Understanding Microsoft Fabric Capacity and Throttling–A first attempt

Being new to Microsoft Fabric, one of the topics that I found challenging, is how Fabric capacity and especially the throttling works. And what is a better way to structure my understanding than writing a blog post. Let’s give it a try!

Remark: If I made some mistakes, please feel free to let me know so I can update this article.

What is Microsoft Fabric Capacity?

Microsoft Fabric capacity is the compute and storage resources you purchase to run Fabric workloads. Unlike traditional per-service pricing models, Fabric uses a unified capacity model where you purchase Capacity Units (CUs) that power all Fabric experiences including Data Engineering, Data Warehouse, Data Science, Real-Time Analytics, Power BI, and Data Factory.

When you purchase a Fabric capacity, you're essentially reserving a pool of compute resources that can be shared across different workloads and users within your organization. This capacity is measured in Capacity Units, which represent the computational power available for processing operations.

Microsoft Fabric offers various capacity SKUs, each providing different levels of Capacity Units:

F SKUs (Fabric Capacities):

F2: 2 Capacity Units
F4: 4 Capacity Units
F8: 8 Capacity Units
F16: 16 Capacity Units
F32: 32 Capacity Units
And so on, up to F2048

P SKUs (Power BI Premium): Existing Power BI Premium capacities (P1, P2, P3, etc.) can also be used for Fabric workloads, with their respective capacity unit equivalents.

The larger the SKU, the more concurrent operations you can run and the faster individual operations will complete. Choosing the right SKU depends on your workload characteristics, user count, and performance requirements.

How capacity consumption works

Every operation in Microsoft Fabric consumes capacity units. Different operations have different consumption rates based on their computational complexity. For example:

Running a Spark job consumes capacity while the job is executing
Refreshing a semantic model consumes capacity during the refresh operation
Query execution in a data warehouse consumes capacity per query
Data pipeline activities consume capacity based on their duration and complexity

Fabric uses a metering system that tracks capacity consumption in near real-time. This consumption is measured in CU-seconds (Capacity Unit-seconds), which represents one Capacity Unit consumed for one second.

Throttling in Microsoft Fabric

Throttling is the mechanism Fabric uses to protect capacity from being overwhelmed and to ensure fair resource distribution among users and workloads. When your capacity is under heavy load, Fabric may throttle certain operations to maintain system stability and performance.

Throttling typically occurs when:

Capacity saturation: The sum of all operations exceeds the available Capacity Units
Sustained overload: Capacity has been consistently overutilized for an extended period
Burst protection: Very large spikes in demand that could destabilize the system

Fabric distinguishes between two types of operations:

Interactive Operations:

User-initiated queries and reports
Dashboard loads
Ad-hoc data exploration
Real-time dashboards

Background Operations:

Scheduled data refreshes
Pipeline executions
Long-running Spark jobs
Data warehouse maintenance tasks

When throttling occurs, Fabric prioritizes interactive operations to maintain a responsive user experience. Background operations are throttled first, which may result in:

Delayed job starts
Queued refresh operations
Extended pipeline execution times
Slower batch processing

What is very important to understand is that throttling goes through multiple stages:

The first phase of throttling applies 20 seconds delays to new interactive operations.
The second phase of throttling rejects new interactive operations when a capacity uses up all its CU resources for the next one-hour.
The third phase of throttling rejection all new requests, interactive and background, when the capacity uses up all its available CU resources for the next 24-hours.

The capacity continues to throttle requests until the consumed CU are paid off.

You can see if your capacity is throttled by reviewing the Throttling chart in the Microsoft Fabric Capacity Metrics app (more about this later in this post).

Remark: Everything above the dotted line is throttled.

Smoothing and bursting

Microsoft Fabric implements a smoothing algorithm that allows for temporary bursts above your nominal capacity limit. This is crucial for handling real-world workload patterns where demand isn't constant.

Think of capacity like a bank account with a credit line. Your capacity SKU determines your "income" rate (CUs per second), and operations "spend" from this account. The smoothing mechanism allows you to temporarily overdraw this account during peak periods, provided you stay within limits over a longer time window (typically 5-10 minutes).

Example: If you have an F8 capacity (8 CUs), you can theoretically burst to 16 CUs for a short period, as long as your average consumption over the smoothing window remains around 8 CUs. This allows you to handle sudden spikes without immediate throttling.

While smoothing provides flexibility, there are hard limits:

Short-term burst limit: Typically 2x your capacity for very brief periods
Sustained burst limit: Lower multiplier sustainable for several minutes
Throttling threshold: When smoothing buffer is exhausted

Capacity metrics and monitoring

To effectively manage capacity and avoid throttling, you need to monitor key metrics:

Key Metrics to Track

Capacity Utilization Percentage: Shows what percentage of your capacity is being consumed
Throttling Events: Indicates when and how often throttling occurs
Rejected Requests: Operations that couldn't execute due to capacity constraints
Overload Count: How often you've exceeded capacity limits
CU-seconds Consumed: Total capacity consumed over time

The best way to monitor this is through the Fabric Capacity Metrics App. This is a purpose-built Power BI app for capacity monitoring.

To install the app, follow the instructions in Install the Microsoft Fabric Capacity Metrics app.

After the app is installed, you should see a new Fabric Capacity Metrics workspace:

Here you can open the Fabric Capacity Metrics report. I would recommend to check out the Health and Compute pages first:

Best Practices for managing capacity

1. Right-size your capacity

Start with capacity planning based on:

Number of users and their usage patterns
Types of workloads (interactive vs. batch)
Peak usage periods
Growth projections

Don't over-provision initially; you can always scale up as needed.

When in doubt use the Fabric SKU estimator to help you find a matching SKU for your workload:

2. Optimize workload scheduling

Distribute resource-intensive operations throughout the day:

Schedule heavy refreshes during off-peak hours
Stagger data pipeline executions
Avoid scheduling multiple large jobs simultaneously
Use incremental refresh where possible to reduce load

3. Optimize individual operations

Reduce capacity consumption by:

Optimizing Spark jobs and queries
Using efficient data models
Implementing proper partitioning and indexing
Removing unnecessary data transformations
Leveraging caching where appropriate

What to do when throttling occurs

If you're experiencing throttling, take these steps:

Identify the cause: Check which operations are consuming the most capacity
Prioritize critical workloads: Pause or delay non-critical operations
Review recent changes: New reports, pipelines, or increased user activity
Check for runaway processes: Poorly optimized queries or infinite loops

Short-Term Solutions

Scale up capacity: Increase your SKU temporarily or permanently
Reschedule operations: Move batch jobs to off-peak hours
Optimize immediate offenders: Quick fixes for the most problematic workloads

Long-Term Strategies

Capacity planning: Regular review and adjustment of capacity needs
Workload optimization: Systematic improvement of query and job efficiency
User education: Train users on efficient usage patterns
Architecture review: Consider distributing workloads across multiple capacities

Conclusion

Understanding Microsoft Fabric capacity and throttling is not easy, but fundamental to running a successful Fabric implementation. By properly sizing your capacity, monitoring utilization, optimizing workloads, and proactively managing resources, you can ensure your organization gets the best performance and value from Microsoft Fabric.

I learned that capacity management is an ongoing process, not a one-time setup. Regular monitoring, continuous optimization, and periodic capacity reviews will help you maintain optimal performance while controlling costs. As your organization's data needs evolve, your capacity strategy should evolve with it.

The capacity model of Microsoft Fabric offers tremendous flexibility and cost benefits, but it requires thoughtful management.

More information

Microsoft Fabric quotas - Microsoft Fabric | Microsoft Learn

Understand your Fabric capacity throttling - Microsoft Fabric | Microsoft Learn

Evaluate and optimize your Microsoft Fabric capacity - Microsoft Fabric | Microsoft Learn

What is the Fabric SKU Estimator (preview)? - Microsoft Fabric | Microsoft Learn

Kubernetes–Limit your environmental impact

Reducing the carbon footprint and CO2 emission of our (cloud) workloads, is a responsibility of all of us. If you are running a Kubernetes cluster, have a look at Kube-Green . kube-green is a simple Kubernetes operator that automatically shuts down (some of) your pods when you don't need them. A single pod produces about 11 Kg CO2eq per year( here the calculation). Reason enough to give it a try! Installing kube-green in your cluster The easiest way to install the operator in your cluster is through kubectl. We first need to install a cert-manager: kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.5/cert-manager.yaml Remark: Wait a minute before you continue as it can take some time before the cert-manager is up & running inside your cluster. Now we can install the kube-green operator: kubectl apply -f https://github.com/kube-green/kube-green/releases/latest/download/kube-green.yaml Now in the namespace where we want t...

The art of simplicity

Search This Blog