Being new to Microsoft Fabric, one of the topics that I found challenging, is how Fabric capacity and especially the throttling works. And what is a better way to structure my understanding than writing a blog post. Let’s give it a try!
Remark: If I made some mistakes, please feel free to let me know so I can update this article.
What is Microsoft Fabric Capacity?
Microsoft Fabric capacity is the compute and storage resources you purchase to run Fabric workloads. Unlike traditional per-service pricing models, Fabric uses a unified capacity model where you purchase Capacity Units (CUs) that power all Fabric experiences including Data Engineering, Data Warehouse, Data Science, Real-Time Analytics, Power BI, and Data Factory.
When you purchase a Fabric capacity, you're essentially reserving a pool of compute resources that can be shared across different workloads and users within your organization. This capacity is measured in Capacity Units, which represent the computational power available for processing operations.
Microsoft Fabric offers various capacity SKUs, each providing different levels of Capacity Units:
F SKUs (Fabric Capacities):
- F2: 2 Capacity Units
- F4: 4 Capacity Units
- F8: 8 Capacity Units
- F16: 16 Capacity Units
- F32: 32 Capacity Units
- And so on, up to F2048
P SKUs (Power BI Premium): Existing Power BI Premium capacities (P1, P2, P3, etc.) can also be used for Fabric workloads, with their respective capacity unit equivalents.
The larger the SKU, the more concurrent operations you can run and the faster individual operations will complete. Choosing the right SKU depends on your workload characteristics, user count, and performance requirements.
How capacity consumption works
Every operation in Microsoft Fabric consumes capacity units. Different operations have different consumption rates based on their computational complexity. For example:
- Running a Spark job consumes capacity while the job is executing
- Refreshing a semantic model consumes capacity during the refresh operation
- Query execution in a data warehouse consumes capacity per query
- Data pipeline activities consume capacity based on their duration and complexity
Fabric uses a metering system that tracks capacity consumption in near real-time. This consumption is measured in CU-seconds (Capacity Unit-seconds), which represents one Capacity Unit consumed for one second.
Throttling in Microsoft Fabric
Throttling is the mechanism Fabric uses to protect capacity from being overwhelmed and to ensure fair resource distribution among users and workloads. When your capacity is under heavy load, Fabric may throttle certain operations to maintain system stability and performance.
Throttling typically occurs when:
- Capacity saturation: The sum of all operations exceeds the available Capacity Units
- Sustained overload: Capacity has been consistently overutilized for an extended period
- Burst protection: Very large spikes in demand that could destabilize the system
Fabric distinguishes between two types of operations:
Interactive Operations:
- User-initiated queries and reports
- Dashboard loads
- Ad-hoc data exploration
- Real-time dashboards
Background Operations:
- Scheduled data refreshes
- Pipeline executions
- Long-running Spark jobs
- Data warehouse maintenance tasks
When throttling occurs, Fabric prioritizes interactive operations to maintain a responsive user experience. Background operations are throttled first, which may result in:
- Delayed job starts
- Queued refresh operations
- Extended pipeline execution times
- Slower batch processing
What is very important to understand is that throttling goes through multiple stages:
- The first phase of throttling applies 20 seconds delays to new interactive operations.
- The second phase of throttling rejects new interactive operations when a capacity uses up all its CU resources for the next one-hour.
- The third phase of throttling rejection all new requests, interactive and background, when the capacity uses up all its available CU resources for the next 24-hours.
The capacity continues to throttle requests until the consumed CU are paid off.
You can see if your capacity is throttled by reviewing the Throttling chart in the Microsoft Fabric Capacity Metrics app (more about this later in this post).
Smoothing and bursting
Microsoft Fabric implements a smoothing algorithm that allows for temporary bursts above your nominal capacity limit. This is crucial for handling real-world workload patterns where demand isn't constant.
Think of capacity like a bank account with a credit line. Your capacity SKU determines your "income" rate (CUs per second), and operations "spend" from this account. The smoothing mechanism allows you to temporarily overdraw this account during peak periods, provided you stay within limits over a longer time window (typically 5-10 minutes).
Example: If you have an F8 capacity (8 CUs), you can theoretically burst to 16 CUs for a short period, as long as your average consumption over the smoothing window remains around 8 CUs. This allows you to handle sudden spikes without immediate throttling.
While smoothing provides flexibility, there are hard limits:
- Short-term burst limit: Typically 2x your capacity for very brief periods
- Sustained burst limit: Lower multiplier sustainable for several minutes
- Throttling threshold: When smoothing buffer is exhausted
Capacity metrics and monitoring
To effectively manage capacity and avoid throttling, you need to monitor key metrics:
Key Metrics to Track
- Capacity Utilization Percentage: Shows what percentage of your capacity is being consumed
- Throttling Events: Indicates when and how often throttling occurs
- Rejected Requests: Operations that couldn't execute due to capacity constraints
- Overload Count: How often you've exceeded capacity limits
- CU-seconds Consumed: Total capacity consumed over time
The best way to monitor this is through the Fabric Capacity Metrics App. This is a purpose-built Power BI app for capacity monitoring.
To install the app, follow the instructions in Install the Microsoft Fabric Capacity Metrics app.
After the app is installed, you should see a new Fabric Capacity Metrics workspace:
Here you can open the Fabric Capacity Metrics report. I would recommend to check out the Health and Compute pages first:
Best Practices for managing capacity
1. Right-size your capacity
Start with capacity planning based on:
- Number of users and their usage patterns
- Types of workloads (interactive vs. batch)
- Peak usage periods
- Growth projections
Don't over-provision initially; you can always scale up as needed.
When in doubt use the Fabric SKU estimator to help you find a matching SKU for your workload:
2. Optimize workload scheduling
Distribute resource-intensive operations throughout the day:
- Schedule heavy refreshes during off-peak hours
- Stagger data pipeline executions
- Avoid scheduling multiple large jobs simultaneously
- Use incremental refresh where possible to reduce load
3. Optimize individual operations
Reduce capacity consumption by:
- Optimizing Spark jobs and queries
- Using efficient data models
- Implementing proper partitioning and indexing
- Removing unnecessary data transformations
- Leveraging caching where appropriate
What to do when throttling occurs
If you're experiencing throttling, take these steps:
- Identify the cause: Check which operations are consuming the most capacity
- Prioritize critical workloads: Pause or delay non-critical operations
- Review recent changes: New reports, pipelines, or increased user activity
- Check for runaway processes: Poorly optimized queries or infinite loops
Short-Term Solutions
- Scale up capacity: Increase your SKU temporarily or permanently
- Reschedule operations: Move batch jobs to off-peak hours
- Optimize immediate offenders: Quick fixes for the most problematic workloads
Long-Term Strategies
- Capacity planning: Regular review and adjustment of capacity needs
- Workload optimization: Systematic improvement of query and job efficiency
- User education: Train users on efficient usage patterns
- Architecture review: Consider distributing workloads across multiple capacities
Conclusion
Understanding Microsoft Fabric capacity and throttling is not easy, but fundamental to running a successful Fabric implementation. By properly sizing your capacity, monitoring utilization, optimizing workloads, and proactively managing resources, you can ensure your organization gets the best performance and value from Microsoft Fabric.
I learned that capacity management is an ongoing process, not a one-time setup. Regular monitoring, continuous optimization, and periodic capacity reviews will help you maintain optimal performance while controlling costs. As your organization's data needs evolve, your capacity strategy should evolve with it.
The capacity model of Microsoft Fabric offers tremendous flexibility and cost benefits, but it requires thoughtful management.
More information
Microsoft Fabric quotas - Microsoft Fabric | Microsoft Learn
Understand your Fabric capacity throttling - Microsoft Fabric | Microsoft Learn
Evaluate and optimize your Microsoft Fabric capacity - Microsoft Fabric | Microsoft Learn
What is the Fabric SKU Estimator (preview)? - Microsoft Fabric | Microsoft Learn





