Skip to main content

Slowly Changing Dimensions in Microsoft Fabric - The no-code way

Dimension tables don't stay still. A customer moves city, a product gets reclassified, a salesperson switches regions. The question is never whether this happens, it's what you do with the old value once it changes.

That question has a name: Slowly Changing Dimensions (SCD). If you haven't already, it's worth reading my first post about what SCDs actually are and the full set of types first — this post assumes you already know the difference between Type 1 and Type 2 and want to get straight to implementing Type 2 in Fabric.

Until recently, implementing Type 2 in Fabric meant either building a Dataflow Gen2 with a chain of merge steps, or writing a PySpark notebook against Delta tables. Both work. Both also mean you're maintaining custom logic per table, forever.

Fabric's Copy job now has SCD Type 2 built in as a write method. No merge statements, no derived columns for surrogate keys, no alter-row logic. You pick a write method from a dropdown. This post covers when you actually need Type 2, how the built-in option behaves, and where it currently falls short.

Why this post is about Type 1 vs Type 2, and not the full picture

There's a fuller taxonomy of SCD types (Type 0 for attributes that never change, Type 3 for keeping just one prior value, Type 4 for splitting current and history into separate tables, and others). That decision belongs in the modeling conversation, not the tooling one — by the time you're picking a Fabric component, you should already know which type the dimension needs. Type 0 doesn't need any change-tracking mechanism at all, and Type 3/4 aren't (yet) things Copy job builds for you the way it does Type 2. So this post narrows to the two types Fabric actually has dedicated, documented tutorials and native support for: Type 1 and Type 2.

Type 1 or Type 2?

This isn't a Fabric question, it's a business question, and it's worth getting an explicit answer before you build anything.

  • Type 1 overwrites the old value. The dimension table always shows the current state, nothing more. If a product moves from aisle 6 to aisle 11, the old aisle is just gone. Use this when nobody will ever ask "what did this look like before."
  • Type 2 preserves history. Every change creates a new row, the old row gets closed out with an end date, and a flag marks which row is current. Use this when someone will eventually ask that question, for compliance, trend analysis, or "what was true when this fact happened."

The test I'd apply: if a regulator, auditor, or analyst could plausibly ask "what was the customer's address on the date of this transaction," that's Type 2. If the dimension is just operational lookup data nobody analyzes historically, Type 1 is sufficient and a lot less table to maintain.

What Copy job's SCD Type 2 actually does

CDC in Copy job reads inserts, updates, and deletes from a source's change feed, then applies them to the destination according to a write method you choose. SCD Type 1, also called the Merge update method, is the default: the destination always reflects the current state, with updates overwriting rows and deletes physically removing them.

Selecting SCD Type 2 changes that behavior entirely. When a source record is updated, the existing current row in the destination gets expired by setting its Valid_To and flipping Is_Current to false, then a new row is inserted with the updated values, a fresh Valid_From, and Is_Current set to true. Deletes don't disappear either: a deleted source record is soft-deleted at the destination, the row stays but its Valid_To is set and Is_Current flips to false.

To get there, Copy job adds three columns to the destination automatically: Valid_From for when a version became effective, Valid_To for when it was superseded or deleted (NULL while active), and Is_Current as the active-version flag. Here's the example Microsoft uses for a customer who relocates:

That's the whole pattern. No surrogate key generation logic to write, no UPSERT stored procedure, no conditional split. It's enabled as a single write method selection during Copy job configuration, with history tracking and soft-delete handling switched on together.

Where it bites: connector support and the column-mapping trap

This is a preview feature, and the fine print matters more than the demo.

Connector coverage is narrower than you'd expect. As of writing, SCD Type 2 as a write method works for Azure SQL DB, Azure SQL Managed Instance, on-premises SQL Server, Fabric Lakehouse tables, and SQL database in Fabric as destinations. CDC replication from Snowflake or Oracle sources doesn't support SCD Type 2 yet, even though those sources support CDC reads and Type 1 merge writes. Fabric Data Warehouse can be a Type 1 merge destination, but not yet a Type 2 destination through this path — that combination is still rolling out separately. Check the connector matrix for your specific source and destination before you commit to this approach.

Bring-your-own schema doesn't work. If your destination database already has an existing schema that you created yourself, SCD Type 2 isn't supported. Copy job needs to create the destination table itself so it can add the Valid_From, Valid_To, and Is_Current columns on its own terms. If you've already got a hand-built dimension table sitting there, this isn't the path to bolt history onto it.

Don't touch the column mapping after enabling it. This is the gotcha that actually trips people up in practice. Once SCD Type 2 is selected, Fabric needs to auto-generate the history and key columns on every run. Going in afterward and manually adjusting the column mapping breaks that process. If you need to change mappings, do it before turning SCD Type 2 on, not after.

It needs CDC on the source. Type 2 in Copy job rides on CDC-based incremental copy, not watermark-based. If your source database can't have CDC enabled — permissions, licensing, an on-prem system that doesn't support it — this entire approach is off the table and you're back to notebooks or Dataflow Gen2.

Setting it up

The actual configuration is short, which is the point.

  • Create a new Copy job, pick your CDC-enabled source (Azure SQL DB is the most straightforward to test with).
  • Choose your destination. Let Copy job create the table — don't point it at an existing one if you want Type 2.
  • Switch the copy mode to incremental copy with CDC.
  • Under write method, select SCD Type 2 instead of the default Merge.
  • Confirm or set the business key Copy job should track changes against.
  • Run it. The first run does a full load and builds the base table with the history columns already in place. Every subsequent run reads the change feed, closes out changed rows, and inserts the new versions.

Querying current state afterward is just a filter: WHERE Is_Current = 1. Querying as-of a point in time means filtering on Valid_From <= @asOfDate AND (Valid_To > @asOfDate OR Valid_To IS NULL). Nothing about consuming the table is unusual once it's built, it's a standard Type 2 dimension, Fabric just stopped making you hand-write the mechanism that builds it.

When to skip this and reach for a notebook instead

Built-in SCD Type 2 is a good default when your source/destination pair is supported, your dimension logic is the textbook pattern, and you don't need to touch the history columns yourself. It's the wrong tool when you need custom change-detection logic beyond what CDC gives you (say, ignoring certain column changes, or hashing a subset of attributes to detect "meaningful" change), when your destination is one of the currently unsupported connectors, or when you need fine control over surrogate key generation for downstream fact table joins.

More information

Popular posts from this blog

Podman– Command execution failed with exit code 125

After updating WSL on one of the developer machines, Podman failed to work. When we took a look through Podman Desktop, we noticed that Podman had stopped running and returned the following error message: Error: Command execution failed with exit code 125 Here are the steps we tried to fix the issue: We started by running podman info to get some extra details on what could be wrong: >podman info OS: windows/amd64 provider: wsl version: 5.3.1 Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM Error: unable to connect to Podman socket: failed to connect: dial tcp 127.0.0.1:2655: connectex: No connection could be made because the target machine actively refused it. That makes sense as the podman VM was not running. Let’s check the VM: >podman machine list NAME         ...

Azure DevOps/ GitHub emoji

I’m really bad at remembering emoji’s. So here is cheat sheet with all emoji’s that can be used in tools that support the github emoji markdown markup: All credits go to rcaviers who created this list.

VS Code Planning mode

After the introduction of Plan mode in Visual Studio , it now also found its way into VS Code. Planning mode, or as I like to call it 'Hannibal mode', extends GitHub Copilot's Agent Mode capabilities to handle larger, multi-step coding tasks with a structured approach. Instead of jumping straight into code generation, Planning mode creates a detailed execution plan. If you want more details, have a look at my previous post . Putting plan mode into action VS Code takes a different approach compared to Visual Studio when using plan mode. Instead of a configuration setting that you can activate but have limited control over, planning is available as a separate chat mode/agent: I like this approach better than how Visual Studio does it as you have explicit control when plan mode is activated. Instead of immediately diving into execution, the plan agent creates a plan and asks some follow up questions: You can further edit the plan by clicking on ‘Open in Editor’: ...