Skip to main content

Building an end-to-end monitoring solution with Azure Arc, Log Analytics and Workbooks–Part 2: Data collection with Azure Arc

In part 1 I explained that we want to setup an application health dashboard to gain insights on the availability and health of the on-premise parts of our applications. Specifically we want to monitor our application pools, scheduled tasks and windows services. I introduced the overall architecture and explained the building blocks.

Today we'll dive in the first one of these blocks; the data collection part using Azure Arc Data Collection rules.

Understanding Data Collection rules

A Data Collection Rule (DCR) is a declarative configuration object in Azure that defines the full lifecycle of telemetry: what to collect, how to transform it, and where to send it. It's the connective tissue between the Azure Monitor Agent running on your VMs and the Log Analytics Workspace where the data lands.

DCRs replaced the older model where agents were configured locally via XML files. The new model is centralized — you define the DCR in Azure, associate it with your VMs, and the agents pull their configuration from the DCR. This makes fleet-wide changes much easier. Need to start collecting a new data source? Update the DCR, and every associated VM picks up the change automatically within a few minutes.

For our health monitoring use case, the DCR is doing three things:

  1. Defining the data source: It tells the agent to read from a specific log file that contains the health status of application pools, services, and scheduled tasks.
  2. Transforming the data: The DCR can include a transformation query (written in KQL) that reshapes or filters the data before it's sent to Log Analytics. This is optional but powerful — you can drop unnecessary fields, compute derived values, or normalize formats.
  3. Routing to the destination: It specifies that the collected data should be sent to a specific custom table in a specific Log Analytics Workspace.

How the agent collects health data

Before we dive into the DCR configuration, it's worth clarifying what the agent is actually doing on each VM. The Azure Monitor Agent doesn't natively know how to extract the status of IIS application pools or scheduled tasks — that's not built-in Windows telemetry that the agent scrapes by default.

Instead, we're using the agent's ability to collect from a JSON log. In our implementation, we wrote a PowerShell script that:

  • Queries IIS for all application pools and their current state
  • Queries the Windows Service Control Manager for all monitored services and their status
  • Queries Task Scheduler for all monitored scheduled tasks and their last run result

The script outputs this data in a structured format (JSON) to a log file in a known location on the VM. The Azure Monitor Agent then reads from that log file and ships the contents to Azure.

Prerequisites

Before you start creating the Data Collection rule, make sure you have:

  • Arc-enabled Windows VMs: Your on-premises VMs need to already be onboarded to Azure Arc. This is a one-time setup per machine. If your VMs aren't Arc-enabled yet, the Azure Arc documentation walks through the onboarding process — it involves running a script on each VM that installs the Arc agent and registers the machine with Azure.
  • Azure Monitor Agent installed: The Azure Monitor Agent (AMA) should be deployed to your Arc-enabled VMs. You can deploy it through Azure Policy, the portal, or via ARM templates. This is separate from the Arc agent itself — Arc connects the machine to Azure, but AMA is what actually collects and ships telemetry.
  • Appropriate Azure permissions: You'll need Contributor or above on the resource group where you're creating the DCR, and read/write access to the Log Analytics Workspace. If you're working in a locked-down environment, coordinate with your Azure admins to get the right role assignments.
  • A Log Analytics Workspace: The workspace needs to exist before you create the DCR. We'll define the custom table schema in Part 3, but the workspace resource itself is a prerequisite here because the DCR references it as a destination.

Creating the Data Collection rule

Navigate to Azure Monitor > Data Collection Rules in the Azure portal and click Create.

Basic Configuration

Rule name: Give it a descriptive name like OnPremHealthMonitoring-DCR. Naming conventions matter when you have multiple DCRs in a subscription.

Subscription and Resource Group: Choose or create a dedicated resource group for your monitoring infrastructure. Keeping DCRs, the workspace, and workbooks in the same resource group simplifies lifecycle management and makes RBAC boundaries cleaner.

Region: The DCR itself is a regional resource, but it can collect data from VMs in any region (or on-prem, via Arc). Choose the same region as your Log Analytics Workspace to avoid cross-region data egress costs.

Platform type: Windows.


Data Sources

This is the core of the DCR. Click Add data source and you'll see several options: Performance Counters, Windows Event Logs, Syslog, and Custom JSON Logs.



For our health monitoring setup, we're using Custom JSON Logs. This lets the agent tail a file and ship its contents to Azure.

File pattern: Specify the path to the log file that your health collection script writes to. For example:

D:\ServiceHealth\Logs\HealthStatus.log

You can use wildcards if needed, like D:\ServiceHealth\Logs\*.log, but be specific to avoid accidentally ingesting unrelated files.

Table name: This is the name of the custom table in Log Analytics where the data will land. Use a name like ServiceHealth_CL. The _CL suffix is automatically appended by Azure for custom logs.

Transform: This is an optional KQL query that runs on the data before it's ingested. You can use it to:

  • Parse JSON into structured columns
  • Filter out rows you don't need
  • Rename fields to match your table schema
  • Add computed columns

Here's an example transformation if your script outputs JSON:

source
| project
    Name,
    Environment,
    Server, 
    JobId, 
    ResourceType, 
    Result, 
    TimeGenerated = todatetime(TimeGenerated)

The source table is a built-in reference to the raw data the agent collected. You reshape it here, and the output goes to your custom table.



Alternative: Script-Based Collection

If you'd rather have the DCR execute the PowerShell script directly (instead of reading a log file the script writes to), you can use the Custom Logs via AMA data source type with script execution enabled. This is newer and slightly more complex to configure, but it eliminates the intermediate log file.

In this model, the DCR references a PowerShell script stored in Azure Blob Storage (or inline), and the agent downloads and runs it on a schedule. The script's stdout is captured and sent to Log Analytics.

For our use case, we stuck with the log file approach because it gave us more control over error handling and retry logic in the script itself, and it's easier to test the script locally before deploying the DCR.

Destination

Click Add destination and configure it as follows:

Destination type: Azure Monitor Logs (Log Analytics)

Subscription: The subscription containing your Log Analytics Workspace

Account: Select your workspace from the dropdown

If the table doesn't exist yet (we'll create it in Part 3), that's fine — the DCR just stores the reference. However, data won't flow until the table is created with a matching schema.

Review and create

Review your configuration and click Create. The DCR is now live, but it's not doing anything yet — you need to associate it with your VMs.

Associating the DCR with your VMs

A DCR by itself is just a configuration blueprint. To activate it, you need to create Data Collection Rule Associations (DCRAs) between the DCR and the Arc-enabled VMs you want to monitor.

Manual Association (Portal)

Open the DCR you just created, go to Resources, and click Add. You'll see a list of all resources in your subscription that are eligible for association — this includes Azure VMs and Arc-enabled machines.

Select the VMs you want to monitor and click Add. Within a few minutes, the Azure Monitor Agent on those VMs will detect the new association, pull the DCR configuration, and start collecting data according to the rules you defined.

Scale Association (Azure Policy)

If you're managing a large fleet, manually associating VMs is tedious and error-prone. Instead, use Azure Policy to automatically associate the DCR with any VM that matches certain criteria (like a specific resource tag or resource group).

Azure has built-in policy definitions for DCR association. You can assign a policy that says "any Arc-enabled VM tagged with Monitoring: HealthCheck should be associated with DCR OnPremHealthMonitoring-DCR." This way, onboarding new VMs to health monitoring becomes as simple as applying the correct tag.

Here's an example policy assignment:

Policy definition: Configure Windows Arc-enabled machines to be associated with a Data Collection Rule

Scope: Your subscription or a specific resource group

Parameters:

  • Data Collection Rule Resource ID: The full ARM resource ID of your DCR
  • Tag name: Monitoring
  • Tag value: HealthCheck

Once the policy is assigned and remediation is triggered, all matching VMs will be automatically associated.

What's next?

At this point, you have a DCR defined and associated with your VMs. The Azure Monitor Agent is running, reading health data, and attempting to send it to Azure. But until the custom table exists in Log Analytics with the correct schema, the data has nowhere to land.

In Part 3, we'll create that table, define its schema, and configure retention policies. Once that's done, data will start flowing end-to-end. In the next post, we look at how we generated the required information the VM’s and capture it using an Azure Arc Data Collection Rule.

Keep you posted!

More information

Data collection rules in Azure Monitor - Azure Monitor | Microsoft Learn

Collect data from virtual machine client with Azure Monitor - Azure Monitor | Azure Docs

Popular posts from this blog

Kubernetes–Limit your environmental impact

Reducing the carbon footprint and CO2 emission of our (cloud) workloads, is a responsibility of all of us. If you are running a Kubernetes cluster, have a look at Kube-Green . kube-green is a simple Kubernetes operator that automatically shuts down (some of) your pods when you don't need them. A single pod produces about 11 Kg CO2eq per year( here the calculation). Reason enough to give it a try! Installing kube-green in your cluster The easiest way to install the operator in your cluster is through kubectl. We first need to install a cert-manager: kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.5/cert-manager.yaml Remark: Wait a minute before you continue as it can take some time before the cert-manager is up & running inside your cluster. Now we can install the kube-green operator: kubectl apply -f https://github.com/kube-green/kube-green/releases/latest/download/kube-green.yaml Now in the namespace where we want t...

Azure DevOps/ GitHub emoji

I’m really bad at remembering emoji’s. So here is cheat sheet with all emoji’s that can be used in tools that support the github emoji markdown markup: All credits go to rcaviers who created this list.

Podman– Command execution failed with exit code 125

After updating WSL on one of the developer machines, Podman failed to work. When we took a look through Podman Desktop, we noticed that Podman had stopped running and returned the following error message: Error: Command execution failed with exit code 125 Here are the steps we tried to fix the issue: We started by running podman info to get some extra details on what could be wrong: >podman info OS: windows/amd64 provider: wsl version: 5.3.1 Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM Error: unable to connect to Podman socket: failed to connect: dial tcp 127.0.0.1:2655: connectex: No connection could be made because the target machine actively refused it. That makes sense as the podman VM was not running. Let’s check the VM: >podman machine list NAME         ...