Building an end-to-end monitoring solution with Azure Arc, Log Analytics and Workbooks–Part 1: Overview & Architecture
On-premises VMs don't disappear just because you are working on a cloud strategy. We are running a lot of Windows workloads on-prem — application pools, Windows services, scheduled tasks — and still need visibility into whether they're healthy.
Traditional on-prem monitoring solutions could work, but they come with their own operational overhead and are directly tied to our on-premise infrastructure. When an incident happens, we don’t want to context-switch between our cloud monitoring stack and our on-prem monitoring stack. It's not ideal.
We wanted a single, cloud-native view into the health of our on-prem workloads without having to lift and shift them into Azure. Azure Arc made this possible by extending Azure's management plane to our on-premises infrastructure. By combining Arc with Log Analytics and Workbooks, we built a unified health dashboard that sits alongside our cloud monitoring, uses the same query language (KQL), and requires no additional on-prem infrastructure.
The problem we were solving
Before we built this, our health monitoring looked like this: developers would RDP into individual servers, open Services.msc or IIS Manager, and manually check whether critical components were running. For scheduled tasks, they'd dig through Task Scheduler on each box. This worked at small scale, but it didn't scale operationally. It also required giving direct administrator access on these machines which is a ‘no-go’ from a security perspective.
We needed:
- A single dashboard showing the health of all monitored components across all servers
- The ability to filter and drill down by server, component type, or status
- Historical visibility to spot patterns (is this service flapping? did this task fail last night too?)
- Integration with Azure's alerting stack so we could route notifications to teams already using Azure Monitor
The solution had to be low-friction. We didn't want to deploy another monitoring appliance on-prem, and we didn't want to maintain custom scripts running on every VM that would inevitably drift over time.
The 3 puzzle pieces
Azure Arc is the bridge. It extends Azure's management plane to resources that live outside of Azure — in our case, on-premises Windows VMs. Once a machine is Arc-enabled, Azure treats it as a first-class managed resource. You can see it in the portal, tag it, assign policies to it, and — crucially for our use case — deploy the Azure Monitor Agent to it.
Arc doesn't move your workloads. The VMs stay exactly where they are, on-prem, running the same services and tasks they always have. What Arc does is project them into Azure's control plane so you can manage them using the same tools and patterns you'd use for cloud VMs. This is what lets us push the Azure Monitor Agent to on-prem machines and pull telemetry back into Azure without punching permanent inbound firewall holes or setting up VPN tunnels just for monitoring.
Log Analytics Workspace is where the data lives. Azure Monitor Agent ships telemetry to Log Analytics, and we defined a custom table to store the health state of application pools, Windows services, and scheduled tasks. Using a custom table gave us full control over the schema — we weren't constrained by what the built-in Windows event logs expose out of the box.
The built-in Event table in Log Analytics contains Windows event log data, and you can extract service state changes from it. But that data is verbose, semi-structured XML, and every query involves parsing and filtering through thousands of events to find the handful that matter. By creating a custom table, we store pre-structured health snapshots — one row per component, collected at a regular interval — which makes queries fast and simple.
Azure Workbooks is where the data becomes actionable. Workbooks let us build interactive dashboards backed by KQL queries against our custom table. The result is a live, filterable view of what's running, what's stopped, and what's failing — across all of our on-prem VMs. Workbooks support parameterization (filter by server name, component type, time range), conditional formatting (red for failed, green for running), and can be shared with the team or embedded in operational runbooks.
Why we choose this setup
No inbound connectivity required. The Azure Monitor Agent initiates outbound HTTPS connections to Azure. Our on-prem VMs don't need to accept any inbound traffic from the cloud, which keeps your security team happy and your firewall rules simple.
Separation of concerns. The Data Collection Rule is a declarative configuration artifact — it's version-controllable and can be applied consistently across our servers. The custom table schema is independent of the DCR, so the data model can evolve without re-deploying agents. The Workbook queries the table and knows nothing about how the data got there. Each layer has a clear job.
Scalability. The Azure Monitor Agent is designed to handle large-scale telemetry collection. Log Analytics can ingest and query data from thousands of machines. Workbooks render queries in near-real-time. This architecture doesn't hit a ceiling at 10 servers or 50 servers — it scales to hundreds without architectural changes.
Cost control. We only pay for Log Analytics ingestion and retention, and the cost scales with data volume. Because we're collecting structured health snapshots at a defined interval (not high-frequency metrics or verbose logs), the data volume is predictable and relatively low. A typical setup monitoring dozens of components across a fleet of VMs generates a few megabytes per day.
What this doesn’t replace
This solution is purpose-built for health monitoring of known components — application pools, services, and scheduled tasks. It's not a replacement for comprehensive infrastructure monitoring, APM, or log aggregation. We also still want traditional monitoring for CPU, memory, disk, network performance and application-level telemetry for request tracing and error rates but these things are already captured by combining Azure Arc with Azure Monitor Application Insights.
What’s next?
In the next post, we look at how we generated the required information the VM’s and capture it using an Azure Arc Data Collection Rule.
Keep you posted!
More information
Collect data from virtual machine client with Azure Monitor - Azure Monitor | Microsoft Learn
Overview of Log Analytics in Azure Monitor - Azure Monitor | Microsoft Learn