Building an end-to-end monitoring solution with Azure Arc, Log Analytics and Workbooks - Part 5: Putting it all together
Wow! We covered a lot in this series.
Part 1 - Overview & Architecture
Part 2 – Data collection with Azure Arc
Part 3 – Data persistence in Log Analytics
Part 4 - Data visualization with Azure Workbooks
Time for a wrap up and some troubleshooting
Let's trace the data flow from start to finish to make sure everything connects:
- The Azure Monitor Agent runs on each Arc-enabled on-prem VM.
- The Data Collection Rule tells the agent what health data to gather — application pools, Windows services, and scheduled tasks.
- The agent collects that data on a regular interval and ships it to Azure.
- The DCR routes the incoming data to our custom table (
OnPremHealthStatus_CL) in the Log Analytics Workspace. - The Workbook queries that table and renders the dashboard.
If any link in that chain breaks, data stops flowing. The troubleshooting section below covers the most common failure points.
Troubleshooting checklist
No data appearing in the workbook: Start at the table. Run a basic OnPremHealthStatus_CL | take 10 query directly in Log Analytics. If there are no results, the issue is upstream of the Workbook — either the agent isn't sending data or the DCR isn't routing it correctly.
Query timeout: If queries are timing out, reduce the time range or add more specific filters. Log Analytics has query timeout limits (typically 3 minutes for portal queries).
Conditional formatting not applying: Make sure the status values in your data exactly match the conditions you've set (case-sensitive). A status of running (lowercase) won't match a condition checking for Running.
Data is stale: Check the TimeGenerated column. If the most recent rows are hours old, the agent may have stopped collecting or lost connectivity to Azure. Check the agent health on the VM.
Partial data (some VMs missing): Verify the DCR association. Not all VMs may have the rule associated. Check the DCR's resource associations in the portal.
Agent version too old: The custom text logs data source type requires Azure Monitor Agent version 1.10 or higher. If your Arc-enabled VMs have an older agent version, update it through Azure Policy or manually via extension management in the portal.
File permissions: The Azure Monitor Agent runs under the NT AUTHORITY\SYSTEM account on Windows. Make sure the log file your script writes to is readable by SYSTEM. If you're writing to C:\ProgramData\ or C:\MonitoringData\, this should be fine by default.
DCR not updating: If you edit a DCR after it's been associated with VMs, the agents don't pick up the changes instantly. It can take up to 5 minutes for the agent to refresh its configuration. If you need an immediate update, restart the Azure Monitor Agent service on the VM.
Firewall blocking outbound connections: The Azure Monitor Agent needs outbound HTTPS access to several Azure endpoints. If your on-prem network has restrictive egress rules, make sure the following domains are allowed:
*.ods.opinsights.azure.com(data ingestion)*.oms.opinsights.azure.com(agent configuration)*.monitoring.azure.com(Arc and agent management)
If you're using a proxy, the agent can be configured to route through it via the Arc agent's proxy settings.
Schema mismatches: If you see errors when the agent tries to write to the custom table, the data being sent doesn't match the table schema. Double-check that the column names and types in your DCR data source configuration match the table definition exactly.
How to further improve this setup?
This setup gives you a solid foundation. From here, there are a few natural directions to extend it:
Alerting: Layer Azure Monitor alerts on top of the same custom table. You can alert when a critical service enters a Stopped or Failed state, without needing a separate alerting tool.
Historical trending: Extend the Workbook with time-series queries to track how component health has changed over days or weeks. This is useful for catching intermittent failures that a point-in-time view would miss.
Automation: Once you have reliable health data flowing into Log Analytics, you can trigger Logic Apps or Azure Functions in response to specific health events — auto-restarting a failed service, for example.
Maybe I’ll cover these extensions in future posts. For now, the five parts above should get you from zero to a working health dashboard.
Feel free to adapt the schema, queries, and workbook structure to fit your environment.
More information
Azure Monitor Agent Network Configuration - Azure Monitor | Microsoft Learn