In the previous post we went deep on sessions — how to create, persist, resume, and manage them in .NET. All of that assumes you have a running application talking to a Copilot CLI. In development, that's trivial: the SDK starts the CLI for you automatically. In production, the picture is more complex. This post is about what happens between "it works on my machine" and "it's serving real users." We'll look at how the CLI architecture actually works, when to run the CLI as a separate headless server, the isolation patterns that fit different application types, and how to scale horizontally without losing session state. How the SDK talks to the CLI Before making deployment decisions, it helps to understand the communication model. Every SDK in every language works the same way underneath: Your Application ↓ SDK Client ↓ JSON-RPC ↓ Copilot CLI (server mode)
All SDKs communicate with the Copilot CLI server via JSON-RPC. ...
If you use GitHub Copilot, you probably already got the communication that there's a billing change coming on June 1, 2026 that you should understand before it kicks in. GitHub is moving from premium request units (PRUs) to a token-based credit system, and for heavy users, especially those doing agentic work, the cost difference can be significant. The good news (I don't really know if I should call it that): GitHub has shipped a billing preview tool that lets you see your projected costs right now. Here's what's changing, what the impact is, and exactly how to use the preview experience to protect your budget. What's actually changing? GitHub Copilot has historically billed "premium requests" — a flat unit that counted each interaction with advanced models. Starting June 1, that system goes away. In its place is GitHub AI Credits , priced by token consumption: every input token, output token, and cached token is metered at the published API rate fo...