Sessions in the GitHub Copilot SDK: What they are and how to manage them

In the previous post we got a working .NET app talking to the Copilot agent runtime. We created a CopilotSession, sent messages through it, and saw how multi-turn conversations just work — the agent remembered what you said three messages ago without you having to manage that state yourself.

That "just works" quality is deliberate, and it's worth understanding what's actually happening underneath. Sessions are the stateful core of the Copilot SDK. How you create, configure, scope, and dispose them determines whether your application is resilient, scalable, and cost-efficient — or fragile and leaky.

This post goes deep on sessions: what they are, how their lifecycle works, how to persist them across restarts, and the patterns that hold up in production.

The mental model: Client vs. Session

Before getting into lifecycle specifics, it's worth being precise about the two core classes and what each one owns.

CopilotClient is infrastructure. It manages the connection to the Copilot CLI process running locally — process lifecycle, network communication, authentication. It has no concept of what you're talking about. Create one per application and treat it as a singleton.

CopilotSession is context. It holds the full conversation history, tool execution state, and planning context for a specific interaction. It knows what was said, in what order, with what results. Multiple independent sessions can run simultaneously through the same client connection.

The separation matters: the client handles "how do I talk to Copilot at all", while the session handles "what are we talking about". You set up the client once, then create and dispose sessions as needed.

// Client: singleton, created once
await using var client = new CopilotClient();
await client.StartAsync();

// Session: created per conversation, disposed when done
await using var session = await client.CreateSessionAsync(new SessionConfig
{
    Model = "gpt-4.1",
    OnPermissionRequest = PermissionHandler.ApproveAll
});

Session lifecycle

A session moves through four observable states during its lifetime:

Created — the session instance exists but no request has been sent yet. This is where you configure hooks and event handlers before any interaction begins.
Active — a request is in flight. The agent is processing, potentially invoking tools, streaming tokens. The session is busy.
Idle — the agent has finished responding. The session.idle event fires at this point. The session is holding its context in memory and ready for the next prompt.
Disposed — the session has been shut down. Resources are released. Context is gone (unless you've enabled persistence — more on that below).

The SDK has a built-in 30-minute idle timeout. Sessions that sit inactive longer than this are automatically cleaned up. If you're building long-running workflows, plan for this: either keep sessions warm, persist and resume them, or design your workflow so individual sessions don't need to outlive the timeout.

await using var session = await client.CreateSessionAsync(new SessionConfig
{
    Model = "gpt-4.1",
    OnPermissionRequest = PermissionHandler.ApproveAll
});

// Wire up lifecycle events before your first send
session.On<SessionIdleEvent>(evt =>
{
    Console.WriteLine($"Agent finished. Idle for {evt.Data.IdleDurationMs}ms");
});

session.On<AssistantMessageEvent>(evt =>
{
    Console.WriteLine($"Full response: {evt.Data.Content}");
});

await session.SendAndWaitAsync(new MessageOptions { Prompt = "What changed in HEAD?" });

Events

The SDK surfaces everything through an event model. You subscribe to events using session.On<T>(), and each event type tells you something specific about what the agent is doing.

The core events you'll work with:

Event	When it fires
AssistantMessageEvent	A complete response has been generate
AssistantMessageDeltaEvent	A streaming token has arrived
ToolExecutionStartEvent	The agent is about to call a tool
ToolExecutionCompleteEvent	A tool call has finished
SessionIdleEvent	The agent has finished its current turn
SessionErrorEvent	Something went wrong :-)
SessionCompactionStartEvent	The context window is being trimmed
SessionCompactionCompleteEvent	Context compaction has finishe

The session.On() overload that takes a base event works for catch-all logging:

// Log every event — useful during development
var unsubscribe = session.On(evt =>
    Console.WriteLine($"[{evt.Type}] {evt.Data}"));

// Typed handlers for production use
session.On<ToolExecutionStartEvent>(evt =>
    logger.LogInformation("Tool called: {Tool}", evt.Data.ToolName));

session.On<SessionErrorEvent>(evt =>
    logger.LogError("Session error: {Error}", evt.Data.Message));

// Dispose the subscription when you no longer need it
unsubscribe.Dispose();

Session scoping patterns

The right session scope depends on what your application is doing. Three patterns cover most cases.

Session-per-conversation

One session for each user conversation, created at the start, disposed when the conversation ends. This is the most common pattern and the cleanest to reason about.

public class ConversationService
{
    private readonly CopilotClient _client;

    public ConversationService(CopilotClient client) => _client = client;

    public async Task RunConversationAsync(string userId)
    {
        await using var session = await _client.CreateSessionAsync(new SessionConfig
        {
            Model = "gpt-4.1",
    OnPermissionRequest = PermissionHandler.ApproveAll,
            SessionId = $"{userId}-{DateTimeOffset.UtcNow.ToUnixTimeSeconds()}"
        });

        // Full conversation runs through this session
        while (true)
        {
            var input = Console.ReadLine();
            if (input == "exit") break;

            var response = await session.SendAndWaitAsync(
                new MessageOptions { Prompt = input });

            Console.WriteLine(response?.Data.Content);
        }
        // Session is disposed here — context is released
    }
}

Session-per-task

For background workflows or batch operations, create a session for each discrete task rather than each user. This gives each task its own isolated context and makes failures easy to scope.

public async Task ProcessTaskAsync(WorkItem task)
{
    await using var session = await _client.CreateSessionAsync(new SessionConfig
    {
        Model = "gpt-4.1",
    OnPermissionRequest = PermissionHandler.ApproveAll,
        SessionId = $"task-{task.Id}",
        Instructions = $"You are processing a {task.Type} task. Context: {task.Description}"
    });

    var result = await session.SendAndWaitAsync(new MessageOptions{
        Prompt = task.Prompt
    });

    await _store.SaveResultAsync(task.Id, result?.Data.Content);
}

Long-running sessions

Some workflows genuinely benefit from a session that spans many interactions over time — an interactive coding agent that remembers the files it read earlier, for example. This works, but requires you to think about context accumulation. Every turn adds to the session's context window. Eventually the SDK will trigger compaction (you'll see SessionCompactionStart / SessionCompactionComplete events), which trims history automatically. Listen for these events if the completeness of history matters to your use case.

Session persistence: surviving restarts

By default, session state lives in memory and disappears when the session ends. For most use cases that's fine. But if your application restarts, deploys new code, or migrates between hosts, in-memory sessions vanish — and users lose their context.

The SDK supports durable session persistence by letting you provide your own sessionId. When you do, state is saved to disk at ~/.copilot/session-state/{sessionId}/:

~/.copilot/session-state/ └── alice-code-review-1714900000/ ├── checkpoints/ │ ├── 001.json # State after first turn │ ├── 002.json # State after second turn │ └── ... ├── plan.md # Agent's planning state └── files/ # Artifacts the agent created

An example:

Creating a resumable session:

// Use a meaningful, stable ID that encodes ownership and purpose
var sessionId = $"{userId}-{taskType}-{DateTimeOffset.UtcNow.ToUnixTimeSeconds()}";

var session = await client.CreateSessionAsync(new SessionConfig
{
    SessionId = sessionId,
    Model = "gpt-4.1"
,
   OnPermissionRequest = PermissionHandler.ApproveAll});

// Persist the sessionId somewhere — database, cache, user preferences
await _store.SaveSessionIdAsync(userId, sessionId);

Resuming after a restart:

var sessionId = await _store.GetSessionIdAsync(userId);

// Resume picks up exactly where the conversation left off
var session = await client.ResumeSessionAsync(sessionId);

// You can change configuration on resume — useful for model upgrades
var session = await client.ResumeSessionAsync(sessionId, new SessionConfig
{
    Model = "gpt-4.1",
    OnPermissionRequest = PermissionHandler.ApproveAll,     // Switch models between sessions
    Tools = [myUpdatedTool] // Add new tools to an existing conversation
});

The onSessionStart lifecycle hook fires on both new and resumed sessions, and tells you which case you're in via input.Source:

var session = await client.CreateSessionAsync(new SessionConfig
{
    SessionId = sessionId,
    Model = "gpt-4.1",
    OnPermissionRequest = PermissionHandler.ApproveAll,
    Hooks = new SessionHooks
    {
        OnSessionStart = async (input, invocation) =>
        {
            if (input.Source == "resume")
            {
                // Inject context about what's changed since the session was paused
                var summary = await _store.GetUpdatesSinceAsync(invocation.SessionId, input.Timestamp);
                return new SessionStartHookOutput
                {
                    AdditionalContext = $"Session resumed. Updates since last session:\n{summary}"
                };
            }
            return null;
        }
    }
});

Managing multiple concurrent sessions

If you're building a multi-user application — a web API, an internal tool with multiple operators, anything serving more than one person at a time — you'll need to manage session concurrency deliberately.

A single CLI server can handle many concurrent sessions. The key constraint is memory: each active session holds its conversation history in memory, and long-running sessions accumulate context. Without a ceiling, a busy application will eventually exhaust available memory.

Here's a straightforward SessionManager that enforces a concurrency limit:

public class SessionManager : IAsyncDisposable
{
    private readonly CopilotClient _client;
    private readonly Dictionary<string, CopilotSession> _active = new();
    private readonly int _maxConcurrent;
    private readonly SemaphoreSlim _lock = new(1, 1);

    public SessionManager(CopilotClient client, int maxConcurrent = 50)
    {
        _client = client;
        _maxConcurrent = maxConcurrent;
    }

    public async Task<CopilotSession> GetOrCreateAsync(string sessionId, string model = "gpt-4.1")
    {
        await _lock.WaitAsync();
        try
        {
            if (_active.TryGetValue(sessionId, out var existing))
                return existing;

            if (_active.Count >= _maxConcurrent)
                await EvictOldestAsync();

            var session = await _client.CreateSessionAsync(new SessionConfig
            {
                SessionId = sessionId,
                Model = model
            });

            _active[sessionId] = session;
            return session;
        }
        finally
        {
            _lock.Release();
        }
    }

    private async Task EvictOldestAsync()
    {
        var oldest = _active.Keys.First();
        await _active[oldest].DisposeAsync();
        _active.Remove(oldest);
    }

    public async ValueTask DisposeAsync()
    {
        foreach (var session in _active.Values)
            await session.DisposeAsync();
    }
}

For production deployments, a few additional considerations:

Isolation. If you're building a multi-tenant SaaS product where data isolation matters, consider giving each user or tenant their own CLI server instance rather than sharing one. The official scaling docs cover this pattern in detail.
Shared storage for horizontal scale. If you're running multiple application instances behind a load balancer, session state needs to live on shared storage (a network file system or mounted volume) so any instance can resume any session. The state path (~/.copilot/session-state/) is configurable if you need to redirect it.
The 30-minute idle timeout. Plan for sessions being cleaned up automatically. Either design your workflows to complete within a session's natural lifetime, or use persistence and resume to span longer timescales.

Registering CopilotClient in ASP.NET Core

For web applications, register CopilotClient as a singleton and use a hosted service to manage its lifecycle cleanly:

// Program.cs
builder.Services.AddSingleton<CopilotClient>();
builder.Services.AddHostedService<CopilotClientHostedService>();
builder.Services.AddSingleton<SessionManager>();

// CopilotClientHostedService.cs
public class CopilotClientHostedService : IHostedService
{
    private readonly CopilotClient _client;

    public CopilotClientHostedService(CopilotClient client) => _client = client;

    public Task StartAsync(CancellationToken ct) => _client.StartAsync(ct);

    public Task StopAsync(CancellationToken ct) => _client.StopAsync(ct);
}

Then inject SessionManager into your controllers or minimal API handlers and create sessions per request or per user:

app.MapPost("/chat/{sessionId}", async (
    string sessionId,
    ChatRequest request,
    SessionManager sessions) =>
{
    var session = await sessions.GetOrCreateAsync(sessionId);

    var response = await session.SendAndWaitAsync(
        new PromptRequest { Prompt = request.Message });

    return Results.Ok(new { reply = response.Content });
});

What's next

You now have a solid understanding of how sessions work: their lifecycle, their event model, how to persist and resume them, and how to manage concurrency in a real application.

In the next post, we'll look at deployment — what deployment options are available, when to choose what and how to scale your setup.

VS Code Planning mode

After the introduction of Plan mode in Visual Studio , it now also found its way into VS Code. Planning mode, or as I like to call it 'Hannibal mode', extends GitHub Copilot's Agent Mode capabilities to handle larger, multi-step coding tasks with a structured approach. Instead of jumping straight into code generation, Planning mode creates a detailed execution plan. If you want more details, have a look at my previous post . Putting plan mode into action VS Code takes a different approach compared to Visual Studio when using plan mode. Instead of a configuration setting that you can activate but have limited control over, planning is available as a separate chat mode/agent: I like this approach better than how Visual Studio does it as you have explicit control when plan mode is activated. Instead of immediately diving into execution, the plan agent creates a plan and asks some follow up questions: You can further edit the plan by clicking on ‘Open in Editor’: ...

The art of simplicity

Search This Blog