Skip to main content

Rubber Duck in GitHub Copilot CLI

In 2012(!) I blogged about rubber duck debugging to help me troubleshoot issues. It's a practice I'm still using even today. With GitHub Copilot CLI's newest update, rubber duckis becoming AI driven. Copilot introduces a second AI model from a different family to critique your agent's plans and implementations at the moments where feedback has the highest return.


Let's have a look at it in more detail...

What rubber duck actually does

Rubber Duck is not a general-purpose chat assistant. It's a review agent that will use a model from a complementary AI family to whichever model you've selected as your orchestrator.

When you're running a Claude model as your orchestrator, Rubber Duck runs on GPT-5.4. This is intentional: a model reviewing its own output is still bounded by its own training biases. A cross-family reviewer brings genuinely different blind spots to the table.

Rubber Duck's output is deliberately narrow — a short, prioritised list of concerns: missed edge cases, questionable assumptions, and architectural issues the primary agent didn't surface.

When Copilot invokes rubber duck automatically

You can call the Rubber Duck yourself, but Copilot can also do this proactively at three specific checkpoints:

  1. After drafting a plan — before implementation begins. This is the highest-value intervention point: a flawed assumption here compounds into every subsequent step.
  2. After a complex implementation — a second pass on non-trivial code before you commit to the direction.
  3. After writing tests, before running them — catching gaps in coverage or incorrect assertions before they self-validate.

Copilot can also trigger Rubber Duck reactively if it gets stuck in a loop and can't make forward progress.

Triggering it yourself

You can request a Rubber Duck critique at any point in your session. Just tell Copilot to critique its current work. It will invoke Rubber Duck, reason over the feedback, and surface a diff of what changed and why. This is useful before you commit to a plan you're not fully confident in, or after a long agentic run you want double-checked before merging.

How to enable it

Rubber Duck lives in experimental mode. To access it:

# Enable experimental features
/experimental

Today it works for any Claude model from the model picker and when you have access to GPT-5.4. You'll see critiques surface inline — either automatically at the checkpoints above, or on demand when you ask.

When it won't help much

Rubber Duck is optimised for complex, multi-step, multi-file work. For short, well-scoped tasks — a single-function change, a straightforward refactor within one file — the overhead of a cross-model review adds latency without proportional benefit. The agent is designed to invoke it sparingly for exactly this reason.

Practical workflow recommendation

For high-stakes agentic tasks, a reasonable pattern is:

  1. Let Copilot draft the plan.
  2. Before approving the plan, explicitly ask for a Rubber Duck critique — even if Copilot didn't trigger one automatically. This is the cheapest point to catch structural issues.
  3. Proceed with implementation.
  4. After implementation on anything touching 3+ files or involving shared state, request another critique before running tests.

The on-demand trigger gives you full control over when cross-model review runs, regardless of what Copilot decides automatically.

More information

Rubber Duck Debugging

GitHub Copilot CLI combines model families for a second opinion - The GitHub Blog

Popular posts from this blog

Podman– Command execution failed with exit code 125

After updating WSL on one of the developer machines, Podman failed to work. When we took a look through Podman Desktop, we noticed that Podman had stopped running and returned the following error message: Error: Command execution failed with exit code 125 Here are the steps we tried to fix the issue: We started by running podman info to get some extra details on what could be wrong: >podman info OS: windows/amd64 provider: wsl version: 5.3.1 Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM Error: unable to connect to Podman socket: failed to connect: dial tcp 127.0.0.1:2655: connectex: No connection could be made because the target machine actively refused it. That makes sense as the podman VM was not running. Let’s check the VM: >podman machine list NAME         ...

Azure DevOps/ GitHub emoji

I’m really bad at remembering emoji’s. So here is cheat sheet with all emoji’s that can be used in tools that support the github emoji markdown markup: All credits go to rcaviers who created this list.

VS Code Planning mode

After the introduction of Plan mode in Visual Studio , it now also found its way into VS Code. Planning mode, or as I like to call it 'Hannibal mode', extends GitHub Copilot's Agent Mode capabilities to handle larger, multi-step coding tasks with a structured approach. Instead of jumping straight into code generation, Planning mode creates a detailed execution plan. If you want more details, have a look at my previous post . Putting plan mode into action VS Code takes a different approach compared to Visual Studio when using plan mode. Instead of a configuration setting that you can activate but have limited control over, planning is available as a separate chat mode/agent: I like this approach better than how Visual Studio does it as you have explicit control when plan mode is activated. Instead of immediately diving into execution, the plan agent creates a plan and asks some follow up questions: You can further edit the plan by clicking on ‘Open in Editor’: ...