In 2012(!) I blogged about rubber duck debugging to help me troubleshoot issues. It's a practice I'm still using even today. With GitHub Copilot CLI's newest update, rubber duckis becoming AI driven. Copilot introduces a second AI model from a different family to critique your agent's plans and implementations at the moments where feedback has the highest return.
Let's have a look at it in more detail...
What rubber duck actually does
Rubber Duck is not a general-purpose chat assistant. It's a review agent that will use a model from a complementary AI family to whichever model you've selected as your orchestrator.
When you're running a Claude model as your orchestrator, Rubber Duck runs on GPT-5.4. This is intentional: a model reviewing its own output is still bounded by its own training biases. A cross-family reviewer brings genuinely different blind spots to the table.
Rubber Duck's output is deliberately narrow — a short, prioritised list of concerns: missed edge cases, questionable assumptions, and architectural issues the primary agent didn't surface.
When Copilot invokes rubber duck automatically
You can call the Rubber Duck yourself, but Copilot can also do this proactively at three specific checkpoints:
- After drafting a plan — before implementation begins. This is the highest-value intervention point: a flawed assumption here compounds into every subsequent step.
- After a complex implementation — a second pass on non-trivial code before you commit to the direction.
- After writing tests, before running them — catching gaps in coverage or incorrect assertions before they self-validate.
Copilot can also trigger Rubber Duck reactively if it gets stuck in a loop and can't make forward progress.
Triggering it yourself
You can request a Rubber Duck critique at any point in your session. Just tell Copilot to critique its current work. It will invoke Rubber Duck, reason over the feedback, and surface a diff of what changed and why. This is useful before you commit to a plan you're not fully confident in, or after a long agentic run you want double-checked before merging.
How to enable it
Rubber Duck lives in experimental mode. To access it:
# Enable experimental features
/experimental
Today it works for any Claude model from the model picker and when you have access to GPT-5.4. You'll see critiques surface inline — either automatically at the checkpoints above, or on demand when you ask.
When it won't help much
Rubber Duck is optimised for complex, multi-step, multi-file work. For short, well-scoped tasks — a single-function change, a straightforward refactor within one file — the overhead of a cross-model review adds latency without proportional benefit. The agent is designed to invoke it sparingly for exactly this reason.
Practical workflow recommendation
For high-stakes agentic tasks, a reasonable pattern is:
- Let Copilot draft the plan.
- Before approving the plan, explicitly ask for a Rubber Duck critique — even if Copilot didn't trigger one automatically. This is the cheapest point to catch structural issues.
- Proceed with implementation.
- After implementation on anything touching 3+ files or involving shared state, request another critique before running tests.
The on-demand trigger gives you full control over when cross-model review runs, regardless of what Copilot decides automatically.
More information
GitHub Copilot CLI combines model families for a second opinion - The GitHub Blog