Skip to Main Content
Talk Intermediate MIT License

Kimchi: Building an Open-Source Multi-Model Coding Agent from Scratch

Review Pending
Session Description

Most coding agents are a black box: one model, one loop, no visibility into what's happening or why. Kimchi is our attempt to build one in the open — a CLI agent harness that runs multiple models simultaneously and routes work based on what each model is actually good at.

The architecture is straightforward once you see it. Five model roles: orchestrator, planner, builder, reviewer, explorer. Each role gets the model suited for it — reasoning-heavy tasks go to Kimi K2, repetitive coding work goes to MiniMax M2, codebase exploration goes to Nemotron. The user doesn't have to think about any of this; the classifier picks the tier based on the message. Under 100 tokens to decide.

Subagents are the part I find most interesting to talk about. When you ask Kimchi to add auth to your app, it doesn't do it in one big context window. It spawns an @explore subagent to read your codebase, hands off to a planner, then delegates implementation via task() to a builder. Each subagent keeps its own session as JSONL on disk — recoverable if something crashes halfway through. When a model hits 85% of its context limit, the system upgrades mid-conversation to a larger model in the same tier. No manual intervention.

We also built a migration tool for people switching from Claude Code, Cursor, or OpenCode — it pulls your existing MCP server configs over to ~/.config/kimchi/harness/mcp.json in one command. And custom bash hooks let you intercept any shell command the agent tries to run, rewrite it, or block it entirely.

This talk is the story of building it: what we got wrong early (single model with clever prompting doesn't scale), what changed our minds (the cost-per-task numbers once you route properly), and what's still messy (capability negotiation across models when they hit the same MCP server). The whole thing is on GitHub under MIT, so you can clone it during the talk.

Key Takeaways
  • How multi-model routing works in practice: the classifier design, model role assignments, and where the heuristics break down

  • How subagent sessions work — spawning, context overflow handling, JSONL persistence, and task() delegation

  • How MCP servers are shared across agents without re-initializing connections each time

  • What bash hooks give you that prompt engineering doesn't — and how to write them

  • How to fork or extend Kimchi for your own agent workflow (agent discovery, custom model roles, hooks)

References

Session Categories

Introducing a FOSS project or a new version of a popular project
Talk License: MIT License
Which track are you applying for?
Main track

Speakers

Kunal Das Developer Advocate APAC | CAST AI

Kunal Das is a Developer Advocate at CAST AI, where he spends most of his time on cloud cost optimization and, more recently, building AI agents that don't cost a fortune to run. He's one of the people behind Kimchi : an open source multi-model coding agent that routes work across different LLMs based on what each is actually good at, rather than just throwing everything at the most expensive one.

He organizes CNCF community chapters in Mumbai and Kolkata, runs the HashiCorp User Group Bangalore. He's spoken at FOSSASIA, ArgoCon, Observability Summit, and more CNCF meetups than he can count.

Outside work: badminton and too many photos.

Kunal Das
https://heylink.me/kunaldas

Reviews

No reviews yet.