Talk Intermediate Apache-2.0, MIT License First Talk

Token Economy: Doing more with Fewer Token

Approved

Session Description

Token Economy: Doing more with Fewer Tokens looks at why token usage drives the cost, speed, and quality of your LLM interactions and what to do about it. We’ll explore three recent open-source tools that cut waste from different angles: what it reads, writes and how it writes. Practical, Tool-focused and immediately applicable.

Key Takeaways

Tokens are the core resource of any LLM workflow - they decide what you pay, how long you wait, and how well the model performs.
Token waste lives on two sides: what the agent reads and what it writes (over-engineered output).
Three open-source tools, three angles: One to compress agent reads, one to help cli outputs losslessly and one to constrain the agent to write minimal code.

References

https://github.com/DietrichGebert/ponytail

https://github.com/chopratejas/headroom

https://github.com/rtk-ai/rtk

Session Categories

Engineering practice - productivity, debugging

Talk License: Apache-2.0, MIT License

Speakers

Vedansh Sharma SDE 3 | Inmobi

Senior Software Engineer at InMobi, a global ad tech company, where I build high-scale backend systems and AI-driven engineering workflows using LLMs and agentic systems. Previously, I spent three years at Amazon working on Alexa's multimodal platform. I hold a B.Tech from NIT Kurukshetra with 5+ years of industry experience.