Lightning Talk Intermediate

Why Labs Fear Distillation and how Smaller AI Models Are Built in the Open-Source World

Approved

Session Description

Recently big AI labs started publicly talking about distillation attacks, which shows how strategically important model distillation has become. but most devs still don't have a clear picture of what distillation actually is or why it matters so much. I will break down knowledge distillation in plain terms, i.e teacher models, student models, soft targets, and how useful behaviour gets transferred from a larger model into a smaller one. DeepSeek-R1 is probably the best real world example of this, an open source project that distilled a 671B model's reasoning into models as small as 1.5B and released everything under MIT license. that is what open source distillation looks like in practice.

Also will walk through the open source tooling that makes this possible today, stuff you can actually experiment with yourself without needing frontier scale infrastructure. Will cover where distillation works well and where it falls flat, and why evaluation matters just as much as training. benchmarks can be misleading and don't always tell the full story.

Planning to answer the bigger picture is why this matters for FOSS, if smaller capable models can be built and studied in the open, the barrier to serious AI experimentation becomes much lower for everyone.And finally, in a world where anyone can distill a model, are the open source models we rely on actually safe? will touch on what that means for the FOSS community going forward.

Key Takeaways