Large Language Models are no longer limited to closed platforms or proprietary APIs. The open-weight ecosystem has evolved to a point where anyone can build, fine-tune, and deploy their own model while maintaining full control over data and infrastructure. In the current landscape, the rise of large-scale LLM distribution in India through platforms like Gemini Student Offer, Gemini Jio Offer, OpenAI offering ChatGPT Go for Free, and Perplexity Airtel partnership shows how centralized systems increasingly use user data to train and refine their models. This trend highlights the growing importance of owning and operating local, privacy-focused systems that keep data under user control while offering the similar capabilities.
This talk explores the current state of open-weight LLMs and how to use them effectively in real-world applications. It covers major open releases from Meta (Llama 4), Microsoft (Phi-4), Google (Gemma 3), IBM (Granite 4.0), Alibaba (Qwen), and Zhipu (GLM).
The session introduces a hands-on view of running models locally using frameworks such as vLLM, Llama.cpp, and Ollama. These tools make it possible to run powerful open-weight models efficiently on a variety of hardware, including CPUs, GPUs, and Apple MLX devices.
Tooling will be a strong focus, with examples of how developers can use open tools like OpenCode and RooCode to improve developer productivity, and leverage Unsloth for training and fine-tuning models for specific use cases without closed dependencies. The talk will also cover tool calling and how integrating LLMs with external tools, APIs, or automation workflows (through n8n) can enrich data and enable more capable, context-aware systems.
A segment of the talk will focus on benchmarking and evaluation, examining how open-weight models perform against closed ones and how the performance gap has narrowed to a level where open models can now handle most daily tasks effectively. The session will also discuss small language models (SLMs) and their growing importance for privacy-preserving, efficient, and edge-focused deployments.
The goal is to give attendees a clear technical understanding of the open-weight LLM ecosystem, associated tooling, and fine-tuning workflows. By the end of the talk, the audience will know how to set up an open LLM environment locally, fine-tune it for their data, integrate it with automation pipelines, and evaluate it systematically. The session is designed for engineers and practitioners who want to move beyond closed APIs and build their own privacy-preserving, open, and reproducible air-gapped AI systems.