Modern AI feels powerful, yet distant — locked behind cloud APIs, GPU servers, and rising infrastructure costs. As open-source developers, this raises an important question: can AI remain truly open, private, and accessible?
This lightning talk introduces WebLLM, an open-source project that enables large language models to run directly inside the web browser using modern web technologies like WebGPU and WebAssembly. No backend servers, no API keys, and no data leaving the user’s device.
Through simple explanations and real-world developer use cases, this session will show how WebLLM represents a local-first, privacy-preserving shift in how we think about deploying AI on the web. The talk is intentionally practical — aimed at web, frontend, backend, and full-stack developers curious about experimenting with AI without cloud dependency.
This is not about replacing cloud AI — it’s about expanding the open-source AI toolbox and exploring what becomes possible when AI runs where the user already is: the browser.
Why running AI inside the browser matters for open-source ecosystems
How WebLLM leverages WebGPU and WebAssembly for on-device inference
Privacy-first AI: what changes when data never leaves the user’s machine
Cost-free AI experiments — removing cloud infra and API barriers
Realistic use cases: offline tools, education, internal utilities, extensions
How developers can start experimenting with WebLLM today
A clear understanding of local-first AI and why it’s an important direction for FOSS
Awareness that AI is no longer restricted to cloud or server-side environments
Practical insight into when browser-based LLMs make sense — and when they don’t
Motivation to explore WebLLM as an experimental tool, not a black-box solution
Inspiration to rethink how AI can be embedded into everyday web apps responsibly
I would approve this only if the speaker has actual experience contributing to the project (or at least heavy usage experience). The proposal does not make that case