Talk
Intermediate

The Evolution of On-Device AI

Rejected

Session Description

Introduction

On-device AI is becoming increasingly important as it provides increased privacy, security, performance and personalization while lowering costs and energy consumption compared to cloud-based AI. Some key milestones in the history of on-device AI include:

  1. Excel formula evaluators as an early artifact of machine solver
  2. Features like background blurring, noise cancellation in conference applications like Google Meet powered by MediaPipe

The exponential growth in client-side compute capabilities, particularly GPUs, is making on-device AI feasible for more large language models. Currently, on-device AI is being integrated at the system layer in products like:

  1. Browsers like Chrome with built-in AI capabilities
  2. Firefox with alt text image generation 
  3. Apple Intelligence stack across Apple devices ecosystem

Apple Intelligence uses techniques like LoRA (Low-Rank Adaptation)that enables real-time training and personalization of AI models on-device with a similar feature set planned to be supported from Chrome. WebAssembly and WebGPU have risen as a frontrunner for deploying cross platform ML solutions leveraging browsertech with popular libraries like transformers js leveraging ONNX ecosystem that targets these backends.

Demos

To showcase the potential of on-device AI today, here are some prototype demos:

  1. Ollama: A DX focused wrapper of llama.cpp for running variety of models on device 
  2. Ratchet: A web-first ML framework that runs models in the browser with WebGPU
  3. Luminal: A deep learning library using composable compilers for extensibility in optimisations

Conclusion

On-device AI requires rethinking operating systems and computational architectures. Some key shifts include:

  1. A transition from apps which capture user attention to agents that understands user intent
  2. The local user agent should be at the center and cloud should feel like an extension of the computer with technology like Cloud Enclaves, not vice versa as agents require referencing data across workflows which is not possible in the present single origin paradigm

We at Tiles Research are on a mission to advance the communication of human intent with machines to design a more natural way of working. We're developing an intelligence-age operating system built with Rust, WebAssembly, and WebGPU. - To build such a novel system we are actively researching at the intersection of:

  1. Efficient ML: Making fast for on-device inference and training systems with a particular focus on action models
  2. Portable OS: Shifting OS responsibilities like resource management into the compiler.
  3. Distributed Systems: Enabling by default collaboration and software longevity by applying local-first techniques

The future of AI is mediated through an on-device, personalized and private intent router that acts on behalf of the user. An exciting road lies ahead as we architect new intelligence systems that augment the way we do work.

Key Takeaways

None

References

Session Categories

FOSS

Speakers

Riya Bisht
Open-Source Developer CERN-HSF
Riya Bisht

Reviews

0 %
Approvability
0
Approvals
1
Rejections
0
Not Sure
Reviewer #1
Rejected