Talk

Intermediate

Beyond the smokescreen: interactive in-browser FOSS tools for science communication and evidence-based environmental policy

Review Pending

Session Description

Introduction

Every winter in New Delhi, India, air quality index (AQI) levels spike, and Indian media outlets often blame farmers burning stubble from the nearby states of Punjab and Haryana. Politicians promise action, while policies target farmers, and real problems remain hidden from public view. However, research data tells an entirely different story: despite a 71.2% drop in stubble burning incidents from October to December 2024, Delhi's PM2.5 levels rose 3.4% to 104.7 µg/m³, which is 2.6 times the national standard. Meanwhile, over 50% of Delhi's pollution comes from vehicular emissions, with stubble burning having contributed only 0.92% on average.

This gap between scientific data and public narratives affects policy decisions, farmers' livelihoods and wellbeing, and the lives of millions in the National Capital Region (NCR). The consequences extend beyond air quality – when evidence-based policymaking fails, resources are misdirected, and vulnerable communities suffer disproportionately.

At the same time, this problem is just one instance of many classic failures in science communication that continue to erode public trust in scientific institutions.

In this talk, we will demonstrate use cases for interactive browser-based FOSS projects using WebAssembly, particularly Pyodide and JupyterLite, to: (i) democratise access to computational resources for research data for performing scientific experiments, (ii) establish a means to enable evidence-based public discourse, and (iii) provide frameworks to build upon for transparent, reproducible science communication initiatives that serve democratic forms of governance.

The science communication challenge in India

Recent research reveals a paradox in Indian science communication. A first-of-its-kind survey of 259 senior Indian scientists from national science academies found that more than three-quarters personally enjoy science communication and feel confident and well-equipped to communicate their research. Yet, formal public engagement activities remain surprisingly low across the country – reifying that we have structural (pedagogical, sociopolitical, and monetary) barriers to building scientific temper rather than a reluctance to do so at the individual level.

Such a disconnect is critical in environmental policy debates, for example. When complex datasets about air pollution, climate impacts, or agricultural practices need public scrutiny, current communication models fail. Media narratives simplify multifaceted problems into convenient scapegoats, policymakers lack accessible analytical tools, and citizens cannot independently verify claims made in their name.

Eighty-three per cent of the global population believes that scientists should communicate better with the public. In India, where 60% view the country's scientific achievements as above average or world-class, the challenge isn't scientific capability, but rather making that capability accessible for democratic participation.

Technical details: a dive into browser-based scientific computing

Now that we have the necessary background and context about India's science communication arena, we will explore two open-source software projects, Pyodide and JupyterLite, and describe how they can be used to enable scientific computation directly in web browsers, with experiments on geospatial and environmental data at the heart of our use case.

I plan to provide descriptions of the architectures for both Pyodide and JupyterLite. Some brief points are provided as follows:

JupyterLite technical architecture

a serverless Jupyter implementation - uses a Jupyter protocol running entirely in a browser without backend server requirements; and
service-worker-based kernel management - on how browser APIs enable kernel lifecycle management; and
an in-browser file system - IndexedDB for persistent notebook storage; and
WebAssembly integration - how a JavaScript frontend is connected to a language's runtimes that run in WebAssembly; and
customising kernels - extending JupyterLite with kernels for different languages

Pyodide internals

CPython to WebAssembly compilation – the Emscripten toolchain and the runtime considerations this brings; and
the package ecosystem and wheel format – on the process of how Python packages get compiled to WebAssembly wheels; and
lockfiles – ensuring reproducible environments with the pyodide-lock.json specification; and
a foreign function interface – JavaScript-Python interoperability and data marshalling; and
performance characteristics – understanding computational limitations and some optimisation strategies we can use in this context

Advantages of such deployments

static site hosting on GitHub Pages or similar platforms; and
shareable URLs that include data, code, and results; and
offline capabilities, where once initial data is loaded, it gets cached in the browser's storage

Why this matters for India's science communication and FOSS ecosystems

My talk aims to showcase how performing science on environmental data represents a perfect use case for India's FOSS adoption goals. Government agencies generate large volumes of ecological datasets through taxpayer funding; however, accessibility and discoverability often remain limited to technical specialists, researchers, or users of proprietary software, behind licenses that often limit unauthorised use. Browser-based, free, and open-source software (FOSS) tools can aid in conducting research on such data, enabling science to progress while advancing India's objectives of technological sovereignty.

The broader policy implications extend beyond pollution monitoring: India's National FOSS Policy requires technology suppliers to submit open source options in government bids, but implementation has remained inconsistent. Environmental data commons distributors can showcase FOSS capabilities in high-stakes, public-interest applications in lieu of proprietary solutions that often create vendor lock-in and limit democratic oversight; where serverless distributions will significantly reduce the costs of running and scaling large servers to perform computations.

This framework applies across India's diverse environmental challenges: water quality monitoring in industrial corridors where communities need independent verification capabilities, agricultural impact assessment for climate adaptation planning in different agro-ecological zones, urban heat island analysis for sustainable city planning in rapidly growing metropolitan areas, biodiversity monitoring using citizen science data from India's rich ecological zones, industrial emissions tracking for community health protection in pollution-affected regions, and several other use cases.

Interactive demonstrations: notebook-based experiments and analyses

I will demonstrate the technical capabilities of Pyodide and JupyterLite through a series of short interactive notebooks that run an array of scientific experiments from data, similar to the NumPy tutorials notebook linked in the references section.

Notebook demonstration components

Environment setup - loading JupyterLite with a pre-configured Pyodide kernel and scientific packages
Data acquisition - fetching environmental datasets from public environmental data APIs using networking libraries
Package installation - demonstrating micropip for runtime package management in a WebAssembly environment
Data processing workflows - using pandas, numpy, and scipy for analysing queried data
Visualisation capabilities - matplotlib, plotly, and ipywidgets for interactive data exploration
Geospatial analyses - using folium for mapping pollution sources and meteorological patterns

Technical deep-dive elements that are at play here

Kernel communication - message passing between the frontend and the Pyodide kernel
Performance profiling - pitfalls of optimising computational performance when compared to native Python environments
Memory usage patterns - understanding browser memory constraints and optimisation strategies, and discussing practical limitations around this aspect
Package dependency resolution - exploring how lockfiles ensure reproducible environments across different browser contexts

Talk outline

Here is a (tentative) outline of my talk:

Introduction: environmental data and science communication challenges (~5 minutes)

Delhi pollution narratives versus data reality as a case study
Science communication failures in environmental policy
Overview of browser-based solutions for data accessibility

Project Jupyter and JupyterLite architectures (~5 minutes)

Project Jupyter ecosystem: kernels, frontends, and protocol design
JupyterLite implementation: service workers, in-browser filesystems, and WebAssembly integration
Serverless computing model: advantages and technical constraints
Kernel architecture and lifecycle management in browser environments

Pyodide technical deep-dive (~5 minutes)

CPython to WebAssembly compilation using the Emscripten compiler toolchain
Package ecosystem: WebAssembly wheels, micropip, and dependency management
Lockfile specifications and reproducible environments
JavaScript-Python interoperability and foreign function interfaces
Performance characteristics and memory management considerations

A series of interactive demonstrations: environmental data analyses (~5 minutes)

JupyterLite environment setup and package installation workflows
Environmental dataset acquisition and processing using the Scientific Python stack
Visualisation and analysis capabilities demonstration
Notebook sharing and reproducibility features

Conclusion: implications for scientific computing and democratic access (~2 minutes)

The talk is supposed to end as a Call To Action (CTA) towards creating in-browser computational environments for everyone, for just about anything that pertains to computational data, and
a brief discussion around community contribution pathways for browser-based scientific computing for the projects discussed

The speaker has a background in open-source scientific software, and has previously conducted policy analyses and outreach on stubble burning in the greater New Delhi area as part of a social impact initiative that focused on Goal 13: Climate Action within the United Nations' Sustainable Development Goals (SDGs).

While this talk addresses a set of unanswered questions surrounding the intersection of tech policy, geopolitics, open data, and science communication; the content is designed to be technically driven towards a scientific audience – it aims to explore scientific computing from a science communication perspective and the adoption of FOSS in research contexts where proprietary research software continues to be used.

Key Takeaways

With my talk, attendees can gain:

a general idea of how projects like Pyodide and JupyterLite cater to in-browser scientific computing; and
a brief understanding of how both data and its prevalence affect public environmental/ecological storytelling and policy discourse; and
information on community tools (dataset providers, open-source software, and ancillary resources) that can be used to make accessing complex environmental data pragmatic; and
directives on approaches for supporting transparent, reproducible science communication that steers policy perspectives; and
an ability to understand how science communication is a democratic tool, and undertake calls to action to start web-based civic tech initiatives and research projects that they can establish using the current state of FOSS tooling.

Which track are you applying for?

FOSS in Science Devroom

Reviews

100 %

Approvability

2

Approvals

0

Rejections

0

Not Sure

Seems like a very well researched topic that impacts everyone. Curious how you will fit your content in the time allotted. Will need to pare some things down to fit with 25 minutes

Reviewer #1

Approved

Reviewer #2

Approved