Imagine an AI system making critical decisions—approving loans, screening candidates, or recommending medical treatments. Behind the scenes, this AI learned from vast amounts of data, carefully labeled by humans or gathered from online sources. But what if the data it learned from carried hidden biases? What if those biases reflected the backgrounds, experiences, or unconscious prejudices of the people who labeled it? Without transparency, these ethical blind spots remain invisible, quietly shaping AI outcomes that affect real lives.
This is the problem we face today: AI models often inherit the biases embedded in their training data, yet little is known about how that data was created, who labeled it, or what ethical safeguards were applied. This lack of visibility makes it difficult for developers, auditors, and users to trust AI systems or ensure fairness.
Enter the Open Ethics Data Passport (OEDP) — a breakthrough framework designed to shine a light on the origins of AI datasets. Think of it as a “passport” that travels with every dataset and model, telling the story of where the data came from, how it was collected, cleaned, and annotated, and who the labelers were—including their expertise and potential influences.
In this session, I will walk you through how OEDP provides a standardized, open, and machine-readable documentation system that brings transparency to the dataset layer of AI development. You’ll learn how OEDP captures:
The provenance of datasets: their sources and collection methods
The annotation process: how data was labeled, including guidelines and quality controls
Labeler profiles: who annotated the data, their background, and possible biases
The data cleaning and preparation steps before training
The scope and ethical considerations shaping the dataset’s intended use
I’ll explain how OEDP helps reveal hidden biases, promotes accountability, and enables better, more ethical AI decision-making by making dataset ethics visible and auditable.

Beyond data documentation, OEDP is part of a broader Open Ethics ecosystem that also covers decision transparency and algorithmic explainability—all open-source and designed to work seamlessly with existing AI governance tools.
Who is this talk for?:
AI researchers and data scientists interested in improving transparency and ethical standards in training datasets.
Open data advocates and practitioners working with publicly accessible datasets for responsible AI development.
Policy makers, auditors, and ethics professionals seeking practical tools to assess and govern AI data fairness and accountability.
In the demo, I will showcase how easy it is to create an Open Ethics Data Passport for any open dataset, using the publicly available code and templates on GitHub. You’ll see how dataset publishers and AI developers can adopt OEDP to build trust and comply with emerging ethical standards.
Join me to discover how the Open Ethics Data Passport is transforming the way we think about open data for AI—from invisible bias to visible ethics, empowering the community to build fairer, more transparent AI systems.
3 key takeaways:
Understanding how the Open Ethics Data Passport (OEDP) enhances transparency and accountability in AI dataset documentation.
Practical insights into implementing OEDP to uncover and mitigate biases in training data.
Awareness of the broader Open Ethics framework and its role in promoting responsible, open-source AI governance.
Maybe as a lightning talk? OEDP is quite old now, but it hasn't seen much adoption.