Package metadata is basically structured info about the origin of the code, its license, its dependencies, and any known vulnerabilities or exploits.
These metadata are essential to comply with licenses, manage vulnerabilities, and distribute Software Bill of Materials for your product.
(Ref: Cyber Resilience Act in EU, CERT-In guideline CISG-2024-02 in India and Executive Order 14028 in USA)
Data about open source projects should be open source. Today, crucial metadata is predominantly available through proprietary solutions. The tools are often not open source, non-transparent, and non-collaborative in nature. It is simply unacceptable and against open source values principles to have the metadata of open source packages and software locked away in proprietary silos. This is leading to vendor lock-in, creating significant barriers for MSMEs which are not good for resilient software supply chains.
FederatedCode stores collected and normalised package metadata generated from AboutCode tools in a constellation of distributed git repositories. And federates these updates to hosts and clients using ActivityPub protocol, the same that underpins Mastodon. The metadata of a package version can be discovered or subscribed to using Package-URL (PURL). This architecture allows users and tools to track and "watch" packages they care for and receive updates whenever new events are triggered for that package like new package releases, updated scan results, and newly discovered vulnerabilities.
The metadata hosts are decentralised and independent and can focus on a subset of packages. For example, I could host a FederatedCode instance that federates scan data only for specific npm packages. These decentralised FederatedCode instances join the fediverse to share and exchange package metadata, as well as supporting discussions for humans and bots. And suddenly we are no longer working in silos. If one host goes down, users can simply switch to another FederatedCode instance.
The metadata results are transparent and verifiable. Scan results are stored in distributed git repos and anyone can run the scan using the tool specified in the metadata and validate that the results are identical. This creates an open and collaborative solution for dealing with package metadata with democratic access to the metadata
FederatedCode offers a new decentralised approach to share package metadata about open source packages as open source data, democratising access.
ActivityPub can be used as an excellent layer for open-data federation, supporting an ecosystem of publishers and consumers for regular data updates.
We might not have space for this in the open-data devroom, but this should be considered for the main track in that case.