What does community-driven data governance look like?

Some idle thoughts for a Friday afternoon. I was just taking a look at Source.Plus a dataset of public domain images for training Foundation models. It's a project of Spawning.ai which is working to build "data governance for generative AI". I have some thoughts on the tools they're building, but that's not what I'm writing … Continue reading What does community-driven data governance look like?

Comments on “A data for AI taxonomy”

Jack Hardinges and Elena Simperl recently published a taxonomy to describe the data relevant to AI models and systems. Their goal is to help to better distinguish between the different types of data relevant to developing, using and monitoring AI models and systems to help to better distinguish them and thereby add some nuance to … Continue reading Comments on “A data for AI taxonomy”

What datasets have been classified as Digital Public Goods?

Update: 2024-04-14, I've updated this post with some corrections. See below A couple of years ago I wrote a short series of posts looking at some different approaches for assessing data infrastructure. It includes this post on the Digital Public Goods standard and registry. Digital Public Goods are defined as: open-source software, open data, open … Continue reading What datasets have been classified as Digital Public Goods?