“AI-Ready Data” is the wrong framing

A paper was published this week by Stefaan Verhulst, Andrew Zahuranec and Hannah Chafetz called "Moving Toward the FAIR-R principles: Advancing AI-Ready Data". The paper sets out to do two things: Make the case that we are in a "Fourth Wave" of open data in which it is critical that data is made useful for … Continue reading “AI-Ready Data” is the wrong framing →

Falsehoods this programmer believed about energy meters

This is the second part to a post I published earlier this week in which I summarised some things I learned about working with half-hourly energy data. I'll be updating that shortly with a few extra details and clarifications. This post will be a summary of some things I've learned about energy meters and metering. … Continue reading Falsehoods this programmer believed about energy meters →

What does community-driven data governance look like?

Some idle thoughts for a Friday afternoon. I was just taking a look at Source.Plus a dataset of public domain images for training Foundation models. It's a project of Spawning.ai which is working to build "data governance for generative AI". I have some thoughts on the tools they're building, but that's not what I'm writing … Continue reading What does community-driven data governance look like? →

Comments on “A data for AI taxonomy”

Jack Hardinges and Elena Simperl recently published a taxonomy to describe the data relevant to AI models and systems. Their goal is to help to better distinguish between the different types of data relevant to developing, using and monitoring AI models and systems to help to better distinguish them and thereby add some nuance to … Continue reading Comments on “A data for AI taxonomy” →

What datasets have been classified as Digital Public Goods?

Update: 2024-04-14, I've updated this post with some corrections. See below A couple of years ago I wrote a short series of posts looking at some different approaches for assessing data infrastructure. It includes this post on the Digital Public Goods standard and registry. Digital Public Goods are defined as: open-source software, open data, open … Continue reading What datasets have been classified as Digital Public Goods? →

Confused by SOLID

I keep checking in on the Solid project. But I'm baffled by its lack of functionality. I've written up some of my questions.

It takes about 4588 quasars to help you get around and get paid

I love learning about the data infrastructure that shapes the world we live in. Like all good infrastructure it's usually invisible, because it just works. But there's always something interesting to learn if you dig into the detail. For example, a few years ago when I was researching how geospatial data is accessed, used and … Continue reading It takes about 4588 quasars to help you get around and get paid →

Increasing consistency of data with FAIR Implementation Profiles

FAIR implementation profiles offer a means to increase consistency around how data is shared.

Consistency before standards

Before jumping straight into scoping and designing new standards, we should look at other quick wins to increase consistency around how data is published.

The Public Charge Point regulations and other examples of open data and standards in UK legislation

This week the UK published some new draft legislation: The Public Charge Point Regulations 2023. You can read a summary of what the legislation covers elsewhere, but what caught my attention was that it purports to require that operators of electric vehicle charging points must publish open data about their charging points. But I was … Continue reading The Public Charge Point regulations and other examples of open data and standards in UK legislation →