I've been doing some research around different types of data intermediary recently and thought I'd share some things I've learned about "Data Unions". Like a lot of the terms being applied to new approaches to data governance, there's no clear definition of what constitutes a data union. A vision of collective action For example, this … Continue reading What are Data Unions?
Popular Science have recently published three pieces of speculative fiction exploring the question of "will 'we the people' benefit from our data?". They're called "Shared data", "The Memory of Tomatoes" and "Home@Heart". Each of the pieces of fiction is followed up a response from a policy expert. I read the first of these this morning. … Continue reading We need the right data institutions
This is the second in a short series of posts in which I'm sharing my notes and thoughts on a variety of different approaches for assessing data infrastructure and data institutions. The first post in the series looked at The Principles of Open Scholarly Infrastructure. In this post I want to take a look at … Continue reading Assessing data infrastructure: the Digital Public Goods standard and registry
How do we create well-designed, trustworthy, sustainable data infrastructure and institutions? This is a question that I remain deeply interested in. Much of the freelance work I've been doing since leaving the ODI has been in that area. For example, I'm currently helping with a multi-year evaluation of an grant-funded data institution. I'm particularly interested … Continue reading Assessing data infrastructure: the Principles of Open Scholarly Infrastructure
I've been reading about different approaches to watermarking AI and the datasets used to train them. This seems to be an active area of research within the machine learning community. But, of the papers I've looked at so far, there hasn't been much discussion of how these techniques might be applied and what groundwork needs … Continue reading How could watermarking AI help build trust?
This is another in a very occasional series of blog posts where I look at different data initiatives, institutions or infrastructure in order to understand a bit more about how they work. And then have opinions about them. Previously I wrote about Common Voice. This time I'm looking at Swash which describes itself as "reimagining … Continue reading What is Swash and is it really changing data ownership?
The Geospatial Commission have recently published some guidance on Designing Geospatial Data Portals. There's a useful overview in the accompanying blog post. It's good clear guidance that should help anyone building a data portal. It has tips for designing search interfaces, presenting results and dataset metadata. There's very little advice that is specifically relevant to … Continue reading Why are we still building portals?
A couple of months ago I wrote a post that provided some background on the data we use in Energy Sparks. The largest data source comes from gas and electricity meters (consumption) and solar panels (generation). While we're integrating with APIs that allow us to access data from smart meters, for the foreseeable future most … Continue reading 24 different tabular formats for half-hourly energy data
Despite being very different projects Wikidata and OpenStreetmap have a number of similarities. Recurring patterns in how they organise and support the work of their communities. We documented a number of these patterns in the ODI Collaborative Maintenance Guidebook. There were also a number we didn't get time to write-up. A further pattern which I … Continue reading Schema explorers and how they can help guide adoption of common standards
This is a post about building tools to validate data. I wanted to share a few reflections based on helping to design and build a few different public and private tools, as well as my experience as a user. I like using data validators to check my homework. I've been using a few different recently … Continue reading Building data validators