Assessing data infrastructure: the Digital Public Goods standard and registry

This is the second in a short series of posts in which I'm sharing my notes and thoughts on a variety of different approaches for assessing data infrastructure and data institutions. The first post in the series looked at The Principles of Open Scholarly Infrastructure. In this post I want to take a look at … Continue reading Assessing data infrastructure: the Digital Public Goods standard and registry

Assessing data infrastructure: the Principles of Open Scholarly Infrastructure

How do we create well-designed, trustworthy, sustainable data infrastructure and institutions? This is a question that I remain deeply interested in. Much of the freelance work I've been doing since leaving the ODI has been in that area. For example, I'm currently helping with a multi-year evaluation of an grant-funded data institution. I'm particularly interested … Continue reading Assessing data infrastructure: the Principles of Open Scholarly Infrastructure

How could watermarking AI help build trust?

I've been reading about different approaches to watermarking AI and the datasets used to train them. This seems to be an active area of research within the machine learning community. But, of the papers I've looked at so far, there hasn't been much discussion of how these techniques might be applied and what groundwork needs … Continue reading How could watermarking AI help build trust?

What is Swash and is it really changing data ownership?

This is another in a very occasional series of blog posts where I look at different data initiatives, institutions or infrastructure in order to understand a bit more about how they work. And then have opinions about them. Previously I wrote about Common Voice. This time I'm looking at Swash which describes itself as "reimagining … Continue reading What is Swash and is it really changing data ownership?

Why are we still building portals?

The Geospatial Commission have recently published some guidance on Designing Geospatial Data Portals. There's a useful overview in the accompanying blog post. It's good clear guidance that should help anyone building a data portal. It has tips for designing search interfaces, presenting results and dataset metadata. There's very little advice that is specifically relevant to … Continue reading Why are we still building portals?

24 different tabular formats for half-hourly energy data

A couple of months ago I wrote a post that provided some background on the data we use in Energy Sparks. The largest data source comes from gas and electricity meters (consumption) and solar panels (generation). While we're integrating with APIs that allow us to access data from smart meters, for the foreseeable future most … Continue reading 24 different tabular formats for half-hourly energy data

Schema explorers and how they can help guide adoption of common standards

Despite being very different projects Wikidata and OpenStreetmap have a number of similarities. Recurring patterns in how they organise and support the work of their communities. We documented a number of these patterns in the ODI Collaborative Maintenance Guidebook. There were also a number we didn't get time to write-up. A further pattern which I … Continue reading Schema explorers and how they can help guide adoption of common standards

Some lessons learned from building standards around Schema.org

OpenActive is a community-led initiative in the sport and physical activity sector in England. It's goal is to help to get people healthier and more active by making its easier for people to find information about activities and events happening in their area. Publishing open data about opportunities to be active is a key part … Continue reading Some lessons learned from building standards around Schema.org