Eight years ago I was invited to a workshop. The Office for National Statistics were gathering together people from the statistics and linked data communities to talk about publishing statistics on the web.
At the time there was lots of ongoing discussion within and between the two communities around this topic. With a particular emphasis on government statistics.
I was invited along to talk about how publishing linked data could help improve discovery of related datasets.
Others were there to talk about other related projects. There were lots of people there from the SDMX community who were working hard to standardise how statistics can be exchanged between organisations.
There’s a short write-up that mentions the workshop, some key findings and some follow on work.
One general point of agreement was that statistical data points or observations should be part of the web.
Every number, like the current population of Bath & North East Somerset, should have a unique address or URI. So people could just point at it. With their browsers or code.
Last week the ONS launched the beta of a new API that allows you to create links to individual observations.
Seven years on they’ve started delivering on the recommendations of that workshop.
Agreeing that observations should have URIs was easy. The hard work of doing the digital transformation required to actually deliver it has taken much longer.
Proof-of-concept demos have been around for a while. We made one at the ODI.
But the patient, painstaking work to change processes and culture to create sustainable change takes time. And in the tech community we consistently underestimate how long that takes, and how much work is required.
So kudos to Laura, Matt, Andy, Rob, Darren Barnee and the rest of present and past ONS team for making this happen. I’ve see glimpses of the hard work they’ve had to put in behind the scenes. You’re doing an amazing and necessary job.
If you’re unsure as to why this is such a great step forward, here’s a user need I learned at that workshop.
Amongst the attendees was a designer who worked on data visualisations. He would spend a great deal of time working with data to get it into the right format and then designing engaging, interactive views of it.
Often there were unusual peaks and troughs in the graphs and charts which needed some explanation. Maybe there had been an external event that impacted the data, or a change in methodology. Or a data quality issue that needed explaining. Or maybe just something interesting that should be highlighted to users.
What he wanted was a way for the statisticians to give him that context, so he could add notes and explanations to the diagrams. He was doing this manually and it was a lot of time and effort.
For decades statisticians have been putting these useful insights into the margins of their work. Because of the limitations of the printed page and spreadsheet tables this useful context has been relegated into footnotes for the reader to find for themselves.
But by putting this data onto the web, at individual URIs, we can deliver those numbers in context. Everything you need to know can be provided with the statistic, along with pointers to other useful information.
Giving observation unique URIs, frees statisticians from the tyranny of the document. And might help us all to share and discuss data in a much richer way.
I’m not naive enough to think that linking data can help us address issues with fake news. But it’s hard for me to imagine how being able to more easily work with data on the web isn’t at least part of the solution.