In "What Does Your Dataset Contain?" I outlined a conceptual framework for thinking about how we might want to describe datasets, e.g. how they're produced, what they contain, etc. I've been reading with interest the series on dataset summaries in Scraperwiki which is exploring similar ideas. I finally found the time to do some quick … Continue reading Summarising Geographic Coverage of Dbpedia (and Wikipedia)
Category: The Commons
How Do We Attribute Data?
This post is another in my ongoing series of "basic questions about open data", which includes "What is a Dataset?" and "What does a dataset contain?". In this post I want to focus on dataset attribution and in particular questions such as: Why should we attribute data? How are data publishers asking to be attributed? … Continue reading How Do We Attribute Data?
What Does Your Dataset Contain?
Having explored some ways that we might find related data and services, as well as different definitions of "dataset", I wanted to look at the topic of dataset description and analysis. Specifically, how can we answer the following questions: what kinds of information does this dataset contain? what types of entity are described in this … Continue reading What Does Your Dataset Contain?
What is a Dataset?
As my last post highlighted, I've been thinking about how we can find and discover datasets and their related APIs and services. I'm thinking of putting together some simple tools to help explore and encourage the kind of linking that my diagram illustrated. There's some related work going on in a few areas which is … Continue reading What is a Dataset?
Dataset and API Discovery in Linked Data
I've been recently thinking about how applications can discover additional data and relevant APIs in Linked Data. While there's been lots of research done on finding and using (semantic) web services I'm initially interested in supporting the kind of bootstrapping use cases covered by Autodiscovery. We can characterise that use case as helping to answer … Continue reading Dataset and API Discovery in Linked Data
A Brief Review of the Land Registry Linked Data
The Land Registry have today announced the publication of their Open Data -- including both Price Paid information and Transactions as Linked Data. This is great to see, as it means that there is another UK public body making a commitment to Linked Data publishing. I've taken some time to begin exploring the data. This … Continue reading A Brief Review of the Land Registry Linked Data
How I organise data conversions
Factual announced a new project last week, called Drake which is billed as a "make for data". The tool provides a make style environment for building workflows for data conversions, it has support for multiple programming languages, uses a standard project layout, and integrates with HDFS. It looks like a really nice tool and I … Continue reading How I organise data conversions
How to use dpm with data.gov.uk
The Data Package Manager is an Open Knowledge Foundation project to create a tool to support discovery and distribution of datasets. The tool uses the concept of a "data package" to describe the basic metadata for a dataset plus the supporting files. Packages are indexed in a registry to make them searchable and to support … Continue reading How to use dpm with data.gov.uk
Not Just Legislation: Sustainable Open Data Curation Projects
Francis Irving recently wrote an excited blog post about the open curation model that now backs legislation.gov.uk. It's hard not to get excited about legislation.gov.uk. There's been so much good work done on the project and everyone involved has achieved a great deal of which they can be proud. If you're not familiar with the … Continue reading Not Just Legislation: Sustainable Open Data Curation Projects
Data is Potential
Jeni Tennison asked an interesting question on twitter last week: Question: aside from personally identifiable data, is there any data that *should not* be open? The question prompted some interesting discussion which included examples of data that might be sensitive, suggestions about data that would be useful to open up, and the need for better … Continue reading Data is Potential