When are open (geospatial) identifiers useful?

In a meeting today, I was discussing how and when open geospatial identifiers are useful. I thought this might make a good topic for a blog post in my continuing series of questions about data. So here goes.

An identifier provides an unambiguous reference for something about which we want to collect and publish data. That thing might be a road, a school, a parcel of land or a bus stop.

If we publish a dataset that contains some data about “Westminster” then, without some additional documentation, a user of that dataset won’t know whether the data is about a tube station, the Parliamentary Constituency, a company based in Hayes or a school.

If we have identifiers for all of those different things, then we can use the identifiers in our data. This lets us be confident that we are talking about the same things. Publishing data about “940GZZLUWSM” makes it pretty clear that we’re referring to a specific tube station.

If data publishers use the same sets of identifiers, then we can start to easily combine your dataset on the wheelchair accessibility of tube stations, with my dataset of tube station locations and Transport for London’s transit data. So we can build an application that will help people in wheelchairs make better decisions about how to move around London.

Helpful services

To help us publish datasets that use the same identifiers, there are a few things that we repeatedly need to do.

For example it’s common to have to lookup an identifier based on the name of the thing we’re describing. E.g. what’s the code for Westminster tube station? We often need to find information about an identifier we’ve found in a dataset. E.g. what’s the name of the tube station identified by 940GZZLUWSM? And where is it?

When we’re working with geospatial data we often need to find identifiers based on a physical location. For example, based on a latitude and longitude:

  • Where is the nearest tube station?
  • Or, what polling district am I in, so I can find out where I should go to vote?
  • Or, what is the identifier for the parcel of land that contains these co-ordinates?
  • …etc

It can be helpful if these repeated tasks are turned into specialised services (APIs) that make it easier to perform them on-demand. The alternative is that we all have to download and index the necessary datasets ourselves.

Network effects

Choosing which identifiers to use in a dataset is an important part of creating agreements around how we publish data. We call those agreements data standards.

The more datasets that use the same set of identifiers, the easier it becomes to combine those datasets together, in various combinations that will help to solve a range of problems. To put it another way, using common identifiers helps to generate network effects that make it easier for everyone to publish and use data.

I think it’s true to say that almost every problem that we might try and solve with better use of data requires the combination of several different datasets. Some of those datasets might come from the private sector. Some of them might come from the public sector. No single organisation always holds all of the data.

This makes it important to be able to share and reuse identifiers across different organisations. And that is why it is important that those identifiers are published under an open licence.

Open licensing

Open licences allow anyone to access, use and share data. Openly licensed identifiers can be used in both open datasets and those that are shared under more restrictive licences. They give data publishers the freedom to choose the correct licence for their dataset, so that it sits at the right point on the data spectrum.

Identifiers that are not published under an open licence remove that choice. Restricted licensing limits the ability of publishers to share their data in the way that makes sense for their business model or application. Restrictive licences cause friction that gets in the way of making data as open as possible.

Open identifiers create open ecosystems. They create opportunities for a variety of business models, products and services. For example intermediaries can create platforms that aggregate and distribute data that has been published by a variety of different organisations.

So, the best identifiers are those that are

  • published under an open licence that allows anyone to access, use and share them
  • published alongside some basic metadata (a label, a location or other geospatial data, a type)
  • and, are accessible via services that allow them to be easily used

Who provides that infrastructure?

Whenever there is friction around the use of data, application developers are left with a difficult choice. They either have to invest time and effort in working around that friction, or compromise their plans in some way. The need to quickly bring products to market may lead to choices which are not ideal.

For example, developers may choose to build applications against Google’s mapping services. These services are easily and immediately available for anyone developer wanting to display a map or recommend a route to a user. But these platforms usually have restricted licensing that means it is usually the platform provider that reaps the most benefits. In the absence of open licences, network effects can lead to data monopolies.

So who should provide these open identifiers, and the metadata and services that support them?

This is the role of national mapping agencies. These agencies will already have identifiers for important geospatial features. The Ordnance Survey has an identifier called a TOID which is assigned to every feature in Great Britain. But there are other identifiers in use too. Some are designed to support publication of specific types of data, e.g. UPRNs.

These identifiers are national assets. They should be managed as data infrastructure and not be tied up in commercial data products.

Publishing these identifiers under an open licence, in the ways that have been outlined here, will provide a framework to support the collection and curation of geospatial data by many  different organisations, across the public and private sector. That infrastructure will allow value to be created from that geospatial data in a variety of new ways.

Provision of this type of infrastructure is also in-line with what we can see happening across other parts of government. For example the work of the GDS team to develop registers of important data. Identifiers, registers and standards are important building blocks of our local, national and global data infrastructure.

If you’re interested in reading more about the benefits of open identifiers, then you might be interested in this white paper that I wrote with colleagues from the Open Data Institute and Thomson Reuters: “Creating value from identifiers in an open data world

Data assets and data products

A lot of the work that we’ve done at the ODI over the last few years has involved helping organisations to recognise their data assets.

Many organisations will have their IT equipment and maybe even their desks and chairs asset tagged. They know who is using them, where they are, and have some kind of plan to make sure that they only invest in maintaining the assets they really need. But few will be treating data in the same way.

That’s a change that is only just beginning. Part of the shift is in understanding how those assets can be used to solve problems. Or help them, their partners and customers to make more informed decisions.

Often that means sharing or opening that data so that others can use it. Making sure that data is at the right point of the data spectrum helps to unlock its value.

A sticking point for many organisations is that they begin to question why they should share or open those data assets, and whether others should contribute to their maintenance. There are many commons questions around the value of sharing, respecting privacy, logistics, etc.

I think a useful framing for this type of discussion might be to distinguish between data assets and data products.

A data asset is what an organisation is managing internally. It may be shared with a limited audience.

A data product is what you share with or open to a wider audience. Its created from one or more data assets. A data product may not contain all of the same data as the data assets it’s based on. Personal data might need to be removed or anonymised for example. This means a data product might sit at a different point in the data spectrum. It can be more open. I’m using data product here to refer to specific types of datasets, not “applications that have been made using data”

An asset is something you manage and invest in. A product is intended to address some specific needs. It may need some support or documentation to make sure it’s useful. It may also need to evolve based on changing needs.

In some cases a data asset could also be a data product. The complete dataset might be published in its entirety. In my experience this is often rarely the case though. There’s usually additional information, e.g governance and version history, that might not be useful to reusers.

In others cases data assets are collaboratively maintained, often in the open. Wikidata and OpenStreetMap are global data assets that are maintained in this way. There are many organisations that are using those assets to create more tailored data products that help to meet specific needs. Over time I expect more data assets will be managed in collaborative ways.

Obviously not every open data release needs to be a fully supported “product”. To meet transparency goals we often just need to get data published as soon as possible, with a minimum of friction for both publishers and users.

But when we are using data as tool to create other types of impact, more work is sometimes needed. There are often a number of social, legal and technical issues to consider in making data accessible in a sustainable way.

By injecting some product thinking into how we share and open data it might be helpful in addressing the types of problems that can contribute to data releases not having the desired impact: Why are we opening this data? Who will use it? How can we help them be more effective? Does releasing the data provide ways in which the data asset might be more collaboratively maintained?

When governments are publishing data that should be part of a national data infrastructure, more value will be unlocked if more of the underlying data assets are available for anyone to access, use and share. Releasing a “data product” that is too closely targeted might limit its utility.  So I also think this “data asset” vs “data product” distinction can help us to challenge the types data that are being released. Are we getting access to the most valuable data assets or useful subsets of them. Or are we just being given a data product that has much more limited applications, regardless of how well it is being published?

We CAN get there from here

On Wednesday, as part of the Autumn Budget, the Chancellor announced that the government will be creating a Geospatial Commission “to establish how to open up freely the OS MasterMap data to UK-based small businesses”. It will be supported by new funding of £80 million over two years. The Commission will be looking at a range of things including:

  • improving the access to, links between, and quality of their data
  • looking at making more geospatial data available for free and without restriction
  • setting regulation and policy in relation to geospatial data created by the public sector
  • holding individual bodies to account for delivery against the geospatial strategy
  • providing strategic oversight and direction across Whitehall and public bodies who operate in this area

That’s a big pot of money to get something done and a remit that ticks all of the right boxes. As the ODI blog post notes, it creates “the opportunity for national mapping agencies to adapt to a future where they become stewards for national mapping data infrastructure, making sure that data is available to meet the needs of everyone in the country”.

So, I’m really surprised that the many of the reactions from the open data community have been fairly negative. I understand the concerns that the end result might not be a completely open Mastermap. There are many, many ways in which this could end up with little or no change to the status quo. That’s certainly true if we ignore the opportunity to embed some change.

From my perspective, this is the biggest step towards a more open future for UK geospatial data since the first OS Open Data release in 2010. (I remember excitedly hitting the publish button to make their first Linked Data release publicly accessible)

Anyone who has been involved with open data in the UK will have encountered the Ordnance Survey licensing issues that are massively inhibiting both the release and use of open data in the UK. It’s a frustration of mine that these issues aren’t manifest in the various open data indexes.

In my opinion, anything that moves us forward from the current licensing position is to be welcomed. Yes, we all want a completely open MasterMap. That’s our shared goal. But how do we get there?

We’ve just seen the government task and resource itself to do something that can help us achieve that goal. It’s taken concerted effort by a number of people to get to this point. We should be focusing on what we all can do, right now, to help this process stay on track. Dismissing it as an already failed attempt isn’t helpful.

I think there’s a great deal that the community could do to engage with and support this process.

Here’s a few ideas of things of ways that we could inject some useful thinking into the process:

  • Can we pull together examples of where existing licensing restrictions are causing friction for UK businesses? Those of who us have been involved with open data have internalised many of these issues already, but we need to make sure they’re clearly understood by a wider audience
  • Can we do the same for local government data and services? There are loads of these too. Particularly compelling examples will be those that highlight where more open licensing can help improve local service delivery
  • Where could greater clarity around existing licensing arrangements help UK businesses, public sector and civil society organisations achieve greater impact? It often seems like some projects and local areas are able to achieve releases where others can’t.
  • Even if all of MasterMap were open tomorrow, it might still be difficult to access. No-one likes the current shopping cart model for accessing OS open data. What services would we expect from the OS and others that would make this data useful? I suspect this would go beyond “let me download some shapefiles”. We built some of these ideas into the OS Linked Data site. It still baffles me that you can’t find much OS data on the OS website.
  • If all of MasterMap isn’t made open, then which elements of it would unlock the most value? Are there specific layers or data types that could reduce friction in important application areas?
  • Similarly, how could the existing OS open data be improved to make it more useful? Hint: currently all of the data is generalised and doesn’t have any stable identifiers at all.
  • What could the OS and others do to support the rest of us in annotating and improving their data assets? The OS switched off its TOID lookup service because no-one was using it. It wasn’t very good. So what would we expect that type of identifier service to do?
  • If there is more openly licensed data available, then how could it be usefully added to OpenStreetMap and used by the ecosystem of open geospatial tools that it is supporting?
  • We all want access to MasterMap because its a rich resource. What are the options available to ensure that the Ordnance Survey stays resourced to a level where we can retain it as a national asset? Are there reasonable compromises to be made between opening all the data and them offering some commercial services around it?
  • …etc, etc, etc.

Personally, I’m choosing to be optimistic. Let’s get to work to create the result we want to see.

The state of open licensing, 2017 edition

Let’s talk about open data licensing. Again.

Last year I wrote a post, the State of Open Licensing in which I gave a summary of the landscape as I saw it. A few recent developments mean that I think it’s worth posting an update.

But Leigh, I hear you cry, do people really care about licensing? Are you just fretting over needless details? We’re living in a post-open source world after all!

To which I would respond, if licensing doesn’t have real impacts, then why did the open source community recently go into meltdown about Facebook’s open source licences? And why have they recanted? There’s a difference between throwaway, unmaintained code and data, and resources that could and should be infrastructure.

The key points I make in my original post still stand: I think there is still a need to encourage convergence around licensing in order to reduce friction. But I’m concerned that we’re not moving in the right direction. Open Knowledge are doing some research around licensing and have also highlighted their concerns around current trends.

So what follows is a few observations from me looking at trends in a few different areas of open data practice.

Licensing of open government data

I don’t think much has changed with regards to open licenses for government data. The UK Open Government Licence (UK-OGL) still seems to be the starting point for creating bespoke national licences.

Looking through the open definition forum archives, the last government licence that was formally approved as open definition compliant was the Taiwan licence. Like the UK-OGL Version 3, the licence clearly indicates that it is compatible with the Creative Commons Attribution (CC-BY) 4.0 licence. The open data licence for Mexico makes a similar statement.

In short, you can take any data from the UK, Taiwan and Mexico and re-distribute it under a CC-BY 4.0 licence. Minimal friction.

I’d hoped that we could discourage governments from creating new licences. After all, if they’re compatible with CC-BY, then why go to the trouble?

But, chatting briefly about this with Ania Calderon this week, I’ve come to realise that the process of developing these licences is valuable, even if the end products end up being very similar. It encourages useful reflection on the relevant national laws and regulations, whilst also ensuring there is sufficient support and momentum behind adoption of the open data charter. They are as much as a statement of shared intent as a legal document.

The important thing is that national licences should always state compatibility with an existing licence. Ideally CC-BY 4.0. This removes all doubt when combining data collected from different national sources. This will be increasingly important as we strengthen our global data infrastructure.

Licensing of data from commercial publishers

Looking at how data is being published by commercial organisations, things are very mixed.

Within the OpenActive project we now have more than 20 commercial organisations publishing open data under a CC-BY 4.0 licence. Thomson Reuters are using CC-BY 4.0 as the core licence for its PermID product. And Syngenta are publishing their open data under a CC-BY-SA 4.0 licence. This is excellent. 10/10 would reuse again.

But in contrast, the UK Open Banking initiative has adopted a custom licence which has a number of limitations, which I’ve written about extensively. Despite feedback they’ve chosen to ignore concerns raised by the community.

Elsewhere the default is for publishers and platforms to use custom terms and conditions that create complexity for reusers. Or for lists of “open data” to have no clear licensing.

Licensing in the open data commons

It’s a similar situation in the broader open data commons.

In the research community CC0 licences have been recommended for some time and is the default on a number of research data archives. Promisingly the FigShare State of Open Data 2017 report (PDF) shows a growing awareness of open data amongst researchers, and a reduction in uncertainty around licensing. But there’s still lots of work to do. Julie McMurry of the (Re)usable Data Project notes that less than half of the databases they’ve indexed have a clear, findable licence.

While the CC-BY and CC-BY-SA 4.0 licences are seen to be the best practice default, a number of databases still rely on the Open Database Licence (ODbL). OpenStreetMap being the obvious example.

The OSM Licence Working Group has recently concluded that, pending a more detailed analysis, the Creative Commons licences are incompatible with the ODbL. They now recommend asking for specific permission and the completion of a waiver form before importing CC licenced open data into OSM. This is, of course, exactly the situation that open licensing is intended to avoid.

Obtaining 1:1 agreements is the opposite of friction-less data sharing.

And it’s not clear whose job it is to sort it out. I’m concerned that there’s no clear custodian for the ODbL or investment in its maintenance. Resolving issues of compatibility with the CC licences is clearly becoming more urgent. I think it needs an organisation or a consortia of interested parties to take this forward. It will need some legal advice and investment to resolve any issues. Taking no action doesn’t seem like a viable option to me.

Based on what I’ve seen summarised around previous discussions there seem to be some basic disagreements around the approaches taken to data licensing that have held up previous discussions. Creative Commons could take a lead on this, but so far they’ve not certified any third-party licences as compatible with their suite. All statements have been made the other way.

Despite the use by big projects like OSM, its really unclear to me what role the ODbL has longer term. Getting to a clear definition of compatibility would provide a potential way for existing users of the licence to transition at a future date.

Just to add to the fun, the Linux Foundation have thrown two new licences into the mix. There has been some discussion about this in the community and some feedback in these two articles in the Register. The second has some legal analysis: “I wouldn’t want to sign it“.

Adding more licences isn’t helpful. What would have been helpful would have been exploring compatibility issues amongst existing licences and investing in resolving them. But as their FAQ highlights, the Foundation explicitly chose to just create new licences rather than evaluate the current landscape.

I hope that the Linux Foundation can work with Creative Commons to develop a statement of compatibility, otherwise we’re in an even worse situation.

Some steps to encourage convergence

So how do we move forward?

My suggestions are:

  • No new licences! If you’re a government, you get a pass to create a national licence so long as you include a statement of compatibility with a Creative Commons licence
  • If your organisation has issues with the Creative Commons licences, then document and share them with the community. Then engage with the Creative Commons to explore creating revisions. Spend what you would have given your lawyers on helping the Creative Commons improve their licences. It’s a good test of how much you really do want to work in the open
  • If you’re developing a platform, require people to choose a licence or set a default. Choosing a licence can include “All Rights Reserved”. Let’s get some clarity
  • We need to invest further in developing guidance around data licensing.
  • Let’s sort out compatibility between the CC and ODbL licence suites
  • Let’s encourage Linux Foundation to do the same, and also ask them to submit their license to the licence approval process. This should be an obvious step for them as they’ve repeatedly highlighted the lessons to be learned from open source licensing, which go through a similar process.

I think these are all useful steps forward. What would you add to the list? What organisations can help drive this forward?

Note that I’m glossing over a set of more nuanced issues which are worthy of further, future discussion. For example whether licensing is always the right protection, or when “situated openness” may be the best approach towards building trust with communities. Or whether the two completely different licensing schemes for Wikidata and OSM will be a source of friction longer term or are simply necessary to ensure their sustainability.

For now though, I think I’ll stick with the following as my licensing recommendations:


What is a Dataset? Part 2: A Working Definition

A few years ago I wrote a post called “What is a Dataset?” It lists a variety of the different definitions of “dataset” used in different communities and standards. What I didn’t do is give my own working definition of dataset. I wanted to share that here along with a few additional thoughts on some related terms.

Answering the right question

I’ve noticed that often, when people ask for a definition of “dataset”, its for one of two reasons.

The first occurs when they’re actually asking a different question: “What is data?” Here I usually try and avoid getting into a lengthy discussion around data, facts, information and knowledge and instead focus on providing examples of datasets. I include databases, spreadsheets, sensors readings and collections of documents, images and video. This is to help get across that actually everything is data these days. It just depends how you process it.

The second question occurs when someone is trying to decide how to turn an existing database or some other collection of data into a “dataset” they can publish it on their website, or in a portal, or via an API. Answering this question involves a number of other questions. For example:

  • Is a dataset a single data file?
    • Answer: Not necessarily, it could be several files that have been split up for ease of production or consumption
  • Is a database one dataset or several?
    • Answer: It depends. Sometimes a database might be a single dataset, but sometimes it might be better published as several smaller datasets. You’ll often need to strip personal or commercially sensitive data anyway, so what you publish is unlikely to be exactly what you’ve got in your database. But you might decide to publish a collection of different data files (e.g. one per table) packaged together in some way. This might be best if someone will always want to consume the whole thing, e.g. to create a local copy of your database
  • Are there reasons why a single larger collection of data might be broken up into different datasets?
    • Answer: Yes, if it makes it easier for people to access and use the data. Or maybe there are regular updates, each of which is a separate dataset
  • If a database contains data from different sources, should it be published as several different datasets?
    • Answer: It depends. If you’ve created a useful aggregation, then publishing it as a whole makes sense as a user can access the whole thing. Ditto if you’ve corrected, fixed or improved some third-party data. But sometimes you might just want to release whatever new data you’ve added or created, and let people find other datasets that you reference or reuse by providing a link to the original versions
  • …etc

There are no hard and fast answers. Like everything around publishing open data, you need to take into account a number of different factors.

A working definition

Bringing this together, I’ve ended up with the followingrough working definition of “dataset”:

A dataset is a collection of data that is managed using the same set of governance processes, have a shared provenance and share a common schema

By requiring a common set of governance processes, you group together data that has the same level of quality assurance, security and other policies. By requiring a shared provenance, we focus on data that has been collected in similar ways, which means that they will have similar licensing and rights issues. Sharing a common schema means that the data is consistently expressed.

To test this out:

  • If you have a produce a set of official statistics, each annual release is a new dataset. Because the data has been collected and processed at different times
  • A database of images and comments that users have made against them would probably best be released as two datasets: one containing the images (& their metadata) and another containing the comments. Images and comments are two different types of object, they’re collected and managed in different ways
  • A set of food hygiene ratings collected by different councils across the UK consists of multiple datasets. Data on each local area will have been collected at different times by different organisations. Publishing them separately allows users to take just the data they need, when it’s updated
  • …etc

There are always exception to any rule, but I’ve found this reasonably useful in practice. As it highlights some important considerations. But I’m pretty sure it can be improved. Let me know if you have comments.

This post is part of a series called “basic questions about data“.


The Lego Analogy

I think Lego is a great analogy for understanding the importance of data standards and registers.

Lego have been making plastic toys and bricks since the late 40s. It took them a little while to perfect their designs. But since 1958 they’ve been manufacturing bricks in the same way, to the same basic standard. This means that you can take any brick that’s been manufactured over the last 59 years and they’ll fit together. As a company, they have extremely high standards around how their bricks are manufactured. Only 18 in a million are ever rejected.

A commitment to standards maximises the utility of all of the bricks that the company has ever produced.

Open data standards apply the same principle but to data. By publishing data using common APIs, formats and schemas, we can start to treat data like Lego bricks. Standards help us recombine data in many, many different ways.

There are now many more types and shapes of Lego brick than there used to be. The Lego standard colour palette has also evolved over the years. The types and colours of bricks have changed to reflect the company’s desire to create a wider variety of sets and themes.

If you look across all of the different sets that Lego have produced, you can see that some basic pieces are used very frequently. A number of these pieces are “plates” that help to connect other bricks together. If you ask a Master Lego Builder for a list of their favourite pieces, you’ll discover the same. Elements that help you connect other bricks together in new and interesting ways are the most popular.

Registers are small, simple datasets that play the same role in the data ecosystem. They provide a means for us to connect datasets together. A way to improve the quality and structure of other datasets. They may not be the most excitingly shaped data. Sometimes they’re just simple lists and tables. But they play a very important role in unlocking the value of other data.

So there we have it, the Lego analogy for standards and registers.

Mapping wheelchair accessibility, how google could help

This month Google announced a new campaign to crowd-source information on wheelchair accessibility. It will be asking the Local Guides community of volunteers to begin answering simple questions about the wheelchair accessibility of places that appear on Google Maps. Google already crowd-sources a lot of information from volunteers. For example, it asks them to contribute photos, add reviews and validate the data its displaying to users of its mapping products.

It’s great to see Google responding to requests from wheelchair users for better information on accessibility. But I think they can do better.

There are many projects exploring how to improve accessibility information for people with mobility issues, and how to use data to increase mobility. I’ve recently been leading a project in Bath that is using a service called Wheelmap to crowd-source wheelchair accessibility information for the centre of the city. Over two Saturday afternoons we’ve mapped 86% of the city. Crowd-sourcing is a great way to collect this type of information and Google has the reach to really take this to another level.

The problem is that the resulting data is only available to Google. Displaying the data on Google maps will put it in front of millions of people, but that data could potentially be reused in a variety of other ways.

For example, for the Accessible Bath project we’re now able to explore accessibility information based on the type of location. This may be useful for policy makes to help shape support and investment in local businesses to improve accessibility across the city. Bath is a popular tourist destination so it’s important that we’re accessible to all.

We’re able to do this because Wheelmap stores all of its data in OpenStreetMap. We have access to all of the data our volunteers collect and can use it in combination with the rich metadata already in OpenStreetMap. And we can also start to combine it with other information, e.g. data on the ages of buildings, which may yield more insight.

As we learnt in our meetings with local wheelchair users and stroke survivors, mobility and accessibility issues are tricky to address. Road and pavement surfaces and types of dropped kerbs can impacts you differently depending on your specific needs. Often you need more data and more context from other sources to provide the necessary support. Like Google we’re starting with wheelchair accessibility because that’s the easiest problem to begin to address.

To improve routing, for example you might need data on terrain, or be able to identify the locations and sizes of individual disabled parking spaces. Microsoft’s Cities Unlocked are combining accessibility and location data from OpenStreetmap with Wikipedia entries to help blind users navigate a city. They chose OpenStreetMap as their data source because of its flexibility, existing support for accessibility information and rapid updates. This type of innovation requires greater access to raw data, not just data on a map.

By only collecting and displaying data only on its own maps, Google is not maximising the value of the contributions made by it’s Local Guides community. If the data they collected was published under an open licence, it could be used in many other projects. By improving its maps, Google is addressing a specific set of user needs. By opening up the data it could let more people address more user needs.

If Google felt they were unable to publish the data under an open licence, they could at least make the data available to OpenStreetMap contributors to support their mapping events. This type of limited licensing is already being used by Microsoft, Digital Globe and others to make commercial satellite imagery available to the OpenStreetMap community. While restrictive licensing is not ideal, allowing the data to be used to improve open databases, without the need to worry about IP issues is a useful step forward from keeping the data locked down.

Another form of support that Google could offer is to extend Schema.org to allow accessibility information to be associated with Places. By incorporating this into Google Maps and then openly publishing or sharing that data, it would encourage more organisations to publish this information about their locations.

But I find it hard to think of good reasons why Google wouldn’t make this data openly available. I think its Local Guides community would agree that they’re contributing in order to help make the world a better place. Ensuring that the data can be used by anyone, for any purpose, is the best way to achieve that goal.