The Open Banking initiative recently began to publicly publish specifications, guidance and data through its website. If you’re not already aware of the initiative, it was created as a direct result of government reforms that aim to encourage the banking sector to be more open and innovative. The CMA undertook a lengthy consultation period during which the ODI coordinated work on the Open Banking Standard report.
The recommendations from that report and the CMA ruling were clear. Banks have to:
- publish open data about their products, branches and locations, and
- develop and provide open APIs to support access to other data, e.g. the transaction history on your account.
Unfortunately, while the banks are moving in that direction, the data they are publishing is not open data.
The Open Definition is the definitive description of what makes content and data open. It describes certain freedoms that are essential to maximise the value of publishing data under an open licence.
I think publishing open data is what the CMA and others really intended. Its also clearly spelt out in the Open Banking report. But unfortunately something has been lost in translation. The Open Banking Licence does not conform to the open definition.
Owen Boswarva has given a detailed review on his blog. For a review of the impacts of non-open licences you can read the ODI guidance which I helped to draft.
Rather than recap that guidance here, I thought it might be useful to try to spell out where the limitations in the Open Banking licence will impact reuse of the data. This is based on my early explorations with the public data.
Exploring the limitations of the open banking licence
The Open Banking API dashboard provides direct access to the currently available data. It includes data on the ATMs provided by each of the participating banks, their branches and products.
The data is published as JSON. A commonly used data format that is easy for developers to work with.
I can’t freely distribute the data
The first thing I did was to build a public map of all of the ATM data. To do this I had to convert the data from JSON to CSV which I could then load into an online mapping tool (Carto).
This is a permitted use under the Open Banking licence. The conversion of the data from JSON to CSV, and the creation of a map is explicitly allowed in the licence. Section 2.1(c) says that I am allowed “to adapt the Open Data into different formats for the purposes of data mapping (or otherwise for display or presentational purposes)“.
But that clause means that:
- I can’t share the CSV version of the data. Data in CSV format is useful to many more potential reusers of the data. Many analytics tools support CSV but won’t support custom JSON documents. Because I can’t distribute the alternative version, fewer people can immediately use the data
- I had to keep the dataset private in my Carto account. I’m lucky enough to have a personal account that lets me keep data private. Most freely available online tools allow people to use their services for free, so long as they’re using open data. If I was allowed to share the data with other Carto users, anyone could use it in their own maps. People without a paid Carto account can’t use this data. The result is, again, that fewer people can get immediate benefit from it.
The ability to freely convert and distribute data is a key part of the open definition. It allows data re-users to support each other in using the data by making it available in alternative formats and on all available platforms.
At the moment we are only allowed to copy, re-use, publish and distribute data so long as we don’t change it.
I’m limited in using the data to enrich other services and products
Because I can’t distribute the data it means I can’t take the data that has been provided and use it to improve an existing system. For example I don’t believe I can use the data to add missing ATM locations in Open Street Map.
The terms of the Open Banking licence are not compatible with the Open Street Map licence. Because it is a custom licence, rather than an existing standard open licence, resolving that issue will require legal advice.
OSM requires contributors to be extra cautious when adding data from other sources. They suggest getting explicit written agreement. This takes time and effort. That doesn’t seem to be achieving the desired outcome of a more open banking sector.
The licence is also revocable. At any time the banks can revoke my ability to use the data. Open licences, like the creative commons licences are not revocable. This means I’m exposing myself to legal and commercial risks if I build it into a product or service. I would need to take legal advice on that.
I can’t improve the data
After creating a basic map of ATM locations, I wanted to link the data with other sources. Data becomes more valuable once its linked together.
I opened my CSV version of the data in a free, open source desktop GIS tool called QGIS. Using the standard features of that tool I was able to match the geographic coordinates in the ATM data against openly licenced geographic data from the Office of National Statistics.
This generated an enriched dataset in which every ATM was now linked to an LSOA. An LSOA is a statistical area used by the ONS and others to help publish statistics about the UK. There are many statistical datasets that are reported against these areas.
Having completed that enrichment process I could now start to explore the data in the context of official statistics on demographics. There are many interesting questions that I can now ask of the data. But other people might also have interesting uses for that enriched dataset.
The process of doing the enriching is quite technical. I’m comfortable with teaching myself how to do that. But it would be great if I could help other people unlock value by letting them explore the enriched data.
Unfortunately I can’t share my enriched version with them. I’m not allowed to change any of the content of the data, or distribute it in alternate forms. The best I could do is tweet out a few interesting insights.
I am discouraged from using the data
One way I could use the enriched data is to explore how ATM and branch locations might relate to deprivation or other demographic statistics. This might highlight patterns in how individual banks have chosen to site their branches.
I could also monitor the data over time and build up a picture of where ATMs and branches are opening and closing around the country. Or explores the changing mix of products available from individual banks.
Unfortunately I don’t think I can do that. Clause 3.1(b) of the licence states that I must not “use or present the Open Data or any analysis of it in a way that is unfair or misleading, for example comparisons must be based on objective criteria and not be prejudiced by commercial interests“.
It’s not clear to me what unfair or misleading means. Unfair to the banks? Unfair to consumers? What type of objective criteria are acceptable?
If I were working for a fintech startup, I could perhaps use the data to identify new financial products that could be offered to consumers. I think that’s the type of innovation that the CMA wanted to encourage?
But if I do that and explain my analysis with others, then am I “prejudiced by commercial interests”? The licence says I can use the data commercially, but seems to discourage certain types of commercial usage.
These types of broad, under defined clauses in licences discourage reuse. They create uncertainty around what is actually permitted under the terms of the licence. This reduces the likelihood of people using the data, unless they can cover the legal guidance needed to remove the uncertainty.
I have probably already broken the terms of the licence
I think I may have already broken the terms of the licence. As a bit of fun I’ve created a twitter account called @allthebarclays. Every day it tweets out a picture of a branch of Barclays along with its name and unique identifier.
I’m probably not allowed to do that. The photos in the data don’t have a licence attached to them, so I’m hoping that if challenged, I can justify it under fair use.
The account is clearly a joke. It’s of real use to anyone. But it gave me a focus for my explorations with the data.
It was also a deliberate attempt to show how the data could be used to create something which far from its original intended use. Because encouraging unexpected uses of the data is one of the primary goals of publishing open data. It’s the unexpected uses that are most likely to hit the types of limitations that I’ve outlined above.
How does this get resolved?
There are several ways in which these issues could begin to be addressed. There are measures that the initiative could take that would address some specific limtations, or they could take steps to address all of them. For example, the Open Banking Initiative could:
- Publish data in other formats, e.g. by providing a CSV download, this would explicitly address one part of the first issue I highlighted, but none of the real concerns
- Publish some guidance for reusers that clarifies some of the terms of its existing licence. This might avoid discouraging some uses of the data but again, it doesn’t address the primary issues. The data would still not be open data
- Revise its licence to remove the problematic clauses and create an open data licence. This would ideally go through the licence approval process. This would address all of the concerns
- Drop the licence completely in favour of the Creative Commons Attribution licence (CC-BY 4.0). This would address all of the concerns with the added benefit that it would be explicitly clear to all users that the data could be freely and easily mixed with other open data
Only the last two options would be substantial progress.
What’s needed is for someone at the Open Banking initiative (or perhaps the CMA?) to step up and take responsibility for addressing the issues. Unfortunately, until that happens, the initiative is just another example of open washing.