The data we use in Energy Sparks

Disclaimer: this blog post is about some of the challenges that we have faced in consuming and using data in Energy Sparks. While I am a trustee of the Energy Sparks application, and am currently working with the team on some improvements to the application, this blog post are my own opinions.

Energy Sparks is an online energy analysis tool and energy education programme specifically designed to help schools reduce their electricity and gas usage through the analysis of smart meter data. The service is run by the Energy Sparks charity, which aims to educate young people about climate change and the importance of energy saving and reducing carbon emissions. 

The team provides support for teachers and school eco-teams in running educational activities to help pupils learn about energy and climate change in the context of their school.

It was originally started as a project by Bath: Hacked and Transition Bath and has been funded by a range of organisations. Recent funding has come from the Ovo Foundation and via BEIS as part of the Non-Domestic Smart Meter Innovation Challenge

The application uses a lot of different types of data, to provide insights, analysis and reporting to pupils, teachers and school administrators.

There are a number of challenges with accessing and using these different types of dataset. As there is a lot of work happening across the UK energy data ecosystem at the moment, I thought I’d share some information about what data is being used and where the challenges lie.

School data

Unsurprisingly the base dataset for the service is information about schools. There’s actually a lot of different type of information that is useful to know about a school in order to do some useful analysis:

  • basic data about the school, e.g. its identifier, type and the curriculum key stages taught at the school
  • where it is, so we can map the schools and find local weather data
  • whether the school is part of a multi-academy trust or which local authority it is with
  • information about its physical infrastructure. For example number of pupils, floor area, whether it has solar panels, night storage heaters, a swimming pool or serves school dinners
  • its calendar, so we can identify term times, inset days, etc. Useful if you want to identify when the heating may need to be on, or when it can be switched off
  • contact information for people at the school (provided with consent)
  • what energy meters are installed at the school and what energy tariffs are being used

This data can be tricky to acquire because:

  • there are separate databases of schools across the devolved nations, no consistent method of access or similarity of data
  • calendars vary across local authorities, school groups and on an individual basis
  • schools typically have multiple gas and electricity meters installed in different buildings
  • schools might have direct contracts with energy suppliers, be part of a group purchase scheme managed by a trust or their local authority or be part of a large purchasing framework agreement, so tariff and meter data might need to come from elsewhere. Many local authorities appoint separate meter operators adding a further layer of complexity to data acquisition. 

Weather data

If you want to analyse energy usage then you need to know what the weather was like at the location and time it was being used. You need more energy for heating when it’s cold. But maybe you can switch the heating off if it’s cold and it’s outside of term time.

If you want to suggest that the heating might be adjusted because it’s going to be sunny next week, then you need a weather forecast.

And if you want to help people understand whether solar panels might be useful, then you need to be able to estimate how much energy they might have been able to generate in their location. 

This means we use:

  • half-hourly historical temperature data to analyse the equivalent historical energy usage. On average we’re looking at four years worth of data for each school, but for some schools we have ten or more years
  • forecast temperatures to drive some user alerts and recommendations
  • estimated solar PV generation data 

The unfortunate thing here is that the Met Office doesn’t provide the data we need. They don’t provide historical temperature or solar irradiance data at all. They do provide forecasts via DataPoint, but these are weather station specific forecasts. So not that useful if you want something more local. 

For weather data, in lieu of using Met Office data we draw on other sources. We originally used Weather Underground until IBM acquired the service and then later shut down the API. So then we used Dark Sky until Apple acquired it and released the important and exciting news that they were shutting down the API

We’ve now moved on to using Meteostat. Which is a volunteer run service that provides a range of APIs and bulk data access under a CC-BY-NC-4.0 licence.

The feature that Meteostat, Dark Sky and Weather Underground all offer is location based weather data, based on interpolating observation data from individual stations. This lets us get closer to actual temperature data at the schools. 

It would be great if the Met Office offered a similar feature under a fully open licence. They only offer a feed of recent site-specific observations.

To provide schools with estimates of the potential benefits of installing solar panels, we currently use the Sheffield University Solar PV Live API, which is publicly available, but unfortunately not clearly licensed. But it’s our best option. Based on that data we can indicate potential economic benefits of installing different sizes of solar panels.

National energy generation

We provide schools with reports on their carbon emissions and, as part of the educational activities, give insights into the sources of energy being generated on the national grid. 

For both of these uses, we take data from the Carbon Intensity API provided by the National Grid which publishes data under a CC-BY licence. The API provides both live and historical half-hourly data, which aligns nicely with our other sources.

School energy usage and generation

The bulk of the data coming into the system is half-hourly meter readings from gas and electricity meters (usage) and from solar PV panels from schools that have them (generation and export).

This allows us to chart and analyse data presenting reports and analysis across the school data. 

There are numerous difficulties with getting access to this data:

  • the complexity of the energy ecosystem means that data is passed between meter operators, energy suppliers, local authorities, school groups, solar PV systems and a mixture of intermediary platforms. So just getting permission in the right place can be tricky
  • some solar PV monitoring systems, e.g. SolarEdge and RBee, offer APIs so in some cases we integrate with these, further adding to the mixture of sources
  • the mixture of platforms means that, while there is more or less industry standard reporting of half-hourly readings, there is no standard API for accessing this data. There’s a tangle of proprietary, undocumented or restricted access APIs 
  • meters get added and removed over time, so the number of meters can change
  • for some rural schools, reporting of usage and generation is made trickier because of connectivity issues
  • in some cases, we know schools have solar PV installed, but we can’t get access to a proper feed. So in this case, we have to use the Sheffield PV Live API and knowledge of what panels are installed to create estimated outputs

The feature that most platforms and suppliers seem to consistently offer, at least to non-domestic customers, is a daily or weekly email with a CSV attachment containing the relevant readings. So this is the main route by which we currently bring data into the system.

We’re also currently prototyping ingesting smart meter data, rather than the current AMR data. We will likely be accessing that via one or more intermediaries who provide public APIs that interface with the underlying infrastructure and APIs run by the Data Communications Company (DCC). The DCC are the organisation responsible for the UK’s entire smart meter infrastructure. There is a growing set of these companies that are providing access to this data. 

I plan to write more about that in a future post. But I’ll note here that the approach for managing consent and standardising API access is in a very early stage.

Unfortunately the government legislation backing the shift to smart meters only applies to domestic meters. So there is no requirement to stop installing AMR meters in non-domestic settings or a path to upgrade. So services targeting schools and businesses will need to deal with a tangle of data sources for some time to come. 

In addition to exploring integration with the DCC there are other ways that we might improve our data collection. For example directly integrating with the “Get Information about Schools Service” for data on English schools. Or using the EPC data to help find data on floor area for school buildings. 

But as of today, for the bulk of data we use, the two big wins would be access to historical data from the Met Office in a useful format, and some coordination across the industry around standardising access to meter data. I doubt we’ll see the former and I’m not clear yet whether any of the various open energy initiatives will produce the latter.