It is common for energy generation and consumption values to be presented as half-hourly readings: giving 48 readings over the course of a single 24 hour period. This is the type of data we’re working with on a daily basis in Energy Sparks.
I thought I’d share a few things that I learned about working with this type of data, in the classic “Falsehoods Programmers Believe About X” format. But in this case the programmer is me. You might have different or better insights. In which case, leave a comment!
This post focuses just on half-hourly data. I’m going to do a second one about metering.
Half-hourly data is always available
While modern meters are AMR (Automated Meter Reading) or SMETS 1 or 2, there are still meters out there that are not capable of producing half-hourly readings.
Some meters installed to monitor, e.g. solar arrays, only report usage over a longer period, e.g. life-time generation.
Half-hourly data is the most granular data available
Half-hourly data is the standard around which modern electricity and gas metering is based. But that doesn’t mean that more granular data isn’t available in some cases.
Meters installed as part of a solar array, e.g. Solar Edge, might report data at 15 minute intervals.
While SMETS 2 meters make data available to suppliers (or other authorised users) on a half-hourly basis, within the home you can access real-time data from the electricity meter (only).
Those readings are only available to certified devices. Some companies and suppliers are manufacturing in-home devices that can bridge from the Zigbee network to Wifi allowing readings at a more fine-grained level
Other “clamp-on” devices which take readings from an electricity cable can do similar real-time reporting.
There are at most 48 readings
Daylight savings means that 2 days in the year will have 50 readings.
Except some supplier data feeds only ever have 48. It’s unclear what happens during daylight savings.
There will never be less than 48 readings
Meters can fail to record (or submit) a reading so you might have missing readings within a day.
Some solar monitoring systems, e.g. Solis Cloud, only seem to report data when there are non-zero readings. So their API only returns data between e.g. 6am and 8pm when there was any generation on the panels, not a fixed set of 48 readings. Other APIs differ.
The readings are always reporting energy consumption
Different types of meters measure different things. So a half-hourly time series might cover a range of data types and units.
A solar generation meter measures the power generated by the panels, a self-consumption meter shows the local consumption of that power. An export meter measures how much energy is exported to the grid.
An electricity meter might also report the Reactive Energy for the circuit.
If you’re taking data from a supplier then the half-hourly data might also be estimated, rather than actual reads from the meter.
Labelling of half-hourly data is standardised
Some CSV formats organise readings into rows, one for each day and with each half-hour being in a separate column. The column headings might be labelled with the start or the end of the half-hour being reported. E.g. the electricity consumption for the period of time between 1am and 1.30am might be labelled as “1:00” or “1:30”.
For CSV files organised like this, then there’s no real ambiguity. The 48 (or 50!) readings are presented in column order.
But some formats report data with one row per half-hourly reading. So you need to take care to ensure you’re parsing the reading time or time-stamps correctly.
2024-04-24T00:00 might be the consumption between 23:30 on the 23rd February and 00:00 on the 24th. Or it might be the data for 00:00 to 00:30 on the 24th.
Consumption readings are always in kWh
Electricity consumption data is reported in kWh. But gas meters report usage based on the volume of gas supplied.
Modern meters report usage in cubic meters. There are still gas meters in use that report data in imperial units. They might be reporting in cubic feet (cf) or hundreds of cubic feet (hcf).
You’ll need to convert this to kWh using a standard formula that takes into account variations in temperature and pressure as well as the calorific value of the gas.
In the UK calorific values will typically be between 37.5 and 43.0 MJ/m cubed with 40 being an acceptable default. The actual calorific value for your gas supply will be on the monthly bill. As far as I’m aware there’s no programmatic way to access this information.
Whether your gas meter reports data in cf or hcf also only seems to be present on energy bills.
The readings are always measured values
SMETS electricity and gas meters occasionally produce high-values which are not actual recorded consumption. They are error codes used to report some kind of meter fault. You need to trap and handle these when processing the data.
As the linked documents note: “Due to the nature of the national programme and the position taken by government and the regulator this is no obligation or requirement for manufacturers to publish these error codes so they can be captured and processed explicitly.“
The readings are always for the usage in a single half-hourly period
I’ve only observed this for some solar meters, but it might occur elsewhere with other types of metering. It relates to how meters remotely report data in scenarios where the connectivity is poor.
Sometimes there will be a visible spike for a single half-hour after a period of missing readings.
Some meters, if they hit a communication error will, at the next opportunity report, all consumption (or generation) that haven’t already been reported. In effect the meter “catches up” by just attributing all unreported data for the next half-hour its able to connect.
Depending on the capability of the meters, this “catch-up” reporting might be for a couple of hours or it might extend into the next day.
This can create problems where your generation, export and self-consumption are no longer aligned. One option is to just average out the data across the missing period, allowing for periods when the panels are unlikely to be generating.
As far as I’m aware there’s no standard way to describe this feature or other capabilities of solar meters.
Estimated data is always clearly labelled
Meters don’t report estimated readings, so if you’re pulling data directly from, e.g. a SMETS2 meter then you’ll only get actuals.
But if you’re taking data from suppliers or from other parts of the energy data infrastructure then the readings might include estimates. E.g. if a meter cannot be read remotely.
You need to take care to understand whether the data feed or API you’re using includes only actuals or also estimated readings. Not every feed makes this clear.
You may also need to reload or refresh data when the actual become available. Some data feeds will push through actuals when available. Others used a fixed window, e.g. last 7 days, so you need another mechanism to handle getting historical data.
Estimated data will eventually be replaced with actual values
Estimated data is used if there’s a problem reading a meter. However, even if that problem is resolved you won’t necessarily be able to get the “missing” half-hourly data.
Some meters do have a memory so will store a history of the readings which can be later downloaded. But there’s an upper limit on this. So you may never get actual half-hourly reads to replace the estimates.
From a billing perspective the customer’s bill will be adjusted based on the current meter readings. But you’re not guaranteed to get the missing data.
Daylight saving means one day will have 50 readings (autumn/fall when clocks go back by 1 hour) and one day will have 46 readings (spring when clocks go forward by 1 hour).