It’s a new year and time to launch a new type of regular post – welcome to the first of our Data Deep Dives, where our expert Data Team will take you though different data analysis techniques.
Here at Spatial Quotient, we are all about spatio-temporal data. From its collection & processing, to its analysis and visualisation, our business is built on the insights that can be extracted and communicated from this rich data type.
Throughout the COVID-19 pandemic, many organisations have used spatio-temporal data to convey the progression of the virus against time. One of our data scientists, Simon Kirby, had a go, and is going to walk you through the steps to make one yourself.
This plot shows the UK COVID-19 case rate (per 100k people) for each local authority district, and how it changes throughout 2020. I made it to explore a variety of things: a new Python library, the Government’s COVID data and APIs, and finally some interesting theming. It has ended up communicating the pandemic in an effective (if a little scary) way.
All that is needed is some kind of map containing a set of bounded regions, with a variable of interest recorded for each region at different time increments. Phew! All that means in this case is a map of the UK, and a record of the number of COVID cases for each region on that map.
Helpfully, the Government publishes daily data updates for a variety of variables: cases, deaths, and so on, broken down by district. Regional maps containing the district border information are also easy to find, although finding the map with the correct number of districts was tricky (turns out they have changed a number of times). Finally, it can all be grabbed automatically with Python, which is really handy.
There are some excellent Python libraries that can combine the two datasets. I used chloropleth_mapbox from the Plotly library. The UK boundary file is converted into a map, and then each district is coloured by the number of cases identified in that area for that day. Each UK district has its own unique ID to link the number of cases to the map, making this a relatively simple process.
I would like to point out that the threshold COVID rate of 500 used in the video was chosen to show the progression of COVID cases throughout the year, and not the scarily-high rates seen in some districts by the end of the year (over 1200).