NHS vaccination data wrangling

R is for reproducible workflows.

Anthony B. Masters
4 min readJul 3, 2021

Colin Angus, a policy modeller, wrote on Twitter:

A few years ago, my workflow tended to be to find interesting data, usually in an Excel file, then tidy it up *in Excel* and export a clean csv to then import into R for plotting…

I recognised this pattern of working, as I do it too.

(Image: GIFer)

The NHS England weekly vaccination reports contain much information. Those statistics are not held in clean formats. We want to see a table, by age groups, of vaccination doses and population estimates. From there, we can produce a graph of vaccination coverage using different population numbers.

Curling towards freedom

Our first step is to download the curl package. That allows us to download Excel files straight from hosted pages. The read.csvfunction can do this, but only with CSV files served on ‘http’ domains.

library(tidyverse)
library(readxl)
library(curl)
library(scales)

--

--

Anthony B. Masters

This blog looks at the use of statistics in Britain and beyond. It is written by RSS Statistical Ambassador and Chartered Statistician @anthonybmasters.