NHS vaccination data wrangling
R is for reproducible workflows.
Colin Angus, a policy modeller, wrote on Twitter:
A few years ago, my workflow tended to be to find interesting data, usually in an Excel file, then tidy it up *in Excel* and export a clean csv to then import into R for plotting…
I recognised this pattern of working, as I do it too.
The NHS England weekly vaccination reports contain much information. Those statistics are not held in clean formats. We want to see a table, by age groups, of vaccination doses and population estimates. From there, we can produce a graph of vaccination coverage using different population numbers.
Curling towards freedom
Our first step is to download the curl
package. That allows us to download Excel files straight from hosted pages. The read.csv
function can do this, but only with CSV files served on ‘http’ domains.
library(tidyverse)
library(readxl)
library(curl)
library(scales)