Correlations and Time Series

A high correlation does not mean what you think it means.

Anthony B. Masters
5 min readOct 24, 2020

Over 1,700 users shared a series of Twitter posts, which claimed “proof” that:

The vast majority of COVID deaths in England since July have been mislabelled false positive deaths.

The suggestion is that we have more COVID-19 deaths due to greater testing. The posts assert increased testing led to lots of incorrect diagnoses. Their “proof” was high correlation between two time series.

This article focuses on statistical problems with measuring correlation in time series. For this reason among others, their conclusion is false.

Correlation coefficients

Dr Craig, a pathologist, starts with this graph:

There is no given data source, which should be the PHE Coronavirus Dashboard. Dr Craig writes:

You will notice that the shape of the two curves are very similar. We can test this. The chart below demonstrates that since August 93% of the rise in deaths can be accounted for by the rise in the number of tests done in hospitals over the 28 days preceding.

--

--

Anthony B. Masters

This blog looks at the use of statistics in Britain and beyond. It is written by RSS Statistical Ambassador and Chartered Statistician @anthonybmasters.