Member-only story

A Spreadsheet of Errors

How can misusing Microsoft Excel induce analytical errors?

Anthony B. Masters
3 min readOct 12, 2020

Using Microsoft Excel can cause major problems in statistical reporting and analysis. This article looks at several recent incidents of Excel errors.

The missing cases

Public Health England gathered SARS-CoV-2 swab test results from commercial firms. These results were in a list-based format — comma-separated values (CSV) files.

An old version of Excel pulled together different text files. (Image: BBC)

In an automated process, the agency used Excel to pull together these text-based files. For each SARS-CoV-2 test, there were several rows in the file.

The agency used an old version of Excel (XLS format). That meant the collated files could only hold around 65,000 rows. Microsoft superseded that format in 2007. The row limit in the latest version of Excel is about one million.

When the rows breached their limit, the extra lines were missing. There was temporary under-reporting of lab-confirmed cases. These reported cases feed into the NHS Test and Trace system. As a result, this problem delayed attempts to control the virus.

This was not a “glitch”, but the inevitable outcome of the automated process.

Spreadsheet problems

--

--

Anthony B. Masters
Anthony B. Masters

Written by Anthony B. Masters

This blog looks at the use of statistics in Britain and beyond. It is written by RSS Statistical Ambassador and Chartered Statistician @anthonybmasters.

No responses yet