Member-only story

Reproducible analytical pipelines

How do analysts RAP a process up?

Anthony B. Masters
2 min readFeb 6, 2022

Producing statistics for publication is often a key part of analytical roles. Analysts could work for governments, sending official figures to ministers and the public. Reports could also be internal in businesses, going to senior managers and decision-makers. Industry regulators may be another recipient.

Processes for calculating and publishing statistics can be cumbersome, with many manual steps. There may be many spreadsheets, passing from one team to another. Mutual dependencies between files are possible. For instance, a common way from the data store to final document is:

  1. Statistical software exports a spreadsheet: A data portal may produce a spreadsheet. In other cases, some code runs — exporting a spreadsheet. Data ‘stores’ may also be flat files and other spreadsheets.
  2. Spreadsheet manipulation: That file then goes into another spreadsheet. Formulae converts the data tab into the desired graphs and tables.
  3. Copying into a word document: Graphs and tables go into a document.
  4. Saving as a PDF: That document then transforms into a PDF.
(Image: UK Government Data Science)

What are the problems with this approach? There are many steps — taking time and leading to human error. Spreadsheet errors can be horrific

--

--

Anthony B. Masters
Anthony B. Masters

Written by Anthony B. Masters

This blog looks at the use of statistics in Britain and beyond. It is written by RSS Statistical Ambassador and Chartered Statistician @anthonybmasters.

No responses yet