A poster and presentation at the RSS conference featured flawed analysis.

An analyst from the Pension and Population Research Institute gave a rapid-fire talk. This was a five-minute talk at the Royal Statistical Society conference in Manchester. That talk (plus a poster) asserted a strong link between abortions and breast cancer. Note: this article will discuss abortion and breast cancer statistics in the United Kingdom.

Poster problems

There are many statistical problems with this poster.

The incidence plateaus in older cohorts, whilst cumulative abortions rise. (Image: PAPRI)

To reiterate: correlation is not causation. Imagine there were researchers at…

Analytical decisions affect estimates when working from identical data.

When analysing data, what decisions do we make? Given a problem to solve, there are many dividing paths we could take. Those decisions could affect outputs of our analysis, such as estimates and intervals.

Different methods produce different estimates.

In children, a SARS-CoV-2 infection often has no symptoms or induces only mild disease. Fatalities among children are rare, but mortality is not the only concern. Some people with Covid-19 may experience symptoms long after recovering from infection.

COVID Infection Survey (ONS)

The Office for National Statistics conducts a random infection survey. The survey is of private households. It does not include hospitals, care homes, or other institutions like prisons. As part of that survey, respondents answer this question:

Would you describe yourself as…

You can find statistics for your local area.

Prof Michie (UCL) asks:

Information is absolutely key. We need more data, not less data. Everybody should be able to know, for example, what is the prevalence of Covid in the neighbourhood?

These statistics exist, so this article looks at what people can see.

Using an example postcode (from Central North, Swindon), we get:

  • The number of people testing positive, by swab date, in a seven-day period. For the seven days to 20th August, that is 45.
  • The number per 100,000 people, based on a population estimate…

The latest Observer column had some errors.

In our recent co-written column with Sir David Spiegelhalter, we wrote:

if we just look at women who are currently or recently pregnant, this rises to 37%; far higher than the 13% of live births by mothers of black or Asian ethnicity.

That part should read:

if we just look at women who are currently or recently pregnant, this rises to 34%; far higher than the 17% of live births classified as Black or Asian ethnicity.

All apologies.

In detail

The Intensive Care National Audit and Research Centre (ICNARC) produces weekly reports. These publications cover around 40,000 patients in critical care. …

A claimed survey about cheese choices gives a processed answer.

FoodHub, a takeaway delivery service, claim cheese slices are Britain’s favourite cheese. Different outlets laundered the assertion: including Indy100, Mail Online, and the Metro.

Of course. (Image: FoodHub)

Their press release only says it is a “nationwide poll by us”. Yet, the FoodHub post contains no information about survey methods.

Analogies can be useful for explaining statistics.

A journalist from the Financial Times spoke to me about vaccination statistics. I used an analogy to help explain why the vaccinated share of deaths rises as coverage grows:

About two-thirds of people who die on UK roads are wearing a seatbelt, but this is a consequence of usage rates of nearly 99 per cent, Masters said. He added that the same logic applied to severe disease and death in highly vaccinated populations.

These figures are imprecise. What I recall saying was:

In Great Britain, most car occupants who die in incidents are wearing seat-belts. …

The American Statistical Association task-force shares a statement.

Back in 2019, the President of the American Statistical Association (ASA) launched a task-force into p-values. This task-force started after an editorial in The American Statistician, an ASA journal. People may have mistaken that editorial for official ASA policy.

the probability under a specified statistical model that a statistical summary of the data would be equal to or more extreme than its observed value.

The task-force’s statement says:

Much of the controversy surrounding statistical significance can be dispelled through a better appreciation of uncertainty, variability, multiplicity, and replicability.

Often, journal articles will use…

This value does not mean the probability the null hypothesis is true.

The p-value is a misunderstood statistic. Sometimes, textbooks carry incorrect definitions — perpetuating misinterpretations.

The P-value is the bottom line of most statistical tests. It is simply the probability that the hypothesis being tested is true. So if a P-value is given as 0.06, that indicates that the hypothesis has a 6% chance of being true.

This is wrong.

the probability under a specified statistical model that a statistical summary of the data would be equal to…

Different averages tell different stories.

How many potential years of life do people dying of Covid-19 lose? That is a prominent question in the pandemic. It is another way of showing mortal impacts, giving greater weight to younger deaths.

Years of life lost

What does ‘years of potential life lost’ mean? There are different definitions. One definition compares a person’s actual age at death with life expectancy at that age. The difference is then the potential years of life lost. It is a difference between reality and expectation. Such analyses could also account for co-morbidities.

Anthony B. Masters

This blog looks at the use of statistics in Britain and beyond. It is written by RSS Statistical Ambassador and Chartered Statistician @anthonybmasters.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store