A press release claims to find the UK cities swearing most online.

A Scottish Sun headline states “Glasgow is UK’s most foul-mouthed city”. LADbible writes: “Glasgow Is The UK’s Most Potty-Mouthed City”.

The “recent study” was a press release from a digital marketing company. As stated in their methods note, the analysis looked at Reddit. Cities can have dedicated websites on Reddit (called subreddits).

“Which city swears the most online”? (Image: Reboot)

Analysts took ten swear words from Ipsos MORI’s research on offensive language in 2016. Ofcom, the UK’s media regulator, commissioned that work.

That report does not include a prescribed list of ten words that the company searched for. There is a table of “general swear words”, ordered in…


An image spreading on social media refers to hospital patients.

With concerns over rare blood clots, Denmark suspends use of the Oxford-AstraZeneca vaccine. An image comparing the ‘risk of blood clots’ spreads:

The sources are on the bottom. (Image: Twitter)

We can translate sources for each number:

Oxford-AstraZeneca (Vaxzevria) vaccine

The source is either unstated or incorrect. The estimate is from the UK’s Medicines and Health products Regulatory Authority:

By 31 March 20.2 million doses of the COVID-19 Vaccine AstraZeneca had been given in the UK meaning the overall risk of these blood clots is approximately 4 people in a million who receive the vaccine.

The European Medicines Agency used a figure from the Paul-Ehrlich-Institut in Germany. Their estimated rate was…


On bar graphs, showing a value using a line can be effective.

Last week, I looked at how to emulate the mortality graph with a ranged ribbon. This week, I seek to emulate a graph in the Office for National Statistics weekly death reports.

There is a lot to deconstruct here. (Image: ONS)

The graph has the following key elements:

  • A stacked bar graph, showing deaths which involve and do not involve COVID-19. A death ‘involves’ a disease if clinicians believe it caused or contributed to the death.
  • A straight line representing the weekly average of deaths in 2015 to 2019.
  • A legend showing what all three counts correspond to on the graph.
  • Informative text and arrows, highlighting public holidays influence…

What does this tiny probability look like?

The EMA stated there was a plausible link between the Oxford-AstraZeneca (Vaxzevria) vaccine and rare blood clots. The European Medicines Agency said the probability of this kind of blood clot was “very low”. With millions vaccinated, very rare side effects can emerge.

The Paul-Ehrlich-Institut reported brain blood clots with low platelets in one in 100,000 vaccinations. The MHRA figure was a little lower: at about 0.4 in 100,000. The disparity comes from coverage, case definition, study period, and population differences.

What does one in 100,000 look like?

One in 100,000 is a very small probability. It is challenging to think about this chance.

We can put 100…


Emulating the Office for National Statistics graph of mortality.

The Office for National Statistics weekly reports feature a graph of death registrations. Readers can see the 2015–2019 range of registrations, plus the latest year. We can emulate this graph for other countries.

These figures are for the number of death registrations. (Image: ONS)

Our World in Data collates figures on all-cause mortality from two sources. University of California researchers maintain the Human Mortality database. This database relies on Eurostat and national statistics offices. Ariel Karlinsky and Dmitry Kobak put together the World Mortality data set.

The example country is Sweden. Statistics Sweden publishes weekly updates to its deaths file. Unlike the ONS, the preliminary figures show deaths by date of…


“Why were more people dying in the years before the pandemic”?

In the New Statesman, George Eaton writes:

On 12 January, one of the grimmest findings of the Covid-19 pandemic was published: “excess deaths” in England and Wales were reported to have risen to their highest level since the Second World War. In 2020, 608,002 deaths were registered, the largest number since the 1918 Spanish flu and 75,925 more deaths than occurred on average over the preceding five years (the technical definition of excess deaths).

Excess deaths are deaths above a baseline. The Office for National Statistics calculate a simple average of the past five years. The ONS continues to use


No, the ‘study’ does not show Prince William is the “world’s sexiest bald man”.

All that glitters is not gold. News websites claimed a study suggested Prince William is the “world’s sexiest bald man”. There were news articles on The Sun, Daily Mirror, and indy100 websites.

Who did this study and what did they do?

A cosmetics firm typed names of bald male celebrities and ‘sexy’ into Google. A person counted the number of search results.

The recent news articles increase the number of search engine results.

Despite this method, some news articles refer to it as a ‘Google study’. This is incorrect. Google neither commissioned nor conducted this study.

What is the problem with this study?

The measure — the number of search results — has no validity. …


How big was the 2020 increase in mortality in England & Wales?

Measuring mortality matters. Death is the final outcome for many health problems. There are three different measures which analysts look at:

  • Death registrations: this is the total number of deaths. These figures can be by the date of death or registration date.
  • Crude mortality rates: the number of deaths divided by a population estimate. This number is often put for every 100,000 people or 1,000 people.
  • Age-standardised mortality rates: this calculation starts with mortality rates for each age group. Analysts then calculate a weighted average of all those age-specific rates. That creates the age-standardised mortality rate.

The Office for National…


How should we express the uncertainty in survey estimates?

Surveys provide estimates, subject to many sources of potential error.

How should researchers express uncertainty with survey results? The standard way to show an interval around each central estimate. There are many different kinds of intervals for estimates of proportions.

Confidence intervals

Surveys draw a sample from the wider population. Each sample is one instance of the sampling distribution. Sample statistics can differ from true population values.

A 95% confidence interval has a technical meaning. If you did the survey 100 times, expect about 95 calculated intervals to include the actual value.

It does not mean a single interval has 95% chance…


CODEX

Averages can illustrate, but testing is a random event.

Diagnostic tests are imperfect. In my co-written article in The Observer:

So out of those five positive tests, at least three will be false positives.

This was an illustrative elision, assuming average inaccuracies. Instead, what if inaccuracies varied? What is the distribution of the test’s precision — the true share of all positive results?

Tests don’t have to be testing

Suppose we had the following situation for a binary diagnostic test:

  • Prevalence: the proportion of infected people was 0.04%, or 400 in one million.
  • False-positive rate: for an average 10,000 tests of uninfected people, three are positive (0.03%).
  • False-negative rate: for every test of an infected…

Anthony B. Masters

This blog looks at the use of statistics in Britain and beyond. It is written by RSS Statistical Ambassador and Chartered Statistician @anthonybmasters.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store