Bayesian approximations in A/B tests

How close does an approximation get to numerical simulations?

A/B testing is common in businesses. This test puts two versions of the same experience in contest with other another. People (such as website users) see one experience through random assignment.

The research question is simple to state: which version is better? This article compares three different uncertainty intervals in R.

Uncertainty intervals

In an earlier post, I wrote, of an approximation:

For this example, that 95% (highest density) interval is from -0.1 to 8.1 points. This is alike to both numerical methods and classical approximations.

Suppose there are 1,000 users on each version of the web page. Each user converts or…

Self-selecting surveys and GB News

How do you spot a self-selecting survey?

Opinion polling is important for understanding what the public believe. It is also important for readers to recognise quality in survey estimates.

Voodoo polling in the digital era

Self-selecting surveys allow people to put themselves in the sample. Sir Robert Worcester used the term voodoo polls to describe open access surveys.People could phone or send text messages to news organisations to take part. The digital era heralded two more types of self-selecting survey:

• Social media surveys: Users generate their own questions and response options. People with accounts on those platforms can take part.
• Clickable website surveys: Users can click a button to vote in a survey…

Vaccination reporting differences

What do different figures on UK Covid-19 vaccinations tell us?

Statistics producers can make many different reports. Those reports can have distinctions, dependent on analytical choices. I look at two reporting differences for Covid-19 vaccinations in England and Wales.

Why do the daily and weekly NHS England stats differ?

NHS England produce two sets of reports on vaccinations: by day and by week. The latest weekly report goes up to 30th May 2021. These extensive analyses show vaccinations by gender, age, ethnicity and more.

We can compare those figures to the daily report for the same date. Daily reports contain only headline statistics:

• Daily: 32,938,496 first doses and 21,719,461 second doses.
• Weekly: 32,756,725 first doses and 21,609,411 second doses.

First…

Vaccine coverage and denominators

What proportion of people had the Covid-19 vaccine?

What share of people have had a Covid-19 vaccine dose? For the United Kingdom on 25th May, different sources give different statistics:

There is a conceptual difference. PHE uses an Office for National Statistics estimate of the UK population aged 18 or over (~52.6m). Our World in Data uses the UN World Population Prospects 2019 estimate for 2020 (~67.9m).

Both are valid statistics. It is important to see the vaccine coverage of those eligible, and of the total population. Vaccine eligibility may change.

The Birthday Problem

Why are so few random people needed for a likely chance of a shared birthday?

The birthday problem is: how large does a group need to be before it is more likely than not that two share a birthday?

Intuition may offer 183 people as an answer, since that is 365 divided by two rounded up. The answer is 23 or fewer.

A class of children announce their birthdays one at a time. It is simpler to calculate the probability of all birthdays being different. The probability of a birthday pair is then one minus the chance of all different birthdays.

Assume there are 365 possible birthdays, all with equal chance. The first child says…

Period and cohort life expectancy

What is the difference between two measures of life expectancy?

How long do you expect to live? That is a key question which affects financial planning and health decisions. Life expectancy is a statistical measure of how long a person expects to live. That expectancy depends upon demographic factors, like age and sex.

There are two main ways to think about life expectancy:

• Cohorts: the mean lifespan of people born in specific year (or set of years). This calculation uses observed mortality and projections.
• Period: calculations using mortality rates from a year (or years), turn into average years of life. That assumes those rates stay constant throughout each person’s life.

Tiers for Beers

A popular statistic overstates the case for deferred pub drinking.

The financial organisation Company Debt claimed:

Every adult in the UK will have to order 124 pints of beer this year to bring pubs back to their pre-COVID levels.

Headlines repeated that number, in the Daily Mirror, Metro, Evening Standard, and elsewhere.

As a whole, the food and accommodation services industry fell by around 43% from 2019 to 2020. The impact of the pandemic remains stark:

How did they calculate the ‘124 pints’ figure?

The calculation goes as follows:

• “Latest estimates suggesting that the UK’s food and beverage industry lost at least £25.66 billion due to COVID-19”. Their article does not give a primary source for this estimate…

Conditional probabilities and imperfect vaccines

Why is it plausible most future Covid-19 hospitalisations are among those vaccinated?

In recent reporting, BBC journalist Marianna Spring showed a sticker left by protesters:

“60–70% of hospital admissions and deaths are from people who have had 2 doses of the vaccines.”

That sticker provides a short link to “NHS SOURCE”. The document is not, in fact, from the National Health Service.

It is from the modelling sub-group advising the UK government. Despite the exact quote-marks, those words do not appear in the document. The sticker intends to paraphrase this part:

32. The resurgence in both hospitalisations and deaths is dominated by…

Some notes on excess mortality

As I prepared for an interview, I wrote notes on excess deaths.

Excess deaths are deaths from all causes above a baseline. That baseline often represents an expected number of deaths.

Different institutions use different baselines:

Excess death calculations differ through choices of periods and baselines.

Where can I find data on excess mortality?

There many institutions collating COVID-19 surveillance deaths. There is no single source for frequent all-cause mortality.

The World Bank calculates annual crude death rate per 1,000 people for countries. The United Nations Statistics Divisions publishes annual…

Bad graphs of the 2021 UK elections

Some rules can be bent. Others broken.

The United Kingdom held elections on Thursday 6th May. As volunteers count votes, we can look at some graphs in election literature and on social media.

One way to understand rules of graphical integrity is to see them bent and broken. This is a short article. It is not a complete list of all poor data visualisation during the election.

The elastic axis

One common fault with graphs in election leaflets are disproportionate bar charts.

Bar charts show their values through the length of the bars. When bar graphs do not start from zero, heights are disproportionate to the values.

