The Problem of Missing Non-Voters

This article illustrates a problem with standard procedures in opinion polls, given non-voters may not respond to surveys.

The example originally appeared in a journal article, by Christopher Prosser and Jonathan Mellon (University of Manchester, British Election Study).

In short

Missing non-voters: Non-voters are less likely than voters to answer surveys.

Weighting procedures: Using standard procedures, missing non-voters can cause errors in vote share estimates.

The shadow of 2015: This issue was the primary cause of unrepresentative samples in polling for the 2015 General Election.

A Pollster Calls

People who do not vote are less likely to participate in surveys than voters.

Having too many voters in the sample is one form of what statisticians call ‘non-response bias’. It means there are substantial differences between those who answer surveys, and those who do not.

In survey research, ‘weights’ make the sample match the population profile.

A survey could find 100 young people to respond, when there should be 110 in a survey of 1,500 people. Their answers are ‘weighted’: each respondent counts as 1.1 people. The published percentages should then reflect the whole population.

Voters and Population Weights

The problem of missing non-voters arises when we seek the views of voters, but demographic weights for the country’s adult population are applied.

Let’s start with a hypothetical country with two demographic groups: Young and Old. The Young and Old age groups are equal halves of the population.

Half of Young people will turn out to vote, and 85% of Young voters back the Red party.

3 in 4 Old people vote, and 80% of Old voters will cast their ballots for Blue.

For British readers, this is recognisable. Other audiences can swap colours.

If vote intentions stay the same, the Blue party has an 8 point lead.

We send out requests to a representative cross-section of the country.

However, this survey has an extreme form of non-response bias: all voters answer our survey; every non-voter refuses. We are missing our non-voters:

Our unweighted survey data correctly identifies Blue leads by 8 points.

A social researcher notes 3 in 5 respondents of the sample are Old. That researcher naively applies population weightings by age group:

Our poll has missed — with a Red lead of 5 points — inducing an inquiry.

Inappropriate uses of population weights creates the error.

Effectively, the missing Young non-voters are replaced with Red-backing voters from the same age group, inflating the Red share.

The authors identify this problem as the primary cause of unrepresentative samples in the 2015 General Election, and conclude in their paper:

One such problem is the way in which missing non-voters systematically skew polls towards the opinions of the general population rather than just that of voters… This type of problem has likely long-affected polls around the world.

Responding to Non-Response Bias

Responding to the missing non-voters problem, two methods were used in 2017 General Election polling.

Kantar Public used turnout probabilities with an expected level of overall turnout in the weighting schema itself. As Kantar were the second most accurate company, this method seems successful.

ICM Unlimited adjusted population weights based on expected turnout in each demographic group, which appeared to increase their error.

Their weighted samples reveal major disparities in these methods:

Further research is required to understand similarities and differences in these two approaches.

This blog looks at the use of statistics in Britain and beyond. It is written by RSS Statistical Ambassador and Chartered Statistician @anthonybmasters.