MRP and the Remain United Preamble

Multi-level regression and post-stratification (MRP) is a statistical technique used to project national polling data onto electoral districts, such as constituencies or regions. This is important for understanding elections where the geography of the nationwide vote is crucial, such as states in the US (for their electoral college), or constituencies in the House of Commons.

Remain United commissioned an MRP projection, using internet panel polling data from ComRes, for European Parliament voting intention. ComRes interviewed 4,060 GB adults between 1st and 7th May 2019.

The statistical outputs were then used to suggest tactical voting. This article considers some issues with the supporting document.

Response Rates and Non-Response Bias

The document starts by recounting that the last two general elections and the EU referendum “were not predicted well by most pollsters”.

Non-response bias is an issue in survey research where the respondents are substantially different from those who were selected but failed to respond.

It is then asserted:

Whilst the 2015 General Election polling miss was caused by unrepresentative samples, the same story does not really apply to 2017 (as it was not systemic). Furthermore, only a weak relationship appears between response rates and non-response bias, according to a 2008 meta-analysis. Non-response bias is not solely a function of the non-response rate.

Image for post
Image for post

Exit polling is not ‘naturally representative’

The document then states:

In the UK, results from polling stations are not generally published. The Curtice-Firth method, used in exit polling since 2001, looks at the same polling stations from election to election. People are asked to replicate the vote they just cast, on a mock ballot paper.

The change in exit polls is then studied, building a probabilistic forecast for the entire House of Commons — as this can be expressed as the change from the previous election results.

These samples are not “naturally representative” — the errors from polling station selection should be broadly consistent between elections.

MRP is not an algorithm

The document continues:

MRP takes polling data and seeks to build a model for individual voting intention out of demographic characteristics. This is the R: regression.

That model is allowed to vary according to the constituency. This is the M: multi-level.

The model of individual vote intention is then projected across census counts of what kind of people live in each place. This is the P: post-stratification.

There is no ‘machine-learning’ involved. It is not an algorithm in that sense. It is a statistical technique, named by political scientists.

MRP did not do better in 2015

The most confusing assertion made in the document is:

In 2015, YouGov ran its ‘NowCast’ based on modelling with MRP. The final result had the Conservatives and Labour on an equal number of seats (276)— just as “flawed classic polls” would have implied.

Image for post
Image for post

Nor is it the case that MRP-based models of Commons seats were wholly successful in the 2017 General Election. People may remember the YouGov model, but there was another organisation conducting such modelling.

Lord Ashcroft Polls used British Polling Council members for its fieldwork. Under an MRP model, the central projection was for the Conservatives to win 357 seats. Lord Ashcroft Polls used smaller, weekly surveys of around 2,000 people — after the large 40,000 sample survey that ‘started’ their model. Contrasting with YouGov’s daily surveys of 8,000 people, it may have meant the Lord Ashcroft Polls estimate was too resistant to change.

Image for post
Image for post

MRP may be an expensive way of discovering systematic sampling bias.

The greater usage of MRP is an important innovation for the polling industry. However, it is a matter of human learning, rather than machine-learning.

Based on our understanding of public opinion, appropriate models for individual vote intention — including the choice over the district-level predictor — must be selected. Those subjective choices must be clearly and transparently expressed for future assessment.

This blog looks at the use of statistics in Britain and beyond. It is written by RSS Statistical Ambassador and Chartered Statistician @anthonybmasters.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store