RSS Conference 2018: Statistics Holiday

Every year, around 600 statisticians, data scientists and analysts gather from across the world to share information, exchange ideas, and establish friendships.

It was my first time attending this conference, and I felt I learned so much in four days.

In short

Opposing ‘fake news’ wherever we see it: John Pullinger, the National Statistician, highlighted we should oppose misuse of statistics and celebrate their proper use;

“Wear your uncertainty with pride”: Dr Hannah Fry’s keynote presentation on algorithms suggested we should be unsure and proud;

Analyse problems, not data: Peter Diggle’s lecture focussed on using data to solve real-world problems. A publication is not a solution.

Statistics Holiday

As I arrived in Cardiff City Hall on Monday evening, delegates were directed towards the assembly room.

John Pullinger, the National Statistician, spoke of mobilising the power of data to make better decisions. We were asked to communicate clearly and crisply, and stand up to shoddiness in statistics — including the misuse of official statistics by politicians and other people, and within our own practice (“cargo-cult statistics”).

John reiterated the need to appreciate the skills we need to be effective, to make change. To some laughs, we were told that this trip to Cardiff was a “statistics holiday”. The National Statistician urged: “Let’s make every day a holiday for statisticians”. This may cause problems when booking holidays with employers.

A tribute to the late, great Doug Altman told the story of a generous genius, gave one of his powerful lines:

We should be angry when we see statistical nonsense, whether in journals or in the media.

Tuesday began frightfully early, as I woke to undertake an ill-considered trip to the local gym. The Young Statisticians Section gave their guide to the conference, including getting some sleep and remembering to eat. As the RSS badge says: ‘Eat. Sleep. Stats. Repeat.’

My first ‘parallel’ talk was on medical prognosis and prediction. Lucy Teece, a fellow RSS Statistical Ambassador at Keele University, gave a fascinating talk on competing events. Competing events affect the probability of primary outcome (e.g. dying reduces the chance of getting a hip fracture.) Failing to account for these competing events causes an inflated estimate of absolute risk. Simulations showed a standard ‘rule of thumb’ is not that good.

I then went to talks about the Bean Review into official economic statistics. Dr Penny Babb of the Office for Statistics Regulation gave an overview of what her Office does, including writing guidance on the Code of Practice for Statistics. The talk also highlighted the limited powers of the UK Statistics Authority: “We issue stern letters.”

Xihong Lin, of the Harvard School of Public Health, gave the second keynote lecture on statistical inference in massive health datasets.

The Tuesday afternoon began with a session on data journalism, headed by Joe Twyman of Deltapoll. All presentations converged: data journalism is just journalism, and we need to humanise and tell stories with data. I then attended talks on bilateral trade asymmetries.

Dr Hannah Fry gave a funny and fantastic talk on algorithms. We started off with the case of pigeons diagnosing breast cancer, and moved to showing photos of a tourist couple who followed their car’s satellite navigation straight into the sea. After each joke, a serious point was made: are we right to place faith in the machine?

There are hidden biases within algorithms, just as they are in humans. Dr Fry sought balanced decision-making between human and machine, on “a shared journey of possibilities, where one can’t exist without the other.”

I spoke to the erudite Dr Fry afterwards, asking for tips on effective communication from the one of most prolific popularisers of mathematics in the world. (Hannah Fry’s latest book, Hello World, is now available to purchase.)

Myths, Forecasts and the Election Knight

The Wednesday morning began with talks on data science for official statistics. These presentations included the revelation that job portals sometimes have ‘ghost vacancies’ to capture CVs from unwitting jobseekers.

The rapid-fire talks started with Chloe Gibbs, an RSS Statistical Ambassador at the ONS, talking about the recently-published Annual Survey of Goods and Services. This survey gave greater information on companies providing services outside their main industrial classification.

The next keynote considered the controversy of significance testing and the replicability crisis. Those talks were followed by a discussion on measuring inflation, presentations by RSS Prize winners, and a meeting on data visualisation.

The final day of the conference started with another presentation on medical statistics.

Prof Peter Diggle’s keynote asked us to analyse problems, not data. The University of Lancaster professor looked at two problems, based on spatio-temporal point processes: calls to NHS Direct to indicate potential food poisoning, and the spread of foot-and-mouth disease in Cumbria. There is a triangle of the model, nature and data. The link between the model and nature is theory; between data and nature is observations; between the model and data is statistical inference.

We were reminded that “a publication is not a solution”.

As someone who is interested in fact-checking, it was great to attend talks on statistics in a post-truth age. There are differences between perceptions and reality, and we need to present statistics in an effective way to bust myths. Ipsos MORI have extensively studied the perils of perception (which now forms the basis of a book by Bobby Duffy).

Data visualisation, including uncertainty graphs, are a great tool in fact-checking and telling the truthful story to readers. Fraser Nelson, editor of The Spectator, elaborated why some myths linger: “Some stories are too good to check.”

The last parallel talks were on the future of forecasting elections. Gary Brown of the ONS highlighted the difficulties in sampling, including non-response bias (those who do not respond to the survey having a different opinion to those that do) and coverage bias (those who could not be sampled hold a different view to those that could).

In my question to the speakers, I highlighted the Sturgis report ruled out deliberate misreporting (‘shy Tories’) as a cause for the 2015 polling miss. For a 120-page document about polling methodology, it is an engrossing read.

Gianluca Baio (UCL) spoke about using Bayesian statistics to model UK elections, and John Sandall of SixFifty used survey results to try and forecast the number of seats each party would get in the 2017 General Election.

The final keynote presentation was the Campion Lecture, by invitation of the RSS President Sir David Spiegelhalter. The President could think of no better person to invite than Prof Sir John Curtice. Prof Curtice’s now-usual place in the BBC election night coverage is give pronouncements from a balcony, like a psephological pharaoh.

As a senior research fellow for NatCen, Prof Curtice looked at data from the NatCen mixed mode panel and other polls to give an informative and erudite lecture on the public reactions to the Brexit process.

Relatively few voters have changed their mind on the principle on leaving the EU, underpinned by a strong sense of ‘Remain’ and ‘Leave’ identities. Some attitudes and evaluations have changed considerably in the last two years. Blame for the current state of negotiations is being filtered through a partisan lens, and thus tends not to move opinion on the principle. However, the perceptions of the economic consequences are less easily filtered, and are more likely to shift attitudes. Much of recent polling leads for Remain reflect the views of non-voters in the 2016 referendum, who heavily favour Remain.

It was a social conference too, with quizzes and dinners after each day. I learnt a lot, and I cannot wait for next year’s conference in Belfast.

This blog looks at the use of statistics in Britain and beyond. It is written by RSS Statistical Ambassador and Chartered Statistician @anthonybmasters.