# A Degree of Extrapolation

## Drawing a straight line into the future may lead to error.

A headline in *The Times *claimed:

First-class degrees for all students by 2030

This article examines that claim, showing the degree of uncertainty with such forecasts.

# Indicative projections

The headline itself refers only to particular universities, and not to all students in UK institutions. The second paragraph contradicts the headline:

If the inflation continues at its present rate, every student in the UK would achieve a first in 38 years’ time, the projections indicate.

The projection is described later in *The Times* article:

The Times used the four years of available data from the Higher Education Statistics Agency and calculated the average year-on-year increase for those four years, then projected it into the future.

HESA has more than four years of available data, holding previous reports on student statistics. There are five academic years shown in the latest report:

The proportion of classified first-degree graduates achieving a First has increased from 20% in 2013/14 to 28% in 2017/18.

Using four HESA files over different time periods, there is a time series running from 2003/04 to 2017/18. (**Note: **there may be small differences between different versions of each file.)

Consequently, we can show the rise of Firsts achieved by graduates at UK universities accelerated from about 2012/13.

# Uncertain forecasts

The linear projection used in *The Times* article may perform well in short-term forecasting, but is inappropriate for long-term forecasts.

The proportion of Firsts cannot exceed 100%, but a straight line shooting higher will eventually breach this arithmetic limit. Moreover, it is not recommended to produce a forecast for 12 years into the future based on a time series containing just four yearly values.

There is no consideration of forecasting uncertainty within the article. The ‘forecast’ function in *R* applied to the proportion of First class degrees recovers a similar forecast to the straight line projection — with the same issues previously identified.

This is a simple model. Different choices will lead to different results. Triple exponential smoothing breaks down time series into its components — errors, trends and seasons — updating each of these components as the series moves forward through time.

The 80% prediction interval shows our best estimate for 2029/30 is around 53%, but the share of Firsts could plausibly be between 31% and 75%. A prediction interval is where — based on the underlying model — we expect a new, future value to fall.

With 15 years of the past to consider, it is **highly uncertain** what will happen by 2029/30. If the trend accelerates, dampens or even reverses, the future may look very different to what the straight-line projection predicts.

# Transforming the series

One limitation is treating the First proportions as *if* it can take any value. That proportion cannot be lower than 0% nor exceed 100%.

**This share is bounded. **We can transform that bounded share: mapping the values to the real line. Once we produce a forecast, the ‘logit’ transformation can be undone. That yields a bounded forecast of the series.

The central forecast is now higher for 2029/30, at around 58%. The prediction intervals are narrower too: running from 45% to 70%.

It remains highly uncertain what will happen by 2029/30.

For the time series, I used Higher Education Statistics Authority data files on qualifications by first-time graduates, for 2003/04–2006/07, 2007/08–2009/10, 2010/11–2014/15, and 2015/16–2017/18.

The Excel file and R code are available on GitHub, along with a R Markdown web page.