What is a p-value?
This value does not mean the probability the null hypothesis is true.
--
The p-value is a misunderstood statistic. Sometimes, textbooks carry incorrect definitions — perpetuating misinterpretations.
For example, a Twitter user shared a 2003 biology textbook definition:
The P-value is the bottom line of most statistical tests. It is simply the probability that the hypothesis being tested is true. So if a P-value is given as 0.06, that indicates that the hypothesis has a 6% chance of being true.
This is wrong.
A better definition of the p-value is, from the American Statistical Association:
the probability under a specified statistical model that a statistical summary of the data would be equal to or more extreme than its observed value.
Often, you see this definition where the specified model is a null (or nil) hypothesis. One such hypothesis is of no difference between studied groups.
Despite its brevity, there are many aspects of this definition to examine.
The p-value indicates compatibility between the model and data. The calculation assumes a particular model holds. The p-value is then a measure of how extreme the observed data is. With low compatibility, this value provides evidence against the hypothesis or underlying assumptions.
This value does not measure the probability the studied hypothesis is true. The p-value is a conditional probability. That condition is the model — including the studied hypothesis holding. It is not, and cannot be, the probability of that hypothesis. Classical statistics does not assign probabilities to hypotheses. In Bayesian statistics, the language of probability describes unknown parameters and their values.
Modelling assumptions matter and affect the calculation. The distributional assumptions we give to the data are important. That structure creates the sampling distribution. This is how we…