Last year, a little measurement called the p-value was the focus of a big controversy as scientists argued it should be lowered to help address a major problem in modern science.
Now we have a formal counter-argument, democratically crafted by 100 scientists with a passion for the p and published online for all to read. But don’t for a moment think this will be the last word on the matter.
Any scientist worth their salt is familiar with the so-called p-value, a statistical measurement set by broad consensus at 0.05. That ‘p’, by the way, stands for probability.
In straight-forward terms, the figure sets a benchmark of how much confidence a researcher and their peers should invest into a set of given results.
More precisely, the p-value is a measure of getting the exact same results if the null-hypothesis – the counter of your awesome prediction – is actually true.
For most things in science, we’re moderately comfortable with a risk of 0.05 (or 5 percent chance) – it means there’s a good chance your hypothesis is a true reflection of the world.
This number has been settled on for just over half a century, when the British statistician Ronald Fisher suggested, “We shall not often be astray if we draw a conventional line at 0.05.”
Fast forward to mid-2017; a number of prominent scientists were no longer convinced they ‘shall not often be astray’, proclaiming an emerging problem called the replication crisis could be improved if we made the figure 0.005 instead.
Their claim isn’t a new one. Researchers have argued before that the ‘fickle p-value‘ deserves to be used cautiously.
Across many fields, from psychology to oncology, scientists have struggled to reproduce the results of influential experiments, prompting scientists to ask if we’re too relaxed about what we should accept as solid evidence.
Theoretically, lowering the p-value benchmark would require researchers to meet stricter experimental conditions in order to make their work statistically significant.
But one Dutch psychologist thinks it amounts to some “horribly bad advice“.
Last year, Daniël Lakens from Eindhoven University of Technology joined forces with more than 100 other scientists and passionate advocates from around the globe to collaborate on an argument for why changing the p-value would be a super bad idea.
The community worked democratically on a shared Google Docs file, which eventually became the basis of a paper they submitted for peer review.
“It was incredible to see how the document evolved from there,” Lakens told Jop de Vrieze at Science.
“People adding, deleting, and adding again. New discussions appearing in the sidelines. It worked like a charm.”
The end paper had 88 authors attached. Titled “Justify your alpha“, it was recently accepted for publishing by Nature Human Behaviour.
Lakens and his fellow authors agree there seems to be a problem with replication, and that a universal value of 0.05 is undesirable.
Their argument against setting a new limit of 0.005 boils down to three points:
- Nobody has shown that the problem with replication is the result of a high p-value;
- The arguments in favour of dropping the 0.05 p-value don’t logically imply it should be applied across all fields and disciplines;
- There are negative consequences to consider.
As Lakens puts it, “Why prescribe a single p-value, when science is so diverse?”
Some high-stakes studies would certainly require high levels of confidence.
But excluding studies for want of an impossible sample size could also close off areas of study that could become increasingly productive with time.
There are no high councils of science to turn to on these matters, so it’s unlikely the debate will be settled soon.
Maybe it’ll never fully come to rest, with advocates on either side pushing and shoving that little number back and forth far into the future.
We can only hope that’s true. It’s democratic discussions such as these that make science so incredibly powerful.