One of issues that has prompted statisticians and journals to call for the use of significance testing and hypothesis testing to be discontinued is that p-values are not replicable. That is, if you repeat an experiment (each time randomly drawing a new sample from the population) you are likely to get a very different p-value. Emeritus Professor Geoff Cumming from La Trobe University (in Melbourne, Australia) has illustrated this nicely in a video titled “dance of the p-values”. Viewed nearly 50,000 times, this video illustrates just how unpredictable p-values are.
Many people think about p-values as being a measure of how strong the evidence is in a study. For example, very small p-values like p < 0.01 have been called ‘highly significant’, 0.01-0.05 ‘significant’, 0.05-0.10 ‘approaching significance’, and > 0.10 ‘non-significant’. The problem is that p-values tell us almost nothing about what will happen if an experiment is replicated. When computer simulation is used to replicate an experiment, the p-value varies widely.
Professor Cumming, like many other statisticians, recommends that p-values no longer be at the centre of our thinking about drawing conclusions from research because no single p-value can be trusted. A much better alternative is using confidence intervals. Estimation using confidence intervals is much more informative because confidence intervals tell us what is likely to happen if we repeat the experiment. For example, 95% confidence intervals from a sample tell as that if we repeat the experiment 100 times, in about 95 of the 100 repeats the confidence interval will include the mean difference for the population.
If you are interested in perusing the numbers and formulas on which the “dance of the p-values” is based, we recommend you read the following article:
Your ability to read scientific articles will improve with practice. Make the commitment to read at least one article per month and share your reading with the global physiotherapy community in #MyPTArticleOfTheMonth.