A few weeks ago, Nature published an article summarising the various measures and counter-measures suggested to improve statistical inferences and science as a whole (Chawla, 2017). It detailed the initial call to lower the significance threshold to 0.005 from 0.05 (Benjamin et al., 2017) and the paper published in response (Lakens et al., 2017). It was a well written article, with one minor mistake: an incorrect definition of a p-value:

The two best sources for the correct definition of a p-value (along with its implications and examples of how a p-value can be misinterpreted) are Wasserstein & Lazar (2016) and its supplementary paper Greenland et al. (2016). A p-value has been defined as: “a statistical summary of the compatibility between the observed data and what we would predict or expect to see if we knew the entire statistical model (all the assumptions used to compute the P value) were correct” (Greenland et al., 2016). To put it another way, it tells us the probability of finding the data you have or more extreme data assuming the null hypothesis (along with all the other assumptions about randomness in sampling, treatment, assignment, loss, and missingness, the study protocol, etc.) are true. The definition provided in the Chawla article is incorrect because it states “the smaller the p-value, the less likely it is that the results are due to chance”. This gets things backwards: the p-value is a probability deduced from a set of assumptions e.g. the null hypothesis is true, so it can’t also tell you the probability of that assumption at the same time. Joachim Vandekerckhove and Ken Rothman give further evidence as to why this definition is incorrect: