Psychologists do it, biologists do it, and even squeaky clean economists do it. How can you find out whether a group of scientists are guilty of making lots of comparisons and then selecting positive results for publication? You can draw a graph using all the p values in all the papers, or all the p values relating to the central question in all the papers. You might find it looks like this:
That would be a reassuring result. But what about if it looked like this?
Such a result shows clustering of p values just below the conventional significance level of 0.05 and is strongly indicative of p-hacking.
In an important blog, prominent statistician Uri Simonsohn shows that the first – satisfactory – pattern approximates to findings when all p values in all papers are analysed. But when the sample of p values in ‘enriched’ by selecting principal findings or those p values associated with a co-variate, then the signal emerges from the noise, as in the second figure.
— Richard Lilford, CLAHRC WM Director
- Simonsohn U.  Falsely Reassuring: Analyses of All P-Values. 24 Aug 2015. [Online].