Chocolate Can Help You Lose Weight – a Hoax

The hoax perpetrated by John Bohannon is now well known.[1] [2] He carried out a small RCT of the effects of a high chocolate diet versus standard diet on multiple end-points, including blood pressure, mood, cholesterol level, etc. In fact there were 18 end-points, so if the null hypothesis was true, then there was a 0.6 (60%) probability of obtaining at least one positive result {1 – (1 – 0.05)18}. Then he ‘p-hacked’. Sure enough, one end-point was ‘positive’ – effect on weight.

He paid £600 to publish the paper in an online journal. It was picked up by Europe’s largest circulation newspaper (Bild) and went viral – all the way to talk shows in Arizona. He exposed the hoax. (The paper has now been withdrawn but is still available online).[3]

What are the ethics of such a study? The CLAHRC WM Director “is okay” with it on the grounds that the betrayal of trust inherent in such a hoax is justified by the need to expose the orders of magnitude more harmful betrayals in the entire data dredging, scientific publication and journalistic process.

But what does it mean for science? One the one hand, we have people like Ioannidis saying much (even most) research is junk because of ‘p-hacking’,[4] and on the other, those like Pawson and Tilley, arguing that it is the pattern in ‘rich’ data that should inform theory and action.[5] The former advocate published prior hypotheses, distinguishing between ‘primary’ and ‘secondary’ outcomes, and even correcting for multiple observations (thereby making it hard for results to be ‘positive’). The latter advocate mixed methods research and ‘triangulation’.

Both have a point. The chocolate story is but one of many showing that the dangers of p-hacking are very real. Yet the realist camp also have a point – indeed a more profound one – since it is a long-standing part of the scientific method that multiple observations reinforce or undermine theories. The famous philosopher of science, William Whewell, articulated the idea of forming a theory, deciding what observations could confirm or refute the theory, and then collecting the necessary observations.[6] A multiple outcome study does just that (even if the observations come from one study rather than studies in series). Finding that a treatment both reduces blood pressure and heart attack and stroke is more impressive evidence than reduction in one of these end-points alone. The combination of effects in the expected direction suggests that the underlying theoretical construct is correct and that it would be safe to generalise. This would be the case even with respect to an end-point where improvement was not quite significant at the usual threshold. Likewise, showing that improving nurse-patient ratios resulted in nurses spending more time with patients, being more diligent in making observations of vital signs, and turning patients more often, as well as improving satisfaction and improving clinical outcomes, would be more impressive evidence (say in an observational study) than improvement in one end-point alone.

So how to reconcile these two approaches? What we need – the trick to pull off – is to impose a prior discipline (akin to the idea of ‘prior’ hypotheses), while capitalising on the idea of corroboration across different observations, as recommended by Whewell. Here discipline is imposed by first spelling out the hypothesised relationships between end-points. Then observations are made with respect to the hypothesised relationships across the pre-defined causal chain. In the case of nurse-patient ratios, the causal chain may look something like this:

Advertise for more nurses; leading to more nurses are hired; leading to nurse morale improves, nurses spend more time with patients, nurse knowledge improves; leading to patients turned more often, vital signs more diligently observed, nurses provide more compassion; leading to less pressure ulcers, lower mortality / less failed resuscitation attempts, and satisfaction improves

Observations are made across this chain. A borderline improvement in patient satisfaction in an uncontrolled study, in the absence of a change in any other end-points would not be impressive evidence of effectiveness. However, showing that the intervention was properly implemented (A), and that intervening variables (B), clinical processes (C), and patient outcomes (D) all improved, would support a cause and effect relationship.[7] This would hold even in the event that one end-point, say pressure ulcers, improved, but not to the extent that it crossed the usual threshold for statistical significance.

So there we have it – a philosophical basis to reconcile two apparently contradicting movements in research. All that leaves is how to combine the data. CLAHRC WM is actively investigating Bayesian networks for this purpose with practical examples supported by NIHR Programme, HS&DR, and Health Foundation grants.

— Richard Lilford, CLAHRC WM Director


  1. Bohannon J. I Fooled Millions into Thinking Chocolate Helps Weight Loss. Here’s How. io9. 27 May 2015.
  2. Kassel M. John Bohannon’s Chocolate Hoax and the Spread of Misinformation. Observer. 6 April 2015.
  3. Bohannon J, Koch D, Homm P, Driehaus A. Chocolate with high Cocoa content as a weight-loss accelerator. 2015. [Online]
  4. Ioannidis JPA. Why Most Published Research Findings are False. PLoS Med. 2005; 2: e124.
  5. Pawson R & Tilley N. Realistic Evaluation. London: Sage. 1997.
  6. Whewell W & Butts RE. William Whewell’s Theory of Scientific Method. Pittsburgh: University of Pittsburgh Press. 1968.
  7. Lilford RJ, Chilton PJ, Hemming K, Girling AJ, Taylor CA, Barach P. Evaluating policy and service interventions: framework to guide selection and interpretation of study end points. BMJ. 2010; 341: c4413.

2 thoughts on “Chocolate Can Help You Lose Weight – a Hoax”

  1. Richard, maybe I have missed something but doesn’t it come down to having to use a Bonferroni correction or similar. I am just thinking of simple advice to help our Junior Docs spot spurious research and I guess other clues are the lack of information on sample size and any attempt at power calculation..

  2. I am not particularly sympathetic to the view that the benefit of the hoax may have justified the means. The study did contravene the Helsinki Declaration, and with clinical trials I think that we have to abide by a set of participant rights and maxims that dictate conduct. After all many of the most horrendous medical experiments even conducted have been justified by their utilitarian ends: the crimes tried at the Nazi Doctors Trial and the exposure of people to radiation in the US between 1944 and 1974 without their knowledge or consent, to name but two examples. While the harms to the participants were negligible, Bohannon could have just made up the data if the trial itself wasn’t the point of the exercise.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s