Caution should be Exercised when Synthesising Evidence for Policy

Policy should be formulated from all the available evidence. For this reason systematic reviews and meta-analyses are undertaken. However, they are often not conclusive. Indeed, there have been notable articles published in the BMJ over the last two years which are critical of the evidence or conclusions of reviews that have been conducted to inform important contemporary public health decisions.

A key theme that often emerges from articles critical of reviews is that only evidence from randomised controlled trials (RCTs) is strong enough to support policy decisions. For example, Teicholz [1] claimed that a number of important RCTs were ignored by a recent report explaining changes in dietary guidance in the US. This claim has since been refuted by a large number of prominent researchers.[2] Kmietowicz [3] argued that there were flaws in a meta-analysis of observational patient data that supported the stockpiling of anti-flu medication for pandemic influenza, casting doubt on the decision to stockpile. An upcoming analysis of clinical trial data was instead alluded to, despite these trials only examining seasonal flu. Recently, McKee and Capewell,[4] and later Gornall,[5] criticised the evidence underpinning a comprehensive review from Public Health England [6] on the relative harms of e-cigarettes. They noted that it “included only two randomised controlled trials” and that there were methodological weaknesses and potential conflicts of interest in the other available evidence. McKee and Capewell make the claim that “the burden of proof that it is not harmful falls on those taking an action.” However, this is illogical because any policy choice, even doing nothing, can be considered an action and can cause harm. This claim therefore merely translates to saying that the policy chosen should be that best supported by the evidence of its overall effects.

Public health decisions should be made on the basis of all the currently available evidence. What then are reasons one might write off a piece of evidence entirely? One might object to the conclusions reached from the evidence on an ideological basis, or one might view the evidence as useless. In the latter case, this opinion could be reached by taking a rigid interpretation of the ‘hierarchy of evidence’. RCTs may be the only way of knowing for sure what the effects are, but this is not tantamount to concluding that other evidence should be rejected. RCTs are often, correctly in our view, regarded as an antidote to ideology. However, it is important not to let matters get out of hand so that RCTs themselves become the ideology.

In a recent paper, Walach and Loef,[7] argue that the hierarchy of evidence model, which places RCTs at the top of a hierarchy of study designs, is based on false assumptions. They argue that this model only represents degrees of internal validity. They go on to argue that as internal validity increases, external validity decreases. We don’t strictly agree: there is no necessary decoupling between internal and external validity. However we do agree that in many cases, by virtue of the study designs, RCTs may provide greater internal validity and other designs greater external validity. Then how could we know, in the case of a discrepancy between RCTs and observational studies, which results to rely on? The answer is that one would have to look outside the studies and piece together a story, i.e. a theory, and not ignore the observational evidence as recognised by Bradford-Hill’s famous criteria.

The case of chorion villous sampling, a test to detect foetal genetic abnormalities, serves as a good example of how different forms of evidence can provide different insights and be synthesised. Observational studies found evidence that chorion villous sampling increased the risk of transverse limb deformities, which was not detected in any of the RCTs at the time. To make sense of the evidence and to understand whether the findings from the observational evidence were a result of random variation in the population or perhaps poor study design, knowledge of developmental biology, teratology, and epidemiology were required. It turned out that the level of the transverse abnormality – fingers, hands, forearm, or upper arm – corresponded to the embryonic age at which the sampling was conducted and also to the development of the limb at that point. This finding enabled a cause and effect conclusion to be drawn that explained all the evidence and resulted in recommendations for safer practice.[8] [9]

Knowledge gained from the scientific process can inform us of the possible consequences of different policy choices. The desirability of these actions or their consequences can be then assessed in a normative or political framework. The challenge for the scientist is the understanding and synthesising of the available evidence independently of their ideological stance. There often remains great uncertainty about the consequences of different policies. In some cases, such as with electronic cigarettes, there may be reason to maintain the current policy if, by doing so, the likelihood of collecting further and better evidence is enhanced. However, in other cases, like stockpiling for pandemic influenza, such evidence depends on there being a pandemic and by then it is too late. Accepting only RCT evidence or adopting an ideological stance in reporting may distort what is reported to both key policy decision makers and individuals wishing to make an informed choice. It may even be potentially harmful.

— Richard Lilford, CLAHRC WM Director
— Sam Watson, Research Fellow


  1. Teicholz N. The scientific report guiding the US dietary guidelines: is it scientific? BMJ. 2015; 351: h4962.
  2. Centre for Science in the Public Interest. Letter Requesting BMJ to Retract “Investigation”. Nov 5 2015.
  3. Kmietowicz Z. Study claiming Tamiflu saved lives was based on “flawed” analysis. BMJ. 2014; 348: g2228.
  4. McKee M, Capewell S. Evidence about electronic cigarettes: a foundation built on rock or sand? BMJ. 2015; 351: h4863.
  5. Gornall J. Public Health England’s troubled trail. BMJ 2015;315:h5826
  6. McNeill A, Brose LS, Valder R, et al. E-cigarettes: an evidence update: a report commissioned by Public Health England. London: Public Health England, 2015.
  7. Walach H & Loef M. Using a matrix-analytical approach to synthesizing evidence solved incompatability problem in the hierarchy of evidence. J Clin Epidemiol.  2015; 68(11): 1251-1260
  8. Olney RS. Congenital limb reduction defects: clues from developmental biology, teratology and epidemiology. Paediatr Perinat Epidemiol. 1998; 12: 358–9.
  9. Mowatt G, Bower DJ, Brebner JA, et al. When and how to assess fast-changing technologies: a comparative study of medical applications of four generic technologies. Health Technol Assess. 1996; 1: 1–149.



One thought on “Caution should be Exercised when Synthesising Evidence for Policy”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s