History of Controlled Trials in Medicine

Rankin and Rivest recently published a piece looking at the use of clinical trials more than 400 years ago,[1] while Bothwell and Podolsky have produced a highly readable historical account of controlled trials.[2] Alternate treatment designs became quite popular in the late eighteenth century, but Austin Bradford Hill was concerned with the risk of ‘cheating’ and carried out an iconic RCT to overcome the problem.[3] But what next for the RCT? It is time to move to a Bayesian approach,[4] automate trials in medical record systems, and widen credible limits to include the risk of bias when follow-up is incomplete, therapist is not masked, or subjective outcomes are not effectively blinded.

— Richard Lilford, CLAHRC WM Director


  1. Rankin A & Rivest J. Medicine, Monopoly, and the Premodern State – Early Clincial Trials. N Engl J Med. 2016; 375(2): 106-9.
  2. Bothwell LE & Podolsky SH. The Emergence of the Randomized Controlled Trial. N Engl J Med. 2016; 375(6): 501-4.
  3. Hill AB. The environment and disease: Association or causation? Proc R Soc Med. 1965; 58(5): 295-300.
  4. Lilford RJ, & Edwards SJL. Why Underpowered Trials are Not Necessarily Unethical. Lancet. 1997; 350(9080): 804-7.

Frequency of Safety Incidents in Primary Care – an Ephemeral Quality

Most epidemiological studies of safety incidents have been done in hospitals, starting with the iconic Harvard Malpractice study.[1] Primary care has proved a more difficult context for quantitative evaluation of safety. A systematic review of reviews and primary studies (109 in total) has recently been published.[2] The main message that I took away is that all estimates are unstable, irrespective of the type of incident (e.g. diagnostic vs. prescribing error), or the quality of the study. Prospective studies seem to detect a higher proportion of incidents than retrospective studies. One important observation is confirmed – diagnostic errors are more likely to result in harm than other types of error – that is why I bang on about diagnostic error.[3]

— Richard Lilford, CLAHRC WM Director


  1. Brennan TA, Leape LL, Laird NM, Hebert L, Localio AR, Lawthers AG, et al. Incidence of adverse events and negligence in hospitalized patients. Results of the Harvard Medical Practice Study I. N Engl J Med. 1991; 324(6): 370-6.
  2. Panesar SS, deSilva D, Carson-Stevens A, et al. How Safe is Primary Care? A Systematic Review. BMJ Qual Saf. 2016; 25: 544-53.
  3. Lilford RJ. Diagnostic Errors – Extremely Important but How Can They Be Measured? NIHR CLAHRC West Midlands News Blog. 26 February 2016.

Interdental Devices. Weak Effectiveness Detected from Weak Studies

Recently the Associated Press published a report detailing their requests to the US departments of Health and Human Services and Agriculture for information regarding the effectiveness of flossing.[1] They found only weak evidence, and when the federal government issued their updated dietary guidelines flossing was no longer recommended.

An overview (meta-review) from 2015 was cited, which evaluated various interdental devices with respect to protecting against plaque and gum disease.[2] The effects of using a device and brushing, versus brushing alone, are small, but mainly in a positive direction. Also interdental devices were superior to floss when the two were compared head-to-head, but again the difference was of small magnitude. The quality of the trials in the six meta-analyses included in the overview was not very high on average. We do not know how good compliance with use of devices was in the intervention group across studies, nor the extent to which the control group abstained from use of devices. Interviewed on radio BBC Hereford and Worcester, the CLAHRC WM Director opined that he would continue to use interdental devices, despite their apparent nugatory effects, for reasons of taste and aesthetics, just as he shampoos his hair regularly even though it will not prevent him from going grey on top.

— Richard Lilford, CLAHRC WM Director


  1. Donn J. Medical Benefits of Dental Floss Unproven. Associated Press. 2 August 2016.
  2. Sälzer S, Slot DE, Van der Weijden FA, Dörfer CE. Efficacy of inter-dental mechanical plaque control in managing gingivitis–a meta-review. J Clin Periodontol. 2015; 42(s16): s92-105.

Ethics of Using Other Researcher’s Data

It is good practice to make data collected in research projects available to others. A recent editorial in the New England Journal of Medicine [1] says that just using such data is ‘parasitical’, and suggest that researchers who use archived data should collaborate with the researchers who collected the original data. The CLAHRC WM Director disagrees. While there may be times when collaboration with the originators of the data is a good idea, it should not be expected or required. The original researchers are ‘invested’, and in many occasions it is in the public and scientific interest for the new investigations to maintain independence. It is best that data ‘gifted’ to the research community should be just that – a gift. Also, the original researchers might have lost interest, retired or died. Reinhart and Rogoff’s magisterial database of economic data is a case in point.[2] Independent re-analysis of the data they had collected and magnanimously made available to other researchers sometimes produced conclusions different to the original.

— Richard Lilford, CLAHRC WM Director


  1. Longo DL, & Drazen JM. Data Sharing. New Engl J Med. 2016; 374: 276-7.
  2. Reinhart CM, & Rogoff KS. Growth in a Time of Debt. Am Econ Rev. 2010; 100(2): 573-8.

More on Education

Michelle Obama visited an all-girls school in London back in 2009. She met with the pupils again in 2011 and 2012. The school is named after the first English woman doctor, Elizabeth Garrett Anderson. Simon Burgess has examined performance at the school over the years preceding and following these meetings.[1] He used the results of national examinations (the GCSE) and compared the school’s results with those from the rest of London’s schools. A sharp uptick in performance, which later returned to baseline, was seen in the ‘intervention’ school, but not in the controls. Burgess used a difference-in-difference type approach in a multivariate statistical analysis (though a synthetic control may have been even better, as discussed in a previous post). The ‘treatment effect’ was half a standard deviation, which would carry a student destined to achieve eight grade ‘B’s, to achieving a mix of ‘A*’s and ‘A’s. The paper is worth a read – it is really beautifully written and packs a powerful message regarding the beneficial effect of aspirations. The First Lady did not tell her listeners that getting good grades is easy. She said it’s hard, but ‘you can do it’. Most of the pupils at the school are not white and Michelle Obama would have been a great role model.

— Richard Lilford, CLAHRC WM Director


  1. Burgess S. Michelle Obama and an English school: the power of inspiration. 2016.

Bring Back the University Lecture: More on Evidence-Based Teaching

News Blog readers are now familiar with Hattie’s monumental work on evidence-based education [1] – an overview (meta-synthesis) of:


To remind you, the huge proportion of the meta-analyses and studies (96%) show positive effects – maybe a Hawthorne effect of some sort. So an influence or intervention that produces an effect size of, say, only 0.2 of a standard deviation must be considered not particularly useful – it will be at the bottom end of a distribution in which nearly everything ‘works’.

In our last two posts [2] [3] we identified two factors that were, perhaps surprisingly, effete:

  1. Small class sizes.
  2. Problem-based learning.

I should have mentioned that there is no threshold class size – reducing from 200 to 60; 60 to 20; 20 to 8 all yield nugatory benefits. Moreover, and again perhaps surprisingly, the results of most studies are not very age-group dependent. You can see where I am going – abandoning the lecture in universities, in line with current fashion, should be questioned, especially given the cost-efficiency of the method. Important variables (have the students pre-prepared; does the lecturer stop and ask questions to assess understanding; do the students set time aside to reflect; does the lecturer assess herself; does she adapt herself to the type of class/group she is teaching) are all more important than the size of the class. A great lecturer is a scarce resource to be used wisely. Think TED talks.

— Richard Lilford, CLAHRC WM Director


  1. Hattie J. The Applicability of Visible Learning to Higher Education. Scholarship of Teaching and Learning in Psychology. 2015; 1(1): 79-91.
  2. Lilford RJ. Evidence-Based Education (or how wrong the CLAHRC WM Director was). NIHR CLAHRC West Midlands News Blog. 15 July 2016.
  3. Lilford RJ. Ask to Not Whether, But Why, Before the Bell Tolls! NIHR CLAHRC West Midlands News Blog. 29 July 2016.

Digital Future of Systematic Reviews

A good friend and colleague, Kaveh Shojania, recently shared an article about bitcoin (a form of digital currency), which predicts the end of the finance industry as we know it.[1] The article argues that commercial banks, in particular, will no longer be needed. But what about our own industry of clinical epidemiology? Two thoughts occur:

  1. The current endeavour might not be sustainable.
  2. There might be another way to study prognosis, diagnosis and treatment.

We have argued in a previous post that traditional systematic reviews might soon become a victim to their own success. News blog readers will remember that we have argued that the size of the literature will soon become just too large to review in the normal way. In addition to which we have posited the twin issues of “question inflation and effect size deflation”. That is to say the number of potential comparisons is already becoming unwieldy (some network meta-analyses include over 100 individual comparators [2]), and plausible effect sizes are getting smaller as the headroom for further improvements gets used up. Our colleague Norman Waugh tells us that his latest Cochrane review concerning glucagon-like peptides in diabetes runs to over 800 pages. Many have written about the role of automation to search and screen the relevant literature,[3-5] including ourselves in a previous post, but the task of analysing the shedload of retrieved articles will itself become almost insurmountable. At the rate things are going, this may happen sooner than you think![6]

What is to be done? One possibility is that the whole of clinical epidemiology will be largely automated. We have written before about electronic patient records as a potential source of data for clinical research. This ‘rich’ data will be available for analysis by standard statistical methods. However, machine learning is being taken increasingly seriously, and so it is possible to imagine a world in which the bulk of clinical epidemiological studies are largely automated under programme control. That is to say, machine learning algorithms will sit behind rapidly accumulating clinical databases, searching for signals and conducting replication studies autonomously, perhaps even across national borders. In previous posts we have waxed lukewarm about IT systems, which have the potential to disrupt doctor-patient relationships, and where greater precision may be achieved at the cost of increasing inaccuracy. However, it is also possible that these problems can be mitigated by collecting and adjusting for ever larger amounts of information, and perhaps by finding instrumental variables, including those afforded by Mendelian randomisation.

Will all this mean that the CLAHRC WM director will soon retire, while his young colleagues find themselves being made redundant? Almost certainly not. For as long as can be envisaged, human agency will be required to write and monitor computer algorithms, to apply judgement to the outputs, to work out what it all means, and to design and implement subsidiary studies. If anything, epidemiologists of the future will require deeper epistemological understanding, statistical ability and technical knowhow.

— Richard Lilford, CLAHRC WM Director
— Yen-Fu Chen, Senior Research Fellow


  1. Lanchester J. When bitcoin grows up. London Rev Books. 2016; 38(8): 3-12.
  2. Zintzaras E, Doxani C, Mprotsis T, Schmid CH, Hadjigeorgiou GM. Network analysis of randomized controlled trials in multiple sclerosis. Clin Ther. 2012; 34(4): 857-69.
  3. O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev. 2015; 4: 5.
  4. Tsafnat G, Glasziou P, Choong MK, Dunn A, Galgani F, Coiera E. Systematic review automation technologies. Syst Rev. 2014; 3: 74.
  5. Choong MK, Galgani F, Dunn AG, Tsafnat G. Automatic evidence retrieval for systematic reviews. J Med Internet Res. 2014; 16(10): e223.
  6. Bastian H, Glasziou P, Chalmers I. Seventy-five trials and eleven systematic reviews a day: how will we ever keep up? PLoS Med. 2010;7(9): e1000326.

Managing Staff: A Role for Tough Love?

Over the years the CLAHRC WM Director has participated in extensive training in HR issues. The training usually starts with feedback from staff on their satisfaction with their work environment and their boss. The idea then is to amend the environment or the behaviour of the boss, with a view to improving staff feedback. It is surely excellent for staff to provide feedback, and for bosses to be humble and to continually strive to be ‘better’ bosses:

062 DC - Managing Staff Figure

One thing a boss may be asked to do is to reduce stress on staff. But where does this stress come from? Ultimately, the competitive external environment. So what can the boss do about that? Presumably, the worker cannot be shielded from the stress. Academics on research contracts face redundancy if they cannot secure research grants. So, taking the stress out of the job would be self-defeating. Bosses should help staff cope with the real and present threats they face, and do them a disservice if they shield them from it. Enter Alia Crum and colleagues, and a series of two wonderful experiments.[1] [2] First, they took interviewees facing stressful interviews, and then bankers facing financial crisis (poor bankers). In both cases, interventions designed to generate a positive mind-set towards stress bolstered coping mechanisms. It also improved receptivity to critical feedback, which is an essential component of academic life. People receive good salaries for tackling difficult and stressful situations. Do not try to pretend that this is not so, but select resilient staff, make them feel a little heroic,[3] and create a team ethos where stress is to be relished! The CLAHRC WM Director promises his team ‘blood, sweat, and tears’. When our grant applications are turned down it is what we were expecting; when they succeed we get a nice surprise!

— Richard Lilford, CLAHRC WM Director


  1. Crum AJ, Salovey P, Achor S. Rethinking Stress: The Role of Mindsets in Determining the Stress Response. J Person Soc Psychol. 2013; 104(4): 716-33.
  2. Crum AJ, Akinola M, Martin A, Fath S. The Benefits of a Stress-is-enhancing Mindset in Both Challenging and Threatening Contexts. 2015. [Under Review].
  3. Lilford RJ. Can We Do Without Heroism in Health Care? NIHR CLAHRC West Midlands News Blog. 20 March 2015.


Beyond Logic Models

It has become common in systematic reviews and, increasingly, in research papers to present a logic model – a trend we entirely applaud. Logic models are usually “graphical depictions of processes” that “describe logical linkages among program resources, activities, outputs… and outcomes.” The ultimate purpose is to depict ‘if-then‘ causal relationships between elements of the programme.[1] Good examples of logic models can be found in the Cochrane Reviews on slum regeneration,[2] and housing improvements.[3] In health care, the logic model explicates the putative causal pathway linking cause to its ultimate effect at the patient/client level. It may be said to encapsulate programme theory. Here is a simple logic model, adapted from one published previously,[4] relating to the effects of electronic prescribing systems.

062 DCB - Beyond Logic Models Fig 1

The advantage of such logic models are manifold – they explicate theory, provide information on salient endpoints,[5] clarify research questions, and improve communication.[6]

While logic models lay bare the nodes in a causal chain where data may be collected, they do not entail a method to synthesise all this information. The default approach in healthcare is to synthesise information implicitly – the literature refers to ‘triangulation’. However, the mental working of implicit ‘triangulation’ are opaque, at least to others. Equally important, they do not yield estimates of the effect of interest to a decision maker – quantities that are required to inform decision models (such as health economic models). But this limitation can be overcome by making use of a method that is well known in other disciplines – development economics, molecular biology, agriculture to name but a few. We refer here to Bayesian networks. A Bayesian network is a representation of the joint probability distribution and conditional independence assumptions. It enables information collected at each node in the chain to be synthesised to estimate the effects of the intervention on outcomes of interest. Qualitative information can be incorporated through probability distributions elicited from experts. It is even possible to adjust for bias in such a model by specifying a probability distribution for bias (which can be used to ‘update’ quantitative estimates) according to the model of Turner and Spiegelhalter.[7] External evidence, say from the literature, can also be incorporated, again by making use of elicited probability densities. The interactions between nodes do not have to be linear, but causality is in one direction only. Clearly, if you think that reverse causality will apply to a material degree within a given time-frame, then a more complex model, such as a dynamic event simulation, would be necessary.

We have systematically reviewed the literature of service delivery / health services research and find no examples where the powerful technique of Bayesian networking has been used or even advocated,[8] (apart from in our own papers). It is not mentioned, for example, in important articles on the systematic reviews of complex interventions in health services.[9] [10] Since the method has been used to good effect in other disciplines involving complex interventions, we think the time is propitious to explore its use in our field.

— Richard Lilford, CLAHRC WM Director


  1. McCawley PF. The Logic Model for Program Planning and Evaluation. CIS 1097. Moscow, ID: University of Idaho, 1997.
  2. Turley R, Saith R, Bhan N, Rehfuess E, Carter B. Slum upgrading strategies involving physical environment and infrastructure interventions and their effects on health and socio-economic outcomes. Cochrane Database Syst Rev. 2013; 1: CD010067.
  3. Thomson H, Thomas S, Sellstrom E, Petticrew M. Housing improvements for health and associated socio-economic outcomes. Cochrane Database Syst Rev. 2013; 2: CD008657.
  4. Watson SI & Lilford RJ. Essay 1: Integrating multiple sources of evidence: a Bayesian perspective. In: Challenges, solutions and future directions in the evaluation of service innovations in health care and public health. Southampton (UK): NIHR Journals Library, 2016.
  5. Lilford RJ, Chilton PJ, Hemming K, et al. Evaluating policy and service interventions: framework to guide selection and interpretation of study end points. BMJ. 2010; 341: c4413.
  6. Anderson L, Petticrew M, Rehfuess E, et al. Using logic models to capture complexity in systematic reviews. Res Synth Methods. 2011; 2(1): 33-42.
  7. Turner RM, Spiegelhalter DJ, Smith GC, Thompson SG. Bias modelling in evidence synthesis. J R Stat Soc Ser A Stat Soc. 2009; 172(1):21-47.
  8. Chen Y-F, Uthman OA, Leamon S, Watson SI, Lilford RJ. Potential use of Bayesian networks in healthcare service delivery and quality improvement research. [In preparation].
  9. Petticrew M, Anderson L, Elder R, et al. Complex interventions and their implications for systematic reviews: a pragmatic approachJ Clin Epidemiol. 2013; 66(11): 1209-14.
  10. Petticrew M, Rehfuess E, Noyes J, et al. Synthesizing evidence on complex interventions: how meta-analytical, qualitative, and mixed-method approaches can contribute. J Clin Epidemiol. 2013; 66(11): 1230-43.

Education Update

As News Blog readers know the CLAHRC WM Director summarises an empirical finding from the experimental educational literature in each fortnightly post. In this issue our focus turns to university education [1] and to just one aspect of it – the perennial question of Problem-Based Learning (PBL). Nine meta-analyses have evaluated this method, and most constituent studies were carried out in universities rather than in schools. Most studies were carried out among medical students. It turns out that PBL is effete – the summary measure of effect is nugatory (0.08 of a standard deviation). It is one of the smallest effect sizes of any pedagogic method evaluated across the entire corpus of the experimental education literature. Moreover, it is actually harmful in some situations – namely those where PBL precedes learning the basic content. PBL is most likely to be effective where the intellectual scaffold has already been built and the student now has to learn to apply the new knowledge.

Consider a patient with pyrexia of unknown origin. Working back to the causes of a temperature when one does not know the causes does not create an intellectual scaffold from which forward reasoning can work. Rather start with the potential causes and then narrow them down as information accrues.

— Richard Lilford,


  1. Hattie J. The applicability of visible learning to higher education. Scholarship Teaching Learning in Psychology. 2015; 1(1): 79-91.