Tag Archives: Evidence-based medicine

Clinical Research Stands Out Among Disciplines for Being Largely Atheoretical

A recent paper in the BMJ (see our recent Director’s Choice) described the (null) result in a RCT of physiotherapy for ankle injury.[1] The broader implications of this finding were discussed in neither the discussion section of the paper itself, nor in the accompanying editorial.[2] The focus was confined entirely on the ankle joint, with not a thought given to implications for strains around other joints. The theory by which physiotherapy may produce an effect, and why this might apply to some joints and not others, did not enter the discourse. The ankle joint study is no exception, such an atheoretical approach is de rigour in medical journals, and it seems to distinguish clinical research from nearly everything else – most scientific endeavours try to find out what results mean – they seek to explain, not just describe. Pick up an economics journal and you will find, in the introduction, an extensive rationale for the study. Only when the theory that the study seeks to explicate has been thoroughly dealt with do the methods and results follow. An article in a physics journal will use data to populate a mathematical model that embodies theory. Clinical medicines’ parent discipline – the life sciences – are also heavily coloured by theory – Watson and Crick famously built their model (theory) entirely on other researchers’ data.

The premise that theory features less prominently in medical journals compared to the journals of other disciplines is based on my informal observations; my evidence is anecdotal. However, the impression is confirmed by colleagues with experience that ranges across academic disciplines. In due course I hope to stimulate work in our CLAHRC, or with a broader constituency of News Blog readers, to further examine the prominence given to theory across disciplines. In the meantime, if the premise is accepted, contingent questions arise – why is theory less prominent in medicine and is this a problem?

Regarding the first point, it was not ever thus. When I was studying medicine in the late 1960s / early 1970s ‘evidence-based medicine’ lay in the future – it was all theory then, even if the theory was rather shallow and often implicit. With the advent of RCTs and increased use of meta-analysis it became apparent that we had often been duped by theory. Many treatments that were supported by theory turned out to be useless (like physiotherapy for sprained ankles), or harmful (like steroids for severe head injury). At this point there was a (collective) choice to be made. Evidence could have been seen as a method to refine theory and thereby influence practice. Alternatively, having been misdirected by theory in the past, its role could have been extirpated (or downgraded) so that the evidence became the direct basis for practice. Bradford Hill, in his famous talk,[3] clearly favoured the former approach, but the profession, perhaps encouraged by some charismatic proponents of evidence-based medicine, seems to have taken the second route. It would be informative to track the evolution of thought and practice through an exegesis of historical documents since what I am suggesting is itself a theory – albeit a theory which might have verisimilitude for many readers.

But does it matter? From a philosophy of science point of view the answer is ‘yes’. Science is inductive, meaning that results from one place and time must be extrapolated to another. Such an extrapolation requires judgement – the informed opinion that the results can be transferred / generalised / particularised across time and place. And what is there to inform such a judgement but theory? So much for philosophy of science, but is there any evidence from practice to support the idea that an atheoretical approach is harmful? This is an inevitably tricky topic to study because the counterfactual cannot be observed directly – would things have turned out differently under an imaginary counterfactual where theory was given more prominence? Perhaps, if theory had been given more weight, we would have extrapolated from previous data and realised earlier that it is better to treat all HIV infected people with antivirals, not just those with supressed immune systems.[4] Likewise, people have over-interpreted null results of adjuvant chemotherapy in rare tumours when they could have easily ‘borrowed strength’ from positive trials in more common, yet biologically similar, cancers.[5] [6]

In the heady days of evidence-based medicine many clear cut results emerged concerning no treatment versus a proposed new method. Now we have question inflation among a range of possible treatments and diminishing headroom for improvement – not all possible treatments can be tested across all possible conditions – we are going to have to rely more on network meta-analyses, database studies and also on theory.

Richard Lilford, CLAHRC WM Director


  1. Brison RJ, Day AG, Pelland L, et al. Effect of early supervised physiotherapy on recovery from acute ankle sprain: randomised controlled trial. BMJ. 2016; 355: i5650.
  2. Bleakley C. Supervised physiotherapy for mild or moderate ankle sprain. BMJ. 2016; 355: i5984.
  3. Hill AB. The environment and disease: Association or causation? Proc R Soc Med. 1965; 58(5): 295-300.
  4. Thompson MA, Aberg JA, Hoy JF, et al. Antiretroviral Treatment of Adult HIV Infection. 2012 Recommendations of the International Antiviral Society – USA Panel. JAMA. 2012; 308(4): 387-402.
  5. Chen Y-F, Hemming K, Chilton PJ, Gupta KK, Altman DG, Lilford RJ. Scientific hypotheses can be tested by comparing the effects of one treatment over many diseases in a systematic review. J Clin Epidemiol. 2014; 67: 1309-19.
  6. Bowater RJ, Abdelmalik SM, Lilford RJ. Efficacy of adjuvant chemotherapy after surgery when considered over all cancer types: a synthesis of meta-analyses. Ann Surg Oncol. 2012; 19(11): 3343-50.


Challenging the Idea of Hospital Culture

Welcome to this first News Blog of 2015, and happy birthday to sibling CLAHRCs. We are exactly one year old!

One of the things our CLAHRC likes to do is challenge the perceived wisdom. Today we challenge the idea that hospitals have a pervading culture that has a profound influence on the performance of front-line staff across the board – in particular, we question the idea that safe care turns on this latent variable of culture. Of course, we do not doubt the concept of culture itself. National cultures certainly exist, as eloquently demonstrated in a study of propensity among UN headquarters staff of different nationalities to misuse diplomatic immunity in violation of New York’s parking restrictions.[1] Similarly, there may be micro-cultures among certain specialities or in particular locations (such as wards/units) within a hospital.[2] But we think that culture is a weak force at the hospital level. Our argument is part theoretical, part empirical.

Theoretically, staff have cultural ties outside their hospital, particularly to their trade organisations, which operate over longer time frames than employment contracts. Within hospitals, interaction across departments is limited and episodic. There are thus reasons, a priori, to be sceptical about the hospital as the cultural locus for clinical staff.

A number of studies have looked for correlations between culture and various measures of hospital ‘performance’.[3] [4] [5] [6] [7] The results are mixed at best and the authors tend to seek reasons for unimpressive results, rather than question the importance of ‘culture’ itself. Correlations, albeit weak ones, have been found between mortality and staff satisfaction,[8] and between patient and staff satisfaction,[9] but many other potential explanations, such as better staff/patient ratios in higher performing institutions, could explain these findings. When looking for a direct correlation between culture and clinical performance none is found.[10] If culture were important then there should be a correlation between adherence to the tenets of good clinical practice between hospital departments/specialities within hospitals, but none is found.[11] Even within departments/specialities, correlations between individual tenets are either weak [12] [13] or non-existent.[14]

Why has the notion of hospital culture received such widespread support in the face of such paltry evidence? We speculate that the notion has been imported, along the supply line for management ideas, from the private sector. We suspect that commercial organisations have cultures that are stronger than those in hospitals. It is easy to be persuaded that ENRON, for example, had such a corporate culture – malign in that case. Whatever the explanation, it is clear to us that this notion of hospital culture feeds into a ‘meta-narrative’ – a story that is amplified through social networks to become an accepted part of folklore. Such stories become self-referential and hard to oppose – for example, we have anecdotal evidence of strong publication bias in studies on culture and plan to investigate this formally. We seek views and potential collaborators in future study of this topic from our readership. Happy New Year!

— Richard Lilford, CLAHRC WM Director
— Yen-Fu Chen, Senior Research Fellow


  1. Fisman R, Miguel E. Cultures of Corruption: Evidence from Diplomatic Parking Tickets. NBER Working Paper No. 12312. 2006.
  2. Brewer BB. Relationships among teams, culture, safety, and cost outcomes. West J Nurs Res. 2006; 28(6): 641-53.
  3. Wagner C, Mannion R, Hammer A, Groene O, Arah OA, Dersarkissian M, Sunol R. The associations between organizational culture, organizational structure and quality management in European hospitals. Int J Qual Health Care. 2014. 26(s1): 74-80.
  4. Willis C, Saul J, Bevan H, et al. Sustaining organizational culture change in health systems? J Health Organ Manag. [In Press].
  5. Mannion R, Davies TW, Freeman T, Millar R, Jacobs R, Kasteridis P. Overseeing oversight: governance of quality and safety by hospital boards in the English NHS. J Health Serv Res Policy. 2015; 20(s1): 9-16.
  6. Davies HT, Mannion R, Jacobs R, Powell AE, Marshall MN. Exploring the relationship between senior management team culture and hospital performance. Med Care Res Rev. 2007 64(1): 46-65.
  7. Millar R, Mannion R, Freeman T, Davies HT. Hospital Board Oversight of Quality and Patient Safety: A Narrative Review and Synthesis of Recent Empirical Research. Milbank Q. 2013; 91 (4): 738–70.
  8. Pinder RJ, Greaves FE, Aylin PP, Jarman B, Bottle A. Staff perceptions of quality of care: an observational study of the NHS Staff Survey in hospitals in England. BMJ Qual Saf. 2013; 22(7): 563-70.
  9. Dawson J. Staff experience and patient outcomes: what do we know? A report commissioned by NHS Employers on behalf of NHS England. London: NHS Confederation. 2014. [Online].
  10. Scott T, Mannion R, Marshall M, Davies H. Does organisational culture influence health care performance- a review of the evidence. J Health Serv Res. 2003; 8: 105-17.
  11. Jha AK, Li Z, Orav EJ, Epstein AM. Care in U.S. hospitals – the Hospital Quality Alliance Program. New Engl J Med. 2005; 353: 265-74.
  12. Peterson ED, Roe MT, Mulgund J , et al. Association between hospital process performance and outcomes among patients with acute coronary syndromes. 2006; 295: 1912-20.
  13. Bradley EH, Herrin J, Elbel B, et al. Hospital quality for acute myocardial infarction: correlation among process measures and relationship with short-term mortality. JAMA. 2006; 296: 72-8.
  14. Wilson B, Thornton JG, Hewison J, Lilford RJ, Watt I, Braunholtz D, Robinson M. The Leeds University Maternity Audit Project. Int J Qual Health Care. 2002; 14: 175-81.

Statistics is Far Too Important to Leave to Statisticians

“P values, the ‘gold standard’ of statistical validity, are not as reliable as many scientists assume.”
R. Nuzzo. Scientific method: Statistical Errors. Nature. 2014; 506:150-2.

Phase 1: Theoretical Practice

When I was at medical school, the prevailing idea was that all that was needed for the sound practice of medicine was a deep understanding of physiology and pathology. Our teachers had reason to put faith in this idea. They emanated from what is sometimes called the golden age of discovery. Improved understanding of physiology, alongside technical developments, had placed in their hands powerful treatments, such as oral contraception, the breathing machine, kidney dialysis, cardio-pulmonary bypass, and cancer chemotherapy. Patients could be rescued from the clutches of death in the intensive care unit by following the sound principles of living physiology.
However, these heady discoveries soon gave way to a more deliberative process of trial and error to improve the use of generic treatment types.

Phase 2: Evidence-Based Practice

The effects of these second order interventions were not self-evident. Again and again, randomised trials showed that our intuitions, no matter how well-based in physiology and pathology, were often completely and utterly wrong. In short, we did not know enough about physiology and pathology to be able to predict which treatments would do more good than harm. At first the medical profession was non-plussed by this type of direct evidence, but good arguments gradually displace bad. Evangelists such as Archie Cochrane were followed by early adopters, such as David Sackett, Iain Chalmers and Thomas C. Chalmers, and the Evidence-Based Medicine movement was born. Pick up any of the major six journals now and you will most likely be treated to a pageant of randomised trials and systematic reviews of RCTs. RCTs continue to produce iconic results – for example the magnificent CRASH-2 trial [1] and the endovascular aneurysm trials.[2] [3] [4]

However, more and more RCTs are inconclusive, even when they have been of considerable size and very well-funded. Mainly, this is because the head-room for improvement is gradually being eroded by the success of modern medicine; if you halve the absolute effect size you are looking for, then you quadruple the necessary sample size, other things remaining unchanged. Also, as pointed out in a previous blog,[5] we face ‘question inflation’; every question we answer in science spawns a string of subsidiary questions and the science base produces an ever-increasing number of therapeutic targets.[6] This is unmasking an epistemological problem at the heart of the current evidence-based healthcare movement. The results of standard statistical tests have been treated as a decision rule. As argued before, frequentist statistics does not provide a decision rule that can be used in the interpretation of a particular result.[7] [8] This was emphasised by the founding fathers of frequentist statistics Neyman and Fisher. Frequentist statistics does not yield probabilistic estimates that are required for an axiomatic theory of decision-making.[9] [10] Yet, practitioners of evidence-based medicine reify the p-value and confidence limits, and use them as the basis for decisions. Confidence limits take account of the play of chance, while various procedural roles, such as randomisation, take care of bias, according to this view of the world. Statisticians, the only people who really understand the problem, keep silent – Bayesian statistics that provides the probabilities of treatment effects, was something “you do not do in front of the children”. This led Steven Goodman (himself a statistician) to make his insightful remark – “…statistics has become too important to leave only to statisticians”.[10]

Phase 3: Integrating Multiple Sorts of Evidence

Of course none of this mattered when RCTs produced iconic results that swept all before them. Under those circumstances misinterpreting a frequentist confidence limit as a Bayesian credible limit does no harm at all – they are virtually the same thing. It is only now as we enter the fuzzier world of small effect sizes and multiple objectives and the need to combine data of different sorts, that the intellectual flaws in using standard statistical method as a decision rule are assuming practical importance.

The only way out of the mess [11] is to think completely differently. We should start with a formal analysis of the decision problem (using expected utility theory or its elaboration into cost-utility analysis). We should then collect the relevant data and analyse it in a Bayesian paradigm, so that it can provide the kind of probabilities we need (probabilities of events if we follow one decision or the other) and so that the different probabilities and utilities can be reconciled.

When I was a medical student, evidence-based medicine was not given nearly enough prominence. Subsequently, a very simplified version, based on frequentist statistics, was presented as a one-stop shop to clinical decisions. But the best solution is one which synthesises prior knowledge, and all direct comparative evidence, to yield probabilities that are directly referable to the decision problem. Knowledge about the theory of treatments and of how other similar treatments have fared is also part of the evidence for evidence-based medicine, as Bradford Hill pointed out in his famous lecture.[12]

Personal Reflection

I have been proselytising for Bayesian statistics for 25 years. At one point in my career I spoke to a statistician who, like me, had attracted a certain amount of ridicule. His response had been to back off. Recently, a distinguished statistician who I like and admire, told me I should stop banging on about Bayes – the argument, he said, was widely accepted intellectually. Be that as it may, the world plows on regardless, with doctors, nurses, psychologists and many others misunderstanding conventional statistics, and statisticians, with a few exceptions, remaining silent. The CLAHRC WM Director has absolutely no intention to stop banging on about Bayes!

— Richard Lilford, CLAHRC WM Director.

  1. CRASH-2 Trial Collaborators.Effects of tranexamic acid on death, vascular occlusive events, and blood transfusion in trauma patients with significant haemorrhage (CRASH-2): a randomised, placebo-controlled trial. Lancet. 2010; 376(9734): 23-32.
  2. Prinssen M, et al. A randomized trial comparing conventional and endovascular repair of abdominal aortic aneurysms. NEJM. 2004; 351: 1607-18.
  3. EVAR Trial Participants. Endovascular aneurysm repair versus open repair in patients with abdominal aortic aneurysm (EVAR trial 1): randomised controlled trial. Lancet 2005; 365(9478): 2179-86.
  4. EVAR Trial Participants. Endovascular aneurysm repair and outcome in patients unfit for open repair of abdominal aortic aneurysm (EVAR trial 2): randomised controlled trial. Lancet 2005; 365(9478): 2187-92.
  5. Lilford RJ. The End of the Hegemony of Randomised Trials. 30 Nov 2012. [Online].
  6. Lord JM, et al. The systemic immune response to trauma: an overview of pathophysiology and treatment. Lancet. 2014. 384: 1455-65.
  7. Lilford RJ, Thornton JG, Braunholtz D. Clinical trials and rare diseases: a way out of a conundrum. BMJ. 1995; 311: 1621.
  8. Lilford RJ, Braunholtz D. Who’s afraid of Thomas Bayes? J Epidemiol Community Health. 2000; 54: 731-9.
  9. Lindley DV. The philosophy of statistics. Statistician. 2000; 49(3): 293-337.
  10. Goodman SN. Toward evidence-based medical statistics. 1: The P value fallacy. Ann Intern Med. 1999; 130(12): 995-1004.
  11. Lilford RJ. The Messy End of Science. 16 April 2014. [Online].
  12. Hill AB: The environment and disease: Association or causation? Proc R Soc Med. 1965;  58(5): 295-300.