Category Archives: Director & Co-Directors’ Blog

Re-thinking Medical Student Written Assessment

“Patients do not walk into the clinic saying ‘I have one of these five diagnoses. Which do you think is most likely?’” (Surry et al., 2017)

The predominant form of written assessment for UK medical students is the ‘best of five multiple choice question’ (Bo5). Students are presented with a clinical scenario – usually information about a patient, a lead-in or question such as “which is the most likely diagnosis?” and a list of five possible answers, only one of which is unambiguously correct. Bo5 questions are incredibly easy to mark, particularly in the age of computer-read answer sheets (or even computerised assessment). This is critical when results must be turned-round, ratified and feedback provided to students in a timely manner. Because Bo5s are relatively short (UK medical schools allow a median of 72 seconds per question, compared with short answer or essay questions for which at least 10 minutes per question would be allowed), an exam comprising of Bo5 questions can cover a broad sample of the curriculum. This helps to improve the reliability of the exam: a student’s grade is not contingent on ‘what comes up in the exam’, so should have been similar had a different set of questions covering the same curriculum been used. Students not only know that their (or others’) scores are not dependent on what came up, but they are also reassured that they would get the same score regardless of who (or what) marked their paper. There are no hawk/dove issues in Bo5 marking.

On the other hand, Bo5 questions are notoriously difficult to develop. The questions used in the Medical Schools Council Assessment Alliance (MSCAA) Common Content project, where questions are shared across UK medical schools to enable passing standards for written finals exams to be compared,[1] go through an extensive review and selection process prior to inclusion (the general process for MSCAA questions is summarised by Melville, et al. [2]). Yet the data are returned for analysis with comments such as “There is an assumption made in this question that his wife has been faithful to the man” or “Poor distractors – no indication for legionella testing”. But perhaps the greatest problem with Bo5 questions is their poor representativeness to clinical practice. As the title of this blog implied, patients do not come with a list of five possible pathologies, diagnoses, important investigations, treatment options, or management plans. While a doctor would often formulate such a list (e.g. a differential diagnosis) before determining the most likely or appropriate option, such formulation requires considerable skill. We all know that assessment drives learning, so by using Bo5 we may therefore be inadvertently hindering students from developing the full set of clinical reasoning skills required of a doctor. There is certainly evidence that students use test-taking strategies such as elimination of implausible answers and clue-seeking when sitting Bo5-based exams.[3]

A new development in medical student assessment, the Very Short Answer question (VSA) therefore holds much promise. It shifts some of the academic/expert time from question development to marking, but, by exploiting computer-based assessment technology, does so in a way that is not prohibitive given the turn-around times imposed by institutions. The VSA starts with the same clinical scenario as a Bo5. The lead-in changes from “Which is…?” to “What is…?” and this is followed by a blank space. Students are required to type between one and five words in response. A pilot of the VSA style question showed that the list of acceptable answers for a question could be finalised by a clinical academic in just over 90 seconds for a cohort of 300 students.[4] With the finalised list automatically applied to all students’ answers, again there are no concerns regarding hawk/dove markers that would threaten the exam’s acceptability to students. While more time is required per question when using VSAs compared to Bo5s, the internal consistency of VSAs in the pilot was higher for the same number of questions,[4] so it should be possible to find an appropriate compromise between exam length and curriculum coverage that does not jeopardise reliability. The major gain with the use of VSA questions is in clinical validity; these questions are more representative of actual clinical practice than Bo5s, as was reported by the students who participated in the pilot.[4]

To produce more evidence around the utility of VSAs, the MSCAA is conducting a large-scale pilot of VSA questions with final year medical students across the UK this autumn. The pilot will compare student responses and scores to Bo5 and VSA questions delivered electronically and assess the feasibility of online delivery using the MSCAA’s own exam delivery system. A small scale ‘think aloud’ study will run alongside the pilot, to compare students’ thought processes as they attempt Bo5 and VSA questions. This work will provide an initial test of the hypothesis that gains in clinical reasoning validity could be achieved with VSAs, as students are forced to think ‘outside the list of five’. There is strong support for the pilot from UK medical schools, so the results will have good national generalisability and may help to inform the design of the written component of the UK Medical Licensing Assessment.

We would love to know what others, particularly PPI representatives, think of this new development in medical student assessment.

— Celia Taylor, Associate Professor


  1. Taylor CA, Gurnell M, Melville CR, Kluth DC, Johnson N, Wass V. Variation in passing standards for graduation‐level knowledge items at UK medical schools. Med Educ. 2017; 51(6): 612-20.
  2. Melville C, Gurnell M, Wass V. #5CC14 (28171) The development of high quality Single Best Answer questions for a national undergraduate finals bank. [Abstract] Presented at: The International Association for Medical Education AMEE 2015; 2015 Oct 22; Glasgow. p. 372.
  3. Surry LT, Torre D, Durning SJ. Exploring examinee behaviours as validity evidence for multiple‐choice question examinations. Med Educ. 2017; 51(10): 1075-85.
  4. Sam AH, Field SM, Collares CF, et al. Very‐short‐answer questions: reliability, discrimination and acceptability. Med Educ.2018; 52(4): 447-55.

Trials are Not Always Needed for Evaluation of Surgical Interventions: Does This House Agree?

I supported the above motion at a recent surgical trails meeting in Bristol. What where are my arguments?

I argued that there were four broad categories of intervention where trials were not needed:

  1. Where causality is not in dispute

This scenario arises where, but for the intervention, a bad outcome was all but inevitable. Showing that such an outcome can be prevented in only a few cases is sufficient to put the substantive question to bed. Such an intervention is sometimes referred to as a ‘penicillin-type’ of intervention. Surgical examples include heart transplantation and in vitro fertilisation (for people both of whose Fallopian tubes have been removed). From a philosophy of science perspective, causal thinking requires a counterfactual: what would have happened absent the intervention? In most instances a randomised trial provides the best approximation to that counterfactual. However, when the counterfactual is near inevitable death, then a few cases will be sufficient to prove the principle. Of course, this is not the end of the story. Trials of different methods within a generic class will always be needed, along with trials of cases where the indication is less clear cut, and hence where the counterfactual cannot be predicted with a high level of certainty. Nevertheless, the initial introduction of heart transplantation and in vitro fertilisation took place without any randomised trial. Nor was such a trial necessary.

  1. Speculative procedures where there is an asymmetry of outcome

This is similar to the above category, but the justification is ethical rather than scientific. I described a 15 year old girl who was born with no vagina but a functioning uterus. She was referred to me with a pyometra, having had an unsuccessful attempt to create a channel where the vagina should have been. The standard treatment in such a dire situation would have been hysterectomy. However, I offered to improvise and try an experimental procedure using tissue expansion methods to stretch the skin at the vaginal opening and then to use this skin to create a functioning channel linking the uterus to the exterior. The patient and her guardian accepted this procedure, in the full knowledge that it was entirely experimental. In the event, I am glad to report that the operation was successful, producing a functional vagina and allowing regular menstruation.[1] The formal theory behind innovative practice in such dire situations comes from expected utility theory.[2] An example is explicated in the figure.

113 DCB - Trials Eval Sur Interv Figure

This example relates to a person with very low life expectancy and a high-risk procedure that may either prove fatal or extend their life for a considerable time. In such a situation, the expected value of the risky procedure considerably exceeds doing nothing and is preferable, from the point of view of the patient, to entry in an RCT. In fact, the expected value of the RCT (with a 1:1 randomisation ratio) is (0.5 x 0.25) + (0.5 x 1.0) = 0.625. While favourable in comparison to ‘no intervention’, it is inferior in comparison with the ‘risky intervention’.

  1. When the intervention has not been well thought through

Here my example was full frontal lobotomy. Trials and other epidemiological methods can only work out how to reach an objective, not which objective to reach or prioritise. Taking away someone’s personality is nota fair price to pay for mental calmness.

  1. When the trial is poor value for money

Trials are often expensive and we have made them more so with extensive procedural rules. Collection of end-points by routine systems is only part of the answer to this question. Hence trials can be a poor use of research resources. Modelling shows that the value of the information trials provide is sometimes exceeded by the opportunity cost.[3-5]

Of course, I am an ardent trialist. But informed consent must be fully informed so that the preferences of the patient can come into play. I conducted an RCT of two methods of entering a patient into an RCT and showed that more and better information reduced willingness to be randomised.[6] Trial entry is justified when equipoise applies, and the ‘expected value’ of the alternative treatment is about the same.[7] The exception is when the new treatment is unlicensed. Then equipoise plus should apply – the expected value of trial entry should exceed or equal that of standard treatment.[8]

— Richard Lilford, CLAHRC WM Director


  1. Lilford RJ, Sharpe DT, Thomas DFM. Use of tissue expansion techniques to create skin fplas for vaginoplasty. Case report. Br J Obstet Gynacol. 1988;95: 402-7.
  2. Lilford RJ. Trade-off between gestational age and miscarriage risk of prenatal testing: does it vary according to genetic risk? Lancet. 1990; 336: 1303-5.
  3. De Bono M, Fawdry RDS, Lilford RJ. Size of trials for evaluation of antenatal tests of fetal wellbeing in high risk pregnancy. J Perinat Med. 1990; 18(2): 77-87.
  4. Lilford R, Girling A, Braunholtz D. Cost-Utility Analysis When Not Everyone Wants the Treatment: Modeling Split-Choice Bias.Med Decis Making. 2007; 27(1): 21-6.
  5. Girling AJ, Freeman G, Gordon JP, Poole-Wilson P, Scott DA, Lilford RJ. Modeling payback from research into the efficacy of left-ventricular assist devices as destination therapy. Int J Technol Assess Health Care. 2007; 23(2): 269-77.
  6. Wragg JA, Robison EJ, Lilford RJ. Information presentation and decisions to enter clinical trials: a hypothetical trial of hormone replacement therapy. Soc Sci Med. 2000; 51(3): 453-62.
  7. Lilford J. Ethics of clinical trials from a Bayesian and decision analytic perspective: whose equipoise is it anyway?BMJ. 2003; 326: 980.
  8. Robinson EJ, Kerr CE, Stevens AJ, Lilford RJ, Braunholtz DA, Edwards SJ, Beck SR, Roelwy MG. Lay public’s understanding of equipoise and randomisation in randomised controlled trials. Health Technol Assess. 2005; 9(8): 1-192.

Estimating Mortality Due to Low-Quality Care

A recent paper by Kruk and colleagues attempts to estimate the number of deaths caused by sub-optimal care in low- and middle-income countries (LMICs).[1] They do so by selecting 61 conditions that are highly amenable to healthcare. They estimate deaths from these conditions from the global burden of disease studies. The proportion of deaths attributed to differences in health systems is estimated from the difference in deaths between LMICs and high-income countries (HICs). So if the death rate from stroke in people aged 70 to 75 is ten per thousand in HICs and 20 per thousand in LMICs, then ten deaths per 1000 are preventable. This ‘subtractive method’ to estimate deaths that could be prevented by improved health services simply answers the otiose question: “what would happen if low-income countries and their populations could be converted, by the wave of a wand, into high-income countries complete with populations enjoying high income from conception?” Such a reductionist approach simply replicates the well-known association between per capita GDP and life expectancy.[2]

The authors of the above paper do try to isolate the effect of institutional care from access to facilities. To make their distinction they need to estimate utilisation of services. This they do from various household surveys, conducted at selected sites around the world. These surveys contain questions about service use. So a further subtraction is performed; if half of all people deemed to be having a stroke utilise care, then half of the difference in stroke mortality can be attributed to quality of care.

Based on this methodology the authors find that the lion’s share of deaths are caused by poor quality care not failure to get care. This conclusion is flawed because:

  1. The link between the databases is at a very coarse level – there is no individual linkage.
  2. As a result risk-adjustment is not possible.
  3. Further to the above, the method is crucially unable to account for delays in presentation and access to care preceding presentation that will inevitably result in large differences in prognosis at presentation.
  4. Socio-economic status and deprivation over a lifetime is associated with recovery from a condition, so differences in outcome are not due only to differences in care quality.[3]
  5. There are measurement problems at every turn. For example, Global Burden of Disease is measured in very different ways across HICs and LMICs – the latter rely heavily on verbal autopsy.
  6. Quality, as measured by crude subtractive methodologies, includes survival achieved by means of expensive high technology care. However, because of opportunity costs, introduction of effective but expensive treatments will do more harm than good in LMICs (until they are no longer LMICs).

The issue of delay in presentation is crucial. Take, for example, cancer of the cervix. In HICs the great majority of cases are diagnosed at an early, if not at a pre-invasive, stage. However, in low-income countries almost all cases were already far advanced when they present. To attribute the death rate difference to the quality of care is inappropriate. Deep in the discussion the authors state ‘comorbidity and disease history could be different between low and high income countries which can result in some bias.’ This is an understatement and the problem cannot be addressed by a passing mention of it. Later they also assert that all sensitivity analyses support the conclusion that poor healthcare is a larger driver of amenable mortality than utilisation of services. But it is really difficult to believe such a sensitivity analyses when this bias is treated so lightly.

Let us be clear, there is tons of evidence that care is, in many respects, very sub-optimal in LMICs. We care about trying to improve it. But we think such dramatic results based on excessively reductionist analyses are simply not justifiable and in seeking attention in this way risk undermining broader support for the important goal of improving care in LMICs. In areas from global warming to mortality during the Iraq war we have seen the harm that marketing with unreliable methods and generalizing beyond the evidence can do to a good cause by giving fodder to those who don’t want to believe that there is a problem. What is needed are careful observations and direct measurements of care quality itself, along with evaluations of the cost-effectiveness of methods to improve care. Mortality is a crude measure of care quality.[4][5] Moreover, the extent to which healthcare reduces mortality is quite modest among older adults. The type of paper reported here topples over into marketing – it is as unsatisfying as a scientific endeavour as it is sensational.

— Richard Lilford, CLAHRC WM Director

— Timothy Hofer, Professor in Division of General Medicine, University of Michigan


  1. Kruk ME, Gage AD, Joseph NT, Danaei G, García-Saisó S, Salomon JA. Mortality due to low-quality health systems in the universal health coverage era: a systematic analysis of amenable deaths in 137 countries. Lancet. 2018.
  2. Rosling H. How Does Income Relate to Life Expectancy. Gap Minder. 2015.
  3. Pagano D, Freemantle N, Bridgewater B, et al. Social deprivation and prognostic benefits of cardiac surgery: observational study of 44,902 patients from five hospitals over 10 years. BMJ. 2009; 338: b902.
  4. Lilford R, Mohammed MA, Spiegelhalter D, Thomson R. Use and misuse of process and outcome data in managing performance of acute medical care: avoiding institutional stigma. Lancet. 2004; 363: 1147-54.
  5. Girling AJ, Hofer TP, Wu J, et al. Case-mix adjusted hospital mortality is a poor proxy for preventable mortality: a modelling studyBMJ Qual Saf. 2012; 21(12): 1052-6.

Cognitive Bias Modification for Addictive Behaviours

It can be difficult to change health behaviours. Good intentions to quit smoking or drink less alcohol, for example, do not always translate into action – or, if they do, the change doesn’t last very long. A meta-analysis of meta-analyses suggests that intentions explain, at best, a third of the variation in actual behaviour change.[1] [2] What else can be done?

One approach is to move from intentions to inattention. Quite automatically, people who regularly engage in a behaviour like smoking or drinking alcohol pay more attention to smoking and alcohol-related stimuli. To interrupt this process ‘cognitive bias modification’ (CBM) can be used.

Amongst academics, the results of CBM have been called “striking” (p. 464),[3] prompted questions about how such a light-touch intervention can have such strong effects (p. 495),[4] and led to the development of online CBM platforms.[5]

An example of a CBM task for heavy alcohol drinkers is using a joystick to ‘push away’ pictures of beer and wine and ‘pull in’ pictures of non-alcoholic soft drinks. Alcoholic in-patients who received just an hour of this type of CBM showed a 13% lower rate of relapse a year later than those who did not – 50/108 patients in the experimental group and 63/106 patients in the control group.[4]

Debate about the efficacy of CBM is ongoing. It appears that CBM is more effective when administered in clinical settings rather than in a lab experiment or online.[6]

— Laura Kudrna, Research Fellow


  1. Sheeran P. Intention-behaviour relations: A conceptual and empirical review. In: Stroebe W, Hewstone M (Eds.). European review of social psychology, (Vol. 12, pp. 1–36). London: Wiley; 2002.
  2. Webb TL Sheeran P. Does changing behavioral intentions engender behavior change? A meta-analysis of the experimental evidence. Psychol Bull. 2006; 132(2): 249.
  3. Sheeran P, Gollwitzer PM, Bargh JA. Nonconscious processes and health. Health Psychol. 2013; 32(5): 460.
  4. Wiers RW, Eberl C, Rinck M, Becker ES, Lindenmeyer J. Retraining automatic action tendencies changes alcoholic patients’ approach bias for alcohol and improves treatment outcome. Psychol Sci. 2011; 22(4): 490-7.
  5. London School of Economics and Political Science. New brain-training tool to help people cut drinking. 18 May 2016.
  6. Wiers RW, Boffo M, Field M. What’s in a trial? On the importance of distinguishing between experimental lab studies and randomized controlled trials: The case of cognitive bias modification and alcohol use disorders. J Stud Alcohol Drugs. 2018; 79(3): 333-43.

A framework for implementation science: organisational and psychological approaches

Damschroder and colleagues present a meta-analytic approach to development of a framework to guide implementation of service interventions.[1] They call their framework a “consolidated framework for implementation research”. Their approach is based on a review of published theories concerning implementation of service interventions. Since two-thirds of interventions to improve care fail, this is an important activity. They offer on over-arching typology of constructs that deal with barriers to effective implementation, and build on Greenhalgh’s monumental study [2] of factors determining the diffusion, dissemination and implementation of innovations in health service delivery. These frameworks are useful because they take an organisation-wide perspective and so psychological frameworks of individual behaviour change, such as the trans-theoretical [3] or COM-B [4] frameworks are subsumed within these frameworks. I proposed something similar with my “framework of frameworks”.[5]

In any event, the framework produced seems sensible enough. In effect it is an elaboration of the essential interactive dimensions of intervention, context and the process of implementation. Context can be divided into the external setting and the internal setting. This particular study goes further and ends up with five major domains, each broken up into a number of constructs – eight relating to the intervention itself.

This paper is carefully written and well researched, and is an excellent source of references to some of icons of the organisational research literature. But is it useful? And will it be the last such framework? I rather think the answer to these two questions is no. I once had a boss who said the important thing about science was ‘knowing what to leave out’! I think a much simpler framework would have sufficed in this case. Maybe I should have a go at producing one!

— Richard Lilford, CLAHRC WM Director


  1. Damschroder LJ, Aron DC, Keith RE, Kirsch SR, Alexander JA, Lowrey JC. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implement Sci. 2009; 4: 50.
  2. Greenhalgh T, Robert G, Macfarlane F, Bate P, Kyriakidou O. Diffusion of innovations in service organizations: systematic review and recommendations. Milbank Q. 2004; 82: 581-629.
  3. Prochaska JO, Velicer WF. The transtheoretical model of health behaviour change. Am J Health Promot. 1997; 12(1): 38-48.
  4. Michie S, van Stralen M, West R. The behaviour change wheel: a new method for characterising and designing behaviour change interventions. Implement Sci. 2011; 6: 42.
  5. Lilford RJ. A Theory of Everything! Towards a Unifying Framework for Psychological and Organisational Change Models. NIHR CLAHRC West Midlands News Blog. 28 August 2015.

How Theories Inform our Work in Service Delivery Practice and Research

We have often written about theory in this News Blog.[1] [2] For instance, the ‘iron law’ of incentives – never use reward or sanction unless the manager concerned believes that they can influence the odds of success in reaching the required target.[3] [4] This law, sometimes called ‘expectancy theory’ was articulated by Victor Vroom back in 1964.[5] Here we review some of the theories that we have described, refined or enlarged over the last CLAHRC, and which we shall include among those we will pursue if we are successful in our Applied Research Collaborations (ARC) application. In each case we begin with the theory, then say how we have explicated it, and then describe how we plan to further develop theory through ongoing empirical work. Needless to say, our summaries are an impoverished simulacrum of the full articles:

  1. The theory of ‘hybrid managers’. It is well known that many professionals develop hybrid roles so that they toggle between their professional and managerial duties, and it is also known that tension can arise when the roles conflict. In our work we found that organisational factors can determine the extent to which nurses retain strong professional ethos when fulfilling managerial roles.[6] Simply put, the data seem to show that nurses working in more successful healthcare institutions tend to hew closer to their professional ethos than nurses in less successful units. It is reasonable to infer that an environment that can accommodate a strong professional orientation among hybrid managers is more likely to encompass the checks and balances conducive of safe care, than one that does not accommodate such a range of perspectives; most of us would choose to be treated in the environment where professional ethos is allowed a fair degree of expression. However, whether such a climate reflects better managers or a more difficult external environment is harder to discern. We now plan to examine this issue across many environments – for example, midwife hybrid managers balancing the need to expand choices of place of delivery with logistical limitations on doing so. Similarly, improving care for people with learning difficulties will require clinical managers to have freedom to innovate in order to improve services. Note that working with Warwick Business School enables us to locate our enquiries and theory development in the context of management in general, rather than just the management of health services. For example, the above study of nurse managers encompasses tax inspectors who now have to balance their traditional role in enforcing the tax code with one of helping the likes of us to make accurate declarations.
  2. Hybrid managers as knowledge brokers. Hybrid managers, it is known, act as a conduit between senior managers and frontline professionals, in mediating adoption of effective practice – i.e. knowledge brokering. It is also known that effecting change means overcoming structural, social and motivational barriers. The task of implementing state-of-the-art care practices is a delicate one and, prior to our research, the social dynamic of effecting change was poorly understood. In particular, the CLAHRC WM team wanted to study the role of status and perceived legitimacy in facilitating or inhibiting the knowledge brokers task. We found that hierarchies are critically important – safe care is more than following rules, but requires a degree of initiative (sometimes called discretional energy) by multiple actors across the hierarchy.[7] Nurses were often severely inhibited in using such personal initiative. The attitude of more senior staff is thus crucial in permitting, indeed encouraging, the use of initiative within a broader system of checks and balances. If the hierarchy within nursing is a barrier to progress, then that between doctors and nurses is a much bigger obstacle to uptake of knowledge. Moreover, there was also evidence of a difference barrier across different medical specialities with clinicians at the most action-oriented end of the spectrum (such as surgeons) showing lower levels of team-working than those with more reflective daily tasks (such as geriatricians). The work pointed towards the effectiveness of creating opportunities for purposeful interaction across these various types of hierarchical barriers – what the researchers called “dyadic relationships between hybrid middle managers with clinical governance responsibility and doctors through engagement and participation in medical-oriented meetings”; Elinor Ostrom would call this opportunities for ‘cheap talk’.[8] This work is crucial for laying the foundation for our work on the integration of care covering management of patients at the acute hospital / ambulance / community interface; care of patients with multiple diseases; care of the elderly; and the care of people with rare diseases, to mention but a few. Clearly, such opportunities for structured interaction are only parts of the story, and other factors that have been shown to be important (e.g. job design, performance management, education, patient empowerment, and data sharing) must be included in service improvement initiatives.
  3. Logics. Our third example concerns the unwritten assumptions that underpin what a person should do in their domain of work, and why they do it – so called ‘logics’. In a place like a hospital or university, many professions must co-exist, yet each will have a different ‘logic’. This idea applies across society, but CLAHRC WM investigator Graeme Currie wanted to examine how the professional logic and policy logic interact in a hospital setting.[9] The background to this study is the finding that policy logic has constrained and limited professional logic over the last few decades – doctors are no longer in charge of performance improvement, the management of waiting lists, etc. The researchers used the introduction of a new evidence-based, multi-component guideline as a lens through which to explore the interactions of different ‘logics’ in hospital practice. The implementation of a multi-component guideline is not a simple thing, and some intuitive cost-benefit calculations could justify, at least intellectually, massaging some aspects of the guideline to fit management practices rather than the reverse. However, the way this played out was not the same across contexts. As before, doctors were generally (but not invariably) less amenable to change than nurse practitioners with managerial responsibility. This study, published in a premier management journal,[9] identifies contingencies that will provide depth to our evaluations of different ways to reshape services. We will build on these insights when we examine a proposed service to use Patient Reported Outcome Measures, rather than simply elapsed time, to determine when patients should be seen in the outpatient department. An understanding of ‘logics’ is likely to come into play when we empower community and ambulance staff to elicit patient preferences and respect them even when to do so flies in the face of guidelines. At the level of the system, change is best viewed as an institutional problem of professional power and policy, around which change needs to orientate. It is not that systems and organisations can’t be changed, but subtle tactics and work may be required.[10] [11]
  4. Health care organisations viewed as political grouping and the need to do ‘political work’ when implementing interventions. Trish Greenhalgh has recently provided an evidence-based framework which unpicks reasons why IT implementations so often disappoint.[12] She points out that managers consistently underestimate the size of the task and the sheer difficulty of implementing IT systems so that they reach even some of their potential. Likewise, work conducted under an NIHR Programme grant that developed out of CLAHRC WM showed how new IT systems could introduce serious new hazards.[13] One of the methods to avoid failure in any large initiative, such as a large IT system, comes from a study of Italian hospitals conducted by the CLAHRC WM team,[14] advocating an iterative process, time and careful preparation of the ground by doing ‘political work’ to win hearts and minds and adapt interventions to context.[15] This type of approach will be critical to the development of complex interventions, such as those widening access to homebirth, and integrating patient feedback (including Patient Reported Outcome Measures) into patient care pathways.
  5. Absorptive capacity. Many CLAHRCs have relied on a knowledge brokering model to underpin translation of research, through which key individuals ensure knowledge gets to the right people at the right time to benefit patient care.[16] However, such an approach may have a limited effect and we need to consider how organisations and systems can be developed to ensure the efforts of knowledge brokers are leveraged and evidence informs patient care more widely. This is a matter of developing organisation and system ‘absorptive capacity’. Many of the implementation studies under our current CLAHRC have sought to develop co-ordination capability of organisations and systems to translate evidence into practice. For example, public and patient involvement, GP involvement, better business intelligence processes and structures is highlighted as ensuring clinical commissioning groups make evidence-informed decisions.[17] We have taken our work further to develop a ‘tool’ to assess the Absorptive Capacity of organisations.[18]

In this short review we have described how theoretical work, based on the development and evaluation of service interventions, can help understand the reasons why an intervention may succeed or fail, and how this may vary from place to place. Increasingly we are applying Elinor Ostrom’s work on collaboration between managers when the incentives are not aligned to the problems of integrated care in the NHS.[19] Our work represents successful collaboration between management and medical schools and, indeed, a difference in ‘logics’ between these organisations. This collaboration has taken time to mature, as have those between the services and academia more broadly. The essential point is that consideration of wider organisational and systems context will prove crucial to our efforts to continue broadening, accelerating and deepening translation of evidence into practice in our proposed ARC.

— Richard Lilford, CLAHRC WM Director

— Graeme Currie, Professor of Public Management, CLAHRC WM Deputy Director


  1. Lilford RJ. A Theory of Everything! Towards a Unifying Framework for Psychological and Organisational Change Models. NIHR CLAHRC West Midlands News Blog. 28 August 2015.
  2. Lilford RJ. Demystifying Theory. NIHR CLAHRC West Midlands News Blog. 10 April 2015.
  3. Lilford RJ. Financial Incentives for Providers of Health Care: The Baggage Handler and the Intensive Care Physician. NIHR CLAHRC West Midlands News Blog. 25 July 2015.
  4. Lilford RJ. Two Things to Remember About Human Nature When Designing Incentives. NIHR CLAHRC West Midlands News Blog. 27 January 2017.
  5. Vroom VH. Work and motivation. Oxford, England: Wiley. 1964.
  6. Croft C, Currie G, Lockett A. The impact of emotionally important social identities on the construction of managerial leader identity: A challenge for nurses in the English NHS. Organ Stud. 2015; 36(1): 113-31.
  7. Currie G, Burgess N, Hayton JC. HR Practices and Knowledge Brokering by Hybrid Middle Managers in Hospital Settings: The Influence of Professional Hierarchy. Hum Res Manage. 2015; 54(5): 793-812.
  8. Lilford RJ. Polycentric Organisations. NIHR CLAHRC West Midlands News Blog. 25 July 2014.
  9. Currie G & Spyridonidis D. Interpretation of Multiple Institutional Logics on the Ground: Actors’ Position, their Agency and Situational Constraints in Professionalized Contexts. Organ Stud. 2016; 37(1): 77-97.
  10. Currie G, Lockett A, Finn R, Martin G, Waring J. Institutional work to maintain professional power: Recreating the model of medical professionalism. Organ Stud. 2012; 33(7): 937-62.
  11. Lockett A, Currie G, Waring J, Finn R, Martin G. The influence of social position on sensemaking about organizational change. Acad Manage J. 2014; 57(4): 1102-29.
  12. Lilford RJ. New Framework to Guide the Evaluation of Technology-Supported Services. NIHR CLAHRC West Midlands News Blog. 12 January 2018.
  13. Cresswell KM, Mozaffar H, Lee L, Williams R, Sheikh A. W. Workarounds to hospital electronic prescribing systems: a qualitative study in English hospitals. BMJ Qual Saf. 2017; 26: 542-51.
  14. Radaelli G, Currie G, Frattini F, Lettieri E. The Role of Managers in Enacting Two-Step Institutional Work for Radical Innovation in Professional Organizations. J Prod Innov Manag, 2017; 34(4): 450-70.
  15. Lilford RJ. Implementation Science at the Crossroads. BMJ Qual Saf. 2017; 27: 331-2.
  16. Rowley E, Morriss R, Currie G, Schneider J. Research into practice: Collaboration for Leadership in Applied Health Research and Care (CLAHRC) for Nottinghamshire, Derbyshire and Lincolnshire (NDL). Implement Sci. 2012; 7:
  17. Croft C & Currie G. ‘Enhancing absorptive capacity of healthcare organizations: The case of commissioning service interventions to avoid undesirable older people’s admissions to hospitals’. In: Swan J, Nicolini D, et al., Knowledge Mobilization in Healthcare. Oxford: Oxford University Press; 2016.
  18. Currie G, Croft C, Chen Y, Kiefer T, Staniszewska S, Lilford RJ. The capacity of health service commissioners to use evidence: a case study. Health Serv Del Res. 2018; 6(12).
  19. Lilford RJ. Evaluating Interventions to Improve the Integration of Care (Among Multiple Providers and Across Multiple Sites). NIHR CLAHRC West Midlands News Blog. 10 February 2017.

Stunted Child Growth: How Good a Marker of Nutrition?

Around the world children, adjusted for age, are getting taller. When a child falls below two standard deviations of the reference mean height for a given age they are labelled as ‘stunted’. Given that children have been growing faster, even in poor countries, the incidence of stunting has decreased over the last four decades when compared to an unchanging reference standard. Using the WHO reference standard, for example, the prevalence of stunting in Asia decreased from 49% to 28% over the years 1990 to 2010. In Africa stunting rates remain stubbornly high at ~40%.[1] However, classifying 40% of any population as ‘abnormal’ should always raise a scientist’s suspicion.

The WHO reference is based on measurements across a mixture of rich and poor countries. So it is tempting to use it as a measure of infant nutrition; as nutrition improves so the prevalence of stunting should decline. It is tempting to infer that a well-nourished infant population would attain a stunting rate of 2.5%, even if judged against high-income norms. However, this argument is flawed – growth rates reflect not just each individual child’s nutrition, but the nutrition available to at least two preceding generations of the child’s family.[2] Targets should be based on what is achievable, and a statistically defined 2.5 percent ‘stunting’ rate is an unachievable target; the high-income threshold is unachievable within one generation. However, a poor country’s own 2.5% threshold is too low. In fact, there is no perfect external standard; they are all arbitrary.

We propose a ‘risk-adjusted’ stunting rate. One method would be based on the evidence of what can be achieved within one generation by examining the variances in height across generations, such as those in Japan, Taiwan and the Gulf states over a period of rapid economic progress and the years that followed. For instance, if two-thirds of the variance in growth rates is attributable to nutrition in a given generation, and one-third to inter-generational effects, then thresholds could be adjusted accordingly. A more refined method still could adjust for the height of the mother, since that also imposes a limit on growth rates. We think that the world should move to an empirically supported method of monitoring, rather than stunting rates based on an arbitrary standard that ignore intergenerational effects, which should henceforth be regarded as discredited.

— Richard Lilford, CLAHRC WM Director


  1. de Onis M, Blössner M, Borghi E. Prevalence and trends of stunting among pre-school children, 1990–2020. Public Health Nutr. 2012; 15(1): 142-8.
  2. Kaati G, Bygren LO, Pembrey M, Sjöström M. Transgenerational response to nutrition, early life circumstances and longevity. Eur J Hum Genet. 2007; 15: 784-90.

Our CLAHRC’s Unique Approach to Public and Community Involvement Engagement and Participation (PCIEP)

All NIHR-funded research is required to involve the public/patients at all stages of the research process. Here in CLAHRC WM we are ardent supporters of this principle, and we hew to the INVOLVE guidelines in doing so. We are keen to improve our ways of involving patients and the public in our research and have used the recently-published Standards for involvement to reflect on our activities and develop better ways of working.

CLAHRC WM is a Service Delivery Research Organisation. Because we are in the business of shaping the way health services are designed and delivered, we have given careful consideration to our approach to public and community involvement in research.  Our unique approach and rationale is described below.

Let’s start with a basic point. Service delivery research is, in most instances, best conducted prospectively. This is for three reasons:

  1. Prospective involvement of researchers provides access to the world’s literature, along with critical appraisal of that literature, to help inform the selection and design/adaptation of interventions.
  2. Researchers can assist in co-design and alpha-testing of proposed changes, deploying disciplines such as behavioural economics, operations research, and organisational theory.
  3. Prospective evaluations are generally more powerful (valid) than purely retrospective studies – for example, providing baseline data and information on both mediating, clinical processes and outcome variables.[1]

This takes us to the next basic point – service interventions are in the purvey of managers who control the purse strings, not the researchers. Yes, researchers can influence intervention selection and deployment, but they do not have the final say.

A third basic point is that service managers have a duty to consult patients and the public, just as researchers do.

We could have a model for public involvement with one set of patient/public advisors advising on research, and another set advising on interventions, as in in Figure 1.

Figure 1: Separate Involvement of PPI (PCIEP) in Research and in Service

108 DCB - Our CLAHRC Fig 1However, such a plan seems an opportunity missed. It could result, for instance, in conflicting advice, with patients/public in the research sector advocating evaluation of an intervention that their counterparts in the service have not prioritised.

We are not advocating combining PCIEP in research with patient and public involvement in the service to create just one monolithic structure. There are many research issues that are not relevant in a purely service context. We do, however, advocate an integrated approach, as represented in Figure 2. By involving patient and public contributors that are also involved in advising the service, we generate a group of people to champion our research and help ensure evidence is used in practice.

Figure 2: A system that integrates patient and the public across the service and research domains

108 DCB - Our CLAHRC Fig 2

So what can we do to achieve this level of integration? We do not have all the answers, as this is an evolving idea. However, here is what we do in CLAHRC WM:

  1. We try to recruit public contributors who also have (or have had) a role in advising the service. We target them and give some preference to such people in our competitive selection process.
  2. We hold joint consultative sessions with service managers, our public contributors, and (when possible) those who advise the service. Such was the situation, for example, in the ‘Dragon’s Den’ events we held to select priorities for our forthcoming Applied Research Collaboration application.
  3. Working with PCIEP and service partners, we create structures where research PCIEP, service PCIEP (say from Healthwatch) and CLAHRC WM researchers work together. We have worked with Sustainability and Transformation Partnerships (STPs) and our local Academic Health Science Network (AHSN) to create these structures.

Our strategy has evolved over considerable discussion in the CLAHRC WM and we have ‘market tested’ our approach with Simon Denegri, the past head of INVOLVE. However, we welcome feedback, advice, and opinions from readers.

Those who wish to read more on our work and/or thoughts on patient and public involvement can do so by clicking here.

— Richard Lilford, CLAHRC WM Director

— Magdalena Skrybant, PPIE Lead


  1. Lilford RJ, Chilton PJ, Hemming K, Girling AJ, Taylor CA, Barach P. Evaluating policy and service interventions: framework to guide selection and interpretation of study end pointsBMJ2010; 341: c4413.

The Slow March of Epidemiology: From Disease Causation to Treatment to Service Delivery

Traditional epidemiology was concerned with the causes of disease – many of the great medical discoveries, from malaria to the effects of smoking, can be credited to classical epidemiology. The subject continues to make great strides thanks to modern developments, such as genome-wide association studies and Mendelian randomisation. Approximately 70 years ago Austin Bradford Hill ushered in the days of clinical epidemiology.[1] Epidemiological methods were used to study the diagnosis and treatment of disease, rather than simply the causes and prognosis. Randomised trials and systematic reviews became the ‘stock in trade’ of the clinical epidemiologist.

As more and more effective treatments were discovered, people started to worry about large variations in practice and in the quality of care. Service delivery health research and the ‘quality movement’ were born. Researchers naturally felt the need to measure quality. Progress was slow, however. First, quality improvement was initially dominated by management research; a subject that does not have a strong tradition of measurement, as I have reported elsewhere.[2] Second, the constructs that quality researchers were dealing with were much harder to measure than clinical outcomes. For example, an attempt was made to correlate the safety culture with standardised mortality rates across intensive care units. The result was null, but this might have resulted entirely from measurement error; mortality rates suffer from unavoidable signal to noise problems,[3] while the active ingredient in culture is hard to capture in a measurement.[4] As the subject of the quality of care seemed to become bogged down with measurement issues, the patient safety movement became dominant. Initially people focused on psychology and organisational science. However, no science can mature without, at some point, making its central concepts quantifiable. As Galileo (allegedly) said, “Measure what can be measured, and make measurable what cannot be measured.” So it became necessary to try to measure safety, and all the problems of quality measurement re-surfaced.

Most sensible people now realise that impatience does more harm than good; shortcuts lead nowhere and we simply have to work away, measuring and mitigating measurement error as bast we can. As stated, and as I have argued elsewhere,[5] clinical outcomes are insensitive to many service interventions. This is a lesson that those of us with a background in classical or clinical epidemiology have been slow to learn. Trying to copy clinical epidemiology, and to rely entirely on clinical endpoints, has driven service delivery research into two camps – qualitative researchers who eschew quantification, and quantitative researchers who want to apply rules of evidence that served them well in clinical research. However, there really is a third way. This method is based on observations across the causal chain linking intervention to clinical outcome. I have long argued that it is the pattern of data (qualitative and quantitative) across a causal chain that should be analysed.[5] Since then, people have started to pay attention, not just to the outcome at the human level, but also to mediating variables. More recently still, I have argued for the use of Bayesian networks to synthesise information from the causal chain in a particular study, along with evidence from reviews in salient topics.[6] Note that while coming from the same, realist, epistemology as ‘mixed-methods’ research, mediator variable analysis and Bayesian networking developed mixed-methods to another level, since they enable data of different sorts to be captured in a clinical outcome of sufficient importance to populate a decision model. The use of proxy outcomes acquired a bad reputation in clinical epidemiology. However, carrying this idea over into service delivery research is extremely limiting. It is also unscientific, since science is dependent on induction, and induction can only be carried out if the causal mechanisms behind the results obtained are understood.

— Richard Lilford, CLAHRC WM Director


  1. Hill AB. The environment and disease: Association or causation? Proc R Soc Med. 1965; 58(5): 295-300.
  2. Lilford RJ, Dobbie F, Warren R, Braunholtz D, Boaden R. Top-rated British business research: Has the emperor got any clothes? Health Serv Manage Res. 2003; 16(3): 147-54.
  3. Girling AJ, Hofer TP, Wu J, Chilton PJ, Nicholl JP, Mohammed MA, Lilford RJ. Case-mix adjusted hospital mortality is a poor proxy for preventable mortality: a modelling study. BMJ Qual Saf. 2012; 21(12): 1052-6.
  4. Mannion R, Davies H, Konteh H, Jung T, Scott T, Bower P, Whalley D, McNally R, McMurray R. Measuring and Assessing Organisational Culture in the NHS (OC1). 2008.
  5. Lilford RJ, Chilton PJ, Hemming K, Girling AJ, Taylor CA, Barach P. Evaluating policy and service interventions: framework to guide selection and interpretation of study end points. BMJ 2010; 341: c4413.
  6. Watson SI & Lilford RJ. Essay 1: Integrating Multiple Sources of Evidence: a Bayesian Perspective. In: Challenges, solutions and future directions in the evaluation of service innovations in health care and public health. Southampton (UK): NIHR Journals Library, 2016.

Sustainability and Transformation Partnerships: Why they are so Very Interesting

There is a strong international, national and local initiative to develop services generically by integrating care across multiple providers and many diseases, rather than to focus exclusively on disease ‘silos’. However, integrating care across providers runs into immediate problems because the interests of these different providers are seldom aligned. For instance, providing care in the community may reduce earnings in a hospital where money follows patients.

Integrating care across multiple providers can take different forms, which might play out in different ways. The least radical solution would consist of informal alliances to help plan services. At the other end of the scale organisations merge into common legal entities with consolidated budgets (so-called Responsible Care organisations). Between these two extremes lie formal structures, but where the budgets and legal responsibility remain with local providers.

The Sustainability and Transformation Partnerships (STPs) in England are a good example of the intermediate arrangement. They are part of official government policy, have some funding, and have generated considerable local buy-in.

However, the interests of local providers cannot be overridden by the STP. It is tempting to say that they are unlikely to be very successful given that, inevitably, the interests of the different organisations are not the same. However, there is some evidence that this might not be the inevitable, dismal outcome. The evidence comes from Elinor Ostrom, Nobel Prize winner for economics. We have cited her work previously within this news blog.[1][2] She describes the conditions under which collaboration can take place, even when the interests of the collaborating organisations are imperfectly aligned:

  1. Clearly defined boundaries.
  2. Congruence between appropriation/provision rules and local conditions.
  3. Collective choice arrangements.
  4. Monitoring.
  5. Graduated sanctions.
  6. Conflict-resolution mechanisms.
  7. Local autonomy.
  8. Nested enterprises (polycentric governance).

Ostrom’s work was carried out in the context of protection of the environment; fisheries, farms, oceans, forests and the like. So, it would be extremely interesting to examine STP using Ostrom’s findings as an investigative lens. Working with CLAHRC London we plan to conduct numerous case studies of STPs that exhibit different features or philosophies. We expect that we will uncover differences in structure and culture that play out differently in different places. Among other things, we will see whether we can replicate Ostrom’s findings in a health care context. On this basis, we may be able to develop a tool that could help predict how well an organisation, such as an STP, is working. In the long-term we would examine (any) correlation in adherence with Ostrom’s criteria and the overall success of an STP.

Of course, this is not an easy topic for study. That is precisely why we think it is a good topic for a capacity development centre, such as a CLAHRC, to tackle. There is an inverse relationship between the importance of a topic, and its tractability. This is where various tools that we have developed, such as Bayesian networks, come into their own. These tools make intractable subjects, such as evaluating the success of STPs, a little more tractable.

— Richard Lilford, CLAHRC WM Director


  1. Lilford RJ. Polycentric Organisations. NIHR CLAHRC West Midlands News Blog. 25 July 2014.
  2. Ostrom E. Governing the Commons: The Evolution of Institutions for Collective Actions. Cambridge University Press: Cambridge, UK; 1990.