Tag Archives: Director & Co-Directors’ Blog

Health Service and Delivery Research – a Subject of Multiple Meanings

Never has there been a topic so subject to lexicological ambiguity as that of Service Delivery Research. Many of the terms it uses are subject to multiple meanings, making communication devilishly difficult; a ‘Tower of Babel’ according to McKibbon, et al.[1] The result is that two people may disagree when they agree, or agree when they are fundamentally at odds. The subject is beset with ‘polysemy’(one word means different things) and, to an even greater extent, ‘cognitive synonyms’ (different words mean the same thing).

Take the very words “Service Delivery Research”. The study by McKibbon, et al. found 46 synonyms (or near synonyms) for the underlying construct, including applied health research, management research, T2 research, implementation research, quality improvement research, and patient safety research. Some people will make strong statements as to why one of these terms is not the same as another – they will tell you why implementation research is not the same as quality improvement, for example. But seldom will two protagonists agree and give the same explanation as to why they differ, and textual exegesis of the various definitions does not support separate meanings – they all tap into the same concept, some focussing on outcomes (quality, safety) and others on the means to achieve those outcomes (implementation, management).

Let us examine some widely used terms in more detail. Take first the term “implementation”. The term can mean two quite separate things:

  1. Implementation of the findings of clinical research (e.g. if a patient has a recent onset thrombotic stroke then administer a ‘clot busting’ medicine).
  2. Implementation of the findings from HS&DR (e.g. do not use incentives when the service providers targeted by the incentive do not believe they have any control over the target.[2][3]

Then there is my bête noire, “complex interventions”. This term concatenates separate ideas, such as the complexity of the intervention vs. the complexity of the system (e.g. health system) with which the intervention interacts. Alternatively, it may concatenate the complexity of the intervention components vs. the number of components it includes.

It is common to distinguish between process and outcome, á la Donabedian.[4] But this conflates two very different things – clinical process (such as prescribing the correct medicine, eliciting the relevant symptoms, or displaying appropriate affect), and service level (upstream) process endpoints (such as favourable staff/patient ratios, or high staff morale). We have described elsewhere the methodological importance of this distinction.[5]

Intervention description is famously conflated with intervention uptake/ fidelity/ adaptation. The intervention description should be the implementation as described (like the recipe), while the way the interventions is assimilated in the organisation is a finding (like the process the chef actually follows).[6]

These are just a few examples of words with multiple meanings that cause health service researchers to fall over their feet. Some have tried to forge an agreement over these various terms, but widespread agreement is yet to be achieved. In the meantime, it is important to explain precisely what is meant when we talk about implementation, processes, complexity, and so on.

— Richard Lilford, CLAHRC WM Director


  1. McKibbon KA, Lokker C, Wilczynski NL, et al. A cross-sectional study of the number and frequency of terms used to refer to knowledge translation in a body of health literature in 2006: a Tower of Babel? Implementation Science. 2010; 5: 16.
  2. Lilford RJ. Financial Incentives for Providers of Health Care: The Baggage Handler and the Intensive Care Physician. NIHR CLAHRC West Midlands News Blog. 2014 July 25.
  3. Lilford RJ. Two Things to Remember About Human Nature When Designing Incentives. NIHR CLAHRC West Midlands News Blog. 2017 January 27.
  4. Donabedian A. Explorations in quality assessment and monitoring. Health Administration Press, 1980.
  5. Lilford RJ, Chilton PJ, Hemming K, Girling AJ, Taylor CA, Barach P. Evaluating policy and service interventions: framework to guide selection and interpretation of study end points. BMJ. 2010; 341: c4413.
  6. Brown C, Hofer T, Johal A, Thomson R, Nicholl J, Franklin BD, Lilford RJ. An epistemology of patient safety research: a framework for study design and interpretation. Part 3. End points and measurement. Qual Saf Health Care. 2008. 17;170-7.

The Same Data Set Analysed in Different Ways Yields Materially Different Parameter Estimates: The Most Important Paper I Have Read This Year

News blog readers know that I have a healthy scepticism about the validity of econometric/regression models. In particular, the importance of being aware of the distinction between confounding and mediating variables, the latter being variables that lie on the causal chain between explanatory and outcome variables. I therefore thank Dr Yen-Fu Chen for drawing my attention to an article by Silberzahn and colleagues.[1] They conducted a most elegant study in which 26 statistical teams analysed the same data set.

The data set concerns the game of soccer and the hypothesis that a player’s skin tone will influence propensity for a referee to issue a red card, which is some kind of reprimand to the player. The provenance of this hypothesis lies in shed loads of studies on preference for lighter skin colour across the globe and subconscious bias towards people of lighter skin colour. Based on access to various data sets that included colour photographs of players, each player’s skin colour was graded into four zones of darkness by independent observers with, as it turned out, high reliability (agreement between observers over and above that expected by chance).

The effect of skin colour tone and player censure by means of the red card was estimated by regression methods. The team was free to select its preferred method. The team could also select which of 16 available variables to include in the model.

The results across the 26 teams varied widely but were positive (in the hypothesised direction) in all but one case. The ORs varied from 0.89 to 2.93 with a median estimate of 1.31. Overall, twenty teams found a significant (in each case positive) relationship. This wide variability in effect estimates was all the more remarkable given that the teams peer-reviewed  each other’s methods prior to analysis of the results.

All but one team took account of the clustering of players in referees and the outlier was also the single team not to have a point estimate in the positive (hypothesised) direction. I guess this could be called a flaw in the methodology, but the remaining methodological differences between teams could not easily be classified as errors that would earn a low score in a statistics examination. Analytic techniques varied very widely, covering linear regression, logistic regression, Poisson regression, Bayesian methods, and so on, with some teams using more than one method. Regarding covariates, all teams included number of games played under a given referee and 69% included player’s position on the field. More than half of the teams used a unique combination of variables. Use of interaction terms does not seem to have been studied.

There was little systematic difference across teams by the academic rank of the teams. There was no effect of prior beliefs about what the study would show and the magnitude of effect estimated by the teams. This may make the results all the more remarkable, since there would have been no apparent incentive to exploit options in the analysis to produce a positive result.

What do I make of all this? First, it would seem to be good practice to use different methods to analyse a given data set, as CLAHRC West Midlands has done in recent studies,[2] [3] though this opens opportunities to selectively report methods that produce results convivial to the analyst. Second, statistical confidence limits in observational studies are far too narrow and this should be taken into account in the presentation and use of results. Third, data should be made publically available so that other teams can reanalyse them whenever possible. Fourth, and a point surprisingly not discussed by the authors, the analysis should be tailored to a specific scientific causal model ex antenot ex post. That is to say, there should be a scientific rationale for choice of potential confounders and explication of variables to be explored as potential mediating variables (i.e. variables that might be on the causal pathway).

— Richard Lilford, CLAHRC WM Director


  1. Silberzahn R, Uthman EL, Martin DP, et al. Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results. Adv Methods Pract Psychol Sci. 2018; 1(3): 337-56.
  2. Manaseki-Holland S, Lilford RJ, Bishop JR, Girling AJ, Chen Y-F, Chilton PJ, Hofer TP; the UK Case Note Review Group. Reviewing deaths in British and US hospitals: a study of two scales for assessing preventability. BMJ Qual Saf. 2017; 26: 408-16.
  3. Mytton J, Evison F, Chilton PJ, Lilford RJ. Removal of all ovarian tissue versus conserving ovarian tissue at time of hysterectomy in premenopausal patients with benign disease: study using routine data and data linkage. BMJ. 2017; 356: j372.

Trials are Not Always Needed for Evaluation of Surgical Interventions: Does This House Agree?

I supported the above motion at a recent surgical trails meeting in Bristol. What where are my arguments?

I argued that there were four broad categories of intervention where trials were not needed:

  1. Where causality is not in dispute

This scenario arises where, but for the intervention, a bad outcome was all but inevitable. Showing that such an outcome can be prevented in only a few cases is sufficient to put the substantive question to bed. Such an intervention is sometimes referred to as a ‘penicillin-type’ of intervention. Surgical examples include heart transplantation and in vitro fertilisation (for people both of whose Fallopian tubes have been removed). From a philosophy of science perspective, causal thinking requires a counterfactual: what would have happened absent the intervention? In most instances a randomised trial provides the best approximation to that counterfactual. However, when the counterfactual is near inevitable death, then a few cases will be sufficient to prove the principle. Of course, this is not the end of the story. Trials of different methods within a generic class will always be needed, along with trials of cases where the indication is less clear cut, and hence where the counterfactual cannot be predicted with a high level of certainty. Nevertheless, the initial introduction of heart transplantation and in vitro fertilisation took place without any randomised trial. Nor was such a trial necessary.

  1. Speculative procedures where there is an asymmetry of outcome

This is similar to the above category, but the justification is ethical rather than scientific. I described a 15 year old girl who was born with no vagina but a functioning uterus. She was referred to me with a pyometra, having had an unsuccessful attempt to create a channel where the vagina should have been. The standard treatment in such a dire situation would have been hysterectomy. However, I offered to improvise and try an experimental procedure using tissue expansion methods to stretch the skin at the vaginal opening and then to use this skin to create a functioning channel linking the uterus to the exterior. The patient and her guardian accepted this procedure, in the full knowledge that it was entirely experimental. In the event, I am glad to report that the operation was successful, producing a functional vagina and allowing regular menstruation.[1] The formal theory behind innovative practice in such dire situations comes from expected utility theory.[2] An example is explicated in the figure.

113 DCB - Trials Eval Sur Interv Figure

This example relates to a person with very low life expectancy and a high-risk procedure that may either prove fatal or extend their life for a considerable time. In such a situation, the expected value of the risky procedure considerably exceeds doing nothing and is preferable, from the point of view of the patient, to entry in an RCT. In fact, the expected value of the RCT (with a 1:1 randomisation ratio) is (0.5 x 0.25) + (0.5 x 1.0) = 0.625. While favourable in comparison to ‘no intervention’, it is inferior in comparison with the ‘risky intervention’.

  1. When the intervention has not been well thought through

Here my example was full frontal lobotomy. Trials and other epidemiological methods can only work out how to reach an objective, not which objective to reach or prioritise. Taking away someone’s personality is nota fair price to pay for mental calmness.

  1. When the trial is poor value for money

Trials are often expensive and we have made them more so with extensive procedural rules. Collection of end-points by routine systems is only part of the answer to this question. Hence trials can be a poor use of research resources. Modelling shows that the value of the information trials provide is sometimes exceeded by the opportunity cost.[3-5]

Of course, I am an ardent trialist. But informed consent must be fully informed so that the preferences of the patient can come into play. I conducted an RCT of two methods of entering a patient into an RCT and showed that more and better information reduced willingness to be randomised.[6] Trial entry is justified when equipoise applies, and the ‘expected value’ of the alternative treatment is about the same.[7] The exception is when the new treatment is unlicensed. Then equipoise plus should apply – the expected value of trial entry should exceed or equal that of standard treatment.[8]

— Richard Lilford, CLAHRC WM Director


  1. Lilford RJ, Sharpe DT, Thomas DFM. Use of tissue expansion techniques to create skin fplas for vaginoplasty. Case report. Br J Obstet Gynacol. 1988;95: 402-7.
  2. Lilford RJ. Trade-off between gestational age and miscarriage risk of prenatal testing: does it vary according to genetic risk? Lancet. 1990; 336: 1303-5.
  3. De Bono M, Fawdry RDS, Lilford RJ. Size of trials for evaluation of antenatal tests of fetal wellbeing in high risk pregnancy. J Perinat Med. 1990; 18(2): 77-87.
  4. Lilford R, Girling A, Braunholtz D. Cost-Utility Analysis When Not Everyone Wants the Treatment: Modeling Split-Choice Bias.Med Decis Making. 2007; 27(1): 21-6.
  5. Girling AJ, Freeman G, Gordon JP, Poole-Wilson P, Scott DA, Lilford RJ. Modeling payback from research into the efficacy of left-ventricular assist devices as destination therapy. Int J Technol Assess Health Care. 2007; 23(2): 269-77.
  6. Wragg JA, Robison EJ, Lilford RJ. Information presentation and decisions to enter clinical trials: a hypothetical trial of hormone replacement therapy. Soc Sci Med. 2000; 51(3): 453-62.
  7. Lilford J. Ethics of clinical trials from a Bayesian and decision analytic perspective: whose equipoise is it anyway?BMJ. 2003; 326: 980.
  8. Robinson EJ, Kerr CE, Stevens AJ, Lilford RJ, Braunholtz DA, Edwards SJ, Beck SR, Roelwy MG. Lay public’s understanding of equipoise and randomisation in randomised controlled trials. Health Technol Assess. 2005; 9(8): 1-192.

Estimating Mortality Due to Low-Quality Care

A recent paper by Kruk and colleagues attempts to estimate the number of deaths caused by sub-optimal care in low- and middle-income countries (LMICs).[1] They do so by selecting 61 conditions that are highly amenable to healthcare. They estimate deaths from these conditions from the global burden of disease studies. The proportion of deaths attributed to differences in health systems is estimated from the difference in deaths between LMICs and high-income countries (HICs). So if the death rate from stroke in people aged 70 to 75 is ten per thousand in HICs and 20 per thousand in LMICs, then ten deaths per 1000 are preventable. This ‘subtractive method’ to estimate deaths that could be prevented by improved health services simply answers the otiose question: “what would happen if low-income countries and their populations could be converted, by the wave of a wand, into high-income countries complete with populations enjoying high income from conception?” Such a reductionist approach simply replicates the well-known association between per capita GDP and life expectancy.[2]

The authors of the above paper do try to isolate the effect of institutional care from access to facilities. To make their distinction they need to estimate utilisation of services. This they do from various household surveys, conducted at selected sites around the world. These surveys contain questions about service use. So a further subtraction is performed; if half of all people deemed to be having a stroke utilise care, then half of the difference in stroke mortality can be attributed to quality of care.

Based on this methodology the authors find that the lion’s share of deaths are caused by poor quality care not failure to get care. This conclusion is flawed because:

  1. The link between the databases is at a very coarse level – there is no individual linkage.
  2. As a result risk-adjustment is not possible.
  3. Further to the above, the method is crucially unable to account for delays in presentation and access to care preceding presentation that will inevitably result in large differences in prognosis at presentation.
  4. Socio-economic status and deprivation over a lifetime is associated with recovery from a condition, so differences in outcome are not due only to differences in care quality.[3]
  5. There are measurement problems at every turn. For example, Global Burden of Disease is measured in very different ways across HICs and LMICs – the latter rely heavily on verbal autopsy.
  6. Quality, as measured by crude subtractive methodologies, includes survival achieved by means of expensive high technology care. However, because of opportunity costs, introduction of effective but expensive treatments will do more harm than good in LMICs (until they are no longer LMICs).

The issue of delay in presentation is crucial. Take, for example, cancer of the cervix. In HICs the great majority of cases are diagnosed at an early, if not at a pre-invasive, stage. However, in low-income countries almost all cases were already far advanced when they present. To attribute the death rate difference to the quality of care is inappropriate. Deep in the discussion the authors state ‘comorbidity and disease history could be different between low and high income countries which can result in some bias.’ This is an understatement and the problem cannot be addressed by a passing mention of it. Later they also assert that all sensitivity analyses support the conclusion that poor healthcare is a larger driver of amenable mortality than utilisation of services. But it is really difficult to believe such a sensitivity analyses when this bias is treated so lightly.

Let us be clear, there is tons of evidence that care is, in many respects, very sub-optimal in LMICs. We care about trying to improve it. But we think such dramatic results based on excessively reductionist analyses are simply not justifiable and in seeking attention in this way risk undermining broader support for the important goal of improving care in LMICs. In areas from global warming to mortality during the Iraq war we have seen the harm that marketing with unreliable methods and generalizing beyond the evidence can do to a good cause by giving fodder to those who don’t want to believe that there is a problem. What is needed are careful observations and direct measurements of care quality itself, along with evaluations of the cost-effectiveness of methods to improve care. Mortality is a crude measure of care quality.[4][5] Moreover, the extent to which healthcare reduces mortality is quite modest among older adults. The type of paper reported here topples over into marketing – it is as unsatisfying as a scientific endeavour as it is sensational.

— Richard Lilford, CLAHRC WM Director

— Timothy Hofer, Professor in Division of General Medicine, University of Michigan


  1. Kruk ME, Gage AD, Joseph NT, Danaei G, García-Saisó S, Salomon JA. Mortality due to low-quality health systems in the universal health coverage era: a systematic analysis of amenable deaths in 137 countries. Lancet. 2018.
  2. Rosling H. How Does Income Relate to Life Expectancy. Gap Minder. 2015.
  3. Pagano D, Freemantle N, Bridgewater B, et al. Social deprivation and prognostic benefits of cardiac surgery: observational study of 44,902 patients from five hospitals over 10 years. BMJ. 2009; 338: b902.
  4. Lilford R, Mohammed MA, Spiegelhalter D, Thomson R. Use and misuse of process and outcome data in managing performance of acute medical care: avoiding institutional stigma. Lancet. 2004; 363: 1147-54.
  5. Girling AJ, Hofer TP, Wu J, et al. Case-mix adjusted hospital mortality is a poor proxy for preventable mortality: a modelling studyBMJ Qual Saf. 2012; 21(12): 1052-6.

A framework for implementation science: organisational and psychological approaches

Damschroder and colleagues present a meta-analytic approach to development of a framework to guide implementation of service interventions.[1] They call their framework a “consolidated framework for implementation research”. Their approach is based on a review of published theories concerning implementation of service interventions. Since two-thirds of interventions to improve care fail, this is an important activity. They offer on over-arching typology of constructs that deal with barriers to effective implementation, and build on Greenhalgh’s monumental study [2] of factors determining the diffusion, dissemination and implementation of innovations in health service delivery. These frameworks are useful because they take an organisation-wide perspective and so psychological frameworks of individual behaviour change, such as the trans-theoretical [3] or COM-B [4] frameworks are subsumed within these frameworks. I proposed something similar with my “framework of frameworks”.[5]

In any event, the framework produced seems sensible enough. In effect it is an elaboration of the essential interactive dimensions of intervention, context and the process of implementation. Context can be divided into the external setting and the internal setting. This particular study goes further and ends up with five major domains, each broken up into a number of constructs – eight relating to the intervention itself.

This paper is carefully written and well researched, and is an excellent source of references to some of icons of the organisational research literature. But is it useful? And will it be the last such framework? I rather think the answer to these two questions is no. I once had a boss who said the important thing about science was ‘knowing what to leave out’! I think a much simpler framework would have sufficed in this case. Maybe I should have a go at producing one!

— Richard Lilford, CLAHRC WM Director


  1. Damschroder LJ, Aron DC, Keith RE, Kirsch SR, Alexander JA, Lowrey JC. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implement Sci. 2009; 4: 50.
  2. Greenhalgh T, Robert G, Macfarlane F, Bate P, Kyriakidou O. Diffusion of innovations in service organizations: systematic review and recommendations. Milbank Q. 2004; 82: 581-629.
  3. Prochaska JO, Velicer WF. The transtheoretical model of health behaviour change. Am J Health Promot. 1997; 12(1): 38-48.
  4. Michie S, van Stralen M, West R. The behaviour change wheel: a new method for characterising and designing behaviour change interventions. Implement Sci. 2011; 6: 42.
  5. Lilford RJ. A Theory of Everything! Towards a Unifying Framework for Psychological and Organisational Change Models. NIHR CLAHRC West Midlands News Blog. 28 August 2015.

How Theories Inform our Work in Service Delivery Practice and Research

We have often written about theory in this News Blog.[1] [2] For instance, the ‘iron law’ of incentives – never use reward or sanction unless the manager concerned believes that they can influence the odds of success in reaching the required target.[3] [4] This law, sometimes called ‘expectancy theory’ was articulated by Victor Vroom back in 1964.[5] Here we review some of the theories that we have described, refined or enlarged over the last CLAHRC, and which we shall include among those we will pursue if we are successful in our Applied Research Collaborations (ARC) application. In each case we begin with the theory, then say how we have explicated it, and then describe how we plan to further develop theory through ongoing empirical work. Needless to say, our summaries are an impoverished simulacrum of the full articles:

  1. The theory of ‘hybrid managers’. It is well known that many professionals develop hybrid roles so that they toggle between their professional and managerial duties, and it is also known that tension can arise when the roles conflict. In our work we found that organisational factors can determine the extent to which nurses retain strong professional ethos when fulfilling managerial roles.[6] Simply put, the data seem to show that nurses working in more successful healthcare institutions tend to hew closer to their professional ethos than nurses in less successful units. It is reasonable to infer that an environment that can accommodate a strong professional orientation among hybrid managers is more likely to encompass the checks and balances conducive of safe care, than one that does not accommodate such a range of perspectives; most of us would choose to be treated in the environment where professional ethos is allowed a fair degree of expression. However, whether such a climate reflects better managers or a more difficult external environment is harder to discern. We now plan to examine this issue across many environments – for example, midwife hybrid managers balancing the need to expand choices of place of delivery with logistical limitations on doing so. Similarly, improving care for people with learning difficulties will require clinical managers to have freedom to innovate in order to improve services. Note that working with Warwick Business School enables us to locate our enquiries and theory development in the context of management in general, rather than just the management of health services. For example, the above study of nurse managers encompasses tax inspectors who now have to balance their traditional role in enforcing the tax code with one of helping the likes of us to make accurate declarations.
  2. Hybrid managers as knowledge brokers. Hybrid managers, it is known, act as a conduit between senior managers and frontline professionals, in mediating adoption of effective practice – i.e. knowledge brokering. It is also known that effecting change means overcoming structural, social and motivational barriers. The task of implementing state-of-the-art care practices is a delicate one and, prior to our research, the social dynamic of effecting change was poorly understood. In particular, the CLAHRC WM team wanted to study the role of status and perceived legitimacy in facilitating or inhibiting the knowledge brokers task. We found that hierarchies are critically important – safe care is more than following rules, but requires a degree of initiative (sometimes called discretional energy) by multiple actors across the hierarchy.[7] Nurses were often severely inhibited in using such personal initiative. The attitude of more senior staff is thus crucial in permitting, indeed encouraging, the use of initiative within a broader system of checks and balances. If the hierarchy within nursing is a barrier to progress, then that between doctors and nurses is a much bigger obstacle to uptake of knowledge. Moreover, there was also evidence of a difference barrier across different medical specialities with clinicians at the most action-oriented end of the spectrum (such as surgeons) showing lower levels of team-working than those with more reflective daily tasks (such as geriatricians). The work pointed towards the effectiveness of creating opportunities for purposeful interaction across these various types of hierarchical barriers – what the researchers called “dyadic relationships between hybrid middle managers with clinical governance responsibility and doctors through engagement and participation in medical-oriented meetings”; Elinor Ostrom would call this opportunities for ‘cheap talk’.[8] This work is crucial for laying the foundation for our work on the integration of care covering management of patients at the acute hospital / ambulance / community interface; care of patients with multiple diseases; care of the elderly; and the care of people with rare diseases, to mention but a few. Clearly, such opportunities for structured interaction are only parts of the story, and other factors that have been shown to be important (e.g. job design, performance management, education, patient empowerment, and data sharing) must be included in service improvement initiatives.
  3. Logics. Our third example concerns the unwritten assumptions that underpin what a person should do in their domain of work, and why they do it – so called ‘logics’. In a place like a hospital or university, many professions must co-exist, yet each will have a different ‘logic’. This idea applies across society, but CLAHRC WM investigator Graeme Currie wanted to examine how the professional logic and policy logic interact in a hospital setting.[9] The background to this study is the finding that policy logic has constrained and limited professional logic over the last few decades – doctors are no longer in charge of performance improvement, the management of waiting lists, etc. The researchers used the introduction of a new evidence-based, multi-component guideline as a lens through which to explore the interactions of different ‘logics’ in hospital practice. The implementation of a multi-component guideline is not a simple thing, and some intuitive cost-benefit calculations could justify, at least intellectually, massaging some aspects of the guideline to fit management practices rather than the reverse. However, the way this played out was not the same across contexts. As before, doctors were generally (but not invariably) less amenable to change than nurse practitioners with managerial responsibility. This study, published in a premier management journal,[9] identifies contingencies that will provide depth to our evaluations of different ways to reshape services. We will build on these insights when we examine a proposed service to use Patient Reported Outcome Measures, rather than simply elapsed time, to determine when patients should be seen in the outpatient department. An understanding of ‘logics’ is likely to come into play when we empower community and ambulance staff to elicit patient preferences and respect them even when to do so flies in the face of guidelines. At the level of the system, change is best viewed as an institutional problem of professional power and policy, around which change needs to orientate. It is not that systems and organisations can’t be changed, but subtle tactics and work may be required.[10] [11]
  4. Health care organisations viewed as political grouping and the need to do ‘political work’ when implementing interventions. Trish Greenhalgh has recently provided an evidence-based framework which unpicks reasons why IT implementations so often disappoint.[12] She points out that managers consistently underestimate the size of the task and the sheer difficulty of implementing IT systems so that they reach even some of their potential. Likewise, work conducted under an NIHR Programme grant that developed out of CLAHRC WM showed how new IT systems could introduce serious new hazards.[13] One of the methods to avoid failure in any large initiative, such as a large IT system, comes from a study of Italian hospitals conducted by the CLAHRC WM team,[14] advocating an iterative process, time and careful preparation of the ground by doing ‘political work’ to win hearts and minds and adapt interventions to context.[15] This type of approach will be critical to the development of complex interventions, such as those widening access to homebirth, and integrating patient feedback (including Patient Reported Outcome Measures) into patient care pathways.
  5. Absorptive capacity. Many CLAHRCs have relied on a knowledge brokering model to underpin translation of research, through which key individuals ensure knowledge gets to the right people at the right time to benefit patient care.[16] However, such an approach may have a limited effect and we need to consider how organisations and systems can be developed to ensure the efforts of knowledge brokers are leveraged and evidence informs patient care more widely. This is a matter of developing organisation and system ‘absorptive capacity’. Many of the implementation studies under our current CLAHRC have sought to develop co-ordination capability of organisations and systems to translate evidence into practice. For example, public and patient involvement, GP involvement, better business intelligence processes and structures is highlighted as ensuring clinical commissioning groups make evidence-informed decisions.[17] We have taken our work further to develop a ‘tool’ to assess the Absorptive Capacity of organisations.[18]

In this short review we have described how theoretical work, based on the development and evaluation of service interventions, can help understand the reasons why an intervention may succeed or fail, and how this may vary from place to place. Increasingly we are applying Elinor Ostrom’s work on collaboration between managers when the incentives are not aligned to the problems of integrated care in the NHS.[19] Our work represents successful collaboration between management and medical schools and, indeed, a difference in ‘logics’ between these organisations. This collaboration has taken time to mature, as have those between the services and academia more broadly. The essential point is that consideration of wider organisational and systems context will prove crucial to our efforts to continue broadening, accelerating and deepening translation of evidence into practice in our proposed ARC.

— Richard Lilford, CLAHRC WM Director

— Graeme Currie, Professor of Public Management, CLAHRC WM Deputy Director


  1. Lilford RJ. A Theory of Everything! Towards a Unifying Framework for Psychological and Organisational Change Models. NIHR CLAHRC West Midlands News Blog. 28 August 2015.
  2. Lilford RJ. Demystifying Theory. NIHR CLAHRC West Midlands News Blog. 10 April 2015.
  3. Lilford RJ. Financial Incentives for Providers of Health Care: The Baggage Handler and the Intensive Care Physician. NIHR CLAHRC West Midlands News Blog. 25 July 2015.
  4. Lilford RJ. Two Things to Remember About Human Nature When Designing Incentives. NIHR CLAHRC West Midlands News Blog. 27 January 2017.
  5. Vroom VH. Work and motivation. Oxford, England: Wiley. 1964.
  6. Croft C, Currie G, Lockett A. The impact of emotionally important social identities on the construction of managerial leader identity: A challenge for nurses in the English NHS. Organ Stud. 2015; 36(1): 113-31.
  7. Currie G, Burgess N, Hayton JC. HR Practices and Knowledge Brokering by Hybrid Middle Managers in Hospital Settings: The Influence of Professional Hierarchy. Hum Res Manage. 2015; 54(5): 793-812.
  8. Lilford RJ. Polycentric Organisations. NIHR CLAHRC West Midlands News Blog. 25 July 2014.
  9. Currie G & Spyridonidis D. Interpretation of Multiple Institutional Logics on the Ground: Actors’ Position, their Agency and Situational Constraints in Professionalized Contexts. Organ Stud. 2016; 37(1): 77-97.
  10. Currie G, Lockett A, Finn R, Martin G, Waring J. Institutional work to maintain professional power: Recreating the model of medical professionalism. Organ Stud. 2012; 33(7): 937-62.
  11. Lockett A, Currie G, Waring J, Finn R, Martin G. The influence of social position on sensemaking about organizational change. Acad Manage J. 2014; 57(4): 1102-29.
  12. Lilford RJ. New Framework to Guide the Evaluation of Technology-Supported Services. NIHR CLAHRC West Midlands News Blog. 12 January 2018.
  13. Cresswell KM, Mozaffar H, Lee L, Williams R, Sheikh A. W. Workarounds to hospital electronic prescribing systems: a qualitative study in English hospitals. BMJ Qual Saf. 2017; 26: 542-51.
  14. Radaelli G, Currie G, Frattini F, Lettieri E. The Role of Managers in Enacting Two-Step Institutional Work for Radical Innovation in Professional Organizations. J Prod Innov Manag, 2017; 34(4): 450-70.
  15. Lilford RJ. Implementation Science at the Crossroads. BMJ Qual Saf. 2017; 27: 331-2.
  16. Rowley E, Morriss R, Currie G, Schneider J. Research into practice: Collaboration for Leadership in Applied Health Research and Care (CLAHRC) for Nottinghamshire, Derbyshire and Lincolnshire (NDL). Implement Sci. 2012; 7:
  17. Croft C & Currie G. ‘Enhancing absorptive capacity of healthcare organizations: The case of commissioning service interventions to avoid undesirable older people’s admissions to hospitals’. In: Swan J, Nicolini D, et al., Knowledge Mobilization in Healthcare. Oxford: Oxford University Press; 2016.
  18. Currie G, Croft C, Chen Y, Kiefer T, Staniszewska S, Lilford RJ. The capacity of health service commissioners to use evidence: a case study. Health Serv Del Res. 2018; 6(12).
  19. Lilford RJ. Evaluating Interventions to Improve the Integration of Care (Among Multiple Providers and Across Multiple Sites). NIHR CLAHRC West Midlands News Blog. 10 February 2017.

Stunted Child Growth: How Good a Marker of Nutrition?

Around the world children, adjusted for age, are getting taller. When a child falls below two standard deviations of the reference mean height for a given age they are labelled as ‘stunted’. Given that children have been growing faster, even in poor countries, the incidence of stunting has decreased over the last four decades when compared to an unchanging reference standard. Using the WHO reference standard, for example, the prevalence of stunting in Asia decreased from 49% to 28% over the years 1990 to 2010. In Africa stunting rates remain stubbornly high at ~40%.[1] However, classifying 40% of any population as ‘abnormal’ should always raise a scientist’s suspicion.

The WHO reference is based on measurements across a mixture of rich and poor countries. So it is tempting to use it as a measure of infant nutrition; as nutrition improves so the prevalence of stunting should decline. It is tempting to infer that a well-nourished infant population would attain a stunting rate of 2.5%, even if judged against high-income norms. However, this argument is flawed – growth rates reflect not just each individual child’s nutrition, but the nutrition available to at least two preceding generations of the child’s family.[2] Targets should be based on what is achievable, and a statistically defined 2.5 percent ‘stunting’ rate is an unachievable target; the high-income threshold is unachievable within one generation. However, a poor country’s own 2.5% threshold is too low. In fact, there is no perfect external standard; they are all arbitrary.

We propose a ‘risk-adjusted’ stunting rate. One method would be based on the evidence of what can be achieved within one generation by examining the variances in height across generations, such as those in Japan, Taiwan and the Gulf states over a period of rapid economic progress and the years that followed. For instance, if two-thirds of the variance in growth rates is attributable to nutrition in a given generation, and one-third to inter-generational effects, then thresholds could be adjusted accordingly. A more refined method still could adjust for the height of the mother, since that also imposes a limit on growth rates. We think that the world should move to an empirically supported method of monitoring, rather than stunting rates based on an arbitrary standard that ignore intergenerational effects, which should henceforth be regarded as discredited.

— Richard Lilford, CLAHRC WM Director


  1. de Onis M, Blössner M, Borghi E. Prevalence and trends of stunting among pre-school children, 1990–2020. Public Health Nutr. 2012; 15(1): 142-8.
  2. Kaati G, Bygren LO, Pembrey M, Sjöström M. Transgenerational response to nutrition, early life circumstances and longevity. Eur J Hum Genet. 2007; 15: 784-90.

Our CLAHRC’s Unique Approach to Public and Community Involvement Engagement and Participation (PCIEP)

All NIHR-funded research is required to involve the public/patients at all stages of the research process. Here in CLAHRC WM we are ardent supporters of this principle, and we hew to the INVOLVE guidelines in doing so. We are keen to improve our ways of involving patients and the public in our research and have used the recently-published Standards for involvement to reflect on our activities and develop better ways of working.

CLAHRC WM is a Service Delivery Research Organisation. Because we are in the business of shaping the way health services are designed and delivered, we have given careful consideration to our approach to public and community involvement in research.  Our unique approach and rationale is described below.

Let’s start with a basic point. Service delivery research is, in most instances, best conducted prospectively. This is for three reasons:

  1. Prospective involvement of researchers provides access to the world’s literature, along with critical appraisal of that literature, to help inform the selection and design/adaptation of interventions.
  2. Researchers can assist in co-design and alpha-testing of proposed changes, deploying disciplines such as behavioural economics, operations research, and organisational theory.
  3. Prospective evaluations are generally more powerful (valid) than purely retrospective studies – for example, providing baseline data and information on both mediating, clinical processes and outcome variables.[1]

This takes us to the next basic point – service interventions are in the purvey of managers who control the purse strings, not the researchers. Yes, researchers can influence intervention selection and deployment, but they do not have the final say.

A third basic point is that service managers have a duty to consult patients and the public, just as researchers do.

We could have a model for public involvement with one set of patient/public advisors advising on research, and another set advising on interventions, as in in Figure 1.

Figure 1: Separate Involvement of PPI (PCIEP) in Research and in Service

108 DCB - Our CLAHRC Fig 1However, such a plan seems an opportunity missed. It could result, for instance, in conflicting advice, with patients/public in the research sector advocating evaluation of an intervention that their counterparts in the service have not prioritised.

We are not advocating combining PCIEP in research with patient and public involvement in the service to create just one monolithic structure. There are many research issues that are not relevant in a purely service context. We do, however, advocate an integrated approach, as represented in Figure 2. By involving patient and public contributors that are also involved in advising the service, we generate a group of people to champion our research and help ensure evidence is used in practice.

Figure 2: A system that integrates patient and the public across the service and research domains

108 DCB - Our CLAHRC Fig 2

So what can we do to achieve this level of integration? We do not have all the answers, as this is an evolving idea. However, here is what we do in CLAHRC WM:

  1. We try to recruit public contributors who also have (or have had) a role in advising the service. We target them and give some preference to such people in our competitive selection process.
  2. We hold joint consultative sessions with service managers, our public contributors, and (when possible) those who advise the service. Such was the situation, for example, in the ‘Dragon’s Den’ events we held to select priorities for our forthcoming Applied Research Collaboration application.
  3. Working with PCIEP and service partners, we create structures where research PCIEP, service PCIEP (say from Healthwatch) and CLAHRC WM researchers work together. We have worked with Sustainability and Transformation Partnerships (STPs) and our local Academic Health Science Network (AHSN) to create these structures.

Our strategy has evolved over considerable discussion in the CLAHRC WM and we have ‘market tested’ our approach with Simon Denegri, the past head of INVOLVE. However, we welcome feedback, advice, and opinions from readers.

Those who wish to read more on our work and/or thoughts on patient and public involvement can do so by clicking here.

— Richard Lilford, CLAHRC WM Director

— Magdalena Skrybant, PPIE Lead


  1. Lilford RJ, Chilton PJ, Hemming K, Girling AJ, Taylor CA, Barach P. Evaluating policy and service interventions: framework to guide selection and interpretation of study end pointsBMJ2010; 341: c4413.

The Slow March of Epidemiology: From Disease Causation to Treatment to Service Delivery

Traditional epidemiology was concerned with the causes of disease – many of the great medical discoveries, from malaria to the effects of smoking, can be credited to classical epidemiology. The subject continues to make great strides thanks to modern developments, such as genome-wide association studies and Mendelian randomisation. Approximately 70 years ago Austin Bradford Hill ushered in the days of clinical epidemiology.[1] Epidemiological methods were used to study the diagnosis and treatment of disease, rather than simply the causes and prognosis. Randomised trials and systematic reviews became the ‘stock in trade’ of the clinical epidemiologist.

As more and more effective treatments were discovered, people started to worry about large variations in practice and in the quality of care. Service delivery health research and the ‘quality movement’ were born. Researchers naturally felt the need to measure quality. Progress was slow, however. First, quality improvement was initially dominated by management research; a subject that does not have a strong tradition of measurement, as I have reported elsewhere.[2] Second, the constructs that quality researchers were dealing with were much harder to measure than clinical outcomes. For example, an attempt was made to correlate the safety culture with standardised mortality rates across intensive care units. The result was null, but this might have resulted entirely from measurement error; mortality rates suffer from unavoidable signal to noise problems,[3] while the active ingredient in culture is hard to capture in a measurement.[4] As the subject of the quality of care seemed to become bogged down with measurement issues, the patient safety movement became dominant. Initially people focused on psychology and organisational science. However, no science can mature without, at some point, making its central concepts quantifiable. As Galileo (allegedly) said, “Measure what can be measured, and make measurable what cannot be measured.” So it became necessary to try to measure safety, and all the problems of quality measurement re-surfaced.

Most sensible people now realise that impatience does more harm than good; shortcuts lead nowhere and we simply have to work away, measuring and mitigating measurement error as bast we can. As stated, and as I have argued elsewhere,[5] clinical outcomes are insensitive to many service interventions. This is a lesson that those of us with a background in classical or clinical epidemiology have been slow to learn. Trying to copy clinical epidemiology, and to rely entirely on clinical endpoints, has driven service delivery research into two camps – qualitative researchers who eschew quantification, and quantitative researchers who want to apply rules of evidence that served them well in clinical research. However, there really is a third way. This method is based on observations across the causal chain linking intervention to clinical outcome. I have long argued that it is the pattern of data (qualitative and quantitative) across a causal chain that should be analysed.[5] Since then, people have started to pay attention, not just to the outcome at the human level, but also to mediating variables. More recently still, I have argued for the use of Bayesian networks to synthesise information from the causal chain in a particular study, along with evidence from reviews in salient topics.[6] Note that while coming from the same, realist, epistemology as ‘mixed-methods’ research, mediator variable analysis and Bayesian networking developed mixed-methods to another level, since they enable data of different sorts to be captured in a clinical outcome of sufficient importance to populate a decision model. The use of proxy outcomes acquired a bad reputation in clinical epidemiology. However, carrying this idea over into service delivery research is extremely limiting. It is also unscientific, since science is dependent on induction, and induction can only be carried out if the causal mechanisms behind the results obtained are understood.

— Richard Lilford, CLAHRC WM Director


  1. Hill AB. The environment and disease: Association or causation? Proc R Soc Med. 1965; 58(5): 295-300.
  2. Lilford RJ, Dobbie F, Warren R, Braunholtz D, Boaden R. Top-rated British business research: Has the emperor got any clothes? Health Serv Manage Res. 2003; 16(3): 147-54.
  3. Girling AJ, Hofer TP, Wu J, Chilton PJ, Nicholl JP, Mohammed MA, Lilford RJ. Case-mix adjusted hospital mortality is a poor proxy for preventable mortality: a modelling study. BMJ Qual Saf. 2012; 21(12): 1052-6.
  4. Mannion R, Davies H, Konteh H, Jung T, Scott T, Bower P, Whalley D, McNally R, McMurray R. Measuring and Assessing Organisational Culture in the NHS (OC1). 2008.
  5. Lilford RJ, Chilton PJ, Hemming K, Girling AJ, Taylor CA, Barach P. Evaluating policy and service interventions: framework to guide selection and interpretation of study end points. BMJ 2010; 341: c4413.
  6. Watson SI & Lilford RJ. Essay 1: Integrating Multiple Sources of Evidence: a Bayesian Perspective. In: Challenges, solutions and future directions in the evaluation of service innovations in health care and public health. Southampton (UK): NIHR Journals Library, 2016.

Sustainability and Transformation Partnerships: Why they are so Very Interesting

There is a strong international, national and local initiative to develop services generically by integrating care across multiple providers and many diseases, rather than to focus exclusively on disease ‘silos’. However, integrating care across providers runs into immediate problems because the interests of these different providers are seldom aligned. For instance, providing care in the community may reduce earnings in a hospital where money follows patients.

Integrating care across multiple providers can take different forms, which might play out in different ways. The least radical solution would consist of informal alliances to help plan services. At the other end of the scale organisations merge into common legal entities with consolidated budgets (so-called Responsible Care organisations). Between these two extremes lie formal structures, but where the budgets and legal responsibility remain with local providers.

The Sustainability and Transformation Partnerships (STPs) in England are a good example of the intermediate arrangement. They are part of official government policy, have some funding, and have generated considerable local buy-in.

However, the interests of local providers cannot be overridden by the STP. It is tempting to say that they are unlikely to be very successful given that, inevitably, the interests of the different organisations are not the same. However, there is some evidence that this might not be the inevitable, dismal outcome. The evidence comes from Elinor Ostrom, Nobel Prize winner for economics. We have cited her work previously within this news blog.[1][2] She describes the conditions under which collaboration can take place, even when the interests of the collaborating organisations are imperfectly aligned:

  1. Clearly defined boundaries.
  2. Congruence between appropriation/provision rules and local conditions.
  3. Collective choice arrangements.
  4. Monitoring.
  5. Graduated sanctions.
  6. Conflict-resolution mechanisms.
  7. Local autonomy.
  8. Nested enterprises (polycentric governance).

Ostrom’s work was carried out in the context of protection of the environment; fisheries, farms, oceans, forests and the like. So, it would be extremely interesting to examine STP using Ostrom’s findings as an investigative lens. Working with CLAHRC London we plan to conduct numerous case studies of STPs that exhibit different features or philosophies. We expect that we will uncover differences in structure and culture that play out differently in different places. Among other things, we will see whether we can replicate Ostrom’s findings in a health care context. On this basis, we may be able to develop a tool that could help predict how well an organisation, such as an STP, is working. In the long-term we would examine (any) correlation in adherence with Ostrom’s criteria and the overall success of an STP.

Of course, this is not an easy topic for study. That is precisely why we think it is a good topic for a capacity development centre, such as a CLAHRC, to tackle. There is an inverse relationship between the importance of a topic, and its tractability. This is where various tools that we have developed, such as Bayesian networks, come into their own. These tools make intractable subjects, such as evaluating the success of STPs, a little more tractable.

— Richard Lilford, CLAHRC WM Director


  1. Lilford RJ. Polycentric Organisations. NIHR CLAHRC West Midlands News Blog. 25 July 2014.
  2. Ostrom E. Governing the Commons: The Evolution of Institutions for Collective Actions. Cambridge University Press: Cambridge, UK; 1990.