I commonly come across colleagues who say that context is all in service delivery research. They argue that summative quantitative studies are not informative because there is so much variation by context that the average is meaningless. I think that this is lazy thinking. If context was all, then there would be no point in studying anything by any means; any one instance would be precisely that – one instance. If the effects of an intervention were entirely context-specific then it would never be permissible to extrapolate from one situation to another, irrespective of the types of observations made. But nobody thinks that.
A softer anti-quantitative view accepts the idea of generalising across contexts, but holds that such generalisations / extrapolations can be built up solely from studies of underlying mechanisms, and that in-depth qualitative studies can tell us all we need to know about those mechanisms. Proponents of this view hold that quantitative epidemiological studies are, at best, extremely limited in what they can offer. It is true that some things cannot easily be studied in a quantitative comparative way – an historian interested in the cause of the First World War cannot easily compare the candidate explanatory variables over lots of instances. In such a case, exploration of various individual factors that may have combined to unleash the catastrophe may be all that is available. But accepting this necessity is not tantamount to eschewing quantitative comparisons when they are possible. It is unsatisfying to study just the mechanisms by which improved nurse ratios may reduce falls or pressure ulcers without measuring whether the incidence of these outcomes is, in fact, correlated with nurse numbers.
Of course, concluding that quantification is important is not tantamount to concluding that quantification alone is adequate. It never is and cannot be, as the famous statistician, Sir Austin Bradford Hill, implied in his famous speech. Putative causal explanations are generally strengthened when theory generated from one study yields an hypothesis that is supported by another study (Hegel’s thesis, antithesis, synthesis idea). Alternatively, or in addition, situations arise when evidence for a theory, and for hypotheses that are contingent on that theory, may arise within the framework of a single study. This can happen when observations are made across a causal chain. For example, a single study may follow up heavy, light and non-drinkers and examine the size of the memory centre in the brain (by MRI) and their memory (through a cognitive test). The theory that alcohol affects memory is supported by the finding that memory declines faster in drinkers than teetotallers, and yet further support comes from alcohol’s effect on the size of the memory centre (the hippocampus). Similarly, a single study may show that improving the nurse to patient ratio results in a lower probability of unexpected deaths and more diligent monitoring of patients’ vital signs. Here the primary hypothesis that the explanatory variable (nurse/patient ratio) is correlated with the outcome variable (unexpected hospital death) is reinforced by also finding a correlation between the intervening / mediatory variable (diligence in monitoring vital signs) and the outcome variable (hospital deaths) (see Figure 1). In a previous News Blog we have extolled the virtues of Bayesian networking in quantifying these contingent relationships.
Figure 1: Causal chain linking explanatory variable (intervention) and outcome
Observations relating to various primary and higher order hypotheses may be quantitative or qualitative. Qualitative observations on their own are seldom sufficient to test a theory and make reliable predictions. But measurement without a search for mechanisms – without representation / theory building – is effete. The practical value of science depends on ‘induction’ – making predictions over time and space. Such predictions across contexts require judgement, and such judgement cannot gain purchase without an understanding of how an intervention might work. Putting these thoughts together (the thesis, antithesis, synthesis idea and the need for induction), we end up with a ‘realist’ epistemology – the idea here is to make careful observations, interpret them according to the scientific canon, and then represent the theory – the underlying causal mechanisms. In such a framework, qualitative observations complement quantitative observations and vice-versa.
It is because results are sensitive to context that mechanistic / theoretical understanding is necessary. Context refers to things that vary from place to place and that might influence the (relative or absolute) effects of an intervention. It is also plausible to argue that context is more influential with respect to some types of intervention than others. Arguably, context is (even) more important in service delivery research than in clinical research. In that case, one might say that understanding mechanisms is even more important in service delivery research than in clinical research. At the (absolute) limit, if B always follows A, then sound predictions may be made in the absence of an understanding of mechanisms – the Sun was known to always come up in the East, even before rotation of the Earth was discovered. But scientific understanding requires more than just following the numbers. A chicken may be too quick to predict that a meal will follow egg-laying just because that has happened on 364 consecutive days, while failing to appreciate the underlying socioeconomic mechanisms that might land her on a dinner plate on the 365th day, in Bertrand Russell’s evocative example.
Moving on from a purely epistemological argument, there is plenty of empirical data to show that many quantitative findings are replicated across a sufficient range of contexts to provide a useful guide to action. Here are some examples. The effect of ‘user fees’ and co-payments on consumption of health care are quite predictable – demand is inelastic on price, meaning that a relatively small increase in price, relative to average incomes, suppresses demand. Moreover, this applies irrespective of medical need, and across low- and high-income countries. Audit and feedback as a mechanism to improve the effectiveness of care has consistently positive, but small (about 8% change in relative risk) effects. Self-care for diabetes is effective across many contexts. Placing managers under risk of sanction has a high risk of inducing perverse behaviour when managers do not believe they can influence the outcome. It is sometimes claimed that behavioural / organisational sciences are qualitatively distinct from natural sciences because they involve humans, and humans have volition. Quite apart from the fact that we are not the only animals with volition (we share this feature with other primates and cetaceans), the existence of self-determination does not mean that interventions will not have typical / average effects across groups or sub-groups of people.
The diabetes example, cited above, is particularly instructive because it makes the point that the role of context is amenable to quantitative evaluation – context may have no effect, it may modify an effect (but not vitiate it), it may obliterate an effect, or even reverse the direction of an effect. Tricco’s iconic diabetes study  combined over 120 RCTs of service interventions to improve diabetes care (there are now many more studies and the review is being updated). The study shows not just how the effect of interventions vary by intervention type, but also how the intervention effect itself varies by context. It is thus untenable to claim, as some do, that ‘what works for whom, under what circumstances’ is discernible only by qualitative methods. The development economist, Abhijit Banerjee, goes further, arguing that the main purpose of RCTs is to generate unbiased point estimates of effectiveness for use in observational studies of the moderating effect of context on intervention effects.
We have defined context as all the things that might vary from place to place and that might affect intervention effects. Some people conflate context with how an intervention is taken up / modified in a system. This is a conceptual error – how the intervention is applied in a system is an effect of the intervention and like other effects, it may be influenced by context. Likewise, everything that happens ‘downstream’ of an intervention as a result of the intervention is a potential effect, and again, this effect may be affected by context. Context includes upstream variables (see Figure 2) and any downstream variable at baseline. All that having been said, it is not always easy to distinguish when a change in a downstream variable is caused by the intervention, or whether it is a change in a variable that would have happened anyway (i.e. a temporal effect). Note, that a variable such as the nurse-patient ratio may be an intervention in one study (e.g. a study of nurse-patient ratios) and a context variable in another (e.g. a study of an educational intervention to reduce falls in hospital). Context is defined by its role in the inferential cause / effect framework, not by the kind of variable it is.
Figure 2: How to conceptualise the intervention, the effects downstream, and the context.
— Richard Lilford, CLAHRC WM Director
- Hill AB. The environment and disease: Association or causation? Proc R Soc Med. 1965; 58(5): 295-300.
- Topiwala A, Allan C, Valkanova V, et al. Moderate alcohol consumption as risk factor for adverse brain outcomes and cognitive decline: longitudinal cohort study. BMJ. 2017; 357:j2353.
- Lilford RJ. Statistics is Far Too Important to Leave to Statisticians. NIHR CLAHRC West Midlands News Blog. 27 June 2014.
- Russell B. Chapter VI. On Induction. In: Problems of Philosophy. New York, NY: Henry Holt and Company, 1912.
- Watson SI, Wroe EB, Dunbar EL, Mukherjee J, Squire SB, Nazimera L, Dullie L, Lilford RJ. The impact of user fees on health services utilization and infectious disease diagnoses in Neno District, Malawi: a longitudinal, quasi-experimental study. BMC Health Serv Res. 2016; 16(1): 595.
- Lagarde M & Palmer N. The impact of user fees on health service utilization in low- and middle-income countries: how strong is the evidence? Bull World Health Organ. 2008; 86(11): 839-48.
- Effective Practice and Organisation of Care (EPOC). EPOC Resources for review authors. Oslo: Norwegian Knowledge Centre for the Health Services; 2015.
- Tricco AC, Ivers NM, Grimshaw JM, Moher D, Turner L, Galipeau J, et al. Effectiveness of quality improvement strategies on the management of diabetes: a systematic review and meta-analysis. Lancet. 2012; 379: 2252–61.
- Lilford RJ. Discontinuities in Data – a Neat Statistical Method to Detect Distorted Reporting in Response to Incentives. NIHR CLAHRC West Midlands News Blog. 1 September 2017.
- Pawson R & Tilley N. Realistic Evaluation. London: Sage. 1997.
- Banerjee AV & Duflo E. The Economic Lives of the Poor. J Econ Perspect. 2007; 21(1): 141-67.
- Lilford RJ, Chilton PJ, Hemming K, Girling AJ, Taylor CA, Barach P. Evaluating policy and service interventions: framework to guide selection and interpretation of study end points. BMJ. 2010; 341: c4413.