Tag Archives: Celia Taylor

Evidence-Based Guidelines and Practitioner Expertise to Optimise Community Health Worker Programmes

The rapid increase in scale and scope of community health worker (CHW) programmes highlights a clear need for guidance to help programme providers optimise programme design. A new World Health Organization (WHO) guideline in this area [1] is therefore particularly welcome, and provides a complement to existing guidance based on practitioner expertise.[2] The authors of the WHO guideline undertook an overview of existing reviews (N=122 reviews with over 4,000 references included), 15 separate systematic reviews of primary studies (N=137 studies included), and a stakeholder perception survey (N=96 responses). The practitioner expertise report was developed following a consensus meeting of six CHW programme implementers, a review of over 100 programme documents, a comparison of the standard operating procedures of each implementer to identify areas of alignment and variation, and interviews with each implementer.

The volume of existing research, in terms of the number of eligible studies included in each of the 15 systematic reviews, varied widely, from no studies for the review question “Should practising CHWs work in a multi-cadre team versus in a single-cadre CHW system?” to 43 studies for the review question “Are community engagement strategies effective in improving CHW programme performance and utilization?”. Across the 15 review questions, only two could be answered with “moderate” certainty of evidence (the remainder were “low” or “very low”): “What competencies should be included in the curriculum?” and “Are community engagement strategies effective?”. Only three review questions had a “strong” recommendation (as opposed to “conditional”): those based on Remuneration(do so financially), Contracting agreements(give CHWs a written agreement), and Community engagement(adopt various strategies). There was also a “strong” recommendation not to use marital status as a selection criterion.

The practitioner expertise report provided recommendations in eight key areas and included a series of appendices with examples of selection tools, supervision tools and performance management tools. Across the 18 design elements, there was alignment across the six implementers for 14, variation for two (Accreditation– although it is recommended that all CHW programmes include accreditation – and CHW:Population ratio), and general alignment but one or more outliers for two (Career advancement– although supported by all implementers, and Supply chain management practices).

There was general agreement between the two documents in terms of the design elements that should be considered for CHW programmes (Table 1), although notincluding an element does not necessarily mean that the report authors do not think it is important. In terms of the specific content of the recommendations, the practitioner expertise document was generally more specific; for example, on the frequency of supervision the WHO recommend “regular support” and practitioners “at least once per month”. The practitioner expertise report also included detail on selection processes, as well as selection criteria: not just what to select for, but how to put this into practice in the field. Both reports rightly highlight the need for programme implementers to consider all of the recommendations within their own local contexts; one size will not fit all. Both also highlight the need for more high quality research. We recently found no evidence of the predictive validity of the selection tools used by Living Goods to select their CHWs,[3] although these tools are included as exemplars in the practitioner expertise report. Given the lack of high quality evidence available to the WHO report authors, (suitably qualified) practitioner expertise is vital in the short term, and this should now be used in conjunction with the WHO report findings to agree priorities for future research.

Table 1: Comparison of design elements included in the WHO guideline and Practitioner Expertise report

114 DC - WHO Guidelines Fig

— Celia Taylor, Associate Professor

References:

  1. World Health Organization. WHO guideline on health policy and system support to optimize community health worker programmes. Geneva, Switzerland: WHO; 2018.
  2. Community Health Impact Coalition. Practitioner Expertise to Optimize Community Health Systems. 2018.
  3. Taylor CA, Lilford RJ, Wroe E, Griffiths F, Ngechu R. The predictive validity of the Living Goods selection tools for community health workers in Kenya: cohort study. BMC Health Serv Res. 2018; 18: 803.

Re-thinking Medical Student Written Assessment

“Patients do not walk into the clinic saying ‘I have one of these five diagnoses. Which do you think is most likely?’” (Surry et al., 2017)

The predominant form of written assessment for UK medical students is the ‘best of five multiple choice question’ (Bo5). Students are presented with a clinical scenario – usually information about a patient, a lead-in or question such as “which is the most likely diagnosis?” and a list of five possible answers, only one of which is unambiguously correct. Bo5 questions are incredibly easy to mark, particularly in the age of computer-read answer sheets (or even computerised assessment). This is critical when results must be turned-round, ratified and feedback provided to students in a timely manner. Because Bo5s are relatively short (UK medical schools allow a median of 72 seconds per question, compared with short answer or essay questions for which at least 10 minutes per question would be allowed), an exam comprising of Bo5 questions can cover a broad sample of the curriculum. This helps to improve the reliability of the exam: a student’s grade is not contingent on ‘what comes up in the exam’, so should have been similar had a different set of questions covering the same curriculum been used. Students not only know that their (or others’) scores are not dependent on what came up, but they are also reassured that they would get the same score regardless of who (or what) marked their paper. There are no hawk/dove issues in Bo5 marking.

On the other hand, Bo5 questions are notoriously difficult to develop. The questions used in the Medical Schools Council Assessment Alliance (MSCAA) Common Content project, where questions are shared across UK medical schools to enable passing standards for written finals exams to be compared,[1] go through an extensive review and selection process prior to inclusion (the general process for MSCAA questions is summarised by Melville, et al. [2]). Yet the data are returned for analysis with comments such as “There is an assumption made in this question that his wife has been faithful to the man” or “Poor distractors – no indication for legionella testing”. But perhaps the greatest problem with Bo5 questions is their poor representativeness to clinical practice. As the title of this blog implied, patients do not come with a list of five possible pathologies, diagnoses, important investigations, treatment options, or management plans. While a doctor would often formulate such a list (e.g. a differential diagnosis) before determining the most likely or appropriate option, such formulation requires considerable skill. We all know that assessment drives learning, so by using Bo5 we may therefore be inadvertently hindering students from developing the full set of clinical reasoning skills required of a doctor. There is certainly evidence that students use test-taking strategies such as elimination of implausible answers and clue-seeking when sitting Bo5-based exams.[3]

A new development in medical student assessment, the Very Short Answer question (VSA) therefore holds much promise. It shifts some of the academic/expert time from question development to marking, but, by exploiting computer-based assessment technology, does so in a way that is not prohibitive given the turn-around times imposed by institutions. The VSA starts with the same clinical scenario as a Bo5. The lead-in changes from “Which is…?” to “What is…?” and this is followed by a blank space. Students are required to type between one and five words in response. A pilot of the VSA style question showed that the list of acceptable answers for a question could be finalised by a clinical academic in just over 90 seconds for a cohort of 300 students.[4] With the finalised list automatically applied to all students’ answers, again there are no concerns regarding hawk/dove markers that would threaten the exam’s acceptability to students. While more time is required per question when using VSAs compared to Bo5s, the internal consistency of VSAs in the pilot was higher for the same number of questions,[4] so it should be possible to find an appropriate compromise between exam length and curriculum coverage that does not jeopardise reliability. The major gain with the use of VSA questions is in clinical validity; these questions are more representative of actual clinical practice than Bo5s, as was reported by the students who participated in the pilot.[4]

To produce more evidence around the utility of VSAs, the MSCAA is conducting a large-scale pilot of VSA questions with final year medical students across the UK this autumn. The pilot will compare student responses and scores to Bo5 and VSA questions delivered electronically and assess the feasibility of online delivery using the MSCAA’s own exam delivery system. A small scale ‘think aloud’ study will run alongside the pilot, to compare students’ thought processes as they attempt Bo5 and VSA questions. This work will provide an initial test of the hypothesis that gains in clinical reasoning validity could be achieved with VSAs, as students are forced to think ‘outside the list of five’. There is strong support for the pilot from UK medical schools, so the results will have good national generalisability and may help to inform the design of the written component of the UK Medical Licensing Assessment.

We would love to know what others, particularly PPI representatives, think of this new development in medical student assessment.

— Celia Taylor, Associate Professor

References:

  1. Taylor CA, Gurnell M, Melville CR, Kluth DC, Johnson N, Wass V. Variation in passing standards for graduation‐level knowledge items at UK medical schools. Med Educ. 2017; 51(6): 612-20.
  2. Melville C, Gurnell M, Wass V. #5CC14 (28171) The development of high quality Single Best Answer questions for a national undergraduate finals bank. [Abstract] Presented at: The International Association for Medical Education AMEE 2015; 2015 Oct 22; Glasgow. p. 372.
  3. Surry LT, Torre D, Durning SJ. Exploring examinee behaviours as validity evidence for multiple‐choice question examinations. Med Educ. 2017; 51(10): 1075-85.
  4. Sam AH, Field SM, Collares CF, et al. Very‐short‐answer questions: reliability, discrimination and acceptability. Med Educ.2018; 52(4): 447-55.

The reliability of ethical review committees

I recently submitted the same application for ethical review for a multi-country study to three ethical review panels, two of which were overseas and one in the UK. The three panels together raised 19 points to be addressed before full approval could be given. Of these 19 points, just one was raised by two committees and none was raised by all three. Given CLAHRC WM’s methodological interest in inter-rater reliability and my own interests in the selection and assessment of health care students and workers, I was left pondering a) whether different ethical review committees consistently have different pass/fail thresholds for different ethical components of a proposed research study; and b) whether others have had similar experiences (we would welcome any examples of either convergent or divergent decisions by different ethical review committees).

Let me explain with two examples. One point raised was the need for formal written client consent during observations of Community Health Workers’ day-to-day activities. We had argued that because the field worker would only be observing the actions of the Community Health Worker and not the client, then formal written client consent was not required, but that informal verbal consent would be requested and the field worker would withdraw if the client did not wish them to be present. The two overseas committees both required formal written client consent, but the UK committee was happy with our justification for not doing so. On the other hand, the UK committee did not think we had provided sufficient reassurance of how we would protect the health and safety of the field worker as they conducted the observations, which could involve travelling alone into remote rural communities. The two overseas committees, however, considered our original plans for ensuring field worker health and safety sufficient.

What are the potential implications if different ethical review committees have different “passing standards”? As with pass/fail decisions in selection and assessment, there could be false positives or false negatives if studies are reviewed by “dove-ish” or “hawk-ish” committees respectively. As with selection and assessment, a false positive is probably the most concerning of the two: a study is given ethical clearance when ethical issues that would concern most other committees have not been raised and addressed. Although it is probably very rare that a study never gets ethical approval, a false negative decision would mean that the research team is required to make potentially costly and time-consuming amendments that most other committees would consider excessive. I have no experience on the “other side” of an ethical review committee, but I expect there must be some consideration of balancing the need for the research findings against potential ethical risks to participants and the research team.

Two interesting research questions arise. The first is to examine how ethical review committees make their decisions and set passing standards for research studies. A study of this nature in undergraduate medical education is currently ongoing: Peter Yates at Keele University is qualitatively examining how medical schools set their standards for finals examinations. The second is to explore the extent of the difference in passing standards across ethical review committees, by asking a sample of committees to each review a set of identical applications and to compare their decisions. A similar study in undergraduate medical education investigated differences in passing standards for written finals examinations across UK medical schools.[1] To avoid significant bias due to the Hawthorne effect, the ethical review committees would really need to be unaware that they were the subjects of such research. This, of course, raises a significant ethical dilemma with respect to informed consent and deception. Therefore it is not known whether such a study would be given ethical approval (and if so, by which committees?).

— Celia Taylor, Associate Professor

Reference:

  1. Taylor CA, Gurnell M, Melville CR, Kluth DC, Johnson N, Wass V. Variation in passing standards for graduation‐level knowledge items at UK medical schools. Med Educ. 2017; 51(6): 612-20.

Publishing Health Economic Models

It has increasingly become de rigueur – if not necessary – to publish the primary data collected as part of clinical trials and other research endeavours. In 2015 for example, the British Medical Journal stipulated that a pre-condition of publication of all clinical trials was the guarantee to make anonymised patient-level data available on reasonable request.[1] Data repositories, from which data can be requested such as the Yoda Project, and from which data can be directly downloaded such as Data Dryad provide a critical service for researchers wanting to make their data available and transparent. The UK Data Service also provides access to an extensive range of quantitative and, more recently, qualitative data from studies focusing on matters relating to society, economics and populations. Publishing data enables others to replicate and verify (or otherwise) original findings and, potentially, to answer additional research questions and add to knowledge in a particularly cost-effective manner.

At present, there is no requirement for health economic models to be published. The ISPOR-SMDM Good Research Practices Statement advocates publishing of sufficient information to meet their goals of transparency and validation.[2] In terms of transparency, the Statement notes that this should include sufficiently detailed documentation “to enable those with the necessary expertise and resources to reproduce the model”. The need to publish the model itself is specifically refuted, using the following justification: “Building a model can require a significant investment in time and money; if those who make such investments had to give their models away without restriction, the incentives and resources to build and maintain complex models could disappear”. This justification may be relatively hard to defend for “single-use” models that are not intended to be reused. Although the benefits of doing so are limited, publishing such models would still be useful if a decision-maker facing a different cost structure wanted to evaluate the cost-effectiveness of a specific intervention in their own context. The publication of any economic model would also allow for external validation which would likely be stronger than internal validation (which could be considered marking one’s own homework).

The most significant benefits of publication are most likely to arise from the publication of “general” or “multi-application” models because those seeking to adapt, expand or develop the original model would not have to build it from scratch, saving time and money (recognising this process would be facilitated by the publication of the technical documentation from the original model). Yet it is for these models that not publishing gives developers a competitive advantage in any further funding bids in which a similar model is required. This confers partial monopoly status in a world where winning grant income is becoming ever more critical. However, I like to believe most researchers also want to maximise the health and wellbeing of society: am aim rarely achieved by monopolies. The argument for publication gets stronger when society has paid (via taxation) for the development of the original model. It is also possible that the development team benefit from publication through increased citations and even the now much sought after impact. For example, the QRISK2 calculator used to predict cardiovascular risk is available online and its companion paper [3] has earned Julia Hippisley-Cox and colleagues almost 700 citations.

Some examples of published economic models exist, such as a costing model for selection processes for speciality training in the UK. While publication of more – if not all – economic models is not an unrealistic aim, it is also necessary to respect intellectual property rights. We welcome your views on whether existing good practice for transparency in health economic modelling should be extended to include the model itself.

— Celia Taylor, Associate Professor

References:

  1. Loder E, & Groves T. The BMJ requires data sharing on request for all trials. BMJ. 2015; 350: h2373.
  2. Eddy DM, Hollingworth W, Caro JJ, et al. Model transparency and validation: a report of the ISPOR-SMDM Modeling Good Research Practices Task Force–7. Med Decis Making. 2012; 32(5): 733-43.
  3. Hippisley-Cox J, Coupland C, Vinogradova Y, et al. Predicting cardiovascular risk in England and Wales: prospective derivation and validation of QRISK2. BMJ. 2008; 336(7659): 1475-82.

Do they think we’re stupid? The rise of statistical manipulitis and preventable measures

If there is one thing that the campaigns on the EU Referendum have taught us, it’s how the same set of data can be used to generate statistics that support two completely opposing points of view. This is beautifully illustrated in a report in the Guardian newspaper.[1] While the research community (amongst others) might accuse the campaigners of misleading the public and lament the journalists who sensationalise our findings, we are not immune from statistical manipulitus. To help control the susceptibility of researchers to statistical manipulitis, compulsory registration of trial protocols had to be instigated,[2] but five years later the majority of studies failed to do so, even registered trials where reporting results within one year of trial completion was mandated.[3] Furthermore, reporting alone provides insufficient public protection against the symptoms of statistical manipulitis. As highlighted in a previous blog, and one of Ben Goldacre’s Bad Science blogs,[4] researchers have been known to change primary endpoints, or select which endpoints to report. To provide a full aetiology for statistical manipulitis is beyond the scope of this blog, although Maslow’s belief that esteem (incorporating achievement, status, dominance and prestige) precedes self-actualisation (incorporating the realisation of one’s actual personal potential) provides an interesting starting point.[5] Whatever the causative mechanism, statistical manipulitis is not the only adverse consequence. For example, some professional athletes may stretch the principles underlying Therapeutic Use Exemptions to enable them to legally use substances on the World Anti-Doping Agency’s banned list, such as testosterone-based creams to treat saddle-soreness, when not all physicians would consider the athlete’s symptoms sufficiently severe to justify their use.[6]

We can also think of statistical manipulitis as pushing its victims across a balanced scale to the point at which the statistics presented become too contrived to be believed. Which side in the EU Referendum debate has travelled further from equilibrium is a moot point. While important gains could be had if those engaged with the debate knew the point at which the public’s scale is balanced, watching them succumb has injected some much-needed entertainment. The increased awareness of statistical manipulitis resulting from the debate has also provided an open door for those involved with public engagement with science to help move that tipping point and reduce the expected value of manipulation. To do so, the public need the tools and confidence to ask questions about political, scientific and other claims, as now being facilitated by the work of CLAHRC WM’s new PPIE Lead, Magdalena Skrybant, in her series entitled Method Matters. The first instalment, on regression to the mean, is featured in this blog.

Method Matters are ‘bite size’ explanations to help anyone without a degree in statistics or experience in research methods make sense of the numbers and claims that are bandied about in the media, using examples taken from real life written. Certainly, we would hope that through Method Matters, more people would be able to accurately diagnose any cases of statistical manipulitis and take relevant precautions.

Writing Method Matters is not an easy task: if each student in my maths class had rated my explanation of each topic, those ratings would vary both within and between students. My challenge was how to maximise the number of students leaving the class uttering those five golden words: “I get it now Miss!” Magdalena faces a tougher challenge – one size does not fit all and, unlike a “live” lesson, she cannot offer multiple explanations or answer questions in real time. However, while I had to convince 30 14-year-olds of the value of trigonometry on a windy Friday afternoon, the epidemic of statistical manipulitis highlighted by the EU Referendum debate has provided fertile ground for Method Matters. Please let us know what you think.

— Celia Taylor, Associate Professor

References:

  1. Duncan P, Gutiérrez P, Clarke S. Brexit: how can the same statistics be read so differently? The Guardian. 3 June 2016.
  2. Abbasi K. Compulsory registration of clinical trials. BMJ. 2004; 329: 637.
  3. Prayle AP, Hurley MN, Smyth AR. Compliance with mandatory reporting of clinical trial results on ClinicalTrials.gov: cross sectional study. BMJ. 2012; 344: d7373.
  4. Goldacre B. The data belongs to the patients who gave it to you. Bad Science. 2008.
  5. McLeod S. Maslow’s Hierarchy of Needs. Simply Psychology. 2007.
  6. Bassindale T. TUE – Therapeutic Use Exemptions or legitimised drug taking? We Are Forensic. 2014.

Do we Need ‘Situations’ to Make a Situational Judgement Test?

Rank the following options in order of their likely effectiveness or the extent to which they reflect ideal behaviour in a work situation.

  1. Make a list of the patients under your care on the acute assessment unit, detailing their outstanding issues, leaving this on the doctor’s office notice board when your shift ends and then leave at the end of your shift.
  2. Quickly go around each of the patients on the acute assessment unit, leaving an entry in the notes highlighting the major outstanding issues relating to each patient and then leave at the end of your shift.
  3. Make a list of patients and outstanding investigations to give to your colleague as soon as she arrives.
  4. Ask your registrar if you can leave a list of your patients and their outstanding issues with him to give to your colleague when she arrives and then leave at the end of your shift.
  5. Leave a message for your partner explaining that you will be 30 minutes late.

053 GB - SJT Doctor

How would your ranking change if you knew the following about the situation?

You are just finishing a busy shift on the Acute Assessment Unit (AAU). Your FY1 colleague who is due to replace you for the evening shift leaves a message with the nurse in charge that she will be 15 to 30 minutes late. There is only a 30 minute overlap between your timetables to handover to your colleague. You need to leave on time as you have a social engagement to attend with your partner.

(Example from UKFPO SJT Practice Paper © MSC Assessment 2014, reproduced with permission.)

The use of situational judgement tests (SJTs) for selection into education, training and employment has proliferated in recent years, but there remains an absence of theory to explain why they may be predictive of subsequent performance.[1] The name suggests that the tests are an assessment of a candidate’s ability to make a judgement about the most appropriate action in challenging work-related situations; suggesting that the tests must include descriptions of such challenging work-related situations. But your ranking of the possible actions listed above probably did not change much (if at all) once you knew the exact details of the situation compared to when these had to be deduced from the possible actions listed. A similar finding was recently reported in a fascinating experiment conducted by Krumm and colleagues,[2] with volunteers randomised to complete a teamwork SJT with or without situation descriptions. Those given the situation descriptions scored, on average, just 8.5% higher than those not given the descriptions. Of course, consideration of the need for a situation description is only possible for SJTs in a format where possible actions are presented to candidates (commonly known as multiple choice), but this format is generally used in practice as it facilitates marking and scoring.

Krumm et al.’s findings clearly raise doubts as to the intended construct of the test (i.e. the candidate’s judgement of specific situations); yet SJTs are predictive of workplace performance, with correlations of around 0.30 reported in meta-analyses (see for example McDaniel et al.).[3] So if a SJT doesn’t actually require a “situation” to enable a useful assessment of a candidate’s likely future performance, then what exactly is the assessment of? Lievens and Motowildo [4] suggest that it is of general domain knowledge regarding the utility of expressing certain traits, such as agreeableness, based on the knowledge that such traits help to ensure effective workplace importance. The implication of this theory for practice is that SJTs may not need to be particularly specific and could therefore be shared across professions and geographical boundaries, making them a particularly cost-effective selection tool. The implication for research is that we need more evidence on the antecedents of general domain knowledge, such as family background, both as part of theoretical development and to evaluate the fairness of SJTs for selection.

And what if one does actually desire an assessment of situational judgement as opposed to general domain knowledge, since both have independent predictive validity for job performance? Rockstuhl and colleagues suggest that candidates need to be asked for an explicit, open-ended judgement of the situation (e.g. “what are the thoughts, feelings and ideas of the people in the situation?”) rather than what they think is the most appropriate response to it.[5] The nub here is whether including open-ended assessments to enable measurement of situational judgement is cost-effective given their incremental validity over general domain knowledge and the cost of marking responses (with at least two markers required). For the moment we simply note that a rather large envelope would be required for even a rapid assessment of selection utility!

— Celia Taylor, Senior Lecturer

References:

  1. Campion MC, Ployhart RE, MacKenzie Jr WI. The state of research on situational judgment tests: a content analysis and directions for future research. Hum Perform. 2014; 27(4): 283-310.
  2. Krumm S, Lievens F, Hüffmeier J, et al. How “situational” is judgment in situational judgment tests? J Appl Psychol. 2015; 100(2): 399-416.
  3. McDaniel MA, Hartman NS, Whetzel DL, Grubb III WL. Situational judgment tests, response instructions, and validity: a meta‐analysis. Pers Psychol. 2007; 60(1): 63-91.
  4. Lievens F, & Motowidlo SJ. Situational judgment tests: From measures of situational judgment to measures of general domain knowledge. Ind Organ Psychol. 2016: 9(1): 3-22.
  5. Rockstuhl T, Ang S, Ng KY, Lievens F, Van Dyne L. Putting judging situations into situational judgment tests: Evidence from intercultural multimedia SJTs. J Appl Psychol. 2015; 100(2): 464-80.

CHWonomics

Watching NoCounter interact with “Aunty” Martha (not their real names) in Mahwaqe, South Africa, and learning about NoCounter’s roles as Martha’s health advocate, personal trainer and medication manager was anything but dismal. So as a dismal scientist, I was fascinated by how Community Health Workers (CHWs) seem to contradict one of our most famous founders, Adam Smith. To help explain one of the concepts for which he would become famous, “the invisible hand”, Smith wrote: “I have never known much good done by those who affected to trade for the public good”.[1]

To consider whether NoCounter and other CHWs are an exception to this statement, there are three questions that need to be considered:

Is the CHW doing good?
Almost all of the available research evidence suggests that CHWs are effective in enhancing the health of their communities,[2] and since the World Health Organization also see CHWs as playing a pivotal role in helping countries achieve health-related Millennium Development Goals,[3] it is most likely that CHWs are “doing good”. In Mahwaqe, we saw how NoCounter helped Martha do the chair yoga exercises that now mean she can walk and explained her medications, which helped Martha understand the importance of adherence.

Is the CHW trading?
NoCounter is giving up her time (working around 50% FTE) and in return, receives a stipend from an NGO of around R800 (~£36) per month and as such, is trading. However, as a maid in South Africa, she could earn around R1,200 (~£54) per month for the same hours, so NoCounter does not seem to be receiving the full monetary value of her time. If approximate role equivalence can be assumed, compared to a CHW in the US, NoCounter’s time is undervalued by a factor of around 8.5: a US CHW working for an hour could buy 3.3 McDonald’s Big Macs; NoCounter could buy 0.4.[4] [5] NoCounter is also using her skills and experience to provide care, but economics would describe these as “non-rivalrous” and thus not directly tradable.

Is the CHW doing so for the public good or her own self-interest?
Adam Smith might be confused by NoCounter, because her aim doesn’t seem to be wealth maximisation. However, a “utility maximising” economist would argue that NoCounter is making up for not being paid the full monetary value of her time by obtaining utility either from substitutes for money or from directly helping her community.[6] Even if NoCounter obtains utility from the latter, her motivation would still be to do public good. With regards to money substitutes, CHWs may also receive non-monetary incentives such as community respect, housing and access to health care and/or be motivated in their roles via the support of their families.[6] [7] Furthermore, the CHW role is particularly desirable in areas where residents have a high marginal rate of substitution for leisure over consumption, since CHWs do not have to commute to their place of work. Finally, a by-product of NoCounter’s work as a CHW from which she benefits directly is that she lives in a healthier community: by encouraging vaccination of new-borns, for example, she is lowering her own risk of TB.

On this last question, the relative importance of the different reasons why CHWs undertake their role for a wage lower than they appear to be worth, we cannot be certain about the answer. Research in this area is critical given the push to eliminate the under-supply of CHWs.[8] There are also additional pre-conditions – the organisational structure required to implement a successful CHW programme [9] – that also must be met before the demand for CHWs can be realised (made “effective”) in practice. Nevertheless, it is critical to determine whether all of the additional CHWs required to meet demand would also offer their labour at a low relative price. This was assumed in a costing exercise of a CHW roll-out programme,[10] but which prima facie contradicts basic economic theory of demand and supply.

Fortunately for me, economics provides one approach to studying the interaction between monetary and non-monetary incentives with respect to the supply of labour, for example using discrete choice experiments, where CHWs would be asked to make a choice between a series of pairs of packages of stipend/salary, level of health produced, and non-monetary incentives (see [11] for an example). Such experiments would need to be repeated in (and possibly also within) different countries, since the relative value of “doing good” by volunteering may well differ according to a country’s stage in economic development. Such work would help to provide evidence regarding the sustainability of CHWs as a cadre of health care providers. Here, we hypothesise a U-shaped curve if propensity to volunteer is plotted against GDP per capita

— Celia Taylor, Senior Lecturer

References:

  1. Smith A. An Inquiry into the Nature and Causes of the Wealth of Nations. London: Strahan and Cadell, 1776.
  2. Perry H, Zulliger R. How Effective are Community Health Workers? An Overview of Current Evidence with Recommendations for Strengthening Community Health Worker Programs to Accelerate Progress in Achieving the Health-related Millennium Development Goals. Baltimore, MD: John Hopkins Bloomberg School of Public Health, 2012.
  3. World Health Organization and Global Health Workforce Alliance. Global Consultation on Community Health Workers. Geneva, Switzerland: World Health Organization, 2010.
  4. Payscale Homepage. 2015.
  5. The Economist. The Big Mac Index. 2015.
  6. Greenspan JA, McMahon SA, Chebet JJ, Mpunga M, Urassa DP, Winch PJ. Sources of community health worker motivation: a qualitative study in Morogoro Region, Tanzania. Hum Resour Health. 2013; 11: 52.
  7. Dambisya YM. A review of non-financial incentives for health worker retention in east and southern Africa. In: EQUINET Discussion Paper Number 44 with ESCA-HC. Loewenson R (Editor). Harare, Zimbabwe: EQUINET, 2007.
  8. One Million Community Health Workers Campaign. One Million Community Health Workers Campaign. 2015.
  9. World Health Organization, Policy Brief. Community health workers: What do we know about them? Geneva, Switzerland: World Health Organization, 2007
  10. McCord GC, Liu A, Singh P. Deployment of community health workers across rural sub-Saharan Africa: financial considerations and operational assumptions. Bull World Health Organ. 2012; 91(4):244-53B.
  11. Kasteng F, Settumba S, Källander K, Vassall, A, inSCALE Study Group. Valuing the work of unpaid community health workers and exploring the incentives to volunteering in rural Africa. Health Policy Plan. 2016: 31(2): 205-16.

Another Day, Another (Badly-Reported) Health Story in the Media…

Recent health issues reported in the British media have included the link between consumption of red and processed meat with an increased risk of cancer and the need for a ‘sugar tax’ to curb the ever-increasing rates of obesity and its associated health problems. These are big, newsworthy issues relating to the effect of diet and lifestyle on health: the World Cancer Research Fund estimate that around 6,000 cases of bowel cancer in the UK could be prevented by reducing consumption of red and processed meat,[1] while a 20p/litre tax on sugar-sweetened beverages could reduce the number of obese adults in the UK by 180,000 according to the Faculty of Public Health.[2]

So one has to feel a little pity for a journalist tasked with writing a piece about a study investigating whether the composition of a mother’s breast milk was associated with infant weight and body composition.[3] The journalist from The Times seemed to approach this task by jumping on the obesity bandwagon; two key quotes from the story are: “A mother’s milk can increase the chance of a child growing up obese” and “A study … identified sugars in breastmilk that heightened a baby’s risk of being overweight by the age of 6 months”. This seemed to fly in the face of almost everything I had ever read about breastfeeding, so I decided to look at the evidence in a bit more detail.

The paper was based on a sample of 25 breastfeeding mothers and their babies. No babies were formula-fed. Outcomes were infant growth (weight and length) and body composition (percentage fat, total fat and lean mass). Whether or not the baby was ‘overweight’ or ‘obese’ was not an outcome. An association between the level of different human milk oligosaccharides (HMOs) in breast milk and infant weight and body composition was identified by the study authors, adding to the evidence base regarding the factors influencing a baby’s growth and development. The authors themselves made no direct claim that breastfeeding causes childhood obesity (three separate meta-analyses have, in fact, shown the opposite [4-6]), with the smallest of these studies including data for almost 30,000 babies.

The journalist’s train of thought may have gone thus:

44 GB Health Story in Media Fig 1

The first step in this chain was identified by the study authors. But was the journalist justified in making the second?

The increase in risk of adulthood obesity given a high weight-for-age percentile in infancy has been known for some time,[7] so the second link is plausible. But can it automatically be inferred from this study? To do so relies on the increases in body fat/fat mass being of such magnitude to class some of the infants in this study as overweight or obese at six months and we simply don’t know if this was the case. Instead, it could be possible that babies receiving alternative combinations of HMOs to those shown in the diagram were actually underweight and that those at the upper end of the weight range were still of ‘normal’ weight. We also don’t know how the weights and body compositions of the babies in the study would compare to those who have been formula-fed: even if breast milk containing high levels of certain HMOs did increase the risk of obesity, the risk with such HMOs could still be lower than that from infant formula.

That some HMOs were shown to have a negative relationship with body weight and/or composition seemed to make the journalist even more confused, since the story ended by stating “However, scientists also found that breast milk could protect against obesity.” The meta-analyses quoted above have demonstrated this, but once again, such a conclusion cannot be drawn from this particular study.
Reporting of current research in the media is invaluable to help increase uptake of its findings, yet the dangerous misinterpretation of the findings of the study by Alderete et al. mean that I hope the story in The Times (not the research study) was ignored by all who read it.

— Celia Taylor

References:

  1. World Cancer Research Fund. Bowel cancer. 2015. [Online]
  2. Faculty of Public Health. A duty on sugar sweetened beverages. A position statement. 2013. [Online]
  3. Alderete TL, Autran C, Brekke BE, et al. Associations between human milk oligosaccharides and infant body composition in the first 6 mo of life. Am J Clin Nutr. 2015. [ePub].
  4. Arenz S, Rückerl R, Boletzko B, von Kries R. Breast-feeding and childhood obesity – a systematic review. Int J Obesity. 2004; 28: 1247-56.
  5. Owen C, Martin R, Whincup P et al. The effect of breastfeeding on mean body mass index throughout life: a quantitative review of published and unpublished observational evidence. Am J Clin Nutr. 2005; 82: 1298-1307.
  6. Harder T, Bergman R, Kallischnigg G et al. Duration of breastfeeding and risk of overweight: a meta-analysis. Am J Epidemiol. 2005; 162:397-403.
  7. Charney E, Goodman HC, McBride M, et al. Childhood Antecedents of Adult Obesity – Do Chubby Infants Become Obese Adults? N Engl J Med. 1976; 295: 6-9.

The Payback from Improving Availability of Donor Human Milk for Premature Babies

CLAHRC WM is collaborating with the African Population Health Research Centre (APHRC) in the evaluation of donor milk banks in slums (informal settlements) in Kenya. The initiative is led by PATH,[1] which has had considerable success in establishing an altruistic donor service in South Africa. The donor milk is donated to hospital wards caring for premature infants.

There is excellent evidence that donor human milk is superior to ‘formula’ in babies whose mothers are unable to express breast milk. As a result of passive immunity, and also because it has nutritional properties that formula is not able to replicate, donor human milk reduces the risk of neonatal infection.[2] In particular, it reduces the dangerous condition of necrotising enterocolitis (NEC).[3][4] NEC can be fatal and may also require surgery that may have permanent consequences – particularly the ‘short bowel syndrome’. The decreased infection risk resulting from use of donor milk is associated with a measurable decrease in mean length of stay.[5]

One concern is that the mothers of infants who receive donor milk may be less likely to initiate breast feeding at a later date for psychological or physiological reasons. The evidence does not bear out this concern and, if anything, these mothers, perhaps inspired by the altruism of the donors, are more likely to breastfeed.[6][7] If so, this may be expected to augment the benefits of donor milk and also reduce the mother’s risk of developing breast cancer later in life.[8]

The benefits do not seem to end there. There is observational evidence, recently reinforced by a substantial study from Brazil,[9] that cognitive ability in later life is improved by human milk. There is a dose-response effect and the results remain after extensive statistical adjustment for confounders. There is also some experimental (RCT) evidence for a beneficial effect on IQ.[10] Improved IQ is correlated with earning power [11] and, we must assume, payback to society.[12]

To summarise the benefits of breastfeeding we offer the following Influence Diagram (Causal Pathway: Model):

CI - Improving Availability of Donor Human Milk Fig 1

A health economic analysis of promotion of breastfeeding for older children (not premature infants specifically) found that the intervention ‘dominated’ – reduced short-term benefits (less infection) and the contingent cost savings (reduced hospital stays) meant that interventions to promote breastfeeding are cost-saving, not just beneficial for health.[12][13]

There have been two studies of the cost-effectiveness of a donor milk service for premature babies. Both found that the service was cost-effective. The first study was based on a hypothetical baby who was very premature (28 weeks gestational age), rather than an observed mean intervention effect observed at the group level.[14] The calculated benefits might therefore be exaggerated. The second study was based on only 175 propensity scored low birth weight infants.[5] The risk of sepsis decreased with increasing dose of human milk, and total costs obtained from the hospital billing system were lower in proportion to the amount of human milk consumed. However, most infants received some human milk, so the infants could not be divided into a control and intervention population, and the above correlation between outcome and volume of donor milk consumed may have been confounded by factors that determine both access to human milk and sepsis, notwithstanding propensity scoring. Both the above studies were American.

Working with colleagues above, we propose a comprehensive health economic model that takes account of long-term outcomes and that can be populated with country-specific data. The base-case model will be populated with evidence from systematic reviews,[12][13] and we propose to use Bayesian techniques to ‘down weight’ observational evidence using the Turner and Spiegelhalter method.[15]

— Richard Lilford, CLAHRC WM Director
— Celia Taylor, Senior Lecturer

References:

  1. PATH. Models of milk banking in South Africa. Seattle, WA: PATH, 2011.
  2. Arslanoglu S, Ziegler EE, Moro GE. Donor human milk in preterm infant feeding: evidence and recommendations. J Perinat Med. 2010; 38: 347-51.
  3. Lucas A, Cole TJ. Breast milk and neonatal necrotising enterocolitis. Lancet. 1990; 336: 1519-23.
  4. Quigley M, McGuire W. Formula versus donor milk for feeding preterm or low birth weight infants. Cochrane Database Sys Revs. 2014; 4: CD002971.
  5. Patel AL, Johnson TJ, Engstrom JL, Fogg LF, Jegier BJ, Bigger HR, Meier PP. Impact of early human milk on sepsis and health-care costs in very low birth weight infants. J Perinatol. 2013; 33: 514-9.
  6. Arslanoglu S, Moro GE, Bellù R, Turoli D, De Nisi G, Tonetto P, Bertino E. Presence of human milk bank is associated with elevated rate of exclusive breastfeeding in VLBW infants. J Perinat Med. 2013; 41(2): 129-31.
  7. Vázquez-Román S, Bustos-Lozano G, López-Maestro M, et al. Clinical impact of opening a human milk bank in a neonatal unit. An Pediatr (Barc). 2014; 81(3): 155-60.
  8. Collaborative Group on Hormonal Factors in Breast Cancer. Breast cancer and breastfeeding: collaborative reanalysis of individual data from 47 epidemiological studies in 30 countries, including 50 302 women with breast cancer and 96 973 women without the disease. Lancet. 2002; 360: 187-95.
  9. Victora CG, Horta BL, Loret de Mola C, Quevedo L, Pinheiro RT, Gigante DP, Gonçalves H, Barros FC. Association between breastfeeding and intelligence, educational attainment, and income at 30 years of age: a prospective birth cohort study from Brazil. Lancet Glob Health. 2015; 3(4): e199-205.
  10. Horta BL, Victora CG. Long-term effects of breastfeeding: a systematic review. Geneva: World Health Organization. 2013
  11. US Environmental Protection Agency. The benefits and costs of the clean air act, 1970 to 1990, appendix G, lead benefits analysis. Washington, DC: Environmental Protection Agency, 1997.
  12. Renfrew MJ, Pokhrel S, Quigley M, et al. Preventing disease and saving resources: the potential contribution of increasing breastfeeding rates in the UK. UNICEF. 2012.
  13. Kramer MS & Kakuma R. Optimal duration of exclusive breastfeeding. Cochrane Database Sys Revs. 2012; 8: CD003517.
  14. Arnold LDW. The Cost-effectiveness of Using Banked Donor Milk in the Neonatal Intensive Care Unit: Prevention of Necrotizing Enterocolitis. J Hum Lact. 2002; 18(2): 172-7.
  15. Turner RM, Spiegelhalter DJ, Smith GCS, Thompson SG. Bias modeling in evidence synthesis. J R Stat Soc Ser A. 2009; 172: 21–47.

A Low-Value Paper on the Assessment of High-Value Care

The provision of ‘high-value’ care (HVC) – balancing health outcomes from treatment against financial costs, potential adverse events and the disutility of undergoing treatment – has become increasingly important in a time of austerity and patient-centred care. A recent paper in the Annals of Internal Medicine therefore set out to establish whether a subset of single-best answer questions used as part of a wider knowledge-based examination could be an effective tool for assessing trainees’ knowledge of HVC.[1] Thirty-eight existing questions were identified as assessing domains of HVC and the scores of around 18,000 residents were analysed for evidence of validity. We are not informed of the extent to which any of the measures of HVC used in the study were reliable, although an examination including just 38 questions is unlikely to have sufficient reliability to be used to classify trainees.

The analysis proceeds at the level of the training programme (N=362) and no data on the variability of trainees’ scores within a programme, compared to that between programmes, are provided. We are informed that the HVC subscore correlates positively at programme level with total examination scores, although no quantitative measure of the correlation is provided and any such measure would inevitably be biased upwards by the inclusion of the HVC subscore in the total score. Despite the authors’ statement that their findings “support the importance of the training environment in fostering HVC” (p. 737), there was poor agreement between programme quartiles based on HVC subscores and a measure of hospital care intensity (a quadratic weighted kappa of 0.17 was calculated from data provided). Evidence of validity at trainee level could have been provided, as survey data on self-reported HVC behaviours was also collected, but again analysed at programme level (with no consistent relationship identified across the eight HCV behaviours included in the survey).

Research in medical education – of which assessment is a key domain – is often seen as the poor bedfellow of clinical research. Guidance on reporting and interpreting validity evidence is available [2] and needs to be followed if medical education research is to raise its profile.

— Celia Taylor, Senior Lecturer

References:

  1. Ryskina KL, Korenstein D, Weissman A, Masters P, Alguire P, Smith CD. Development of a High-Value Care Subscore on the Internal Medicine In-Training Examination Assessing Residents’ Knowledge of HVC. Ann Intern Med. 2014; 161(10): 733-9.
  2. Downing SM. Validity: on the meaningful interpretation of assessment data. Med Educ. 2003; 37(9): 830-7