“Patients do not walk into the clinic saying ‘I have one of these five diagnoses. Which do you think is most likely?’” (Surry et al., 2017)
The predominant form of written assessment for UK medical students is the ‘best of five multiple choice question’ (Bo5). Students are presented with a clinical scenario – usually information about a patient, a lead-in or question such as “which is the most likely diagnosis?” and a list of five possible answers, only one of which is unambiguously correct. Bo5 questions are incredibly easy to mark, particularly in the age of computer-read answer sheets (or even computerised assessment). This is critical when results must be turned-round, ratified and feedback provided to students in a timely manner. Because Bo5s are relatively short (UK medical schools allow a median of 72 seconds per question, compared with short answer or essay questions for which at least 10 minutes per question would be allowed), an exam comprising of Bo5 questions can cover a broad sample of the curriculum. This helps to improve the reliability of the exam: a student’s grade is not contingent on ‘what comes up in the exam’, so should have been similar had a different set of questions covering the same curriculum been used. Students not only know that their (or others’) scores are not dependent on what came up, but they are also reassured that they would get the same score regardless of who (or what) marked their paper. There are no hawk/dove issues in Bo5 marking.
On the other hand, Bo5 questions are notoriously difficult to develop. The questions used in the Medical Schools Council Assessment Alliance (MSCAA) Common Content project, where questions are shared across UK medical schools to enable passing standards for written finals exams to be compared, go through an extensive review and selection process prior to inclusion (the general process for MSCAA questions is summarised by Melville, et al. ). Yet the data are returned for analysis with comments such as “There is an assumption made in this question that his wife has been faithful to the man” or “Poor distractors – no indication for legionella testing”. But perhaps the greatest problem with Bo5 questions is their poor representativeness to clinical practice. As the title of this blog implied, patients do not come with a list of five possible pathologies, diagnoses, important investigations, treatment options, or management plans. While a doctor would often formulate such a list (e.g. a differential diagnosis) before determining the most likely or appropriate option, such formulation requires considerable skill. We all know that assessment drives learning, so by using Bo5 we may therefore be inadvertently hindering students from developing the full set of clinical reasoning skills required of a doctor. There is certainly evidence that students use test-taking strategies such as elimination of implausible answers and clue-seeking when sitting Bo5-based exams.
A new development in medical student assessment, the Very Short Answer question (VSA) therefore holds much promise. It shifts some of the academic/expert time from question development to marking, but, by exploiting computer-based assessment technology, does so in a way that is not prohibitive given the turn-around times imposed by institutions. The VSA starts with the same clinical scenario as a Bo5. The lead-in changes from “Which is…?” to “What is…?” and this is followed by a blank space. Students are required to type between one and five words in response. A pilot of the VSA style question showed that the list of acceptable answers for a question could be finalised by a clinical academic in just over 90 seconds for a cohort of 300 students. With the finalised list automatically applied to all students’ answers, again there are no concerns regarding hawk/dove markers that would threaten the exam’s acceptability to students. While more time is required per question when using VSAs compared to Bo5s, the internal consistency of VSAs in the pilot was higher for the same number of questions, so it should be possible to find an appropriate compromise between exam length and curriculum coverage that does not jeopardise reliability. The major gain with the use of VSA questions is in clinical validity; these questions are more representative of actual clinical practice than Bo5s, as was reported by the students who participated in the pilot.
To produce more evidence around the utility of VSAs, the MSCAA is conducting a large-scale pilot of VSA questions with final year medical students across the UK this autumn. The pilot will compare student responses and scores to Bo5 and VSA questions delivered electronically and assess the feasibility of online delivery using the MSCAA’s own exam delivery system. A small scale ‘think aloud’ study will run alongside the pilot, to compare students’ thought processes as they attempt Bo5 and VSA questions. This work will provide an initial test of the hypothesis that gains in clinical reasoning validity could be achieved with VSAs, as students are forced to think ‘outside the list of five’. There is strong support for the pilot from UK medical schools, so the results will have good national generalisability and may help to inform the design of the written component of the UK Medical Licensing Assessment.
We would love to know what others, particularly PPI representatives, think of this new development in medical student assessment.
— Celia Taylor, Associate Professor
- Taylor CA, Gurnell M, Melville CR, Kluth DC, Johnson N, Wass V. Variation in passing standards for graduation‐level knowledge items at UK medical schools. Med Educ. 2017; 51(6): 612-20.
- Melville C, Gurnell M, Wass V. #5CC14 (28171) The development of high quality Single Best Answer questions for a national undergraduate finals bank. [Abstract] Presented at: The International Association for Medical Education AMEE 2015; 2015 Oct 22; Glasgow. p. 372.
- Surry LT, Torre D, Durning SJ. Exploring examinee behaviours as validity evidence for multiple‐choice question examinations. Med Educ. 2017; 51(10): 1075-85.
- Sam AH, Field SM, Collares CF, et al. Very‐short‐answer questions: reliability, discrimination and acceptability. Med Educ.2018; 52(4): 447-55.