In a previous News Blog [1] we discussed endpoint measurement for trials of wound infection, where the observers were not ‘blinded’ (not masked to the intervention group). Such an approach is simply not adequate, even if the observers use ‘strict criteria’.[1] This is because of subjectivity in the interpretation of the criteria and, more especially, because of reactivity. Reactivity means that observers are influenced, albeit sub-consciously, by knowledge of the group to which patients have been assigned (treatment or not). Such reactivity is an important source of bias in science.[2]
We are proposing a trial of a promising treatment for recurrent leprosy ulcers that we would like to carry out in the Leprosy Mission Hospital in Kathmandu, Nepal. We plan to conduct an efficacy trial of a regenerative medicine (RM) technique where a paste is made from the buffy coat layer of the patient’s own blood. This is applied to the ulcer surface at the time of dressing change. The only difference in treatment will be whether or not the RM technique is applied when the regular change of wet dressing is scheduled. We will measure, amongst other things, the rate of healing on the ulcers and time to complete healing and discharge from hospital.
Patients will be randomised so as to avoid selection bias and, as the primary endpoints in this efficacy trial are measured during the hospital sojourn (and patients seldom discharge themselves), we are mainly concerned with outcome bias as far as endpoints regarding ulcer size are concerned.
One obvious way to get around the problem of reactivity is to use a well described method in which truly masked observers, typically based off-site, measure ulcer size using photographs. Measurements are based on a sterile metal ruler positioned at the level of the ulcer to standardise the measurement irrespective of the distance of the camera. The measurement can be done manually or automated by computer (or both). But is that enough? It has been argued that bias can still arise, not at the stage where photographs are analysed, but rather at the earlier stage of photograph acquisition. This argument holds that, again perhaps sub-consciously, those responsible for taking the photograph can affect its appearance. The question of blinding / masking of medical images is a long-standing topic of debate.
The ‘gold standard’ method is to have an independent observer arrive on the scene at the appropriate time to make the observations (and take any photographs). Such a method would be expensive (and logistically challenging over long distances). So, an alternative would be to deploy such an observer for a random sub-set of cases. This method may work but it has certain disadvantages. First, it would be tricky to choreograph as it would disrupt the work flow in settings such as that described above. Second, to act as a method of audit, it would need to be used alongside the existing method (making the method still more ‘unwieldy’). Third, the method of preparing the wound would still lie in the hands of the clinical team, and arguably still be subject to some sort of subconscious ‘manipulation’ (unless the observer also provided the clinical care). Fourth, given that agreement would not be exact between observers, a threshold would have to be agreed regarding the magnitude of difference between the standard method and the monitoring method that would be regarded as problematic. Fifth, it would not be clear how to proceed if such a threshold was crossed. While none of these problems are necessarily insurmountable, they are sufficiently problematic to invite consideration of further methods. What might augment or replace standard third party analysis of photographic material?
Here we draw our inspiration from a trial of surgical technique in the field of ophthalmology/orbital surgery.[3] In this trial, surgical operations were video-taped in both the intervention and control groups. With permission of patients, we are considering such an approach in our proposed trial. The vast majority of ulcers are on the lower extremities, so patients’ faces would not appear in the videos. The videos could be arranged so that staff were not individually identifiable, though they could be redacted if and where necessary. We would like to try to develop a method whereby the photographs were directed in real time by remote video link, but pending the establishment of such a link, we propose that each procedure (dressing change) is video-taped, adhering to certain guidelines (for example, shot in high-definition, moving the camera to give a full view of the limb from all sides, adequate lighting, a measurement instrument is included in the shot, etc.). We propose that measurements are made both in the usual way (from mobile phone photographs), and from ‘stills’ obtained from the video-tapes. Each could be scored by two independent, off-site observers. Furthermore the videos could be used as a method of ‘ethnographic’ analysis of the process to surface any material differences between patients in each trial arm in lighting, preparation of ulcer sites, time spent on various stages of the procedure and photograph acquisition, and so on.
Would this solve the problem? After all, local clinicians would still prepare the ulcer site for re-bandaging and, insofar as they may be able to subconsciously manipulate the situation, this risk has not been vitiated. However, we hypothesise that the video will work a little like a black box on an aeroplane; it cannot stop things happening, but it provides a powerful method to unravel what did happen. The problem we believe we face is not deliberate maleficence, but subtle bias at the most. We think that by using the photographic approach, in accordance with guidelines for such an approach,[4] we already mitigate the risk of outcome measurement bias. We think that by introducing a further level of scrutiny, we reduce the risk of bias still further. Can the particular risk we describe here be reduced to zero? We think not. Replication remains an important safeguard to the scientific endeavour. We now turn our attention to this further safeguard.
Leprosy ulcers are far from the only type of ulcer to which the regenerative medicine solution proposed here is relevant. Diabetic ulcers, in particular, are similar to leprosy ulcers in that loss of neural sensation plays a large part in both. We have argued elsewhere that much can be learned by comparing the results of the same treatment across different disease classes. In due course we hope to collaborate with those who care for other types of skin ulcer so that we can compare and contrast and also to advance methodologically. Together we will seek the optimal method to limit expense and disruption of workflow while minimising outcome bias from reactive measurements.
— Richard Lilford, CLAHRC WM Director
References:
- Lilford RJ. Before and After Study Shows Large Reductions in Surgical Site Infections Across Four African Countries. NIHR CLAHRC West Midlands News Blog. 10 August 2018.
- Kazdin AE. Unobtrusive measures in behavioral assessment. J Appl Behav Anal. 1979; 12: 713–24.
- Feldon SE, Scherer RW, Hooper FJ, et al. Surgical quality assurance in the Ischemic Optic Neuropathy Decompression Trial (IONDT). Control Clin Trials. 2003; 24: 294-305.
- Bowen AC, Burns K, Tong SY, Andrews RM, Liddle R, O’Meara IM, et al. Standardising and assessing digital images for use in clinical trials: a practical, reproducible method that blinds the assessor to treatment allocation. PLoS One. 2014;9(11):e110395.