Use of a new task-relevant test to assess the effects of shift work and drug labelling formats on anesthesia trainees’ drug recognition and confirmation
- First Online:
- Cite this article as:
- Cheeseman, J.F., Webster, C.S., Pawley, M.D.M. et al. Can J Anesth/J Can Anesth (2011) 58: 38. doi:10.1007/s12630-010-9404-3
- 343 Downloads
Drug administration errors occur in every aspect of clinical practice. Using a novel task-relevant Medication Recognition and Confirmation Test (MRCT), we investigated the effects on performance of working night and day shifts and labelling different drug formats.
Anesthesia trainees (n = 18) participated in one of two experiments during an 8-12 hr day shift and an 8-12 hr night shift. In Experiment-1 (n = 10), we compared standardized colour-coded labels with pictures of ampoules. In Experiment-2 (n = 8), we compared colour-coded labels with black and white labels. Sleep was measured with wrist actigraphy during both day and night shift runs over seven to eight days. The MRCT outcome measures were reaction times and drug errors.
In the two experiments, colour-coded labels were recognized (and therefore selected) more quickly than pictures of conventional ampoules (mean difference 332 msec, 95% confidence interval (CI) 242-422 msec; P < 0.0001) and faster than black and white labels (mean difference 96 msec, 95% CI 46-146 msec; P < 0.0001). Participants obtained less sleep while working night shifts than while working day shifts (mean difference 57 min, 95% CI 0:15-1:39 hr; P = 0.013). Mean confirmation reaction times were slower during night shifts than during day shifts (mean difference 60 msec, 95% CI 1-120 msec; P = 0.048). No differences in error rates were observed between shifts or among drug label types.
Label format influenced recognition and confirmation reaction times to representations of drugs in this study, and we found some evidence to suggest that performance is better during day shifts than during night shifts. The task-relevant test evaluated here may have further application in measuring performance in the wider clinical setting.
Nouveau test pertinent à la tâche afin d’évaluer les effets du travail par quarts et des formats d’étiquetage des médicaments sur les capacités d’identification et de confirmation des médicaments par les stagiaires en anesthésie
Les erreurs d’administration médicamenteuse surviennent dans toutes sortes de situations de la pratique clinique. À l’aide d’un nouveau Test de d’identification et de confirmation des médicaments (MRCT) pertinent à la tâche, nous avons examiné les effets de quarts de travail de nuit et de jour sur les performances, ainsi que les effets de différents formats d’étiquetage de médicaments.
Des stagiaires en anesthésie (n = 18) ont pris part à l’une de deux expériences pendant un quart de travail de jour de 8-12 h et un quart de nuit de 8-12 h. Dans l’expérience 1 (n = 10), nous avons comparé des étiquettes standardisées avec codes de couleur à des étiquettes avec des images d’ampoules. Dans l’expérience 2 (n = 8), nous avons comparé des étiquettes avec codes de couleur à des étiquettes en noir et blanc. Le sommeil a été mesuré à l’aide d’actigraphie au poignet pendant les quarts de jour et de nuit sur une période de sept ou huit jours. Les résultats mesurés par le MRCT étaient les temps de réaction et les erreurs médicamenteuses.
Dans les deux expériences, les étiquettes avec codes de couleur ont été identifiées (et donc sélectionnées) plus rapidement que les étiquettes présentant des images d’ampoules conventionnelles (différence moyenne 332 msec, intervalle de confiance (IC) 95 % 242-422 msec; P < 0,0001) et plus rapidement que les étiquettes en noir et blanc (différence moyenne 96 msec, IC 95 % 46-146 msec; P < 0,0001). Les participants ont moins dormi lorsqu’ils travaillaient les quarts de nuit que lorsqu’ils ont travaillé les quarts de jour (différences moyennes 57 min, IC 95 % 0:15-1:39 h; P = 0,013). Les temps de réaction de confirmation moyens étaient plus longs pendant les quarts de nuit que pendant les quarts de jour (différence moyenne 60 msec, IC 95 % 1-120 msec; P = 0,048). Aucune différence n’a été observée dans les taux d’erreurs entre les quarts ou entre les différents types d’étiquettes.
Le format de l’étiquette a eu une influence sur les temps d’identification et de réaction de confirmation aux représentations médicamenteuses dans cette étude, et certaines des données observées suggèrent que la performance est meilleure pendant les quarts de jour que pendant les quarts de nuit. Le test pertinent à la tâche évalué ici pourrait également être appliqué pour mesurer la performance dans un contexte clinique plus vaste.
Error in drug administration pervades every aspect of clinical practice, notably in anesthesia.1-5 There is reasonable agreement about the basic principles of safe administration of medications6 but relatively little empirical evidence to underpin these principles. Studying medication error in clinical settings is difficult, because errors, although unacceptably frequent, do not occur often enough to lend themselves to study in prospective trials that are moderate in size. In recent years, the effects of fatigue and time of day have received considerable attention as causes of performance impairment.7-10 Interns in the United States working “traditional” 24-hr shifts suffered from double the number of failures in concentration than those working a 17-hr shift,11 and during the night shift, they made 36% more medical errors of a serious nature.12 This impairment in performance was attributed to the effects of sleep deprivation and the circadian clock, which promotes sleep at night.
As a result of these concerns, the work hours of junior doctors in New Zealand have been restricted since 1985 to a maximum of 17 continuous hours and a total of 72 hr per week. Many of the studies of the effects of fatigue are based on tests, such as the psychomotor vigilance test, that seem to have little to do with clinical medicine in general or anesthesia in particular. Perhaps unsurprisingly, Howard et al.13 demonstrated a significant decrement in the performance of anesthesiologist in the psychomotor vigilance test after 25-30 hr of sleep deprivation, but they demonstrated no change in their performance in a high-fidelity human patient simulator. Conventional cognitive tests (CogState™) have been used to demonstrate reduced performance of anesthesiologists after working several consecutive night shifts,14 but these tests are relatively time-consuming to complete during work periods and, arguably, do not relate directly to the task of drug administration. Legible drug labels that are colour-coded according to international standards have been adopted widely as an aid to avoiding drug error,15 and with some overall success, they have been incorporated into a multi-faceted system designed to reduce error in anesthesia.16 However, the influence of drug label format alone on drug administration error has not been definitively elucidated.
Consequently, we have developed a novel task-relevant simulated drug recognition and confirmation test to measure times and error rates of simulated drug recognition and confirmation. In this study, we describe the test and examine two null hypotheses: 1) Speed of drug recognition and accuracy of drug selection, as measured by the new test in terms of reaction time and error rate, does not differ between three different presentations of drug identity; and 2) These measures do not differ between day and night shifts.
Setting, ethics, and consent
Approval was obtained from the Auckland Regional Ethics Committee (ref 97/032) and the University of Auckland Human Participants’ Ethics Committee (ref 2005/051). Written consent was obtained from all participants. The study was conducted at Auckland City Hospital (ACH), Auckland, New Zealand from 2005 to 2008. The ACH is a 760-bed tertiary hospital associated with the University of Auckland and credentialed for training by the Australian and New Zealand College of Anaesthetists (ANZCA).
Eligible participants were anesthesia trainees allocated to one of four sub-departments at ACH. The ACH participates in a regional training scheme that has slightly more than 90 anesthesia trainees at any one time, all of whom are allocated to this sub-department (15 each time) at some point in their training. These trainees are rostered for three or four consecutive night shifts at a time, interspersed with variable (but longer) periods of day shifts. Data were collected from each participant over a seven-day study period that spanned both day and night shifts. Different participants took part in Experiment-1 (undertaken in the earlier part of the study) and Experiment-2 (undertaken in the later period).
Participants used actigraphs (Actiwatch-L, Cambridge Neurotechnology Inc, Cambridge, UK) to track their time spent sleeping, which were cross-checked with sleep diaries.
The medication recognition and confirmation test
The computer-based Medication Recognition and Confirmation Test (MRCT) was designed with reference to established criteria for neurocognitive testing.17 The user interface is a touch screen to measure the times and accuracy of responses to linked pairs of questions involving (Question One) recognition and selection and (Question Two) confirmation of the identity of pictures of medication labels in any desired format.
All participants were familiarized with the MRCT test and practiced the test before commencing the study. During training and before each trial, participants were instructed to strive for maximum speed and accuracy. Testing was undertaken on the third shift in a consecutive series of day or night shifts, respectively. Each participant acted as his/her own control. The start shift run (either nights or days) was determined by the roster, and each test battery was conducted three times (at the start, middle, and end of the shift). To avoid any possible differences in processor speed, the same dedicated desktop computer was used to collect all MRCT data. All testing took place in a quiet room adjacent to the operating rooms. Participants were instructed to use their dominant hand to respond to every question.
Experiment-1: colour-coded labels vs ampoules
Format A: Pictures of standardized labels colour-coded according to a New Zealand and Australian standard19 (identical to standards registered in the United States, Canada, and the United Kingdom),20-22 with both the class and the name of the drug displayed in a large clear font (e.g., “Opioid” and “Fentanyl”). Less salient details, including those required by regulation, were displayed in smaller fonts. These labels incorporated a barcode and were designed in variants for use on prefilled syringes, on ampoules as flag labels, and as user-applied labels for syringes15 (Safer Sleep, Auckland, NZ).
Format B: Photographs of conventional ampoules with care taken to produce the best possible resolution and clarity (Fig. 2). All ampoules were represented 1:1 with their physical equivalents.
Experiment-2: colour-coded labels vs black and white labels
Format A: The same as Experiment-1, Format A. Format B: Similar to Experiment-1, Format A, but with black and white labels instead of colour (Fig. 2).
Statistical analysis was undertaken with SPSS v.17 (SPSS, Chicago, IL, USA) and the statistical package R, version 2.8.23 Average daily sleep gained during each shift type was calculated using Sleep Analysis Software v. 5.0 (Cambridge Neurotechnology Inc, Cambridge, UK).
For each participant, the median reaction time (MRT) in milliseconds (msec) was calculated for the responses to each recognition question in each format (Question One) and each confirmation question in each format (Question Two). The MRT was chosen purposely after examining the distributions of data, because it was a valid measure of central tendency10 and because transforming the data would make interpretation of the reaction times less intuitive. The total number of errors was calculated for each participant for each of Questions One and Two for each format.
Label type comparisons
When examining differences in label types, separate analyses were performed on each of the following outcome measurements: median recognition MRT, median confirmation MRT, and total errors within each experiment. Outcomes were measured for each participant within a shift-type at a shift-stage, so tests were based on the paired differences between the label types. Participants were modelled as levels of a random factor, as they were considered to have been sampled from the population of anesthetic trainees in the region’s training program. Linear mixed-effects models with random intercepts were fitted to the distribution of differences using a normal error structure (identity link-function) and maximum likelihood parameter estimation. The model initially included the following covariates: 1) shift-type (night vs day) 2) shift-stage (start, middle, or end), and 3) sleep duration (minutes) as fixed effect covariates. Covariates were dropped from the model if there was no evidence of an effect at the alpha = 5% level.
When outliers were present, we used the non-parametric Wilcoxon signed-rank test or the sign test (the latter test was used when data had multiple ties) on the paired differences (averaged across shift-stage for each participant), as indicated in the text. We obtained the exact confidence intervals (CI) for the median difference of the population by the algorithm described by Bauer,24 and we designated the level of significance, alpha, as 5%.
The sample size estimate was not undertaken because no prior data were available for this purpose. Instead, we recruited as many participants as practical during the time the first author was completing his PhD.
When examining differences in shift-type (day vs night), the analysis was carried out on the combined Format A data (i.e., coloured label data) amalgamated from both experiments using a paired Student’s t test.
Fourteen trainees consented for Experiment-1 and ten trainees consented for Experiment-2. No trainee who was approached at the outset declined to participate, but four and two trainees, respectively, were then unable to participate because of rostering exigencies. At the end of the study, valuable MRCT data were available from 18 subjects all aged < 40 yr (ten in Experiment-1: five male, five female and eight in Experiment-2: three male, five female); actigraphy data were available from 12 of these participants (five in Experiment-1 and seven in Experiment-2). Missing data were attributable to lack of compliance in completing the tests and equipment failure.
Experiment-1: colour-coded labels vs ampoules
There were 38 errors, 17 errors with the colour-coded label format and 21 errors with the ampoule format (with no evidence of a difference in proportions; P = 0.47, mixed-model analysis). There was no evidence of the effects of shift-stage order on confirmation MRT or error rate. No drug stood out as being more error prone than the others.
Experiment-2: colour-coded labels vs black and white labels
The mean recognition MRT for colour-coded labels was 96 msec faster (95% CI 46-146 msec; P < 0.0001, mixed-model analysis) than that for black and white labels (Fig. 4B). Mean confirmation times for colour-coded labels were 16 msec faster (95% CI 6-26 msec; P = 0.0023, mixed-model analysis) than that for black and white labels (Fig. 6B). There was no evidence of the effects of shift-stage on recognition or confirmation MRT.
A total of 76 errors were made, 42 with the colour-coded label format and 34 with the black and white label format (P = 0.07, Wilcoxon signed-rank test).
Day vs night comparisons: analysis of amalgamated colour-coded label format data (Format A in both experiments)
Mean confirmation MRT was 60 msec slower (95 % CI 1-120 msec; P = 0.048, paired Student’s t test) during the night than during the day (Fig. 7). Analysis of recognition and confirmation errors together revealed no evidence of a difference (P = 0.29, mixed-model analysis)
Fifty-nine errors were made with colour-coded labels, 28 during day shifts and 31 during night shifts.
In Experiment-1, photographs of colour-coded drug labels were selected more quickly than photographs of ampoules, and they were also confirmed more quickly as the required drug when subsequently presented. However, there was no difference in error rates between these two formats.
In Experiment-2, selection and confirmation was also quicker with colour-coded labels than with equivalent black and white labels. Forty-two errors were made with the colour-coded labels vs 34 with black and white labels (P = 0.07).
In the first Experiment-1, the difference in selection times diminished as the shift progressed from early to mid to late. This result may have been an effect of fatigue, but no other effects from shift phase were identified, so it is difficult to speculate on this theory. We tested for effects of order and found none, suggesting no important learning component in the test.
Participants obtained almost one less hour of sleep while on night shifts than while on day shifts (Fig. 3). Both drug recognition time (P = 0.06) and drug confirmation time (P = 0.048) were slightly slower during night shifts. However, we found no differences between the night and day shifts with regard to error rates.
The value of these data lies in the clinical relevance of the simulated task. The participants in this study were participating in standard rostered shifts, and the only deviation from normal clinical practice was the requirement to perform our tests. We did not control for caffeine, alcohol consumption, or sleep medication. We were surprised that almost all participants obtained considerably less than the eight hours of sleep considered usual25 on both day and night shifts. The relatively small difference between day and night shifts in hours of sleep (approximately one hour) can be attributed to the fact that participants often had the opportunity for sleep while on night duty. Recent data suggest that continuously restricting sleep to six hours each night can have a deleterious effect on vigilance equivalent to 24-hr acute sleep deprivation,26,27 and self-awareness of reduced performance tends to plateau.27,28 The trainees who were studied in our hospital had a tendency to fall within this category, at least on night shifts, despite the relatively rigorous restrictions on work hours in New Zealand. Impairment of anesthesiologists’ performance due to loss of sleep has been shown previously in a similar New Zealand setting.10
Taken collectively, these findings suggest an advantage in favour of the labels that were studied (characterized by a standardized layout, a large clear font, and various other features)15 over most of the varied presentations manufacturers use on their ampoules. The results also provide limited support for the use of colour-coding, although the advantage conferred by this was restricted to speed of selection and confirmation rather than to increased accuracy.
A limitation of this study lies in its relatively small number of participants (the logistics of collecting data from practicing clinicians during actual periods of work at different times of night and day were challenging) and the fact that they all were a part of a single regional training scheme. Therefore, our findings may not apply beyond the population studied. The strength of this study is its paired design - each individual acted as his/her own control. Also, we were able to collect data from many repetitions of the test.
Questions relating to the presentation of drug information may be easier to study than those relating to time of day. We took great care to make the photographs as clear as possible, but results could be different if the actual ampoules were to be tested against the actual labels. It is also possible that performance in a clinical context could differ from that reported here. On the one hand, performance could be worse if there were distractions; on the other hand, perhaps participants would perform better if they were making decisions with consequences for real patients. The absolute reaction and confirmation times in this study are relatively arbitrary, and one might argue that differences of 330 msec (recognition times) or 40 msec (confirmation times) between colour-coded labels and ampoules are not clinically relevant in themselves. However, these times reflect the mental effort required to complete the tests,13 and we therefore infer that quicker reaction times to standardized drug representations correspond with easier recognition of the drugs. Over a large number of administrations, quicker reaction times are likely to facilitate the avoidance of errors. We were surprised not to find a difference in error rates between label types. Despite being instructed to complete the tests as quickly and accurately as possible, the participants in the study knew we were investigating drug error and may have sacrificed speed in favour of accuracy.
The value of colour-coding in promoting safety in drug administration continues to be debated.29,30 The fact that there is room for improvement in the legibility and distinctiveness of some ampoules is less controversial2 – but it remains very difficult to persuade regulators or manufacturers to act on the well recognized problem of so-called “look-alike sound-alike”31 drugs.
The experiments described here employ a simple form of simulation. Complexity in simulation can be varied, from relatively simple task-oriented simulations such as this through to immersion high-fidelity simulation.32 By capitalizing on the strengths of each level (in this case low cost and ease of multiple tests) the same questions can be studied in different ways. Thus, the methods presented here could be used to obtain the information needed to perfect the presentation of drugs, i.e., investigating such features as font size, tall man lettering, and the use of auditory as well as visual presentation of key information.15 One could also test for error rates between drugs that look the same rather than using random substitutions as we did in the present trial. Once this has been accomplished, more complex simulation methods could then be used to verify that these findings apply in the wider context of administering a full anesthetic6 before finally confirming them in a clinical setting.
In conclusion, we have presented a novel task-relevant test of performance in relation to drug administration. We have shown that there are advantages to having legible standardized drug labels that are colour-coded for class of drug, and we have added to the evidence that shift effects (and likely fatigue associated with sleep restriction) influence performance in an anesthetic context.
We are grateful for the participation of the clinical staff from the Department of Anaesthesia, Auckland City Hospital, Auckland District Health Board, New Zealand. We sincerely thank Miss Milica Milovanovic for her assistance with data collection.
J.F.C. received a University of Auckland Doctoral Scholarship and a New Zealand Vice-Chancellors’ Committee William Georgetti Scholarship to conduct this work. Miss Milovanovic received a University of Auckland summer studentship stipend. This project was funded through an equipment grant from the Maurice and Phyllis Paykel Trust. A.F.M. and C.S.W. have a financial interest in Safer Sleep, Auckland, New Zealand.