Abstract
Purpose
Assessment of physician performance has been a subjective process. An anaesthesia simulator could be used for a more structured and standardized evaluation but its reliability for this purpose is not known. We sought to determine if observers witnessing the same event in an anaesthesia simulator would agree on their rating of anaesthetist performance.
Methods
The study had the approval of the research ethics board. Two one-hour clinical scenarios were developed, each containing five anaesthetic problems. For each problem, a rating scale defined the appropriate score (no response to the situation: score=0; compensating intervention defined as physiological correction: score= 1; corrective treatment: defined as definitive therapy score=2). Video tape recordings, for assessment of inter-rater reliability, were generated through role-playing with recording of the two scenarios three times each resulting in a total of 30 events to be evaluated. Two clinical anaesthetists, uninvolved in the development of the study and the clinical scenarios, reviewed and scored each of the 30 problems independently. The scores produced by the two observers were compared using the kappa statistic of agreement.
Results
The raters were in complete agreement on 29 of the 30 items. There was excellent inter-rater reliability (=0.96, P < 0.001).
Conclusion
The use of videotapes allowed the scenarios to be scored by reproducing the same event for each observer. There was excellent inter-rater agreement within the confines of the study. Rating of video recordings of anaesthetist performance in a simulation setting can be used for scoring of performance. The validity of the scenarios and the scoring system for assessing clinician performance have yet to be determined.
Résumé
Objectif
En médecine, 1’évaluation de la performance demeure subjective. En anesthésie, un simulateur peut être utilisé pour foumir une évaluation mieux structure et standardisée mais on n’en connaît pas la fiabilité. Nous avons cherché à déterminer si, en anesthésie, les observateurs d’un phénomène simulé pouvaient s’entendre sur leur appréciation de la performance de I’anesthesiste.
Methodes
Le comité d’éthique avait approuvé cette étude. Deux scénarios cliniques d’une durée d’une heure comportant cinq problèmes anesthésiques ont été élaborés. Une échelle de cotation accordant un score à chacun (aucune réponse à ta situation =0, une intervention déftnie comme une correction physiologique = 1; une intervention thérapeutique considérée comme le traitement défmitif = 2). Des enregistrements sur vidéocassettes ont servi à évaluer la concordance entre les évaluateurs. Ces enregistrements témoignaient du rôle joué pendant les deux scénarios exécutés trois fois pour un total de 30 événements. Deux anesthésistes, ignorant le déroulement de l’étude et le contenu des scénanos, ont révisé et coté indépendamment les 30 problèmes. Les deux observateurs ont comparé les scores obtenus à l’aide de la méthode statistique d’accord kappa.
Résultats
Les évaluateurs s’accordaient completement sur 29 des 30 sujets. La fiabilité entre évaluateurs était excellente (=0.96, P < 0,001).
Conclusion
L’utilisation des vidéocassettes a permis de coter les scénanos en reproduisant le même événement devant chacun des observateurs. Dans le cadre de l’étude, l’accord entre les évaluateurs était excellent. On peut utiliser 1’évaluation de (a performance d’un anesthésiste à l’aide d’enregistrements sur vidéocassette au cours d’une simulation. La validite des scénarios et du système de cotation reste à déterminer.
Article PDF
Similar content being viewed by others
References
Gaba DM, DeAnda A. A comprehensive anaesthesia simulation environment: re-creating the operating room for research and training. Anesthesiology 1988; 69: 387–94.
Gaba DM, DeAnda A. The response of anesthesia trainees to simulated critical incidents. Anesth Analg 1989; 68: 444–51.
Kurrek MM, Fish KJ. Anaesthesia crisis resource management training: an intimidating concept, a rewarding experience. Can J Anaesth 1996; 43: 430–4.
Holzman RS, Cooper JB, Gaba DM, Philip JH, Small SD, Feinstein D. Anesthesia crisis resource management: real-life simulation training in operating room crises. J Clin Anesth 1995; 7: 675–87.
Howard SK, Gaba DM, Fish KJ, Yang G, Sarnquist FH. Anesthesia crisis resource management training: teaching anesthesiologists to handle critical incidents. Aviat Space Environ Med 1992; 63: 763–70.
Byrick RJ, Cohen MM. Technology assessment of anaesthesia monitors: problems and future directions. Can J Anaesth 1995; 42: 234–9
Fliess JL. Statistical Methods for Rates and Proportions, 2nd ed. New York: John Wiley and Sons, 1981; 218.
Hart IR. The OSCE-objective yes, but how useful?In: Hart IR, Harden RM, Walton HJ (Eds.). Newer Developments in Assessing Clinical Competence. Montreal, Quebec: Heal Publications Ltd, 1986: 22–8.
Cohen R, Rothman AI, Ross J, Poldre P. Validating an objective structured clinical examination (OSCE) as a method for selecting foreign medical graduates for a pre-internship program. Acad Med 1991; 66: S67–9.
Newble DI, Hoare J, Sheldrake PF. The selection and training of examiners for clinical examinations. Med Ed 1980; 14: 345–9.
Robb KV, Rothman AI. The assessment of clinical skills in general medical residents-comparison of the objective structured clinical examination to a conventional oral.In: Hart IR, Harden RM, Walton HJ (Eds.). Newer Developments in Assessing Clinical Competence. Montreal, Quebec: Heal Publications Ltd, 1986: 87–94.
Swanson R, Swanson S, Spooner J, Haight K, Ramsden V, Tan L. Inter-rater variability in an advanced cardiac life support course: a case study. Medical Teacher 1987; 9: 447–9.
Wilson GM, Lever R, Harden RM, Robertson JIS. Examination of clinical examiners. Lancet 1969; I: 37–40.
Gaba DM, Botney R, Howard SK, Fish KJ, Flanagen B. Interrater reliability of performance assessment tools for the management of simulated anesthetic crises. Anesthesiology 1994; 81: A1277.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported with a grant from the physicians of Ontario through the Physician’s Services Incorporation Foundation. Dr. Cohen is the recipient of a National Health Scholar Award from Health Canada.
Rights and permissions
About this article
Cite this article
Devitt, J.H., Kurrek, M.M., Cohen, M.M. et al. Testing the raters: inter-rater reliability of standardized anaesthesia simulator performance. Can J Anaesth 44, 924–928 (1997). https://doi.org/10.1007/BF03011962
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF03011962