World Journal of Surgery

, Volume 32, Issue 4, pp 548–556 | Cite as

Surgeons’ Non-technical Skills in the Operating Room: Reliability Testing of the NOTSS Behavior Rating System

  • Steven YuleEmail author
  • Rhona Flin
  • Nicola Maran
  • David Rowley
  • George Youngson
  • Simon Paterson-Brown



Previous research has shown that surgeons’ intraoperative non-technical skills are related to surgical outcomes. The aim of this study was to evaluate the reliability of the NOTSS (Non-technical Skills for Surgeons) behavior rating system. Based on task analysis, the system incorporates five categories of skills for safe surgical practice (Situation Awareness, Decision Making, Task Management, Communication & Teamwork, and Leadership).


Consultant (attending) surgeons (n = 44) from five Scottish hospitals attended one of six experimental sessions and were trained to use the NOTSS system. They then used the system to rate consultant surgeons’ behaviors in six simulated operating room scenarios that were presented using video. Surgeons’ ratings of the behaviors demonstrated in each scenario were compared to expert ratings (“accuracy”), and assessed for inter-rater reliability and internal consistency.


The NOTSS system had a consistent internal structure. Although raters had minimal training, rating “accuracy” for acceptable/unacceptable behavior was above 60% for all categories, with mean of 0.67 scale points difference from reference (expert) ratings (on 4-point scale). For inter-rater reliability, the mean values of within-group agreement (r wg) were acceptable for the categories Communication & Teamwork (.70), and Leadership (.72), but below a priori criteria for other categories. Intra-class correlation coefficients (ICC) indicated high agreement using average measures (values were .95–.99).


With the requisite training, the prototype NOTSS system could be used reliably by surgeons to observe and rate surgeons’ behaviors. The instrument should now be tested for usability in the operating room.


Situation Awareness Task Management Reference Rating Element Rating Crew Resource Management 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The NOTSS system was developed under funding from the Royal College of Surgeons of Edinburgh and NHS Education Scotland. The views presented are those of the authors and should not be taken to represent the position or policy of the funding bodies. The authors thank the surgeons who took part in this study.


  1. 1.
    Gawande AA, Zinner MJ, Studdert DM, et al. (2003) Analysis of errors reported by surgeons at three teaching hospitals. Surgery 133:614–621PubMedCrossRefGoogle Scholar
  2. 2.
    Yule S, Flin R, Maran N, et al. (2006) Non-technical skills for surgeons in the operating room. A review of the literature. Surgery 39:140–149CrossRefGoogle Scholar
  3. 3.
    Stevenson KS, Gibson SC, Rogers PN, et al. Process of care in acute surgical admissions: room for improvement. Br J Surg in pressGoogle Scholar
  4. 4.
    Christian C, Gustafon M, Roth E, et al. (2006) A prospective study of patient safety in the operating room. Surgery 139:159–173PubMedCrossRefGoogle Scholar
  5. 5.
    Studdert DM, Mello MM, Gawande AA, et al. (2006) Claims, errors, and compensation payments in medical malpractice litigation. N Engl J Med 354:2024–2033PubMedCrossRefGoogle Scholar
  6. 6.
    Carthey J, de Leval MR, Wright DJ, et al. (2003) Behavioral markers of surgical excellence. Safety Sci 41:409–425CrossRefGoogle Scholar
  7. 7.
    Moorthy K, Munz Y, Adams S, et al. (2005) A human factors analysis of technical and team skills among surgical trainees during procedural simulations in a simulated operating theatre. Ann Surg 242:631–641PubMedCrossRefGoogle Scholar
  8. 8.
    Templeton S, Feinmann J (2006) Arrogant surgeons ‘risk another Bristol babies scandal’. The Sunday Times London September 3Google Scholar
  9. 9.
    Davidson P (2002) The surgeon of the future and implications for training. Aust N Z J Surg 72:822–828CrossRefGoogle Scholar
  10. 10.
    Flin R, Maran N (2004) Identifying and training non-technical skills in acute medicine. Qual Safety Healthcare i180–i184Google Scholar
  11. 11.
    Flin R, Martin L, Goeters K, et al. (2003) Pilots’ non-technical skills: NOTECHS. Hum Factors Aerospace Safety 3:95–117Google Scholar
  12. 12.
    Fletcher G, Flin R, McGeorge P, et al. (2003) Anaesthetists’ non-technical skills (ANTS): evaluation of a behavioral marker system. Br J Anaesthesia 90:580–588CrossRefGoogle Scholar
  13. 13.
    Klampfer B, Flin R, Helmreich RL, et al. (2001) Enhancing performance in high risk environments: recommendations for the use of behavioral markers. Berlin: GIHREGoogle Scholar
  14. 14.
    Baker D, Mulqueen C, Dismukes R (2001) Training raters to assess resource management skills. In Salas E, Bowers C, Edens E, editors, Improving Teamwork in Organizations. Mahwah, NJ, Lawrence Earlbaum, 131–145Google Scholar
  15. 15.
    Flin R, Yule S, Paterson-Brown S, et al. (2005) Surgeons’ non-technical skills. Surg News 4:83–85Google Scholar
  16. 16.
    Yule S, Flin R, Paterson-Brown S, et al. (2006) Development of a rating system for surgeons’ non-technical skills. Med Ed 40:1098–1104CrossRefGoogle Scholar
  17. 17.
    Goldsmith T, Johnson P (2002) Assessing and improving evaluation of aircrew performance. Int J Aviation Psychology 12:223–240CrossRefGoogle Scholar
  18. 18.
    James L, Demaree R, Wolf G (1993) rwg: an assessment of within-group inter-rater agreement. J Appl Psychology 78:306–309CrossRefGoogle Scholar
  19. 19.
    Nunnally J, Bernstein I (1993) Psychometric Theory. New York, McGraw HillGoogle Scholar
  20. 20.
    Bliese P (2000) Within-group agreement, non-independence, and reliability. Implications for data aggregation and analysis. In Klein K, Kozlowski S, editors, Multilevel theory, research, and methods in organizations. San Francisco, Jossey-BassGoogle Scholar
  21. 21.
    Shrout PE, Fleiss JL (1979) Intraclass correlations: uses in assessing rater reliability. Psychol Bull 86:420–428CrossRefGoogle Scholar
  22. 22.
    O’Connor P, Hormann HJ, Flin R, et al. (2002) Developing a method for evaluating Crew Resource Management skills: a European perspective. Int J Aviation Psychology 12:265–288Google Scholar
  23. 23.
    Holt R, Boehm-Davis D, Beaubien J (2001) Evaluating resource management training. In Salas E, Bowers C, Edens E, editors. Improving Teamwork in Organizations. Applications of Resource Management Training. Mahwah, NJ, Lawrence Earlbaum AssociatesGoogle Scholar
  24. 24.
    Yule S, Flin R, Rowley D, et al. Debriefing surgeons on non-technical skills (NOTSS). Cognition, Technology & Work (in press)Google Scholar
  25. 25.
    Davis DA, Mazamanian PE, Fordis M, et al. (2006) Accuracy of physician self-assessment compared with observed measures of competence. JAMA 296:1094–1102PubMedCrossRefGoogle Scholar
  26. 26.
    Donaldson LJ (2006) Good doctors, safer patients: proposals to strengthen the system to assure and improve the performance of doctors and to protect the safety of patients. London, Department of HealthGoogle Scholar
  27. 27.
    Flin R, O’Connor P, Crichton M (2007) Safety at the Sharp End. A Guide to Non-Technical Skills. Aldershot, UK, Ashgate, in pressGoogle Scholar

Copyright information

© Société Internationale de Chirurgie 2008

Authors and Affiliations

  • Steven Yule
    • 1
    Email author
  • Rhona Flin
    • 1
  • Nicola Maran
    • 2
  • David Rowley
    • 3
  • George Youngson
    • 4
  • Simon Paterson-Brown
    • 5
  1. 1.School of PsychologyUniversity of AberdeenAberdeenUnited Kingdom
  2. 2.Department of AnaesthesiaRoyal Infirmary of EdinburghEdinburghUnited Kingdom
  3. 3.Department of Orthopaedic and Trauma SurgeryUniversity of Dundee, Ninewells HospitalDundeeUnited Kingdom
  4. 4.Royal Aberdeen Children’s HospitalAberdeenUnited Kingdom
  5. 5.Department of SurgeryRoyal Infirmary EdinburghEdinburghUnited Kingdom

Personalised recommendations