Why do we need video analysis in surgery?

Despite the explosion of new surgical technology in the last 30 years, there had until recently been very little change in the way we assessed surgeon’s intraoperative performance. Video analysis has become integral across multiple industries. High-level athletics invests huge amounts of time and resources into the deconstruction of video footage to better understand their sport and improve performance in key areas. In aviation, hugely complicated data is collected and collated to ensure that passenger safety is never compromised, and this has allowed the industry to have a deep understanding of what type of errors lead to adverse outcomes. Despite robust evidence from these and other industries, there has been only minimal amounts of innovation and application of existing technologies to surgery. Slowly, this is changing, and the future of video analysis in surgery appears to be bright.

The use of video analysis in surgery has traditionally been best suited to minimally invasive surgery (MIS), because of the use of intra-corporeal video capture needed to perform the procedure [1]. Laparoscopic and endoscopic approaches to what were traditionally open surgical procedures are becoming more and more common as surgical technology evolves, and this gives us a prime opportunity to increase the ‘scope’ of video analysis. In addition to these changes, the technologies surrounding video analysis itself are rapidly becoming more complex. Novel methods of analysis, including motion tracking [2], are becoming more prevalent and these innovations can potentially allow us to explore the role of intricate psychomotor factors on surgical performance. Crowdsourcing of assessment has been suggested for garnering assessments in a short time-period for a given task or skill, however its role in surgical training has not been sufficiently explored in the literature [3].

In a time where health-care systems demand tangible, reportable measurements of the quality of care patients receive [4], we also find ourselves at the outset of the competency-based era in medical education. The two concepts are synergistic; Trainees must demonstrate they are competent in procedural skill, as well as cognitive ability in order to ensure they will provide optimal care for patients in their future practices [5]. Historically, determining competency in the surgical trainee can be done through a variety of methods, from direct intra-operative observation [6], to assessment of skill in the simulated, virtual reality domain [7]. Evaluation through simulation allows comprehensive feedback in a low-stake environment [8], while intraoperative assessment allows educators the opportunity to see trainees in the ‘real world’ and make more confident judgments on their readiness to perform as solo practitioners [6]. Retrospective analysis of intraoperative video footage allows the analytical techniques employed in simulation to be extrapolated to the clinical world, through robust and objective methods of assessment.

Video analysis is of benefit to training in other ways. When trainees’ surgical skill and non-technical skill is assessed through video reordering, as opposed to live and in-person evaluation, there is a lessening of the Hawthorne Effect, a phenomenon wherein trainees change their behavior in the presence of an examiner or rater. In a systematic review of observation-based assessment, Yanes et al. [9] found that the Hawthorne Effect may artificially raise performance standards in observational studies. Yanes points out that if video recording is done correctly, the risk of the Hawthorne Effect is low, especially in high-risk or crisis scenarios where attention is directed elsewhere.

There are benefits to the rater in this method of assessment. Retrospective analysis of trainees in the operating room has the added benefit of reducing observer bias. While impossible in real-time assessment, video review allows for blinding of both the rater and the trainee. Additionally, the ability to watch intraoperative footage over and over allows for more practical calibration of raters. In the real world, in order to show inter-rater reliability (IRR), multiple assessors need to be present at the time of the performance. Video analysis allows for multiple raters to individually review collected footage, as well as go through cases in greater detail, taking time to ensure that consensus is reached regarding evaluation and decisions about performance, especially in high-stakes assessment.

An ideal video recording from the clinical setting should capture as much of the frame of view being assessed, as a single, fixed location of recording equipment may miss key details of the scenario being assessed [9]. An obvious solution to this issue is through positioning of multiple audiovisual collection devices. In the operating room, this may involve the intracorporeal camera footage, a view of the nursing team, the surgical team, and the anesthesia area including the patient vital signs monitors. Dedicated devices to each of these locations in the operating room will prevent key details from being missed. No real-time observer is able to watch four components of the intraoperative environment simultaneously.

Probably the most beneficial aspect of video recording in the operating room is the ability to stop, rewind, and reanalyze key moments of the procedure being assessed. Atul Gawande, in his widely acclaimed article Personal Best in the New Yorker magazine [10], describes his personal experience with the benefits of video review. A surgeon-mentor is able to go through a routine case with Gawande, and points out aspects ranging from surgical technique, to how the patient was draped. He goes on to talk about how with a multi-camera view of the operating room, his surgeon-mentor was able to comment on not only his patient positioning, or the length of the incision, but also the way in which he interacted with his trainees in the operative setting. This is a great example of the potential for video analysis in surgery to allow for coaching around surgical technique, but also non-technical skills, an integral part of a surgeon’s skill-set [11].

How is video analysis being used in surgery?

We use video analysis in surgery in two key ways. Firstly, we use video review for assessment of trainee surgeons, and more recently, for peer review of staff surgeons. Video analysis is capable of affording easier assessment of surgeon and trainee technical and non-technical skill in both the operating room and the simulation lab. This is possible when stakeholders employ assessment tools that have evidence supporting their reliable and validity. One such rubric devised to evaluate surgical skill is the Objective Structured Assessment of Technical Skill (OSATS) [12], created at the University of Toronto by dr. Richard Reznick and his team. The OSATS is an example of a global rating scale, meaning the assessor makes judgments about a subject’s abilities according to overarching domains of skill, in this example, respect for tissue, and instrument handling. Although originally designed for in-person assessment of trainees, it has become a popular mode of assessment in retrospective video review. In a seminal article by Beard and colleagues [13], the ability of surgeons across different levels of skill and experience at performing a saphenofemoral disconnection were assessed using the OSATS. 28 judges (14 trainee, 14 staff-level) assessed each individual performance in this study. The ability to have a higher number of assessments made on a single performance is only feasible with video recording of procedures and post hoc assessment. Another example of large-scale analysis in trainees comes from De Montbrun et al. [14]. Their study assessed the technical skills of a cohort of first year general surgery residents across a ten-year span, and this large volume of assessable material allowed their group to confidently set the standard of competency in the tasks they assessed. By recording technical performance, whether in the operating room or in the laboratory, educators are able to ensure they not only meet the number of assessments required to answer their research questions, but also that they are able to ensure that enough judgments are collected to be confident in the establishment of assessment standards.

Understanding the relationship between surgeon error and intraoperative adverse events requires careful review and analysis of surgical video. It is essential to undertake a root-cause analysis approach to adverse surgical events in order to appreciate the events that lead up to an error being committed [15]. Earlier studies looking at surgeon error involved retrospective review of operative notes and morbidity and mortality conferences [16], but this method of data collection does not account for those errors committed intraoperatively that did not result in an adverse event, or did not meet the threshold to be recorded in the surgical note. Bonrath et al. [1] published a novel tool for collation and assessment of surgical error, adverse events, and subsequent rectification of these errors, termed the Generic Error Rating Tool (GERT), shown in Fig. 1. This instrument defines error as ‘the smallest unit of deviation from the intended operative course,’ which may or may not precipitate an adverse event. The tool also captures the nature of the error: whether an incorrect application of force was used, a surgical instrument was incorrectly orientated, or if a maneuver was carried out with inadequate visualization. Additionally, the time needed to rectify a committed error is also captured by the GERT, which can act as a surrogate for error severity. In their original study, they analyzed 54 laparoscopic Roux-En-Y Gastric Bypass (LRYGB) procedures using the GERT, and found that in two-thirds of cases, errors were committed that results in adverse events required some kind of rectification, such as suture-repair or hemostasis. The authors discuss the importance of recognizing error in complex surgical procedures as a means of ensuring trainee participation does not jeopardize patient safety. They highlight the importance of capturing surgical performance on video, as even routine operations such as the LRYGB can yield important data to be used in surgeon education and quality improvement. The group conducted a further study [17] that showed that error, as captured by the GERT, is also a valid method of discriminating between high and low skill surgeons, through the inverse correlation of GERT and OSATS scores in a cohort of gynecology staff and trainees performing laparoscopic hysterectomy. In urology, an example of surgical complication analysis comes from Sotelo et al. [18]. This international group of urologic robotic surgeons collated and analyzed intraoperative adverse events occurring during a Robotic-Assisted Radical Prostatectomy (RARP), with the purpose of quality improvement and patient safety. In the article, they outline a list of potential complications arising from this procedure, and suggest ways these can be avoided.

An equally exciting use of video analysis in surgery is the evaluation of non-technical skills. The operating room is a unique ecosystem in the world of healthcare, as it requires seamless, even non-verbal, communication, cooperation, and coordination between medical practitioners with a variety of backgrounds and perspectives. There are multiple methods of assessing non-technical performance in the operating room, and while most of these were designed with the intention of being used for live, intraoperative assessment, they are equally efficacious in video analysis. The NOTSS [19] method of non-technical assessment is a commonly used rubric, employing a global rating scale to evaluate trainees and surgeons across four domains, Situational Awareness, Communication and Teamwork, Decision Making, and Leadership. The tool gives examples of good and bad behavior in each of these categories. Its use in video analysis has been limited, but Hamilton and colleagues [20, 21] demonstrated that using video review in crisis scenario training improves team performance, in a pre-post designed study. Steven Yule authored a study [22] in which he demonstrated again the feasibility of the NOTSS as a means of debriefing senior surgical residents on their non-technical skills, and that doing this can lead to improved non-technical scores on repeat testing. Pena et al. [23] used NOTSS to demonstrate the utility of video-based analysis and feedback of non-technical skill in a simulation environment.

Drawing inspiration from the world of athletics, recent literature has focused on the use of intraoperative video as a tool to help improve surgical performance, termed ‘coaching’ [10]. A Harvard group, led by Dr. Caprice Greenberg, conducted a study [24] in 2012, in which they offered postoperative coaching sessions to surgeons across procedures of different difficulties. In their article they describe the types of interactions that resulted from this ‘post game analysis’ and the utility of these sessions for the purposes of peer evaluation and shared learning. A randomized control trial was undertaken by Bonrath et al. [25], in which they showed the effectiveness of surgical coaching in the real world. They analyzed the jejunojejunostomy step of a LRYGB procedure in trainees randomized to either Comprehensive Surgical Coaching (CSC), or conventional training. In their study, they demonstrated that those participants in the coaching arm had significantly improved technical skill scores, and committed fever errors. In a similar study, Singh et al. [26] randomized to either online surgical tutorial or video-based coaching. They found that video-coaching cohort performed better than controls on a virtual reality and porcine model, although they took longer to complete the tasks. ‘Telemonitoring’ is a recent development in the field of surgical coaching with great potential upside. In a study by Shin et al. from the University of Southern California Group [27], urology trainees received either traditional in-room instruction, or remote telemonitoring from a surgeon located outside of the operating room, in robotic-assisted renal and prostate surgery. They showed that there were no differences in global skill ratings as judged by themselves (self-assessment) and expert surgeons, between both remote and locally coached groups. The authors suggest that this type of remote assessment using intraoperative video will allow for expert surgeons to watch, evaluate, and coach trainees and peer surgeons who are located in other geographic locations. In places such as the United States, where robotic surgery is wide spread, this technology allows for expert surgeons to coach lower-volume or lower-skill surgeons to improve care delivery for patients undergoing these robotic procedures. The potential for inclusion of surgical coaching into formalized training and continuing medical education (CME) is currently under investigation and promises to be a rich area of surgical education research [28].

Fig. 1
figure 1

The Generic Error Rating Tool, described by Bonrath et al. [1].

Future directions

The field of video analysis in surgery is growing rapidly, across multiple platforms, surgical specialties, and even across different industries. Stakeholders from both the medical and technology industries are currently investigating ways to integrate a range of novel concepts from motion tracking [29] to machine learning into day-to-day surgical practice. It is essential that any efforts to implement routine video analysis in surgery accounts for the different domains of assessment that this article describes. It would be insufficient to analyze one of technical skill, non-technical skill, and surgeon error in isolation, as the role of surgeon factors in patient safety becomes more apparent [30]. An integrated system of operative data capture is necessary to ensure that when adverse events occur in surgery, they can be studied in full, and the incidents surrounding them can be illuminated. This concept is embodied by the surgical black box [31], a computer-based system that captures audiovisual data in the operating room (Fig. 2). This system adopts the aviation industry’s philosophy in that, even in highly controlled situations with highly skilled people, errors can occur. Through this system it is possible to analyze the impact that all members of the surgical team (nurses, anesthesiologists, surgeons) have on the course and success of the procedure. Rigorous data collection allows for the possibility of thorough post-operative evaluation of multiple intraoperative systems, from laparoscopic suturing to how to trainee surgeon communicates with the scrub nurse. When adverse events occur, the root-cause can be determined, allowing the surgeon to learn from their error. On a large scale, educational interventions can be targeted at procedures, or steps of a procedure, wherein errors frequently occur or often lead to adverse events that compromise patient safety. Moving forward, there will be further integration of technology into this process, as though leaders continue to speak of the move to the world of ‘digital surgery’ [32].

Fig. 2
figure 2

The Surgical Blackbox™ allows for collection of multiple sources of audiovisual data collection in the operating room (pictured against the far wall in this photo).

Conclusion

Video analysis in surgery provides multiple advantages for trainee and surgeon’s practical skills assessment, over traditional methods. Many important aspects of surgical care are difficult to account for, or are overlooked entirely, when assessments are made in real-time. Video review allows for careful evaluation of procedural competency by any number of judges, encompassing any number of styles and types of assessment tools. While many reliable metrics exist for technical skill assessment, in the form of global rating tools and task-specific checklists, there is a paucity of validated tools to examine procedural error and non-technical skills, both of which are important aspects of holistic surgical assessment. New applications of this technology to concepts such as coaching and telemonitoring allow educational stakeholders to have more flexibility in how they approach the evaluation of their trainees, and ensure standardization of training outcomes. The capacity to improve patient care has drawn the eye of the technology sector, and with a big push from thought leaders in this field, we can expect to see a rapid influx of technology that will further the intricacy and complexity with which we assess surgical performance.