Introduction

Cardiothoracic surgical performance depends on integrating an extensive body of knowledge with, often complex and nuanced, technical and non-technical skills [1]. Given that surgery occurs within the context of individual patients and environmental factors, understanding surgical expertise and performance in a meaningful way that informs patient care and surgical training is a particularly challenging problem. Investment in tools to objectively track and analyse human movement is commonplace in elite sports [2], and similar tools could be used in surgical environments to enhance performance and patient outcomes [3]. Movement tracking in a surgical setting is also not unusual, with performance metrics available through surgical techniques and procedures that support data capture [4, 5]. However, it is relatively rare to analyse movement and technical expertise in open surgical procedures due to difficulties in extracting the required metrics. Yet, a data-driven approach that provides analytics linking intra-operative clinical and technical processes to patient outcomes may provide a means of targeted improvement in surgical care [6, 7]. Recent computer vision innovations may open doors to similar tools that benefit the pursuit of cardiothoracic surgical excellence, which relies less heavily, in relative terms, on robotic and thoracoscopic technology than many other specialities. Thus, the present review intends first to provide background on how objective kinematic parameters link with technical and non-technical skills. The potential for innovations from artificial intelligence and computer vision to track technical and nontechnical skills in real and simulated settings will then be evaluated, and last, the benefits and barriers to the uptake of such technology for the cardiothoracic community will be discussed.

Innovations in machine learning that render the analysis of real-time cardiothoracic surgery accessible even without specialist motion-tracking equipment provide a solution to understanding how the surgical environment shapes operative performance by providing objective measures of technical and co-ordinative skill in the operating theatre. Such analyses could also provide a valuable performance feedback tool for surgeons throughout their careers as they evolve or provide immediate objective indicators of factors that can impact surgical performance, such as fatigue. In terms of training, integrating such data-driven approaches with empirically validated teamwork theory may give trainees or those surgeons being mentored more substantial opportunities to develop the technical and professional skills to excel in their practice.

Surgical expertise

Much work has already been done to delineate markers of surgical expertise and provide measures that assess surgical skills. Whilst it is beyond the scope of the present paper to provide a comprehensive overview (reviews: [1, 8]), surgical expertise is thought to comprise both technical and non-technical skills. Technical skill refers to direct psychomotor ability as governed by visuomotor aptitude, economy of movement and co-ordination [1], whereas non-technical skills encompass a broad range of abilities that support the surgical task [1, 9,10,11]. Specifically, individual markers of expertise beyond technical skills include declarative knowledge, interpersonal skills, situational awareness, and cognitive flexibilityFootnote 1.

Surgical performance depends not just on a surgeon’s technical and non-technical skills but also on the skills of the surgical team and the administrative, managerial and organisational policies and procedures that support them. Thus, it is crucial to consider that surgical performance occurs within a broader context. For example, better team communication is associated with higher non-technical skill performance overall [12], and fewer miscommunications occur in teams that have a high degree of familiarity [13]. These effects on non-technical skills translate further: situational awareness in the surgical team demonstrates a strong negative relationship to the frequency of technical errors [14, 15], and high team familiarity appears to be associated with lower rates of postoperative morbidity [16]. Thus, the path to cardiothoracic surgical excellence requires a holistic and integrative approach; considering context can provide insights beyond direct technical performance [9].

From general cognitive research and theory into joint [17] and individual co-ordination [18], as well as applied surgical investigations [8], we know that factors beyond technical skill contribute to psychomotor performance and thus may also be observed in movement execution. Non-technical skills, group-related factors and organisational factors can all influence psychomotor performance. Consider how a cardiothoracic surgeon executes an action plan and how the surgeon must adapt their movements due to an unforeseen event such as major unexpected bleeding or a perfusion issue such as air embolism — the speed at which the surgeon adjusts quickly to new circumstances might be influenced by any number of non-technical skills (e.g. mental readiness, fatigue, anticipatory ability, cognitive flexibility or situational awareness). Such individual factors would also depend on team-level factors; for example, working with new team members may mean attention is diverted from the task towards managing new team dynamics and developing confidence and trust in those around them. If attention is directed towards managing new team dynamics, the cognitive resources available for action planning are reduced, and movement quality may suffer. Conversely, a familiar team may allow for more cognitive flexibility or ease with which to adapt to changing patient circumstances. Specifically, the surgeon does not need to monitor team members to the same extent when they understand their behavioural preferences and abilities and have an innate feeling of trust and confidence. Additionally, the tendency to monitor team members may be lower when the surgeon expects that team members will anticipate their needs [19]. Similarly, managerial or organisational level factors such as policy, established procedures or culture can influence movement by providing formalised mutual understanding among team members, which, in turn, provides a scaffolding for behaviour and minimises the cognitive resources required to coordinate.

Kinematic analysis of surgical performance and expertise

A range of kinematic metrics have been shown to discriminate surgical expertise (see Table 1). For example, increased experience is associated with lower trajectory displacement during suturing, lower acceleration with non-dominant hands, and higher velocity while tying sutures [20]. Similar results have been obtained in live settings [21]. Early increases in expertise as a trainee surgeon are related to increases in psychomotor performance (e.g. increased velocity and precision of movements). In contrast, later expertise developments are characterised by physical efficiency gains [20, 22, 23]. Although most work in this area is not considered specialty-specific, it is important to consider the generalisability of the findings. Speaking directly to the cardiothoracic speciality, recent work indicates that the expertise effects concerning speed and physical efficiency gains are also evident whilst performing a simulated graft anastomosis [24].

Table 1 Example measures of surgical expertise/performance. Note that definitions may vary slightly across studies and measurements must be considered with relevance to the task

Cardiothoracic surgeons and surgical teams must be both experts on a technical level and expert co-ordinators. Yet, little research investigates group-level coordination during surgery using objective kinematic measurements. Recent research on general surgical trainees indicates that expertise is linked with more robust mental representations [29]. A consequence of stronger mental representations is easier and faster retrieval from memory, which can lead to better action planning and execution, as evident in the objective measurement of movement trajectory and speed of movement initiation and execution.

Research from cognitive psychology in ‘joint action’ also strongly emphasises the importance of mental representations. Humans tend to form joint ‘internal predictive models’ - models for action that team members jointly represent. These shared representations promote smooth and effective co-ordination [30, 31] because they allow team members to predict the actions of co-ordinative partners, anticipate their needs and adapt their movements to accommodate those co-ordinative needs, ultimately maximising physical efficiency [32, 33], and potentially cognitive efficiency [34].

Co-ordinative agents can also facilitate shared representation via communicative movements; for example, movements can be exaggerated spatially or temporally to indicate intention [35] or call attention to important parts of a movement for teaching purposes [36, 37]. In attending to the movements’ communicative aspects, the observer can enhance their own internal models and respond more effectively to the signaller. Kinematic analysis of surgery on the co-ordinative level will provide insight into how surgeons optimally engage co-ordinative mechanisms, including communicative movement, to facilitate and enhance task performance.

Marker-less movement tracking for the evaluation of surgical performance

Kinematic analyses have historically been performed in simulated settings [38] due to the need for specialist motion-tracking equipment that would not typically be present during live cardiothoracic surgical procedures. It is possible to use traditional markered motion capture in live surgery using tools equipped with motion capture markers or by attaching them to a surgeon’s hands. However, it should be noted that any changes to an instrument or a surgeon’s hands may influence their movement patterns. Therefore, extensive pre-testing or experience with the new equipment should be undertaken to ensure that the introduction of markers would not impact technical performance. Further, the practice is not well adopted beyond simulated settings and minimally invasive surgery due to concerns over introducing motion capture sensors and markers into sterile environments in open surgery [39].

Assessing surgical skill in simulated settings provides only a limited picture of surgical performance, and objective measures in simulated settings cannot be directly linked to patient outcomes. Marker-less tracking techniques using custom software algorithms have demonstrated success for kinematic analysis of surgical skills in both live open surgery [40, 41] and simulated settings [23, 42]. Such video review is possible using only a video camera.

In computer vision, markerless motion tracking is called ‘pose estimation’; here we focus on innovations in deep learning techniques. Deep learning for pose estimation involves training an artificial neural network on annotated pose data sets [43, 44], so it can ‘learn’ to recognise poses. When provided with new videos of operations, the network identifies the poses within that video. Kinematic parameters can then be extracted using spatial and temporal data. Where deep learning requires more powerful hardware than a standard computer, the acquisition and set up of appropriate equipment for markerless motion capture is more accessible than traditional means of markered motion capture because traditional video cameras alone may be used.

Pose estimation algorithms [45,46,47] and toolboxes [48] have advanced considerably recently. Further, the speed with which a network can identify poses has markedly increased, making deep learning for pose estimation a viable tool in applied kinematic analysis to provide feedback to surgeons on their performance and link psychomotor metrics to patient outcomes. Such techniques can precisely track extremely delicate procedures requiring microprecision instruments, such as retinal microsurgery [49]. They thus may even be useful to analyse procedures such as coronary artery bypass surgery or in paediatric heart surgery. Further, innovations in terms of multi-person pose estimation [46] and multi-instrument pose estimation [50] provide opportunities to analyse group co-ordination, which is an under-studied area [38].

Typically, using multiple cameras to enhance 3D spatial precision is optimal. Nevertheless, cameras designed to measure depth without using makers or additional cameras have been developed—for example, Microsoft’s Kinect [51]. Like some optical marker trackers, these cameras emit infrared light and read depth information from the reflected light. High-speed pose recognition can be achieved through traditional machine learning, allowing for real-time interaction [52]. Depth cameras have been used for the 3D pose estimation of medical instruments [53].

Pose estimation techniques do have limitations. For example, occluded points are not estimated or estimated with lower accuracy. Many standard methods using markers also suffer from this limitation;Footnote 2 multiple cameras in markered and markerless [56] approaches may help alleviate occlusion issues, and gap-filling techniques may be employed post hoc to deal with short durations of missing data. As an alternative to simple gap-filling techniques and to prioritise accuracy, artificial neural networks can be used to estimate the missing information [57]. Alternatively, a hybrid approach that incorporates wearable sensors may assist. Flexible sensors (accelerometers and gyroscopes) that can be unobtrusively worn under gloves may provide useful information when the line of sight is obstructed. These sensors have recently been shown to produce measures that differentiate experts and novices during a simulated graft anastomosis [24].

Training a network to perform pose estimation adequately takes substantial computational time, often days, even with hardware with sizeable computational power. Whereas, some forms of markered motion capture are fast enough to provide feedback in milliseconds with appropriate computer hardware. Nevertheless, a pre-trained network can perform pose estimation extremely quickly to reach real-time processing speeds (10-30 Hz). Last, purpose-built algorithms may be required depending on the goals of the analysis. There have been recent initiatives that aim to collate surgical tool or surgical procedure data sets [58, 59], which can be used for developing pose estimation models.

Pose estimation data and other forms of kinematic data can also be fed into artificial neural networks designed to classify a cardiothoracic surgeon’s skill level [60,61,62]. Most demonstrations have been performed in the context of robotic or laparoscopic surgery as more data is available. However, with appropriate videos, surgical skills in open cardiac or thoracic surgery could be assessed using similar approaches. To build such classifiers, neural networks are trained on data (e.g. kinematic data, videos) taken from surgeons at all skill levels. The neural network then extracts parameters common to each group and then classifies new videos based on how well they match the typical parameters of a given skill level. These parameters, however, may not always represent meaningful skill differences between levels but may be arbitrary parameters that co-occur with skill differences. Thus, cardiothoracic surgeons with unique approaches may be disadvantaged even though such technique differences may not be meaningful for patient outcomes. Further, surgical adaptations associated with environmental factors or personal characteristics may not be accounted for sufficiently. For example, classifier algorithms are known to be prejudiced as a function of their inputs and often disadvantage under-represented groups [63]. Thus, for such skill assessment algorithms to be developed, comprehensive data sets for a given procedure that demonstrate sufficient variation in action execution and skill level are needed from surgical teams and hospitals.

Where deep learning for kinematic analysis is not likely to raise concerns over bias because it directly measures an event, a classifier takes the decision out of human hands and can make decisional factors opaque. Thus, developers should make every effort to ensure that the training data is not biased. Ideally, any classification should be accompanied by a meaningful description of the decisional parameters (Explainable Artificial Intelligence [64]). For example, classifiers can also be designed to provide feedback on what data components are most predictive of the skill classification [60].

A recent systematic review indicated that reported machine learning methods developed to classify expertise typically achieved over 80% accuracy [65]. With further accuracy gains likely, the technology could facilitate assessment in competency-based education. With limitations in mind, we believe such methods could be most usefully employed as a formative tool that aids surgeons in developing their technical expertise, supplementing more human resource-intensive means of feedback. Indeed, a recent systematic review highlighted that personalised feedback supported by Artificial Intelligence is well accepted and considered beneficial by users [66]. However, it is noted that there is a need for rigorous experimental studies that contrast traditional pedagogical interventions with those from artificial intelligence to evaluate any potential learning gains.

Understanding the impact of technical skill on patient outcomes via kinematic data

Patient outcomes can be linked to technical skill. For example, higher-rated technical performance is linked with better post-operative outcomes in neonatal cardiac surgery [67]. High skill levels are also linked to operating time [20], which may consequently influence patient outcomes. For example, prolonged femoral-popliteal bypass procedures in vascular surgery are associated with increased surgical site infection and extended post-operative stays [68]. As we operate in an age of big data, opportunities to understand the factors contributing to patient outcomes within and between hospitals will become more accessible through data mining techniques. Using deep learning to tap into kinematic data is a particularly exciting innovation that will contribute even further to the standard information that is commonly extracted.

First, it is essential to look for objective measurements of expertise linked to patient outcomes in real surgery to identify the most critical aspects of technical skill. However, a necessary corollary to this work is understanding the nuance in technical expertise between surgeons. Nuanced kinematic variations [69], for example, may represent differences in technique that surgeons have adapted to their own biomechanical and cognitive needs or the surgical procedure. Further, surgeons are heavily influenced by their training, resulting in different techniques for the same procedure. Yet, international and national registry and audit data show similar outcomes between surgeons, hospitals and nations, indicating that variations in approach are not necessarily meaningful. Overall, a better understanding of which markers of technical skill are related to patient outcomes will inform cardiothoracic surgical training and development or decline over a surgeon’s whole career. As with elite athletes, surgical trainees should be supported with movement feedback to explore what works for them [70]. Identifying if, how and when technical skill impacts patient outcomes will require exceptional interdisciplinary cooperation from surgeons, surgical team members, hospitals, and researchers. Large amounts of data will be required given that many other factors can influence patient outcomes within and outside the operating room.

Understanding the routes by which nontechnical skills influence patient outcomes via kinematic data

One route by which non-technical skills influence patient outcomes particularly pertinent to kinematic analysis is their capacity to feed into the technical execution of surgical actions [8]. For example, an overall assessment of non-technical skills [71] measuring communication and interaction, situation awareness and vigilance, co-operation and team skills, leadership and managerial skills, and decision-making was linked to the technical performance of surgeons performing carotid endarterectomy. The same is likely to be true in cardiothoracic surgery more generally.

Non-technical skills are not only a modulator of technical skills but may also influence patient outcomes directly. In fact, a video evaluation study assessing the surgical skills of surgeons (including 83 cardiac and cardiothoracic surgeons) demonstrated that increased scores for non-technical skills, independent of technical skill, were related to higher patient safety ratings [72]. This direct influence is particularly evident in crisis settings in the operating room, where non-technical skills drop considerably for all expertise levels [73] where changes in technical performance are less pronounced or negligible for highly experienced surgeons [74]. A similar pattern can be observed in response to fatigue [28]. If situational awareness drops, for example, the cardiothoracic surgeon will be less able to monitor all aspects of the surgery well and may not make the most informed patient care decisions.

Measuring non-technical skills objectively via video-based data is less straightforward than speaking to movement execution. Nevertheless, it is achievable. An investigation of nursing students showed that video-based feedback on gaze allowed the trainee to develop situational awareness [75]. In addition, recent advances show that human attention can be tracked within a task space by modelling head pose and orientation. Of course, this approach is less precise than using eye-tracking technology. However, such modelling helps understand various factors contributing to situational awareness, such as concentration loss, collaborative attention and stress levels more generally whilst engaging in collaborative tasks [76]. Further, as mentioned earlier, communicative gestures can be differentiated from goal-directed gestures [35,36,37] in terms of their kinematic features, and collaborative responses could be indexed by reaction times to requests. Although it is theoretically achievable to objectively measure some non-technical skills from video data as indicated by work in other fields, such an approach would need to be empirically validated within a cardiothoracic surgical setting.

Analysing co-ordinative kinematics via deep learning techniques in real crises may explain how and why performance changes. Such analyses will also provide insight into how the organisation and the wider surgical team can optimally support the operating surgeon.

Implications and implementation

Understanding markers of expertise (and learning trajectories) for various surgical tasks, how those markers relate to cognitive mechanisms that optimise performance, and subsequently, how performance impacts patient outcomes would help design targeted training programmes for cardiothoracic trainees and established surgeons at all levels wishing to enhance their own motor expertise. Understanding the aspects of expertise linked to surgical outcomes would allow a surgeon to target the most beneficial areas of improvement at any given time during their learning curve as this changes throughout their career.

Traditional methods of teaching and development focus on repeated practice under the supervision of a more senior surgeon. With surgical coaching and mentorship, the surgeon engages in an ongoing process of performance reflection and adjustment under the guidance of a surgical coach. Although research investigating the efficacy of surgical coaching is still new [77], current evidence suggests it is highly effective for skill acquisition and development, and participants receive it well [78,79,80].

Deliberate individualised practice is essential to surgical skill acquisition [81] and is also an essential part of the surgical coaching process. Tracking metrics over time allows a surgeon to engage in deliberate practice by measuring improvement and using targeted feedback. Thus, kinematic tracking and feedback tools could provide further targeted guidance to complement the feedback and reflection provided in surgical coaching. For example, a user could submit a video of their own performance to software that they wish to get feedback on. The software would then analyse kinematic parameters known to reflect expertise against collective benchmarks or their own previous performance and provide targeted recommendations to improve performance. Feedback could also be provided online during simulations if such feedback was helpful.

Whilst it is important to note that engaging in facilitated reflection and problem solving with a surgical coaching tool based on artificial intelligence is unlikely to provide the same experience as with a human coach or mentor given the lack of authentic empathy and emotional intelligence, very recent research suggests good outcomes from AI coaches that aim to assist with goal attainment more generally [82]. Given that surgical training and coaching are extremely human resource intensive, increasing the availability of tools that provide opportunities to gain additional feedback without needing a human expert may assist in accelerating the development of surgical skills [83] to complement or facilitate the more resource-intensive visual assessment used in surgical coaching [84]. Furthermore, surgical coaches could use the technology to assist decision-making or feedback (human in the loop, [85]). Indeed, using audio-visual technology to review performance in coaching contexts has offered additive benefits over in-person observation alone [86].

This technology also offers the potential to monitor the surgical team to provide real-time feedback that may benefit decision-making processes during surgery. For example, given that fatigue is related to technical errors [8], when a surgical team member shows signs of fatigue during long surgeries, an alert could notify the team to take a break or alter roles. Monitoring surgical skills using pose estimation methods could also assist offline in determining how a cardiothoracic surgeon’s role may need to alter toward the end of their career if objective measures of intra-operative performance begin to decline and affect patient outcomes.

Competency-based education and certification are highly labour-intensive and thus could also benefit from these technologies. Pose estimation algorithms could be used to assess if a trainee surgeon meets the competency requirements. Of course, it would be essential to assess trust and acceptance of the technology for this purpose in addition to the accuracy of the algorithm. A hybrid approach could reduce labour if trust and acceptance require human oversight (human in the loop, [85]). For example, algorithms could be used throughout training to track progression against milestones and flag when the trainee meets the competency threshold to be formally assessed by a human.

Beyond training and monitoring interventions, correlative ecological studies investigating already recorded operations could help further understand what factors link with patient outcomes. Indeed, early investigations examining intra-operative performance in laparoscopic and robotic-assisted procedures demonstrate a link to short-term patient outcomes [4]. This new understanding would provide valuable information that can be used to develop data-driven policies, procedures and environments that support the optimal performance of surgeons. By training artificial neural networks to predict and track a surgeon’s movement, large-scale investigations evaluating kinematic data from recorded surgeries are possible. This research could be additionally informative for competency-based educational frameworks as there is currently no evidence to suggest that a trainee’s progression through milestones links with patient outcomes [87]. By understanding more clearly what aspects of skill relate to patient outcomes, it may be possible to determine competency-based thresholds informed by empirical evidence.

Of course, implementing any performance monitoring or feedback tool should be done in such a way as to foster trust between users (cardiothoracic surgeons and their teams) and the institution. Installing cameras to monitor operations is becoming more commonplace with the use of ‘operating room black boxes’, but it is possible that such initiatives could result in resistance. In a cross-sectional survey of Danish healthcare professionals [88], on average, opinions toward using a black box were neutral or positive, with little concern over data safety. Conversely, in a similar study conducted in Canada [89], there were more significant concerns over data safety and the potential for litigation, highlighting the importance of considering any concerns within a societal, cultural and legislative context. Irrespective of perception, video data most often supports healthcare professionals from a legal perspective [90] and thus is more likely to offer protection than be a threat. Ultimately, the success of the technology elaborated upon within this review will rely on fostering a culture of trust and engagement with users and institutions to ensure that any concerns are addressed and there are strong institutional policies to protect and support the interests of the observed cardiothoracic surgical team.

Ethical and legal considerations

A solid institutional policy should be developed to ensure that video footage (both from simulated and live procedures) is recorded, stored and used ethically and legally. Potential concerns and methods for addressing those concerns have been summarised in detail elsewhere [91]. It is common to raise legal fears concerning the recording and storage of footage. However, as mentioned already, these fears are likely unfounded: typically, video data protects healthcare professionals rather than puts them at risk from a legal perspective [90]. Further, it is not considered necessary for video data to be added to a patient’s medical record if the video is collected solely for quality improvement because it is not in any way used for the patients care [90]. Nevertheless, there may be variations in patient consent requirements across institutions and regions. Confidentiality and anonymity should be carefully considered as, in some cases, the nature of the research would require analysis of identifiable or sensitive personal information.

Conclusion

New means of analysing surgical performance open doors to understanding surgical excellence in the cardiothoracic specialty. Other disciplines have traditionally benefitted from technological innovations around training and the objective measurement of performance; the fields of computer vision and Artificial Intelligence now offer opportunities that are ideal for use in the cardiothoracic surgical environment. Further, these tools are feasible to use within in the operating room which will assist in understanding how technical and non-technical skills influence patient outcomes. However, the technology is still in the early stages, and thus, further innovation will require commitment and partnership from hospitals and cardiothoracic surgeons to provide (1) data that can be used to develop feedback tools and (2) constructive direction to ensure that any tool used for applied purposes has been developed to meet the needs of the user adequately. With data from simulated and real surgical settings, research aimed at understanding how expertise relates to the cognitive mechanisms that support psychomotor performance within the context of surgery will further help design targeted training interventions and surgical environments more optimally to enhance surgical outcomes. With big data generated across many institutions, it may be possible to develop data-driven guidelines for task execution, and team coordination that reduce the surgeon and team’s physical and cognitive load [6].