Artificial intelligence (AI) is a branch of computer science enabling computers to build mathematical models to train and solve problems by mimicking human cognition, provided these models have been trained properly [1]. Machine learning (ML) is a subfield of AI that enables computers to make predictions based on underlying data patterns, while deep learning (DL) is a subset of ML [2] (Fig. 1).

Fig. 1
figure 1

Relationship among artificial intelligence, machine learning and deep learning

AI has been applied successfully to medicine during the last years [3]. Surgical Data Science is the research field of AI, specifically targeted towards surgical practice to improve the quality of surgical healthcare delivery through the capture, organization, analysis, and modeling of data [4]. AI can be applied to surgery (i) to obtain an automated objective assessment of skills during training [5, 6], (ii) to automate surgery following the paradigm of self-driving cars with respect to robot-assisted surgery (RAS) [7], (iii) to enable intra-operative navigation [8] and (iv) to detect early and thereby prevent errors and improve patient outcomes [9].

As literature on AI in surgery increases, surgeons should be able to understand the published studies on AI. Although this additional AI knowledge may be perceived currently as too theoretical and not useful, and consequently overlooked as not essential for good practice, it is more than likely that in the future this attitude will change as medical devices based on AI will enter the market and be integrated into the clinical practice.

However, this is not a trivial task as the AI terms used in these studies are generated by computer scientists using their own jargon which is largely unfamiliar to surgeons. Thus, the widespread clinical use of AI in surgery faces several challenges. These include AI algorithms that are not transparent or “understood” by surgeons who for this reason regard AI systems as ‘black box’ in nature and understood incompletely [10]. Indeed, few physicians have the necessary knowledge to understand them [11]. Additionally, the data files structure are often extremely complex [12].

Thus, the primary outcome of this review is to provide definitions of the commonly used AI terms in surgery to simplify their understanding by surgeons. In this way, we want to contribute to the development of a multidisciplinary collaboration between surgeons, engineers, and computer scientists. The secondary outcome is to provide, in a supplement, a detailed list of surgical articles in which AI terminology is used.

Materials and methods

A literature search was conducted in September 2021 on PubMed following the preferred reporting items for systematic reviews and meta-analyses (PRISMA) statement [13]. This search required several steps. First, we retrieved any AI terms used in laparoscopy or RAS. We then compiled an initial database using the following search strategy:

  1. 1.

    ((((artificial intelligence) OR (machine learning)) OR (deep learning)) OR (computer vision)) OR (Natural language processing)

  2. 2.

    ((((laparoscopy) OR (robotic)) OR (minimal invasive)) OR (minimally invasive)) OR (laparoscopic)

  3. 3.

    (surgery) OR (surgical)

  4. 4.

    #2 AND #3

  5. 5.

    #1 AND #4

  6. 6.

    #5 NOT REVIEW

We then applied the following filters to the retrieved articles: Abstract, Journal Article, in the last 5 years, Humans, and English language. Thus, in total 1729 articles were retrieved and two reviewers (AM and KG) independently screened titles and abstracts of all identified publications for relevance before inclusion. The exclusion criteria were reviews, letters, non-peer reviewed articles, conference abstracts, and proceedings. A total of 195 articles were finally included in our initial database. Data from these articles were extracted and checked by two authors (AM and KG). During the second step, we compiled a list of AI terms mentioned at least once in the abstracts of our initial database. The three fundamental terms i.e., AI, ML, and DL were excluded. A final database based on the resulting 38 AI-related terms was established. Next, we then conducted a reciprocal number of queries to our initial database to establish the frequency each term was used. In the third step, we searched for the first mention of each AI term in the literature. In this step, we performed a query in our primary search using item #5 tagged as review to identify all published reviews. We applied the same filters as those used in the initial search for non-review articles and retrieved 34 review articles. The same methodology as in the first step was followed by the same authors (AM and KG) who inspected all reviews to see whether they provided explanations of any of the 38 AI terms either into text or in the online Appendix. Ten published reviews in the last five years contained at least one of the 38 AI terms. The flowchart of the searches performed, based on PRISMA statement is shown in Fig. 2.

Fig. 2
figure 2

Flow diagram of the search and inclusion process

Results

Table 1 depicts the list of the 38 AI terms and their occurrence in our initial search (second column). The final number of occurrences used in our final database is shown in the third column. Table 1 shows that convolutional neural networks (CNNs) were the most frequent, appearing in 74 studies, followed by classification in 62, artificial neural networks (ANNs) in 53, and regression in 49. The most frequent expressions were supervised learning (reported in 24 articles), support vector machine (SVM) in 21, and logistic regression in 16. The rest of the 38 terms were mentioned infrequently. The detailed list of the citations occurring in each one of the 38 terms is shown in the online Appendix.

Table 1 Occurrence of 38 main AI terms used in Surgery

Table 2 reports the occurrence of each one of the 38 main AI terms in the ten retrieved reviews. Except the one by Zhou et al. [14] where 21 out of the 38 (55%) terms were mentioned, only a minority of the terms occurred in the rest, ranging from 12 (32%) [15] to just one (3%) [9].

Table 2 Main AI terms mentioned in surgery-related reviews

Table 3 provides a glossary with the definitions of the AI terms identified in our search results. A detailed list of surgical articles in which each AI term is included is reported in the online Appendix.

Table 3 Glossary of AI terms in surgery

Discussion

Surgery is in its fourth generation (open surgery, endoluminal surgery, laparoscopic surgery, and RAS). Laparoscopic surgery and RAS provide huge amount of data which can be processed by AI, e.g., one minute of a high-resolution minimal access operations generate 25 times the amount of data found in a high-resolution computed tomography scan [16]. However, minimal access surgery poses a significant challenge to image analysis due to changes in illumination, unfocused frames, blood and smoke in the surgical field, and anatomical diversity [17]. In addition to data from videos, RAS generates data from robot kinematics and event data (e.g., pressing camera and/or clutching pedals) [18].

Since the number of AI terms is increasing constantly with constant expansion of the reported literature on AI, the new generation of surgeons will be required to become familiar with AI knowledge and its reported literature, since AI is expected to have a significant impact on surgery at all the stages: pre-operative, intra-operative, and post-operative.

In this report, we aimed to review and attempt to categorize the relevant terms as well as provide a glossary for surgeons. Our search revealed that CNNs is the AI term reporting the highest number (n = 74) of published studies in surgery. This is not surprising as CNNs constitute the backbone of AI frameworks for different applications of computer vision, namely classification (prediction of the correct class of objects in an image), and object detection (localization of objects in addition to classifying the correct class). CNNs also form the basis of complex DL architectures like U-Net for semantic segmentation (definition of pixel-wise borders of objects of the same class) and Mask R-CNN for instance segmentation (definition of pixel-wise borders of each object). As shown in Table 1, all these frameworks have been applied to surgery. When coupled with recurrent neural networks (RNNs), CNNs are capable not only to process spatial information to localize surgical tools, but also to analyze temporal information so that they can be used to analyze the surgical workflow, for instance for action recognition (e.g., dissection and cutting) and phase recognition (e.g., incision of splenorenal ligament). The two common tasks in a ML i.e., classification and regression, were ranked highly in terms of their occurrence. As previously, classification is part of other computer vision tasks e.g., detection and segmentation. Regression is defined as the task involved in the prediction of continuous numerical values. The simplest type of regression is linear regression in which a fitting line is used to model the data, representing the relation between one dependent and one independent variable. There are also more complex types of regression, e.g., non-linear when a curve is used to fit the data or multiple regression, when the dependent variables is related by more than one independent variable. The main ML algorithms were developed for both classification and regression e.g., support vector machine (SVM), random forests, and multilayer perceptrons (MLP) (Table 1).

In contrast there are some AI terms reporting very low number of occurrences, but which are expected to grow rapidly over the next few years. Examples include imitation learning and reinforcement learning, respectively, mentioned in three and seven studies, which may become more common in the near term in wake of the widespread use of robots in several fields, including surgery.

Our glossary provides a comprehensive list of definitions of AI terms to help different stakeholders. Firstly, residents and surgeons with the need to understand the fundamentals of AI while reading articles. Secondly, young researchers starting their career in Surgical Data Science. Thirdly, experts working in the regulatory department of companies in the business of AI Software as a Medical Device (SaMD) to prepare and submit documents for approval from Food and Drug Administration (FDA) or other agencies.

Our glossary contains not only the definitions of AI terms to develop software e.g., the models, but also those related to the hardware necessary to perform the heavy computation requested by AI, e.g., graphical processing unit (GPU). The availability of cloud services hosting large numbers of GPUs significantly lowered the economic barriers to access powerful hardware to train and test even the most complex AI models.

The access to high performance computers at a reasonable price and the possibility to record and store videos of minimal access surgeries would suggest that building and training large datasets is within the reach of most research centers. However, training of AI models for surgery is extremely labor intensive since the process of annotating images (called annotation) requires specific knowledge. It is not a simple annotation of images for “cat versus dog” classification or detection tasks, but rather a process to correctly identify the surgical tools, the anatomical parts (e.g., organs and vessels), and the clinically meaningful events. While laypersons and crowd annotators can reach the same level of surgeons for annotating surgical tools, to identify the anatomy and the quality of a dissection experts surgeons are required [19]. Additionally, the files must be anonymized to protect patients’ identity. Consequently, the size of the datasets of the published studies is typically small. An attempt to overcome this limitation is the Critical View of Safety Challenge [20] of the AI Task Force of the Society of American Gastrointestinal and Endoscopic Surgeons (SAGES), an online platform where it is possible to donate videos of laparoscopic cholecystectomy and contribute as annotators of the videos.

Conclusions

Surgical data science was recently introduced as the specific field of AI in surgery. Literature on this subject is expanding rapidly. For this reason, there is the need for surgeons to become familiar with the AI terms which were traditionally coined by computer scientists. In this review, we prepared a glossary with definitions of AI terms in surgery after reviewing the literature. This glossary will be useful not only to surgeons, but also to young researchers approaching the field, and companies developing SaMD applications.