1 Introduction

Recently minimally invasive treatments have been much in demand as medical technology develops. Laparoscopic surgery has attracted a lot of attention because it requires tiny incisions (5–30 mm) to insert a laparoscope and other instruments instead of a large incision (about 300 mm) for open surgery. During laparoscopic procedures, the laparoscopic image is usually displayed on a monitor, so surgeons can observe the operation field. Laparoscopic surgery can lead to faster recovery compared with open surgery. On the other hand, it is difficult to find bleeding points because of lowered visibility due to the bleeding and the narrow field of view in the laparoscope. In cases where it is difficult to implement the hemostasis process in laparoscopic surgery, surgeons have to consider open surgery.

In the event of bleeding, surgeons rapidly cover the bleeding area with gauzes as primary hemostasis. However, primary hemostasis is just an emergency action. Complete hemostasis such as heat coagulation, suture and ligation must be conducted after primary hemostasis. Surgeons plan their treatment for bleeding according to the bleeding status and the location. However, it is difficult to identify them and to implement the secondary hemostasis smoothly because the bleeding area is covered with gauzes after primary hemostasis. Thus, laparoscopic surgery requires close monitoring of the bleeding state and location.

We have been developing the concept for a hemostasis support system that automatically identifies blood regions and indicates them to the surgeon during laparoscopic surgery. Figure 1 shows conceptual images of applications of the support system we aim at. The future system will identify blood regions in real time and present those regions with highlighting or flashing, or displaying them on another monitor as shown in Fig. 1a. By highlighting blood regions, the surgeons can confirm the bleeding state and location immediately and reduce the hemostasis time. This system will also lead to the prevention of postoperative bleeding by making surgeons aware of bleeding occurrence. In addition, if surgeons want to check an image frame showing bleeding occurs, the future system will display laparoscopic images in a time series and present the frame specified by the surgeons in the picture-in-picture mode or on another monitor as shown in Fig. 1b. This makes it possible for the surgeons to consider and decide the approach to the complete hemostasis while the primary hemostasis is being conducted.

To realize the future system, two technical challenges must be met to identify blood regions in real time and to capture the instantaneous situation of bleeding. In this paper, we describe a highly accurate blood region identification method which will move us closer to the final goal.

Several studies have been made on instrument detection in laparoscopic surgery [1, 2]. Additionally, there is an artery detection method based on the artery pulse [3]. However, no reports have been made on blood region identification for laparoscopic surgery to the best of the authors’ knowledge. In the field of wireless capsule endoscopy (WCE) [4], blood detection methods have been actively researched. WCE can examine the entire small intestine without causing any pain to the patient. However, WCE produces too many images that need to be manually examined by clinicians. Consequently, many researchers have been developing automatic blood detection methods that have been reviewed in the literature [5].

The existing methods are classified roughly into image-based methods, patch-based methods and pixel-based methods [6]. Image-based methods process the image as a whole and classify each image as a bleeding or a non-bleeding image. Liu and Yuan [7] proposed an automatic blood detection method by employing RGB color values. Ghosh et al. [8] distinguished these images using different statistical measurements in the hue space of WCE images (mean variance and standard deviation). However, image-based methods, which are based on the statistical features extracted globally from the image, cannot identify the blood regions. Therefore, these methods are not suitable for our purpose. Patch-based methods, on the other hand, group similar pixels into a patch and classify each patch as a blood or a non-blood patch. Li and Meng [9] proposed a method based on chrominance moments combined with a local binary pattern texture, and Fu et al. [6] grouped pixels through super-pixel segmentation and patch classification. However, patch-based methods require significant processing time for the grouping. Pixel-based methods identify blood pixels to distinguish blood regions in WCE images from non-blood regions and to identify the location of the blood region. Some research studies [10,11,12,13] set the threshold in an RGB or HSV color space to distinguish the blood and non-blood pixels. Since the threshold value is empirically determined, those methods are less reliable and less robust. Detection methods based on machine learning have been studied recently. Pan et al. [14] and Fu et al. [15] proposed detection of blood pixels by using a neural network.

Our proposed blood identification method in laparoscopic surgery is based on the pixel-based methods, working in real time. We examine the feature values suitable for laparoscopic images and then classify the pixels into either blood or non-blood pixels by employing a support vector machine (SVM). We report that the proposed method achieves high identification performance and real-time processing.

Fig. 1
figure 1

Conceptual images of applications of the hemostasis support system being aimed at. a Highlight mode. b Picture-in-picture mode

Fig. 2
figure 2

Flow chart of the proposed method

2 Materials and methods

The proposed method takes a machine learning approach. It uses color features (i.e., RGB and HSV values, as well as their combinations) and a supervised SVM classifier to determine whether pixels in the image contain blood or not. Consistent with all machine learning methods, it consists of two steps: training and application. Its processing flow is schematically shown in Fig. 2. First of all, a ground truth dataset is constructed from acquired laparoscopic images. Then, the dataset is divided into training and testing sets. In the training step, the most discriminant features are selected and an SVM classifier is trained with parameter optimization. In the application step, the image is first preprocessed to remove pixels that are too dark. Then, each remaining pixel of every frame is classified into either a blood pixel (BP) or a non-blood pixel (NBP) by the SVM. Finally, the identified blood regions are superimposed on the original image for display. As a way of superposition, pixel values of identified BP are replaced with a specific color, for example, cyan. Details of the entire process are given in the following subsections.

2.1 Preprocessing

To reduce the image noises and processing time, the original images are resized using a Gaussian pyramid. In this processing, image is blurred a Gaussian filter and then downsized by half along each direction. This is repeated until we obtain desired image size or resolution. The image size is reduced to a quarter size of the original image in this study. According to [13, 15, 16], BP/NBP in the dark regions cannot be accurately identified because the ratio between RGB values of blood pixels in such regions is different from that in bright regions due to noise level and tone curve. Thus, it is necessary to remove the dark regions by threshold processing. In the present study, we masked every pixel whose R(red) value was smaller than 60. This threshold value was used for all images. This value was the same as used in reference [15] and was empirically used in our study as well.

2.2 Feature selection

Selection of feature values is a critical factor to improve the capability of SVM classification and to reduce the number of feature values. Some papers have examined the scheme for feature selection. Sainju et al. [17] and Sergio et al. [18] adopted the wrapper approach to determine the best combination of feature values. Yeh et al. [19] evaluated the performance of three filter-based methods. Although these procedures provide good feature values, they are complicated and time-consuming. Thus, we proposed a simple feature selection method using linear discriminant analysis. To determine the suitable combination of feature values for blood detection, we selected proper feature values from 12 candidates calculated from the pixel value RGB or its HSV [20] representation: specifically R, G, B, H, S, V, G / R, B / G, B / R, \(R/(R+G+B)\), \(G/(R+G+B)\) and \(B/(R+G+B)\). These candidates have been frequently used in blood pixel detection techniques based on machine learning [5]. We selected only three feature values for the SVM, taking the trade-off between classification accuracy and processing speed into consideration. Importance of the features was measured by discriminant analysis. Discriminant analysis [21] is a statistical procedure to classify datasets. It is possible to measure the degree of separation by calculating the between-class variance in each feature value. The degree of separation C is calculated as:

$$\begin{aligned} C = \omega _{\mathrm{bp}}\omega _{\mathrm{nbp}}(m_{\mathrm{bp}} - m_{\mathrm{nbp}})^{2}, \end{aligned}$$
(1)

where \(\omega _{\mathrm{bp}}\) and \(\omega _{\mathrm{nbp}}\) represent the numbers of BPs and NBPs, respectively, and \(m_{\mathrm{bp}}\) and \(m_{\mathrm{nbp}}\) represent the mean values of BPs and NBPs, respectively. Feature values were normalized between 0 and 1 before this calculation because the ranges of the feature values were various and the calculated C depends on the mean values. We calculated the separation C in Eq. (1) in each feature. Figure 3 shows the relationship between each feature candidate and its calculated degree of separation C. Here, the abscissa represents the value of C, and each feature is arranged in a descending order of C from top to bottom on the ordinate. Based on the results shown in Fig. 3, as features used in the SVM, we chose the top three components as follows:

$$\begin{aligned} F_{1}&= R(i)/(R(i)+G(i)+B(i)), \nonumber \\ F_{2}&= G(i)/R(i), \nonumber \\ F_{3}&= S(i) = (V_{\mathrm{max}}(i) - V_{\mathrm{min}}(i))/V_{\mathrm{max}}(i), \end{aligned}$$
(2)

where R(i), G(i) and B(i) represent the RGB value and S(i) represents the saturation at the ith pixel. \(V_{\mathrm{max}}(i)\) and \(V_{\mathrm{min}}(i)\) represent maximum and minimum values of R(i), G(i) and B(i). If \(V_{\mathrm{max}}(i)\) is zero, S(i) is exceptionally set to zero. \(F_{1}\) is a chromaticity coordinate of the pixel, and \(F_{2}\) is the ratio of R(i) to G(i). These two values are unaffected by the brightness change in the image. According to [22], the ratio of green to red is remarkable so that it can be used to identify the bleeding pattern.

Figure 4 shows histograms of the three features used in the SVM. Figure 4d shows the three-dimensional distributions of the BPs and NBPs. Because there was only a subtle overlap between the BP and NBP distributions, possibility of accurate classification was indicated.

Fig. 3
figure 3

Relationship between each feature candidate and its calculated separation C

Fig. 4
figure 4

Histogram of the features: ac The histograms of \(F_{1}\), \(F_{2}\) and \(F_{3}\). d The distribution in 3D space

2.3 Classification

We used the support vector machine (SVM) to train a classifier and identify BPs for each laparoscopic image. Vapnik and co-workers developed the SVM which is a kernel-based machine learning algorithm [23]. SVM builds a classifier using training sets and divides data into two sets by a hyperplane with the maximum margin. If the classification is not linearly separable and the datasets are mapped in a high-dimensional feature space, then the SVM finds a linear separating hyperplane in the high-dimensional feature space.

2.4 Experiments

We applied the proposed method to ten sets of laparoscopic motion pictures. Table 1 lists the patient ID and the model of laparoscope used. Informed consent was obtained from each patient. The performance of BP identification and the processing time in each frame were evaluated in two kinds of experiments.

Table 1 Patient ID and model of laparoscope used

Experiment 1

This experiment was designed to examine how accurately a dedicated classifier for a specific model of laparoscope classifies the pixels in the images obtained with the same model. The images of patient ID pairs of #1 and #2, #3 and #4, #5 and #6, and IDs #7 to #10 were obtained with the same laparoscopes model (Table 1), respectively. Data of each patient were used as test data to evaluate the performance of the classifier trained with the data of the other patient or patients in the same group. In the training step, we randomly selected 40 frames from each patient’s data. We then randomly selected 200 BPs and 200 NBPs per patient from selected frames under the guidance of a surgeon, and these pixels were defined as the training set.

Experiment 2

In this experiment, we tested performance when training the SVM using several models of laparoscope. For this purpose, we chose a pair “patient–device” to be left out of the training set and used it as a test sequence. For example, the SVM was trained using all patients’ data except for patient #1, and it was applied to patient #1 data. This process was applied to all patients. The training set was formed in the same way as in experiment 1.

In both experiments, the algorithm was implemented using a computer with Intel Xeon E5-2630 \(\times \) 2 (2.30 GHz 6-core CPU) and 64.0 GB of RAM. OpenMP was introduced to accelerate the process. The performance of the proposed method was measured with sensitivity, specificity and accuracy defined as follows:

$$\begin{aligned} \mathrm{Sensitivity}&= \frac{\textit{TP}}{\textit{TP}+\textit{FN}}, \nonumber \\ \mathrm{Specificity}&= \frac{\textit{TN}}{\textit{TN}+\textit{FP}}, \nonumber \\ \mathrm{Accuracy }&= \frac{\textit{TP}+\textit{TN}}{\textit{TP}+\textit{FP}+\textit{TN}+\textit{FN}}, \end{aligned}$$
(3)

Here, TP represents the number of positive (BPs) samples correctly classified, FN represents the number of positive samples incorrectly classified as negative (NBPs), TN represents the number of negative samples correctly classified and FP represents the number of negative samples incorrectly classified as positive.

Ground truth data were required to calculate the sensitivity, specificity and accuracy. First, the surgeon among the authors explained to the non-surgeon co-authors the rule of annotation by demonstrating the annotation with some images. Non-surgeon co-authors then arbitrarily selected 1000 frames from each application dataset (100 \(\times \) 10 frames) and annotated the blood regions in detail. In addition, we evaluated the reproducibility of the annotation. Ten frames per patient were selected from the ground truth data (100 frames), and the non-surgeon authors annotated the blood regions again for these 100 frames (GT2) without referring the ground truth data (GT1). Example images for annotation of frames of GT1 and GT2 are shown in Fig. 5. After GT1 and GT2 were prepared, we computed a dice similarity coefficient (DSC) between GT1 and GT2 to validate the consistency of the ground truth data. DSC is defined as Eq. (4) and used for comparing the similarity of two samples.

$$\begin{aligned} \mathrm{DSC}(X,Y)=\frac{2|X \cap Y|}{|X|+|Y|}. \end{aligned}$$
(4)

Here, |X| and |Y| are the number of elements in two samples. As a result, the average DSC value between GT1 and GT2 for each datum was \(0.90 \pm 0.03\). Generally, there is enough reproducibility if the DSC value exceeds 0.90. We regarded the reproducibility to be sufficient and the ground truth data had high validity.

Fig. 5
figure 5

Example images for annotation of frames. a Ground truth data, GT1. b Ground truth data, GT2

Fig. 6
figure 6

Identification results for patients #1 and #2. a, c Original images. b, d Superposition images with identification results and original images corresponding to a, c

3 Results

Figures 6, 7, 8 and 9 show the results of experiment 1. The left images are original images, and the right images show superposition of the identification results onto the original images. In these images, we rendered BPs as (R, G, B) = (0, 200, 200) to confirm the blood identification result. Table 2 summarizes the performance in experiment 1 in terms of sensitivity, specificity and accuracy of the proposed method. The SVM built a highly accurate classifier to separate BPs from NBPs, achieving more than 97% sensitivity and 96% specificity and accuracy. Moreover, the average processing time was \(5.0 \pm 4.5\) ms/frame. Hence, we can conclude the proposed method allows for real-time processing.

Table 3 summarizes the performance of the proposed method in experiment 2. Although the averages of specificity and accuracy decreased slightly, average sensitivity was almost the same as that in experiment 1. The specificities and accuracies of patients #3 and #6 markedly decreased. In particular, those of patient #3 fell below 90%. Although the average processing time was \(12.6 \pm 2.0\) ms/frame which is a little more than twice that of experiment 1, this speed is faster than the frame rate of general-use commercial products. Namely the proposed method allows real-time processing.

Fig. 7
figure 7

Identification results for patients #3 and #4. a, c Original images. b, d Superposition images with identification results and original images corresponding to a, c

Fig. 8
figure 8

Identification results for patients #5 and #6. a, c Original images. b, d Superposition images with identification results and original images corresponding to a, c

Fig. 9
figure 9

Identification results for patients #7–10. a, c, e and g Original images. b, d, f and h Superposition images with identification results and original images corresponding to a, c, e and g

Table 2 Identification performance in experiment 1
Table 3 Identification performance in experiment 2

4 Discussion

The sensitivity, specificity and accuracy in experiment 1 were more than 90 %. Therefore, the proposed method achieved adequate accuracy for data of other patients as long as the same model of laparoscope was used. However, specificity and accuracy of patients #3 and #7 were remarkably lower than those of the others. For patient #3, the dark reddish parts of the organ (upper left of Fig. 7a) were incorrectly identified as blood regions. As shown in Fig. 7c, patient #4 had much bright yellow visceral fat and there were a few dark reddish NBPs. On the other hand, for patient #3, the color of most NBPs was reddish. This suggests that the number of reddish NBPs in the training dataset was too small and the classifier trained with the dataset of patient #4 could not exactly identify the reddish background as NBPs. This led to many FPs and resulted in low specificity and accuracy.

The results of patient #7 had the lowest accuracy in the ten sequences. From comparison between Fig. 9a, c, e, g captured by the same model of laparoscopic device, we can see that these images were acquired under different conditions. The Storz laparoscopic device IMAGE1 S has its own image enhancement function such as contrast enhancement and hue adjustment. Here, we do not comprehend the detail process of such image enhancement function. In this experiment, while patient #7 data were obtained with the image enhancement function off, #8–10 data were obtained with the enhancement function on. So, it is natural that the performance of #7 was poor. The proposed method identified BPs based on pixel-wise color features and does not take color tone of the whole image into account. Thus, it is necessary to include the training samples processed with the image enhancement function or to switch the classifier to the specific one built for the enhancement function. If special training samples are prepared in advance, switching of the classifier can be easily realized because it is done instantaneously by a PC.

When blood regions were disguised due to smoke caused by ablation or specular light, appropriate identification was highly difficult. Figure 10a, b shows examples of images when there were smoke and specular light, respectively. In such cases, the classifier using only the SVM has a limitation in achieving high accuracy. Some intelligent pre- and/or post-processing should be introduced. Although experiment 1 achieved sufficient identification accuracy, it is troublesome to make the training sets for each model. Experiment 2 was designed to be trained with data obtained from multiple laparoscope models. All averages of performance parameters were more than 95%, the same level as experiment 1. However, some sensitivities and specificities decreased slightly from experiment 1. Since images of patients #8–10 were processed with the enhancement function on, BPs were bluer than those of other images. Thus, accurate identification was difficult for such bluish BPs using the classifier trained in experiment 2. On the other hand, specificity of #7 obtained with the enhancement function off was improved because similar datasets were used in the training step.

Fig. 10
figure 10

Difficult situations for identification. Occurrences of a smoke and b specular light. The green arrow points to the specular region

For patients #3, #5 and #6, specificities decreased compared with experiment 1. In particular, that of patient #3 was lower than 90%. Because the color of most objects in these images was reddish, several NBPs were identified as BPs, which was the same situation as experiment 1. Since the image characteristics of patients #5 and #6 were similar, trained classifiers were specialized for the reddish environment in experiment 1. Therefore, the performance parameters were all good. Although the classifier in experiment 2 had versatility using many kinds of datasets, it was insufficient for the reddish environment. To obtain better performance for the reddish environment, an additional classifier or additional training samples are required.

The processing time in experiment 2 was a little more than twice that of experiment 1. This seemed to be caused by the increased number of training samples. Overall results showed that the proposed method is effective in blood identification and can be used online for hemostasis support even under the experiment 2 conditions. However, the failure to detect bleeding is the most undesirable case. To construct the most sensitive classifier, the training dataset from the same model of laparoscope is required under the various environments of the abdominal cavity when the enhancement function is off.

The performance obtained in prior studies that evaluated by pixel unit was sensitivities of 87–92% and specificities of 85–89% [11,12,13,14]. The sensitivities and specificities of our proposed method were 92.7–99.5% and 86.1–99.8%, respectively. Even the lowest results were slightly better than those of prior studies. The study [11] that obtained the highest performance among those four studies only used a feature value similar to \(F_{1}\) and empirically set the specific threshold value.

Fu et al. [6] detected bleeding pixels by combining with three feature values and SVM. This combination was similar to that of our method. The average processing time was 540 ms/frame. This is at least 40 times longer than our method and the real-time identification cannot be performed. It should be noted that the direct comparison is not easy because the implementation environment and image size were different. Their obtained performance values were 97% sensitivity, 92% specificity and 94% accuracy, which were very similar to ours. Fu’s method included a super-pixel method as a preprocessing to reduce the computational cost of the bleeding detection. However, the super-pixel method is generally time-consuming than the Gaussian pyramid used in our method. Because one purpose of this study is to conduct the bleeding identification in real time, this superiority in processing time is important.

As Sainju et al. [17] and Yeh et al. [19], the best performance values of these studies were 96% sensitivity, 90% specificity and 93.6% sensitivity, 92.1% specificity. To achieve such performance, both groups evaluated the performance using several combinations of feature values or feature selection methods. Sainju’s group adopted a wrapper approach which evaluated all combinations of the feature values to achieve the best performance. Yeh’s group used three feature selection methods based on the SVM or decision tree. While these approaches might produce effective combination of feature values, they are time-consuming and troublesome tasks although both articles did not mention the processing time. The proposed method used the linear discriminant analysis which was a simple approach. Because only the degree of separation was individually evaluated for each candidate feature value, the best combination might not be determined. However, highly accurate identification results were obtained in this study and the effectiveness of discriminant analysis was indicated.

Our method has achieved better performance and less processing time since we selected the best three feature values based on the discriminant analysis and the classifier was built by the SVM.

5 Conclusion

To realize the hemostasis support system, we proposed a blood identification method using the SVM for laparoscopic surgery. In addition, the simple feature selection method was introduced. Experimental results showed that the proposed method could identify more than 95% of the blood pixels in real time.

In this paper, identified blood pixels were simply overlaid with a cyan color onto the original image, but this might not be the best way. A more informative and efficient interface should be developed. The hemostasis support system may be able to output a warning for surgeons. With such a warning, as they rapidly cover the bleeding area with gauze pads, they can consider and decide the next hemostasis approach by monitoring the presented bleeding frames. With the system, it would be possible for novice surgeons to remain composed as they stop bleeding which leads to decreases in the volume of blood lost.