Abstract
Automated pulmonary nodule detection plays an important role in lung cancer diagnosis. In this paper, we propose a pulmonary detection framework that can achieve high sensitivity with few candidates. First, the Feature Pyramid Network (FPN), which leverages multi-level features, is applied to detect nodule candidates that cover almost all true positives. Then redundant candidates are removed by a simple but effective Conditional 3-Dimensional Non-Maximum Suppression (Conditional 3D-NMS). Moreover, a novel Attention 3D CNN (Attention 3D-CNN) which efficiently utilizes contextual information is proposed to further remove the overwhelming majority of false positives. The proposed method yields a sensitivity of \(95.8\%\) at 2 false positives per scan on the LUng Nodule Analysis 2016 (LUNA16) dataset, which is competitive compared to the current published state-of-the-art methods.
This work was supported by the National Natural Science Foundation of China under Grant (61525206, 61572472, U1703261, 61571424).
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
1 Introduction
Lung cancer is one of the most lethal diseases worldwide [1]. It can be early diagnosed among high-risk individuals through screening with low-dose computed tomography (LDCT). Compared with traditional chest radiography screening, LDCT has reduced the mortality of lung cancer by \(20\%\) in seven years with early diagnosis [3]. The prevalence of CT technology has generated enormous CT data. However, it is challenging for radiologists to accurately localize every pulmonary nodule appearing on all CT slices.
Over the past two decades, researchers have developed many Computer-Aided Diagnosis (CAD) systems for automatical detection of lung nodules [2, 3]. The performances of these CAD systems have been improved significantly over previous systems. However, these techniques still have a long way to enable practical applications. These CAD systems are often designed by using hand-craft features that are based on some low-level features such as size, shape, texture and intensity. They cannot deal with the large variations of nodules, thus may fail to distinguish nodules from those ambiguous regions. Fortunately, the development of deep learning enables us to engineer representative features to recognize various appearances of pulmonary nodules, which shows promising detection accuracy with significantly improved performances. Over the past two years, many deep learning based systems have been proposed and delivered exciting results [3,4,5,6]. However, it is still a challenging task to detect small pulmonary nodules from CT slices. Detecting small or even tiny nodules plays an important role in early diagnosis and treatment, which can effectively lower the risk of lung cancers before they develop into worse stages. This inspires us to develop a high sensitive CAD system.
Conventional pulmonary nodule detection systems usually consist of two stages: proposing nodule candidates and removing false positives. In this paper, we carefully design Convolutional Neural Network (CNN) structures to address the challenge of detecting small pulmonary nodules at each stage: First of all, in order to ensure detecting all the nodules without missing those small nodules, we design the candidate detection network (see Fig. 1-Stage 1) by exploring the property of Feature Pyramid Networks (FPN) [7]. This detection network could cover almost all nodules with only very few nodules missed (16/1186 \(\approx \) 1.3%). Then, we propose an effective Conditional 3D-NMS to remove the redundant candidates. Moreover, we propose a novel Attention 3D-CNN model (see Fig. 1-Stage 2) allowing the model to focus on the most relevant regions for reducing false positives. We will demonstrate that the proposed pulmonary nodule detection architecture achieves as high as \(95.8\%\) sensitivities with only two false positives per scan on LUNA16 dataset, which is very competitive compared to the current state-of-the-art methods.
2 Methods
As shown in Fig. 1, the proposed framework consists of two main parts: (1) the detection of nodule candidates by using FPN and (2) false positive reduction with the Attention 3D-CNN.
2.1 High Sensitivity Candidate Detection with FPN
Candidate detection is a crucial stage for pulmonary nodule detection. The purpose of this step is to recall all possible nodules subject to a reasonable number of candidates. In principle, sliding windows of various scales can cover all possible nodules. However, it is practically impossible because of a too large number of candidates which bring great challenges for subsequent operations. On the contrary, Region Proposal Network (RPN) [8] makes a better balance between the computational cost and the amount of recalled candidates. By using the RPN, Ding et al. [5] showed a higher sensitivity with even fewer candidates compared with traditional CAD systems. However, small nodules are still hard to detect with the original RPN. To this end, we propose the FPN-based detection architecture in detail below.
Like [5], we take three consecutive layers of CT slices as input and resize them to \(3\times 1024\times 1024\) before feeding into the network. To detect candidates, we design the FPN-based architecture as illustrated in Fig. 2. The C1\(\sim \)C5 layers in the proposed network correspond to the VGG16 network [9], just adding a new \(2\times 2\) pooling layer with stride 2 after Conv5_3 of the original VGG16 network. In most cases, the C5 feature map is already effective for object detection in general natural image. However, since nodules in CT are usually very small (3\(\sim \)30 mm in LUNA 2016 [3]), after passing through several pooling layers, the features of small nodules would become too weak or even disappear. We then take P3\(\sim \)P5 as the final features instead of C5. A \(1\times 1\) convolutional layer is attached on C5 to produce the coarsest resolution map P5 of 256 channels. Since P5 has a lower resolution feature map, we attempt to find larger nodule proposals of size \(64 \times 64\) on it. Then, we upsample the P5 by a factor of two and add it to channel-reduced C4 to obtain a middle-resolution P4 layer. In a similar way, we can obtain the P3 layer at the highest resolution. Both P3 and P4 contain higher-resolution information (which comes from C3 and C4) than P5 layer for detecting smaller nodules. Thus, 32 \(\times \) 32 and 16 \(\times \) 16 nodule proposals are found at P4 and P3 layers respectively. With these feature maps at different resolutions, we use RPN-Net to get nodule proposals and then classify them into nodule candidates or not.
2.2 Conditional 3D-NMS for Redundant Candidate Removal
Most pulmonary nodule candidates generated from the FPN may be repetitively detected because they often exist in multiple consecutive slices. To reduce the unnecessary computational burden, we propose a simple yet effective Conditional 3D-NMS method to remove redundant candidates. The basic idea is to divide candidates of the same CT scan into different sets of highly overlapped candidates based on their positions and radius. Then we choose the candidate with the highest mean pixel value as the final candidate of the current set. The reason we use pixel mean value as the condition is that the mean value corresponds to the Hounsfield unit (HU) value which reflects the characteristics (pulmonary nodule or lung parenchyma) of the current region. The overview of the algorithm is summarized in Algorithm 1.
2.3 Attention 3D-CNN for False Positive Reduction
The proposed candidate detection method can recall almost all of the nodules. Nevertheless, a large number of false positives exist among those candidates since it is difficult to distinguish true nodules from highly similar false positives without using three-dimensional spatial contextual information. Some researchers take advantage of 3D CNN for false positive reduction [5, 6, 10]. By contrast, to use spatial information more effectively, we propose a novel Attention 3D-CNN architecture for false positive reduction.
As shown in Fig. 1 (Stage2), the Attention 3D-CNN has two components: Branch A is the attention subnet (U-net structure [11]), which produces a 3D mask that is supposed to have a high response near the nodule. We apply the resultant 3D mask to the source patch before it is fed into the 3D CNN classification network. This allows the network to focus on the lesion area while ignoring the noisy irrelevant background. The detailed architecture of this Attention 3D-CNN network is presented in Fig. 3, where the convolutional layers are followed by batch normalization and ReLU activation.
While Branch A produces a mask of the nodule’s Gaussian distribution, its ground truth used to train the network is calculated below:
where \((\overline{x},\overline{y},\overline{z})\) represents the nodule centroid, (x, y, z) represents the voxel point of the mask in the CT scan, V is the mask value, and K is a constant term. We adopt \(3\sigma =1.5r\) instead of r, because we aim to consider context information in the model to better recognize nodules. Finally, Branch B outputs a classification probability to decide whether the current voxel is a nodule or not. We use a multi-task loss L to jointly train the network:
where the mask loss \(L_{mask}\) is mean squared error (MSE) between the ground truth mask and the prediction mask, the classification loss \(L_{cls}\) is focal loss [7] that is more effective than cross entropy loss to classify hard examples on an imbalanced set, and \(\lambda \) is a hyperparameter balancing between the two losses.
3 Experiments
We evaluate the proposed framework on the LUNA16 [3] dataset. It contains 888 CT scans whose pulmonary nodules have been well annotated by four experienced radiologists. The LUNA16 dataset is divided into ten subsets for ten-fold cross validation. The performances of detection algorithms are evaluated by sensitivity and average number of false positives per scan (FPs/scan). The overall score (CPM score) is defined as the average of sensitivity at seven predefined false positive rates – 1/8, 1/4, 1/2, 1, 2, 4 and 8 FPs per scan.
3.1 Implementation Details
On the candidate detection stage, we normalize the values of CT scans (Houndsfield Unit between −1000 and 400) to the range (0, 1). In the stage of false positive reduction, we crop \(36\times 36\times 36\) voxels from the detected candidates that are preprocessed by Conditional 3D-NMS. Then, data augmentation is used for training the Attention 3D-CNN network: we flip the voxels from coronal, sagittal and axial dimensions and crop \(32\times 32\times 32\) patches as the input into the network. The constant term K in generating Gaussian masks is set to 1000 and the hyper-parameter \(\lambda \) is set based on the cross-validation result.
3.2 Ablation Study and Results
To check the contribution of each step in the proposed framework, we perform an ablation study. As shown in Table 1, our candidate detection network (Cand-Det) can achieve a sensitivity of 98.7% with an average of 179.6 detected candidate nodules per scan. Compared with the ‘FUSION’ result that combines five traditional CAD systems (ISICAD\(\sim \)ETROCAD), the proposed candidate detection method can achieve higher sensitivity with fewer candidates.
After candidate detection, the Conditional 3D-NMS (Cand-Det-CNMS) method is adopted to remove redundant candidates. From the table, we can see that, compared with the normal 3D-NMS, the Conditional 3D-NMS is more effective for removing redundancy. Finally, we use the Attention 3D-CNN to remove massive false positives and reach a sensitivity of 95.8% with only 6.6 detected nodules per scan. By comparison, if we remove the branch A from the Attention 3D-CNN network, it becomes a typical 3D CNN, and the result would reduce to 94.6% sensitivity with 8.5 detections per scan. This shows that adding attention branch can help improve the sensitivity as well as reduce false detections. The FROC curve for each step is plotted in Fig. 4. We can see that the performance of the proposed framework is improved by combining these steps.
To further analyze the performance of the proposed framework, we compared with state-of-the-art methods [5, 6, 10] by using CPM score. In Table 2, we can see that the CPM score of the proposed system is 87.8%, which is better than the methods proposed in [6, 10], and only a little lower than the FRCN+3DCNN. However, in clinical practice, radiologists are more concerned with the sensitivities when the FPs per scan rates vary from 1 to 4 [5, 6], where the proposed architecture achieves the best performance.
4 Conclusion
In this work, we propose an architecture for the detection of pulmonary nodules. The architecture can first produce nodule candidates with high sensitivity using a FPN-based detection network. Then, a simple and effective Conditional 3D-NMS method removes the redundant candidates. Finally, a novel Attention 3D-CNN network is proposed to reduce the abundance of false positives. Experiments show that our architecture can achieve a high sensitive result with few candidates. The architecture can also be extended to other similar object detection tasks in CT scans.
References
Siegel, R.L., Miller, K.D., Jemal, A.: Cancer statistics, 2018. CA Cancer J. Clin. 68(1), 7–30 (2018)
Murphy, K., van Ginneken, B., Schilham, A.M.R., et al.: A large-scale evaluation of automatic pulmonary nodule detection in chest CT using local image features and k-nearest-neighbour classification. Med. Image Anal. 13(5), 757–770 (2009)
Setio, A.A.A., Traverso, A., De Bel, T., et al.: Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the LUNA16 challenge. Med. Image Anal. 42, 1–13 (2017)
Setio, A.A.A., Ciompi, F., Litjens, G., et al.: Pulmonary nodule detection in CT images: false positive reduction using multi-view convolutional networks. IEEE Trans. Med. Imaging 35(5), 1160–1169 (2016)
Ding, J., Li, A., Hu, Z., Wang, L.: Accurate pulmonary nodule detection in computed tomography images using deep convolutional neural networks. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 559–567. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_64
Dou, Q., Chen, H., Jin, Y., Lin, H., Qin, J., Heng, P.-A.: Automated pulmonary nodule detection via 3d convnets with online sample filtering and hybrid-loss residual learning. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 630–638. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_72
Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: CVPR, vol. 1, no. 2, p. 4 (2017)
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Dou, Q., Chen, H., Yu, L., et al.: Multilevel contextual 3-D CNNs for false positive reduction in pulmonary nodule detection. IEEE Trans. Biomed. Eng. 64(7), 1558–1567 (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, B., Qi, G., Tang, S., Zhang, L., Deng, L., Zhang, Y. (2018). Automated Pulmonary Nodule Detection: High Sensitivity with Few Candidates. In: Frangi, A., Schnabel, J., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2018. MICCAI 2018. Lecture Notes in Computer Science(), vol 11071. Springer, Cham. https://doi.org/10.1007/978-3-030-00934-2_84
Download citation
DOI: https://doi.org/10.1007/978-3-030-00934-2_84
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00933-5
Online ISBN: 978-3-030-00934-2
eBook Packages: Computer ScienceComputer Science (R0)