Skip to main content
Log in

Real-Time Lightweight Convolutional Neural Network for Polyp Detection in Endoscope Images

用于内窥镜图像息肉检测的实时轻量级卷积神经网络

  • Published:
Journal of Shanghai Jiaotong University (Science) Aims and scope Submit manuscript

Abstract

Colorectal cancer is the most common cancer with a second mortality rate. Polyp lesion is a precursor symptom of colorectal cancer. Detection and removal of polyps can effectively reduce the mortality of patients in the early period. However, mass images will be generated during an endoscopy, which will greatly increase the workload of doctors, and long-term mechanical screening of endoscopy images will also lead to a high misdiagnosis rate. Aiming at the problem that computer-aided diagnosis models deeply depend on the computational power in the polyp detection task, we propose a lightweight model, coordinate attention-YOLOv5-Lite-Prune, based on the YOLOv5 algorithm, which is different from state-of-the-art methods proposed by the existing research that applied object detection models or their variants directly to prediction task without any lightweight processing, such as faster region-based convolutional neural networks, YOLOv3, YOLOv4, and single shot multibox detector. The innovations of our model are as follows: First, the lightweight EfficientNetLite network is introduced as the new feature extraction network. Second, the depthwise separable convolution and its improved modules with different attention mechanisms are used to replace the standard convolution in the detection head structure. Then, the α-intersection over union loss function is applied to improve the precision and convergence speed of the model. Finally, the model size is compressed with a pruning algorithm. Our model effectively reduces parameter amount and computational complexity without significant accuracy loss. Therefore, the model can be successfully deployed on the embedded deep learning platform, and detect polyps with a speed above 30 frames per second, which means the model gets rid of the limitation that deep learning models must rely on high-performance servers.

摘要

结直肠癌是最常见的癌症, 死亡率第二。息肉病变是结直肠癌的前兆症状。息肉的发现和切除可有效降低患者早期的死亡率。然而, 内窥镜检查过程中会产生大量的图像, 这将大大增加医生的工作量, 并且长期的机械筛选内镜图像也会导致高误诊率。针对计算机辅助诊断模型在息肉检测任务中严重依赖计算能力的问题, 我们提出了一种基于YOLOv5算法的轻量级模型, 坐标注意力-YOLOv5-Lite-Prune;这个模型不同于现有研究中提出的最新方法, 例如更快的基于区域的卷积神经网络、YOLOv3、YOLOv4和单次多边框检测, 这些方法将目标检测模型或其变体直接应用于预测任务而不进行任何轻量级处理。本文模型的创新点如下:首先, 引入轻量级的EfficientNetLite作为新的特征提取网络;其次, 采用深度可分卷积及其改进模块, 采用不同的注意机制取代检测头结构中的标准卷积;然后, 利用α-IoU损失函数提高模型的精度和收敛速度;最后, 利用剪枝算法压缩模型大小。我们的模型有效地减少了参数的数量和计算复杂度, 并且没有明显的精度损失。因此, 该模型可以成功部署在嵌入式深度学习平台上, 并以每秒30帧以上的速度检测息肉, 这意味着该模型摆脱了深度学习模型必须依赖高性能服务器的限制。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. THANIKACHALAM K, KHAN G. Colorectal cancer and nutrition [J]. Nutrients, 2019, 11(1): 164.

    Article  Google Scholar 

  2. SUNG H, FERLAY J, SIEGEL R L, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J]. CA: A Cancer Journal for Clinicians, 2021, 71(3): 209–249.

    Google Scholar 

  3. BRAY F, FERLAY J, SOERJOMATARAM I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J]. CA: A Cancer Journal for Clinicians, 2018, 68(6): 394–424.

    Google Scholar 

  4. SIMON K. Colorectal cancer development and advances in screening [J]. Clinical Interventions in Aging, 2016, 11: 967–976.

    Article  Google Scholar 

  5. LOEVE F, BOER R, ZAUBER A G, et al. National polyp study data: Evidence for regression of adenomas [J]. International Journal of Cancer, 2004, 111(4): 633–639.

    Article  Google Scholar 

  6. LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector [M]//European conference on computer vision. Amsterdam: Springer, 2016: 21–37.

    Google Scholar 

  7. BURLING D, International Collaboration for CT Colonography Standards. CT colonography standards [J]. Clinical Radiology, 2010, 65(6): 474–480.

    Article  Google Scholar 

  8. COX B F, STEWART F, LAY H, et al. Ultrasound capsule endoscopy: Sounding out the future [J]. Annals of Translational Medicine, 2017, 5(9): 201.

    Article  Google Scholar 

  9. SIEGEL R L, MILLER K D, FEDEWA S A, et al. Colorectal cancer statistics, 2017 [J]. CA: A Cancer Journal for Clinicians, 2017, 67(3): 177–193.

    Google Scholar 

  10. GUO Z, ZHANG R Y, LI Q, et al. Reduce false-positive rate by active learning for automatic polyp detection in colonoscopy videos [C]//2020 IEEE 17th International Symposium on Biomedical Imaging. Iowa City: IEEE, 2020: 1655–1658.

    Google Scholar 

  11. NOGUEIRA-RODRÍGUEZ A, DOMÍNGUEZ-CARBAJALES R, CAMPOS-TATO F, et al. Real-time polyp detection model using convolutional neural networks [J]. Neural Computing and Applications, 2022, 34(13): 10375–10396.

    Article  Google Scholar 

  12. SONG E M, PARK B, HA C A, et al. Endoscopic diagnosis and treatment planning for colorectal polyps using a deep-learning model [J]. Scientific Reports, 2020, 10: 30.

    Article  Google Scholar 

  13. XU J W, ZHAO R, YU Y Z, et al. Real-time automatic polyp detection in colonoscopy using feature enhancement module and spatiotemporal similarity correlation unit [J]. Biomedical Signal Processing and Control, 2021, 66: 102503.

    Article  Google Scholar 

  14. CAO C T, WANG R L, YU Y, et al. Gastric polyp detection in gastroscopic images using deep neural network [J]. PLoS One, 2021, 16(4): e0250632.

    Article  Google Scholar 

  15. CHEN B L, WAN J J, CHEN T Y, et al. A self-attention based faster R-CNN for polyp detection from colonoscopy images [J]. Biomedical Signal Processing and Control, 2021, 70: 103019.

    Article  Google Scholar 

  16. QIAN Z Q, JING W J, LV Y, et al. Automatic polyp detection by combining conditional generative adversarial network and modified you-only-look-once [J]. IEEE Sensors Journal, 2022, 22(11): 10841–10849.

    Article  Google Scholar 

  17. PASCUAL G, LAIZ P, GARCÍA A, et al. Time-based self-supervised learning for Wireless Capsule Endoscopy [J]. Computers in Biology and Medicine, 2022, 146: 105631.

    Article  Google Scholar 

  18. PACAL I, KARABOGA D. A robust real-time deep learning based automatic polyp detection system [J]. Computers in Biology and Medicine, 2021, 134: 104519.

    Article  Google Scholar 

  19. PACAL I, KARAMAN A, KARABOGA D, et al. An efficient real-time colonic polyp detection with YOLO algorithms trained by using negative samples and large datasets [J]. Computers in Biology and Medicine, 2022, 141: 105031.

    Article  Google Scholar 

  20. WANG C Y, MARK LIAO H Y, WU Y H, et al. CSP-Net: A new backbone that can enhance learning capability of CNN [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle: IEEE, 2020: 1571–1580.

    Google Scholar 

  21. HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904–1916.

    Article  Google Scholar 

  22. LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8759–8768.

    Chapter  Google Scholar 

  23. TAN M X, LE Q V. EfficientNet: Rethinking model scaling for convolutional neural networks [DB/OL]. (2019-05-28). https://arxiv.org/abs/1905.11946

  24. LIU R. Higher accuracy on vision models with EfficientNet-Lite. TensorFlow Blog [EB/OL]. (2020-03-16). https://blog.tensorflow.org/2020/03/higher-accuracy-on-vision-models-with-efficientnet-lite.html?continueFlag=fc4c98f37325a2fd6989afa002d20bec

  25. HE J B, ERFANI S, MA X J, et al. Alpha-IoU: A family of power intersection over union losses for bounding box regression [DB/OL]. (2021-10-26). https://arxiv.org/abs/2110.13675

  26. BOX G E P, COX D R. An analysis of transformations [J]. Journal of the Royal Statistical Society: Series B (Methodological), 1964, 26(2): 211–243.

    MathSciNet  MATH  Google Scholar 

  27. WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [M]//Computer vision-ECCV 2018. Munich: Springer, 2018: 3–19.

    Chapter  Google Scholar 

  28. HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132–7141.

    Chapter  Google Scholar 

  29. WANG Q L, WU B G, ZHU P F, et al. ECA-net: Efficient channel attention for deep convolutional neural networks [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11531–11539.

    Google Scholar 

  30. HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 13708–13717.

    Google Scholar 

  31. IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift [C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning-Volume 37. New York: ACM, 2015: 448–456.

    Google Scholar 

  32. ODAGAWA M. Implementation of real-time computer-aided diagnosis system with quantitative staging and navigation on customizable embedded digital signal processor [D]. Hiroshima: Hiroshima University, 2021 (in Japanese).

    Google Scholar 

  33. KRENZER A, BANCK M, MAKOWSKI K, et al. A real-time polyp-detection system with clinical application in colonoscopy using deep convolutional neural networks [J]. Journal of Imaging, 2023, 9(2): 26.

    Article  Google Scholar 

  34. BERNAL J, TAJKBAKSH N, SANCHEZ F J, et al. Comparative validation of polyp detection methods in video colonoscopy: Results from the MICCAI 2015 endoscopic vision challenge [J]. IEEE Transactions on Medical Imaging, 2017, 36(6): 1231–1249.

    Article  Google Scholar 

  35. MESEJO P, PIZARRO D, ABERGEL A, et al. Computer-aided classification of gastrointestinal lesions in regular colonoscopy [J]. IEEE Transactions on Medical Imaging, 2016, 35(9): 2051–2063.

    Article  Google Scholar 

  36. BORGLI H, THAMBAWITA V, SMEDSRUD P H, et al. HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy [J]. Scientific Data, 2020, 7: 283.

    Article  Google Scholar 

  37. JHA D, SMEDSRUD P H, RIEGLER M A, et al. Kvasir-SEG: A segmented polyp dataset [C]//International Conference on Multimedia Modeling. Daejeon: Springer, 2020: 451–462.

    Chapter  Google Scholar 

  38. YANG Y J. The future of capsule endoscopy: The role of artificial intelligence and other technical advancements [J]. Clinical Endoscopy, 2020, 53(4): 387–394.

    Article  Google Scholar 

  39. WANG C Y, BOCHKOVSKIY A, LIAO H Y M. Scaled-YOLOv4: Scaling cross stage partial network [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 13024–13033.

    Google Scholar 

  40. WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [DB/OL]. (2022-07-06). https://arxiv.org/abs/2207.02696

  41. GE Z, LIU S T, WANG F, et al. YOLOX: Exceeding YOLO series in 2021 [DB/OL]. (2021-07-18). https://arxiv.org/abs/2107.08430

  42. LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2999–3007.

    Google Scholar 

  43. REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149.

    Article  Google Scholar 

  44. HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3 [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 1314–1324.

    Google Scholar 

  45. ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6848–6856.

    Chapter  Google Scholar 

  46. HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 1577–1586.

    Google Scholar 

  47. TAN M X, LE Q V. EfficientNet: Rethinking model scaling for convolutional neural networks [DB/OL]. (2019-05-28). https://arxiv.org/abs/1905.11946

  48. IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size [DB/OL]. (2016-02-24). https://arxiv.org/abs/1602.07360

  49. JOCHER G, STOKEN A, BOROVEC J, et al. Ultra-lytics/yolov5: v5.0-YOLOv5-P6 1280 models, AWS, Supervise.ly and YouTube integrations [EB/OL]. (2021-04-11). https://zenodo.org/records/4679653

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiwu Wang  (王志武).

Additional information

Foundation item: the National Natural Science Foundation of China (Nos. 81971767, 62103263 and 62103267), and the Shanghai Science and Technology Commission (Nos. 19142203800, 19441913800 and 19441910600)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Si, B., Pang, C., Wang, Z. et al. Real-Time Lightweight Convolutional Neural Network for Polyp Detection in Endoscope Images. J. Shanghai Jiaotong Univ. (Sci.) (2023). https://doi.org/10.1007/s12204-023-2671-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12204-023-2671-2

Key words

关键词

CLC number

Document code

Navigation