Real-Time Lightweight Convolutional Neural Network for Polyp Detection in Endoscope Images

Si, Bingqi; Pang, Chenxi; Wang, Zhiwu; Jiang, Pingping; Yan, Guozheng

doi:10.1007/s12204-023-2671-2

Real-Time Lightweight Convolutional Neural Network for Polyp Detection in Endoscope Images

用于内窥镜图像息肉检测的实时轻量级卷积神经网络

Published: 09 November 2023

(2023)
Cite this article

Journal of Shanghai Jiaotong University (Science) Aims and scope Submit manuscript

Bingqi Si (司丙奇)^1,2,
Chenxi Pang (逄晨曦)³,
Zhiwu Wang (王志武)^1,2,
Pingping Jiang (姜萍萍)^1,2 &
…
Guozheng Yan (颜国正)^1,2

73 Accesses
Explore all metrics

Abstract

Colorectal cancer is the most common cancer with a second mortality rate. Polyp lesion is a precursor symptom of colorectal cancer. Detection and removal of polyps can effectively reduce the mortality of patients in the early period. However, mass images will be generated during an endoscopy, which will greatly increase the workload of doctors, and long-term mechanical screening of endoscopy images will also lead to a high misdiagnosis rate. Aiming at the problem that computer-aided diagnosis models deeply depend on the computational power in the polyp detection task, we propose a lightweight model, coordinate attention-YOLOv5-Lite-Prune, based on the YOLOv5 algorithm, which is different from state-of-the-art methods proposed by the existing research that applied object detection models or their variants directly to prediction task without any lightweight processing, such as faster region-based convolutional neural networks, YOLOv3, YOLOv4, and single shot multibox detector. The innovations of our model are as follows: First, the lightweight EfficientNetLite network is introduced as the new feature extraction network. Second, the depthwise separable convolution and its improved modules with different attention mechanisms are used to replace the standard convolution in the detection head structure. Then, the α-intersection over union loss function is applied to improve the precision and convergence speed of the model. Finally, the model size is compressed with a pruning algorithm. Our model effectively reduces parameter amount and computational complexity without significant accuracy loss. Therefore, the model can be successfully deployed on the embedded deep learning platform, and detect polyps with a speed above 30 frames per second, which means the model gets rid of the limitation that deep learning models must rely on high-performance servers.

摘要

结直肠癌是最常见的癌症, 死亡率第二。息肉病变是结直肠癌的前兆症状。息肉的发现和切除可有效降低患者早期的死亡率。然而, 内窥镜检查过程中会产生大量的图像, 这将大大增加医生的工作量, 并且长期的机械筛选内镜图像也会导致高误诊率。针对计算机辅助诊断模型在息肉检测任务中严重依赖计算能力的问题, 我们提出了一种基于YOLOv5算法的轻量级模型, 坐标注意力-YOLOv5-Lite-Prune;这个模型不同于现有研究中提出的最新方法, 例如更快的基于区域的卷积神经网络、YOLOv3、YOLOv4和单次多边框检测, 这些方法将目标检测模型或其变体直接应用于预测任务而不进行任何轻量级处理。本文模型的创新点如下:首先, 引入轻量级的EfficientNetLite作为新的特征提取网络;其次, 采用深度可分卷积及其改进模块, 采用不同的注意机制取代检测头结构中的标准卷积;然后, 利用α-IoU损失函数提高模型的精度和收敛速度;最后, 利用剪枝算法压缩模型大小。我们的模型有效地减少了参数的数量和计算复杂度, 并且没有明显的精度损失。因此, 该模型可以成功部署在嵌入式深度学习平台上, 并以每秒30帧以上的速度检测息肉, 这意味着该模型摆脱了深度学习模型必须依赖高性能服务器的限制。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time polyp detection model using convolutional neural networks

Article Open access 21 September 2021

UViT-Seg: An Efficient ViT and U-Net-Based Framework for Accurate Colorectal Polyp Segmentation in Colonoscopy and WCE Images

Article 26 April 2024

Towards Automated Colonoscopy Diagnosis: Binary Polyp Size Estimation via Unsupervised Depth Learning

References

THANIKACHALAM K, KHAN G. Colorectal cancer and nutrition [J]. Nutrients, 2019, 11(1): 164.
Article Google Scholar
SUNG H, FERLAY J, SIEGEL R L, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J]. CA: A Cancer Journal for Clinicians, 2021, 71(3): 209–249.
Google Scholar
BRAY F, FERLAY J, SOERJOMATARAM I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J]. CA: A Cancer Journal for Clinicians, 2018, 68(6): 394–424.
Google Scholar
SIMON K. Colorectal cancer development and advances in screening [J]. Clinical Interventions in Aging, 2016, 11: 967–976.
Article Google Scholar
LOEVE F, BOER R, ZAUBER A G, et al. National polyp study data: Evidence for regression of adenomas [J]. International Journal of Cancer, 2004, 111(4): 633–639.
Article Google Scholar
LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector [M]//European conference on computer vision. Amsterdam: Springer, 2016: 21–37.
Google Scholar
BURLING D, International Collaboration for CT Colonography Standards. CT colonography standards [J]. Clinical Radiology, 2010, 65(6): 474–480.
Article Google Scholar
COX B F, STEWART F, LAY H, et al. Ultrasound capsule endoscopy: Sounding out the future [J]. Annals of Translational Medicine, 2017, 5(9): 201.
Article Google Scholar
SIEGEL R L, MILLER K D, FEDEWA S A, et al. Colorectal cancer statistics, 2017 [J]. CA: A Cancer Journal for Clinicians, 2017, 67(3): 177–193.
Google Scholar
GUO Z, ZHANG R Y, LI Q, et al. Reduce false-positive rate by active learning for automatic polyp detection in colonoscopy videos [C]//2020 IEEE 17th International Symposium on Biomedical Imaging. Iowa City: IEEE, 2020: 1655–1658.
Google Scholar
NOGUEIRA-RODRÍGUEZ A, DOMÍNGUEZ-CARBAJALES R, CAMPOS-TATO F, et al. Real-time polyp detection model using convolutional neural networks [J]. Neural Computing and Applications, 2022, 34(13): 10375–10396.
Article Google Scholar
SONG E M, PARK B, HA C A, et al. Endoscopic diagnosis and treatment planning for colorectal polyps using a deep-learning model [J]. Scientific Reports, 2020, 10: 30.
Article Google Scholar
XU J W, ZHAO R, YU Y Z, et al. Real-time automatic polyp detection in colonoscopy using feature enhancement module and spatiotemporal similarity correlation unit [J]. Biomedical Signal Processing and Control, 2021, 66: 102503.
Article Google Scholar
CAO C T, WANG R L, YU Y, et al. Gastric polyp detection in gastroscopic images using deep neural network [J]. PLoS One, 2021, 16(4): e0250632.
Article Google Scholar
CHEN B L, WAN J J, CHEN T Y, et al. A self-attention based faster R-CNN for polyp detection from colonoscopy images [J]. Biomedical Signal Processing and Control, 2021, 70: 103019.
Article Google Scholar
QIAN Z Q, JING W J, LV Y, et al. Automatic polyp detection by combining conditional generative adversarial network and modified you-only-look-once [J]. IEEE Sensors Journal, 2022, 22(11): 10841–10849.
Article Google Scholar
PASCUAL G, LAIZ P, GARCÍA A, et al. Time-based self-supervised learning for Wireless Capsule Endoscopy [J]. Computers in Biology and Medicine, 2022, 146: 105631.
Article Google Scholar
PACAL I, KARABOGA D. A robust real-time deep learning based automatic polyp detection system [J]. Computers in Biology and Medicine, 2021, 134: 104519.
Article Google Scholar
PACAL I, KARAMAN A, KARABOGA D, et al. An efficient real-time colonic polyp detection with YOLO algorithms trained by using negative samples and large datasets [J]. Computers in Biology and Medicine, 2022, 141: 105031.
Article Google Scholar
WANG C Y, MARK LIAO H Y, WU Y H, et al. CSP-Net: A new backbone that can enhance learning capability of CNN [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle: IEEE, 2020: 1571–1580.
Google Scholar
HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904–1916.
Article Google Scholar
LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8759–8768.
Chapter Google Scholar
TAN M X, LE Q V. EfficientNet: Rethinking model scaling for convolutional neural networks [DB/OL]. (2019-05-28). https://arxiv.org/abs/1905.11946
LIU R. Higher accuracy on vision models with EfficientNet-Lite. TensorFlow Blog [EB/OL]. (2020-03-16). https://blog.tensorflow.org/2020/03/higher-accuracy-on-vision-models-with-efficientnet-lite.html?continueFlag=fc4c98f37325a2fd6989afa002d20bec
HE J B, ERFANI S, MA X J, et al. Alpha-IoU: A family of power intersection over union losses for bounding box regression [DB/OL]. (2021-10-26). https://arxiv.org/abs/2110.13675
BOX G E P, COX D R. An analysis of transformations [J]. Journal of the Royal Statistical Society: Series B (Methodological), 1964, 26(2): 211–243.
MathSciNet MATH Google Scholar
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [M]//Computer vision-ECCV 2018. Munich: Springer, 2018: 3–19.
Chapter Google Scholar
HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132–7141.
Chapter Google Scholar
WANG Q L, WU B G, ZHU P F, et al. ECA-net: Efficient channel attention for deep convolutional neural networks [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11531–11539.
Google Scholar
HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 13708–13717.
Google Scholar
IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift [C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning-Volume 37. New York: ACM, 2015: 448–456.
Google Scholar
ODAGAWA M. Implementation of real-time computer-aided diagnosis system with quantitative staging and navigation on customizable embedded digital signal processor [D]. Hiroshima: Hiroshima University, 2021 (in Japanese).
Google Scholar
KRENZER A, BANCK M, MAKOWSKI K, et al. A real-time polyp-detection system with clinical application in colonoscopy using deep convolutional neural networks [J]. Journal of Imaging, 2023, 9(2): 26.
Article Google Scholar
BERNAL J, TAJKBAKSH N, SANCHEZ F J, et al. Comparative validation of polyp detection methods in video colonoscopy: Results from the MICCAI 2015 endoscopic vision challenge [J]. IEEE Transactions on Medical Imaging, 2017, 36(6): 1231–1249.
Article Google Scholar
MESEJO P, PIZARRO D, ABERGEL A, et al. Computer-aided classification of gastrointestinal lesions in regular colonoscopy [J]. IEEE Transactions on Medical Imaging, 2016, 35(9): 2051–2063.
Article Google Scholar
BORGLI H, THAMBAWITA V, SMEDSRUD P H, et al. HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy [J]. Scientific Data, 2020, 7: 283.
Article Google Scholar
JHA D, SMEDSRUD P H, RIEGLER M A, et al. Kvasir-SEG: A segmented polyp dataset [C]//International Conference on Multimedia Modeling. Daejeon: Springer, 2020: 451–462.
Chapter Google Scholar
YANG Y J. The future of capsule endoscopy: The role of artificial intelligence and other technical advancements [J]. Clinical Endoscopy, 2020, 53(4): 387–394.
Article Google Scholar
WANG C Y, BOCHKOVSKIY A, LIAO H Y M. Scaled-YOLOv4: Scaling cross stage partial network [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 13024–13033.
Google Scholar
WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [DB/OL]. (2022-07-06). https://arxiv.org/abs/2207.02696
GE Z, LIU S T, WANG F, et al. YOLOX: Exceeding YOLO series in 2021 [DB/OL]. (2021-07-18). https://arxiv.org/abs/2107.08430
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2999–3007.
Google Scholar
REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149.
Article Google Scholar
HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3 [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 1314–1324.
Google Scholar
ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6848–6856.
Chapter Google Scholar
HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 1577–1586.
Google Scholar
TAN M X, LE Q V. EfficientNet: Rethinking model scaling for convolutional neural networks [DB/OL]. (2019-05-28). https://arxiv.org/abs/1905.11946
IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size [DB/OL]. (2016-02-24). https://arxiv.org/abs/1602.07360
JOCHER G, STOKEN A, BOROVEC J, et al. Ultra-lytics/yolov5: v5.0-YOLOv5-P6 1280 models, AWS, Supervise.ly and YouTube integrations [EB/OL]. (2021-04-11). https://zenodo.org/records/4679653

Download references

Author information

Authors and Affiliations

School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
Bingqi Si (司丙奇), Zhiwu Wang (王志武), Pingping Jiang (姜萍萍) & Guozheng Yan (颜国正)
Shanghai Engineering Research Center of Intelligent Drug Detoxification and Rehabilitation, Shanghai, 200240, China
Bingqi Si (司丙奇), Zhiwu Wang (王志武), Pingping Jiang (姜萍萍) & Guozheng Yan (颜国正)
College of Computer Science and Technology, Jilin University, Changchun, 130012, China
Chenxi Pang (逄晨曦)

Authors

Bingqi Si (司丙奇)
View author publications
You can also search for this author in PubMed Google Scholar
Chenxi Pang (逄晨曦)
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwu Wang (王志武)
View author publications
You can also search for this author in PubMed Google Scholar
Pingping Jiang (姜萍萍)
View author publications
You can also search for this author in PubMed Google Scholar
Guozheng Yan (颜国正)
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiwu Wang (王志武).

Additional information

Foundation item: the National Natural Science Foundation of China (Nos. 81971767, 62103263 and 62103267), and the Shanghai Science and Technology Commission (Nos. 19142203800, 19441913800 and 19441910600)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Si, B., Pang, C., Wang, Z. et al. Real-Time Lightweight Convolutional Neural Network for Polyp Detection in Endoscope Images. J. Shanghai Jiaotong Univ. (Sci.) (2023). https://doi.org/10.1007/s12204-023-2671-2

Download citation

Received: 14 November 2022
Accepted: 27 February 2023
Published: 09 November 2023
DOI: https://doi.org/10.1007/s12204-023-2671-2

Key words

关键词

CLC number

Document code

A

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-Time Lightweight Convolutional Neural Network for Polyp Detection in Endoscope Images

Abstract

摘要

Access this article

Similar content being viewed by others

Real-time polyp detection model using convolutional neural networks

UViT-Seg: An Efficient ViT and U-Net-Based Framework for Accurate Colorectal Polyp Segmentation in Colonoscopy and WCE Images

Towards Automated Colonoscopy Diagnosis: Binary Polyp Size Estimation via Unsupervised Depth Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

关键词

CLC number

Document code

Navigation

Real-Time Lightweight Convolutional Neural Network for Polyp Detection in Endoscope Images

Abstract

摘要

Access this article

Similar content being viewed by others

Real-time polyp detection model using convolutional neural networks

UViT-Seg: An Efficient ViT and U-Net-Based Framework for Accurate Colorectal Polyp Segmentation in Colonoscopy and WCE Images

Towards Automated Colonoscopy Diagnosis: Binary Polyp Size Estimation via Unsupervised Depth Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

关键词

CLC number

Document code

Search

Navigation