Abstract
Deep learning (DL) has been widely used to detect abnormalities in retinal image. Typically, this task has been focused on a specific domain, such as diseases related to glaucoma or diabetic retinopathy, for example. In this study, we propose to identify lesions associated with both diseases using a single base model, Cascade R-CNN, avoiding the use of multiple DL models. The task is complicated by the need for annotations in datasets related to damages in another domain for which it was created. In addition, the size and shape of objects and bias toward predominant classes are evident. Several techniques characterize this work, including soft labeling for mask predictions, normalized Wasserstein distance for handling small objects, and experiments in image sampling during training with cross-entropy loss combined with Online Hard Negative Mining or asymmetric loss. For result refinement, cluster-weighted with Distance IoU improved final predictions. Based on mean average precision (mAP), a standard metric in object detection models, the reported results were 0.46, and all experiments were conducted on the public DDR dataset. A detailed error analysis by category was provided. In conclusion, the feasibility of using a single model was demonstrated, while the techniques employed helped to increase mAP-related metrics. Our research provides novel insights into the use of retinal photographs for the prediction of systemic biomarkers associated with multiple diseases.
Similar content being viewed by others
Data availability
Not applicable.
References
Ajitha S, Judy MV (2020) Faster R-CNN classification for the recognition of glaucoma. J Phys Conf Ser 1706(1):012170. https://doi.org/10.1088/1742-6596/1706/1/012170
Ali S, Zhou F, Daul C, Braden B, Bailey A, Realdon S, East J et al (2019) Endoscopy artifact detection (EAD 2019) challenge dataset. https://doi.org/10.17632/C7FJBXCGJ9.1
Alyoubi WL, Abulkhair MF, Shalash WM (2021) Diabetic retinopathy fundus image classification and lesions localization system using deep learning. Sensors 21(11):3704. https://doi.org/10.3390/S21113704
Bajwa MN, Singh GAP, Neumeier W, Malik MI, Dengel A, Ahmed S (2020) G1020: a benchmark retinal fundus image dataset for computer-aided glaucoma detection. In: 2020 International joint conference on neural networks (IJCNN). IEEE, pp 1–7. https://doi.org/10.1109/IJCNN48605.2020.9207664
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(February):281–305. https://doi.org/10.5555/2188385.2188395
Bolya D, Foley S, Hays J, Hoffman J (2020) TIDE: a general toolbox for identifying object detection errors. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), LNCS, vol 12348. pp 558–73. https://doi.org/10.1007/978-3-030-58580-8_33/COVER
Borsos B, Nagy L, Iclănzan D, Szilágyi L (2019) Automatic detection of hard and soft exudates from retinal fundus images. Acta Univ Sapientiae Inf 11(1):65–79. https://doi.org/10.2478/AUSI-2019-0005
Bourne RRA, Steinmetz JD, Saylan M, Mersha AM, Weldemariam AH, Wondmeneh TG, Sreeramareddy CT et al (2021) Causes of blindness and vision impairment in 2020 and trends over 30 years, and prevalence of avoidable blindness in relation to VISION 2020: the right to sight: an analysis for the global burden of disease study. Lancet Glob Health 9(2):e144–e160. https://doi.org/10.1016/S2214-109X(20)30489-7
Cai Z, Vasconcelos N (2019) Cascade R-CNN: high quality object detection and instance segmentation. http://arxiv.org/abs/1906.09756
Cen LP, Ji J, Lin JW, Ju ST, Lin HJ, Li TP, Wang Y et al (2021) Automatic detection of 39 fundus diseases and conditions in retinal photographs using deep neural networks. Nat Commun 12(1):1–13. https://doi.org/10.1038/s41467-021-25138-w
Chai Y, Liu H, X J (2018) Glaucoma diagnosis based on both hidden features and domain knowledge through deep learning models. Knowl Based Syst 161(December):147–156. https://doi.org/10.1016/J.KNOSYS.2018.07.043
Chai Y, Liu H, Xu J (2020) A new convolutional neural network model for peripapillary atrophy area segmentation from retinal fundus images. Appl Soft Comput 86(January):105890. https://doi.org/10.1016/J.ASOC.2019.105890
COCO-Common Objects in Context (n.d.) https://cocodataset.org/#detection-eval. Accessed 22 Apr 2022
Contributors MMCV (2018) MMCV: OpenMMLab computer vision foundation. https://github.com/open-mmlab/mmcv
Dai L, Wu L, Li H, Cai C, Wu Q, Kong H, Liu R et al (2021) A deep learning system for detecting diabetic retinopathy across the disease spectrum. Nat Commun 12(1):1–11. https://doi.org/10.1038/s41467-021-23458-5
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput vis 88(2):303–338. https://doi.org/10.1007/s11263-009-0275-4
Girshick R (2015) Fast R-CNN. In: 2015 IEEE international conference on computer vision (ICCV), 2015 Inter:1440–48. IEEE. https://doi.org/10.1109/ICCV.2015.169
Günther J, Pilarski PM, Helfrich G, Shen H, Diepold K (2014) First steps towards an intelligent laser welding architecture using deep neural networks and reinforcement learning. Procedia Technol 15:474–483. https://doi.org/10.1016/j.protcy.2014.09.007
Guo Y, Peng Y, Zhang B (2021) CAFR-CNN: Coarse-to-fine adaptive faster R-CNN for cross-domain joint optic disc and cup segmentation. Appl Intell 51(8):5701–5725. https://doi.org/10.1007/s10489-020-02145-w
Hardalaç F, Uysal F, Peker O, Çiçeklidağ M, Tolunay T, Tokgöz N, Kutbay U, Demirciler B, Mert F (2022) Fracture detection in wrist X-ray images using deep learning-based object detection models. Sensors (basel, Switzerland). https://doi.org/10.3390/S22031285
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 2016 December. pp 770–78. https://doi.org/10.48550/arxiv.1512.03385
Healey PR, Mitchell P, Gilbert CE, Lee AJ, Ge D, Snieder H, Spector TD, Hammond CJ (2007) The inheritance of peripapillary atrophy. Investig Ophthalmol vis Sci 48(6):2529–2534. https://doi.org/10.1167/IOVS.06-0714
Hirasawa T, Aoyama K, Tanimoto T, Ishihara S, Shichijo S, Ozawa T, Ohnishi T et al (2018) Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. Gastr Cancer 21(4):653–660. https://doi.org/10.1007/S10120-018-0793-2
Hosang J, Benenson R, Schiele B (2017) Learning non-maximum suppression. In: Proceedings—30th IEEE conference on computer vision and pattern recognition, CVPR 2017, 2017 January (May), pp 6469–77. https://doi.org/10.1109/CVPR.2017.685
Huang Q, Mao J, Liu Y (2012) An improved grid search algorithm of SVR parameters optimization. In: International conference on communication technology proceedings, ICCT, pp 1022–26. https://doi.org/10.1109/ICCT.2012.6511415
Huang Z, Huang L, Gong Y, Huang C, Wang X (2019) Mask scoring R-CNN. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2019, June, pp 6402–11. IEEE. https://doi.org/10.1109/CVPR.2019.00657
Huynh QT, Nguyen PH, Le HX, Ngo LT, Trinh NT, Tran MTT, Nguyen HT et al (2022) Automatic acne object detection and acne severity grading using smartphone images and artificial intelligence. Diagnostics 12(8):1879. https://doi.org/10.3390/DIAGNOSTICS12081879
Jha D, Ali S, Tomar NK, Johansen HD, Johansen D, Rittscher J, Riegler MA, Halvorsen P (2021) Real-time polyp detection, localization and segmentation in colonoscopy using deep learning. IEEE Access 9:40496. https://doi.org/10.1109/ACCESS.2021.3063716
Jonas JB (2005) Clinical IMPLICATIONS OF PERIPAPILLARY ATROPHY IN GLAUCOMA. Curr Opin Ophthalmol 16(2):84–88. https://doi.org/10.1097/01.ICU.0000156135.20570.30
Jonas JB, Martus P, Horn FK, Jünemann A, Korth M, Budde WM (2004) Predictive factors of the optic nerve head for development or progression of glaucomatous visual field loss. Investig Ophthalmol Vis Sci 45(8):2613–2618. https://doi.org/10.1167/IOVS.03-1274
Kakade P, Kale A, Jawade I, Jadhav R, Kulkarni N (2016) Optic disc detection using image processing and deep learning. UOB J 3(3):1–8
Kande GB, Satya Savithri T, Venkata Subbaiah P (2010) Automatic detection of microaneurysms and hemorrhages in digital fundus images. J Digit Imaging 23(4):430. https://doi.org/10.1007/S10278-009-9246-0
Kanski JJ, Bowling B (2015) Kanski’s clinical ophthalmology e-book: a systematic approach. Elsevier Health Sciences. https://books.google.es/books?id=D9GfBwAAQBAJ
Karthikeyan S, Sanjay Kumar P, Madhusudan RJ, Sundaramoorthy SK, Krishnan Namboori PK (2019) Detection of multi-class retinal diseases using artificial intelligence: an expeditious learning using deep cnn with minimal data. Biomed Pharmacol J 12(3):1577–1586. https://doi.org/10.13005/BPJ/1788
Lee WY, Park SM, Sim KB (2018) Optimal hyperparameter tuning of convolutional neural networks based on the parameter-setting-free harmony search algorithm. Optik 172(November):359–367. https://doi.org/10.1016/J.IJLEO.2018.07.044
Li G, Li C, Zeng C, Gao P, Xie G (2020) Region focus network for joint optic disc and cup segmentation. Proc AAAI Conf Artif Intell 34(01):751–758. https://doi.org/10.1609/aaai.v34i01.5418
Li T, Gao Y, Wang K, Guo S, Liu H, Kang H (2019) Diagnostic assessment of deep learning algorithms for diabetic retinopathy screening. Inf Sci 501:511–522. https://doi.org/10.1016/j.ins.2019.06.011
Lin T-Y, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Lawrence Zitnick C, Dollár P (2014) Microsoft COCO: common objects in context. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, May, pp 3686–93. http://arxiv.org/abs/1405.0312
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2016) Feature pyramid networks for object detection. http://arxiv.org/abs/1612.03144
Liu Y, Gadepalli K, Norouzi M, Dahl GE, Kohlberger T, Boyko A, Venugopalan S et al (2017) Detecting cancer metastases on gigapixel pathology images. https://arxiv.org/abs/1703.02442v2
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. http://arxiv.org/abs/1803.01534
Loshchilov I, Hutter F (n.d.) Decoupled weight decay regularization. https://doi.org/10.48550/arXiv.1711.05101
Loshchilov I, Hutter F (2016) SGDR: stochastic gradient descent with warm restarts. https://doi.org/10.48550/arXiv.1608.03983
Lu C-K, Tang TB, Murray AF, Laude A, Dhillon B (2010) Automatic parapapillary atrophy shape detection and quantification in colour fundus images. In: 2010 IEEE biomedical circuits and systems conference, BioCAS 2010, pp 86–89. https://doi.org/10.1109/BIOCAS.2010.5709577
Lu C-K, Tang TB, Laude A, Dhillon B, Murray AF (2012) Parapapillary atrophy and optic disc region assessment (PANDORA): retinal imaging tool for assessment of the optic disc and parapapillary atrophy. J Biomed Opt 17(10):1060101. https://doi.org/10.1117/1.JBO.17.10.106010
Maclaurin D, Duvenaud D, Adams RP, Maclaurin D, Duvenaud D, Adams RP (2015) Gradient-based hyperparameter optimization through reversible learning. ArXiv. arXiv:1502.03492. https://doi.org/10.48550/ARXIV.1502.03492
Mahabadi N, Al Khalili Y (2022) Neuroanatomy, retina. StatPearls. https://www.ncbi.nlm.nih.gov/books/NBK545310/
Mateen M, Malik TS, Hayat S, Hameed M, Sun S, Wen J (2022) Deep learning approach for automatic microaneurysms detection. Sensors 22(2):542. https://doi.org/10.3390/S22020542
Müller D, Soto-Rey I, Kramer F (2022) Towards a guideline for evaluation metrics in medical image segmentation, pp 1–7. https://doi.org/10.48550/arXiv.2202.05273
Nazir T, Irtaza A, Starovoitov V (2021a) Optic disc and optic cup segmentation for glaucoma detection from blur retinal images using improved mask-RCNN. Int J Opt 2021(D1):1–12. https://doi.org/10.1155/2021/6641980
Nazir T, Nawaz M, Rashid J, Mahum R, Masood M, Mehmood A, Ali F, Kim J, Kwon HY, Hussain A (2021b) Detection of diabetic eye disease from retinal images using a deep learning based CenterNet model. Sensors 21(16):5283. https://doi.org/10.3390/S21165283
Norouzi S, Ebrahimi M (n.d.) A survey on proposed methods to address Adam optimizer deficiencies. http://www.cs.toronto.edu/~sajadn/sajad_norouzi/ECE1505.pdf. Accessed 18 Jun 2022
Oksuz K, Cam BC, Kalkan S, Akbas E (2020) Imbalance problems in object detection: a review. IEEE Trans Pattern Anal Mach Intell 43(10):3388–3415. https://doi.org/10.1109/TPAMI.2020.2981890
OpenCV-Open Computer Vision Library. (n.d.) https://opencv.org/. Accessed 15 June 2023
Orouskhani M, Firoozeh N, Xia S, Mossa-Basha M, Zhu C (2023) nnDetection for intracranial aneurysms detection and localization. https://arxiv.org/abs/2305.13398v1
Peyré G, Cuturi M (2018) Computational optimal transport. Found Trends Mach Learn 11(5–6):1–257. https://doi.org/10.1561/2200000073
Rame A, Garreau E, Ben-Younes H, Ollion C (2018) OMNIA faster R-CNN: detection in the wild through dataset merging and soft distillation. https://arxiv.org/abs/1812.02611v2
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. http://arxiv.org/abs/1506.01497
Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition 2019, June. pp 658–66. https://doi.org/10.1109/CVPR.2019.00075
Ridnik T, Ben-Baruch E, Zamir N, Noy A, Friedman I, Protter M, Zelnik-Manor L (2021) Asymmetric loss for multi-label classification. In: 2021 IEEE/CVF international conference on computer vision (ICCV), October, pp 82–91. https://doi.org/10.1109/ICCV48922.2021.00015
Rim TH, Lee G, Kim Y, Tham YC, Lee CJ, Baik SJ, Kim YA et al (2020) Prediction of systemic biomarkers from retinal photographs: development and validation of deep-learning algorithms. Lancet Digit Health 2(10):e526–e536. https://doi.org/10.1016/S2589-7500(20)30216-8
Riordan-Eva P, Augsburger JJ (2018) Vaughan & Asbury’s general ophthalmology, 19e
Roboflow: give your software the power to see objects in images and video. n.d. https://roboflow.com/. Accessed 14 Jun 2023
Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 9351. pp 234–41. https://doi.org/10.1007/978-3-319-24574-4_28
Sadhukhan S, Ghorai GK, Maiti S, Sarkar G, Dhara AK (2018) Optic disc localization in retinal fundus images using faster R-CNN. In: 2018 fifth international conference on emerging applications of information technology (EAIT), IEEE. pp 1–4. https://doi.org/10.1109/EAIT.2018.8470435
Santos C, Aguiar M, Welfer D, Belloni B (2022) A new approach for detecting fundus lesions using image processing and deep neural network architecture based on YOLO model. Sensors 22(17):6441. https://doi.org/10.3390/S22176441/S1
Sayed SY, Raafat KA, Ahmed RA, Allam RSHM (2021) Evaluation of peripapillary atrophy in early open-angle glaucoma using autofluorescence combined with optical coherence tomography. Int Ophthalmol 41(7):2405–2415. https://doi.org/10.1007/S10792-021-01795-0/METRICS
Septiarini A, Harjoko A (2015) Automatic glaucoma detection based on the type of features used: a review. J Theor Appl Inf Technol 28(3). www.jatit.org
Shou Y, Meng T, Ai W, Xie C, Liu H, Wang Y (2022) Object detection in medical images based on hierarchical transformer and mask mechanism. Comput Intell Neurosci. https://doi.org/10.1155/2022/5863782
Shrivastava, Gupta A, Girshick R. (n.d.) Training region-based object detectors with online hard example mining
Son J, Shin JY, Kim HD, Jung KH, Park KH, Park SJ (2020) Development and validation of deep learning models for screening multiple abnormal findings in retinal fundus images. Ophthalmology 127(1):85–94. https://doi.org/10.1016/J.OPHTHA.2019.05.029
Ting DS, Wei CY, Cheung L, Lim G, Tan GSW, Quang ND, Gan A, Hamzah H et al (2017) Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA 318(22):2211–2223. https://doi.org/10.1001/JAMA.2017.18152
Tomita N, Abdollahi B, Wei J, Ren B, Suriawinata A, Hassanpour S (2019) Attention-based deep neural networks for detection of cancerous and precancerous esophagus tissue on histopathological slides. JAMA Netw Open. https://doi.org/10.1001/JAMANETWORKOPEN.2019.14645
Tyagi AK, Mohapatra C, Das P, Makharia G, Mehra L, Prathosh AP, Mausam (2023) DeGPR: deep guided posterior regularization for multi-class cell detection and counting. April. https://arxiv.org/abs/2304.00741v1
Tychsen-Smith L, Petersson L (2017) Improving object localization with fitness NMS and bounded IoU loss. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. November, pp 6877–85. https://doi.org/10.1109/CVPR.2018.00719
Wan C, Wu J, Li H, Yan Z, Wang C, Jiang Q, Cao G, Xu Y, Yang W (2021) Optimized-Unet: novel algorithm for parapapillary atrophy segmentation. Front Neurosci. https://doi.org/10.3389/FNINS.2021.758887
Wang J, Yang L, Huo Z, He W, Luo J (2020) Multi-label classification of fundus images with EfficientNet. IEEE Access 8:212499–212508. https://doi.org/10.1109/ACCESS.2020.3040275
Wu L, Zhou W, Wan X, Zhang J, Shen L, Shan Hu, Ding Q et al (2019) A deep neural network improves endoscopic detection of early gastric cancer without blind spots. Endoscopy 51(6):522–531. https://doi.org/10.1055/A-0855-3532
Xu C, Wang J, Yang W, Huai Yu, Lei Yu, Xia G-S (2022) Detecting tiny objects in aerial images: a normalized wasserstein distance and a new benchmark. ISPRS J Photogramm Remote Sens 190(June):79–93. https://doi.org/10.1016/j.isprsjprs.2022.06.002
Yan K, Tang Y, Peng Y, Sandfort V, Bagheri M, Lu Z, Summers RM (2019) MULAN: multitask universal lesion analysis network for joint lesion detection, tagging, and segmentation. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), LNCS, vol 11769. pp 194–202. https://doi.org/10.1007/978-3-030-32226-7_22/COVER
Yang R, Yu Y (2021) Artificial convolutional neural network in object detection and semantic segmentation for medical imaging analysis. Front Oncol 11(March):573. https://doi.org/10.3389/FONC.2021.638182/BIBTEX
Yu J, Jiang Y, Wang Z, Cao Z, Huang T (2016) UnitBox: an advanced object detection network. In: MM 2016—proceedings of the 2016 ACM multimedia conference, October. pp 516–20. https://doi.org/10.1145/2964284.2967274
Zhang Z, Sabuncu MR (n.d.) Generalized cross entropy loss for training deep neural networks with noisy labels. https://doi.org/10.5555/3327546.3327555. Accessed 4 July 2023
Zhang Z, Yin FS, Liu J, Wong WK, Tan NM, Lee BH, Cheng J, Wong TY (2010) “ORIGA-Light: an online retinal fundus image database for glaucoma analysis and research. In: 2010 Annual international conference of the IEEE engineering in medicine and biology. IEEE, United States. pp 3065–68. https://doi.org/10.1109/IEMBS.2010.5626137
Zheng Z, Wang P, Ren D, Liu W, Ye R, Hu Q, Zuo W (n.d.) Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans Cybern XX (1). https://github.com/Zzh-tju/CIoU. Accessed 30 May 2023
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-IoU loss: faster and better learning for bounding box regression. Proc AAAI Conf Artif Intell 34(07):12993. https://doi.org/10.1609/AAAI.V34I07.6999
Zlocha M, Dou Q, Glocker B (2019) Improving RetinaNet for CT lesion detection with dense masks from weak RECIST labels. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), LNCS, vol 11769. pp 402–10. https://doi.org/10.1007/978-3-030-32226-7_45/COVER
Acknowledgements
The authors would like to thank the National Council of Research of Mexico (CONACyT), the Faculty of Engineering at the Autonomous University of Querétaro, the Instituto Mexicano de Oftalmología IAP, and University of Saskatchewan, Department of Electrical and Computer Engineering for their advice and support in this investigation.
Funding
This research received no external funding.
Author information
Authors and Affiliations
Contributions
GA-F: Conceptualization, methodology, software, validation, formal analysis, data curation, writing—original draft, and visualization; ST-A: Conceptualization, writing—original draft, supervision, and project administration; JCP-O: Writing—review and editing, and supervision; MT-A: Methodology and funding; MAA-F: Writing—reviewing and editing; JR-R: Writing—reviewing and editing; MB-F: Review and editing, and supervision; S-BK: Writing—reviewing and editing. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Alfonso-Francia, G., Pedraza-Ortega, J.C., Toledano-Ayala, M. et al. Unraveling the complexity: deep learning for imbalanced retinal lesion detection and multi-disease identification. Netw Model Anal Health Inform Bioinforma 13, 3 (2024). https://doi.org/10.1007/s13721-023-00438-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13721-023-00438-x