Skip to main content
Log in

A Survey on Deep Learning Approaches to Medical Images and a Systematic Look up into Real-Time Object Detection

  • Review article
  • Published:
Archives of Computational Methods in Engineering Aims and scope Submit manuscript

Abstract

The article focuses on the gentle introduction of Artificial Intelligence and the concepts of machine learning (ML) and deep learning (DL). The rapid developments made in DL techniques has motivated us to delve into this study. The concept of DL flourishing from basics theoretical concepts to applications. Deep neural networks are now state-of-the-art ML models commonly used in academia and industry in several fields, from image recognition to natural language processing. These advances have an immense potential for medical imaging technology, medical data processing, medical diagnostics and general healthcare. Our aim is two-fold: (1) the survey on DL approaches to medical images (2) the DL-based object detection approaches. The article delivers an effective description of the recent advances, advanced learning technologies and the platforms used for DL approaches. Object detection is the most explored and challenging concept in the field of computer vision systems. This field is receiving greater attention among the researchers since it covers real-time applications such as the face, pedestrian, text etc. The role of object detection is to detect the target objects presented in the image (or) video frames by appropriately classifying into their relevant classes. The review study of object detection begins with the recent works, the datasets used, and the real-time applications are explored from the learning strategies. Finally, the article investigates the challenges of the DL models and discusses promising future directions in both the research areas.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28

Similar content being viewed by others

References

  1. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436

    Google Scholar 

  2. Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large- vocabulary speech recognition. IEEE Trans Actions Audio Speech Lang Process 20:30–42

    Google Scholar 

  3. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNET classification with deep convolutional neural networks. In: Advances in neural information processing systems. pp 1097–1105

  4. Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K et al (2018) Deep contextualized word representations. In: Proceedings of the 2018 conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (long papers), vol 1. pp 2227–2237

  5. Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In: Proceedings of the 56th annual meeting of the Association for Computational Linguistics (volume 1: long papers). pp 328–339

  6. Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf,2018

  7. Xiong W, Wu L, Alleva F, Droppo J, Huang X, Stolcke A (2018) The Microsoft 2017 conversational speech recognition system. In: Proceedings speech and signal processing (ICASSP) 2018 IEEE international conference acoustics. pp 5934–5938

  8. van den Oord A, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A et al (2016) WaveNet: a generative model for raw audio. arXiv:1609.03499v2

  9. Guo C, Berkhahn F (2016) Entity embeddings of categorical variables. arXiv:1604.06737

  10. De Brébisson A, Simon É, Auvolat A, Vincent P, Bengio Y (2015) Artificial neural networks applied to taxi destination prediction. arXiv preprint arXiv:1508.00021

  11. Cheplygina V, de Bruijnea M, Pluimb JPW (2019) Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Med Image Anal 54:280–296

    Google Scholar 

  12. Yan Z, Zhan Y, Peng Z, Liao S, Shinagawa Y, Zhang S, Metaxas DN, Zhou XS (2016) Multi-instance deep learning: discover discriminative local anatomies for bodypart recognition. IEEE Trans Med Imaging 35(5):1332–1343

    Google Scholar 

  13. Anthimopoulos M, Christodoulidis S, Ebner L, Christe A, Mougiakakou S (2016) Lung pattern classification for interstitial lung diseases using a deep convolutional neural network. IEEE Trans Med Imaging 35(5):1207–1216

    Google Scholar 

  14. Shen W, Zhou M, Yang F, Yang C, Tian J (2015) Multi-scale convolutional neural networks for lung nodule classification. In: International conference on information processing in medical imaging. Springer, pp 588–599

  15. Schlemper J, Caballero J, Hajnal JV, Price A, Rueckert D (2017) A deep cascade of convolutional neural networks for MR image reconstruction. In: International conference on information processing in medical imaging. Springer, pp 647–658

  16. Mehta J, Majumdar A (2017) Rodeo: robust de-aliasing autoencoder for real-time medical image reconstruction. Pattern Recogn 63:499–510

    Google Scholar 

  17. Havaei M, Davy A, Warde-Farley D, Biard A, Courville A, Bengio Y, Pal C, Jodoin P-M, Larochelle H (2017) Brain tumor segmentation with deep neural networks. Med Image Anal 35:18–31

    Google Scholar 

  18. Gillies RJ, Kinahan PE, Hricak H (2016) Radiomics: images are more than pictures, they are data. Radiology 278:563–577

    Google Scholar 

  19. Aerts HJ, Velazquez ER, Leijenaar RT, Parmar C, Grossmann P, Carvalho S et al (2014) Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun 5:4006

    Google Scholar 

  20. Lambin P, Leijenaar RTH, Deist TM, Peerlings J, de Jong EEC, van Timmeren J et al (2017) Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol 14:749–762

    Google Scholar 

  21. Kamnitsas K, Baumgartner C, Ledig C, Newcombe V, Simpson J, Kane A, Menon D, Nori A, Criminisi A, Rueckert D et al (2017) Unsupervised domain adaptation in brain lesion segmentation with adversarial networks. In: International conference on information processing in medical imaging. Springer, pp 597–609

  22. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems 25. Curran Associates, Inc., New York, pp 1097–1105

    Google Scholar 

  23. Kim J, Hong J, Park H (2018) Prospects of deep learning for medical imaging. Precis Future Med 2(2):37–52. https://doi.org/10.23838/pfm.2018.00030

    Article  Google Scholar 

  24. Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Largescale video classification with convolutional neural networks. In: IEEE conference on computer vision and pattern recognition. IEEE Computer Society, pp 1725–1732

  25. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems, vol 25. Curran Associates, New York, pp 1097–1105

    Google Scholar 

  26. Ker, J., Wang, L., Rao, J.P., & Lim, T. (2018). Deep Learning Applications in Medical Image Analysis. IEEE Access, 6, 9375-9389

  27. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  28. Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision. pp 1520–1528

  29. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of IEEE conference computer vision and pattern recognition (CVPR). pp 1–9

  30. LeCun Y, Jackel LD, Bottou L, Cortes C, Denker JS, Drucker H, Guyon I, Müller UA, Säckinger E, Simard P, Vapnik V (1995) Learning algorithms for classification: a comparison on handwritten digit recognition. In: Neural networks: the statistical mechanics perspective. World Scientific, Singapore. pp 261–276. https://nyuscholars.nyu.edu/en/publications/learning-algorithms-forclassification-a-comparison-on-handwritte

  31. Srivastava, S., Soman, S., Rai, A., & Srivastava, P.K. (2017). Deep learning for health informatics: Recent trends and future directions. 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 1665–1670.

  32. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, van der Laak JAWM, van Ginneken B, Sánchez CI (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88

    Google Scholar 

  33. Qayyum A, Qadir J, Bilal M, Al-Fuqaha A (2020) Secure and robust machine learning for healthcare: a survey. arXiv:2001.08103v1

  34. Sohail MN, Ren J, Uba Muhammad M (2019) A euclidean group assessment on semi-supervised clustering for healthcare clinical implications based on real-life data. Int J Environ Res Public Health 16(9):1581

    Google Scholar 

  35. Zahin A, Hu RQ et al (2019) Sensor-based human activity recognition for smart healthcare: a semi-supervised machine learning. In: International conference on artificial intelligence for communications and networks. Springer, pp 450–472

  36. Mahapatra D (2017) Semi-supervised learning and graph cuts for consensus based medical image segmentation. Pattern Recogn 63:700–709

    Google Scholar 

  37. Bai W, Oktay O, Sinclair M, Suzuki H, Rajchl M, Tarroni G, Glocker B, King A, Matthews PM, Rueckert D (2017) Semisupervised learning for network-based cardiac MR image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 253–260

  38. Chapelle O, Schölkopf B, Zien A (2006) Semi-supervised learning, 1st edn. The MIT Press, Cambridge

    Google Scholar 

  39. Zhu X (2008) Semi-supervised learning literature survey. Technical Report. 1530, University of Wisconsin Madison

  40. Sutton RS, Barto AG et al (1998) Introduction to reinforcement learning, vol 2(4). MIT Press, Cambridge

    MATH  Google Scholar 

  41. Kao H-C, Tang K-F, Chang EY (2018) Context-aware symptom checking for disease diagnosis using hierarchical reinforcement learning. In: Thirty-second AAAI conference on artificial intelligence

  42. Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484

    Google Scholar 

  43. Cheplygina V, de Bruijne M, Pluim JPW (2019) Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis. Med Image Anal 54:280–296

    Google Scholar 

  44. Altaf, F., Islam, S.M., Akhtar, N., & Janjua, N.K. (2019). Going Deep in Medical Image Analysis: Concepts, Methods, Challenges, and Future Directions. IEEE Access, 7, 99540–99572.

  45. White BW, Rosenblatt F (1963) ‘Principles of neurodynamics: Perceptrons and the theory of brain mechanisms.’ Am J Psychol 76(4):705

    Google Scholar 

  46. Lundervold, A.S., & Lundervold, A. (2019). An overview of deep learning in medical imaging focusing on MRI. Zeitschrift fur medizinische Physik, 29 2, 102–127 .

  47. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learningapplied to document recognition. Proc IEEE 86:2278–2324

    Google Scholar 

  48. Ravi D, Wong C, Deligianni F, Berthelot M, Andreu-Perez J, Lo B et al (2017) Deep learning for health informatics. IEEE J Biomed HealthInform 21:4–21

    Google Scholar 

  49. Kuhlmann L, Lehnertz K, Richardson MP, Schelter B, Zaveri HP (2018) Seizure prediction—ready for a new era. Nat Rev Neurol 14:618–630

    Google Scholar 

  50. Leshno M, Lin VY, Pinkus A, Schocken S (1993) Multilayer feedforwardnetworks with a nonpolynomial activation function can approximateany function. Neural Netw 6:861–867

    Google Scholar 

  51. Sonoda S, Murata N (2017) Neural network with unbounded activation functions is universal approximator. Appl Comput Harm Anal 43:233–268

    MathSciNet  MATH  Google Scholar 

  52. Clevert D-A, Unterthiner T, Hochreiter S (2015) Fast and accurate deep net-work learning by exponential linear units (ELUS). arXiv:1511.07289

  53. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on image net classification. In: Proceedings of the IEEE international conference on computer vision. pp 1026–1034

  54. Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M (2014) Striving for simplicity: the all convolutional net. arXiv:1412.6806

  55. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958

    MathSciNet  MATH  Google Scholar 

  56. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. pp 448–456

  57. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge

    MATH  Google Scholar 

  58. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Google Scholar 

  59. Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555

  60. Yu AW, Lee H, Le QV (2017) Learning to skim text. arXiv:1704.06877

  61. Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of AAAI conference on artificial intelligence (AAAI)

  62. Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078

  63. Chung J, Gulcehre C, Cho K, Bengio Y (2015) Gated feedback recurrent neural networks. In: International conference on machine learning. pp 2067–2075

  64. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. pp 234–241

  65. Liu L, Cheng J, Quan Q, Fang-Xiang Wu, Wang Y-P, Wang J (2020) A survey on U-shaped networks in medical image segmentations. Neurocomputing 409(7):244–258

    Google Scholar 

  66. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778

  67. Honari S, Yosinski J, Vincent P, Pal C (2016) Recombinator networks: Learning coarse-to fine feature aggregation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 5743–5752

  68. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition. pp 770–778

  69. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 4700–4708

  70. Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198

    MATH  Google Scholar 

  71. Li H, Jiang G, Zhang J, Wang R, Wang Z, Zheng W-S, Menze B (2018) Fully convolutional network ensembles for white matter hyperintensities segmentation in MR images. Neuroimage 183:650–665

    Google Scholar 

  72. Milletari F, Navab N, Ahmadi SA (2016) V-Net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 fourth international conference on 3D vision. IEEE, pp 565–571

  73. Perez L, Wang J. The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621

  74. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22:1345–1359

    Google Scholar 

  75. Jialin-Pan S, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359

    Google Scholar 

  76. Intisar Rizwan I, Haque JN (2020) Deep learning approaches to biomedical image segmentation. Inform Med Unlocked 18:100297

    Google Scholar 

  77. Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv:1312.6114

  78. Ackley DH, Hinton GE, Sejnowski TJ (1985) A learning algorithm for boltzmann machines. Cogn Sci 9(1):147–169

    Google Scholar 

  79. Smolensky P, Smolensky P (1986) Information processing in dynamical systems: Foundations of harmony theory. In: Rumelhart DE (ed) Parallel distributed processing. Explorations in the microstructure of cognition: foundations, vol 1. MIT Press, Cambridge, pp 194–281

    Google Scholar 

  80. van Tulder G, de Bruijne M (2016) Combining generative and discriminative representation learning for lung ct analysis with convolutional restricted boltzmann machines. IEEE Trans Med Imaging 35(5):1262–1272

    Google Scholar 

  81. Ji NN, Zhang SZ, Zhang CX (2014) A sparse response deep belief network based on rate distortion theory. Pattern Recogn 47(9):3179–3191

  82. Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    MathSciNet  MATH  Google Scholar 

  83. Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layerwise training of deep networks. In: Advances in neural information processing systems. pp 153–160

  84. Khatami A, Khosravi A, Nguyen T, Lim CP, Nahavandi S (2017) Medical image analysis using wavelet transform and deep belief networks. Expert Syst Appl 86:190–198

    Google Scholar 

  85. An open-source software library for machine intelligence (2017) https://www.tensorflow.org/

  86. Shi S, Wang Q, Xu P, Chu X (2016) Benchmarking state-of-the-art deep learning software tools. ArXiv e-prints.

  87. Deep learning: For data scientists who need to deliver (2017) https://skymind.ai/

  88. Deep learning for java: Open-source, distributed, deep learning library for the jvm (2017) https://deeplearning4j.org/

  89. Theano (2017) http://deeplearning.net/software/theano/

  90. Torch: A scientific computing framework for luajit (2017) http://torch.ch/

  91. Shi S, Wang Q, Xu P, Chu X (2016) Benchmarking state-of-the-art deep learning software tools. ArXiv e-prints

  92. The microsoft cognitive toolkit (2017) https://docs.microsoft.com/en-us/cognitive-toolkit/

  93. Caffe (2017) http://caffe.berkeleyvision.org/

  94. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093

  95. Caffe2: A new lightweight, modular, and scalable deep learning framework (2017) https://caffe2.ai/

  96. Apache mxnet: A flexible and efficient library for deep learning (2017) https://mxnet.apache.org/

  97. Keras: The python deep learning library (2017) https://keras.io/

  98. I. of H.-C. Center. Chest X-ray NIHCC (2017) https://nihcc.app.box.com/v/ChestXray-NIHCC. Accessed 10 Nov 2019

  99. Yan K, Wang X, Lu L, Summers RM (2018) DeepLesion: automated mining of large-scale lesion annotations and universal lesion detection with deep learning. J Med Imaging 5(1):03

    Google Scholar 

  100. Wang G (2016) A perspective on deep imaging. IEEE Access 4:8914–8924

    Google Scholar 

  101. Lo SCB, Lou SLA, Lin J-S, Freedman MT, Chien MV, Mun SK (1995) Artificial convolution neural network techniques and applications for lung nodule detection. IEEE Trans Med Imag 14(4):711–718

    Google Scholar 

  102. Rajkomar A, Lingam S, Taylor AG, Blum M, Mongan J (2017) Highthroughput classification of radiographs using deep convolutional neural networks. J Digit Imag 30(1):95–101

    Google Scholar 

  103. Huang G, Liu Z, Weinberger KQ, van der Maaten L (2016) Densely connected convolutional networks. arXiv:1608.06993

  104. Rajpurkar P et al (2017) CheXNet: Radiologist-level pneumonia detection on chest X-rays with deep learning. arXiv:1711.05225

  105. Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM (2017) ChestX-ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. arXiv:1705.02315

  106. Shen W, Zhou M, Yang F, Yang C, Tian J (2015) ‘Multi-scale convolutional neural networks for lung nodule classification. In: Information processing in medical imaging, vol 24. Springer, Cham, pp 588–599

  107. Li R et al (2014) Deep learning based imaging data completion for improved brain disease diagnosis. Med Image Comput Comput Assist Interv 17:305–312

    Google Scholar 

  108. Heinsfeld AS, Franco AR, Craddock RC, Buchweitz A, Meneguzzi F (2017) Identification of autism spectrum disorder using deep learning and the ABIDE dataset. Neuroimage Clin 17:16–23. https://doi.org/10.1016/j.nicl.2017.08.017

    Article  Google Scholar 

  109. Awan R, Koohbanani NA, Shaban M, Lisowska A, Rajpoot N (2018) Context-aware learning using transferable features for classification of breast cancer histology images. In: Proceedings of international conference image analysis recognition. Springer, Cham, Switzerland. pp 788–795

  110. Gargeya R, Leng T (2017) Automated identification of diabetic retinopathy using deep learning. Ophthalmology 124(7):962–969

    Google Scholar 

  111. Tomczak JM, Ilse M, Welling M, Jansen M, Coleman HG, Lucas M, de Laat K, de Bruin M, Marquering H, van der Wel MJ, de Boer OJ, Heijink CDS, Meijer SL (2018) Histopathological classification of precursor lesions of esophageal adenocarcinoma: a deep multiple instance learning approach. In: Proceedings of 1st Conference Medical Imaging Deep Learning (MIDL). pp 1–3

  112. Frid-Adar M, Klang E, Amitai M, Goldberger J, Greenspan H (2018) Synthetic data augmentation using GAN for improved liver lesion classification. In: Proceedings of IEEE 15th International Symposium Biomedical Imaging (ISBI). pp 289–293

  113. Islam J, Zhang Y (2018) Early diagnosis of Alzheimer’s disease: a neuroimaging study with deep learning architectures. In: Proceedings of IEEE conference computer vision and pattern recognition workshops. pp 1881–1883

  114. Marcus DS, Fotenos AF, Csernansky JG, Morris JC, Buckner RL (2010) Open access series of imaging studies: longitudinal MRI data in nondemented and demented older adults. J Cogn Neurosci 22(12):2677–2684

    Google Scholar 

  115. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-V4, inception-resnet and the impact of residual connections on learning. In: Proceedings of AAAI, vol 4. p 12

  116. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of IEEE conference computer vision and pattern recognition. pp 770–778

  117. Roth HR, Lee CT, Shin H-C, Seff A, Kim L, Yao J, Lu L, Summers RM (2015) Anatomy-specific classification of medical images using deep convolutional nets. In: 2015 IEEE 12th international symposium on biomedical imaging (ISBI). pp 101–104

  118. Shin H-C, Orton MR, Collins DJ, Doran SJ, Leach MO (2013) Stacked autoencoders for unsupervised feature learning and multiple organ detection in a pilot study using 4d patient data. IEEE Trans Pattern Anal Mach Intell 35(8):1930–1943

    Google Scholar 

  119. Alaverdyan Z, Jung J, Bouet R, Lartizien C (2018) Regularized siamese neural network for unsupervised outlier detection on brain multiparametric magnetic resonance imaging: Application to epilepsy lesion screening. In: Proceedings of 1st conference medical imaging deep learning (MIDL)

  120. Chiang T-C, Huang Y-S, Chen R-T, Huang C-S, Chang R-F (2019) Tumor detection in automated breast ultrasound using 3-D CNN and prioritized candidate aggregation. IEEE Trans Med Imag 38(1):240–249

    Google Scholar 

  121. Schlegl T, Waldstein SM, Bogunovic H, Endstraßer F, Sadeghipour A, Philip A-M, Podkowinski D, Gerendas BS, Langs G, Schmidt-Erfurth U (2017) Fully automated detection and quantification of macular fluid in OCT using deep learning. Ophthalmology 125(4):549–558

    Google Scholar 

  122. Li F, Chen H, Liu Z, Zhang X, Wu Z (2019) Fully automated detection of retinal disorders by image-based deep learning. Graefe’s Arch Clin Exp Ophthalmol 257(3):495–505

    Google Scholar 

  123. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: A large-scale hierarchical image database. In: Proceedings of IEEE conference computer vision and pattern recognition (CVPR). pp 248–255

  124. Kermany DS, Goldbaum M, Cai W, Valentim CC, Liang H, Baxter SL, McKeown A, Yang G, Wu X, Yan F (2018) ‘Identifying medical diagnoses and treatable diseases by image-based deep learning.’ Cell 172(5):1122–1131

    Google Scholar 

  125. The U.S. Food and Drug Administration (2018) FDA permits marketing of artificial intelligence-based device to detect certain diabetes-related eye problems News Release

  126. Li Z, He Y, Keel S, Meng W, Chang RT, He M (2018) Efficacy of a deep learning system for detecting glaucomatous optic neuropathy based on color fundus photographs. Ophthalmology 125(8):1199–1206

    Google Scholar 

  127. Christopher M, Belghith A, Bowd C, Proudfoot JA, Goldbaum MH, Weinreb RN, Girkin CA, Liebmann JM, Zangwill LM (2018) Performance of deep learning architectures and transfer learning for detecting glaucomatous optic neuropathy in fundus photographs. Sci Rep 8(1):16685

    Google Scholar 

  128. Forouzanfar M, Forghani N, Teshnehlab M (2010) ‘Parameter optimization of improved fuzzy C-means clustering algorithm for brain MR image segmentation.’ Eng Appl Artif Intell 23(2):160–168

    Google Scholar 

  129. Akkus Z, Galimzianova A, Hoogi A, Rubin DL, Erickson BJ (2017) Deep learning for brain MRI segmentation: state of the art and future directions. J Digit Imaging 30(4):449–459

    Google Scholar 

  130. Moeskops P, Viergever MA, Mendrik AM, de Vries LS, Benders MJ, Isgum I (2016) Automatic segmentation of mr brain images with a convolutional neural network. IEEE Trans Med Imaging 35(5):1252–1261

    Google Scholar 

  131. Marcus DS, Wang TH, Parker J, Csernansky JG, Morris JC, Buckner RL (2007) Open access series of imaging studies (OASIS): Cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J Cogn Neurosci 19(9):1498–1507

    Google Scholar 

  132. Kleesiek J, Urban G, Hubert A, Schwarz D, Maier-Hein K, Bendszus M, Biller A (2016) Deep MRI brain extraction: a 3D convolutional neural network for skull stripping. Neuroimage 129:460–469

    Google Scholar 

  133. Nair T, Precup D, Arnold DL, Arbel T (2018) Exploring uncertainty measures in deep networks for multiple sclerosis lesion detection and segmentation. In: Proceedings of international conference medical image computing and computer-assisted intervention. Springer, Cham, Switzerland, pp 655–663

  134. Roy AG, Conjeti S, Navab N, Wachinger C (2018) Inherent brain segmentation quality control from fully convnet Monte Carlo sampling. arXiv:1804.07046

  135. Brosch T, Tang LY, Yoo Y, Li DK, Traboulsee A, Tam R (2016) Deep 3d convolutional encoder networks with shortcuts for multiscale feature integration applied to multiple sclerosis lesion segmentation. IEEE Trans Med Imaging 35(5):1229–1239

    Google Scholar 

  136. Lee J, Nishikawa RM (2018) Automated mammographic breast density estimation using a fully convolutional network. Med Phys 45(3):1178–1190

    Google Scholar 

  137. Zhang Y, Chung ACS (2018) Deep supervision with additional labels for retinal vessel segmentation task. arXiv:1806.02132

  138. Chartsias A, Joyce T, Papanastasiou G, Semple S, Williams M, Newby D, Dharmakumar R, Tsaftaris SA (2018) Factorised spatial representation learning: application in semi-supervised myocardial segmentation. arXiv:1803.07031

  139. Burlutskiy N, Gu F, Wilen LK, Backman M, Micke P (2018) A deep learning framework for automatic diagnosis in lung cancer. arXiv:1807.10466

  140. Klein A, Andersson J, Ardekani BA, Ashburner J, Avants B, Chiang M-C, Christensen GE, Collins DL, Gee J, Hellier P, Song JH, Jenkinson M, Lepage C, Rueckert D, Thompson P, Vercauteren T, Woods RP, Mann JJ, Parsey RV (2009) Evaluation of 14 nonlinear deformation algorithms applied to human brain MRI registration. Neuroimage 46(3):786–802

    Google Scholar 

  141. El-Gamal FE-ZA, Elmogy M, Atwan A (2016) Current trends in medical image registration and fusion. Egypt Inform J 17(1):99–124

    Google Scholar 

  142. Miao S, Wang ZJ, Liao R (2016) A CNN regression approach for realtime 2d/3d registration. IEEE Trans Med Imaging 35(5):1352–1363

    Google Scholar 

  143. Yang X, Yeo SY, Hong JM, Wong ST, Tang WT, Wu ZZ, Lee G, Chen S, Ding V, Pang B et al (2016) A deep learning approach for tumor tissue image classification. Biomed Eng

  144. Balakrishnan G, Zhao A, Sabuncu MR, Guttag J, Dalca AV (2018) An unsupervised learning model for deformable medical image registration. In: Proceedings of IEEE conference computer vision and pattern recognition. pp 9252–9260

  145. Pan L, Shi F, Zhu W, Nie B, Guan L, Chen X (2018) Detection and registration of vessels for longitudinal 3D retinal OCT images using SURF. Proc SPIE 10578:105782P

    Google Scholar 

  146. Bay H, Tuytelaars T, Van Gool L (1981) SURF: Speeded up robust features. In: Proceedings of european conference on computer vision. Springer, Berlin, Germany. pp 404–417. [211] Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395

  147. Fischler MA, Bolles RC (1981) ‘Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography.’ Commun ACM 24(6):381–395

    MathSciNet  Google Scholar 

  148. Mahapatra D, Sedai S, Garnavi R (2018) Elastic registration of medical images with GANs. arXiv:1805.02369

  149. Zheng J, Miao S, Wang ZJ, Liao R (2018) Pairwise domain adaptation module for CNN-based 2-D/3-D registration. Proc SPIE 5(2):021204

    Google Scholar 

  150. Badea, M., Felea, I., Florea, L.M., & Vertan, C. (2016). The use of deep learning in image segmentation, classification and detection. ArXiv, abs/1605.09612.

  151. Dhungel, N., Carneiro, G., & Bradley, A.P. (2015). Deep Learning and Structured Prediction for the Segmentation of Mass in Mammograms. MICCAI.

  152. Zhou X, Yamada K, Kojima T, Takayama R, Wang S, Zhou X, Hara T, Fujita H (2018) Performance evaluation of 2D and 3D deep learning approaches for automatic segmentation of multiple organs on CT images. In: Petrick N, Mori K (eds) Medical imaging 2018: computer-aided diagnosis, Proc. SPIE 10575: 105752C

  153. Roth HR, Shen C, Oda H, Oda M, Hayashi Y, Misawa K, Mori K (2018) Deep learning and its application to medical image segmentation. arXiv:1803.08691v1

  154. Moeskops P, Wolterink JM, van der Velden BHM, Gilhuijs KGA, Leiner T, Viergever MA, Išgum I (2017) Deep learning for multi-task medical image segmentation in multiple modalities. arXiv:1704.03379v1

  155. Doua Qi, Yua L, Chena H, Jina Y, Yanga X, Qinb J, Heng P-A (2017) 3D deeply supervised network for automated segmentation of volumetric medical images. Med Image Anal 41:40–54

    Google Scholar 

  156. Wang G, Li W, Zuluaga MA, Pratt R, Patel PA, Aertsen M, Doel T, David AL, Deprest J, Ourselin S, Vercauteren T (2018) Interactive medical image segmentation using deep learning with image-specific fine tuning. IEEE Trans Med Imaging 37(7):1562–1573

    Google Scholar 

  157. Havaei M, Davy A, Warde-Farley D, Biardc A, Courvillec A, Bengio Y, Pal C, Jodoina P-M, Larochelle H (2017) Brain tumor segmentation with deep neural networks. Med Image Anal 35:18–31

    Google Scholar 

  158. Ngoa TA, Lub Z, Carneiroc G (2017) Combining deep learning and level set for the automated segmentation of the left ventricle of the heart from cardiac cine magnetic resonance. Med Image Anal 35:158–171

    Google Scholar 

  159. Chen H, Dou Q, Yu L, Qin J, Heng P-A (2017) VoxResNet: deep voxelwise residual networks for brain segmentation from 3D MR images. Neuroimage 170:446–455

    Google Scholar 

  160. Milletaria F, Ahmadib S-A, Kroll C, Plate A, Rozanski V, Maiostre J, Levin J, Dietrich O, Ertl-Wagner B, Bötzel K, Navab N (2016) Hough-CNN: deep learning for segmentation of deep brain regions in mri and ultrasound. arXiv:1601.07014v3

  161. Xu Y, Wang Y, Yuan J, Cheng Q, Wang X, Carson PL (2018) Medical breast ultrasound image segmentation by machine learning. Ultrasonics 2019(91):1–9

    Google Scholar 

  162. Prince JL et al (2019) Parallel deep neural networks for endoscopic OCT image segmentation. Biomed Opt Express 10(3):1126 (I. Rizwan I Haque and J. Neubert)

    Google Scholar 

  163. Jia Z, Huang X, Chang EIC, Xu Y (2017) Constrained deep weak supervision for histopathology image segmentation. IEEE Trans Med Imaging 36(11):2376–2388

    Google Scholar 

  164. Zhao Z, Yang L, Zheng H, Guldner IH, Zhang S, Chen DZ (2018) Deep learning based instance segmentation in 3D biomedical images using weak annotation. In: Lecture notes computer science (including subseries lecture notes artificial intelligence lecture notes bioinformatics), 11073. LNCS. pp 352–60

  165. Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, vol 1. IEEE, pp I–I

  166. Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vision 57(2):137–154

    Google Scholar 

  167. Vikram K, Padmavathi S (2017) Facial parts detection using Viola Jones algorithm. In: 2017 4th international conference on advanced computing and communication systems (ICACCS). IEEE, pp 1–4

  168. Zhu Q, Yeh MC, Cheng KT, Avidan S (2006) Fast human detection using a cascade of histograms of oriented gradients. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06), vol 2. IEEE, pp 1491–1498

  169. Overett G, Petersson L (2011) Large scale sign detection using HOG feature variants. In: 2011 IEEE intelligent vehicles symposium (IV). IEEE, pp 326–331

  170. Ren H, Li ZN (2014) Object detection using edge histogram of oriented gradient. In: 2014 IEEE international conference on image processing (ICIP). IEEE, pp 4057–4061

  171. Sudowe P, Leibe B (2011) Efficient use of geometric constraints for sliding-window object detection in video. In: International conference on computer vision systems. Springer, Berlin, pp 11–20

  172. Xiao-pei ZJYW, Zhao ZCL (2013) A moving object detection method based on sliding window Gaussian mixture model. J Electron Inf Technol 7

  173. Lampert CH, Blaschko MB, Hofmann T (2008) Beyond sliding windows: object localization by efficient subwindow search. In: 2008 IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8

  174. Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229

  175. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 779–788

  176. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 7263–7271

  177. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767

  178. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, Cham, pp 21–37

  179. Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2117–2125

  180. Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision. pp 2980–2988

  181. Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV). pp 734–750

  182. Zhou X, Wang D, Krähenbühl P (2019) Objects as points. arXiv preprint arXiv:1904.07850

  183. Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: Object detection with keypoint triplets. arXiv preprint arXiv:1904.08189, 3

  184. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 580–587

  185. Chen C, Liu MY, Tuzel O, Xiao J (2016) R-CNN for small object detection. In: Asian conference on computer vision. Springer, Cham, pp 214–230

  186. Uijlings JR, Van De Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vision 104(2):154–171

    Google Scholar 

  187. Kleban J, Xie X, Ma WY (2008) Spatial pyramid mining for logo detection in natural scenes. In: 2008 IEEE international conference on multimedia and expo. IEEE, pp 1077–1080

  188. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Google Scholar 

  189. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision. pp. 1440–1448

  190. Li J, Liang X, Shen S, Xu T, Feng J, Yan S (2017) Scale-aware fast R-CNN for pedestrian detection. IEEE Trans Multimedia 20(4):985–996

    Google Scholar 

  191. Ren S, He K, Girshick R, Sun J (2016) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    Google Scholar 

  192. Jiang H, Learned-Miller E (2017) Face detection with the faster R-CNN. In: 2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017). IEEE, pp. 650–657.

  193. Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. arXiv preprint arXiv:1605.06409

  194. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision. pp. 2961–2969

  195. Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) September. Microsoft coco: Common objects in context. In: European conference on computer vision. Springer, Cham, pp 740–755

  196. Hu H, Gu J, Zhang Z, Dai J, Wei Y (2018) Relation networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3588–3597

  197. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252

    MathSciNet  Google Scholar 

  198. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. Ieee, pp 248–255

  199. Kuznetsova A, Rom H, Alldrin N, Uijlings J, Krasin I, Pont-Tuset J, Kamali S, Popov S, Malloci M, Kolesnikov A, Duerig T (2018) The open images dataset V4: unified image classification, object detection, and visual relationship detection at scale. arXiv preprint arXiv:1811.00982

  200. Gupta A, Dollar P, Girshick R (2019) LVIS: A dataset for large vocabulary instance segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 5356–5364

  201. Yang S, Luo P, Loy CC, Tang X (2016) Wider face: a face detection benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 5525–5533

  202. Jain V, Learned-Miller E (2010) Fddb: A benchmark for face detection in unconstrained settings (Vol. 2, No. 4, p. 5). UMass Amherst technical report

  203. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), vol 1. Ieee, pp 886–893

  204. Dollar P, Wojek C, Schiele B, Perona P (2011) Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell 34(4):743–761

    Google Scholar 

  205. Ess A, Leibe B, Van Gool L (2007) Depth and appearance for mobile scene analysis. In: 2007 IEEE 11th international conference on computer vision. IEEE, pp 1–8

  206. Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the kitti dataset. Int J Robot Res 32(11):1231–1237

    Google Scholar 

  207. Zhang S, Benenson R, Schiele B (2017) Citypersons: A diverse dataset for pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3213–3221

  208. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3213–3223

  209. He R, Wu X, Sun Z, Tan T (2018) Wasserstein CNN: Learning invariant features for NIR-VIS face recognition. IEEE Trans Pattern Anal Mach Intell 41(7):1761–1773

    Google Scholar 

  210. Guo Y, Jiao L, Wang S, Wang S, Liu F (2017) Fuzzy sparse autoencoder framework for single image per person face recognition. IEEE Trans Cybern 48(8):2402–2415

    Google Scholar 

  211. Cai Z, Saberian M, Vasconcelos N (2019) Learning complexity-aware cascades for pedestrian detection. IEEE Trans Pattern Anal Mach Intell 42(9):2195–2211

    Google Scholar 

  212. Barz B, Rodner E, Garcia YG, Denzler J (2018) Detecting regions of maximal divergence for spatio-temporal anomaly detection. IEEE Trans Pattern Anal Mach Intell 41(5):1088–1101

    Google Scholar 

  213. Shivakumara P, Tang D, Asadzadehkaljahi M, Lu T, Pal U, Anisi MH (2018) CNN-RNN based method for license plate recognition. CAAI Trans Intell Technol 3(3):169–175

    Google Scholar 

  214. Li D, Zhao D, Chen Y, and Zhang Q (2018) Deepsign: Deep learning based traffic sign recognition. In: 2018 international joint conference on neural networks (IJCNN). IEEE, pp 1–6

  215. Yang Z, Li Q, Liu W, Lv J (2019) Shared multi-view data representation for multi-domain event detection. IEEE Trans Pattern Anal Mach Intell 42(5):1243–1256

    Google Scholar 

  216. Teboul O, Kokkinos I, Simon L, Koutsourakis P, Paragios N (2011) Shape grammar parsing via reinforcement learning. In: CVPR 2011. IEEE, pp 2273–2280

  217. Friedman S, Stamos I (2013) Online detection of repeated structures in point clouds of urban scenes for compression and registration. Int J Comput Vis 102(1–3):112–128

    Google Scholar 

  218. Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimedia 20(11):3111–3122

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yadwinder Singh.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kaur, A., Singh, Y., Neeru, N. et al. A Survey on Deep Learning Approaches to Medical Images and a Systematic Look up into Real-Time Object Detection. Arch Computat Methods Eng 29, 2071–2111 (2022). https://doi.org/10.1007/s11831-021-09649-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11831-021-09649-9

Navigation