Convolutional neural networks in medical image understanding: a survey

Sarvamangala, D. R.; Kulkarni, Raghavendra V.

doi:10.1007/s12065-020-00540-3

Convolutional neural networks in medical image understanding: a survey

Review Article
Published: 03 January 2021

Volume 15, pages 1–22, (2022)
Cite this article

Download PDF

Evolutionary Intelligence Aims and scope Submit manuscript

Convolutional neural networks in medical image understanding: a survey

Download PDF

46k Accesses
272 Citations
4 Altmetric
Explore all metrics

Abstract

Imaging techniques are used to capture anomalies of the human body. The captured images must be understood for diagnosis, prognosis and treatment planning of the anomalies. Medical image understanding is generally performed by skilled medical professionals. However, the scarce availability of human experts and the fatigue and rough estimate procedures involved with them limit the effectiveness of image understanding performed by skilled medical professionals. Convolutional neural networks (CNNs) are effective tools for image understanding. They have outperformed human experts in many image understanding tasks. This article aims to provide a comprehensive survey of applications of CNNs in medical image understanding. The underlying objective is to motivate medical image understanding researchers to extensively apply CNNs in their research and diagnosis. A brief introduction to CNNs has been presented. A discussion on CNN and its various award-winning frameworks have been presented. The major medical image understanding tasks, namely image classification, segmentation, localization and detection have been introduced. Applications of CNN in medical image understanding of the ailments of brain, breast, lung and other organs have been surveyed critically and comprehensively. A critical discussion on some of the challenges is also presented.

Survey of Supervised Learning for Medical Image Processing

Article 17 May 2022

Convolutional Neural Network in Medical Image Analysis: A Review

Article 01 March 2023

Medical Image Analysis using Convolutional Neural Networks: A Review

Article 08 October 2018

1 Introduction

Loss of human lives can be prevented or the medical trauma experienced in an injury or a disease can be reduced through the timely diagnosis of medical anomalies. Medical anomalies include glaucoma, diabetic retinopathy, tumors [34], interstitial lung diseases [44], heart diseases and tuberculosis. Diagnosis and prognosis involve the understanding of the images of the affected area obtained using X-ray, magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET), single photon emission computed tomography or ultrasound scanning. Image understanding involves the detection of anomalies, ascertaining their locations and borders, and estimating their sizes and severity. The scarce availability of human experts and their fatigue, high consultation charges and rough estimate procedures limit the effectiveness of image understanding. Further, shapes, locations and structures of the medical anomalies are highly variable [55]. This makes diagnosis difficult even for specialized physicians [4]. Therefore, human experts often feel a need for support tools to aid in precise understanding of medical images. This is the motivation for intelligent image understanding systems.

Image understanding systems that exploit machine learning (ML) techniques are fast evolving in recent years. ML techniques include decision tree learning [35], clustering, support vector machines (SVMs) [47], k-means nearest neighbor (K-NN), restricted Boltzmann machines (RBMs) [42] and random forests (RFs) [28]. The pre-requisite for ML techniques to work efficiently is the extraction of discriminant features. And these features are generally unknown and is also a very challenging task especially for applications involving image understanding and is still a topic of research. A logical step to overcome was to create intelligent machines which could learn features needed for image understanding and extract it on its own. One such intelligent and successful model is the convolutional neural network (CNN) model, which automatically learns the needed features and extracts it for medical image understanding. The CNN model is made of convolutional filters whose primary function is to learn and extract necessary features for efficient medical image understanding. CNN started gaining popularity in the year 2012, due to AlexNet [41], a CNN model, which defeated all the others models with a record accuracy and low error rate in imageNet challenge 2012. CNN has been used by corporate giants for providing internet services, automatic tagging in images, product recommendations, home feed personalization and autonomous cars [59]. The major applications of the CNN are in image and signal processing, natural language processing and data analytics. The CNN had a major breakthrough when GoogleNet used it to detect cancer at an accuracy of 89% while human pathologists could achieve the accuracy of only 70% [3].

1.1 Motivation and purpose

CNNs have contributed significantly in the areas of image understanding. CNN-based approaches are placed in the leader board of the many image understanding challenges, such as Medical Image Computing and Computer Assisted Intervention (MICCAI) biomedical challenge, Brain Tumor segmentation (BRATS) Multimodal Brain Tumor Segmentation challenge [48], Imagenet classification challenge, challenges of International Conference on Pattern Recognition (ICPR) [31] and Ischemic Stroke Lesion Segmentation (ISLES) challenge [32]. CNN has become a powerful choice as a technique for medical image understanding. Researchers have successfully applied CNNs for many medical image understanding applications like detection of tumors and their classification into benign and malignant [52], detection of skin lesions [50], detection of optical coherence tomography images [39], detection of colon cancer [71], blood cancer, anomalies of the heart [40], breast [36], chest, eye etc. Also CNN-based models like CheXNet [56, 58], used for classifying 14 different ailments of the chest achieved better results compared to the average performance of human experts.

CNNs have also dominated the area of COVID-19 detection using chest X-rays/CT scans. Research involving CNNs is now a dominant topic at major conferences. In addition, there are special issues reserved in reputed journals for solving challenges using deep learning models. The vast amount of literature available on CNNs is the testimonial of their efficiency and the widespread use. However, various research communities are developing these applications concurrently and the dissemination results are scattered in a wide and diverse range of conference proceedings and journals.

A large number of surveys on deep learning have been published recently. A review of deep learning techniques applied in medical imaging, bioinformatics and pervasive sensing has been presented in [60]. A thorough review of deep learning techniques for segmentation of MRI images of brain has been presented in [2]. Survey of deep learning techniques for medical image segmentation, their achievements and challenges involved in medical image segmentation has been presented in [27] Though literature is replete with many survey papers, most of them concentrate on deep learning models which include CNN, recurrent neural network, generative adversial network or on a particular application. There is also no coverage of the application of CNN in early detection of COVID-19 as well as many other areas.

The survey includes research papers on various applications of CNNs in medical image understanding. The papers for the survey are queried from various journal websites. Additionally, arxiv, conference proceedings of various medical image challenges are also included in the survey. Also the references of these papers are checked. The query used are: “CNN” or “deep learning” or “convolutional neural network” or terms related to medical image understanding. These terms had to be present either in title or abstract to be considered.

The objective of this survey is to offer a comprehensive overview of applications and methodology of CNNs and its variants, in the fields of medical image understanding including the detection of latest global pandemic COVID-19. The survey includes overview tables which can be used for quick reference. The authors leverage experiences of their own and that of the research fraternity on the applications of CNNs to provide an insight into various state of the art CNN models, challenges involved in designing CNN model, overview of research trends in the field, and to motivate medical image understanding researchers and medical professionals to extensively apply CNNs in their research and diagnosis, respectively.

1.2 Contributions and the structure

Primary contributions of this article are as follows:

1.
To briefly introduce medical image understanding and CNN.
2.
To convey that CNN has percolated in the field of medical image understanding.
3.
To identify the various challenges in medical image understanding.
4.
To highlight contributions of CNN to overcome those challenges

The remainder of this article has been organized as follows: Medical image understanding has been briefly introduced in Sect. 2. A brief introduction of CNN and its architecture has been presented in Sect. 3. The applications of CNN in medical image understanding have been surveyed comprehensively through Sects. 4–7. Finally, concluding remarks and a projection of the trends in CNN applications in image understanding have been presented in Sect. 8.

2 Medical image understanding

Medical imaging is necessary for the visualization of internal organs for the detection of abnormalities in their anatomy or functioning. Medical image capturing devices, such as X-ray, CT, MRI, PET and ultrasound scanners capture the anatomy or functioning of the internal organs and present them as images or videos. The images and videos must be understood for the accurate detection of anomalies or the diagnosis of functional abnormalities. If an abnormality is detected, then its exact location, size and shape must be determined. These tasks are traditionally performed by the trained physicians based on their judgment and experience. Intelligent healthcare systems aim to perform these tasks using intelligent medical image understanding. Medical image classification, segmentation, detection and localization are the important tasks in medical image understanding.

2.1 Medical image classification

Medical image classification involves determining and assigning labels to medical images from a fixed set. The task involves the extraction of features from the image, and assigning labels using the extracted features. Let I denote an image made of pixels and $c_1, c_2, \ldots , c_r$ denote the labels. For each pixel x, a feature vector $\zeta $, consisting of values $f(x_i)$ is extracted from the neighborhood N(x) using (1), where $x_i \in N(x)$ for $i = 0, 1, \ldots , k$.

$$\begin{aligned} \zeta = (f(x_0), f(x_1), \ldots , f(x_k)) \end{aligned}$$

(1)

A label from the list of labels $c_1, c_2, \ldots , c_r$ is assigned to the image based on $\zeta $.

2.2 Medical image segmentation

Medical image segmentation helps in image understanding, feature extraction and recognition, and quantitative assessment of lesions or other abnormalities. It provides valuable information for the analysis of pathologies, and subsequently helps in diagnosis and treatment planning. The objective of segmentation is to divide an image into regions that have strong correlations. Segmentation involves dividing the image I into a finite set of regions $R_1, R_1, \ldots , R_S$ as expressed in (2).

$$\begin{aligned} I = \mathop{\cup}\limits ^{s} _{i = 1} R_i, \ R_i \cap R_j = \emptyset \ {\text {and}} \ i \ne j. \end{aligned}$$

(2)

2.3 Medical image localization

Automatic localization of pathology in images is quite an important step towards automatic acquisition planning and post imaging analysis tasks, such as segmentation and functional analysis. Localization involves predicting the object in an image, drawing a bounding box around the object and labeling the object.

The localization function f(I) on an image I computes $c, \ l_x, \ l_y, \ l_w, l_h $, which represent respectively, class label, centroid x and y coordinates, and the proportion of the bounding box with respect to width and height of the image as expressed in (3).

$$\begin{aligned} f(I) = (c, l_x, l_y, l_w, l_h). \end{aligned}$$

(3)

2.4 Medical image detection

Image detection aims at the classification and the localization of regions of interest by drawing bounding boxes around multiple regions of interest and labeling them. This helps in determining the exact locations of different organs and their orientation. Let I be an image with n objects or regions of interest. Then detection function D(I) computes $c_i, \ x_i, \ y_i, \ w_i, h_i$ and these are respectively the class label, centroid x and y coordinates, proportion of the bounding box with respect to width and height of the image I as given in the (4)

$$\begin{aligned} \mathop{\cup}\limits ^{n} _{i = 1} {c_i, x_i, y_i, w_i, h_i} = D(I). \end{aligned}$$

(4)

Table 1 Confusion matrix

Convolutional neural networks in medical image understanding: a survey

Abstract

Similar content being viewed by others

Survey of Supervised Learning for Medical Image Processing

Convolutional Neural Network in Medical Image Analysis: A Review

Medical Image Analysis using Convolutional Neural Networks: A Review

1 Introduction

1.1 Motivation and purpose

1.2 Contributions and the structure

2 Medical image understanding

2.1 Medical image classification

2.2 Medical image segmentation

2.3 Medical image localization

2.4 Medical image detection

2.5 Evaluation metrics for image understanding

3 A brief introduction to CNNs

3.1 Convolution layers (Conv layers)

3.2 Activation functions or nonlinear functions

3.2.1 Sigmoid

3.2.2 Tan hyperbolic

3.2.3 Rectified linear unit (ReLU)

3.3 Pooling

3.4 Fully connected (FC) layer

3.5 Data preprocessing and augmentation

3.6 CNN architectures and frameworks

4 CNN applications in medical image classification

4.1 Lung diseases

4.1.1 Ensemble CNN

4.1.2 Small-kernel CNN

4.1.3 Whole image CNN

4.1.4 Multicrop pooling CNN

4.2 Coronavirus disease 2019 (COVID-19)

4.2.1 Customized CNN

4.2.2 Bayesian CNN

4.2.3 PDCOVIDNET

4.2.4 CVR-Net

4.2.5 Twice transfer learning CNN

4.3 Immune response abnormalities

4.3.1 CUDA ConvNet CNN

4.3.2 Six-layer CNN

4.4 Breast tumors

4.4.1 Stopping monitoring CNN

4.4.2 Ensemble CNN

4.4.3 Semi-supervised CNN

4.5 Heart diseases

4.5.1 One-dimensional CNN

4.5.2 Fused CNN

4.6 Eye diseases

4.6.1 Gaussian initialized CNN

4.6.2 Hyper parameter tuning inception-v4

4.7 Colon cancer

4.7.1 Ensemble CNN

4.8 Brain disorders

4.8.1 Fused CNN

4.8.2 Input cascaded CNN

5 CNN applications in medical image segmentation

5.1 Brain tumors

5.1.1 Small kernel CNN

5.1.2 Fully blown CNN

5.1.3 Multipath CNN

5.1.4 Cascaded CNN

5.1.5 Multiscale CNN

5.1.6 Multipath and multiscale CNN

5.2 Breast cancer

5.2.1 FCNN

5.2.2 Probability map CNN

5.2.3 Patch CNN

5.3 Eye diseases

5.3.1 Greedy CNN

5.3.2 Multi label inference CNN

5.4 Lung

5.4.1 U net

6 CNN applications in medical image detection

6.1 Breast tumors

6.1.1 GoogLeNet CNN

6.2 Eye diseases

6.2.1 Dynamic CNN

6.2.2 Ensemble CNN

6.3 Cell division

6.3.1 LeNet CNN