Relevance and penetration of machine learning in clinical practice is a recent phenomenon with multiple applications being currently under development. Deep learning—and especially convolutional neural networks (CNNs)—is a subset of machine learning, which has recently entered the field of thoracic imaging. The structure of neural networks, organized in multiple layers, allows them to address complex tasks. For several clinical situations, CNNs have demonstrated superior performance as compared with classical machine learning algorithms and in some cases achieved comparable or better performance than clinical experts. Chest radiography, a high-volume procedure, is a natural application domain because of the large amount of stored images and reports facilitating the training of deep learning algorithms. Several algorithms for automated reporting have been developed. The training of deep learning algorithm CT images is more complex due to the dimension, variability, and complexity of the 3D signal. The role of these methods is likely to increase in clinical practice as a complement of the radiologist’s expertise. The objective of this review is to provide definitions for understanding the methods and their potential applications for thoracic imaging.
• Deep learning outperforms other machine learning techniques for number of tasks in radiology.
• Convolutional neural network is the most popular deep learning architecture in medical imaging.
• Numerous deep learning algorithms are being currently developed; some of them may become part of clinical routine in the near future.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Convolutional neural networks
Chronic obstructive pulmonary disease
Epithelial growth factor receptor
Generative adversarial neural networks
Graphic processing unit
National Institute of Health
Picture archiving and communication systems
Recurrent neural networks
Support vector machine
Samuel AL (1959) Some studies in machine learning using the game of checkers. IBM J Res Dev 3:210–229. https://doi.org/10.1147/rd.33.0210
Hwang EJ, Park S, Jin K-N et al (2019) Development and validation of a deep learning–based automated detection algorithm for major thoracic diseases on chest radiographs. JAMA Netw Open 2:e191095. https://doi.org/10.1001/jamanetworkopen.2019.1095
Ardila D, Kiraly AP, Bharadwaj S et al (2019) End-to-end lung cancer screening with three-dimensional deep learning on lowdose chest computed tomography. Nat Med 25:954–961. https://doi.org/10.1038/s41591-019-0447-x
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444. https://doi.org/10.1038/nature14539
Suzuki K (2012) A review of computer-aided diagnosis in thoracic and colonic imaging. Quant Imaging Med Surg 2:14
Firmino M, Angelo G, Morais H, Dantas MR, Valentim R (2016) Computer-aided detection (CADe) and diagnosis (CADx) system for lung cancer with likelihood of malignancy. Biomed Eng Online 15:2. https://doi.org/10.1186/s12938-015-0120-7
Gillies RJ, Kinahan PE, Hricak H (2015) Radiomics: images are more than pictures, they are data. Radiology 278:563–577. https://doi.org/10.1148/radiol.2015151169
Zhu X, Dong D, Chen Z et al (2018) Radiomic signature as a diagnostic factor for histologic subtype classification of non-small cell lung cancer. Eur Radiol 28:2772–2778. https://doi.org/10.1007/s00330-017-5221-1
Fan L, FangM LZ et al (2019) Radiomics signature: a biomarker for the preoperative discrimination of lung invasive adenocarcinoma manifesting as a ground-glass nodule. Eur Radiol 29:889–897. https://doi.org/10.1007/s00330-018-5530-z
Jia T-Y, Xiong J-F, Li X-Y et al (2019) Identifying EGFR mutations in lung adenocarcinoma by noninvasive imaging using radiomics features and random forest modeling. Eur Radiol 29:4742–4750. https://doi.org/10.1007/s00330-019-06024-y
Tu W, Sun G, Fan L et al (2019) Radiomics signature: a potential and incremental predictor for EGFR mutation status in NSCLC patients, comparison with CT morphology. Lung Cancer Amst Neth 132:28–35. https://doi.org/10.1016/j.lungcan.2019.03.025
Song J, Tian J, Zhang L et al (2019) Development and validation of a prognostic index for efficacy evaluation and prognosis of first-line chemotherapy in stage III–IV lung squamous cell carcinoma. Eur Radiol 29:2388–2398. https://doi.org/10.1007/s00330-018-5912-2
Park JE, Kim D, Kim HS et al (2019) Quality of science and reporting of radiomics in oncologic studies: room for improvement according to radiomics quality score and TRIPOD statement. Eur Radiol https://doi.org/10.1007/s00330-019-06360-z
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
Wang X, Peng Y, Lu L, et al (2017) ChestX-Ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Honolulu, pp. 3462–3471
Irvin J, Rajpurkar P, Ko M, et al (2019) CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. ArXiv190107031 Cs Eess
Depeursinge A, Vargas A, Platon A, Geissbuhler A, Poletti PA, Müller H (2012) Building a reference multimedia database for interstitial lung diseases. Comput Med Imaging Graph 36:227–238. https://doi.org/10.1016/j.compmedimag.2011.07.003
Setio AAA, Traverso A, de Bel T et al (2017) Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the LUNA16 challenge. Med Image Anal 42:1–13. https://doi.org/10.1016/j.media.2017.06.015
Kim HJ, Li G, Gjertson D et al (2008) Classification of parenchymal abnormality in scleroderma lung using a novel approach to denoise images collected via a multicenter study. Acad Radiol 15:1004–1016. https://doi.org/10.1016/j.acra.2008.03.011
Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF (eds) Medical image computing and computer-assisted intervention – MICCAI 2015. Springer International Publishing, Cham, pp 234–241
Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39:2481–2495. https://doi.org/10.1109/TPAMI.2016.2644615
Vakalopoulou M, Chassagnon G, Bus N et al (2018) AtlasNet: Multi-deep non-linear elastic networks for multiorgan medical image segmentation. In: Medical image computing and computerassisted intervention − MICCAI 2018
Donahue J, Anne Hendricks L, Guadarrama S, et al (2015) Longterm recurrent convolutional networks for visual recognition and description. arXiv:1411.4389
Lee PQ, Guida A, Patterson S et al (2019) Model-free prostate cancer segmentation from dynamic contrast-enhanced MRI with recurrent convolutional networks: a feasibility study. Comput Med Imaging Graph 75:14–23. https://doi.org/10.1016/j.compmedimag.2019.04.006
Goodfellow I, Pouget-Abadie J, Mirza M et al (2014) Generative adversarial nets. In: Ghahramani Z,Welling M, Cortes C et al (eds) Advances in neural information processing systems 27. Curran Associates, Inc., pp 2672–2680
Dice LR (1945) Measures of the amount of ecologic association between species. Ecology 26:297–302. https://doi.org/10.2307/1932409
Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Lechevallier Y, Saporta G (eds) Proceedings of COMPSTAT’2010. Physica-Verlag HD, pp 177–186
Mollura DJ, Azene EM, Starikovsky A et al (2010) White paper report of the RAD-AID Conference on International Radiology for Developing Countries: identifying challenges, opportunities, and strategies for imaging services in the developing world. J Am Coll Radiol 7:495–500. https://doi.org/10.1016/j.jacr.2010.01.018
Kesselman A, Soroosh G, Mollura DJ, RAD-AID Conference Writing Group (2016) 2015 RAD-AID Conference on International Radiology for Developing Countries: the evolving global radiology landscape. J Am Coll Radiol 13:1139–1144. https://doi.org/10.1016/j.jacr.2016.03.028
Rajpurkar P, Irvin J, Lungren M, Langlotz C, Liang P (2019) Validating the CheXpert model on your own data in 30 minutes. In: github. https://rajpurkar.github.io/mlx/chexpert-validate/. Accessed 21 Oct 2019
Lakhani P, Sundaram B (2017) Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology 284:574–582. https://doi.org/10.1148/radiol.2017162326
Jaeger S, Karargyris A, Candemir S et al (2014) Automatic tuberculosis screening using chest radiographs. IEEE Trans Med Imaging 33:233–245. https://doi.org/10.1109/TMI.2013.2284099
Melendez J, Hogeweg L, Sánchez CI et al (2018) Accuracy of an automated systemfor tuberculosis detection on chest radiographs in high-risk screening. Int J Tuberc Lung Dis 22:567–571. https://doi.org/10.5588/ijtld.17.0492
Bortsova G, Dubost F, Ørting S, et al (2018) Deep learning from label proportions for emphysema quantification. In: Frangi AF, Schnabel JA, Davatzikos C, et al (eds) Medical image computing and computer assisted intervention – MICCAI 2018. Springer International Publishing, pp 768–776
Walsh SLF, Calandriello L, Silva M, Sverzellati N (2018) Deep learning for classifying fibrotic lung disease on high-resolution computed tomography: a case-cohort study. Lancet Respir Med 6:837–845. https://doi.org/10.1016/S2213-2600(18)30286-8
Wang S, Shi J, Ye Z, et al (2019) Predicting EGFR mutation status in lung adenocarcinoma on computed tomography image using deep learning. Eur Respir J 53: https://doi.org/10.1183/13993003.00986-2018
Xu Y, Hosny A, Zeleznik R et al (2019) Deep learning predicts lung cancer treatment response fromserialmedical imaging. Clin Cancer Res 25:3266–3275. https://doi.org/10.1158/1078-0432.CCR-18-2495
The authors state that this work has not received any funding.
The scientific guarantor of this publication is Pr. MP Revel.
Conflict of interest
Pr. N Paragios is an employee of TheraPanacea (Paris, France).
Statistics and biometry
No complex statistical methods were necessary for this paper.
Written informed consent was not required for this study because this is a review.
Institutional Review Board approval was not required because this is a review.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a mathematical model that is applied to data (input) and provide an output in order to solve a specific problem.
- Artificial intelligence
a scientific domain that creates algorithms allowing machines to mimic human cognition or human performance on a dataset. Currently, AI algorithms address specific tasks such as tuberculosis diagnosis or pneumonia detection, which is defined as narrow or weak AI. They do not allow the detection of all potential abnormalities (e.g., general AI) as a human reader would do.
- CAD (computer-aided diagnosis)
a domain that exploits algorithms derived from artificial intelligence to provide indicators and assist clinical experts in diagnosis.
CAD developed for a detection task.
CAD developed for a characterization task.
- Classification task
assignment to an input signal (an image) a label from a predefined set of categories (disease or no disease), by mean of a machine learning algorithm.
- Cross validation
a statistical method used to estimate the performance of the machine learning algorithm by exploiting various partitions of the data between training and testing.
mathematical operator that creates a new value from an input signal (for instance a group of voxels) after modification by another value which acts as a filter. For example, averaging mean density values within a patch of voxels.
- Convolutional neural network (CNN)
deep neural network which is based on a sequence of convolutional operations.
- Deep learning (= deep neural network)
part of the broader family of machine learning, individualized by specific configuration of neural network organized in multiples layers, emulating the human learning approach and increasing the ability to address complex problems. Deep learning networks are iterative methods that propagate information, training their features automatically through gradient-based optimization methods and backpropagation.
indicates the number of times the entire dataset has been used during the iterative optimization of the network.
image characteristics which are invisible to the human eye. Three categories of features are used by classical machine learning algorithms: morphological features such as shape, volume, and diameter; first-order features such as histogram, kurtosis, and mean values; and textural features including co-occurrence of patterns and filter responses.
- Formal neuron (= artificial neuron)
mathematical function mimicking the architecture of biological neurons.
- Fully connected CNNs
variation of CNNs which consists of connecting all the elements of one layer with all the elements of the next one. Fully connected CNNs are used for classification problems (does this chest radiograph contain signs of tuberculosis?).
- Fully convolutional CNNs
variation of CNNs which are composed from only convolutional layers. Fully convolutional CCNs are used for segmentation tasks (is this pixel located in a fibrotic area?).
- Generalization capability
capacity for a model to maintain its performance when applied to new cases, unseen during training.
- Generative adversarial neural network (GAN)
a neural network that combines two subnetworks, one generating hypotheses and another evaluating their likelihood.
- Ground truth
refers to the label assigned by the expert or another reference method such as pathology.
parameters which control the training process of the algorithm and are defined before training, such as the number of layers and learning rate, among others.
process of allocating ground truth by associating a label to an image.
- Loss function
when training and optimizing the algorithm, it quantifies the gap between predictions and ground truth.
- Machine learning
a scientific field that gives computers the ability to automatically learn without being explicitly programmed, by relying on sample data, known as “training data,” used to make predictions.
- Neural network
machine learning algorithm made of a succession of formal neurons.
characterizes algorithms that perform well on the data on which they have been trained but fail to perform equally well on unseen data.
a field of medical imaging that aims to extract features from medical images, for tasks such as characterization or prediction (prognosis, response to treatment, genotype).
- Recurrent neural networks (RNN)
a class of neural networks that integrate interdependencies between different tasks using the same data (detection and characterization) or between different data (temporal post-contrast enhancement).
- Regression task
process of associating input data with a continuous outcome (for instance survival).
- Semi-supervised learning
class of machine learning techniques that learns from annotated data in order to generate their model and improves its performance using the non-annotated ones
- Supervised learning
class of machine learning techniques requiring labeled training data in order to generate their model.
- Semantic segmentation
process of associating every voxel with a specific label/class, for instance diseased or healthy area, which usually requires manual contouring.
- Stochastic gradient descent
an iterative method to optimize machine learning methods, very commonly used for deep learning networks.
- Test dataset
dataset which is used to evaluate the performance of the final model.
- Training dataset
dataset which is used to train the model.
- Transfer learning
concept of exporting parameters, principles, and strategies learned from a dataset to another algorithm, which will be trained on another dataset (for example, learning on nonmedical images before applying to chest imaging).
- Unsupervised learning
the class of machine learning techniques that seeks to determine patterns or clusters with similar properties (= phenotypes for instance) from unlabeled data. It usually uses techniques different from deep learning.
inability of an algorithm to perform well on both training and test datasets.
- Validation dataset
dataset which is used to determine among different variants of the trained model, the optimal model that should be selected for testing on the remaining unseen cases (test dataset).
About this article
Cite this article
Chassagnon, G., Vakalopolou, M., Paragios, N. et al. Deep learning: definition and perspectives for thoracic imaging. Eur Radiol 30, 2021–2030 (2020). https://doi.org/10.1007/s00330-019-06564-3
- Machine learning
- Deep learning