Abstract
Auto-encoders and their variational counterparts form a family of (deep) neural networks that serve a wide range of applications in medical research and clinical practice. In this chapter we provide a comprehensive overview of how auto-encoders work and how they can be used to improve medical research. We elaborate on various topics such as dimension reduction, denoising auto-encoders, auto-encoders used for anomaly detection and the applications of representations of data created using auto-encoders. Secondly, we touch upon the subject of variational auto-encoders, explaining their design and training process. We end the chapter with small scale examples of auto-encoders applied to the MNIST dataset and a recent example of an application of a (disentangled) variational auto-encoder applied to ECG-data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hinton GE, Zemel RS. Autoencoders, minimum description length and Helmholtz free energy. p. 8.
Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks. Science. 2006;313(5786):504–7. https://doi.org/10.1126/science.1127647.
Using autoencoders for mammogram compression. PubMed. https://pubmed.ncbi.nlm.nih.gov/20703586/. Accessed 31 Jan 2022.
Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324. https://doi.org/10.1109/5.726791.
Baldi P, Hornik K. Neural networks and principal component analysis: learning from examples without local minima. Neural Netw. 1989;2(1):53–8. https://doi.org/10.1016/0893-6080(89)90014-2.
Mohd Sagheer SV, George SN. A review on medical image denoising algorithms. Biomed Signal Process Control 2020;61:102036. https://doi.org/10.1016/j.bspc.2020.102036.
Gondara L. Medical image denoising using convolutional denoising autoencoders. In: 2016 IEEE 16th international conference on data mining workshops (ICDMW), Barcelona, Spain, Dec 2016, pp 241–46. https://doi.org/10.1109/ICDMW.2016.0041
Lu X, Tsao Y, Matsuda S, Hori C. Speech enhancement based on deep denoising auto-encoder. In: Proceedings of interspeech, Jan 2013. p. 436–40.
Jifara W, Jiang F, Rho S, Cheng M, Liu S. Medical image denoising using convolutional neural network: a residual learning approach. J Supercomput. 2019;75(2):704–18. https://doi.org/10.1007/s11227-017-2080-0.
Nawarathne T et al. Comprehensive study on denoising of medical images utilizing neural network based auto-encoder. Feb 2021. arXiv:2102.01903 [eess]. Accessed 30 Jan 2022. Available http://arxiv.org/abs/2102.01903
Shvetsova N, Bakker B, Fedulova I, Schulz H, Dylov DV. Anomaly detection in medical imaging with deep perceptual autoencoders. IEEE Access. 2021;9:118571–83. https://doi.org/10.1109/ACCESS.2021.3107163.
Wei Q, Shi B, Lo JY, Carin L, Ren Y, Hou, R. Anomaly detection for medical images based on a one-class classification. In: Medical imaging 2018: computer-aided diagnosis, Houston, United States, Feb 2018. p 57. https://doi.org/10.1117/12.2293408
Hinton GE, Krizhevsky A, Wang SD. Transforming auto-encoders. In: Artificial neural networks and machine learning—ICANN 2011, Berlin, Heidelberg, 2011. p. 44–51.
Fabius O, van Amersfoort JR. Variational Recurrent auto-encoders. arXiv:1412.6581 [cs, stat], Jun 2015. Accessed 31 Jan 2022. Available http://arxiv.org/abs/1412.6581
Vessies MB et al. Interpretable ECG classification via a query-based latent space traversal (qLST). arXiv:2111.07386 [cs, eess], Nov 2021, Accessed 31 Jan 2022. Available http://arxiv.org/abs/2111.07386
Kingma DP, Welling M. Auto-encoding variational Bayes. arXiv:1312.6114 [cs, stat], May 2014. Accessed 30 Jan 2022. Available http://arxiv.org/abs/1312.6114
Higgins I et al.: β-VAE: learning basic visual concepts with a constrained variational framework. 2017. p. 22.
Van Steenkiste T, Deschrijver D, Dhaene T. Generating an explainable ECG beat space with variational auto-encoders. arXiv:1911.04898 [cs, eess, stat], Nov 2019. Accessed 30 Jan 2022. Available http://arxiv.org/abs/1911.04898
van de Leur RR et al. Inherently explainable deep neural network-based interpretation of electrocardiograms using variational auto-encoders. Cardiovasc Med. 2022;preprint. https://doi.org/10.1101/2022.01.04.22268759.
Kim J-Y, Cho S. BasisVAE: orthogonal latent space for deep disentangled representation. Sep 2019. Accessed 30 Jan 2022. Available https://openreview.net/forum?id=S1gEFkrtvH
Chen RTQ, Li X, Grosse R, Duvenaud D. Isolating sources of disentanglement in variational autoencoders. arXiv:1802.04942 [cs, stat], Apr 2019. Accessed 30 Jan 30 2022. Available http://arxiv.org/abs/1802.04942
Asperti A, Trentin M. Balancing reconstruction error and Kullback-Leibler divergence in variational autoencoders. arXiv:2002.07514 [cs], Feb 2020. Accessed 30 Jan 2022. Available http://arxiv.org/abs/2002.07514
Lucas J, Tucker G, Grosse R, Norouzi M. Understanding posterior collapse in generative latent variable models. 2019. p. 16.
Shao H et al. Control VAE: controllable variational autoencoder. arXiv:2004.05988 [cs, stat], Jun 2020. Accessed 30 Jan 2022. Available: http://arxiv.org/abs/2004.05988
Guo X, Gichoya JW, Purkayastha S, Banerjee I. CVAD: a generic medical anomaly detector based on Cascade VAE. arXiv:2110.15811 [cs, eess], Jan 2022. Accessed 30 Jan 2022. Available http://arxiv.org/abs/2110.15811
Cakmak AS et al. Using convolutional variational autoencoders to predict post-trauma health outcomes from actigraphy data. arXiv:2011.07406 [cs, eess], Nov 2020. Accessed 25 Jan 2022. Available http://arxiv.org/abs/2011.07406
Ministerie van Volksgezondheid WS. Guideline for high-quality diagnostic and prognostic applications of AI in healthcare—Publicatie - Data voor gezondheid. 28 Dec 2021. https://www.datavoorgezondheid.nl/documenten/publicaties/2021/12/17/guideline-for-high-quality-diagnostic-and-prognostic-applications-of-ai-in-healthcare. Accessed 01 Feb 2022.
Wang Y, Yao H, Zhao S. Auto-encoder based dimensionality reduction. Neurocomputing. 2016;184:232–42. https://doi.org/10.1016/j.neucom.2015.08.104.
He K, Zhang X, Ren S, Sun J. Identity mappings in deep residual networks. In: Computer vision—ECCV, Cham. 2016. p. 630–45. https://doi.org/10.1007/978-3-319-46493-0_38.
Chen H, et al. Low-dose CT with a residual encoder-decoder convolutional neural network. IEEE Trans Med Imaging. 2017;36(12):2524–35. https://doi.org/10.1109/TMI.2017.2715284.
van de Leur RR, et al. Automatic triage of 12-lead ECGs using deep convolutional neural networks. J Am Heart Assoc. 2020;9(10): e015138. https://doi.org/10.1161/JAHA.119.015138.
Attia ZI, et al. Screening for cardiac contractile dysfunction using an artificial intelligence-enabled electrocardiogram. Nat Med. 2019;25(1):70–4. https://doi.org/10.1038/s41591-018-0240-2.
van de Leur RR, et al. Discovering and Visualizing disease-specific electrocardiogram features using deep learning: proof-of-concept in phospholamban gene mutation carriers. Circ Arrhythm Electrophysiol. 2021;14(2): e009056. https://doi.org/10.1161/CIRCEP.120.009056.
Ribeiro AH, et al. Automatic diagnosis of the 12-lead ECG using a deep neural network. Nat Commun. 2020;11(1):1760. https://doi.org/10.1038/s41467-020-15432-4.
Goodman B, Flaxman S. European Union regulations on algorithmic decision-making and a ‘right to explanation.’ AIMag. 2017;38(3):50–7. https://doi.org/10.1609/aimag.v38i3.2741.
Markus AF, Kors JA, Rijnbeek PR. The role of explainability in creating trustworthy artificial intelligence for health care: a comprehensive survey of the terminology, design choices, and evaluation strategies. J Biomed Inform. 2021;113: 103655. https://doi.org/10.1016/j.jbi.2020.103655.
Hughes JW, et al. Performance of a convolutional neural network and explainability technique for 12-lead electrocardiogram interpretation. JAMA Cardiol. 2021;6(11):1285–95. https://doi.org/10.1001/jamacardio.2021.2746.
Hooker S, Erhan D, Kindermans P-J, Kim B. A benchmark for interpretability methods in deep neural networks. arXiv:1806.10758 [cs, stat], Nov 2019. Accessed 31 Jan 2022. Available http://arxiv.org/abs/1806.10758
Adebayo J, Gilmer J, Muelly M, Goodfellow I, Hardt M, Kim B. Sanity checks for saliency maps. arXiv:1810.03292 [cs, stat], Nov 2020, Accessed 31 Jan 2022. Available http://arxiv.org/abs/1810.03292
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Glossary
- Activation function
-
In neural networks, (non-linear) activation functions are used at the output of neurons to convert the input to an ‘active’ or ‘not active’ state. An activation function can be a simple linear or sigmoid function or have more complex arbitrary forms. The Rectified Linear Unit (ReLU) function is currently the most popular choice.
- Back propagation
-
Is a widely used technique in the field of machine learning that is used during the training of a neural network. The technique is used to update the weights of the neural network based on the calculated loss, effectively allowing it to ‘learn’.
- (mini-) Batch
-
A small set of data samples that is fed through the network at once during training. A too small batch size may lead to instability while a too large batch size may lead to depletion of computer resources.
- Convolution
-
Common building block of various neural networks. Convolutional neural networks can be considered the current ‘state of the art’ of neural networks applied to various data sources. Convolutional layers in a neural network a apply a learned filter to the input data which improves the ability of neural networks to comprehend spatial structures. Convolutions can be applied in 1 dimensional (signal/timeseries data) and 2 dimensional (images) forms.
- Decoder
-
Part of the (variational) auto-encoder that decodes the given latent vector into a reconstruction of the original data
- Dimension
-
The dimension of data is the size of the dataset or vector, for a grayscale image this is the height × the width in pixels (e.g. 28 × 28), for an RGB-color image, a third dimension of size 3 is added (e.g. (28 × 28 × 3)
- Encoder
-
Part of the (variational) auto-encoder that encodes the provided data into the latent vector
- Explainability
-
The ability of a (trained) observer to interpret the inner workings of a model. Neural networks are generally considered to be to complex to comprehend by humans and are treated as an ‘unexplainable’ black box. The lack of explainability is a major issue in many of the current clinical applications of neural networks.
- Fullyconnected or linear layer
-
Common building block of neural networks in which every node (or every datapoint) in the input is connected to every node in the output of the layer. Through the weights that are associated with each connection the layer is able perform linear transformations of the input data. Together with non-linear activation functions, fully connected layers make up the most basic forms of neural networks.
- KL Divergence
-
The Kullback-Leiber Divergence is a measure of similarity between two distributions.
- Loss function
-
The loss function of the network defines the training objective of the neural network. The loss, the output of the loss function, is progressively minimized through backpropagation, allowing the network to learn and be optimized for its training objective.
- MNIST
-
A commonly used dataset consisting of image of handwritten digits. MNIST is often used for small scale experiments because of the simplistic nature of the data.
- PCA
-
Principal component analysis. A technique commonly used for dimension reduction. The technique involves the calculation of
- Posterior collapse
-
A phenomenon that can occur during the train of variational autoencoder through which the reconstruction accuracy of the network decreases dramatically if the KL-divergence reduces to much.
- Vector
-
A vector is a single row or column of numbers.
- Matrix
-
A set consisting of multiple rows and columns of numbers.
- Convergence
-
A neural network has reached convergence when further training does no longer improve the model.
- MSE loss
-
Mean Squared Error loss, a measure of difference between two data instances such as images or timeseries. The MSE loss is a common loss function that is used to minimize the reconstruction error in auto-encoders.
- Latent variable
-
A variable that is not directly observed in the data but can be inferred through the usage of a model from other variables that are observed directly. In the case of auto-encoders we refer to the variables in the vector extracted after applying the encoder of the auto-encoder as latent variables.
- Disentanglement
-
The disentanglement of latent variables refers to the process of separating the influence of each latent variable on the reconstructed data.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Vessies, M., van de Leur, R., Wouters, P., van Es, R. (2023). Deep Learning—Autoencoders. In: Asselbergs, F.W., Denaxas, S., Oberski, D.L., Moore, J.H. (eds) Clinical Applications of Artificial Intelligence in Real-World Data. Springer, Cham. https://doi.org/10.1007/978-3-031-36678-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-36678-9_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36677-2
Online ISBN: 978-3-031-36678-9
eBook Packages: MedicineMedicine (R0)