Deep Learning—Autoencoders

Vessies, Melle; van de Leur, Rutger; Wouters, Philippe; van Es, René

doi:10.1007/978-3-031-36678-9_13

Melle Vessies⁵,
Rutger van de Leur⁵,
Philippe Wouters⁵ &
…
René van Es⁵

298 Accesses

Abstract

Auto-encoders and their variational counterparts form a family of (deep) neural networks that serve a wide range of applications in medical research and clinical practice. In this chapter we provide a comprehensive overview of how auto-encoders work and how they can be used to improve medical research. We elaborate on various topics such as dimension reduction, denoising auto-encoders, auto-encoders used for anomaly detection and the applications of representations of data created using auto-encoders. Secondly, we touch upon the subject of variational auto-encoders, explaining their design and training process. We end the chapter with small scale examples of auto-encoders applied to the MNIST dataset and a recent example of an application of a (disentangled) variational auto-encoder applied to ECG-data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hinton GE, Zemel RS. Autoencoders, minimum description length and Helmholtz free energy. p. 8.
Google Scholar
Hinton GE, Salakhutdinov RR. Reducing the dimensionality of data with neural networks. Science. 2006;313(5786):504–7. https://doi.org/10.1126/science.1127647.
Article MathSciNet MATH Google Scholar
Using autoencoders for mammogram compression. PubMed. https://pubmed.ncbi.nlm.nih.gov/20703586/. Accessed 31 Jan 2022.
Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324. https://doi.org/10.1109/5.726791.
Article Google Scholar
Baldi P, Hornik K. Neural networks and principal component analysis: learning from examples without local minima. Neural Netw. 1989;2(1):53–8. https://doi.org/10.1016/0893-6080(89)90014-2.
Article Google Scholar
Mohd Sagheer SV, George SN. A review on medical image denoising algorithms. Biomed Signal Process Control 2020;61:102036. https://doi.org/10.1016/j.bspc.2020.102036.
Gondara L. Medical image denoising using convolutional denoising autoencoders. In: 2016 IEEE 16th international conference on data mining workshops (ICDMW), Barcelona, Spain, Dec 2016, pp 241–46. https://doi.org/10.1109/ICDMW.2016.0041
Lu X, Tsao Y, Matsuda S, Hori C. Speech enhancement based on deep denoising auto-encoder. In: Proceedings of interspeech, Jan 2013. p. 436–40.
Google Scholar
Jifara W, Jiang F, Rho S, Cheng M, Liu S. Medical image denoising using convolutional neural network: a residual learning approach. J Supercomput. 2019;75(2):704–18. https://doi.org/10.1007/s11227-017-2080-0.
Article Google Scholar
Nawarathne T et al. Comprehensive study on denoising of medical images utilizing neural network based auto-encoder. Feb 2021. arXiv:2102.01903 [eess]. Accessed 30 Jan 2022. Available http://arxiv.org/abs/2102.01903
Shvetsova N, Bakker B, Fedulova I, Schulz H, Dylov DV. Anomaly detection in medical imaging with deep perceptual autoencoders. IEEE Access. 2021;9:118571–83. https://doi.org/10.1109/ACCESS.2021.3107163.
Article Google Scholar
Wei Q, Shi B, Lo JY, Carin L, Ren Y, Hou, R. Anomaly detection for medical images based on a one-class classification. In: Medical imaging 2018: computer-aided diagnosis, Houston, United States, Feb 2018. p 57. https://doi.org/10.1117/12.2293408
Hinton GE, Krizhevsky A, Wang SD. Transforming auto-encoders. In: Artificial neural networks and machine learning—ICANN 2011, Berlin, Heidelberg, 2011. p. 44–51.
Google Scholar
Fabius O, van Amersfoort JR. Variational Recurrent auto-encoders. arXiv:1412.6581 [cs, stat], Jun 2015. Accessed 31 Jan 2022. Available http://arxiv.org/abs/1412.6581
Vessies MB et al. Interpretable ECG classification via a query-based latent space traversal (qLST). arXiv:2111.07386 [cs, eess], Nov 2021, Accessed 31 Jan 2022. Available http://arxiv.org/abs/2111.07386
Kingma DP, Welling M. Auto-encoding variational Bayes. arXiv:1312.6114 [cs, stat], May 2014. Accessed 30 Jan 2022. Available http://arxiv.org/abs/1312.6114
Higgins I et al.: β-VAE: learning basic visual concepts with a constrained variational framework. 2017. p. 22.
Google Scholar
Van Steenkiste T, Deschrijver D, Dhaene T. Generating an explainable ECG beat space with variational auto-encoders. arXiv:1911.04898 [cs, eess, stat], Nov 2019. Accessed 30 Jan 2022. Available http://arxiv.org/abs/1911.04898
van de Leur RR et al. Inherently explainable deep neural network-based interpretation of electrocardiograms using variational auto-encoders. Cardiovasc Med. 2022;preprint. https://doi.org/10.1101/2022.01.04.22268759.
Kim J-Y, Cho S. BasisVAE: orthogonal latent space for deep disentangled representation. Sep 2019. Accessed 30 Jan 2022. Available https://openreview.net/forum?id=S1gEFkrtvH
Chen RTQ, Li X, Grosse R, Duvenaud D. Isolating sources of disentanglement in variational autoencoders. arXiv:1802.04942 [cs, stat], Apr 2019. Accessed 30 Jan 30 2022. Available http://arxiv.org/abs/1802.04942
Asperti A, Trentin M. Balancing reconstruction error and Kullback-Leibler divergence in variational autoencoders. arXiv:2002.07514 [cs], Feb 2020. Accessed 30 Jan 2022. Available http://arxiv.org/abs/2002.07514
Lucas J, Tucker G, Grosse R, Norouzi M. Understanding posterior collapse in generative latent variable models. 2019. p. 16.
Google Scholar
Shao H et al. Control VAE: controllable variational autoencoder. arXiv:2004.05988 [cs, stat], Jun 2020. Accessed 30 Jan 2022. Available: http://arxiv.org/abs/2004.05988
Guo X, Gichoya JW, Purkayastha S, Banerjee I. CVAD: a generic medical anomaly detector based on Cascade VAE. arXiv:2110.15811 [cs, eess], Jan 2022. Accessed 30 Jan 2022. Available http://arxiv.org/abs/2110.15811
Cakmak AS et al. Using convolutional variational autoencoders to predict post-trauma health outcomes from actigraphy data. arXiv:2011.07406 [cs, eess], Nov 2020. Accessed 25 Jan 2022. Available http://arxiv.org/abs/2011.07406
Ministerie van Volksgezondheid WS. Guideline for high-quality diagnostic and prognostic applications of AI in healthcare—Publicatie - Data voor gezondheid. 28 Dec 2021. https://www.datavoorgezondheid.nl/documenten/publicaties/2021/12/17/guideline-for-high-quality-diagnostic-and-prognostic-applications-of-ai-in-healthcare. Accessed 01 Feb 2022.
Wang Y, Yao H, Zhao S. Auto-encoder based dimensionality reduction. Neurocomputing. 2016;184:232–42. https://doi.org/10.1016/j.neucom.2015.08.104.
Article Google Scholar
He K, Zhang X, Ren S, Sun J. Identity mappings in deep residual networks. In: Computer vision—ECCV, Cham. 2016. p. 630–45. https://doi.org/10.1007/978-3-319-46493-0_38.
Chen H, et al. Low-dose CT with a residual encoder-decoder convolutional neural network. IEEE Trans Med Imaging. 2017;36(12):2524–35. https://doi.org/10.1109/TMI.2017.2715284.
Article Google Scholar
van de Leur RR, et al. Automatic triage of 12-lead ECGs using deep convolutional neural networks. J Am Heart Assoc. 2020;9(10): e015138. https://doi.org/10.1161/JAHA.119.015138.
Article Google Scholar
Attia ZI, et al. Screening for cardiac contractile dysfunction using an artificial intelligence-enabled electrocardiogram. Nat Med. 2019;25(1):70–4. https://doi.org/10.1038/s41591-018-0240-2.
Article Google Scholar
van de Leur RR, et al. Discovering and Visualizing disease-specific electrocardiogram features using deep learning: proof-of-concept in phospholamban gene mutation carriers. Circ Arrhythm Electrophysiol. 2021;14(2): e009056. https://doi.org/10.1161/CIRCEP.120.009056.
Article Google Scholar
Ribeiro AH, et al. Automatic diagnosis of the 12-lead ECG using a deep neural network. Nat Commun. 2020;11(1):1760. https://doi.org/10.1038/s41467-020-15432-4.
Article Google Scholar
Goodman B, Flaxman S. European Union regulations on algorithmic decision-making and a ‘right to explanation.’ AIMag. 2017;38(3):50–7. https://doi.org/10.1609/aimag.v38i3.2741.
Article Google Scholar
Markus AF, Kors JA, Rijnbeek PR. The role of explainability in creating trustworthy artificial intelligence for health care: a comprehensive survey of the terminology, design choices, and evaluation strategies. J Biomed Inform. 2021;113: 103655. https://doi.org/10.1016/j.jbi.2020.103655.
Article Google Scholar
Hughes JW, et al. Performance of a convolutional neural network and explainability technique for 12-lead electrocardiogram interpretation. JAMA Cardiol. 2021;6(11):1285–95. https://doi.org/10.1001/jamacardio.2021.2746.
Article Google Scholar
Hooker S, Erhan D, Kindermans P-J, Kim B. A benchmark for interpretability methods in deep neural networks. arXiv:1806.10758 [cs, stat], Nov 2019. Accessed 31 Jan 2022. Available http://arxiv.org/abs/1806.10758
Adebayo J, Gilmer J, Muelly M, Goodfellow I, Hardt M, Kim B. Sanity checks for saliency maps. arXiv:1810.03292 [cs, stat], Nov 2020, Accessed 31 Jan 2022. Available http://arxiv.org/abs/1810.03292

Download references

Author information

Authors and Affiliations

University Medical Center Utrecht, Utrecht, Netherlands
Melle Vessies, Rutger van de Leur, Philippe Wouters & René van Es

Authors

Melle Vessies
View author publications
You can also search for this author in PubMed Google Scholar
Rutger van de Leur
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Wouters
View author publications
You can also search for this author in PubMed Google Scholar
René van Es
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to René van Es .

Editor information

Editors and Affiliations

Amsterdam University Medical Center, University of Amsterdam, Amsterdam, The Netherlands
Folkert W. Asselbergs
Institute of Health Informatics, University College London, London, UK
Spiros Denaxas
Department of Data Science and Biostatistics, University Medical Center Utrecht (UMCU), Utrecht, The Netherlands
Daniel L. Oberski
Cedars-Sinai Medical Center, Los Angeles, CA, USA
Jason H. Moore

Glossary

Activation function: In neural networks, (non-linear) activation functions are used at the output of neurons to convert the input to an ‘active’ or ‘not active’ state. An activation function can be a simple linear or sigmoid function or have more complex arbitrary forms. The Rectified Linear Unit (ReLU) function is currently the most popular choice.
Back propagation: Is a widely used technique in the field of machine learning that is used during the training of a neural network. The technique is used to update the weights of the neural network based on the calculated loss, effectively allowing it to ‘learn’.
(mini-) Batch: A small set of data samples that is fed through the network at once during training. A too small batch size may lead to instability while a too large batch size may lead to depletion of computer resources.
Convolution: Common building block of various neural networks. Convolutional neural networks can be considered the current ‘state of the art’ of neural networks applied to various data sources. Convolutional layers in a neural network a apply a learned filter to the input data which improves the ability of neural networks to comprehend spatial structures. Convolutions can be applied in 1 dimensional (signal/timeseries data) and 2 dimensional (images) forms.
Decoder: Part of the (variational) auto-encoder that decodes the given latent vector into a reconstruction of the original data
Dimension: The dimension of data is the size of the dataset or vector, for a grayscale image this is the height × the width in pixels (e.g. 28 × 28), for an RGB-color image, a third dimension of size 3 is added (e.g. (28 × 28 × 3)
Encoder: Part of the (variational) auto-encoder that encodes the provided data into the latent vector
Explainability: The ability of a (trained) observer to interpret the inner workings of a model. Neural networks are generally considered to be to complex to comprehend by humans and are treated as an ‘unexplainable’ black box. The lack of explainability is a major issue in many of the current clinical applications of neural networks.
Fullyconnected or linear layer: Common building block of neural networks in which every node (or every datapoint) in the input is connected to every node in the output of the layer. Through the weights that are associated with each connection the layer is able perform linear transformations of the input data. Together with non-linear activation functions, fully connected layers make up the most basic forms of neural networks.
KL Divergence: The Kullback-Leiber Divergence is a measure of similarity between two distributions.
Loss function: The loss function of the network defines the training objective of the neural network. The loss, the output of the loss function, is progressively minimized through backpropagation, allowing the network to learn and be optimized for its training objective.
MNIST: A commonly used dataset consisting of image of handwritten digits. MNIST is often used for small scale experiments because of the simplistic nature of the data.
PCA: Principal component analysis. A technique commonly used for dimension reduction. The technique involves the calculation of
Posterior collapse: A phenomenon that can occur during the train of variational autoencoder through which the reconstruction accuracy of the network decreases dramatically if the KL-divergence reduces to much.
Vector: A vector is a single row or column of numbers.
Matrix: A set consisting of multiple rows and columns of numbers.
Convergence: A neural network has reached convergence when further training does no longer improve the model.
MSE loss: Mean Squared Error loss, a measure of difference between two data instances such as images or timeseries. The MSE loss is a common loss function that is used to minimize the reconstruction error in auto-encoders.
Latent variable: A variable that is not directly observed in the data but can be inferred through the usage of a model from other variables that are observed directly. In the case of auto-encoders we refer to the variables in the vector extracted after applying the encoder of the auto-encoder as latent variables.
Disentanglement: The disentanglement of latent variables refers to the process of separating the influence of each latent variable on the reconstructed data.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Vessies, M., van de Leur, R., Wouters, P., van Es, R. (2023). Deep Learning—Autoencoders. In: Asselbergs, F.W., Denaxas, S., Oberski, D.L., Moore, J.H. (eds) Clinical Applications of Artificial Intelligence in Real-World Data. Springer, Cham. https://doi.org/10.1007/978-3-031-36678-9_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-36678-9_13
Published: 05 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36677-2
Online ISBN: 978-3-031-36678-9
eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics