Deep Learning Approaches for End-to-End Modeling of Medical Spatiotemporal Data

Harris, Jacqueline K.; Greiner, Russell

doi:10.1007/978-3-031-46341-9_5

Jacqueline K. Harris⁵ &
Russell Greiner⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1124))

209 Accesses

Abstract

For many medical applications, a single, stationary image may not be sufficient for detecting subtle pathology. Advancements in fields such as computer vision have produced robust deep learning (DL) techniques able to effectively learn complex interactions between space and time for prediction. This chapter presents an overview of different medical applications of spatiotemporal DL for prognostic and diagnostic predictive tasks, and how they built on important advancements in DL from other domains. Although many of the current approaches draw heavily from previous works in other fields, adaptation to the medical domain brings unique challenges, which will be discussed, along with techniques being used to address them. Although the use of spatiotemporal DL in medical applications is still relatively new, and lags behind the progress seen from still images, it provides unique opportunities to incorporate information about functional dynamics into prediction, which could be vital in many medical applications. Current medical applications of spatiotemporal DL have demonstrated the potential of these models, and recent advancements make this space poised to produce state-of-the-art models for many medical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nat 521:436–444
Google Scholar
Deng J, Dong W, Socher R et al (2009) ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conf on Comput Vis and Pattern Recognit (pp. 248–255)
Google Scholar
Soomro K, Zamir AR, Shah M (2012) UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402
Karpathy A, Toderici G, Shetty S et al (2014) Large-Scale Video Classification with Convolutional Neural Networks. In: 2014 IEEE Conf on Comput Vis and Pattern Recognit, (pp. 1725–1732)
Google Scholar
Kay W, Carreira J, Simonyan K et al (2017) The kinetics human action video dataset. arXiv preprint arXiv:1705.06950.
Willamowski J, Arregui D, Csurka G et al (2004) Categorizing nine visual classes using local appearance descriptors. Illum 17(21)
Google Scholar
Laptev I, Marszalek M, Schmid C et al (2008) Learning realistic human actions from movies. In: 2008 IEEE Conf on Comput Vis and Pattern Recognit (pp. 1–8)
Google Scholar
Wang H, Kläser A, Schmid C et al (2013) Dense Trajectories and Motion Boundary Descriptors for Action Recognition. Int J Comput Vis 103:60–79
Google Scholar
Wang H, Schmid C (2013) Action Recognition with Improved Trajectories. In: 2013 IEEE Int Conf on Comput Vis (pp. 3551–3558)
Google Scholar
Lecun Y, Bottou L, Bengio Y et al (1998) Gradient-based learning applied to document recognition. In: Proc of the IEEE (pp. 2278–2324)
Google Scholar
LeCun Y, Huang FJ, Bottou L (2004) Learning methods for generic object recognition with invariance to pose and lighting. In: Proc of the 2004 IEEE Comput Society Conf on Comput Vis and Pattern Recognit (pp. II–104)
Google Scholar
Krizhevsky Alex, Sutskever I, Hinton G (2012) ImageNet Classification with Deep Convolutional Neural Networks. In: Adv in Neural Inf Process Syst
Google Scholar
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. In: Adv in Neural Inf Process Syst 27 (2014)
Google Scholar
Tran D, Bourdev L, Fergus R et al (2014) Learning Spatiotemporal Features with 3D Convolutional Networks. In: Proc of the IEEE Int Conf on Comput Vis (pp. 4489–4497).
Google Scholar
He K, Zhang X, Ren S et al (2016) Identity mappings in deep residual networks. In: Proc of the Comput Vis-ECCV 2016: 14th Eur Conf (pp. 630–645)
Google Scholar
Qiu Z, Yao T, Mei T (2017) Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks. In: Proc of the IEEE Int Conf on Comput Vis(pp. 5533–5541)
Google Scholar
Tran D, Wang H, Torresani L et al (2018). A Closer Look at Spatiotemporal Convolutions for Action Recognition. In: Proc of the IEEE Conf on Comput Vis and Pattern Recognit (pp. 6450–6459)
Google Scholar
Wang L, Xiong Y, Wang Z et al (2016) Temporal Segment Networks: Towards Good Practices for Deep Action Recognition. Temporal segment networks: Towards good practices for deep action recognition. In: Eur Conf on Comput Vis (pp. 20–36)
Google Scholar
Varol G, Laptev I, Schmid C (2017) Long-term temporal convolutions for action recognition. IEEE Trans on Pattern Anal and Mach Intell 40(6):1510–1517
Google Scholar
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nat 323(6088):533–536
Google Scholar
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans on Neural Netw 5(2):157–166, 1994
Google Scholar
Hochreiter S, Schmidhuber J (1997) Long Short-Term Memory. Neural Comput 9(8):1735–1780
Google Scholar
Gers FA, Schmidhuber J, Cummins F (2000) Learning to Forget: Continual Prediction with LSTM. Neural Comput 12(10):2451–2471
Google Scholar
Cho K, Merriënboer B, Bahdanau D, Bengio Y (2014) On the Properties of Neural Machine Translation: Encoder-decoder Approaches. In: Proc of SSST-8, Eighth Workshop on Syntax, Semant and Struct in Stat Transl
Google Scholar
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv preprint, arXiv:1412.3555
Baccouche M, Mamalet F, Wolf C et al (2010) Action classification in soccer videos with long short-term memory recurrent neural networks. In: Proc ICANN(2)
Google Scholar
Ng JYH, Hausknecht M, Vijayanarasimhan S et al (2015) Beyond Short Snippets: Deep Networks for Video Classification. In: Proc of the IEEE Conf on Comput Vis and Pattern Recognit (CVPR)
Google Scholar
Donahue J, Anne Hendricks, L., Guadarrama S, Rohrbach M et al (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Proc of the IEEE Conf on Comput Vis and Pattern Recognit (pp. 2625–2634)
Google Scholar
Bahdanau D, Cho K, Yoshua B (2014) Neural Machine Translation by Jointly Learning to Align and Translate. arXiv preprint arXiv:1409.0473
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention Is All You Need. In: Proc of the 31st Int Conf on Neural Inf Process Syst
Google Scholar
Devlin J, Chang MW, Lee K et al (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G Gelly S, Uszkoreit J, Houlsby N (2020) An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv preprint arXiv:2010.11929
Langlotz CP, Allen B, Erickson BJ, et al (2019) A roadmap for foundational research on artificial intelligence in medical imaging: from the 2018 NIH/RSNA/ACR/The Academy Workshop. Radiol 291(3):781–791
Google Scholar
Klem GH, Lüders HO, Jasper HH, et al (1999) The ten-twenty electrode system of the International Federation. The Int Fed of Clin Neurophysiol. Electroencephalogr Clin Neurophysiol Suppl 52:3–6
Google Scholar
Salama ES, El-Khoribi RA, Shoman ME et al (2018) EEG-Based Emotion Recognition using 3D Convolutional Neural Networks. Int J of Adv Comput Sci and Appl 9(8)
Google Scholar
Koelstra S, Muhl C, Soleymani M (2012) DEAP: A database for emotion analysis using physiological signals. IEEE Trans on Affect Comput 3(1):18–31
Google Scholar
Cho J, Hwang H (2020) Spatio-temporal representation of an electoencephalogram for emotion recognition using a three-dimensional convolutional neural network. Sensors 20(12)
Google Scholar
Song Y, Jia X, Yang L (2021) Transformer-based Spatial-Temporal Feature Learning for EEG Decoding. arXiv preprint arXiv:2106.11170
Brunner C, Leeb R, Muller-Putz GR et al (2008) BCI Competition 2008-Graz data set A. Inst for Knowl Discov (Laboratory of Brain-Computer Interfaces), Graz Univ of Technol, 16, 1–6
Google Scholar
Aslam MH, Usman SM, Khalid S et al (2022) Classification of EEG Signals for Prediction of Epileptic Seizures. Appl Sci 12(14):7251
Google Scholar
Bardeci M, Ip CT, Olbrich S (2021) Deep learning applied to electroencephalogram data in mental disorders: A systematic review. Biological Psychol 162:108–117
Google Scholar
Liu Y, Pu C, Xia S et al (2022) Machine learning approaches for diagnosing depression using EEG: A review. Transl Neurosci 13(1):224–235
Google Scholar
Mousavi S, Afghah F, Acharya UR (2019) SleepEEGNet: Automated sleep stage scoring with sequence to sequence deep learning approach. PLOS ONE 14(5)
Google Scholar
Rafie N, Kashou AH, Noseworthy PA (2021) ECG Interpretation: Clinical Relevance, Challenges, and Advances. Hearts 2(4):505–513
Google Scholar
Pantelopoulos A, Bourbakis NG (2010) A Survey on Wearable Sensor-Based Systems for Health Monitoring and Prognosis. IEEE Transact on Syst, Man, and Cybern, Part C (Appl and Rev). 40(1):1–12
Google Scholar
Sun W, Kalmady SV, Salimi A et al (2022) ECG for high-throughput screening of multiple diseases: Proof-of-concept using multi-diagnosis deep learning from population-based datasets. arXiv preprint arXiv:2210.06291
Yao Q, Wang R, Fan X et al (2020) Multi-class Arrhythmia detection from 12-lead varied-length ECG using Attention-based Time-Incremental Convolutional Neural Network. Inf Fusion 53:174–182
Google Scholar
Che C, Zhang P, Zhu M et al (2021) Constrained transformer network for ECG signal processing and arrhythmia classification. BMC Med Inform Decis Mak 21(184)
Google Scholar
Somani S, Russak AJ, Richter F et al (2021) Deep learning and the electrocardiogram: review of the current state-of-the-art. EP Europace 23(8):1179–1191
Google Scholar
Atzori M, Gijsberts A, Kuzborskij I et al (2015) Characterization of a benchmark database for myoelectric movement classification. IEEE Transact on Neural Syst and Rehabilitation Eng 23(1):73–83
Google Scholar
Park KH, Lee SW (2016) Movement intention decoding based on deep learning for multiuser myoelectric interfaces. In: 4th Int Winter Conf on Brain-Comput Interface
Google Scholar
Tommasi T, Orabona F, Castellini C et al (2013) Improving Control of Dexterous Hand Prostheses Using Adaptive Learning. IEEE Transact on Robotics 29(1):207–219
Google Scholar
Côté-Allard U, Fall CL, Drouin A et al (2019) Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning. IEEE Transact on Neural Syst and Rehabilitation Eng 27(4):760–771
Google Scholar
Ravichandran T, Kamel N, Al-Ezzi AA et al (2021) Electrooculography-based Eye Movement Classification using Deep Learning Models. In: 2020 IEEE-EMBS Conf on Biomedical Eng and Sci (IECBES) (pp. 57–61)
Google Scholar
Hernandez KAL, Rienmüller T, Baumgartner D et al (2021) Deep learning in spatiotemporal cardiac imaging: A review of methodologies and clinical usability. Comput in Biology and Medicine 130
Google Scholar
Fiorito AM, Østvik A, Smistad E et al (2018) Detection of cardiac events in echocardiography using 3D convolutional recurrent neural networks. In: 2018 IEEE Int Ultrasonics Symp (IUS) (pp. 1–4)
Google Scholar
Dezaki TF, Liao Z, Luong C et al (2019) Cardiac phase detection in echocardiograms with densely gated recurrent neural networks and global extrema loss. IEEE Trans Med Imag 38(8):1821–1832
Google Scholar
Jahren TS, Steen EN, Aase SA et al (2020) Estimation of End-Diastole in Cardiac Spectral Doppler Using Deep Learning. IEEE Trans on Ultrason, Ferroelectr, and Freq Control 67(12):2605–2614
Google Scholar
Ouyang D, He B, Ghorbani A et al (2020) Video-based AI for beat-to-beat assessment of cardiac function. Nat 580:252–256
Google Scholar
Reynaud H, Vlontzos A, Hou B et al (2021) Ultrasound Video Transformers for Cardiac Ejection Fraction Estimation. In: Proc of Med Image Comput and Comput Assist Interv-MICCAI 2021, Part VI 24 (pp. 495–505)
Google Scholar
Kalam K, Otahal P, Marwick TH (2014) Prognostic implications of global LV dysfunction: a systematic review and meta-analysis of global longitudinal strain and ejection fraction. Heart. 100(21):1673–80
Google Scholar
Tsai CH, Ma HP, Lin YT et al (2020) Usefulness of heart rhythm complexity in heart failure detection and diagnosis. Sci Rep 10(1):14916
Google Scholar
Shad R, Quach N, Fong R et al (2021) Predicting post-operative right ventricular failure using video-based deep learning. Nat Commun 12:5192
Google Scholar
Hwang IC, Choi D, Choi YJ et al (2022) Differential diagnosis of common etiologies of left ventricular hypertrophy using a hybrid CNN-LSTM model. Sci Rep 12:20998
Google Scholar
Zaman F, Ponnapureddy R, Wang YG et al (2021) Spatio-temporal Hybrid Neural Networks Reduce Erroneous Human “Judgment Calls” in the Diagnosis of Takotsubo Syndrome. EClinicalMedicine 40:101115
Google Scholar
Kwong RY, Yucel EK (2003) Computed Tomography Scan and Magnetic Resonance Imaging. Circ 108(15):e104–e106
Google Scholar
Varoquaux G, Cheplygina V (2022) Machine learning for medical imaging: methodological failures and recommendations for the future. NPJ Digit. Med. 5(1):48
Google Scholar
Mittermeier A, Reidler P, Fabritius MP et al (2022) End-to-End Deep Learning Approach for Perfusion Data: A Proof-of-Concept Study to Classify Core Volume in Stroke CT. Diagn 12(5):1142
Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Hu T, Lei Y, Su J et al (2021) Learning spatiotemporal features of DSA using 3D CNN and BiConvGRU for ischemic moyamoya disease detection. Int J of Neurosci 1–11
Google Scholar
Nielsen M, WaldmannM, Frölich AM et al (2021) Deep Learning-Based Automated Thrombolysis in Cerebral Infarction Scoring: A Timely Proof-of-Principle Study. Stroke 52:3497–3504
Google Scholar
Tan M, Le QV (2019) EfficientNet: Rethinking model scaling for convolutional neural networks. In: Int Conf on Machine Learn (pp. 6105–6114)
Google Scholar
Ashby FG (2015) An introduction to fMRI. In: Forstmann BU, Wagenmakers E-J (ed) An introduction to model-based cognitive neuroscience, 91–112. Springer International Publishing
Google Scholar
Damoiseaux JS, Rombouts SARB, Barkhof F et al (2006) Consistent resting-state networks across healthy subjects. Proc Natl Acad Sci USA 103(37):13848–13853
Google Scholar
Li X, Dvornek NC, Papademetris X, et al (2018) 2-Channel convolutional 3D deep neural network (2CC3D) for fMRI analysis: ASD classification and feature learning. In: IEEE Int Symp on Biomed Imaging (pp. 1252–1255)
Google Scholar
Riaz A, Asad M ,Alsano E et al (2020) DeepFMRI: End-to-end deep learning for functional connectivity and classification of ADHD using fMRI. J Neurosci Methods 335:0165–0270
Google Scholar
Riaz A, Asad M, Al-Arid SMMR, et al (2017) Fcnet: a convolutional neural network for calculating functional connectivity from functional mri. In: Proc Int Workshop on Connectomics in NeuroImaging (pp. 70–78)
Google Scholar
Zhang T, Li C, Li P et al (2020) Separated Channel Attention Convolutional Neural Network (SC-CNN-Attention) to Identify ADHD in Multi-Site Rs-fMRI Dataset. Entropy 22(8):893
Google Scholar
Li W, Lin X, Chen X (2020) Detecting Alzheimer’s disease Based on 4D fMRI: An exploration under deep learning framework. Neurocomputing 388:280–287
Google Scholar
Wang L, Li K, Chen X et al (2019) Application of convolutional recurrent neural network for individual recognition based on resting state fMRI data. Front in Neurosci 13:434
Google Scholar
Mao Z, Su Y, Xu G, et al (2019) Spatio-temporal deep learning method for ADHD fMRI classification. Inf Sci 499:1–11
Google Scholar
Xie J, Huo Z, Liu X et al (2022) An fMRI Sequence Representation Learning Framework for Attention Deficit Hyperactivity Disorder Classification. Appl Sci 12(12):6211
Google Scholar
Thomas AW, Ré C, Poldrack RA (2022) Self-supervised learning of brain dynamics from broad neuroimaging data. arXiv preprint arXiv:2206.11417
Kong Y, Gao S, Yue Y et al (2021) Spatio-temporal graph convolutional network for diagnosis and treatment response prediction of major depressive disorder from functional connectivity. Hum Brain Mapp 42(12):3922–3933
Google Scholar
El Gazzar A, Thomas R, Van Wingen G (2022) Benchmarking Graph Neural Networks for FMRI analysis. arXiv preprint arXiv:2211.08927
Specht K (2020) Current challenges in translational and clinical fMRI and future directions. Front Psychiatry 10:924
Google Scholar
He K, Girshick R, Dollár P (2018) Rethinking ImageNet Pre-training. In: Procof the IEEE/CVF Int Conf on Comput Vis (pp. 4918–4927)
Google Scholar
Raghu M, Zhang C, Kleinberg J et al (2019) Transfusion: Understanding Transfer Learning for Medical Imaging. Adv in Neural Inf Process Syst 32
Google Scholar
Rusu AA, Rabinowitz NC, Desjardins G et al (2016) Progressive neural networks. arXiv preprint arXiv:1606.04671.
Goodfellow I, Bengion Y, Courville A (2016) Deep Learning. MIT Press
Google Scholar
Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J of Mach Learn Res 15(56):1929–1958
Google Scholar
DeVries T, Taylor GW (2017) Improved regularization of convolutional neural networks with cutout, arXiv preprint arXiv:1708.04552.
Goodfellow IJ, Pouget-Abadie J, Mirza M et al (2014) Generative Adversarial Networks. Commun ACM 63(11):139–144.
Google Scholar
Liu R, Huang ZA, Hu Y et al (2022) Attention-Like Multimodality Fusion With Data Augmentation for Diagnosis of Mental Disorders Using MRI. In: IEEE Trans on Neural Netw and Learn Syst
Google Scholar
Mirza M, Osindero S (2014) Conditional Generative Adversarial Nets. arXiv preprint arXiv:1411.1784
Caruana R (1993) Multitask learning: A knowledge-based source of inductive bias1. In: Proc of the Tenth Int Conf on Mach Learn (pp. 41–48)
Google Scholar
Caruana R, Baluja S, Mitchell T (1995) Using the future to “sort out” the present: Rankprop and multitask learning for medical risk evaluation. In: Adv in Neural Inf Process Syst 8
Google Scholar
Zhao Y, Wang X, Che T et al (2023) Multi-task deep learning for medical image computing and analysis: A review. Comput Biol Med 153:106496
Google Scholar
Xue W, Brahm G, Pandey S et al (2017) Full left ventricle quantification via deep multitask relationships learning. Med Image Anal43:54–65
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing Science, Alberta Machine Intelligence Institute, University of Alberta, Edmonton, AB, Canada
Jacqueline K. Harris & Russell Greiner

Authors

Jacqueline K. Harris
View author publications
You can also search for this author in PubMed Google Scholar
Russell Greiner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jacqueline K. Harris .

Editor information

Editors and Affiliations

College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
Hazrat Ali
Department of Computer Science, Munster Technological University, Bishopstown, Cork, Ireland
Mubashir Husain Rehmani
College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
Zubair Shah

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Harris, J.K., Greiner, R. (2023). Deep Learning Approaches for End-to-End Modeling of Medical Spatiotemporal Data. In: Ali, H., Rehmani, M.H., Shah, Z. (eds) Advances in Deep Generative Models for Medical Artificial Intelligence. Studies in Computational Intelligence, vol 1124. Springer, Cham. https://doi.org/10.1007/978-3-031-46341-9_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-46341-9_5
Published: 17 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46340-2
Online ISBN: 978-3-031-46341-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Deep Learning Approaches for End-to-End Modeling of Medical Spatiotemporal Data