Skip to main content

Deep Learning Approaches for End-to-End Modeling of Medical Spatiotemporal Data

  • Chapter
  • First Online:
Advances in Deep Generative Models for Medical Artificial Intelligence

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1124))

  • 209 Accesses

Abstract

For many medical applications, a single, stationary image may not be sufficient for detecting subtle pathology. Advancements in fields such as computer vision have produced robust deep learning (DL) techniques able to effectively learn complex interactions between space and time for prediction. This chapter presents an overview of different medical applications of spatiotemporal DL for prognostic and diagnostic predictive tasks, and how they built on important advancements in DL from other domains. Although many of the current approaches draw heavily from previous works in other fields, adaptation to the medical domain brings unique challenges, which will be discussed, along with techniques being used to address them. Although the use of spatiotemporal DL in medical applications is still relatively new, and lags behind the progress seen from still images, it provides unique opportunities to incorporate information about functional dynamics into prediction, which could be vital in many medical applications. Current medical applications of spatiotemporal DL have demonstrated the potential of these models, and recent advancements make this space poised to produce state-of-the-art models for many medical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nat 521:436–444

    Google Scholar 

  2. Deng J, Dong W, Socher R et al (2009) ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conf on Comput Vis and Pattern Recognit (pp. 248–255)

    Google Scholar 

  3. Soomro K, Zamir AR, Shah M (2012) UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402

  4. Karpathy A, Toderici G, Shetty S et al (2014) Large-Scale Video Classification with Convolutional Neural Networks. In: 2014 IEEE Conf on Comput Vis and Pattern Recognit, (pp. 1725–1732)

    Google Scholar 

  5. Kay W, Carreira J, Simonyan K et al (2017) The kinetics human action video dataset. arXiv preprint arXiv:1705.06950.

  6. Willamowski J, Arregui D, Csurka G et al (2004) Categorizing nine visual classes using local appearance descriptors. Illum 17(21)

    Google Scholar 

  7. Laptev I, Marszalek M, Schmid C et al (2008) Learning realistic human actions from movies. In: 2008 IEEE Conf on Comput Vis and Pattern Recognit (pp. 1–8)

    Google Scholar 

  8. Wang H, Kläser A, Schmid C et al (2013) Dense Trajectories and Motion Boundary Descriptors for Action Recognition. Int J Comput Vis 103:60–79

    Google Scholar 

  9. Wang H, Schmid C (2013) Action Recognition with Improved Trajectories. In: 2013 IEEE Int Conf on Comput Vis (pp. 3551–3558)

    Google Scholar 

  10. Lecun Y, Bottou L, Bengio Y et al (1998) Gradient-based learning applied to document recognition. In: Proc of the IEEE (pp. 2278–2324)

    Google Scholar 

  11. LeCun Y, Huang FJ, Bottou L (2004) Learning methods for generic object recognition with invariance to pose and lighting. In: Proc of the 2004 IEEE Comput Society Conf on Comput Vis and Pattern Recognit (pp. II–104)

    Google Scholar 

  12. Krizhevsky Alex, Sutskever I, Hinton G (2012) ImageNet Classification with Deep Convolutional Neural Networks. In: Adv in Neural Inf Process Syst

    Google Scholar 

  13. Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. In: Adv in Neural Inf Process Syst 27 (2014)

    Google Scholar 

  14. Tran D, Bourdev L, Fergus R et al (2014) Learning Spatiotemporal Features with 3D Convolutional Networks. In: Proc of the IEEE Int Conf on Comput Vis (pp. 4489–4497).

    Google Scholar 

  15. He K, Zhang X, Ren S et al (2016) Identity mappings in deep residual networks. In: Proc of the Comput Vis-ECCV 2016: 14th Eur Conf (pp. 630–645)

    Google Scholar 

  16. Qiu Z, Yao T, Mei T (2017) Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks. In: Proc of the IEEE Int Conf on Comput Vis(pp. 5533–5541)

    Google Scholar 

  17. Tran D, Wang H, Torresani L et al (2018). A Closer Look at Spatiotemporal Convolutions for Action Recognition. In: Proc of the IEEE Conf on Comput Vis and Pattern Recognit (pp. 6450–6459)

    Google Scholar 

  18. Wang L, Xiong Y, Wang Z et al (2016) Temporal Segment Networks: Towards Good Practices for Deep Action Recognition. Temporal segment networks: Towards good practices for deep action recognition. In: Eur Conf on Comput Vis (pp. 20–36)

    Google Scholar 

  19. Varol G, Laptev I, Schmid C (2017) Long-term temporal convolutions for action recognition. IEEE Trans on Pattern Anal and Mach Intell 40(6):1510–1517

    Google Scholar 

  20. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nat 323(6088):533–536

    Google Scholar 

  21. Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans on Neural Netw 5(2):157–166, 1994

    Google Scholar 

  22. Hochreiter S, Schmidhuber J (1997) Long Short-Term Memory. Neural Comput 9(8):1735–1780

    Google Scholar 

  23. Gers FA, Schmidhuber J, Cummins F (2000) Learning to Forget: Continual Prediction with LSTM. Neural Comput 12(10):2451–2471

    Google Scholar 

  24. Cho K, Merriënboer B, Bahdanau D, Bengio Y (2014) On the Properties of Neural Machine Translation: Encoder-decoder Approaches. In: Proc of SSST-8, Eighth Workshop on Syntax, Semant and Struct in Stat Transl

    Google Scholar 

  25. Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv preprint, arXiv:1412.3555

  26. Baccouche M, Mamalet F, Wolf C et al (2010) Action classification in soccer videos with long short-term memory recurrent neural networks. In: Proc ICANN(2)

    Google Scholar 

  27. Ng JYH, Hausknecht M, Vijayanarasimhan S et al (2015) Beyond Short Snippets: Deep Networks for Video Classification. In: Proc of the IEEE Conf on Comput Vis and Pattern Recognit (CVPR)

    Google Scholar 

  28. Donahue J, Anne Hendricks, L., Guadarrama S, Rohrbach M et al (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Proc of the IEEE Conf on Comput Vis and Pattern Recognit (pp. 2625–2634)

    Google Scholar 

  29. Bahdanau D, Cho K, Yoshua B (2014) Neural Machine Translation by Jointly Learning to Align and Translate. arXiv preprint arXiv:1409.0473

  30. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention Is All You Need. In: Proc of the 31st Int Conf on Neural Inf Process Syst

    Google Scholar 

  31. Devlin J, Chang MW, Lee K et al (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

  32. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G Gelly S, Uszkoreit J, Houlsby N (2020) An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv preprint arXiv:2010.11929

  33. Langlotz CP, Allen B, Erickson BJ, et al (2019) A roadmap for foundational research on artificial intelligence in medical imaging: from the 2018 NIH/RSNA/ACR/The Academy Workshop. Radiol 291(3):781–791

    Google Scholar 

  34. Klem GH, Lüders HO, Jasper HH, et al (1999) The ten-twenty electrode system of the International Federation. The Int Fed of Clin Neurophysiol. Electroencephalogr Clin Neurophysiol Suppl 52:3–6

    Google Scholar 

  35. Salama ES, El-Khoribi RA, Shoman ME et al (2018) EEG-Based Emotion Recognition using 3D Convolutional Neural Networks. Int J of Adv Comput Sci and Appl 9(8)

    Google Scholar 

  36. Koelstra S, Muhl C, Soleymani M (2012) DEAP: A database for emotion analysis using physiological signals. IEEE Trans on Affect Comput 3(1):18–31

    Google Scholar 

  37. Cho J, Hwang H (2020) Spatio-temporal representation of an electoencephalogram for emotion recognition using a three-dimensional convolutional neural network. Sensors 20(12)

    Google Scholar 

  38. Song Y, Jia X, Yang L (2021) Transformer-based Spatial-Temporal Feature Learning for EEG Decoding. arXiv preprint arXiv:2106.11170

  39. Brunner C, Leeb R, Muller-Putz GR et al (2008) BCI Competition 2008-Graz data set A. Inst for Knowl Discov (Laboratory of Brain-Computer Interfaces), Graz Univ of Technol, 16, 1–6

    Google Scholar 

  40. Aslam MH, Usman SM, Khalid S et al (2022) Classification of EEG Signals for Prediction of Epileptic Seizures. Appl Sci 12(14):7251

    Google Scholar 

  41. Bardeci M, Ip CT, Olbrich S (2021) Deep learning applied to electroencephalogram data in mental disorders: A systematic review. Biological Psychol 162:108–117

    Google Scholar 

  42. Liu Y, Pu C, Xia S et al (2022) Machine learning approaches for diagnosing depression using EEG: A review. Transl Neurosci 13(1):224–235

    Google Scholar 

  43. Mousavi S, Afghah F, Acharya UR (2019) SleepEEGNet: Automated sleep stage scoring with sequence to sequence deep learning approach. PLOS ONE 14(5)

    Google Scholar 

  44. Rafie N, Kashou AH, Noseworthy PA (2021) ECG Interpretation: Clinical Relevance, Challenges, and Advances. Hearts 2(4):505–513

    Google Scholar 

  45. Pantelopoulos A, Bourbakis NG (2010) A Survey on Wearable Sensor-Based Systems for Health Monitoring and Prognosis. IEEE Transact on Syst, Man, and Cybern, Part C (Appl and Rev). 40(1):1–12

    Google Scholar 

  46. Sun W, Kalmady SV, Salimi A et al (2022) ECG for high-throughput screening of multiple diseases: Proof-of-concept using multi-diagnosis deep learning from population-based datasets. arXiv preprint arXiv:2210.06291

  47. Yao Q, Wang R, Fan X et al (2020) Multi-class Arrhythmia detection from 12-lead varied-length ECG using Attention-based Time-Incremental Convolutional Neural Network. Inf Fusion 53:174–182

    Google Scholar 

  48. Che C, Zhang P, Zhu M et al (2021) Constrained transformer network for ECG signal processing and arrhythmia classification. BMC Med Inform Decis Mak 21(184)

    Google Scholar 

  49. Somani S, Russak AJ, Richter F et al (2021) Deep learning and the electrocardiogram: review of the current state-of-the-art. EP Europace 23(8):1179–1191

    Google Scholar 

  50. Atzori M, Gijsberts A, Kuzborskij I et al (2015) Characterization of a benchmark database for myoelectric movement classification. IEEE Transact on Neural Syst and Rehabilitation Eng 23(1):73–83

    Google Scholar 

  51. Park KH, Lee SW (2016) Movement intention decoding based on deep learning for multiuser myoelectric interfaces. In: 4th Int Winter Conf on Brain-Comput Interface

    Google Scholar 

  52. Tommasi T, Orabona F, Castellini C et al (2013) Improving Control of Dexterous Hand Prostheses Using Adaptive Learning. IEEE Transact on Robotics 29(1):207–219

    Google Scholar 

  53. Côté-Allard U, Fall CL, Drouin A et al (2019) Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning. IEEE Transact on Neural Syst and Rehabilitation Eng 27(4):760–771

    Google Scholar 

  54. Ravichandran T, Kamel N, Al-Ezzi AA et al (2021) Electrooculography-based Eye Movement Classification using Deep Learning Models. In: 2020 IEEE-EMBS Conf on Biomedical Eng and Sci (IECBES) (pp. 57–61)

    Google Scholar 

  55. Hernandez KAL, Rienmüller T, Baumgartner D et al (2021) Deep learning in spatiotemporal cardiac imaging: A review of methodologies and clinical usability. Comput in Biology and Medicine 130

    Google Scholar 

  56. Fiorito AM, Østvik A, Smistad E et al (2018) Detection of cardiac events in echocardiography using 3D convolutional recurrent neural networks. In: 2018 IEEE Int Ultrasonics Symp (IUS) (pp. 1–4)

    Google Scholar 

  57. Dezaki TF, Liao Z, Luong C et al (2019) Cardiac phase detection in echocardiograms with densely gated recurrent neural networks and global extrema loss. IEEE Trans Med Imag 38(8):1821–1832

    Google Scholar 

  58. Jahren TS, Steen EN, Aase SA et al (2020) Estimation of End-Diastole in Cardiac Spectral Doppler Using Deep Learning. IEEE Trans on Ultrason, Ferroelectr, and Freq Control 67(12):2605–2614

    Google Scholar 

  59. Ouyang D, He B, Ghorbani A et al (2020) Video-based AI for beat-to-beat assessment of cardiac function. Nat 580:252–256

    Google Scholar 

  60. Reynaud H, Vlontzos A, Hou B et al (2021) Ultrasound Video Transformers for Cardiac Ejection Fraction Estimation. In: Proc of Med Image Comput and Comput Assist Interv-MICCAI 2021, Part VI 24 (pp. 495–505)

    Google Scholar 

  61. Kalam K, Otahal P, Marwick TH (2014) Prognostic implications of global LV dysfunction: a systematic review and meta-analysis of global longitudinal strain and ejection fraction. Heart. 100(21):1673–80

    Google Scholar 

  62. Tsai CH, Ma HP, Lin YT et al (2020) Usefulness of heart rhythm complexity in heart failure detection and diagnosis. Sci Rep 10(1):14916

    Google Scholar 

  63. Shad R, Quach N, Fong R et al (2021) Predicting post-operative right ventricular failure using video-based deep learning. Nat Commun 12:5192

    Google Scholar 

  64. Hwang IC, Choi D, Choi YJ et al (2022) Differential diagnosis of common etiologies of left ventricular hypertrophy using a hybrid CNN-LSTM model. Sci Rep 12:20998

    Google Scholar 

  65. Zaman F, Ponnapureddy R, Wang YG et al (2021) Spatio-temporal Hybrid Neural Networks Reduce Erroneous Human “Judgment Calls” in the Diagnosis of Takotsubo Syndrome. EClinicalMedicine 40:101115

    Google Scholar 

  66. Kwong RY, Yucel EK (2003) Computed Tomography Scan and Magnetic Resonance Imaging. Circ 108(15):e104–e106

    Google Scholar 

  67. Varoquaux G, Cheplygina V (2022) Machine learning for medical imaging: methodological failures and recommendations for the future. NPJ Digit. Med. 5(1):48

    Google Scholar 

  68. Mittermeier A, Reidler P, Fabritius MP et al (2022) End-to-End Deep Learning Approach for Perfusion Data: A Proof-of-Concept Study to Classify Core Volume in Stroke CT. Diagn 12(5):1142

    Google Scholar 

  69. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  70. Hu T, Lei Y, Su J et al (2021) Learning spatiotemporal features of DSA using 3D CNN and BiConvGRU for ischemic moyamoya disease detection. Int J of Neurosci 1–11

    Google Scholar 

  71. Nielsen M, WaldmannM, Frölich AM et al (2021) Deep Learning-Based Automated Thrombolysis in Cerebral Infarction Scoring: A Timely Proof-of-Principle Study. Stroke 52:3497–3504

    Google Scholar 

  72. Tan M, Le QV (2019) EfficientNet: Rethinking model scaling for convolutional neural networks. In: Int Conf on Machine Learn (pp. 6105–6114)

    Google Scholar 

  73. Ashby FG (2015) An introduction to fMRI. In: Forstmann BU, Wagenmakers E-J (ed) An introduction to model-based cognitive neuroscience, 91–112. Springer International Publishing

    Google Scholar 

  74. Damoiseaux JS, Rombouts SARB, Barkhof F et al (2006) Consistent resting-state networks across healthy subjects. Proc Natl Acad Sci USA 103(37):13848–13853

    Google Scholar 

  75. Li X, Dvornek NC, Papademetris X, et al (2018) 2-Channel convolutional 3D deep neural network (2CC3D) for fMRI analysis: ASD classification and feature learning. In: IEEE Int Symp on Biomed Imaging (pp. 1252–1255)

    Google Scholar 

  76. Riaz A, Asad M ,Alsano E et al (2020) DeepFMRI: End-to-end deep learning for functional connectivity and classification of ADHD using fMRI. J Neurosci Methods 335:0165–0270

    Google Scholar 

  77. Riaz A, Asad M, Al-Arid SMMR, et al (2017) Fcnet: a convolutional neural network for calculating functional connectivity from functional mri. In: Proc Int Workshop on Connectomics in NeuroImaging (pp. 70–78)

    Google Scholar 

  78. Zhang T, Li C, Li P et al (2020) Separated Channel Attention Convolutional Neural Network (SC-CNN-Attention) to Identify ADHD in Multi-Site Rs-fMRI Dataset. Entropy 22(8):893

    Google Scholar 

  79. Li W, Lin X, Chen X (2020) Detecting Alzheimer’s disease Based on 4D fMRI: An exploration under deep learning framework. Neurocomputing 388:280–287

    Google Scholar 

  80. Wang L, Li K, Chen X et al (2019) Application of convolutional recurrent neural network for individual recognition based on resting state fMRI data. Front in Neurosci 13:434

    Google Scholar 

  81. Mao Z, Su Y, Xu G, et al (2019) Spatio-temporal deep learning method for ADHD fMRI classification. Inf Sci 499:1–11

    Google Scholar 

  82. Xie J, Huo Z, Liu X et al (2022) An fMRI Sequence Representation Learning Framework for Attention Deficit Hyperactivity Disorder Classification. Appl Sci 12(12):6211

    Google Scholar 

  83. Thomas AW, Ré C, Poldrack RA (2022) Self-supervised learning of brain dynamics from broad neuroimaging data. arXiv preprint arXiv:2206.11417

  84. Kong Y, Gao S, Yue Y et al (2021) Spatio-temporal graph convolutional network for diagnosis and treatment response prediction of major depressive disorder from functional connectivity. Hum Brain Mapp 42(12):3922–3933

    Google Scholar 

  85. El Gazzar A, Thomas R, Van Wingen G (2022) Benchmarking Graph Neural Networks for FMRI analysis. arXiv preprint arXiv:2211.08927

  86. Specht K (2020) Current challenges in translational and clinical fMRI and future directions. Front Psychiatry 10:924

    Google Scholar 

  87. He K, Girshick R, Dollár P (2018) Rethinking ImageNet Pre-training. In: Procof the IEEE/CVF Int Conf on Comput Vis (pp. 4918–4927)

    Google Scholar 

  88. Raghu M, Zhang C, Kleinberg J et al (2019) Transfusion: Understanding Transfer Learning for Medical Imaging. Adv in Neural Inf Process Syst 32

    Google Scholar 

  89. Rusu AA, Rabinowitz NC, Desjardins G et al (2016) Progressive neural networks. arXiv preprint arXiv:1606.04671.

  90. Goodfellow I, Bengion Y, Courville A (2016) Deep Learning. MIT Press

    Google Scholar 

  91. Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J of Mach Learn Res 15(56):1929–1958

    Google Scholar 

  92. DeVries T, Taylor GW (2017) Improved regularization of convolutional neural networks with cutout, arXiv preprint arXiv:1708.04552.

  93. Goodfellow IJ, Pouget-Abadie J, Mirza M et al (2014) Generative Adversarial Networks. Commun ACM 63(11):139–144.

    Google Scholar 

  94. Liu R, Huang ZA, Hu Y et al (2022) Attention-Like Multimodality Fusion With Data Augmentation for Diagnosis of Mental Disorders Using MRI. In: IEEE Trans on Neural Netw and Learn Syst

    Google Scholar 

  95. Mirza M, Osindero S (2014) Conditional Generative Adversarial Nets. arXiv preprint arXiv:1411.1784

  96. Caruana R (1993) Multitask learning: A knowledge-based source of inductive bias1. In: Proc of the Tenth Int Conf on Mach Learn (pp. 41–48)

    Google Scholar 

  97. Caruana R, Baluja S, Mitchell T (1995) Using the future to “sort out” the present: Rankprop and multitask learning for medical risk evaluation. In: Adv in Neural Inf Process Syst 8

    Google Scholar 

  98. Zhao Y, Wang X, Che T et al (2023) Multi-task deep learning for medical image computing and analysis: A review. Comput Biol Med 153:106496

    Google Scholar 

  99. Xue W, Brahm G, Pandey S et al (2017) Full left ventricle quantification via deep multitask relationships learning. Med Image Anal43:54–65

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jacqueline K. Harris .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Harris, J.K., Greiner, R. (2023). Deep Learning Approaches for End-to-End Modeling of Medical Spatiotemporal Data. In: Ali, H., Rehmani, M.H., Shah, Z. (eds) Advances in Deep Generative Models for Medical Artificial Intelligence. Studies in Computational Intelligence, vol 1124. Springer, Cham. https://doi.org/10.1007/978-3-031-46341-9_5

Download citation

Publish with us

Policies and ethics