Abstract
The accurate cardiac function analysis (i.e., ventricle/stroke volume and ejection fraction measurement) in 2D echocardiography is challenging because of the low-resolution nature of echo sequence and motion in cardiac structure. In an echo sequence, the cardiac function analysis is a sequential process: identification of end-diastole (ED) and end-systole (ES) frames (echo phase detection) followed by the left ventricle ejection fraction (LVEF) prediction. To precisely describe cardiac function, proper attention must be given to spatial and temporal information and their interaction. Several deep learning (i.e., convolution neural networks, recurrent neural networks, and transformer) techniques have recently been introduced but have largely ignored the spatial and temporal information interaction. To address this issue, this study introduces EchoPhaseFormer, a transformer-based solution for echo phase detection (EPD) and LVEF prediction. A 3D convolution stemming is used to get the 3D patches from the echo sequence to retain the temporal information. The EchoPhaseFormer has an echo phase former block consisting of a conditional positional encoder and a phase self-attention module that ensures the spatial–temporal information extraction and their interaction. The EchoPhaseFormer outperformed the state-of-the-art architectures for both tasks on the EchoNet dataset. We obtain an average absolute frame distance of 1.01 for ED frames and 1.04 for ES frames for EPD, respectively. Regarding LVEF prediction, we obtain a mean absolute error of 4.77, a root mean square error of 6.14, and an R2 score of 0.81.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of Data and Materials
This manuscript’s conclusions, data, and figures have not been published before, nor are they being considered by other publishers.
References
Fadel BM, Kazzi B, Mohty D. Ultrasound imaging of the superior vena cava: a state-of-the-art review. J Am Soc Echocardiogr. 2023.
Mada RO, Lysyansky P, Daraban AM, Duchenne J, Voigt J-U. How to define end-diastole and end-systole? Impact of timing on strain measurements. JACC Cardiovasc Imaging. 2015;8(2):148–57.
Liu D, Deng H, Huang Z, Fu J. Fca-net: fully context-aware feature aggregation network for medical segmentation. Biomed Signal Process Control. 2024;91: 106004.
Kachenoura N, Delouche A, Herment A, Frouin F, Diebold B. Automatic detection of end systole within a sequence of left ventricular echocardiographic images using autocorrelation and mitral valve motion detection. In: 2007 29th annual international conference of the IEEE Engineering in Medicine and Biology Society. IEEE; 2007. p. 4504–7.
Shalbaf A, AlizadehSani Z, Behnam H. Echocardiography without electrocardiogram using nonlinear dimensionality reduction methods. J Med Ultrason. 2015;42:137–49.
Kathpalia A, Karabiyik Y, Eik-Nes SH, Tegnander E, Ekroll IK, Kiss G, Torp H. Adaptive spectral envelope estimation for doppler ultrasound. IEEE Trans Ultrason Ferroelectr Freq Control. 2016;63(11):1825–38.
Gifani P, Behnam H, Shalbaf A, Sani ZA. Automatic detection of end-diastole and end-systole from echocardiography images using manifold learning. Physiol Meas. 2010;31(9):1091.
Darvishi S, Behnam H, Pouladian M, Samiei N. Measuring left ventricular volumes in two-dimensional echocardiography image sequence using level-set method for automatic detection of end-diastole and end-systole frames. Res Cardiovasc Med. 2013;2(1):39.
Barcaro U, Moroni D, Salvetti O. Automatic computation of left ventricle ejection fraction from dynamic ultrasound images. Pattern Recognit Image Anal. 2008;18:351–8.
Abboud AA, Rahmat RW, Kadiman SB, Dimon MZB, Nurliyana L, Saripan MI, Khaleel HH. Automatic detection of the end-diastolic and end-systolic from 4d echocardiographic images. J Comput Sci. 2015;11(1):230.
O’Shea K, Nash R. An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 (2015).
Medsker LR, Jain L. Recurrent neural networks. Des Appl. 2001;5(64–67):2.
Zhou P, Shi W, Tian J, Qi Z, Li B, Hao H, Xu B. Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 2: short papers); 2016. p. 207–12.
Kong B, Zhan Y, Shin M, Denny T, Zhang S. Recognizing end-diastole and end-systole frames via deep temporal regression network. In: Medical image computing and computer-assisted intervention-MICCAI 2016: 19th international conference, Athens, Greece, October 17–21, 2016, Proceedings, part III 19. Springer; 2016. p. 264–72.
Dezaki FT, Liao Z, Luong C, Girgis H, Dhungel N, Abdi AH, Behnami D, Gin K, Rohling R, Abolmaesumi P, et al. Cardiac phase detection in echocardiograms with densely gated recurrent neural networks and global extrema loss. IEEE Trans Med Imaging. 2018;38(8):1821–32.
Ouyang D, He B, Ghorbani A, Yuan N, Ebinger J, Langlotz CP, Heidenreich PA, Harrington RA, Liang DH, Ashley EA, et al. Video-based ai for beat-to-beat assessment of cardiac function. Nature. 2020;580(7802):252–6.
Lane ES, Azarmehr N, Jevsikov J, Howard JP, Shun-Shin MJ, Cole GD, Francis DP, Zolgharni M. Multibeat echocardiographic phase detection using deep neural networks. Comput Biol Med. 2021;133: 104373.
Li Y, Li H, Wu F, Luo J. Semi-supervised learning improves the performance of cardiac event detection in echocardiography. Ultrasonics, 2023;107058.
Reynaud H, Vlontzos A, Hou B, Beqiri A, Leeson P, Kainz B. Ultrasound video transformers for cardiac ejection fraction estimation. In: Medical image computing and computer assisted intervention–MICCAI 2021: 24th international conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, part VI 24; 2021. p. 495–505. Springer.
Zeng Y, Tsui P-H, Pang K, Bin G, Li J, Lv K, Wu X, Wu S, Zhou Z. Maef-net: multi-attention efficient feature fusion network for left ventricular segmentation and quantitative analysis in two-dimensional echocardiography. Ultrasonics. 2023;127: 106855.
Fiorito AM, Østvik A, Smistad E, Leclerc S, Bernard O, Lovstakken L. Detection of cardiac events in echocardiography using 3d convolutional recurrent neural networks. In: 2018 IEEE international ultrasonics symposium (IUS). IEEE; 2018. p. 1–4.
Jahren TS, Steen EN, Aase SA, Solberg AHS. Estimation of end-diastole in cardiac spectral doppler using deep learning. IEEE Trans Ultrason Ferroelectr Freq Control. 2020;67(12):2605–14.
Wang Z, Shi J, Hao X, Wen K, Jin X, An H. Simultaneous right ventricle end-diastolic and end-systolic frame identification and landmark detection on echocardiography. In: 2021 43rd annual international conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE; 2021. p. 3916–9.
Farhad M, Masud MM, Beg A. Deep learning based cardiac phase detection using echocardiography imaging. In: International conference on advanced data mining and applications. Springer; 2022. p. 3–17.
Farhad M, Masud MM, Beg A, Ahmad A, Ahmed L, Memon S. Cardiac phase detection in echocardiography using convolutional neural networks. Sci Rep. 2023;13(1):8908.
Ghorbani A, Ouyang D, Abid A, He B, Chen JH, Harrington RA, Liang DH, Ashley EA, Zou JY. Deep learning interpretation of echocardiograms. NPJ Digit Med. 2020;3(1):10.
Hassan D, Obied A. 3dcnn model for left ventricular ejection fraction evaluation in echocardiography. In: 2023 Al-Sadiq international conference on communication and information technology (AICCIT). IEEE; 2023. p. 1–6.
Toro-Quitian L, Torres JC, Carrera-Pinzón AF, Gutiérrez-Carvajal R, Guerrero MI, Cruz-Roa A, Davila CÓ, Romero E. Automatic estimation of the ejection fraction from diastole and systole ultrasound images by a simplified end-to-end u-net neural network. In: 2023 19th international symposium on medical information processing and analysis (SIPAIM). IEEE; 2023. p. 1–5.
Alvén J, Hagberg E, Hagerman D, Petersen R, Hjelmgren O. A deep multi-stream model for robust prediction of left ventricular ejection fraction in 2d echocardiography. Sci Rep. 2024;14(1):2104.
Fazry L, Haryono A, Nissa NK, Hirzi NM, Rachmadi MF, Jatmiko W, et al. Hierarchical vision transformers for cardiac ejection fraction estimation. In: 2022 7th international workshop on big data and information security (IWBIS). IEEE; 2022. p. 39–44.
Zhao C, Chen W, Qin J, Yang P, Xiang Z, Frangi AF, Chen M, Fan S, Yu W, Chen X, et al. Ift-net: interactive fusion transformer network for quantitative analysis of pediatric echocardiography. Med Image Anal. 2022;82: 102648.
Mokhtari M, Ahmadi N, Tsang TS, Abolmaesumi P, Liao R. Gemtrans: a general, echocardiography-based, multi-level transformer framework for cardiovascular diagnosis. In: International workshop on machine learning in medical imaging. Springer; 2023. p. 1–10.
Xiao T, Singh M, Mintun E, Darrell T, Dollár P, Girshick R. Early convolutions help transformers see better. Adv Neural Inf Process Syst. 2021;34:30392–400.
Si C, Yu W, Zhou P, Zhou Y, Wang X, Yan S. Inception transformer. Adv Neural Inf Process Syst. 2022;35:23495–509.
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2015. p. 1–9.
Ren S, Zhou D, He S, Feng J, Wang X. Shunted self-attention via multi-scale token aggregation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2022. p. 10853–62.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. Adv Neural Inf Process Syst. 2017;30.
Wang W, Xie E, Li X, Fan D-P, Song K, Liang D, Lu T, Luo P, Shao L. Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF international conference on computer vision; 2021. p. 568–78.
Wang W, Xie E, Li X, Fan D-P, Song K, Liang D, Lu T, Luo P, Shao L. Pvt v2: improved baselines with pyramid vision transformer. Comput Vis Media. 2022;8(3):415–24.
Guo M-H, Lu C-Z, Liu Z-N, Cheng M-M, Hu S-M. Visual attention network. Comput Vis Media. 2023;9(4):733–52.
Hou Q, Lu C-Z, Cheng M-M, Feng J. Conv2former: a simple transformer-style convnet for visual recognition. arXiv preprint arXiv:2211.11943 (2022).
Yang J, Li C, Dai X, Gao J. Focal modulation networks. Adv Neural Inf Process Syst. 2022;35:4203–17.
Torun O, Yuksel SE, Erdem E, Imamoglu N, Erdem A. Hyperspectral image denoising via self-modulating convolutional neural networks. Signal Process. 2024;214: 109248.
Wu H, Xiao B, Codella N, Liu M, Dai X, Yuan L, Zhang L. Cvt: introducing convolutions to vision transformers. In: Proceedings of the IEEE/CVF international conference on computer vision; 2021. p. 22–31.
Loshchilov I, Hutter F. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).
Liu Z, Ning J, Cao Y, Wei Y, Zhang Z, Lin S, Hu H. Video swin transformer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2022. p. 3202–11.
Funding
Since this study was not funded, specifics are inaccessible.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We state that none of the authors has any interests that may be seen as influencing the findings and/or discussion presented in this publication, as defined by Springer, or conflicting interests.
Ethical approval
This study complies with national and international norms and did not include research on humans or animals.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Singh, G., Darji, A.D., Sarvaiya, J.N. et al. EchoPhaseFormer: A Transformer Based Echo Phase Detection and Analysis in 2D Echocardiography. SN COMPUT. SCI. 5, 878 (2024). https://doi.org/10.1007/s42979-024-03249-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-024-03249-7