Abstract
Remote photoplethysmography (rPPG) technology is a non-contact physiological signal measurement method, characterized by non-invasiveness and ease of use. It has broad application potential in medical health, human factors engineering, and other fields. However, current rPPG technology is highly susceptible to variations in lighting conditions, head pose changes, and partial occlusions, posing significant challenges for its widespread application. In order to improve the accuracy of remote heart rate estimation and enhance model generalization, we propose PulseFormer, a dual-path network based on transformer. By integrating local and global information and utilizing fast and slow paths, PulseFormer effectively captures the temporal variations of key regions and spatial variations of the global area, facilitating the extraction of rPPG feature information while mitigating the impact of background noise variations. Heart rate estimation results on the popular rPPG dataset show that PulseFormer achieves state-of-the-art performance on public datasets. Additionally, we establish a dataset containing facial expressions and synchronized physiological signals in driving scenarios and test the pre-trained model from the public dataset on this collected dataset. The results indicate that PulseFormer exhibits strong generalization capabilities across different data distributions in cross-scenario settings. Therefore, this model is applicable for heart rate estimation of individuals in various scenarios.
Similar content being viewed by others
Data Availability
Data will be made available on request.
References
Yu Z, Li X, Zhao G (2021) Facial-video-based physiological signal measurement: recent advances and affective applications. IEEE Signal Process Mag 38(6):50–58. https://doi.org/10.1109/MSP.2021.3106285
Faust O et al (2022) Heart rate variability for medical decision support systems: A review. Comput Biol Med 145:105407. https://doi.org/10.1016/j.compbiomed.2022.105407
Yu X, Hoog Antink C, Leonhardt S, Bollheimer LC, Laurentius T (2022) Non-contact measurement of heart rate variability in frail geriatric patients: response to early geriatric rehabilitation and comparison with healthy old community-dwelling individuals—a pilot study. Gerontology. https://doi.org/10.1159/000518628
Chang CM, Hung CC, Zhao C, Lin CL, Hsu BY (2020) Learning-based remote photoplethysmography for physiological signal feedback control in fitness training. In 2020 15th IEEE Conference on Industrial Electronics and Applications (ICIEA), Kristiansand, Norway: IEEE, pp. 1663–1668. https://doi.org/10.1109/ICIEA48937.2020.9248164
Gupta A, Ravelo-García AG, Dias FM (2022) Availability and performance of face based non-contact methods for heart rate and oxygen saturation estimations: a systematic review. Comput Methods Programs Biomed 219:106771. https://doi.org/10.1016/j.cmpb.2022.106771
Liu S-Q, Lan X, Yuen PC (2022) Learning temporal similarity of remote photoplethysmography for fast 3D mask face presentation attack detection. IEEE Trans Inform Forensic Secur 17:3195–3210. https://doi.org/10.1109/TIFS.2022.3197335
Schraven SP et al (2023) Continuous intraoperative perfusion monitoring of free microvascular anastomosed fasciocutaneous flaps using remote photoplethysmography. Sci Rep 13(1):1532. https://doi.org/10.1038/s41598-023-28277-w
Leicht L, Walter M, Mathissen M, Antink CH, Teichmann D, Leonhardt S (2022) Unobtrusive measurement of physiological features under simulated and real driving conditions. IEEE Trans Intell Transport Syst 23(5):4767–4777. https://doi.org/10.1109/TITS.2022.3143004
Xu M, Zeng G, Song Y, Cao Y, Liu Z, He X (2023) Ivrr-PPG: an illumination variation robust remote-PPG algorithm for monitoring heart rate of drivers. IEEE Trans Instrum Meas 72:1–10. https://doi.org/10.1109/TIM.2023.3271760
Verkruysse W, Svaasand LO, Nelson JS (2008) Remote plethysmographic imaging using ambient light. Opt Express 16(26):21434. https://doi.org/10.1364/OE.16.021434
Poh M-Z, McDuff DJ, Picard RW (2010) Non-contact, automated cardiac pulse measurements using video imaging and blind source separation. Opt Express 18(10):10762. https://doi.org/10.1364/OE.18.010762
Zhao C, Lin CL, Chen W, Li Z (2018) A novel framework for remote photoplethysmography pulse extraction on compressed videos. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA: IEEE, pp. 1380–138009. https://doi.org/10.1109/CVPRW.2018.00177
de Haan G, Jeanne V (2013) Robust Pulse Rate From Chrominance-Based rPPG. IEEE Trans Biomed Eng 60(10):2878–2886. https://doi.org/10.1109/TBME.2013.2266196
Casado CA, López MB (2022) Face2PPG: an unsupervised pipeline for blood volume pulse extraction from faces. Preprint at http://arxiv.org/abs/2202.04101. Accessed 02 Apr 2023
Hu M, Qian F, Wang X, He L, Guo D, Ren F (2022) Robust heart rate estimation with spatial-temporal attention network from facial videos. IEEE Trans Cogn Dev Syst 14(2):639–647. https://doi.org/10.1109/TCDS.2021.3062370
Niu X et al. (2019) Robust remote heart rate estimation from face utilizing spatial-temporal attention. In 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), Lille, France: IEEE, pp. 1–8. https://doi.org/10.1109/FG.2019.8756554
Lu H, Han H, Zhou SK (2021) Dual-GAN: joint BVP and noise modeling for remote physiological measurement. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA: IEEE, pp. 12399–12408. https://doi.org/10.1109/CVPR46437.2021.01222
Lokendra B, Puneet G (2022) AND-rPPG: a novel denoising-rPPG network for improving remote heart rate estimation. Comput Biol Med 141:105146. https://doi.org/10.1016/j.compbiomed.2021.105146
Chen W, McDuff D (2018) DeepPhys: video-based physiological measurement using convolutional attention networks. In Computer Vision—ECCV 2018, vol. 11206, Ferrari V, Hebert M, Sminchisescu C, Y Weiss C. (Eds) In Lecture Notes in Computer Science, vol. 11206. , Cham: Springer International Publishing, 2018, pp. 356–373. https://doi.org/10.1007/978-3-030-01216-8_22.
Yu Z, Li X, Zhao G (2019) Remote photoplethysmograph signal measurement from facial videos using spatio-temporal networks. Preprint at http://arxiv.org/abs/1905.02419. Accessed 11 Mar 2023
Niu X, Shan S, Han H, Chen X (2020) RhythmNet: end-to-end heart rate estimation from face via spatial-temporal representation. IEEE Trans on Image Process 29:2409–2423. https://doi.org/10.1109/TIP.2019.2947204
Liu X, Hill B, Jiang Z, Patel S, McDuff D (2023) EfficientPhys: enabling simple, fast and accurate camera-based cardiac measurement. In 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA: IEEE, Jan. 2023, pp. 4997–5006. https://doi.org/10.1109/WACV56688.2023.00498
Yu Z, Y. Shen J, Shi H, Zhao, Torr P, and Zhao G (2022) PhysFormer: facial video-based physiological measurement with temporal difference transformer. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA: IEEE, pp. 4176–4186. https://doi.org/10.1109/CVPR52688.2022.00415
Park S, Kim B-K, Dong S-Y (2022) Self-supervised RGB-NIR fusion video vision transformer framework for rPPG estimation. IEEE Trans Instrum Meas 71:1–10. https://doi.org/10.1109/TIM.2022.3217867
Li Y, Lu G, Li J, Zhang Z, Zhang D (2023) Facial expression recognition in the wild using multi-level features and attention mechanisms. IEEE Trans Affective Comput 14(1):451–462. https://doi.org/10.1109/TAFFC.2020.3031602
Vaswani A et al (2017) Attention is all you need. Adv Neural Inform Process. https://doi.org/10.48550/ARXIV.1706.03762
A. Dosovitskiy et al. (2021) An image is worth 16x16 words: transformers for image recognition at scale. Preprint at http://arxiv.org/abs/2010.11929. Accessed 30 May 2023
Bobbia S, Macwan R, Benezeth Y, Mansouri A, Dubois J (2019) Unsupervised skin tissue segmentation for remote photoplethysmography. Pattern Recogn Lett 124:82–90. https://doi.org/10.1016/j.patrec.2017.10.017
Stricker R, Muller S, Gross H-M (2014) Non-contact video-based pulse rate measurement on a mobile service robot. In The 23rd IEEE International Symposium on Robot and Human Interactive Communication, Edinburgh, UK: IEEE, pp. 1056–1062. https://doi.org/10.1109/ROMAN.2014.6926392
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vision 57(2):137–154. https://doi.org/10.1023/B:VISI.0000013087.49260.fb
Liu X et al. (2022) Deep physiological sensing toolbox. Preprint at http://arxiv.org/abs/2210.00716. Accessed 19 May 2023
Paszke A et al (2019) ‘PyTorch: an imperative style. High-Perform Deep Learn Libr. https://doi.org/10.48550/ARXIV.1912.01703
Wang W, Den Brinker AC, Stuijk S, De Haan G (2017) Algorithmic principles of remote PPG. IEEE Trans Biomed Eng 64(7):1479–1491. https://doi.org/10.1109/TBME.2016.2609282
Špetlík R Visual Heart Rate Estimation with Convolutional Neural Network.
Lee E, Chen E, Lee C-Y (2020) Meta-rPPG: remote heart rate estimation using a transductive meta-learner. Preprint at http://arxiv.org/abs/2007.06786. Accessed 30 Nov 2022
Ouzar Y, Djeldjli D, Bousefsaf F, Maaoui C (2023) X-iPPGNet: A novel one stage deep learning architecture based on depthwise separable convolutions for video-based pulse rate estimation. Comput Biol Med 154:106592. https://doi.org/10.1016/j.compbiomed.2023.106592
Unke OT, Meuwly M (2019) PhysNet: A Neural Network for Predicting Energies, Forces, Dipole Moments, and Partial Charges. J Chem Theory Comput 15(6):3678–3693. https://doi.org/10.1021/acs.jctc.9b00181
Funding
This work was supported by National Key R&D Program of China [2022YFB4300300]; the National Natural Science Foundation of China [52075553]; the Hunan Science Foundation for Distinguished Young Scholars of China [2021JJ10059]; the Hunan Provincial Science and Technology Innovation Leaders [2022RC3044].
Author information
Authors and Affiliations
Contributions
GX: Conceptualization, Data Collection, Data Analysis, Methodology, Writing. SY: Data Analysis, Manuscript Review. YP: Guidance, Data Collection, Funding. HD: Data preprocessing. XW: Analysis of data. KW: Experimental design. YL: Technical Support. FW: Manuscript Review. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors have no relevant financial or non-financial interests to disclose.
Ethics approval
This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Ethics Committee of the Third Xiangya Hospital of Central South University in China. (Approval number: 2022–326).
Informed consent
Informed consent was obtained from all individual participants included in the study.
Consent to participate
Informed consent was obtained from all individual participants included in the study.
Consent for publication
The authors affirm that human research participants provided informed consent for publication of the images in Fig. 5b and Fig. 6.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xiang, G., Yao, S., Peng, Y. et al. An effective cross-scenario remote heart rate estimation network based on global–local information and video transformer. Phys Eng Sci Med 47, 729–739 (2024). https://doi.org/10.1007/s13246-024-01401-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13246-024-01401-4