Skip to main content
Log in

An effective cross-scenario remote heart rate estimation network based on global–local information and video transformer

  • Scientific Paper
  • Published:
Physical and Engineering Sciences in Medicine Aims and scope Submit manuscript

Abstract

Remote photoplethysmography (rPPG) technology is a non-contact physiological signal measurement method, characterized by non-invasiveness and ease of use. It has broad application potential in medical health, human factors engineering, and other fields. However, current rPPG technology is highly susceptible to variations in lighting conditions, head pose changes, and partial occlusions, posing significant challenges for its widespread application. In order to improve the accuracy of remote heart rate estimation and enhance model generalization, we propose PulseFormer, a dual-path network based on transformer. By integrating local and global information and utilizing fast and slow paths, PulseFormer effectively captures the temporal variations of key regions and spatial variations of the global area, facilitating the extraction of rPPG feature information while mitigating the impact of background noise variations. Heart rate estimation results on the popular rPPG dataset show that PulseFormer achieves state-of-the-art performance on public datasets. Additionally, we establish a dataset containing facial expressions and synchronized physiological signals in driving scenarios and test the pre-trained model from the public dataset on this collected dataset. The results indicate that PulseFormer exhibits strong generalization capabilities across different data distributions in cross-scenario settings. Therefore, this model is applicable for heart rate estimation of individuals in various scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability

Data will be made available on request.

References

  1. Yu Z, Li X, Zhao G (2021) Facial-video-based physiological signal measurement: recent advances and affective applications. IEEE Signal Process Mag 38(6):50–58. https://doi.org/10.1109/MSP.2021.3106285

    Article  Google Scholar 

  2. Faust O et al (2022) Heart rate variability for medical decision support systems: A review. Comput Biol Med 145:105407. https://doi.org/10.1016/j.compbiomed.2022.105407

    Article  PubMed  Google Scholar 

  3. Yu X, Hoog Antink C, Leonhardt S, Bollheimer LC, Laurentius T (2022) Non-contact measurement of heart rate variability in frail geriatric patients: response to early geriatric rehabilitation and comparison with healthy old community-dwelling individuals—a pilot study. Gerontology. https://doi.org/10.1159/000518628

    Article  PubMed  Google Scholar 

  4. Chang CM, Hung CC, Zhao C, Lin CL, Hsu BY (2020) Learning-based remote photoplethysmography for physiological signal feedback control in fitness training. In 2020 15th IEEE Conference on Industrial Electronics and Applications (ICIEA), Kristiansand, Norway: IEEE, pp. 1663–1668. https://doi.org/10.1109/ICIEA48937.2020.9248164

  5. Gupta A, Ravelo-García AG, Dias FM (2022) Availability and performance of face based non-contact methods for heart rate and oxygen saturation estimations: a systematic review. Comput Methods Programs Biomed 219:106771. https://doi.org/10.1016/j.cmpb.2022.106771

    Article  PubMed  Google Scholar 

  6. Liu S-Q, Lan X, Yuen PC (2022) Learning temporal similarity of remote photoplethysmography for fast 3D mask face presentation attack detection. IEEE Trans Inform Forensic Secur 17:3195–3210. https://doi.org/10.1109/TIFS.2022.3197335

    Article  Google Scholar 

  7. Schraven SP et al (2023) Continuous intraoperative perfusion monitoring of free microvascular anastomosed fasciocutaneous flaps using remote photoplethysmography. Sci Rep 13(1):1532. https://doi.org/10.1038/s41598-023-28277-w

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Leicht L, Walter M, Mathissen M, Antink CH, Teichmann D, Leonhardt S (2022) Unobtrusive measurement of physiological features under simulated and real driving conditions. IEEE Trans Intell Transport Syst 23(5):4767–4777. https://doi.org/10.1109/TITS.2022.3143004

    Article  Google Scholar 

  9. Xu M, Zeng G, Song Y, Cao Y, Liu Z, He X (2023) Ivrr-PPG: an illumination variation robust remote-PPG algorithm for monitoring heart rate of drivers. IEEE Trans Instrum Meas 72:1–10. https://doi.org/10.1109/TIM.2023.3271760

    Article  CAS  Google Scholar 

  10. Verkruysse W, Svaasand LO, Nelson JS (2008) Remote plethysmographic imaging using ambient light. Opt Express 16(26):21434. https://doi.org/10.1364/OE.16.021434

    Article  CAS  PubMed  Google Scholar 

  11. Poh M-Z, McDuff DJ, Picard RW (2010) Non-contact, automated cardiac pulse measurements using video imaging and blind source separation. Opt Express 18(10):10762. https://doi.org/10.1364/OE.18.010762

    Article  CAS  PubMed  Google Scholar 

  12. Zhao C, Lin CL, Chen W, Li Z (2018) A novel framework for remote photoplethysmography pulse extraction on compressed videos. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA: IEEE, pp. 1380–138009. https://doi.org/10.1109/CVPRW.2018.00177

  13. de Haan G, Jeanne V (2013) Robust Pulse Rate From Chrominance-Based rPPG. IEEE Trans Biomed Eng 60(10):2878–2886. https://doi.org/10.1109/TBME.2013.2266196

    Article  PubMed  Google Scholar 

  14. Casado CA, López MB (2022) Face2PPG: an unsupervised pipeline for blood volume pulse extraction from faces. Preprint at http://arxiv.org/abs/2202.04101. Accessed 02 Apr 2023

  15. Hu M, Qian F, Wang X, He L, Guo D, Ren F (2022) Robust heart rate estimation with spatial-temporal attention network from facial videos. IEEE Trans Cogn Dev Syst 14(2):639–647. https://doi.org/10.1109/TCDS.2021.3062370

    Article  Google Scholar 

  16. Niu X et al. (2019) Robust remote heart rate estimation from face utilizing spatial-temporal attention. In 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), Lille, France: IEEE, pp. 1–8. https://doi.org/10.1109/FG.2019.8756554

  17. Lu H, Han H, Zhou SK (2021) Dual-GAN: joint BVP and noise modeling for remote physiological measurement. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA: IEEE, pp. 12399–12408. https://doi.org/10.1109/CVPR46437.2021.01222

  18. Lokendra B, Puneet G (2022) AND-rPPG: a novel denoising-rPPG network for improving remote heart rate estimation. Comput Biol Med 141:105146. https://doi.org/10.1016/j.compbiomed.2021.105146

    Article  PubMed  Google Scholar 

  19. Chen W, McDuff D (2018) DeepPhys: video-based physiological measurement using convolutional attention networks. In Computer Vision—ECCV 2018, vol. 11206, Ferrari V, Hebert M, Sminchisescu C, Y Weiss C. (Eds) In Lecture Notes in Computer Science, vol. 11206. , Cham: Springer International Publishing, 2018, pp. 356–373. https://doi.org/10.1007/978-3-030-01216-8_22.

  20. Yu Z, Li X, Zhao G (2019) Remote photoplethysmograph signal measurement from facial videos using spatio-temporal networks. Preprint at http://arxiv.org/abs/1905.02419. Accessed 11 Mar 2023

  21. Niu X, Shan S, Han H, Chen X (2020) RhythmNet: end-to-end heart rate estimation from face via spatial-temporal representation. IEEE Trans on Image Process 29:2409–2423. https://doi.org/10.1109/TIP.2019.2947204

    Article  Google Scholar 

  22. Liu X, Hill B, Jiang Z, Patel S, McDuff D (2023) EfficientPhys: enabling simple, fast and accurate camera-based cardiac measurement. In 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA: IEEE, Jan. 2023, pp. 4997–5006. https://doi.org/10.1109/WACV56688.2023.00498

  23. Yu Z, Y. Shen J, Shi H, Zhao, Torr P, and Zhao G (2022) PhysFormer: facial video-based physiological measurement with temporal difference transformer. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA: IEEE, pp. 4176–4186. https://doi.org/10.1109/CVPR52688.2022.00415

  24. Park S, Kim B-K, Dong S-Y (2022) Self-supervised RGB-NIR fusion video vision transformer framework for rPPG estimation. IEEE Trans Instrum Meas 71:1–10. https://doi.org/10.1109/TIM.2022.3217867

    Article  Google Scholar 

  25. Li Y, Lu G, Li J, Zhang Z, Zhang D (2023) Facial expression recognition in the wild using multi-level features and attention mechanisms. IEEE Trans Affective Comput 14(1):451–462. https://doi.org/10.1109/TAFFC.2020.3031602

    Article  Google Scholar 

  26. Vaswani A et al (2017) Attention is all you need. Adv Neural Inform Process. https://doi.org/10.48550/ARXIV.1706.03762

    Article  Google Scholar 

  27. A. Dosovitskiy et al. (2021) An image is worth 16x16 words: transformers for image recognition at scale. Preprint at http://arxiv.org/abs/2010.11929. Accessed 30 May 2023

  28. Bobbia S, Macwan R, Benezeth Y, Mansouri A, Dubois J (2019) Unsupervised skin tissue segmentation for remote photoplethysmography. Pattern Recogn Lett 124:82–90. https://doi.org/10.1016/j.patrec.2017.10.017

    Article  Google Scholar 

  29. Stricker R, Muller S, Gross H-M (2014) Non-contact video-based pulse rate measurement on a mobile service robot. In The 23rd IEEE International Symposium on Robot and Human Interactive Communication, Edinburgh, UK: IEEE, pp. 1056–1062. https://doi.org/10.1109/ROMAN.2014.6926392

  30. Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vision 57(2):137–154. https://doi.org/10.1023/B:VISI.0000013087.49260.fb

    Article  Google Scholar 

  31. Liu X et al. (2022) Deep physiological sensing toolbox. Preprint at http://arxiv.org/abs/2210.00716. Accessed 19 May 2023

  32. Paszke A et al (2019) ‘PyTorch: an imperative style. High-Perform Deep Learn Libr. https://doi.org/10.48550/ARXIV.1912.01703

    Article  Google Scholar 

  33. Wang W, Den Brinker AC, Stuijk S, De Haan G (2017) Algorithmic principles of remote PPG. IEEE Trans Biomed Eng 64(7):1479–1491. https://doi.org/10.1109/TBME.2016.2609282

    Article  PubMed  Google Scholar 

  34. Špetlík R Visual Heart Rate Estimation with Convolutional Neural Network.

  35. Lee E, Chen E, Lee C-Y (2020) Meta-rPPG: remote heart rate estimation using a transductive meta-learner. Preprint at http://arxiv.org/abs/2007.06786. Accessed 30 Nov 2022

  36. Ouzar Y, Djeldjli D, Bousefsaf F, Maaoui C (2023) X-iPPGNet: A novel one stage deep learning architecture based on depthwise separable convolutions for video-based pulse rate estimation. Comput Biol Med 154:106592. https://doi.org/10.1016/j.compbiomed.2023.106592

    Article  PubMed  Google Scholar 

  37. Unke OT, Meuwly M (2019) PhysNet: A Neural Network for Predicting Energies, Forces, Dipole Moments, and Partial Charges. J Chem Theory Comput 15(6):3678–3693. https://doi.org/10.1021/acs.jctc.9b00181

    Article  CAS  PubMed  Google Scholar 

Download references

Funding

This work was supported by National Key R&D Program of China [2022YFB4300300]; the National Natural Science Foundation of China [52075553]; the Hunan Science Foundation for Distinguished Young Scholars of China [2021JJ10059]; the Hunan Provincial Science and Technology Innovation Leaders [2022RC3044].

Author information

Authors and Affiliations

Authors

Contributions

GX: Conceptualization, Data Collection, Data Analysis, Methodology, Writing. SY: Data Analysis, Manuscript Review. YP: Guidance, Data Collection, Funding. HD: Data preprocessing. XW: Analysis of data. KW: Experimental design. YL: Technical Support. FW: Manuscript Review. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Yong Peng.

Ethics declarations

Competing interests

The authors have no relevant financial or non-financial interests to disclose.

Ethics approval

This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Ethics Committee of the Third Xiangya Hospital of Central South University in China. (Approval number: 2022–326).

Informed consent

Informed consent was obtained from all individual participants included in the study.

Consent to participate

Informed consent was obtained from all individual participants included in the study.

Consent for publication

The authors affirm that human research participants provided informed consent for publication of the images in Fig. 5b and Fig. 6.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiang, G., Yao, S., Peng, Y. et al. An effective cross-scenario remote heart rate estimation network based on global–local information and video transformer. Phys Eng Sci Med 47, 729–739 (2024). https://doi.org/10.1007/s13246-024-01401-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13246-024-01401-4

Keywords

Navigation