PERSIST: Improving micro-expression spotting using better feature encodings and multi-scale Gaussian TCN

Gupta, Puneet

doi:10.1007/s10489-022-03553-w

PERSIST: Improving micro-expression spotting using better feature encodings and multi-scale Gaussian TCN

Published: 05 May 2022

Volume 53, pages 2235–2249, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Puneet Gupta ORCID: orcid.org/0000-0003-3586-9315¹

632 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Micro-expression (ME) is required in real-world applications for understanding true human feeling. The preliminary step of ME analysis, ME spotting, is highly challenging for human experts because MEs induce subtle facial movements for a short duration. Moreover, the existing feature encodings are insufficient for spotting because they are affected by illumination and eye-blinking. These issues are alleviated for better ME spotting by our proposed method, PERSIST, that is, imProved fEatuRe encodingS and multIscale gauSsian Temporal convolutional network. It investigates the possibility of human gaze deformations for spotting. In contrast to the well-known sequence models like RNN and LSTM, it explores the feasibility of a temporal convolutional network to model long-term dependencies in a better way. Furthermore, the proposed network efficacy is significantly improved by adding a Gaussian filter layer and performing multi-resolution analysis. Experimental results conducted on publicly available ME spotting databases reveal that our method PERSIST outperforms the well-known methods. It also indicates that eyebrow information is helpful in ME spotting when eye-blinking artifacts are mitigated, and human gaze information can be consolidated with other encodings for performance improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CBAM: Convolutional Block Attention Module

Facial emotion recognition using convolutional neural networks (FERC)

Article 18 February 2020

A review of convolutional neural networks in computer vision

Article Open access 23 March 2024

Data Availability

All the datasets used for performance evaluation are publicly available. Anyone can obtain these datasets after signing the agreement. The relevant links are provided in the manuscript to obtain the datasets.

Code Availability

The software application will be made available after paper acceptance.

References

Shukla J, Gupta P, Bera A, Sarkar A, Goel P, Butta S, Gupta AK, Sanyal S, Neog DR, Bhuyan MK, et al (2021) Contextual emotion learning challenge. In: IEEE International conference on automatic face and gesture recognition (FG). https://doi.org/10.1109/FG52635.2021.9667034, IEEE, pp 1–7
Gupta P, Bhowmick B, Pal A (2018) Exploring the feasibility of face video based instantaneous heart-rate for micro-expression spotting. In: IEEE Conference on computer vision and pattern recognition workshops (CVPRW). https://doi.org/10.1109/CVPRW.2018.00179, pp 1316–1323
Zhang X, Xu T, Sun W, Song A (2020) Multiple source domain adaptation in micro-expression recognition. Journal of Ambient Intelligence and Humanized Computing, pp 1–16. https://doi.org/10.1007/s12652-020-02569-9
Gupta P (2021) MERASTC: Micro-expression recognition using effective feature encodings and 2D convolutional neural network. IEEE Trans Affect Comput, https://doi.org/10.1109/TAFFC.2021.3061967
Joho H, Jose JM, Valenti R, Sebe N (2009) Exploiting facial expressions for affective video summarisation. In: ACM international conference on image and video retrieval (CIVR). https://doi.org/10.1145/1646396.1646435, ACM, p 31
Ekman P (2009) Lie catching and microexpressions. The philosophy of deception, pp 118–133. https://doi.org/10.1093/acprof:oso/9780195327939.003.0008 https://doi.org/10.1093/acprof:oso/9780195327939.003.0008
Haggard EA, Isaacs KS (1966) Micromomentary facial expressions as indicators of ego mechanisms in psychotherapy. In: Methods of research in psychotherapy. Springer, pp 154– 165
Ben X, Ren Y, Zhang J, Wang S-J, Kpalma K, Meng W, Liu Y-J (2021) Video-based facial micro-expression analysis: A survey of datasets, features and algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, https://doi.org/10.1109/TPAMI.2021.3067464
Frank MG, Herbasz M, Sinuk K, Keller A, Nolan C (2009) I see how you feel: Training laypeople and professionals to recognize fleeting emotions. In: The annual meeting of the international communication association. Sheraton New York, New York City, pp 1–35
Moilanen A, Zhao G, Pietikäinen M (2014) Spotting rapid facial movements from videos using appearance-based feature difference analysis. In: IEEE International conference on pattern recognition (ICPR). https://doi.org/10.1109/ICPR.2014.303, IEEE, pp 1722–1727
Esmaeili V, Shahdi SO (2020) Automatic micro-expression apex spotting using Cubic-LBP. Multimedia Tools and Applications, pp 1–19. https://doi.org/10.1007/s11042-020-08737-5
Guo Y, Li B, Ben X, Ren Y, Zhang J, Yan R, Li Y (2021) A magnitude and angle combined optical flow feature for micro-expression spotting. IEEE MultiMedia, https://doi.org/10.1109/MMUL.2021.3058017
Li X, Xiaopeng HONG, Moilanen A, Huang X, Pfister T, Zhao G, Pietikainen M (2017) Towards reading hidden emotions: A comparative study of spontaneous micro-expression spotting and recognition methods. IEEE Trans Affect Comput 9(4):563–577. https://doi.org/10.1109/TAFFC.2017.2667642
Article Google Scholar
Li J, Soladie C, Seguier R, Wang S-J, Yap MH (2019) Spotting micro-expressions on long videos sequences. In: IEEE International conference on automatic face & gesture recognition (FG). https://doi.org/10.1109/FG.2019.8756626, IEEE, pp 1–5
He Y, Wang S-J, Li J, Yap MH (2020) Spotting macro-and micro-expression intervals in long video sequences. In: International conference on automatic face & gesture recognition, (FG). https://doi.org/10.1109/FG47880.2020.00036, IEEE, pp 742–748
Oh Y-H, See J, Le Ngo AC, Phan RC-W, Baskaran VM (2018) A survey of automatic facial micro-expression analysis: databases, methods, and challenges. Frontiers in psychology 9:1128. https://doi.org/10.3389/fpsyg.2018.01128
Article Google Scholar
Yu Z, Zhang C (2015) Image based static facial expression recognition with multiple deep network learning. In: ACM international conference on multimodal interaction (ICMI). https://doi.org/10.1145/2818346.2830595, pp 435–442
Verburg M, Menkovski V (2019) Micro-expression detection in long videos using optical flow and recurrent neural networks. In: IEEE International conference on automatic face & gesture recognition (FG). https://doi.org/10.1109/FG.2019.8756588, IEEE, pp 1–6
Tran T-K, Vo Q-N, Hong X, Zhao G (2019) Dense prediction for micro-expression spotting based on deep sequence model. Electronic Imaging 2019(8):401–1. https://doi.org/10.2352/ISSN.2470-1173.2019.8.IMAWM-401 https://doi.org/10.2352/ISSN.2470-1173.2019.8.IMAWM-401
Google Scholar
Bai S, Kolter JZ, Koltun V (2018) An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. https://arxiv.org/abs/1803.01271
Li X, Pfister T, Huang X, Zhao G, Pietikäinen M (2013) A spontaneous micro-expression database: Inducement, collection and baseline. In: IEEE International conference and workshops on automatic face & gesture recognition (FG). https://doi.org/10.1109/FG.2013.6553717, IEEE, pp 1–6
Yan W-J, Li X, Wang S-J, Zhao G, Liu Y-J, Chen Y-H, Fu X (2014) CASME II: An improved spontaneous micro-expression database and the baseline evaluation. PloS one 9(1):e86041. https://doi.org/10.1371/journal.pone.0086041
Article Google Scholar
Patel D, Zhao G, Pietikäinen M (2015) Spatiotemporal integration of optical flow vectors for micro-expression detection. In: International conference on advanced concepts for intelligent vision systems (ACIVS). https://doi.org/10.1007/978-3-319-25903-1_32, Springer, pp 369–380
Birla L, Gupta P (2022) PATRON: Exploring respiratory signal derived from non-contact face videos for face anti-spoofing. Expert Syst Appl 187:115883. https://doi.org/10.1016/j.eswa.2021.115883 https://doi.org/10.1016/j.eswa.2021.115883
Article Google Scholar
Valstar M, Pantic M (2006) Fully automatic facial action unit detection and temporal analysis. In: IEEE Conference on computer vision and pattern recognition workshop (CVPRW). https://doi.org/10.1109/CVPRW.2006.85, IEEE, pp 149–149
Ngo L, Cha J, Han J-H (2019) Deep neural network regression for automated retinal layer segmentation in optical coherence tomography images. IEEE Trans Image Process 29:303–312. https://doi.org/10.1109/tip.2019.2931461
Article MathSciNet MATH Google Scholar
Yu X, Zhou Z, Gao Q, Li D, Ríha K (2018) Infrared image segmentation using growing immune field and clone threshold. Infrared Physics & Technology 88:184–193. https://doi.org/10.1016/j.infrared.2017.11.029
Article Google Scholar
Yu X, Lu Y, Gao Q (2021) Pipeline image diagnosis algorithm based on neural immune ensemble learning. Int J Press Vessel Pip 189:104249. https://doi.org/10.1016/j.ijpvp.2020.104249
Article Google Scholar
Wang S-J, He Y, Li J, Fu X (2021) MESNet: A convolutional neural network for spotting multi-scale micro-expression intervals in long videos. IEEE Trans Image Process 30:3956–3969. https://doi.org/10.1109/TIP.2021.3064258
Article Google Scholar
Li J, Soladie C, Seguier R (2020) Local temporal pattern and data augmentation for micro-expression spotting. IEEE Trans Affect Comput, https://doi.org/10.1109/TAFFC.2020.3023821
Takalkar MA, Thuseethan S, Rajasegarar S, Chaczko Z, Xu M, Yearwood J (2021) LGAttNet: Automatic micro-expression detection using dual-stream local and global attentions. Knowl-Based Syst 212:106566. https://doi.org/10.1016/j.knosys.2020.106566
Article Google Scholar
Oord A, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K (2016) Wavenet: A generative model for raw audio. arXiv:https://arxiv.org/pdf/1609.03499
Mishra S, Gupta AK, Gupta P (2021) DARE: Deceiving audio–visual speech recognition model. Knowl-Based Syst 232:107503. https://doi.org/10.1016/j.knosys.2021.107503
Article Google Scholar
Birla L, Gupta P (2022) AND-rPPG: A novel denoising-rppg network for improving remote heart rate estimation. Comput Biol Med 141:105146. https://doi.org/10.1016/j.compbiomed.2021.105146 https://doi.org/10.1016/j.compbiomed.2021.105146
Article Google Scholar
Baltrušaitis T, Robinson P, Morency L-P (2012) 3d constrained local model for rigid and non-rigid facial tracking. In: IEEE Conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2012.6247980 https://doi.org/10.1109/CVPR.2012.6247980, IEEE, pp 2610–2617
Mavadati SM, Mahoor MH, Bartlett K, Trinh P, Cohn JF (2013) DISFA: A spontaneous facial action intensity database. IEEE Trans Affect Comput 4(2):151–160. https://doi.org/10.1109/T-AFFC.2013.4
Article Google Scholar
Chang KI, Bowyer KW, Flynn PJ (2006) Multiple nose region matching for 3d face recognition under varying facial expression. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(10):1695–1700. https://doi.org/10.1109/TPAMI.2006.210 https://doi.org/10.1109/TPAMI.2006.210
Article Google Scholar
Gupta P, Bhowmick B, Pal A (2017) Serial fusion of eulerian and lagrangian approaches for accurate heart-rate estimation using face videos. In: IEEE Engineering in medicine and biology society (EMBC). https://doi.org/10.1109/EMBC.2017.8037447, IEEE, pp 2834–2837
Zhang Z (2015) Photoplethysmography-based heart rate monitoring in physical activities via joint sparse spectrum reconstruction. IEEE Trans Biomed Eng 62(8):1902–1910. https://doi.org/10.1109/TBME.2015.2406332
Article Google Scholar
Gupta P, Bhowmick B, Pal A (2020) MOMBAT: Heart rate monitoring from face video using pulse modeling and bayesian tracking. Computers in biology and medicine 121:103813. https://doi.org/10.1016/j.compbiomed.2020.103813
Article Google Scholar
N’diaye K, Sander D, Vuilleumier P (2009) Self-relevance processing in the human amygdala: gaze direction, facial expression, and emotion intensity. Emotion 9(6):798. https://doi.org/10.1037/a0017845
Article Google Scholar
Wood E, Baltrusaitis T, Zhang X, Sugano Y, Robinson P, Bulling A (2015) Rendering of eyes for eye-shape registration and gaze estimation. In: IEEE International conference on computer vision (ICCV). https://doi.org/10.1109/ICCV.2015.428, IEEE, pp 3756–3764
Kotecha JH, Djuric PM (2003) Gaussian sum particle filtering. IEEE Transactions on signal processing 51(10):2602–2612. https://doi.org/10.1109/TSP.2003.816754
Article MathSciNet MATH Google Scholar
Jia X, Ben X, Yuan H, Kpalma K, Meng W (2018) Macro-to-micro transformation model for micro-expression recognition. Journal of Computational Science 25:289–297. https://doi.org/10.1016/j.jocs.2017.03.016
Article Google Scholar
Baltrušaitis T, Mahmoud M, Robinson P (2015) Cross-dataset learning and person-specific normalisation for automatic action unit detection. In: IEEE International conference and workshops on automatic face & gesture recognition (FG). https://doi.org/10.1109/FG.2015.7284869, IEEE, pp 1–6
Piergiovanni AJ, Ryoo M (2019) Temporal gaussian mixture layer for videos. In: International conference on machine learning (ICML). https://arxiv.org/pdf/1803.06316, pp 5152–5161
Martinez B, Ma P, Petridis S, Pantic M (2020) Lipreading using temporal convolutional networks. In: IEEE International conference on acoustics, speech and signal processing (ICASSP). https://doi.org/10.1109/ICASSP40776.2020.9053841, IEEE, pp 6319–6323
Qu F, Wang S-J, Yan W-J, Li H, Wu S, Fu X (2017) CAS(ME)²: A database for spontaneous macro-expression and micro-expression spotting and recognition. IEEE Trans Affect Comput 9(4):424–436. https://doi.org/10.1109/TAFFC.2017.2654440
Article Google Scholar
Davison AK, Lansley C, Costen N, Tan K, Yap MH (2016) SAMM: A spontaneous micro-facial movement dataset. IEEE transactions on affective computing 9(1):116–129. https://doi.org/10.1109/TAFFC.2016.2573832
Article Google Scholar
Yap CH, Kendrick C, Yap MH (2020) Samm long videos: A spontaneous facial micro-and macro-expressions dataset. In: IEEE International conference on automatic face & gesture recognition (FG). https://doi.org/10.1109/FG47880.2020.00029 https://doi.org/10.1109/FG47880.2020.00029, IEEE, pp 771–776
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1):29–36. https://doi.org/10.1148/radiology.143.1.7063747
Article Google Scholar
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: IEEE international conference on computer vision (ICCV). https://doi.org/10.1109/TPAMI.2018.2858826, IEEE, pp 2980–2988
Chanti DA, Caplier A (2019) ADS-ME: Anomaly detection system for micro-expression spotting. arXiv:https://arxiv.org/abs/1903.04354
Yang B, Wu J, Zhou Z, Komiya M, Kishimoto K, Xu J, Nonaka K, Horiuchi T, Komorita S, Hattori G et al (2021) Facial action unit-based deep learning framework for spotting macro-and micro-expressions in long video sequences. In: ACM International conference on multimedia (MM). https://doi.org/10.1145/3474085.3479209, pp 4794–4798
Yuhong H (2021) Research on micro-expression spotting method based on optical flow features. In: ACM International conference on multimedia (MM). https://doi.org/10.1145/3474085.3479225, pp 4803–4807
Yap CH, Yap MH, Davison AK, Cunningham R (2021) 3d-cnn for facial micro-and macro-expression spotting on long video sequences using temporal oriented reference frame. In: CoRR https://arxiv.org/abs/2105.06340
Gupta AK, Gupta P, Rahtu E (2021) FATALRead-fooling visual speech recognition models. Appl Intell, pp 1–16. https://doi.org/10.1007/s10489-021-02846-w
Gupta P, Rahtu E (2019) CIIDefence: Defeating adversarial attacks by fusing class-specific image inpainting and image denoising. In: Proceedings of the ieee/cvf international conference on computer vision (iccv), pp 6708–6717, https://doi.org/10.1109/ICCV.2019.00681
Gupta P, Rahtu E (2019) MLAttack: Fooling semantic segmentation networks by multi-layer attacks. In: German Conference on Pattern Recognition (GCPR). https://doi.org/10.1007/978-3-030-33676-9_28, pp 401–413

Download references

Acknowledgment

The authors are thankful to all those researchers who have provided access to the publicly available datasets and codes used in this experimental analysis of our method.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology Indore, Indore, 453552, Madhya Pradesh, India
Puneet Gupta

Authors

Puneet Gupta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Puneet Gupta.

Ethics declarations

Conflict of Interests

The author declares that he has no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(MP4 4.29 MB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gupta, P. PERSIST: Improving micro-expression spotting using better feature encodings and multi-scale Gaussian TCN. Appl Intell 53, 2235–2249 (2023). https://doi.org/10.1007/s10489-022-03553-w

Download citation

Accepted: 24 March 2022
Published: 05 May 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s10489-022-03553-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PERSIST: Improving micro-expression spotting using better feature encodings and multi-scale Gaussian TCN

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

Facial emotion recognition using convolutional neural networks (FERC)

A review of convolutional neural networks in computer vision

Data Availability

Code Availability

References

Acknowledgment

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Keywords

Navigation

PERSIST: Improving micro-expression spotting using better feature encodings and multi-scale Gaussian TCN

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

Facial emotion recognition using convolutional neural networks (FERC)

A review of convolutional neural networks in computer vision

Data Availability

Code Availability

References

Acknowledgment

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation