NA-Resnet: neighbor block and optimized attention module for global-local feature extraction in facial expression recognition

Qi, Yongfeng; Zhou, Chenyang; Chen, Yixing

doi:10.1007/s11042-022-14191-2

NA-Resnet: neighbor block and optimized attention module for global-local feature extraction in facial expression recognition

Published: 11 November 2022

Volume 82, pages 16375–16393, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

491 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

As deep networks constantly deepen to extract high-level abstract features, the significance of shallow features for the target task will inevitably diminish. To address this issue and provide novel technical support for current research in the field of facial expression recognition (FER), in this article, we propose a network that can increase the decision weight of the shallow and middle feature mappings through the neighbor block (Nei Block) and concentrate on the crucial areas for extracting necessary features through the optimized attention module (OAM), called NA-Resnet. Our work has several merits. First, to the best of our knowledge, NA-Resnet is the first network that directly utilizes surface features to assist image classification. Second, the suggested OAM is embedded into each layer of the network that can precisely extract critical information appropriate to the current stage. Third, our model achieves the best exhibition when using a single relatively lightweight network without a network ensemble on Fer2013. Extensive experiments have been conducted, and the results show that our model achieves much higher state-of-the-art performance than any single network on Fer2013. In particular, our NA-Resnet achieves 74.59% on Fer2013 and an average accuracy of 96.06% with a standard deviation of 2.9% through 10-fold-cross-validation on Ck+.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Facial expression recognition based on strong attention mechanism and residual network

Article 28 September 2022

A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition

Article 04 May 2022

LKRNet: a dual-branch network based on local key regions for facial expression recognition

Article 28 July 2020

Data availability

All the data generated or analyzed during this study is included in this published article. The datasets used or analyzed during the current study are available from the official website or the corresponding author on reasonable request.

References

Achanta SDM, Karthikeyan T, Vinothkanna R (2019) A novel hidden Markov model-based adaptive dynamic time warping (HMDTW) gait analysis for identifying physically challenged persons. Soft Comput 23:8359–8366. https://doi.org/10.1007/s00500-019-04108-x
Article Google Scholar
Cai J, Meng Z, Khan AS, Li Z, O'Reilly J, Tong Y (2018) Island loss for learning discriminative features in facial expression recognition. In: Proceedings 2018 13th IEEE international conference on Automatic Face & Gesture Recognition (FG 2018). IEEE, Piscataway, pp 302–309. https://doi.org/10.1109/fg.2018.00051
Chapter Google Scholar
Connie T, Al-Shabi M, Cheah WP, Goh M (2017) Facial expression recognition using a hybrid cnn–sift aggregator. In: Phon-Amnuaisuk S, Ang SP, Lee SY (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2017. Lecture notes in computer science, vol 10607. Springer, Cham. https://doi.org/10.1007/978-3-319-69456-6_12
Chapter Google Scholar
Fan Y, Lam JCK, Li VOK (2018) Multi-region ensemble convolutional neural network for facial expression recognition. In: Kurkova V, Manolopoulos Y, Hammer B, Iliadis L, Maglogiannis I (eds) Artificial neural networks and machine learning - ICANN 2018, lecture notes in computer science, vol:11139. Springer, Cham, pp 84–94. https://doi.org/10.1007/978-3-030-01418-6_9
Chapter Google Scholar
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE, Piscataway, pp 3141–3149. https://doi.org/10.1109/cvpr.2019.00326
Chapter Google Scholar
Goodfellow IJ, Erhan D, Carrier PL, Courville A, Mirza M, Hamner B, Cukierski W, Tang Y, Thaler D, Lee D-H, Zhou Y, Ramaiah C, Feng F, Li R, Wang X, Athanasakis D, Shawe-Taylor J, Milakov M, Park J, … Bengio Y (2015) Challenges in representation learning: a report on three machine learning contests. Neural Netw 64:59–63. https://doi.org/10.1016/j.neunet.2014.09.005
Article Google Scholar
Gunes H, Schuller B (2013) Categorical and dimensional affect analysis in continuous input: current trends and future directions. Image Vis Comput 31(2):120–136. https://doi.org/10.1016/j.imavis.2012.06.016
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Piscataway, pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Chapter Google Scholar
Hu G, Liu L, Yuan Y, Yu Z, Hua Y, Zhang Z, Shen F, Shao L, Hospedales T, Robertson N, Yang Y (2018) Deep multi-task learning to recognise subtle facial expressions of mental states. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer Vision - ECCV 2018. Lecture notes in computer science, vol 11216. Springer, Cham, pp 106–123. https://doi.org/10.1007/978-3-030-01258-8_7
Chapter Google Scholar
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE, Piscataway, pp 7132–7141. https://doi.org/10.1109/cvpr.2018.00745
Chapter Google Scholar
Huang G, Liu Z, Maaten LVD, Weinberger KQ (2017) Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Piscataway, pp 2261–2269. https://doi.org/10.1109/cvpr.2017.243
Chapter Google Scholar
Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K (2015) Spatial transformer networks. In: Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R (eds) Advances in neural information processing systems. NIPS, La Jolla, pp 2017–2025
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386
Article Google Scholar
Li S, Deng W (2019) Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. IEEE Trans Image Process 28(1):356–370. https://doi.org/10.1109/TIP.2018.2868382
Article MathSciNet MATH Google Scholar
Li S, Deng W (2020) Deep facial expression recognition: a survey. IEEE Trans Affect Comput. https://doi.org/10.1109/TAFFC.2020.2981446
Liu P, Han S, Meng Z, Tong Y (2014) Facial expression recognition via a boosted deep belief network. In: 2014 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, Piscataway, pp 1805–1812. https://doi.org/10.1109/cvpr.2014.233
Chapter Google Scholar
Liu K, Zhang M, Pan Z (2016) Facial expression recognition with cnn ensemble. In: 2016 international conference on cyberworlds (CW). IEEE, Piscataway, pp 163–166. https://doi.org/10.1109/cw.2016.34
Chapter Google Scholar
Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended Cohn-Kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE computer society conference on computer vision and pattern recognition-workshops. IEEE, Piscataway, pp 94–101. https://doi.org/10.1109/CVPRW.2010.5543262
Chapter Google Scholar
Meng Z, Liu P, Cai J, Han S, Tong Y (2017) Identity-aware convolutional neural network for facial expression recognition. In: 2017 12th IEEE international conference on automatic face and gesture recognition (FG 2017). IEEE, Piscataway, pp 558–565. https://doi.org/10.1109/fg.2017.140
Chapter Google Scholar
Murthy ASD, Karthikeyan T, Jagan BOL, Kumari CU (2020) Novel deep neural network for individual re recognizing physically disabled individuals. Mater Today 33(7):4323–4328. https://doi.org/10.1016/j.matpr.2020.07.447
Article Google Scholar
Papers with Code (2021) Facial Expression Recognition on FER2013. https://paperswithcode.com/sota/facial-expression-recognition-on-fer2013. Accessed 1 December 2021
Pham L, Vu TH, Tran TA (2021) Facial expression recognition using residual masking network. In: 2020 25th international conference on pattern recognition (ICPR). IEEE, Piscataway, pp 4513–4519. https://doi.org/10.1109/ICPR48806.2021.9411919
Chapter Google Scholar
Pons G, Masip D (2018) Supervised committee of convolutional neural networks in automated facial expression analysis. IEEE Trans Affect Comput 9(3):343–350. https://doi.org/10.1109/taffc.2017.2753235
Article Google Scholar
Ranjan R, Sankaranarayanan S, Castillo CD, Chellappa R (2017) An all-in-one convolutional neural network for face analysis. In: 2017 12th IEEE international conference on automatic face and gesture recognition (FG 2017). IEEE, Piscataway, pp 17–24. https://doi.org/10.1109/fg.2017.137
Chapter Google Scholar
Rouast PV, Adam MTP, Chiong R (2021) Deep learning for human affect recognition: insights and new developments. IEEE Trans Affect Comput 12(2):524–543. https://doi.org/10.1109/taffc.2018.2890471
Article Google Scholar
Ruan D, Yan Y, Lai S, Chai Z, Shen C, Wang H (2021) Feature decomposition and reconstruction learning for effective facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE, Piscataway, pp 7660–7669
Google Scholar
Sanchez E, Tellamekala MK, Valstar M, Tzimiropoulos G (2021) Affective processes: stochastic modelling of temporal context for emotion and facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE, Piscataway, pp 9074–9084
Google Scholar
Sikka K, Sharma G, Bartlett M (2016) LOMo: latent ordinal model for facial analysis in videos. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, Piscataway, pp 5580–5589. https://doi.org/10.1109/cvpr.2016.602
Chapter Google Scholar
Siqueira H, Magg S, Wermter S (2020) Efficient facial feature learning with wide ensemble-based convolutional neural networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 34. AAAI, Palo Alto, pp 5800–5809. https://doi.org/10.1609/aaai.v34i04.6037
Chapter Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D (2015) Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Piscataway, pp 1–9. https://doi.org/10.1109/CVPR.2015.7298594
Chapter Google Scholar
Tan M, Le QV (2019) EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International Conference on Machine Learning (ICML). ACM, New York, pp 6105–6114
Google Scholar
Tian Y-I, Kanade T, Cohn JF (2001) Recognizing action units for facial expression analysis. IEEE Trans Pattern Anal Mach Intell 23(2):97–115. https://doi.org/10.1109/34.908962
Article Google Scholar
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Piscataway, pp 6450–6458. https://doi.org/10.1109/cvpr.2017.683
Chapter Google Scholar
Wang K, Peng X, Yang J, Lu S, Qiao Y (2020) Suppressing uncertainties for large-scale facial expression recognition. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, (CVPR). IEEE, Piscataway, pp 6896–6905. https://doi.org/10.1109/cvpr42600.2020.00693
Chapter Google Scholar
Woo S, Park J, Lee J-Y, Kweon IS (2018) CBAM: convolutional block attention module. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision - ECCV 2018, lecture notes in computer science, vol 11211. Springer, Cham, pp 3–19. https://doi.org/10.1007/978-3-030-01234-2_1
Chapter Google Scholar
Wu R, Zhang G, Lu S, Chen T (2020) Cascade EF-GAN: progressive facial expression editing with local focuses. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE, Piscataway, pp 5020–5029. https://doi.org/10.1109/cvpr42600.2020.00507
Chapter Google Scholar
WuJie1010 (2021) Facial-Expression-Recognition.Pytorch. https://github.com/WuJie1010/Facial-Expression-Recognition.Pytorch/. Accessed 24 September 2021
Yang J, Zhang D, Frangi AF, Yang J-Y (2004) Two-dimensional PCA: a new approach to appearance-based face representation and recognition. IEEE Trans Pattern Anal Mach Intell 26(1):131–137. https://doi.org/10.1109/TPAMI.2004.10004
Article Google Scholar
Yao L, Wan Y, Ni H, Xu B (2021) Action unit classification for facial expression recognition using active learning and svm. Multimed Tools Appl 80(16):24287–24301. https://doi.org/10.1007/s11042-021-10836-w
Article Google Scholar
Ying Z, Fang X (2008) Combining LBP and Adaboost for facial expression recognition. In: 2008 9th International Conference on Signal Processing. IEEE, Piscataway, pp 1461–1464. https://doi.org/10.1109/ICOSP.2008.4697408
Chapter Google Scholar
Zhang L, Verma B, Tjondronegoro D, Chandran V (2018) Facial expression analysis under partial occlusion: a survey. ACM Comput Surv 51(2):25:1–25:49. https://doi.org/10.1145/3158369
Article Google Scholar
Zhang H, Su W, Wang Z (2020) Weakly supervised local-global attention network for facial expression recognition. IEEE Access 8:37976–37987. https://doi.org/10.1109/ACCESS.2020.2975913
Article Google Scholar
Zhang F, Zhang T, Mao Q, Xu C (2020) A unified deep model for joint facial expression recognition, face synthesis, and face alignment. IEEE Trans Image Process 29:6574–6589. https://doi.org/10.1109/tip.2020.2991549
Article MATH Google Scholar
Zhao S, Ma Y, Gu Y, Yang J, Xing T, Xu P, Hu R, Chai H, Keutzer K (2020) An end-to-end visual-audio attention network for emotion recognition in user-generated videos. In: Proceedings of the AAAI conference on artificial intelligence, vol 34. AAAI, Palo Alto, pp 303–311. https://doi.org/10.1609/aaai.v34i01.5364
Chapter Google Scholar
Zhu X, Ramanan D (2012) Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, Piscataway, pp 2879–2886. https://doi.org/10.1109/CVPR.2012.6248014
Chapter Google Scholar
Zhu K, Du Z, Li W, Huang D, Wang Y, Chen L (2019) Discriminative attention-based convolutional neural network for 3d facial expression recognition. In: 2019 14th IEEE international conference on Automatic Face & Gesture Recognition (FG 2019). IEEE, Piscataway, pp 1–8. https://doi.org/10.1109/FG.2019.8756524
Chapter Google Scholar

Download references

Acknowledgments

The research was supported by the National Natural Science Foundation of China under Grant 62267007,Gansu Provincial Department of Education Higher Education Industry Support Plan Project under Grant 2022CYZC-16.

Code availability

The code and pre-training model used or analyzed during the current study are available from the corresponding author on reasonable request.

Author information

Authors and Affiliations

College of Computer Science and Engineering, Northwest Normal University, Lanzhou, 730070, Gansu, China
Yongfeng Qi & Chenyang Zhou
School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, 510006, Guangdong, China
Yixing Chen

Authors

Yongfeng Qi
View author publications
You can also search for this author in PubMed Google Scholar
Chenyang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yixing Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chenyang Zhou.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Qi, Y., Zhou, C. & Chen, Y. NA-Resnet: neighbor block and optimized attention module for global-local feature extraction in facial expression recognition. Multimed Tools Appl 82, 16375–16393 (2023). https://doi.org/10.1007/s11042-022-14191-2

Download citation

Received: 26 September 2021
Revised: 07 March 2022
Accepted: 27 October 2022
Published: 11 November 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s11042-022-14191-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

NA-Resnet: neighbor block and optimized attention module for global-local feature extraction in facial expression recognition

Abstract

Access this article

Similar content being viewed by others

Facial expression recognition based on strong attention mechanism and residual network

A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition

LKRNet: a dual-branch network based on local key regions for facial expression recognition

Data availability

References

Acknowledgments

Code availability

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

NA-Resnet: neighbor block and optimized attention module for global-local feature extraction in facial expression recognition

Abstract

Access this article

Similar content being viewed by others

Facial expression recognition based on strong attention mechanism and residual network

A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition

LKRNet: a dual-branch network based on local key regions for facial expression recognition

Data availability

References

Acknowledgments

Code availability

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation