Abstract
Facial expression recognition (FER) has attracted much attention lately. However, the current methods are concerned primarily with recognition accuracy, while ignoring efficiency. Efficient-CapsNet, which employs deep separable convolution operations based on CapsNet, has low network parameters and high network training efficiency while ensuring recognition accuracy. Using three public datasets, JAFFE, CK+, and FER2013, we comprehensively compared the recognition accuracy and training efficiency of Efficient-CapsNet and CapsNet. Results showed that the Efficient-CapsNet’s recognition accuracy reached 99.13%, 93.07%, and 72.94%, respectively, which is superior to most of the latest methods. In terms of training efficiency, the training time of a single image of Efficient-CapsNet under 64x64 size input and 48x48 size input is only 0.125ms and 0.033ms, respectively, which is 1454.28 times and 2730.03 times faster than CapsNet, respectively. Results also suggest that the training efficiency of Efficient-CapsNet is affected by the sample size. When the sample size grows, the training efficiency gradually slows down until it stabilizes.
Similar content being viewed by others
References
Cao S, Yao Y, An G (2020) E2-capsule neural networks for facial expression recognition using au-aware attention. IET Image Process 14(11):2417–2424. https://doi.org/10.1049/iet-ipr.2020.0063
Chang T, Li H, Wen G, Hu Y, Ma J (2020) Correction to: Facial expression recognition sensing the complexity of testing samples. Appl Intell 50(11):4143–4144. https://doi.org/10.1007/s10489-020-01709-0
Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2017.195, pp 1251–1258
Fathallah A, Abdi L, Douik A (2017) Facial expression recognition via deep learning. In: 2017 IEEE/ACS 14th international conference on computer systems and applications (AICCSA). IEEE, pp 745–750, DOI https://doi.org/10.1109/AICCSA.2017.124, (to appear in print)
Georgescu MI, Ionescu RT, Popescu M (2019) Local learning with deep and handcrafted features for facial expression recognition. IEEE Access 7:64827–64836. https://doi.org/10.1109/ACCESS.2019.2917266
Ghimire D, Jeong S, Lee J, Park SH (2017) Facial expression recognition based on local region specific features and support vector machines. Multimed Tools Appl 76(6):7803–7821. https://doi.org/10.1007/s11042-016-3418-y
Goodfellow IJ, Erhan D, Carrier PL, Courville A, Mirza M, Hamner B, Cukierski W, Tang Y, Thaler D, Lee DH et al (2013) Challenges in representation learning: A report on three machine learning contests. In: International conference on neural information processing. Springer, pp 117–124. https://doi.org/10.1007/978-3-642-42051-1_16
Hinton GE, Krizhevsky A, Wang SD (2011) Transforming auto-encoders. In: International conference on artificial neural networks. Springer, pp 44–51. https://doi.org/10.1007/978-3-642-21735-7_6
Hosseini S, Cho NI (2019) GF-CapsNet: Using gabor jet and capsule networks for facial age, gender, and expression recognition. In: 2019 14th IEEE international conference on automatic face & gesture recognition (FG 2019). IEEE, pp 1–8, DOI https://doi.org/10.1109/FG.2019.8756552
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. PMLR. https://doi.org/10.48550/arXiv.1502.03167
Li D, Zhao X, Yuan G, Liu Y, Liu G (2021) Robustness comparison between the capsule network and the convolutional network for facial expression recognition. Appl Intell 51(4):2269–2278. https://doi.org/10.1007/s10489-020-01895-x
Liu C, Hirota K, Ma J, Jia Z, Dai Y (2021) Facial expression recognition using hybrid features of pixel and geometry. IEEE Access 9:18876–18889. https://doi.org/10.1109/ACCESS.2021.3054332
Liu Y, Zhang X, Zhou J, Fu L (2021) SG-DSN: A semantic graph-based dual-stream network for facial expression recognition. Neurocomputing 462:320–330. https://doi.org/10.1016/j.neucom.2021.07.017
Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In: 2010 IEEE computer society conference on computer vision and pattern recognition-workshops. https://doi.org/10.1109/CVPRW.2010.5543262. IEEE, pp 94–101
Lyons M, Akamatsu S, Kamachi M, Gyoba J (1998) Coding facial expressions with gabor wavelets. In: Proceedings Third IEEE international conference on automatic face and gesture recognition. IEEE, pp 200–205. https://doi.org/10.1109/AFGR.1998.670949
Mazzia V, Salvetti F, Chiaberge M (2021) Efficient-capsnet: Capsule network with self-attention routing. Sci Rep 11(1):1–13. https://doi.org/10.1038/s41598-021-93977-0
Minaee S, Minaei M, Abdolrashidi A (2021) Deep-emotion: Facial expression recognition using attentional convolutional network. Sensors 21(9):3046. https://doi.org/10.3390/s21093046
Mohan K, Seal A, Krejcar O, Yazidi A (2021) FER-net: facial expression recognition using deep neural net. Neural Comput Appl 33(15):9125–9136. https://doi.org/10.1007/s00521-020-05676-y
Mollahosseini A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. In: 2016 IEEE Winter conference on applications of computer vision (WACV). IEEE, pp 1–10. https://doi.org/10.1109/WACV.2016.7477450
Revina IM, Emmanuel WS (2021) A survey on human face expression recognition techniques. J King Saud Univ-Comput Inf Sci 33(6):619–628. https://doi.org/10.1016/j.jksuci.2018.09.002
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. Adv Neural Inform Process Syst 30. https://doi.org/10.48550/arXiv.1710.09829
Sepas-Moghaddam A, Etemad A, Pereira F, Correia PL (2021) Capsfield: Light field-based face and expression recognition in the wild using capsule routing. IEEE Trans Image Process 30:2627–2642. https://doi.org/10.1109/TIP.2021.3054476
Shao J, Cheng Q (2021) E-FCNN for tiny facial expression recognition. Appl Intell 51 (1):549–559. https://doi.org/10.1007/s10489-020-01855-5
Sun X, Zheng S, Fu H (2020) ROI-attention vectorized CNN model for static facial expression recognition. IEEE Access 8:7183–7194. https://doi.org/10.1109/ACCESS.2020.2964298
Tereikovska L, Tereikovskyi I, Mussiraliyeva S, Akhmed G, Beketova A, Sambetbayeva A (2019) Recognition of emotions by facial geometry using a capsule neural network. Int J Civ Eng Technol 10(3)
Wang K, Su G, Liu L, Wang S (2020) Wavelet packet analysis for speaker-independent emotion recognition. Neurocomputing 398:257–264. https://doi.org/10.1016/j.neucom.2020.02.085
Zhao Y, Chen D (2020) A facial expression recognition method using improved capsule network model. Sci Program 2020. https://doi.org/10.1155/2020/8845176
Zou W, Zhang D, Lee DJ (2021) A new multi-feature fusion based convolutional neural network for facial expression recognition. Appl Intell 1–12. https://doi.org/10.1007/s10489-021-02575-0
Acknowledgements
This work was supported by grants from the National Major Science and Technology Projects of China (grant no. 2018AAA0100703), the National Natural Science Foundation of China (grant no. 61977012), the Anhui Key Laboratory of building acoustic environment (grant no.AAE2021ZR02) and Anhui International Joint Research Center for Ancient Architecture Intelligent and Multi-Dimensional Modeling (grant no.GJZZX2021ZR01).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, K., He, R., Wang, S. et al. The Efficient-CapsNet model for facial expression recognition. Appl Intell 53, 16367–16380 (2023). https://doi.org/10.1007/s10489-022-04349-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-04349-8