Abstract
Edge computing has shown significant successes in addressing the security and privacy issues related to facial expression recognition (FER) tasks. Although several lightweight networks have been proposed for edge computing, the computing demands and memory access cost (MAC) imposed by these networks hinder their deployment on edge devices. Thus, we propose an edge computing-oriented real-time facial expression recognition network, called EC-RFERNet. Specifically, to improve the inference speed, we devise a mini-and-fast (MF) block based on the partial convolution operation. The MF block effectively reduces the MAC and parameters by processing only a part of the input feature maps and eliminating unnecessary channel expansion operations. To improve the accuracy, the squeeze-and-excitation (SE) operation is introduced into certain MF blocks, and the MF blocks at different levels are selectively connected by the harmonic dense connection. SE operation is used to complete the adaptive channel weighting, and the harmonic dense connection is used to exchange information between different MF blocks to enhance the feature learning ability. The MF block and the harmonic dense connection together constitute the harmonic-MF module, which is the core component of EC-RFERNet. This module achieves a balance between accuracy and inference speed. Five public datasets are used to test the validity of EC-RFERNet and to demonstrate its competitive performance, with only 2.25 MB and 0.55 million parameters. Furthermore, one human–robot interaction system is constructed with a humanoid robot equipped with the Raspberry Pi. The experimental results demonstrate that EC-RFERNet can provide an effective solution for practical FER applications.
Similar content being viewed by others
Data availability
All datasets are freely available in public repositories. RAF-DB: http://www.whdeng.cn/raf/model1.html, FER2013: https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data, CK+ : http://www.jeffcohn.net/Resources/, AffectNet: http://mohammadmahoor.com/affectnet/, SFEW: https://cs.anu.edu.au/few/AFEW.html.
References
Li, S., Deng, W.: Deep facial expression recognition: a survey. IEEE Trans. Affect. Comput. 13(3), 1195–1215 (2020)
Rouast, P.V., Adam, M.T.P., Chong, R.: Deep learning for human affect recognition: Insights and new developments. IEEE Trans. Affect. Comput. 12(2), 524–543 (2021)
Shi, W., Cao, J., Zhang, Q., Li, Y., Xu, L.: Edge computing: vision and challenges. IEEE Int. Things J. 3(5), 637–646 (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, https://arxiv.org/abs/1409.1556(2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The 2016 conference on computer vision and pattern recognition (CVPR), pp 770–778 (2016).
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Mohammadian, F.R., Mahlouji, M., Shahidinejad, A.: Real-time face detection using circular sliding of the gabor energy and neural networks. SIViP 16, 1081–1089 (2022)
Wang, X., Zhang, L., Huang, W.B., Wang, S.Y., Wu, H., He, J., Song, A.G.: Deep convolutional networks with tunable speed–accuracy tradeoff for human activity recognition using wearables. IEEE Trans. Instrum. Meas. 71, 2503912 (2022)
Zarif, N.E., Montazeri, L., Leduc-Primeau, F., Sawan, M.: Mobile-optimized facial expression recognition techniques. IEEE Access 9, 101172–101185 (2021)
Yang, J., Qian, T., Zhang, F., Khan, S.U.: Real-time facial expression recognition based on edge computing. IEEE Access 9, 76178–76190 (2021)
Nan, Y.H., Ju, J.G., Hua, Q.Y., Zhang, H.M., Wang, B.: A-MobileNet: an approach of facial expression recognition. Alex. Eng. J. 61(6), 4435–4444 (2022)
Ma, H., Celik, T.: FER-Net: Facial expression recognition using densely connected convolutional network. Electron. Lett. 55(4), 184–186 (2019)
Chen, J. R., Kao, S., He, H., Zhuo, W. P., Wen, S., Lee, C., Chan, S. G.: Run, don’t walk: chasing higher flops for faster neural networks. In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12021–12031 (2023)
Chao, P., Kao, C.-Y., Ruan, Y., Huang, C.-H., Lin, Y.-L.: HarDNet: a low memory traffic network. In: The 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3551–3560 (2019)
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42(8), 2011–2023 (2020)
Howard, A. G., Zhu, M., Chen, B.: MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR, https://arxiv.org/abs/1704.04861(2017)
Lu, X., Zhang, H., Zhang, Q., Han, X.: A lightweight network for expression recognition based on adaptive mixed residual connections. In: The 2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), pp 311–315 (2022).
Zhou, L., Li, S., Wang, Y., Liu, J.: SDNET: Lightweight facial expression recognition for sample disequilibrium. In: The 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2415–2419 (2022)
Fang, B., Chen, G., He, J.: Ghost-based convolutional neural network for effective facial expression recognition. In: The 2022 International Conference on Machine Learning and Knowledge Engineering, pp. 121–124 (2022)
Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: GhostNet: more features from cheap operations. In The 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1577–1586 (2020)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L. -C.: MobileNetV2: inverted residuals and linear bottlenecks. In: The 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Sun, Q., Liang, L., Dang, X.H., Chen, Y.: Deep learning-based dimensional emotion recognition combining the attention mechanism and global second-order feature representations. Comput. Electr. Eng. 104, 108469 (2022)
Zhang, F., Li, Q., Ren, Y., Xu, H., Song, Y., Liu, S.: An expression recognition method on robots based on MobilenetV2-SSD. In: The 2019 6th International Conference on Systems and Informatics (ICSAI), pp. 118–122 (2019)
Huang, Z., Yang, S., Zhou, M.C., Gong, Z., Abusorrah, A., Lin, C., Huang, Z.: Making accurate object detection at the edge: review and new approach. Artif. Intell. Rev. 55(3), 2245–2274 (2022)
Passalis, N., Raitoharju, J., Tefas, A., Gabbouj, M.: Adaptive inference using hierarchical convolutional bag-of-features for low-power embedded platforms. In: The 2019 IEEE International Conference on Image Processing, pp 3048–3052 (2019)
Wahab, M.N.A., Nazir, A., Zhen, A.T., Mohd Noor, M.H., Akbar, M.F., Mohamed, A.S.A.: Efficientnet-lite and hybrid cnn-knn implementation for facial expression recognition on raspberry Pi. IEEE Access 9, 134065–134080 (2021)
Tan, M. X., Le, Q.: Efficientnet: rethinking model scaling for convolutional neural networks. In: The 36th International Conference on Machine Learning, Long Beach, 97, pp. 6105–6114 (2019)
Riaz, M.N., Shen, Y., Sohail, M., Guo, M.: eXnet: an efficient approach for emotion recognition in the wild. Sensors 20(4), 1087 (2020)
Nitish, S., Geoffrey, H., Alex, K., Ilya, S., Ruslan, S.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: The 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2001)
Li, S., Deng W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: The 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2584–2593 (2017)
Mollahosseini, A., Hasani, B., Mahoor, M.H.: AffectNet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10(1), 18–31 (2019)
Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: The 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 94–101 (2010)
Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Static facial expression analysis in tough conditions: data, evaluation protocol, and benchmark. In: The 2011 IEEE International Conference on Computer Vision Workshops, pp. 2106–2112 (2011)
Wang, R. J., Li, X., Ling, C. X.: Pelee: A real-time object detection system on mobile devices. In: The 32nd International Conference on Neural Information Processing Systems, pp. 1967–1976 (2018)
Tan, M. X., Le, Q. V.: MixConv: Mixed depthwise convolutional kernels. In: The 30th British Machine Vision Conference (2019)
Ma, H., Celik, T., Li, H.C.: Lightweight attention convolutional neural network through network slimming for robust facial expression recognition. SIViP 15, 1507–1515 (2021)
Huo, H., Yu, Y., Liu, Z.: Facial expression recognition based on improved depthwise separable convolutional network. Multimed. Tools Appl. 82, 18635–18652 (2023)
Saurav, S., Gidde, P., Saini, R., Singh, S.: Dual integrated convolutional neural network for real-time facial expression recognition in the wild. Vis. Comput. 38, 1083–1096 (2022)
Shao, J., Qian, Y.S.: Three convolutional neural network models for facial expression recognition in the wild. Neurocomputing 355, 82–92 (2019)
Ji, L.P., Wu, S.L., Gu, X.P.: A facial expression recognition algorithm incorporating SVM and explainable residual neural network. Signal Image Video Process 17, 4245–4254 (2023)
Wen, G., Chang, T., Li, H., Jiang, L.: Dynamic objectives learning for facial expression recognition. IEEE Trans. Multimedia 22(11), 2914–2925 (2020)
Li, Y., Zeng, J., Shan, S., Chen, X.: Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans. Image Process. 28(5), 2439–2450 (2018)
Wen, Z., Lin, W., Wang, T., Xu, G.: Distract your attention: multi-head cross attention network for facial expression recognition. Biomimetics 8(2), 199 (2021)
Zhang, J., Yu, H.: Improving the facial expression recognition and its interpretability via generating expression pattern-map. Pattern Recogn. 129, 108737 (2022)
Saurav, S., Saini, R, Singh, S.: A dual-channel ensembled deep convolutional neural network for facial expression recognition in the wild. Comput. Intell. 39(5), 666–706 (2023)
Xie, S.Y., Hu, H.F., Wu, Y.B.: Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recogn. 92, 177–191 (2019)
Sun, W.Y., Zhao, H.T., Jin, Z.: A visual attention-based ROI detection method for facial expression recognition. Neurocomputing 296, 12–22 (2018)
Gogić, I., Manhart, M., Pandžić, I.S., Ahlberg, J.: Fast facial expression recognition using local binary features and shallow neural networks. Vis. Comput. 36, 97–112 (2020)
Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Y.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)
Saurav, S., Saini, K.A., Saini, R., Singh, S.: Deep learning inspired intelligent embedded system for haptic rendering of facial emotions to the blind. Neural Comput. Appl. 34, 4595–4623 (2022)
Wu, Y.R., Zhang, L.L., Gu, Z.H., Lu, H., Wan, S.H.: Edge-AI-driven framework with efficient mobile network design for facial expression recognition. ACM Trans Embed. Comput. Syst. 22(3), 1–17 (2023)
Landola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., Keutzer, K.: squeezenet: alexnet-level accuracy with 50x fewer parameters and< 0.5 MB model size. In: The 5th International Conference on Learning Representations (2016)
Gholami, A., Kwon, K., Wu, B., Tai, Z., Yue, X. Y., Jin, P., Zhao, S., Keutzer, K.: SqueezeNext: hardware-aware neural network design. In: The 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 1719–1728 (2018)
Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L., Tan, M., Chuet, G.: Searching for MobileNetV3. In: The 2019 IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019)
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: The 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Ma, N. N., Zhang, X. Y., Zheng, H. T., Sun, J.: ShuffleNet V2: Practical guidelines for efficient architecture design. In: The 2018 European conference on computer vision (ECCV), pp. 116–131 (2018)
Zhang, T., Qi, G. -J., Xiao, B., Wang, J.: Interleaved group convolutions. In: The 2017 IEEE International Conference on Computer Vision (ICCV), pp. 4383–4392 (2017)
Sun, K., Li, M. J., Liu, D., Wang, J. D.: IGCV3: Interleaved low-rank group convolutions for efficient deep neural networks. CoRR, (2018) http://arxiv.org/abs/1806.00178
Funding
This work has been supported in part by the Science and Technology Project of Xi’an City (Grant No. 22GXFW0086), the Science and Technology Project of Beilin District in Xi’an City (Grant No. GX2243), and the School-Enterprise Collaborative Innovation Fund for Graduate Students of Xi’an University of Technology (Grant No. 310/252062108).
Author information
Authors and Affiliations
Contributions
QS made the research formulation and revised the manuscript. YC contributed to the model architecture design, experiment design and implementation, and manuscript writing. DY, JW, and JY contributed to manuscript writing and revision. YL provided the hardware platform for real-time system verification.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. The authors declare they have no financial interests.
Informed consent
Informed consent was obtained from all the participants in this study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, Q., Chen, Y., Yang, D. et al. EC-RFERNet: an edge computing-oriented real-time facial expression recognition network. SIViP 18, 2019–2035 (2024). https://doi.org/10.1007/s11760-023-02832-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-023-02832-4