Cost-effective real-time recognition for human emotion-age-gender using deep learning with normalized facial cropping preprocess


Because of technological advancement, human face recognition has been commonly applied in various fields. There are some HCI-related applications, such as camera-ready chatbot and companion robot, require gathering more information from user’s face. In this paper, we developed a system called EAGR for emotion, age, and gender recognition, which can perceive user’s emotion, age and gender based on the face detection. The EAGR system first applies normalized facial cropping (NFC) as a preprocessing method for training data before data augmentation, then uses convolution neural network (CNN) as three training models for recognizing seven emotions (six basics plus one neutral emotion), four age groups, and two genders. For better emotion recognition, the NFC will extract facial features without hair retained. On the other hand, the NFC will extract facial features with hair retained for better age and gender recognition. The experiments were conducted on these three training models of emotion, age and gender recognitions. The recognition performance results from the testing dataset, which has been normalized for tilted head by proposed binocular line angle correction (BLAC), showed that the optimal mean accuracy rates of real-time recognition for seven emotions, four age groups and two genders were 82.4%, 74.95%, and 96.65% respectively. Furthermore, the training time can be substantially reduced via NFC preprocessing. Therefore, we believe that EAGR system is cost-effective in recognizing human emotions, ages, and genders. The EAGR system can be further applied in social applications to help HCI service provide more accurate feedback from pluralistic facial classifications.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13



Facial Expression Recognition


Emotion, Age and Gender Recognitions


Normalized Facial Cropping


Convolutional Neural Network


Binocular Line Angle Correction


Human-Computer Interactions


Brain-Computer Interfaces


Deep Neural Network


Emotion and Age Recognitions


You Only Look Once


Histogram of Oriented Gradients


JavaScript Object Notation


Emotion Recognition Rate


Age Recognition Rate


Gender Recognition Rate


K Nearest Neighbors


Support Vector Machine


Generative Adversarial Network


  1. 1.

    Baldi PF, Hornik K (1995) Learning in linear neural networks: a survey. IEEE Trans Neural Netw 6:837–858

    Article  Google Scholar 

  2. 2.

    Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19:711–720

    Article  Google Scholar 

  3. 3.

    Bengio Y (2009) Learning deep architectures for AI. Found Trends Ma-chine Learn 2:1–127

    Article  Google Scholar 

  4. 4.

    Cai W, Wei Z (2020) PiiGAN: generative adversarial networks for pluralistic image Inpainting. IEEE Access 8:48451–48463.

    Article  Google Scholar 

  5. 5.

    ChaLearn Looking at People,

  6. 6.

    Navneet Dalal, Bill Triggs. Histograms of Oriented Gradients for Human Detection. International Conference on Computer Vision & Pattern Recognition (CVPR ‘05), Jun 2005, San Diego. pp. 886–893, ff ffinria-00548512

  7. 7.

    Dehghan A, et al. (2017) Dager: Deep age, gender and emotion recognition using convolutional neural network. arXiv preprint arXiv:1702.04280

  8. 8.

    Dlib Library, Accessed 30 June 2020

  9. 9.

    Ekman P, Friesen WV (1971) Constants across cultures in the face and emotion. J Pers Soc Psychol 17(2):124–129

    Article  Google Scholar 

  10. 10.

    Extended Cohn-Kanade Dataset (CK+),

  11. 11.

    Facial Expression Database,

  12. 12.

    Gaussian weight average,

  13. 13.


  14. 14.

    HOG feature detection principle and OpenCV API call, Accessed 30 June 2020

  15. 15.

    Ivakhnenko G, Lapa VG (1965) Cybernetic predicting devices, CCM Information Corporation

  16. 16.

    Japanese Female Facial Expressions (JAFFE),

  17. 17.

    Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  18. 18.

    LeCun Y (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324

    Article  Google Scholar 

  19. 19.

    LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1:541–551

    Article  Google Scholar 

  20. 20.

    Li S, Deng W Deep Facial Expression Recognition: A Survey," in IEEE Transactions on Affective Computing,

  21. 21.

    Liu P, Yu H, Cang S (2019) Adaptive neural network tracking control for underactuated systems with matched and mismatched disturbances. Nonlinear Dynamics 98(2):1447–1464

    Article  Google Scholar 

  22. 22.

    MMI Facial Expression Database,

  23. 23.

    Multimedia Understanding Group Facial Expression Database,

  24. 24.

    Nvidia Digits, Accessed 30 June 2020

  25. 25.

    OpenCV Library, Accessed 30 June 2020

  26. 26.

    Radboud Faces Database (RaFD),

  27. 27.

    Sharma M, Jalal AS, Khan A (2019) Emotion recognition using facial expression by fusing key points descriptor and texture features. Multimed Tools Appl 78(12):16195–16219

    Article  Google Scholar 

  28. 28.


  29. 29.

    L. Sun, C. Zhao, Z. Yan, P. Liu, T. Duckett and R. Stolkin, "A Novel Weakly-Supervised Approach for RGB-D-Based Nuclear Waste Object Detection," in IEEE Sens J, vol. 19, no. 9, pp. 3487–3500, 1 May1, 2019,

  30. 30.

    Szegedy C et al (2015) Going deeper with convolutions. Proc IEEE Conf Comput Vis Pattern Recognit:1–9

  31. 31.

    Tang Z, Yu H, Lu C, Liu P, Jin X (2019) Single-trial classification of different movements on one arm based on ERD/ERS and Corticomuscular coherence. IEEE Access 7:128185–128197.

    Article  Google Scholar 

  32. 32.

    Tang Z-c, Li C, Wu J-f, Liu P-c, Cheng S-w (2019) Classification of EEG-based single-trial motor imagery tasks using a B-CSP method for BCI. Front Inform Technol Electron Eng 20(8):1087–1098

    Article  Google Scholar 

  33. 33.

    The Karolinska Directed Emotional Faces (KDEF),

  34. 34.

    Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57:137–154

    Article  Google Scholar 

  35. 35.

    YOLO: real-time target detection, Accessed 1 Mar 2020

  36. 36.

    YOLO object detection flowchart, Accessed 30 Nov 2020

Download references

Author information



Corresponding author

Correspondence to Chia-Hui Wang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lu, TT., Yeh, SC., Wang, CH. et al. Cost-effective real-time recognition for human emotion-age-gender using deep learning with normalized facial cropping preprocess. Multimed Tools Appl (2021).

Download citation


  • Face detection
  • Deep learning
  • Normalized facial cropping
  • Binocular line angle correction
  • Companion robot
  • Social applications