Skip to main content
Log in

A recursive framework for expression recognition: from web images to deep models to game dataset

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose a recursive framework to recognize facial expressions from images in real scenes. Unlike traditional approaches that typically focus on developing and refining algorithms for improving recognition performance on an existing dataset, we integrate three important components in a recursive manner: facial dataset generation, facial expression recognition model building, and interactive interfaces for testing and new data collection. To start with, we first create candid images for facial expression (CIFE) dataset. We then apply a convolutional neural network (CNN) to CIFE and build a CNN model for web image expression classification. In order to increase the expression recognition accuracy, we also fine-tune the CNN model and thus obtain a better CNN facial expression recognition model. Based on the fine-tuned CNN model, we design a facial expression game engine and collect a new and more balanced dataset, GaMo. The images of this dataset are collected from the different expressions our game users make when playing the game. Finally, we run yet another recursive step—a self-evaluation of the quality of the data labeling and propose a self-cleansing mechanism for improve the quality of the data. We evaluate the GaMo and CIFE datasets and show that our recursive framework can help build a better facial expression model for dealing with real scene facial expression tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. http://emo.vistawearables.com/modeltests/new (Firefox tested and recommended).

  2. http://emo.vistawearables.com/usergames/new (Firefox tested and recommended)

References

  1. Li, W., Li, M., Su, Z., Zhu, Z.: A deep-learning approach to facial expression recognition with candid images. In: 2015 14th IAPR International Conference on Machine Vision Applications (MVA), pp. 279–282. IEEE (2015)

  2. Li, W., Abtahi, F., Zhu, Z.: A deep feature based multi-kernel learning approach for video expression recognition. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 483–490. ACM (2015)

  3. Li, W., Farnaz A., Tsangouri, C., Zhu Z.: Towards an “in-the-wild” emotion dataset using a game-based framework. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 75–83 (2016)

  4. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  5. Mavadati, S.M., Mahoor, M.H., Bartlett, K., Trinh, P., Cohn, J.F.: Disfa: a spontaneous facial action intensity database. IEEE Trans. Affect. Comput. 4(2), 151–160 (2013)

    Article  Google Scholar 

  6. Kanade, T., Cohn, J.F., Tian, Y.: Comprehensive database for facial expression analysis. In: 4th IEEE International Conference on Automatic Face and Gesture Recognition, 2000. Proceedings, pp. 46–53. IEEE (2000)

  7. Cohn, J.F., Ambadar, Z., Ekman, P.: Observer-based measurement of facial expression with the facial action coding system. In: The Handbook of Expression Elicitation and Assessment, pp. 203–221 (2007)

  8. Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (ck+): a complete dataset for action unit and expression-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 94–101. IEEE (2010)

  9. Pantic, M., Valstar, M., Rademaker, R., Maat, L.: Web-based database for facial expression analysis. In: IEEE International Conference on Multimedia and Expo, 2005. ICME 2005, p. 5. IEEE (2005)

  10. Shan, C., Gong, S., McOwan, P.W.: Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput. 27(6), 803–816 (2009)

    Article  Google Scholar 

  11. Xiao, R., Zhao, Q., Zhang, D., Shi, P.: Facial expression recognition on multiple manifolds. Pattern Recognit. 44(1), 107–116 (2011)

    Article  MATH  Google Scholar 

  12. Wang, Z., Wang, S., Ji, Q.: Capturing complex spatio-temporal relations among facial muscles for facial expression recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3422–3429 (2013)

  13. Sikka, K., Dhall, A., Bartlett, M.: Exemplar hidden Markov models for classification of facial expressions in videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 18–25 (2015)

  14. Taigman, Y., Yang, M., Ranzato, M.A., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)

  15. Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1891–1898 (2014)

  16. Liu, P., Han, S., Meng, Z., Tong, Y.: Facial expression recognition via a boosted deep belief network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1805–1812 (2014)

  17. Kim, Y., Lee, H., Provost, E. M.: Deep learning for robust feature generation in audiovisual expression recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3687–3691. IEEE (2013)

  18. Jung, H., Lee, S., Yim, J., Park, S., Kim, J.: Joint fine-tuning in deep neural networks for facial expression recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2983–2991 (2015)

  19. Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 248–255. IEEE (2009)

  20. Von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 319–326. ACM (2004)

  21. Mouro, A., Magalhes, J.: Competitive affective gaming: winning with a smile. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 83–92. ACM. Chicago (2013)

  22. Deriso, D., Susskind, J., Krieger, L., Bartlett, M.: expression mirror: a novel intervention for autism based on real-time expression recognition. In: Computer Vision-ECCV 2012. Workshops and Demonstrations, pp. 671–674. Springer, Berlin (2012)

  23. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105 (2012)

  24. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

  25. Xie, L., Hong, R., Zhang, B., Tian, Q.: Image classification and retrieval are one. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 3–10. ACM (2015)

Download references

Acknowledgements

The first author would also like to thank IBM China Research Laboratory for the summer internship that enables the collection of the CIFE dataset. Special thanks to Ms. Celina M. Cavalluzzi, Director of Day Services, GoodWill, for her assistance in evaluating the game applications by adults with autism spectrum disorders.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Li.

Additional information

This work is supported by the National Science Foundation through Award EFRI -1137172 and VentureWell (formerly NCIIA) through Award 10087-12.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (avi 114067 KB)

Supplementary material 2 (avi 42753 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, W., Tsangouri, C., Abtahi, F. et al. A recursive framework for expression recognition: from web images to deep models to game dataset. Machine Vision and Applications 29, 489–502 (2018). https://doi.org/10.1007/s00138-017-0904-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-017-0904-9

Keywords

Navigation