Skip to main content
Log in

Deeper cascaded peak-piloted network for weak expression recognition

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Facial expression recognition is in general a challenging problem, especially in the presence of weak expression. Most recently, deep neural networks have been emerging as a powerful tool for expression recognition. However, due to the lack of training samples, existing deep network-based methods cannot fully capture the critical and subtle details of weak expression, resulting in unsatisfactory results. In this paper, we propose Deeper Cascaded Peak-piloted Network (DCPN) for weak expression recognition. The technique of DCPN has three main aspects: (1) Peak-piloted feature transformation, which utilizes the peak expression (easy samples) to supervise the non-peak expression (hard samples) of the same type and subject; (2) the back-propagation algorithm is specially designed such that the intermediate-layer feature maps of non-peak expression are close to those of the corresponding peak expression; and (3) an novel integration training method, cascaded fine-tune, is proposed to prevent the network from overfitting. Experimental results on two popular facial expression databases, CK\(+\) and Oulu-CASIA, show the superiority of the proposed DCPN over state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Agarwal, S., Santra, B., Mukherjee, D.P.: Anubhav : recognizing emotions through facial expression. Vis. Comput. 1–15 (2016)

  2. Bargal, S.A., Barsoum, E., Ferrer, C.C., Zhang, C.: Emotion recognition in the wild from videos using images. In: ACM International Conference on Multimodal Interaction, pp. 433–436 (2016)

  3. Bartlett, M.S., Littlewort, G., Frank, M., Lainscsek, C., Fasel, I., Movellan, J.: Recognizing facial expression: Machine learning and application to spontaneous behavior. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005. vol. 2, pp. 568–573 (2005)

  4. Chi, J., Tu, C., Zhang, C.: Dynamic 3d facial expression modeling using Laplacian smooth and multi-scale mesh matching. Vis. Comput. 30(6–8), 649–659 (2014)

    Article  Google Scholar 

  5. Chopra, S., Hadsell, R., Lecun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005. vol. 1, pp. 539–546 (2005)

  6. Danelakis, A., Theoharis, T., Pratikakis, I.: A spatio-temporal wavelet-based descriptor for dynamic 3d facial expression retrieval and recognition. Vis. Comput. 32(6–8), 1–11 (2016)

    Google Scholar 

  7. Dhall, A., Goecke, R., Joshi, J., Hoey, J., Gedeon, T.: Emotiw 2016: video and group-level emotion recognition challenges. In: ACM International Conference on Multimodal Interaction, pp. 427–432 (2016)

  8. Fan, Y., Lu, X., Li, D., Liu, Y.: Video-based emotion recognition using cnn-rnn and c3d hybrid networks. In: ACM International Conference on Multimodal Interaction, pp. 445–450 (2016)

  9. Guo, Y., Zhao, G., Pietikainen, M.: Dynamic Facial Expression Recognition Using Longitudinal Facial Expression Atlases. Springer, Berlin (2012)

    Book  Google Scholar 

  10. Han, S., Meng, Z., KHAN, A.S., Tong, Y.: Incremental boosting convolutional neural network for facial action unit recognition. Adv. Neural Inf. Process. Syst. 29, 109–117 (2016)

    Google Scholar 

  11. He, J., Hu, J.F., Lu, X., Zheng, W.S.: Multi-task mid-level feature learning for micro-expression recognition. Pattern Recognit. 66, 44–52 (2016)

    Article  Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  13. Hung, A.P., Wu, T., Hunter, P., Mithraratne, K.: A framework for generating anatomically detailed subject-specific human facial models for biomechanical simulations. Vis. Comput. 31(5), 527–539 (2015)

    Article  Google Scholar 

  14. Jaiswal, S., Valstar, M.: Deep learning the dynamic appearance and shape of facial action units. In: Winter Applications in Computer Vision, pp. 1–8 (2016)

  15. Jung, H., Lee, S., Yim, J., Park, S.: Joint fine-tuning in deep neural networks for facial expression recognition. In: IEEE International Conference on Computer Vision, pp. 2983–2991 (2015)

  16. Klaser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: British Machine Vision Conference 2008, Leeds, September (2008)

  17. Li, X., Mori, G., Zhang, H.: Expression-invariant face recognition with expression classification. In: The Canadian Conference on Computer and Robot Vision, p. 77 (2006)

  18. Liu, M., Li, S., Shan, S., Wang, R., Chen, X.: Deeply Learning Deformable Facial Action Parts Model for Dynamic Expression Analysis. Springer International Publishing, Berlin (2014)

    Google Scholar 

  19. Liu, M., Shan, S., Wang, R., Chen, X.: Learning expression lets on spatio-temporal manifold for dynamic facial expression recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1749–1756 (2014)

  20. Liu, P., Han, S., Meng, Z., Tong, Y.: Facial expression recognition via a boosted deep belief network. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1805–1812 (2014)

  21. Liu, Y.J., Zhang, J.K., Yan, W.J., Wang, S.J., Zhao, G., Fu, X.: A main directional mean optical flow feature for spontaneous micro-expression recognition. IEEE Trans. Affect. Comput. 7(4), 1–1 (2016)

    Article  Google Scholar 

  22. Lucey, P., Cohn, J.F., Kanade, T., Saragih, J.: The extended Cohn–Kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: Computer Vision and Pattern Recognition Workshops, pp. 94–101 (2010)

  23. Metaxas, D.N., Huang, J., Liu, B., Yang, P., Liu, Q., Zhong, L.: Learning active facial patches for expression analysis. In: Computer Vision and Pattern Recognition, pp. 2562–2569 (2012)

  24. Shan, C., Gong, S., Mcowan, P.W.: Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput. 27(6), 803–816 (2009)

    Article  Google Scholar 

  25. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556

  26. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Workshop Track International Conference on Learning Representations, pp. 1–12 (2016)

  27. Szegedy, C., Liu, W., Jia, Y., Sermanet, P.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–13 (2014)

  28. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)

  29. Taini, M., Zhao, G., Li, S.Z., Pietikainen, M.: Facial expression recognition from near-infrared video sequences. In: International Conference on Pattern Recognition, pp. 1–4 (2011)

  30. Valstar, M.F., Almaev, T., Girard, J.M., Mckeown, G.: Fera 2015 second facial expression recognition and analysis challenge. In: IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, pp. 1–8 (2015)

  31. Yao, A., Cai, D., Hu, P., Wang, S., Sha, L., Chen, Y.: Holonet: towards robust emotion recognition in the wild. In: The ACM International Conference, pp. 472–478 (2016)

  32. Yu, Z., Zhang, C.: Image based static facial expression recognition with multiple deep network learning. In: ACM on International Conference on Multimodal Interaction, pp. 435–442 (2015)

  33. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multi-task cascaded convolutional networks. IEEE J. Solid State Circuits 23(99), 1161–1173 (2016)

    Google Scholar 

  34. Zhang, Z., Luo, P., Chen, C.L., Tang, X.: Facial landmark detection by deep multi-task learning. In: European Conference on Computer Vision, pp. 94–108 (2014)

    Google Scholar 

  35. Zhao, R., Gan, Q., Wang, S., Ji, Q.: Facial expression intensity estimation using ordinal information. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3466–3474 (2016)

  36. Zhao, X., Liang, X., Liu, L., Li, T., Han, Y., Vasconcelos, N., Yan, S.: Peak-piloted deep network for facial expression recognition. In: European Conference on Computer Vision, pp. 425–442 (2016)

    Chapter  Google Scholar 

Download references

Acknowledgements

The work of Qingshan Liu is supported by National Natural Science Foundation of China (NSFC) under Grant 61532009. The work of Guangcan Liu is supported in part by NSFC under Grant 61622305 and Grant 61502238, and in part by the Natural Science Foundation of Jiangsu Province of China (NSFJPC) under Grant BK20160040.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhenbo Yu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, Z., Liu, Q. & Liu, G. Deeper cascaded peak-piloted network for weak expression recognition. Vis Comput 34, 1691–1699 (2018). https://doi.org/10.1007/s00371-017-1443-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-017-1443-0

Keywords

Navigation