Masked Linear Regression for Learning Local Receptive Fields for Facial Expression Synthesis

Khan, Nazar; Akram, Arbish; Mahmood, Arif; Ashraf, Sania; Murtaza, Kashif

doi:10.1007/s11263-019-01256-3

Masked Linear Regression for Learning Local Receptive Fields for Facial Expression Synthesis

Published: 01 November 2019

Volume 128, pages 1433–1454, (2020)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Nazar Khan ORCID: orcid.org/0000-0002-9470-2120¹,
Arbish Akram¹,
Arif Mahmood²,
Sania Ashraf¹ &
…
Kashif Murtaza¹

958 Accesses
7 Citations
2 Altmetric
Explore all metrics

Abstract

Compared to facial expression recognition, expression synthesis requires a very high-dimensional mapping. This problem exacerbates with increasing image sizes and limits existing expression synthesis approaches to relatively small images. We observe that facial expressions often constitute sparsely distributed and locally correlated changes from one expression to another. By exploiting this observation, the number of parameters in an expression synthesis model can be significantly reduced. Therefore, we propose a constrained version of ridge regression that exploits the local and sparse structure of facial expressions. We consider this model as masked regression for learning local receptive fields. In contrast to the existing approaches, our proposed model can be efficiently trained on larger image sizes. Experiments using three publicly available datasets demonstrate that our model is significantly better than \(\ell _0, \ell _1\) and \(\ell _2\)-regression, SVD based approaches, and kernelized regression in terms of mean-squared-error, visual quality as well as computational and spatial complexities. The reduction in the number of parameters allows our method to generalize better even after training on smaller datasets. The proposed algorithm is also compared with state-of-the-art GANs including Pix2Pix, CycleGAN, StarGAN and GANimation. These GANs produce photo-realistic results as long as the testing and the training distributions are similar. In contrast, our results demonstrate significant generalization of the proposed algorithm over out-of-dataset human photographs, pencil sketches and even animal faces.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 15

LSRF: localized and sparse receptive fields for linear facial expression synthesis based on global face context

Article 16 September 2023

Facial expression analysis and expression-invariant face recognition by manifold-based synthesis

Article 18 December 2017

Feature fusion of multi-granularity and multi-scale for facial expression recognition

Article 10 June 2023

Notes

https://github.com/thoughtworksarts/EmoPy.

References

Barsoum, E., Zhang, C., Ferrer, C.C., & Zhang, Z. (2016) Training deep networks for facial expression recognition with crowd-sourced label distribution. In Proceedings of the 18th ACM international conference on multimodal interaction, ACM (pp 279–283).
Belhumeur, P. N., Hespanha, J. P., & Kriegman, D. J. (1997). Eigenfaces vs Fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 711–720.
Article Google Scholar
Bermano, A. H., Bradley, D., Beeler, T., Zund, F., Nowrouzezahrai, D., Baran, I., et al. (2014). Facial performance enhancement using dynamic shape space analysis. ACM Transactions on Graphics, 33(2), 13:1–13:12.
Article Google Scholar
Bishop, C. M. (2006). Pattern recognition and machine learning (information science and statistics). Berlin: Springer.
MATH Google Scholar
Blanz, V., Vetter, T., et al. (1999). A morphable model for the synthesis of 3d faces. SIGGRAPH, 99, 187–194.
Google Scholar
Choi, Y., Choi, M., Kim, M., Ha, J.W,. Kim, S., & Choo, J. (2018) StarGAN: Unified generative adversarial networks for multi-domain imageto- image translation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8789–8797).
Coates, A., & Ng, A.Y. (2011) Selecting receptive fields in deep networks. In Advances in neural information processing systems (pp. 2528– 2536).
Cootes, T. F., Edwards, G. J., Taylor, C. J., et al. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.
Article Google Scholar
Costigan, T., Prasad, M., & McDonnell, R. (2014) Facial retargeting using neural networks. In Proceedings of the seventh international conference on motion in games, ACM (pp. 31–38).
De La Hunty, M., Asthana, A., & Goecke, R. (2010) Linear facial expression transfer with active appearance models. In 2010 20th international conference on pattern recognition, IEEE (pp. 3789–3792).
Deng, Z., & Noh, J. (2008) Computer facial animation: A survey. In Datadriven 3D facial animation (pp. 1–28). Berlin: Springer.
Ekman, P., Friesen, W. V., & Ellsworth, P. (2013). Emotion in the human face: Guidelines for research and an integration of findings. Amsterdam: Elsevier.
Google Scholar
Elaiwat, S., Bennamoun, M., & Boussaid, F. (2016). A spatio-temporal RBMbased model for facial expression recognition. Pattern Recognition, 49, 152–161.
Article Google Scholar
Fabian Benitez-Quiroz, C., Srinivasan, R., & Martinez, A.M. (2016) Emotionet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5562–5570).
Georgakis, C., Panagakis, Y., & Pantic, M. (2016). Discriminant incoherent component analysis. IEEE Transactions on Image Processing, 25(5), 2021–2034.
Article MathSciNet Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014) Generative adversarial nets. In Advances in neural information processing systems (pp. 2672– 2680).
Havaldar, P. (2006) Sony pictures imageworks. In ACM SIGGRAPH 2006 Courses (p. 5). New York: ACM.
Huang, D., & De la Torre, F. (2010) Bilinear kernel reduced rank regression for facial expression synthesis. In European conference on computer vision (pp. 364–377). Berlin: Springer.
Isola, P., Zhu, J.Y., Zhou, T., & Efros, A.A. (2017) Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1125–1134).
Jain, A. K., & Li, S. Z. (2011). Handbook of face recognition. Berlin: Springer.
MATH Google Scholar
Jampour, M., Mauthner, T., & Bischof, H. (2015) Multi-view facial expressions recognition using local linear regression of sparse codes. In Proceedings of the 20th computer vision winter workshop.
Kim, T., Cha, M., Kim, H., Lee, J.K., & Kim, J. (2017) Learning to discover cross-domain relations with generative adversarial networks. In Proceedings of the 34th international conference on machine learning (Vol. 70, pp. 1857–1865).
Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., & Matas, J. (2018) Deblurgan: Blind motion deblurring using conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8183–8192).
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Article Google Scholar
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., & Wang, Z. et al. (2017) Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4681–4690).
Lee, C.S., & Elgammal, A. (2006) Nonlinear shape and appearance models for facial expression analysis and synthesis. In 18th international conference on pattern recognition (Vol. 1, pp. 497–502). IEEE.
Lee, H. S., & Kim, D. (2008). Tensor-based aam with continuous variation estimation: Application to variation-robust face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(6), 1102–1116.
Google Scholar
Lin, J. R., & Ic, Lin. (2011). Multi-layered expression synthesis. Journal of Information Science and Engineering, 27(1), 337–351.
MathSciNet Google Scholar
Liu, M.Y., & Tuzel, O. (2016) Coupled generative adversarial networks. In Advances in neural information processing systems (pp. 469–477).
Liu, M.Y., Breuel, T., & Kautz, J. (2017) Unsupervised image-to-image translation networks. In Advances in neural information processing systems (pp. 700–708).
Liu, S., Huang, D.Y., Lin, W., Dong, M., Li, H., & Ong, E.P. (2014) Emotional facial expression transfer based on temporal restricted Boltzmann machines. In 2014 Asia-Pacific Signal and information processing association annual summit and conference (APSIPA), (pp. 1– 7).
Liu, Z., Shan, Y., & Zhang, Z. (2001) Expressive expression mapping with ratio images. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques (pp. 271–276). ACM.
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010) The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In 2010 IEEE computer society conference on computer vision and pattern recognition-workshops (pp. 94–101). IEEE.
Lundqvist, D., Flykt, A., & Öhman, A. (1998). The karolinska directed emotional faces–KDEF, CD ROM. Stockholm: Department of Clinical Neuroscience, Psychology section, Karolinska Institutet.
Google Scholar
Lyons, M., Akamatsu, S., Kamachi, M., & Gyoba, J. (1998) Coding facial expressions with Gabor wavelets. In Proceedings third IEEE international conference on automatic face and gesture recognition (pp. 200–205). IEEE.
Mirza, M., Osindero, S. (2014) Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784
Nhan Duong, C., Luu, K., Gia Quach, K., & Bui, T.D. (2016) Longitudinal face modeling via temporal deep restricted boltzmann machines. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5772–5780).
Pantic, M., & Rothkrantz, L. J. M. (2000). Automatic analysis of facial expressions: The state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12), 1424–1445.
Article Google Scholar
Patel, N. M., & Zaveri, M. (2010). Parametric facial expression synthesis and animation. International Journal of Computer Applications, 3, 34–40.
Article Google Scholar
Pati, Y.C., Rezaiifar, R., & Krishnaprasad, P.S. (1993) Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In Proceedings of 27th Asilomar conference on signals, systems and computers (Vol. 1, pp. 40–44).
Pighin, F, & Lewis, J. (2006) Performance-driven facial animation. In Pighin, F., Hecker, J., Lischinski, D., Szeliski, R., & Salesin, D.H. (eds). Synthesizing realistic facial expressions from photographs, ACM SIGGRAPH 2006 Courses, ACM (p. 19).
Pumarola, A., Agudo, A., Martinez, A., Sanfeliu, A., & Moreno-Noguer, F. (2019) GANimation: One-shot anatomically consistent facial animation. International Journal of Computer Vision (IJCV). https://doi.org/10.1007/s11263-019-01210-3.
Rizzo, A. A., Neumann, U., Enciso, R., Fidaleo, D., & Noh, J. (2004). Performance-driven facial animation: Basic research on human judgments of emotional state in facial avatars. CyberPsychology & Behavior, 4(4), 471–487.
Article Google Scholar
Saragih, J.M., Lucey, S., & Cohn, J.F. (2011) Real-time avatar animation from a single image. In 2011 IEEE international conference on automatic face & gesture recognition and workshops (FG 2011) (pp. 117–124). IEEE.
Savran, A., Alyüz, N., Dibeklioğlu, H., Çeliktutan, O., Gökberk, B., Sankur, B., & Akarun, L. (2008) Bosphorus database for 3D face analysis. In European workshop on biometrics and identity management (pp. 47–56). Springer.
Shen, W., & Liu, R. (2017) Learning residual images for face attribute manipulation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4030–4038).
Susskind, J.M., Anderson, A.K., Hinton. G.E, & Movellan, J.R. (2008) Generating facial expressions with deep belief nets. In: Or, J. (ed.) Affective computing (chapter 10, pp. 421–440). InTech.
Suwajanakorn, S., Seitz, S.M., & Kemelmacher-Shlizerman, I. (2015) What makes tom hanks look like tom hanks. In Proceedings of the IEEE international conference on computer vision (pp. 3952– 3960).
Tenenbaum, J. B., & Freeman, W. T. (2000). Separating style and content with bilinear models. Neural Computation, 12(6), 1247–1283.
Article Google Scholar
Thies, J., Zollhofer, M., Stamminger, M., Theobalt, C., & Nießner, M. (2016) Face2Face: Real-time face capture and reenactment of RGB videos. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2387–2395).
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B (Methodological), 58, 267–288.
Article MathSciNet Google Scholar
Tropp, J. A., & Gilbert, A. C. (2007). Signal recovery from random measurements via orthogonal matching pursuit. IEEE Transactions on Information Theory, 53(12), 4655–4666.
Article MathSciNet Google Scholar
Wang, H., et al (2003) Facial expression decomposition. In Proceedings ninth IEEE international conference on computer vision (pp. 958–965). IEEE.
Wei, W., Tian, C., Maybank, S. J., & Zhang, Y. (2016). Facial expression transfer method based on frequency analysis. Pattern Recognition, 49, 115–128.
Article Google Scholar
Yi, Z., Zhang, H., Tan, P., & Gong, M. (2017) DualGAN: Unsupervised dual learning for image-to-image translation. In Proceedings of the IEEE international conference on computer vision (pp. 2849– 2857).
Zeiler, M.D., Taylor, G.W., Sigal, L., Matthews, I., & Fergus, R. (2011) Facial expression transfer with input-output temporal restricted boltzmann machines. In Advances in neural information processing systems (pp. 1629–1637).
Zeng, Z., Pantic, M., Roisman, G. I., & Huang, T. S. (2009). A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(1), 39–58.
Article Google Scholar
Zhang, G., Kan, M., Shan, S., & Chen, X. (2018) Generative adversarial network with spatial attention for face attribute editing. In Proceedings of the European conference on computer vision (pp. 417–432).
Zhang, Q., Liu, Z., Quo, G., Terzopoulos, D., & Shum, H. Y. (2006). Geometrydriven photorealistic facial expression synthesis. IEEE Transactions on Visualization and Computer Graphics, 12(1), 48–60.
Article Google Scholar
Zhang, Y., & Wei, W. (2012). A realistic dynamic facial expression transfer method. Neurocomputing, 89, 21–29.
Article Google Scholar
Zhu, J.Y., Krähenbühl, P., Shechtman, E., & Efros, A.A. (2016) Generative visual manipulation on the natural image manifold. In European conference on computer vision (pp. 597–613). Springer.
Zhu, J.Y., Park, T., Isola, P., & Efros, A.A. (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision (pp. 2223–2232).

Download references

Author information

Authors and Affiliations

Punjab University College of Information Technology (PUCIT), Lahore, Pakistan
Nazar Khan, Arbish Akram, Sania Ashraf & Kashif Murtaza
Department of Computer Science, Information Technology University (ITU), Lahore, Pakistan
Arif Mahmood

Authors

Nazar Khan
View author publications
You can also search for this author in PubMed Google Scholar
Arbish Akram
View author publications
You can also search for this author in PubMed Google Scholar
Arif Mahmood
View author publications
You can also search for this author in PubMed Google Scholar
Sania Ashraf
View author publications
You can also search for this author in PubMed Google Scholar
Kashif Murtaza
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nazar Khan.

Additional information

Communicated by Xavier Alameda-Pineda, Elisa Ricci, Albert Ali Salah, Nicu Sebe, Shuicheng Yan.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khan, N., Akram, A., Mahmood, A. et al. Masked Linear Regression for Learning Local Receptive Fields for Facial Expression Synthesis. Int J Comput Vis 128, 1433–1454 (2020). https://doi.org/10.1007/s11263-019-01256-3

Download citation

Received: 29 October 2018
Accepted: 16 October 2019
Published: 01 November 2019
Issue Date: May 2020
DOI: https://doi.org/10.1007/s11263-019-01256-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Masked Linear Regression for Learning Local Receptive Fields for Facial Expression Synthesis

Abstract

Access this article

Similar content being viewed by others

LSRF: localized and sparse receptive fields for linear facial expression synthesis based on global face context

Facial expression analysis and expression-invariant face recognition by manifold-based synthesis

Feature fusion of multi-granularity and multi-scale for facial expression recognition

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Masked Linear Regression for Learning Local Receptive Fields for Facial Expression Synthesis

Abstract

Access this article

Similar content being viewed by others

LSRF: localized and sparse receptive fields for linear facial expression synthesis based on global face context

Facial expression analysis and expression-invariant face recognition by manifold-based synthesis

Feature fusion of multi-granularity and multi-scale for facial expression recognition

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation