International Journal of Computer Vision

, Volume 123, Issue 2, pp 160–183 | Cite as

Markov Chain Monte Carlo for Automated Face Image Analysis

  • Sandro Schönborn
  • Bernhard Egger
  • Andreas Morel-Forster
  • Thomas Vetter
Article
  • 702 Downloads

Abstract

We present a novel fully probabilistic method to interpret a single face image with the 3D Morphable Model. The new method is based on Bayesian inference and makes use of unreliable image-based information. Rather than searching a single optimal solution, we infer the posterior distribution of the model parameters given the target image. The method is a stochastic sampling algorithm with a propose-and-verify architecture based on the Metropolis–Hastings algorithm. The stochastic method can robustly integrate unreliable information and therefore does not rely on feed-forward initialization. The integrative concept is based on two ideas, a separation of proposal moves and their verification with the model (Data-Driven Markov Chain Monte Carlo), and filtering with the Metropolis acceptance rule. It does not need gradients and is less prone to local optima than standard fitters. We also introduce a new collective likelihood which models the average difference between the model and the target image rather than individual pixel differences. The average value shows a natural tendency towards a normal distribution, even when the individual pixel-wise difference is not Gaussian. We employ the new fitting method to calculate posterior models of 3D face reconstructions from single real-world images. A direct application of the algorithm with the 3D Morphable Model leads us to a fully automatic face recognition system with competitive performance on the Multi-PIE database without any database adaptation.

Keywords

Face image analysis Markov chain Monte Carlo Model fitting Morphable Model Generative models Top-down and bottom-up integration 

References

  1. Albrecht, T., Lüthi, M., Gerig, T., & Vetter, T. (2013). Posterior shape models. Medical Image Analysis, 17(8), 959–973. doi: 10.1016/j.media.2013.05.010.CrossRefGoogle Scholar
  2. Aldrian, O., & Smith, W. (2013). Inverse rendering of faces with a 3D morphable model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(5), 1080–1093. doi: 10.1109/TPAMI.2012.206.CrossRefGoogle Scholar
  3. Basri, R., & Jacobs, D. W. (2003). Lambertian reflectance and linear subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(2), 218–233.CrossRefGoogle Scholar
  4. Blanz, V., & Vetter, T. (1999). A morphable model for the synthesis of 3d faces. In SIGGRAPH ’99: Proceedings of the 26th annual conference on computer graphics and interactive techniques (pp. 187–194). New York: ACM Press/Addison-Wesley. doi: 10.1145/311535.311556.
  5. Blanz, V., & Vetter, T. (2003). Face recognition based on fitting a 3D morphable model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9), 1063–1074. doi: 10.1109/TPAMI.2003.1227983.CrossRefGoogle Scholar
  6. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. doi: 10.1023/A:1010933404324.CrossRefMATHGoogle Scholar
  7. Chib, S., & Greenberg, E. (1995). Understanding the Metropolis–Hastings algorithm. The American Statistician, 49(4), 327–335.Google Scholar
  8. Cootes, T., Edwards, G., & Taylor, C. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685. doi: 10.1109/34.927467.CrossRefGoogle Scholar
  9. Duane, S., Kennedy, A. D., Pendleton, B. J., & Roweth, D. (1987). Hybrid Monte Carlo. Physics Letters B, 195(2), 216–222.CrossRefGoogle Scholar
  10. Eckhardt, M., Fasel, I., & Movellan, J. (2009). Towards practical facial feature detection. International Journal of Pattern Recognition and Artificial Intelligence, 23(03), 379–400.CrossRefGoogle Scholar
  11. Everingham, M., Gool, L. V., Williams, C. K. I., Winn, J., & Zisserman, A. (2009). The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338. doi: 10.1007/s11263-009-0275-4.CrossRefGoogle Scholar
  12. Felzenszwalb, P. F., & Huttenlocher, D. P. (2012). Distance transforms of sampled functions. Theory of Computing, 8(1), 415–428. doi: 10.4086/toc.2012.v008a019.MathSciNetCrossRefMATHGoogle Scholar
  13. Gilks, W. R., Richardson, S., & Spiegelhalter, D. J. (1996). Markov chain Monte Carlo in practice (Vol. 2). Boca Raton, FL: CRC Press.MATHGoogle Scholar
  14. Gonick, L., & Smith, W. (1993). Cartoon guide to statistics. New York: HarperCollins.Google Scholar
  15. Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2010). Multi-PIE. Image and Vision Computing, 28(5), 807–813. doi: 10.1016/j.imavis.2009.08.002.CrossRefGoogle Scholar
  16. Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57(1), 97–109. doi: 10.1093/biomet/57.1.97.MathSciNetCrossRefMATHGoogle Scholar
  17. Huang, G.B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Tech. Rep. 07-49, University of Massachusetts, Amherst.Google Scholar
  18. Jampani, V., Nowozin, S., Loper, M., & Gehler, P. V. (2015). The informed sampler: A discriminative approach to bayesian inference in generative computer vision models. Computer Vision and Image Understanding, 136, 32–44. doi: 10.1016/j.cviu.2015.03.002.CrossRefGoogle Scholar
  19. Kirby, M., & Sirovich, L. (1990). Application of the Karhunen–Loeve procedure for the characterization of human faces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(1), 103–108.CrossRefGoogle Scholar
  20. Köstinger, M., Wohlhart, P., Roth, P. M., & Bischof, H. (2011). Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization. In 2011 IEEE international conference on computer vision workshops (ICCV workshops) (pp. 2144–2151).Google Scholar
  21. Kulkarni, T. D., Kohli, P., Tenenbaum, J. B., & Mansinghka, V. (2015). Picture: A probabilistic programming language for scene perception. In The IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  22. Liu, C., Shum, H. Y., & Zhang, C. (2002). Hierarchical shape modeling for automatic face localization. In Computer Vision—ECCV 2002 (pp. 687–703). Heidelberg: Springer.Google Scholar
  23. Liu, J. S., Liang, F., & Wong, W. H. (2000). The multiple-try method and local optimization in Metropolis sampling. Journal of the American Statistical Association, 95(449), 121–134.MathSciNetCrossRefMATHGoogle Scholar
  24. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.CrossRefGoogle Scholar
  25. Lüthi, M., Blanc, R., Albrecht, T., Gass, T., Goksel, O., Buchler, P., et al. (2012). Statismo—A framework for PCA based statistical models. The Insight Journal, 1, 1–18.Google Scholar
  26. Matthews, I., & Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60(2), 135–164.CrossRefGoogle Scholar
  27. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21, 1087.CrossRefGoogle Scholar
  28. Paysan, P., Knothe, R., Amberg, B., Romdhani, S., & Vetter, T. (2009). A 3D face model for pose and illumination invariant face recognition. In Advanced video and signal based surveillance, 2009 (pp. 296–301).Google Scholar
  29. Perlin, K. (1985). An image synthesizer. ACM SIGGRAPH Computer Graphics, 19(3), 287–296.CrossRefGoogle Scholar
  30. Rauschert, I., & Collins, R. T. (2012). A generative model for simultaneous estimation of human body shape and pixel-level segmentation. In Computer Vision—ECCV 2012 (pp. 704–717). Heidelberg: Springer.Google Scholar
  31. Robert, C. P., & Casella, G. (2004). Monte Carlo statistical methods (Vol. 319). Citeseer.Google Scholar
  32. Romdhani, S., & Vetter, T. (2003). Efficient, robust and accurate fitting of a 3D morphable model. In Proceedings of ninth IEEE international conference on computer vision, 2003 (pp. 59–66).Google Scholar
  33. Romdhani, S., & Vetter, T. (2005). Estimating 3D shape and texture using pixel intensity, edges, specular highlights, texture constraints and a prior. In IEEE Computer Society conference on computer vision and pattern recognition, 2005 (CVPR 2005) (Vol. 2, pp. 986–993). doi: 10.1109/CVPR.2005.145.
  34. Sambridge, M., & Mosegaard, K. (2002). Monte Carlo methods in geophysical inverse problems. Reviews of Geophysics, 40(3), 1009. doi: 10.1029/2000RG000089.CrossRefMATHGoogle Scholar
  35. Schönborn, S., Egger, B., Forster, A., & Vetter, T. (2015). Background modeling for generative image models. Computer Vision and Image Understanding, 136, 117–127. doi: 10.1016/j.cviu.2015.01.008.
  36. Schönborn, S., Forster, A., Egger, B., & Vetter, T. (2013). A Monte Carlo strategy to integrate detection and model-based face analysis. In J. Weickert, M. Hein, & B. Schiele (Eds.), Pattern recognition. Lecture notes in computer science (Vol. 8142, pp. 101–110). Berlin: Springer.Google Scholar
  37. Tipping, M. E., & Bishop, C. M. (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(3), 611–622.MathSciNetCrossRefMATHGoogle Scholar
  38. Tu, Z., Chen, X., Yuille, A. L., & Zhu, S. C. (2005). Image parsing: Unifying segmentation, detection, and recognition. International Journal of Computer Vision, 63(2), 113–140. doi: 10.1007/s11263-005-6642-x.CrossRefGoogle Scholar
  39. Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1), 71–86.CrossRefGoogle Scholar
  40. Wojek, C., Roth, S., Schindler, K., & Schiele, B. (2010). Monocular 3d scene modeling and inference: Understanding multi-object traffic scenes. In Computer Vision—ECCV 2010 (pp. 467–481). Heidelberg: Springer.Google Scholar
  41. Xiong, X., & De La Torre, F. (2013). Supervised descent method and its applications to face alignment. In 2013 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 532–539). doi: 10.1109/CVPR.2013.75.
  42. Yin, L., Wei, X., Sun, Y., Wang, J., & Rosato, M. (2006). A 3D facial expression database for facial behavior research. In 7th International conference on automatic face and gesture recognition, 2006 (FGR 2006) (pp. 211–216). doi: 10.1109/FGR.2006.6.
  43. Zhu, X., Yan, J., Yi, D., Lei, Z., & Li, S. (2015). Discriminative 3d morphable model fitting. In Proceedings of 11th IEEE international conference on automatic face and gesture recognition FG2015, Ljubljana, Slovenia.Google Scholar
  44. Zivanov, J., Forster, A., Schönborn, S., & Vetter, T. (2013). Human face shape analysis under spherical harmonics illumination considering self occlusion. In 6th International conference on biometrics, ICB-2013, Madrid.Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Sandro Schönborn
    • 1
  • Bernhard Egger
    • 1
  • Andreas Morel-Forster
    • 1
  • Thomas Vetter
    • 1
  1. 1.Department of Mathematics and Computer ScienceUniversity of BaselBaselSwitzerland

Personalised recommendations