International Journal of Computer Vision

, Volume 91, Issue 2, pp 200–215

Deformable Model Fitting by Regularized Landmark Mean-Shift

Article

Abstract

Deformable model fitting has been actively pursued in the computer vision community for over a decade. As a result, numerous approaches have been proposed with varying degrees of success. A class of approaches that has shown substantial promise is one that makes independent predictions regarding locations of the model’s landmarks, which are combined by enforcing a prior over their joint motion. A common theme in innovations to this approach is the replacement of the distribution of probable landmark locations, obtained from each local detector, with simpler parametric forms. In this work, a principled optimization strategy is proposed where nonparametric representations of these likelihoods are maximized within a hierarchy of smoothed estimates. The resulting update equations are reminiscent of mean-shift over the landmarks but with regularization imposed through a global prior over their joint motion. Extensions to handle partial occlusions and reduce computational complexity are also presented. Through numerical experiments, this approach is shown to outperform some common existing methods on the task of generic face fitting.

Keywords

Deformable Registration Mean-shift 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Avidan, S. (2004). Support vector tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 26, 1064–1072. CrossRefGoogle Scholar
  2. Basso, C., Vetter, T., & Blanz, V. (2003). Regularized 3D morphable models. In IEEE international workshop on higher-level knowledge in 3D modeling and motion analysis (HLK’03) (p. 3). Google Scholar
  3. Black, M., & Anandan, P. (1993). The robust estimation of multiple motions: affine and piecewise-smooth flow fields. Tech. rep., Xerox PARC. Google Scholar
  4. Blake, A., Isard, M., & Reynard, D. (1994). Learning to track curves in motion. In IEEE conference on decision theory and control (pp. 3788–3793). Google Scholar
  5. Bruhn, A., Weickert, J., & Schnörr, C. (2005). Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods. International Journal of Computer Vision, 61(3), 211–231. CrossRefGoogle Scholar
  6. Carreira-Perpinan, M. (2007). Gaussian mean-shift is an EM algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 29(5), 767–776. CrossRefGoogle Scholar
  7. Carreira-Perpinan, M., & Williams, C. (2003). On the number of modes of a Gaussian mixture. Lecture Notes in Computer Science, 2695, 625–640. CrossRefGoogle Scholar
  8. Cootes, T., & Taylor, C. (1992). Active shape models—‘smart snakes’. In British machine vision conference (BMVC’92) (pp. 266–275). Google Scholar
  9. Cristinacce, D., & Cootes, T. (2004). A comparison of shape constrained facial feature detectors. In IEEE international conference on automatic face and gesture recognition (FG’04) (pp. 375–380). Google Scholar
  10. Cristinacce, D., & Cootes, T. (2006). Feature detection and tracking with constrained local models. In British machine vision conference (BMVC’06) (pp. 929–938). Google Scholar
  11. Cristinacce, D., & Cootes, T. (2007). Boosted active shape models. In British machine vision conference (BMVC’07) (vol. 2, pp. 880–889). Google Scholar
  12. Dempster, A., Laird, N., & Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B (Methodological), 39(1), 1–38. MATHMathSciNetGoogle Scholar
  13. Edwards, G., Taylor, C., & Cootes, T. (1998). Interpreting face images using active appearance models. In IEEE international conference on automatic face and gesture recognition (FG’98) (pp. 300–305). Google Scholar
  14. Fashing, M., & Tomasi, C. (2005). Mean shift as a bound optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 27(3), 471–474. CrossRefGoogle Scholar
  15. Felzenszwalb, P., & Huttenlocher, D. (2004). Efficient belief propagation for early vision. In IEEE conference on computer vision and pattern recognition (CVPR’04) (vol. 1, pp. 261–268). Google Scholar
  16. Fukunaga, K., & Hostetler, L. (1975). The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Transactions on Information Theory, 21, 32–40. MATHCrossRefMathSciNetGoogle Scholar
  17. Gelman, A., Carlin, J., Stern, H., & Rubinx, D. (1995). Bayesian data analysis. London/Boca Raton: Chapman & Hall/CRC Press. Google Scholar
  18. Gross, R., Matthews, I., & Baker, S. (2004). Constructing and fitting active appearance models with occlusion. In Proceedings of the IEEE workshop on face processing in video (p. 72). Google Scholar
  19. Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2008). Multi-pie. In IEEE international conference on automatic face and gesture recognition (FG’08) (pp. 1–8). Google Scholar
  20. Gu, L., & Kanade, T. (2008). A generative shape regularization model for robust face alignment. In European conference on computer vision (ECCV’08) (pp. 413–426). Google Scholar
  21. Huang, G., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Tech. rep. 07-49, University of Massachusetts, Amherst. Google Scholar
  22. Liu, X. (2007). Generic face alignment using boosted appearance model. In IEEE conference on computer vision and pattern recognition (CVPR’07) (pp. 1–8). Google Scholar
  23. Matthews, I Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60, 135–164. CrossRefGoogle Scholar
  24. Messer, K., Matas, J., Kittler, J., Lüttin, J., & Maitre, G. (1999). XM2VTSDB: The extended M2VTS database. In International conference of audio- and video-based biometric person authentication (AVBPA’99) (pp. 72–77). Google Scholar
  25. Moghaddam, B., & Pentland, A. (1997). Probabilistic visual learning for object representation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 19(7), 696–710. CrossRefGoogle Scholar
  26. Nguyen, M., & De la Torre Frade, F. (2008). Local minima free parameterized appearance models. In IEEE conference on computer vision and pattern recognition (CVPR’08) (pp. 1–8). Google Scholar
  27. Nickels, K., & Hutchinson, S. (2002). Estimating uncertainty in SSD-based feature tracking. Image and Vision Computing, 20, 47–58. CrossRefGoogle Scholar
  28. Roberts, M., Cootes, T., & Adams, J. (2007). Robust active appearance models with iteratively rescaled kernels. In British machine vision conference (BMVC’07) (vol. 1, pp. 302–311). Google Scholar
  29. Romdhani, S., Gong, S., & Psarrou, A. (1999). A multi-view nonlinear active shape model using kernel PCA. In British machine vision conference (BMVC’99) (pp. 438–492). Google Scholar
  30. Saragih, J. (2008). The generative learning and discriminative fitting of linear deformable models. PhD thesis, The Australian National University, Australia. Google Scholar
  31. Saragih, J., Lucey, S., & Cohn, J. (2009). Face alignment through subspace constrained mean-shifts. In IEEE international conference on computer vision (ICCV’09) (pp. 1034–1041). Google Scholar
  32. Silverman, B. (1986). Density estimation for statistics and data analysis. London/Boca Raton: Chapman & Hall/CRC Press. MATHGoogle Scholar
  33. Sun, J., Zheng, N., & Shum, H. (2003). Stereo matching using belief propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 25(7), 787–800. CrossRefGoogle Scholar
  34. Torresani, L., Hertzmann, A., & Bregler, C. (2008). Nonrigid structure-from-motion: estimating shape and motion with hierarchical priors. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 30(5), 878–892. CrossRefGoogle Scholar
  35. Wang, Y., Lucey, S., & Cohn, J. (2008a). Enforcing convexity for improved alignment with constrained local models. In IEEE conference on computer vision and pattern recognition (CVPR’08) (pp. 1–8). Google Scholar
  36. Wang, Y., Lucey, S., Cohn, J., & Saragih, J. (2008b). Non-rigid face tracking with local appearance consistency constraint. In IEEE international conference on automatic face and gesture recognition (FG’08). Google Scholar
  37. Yedidia, J., Freeman, W., & Weiss, Y. (2002). Constructing free energy approximations and generalized belief propagation algorithms. Tech. rep., Mitsubishi Electric Research Laboratories (MERL). Google Scholar
  38. Zhou, S., & Comaniciu, D. (2007). Shape regression machine. In Information processing in medical imaging (IPMI’07) (pp. 13–25). Google Scholar
  39. Zhou, X., Comaniciu, D., & Gupta, A. (2005). An information fusion framework for robust shape tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 27(1), 115–129. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Jason M. Saragih
    • 1
  • Simon Lucey
    • 2
  • Jeffrey F. Cohn
    • 3
  1. 1.ICT Center, CSIROSydneyAustralia
  2. 2.ICT Center, CSIROBrisbaneAustralia
  3. 3.Robotics InstituteCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations