International Journal of Computer Vision

, Volume 38, Issue 1, pp 59–73 | Cite as

Morphable Models for the Analysis and Synthesis of Complex Motion Patterns

  • Martin A. Giese
  • Tomaso Poggio

Abstract

The linear combination of prototypical views provides a powerful approach for the recognition and the synthesis of images of stationary three-dimensional objects. In this article, we present initial results that demonstrate that similar ideas can be developed for the recognition and synthesis of complex motion patterns. We present a technique that permits to represent complex motion or action patterns by linear combinations of a small number of prototypical image sequences. We demonstrate the applicability of this new approach for the synthesis and analysis of biological motion using simulated and real video data from different locomotion patterns. Our results show that complex motion patterns are embedded in pattern spaces with a defined topological structure, which can be uncovered with our methods. The underlying pattern space seems to have locally, but not globally, the properties of a linear vector space. We show how the knowledge about the topology of the pattern space can be exploited during pattern recognition. Our method may provide a new interesting approach for the analysis and synthesis of video sequences and complex movements.

computer vision learning morphing action recognition nonrigid motion animation prototype linear superposition correspondence structural risk minimization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahmad, T., Taylor, C.J., Lanitis, A., and Cootes, T.F. 1997. Tracking and recognizing hand gestures, using statistical shape models. Image and Vision Computing, 19, in press.Google Scholar
  2. Badler, N.I. 1993. Simulating Humans. Oxford University Press: New York.Google Scholar
  3. Beymer, D. and Poggio, T. 1996. Image representations for visual learning. Science, 272:1905–1909.Google Scholar
  4. Beymer, D., Shashua, A., and Poggio, T. 1993. Example-based image analysis and synthesis. Technical Report 1431, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
  5. Black, M.J. and Jepson, A.D. 1996. Eigen tracking: Robust matching and tracking of articulated objects using a view-based representation. In Proceedings of the European Conference on Computer Vision, Cambridge. Springer, NY.Google Scholar
  6. Blake, A. and Isard, M. 1998. Active Contours. Springer: New York.Google Scholar
  7. Blanz, V. and Vetter, T. 1999. Morphable model for the synthesis of 3D faces. In Proceedings of SIGGRAPH 99, Los Angeles, pp. 187–194.Google Scholar
  8. Bruderlin, A. and Williams, L. 1995. Motion signal processing. In Proceedings of SIGGRAPH 95, Los Angeles, pp. 97–104.Google Scholar
  9. Darrell, T.J., Essa, I.A., and Pentland, A. 1995. Task-specific gesture analysis in real-time using interpolated views. Technical Report 364, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
  10. Davis, J.W. and Bobick, A.F. 1996. The representation and recognition of action using temporal templates. Technical Report 402, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
  11. Essa, I.A. and Pentland, A.P. 1997. Coding, analysis, interpretation and recognition of facial expressions. IEEE Transactions on Pattern Recognition and Machine Intelligence, 19:757–763.Google Scholar
  12. Ezzat, T. and Poggio, T. 1999. Visual speech synthesis by morphing visemes. Technical Report 1658, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
  13. Gavrila, D.M. 1999. The visual analysis of human movement: A survey. Computer Vision and Image Understanding, 73:82–98.Google Scholar
  14. Giese, M.A. and Poggio, T. 1999. Synthesis and recognition of biological motion patterns based on linear superposition of prototypical motion sequences. In Proceedings of the MVIEW 99 Symposium at CVPR, Fort Collins, CO, IEEE (Ed.), IEEE Computer Society, Los Alamitos, pp. 73–80.Google Scholar
  15. Girosi, F., Jones, M., and Poggio, T. 1995. Regularization theory and neural network architectures. Neural Computation, 7:219–269.Google Scholar
  16. Jones, M. and Pogio, T. 1997. Model-based matching by linear combinations of prototypes. In Proceedings of the DARPA Image Understanding Workshop, New Orleans, LA, pp. 1357–1365.Google Scholar
  17. Jones, M.J. 1997. Multidimensional morphable models: A framework for representing and matching object classes. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
  18. Lee, J. and Shin, S.Y. 1999. A hierarchical approach to interactive motion editiong for human-like figures. In Proceedings of SIGGRAPH 99, Los Angeles, pp. 39–48.Google Scholar
  19. Niyogi, S.A. and Adelson, E.H. 1994. Analyzing and recognizing walking figures in XYT. Technical Report 223, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
  20. O'Rourke, J. and Badler, N.I. 1982. Model-based analysis of human motion using constraint propagation. IEEE Transactions on Pattern Recognition and Machine Intelligence, 2:522–536.Google Scholar
  21. Poggio, T. and Edelman, S. 1990. A network that learns to recognize three-dimensional objects. Nature, 343:263–266.Google Scholar
  22. Rabiner, L. and Juang, B.H. 1993. Fundamentals of Speech Recognition. Prentice-Hall: Englewood Cliffs, NJ.Google Scholar
  23. Shelton, C.R. 1998. Three-dimensional correspondence. Master's Thesis, Dept. of Computer Science, Cambridge, MA.Google Scholar
  24. Starner, T. and Pentland, A.P. 1995. Recognition of American sign language using hidden Markov models. In Proceeding of International Workshop on Automatic Face and Gesture Recognition. IEEE Press, Los Alamitos, pp. 265–270.Google Scholar
  25. Takahashi, K., Seki, S., Kojima, H., and Oka, R. 1994. Recognition of dexterous manipulations from time-varying images. In Proceedings of the Workshop on Motion of Non-Rigid and Articulated Objects, IEEE Computer Society, Los Alamitos CA, pp. 23–28.Google Scholar
  26. Ullman, S. and Basri, R. 1991. Recognition by linear combination of models. IEEE Transactions on Pattern Recognition and Machine Intelligence, 13:992–1006.Google Scholar
  27. Vapnik, V.N. 1998. Statistical Learning Theory. Wiley: New York.Google Scholar
  28. Vetter, T. 1998. Synthesis of novel views from a single face image. International Journal of Computer Vision, 28(2):103–116.Google Scholar
  29. Vetter, T. and Poggio, T. 1995. Linear object classes and image synthesis from a single example image. Technical Report 1531, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
  30. Vetter, T. and Poggio, T. 1997. Linear object classes and image synthesis from a single example. IEEE Transactions on Pattern Recognition and Machine Intelligence, 19(7):733–742.Google Scholar
  31. Wren, C., Azarbayejani, A., Darrell, T., and Pantland, A. 1997. Real-time tracking of a human body. IEEE Transactions on Pattern Recognition and Machine Intelligence, 19:780–785.Google Scholar
  32. Yacoob, Y. and Black, M.J. 1999. Parameterized modeling and recognition of activities. Computer Vision and Image Understanding, 73(2):232–247.Google Scholar

Copyright information

© Kluwer Academic Publishers 2000

Authors and Affiliations

  • Martin A. Giese
    • 1
  • Tomaso Poggio
    • 1
  1. 1.Center for Biological and Computational Learning, Artificial Intelligence LaboratoryM.I.T.CambridgeUSA

Personalised recommendations