The human body is an articulated object with a high number of degrees of freedom. Despite the high dimensionality of the configuration space, many human motion activities lie intrinsically on low-dimensional manifolds. Although the intrinsic body configuration manifolds might be very low in dimensionality, the resulting appearance manifolds are challenging to model given various aspects that affect the appearance such as the shape and appearance of the person performing the motion, or variation in the viewpoint, or illumination. Our objective is to learn representations for the shape and the appearance of moving (dynamic) objects that support tasks such as synthesis, pose recovery, reconstruction, and tracking. We studied various approaches for representing global deformation manifolds that preserve their geometric structure. Given such representations, we can learn generative models for dynamic shape and appearance. We also address the fundamental question of separating style and content on nonlinear manifolds representing dynamic objects. We learn factorized generative models that explicitly decompose the intrinsic body configuration (content) as a function of time from the appearance/shape (style factors) of the person performing the action as time-invariant parameters. We show results on pose recovery, body tracking, gait recognition, as well as facial expression tracking and recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J.O’Rourke, Badler: Model-based image analysis of human motion using con-straint propagation. IEEE PAMI 2(6) (1980)
Hogg, D.: Model-based vision: a program to see a walking person. Image and Vision Computing 1(1) (1983) 5-20
Chen, Z., Lee, H.: Knowledge-guided visual perception of 3-d human gait from single image sequence. IEEE SMC 22(2) (1992) 336-342
Rohr, K.: Towards model-based recognition of human movements in image sequence. CVGIP 59(1) (1994) 94-115
Rehg, J.M., Kanade, T.: Model-based tracking of self-occluding articulated objects. In: ICCV (1995) 612-617
Gavrila, D., Davis, L.: 3-d model-based tracking of humans in action: a multi-view approach. In: IEEE Conference on Computer Vision and Pattern Recog-nition. Volume 73-80 (1996)
Kakadiaris, I.A., Metaxas, D.: Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition, CVPR, Los Alamitos, California, USA, IEEE Computer Society (1996) 81-87
Sidenbladh, H., Black, M.J., Fleet, D.J.: Stochastic tracking of 3d human figures using 2d image motion. In: ECCV (2) (2000) 702-718
Rehg, J.M., Kanade, T.: Visual tracking of high DOF articulated structures: an application to human hand tracking. In: ECCV (2) (1994) 35-46
Darrell, T., Pentland, A.: Space-time gesture. In: Proc IEEE CVPR (1993)
Campbell, L.W., Bobick, A.F.: Recognition of human body motion using phase space constraints. In: ICCV (1995) 624-630
Wern, C.R., Azarbayejani, A., Darrell, T., Pentland, A.P.: Pfinder: Real-time tracking of human body. IEEE Transaction on Pattern Analysis and Machine Intelligence 19(7) (1997)
Ju, S.X., Black, M.J., Yacoob, Y.: Cardboard people: A parameterized model of articulated motion. In: International Conference on Automatic Face and Gesture Recognition, Killington, Vermont (1996) 38-44
Black, M.J., Jepson, A.D.: Eigentracking: Robust matching and tracking of articulated objects using a view-based representation. In: ECCV (1) (1996) 329-342
Haritaoglu, I., Harwood, D., Davis, L.S.: W4: Who? when? where? what? a real time system for detecting and tracking people. In: International Conference on Automatic Face and Gesture Recognition (1998) 222-227
Yacoob, Y., Black, M.J.: Parameterized modelling and recognition of activities. Computer Vision and Image Understanding: CVIU 73(2) (1999) 232-247
Fablet, R., Black, M.J.: Automatic detection and tracking of human motion with a view-based representation. In: Proc. ECCV 2002, LNCS 2350 (2002) 476-491
Sidenbladh, H., Black, M.J., Sigal, L.: Implicit probabilistic models of human motion for synthesis and tracking. In: Proc. ECCV 2002, LNCS 2350 (2002) 784-800
Goldenberg, R., Kimmel, R., Rivlin, E., Rudzsky, M.: ‘Dynamism of a dog on a leash’ or behavior classification by eigen-decomposition of periodic motions. In: Proceedings of the ECCV’02, Copenhagen, Springer, LNCS 2350 (2002) 461-475
Polana, R., Nelson, R.C.: Qualitative detection of motion by a moving observer. International Journal of Computer Vision 7(1) (1991) 33-46
Nelson, R.C., Polana, R.: Qualitative recognition of motion using temporal texture. CVGIP Image Understanding 56(1) (1992) 78-89
Polana, R., Nelson, R.: Low level recognition of human motion (or how to get your man without finding his body parts). In: IEEE Workshop on Non-Rigid and Articulated Motion (1994) 77-82
Polana, R., Nelson, R.C.: Detecting activities. Journal of Visual Communication and Image Representation (1994)
Niyogi, S., Adelson, E.: Analyzing and recognition walking figures in xyt. In: Proc. IEEE CVPR (1994) 469-474
Song, Y., Feng, X., Perona, P.: Towards detection of human motion. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2000) (2000) 810-817
Rittscher, J., Blake, A.: Classification of human body motion. In: IEEE Inter-national Conferance on Compute Vision (1999)
Bobick, A., Davis, J.: The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(3)(2001) 257-267
Cutler, R., Davis, L.: Robust periodic motion and motion symmetry detection. In: Proc. IEEE CVPR (2000)
Mori, G., Malik., J.: Estimating human body configurations using shape context matching. In: European Conference on Computer Vision (2002)
Kristen Grauman, Gregory Shakhnarovich, T.D.: Inferring 3d structure with a statistical image-based shape model. In: ICCV (2003)
Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter- sensitive hashing. In: ICCV (2003)
Howe, Leventon, Freeman, W.: Bayesian reconstruction of 3d human motion from single-camera video. In: Proc. NIPS (1999)
Brand, M.: Shadow puppetry. In: International Conference on Computer Vision. Volume 2 (1999) 1237
Rosales, R., Sclaroff, S.: Inferring body pose without tracking body parts. Tech- nical Report 1999-017 (1999)
Rosales, R., Sclaroff, S.: Specialized mappings and the estimation of human body pose from a single image. In: Workshop on Human Motion (2000) 19-24
Rosales, R., Athitsos, V., Sclaroff, S.:3D hand pose reconstruction using spe- cialized mappings. In: Proc. ICCV (2001)
Christoudias, C.M., Darrell, T.: On modelling nonlinear shape-and-texture ap-pearance manifolds. In: Proc.of IEEE CVPR. Volume 2 (2005) 1067-1074
Rahimi, A., Recht, B., Darrell, T.: Learning appearane manifolds from video. In: Proc.of IEEE CVPR. Volume 1 (2005) 868-875
Bowden, R.: Learning statistical models of human motion. In: IEEE Workshop on Human Modelling, Analysis and Synthesis (2000)
Toyama, K., Blake, A.: Probabilistic tracking in a metric space. In: ICCV (2001) 50-59
Bregler, C., Omohundro, S.M.: Nonlinear manifold learning for visual speech recognition (1995) 494-499
Jolliffe, I.T.: Principal Component Analysis. Springer-Verlag (1986)
M. Turk, A. Pentland: Eigenfaces for recognition. Journal of Cognitive Neuro- science 3(1) (1991) 71-86
Belhumeur, P.N., Hespanha, J., Kriegman, D.J.: Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. In: ECCV (1) (1996) 45-58
Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active shape models: Their training and application. CVIU 61(1) (1995) 38-59
Levin, A., Shashua, A.: Principal component analysis over continuous sub-spaces and intersection of half-spaces. In: ECCV, Copenhagen, Denmark (2002) 635-650
Murase, H., Nayar., S.: Visual learning and recognition of 3d objects from appearance. International Journal of Computer Vision 14 (1995) 5-24
Tenenbaum, J., Freeman, W.T.: Separating style and content with bilinear models. Neural Computation 12 (2000) 1247-1283
Vasilescu, M.A.O., Terzopoulos, D.: Multilinear analysis of image ensebles: Ten-sorfaces. In: Proc. of ECCV, Copenhagen, Danmark (2002) 447-460
Magnus, J., Neudecker, H.: Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley, New York (1988)
Marimont, D., Wandell, B.: Linear models of surface and illumination spectra. Journal of Optical Society od America 9 (1992) 1905-1913
Lathauwer, L.D., de Moor, B., Vandewalle, J.: A multilinear singular value de-composiiton. SIAM Journal On Matrix Analysis and Applications 21(4) (2000) 1253-1278
Shashua, A., Levin, A.: Linear image coding of regression and classification using the tensor rank principle. In: Proc. of IEEE CVPR, Hawai (2001)
Vasilescu, M.A.O.: An algorithm for extracting human motion signatures. In: Proc. of IEEE CVPR, Hawai (2001)
Wang, H., Ahuja, N.: Rank-r approximation of tensors: Using image-as-matrix representation. (In: Proc IEEE CVPR)
Tucker, L.: Some mathematical notes on three-mode factor analysis. Psychome- trika 31 (1966) 279-311
Kapteyn, A., Neudecker, H., Wansbeek, T.: An approach to n-model component analysis. Psychometrika 51(2) (1986) 269-275
Vidal, R., Ma, Y., Sastry, S.: Generalized principal component analysis (gpca). In: Proceedings of IEEE CVPR. Volume 1 (2003) 621-628
Vidal, R., Hartley, R.: Motion segmentation with missing data using powerfac- torization and gpca (2004)
Cox, T., Cox, M.: Multidimentional scaling. Chapman & Hall (1994)
Tenenbaum, J.: Mapping a manifold of perceptual observations. In: Advances in Neural Information Processing. Volume 10 (1998) 682-688
Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear em-bedding. Sciene 290(5500) (2000) 2323-2326
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6) (2003) 1373-1396
Brand, M., Huang, K.: A unifying theorem for spectral embedding and cluster-ing. In: Proc. of the Ninth International Workshop on AI and Statistics (2003)
Lawrence, N.: Gaussian process latent variable models for visualization of high dimensional data. In: NIPS (2003)
Weinberger, K.W., Saul, L.K.: Unsupervised learning of image manifolds by semidefinite programming. In: Proceedings of IEEE CVPR. Volume 2 (2004) 988-995
Mordohai, P., Medioni, G.: Unsupervised dimensionality estimation and man-ifold learning in high-dimensional spaces by tensor voting. In: Proceedings of International Joint Conference on Artificial Intelligence (2005)
Bengio, Y., Delalleau, O., Le Roux, N., Paiement, J.F., Vincent, P., Ouimet, M.: Learning eigenfunctions links spectral embedding and kernel pca. Neural Comp. 16(10) (2004) 2197-2219
Ham, J., Lee, D.D., Mika, S., Schölkopf, B.: A kernel view of the dimensionality reduction of manifolds. In: Proceedings of ICML, New York, NY, USA, ACM Press (2004)47
Schölkopf, B., Smola, A.: Learning with Kernels: Support Vector Machines, Reg-ularization, Optimization and Beyond. MIT Press, Cambridge, Massachusetts (2002)
Bengio, Y., Paiement, J.F., Vincent, P., Delalleau, O., Roux, N.L., Ouimet, M.: Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering. In: NIPS 16 (2004)
Elgammal, A.: Nonlinear generative models for dynamic shape and dynamic appearance. In: Proc. of 2nd International Workshop on Generative-Model based vision. GMBV 2004 (2004)
Elgammal, A., Lee, C.S.: Separating style and content on a nonlinear manifold. In: Proc. of CVPR (2004) 478-485
Seung, H.S., Lee, D.D.: The manifold ways of perception. Science 290(5500) (2000)2268-2269
Poggio, T., Girosi, F.: Network for approximation and learning. Proc. IEEE 78(9)(1990) 1481-1497
Beymer, D., Poggio, T.: Image representations for visual learning. Science 272(5250)(1996)
Elgammal, A., Lee, C.S.: Inferring 3d body pose from silhouettes using activity manifold learning. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2004)
Lee, C.S., Elgammal, A.: Style adaptive bayesian tracking using explicit mani- fold learning. In: Proc BMVC (2005)
Lee, C.S., Elgammal, A.: Gait tracking and recognition using person-dependent dynamic shape model. In: International Conference on Automatic Face and Gesture Recognition. Volume 0., IEEE Computer Society (2006) 553-559
Vasilescu, M.A.O., Terzopoulos, D.: Multilinear subspace analysis of image ensembles. (2003)
Lee, C.S., Elgammal, A.: Homeomorphic manifold analysis: Learning decompos-able generative models for human motion analysis. In: Workshop on Dynamical Vision (2005)
Gross, R., Shi, J.: The cmu motion of body (mobo) database. Technical Report TR-01-18, Carnegie Mellon University (2001)
Lee, C.S., Elgammal, A.M.: Simultaneous inference of view and body pose using torus manifolds. In: ICPR (3) (2006) 489-494
Lee, C.S., Elgammal, A.: Gait style and gait content: Bilinear model for gait recogntion using gait re-sampling. In: International Conference on Automatic Face and Gesture Recognition (2004) 147-152
Lee, C.S., Elgammal, A.M.: Towards scalable view-invariant gait recognition: Multilinear analysis for gait. In: AVBPA (2005) 395-405
Lee, C.S., Elgammal, A.: Facial expression analysis using nonlinear decompos-able generative models. In: AMFG (2005) 17-31
Lee, C.S., Elgammal, A.M.: Nonlinear shape and appearance models for facial expression analysis and synthesis. In: ICPR (1) (2006) 497-502
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer
About this chapter
Cite this chapter
Elgammal, A., Lee, CS. (2008). The Role of Manifold Learning in Human Motion Analysis. In: Rosenhahn, B., Klette, R., Metaxas, D. (eds) Human Motion. Computational Imaging and Vision, vol 36. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6693-1_2
Download citation
DOI: https://doi.org/10.1007/978-1-4020-6693-1_2
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6692-4
Online ISBN: 978-1-4020-6693-1
eBook Packages: Computer ScienceComputer Science (R0)