This chapter presents a Continuous Movement Recognition (CMR) framework which forms a basis for segmenting continuous human motion to recognize actions as demonstrated through the tracking and recognition of hundreds of skills from gait to twisting summersaults. A novel 3D color clone-body-model is dynamically sized and texture mapped to each person for more robust tracking of both edges and textured regions. Tracking is further stabilized by estimating the joint angles for the next frame using a forward smoothing Particle filter with the search space optimized by utilizing feedback from the CMR system. A new paradigm defines an alphabet of dynemes being small units of movement, to enable recognition of diverse actions. Using multiple Hidden Markov Models, the CMR system attempts to infer the action that could have produced the observed sequence of dynemes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abdelkader, M., R. Chellappa, Q. Zheng, and A. Chan: Integrated Motion De-tection and Tracking for Visual Surveillance, In Proc. Fourth IEEE International Conference on Computer Vision Systems, pp. 28-36, 2006
Aggarwal A., S. Biswas, S. Singh, S. Sural, and A. Majumdar: Object Tracking Using Background Subtraction and Motion Estimation in MPEG Videos, In Proc. Asian Conference on Computer Vision, pp. 121-130, 2006.
Badler, N., C. Phillips and B. Webber: Simulating Humans. Oxford University Press, New York, pp. 23-65, 1993.
Bauckhage C., M. Hanheide, S. Wrede and G. Sagerer: A Cognitive Vision System for Action Recognition in Office Environment, In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 827-833, 2004.
Bhatia S., L. Sigal, M. Isard, and M. Black: 3D Human Limb Detection using Space Carving and Multi-view Eigen Models, In Proc. Second IEEE Interna-tional Conference on Computer Vision Systems 2004.
Brand, M., and V. Kettnaker: Discovery and segmentation of activities in video, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 2000.
Bregler C.: Twist Based Acquisition and Tracking of Animal and Human Kine-matics, International Journal of Computer Vision, 56(3):179-194, 2004.
Bregler, C.:Learning and Recognizing Human Dynamics in Video Sequences, In Proc. IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 1997.
Bregler, C. and J. Malik: Tracking people with twists and exponential maps, In Proc. IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 8-15, 1998.
Campos T.: 3D Hand and Object Tracking for Intention Recognition. DPhil Transfer Report, Robotics Research Group, Department of Engineering Science, University of Oxford, 2003.
Cham, T., and J. Rehg: A Multiple Hypothesis Approach to Figure Tracking, In Proc. IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 239-245, 1999.
Chen D., J. Yang, H.: Towards Automatic Analysis of Social Interaction Pat-terns in a Nursing Home Environment from Video, In Proc. ACM Multimedia Information Retrieval, pp. 283-290, 2004.
Daugman, J.: How Iris Recognition Works, In Proc. IEEE Conference on ICIP, 2002.
Demirdjian D., T. Ko, and T. Darrell: Untethered Gesture Acquisition and Recognition for Virtual World Manipulation, In Proc. International Conference on Virtual Reality, 2005.
Deutscher, J., A. Blake, I. Reid; Articulated Body Motion Capture by Annealed Particle Filtering, In Proc. IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2: 1144-1149, 2000.
Deutscher, J., A. Davison, and I. Reid: Automatic Partitioning of High Di-mensional Search Spaces Associated with Articulated Body Motion Capture, In Proc. IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2: 669-676, 2001.
Drummond, T., and R. Cipolla: Real-time Tracking of Highly Articulated Struc-tures in the Presence of Noisy Measurements, In Proc. IEEE International Con-ference on Computer Vision, ICCV, 2: 315-320, 2001.
Elias H., O. Carlos, and S. Jesus: Detected motion classification with a double- background and a neighborhood-based difference, Pattern Recognition Letters, 24(12): 2079-2092, 2003.
Fang G., W. Gao and D. Zhao: Large Vocabulary Sign Language Recognition Based on Hierarchical Decision Trees, In Proc. International Conference on Mul-timodal Interfaces, pp. 301-312, 2003
Ferryman J., A. Adams, S. Velastin, T. Ellis, P. Emagnino, and N. Tyler: REASON: Robust Method for Monitoring and Understanding People in Pub-lic Spaces. Technological Report, Computational Vision Group, University of Reading, 2004.
Gao H.: Tracking Small and Fast Objects in Noisy Images. Masters Thesis. Computer Science Department, University of Canterbury, 2005.
Gao J. and J. Shi: Multiple Frame Motion Inference Using Belief Propagation, In Proc. IEEE International Conference on Automatic Face and Gesture Recog-nition, 2004.
Gavrila, D. and L. Davis: 3-D model-based tracking of humans in action: a multi-view approach, In Proc. IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 73-80, 1996.
Goncalves, L., E. Di Bernardo, E. Ursella and P. Perona: Monocular Tracking of the Human Arm in 3D, In Proc. IEEE International Conference on Computer Vision, ICCV, 764-770, 1995.
Green R. and L. Guan: Quantifying and Recognising Human Movement Pat-terns from Monocular Video Images - Part I: A New Framework for Modelling Human Motion, IEEE Transactions on Circuits and Systems for Video Tech-nology, 14(2): 179-190, 2004.
Green R. and L. Guan: Quantifying and Recognising Human Movement Pat-terns from Monocular Video Images - Part II: Application to Biometrics. IEEE Transactions on Circuits and Systems for Video Technology, 14(2): 191-198, 2004.
Grobel, K. and M. Assam: Isolated Sign Language Recognition Using Hidden Markov Models, In Proc. IEEE International Conference on Systems, Man and Cybernetics, pp. 162-167, Orlando, 1997.
Grossmann E., A. Kale and C. Jaynes: Towards Interactive Generation of “Ground-truth” in Background Subtraction from Partially Labelled Examples, In Proc. IEEE Workshop on VS PETS, 2005.
Grossmann E., A. Kale, C. Jaynes and S. Cheung: Offline Generation of High Quality Background Subtraction Data, In Proc. British Machine Vision Con-ference, 2005.
Herbison-Evans, D., R. Green and A. Butt: Computer Animation with NUDES in Dance and Physical Education, Australian Computer Science Communica-tions, 4(1): 324-331, 1982.
Hogg, D.: Model-based vision: A program to see a walking person, Image and Vision Computing, 1(1): 5-20, 1983.
Hutchinson-Guest, A.: Choreo-Graphics; A Comparison of Dance Notation Sys- tems from the Fifteenth Century to the Present, Gordon and Breach, New York, 1989.
Isard, M. and A. Blake: Visual Tracking by Stochastic Propagation of Condi-tional Density, In Proc. Fourth European Conference on Computer Vision, pp. 343-356, Cambridge, 1996.
Isard, M. and A. Blake: A Mixed-state Condensation Tracker with Automatic Model Switching, In Proc. Sixth International Conference on Computer Vision, pp. 107-112, 1998.
Jaynes C., A. Kale, N. Sanders, and E. Grossman: The Terrascope Dataset: A Scripted Multi-Camera Indoor Video Surveillance Dataset with Ground-truth, In Proc. IEEE Workshop on VS PETS, 2005.
Jelinek, F.: Statistical Methods for Speech Recognition, MIT Press, Cambridge, 1999.
Jeong K. and C. Jaynes: Moving Shadow Detection Using a Combined Geomet- ric and Color Classification Approach, In Proc. IEEE Motion, Breckenridge, 2005.
Ju, S., M. Black and Y. Yacoob: Cardboard People: A Parameterized Model of Articulated Motion, In Proc. IEEE International Conference on Automatic Face and Gesture Recognition, pp. 38-44, 1996.
Kadous, M.: Machine recognition of Auslan signs using PowerGloves: Towards large-lexicon recognition of sign language, In Proc. Workshop on the Integration of Gesture in Language and Speech, pp. 165-74, Applied Science and Engineering Laboratories, Newark, 1996.
Kakadiaris, I. and D. Metaxas: Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection, IEEE Conference on Computer Vision and Pattern Recognition, pp. 81-87, 1996.
Krinidis M., N. Nikolaidis and I. Pitas: Feature-Based Tracking Using 3D Physics-Based Deformable Surface. Department of Informatics, Aristotle Uni-versity of Thessaloniki, 2005.
Kumar S.: Models for Learning Spatial Interactions in Natural Images for Context-Based Classification. Phd Thesis, The Robotics Institute School of Computer Science Carnegie Mellon University, 2005.
Leventon, M. and W. Freeman: Bayesian estimation of 3-d human motion from an image sequence, Technical Report 98-06, Mitsubishi Electric Research Lab, Cambridge, 1998.
Li D., D. Winfield and D. Parkhurst: Starburst: A hybrid algorithm for video-based eye tracking combining feature-based and model-based approaches. Tech-nical Report of Human Computer Interaction Program, Iowa State University, 2005.
Liang, R. and M. Ouhyoung: A Real-time Continuous Gesture Recognition Sys-tem for Sign Language, In Proc. Third International Conference on Automatic Face and Gesture Recognition, pp. 558-565, Nara, 1998.
Liddell, S. and R. Johnson: American Sign Language: the phonological base, Sign Language Studies, 64: 195-277, 1989.
Liebowitz, D. and S. Carlsson: Uncalibrated Motion Capture Exploiting Artic-ulated Structure Constraints, In Proc. IEEE International Conference on Com-puter Vision, ICCV, 2001.
Lukowicz, P., J. Ward, H. Junker, M. Stager, G. Troster, A. Atrash, and T. Starner: Recognising Workshop Activity Using Body Worn Microphones and Accelerometers, In Proc. Second International Conference on Pervasive Computing, pp. 18-22, 2004.
MacCormick, J. and M. Isard: Partitioned Sampling, Articulated Objects and Interface-quality Hand Tracking, In Proc. European Conference on Computer Vision, 2: 3-19, 2000.
Makris D.: Learning an Activity Based Semantic Scene Model. PhD Thesis, School of Engineering and Mathematical Science, City University, 2004.
Mark J. Body Tracking from Single-Camera Video. Technical Report of Mit- subishi Electric Research Laboratories, 2004.
Moeslund, T. and E. Granum: A survey of computer vision-based human motion capture, Computer Vision and Image Understanding, 18: 231-268, 2001.
Nam Y. and K. Wohn: Recognition of space-time hand-gestures using hidden Markov model, ACM Symposium on Virtual Reality Software and Technology, 1996.
Pentland, A. and B. Horowitz: Recovery of nonrigid motion and structure, IEEE Transactions on PAMI, 13:730-742, 1991.
Pheasant, S. Bodyspace. Anthropometry, Ergonomics and the Design of Work, Taylor & Francis, 1996.
Plnkers, R. and P. Fua: Articulated Soft Objects for Video-based Body Mod-elling, In Proc. IEEE International Conference on Computer Vision, ICCV, pp. 394-401, 2001.
Rehg, J. and T. Kanade: Model-based Tracking of Self-occluding Articulated Objects, In Proc. Fifth International Conference on Computer Vision, pp. 612-617, 1995.
Remondino F. and A. Roditakis: Human Figure Reconstruction and Modelling from Single Image or Monocular Video Sequence, In Proc. Fourth International Conference on 3D Digital Image and Modelling, 2003.
Ren, J., J. Orwell, G. Jones, and M. Xu: A General Framework for 3D Soccer Ball Estimations and Tracking, IEEE Transactions on Image Processing, 24-27, 2004.
Rittscher, J., A. Blake and S. Roberts: Towards the automatic analysis of com-plex human body motions, Image and Vision Computing, 20(12): 905-916, 2002.
Rohr, K. Towards model-based recognition of human movements in image se-quences, CVGIP - Image Understanding, 59(1):94-115, 1994.
Rosales, R. and S. Sclaroff: Inferring Body Pose Without Tracking Body Parts, In Proc. IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2000.
Schlenzig, J., E. Hunter, and R. Jain: Recursive Identification of Gesture Input-ers Using Hidden Markov Models, In Proc. Applications of Computer Vision, 187-194, 1994.
Schrotter G., A. Gruen, E. Casanova, and P. Fua: Markerless Model Based Surface Measurement and Motion Tracking, In Proc. Seventh conference on Optical 3D Measurement Techniques, Zurich, 2005.
Sigal, L., S. Bhatia S., Roth, M. Black, and M. Isard: Tracking Loose-limbed People, In Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2004.
Starner, T. and A. Pentland: Real-time American Sign Language recognition from video using Hidden Markov Models, Technical Report 375, MIT Media Laboratory, 1996.
Stokoe, W.: Sign Language Structure: An Outline of the Visual Communication System of the American Deaf, Studies in Linguistics: Chapter 8. Linstok Press, Silver Spring, MD, 1960. Revised 1978.
Sullivan, J., A. Blake, M. Isard, and J. MacCormick: Object Localization by Bayessian Correlation, In Proc. International Conference on Computer Vision, 2: 1068-1075, 1999.
Tamura, S., and S. Kawasaki: Recognition of sign language motion images, Pat- tern Recognition, 31: 343-353, 1988.
Taylor, C. Reconstruction of Articulated Objects from Point Correspondences in a Single Articulated Image, In Proc. IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 586-591, 2000.
Urtasun R. and P. Fua: (2004) 3D Human Body Tracking using Deterministic Temporal Motion Models, Technical Report of Computer Vision Laboratory, EPFL, Lausanne, 2004.
Vogler, C. and D. Metaxas: Adapting hidden Markov Models for ASL Recogni-tion by Using Three-dimensional Computer Vision Methods, In Proc. IEEE In-ternational Conference on Systems, Man and Cybernetics, pp. 156-161, Orlando, 1997.
Vogler, C. and D. Metaxas: ASL Recognition Based on a Coupling Between HMMs and 3D Motion Analysis, In Proc. IEEE International Conference on Computer Vision, pp. 363-369, Mumbai, 1998.
Vogler, C. and D. Metaxas: Toward scalability in ASL recognition: breaking down signs into phonemes, Gesture Workshop 99, Gif-sur-Yvette, 1999.
Wachter, S. and H. Nagel, Tracking of persons in monocular image sequences, Computer Vision and Image Understanding, 74(3):174-192, 1999.
Waldron, M. and S. Kim, Isolated ASL sign recognition system for deaf persons, IEEE Transactions on Rehabilitation Engineering, 3(3):261-71, 1995.
Wang, J., G. Lorette, and P. Bouthemy, Analysis of Human Motion: A Modelbased Approach, In Proc. Scandinavian Conference on Image Analysis, 2:1142-1149, 1991.
Wren, C., A. Azarbayejani, T. Darrell and A. Pentland, “Pfinder: Real-time tracking of the human body”, IEEE Transactions on PAMI, 19(7):780-785, 1997.
Yamato, J., J. Ohya, and K. Ishii, Recognizing Human Action in Time-sequential Images Using Hidden Markov Models, In Proc. IEEE International Conference on Computer Vision, pp. 379-385, 1992.
Zhong H., J. Shi, and M. Visontai: Detecting Unusual Activity in Video, In Proc.IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer
About this chapter
Cite this chapter
Green, R. (2008). Spatially and Temporally Segmenting Movement to Recognize Actions. In: Rosenhahn, B., Klette, R., Metaxas, D. (eds) Human Motion. Computational Imaging and Vision, vol 36. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6693-1_9
Download citation
DOI: https://doi.org/10.1007/978-1-4020-6693-1_9
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6692-4
Online ISBN: 978-1-4020-6693-1
eBook Packages: Computer ScienceComputer Science (R0)