Abstract
We present a novel model for recognizing long-term complex activities involving multiple persons. The proposed model, named ‘decomposed hidden Markov model’ (DHMM), combines spatial decomposition and hierarchical abstraction to capture multi-modal, long-term dependent and multi-scale characteristics of activities. Decomposition in space and time offers conceptual advantages of compaction and clarity, and greatly reduces the size of state space as well as the number of parameters. DHMMs are efficient even when the number of persons is variable. We also introduce an efficient approximation algorithm for inference and parameter estimation. Experiments on multi-person activities and multi-modal individual activities demonstrate that DHMMs are more efficient and reliable than familiar models, such as coupled HMMs, hierarchical HMMs, and multi-observation HMMs.
Similar content being viewed by others
References
Brand, M., Oliver, N., Pentland, A., 1997. Coupled Hidden Markov Models for Complex Action Recognition. Proc. CVPR, p.994–999. [doi:10.1109/CVPR.1997.609450]
Bui, H.H., Venkatesh, S., West, G., 2002. Policy recognition in the abstract hidden Markov model. J. Artif. Intell. Res., 17:451–499.
Du, Y., Chen, F., Xu, W., 2007. Human interaction representation and recognition through motion decomposition. IEEE Signal Processing Lett., 14(12):952–955. [doi:10.1109/LSP.2007.908035]
Du, Y., Chen, F., Xu, W., Zhang, W., 2008. Activity recognition through multi-scale motion detail analysis. Neurocomputing, 71:3561–3574. [doi:10.1016/j.neucom.2007.09.012]
Fine, S., Singer, Y., Tishby, N., 1998. The hierarchical hidden Markov model: analysis and applications. Mach. Learning, 32(1):41–62. [doi:10.1023/A:1007469218079]
Forster, M., 2000. Key concepts in model selection performance and generalizability. J. Math. Psychol., 44:205–231. [doi:10.1006/jmps.1999.1284]
Ghahramani, Z., 2001. An introduction to hidden Markov models and Bayesian networks. Int. J. Pattern Recogn. Artif. Intell., 15(1):9–42. [doi:10.1142/S0218001401000836]
Gong, S., Xiang, T., 2003. Recognition of Group Activities Using Dynamic Probabilistic Networks. Proc. ICCV, p.742–749. [doi:10.1109/ICCV.2003.1238423]
Intille, S.S., Bobick, A.F., 2001. Recognizing planned, multiperson action. Comput. Vis. Image Underst., 81(3):414–445. [doi:10.1006/cviu.2000.0896]
Liu, X.H., Chua, C.S., 2006. Multi-agent activity recognition using observation decomposed hidden Markov models. Image Vis. Comput., 24:166–175. [doi:10.1016/j.imavis.2005.09.024]
Moeslund, T.B., Hilton, A., Krüger, V., 2006. A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst., 104(2–3):90–127. [doi:10.1016/j.cviu.2006.08.002]
Murphy, K.P., 2002. Dynamic Bayesian Networks: Representation, Inference and Learning. PhD Thesis, University of California, Berkeley, USA.
Murphy, K.P., Paskin, M., 2001. Linear Time Inference in Hierarchical HMMs. Proc. NIPS, p.833–840.
Nguyen, N., Phung, D., Venkatesh, S., Bui, H.H., 2005. Learning and Detecting Activities from Movement Trajectories Using the Hierarchical Hidden Markov Model. Proc. CVPR, p.955–960. [doi:10.1109/CVPR.2005.203]
Oliver, N., Garg, A., Horvitz, E., 2004. Layered representations for learning and inferring office activity from multiple sensory channels. Comput. Vis. Image Underst., 96(2):163–180. [doi:10.1016/j.cviu.2004.02.004]
Schwarz, G., 1978. Estimating the dimension of a model. Ann. Statist., 6(2):461–464. [doi:10.1214/aos/1176344136]
Wada, T., Matsuyama, T., 2000. Multiobject behavior recognition by event driven selective attention method. IEEE Trans. PAMI, 22(8):873–887. [doi:10.1109/34.868687]
Zacks, J.Z., Tversky, B., 2001. Event structure in perception conception. Psychol. Bull., 127(1):3–21. [doi:10.1037/0033-2909.127.1.3]
Zhang, D., Gatica-Perez, D., Bengio, S., McCowan, I., 2006. Modeling individual and group actions in meetings with layered HMMs. IEEE Trans. Multim., 8(3):509–520. [doi:10.1109/TMM.2006.870735]
Zhang, W., Chen, F., Xu, W., Zhang, E., 2006. Real-time Video Intelligent Surveillance System. Proc. ICME, p.1021–1024. [doi:10.1109/ICME.2006.262707]
Zhang, W., Chen, F., Xu, W., Cao, Z., 2007. Decomposition in Hidden Markov Models for Activity Recognition. Proc. MCAM, p.232–241. [doi:10.1007/978-3-540-73417-8_30]
Zhang, W., Chen, F., Xu, W., Du, Y., 2008. Hierarchical group process representation in multi-agent activity recognition. Signal Processing: Image Commun., 23(10):739–753. [doi:10.1016/j.image.2008.09.001]
Author information
Authors and Affiliations
Corresponding author
Additional information
Project (No. 60772050) supported by the National Natural Science Foundation of China
Rights and permissions
About this article
Cite this article
Zhang, Wd., Chen, F. & Xu, Wl. Bi-dimension decomposed hidden Markov models for multi-person activity recognition. J. Zhejiang Univ. Sci. A 10, 810–819 (2009). https://doi.org/10.1631/jzus.A0820388
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/jzus.A0820388