Learning and Inferring Motion Patterns using Parametric Segmental Switching Linear Dynamic Systems
Switching Linear Dynamic System (SLDS) models are a popular technique for modeling complex nonlinear dynamic systems. An SLDS can describe complex temporal patterns more concisely and accurately than an HMM by using continuous hidden states. However, the use of SLDS models in practical applications is challenging for three reasons. First, exact inference in SLDS models is computationally intractable. Second, the geometric duration model induced in standard SLDSs limits their representational power. Third, standard SLDSs do not provide a principled way to interpret systematic variations governed by higher order parameters.
The contributions in this paper address all of these three challenges. First, we present a data-driven MCMC (DD-MCMC) sampling method for approximate inference in SLDSs. We show DD-MCMC provides an efficient method for estimation and learning in SLDS models. Second, we present segmental SLDSs (S-SLDS), where the geometric distributions of the switching state durations are replaced with arbitrary duration models. Third, we extend the standard SLDS model with additional global parameters that can capture systematic temporal and spatial variations. The resulting parametric SLDS model (P-SLDS) uses EM to robustly interpret parametrized motions by incorporating additional global parameters that underly systematic variations of the overall motion.
The overall development of the extensions for SLDSs provide a principled framework to interpret complex motions. The framework is applied to the honey bee dance interpretation task in the context of the on-going BioTracking project at the Georgia Institute of Technology. The experimental results suggest that the enhanced models provide an effective framework for a wide range of motion analysis applications.
KeywordsProbabilistic graphical models Time-series Trajectory analysis Behavior recognition MCMC Biology
Unable to display preview. Download preview PDF.
- Balch, T., Khan, Z., & Veloso, M. (2001). Automatically tracking and analyzing the behavior of live insect colonies. In Proceedings autonomous agents (pp. 521–528), Montreal. Google Scholar
- Brand, M., & Hertzmann, A. (2000). Style machines. In SIGGRAPH: proceedings of the of conference on computer graphics and interactive technologies (pp. 183–192). Google Scholar
- Branson, K., & Belongie, S. (2005). Tracking multiple mouse contours (without too many samples). In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 1039–1046). Google Scholar
- Bregler, C. (1997). Learning and recognizing human dynamics in video sequences. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 568–574). Google Scholar
- Dollár S, P., Belongie, S., Branson, K., & Rabaud, V. (2005). Monitoring animal behavior in the smart vivarium. In Measuring behavior (pp. 70–72). Google Scholar
- Ferguson, J. (1980). Variable duration models for speech. In Symposium on the application of HMMs to text and speech (pp. 143–179). Google Scholar
- Frisch, K. (1967). The dance language and orientation of bees. Harvard: Harvard University Press. Google Scholar
- Ge, X., & Smyth, P. (2000). Deformable Markov model templates for time-series pattern matching. In International conference knowledge discovery and data mining (KDD) (pp. 81–90). Google Scholar
- Ghahramani, Z., & Hinton, G. E. (1998). Variational learning for switching state-space models. Neural Computation, 12(4), 963–996. Google Scholar
- Howard, A., & Jebara, T. (2004). Dynamical systems trees. In Proceedings of the 20th conference on uncertainty in AI (UAI) (pp. 260–267), Banff, Canada. Google Scholar
- Khan, Z., Balch, T., & Dellaert, F. (2004). A Rao-Blackwellized particle filter for eigentracking. In Proceedings of IEEE conference on computer vision and pattern Recognition (CVPR) (Vol. 2, pp. 980–986). Google Scholar
- Lerner, U., & Parr, R. (2001). Inference in hybrid networks: theoretical limits and practical algorithms. In Proceedings of the 17th conference on uncertainty in AI (UAI) (pp. 310–318), Seattle, WA, August 2001. Google Scholar
- Lerner, U., Parr, R., Koller, D., & Biswas, G. (2000). Bayesian fault detection and diagnosis in dynamic systems. In Proceedings of the 17th AAAI national conference on AI (pp. 531–537), Austin, TX. Google Scholar
- Li, Y., Wang, T., & Shum, H-Y. (2002). Motion texture: a two-level statistical model for character motion synthesis. In SIGGRAPH: proceedings of the conference on computer graphics and interactive technologies. Google Scholar
- Neal, R. M., & Hinton, G. E. (1998). A view of the EM algorithm that justifies incremental, sparse, and other variants. Dordrecht: Kluwer Academic. Also published by MIT Press, 1999. Google Scholar
- Oh, S. M., Ranganathan, A., Rehg, J. M., & Dellaert, F. (2005a). A variational inference method for switching linear dynamic systems. GVU Center, College of Computing: Technical Report GIT-GVU-05-16. Google Scholar
- Oh, S. M., Rehg, J. M., Balch, T., & Dellaert, F. (2005b). Data-driven MCMC for learning and inference in switching linear dynamic systems. In Proceedings of the 22nd AAAI national conference on AI (pp. 944–949), Pittsburgh, PA. Google Scholar
- Oh, S. M., Rehg, J. M., Balch, T., & Dellaert, F. (2005c). Learning and inference in parametric switching linear dynamic systems. In Proceedings of the international conference on computer vision (ICCV) (Vol. 2, pp. 1161–1168). Google Scholar
- Pavlović, V., & Rehg, J. M. (2000). Impact of dynamic model learning on classification of human motion. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 788–795). Google Scholar
- Pavlović, V., Rehg, J. M., Cham, T.-J., & Murphy, K. (1999). A dynamic Bayesian network approach to figure tracking using learned dynamic models. In Proceedings of the international conference on computer vision (ICCV) (Vol. 1, pp. 94–101). Google Scholar
- Pavlović, V., Rehg, J. M., & MacCormick, J. (2000) Learning switching linear models of human motion. In Advances in neural information processing systems (NIPS) (pp. 981–987). Google Scholar
- Ranganathan, A., & Dellaert, F. (2005). Data driven MCMC for appearance-based topological mapping. In Robotics: science and systems I (pp. 209–216). Google Scholar
- Ren, L., Patrick, A., Efros, A., Hodgins, J., & Rehg, J. M. (2005). A data-driven approach to quantifying natural human motion. ACM Transactions on Graphics, Special Issue: Proceedings of 2005 SIGGRAPH Conference, 24(3), 1090–1097. Google Scholar
- Rosti, A.-V. I., & Gales, M. J. F. (2004). Rao-Blackwellised Gibbs sampling for switching linear dynamical systems. In Proceedings of international conference on acoustic, speech, and signal processing (ICASSP) (Vol. 1, pp. 809–812). Google Scholar
- Russel, M. (1993). A segmental HMM for speech pattern matching. In Proceedings of international conference on acoustic, speech, and signal processing (ICASSP) (pp. 499–502). Google Scholar
- Vidal, R., Chiuso, A., & Soatto, S. (2002). Observability and identifiability of jump linear systems. In Proceedings of IEEE conference on decision and control (Vol. 4, pp. 3614–3619). Google Scholar