Abstract
Switching Linear Dynamic System (SLDS) models are a popular technique for modeling complex nonlinear dynamic systems. An SLDS can describe complex temporal patterns more concisely and accurately than an HMM by using continuous hidden states. However, the use of SLDS models in practical applications is challenging for three reasons. First, exact inference in SLDS models is computationally intractable. Second, the geometric duration model induced in standard SLDSs limits their representational power. Third, standard SLDSs do not provide a principled way to interpret systematic variations governed by higher order parameters.
The contributions in this paper address all of these three challenges. First, we present a data-driven MCMC (DD-MCMC) sampling method for approximate inference in SLDSs. We show DD-MCMC provides an efficient method for estimation and learning in SLDS models. Second, we present segmental SLDSs (S-SLDS), where the geometric distributions of the switching state durations are replaced with arbitrary duration models. Third, we extend the standard SLDS model with additional global parameters that can capture systematic temporal and spatial variations. The resulting parametric SLDS model (P-SLDS) uses EM to robustly interpret parametrized motions by incorporating additional global parameters that underly systematic variations of the overall motion.
The overall development of the extensions for SLDSs provide a principled framework to interpret complex motions. The framework is applied to the honey bee dance interpretation task in the context of the on-going BioTracking project at the Georgia Institute of Technology. The experimental results suggest that the enhanced models provide an effective framework for a wide range of motion analysis applications.
Similar content being viewed by others
References
Andrieu, C., de Freitas, N., Doucet, A., & Jordan, M. I. (2003). An introduction to MCMC for machine learning. Machine Learning, 50, 5–43.
Balch, T., Dellaert, F., Feldman, A., Guillory, A., Isbell, C., Khan, Z., Stein, A., & Wilde, H. (2006). How A.I. and multi-robot systems research will accelerate our understanding of social animal behavior. Proceedings of IEEE, 94(7), 1145–1463.
Balch, T., Khan, Z., & Veloso, M. (2001). Automatically tracking and analyzing the behavior of live insect colonies. In Proceedings autonomous agents (pp. 521–528), Montreal.
Bar-Shalom, Y., & Fortmann, T. E. (1988). Tracking and data association. New York: Academic Press.
Bar-Shalom, Y., & Li, X. (1993). Estimation and tracking: principles, techniques and software. Boston: Artech House.
Bar-Shalom, Y., & Tse, E. (1975). Tracking in a cluttered environment with probabilistic data-association. Automatica, 11, 451–460.
Barbu, A., & Zhu, S.-C. (2005). Generalizing Swendsen–Wang to sampling arbitrary posterior probabilities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1239–1253.
Brand, M., & Hertzmann, A. (2000). Style machines. In SIGGRAPH: proceedings of the of conference on computer graphics and interactive technologies (pp. 183–192).
Branson, K., & Belongie, S. (2005). Tracking multiple mouse contours (without too many samples). In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 1039–1046).
Bregler, C. (1997). Learning and recognizing human dynamics in video sequences. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 568–574).
Carter, C., & Kohn, R. (1996). Markov chain Monte Carlo in conditionally Gaussian state spaece models. Biometrika, 83, 589–601.
Casella, G., & Robert, C. P. (1996). Rao-Blackwellisation of sampling schemes. Biometrika, 83(1), 81–94.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1), 1–38.
Djuric, P. M., & Chun, J.-H. (2002). An MCMC sampling approach to estimation of nonstationary hidden Markov Models. IEEE Transactions on Signal Processing, 50(5), 1113–1123.
Dollár S, P., Belongie, S., Branson, K., & Rabaud, V. (2005). Monitoring animal behavior in the smart vivarium. In Measuring behavior (pp. 70–72).
Doucet, A., & Andrieu, C. (2001). Iterative algorithms for state estimation of jump Markov linear systems. IEEE Transactions on Signal Processing, 49(6), 1216–1227.
Doucet, A., Gordon, N. J., & Krishnamurthy, V. (2001). Particle filters for state estimation of jump Markov linear systems. IEEE Transactions on Signal Processing, 49(3), 613–624.
Feldman, A., & Balch, T. (2004). Representing honey bee behavior for recognition using human trainable models. Adaptive Behavior, 12, 241–250.
Ferguson, J. (1980). Variable duration models for speech. In Symposium on the application of HMMs to text and speech (pp. 143–179).
Frey, B., & Jojic, N. (2003). Transformation-invariant clustering and dimensionality reduction using EM. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(1), 1–17.
Frisch, K. (1967). The dance language and orientation of bees. Harvard: Harvard University Press.
Ge, X., & Smyth, P. (2000). Deformable Markov model templates for time-series pattern matching. In International conference knowledge discovery and data mining (KDD) (pp. 81–90).
Ghahramani, Z., & Hinton, G. E. (1998). Variational learning for switching state-space models. Neural Computation, 12(4), 963–996.
Gilks, W. R., Richardson, S., & Spiegelhalter, D. J. (Eds). (1996). Markov chain Monte Carlo in practice. London: Chapman and Hall.
Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109.
Howard, A., & Jebara, T. (2004). Dynamical systems trees. In Proceedings of the 20th conference on uncertainty in AI (UAI) (pp. 260–267), Banff, Canada.
Khan, Z., Balch, T., & Dellaert, F. (2004). A Rao-Blackwellized particle filter for eigentracking. In Proceedings of IEEE conference on computer vision and pattern Recognition (CVPR) (Vol. 2, pp. 980–986).
Khan, Z., Balch, T., & Dellaert, F. (2006). MCMC data association and sparse factorization updating for real time multitarget tracking with merged and multiple measurements. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(12), 1960–1972.
Kim, C.-J. (1994). Dynamic linear models with Markov-switching. Journal of Econometrics, 60(1–2), 1–22.
Kim, S., & Smyth, P. (2006). Segmental Hidden Markov Models with Random Effects for Waveform Modeling. Journal of Machine Learning Research, 7, 945–969.
Lee, M. W., & Cohen, I. (2006). A model-based approach for estimating human 3d poses in static images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(6), 905–916.
Lerner, U., & Parr, R. (2001). Inference in hybrid networks: theoretical limits and practical algorithms. In Proceedings of the 17th conference on uncertainty in AI (UAI) (pp. 310–318), Seattle, WA, August 2001.
Lerner, U., Parr, R., Koller, D., & Biswas, G. (2000). Bayesian fault detection and diagnosis in dynamic systems. In Proceedings of the 17th AAAI national conference on AI (pp. 531–537), Austin, TX.
Levinson, S. E. (1990). Continuously variable duration hidden Markov models for automatic speech recognition. Computer Speech and Language, 1(1), 29–45.
Li, Y., Wang, T., & Shum, H-Y. (2002). Motion texture: a two-level statistical model for character motion synthesis. In SIGGRAPH: proceedings of the conference on computer graphics and interactive technologies.
Maybeck, P. (1979). Stochastic models, estimation and control (Vol. 1). New York: Academic Press.
McLachlan, G. J., & Krishnan, T. (1997). Wiley series in probability and statistics. The EM algorithm and extensions. New York: Wiley.
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equations of state calculations by fast computing machine. Journal of Chemical Physics, 21, 1087–1091.
Neal, R. M., & Hinton, G. E. (1998). A view of the EM algorithm that justifies incremental, sparse, and other variants. Dordrecht: Kluwer Academic. Also published by MIT Press, 1999.
North, B., Blake, A., Isard, M., & Rottscher, J. (2000). Learning and classification of complex dynamics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(9), 1016–1034.
Oh, S. M., Ranganathan, A., Rehg, J. M., & Dellaert, F. (2005a). A variational inference method for switching linear dynamic systems. GVU Center, College of Computing: Technical Report GIT-GVU-05-16.
Oh, S. M., Rehg, J. M., Balch, T., & Dellaert, F. (2005b). Data-driven MCMC for learning and inference in switching linear dynamic systems. In Proceedings of the 22nd AAAI national conference on AI (pp. 944–949), Pittsburgh, PA.
Oh, S. M., Rehg, J. M., Balch, T., & Dellaert, F. (2005c). Learning and inference in parametric switching linear dynamic systems. In Proceedings of the international conference on computer vision (ICCV) (Vol. 2, pp. 1161–1168).
Ostendorf, M., Digalakis, V. V., & Kimball, O. A. (1996). From HMM’s to segment models: a unified view of stochastic modeling for speech recognition. IEEE Transactions on Speech and Audio Processing, 4(5), 360–378.
Pavlović, V., & Rehg, J. M. (2000). Impact of dynamic model learning on classification of human motion. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (Vol. 1, pp. 788–795).
Pavlović, V., Rehg, J. M., Cham, T.-J., & Murphy, K. (1999). A dynamic Bayesian network approach to figure tracking using learned dynamic models. In Proceedings of the international conference on computer vision (ICCV) (Vol. 1, pp. 94–101).
Pavlović, V., Rehg, J. M., & MacCormick, J. (2000) Learning switching linear models of human motion. In Advances in neural information processing systems (NIPS) (pp. 981–987).
Ranganathan, A., & Dellaert, F. (2005). Data driven MCMC for appearance-based topological mapping. In Robotics: science and systems I (pp. 209–216).
Ren, L., Patrick, A., Efros, A., Hodgins, J., & Rehg, J. M. (2005). A data-driven approach to quantifying natural human motion. ACM Transactions on Graphics, Special Issue: Proceedings of 2005 SIGGRAPH Conference, 24(3), 1090–1097.
Rosti, A.-V. I., & Gales, M. J. F. (2004). Rao-Blackwellised Gibbs sampling for switching linear dynamical systems. In Proceedings of international conference on acoustic, speech, and signal processing (ICASSP) (Vol. 1, pp. 809–812).
Roweis, S., & Ghahramani, Z. (1999). A unifying review of linear Gaussian models. Neural Computation, 11(2), 305–345.
Russel, M. (1993). A segmental HMM for speech pattern matching. In Proceedings of international conference on acoustic, speech, and signal processing (ICASSP) (pp. 499–502).
Shumway, R. H., & Stoffer, D. S. (1992). Dynamic linear models with switching. Journal of the American Statistical Association, 86, 763–769.
Tu, Z. W., & Zhu, S. C. (2002). Image segmentation by data-driven Markov chain Monte Carlo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 657–673.
Vidal, R., Chiuso, A., & Soatto, S. (2002). Observability and identifiability of jump linear systems. In Proceedings of IEEE conference on decision and control (Vol. 4, pp. 3614–3619).
Wilson, A. D., & Bobick, A. F. (1999). Parametric hidden Markov models for gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(9), 884–900.
Zoeter, O., & Heskes, T. (2003). Hierarchical visualization of time-series data using switching linear dynamical systems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(10), 1202–1215.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Oh, S.M., Rehg, J.M., Balch, T. et al. Learning and Inferring Motion Patterns using Parametric Segmental Switching Linear Dynamic Systems. Int J Comput Vis 77, 103–124 (2008). https://doi.org/10.1007/s11263-007-0062-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-007-0062-z