Set-Oriented Dimension Reduction: Localizing Principal Component Analysis Via Hidden Markov Models

  • Illia Horenko
  • Johannes Schmidt-Ehrenberg
  • Christof Schütte
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4216)


We present a method for simultaneous dimension reduction and metastability analysis of high dimensional time series. The approach is based on the combination of hidden Markov models (HMMs) and principal component analysis. We derive optimal estimators for the log-likelihood functional and employ the Expectation Maximization algorithm for its numerical optimization. We demonstrate the performance of the method on a generic 102-dimensional example, apply the new HMM-PCA algorithm to a molecular dynamics simulation of 12–alanine in water and interpret the results.


Hide Markov Model Independent Component Analysis Dimension Reduction Independent Component Analysis Hide State 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ichiye, T., Karplus, M.: Collective motions in proteins – a covariance analysis of atomic fluctuations in molecular dynamics and normal mode simulations. Proteins 11, 205–217 (1991)CrossRefGoogle Scholar
  2. 2.
    Frenkel, D., Smit, B.: Understanding Molecular Dynamics: From Algorithms to Applications. Academic Press, London (2002)Google Scholar
  3. 3.
    Weinan, E., Vanden-Eijnden, E.: Metastability, conformation dynamics, and transition pathways in complex systems. In: Attinger, S., Koumoutsakos, P. (eds.) Multiscale, Modelling, and Simulation, pp. 35–68. Springer, Berlin (2004)Google Scholar
  4. 4.
    Deuflhard, P., Schütte, C.: Molecular conformation dynamics and computational drug design. In: Applied Mathematics Entering the 21st Century: Invited Talks from the ICIAM 2003 Congress (2004)Google Scholar
  5. 5.
    Holmes, P., Lumley, J., Berkooz, G.: Turbulence, Coherent Structures, Dynamical Systems and Symmetry. Cambridge University Press, Cambridge (1996)MATHCrossRefGoogle Scholar
  6. 6.
    Givon, D., Kupferman, R., Stuart, A.: Extracting macroscopic dynamics: Model problems and algorithms. Nonlinearity 17, R55–R127 (2004)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Kupferman, R., Stuart, A.: Fitting sde models to nonlinear kac-zwanzig heat bath models. Physica D 199, 279–316 (2004)MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Balsera, M., Wriggers, W., Oono, Y., Schulten, K.: Pricipal Component Analysis and long time protein dynamics. J. Chem. Phys. 100, 2567–2572 (1996)CrossRefGoogle Scholar
  9. 9.
    Hyvarinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. John Wiley & Sons, Chichester (2001)CrossRefGoogle Scholar
  10. 10.
    Meyer, T., Ferrer-Costa, C., Perez, A., Rueda, M., Bidon-Chanal, A., Luque, F., Laughton, C., Orozco, M.: Essential dynamics: a tool for efficient trajectory compression and management. JCTC 2, 251–258 (2006)Google Scholar
  11. 11.
    Hünenberger, P., Mark, A., van Gunsteren, W.: Fluctuation and cross-correlation analysis of protein motions observed in nanosecond molecular dynamics simulations. J. Mol. Biol. 252, 492–503 (1995)CrossRefGoogle Scholar
  12. 12.
    Monahan, A.: Nonlinear principal component analysis by neural networks: Theory and application to the lorenz system. J. Climate 13, 821–835 (2000)CrossRefGoogle Scholar
  13. 13.
    Christiansen, B.: The shortcomings of NLPCA in identifying circulation regimes. J. Climate 18, 4814–4823 (2005)CrossRefMathSciNetGoogle Scholar
  14. 14.
    Aggarwal, C., Wolf, J., Yu, P., Procopiuc, C., Park, J.: Fast algorithms for projected clustering. In: Proceedings of the 1999 ACM SIGMOD international conference on Management of data (1999)Google Scholar
  15. 15.
    Chakrabarti, K., Mehrotra, S.: Local dimensionality reduction: A new approach to indexing high dimensional spaces. In: Proceedings of the 26th VLDB Conference, Cairo, Egypt, pp. 98–115 (2000)Google Scholar
  16. 16.
    Zhang, P., Huang, Y., Shekhar, S., Kumar, V.: Correlation analysis of spatial time series datasets: A filter-and-refine approach. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, Springer, Heidelberg (2003)Google Scholar
  17. 17.
    Baum, L., Petrie, T., Soules, G., Weiss, N.: A maximization technique occuring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Stat. 41, 164–171 (1970)MATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Baum, L.: An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. Inequalities 3, 1–8 (1972)Google Scholar
  19. 19.
    Bilmes, J.: A Gentle Tutorial of the EM Algorithm and its Applications to Parameter Estimation for Gaussian Mixture and Hidden Markov Models. Thechnical Report. International Computer Science Institute, Berkeley (1998)Google Scholar
  20. 20.
    Ghahramani, Z.: An introduction to hidden Markov models and Bayesian networks. Int. J. Pattern Recognition and Artificial Intelligence 15, 9–42 (2001)CrossRefGoogle Scholar
  21. 21.
    Frydman, J., Lakner, P.: Maximum likelihood estimation of hidden Markov processes. Ann. Appl. Prob. 13, 1296–1312 (2003)MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Horenko, I., Dittmer, E., Fischer, A., Schütte, C.: Automated model reduction for complex systems exhibiting metastability. In: SIAM Multiscale Modeling and Simulation (accepted for publication, 2005)Google Scholar
  23. 23.
    Golub, G., van Loan, C.: Matrix computations, 2nd edn. The John Hopkins University Press, Baltimore (1989)MATHGoogle Scholar
  24. 24.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B 39, 1–38 (1977)MATHMathSciNetGoogle Scholar
  25. 25.
    Viterbi, A.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Informat. Theory 13, 260–269 (1967)MATHCrossRefGoogle Scholar
  26. 26.
    Schmidt-Ehrenberg, J., Baum, D., Hege, H.C.: Visualizing dynamic molecular conformations. In: Proceedings of IEEE Visualization 2002, pp. 235–242 (2002)Google Scholar
  27. 27.
    Schütte, C., Fischer, A., Huisinga, W., Deuflhard, P.: A direct approach to conformational dynamics based on hybrid Monte Carlo. J. Comput. Phys. 151, 146–168 (1999)MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Illia Horenko
    • 1
  • Johannes Schmidt-Ehrenberg
    • 2
  • Christof Schütte
    • 1
  1. 1.Department of Mathematics and InformaticsFreie Universität BerlinBerlinGermany
  2. 2.Zuse Institute Berlin (ZIB)Berlin

Personalised recommendations