Variational Approach for Learning Markov Processes from Time Series Data

  • Hao WuEmail author
  • Frank NoéEmail author


Inference, prediction, and control of complex dynamical systems from time series is important in many areas, including financial markets, power grid management, climate and weather modeling, or molecular dynamics. The analysis of such highly nonlinear dynamical systems is facilitated by the fact that we can often find a (generally nonlinear) transformation of the system coordinates to features in which the dynamics can be excellently approximated by a linear Markovian model. Moreover, the large number of system variables often change collectively on large time- and length-scales, facilitating a low-dimensional analysis in feature space. In this paper, we introduce a variational approach for Markov processes (VAMP) that allows us to find optimal feature mappings and optimal Markovian models of the dynamics from given time series data. The key insight is that the best linear model can be obtained from the top singular components of the Koopman operator. This leads to the definition of a family of score functions called VAMP-r which can be calculated from data, and can be employed to optimize a Markovian model. In addition, based on the relationship between the variational scores and approximation errors of Koopman operators, we propose a new VAMP-E score, which can be applied to cross-validation for hyper-parameter optimization and model selection in VAMP. VAMP is valid for both reversible and nonreversible processes and for stationary and nonstationary processes or realizations.


Koopman operator Variational approach Markov process Data-driven methods 

Mathematics Subject Classification

37M10 37L65 47N30 65K10 



  1. Andrew, G., Arora, R., Bilmes, J., Livescu, K.: Deep canonical correlation analysis. In: International Conference on Machine Learning, pp. 1247–1255 (2013)Google Scholar
  2. Arlot, S., Celisse, A.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)MathSciNetzbMATHCrossRefGoogle Scholar
  3. Bollt, E.M., Santitissadeekorn, N.: Applied and Computational Measurable Dynamics. SIAM (2013)Google Scholar
  4. Boninsegna, L., Gobbo, G., Noé, F., Clementi, C.: Investigating molecular kinetics by variationally optimized diffusion maps. J. Chem. Theory Comput. 11, 5947–5960 (2015)CrossRefGoogle Scholar
  5. Bowman, G.R., Pande, V.S., Noé, F. (eds.): An Introduction to Markov State Models and Their Application to Long Timescale Molecular Simulation. Volume 797 of Advances in Experimental Medicine and Biology. Springer, Heidelberg (2014)zbMATHGoogle Scholar
  6. Brunton, S.L., Brunton, B.W., Proctor, J.L., Kutz, J.N.: Koopman invariant subspaces and finite linear representations of nonlinear dynamical systems for control. PLoS ONE 11(2), e0150171 (2016a)CrossRefGoogle Scholar
  7. Brunton, S.L., Proctor, J.L., Kutz, J.N.: Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. 113(15), 3932–3937 (2016b)MathSciNetzbMATHCrossRefGoogle Scholar
  8. Chekroun, M.D., Simonnet, E., Ghil, M.: Stochastic climate dynamics: random attractors and time-dependent invariant measures. Physica D Nonlinear Phenom. 240(21), 1685–1700 (2011)MathSciNetzbMATHCrossRefGoogle Scholar
  9. Chodera, J.D., Noé, F.: Markov state models of biomolecular conformational dynamics. Curr. Opin. Struct. Biol. 25, 135–144 (2014)CrossRefGoogle Scholar
  10. Conrad, N.D., Weber, M., Schütte, C.: Finding dominant structures of nonreversible Markov processes. Multiscale Model. Simul. 14(4), 1319–1340 (2016)MathSciNetzbMATHCrossRefGoogle Scholar
  11. Dellnitz, M., Froyland, G., Junge, O.: The algorithms behind gaio–set oriented numerical methods for dynamical systems. In: Fiedler, B. (ed.) Ergodic Theory, Analysis, and Efficient Simulation of Dynamical Systems, pp. 145–174. Springer, Berlin (2001)zbMATHCrossRefGoogle Scholar
  12. Deuflhard, P., Weber, M.: Robust perron cluster analysis in conformation dynamics. In: Dellnitz, M., Kirkland, S., Neumann, M., Schütte, C. (eds.) Linear Algebra Application, vol. 398C, pp. 161–184. Elsevier, New York (2005)Google Scholar
  13. Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning. Springer, New York (2001)zbMATHGoogle Scholar
  14. Froyland, G.: An analytic framework for identifying finite-time coherent sets in time-dependent dynamical systems. Physica D Nonlinear Phenom. 250, 1–19 (2013)MathSciNetzbMATHCrossRefGoogle Scholar
  15. Froyland, G., Padberg, K.: Almost-invariant sets and invariant manifolds—connecting probabilistic and geometric descriptions of coherent structures in flows. Physica D Nonlinear Phenom. 238(16), 1507–1523 (2009)MathSciNetzbMATHCrossRefGoogle Scholar
  16. Froyland, G., Padberg-Gehle, K.: Almost-invariant and finite-time coherent sets: directionality, duration, and diffusion. In: Bahsoun, W., Bose, C., Froyland, G. (eds.) Ergodic Theory, Open Dynamics, and Coherent Structures, pp. 171–216. Springer, Berlin (2014)zbMATHCrossRefGoogle Scholar
  17. Froyland, G., Gottwald, G.A., Hammerlindl, A.: A computational method to extract macroscopic variables and their dynamics in multiscale systems. SIAM J. Appl. Dyn. Syst. 13(4), 1816–1846 (2014)MathSciNetzbMATHCrossRefGoogle Scholar
  18. Froyland, G., González-Tokman, C., Watson, T.M.: Optimal mixing enhancement by local perturbation. SIAM Rev. 58(3), 494–513 (2016)MathSciNetzbMATHCrossRefGoogle Scholar
  19. Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)zbMATHCrossRefGoogle Scholar
  20. Harmeling, S., Ziehe, A., Kawanabe, M., Müller, K.-R.: Kernel-based nonlinear blind source separation. Neural Comput. 15(5), 1089–1124 (2003)zbMATHCrossRefGoogle Scholar
  21. Hsing, T., Eubank, R.: Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Wiley, Amsterdam (2015)zbMATHCrossRefGoogle Scholar
  22. Klus, S., Schütte, C.: Towards tensor-based methods for the numerical approximation of the perron-frobenius and koopman operator (2015). arXiv:1512.06527
  23. Klus, S., Koltai, P., Schütte, C.: On the numerical approximation of the perron-frobenius and koopman operator (2015). arXiv:1512.05997
  24. Klus, S., Gelß, P., Peitz, S., Schütte, C.: Tensor-based dynamic mode decomposition. Nonlinearity 31(7), 3359 (2018)MathSciNetzbMATHCrossRefGoogle Scholar
  25. Koltai, P., Wu, H., Noe, F., Schütte, C.: Optimal data-driven estimation of generalized Markov state models for non-equilibrium dynamics. Computation 6(1), 22 (2018)CrossRefGoogle Scholar
  26. Konrad, A., Zhao, B.Y., Joseph, A.D., Ludwig, R.: A Markov-based channel model algorithm for wireless networks. In: Proceedings of the 4th ACM International Workshop on Modeling, Analysis and Simulation of Wireless and Mobile Systems, pp. 28–36. ACM (2001)Google Scholar
  27. Koopman, B.O.: Hamiltonian systems and transformations in hilbert space. Proc. Natl. Acad. Sci. U.S.A. 17, 315–318 (1931)zbMATHCrossRefGoogle Scholar
  28. Korda, M., Mezić, I.: On convergence of extended dynamic mode decomposition to the Koopman operator. J. Nonlinear Sci. 28(2), 687–710 (2018)MathSciNetzbMATHCrossRefGoogle Scholar
  29. Kurebayashi, W., Shirasaka, S., Nakao, H.: Optimal parameter selection for kernel dynamic mode decomposition. In: Proceedings of the International Symposium NOLTA, volume 370, p. 373 (2016)Google Scholar
  30. Li, Q., Dietrich, F., Bollt, E.M., Kevrekidis, I.G.: Extended dynamic mode decomposition with dictionary learning: a data-driven adaptive spectral decomposition of the Koopman operator. Chaos 27(10), 103111 (2017)MathSciNetzbMATHCrossRefGoogle Scholar
  31. Lusch, B., Kutz, J.N., Brunton, S.L.: Deep learning for universal linear embeddings of nonlinear dynamics. Nat. Commun. 9(1), 4950 (2018)CrossRefGoogle Scholar
  32. Ma, Y., Han, J.J., Trivedi, K.S.: Composite performance and availability analysis of wireless communication networks. IEEE Trans. Veh. Technol. 50(5), 1216–1223 (2001)CrossRefGoogle Scholar
  33. Mardt, A., Pasquali, L., Wu, H., Noé, F.: Vampnets for deep learning of molecular kinetics. Nat. Commun. 9(1), 5 (2018)CrossRefGoogle Scholar
  34. Marshall, A.W., Olkin, I., Arnold, B.C.: Inequalities: Theory of Majorization and Its Applications, vol. 143. Springer, Berlin (1979)zbMATHGoogle Scholar
  35. McGibbon, R.T., Pande, V.S.: Variational cross-validation of slow dynamical modes in molecular kinetics. J. Chem. Phys. 142, 124105 (2015)CrossRefGoogle Scholar
  36. Mezić, I.: Spectral properties of dynamical systems, model reduction and decompositions. Nonlinear Dyn. 41, 309–325 (2005)MathSciNetzbMATHCrossRefGoogle Scholar
  37. Mezić, I.: Analysis of fluid flows via spectral properties of the Koopman operator. Annu. Rev. Fluid Mech. 45, 357–378 (2013)MathSciNetzbMATHCrossRefGoogle Scholar
  38. Molgedey, L., Schuster, H.G.: Separation of a mixture of independent signals using time delayed correlations. Phys. Rev. Lett. 72, 3634–3637 (1994)CrossRefGoogle Scholar
  39. Noé, F.: Probability distributions of molecular observables computed from Markov models. J. Chem. Phys. 128, 244103 (2008)CrossRefGoogle Scholar
  40. Noé, F., Clementi, C.: Kinetic distance and kinetic maps from molecular dynamics simulation. J. Chem. Theory Comput. 11, 5002–5011 (2015)CrossRefGoogle Scholar
  41. Noé, F., Nüske, F.: A variational approach to modeling slow processes in stochastic dynamical systems. Multiscale Model. Simul. 11, 635–655 (2013)MathSciNetzbMATHCrossRefGoogle Scholar
  42. Nüske, F., Keller, B.G., Pérez-Hernández, G., Mey, A.S.J.S., Noé, F.: Variational approach to molecular kinetics. J. Chem. Theory Comput. 10, 1739–1752 (2014)CrossRefGoogle Scholar
  43. Nüske, F., Schneider, R., Vitalini, F., Noé, F.: Variational tensor approach for approximating the rare-event kinetics of macromolecular systems. J. Chem. Phys. 144, 054105 (2016)CrossRefGoogle Scholar
  44. Otto, S.E., Rowley, C.W.: Linearly recurrent autoencoder networks for learning dynamics. SIAM J. Appl. Dyn. Syst. 18(1), 558–593 (2019)MathSciNetzbMATHCrossRefGoogle Scholar
  45. Paul, F., Wu, H., Vossel, M., Groot, B., Noe, F.: Identification of kinetic order parameters for non-equilibrium dynamics. J. Chem. Phys. 150, 164120 (2018)CrossRefGoogle Scholar
  46. Perez-Hernandez, G., Paul, F., Giorgino, T., Fabritiis, G.D., Noé, F.: Identification of slow molecular order parameters for Markov model construction. J. Chem. Phys. 139, 015102 (2013)CrossRefGoogle Scholar
  47. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes: The Art of Scientific Computing. Cambridge University Press, Cambridge (2007)zbMATHGoogle Scholar
  48. Prinz, J.-H., Wu, H., Sarich, M., Keller, B.G., Senne, M., Held, M., Chodera, J.D., Schütte, C., Noé, F.: Markov models of molecular kinetics: generation and validation. J. Chem. Phys. 134, 174105 (2011)CrossRefGoogle Scholar
  49. Renardy, M., Rogers, R.C.: An Introduction to Partial Differential Equations. Springer, New York (2004)zbMATHGoogle Scholar
  50. Rowley, C.W., Mezić, I., Bagheri, S., Schlatter, P., Henningson, D.S.: Spectral analysis of nonlinear flows. J. Fluid Mech. 641, 115 (2009)MathSciNetzbMATHCrossRefGoogle Scholar
  51. Schmid, P.J.: Dynamic mode decomposition of numerical and experimental data. J. Fluid Mech. 656, 5–28 (2010)MathSciNetzbMATHCrossRefGoogle Scholar
  52. Schütte, C., Fischer, A., Huisinga, W., Deuflhard, P.: A direct approach to conformational dynamics based on hybrid Monte Carlo. J. Comput. Phys. 151, 146–168 (1999)MathSciNetzbMATHCrossRefGoogle Scholar
  53. Schwantes, C.R., Pande, V.S.: Improvements in Markov state model construction reveal many non-native interactions in the folding of NTL9. J. Chem. Theory Comput. 9, 2000–2009 (2013)CrossRefGoogle Scholar
  54. Schwantes, C.R., Pande, V.S.: Modeling molecular kinetics with tica and the kernel trick. J. Chem. Theory Comput. 11, 600–608 (2015)CrossRefGoogle Scholar
  55. Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, pp. 2951–2959 (2012)Google Scholar
  56. Song, L., Fukumizu, K., Gretton, A.: Kernel embeddings of conditional distributions: a unified kernel framework for nonparametric inference in graphical models. IEEE Signal Process. Mag. 30(4), 98–111 (2013)CrossRefGoogle Scholar
  57. Sparrow, C.: The Lorenz Equations: Bifurcations, Chaos, and Strange Attractors. Springer, New York (1982)zbMATHCrossRefGoogle Scholar
  58. Takeishi, N., Kawahara, Y., Yairi, T.: Learning Koopman invariant subspaces for dynamic mode decomposition. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 1130–1140 (2017)Google Scholar
  59. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58, 267–288 (1996)MathSciNetzbMATHGoogle Scholar
  60. Tu, J.H., Rowley, C.W., Luchtenburg, D.M., Brunton, S.L., Kutz, J.N.: On dynamic mode decomposition: theory and applications. J. Comput. Dyn. 1(2), 391–421 (2014)MathSciNetzbMATHCrossRefGoogle Scholar
  61. Williams, M.O., Kevrekidis, I.G., Rowley, C.W.: A data-driven approximation of the Koopman operator: extending dynamic mode decomposition. J. Nonlinear Sci. 25, 1307–1346 (2015a)MathSciNetzbMATHCrossRefGoogle Scholar
  62. Williams, M.O., Rowley, C.W., Kevrekidis, I.G.: A kernel-based method for data-driven Koopman spectral analysis. J. Comput. Dyn. 2(2), 247–265 (2015b)MathSciNetzbMATHGoogle Scholar
  63. Wu, H., Noé, F.: Gaussian Markov transition models of molecular kinetics. J. Chem. Phys. 142, 084104 (2015)CrossRefGoogle Scholar
  64. Wu, H., Nüske, F., Paul, F., Klus, S., Koltai, P., Noé, F.: Variational Koopman models: slow collective variables and molecular kinetics from short off-equilibrium simulations. J. Chem. Phys. 146, 154104 (2017)CrossRefGoogle Scholar
  65. Ziehe, A., Müller, K.-R.: TDSEP —an efficient algorithm for blind separation using time structure. In: ICANN 98, pp. 675–680. Springer (1998)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Mathematical SciencesTongji UniversityShanghaiChina
  2. 2.Department of Mathematics and Computer ScienceFreie Universität BerlinBerlinGermany
  3. 3.Department of PhysicsFreie Universität BerlinBerlinGermany
  4. 4.Department of ChemistryRice UniversityHoustonUSA

Personalised recommendations