Data-Driven Methods in Multiscale Modeling of Soft Matter

Reference work entry


As in many other scientific fields, data-driven methods are rapidly impacting multiscale modeling. This chapter will illustrate some of the many ways advanced statistical models and a data-centric perspective help augmenting computer simulations in soft matter. A specific focus on force fields, sampling, and simulation analysis is presented, taking advantage of machine learning, high-throughput schemes, and Bayesian inference.



Various discussions have helped shape some of the views developed in this chapter. I am especially grateful to Denis Andrienko, Kurt Kremer, Joseph F. Rudzinski, Omar Valsson, and Anatole von Lilienfeld.

This work was supported in part by the Emmy Noether Programme of the Deutsche Forschungsgemeinschaft (DFG).


  1. Bartók AP, Payne MC, Kondor R, Csányi G (2010) Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys Rev Lett 104(13):136403ADSCrossRefGoogle Scholar
  2. Behler J (2016) Perspective: machine learning potentials for atomistic simulations. J Chem Phys 145(17):170901ADSCrossRefGoogle Scholar
  3. Bereau T (2018) Example: ML model of Hirshfeld ratios. Accessed 28 Feb 2018
  4. Bereau T, Kremer K (2015) Automated parametrization of the coarse-grained martini force field for small organic molecules. J Chem Theory Comput 11(6):2783–2791CrossRefGoogle Scholar
  5. Bereau T, Andrienko D, von Lilienfeld OA (2015) Transferable atomic multipole machine learning models for small organic molecules. J Chem Theory Comput 11(7):3225–3233CrossRefGoogle Scholar
  6. Bereau T, Andrienko D, Kremer K (2016) Research update: computational materials discovery in soft matter. APL Mater 4(5):053101ADSCrossRefGoogle Scholar
  7. Bereau T, DiStasio RA Jr, Tkatchenko A, von Lilienfeld OA (2018) Non-covalent interactions across organic and biological subsets of chemical space: physics-based potentials parametrized from machine learning. J Chem Phys 147(24):241706ADSCrossRefGoogle Scholar
  8. Bowman GR, Pande VS, Noé F (Eds) (2013) An introduction to Markov state models and their application to long timescale molecular simulation, Advances in Experimental Medicine and Biology 797. Springer, Dordrecht (NL)Google Scholar
  9. Chiavazzo E, Covino R, Coifman RR, Gear CW, Georgiou AS, Hummer G, Kevrekidis IG (2017) Intrinsic map dynamics exploration for uncharted effective free-energy landscapes. Proc Natl Acad Sci 114(28):E5494–E5503CrossRefGoogle Scholar
  10. Chmiela S, Tkatchenko A, Sauceda HE, Poltavsky I, Schütt KT, Müller KR (2017) Machine learning of accurate energy-conserving molecular force fields. Sci Adv 3(5):e1603015ADSCrossRefGoogle Scholar
  11. Curtarolo S, Hart GL, Nardelli MB, Mingo N, Sanvito S, Levy O (2013) The high-throughput highway to computational materials design. Nat Mater 12(3):191–201ADSCrossRefGoogle Scholar
  12. Deringer VL, Csányi G (2017) Machine learning based interatomic potential for amorphous carbon. Phys Rev B 95(9):094203ADSCrossRefGoogle Scholar
  13. Faber FA, Hutchison L, Huang B, Gilmer J, Schoenholz SS, Dahl GE, Vinyals O, Kearnes S, Riley PF, von Lilienfeld OA (2017) Machine learning prediction errors better than DFT accuracy. arXiv e-preprints arXiv:170205532Google Scholar
  14. Ferguson AL (2017) Bayeswham: a Bayesian approach for free energy estimation, reweighting, and uncertainty quantification in the weighted histogram analysis method. J Comput Chem 38(18):1583–1605CrossRefGoogle Scholar
  15. Ferguson AL, Panagiotopoulos AZ, Debenedetti PG, Kevrekidis IG (2011) Integrating diffusion maps with umbrella sampling: application to alanine dipeptide. J Chem Phys 134(13):04B606Google Scholar
  16. Ferrenberg AM, Swendsen RH (1989) Optimized Monte Carlo data analysis. Phys Rev Lett 63(12):1195ADSCrossRefGoogle Scholar
  17. Fisher DH, Pazzani MJ, Langley P (eds) (2014) Concept formation: knowledge and experience in unsupervised learning. Morgan Kaufmann Series in Machine Learning, San Mateo (CA)Google Scholar
  18. Glielmo A, Sollich P, De Vita A (2017) Accurate interatomic force fields via machine learning with covariant kernels. Phys Rev B 95(21):214302ADSCrossRefGoogle Scholar
  19. Halgren TA (1992) The representation of van der Waals (vdW) interactions in molecular mechanics force fields: potential form, combination rules, and vdW parameters. J Am Chem Soc 114(20):7827–7843CrossRefGoogle Scholar
  20. Halgren TA, Damm W (2001) Polarizable force fields. Curr Opin Struct Biol 11(2):236–242CrossRefGoogle Scholar
  21. Huan TD, Batra R, Chapman J, Krishnan S, Chen L, Ramprasad R (2017) A universal strategy for the creation of machine learning-based atomistic force fields. npj Comput Mater 3(1):37Google Scholar
  22. Huang B, von Lilienfeld O (2016) Communication: understanding molecular representations in machine learning: the role of uniqueness and target similarity. J Chem Phys 145(16):161102–161102ADSCrossRefGoogle Scholar
  23. Jain A, Ong SP, Hautier G, Chen W, Richards WD, Dacek S, Cholia S, Gunter D, Skinner D, Ceder G, Persson KA (2013) Commentary: the materials project: a materials genome approach to accelerating materials innovation. Apl Mater 1(1):011002ADSCrossRefGoogle Scholar
  24. John S (2016) Many-body coarse-grained interactions using gaussian approximation potentials. arXiv preprint arXiv:161109123Google Scholar
  25. Kukharenko O, Sawade K, Steuer J, Peter C (2016) Using dimensionality reduction to systematically expand conformational sampling of intrinsically disordered peptides. J Chem Theory Comput 12(10):4726–4734CrossRefGoogle Scholar
  26. Li Y, Li H, Pickard FC IV, Narayanan B, Sen FG, Chan MK, Sankaranarayanan SK, Brooks BR, Roux B (2017) Machine learning force field parameters from ab initio data. J Chem Theory Comput 13(9):4492–4503CrossRefGoogle Scholar
  27. Li Z, Kermode JR, De Vita A (2015) Molecular dynamics with on-the-fly machine learning of quantum-mechanical forces. Phys Rev Lett 114(9):096405ADSCrossRefGoogle Scholar
  28. Mackerell AD (2004) Empirical force fields for biological macromolecules: overview and issues. J Comput Chem 25(13):1584–1604CrossRefGoogle Scholar
  29. Maple JR, Dinur U, Hagler AT (1988) Derivation of force fields for molecular mechanics and dynamics from ab initio energy surfaces. Proc Natl Acad Sci 85(15):5350–5354ADSCrossRefGoogle Scholar
  30. Marrink SJ, Tieleman DP (2013) Perspective on the MARTINI model. Chem Soc Rev 42(16):6801–6822CrossRefGoogle Scholar
  31. Marrink SJ, Risselada HJ, Yefimov S, Tieleman DP, De Vries AH (2007) The martini force field: coarse grained model for biomolecular simulations. J Phys Chem B 111(27):7812–7824CrossRefGoogle Scholar
  32. Menichetti R, Kanekal KH, Kremer K, Bereau T (2017a) In silico screening of drug-membrane thermodynamics reveals linear relations between bulk partitioning and the potential of mean force. J Chem Phys 147(12):125101ADSCrossRefGoogle Scholar
  33. Menichetti R, Kremer K, Bereau T (2017b) Efficient potential of mean force calculation from multiscale simulations: solute insertion in a lipid membrane. Biochem Biophys Res Commun. 498:282–287. CrossRefGoogle Scholar
  34. Morawietz T, Singraber A, Dellago C, Behler J (2016) How Van der Waals interactions determine the unique properties of water. Proc Natl Acad Sci 113:8368–8373ADSCrossRefGoogle Scholar
  35. Neale C, Bennett WD, Tieleman DP, Pomès R (2011) Statistical convergence of equilibrium properties in simulations of molecular solutes embedded in lipid bilayers. J Chem Theory Comput 7(12):4175–4188CrossRefGoogle Scholar
  36. Noé F (2008) Probability distributions of molecular observables computed from Markov models. J Chem Phys 128(24):244103ADSCrossRefGoogle Scholar
  37. Noid W (2013) Perspective: coarse-grained models for biomolecular systems. J Chem Phys 139(9):09B201_1Google Scholar
  38. Olsson S, Wu H, Paul F, Clementi C, Noé F (2017) Combining experimental and simulation data of molecular processes via augmented Markov models. Proc Natl Acad Sci 114(31): 8265–8270CrossRefGoogle Scholar
  39. Perez A, MacCallum JL, Dill KA (2015) Accelerating molecular simulations of proteins using Bayesian inference on weak information. Proc Natl Acad Sci 112(38):11846–11851ADSCrossRefGoogle Scholar
  40. Perez A, Morrone JA, Dill KA (2017) Accelerating physical simulations of proteins by leveraging external knowledge. Wiley Interdiscip Rev Comput Mol Sci 7:e1309CrossRefGoogle Scholar
  41. Peter C, Kremer K (2010) Multiscale simulation of soft matter systems. Faraday Discuss 144:9–24ADSCrossRefGoogle Scholar
  42. Plattner N, Doerr S, De Fabritiis G, Noe F (2017) Complete protein–protein association kinetics in atomic detail revealed by molecular dynamics simulations and Markov modelling. Nat Chem 9:1005–1011CrossRefGoogle Scholar
  43. Ponder JW, Case DA (2003) Force fields for protein simulations. Adv Protein Chem 66:27–85CrossRefGoogle Scholar
  44. Rabiner L, Juang B (1986) An introduction to hidden Markov models. IEEE ASSP Mag 3(1):4–16CrossRefGoogle Scholar
  45. Ramakrishnan R, von Lilienfeld OA (2017) Machine learning, quantum chemistry, and chemical space. Rev Comput Chem 30:225–256Google Scholar
  46. Rasmussen CE, Williams CKI (2006) Gaussian processes for machine learning, vol 1. MIT Press, Cambridge (MA)zbMATHGoogle Scholar
  47. Rudzinski JF, Kremer K, Bereau T (2016) Communication: consistent interpretation of molecular simulation kinetics using Markov state models biased with external information. J Chem Phys 144(5):051102ADSCrossRefGoogle Scholar
  48. Rühle V, Junghans C, Lukyanov A, Kremer K, Andrienko D (2009) Versatile object-oriented toolkit for coarse-graining applications. J Chem Theory Comput 5(12):3211–3223CrossRefGoogle Scholar
  49. Rupp M, Tkatchenko A, Müller KR, Von Lilienfeld OA (2012) Fast and accurate modeling of molecular atomization energies with machine learning. Phys Rev Lett 108(5):058301ADSCrossRefGoogle Scholar
  50. Schiilkopf B (2001) The kernel trick for distances. In: Advances in neural information processing systems. Proceedings of the 2000 conference, vol 13. MIT Press, Cambridge (MA), p 301Google Scholar
  51. Shaw DE, Grossman J, Bank JA, Batson B, Butts JA, Chao JC, Deneroff MM, Dror RO, Even A, Fenton CH et al (2014) Anton 2: raising the bar for performance and programmability in a special-purpose molecular dynamics supercomputer. In: Proceedings of the international conference for high performance computing, networking, storage and analysis. IEEE Press, New Orleans, pp 41–53Google Scholar
  52. Sodt AJ, Sandar ML, Gawrisch K, Pastor RW, Lyman E (2014) The molecular structure of the liquid ordered phase of lipid bilayers. J Am Chem Soc 136(2):725CrossRefGoogle Scholar
  53. Stroet M, Koziara KB, Malde AK, Mark AE (2017) Optimization of empirical force fields by parameter space mapping: a single-step perturbation approach. J Chem Theory Comput 13:6201–6212CrossRefGoogle Scholar
  54. Tetko IV, Tanchuk VY, Villa AEP (2001) Prediction of n-octanol/water partition coefficients from PHYSPROP database using artificial neural networks and E-state indices. J Chem Inf Comput Sci 41(5):1407–1421CrossRefGoogle Scholar
  55. Tkatchenko A, DiStasio RA Jr, Car R, Scheffler M (2012) Accurate and efficient method for many-body van der Waals interactions. Phys Rev Lett 108(23):236402ADSCrossRefGoogle Scholar
  56. Van Vleet MJ, Misquitta AJ, Stone AJ, Schmidt JR (2016) Beyond Born–Mayer: improved models for short-range repulsion in ab initio force fields. J Chem Theory Comput 12(8):3851–3870CrossRefGoogle Scholar
  57. Voth GA (2008) Coarse-graining of condensed phase and biomolecular systems. CRC Press, Boca RatonCrossRefGoogle Scholar
  58. Wang W, Donini O, Reyes CM, Kollman PA (2001) Biomolecular simulations: recent developments in force fields, simulations of enzyme catalysis, protein-ligand, protein-protein, and protein-nucleic acid noncovalent interactions. Annu Rev Biophys Biomol Struct 30(1):211–243CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Theory GroupMax Planck Institute for Polymer ResearchMainzGermany

Personalised recommendations