Data-Driven Methods in Multiscale Modeling of Soft Matter

  • Tristan BereauEmail author
Living reference work entry


As in many other scientific fields, data-driven methods are rapidly impacting multiscale modeling. This chapter will illustrate some of the many ways advanced statistical models and a data-centric perspective help augmenting computer simulations in soft matter. A specific focus on force fields, sampling, and simulation analysis is presented, taking advantage of machine learning, high-throughput schemes, and Bayesian inference.



Various discussions have helped shape some of the views developed in this chapter. I am especially grateful to Denis Andrienko, Kurt Kremer, Joseph F. Rudzinski, Omar Valsson, and Anatole von Lilienfeld.

This work was supported in part by the Emmy Noether Programme of the Deutsche Forschungsgemeinschaft (DFG).


  1. Bartók AP, Payne MC, Kondor R, Csányi G (2010) Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys Rev Lett 104(13):136403Google Scholar
  2. Behler J (2016) Perspective: machine learning potentials for atomistic simulations. J Chem Phys 145(17):170901Google Scholar
  3. Bereau T (2018) Example: ML model of Hirshfeld ratios. Accessed 28 Feb 2018
  4. Bereau T, Kremer K (2015) Automated parametrization of the coarse-grained martini force field for small organic molecules. J Chem Theory Comput 11(6):2783–2791Google Scholar
  5. Bereau T, Andrienko D, von Lilienfeld OA (2015) Transferable atomic multipole machine learning models for small organic molecules. J Chem Theory Comput 11(7):3225–3233Google Scholar
  6. Bereau T, Andrienko D, Kremer K (2016) Research update: computational materials discovery in soft matter. APL Mater 4(5):053101Google Scholar
  7. Bereau T, DiStasio RA Jr, Tkatchenko A, von Lilienfeld OA (2018) Non-covalent interactions across organic and biological subsets of chemical space: physics-based potentials parametrized from machine learning. J Chem Phys 147(24):241706Google Scholar
  8. Bowman GR, Pande VS, Noé F (2013) An introduction to Markov state models and their application to long timescale molecular simulation, vol 797. Springer Science & Business Media, NetherlandsGoogle Scholar
  9. Chiavazzo E, Covino R, Coifman RR, Gear CW, Georgiou AS, Hummer G, Kevrekidis IG (2017) Intrinsic map dynamics exploration for uncharted effective free-energy landscapes. Proc Natl Acad Sci 114(28):E5494–E5503Google Scholar
  10. Chmiela S, Tkatchenko A, Sauceda HE, Poltavsky I, Schütt KT, Müller KR (2017) Machine learning of accurate energy-conserving molecular force fields. Sci Adv 3(5):e1603015Google Scholar
  11. Curtarolo S, Hart GL, Nardelli MB, Mingo N, Sanvito S, Levy O (2013) The high-throughput highway to computational materials design. Nat Mater 12(3):191–201Google Scholar
  12. Deringer VL, Csányi G (2017) Machine learning based interatomic potential for amorphous carbon. Phys Rev B 95(9):094203Google Scholar
  13. Faber FA, Hutchison L, Huang B, Gilmer J, Schoenholz SS, Dahl GE, Vinyals O, Kearnes S, Riley PF, von Lilienfeld OA (2017) Fast machine learning models of electronic and energetic properties consistently reach approximation errors better than DFT accuracy. arXiv preprint arXiv:170205532Google Scholar
  14. Ferguson AL (2017) Bayeswham: a Bayesian approach for free energy estimation, reweighting, and uncertainty quantification in the weighted histogram analysis method. J Comput Chem 38(18):1583–1605Google Scholar
  15. Ferguson AL, Panagiotopoulos AZ, Debenedetti PG, Kevrekidis IG (2011) Integrating diffusion maps with umbrella sampling: application to alanine dipeptide. J Chem Phys 134(13):04B606Google Scholar
  16. Ferrenberg AM, Swendsen RH (1989) Optimized monte carlo data analysis. Phys Rev Lett 63(12):1195Google Scholar
  17. Fisher DH, Pazzani MJ, Langley P (2014) Concept formation: knowledge and experience in unsupervised learning. Morgan Kaufmann, CaliforniaGoogle Scholar
  18. Glielmo A, Sollich P, De Vita A (2017) Accurate interatomic force fields via machine learning with covariant kernels. Phys Rev B 95(21):214302Google Scholar
  19. Halgren TA (1992) The representation of van der Waals (vdW) interactions in molecular mechanics force fields: potential form, combination rules, and vdW parameters. J Am Chem Soc 114(20):7827–7843Google Scholar
  20. Halgren TA, Damm W (2001) Polarizable force fields. Curr Opin Struct Biol 11(2):236–242Google Scholar
  21. Huan TD, Batra R, Chapman J, Krishnan S, Chen L, Ramprasad R (2017) A universal strategy for the creation of machine learning-based atomistic force fields. npj Comput Mater 3(1):37Google Scholar
  22. Huang B, von Lilienfeld O (2016) Communication: understanding molecular representations in machine learning: the role of uniqueness and target similarity. J Chem Phys 145(16):161102–161102Google Scholar
  23. Jain A, Ong SP, Hautier G, Chen W, Richards WD, Dacek S, Cholia S, Gunter D, Skinner D, Ceder G, Persson KA (2013) Commentary: the materials project: a materials genome approach to accelerating materials innovation. Apl Mater 1(1):011002Google Scholar
  24. John S (2016) Many-body coarse-grained interactions using gaussian approximation potentials. arXiv preprint arXiv:161109123Google Scholar
  25. Kukharenko O, Sawade K, Steuer J, Peter C (2016) Using dimensionality reduction to systematically expand conformational sampling of intrinsically disordered peptides. J Chem Theory Comput 12(10):4726–4734Google Scholar
  26. Li Z, Kermode JR, De Vita A (2015) Molecular dynamics with on-the-fly machine learning of quantum-mechanical forces. Phys Rev Lett 114(9):096405Google Scholar
  27. Li Y, Li H, Pickard FC IV, Narayanan B, Sen FG, Chan MK, Sankaranarayanan SK, Brooks BR, Roux B (2017) Machine learning force field parameters from ab initio data. J Chem Theory Comput 13(9):4492–4503Google Scholar
  28. Mackerell AD (2004) Empirical force fields for biological macromolecules: overview and issues. J Comput Chem 25(13):1584–1604Google Scholar
  29. Maple JR, Dinur U, Hagler AT (1988) Derivation of force fields for molecular mechanics and dynamics from ab initio energy surfaces. Proc Natl Acad Sci 85(15):5350–5354Google Scholar
  30. Marrink SJ, Tieleman DP (2013) Perspective on the martini model. Chem Soc Rev 42(16):6801–6822Google Scholar
  31. Marrink SJ, Risselada HJ, Yefimov S, Tieleman DP, De Vries AH (2007) The martini force field: coarse grained model for biomolecular simulations. J Phys Chem B 111(27):7812–7824Google Scholar
  32. Menichetti R, Kanekal KH, Kremer K, Bereau T (2017a) In silico screening of drug-membrane thermodynamics reveals linear relations between bulk partitioning and the potential of mean force. J Chem Phys 147(12):125101Google Scholar
  33. Menichetti R, Kremer K, Bereau T (2017b) Efficient potential of mean force calculation from multiscale simulations: solute insertion in a lipid membrane. Biochem Biophys Res Commun.
  34. Morawietz T, Singraber A, Dellago C, Behler J (2016) How Van der Waals interactions determine the unique properties of water. Proc Natl Acad Sci 113:201602375Google Scholar
  35. Neale C, Bennett WD, Tieleman DP, Pomès R (2011) Statistical convergence of equilibrium properties in simulations of molecular solutes embedded in lipid bilayers. J Chem Theory Comput 7(12):4175–4188Google Scholar
  36. Noé F (2008) Probability distributions of molecular observables computed from markov models. J Chem Phys 128(24):244103Google Scholar
  37. Noid W (2013) Perspective: coarse-grained models for biomolecular systems. J Chem Phys 139(9):09B201_1Google Scholar
  38. Olsson S, Wu H, Paul F, Clementi C, Noé F (2017) Combining experimental and simulation data of molecular processes via augmented Markov models. Proc Natl Acad Sci 114(31):8265–8270Google Scholar
  39. Perez A, MacCallum JL, Dill KA (2015) Accelerating molecular simulations of proteins using Bayesian inference on weak information. Proc Natl Acad Sci 112(38):11846–11851Google Scholar
  40. Perez A, Morrone JA, Dill KA (2017) Accelerating physical simulations of proteins by leveraging external knowledge. Wiley Interdiscip Rev Comput Mol Sci 7:e1309Google Scholar
  41. Peter C, Kremer K (2010) Multiscale simulation of soft matter systems. Faraday Discuss 144:9–24Google Scholar
  42. Plattner N, Doerr S, De Fabritiis G, Noe F (2017) Complete protein–protein association kinetics in atomic detail revealed by molecular dynamics simulations and Markov modelling. Nat Chem 9:1005–1011Google Scholar
  43. Ponder JW, Case DA (2003) Force fields for protein simulations. Adv Protein Chem 66:27–85Google Scholar
  44. Rabiner L, Juang B (1986) An introduction to hidden Markov models. IEEE ASSP Mag 3(1):4–16Google Scholar
  45. Ramakrishnan R, von Lilienfeld OA (2017) Machine learning, quantum chemistry, and chemical space. Rev Comput Chem 30:225–256Google Scholar
  46. Rasmussen CE, Williams CK (2006) Gaussian processes for machine learning, vol 1. MIT press, CambridgeGoogle Scholar
  47. Rudzinski JF, Kremer K, Bereau T (2016) Communication: consistent interpretation of molecular simulation kinetics using Markov state models biased with external information. J Chem Phys 144(5):051102Google Scholar
  48. Rühle V, Junghans C, Lukyanov A, Kremer K, Andrienko D (2009) Versatile object-oriented toolkit for coarse-graining applications. J Chem Theory Comput 5(12):3211–3223Google Scholar
  49. Rupp M, Tkatchenko A, Müller KR, Von Lilienfeld OA (2012) Fast and accurate modeling of molecular atomization energies with machine learning. Phys Rev Lett 108(5):058301Google Scholar
  50. Schiilkopf B. (2001) The Kernel Trick for Distances. In Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference, MIT Press, 13:301Google Scholar
  51. Shaw DE, Grossman J, Bank JA, Batson B, Butts JA, Chao JC, Deneroff MM, Dror RO, Even A, Fenton CH et al (2014) Anton 2: raising the bar for performance and programmability in a special-purpose molecular dynamics supercomputer. In: Proceedings of the international conference for high performance computing, networking, storage and analysis. IEEE Press, New Orleans, pp 41–53Google Scholar
  52. Sodt AJ, Sandar ML, Gawrisch K, Pastor RW, Lyman E (2014) The molecular structure of the liquid ordered phase of lipid bilayers. J Am Chem Soc 136(2):725Google Scholar
  53. Stroet M, Koziara KB, Malde AK, Mark AE (2017) Optimization of empirical force fields by parameter space mapping: a single-step perturbation approach. J Chem Theory Comput 13:6201–6212Google Scholar
  54. Tetko IV, Tanchuk VY, Villa AEP (2001) Prediction of n-octanol/water partition coefficients from PHYSPROP database using artificial neural networks and E-state indices. J Chem Inf Comput Sci 41(5):1407–1421Google Scholar
  55. Tkatchenko A, DiStasio RA Jr, Car R, Scheffler M (2012) Accurate and efficient method for many-body van der Waals interactions. Phys Rev Lett 108(23):236402Google Scholar
  56. Van Vleet MJ, Misquitta AJ, Stone AJ, Schmidt JR (2016) Beyond Born–Mayer: improved models for short-range repulsion in ab initio force fields. J Chem Theory Comput 12(8):3851–3870Google Scholar
  57. Voth GA (2008) Coarse-graining of condensed phase and biomolecular systems. CRC press, Boca RatonGoogle Scholar
  58. Wang W, Donini O, Reyes CM, Kollman PA (2001) Biomolecular simulations: recent developments in force fields, simulations of enzyme catalysis, protein-ligand, protein-protein, and protein-nucleic acid noncovalent interactions. Annu Rev Biophys Biomol Struct 30(1):211–243Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Theory GroupMax Planck Institute for Polymer ResearchMainzGermany

Section editors and affiliations

  • Kurt Kremer
    • 1
  1. 1.MPI for Polymer ResearchMainzGermany

Personalised recommendations