Skip to main content
Log in

Data-driven discovery of formulas by symbolic regression

  • The Machine Learning Revolution in Materials Research
  • Published:
MRS Bulletin Aims and scope Submit manuscript

Abstract

Discovering knowledge from data is a quantum jump from quantity to quality, which is the characteristic and the spirit of the development of science. Symbolic regression (SR) is playing a greater role in the discovery of knowledge from data, specifically in this era of exponential data growth, because SRs are able to discover mathematical formulas from data. These formulas may provide scientifically meaningful models, especially when combined with domain knowledge. This article provides an overview of SR applications in the field of materials science and engineering. Integrating domain knowledge with SR is the key and a crucial approach, which allows gaining knowledge from data quickly, accurately, and scientifically. In the data-driven paradigm, SR allows for uncovering the underlying mechanisms of materials behavior, properties, and functions, in a wide range of areas from basic academic research to industrial applications, including experiments and computations, by providing explicit interpretable models from data, in comparison with other machine-learning “black-box” models. SR will be a powerful tool for rational and automatic materials development.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. E.B. Goldstein, G. Coco, Front. Environ. Sci. 3, 1 (2015).

    Article  Google Scholar 

  2. P. Langley, “Rediscovering Physics with BACON.3,” Proc. 6th Int. Jt. Conf. Artif. Intell.—Vol. 1 (Morgan Kaufmann Publishers, 1979), pp. 505–507, http:// dl.acm.org/citation.cfm?id=1624861.1624976.

  3. M. Schmidt, H. Lipson, Science 324, 81 (2009).

    Article  CAS  Google Scholar 

  4. H. Schaeffer, S.G. McCalla, Phys. Rev. E 96, 023302 (2017).

    Article  Google Scholar 

  5. S.H. Rudy, S.L. Brunton, J.L. Proctor, J.N. Kutz, Sci. Adv. 3, e1602614 (2017).

    Article  Google Scholar 

  6. L.M. Ghiringhelli, J. Vybiral, E. Ahmetcik, R. Ouyang, S.V. Levchenko, C. Draxl, M. Scheffler, New J. Phys. 19, 023017 (2017).

    Article  Google Scholar 

  7. R. Ouyang, S. Curtarolo, E. Ahmetcik, M. Scheffler, L.M. Ghiringhelli, Phys. Rev. Mater. 2, 083802 (2018).

    Article  CAS  Google Scholar 

  8. S. Ramakrishna, T.-Y. Zhang, W.-C. Lu, Q. Qian, J.S.C. Low, J.H.R. Yune, D.Z.L. Tan, S. Bressan, S. Sanvito, S.R. Kalidindi, J. Intell. Manuf. (2018).

  9. A. Agrawal, A. Choudhary, APL Mater. 4, 053208 (2016).

    Article  CAS  Google Scholar 

  10. Y. Wang, N. Wagner, J.M. Rondinelli, “Symbolic Regression in Materials Science,” submitted arXiv:1901.04136 (2019), http://arxiv.org/abs/1901.04136 (accessed March 26, 2019).

  11. P. Praks, D. Brkić, Water 10, 1175 (2018).

    Article  CAS  Google Scholar 

  12. E.B. Goldstein, G. Coco, A.B. Murray, M.O. Green, Earth Surf. Dyn. 2, 67 (2014).

    Article  Google Scholar 

  13. M.P. Hinchliffe, M.J. Willis, Comput. Chem. Eng. 27, 1841 (2003).

    Article  CAS  Google Scholar 

  14. D.L. Ly, H. Lipson, J. Mach. Learn. Res. 13, 3585 (2012).

    Google Scholar 

  15. T.W. Cornforth, H. Lipson, Genet. Program. Evolvable Mach. 14, 155 (2013).

    Article  Google Scholar 

  16. J. Gout, M. Quade, K. Shafi, R.K. Niven, M. Abel, Nonlinear Dyn. 91, 1001 (2018).

    Article  Google Scholar 

  17. M. Quade, M. Abel, K. Shafi, R.K. Niven, B.R. Noack, Phys. Rev. E 94, 012214 (2016).

    Article  CAS  Google Scholar 

  18. K.A. De Jong, Evolutionary Computation: A Unified Approach (MIT Press, Cambridge, MA, 2006).

    Google Scholar 

  19. J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection (MIT Press, Cambridge, MA, 1992).

    Google Scholar 

  20. R.I. McKay, N.X. Hoai, P.A. Whigham, Y. Shan, M. O’Neill, Genet. Program. Evolvable Mach. 11, 365 (2010).

    Article  Google Scholar 

  21. C. Ryan, J.J. Collins, M.O. Neill, “Grammatical Evolution: Evolving Programs for an Arbitrary Language,” Eur. Conf. Genet. Program. (Springer, 1998), pp. 83–96.

  22. J. Miller, Ed., Cartesian Genetic Programming (Springer, Heidelberg, Germany, 2011).

  23. R. Poli, W.B. Langdon, N.F. McPhee, J.R. Koza, A Field Guide to Genetic Programming (Lulu Press, Morrisville, NC, 2008).

    Google Scholar 

  24. S. Sette, L. Boullart, Eng. Appl. Artif. Intell. 14, 727 (2001).

    Article  Google Scholar 

  25. D.J. Gunaratnam, T. Degroff, J.S. Gero, Appl. Soft Comput. 2, 283 (2003).

    Article  Google Scholar 

  26. I.M. Jamadar, D.P. Vakharia, Measurement 94, 177 (2016).

    Article  Google Scholar 

  27. P.J. Angeline, K.E. Kinnear, “On Using Syntactic Constraints with Genetic Programming,” in Advances in Genetic Programming (MIT Press, Cambridge, MA, 1996), https://ieeexplore.ieee.org/document/6277529 (accessed March 23, 2019).

    Chapter  Google Scholar 

  28. A. Ratle, M. Sebag, Appl. Soft Comput. 1, 105 (2001).

    Article  Google Scholar 

  29. C. Ryan, M. O’Neill, J. Collins, Eds., Handbook of Grammatical Evolution (Springer International Publishing, Cham, Switzerland, 2018).

  30. A.H. Gandomi, A.H. Alavi, C. Ryan, Eds., Handbook of Genetic Programming Applications (Springer International Publishing, Cham, 2015).

  31. K. Sastry, D.D. Johnson, D.E. Goldberg, P. Bellon, Phys. Rev. B 72, 085438 (2005).

    Article  CAS  Google Scholar 

  32. K. Sastry, D.D. Johnson, D.E. Goldberg, P. Bellon, Int. J. Multiscale Comput. Eng. 2, 239 (2004).

    Article  Google Scholar 

  33. H.A. Padilla, S.F. Harnish, B.E. Gore, A.J. Beaudoin, J.A. Dantzig, I.M. Robertson, H. Weiland, “High Temperature Deformation and Hot Rolling of AA7055,” Metallurgical Modeling for Aluminum Alloys, Proc. Mater. Solutions Conf. 2001: 1st Int. Symp. Metall. Model. Alum. Alloys, M. Tiryakioglu, L.A. Lalli, Eds. (ASM International, Materials Park, OH, 2003), pp. 1–8.

  34. J. Behler, J. Chem. Phys. 145, 170901 (2016).

    Article  CAS  Google Scholar 

  35. W. Li, Y. Ando, E. Minamitani, S. Watanabe, J. Chem. Phys. 147, 214106 (2017).

    Article  CAS  Google Scholar 

  36. S.K. Natarajan, J. Behler, Phys. Chem. Chem. Phys. 18, 28704 (2016).

    Article  CAS  Google Scholar 

  37. P. Wang, Y. Shao, H. Wang, W. Yang, Extreme Mech. Lett. 24, 1 (2018).

    Article  Google Scholar 

  38. J. Behler, Int. J. Quantum Chem. 115, 1032 (2015).

    Article  CAS  Google Scholar 

  39. P.E. Dolgirev, I.A. Kruglov, A.R. Oganov, AIP Adv. 6, 085318 (2016).

    Article  CAS  Google Scholar 

  40. F. Fracchia, G. Del Frate, G. Mancini, W. Rocchia, V. Barone, J. Chem. Theory Comput. 14, 255 (2017).

    Article  CAS  Google Scholar 

  41. V.L. Deringer, G. Csányi, Phys. Rev. B 95, 094203 (2017).

    Article  Google Scholar 

  42. A. Glielmo, P. Sollich, A. De Vita, Phys. Rev. B 95, 214302 (2017).

    Article  Google Scholar 

  43. Z. Li, J.R. Kermode, A. De Vita, Phys. Rev. Lett. 114, 096405 (2015).

    Article  CAS  Google Scholar 

  44. W.M. Brown, A.P. Thompson, P.A. Schultz, J. Chem. Phys. 132, 024108 (2010).

    Article  CAS  Google Scholar 

  45. A. Kenoufi, K.T. Kholmurodov, Biol. Chem. Res. 2, 1 (2015).

    Google Scholar 

  46. D.E. Makarov, H. Metiu, J. Chem. Phys. 108, 590 (1998).

    Article  CAS  Google Scholar 

  47. A. Slepoy, M.D. Peters, A.P. Thompson, J. Comput. Chem. 28, 2465 (2007).

    Article  CAS  Google Scholar 

  48. A. Hernandez, A. Balasubramanian, F. Yuan, S. Mason, T. Mueller, “Fast, Accurate, and Transferable Many-Body Interatomic Potentials by Genetic Programming,” submitted arXiv:1904.01095 (2019), http://arxiv.org/abs/1904.01095 (accessed April 10, 2019).

  49. A.A. Javadi, M. Rezania, Adv. Eng. Inform. 23, 442 (2009).

    Article  Google Scholar 

  50. A. Faramarzi, A.M. Alani, A.A. Javadi, Comput. Struct. 137, 63 (2014).

    Article  Google Scholar 

  51. A.H. Gandomi, S. Sajedi, B. Kiani, Q. Huang, Autom. Constr. 70, 89 (2016).

    Article  Google Scholar 

  52. A.H. Gandomi, A.H. Alavi, Neural Comput. Appl. 21, 171 (2012).

    Article  Google Scholar 

  53. D. Versino, A. Tonda, C.A. Bronkhorst, Comput. Methods Appl. Mech. Eng. 318, 981 (2017).

    Article  Google Scholar 

  54. D.L. Preston, D.L. Tonks, D.C. Wallace, J. Appl. Phys. 93, 211 (2003).

    Article  CAS  Google Scholar 

  55. P.S. Follansbee, U.F. Kocks, Acta Metall. 36, 81 (1988).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sheng Sun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, S., Ouyang, R., Zhang, B. et al. Data-driven discovery of formulas by symbolic regression. MRS Bulletin 44, 559–564 (2019). https://doi.org/10.1557/mrs.2019.156

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1557/mrs.2019.156

Navigation