High-Dimensional Materials and Process Optimization Using Data-Driven Experimental Design with Well-Calibrated Uncertainty Estimates

Ling, Julia; Hutchinson, Maxwell; Antono, Erin; Paradiso, Sean; Meredig, Bryce

doi:10.1007/s40192-017-0098-z

High-Dimensional Materials and Process Optimization Using Data-Driven Experimental Design with Well-Calibrated Uncertainty Estimates

Technical Article
Published: 05 July 2017

Volume 6, pages 207–217, (2017)
Cite this article

Integrating Materials and Manufacturing Innovation Aims and scope Submit manuscript

Julia Ling ORCID: orcid.org/0000-0001-9692-4408¹^na1,
Maxwell Hutchinson¹^na1,
Erin Antono¹,
Sean Paradiso¹ &
…
Bryce Meredig¹

7833 Accesses
134 Citations
29 Altmetric
1 Mention
Explore all metrics

Abstract

The optimization of composition and processing to obtain materials that exhibit desirable characteristics has historically relied on a combination of domain knowledge, trial and error, and luck. We propose a methodology that can accelerate this process by fitting data-driven models to experimental data as it is collected to suggest which experiment should be performed next. This methodology can guide the practitioner to test the most promising candidates earlier and can supplement scientific and engineering intuition with data-driven insights. A key strength of the proposed framework is that it scales to high-dimensional parameter spaces, as are typical in materials discovery applications. Importantly, the data-driven models incorporate uncertainty analysis, so that new experiments are proposed based on a combination of exploring high-uncertainty candidates and exploiting high-performing regions of parameter space. Over four materials science test cases, our methodology led to the optimal candidate being found with three times fewer required measurements than random guessing on average.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Is Automated Materials Design and Discovery Possible?

Machine Learning-Based Experimental Design in Materials Science

A perspective on Bayesian methods applied to materials discovery and design

Article 26 October 2022

Notes

References

Roy R (2010) A primer on the Taguchi method. Soc Manuf Eng, 1–245
Fisher R A (1921) On the probable error of a coefficient of correlation deduced from a small sample. Metron 1:3–32
Google Scholar
Chaloner K, Verdinelli I (1995) Bayesian experimental design: a review. Stat Sci 10(3):273–304
Article Google Scholar
Chernoff H (1959) Sequential design of experiments. Ann Math Stat 30(3):755–770
Article Google Scholar
Cohn D A, Ghahramani Z, Jordan M I (1996) Active learning with statistical models. J Artif Intell Res 4(1):129–145
Google Scholar
Martinez-Cantin R (2014) BayesOpt: a Bayesian optimization library for nonlinear optimization, experimental design and bandits. J Mach Learn Res 15(1):3735–3739
Google Scholar
Shan S, Wang GG (2010) Survey of modeling and optimization strategies to solve high-dimensional design problems with computationally-expensive black-box functions. Struct Multidiscip Optim 41(2):219–241. doi:10.1007/s00158-009-0420-2
Article Google Scholar
Wang Y, Reyes KG, Brown KA, Mirkin CA, Powell WB (2015) Nested-batch-mode learning and stochastic optimization with an application to sequential multistage testing in materials science. SIAM J Sci Comput 37(3):B361–B381. doi:10.1137/140971117. http://epubs.siam.org/doi/10.1137/140971117
Article Google Scholar
Aggarwal R, Demkowicz M, Marzouk YM (2015) Information-driven experimental design in materials science. Inf Sci Mater Discov Des 225:13–44. doi:10.1007/978-3-319-23871-5
Google Scholar
Ueno T, Rhone T D, Hou Z, Mizoguchi T, Tsuda K (2016) Combo: an efficient bayesian optimization library for materials science. Mater Discov 4:18–21
Article Google Scholar
Xue D, Xue D, Yuan R, Zhou Y, Balachandran P, Ding X, Sun J, Lookman T (2017) An informatics approach to transformation temperatures of NiTi-based shape memory alloys. Acta Mater 125:532–541
Article Google Scholar
Dehghannasiri R, Xue D, Balachandran PV, Yousefi MR, Dalton LA, Lookman T, Dougherty ER (2017) Optimal experimental design for materials discovery. Comput Mater Sci 129:311–322. doi:10.1016/j.commatsci.2016.11.041
Article Google Scholar
Oliynyk A, Antono E, Sparks T, Ghadbeigi L, Gaultois M, Meredig B, Mar A (2016) High-throughput machine-learning-driven synthesis of full-heusler compounds. Chem Mater 28(20):7324–7331
Article Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article Google Scholar
Ho T K (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8): 832–844
Article Google Scholar
Efron B (2012) Model selection estimation and bootstrap smoothing. Division of Biostatistics, Stanford University
Wager S, Hastie T, Efron B (2014) Confidence intervals for random forests: the Jackknife and the infinitesimal Jackknife. J Mach Learn Res 15:1625–1651. doi:10.1016/j.surg.2006.10.010.Use. http://jmlr.org/papers/v15/wager14a.html, arXiv:1311.4555v2
Google Scholar
Hutchinson M (2016) Citrine Informatics Lolo. https://github.com/CitrineInformatics/lolo accessed: 2017-03-21
Bocarsly JD, Levin EE, Garcia CA, Schwennicke K, Wilson SD, Seshadri R (2017) A simple computational proxy for screening magnetocaloric compounds. Chem Mater 29(4):1613–1622
Article Google Scholar
Sparks T, Gaultois M, Oliynyk A, Brgoch J, Meredig B (2016) Data mining our way to the next generation of thermoelectrics. Scr Mater 111:10–15
Article Google Scholar
Agrawal A, Deshpande P D, Cecen A, Basavarsu G P, Choudhary A N, Kalidindi S R (2014) Exploration of data science techniques to predict fatigue strength of steel from composition and processing parameters. Integr Mater Manuf Innov 3(1):1–19
Article Google Scholar
Ward L, Agrawal A, Choudhary A, Wolverton C (2016) A general-purpose machine learning framework for predicting properties of inorganic materials. arXiv preprint
van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605
O’Mara J, Meredig B, Michel K (2016) Materials data infrastructure: a case study of the citrination platform to examine data import, storage, and access. JOM 68(8):2031–2034
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank S. Wager and T. Covert for their discussions regarding random forest uncertainty estimates. The authors would also like to thank the rest of the Citrine Informatics team. S. Paradiso and M. Hutchinson acknowledge support from Argonne National Laboratories through contract 6F-31341, associated with the R2R Manufacturing Consortium funded by the Department of Energy Advanced Manufacturing Office.

Author information

Julia Ling and Maxwell Hutchinson contributed equally to this work.

Authors and Affiliations

Citrine Informatics, Redwood City, CA, USA
Julia Ling, Maxwell Hutchinson, Erin Antono, Sean Paradiso & Bryce Meredig

Authors

Julia Ling
View author publications
You can also search for this author in PubMed Google Scholar
Maxwell Hutchinson
View author publications
You can also search for this author in PubMed Google Scholar
Erin Antono
View author publications
You can also search for this author in PubMed Google Scholar
Sean Paradiso
View author publications
You can also search for this author in PubMed Google Scholar
Bryce Meredig
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julia Ling.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ling, J., Hutchinson, M., Antono, E. et al. High-Dimensional Materials and Process Optimization Using Data-Driven Experimental Design with Well-Calibrated Uncertainty Estimates. Integr Mater Manuf Innov 6, 207–217 (2017). https://doi.org/10.1007/s40192-017-0098-z

Download citation

Received: 20 April 2017
Accepted: 14 June 2017
Published: 05 July 2017
Issue Date: September 2017
DOI: https://doi.org/10.1007/s40192-017-0098-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-Dimensional Materials and Process Optimization Using Data-Driven Experimental Design with Well-Calibrated Uncertainty Estimates

Abstract

Access this article

Similar content being viewed by others

Is Automated Materials Design and Discovery Possible?

Machine Learning-Based Experimental Design in Materials Science

A perspective on Bayesian methods applied to materials discovery and design

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

High-Dimensional Materials and Process Optimization Using Data-Driven Experimental Design with Well-Calibrated Uncertainty Estimates

Abstract

Access this article

Similar content being viewed by others

Is Automated Materials Design and Discovery Possible?

Machine Learning-Based Experimental Design in Materials Science

A perspective on Bayesian methods applied to materials discovery and design

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation