Abstract
The availability of large, diverse datasets has enabled transformative advances in a wide variety of technical fields by unlocking data scientific and machine learning techniques. In Materials Informatics for Heterogeneous Microstructures capitalization on these techniques has been limited due to the extreme complexity of generating or curating sizeable heterogeneous microstructure datasets. Historically, this difficulty can be attributed to two main hurdles: quantification (i.e., measuring microstructure diversity) and curation (i.e., generating diverse microstructures). In this paper, we present a framework for curating large, statistically diverse mesoscale microstructure datasets composed of 2-phase microstructures. The framework generates microstructures which are statistically diverse with respect to their n-point statistics—the primary emphasis is on diversity in their 2-point statistics. The framework’s foundation is a proposed set of algorithms for synthesizing salient 2-point statistics and neighborhood distributions. We generate statistically diverse microstructures by using the outputs of these algorithms as inputs to a statistically conditioned Local-Global Decomposition generation procedure. Finally, we demonstrate the proposed framework by curating MICRO2D, a diverse, large-scale, and open source heterogeneous microstructure dataset comprised of 87, 379 2-phase microstructures. The contained microstructures are periodic and \(256 \times 256\) pixels. The dataset also contains salient homogenized elastic and thermal properties computed across a range of constituent contrast ratios for each microstructure. Using MICRO2D, we analyze the statistical and property diversity achievable via the proposed framework. We conclude by discussing important areas of future research in microstructure dataset curation.
Similar content being viewed by others
Code Availability
The MICRO2D dataset and the code used in this paper will be freely provided upon publication at https://arobertson38.github.io/MICRO2D.
Notes
We emphasize the similarity of this requirement to that given by Niezgoda et al. [71] above.
The mixture weights must sum to 1. In this work, all weights in a single parameterization were set to the same value.
In this work, we set this value to \(\epsilon =10^{-8}\).
This will likely be true even if optimal space filling is accomplished over the parameter space, because of the nonlinear generation transformation step described earlier.
This approximation is a generalization of PYMKS’ standard generative model [125].
The parameterization is numerically implemented as a standard eigenvalue decomposition of the covariance matrix where the eigenvector matrices are the euler rotation matrices.
For example, the class ’VoidSmallBig’ is nonstationary, breaking the stationarity assumption that accompanies many stochastic quantification frameworks. Similarly, the sharp edges in the Voronoi classes and the small features in the NBSA class will be difficult for localization models [126], in particular those utilizing Fourier filters [127].
The exact parameter values—along with all the code necessary to generate the dataset—can be found in the GitHub repository identified at the end of the paper.
In particular, we noticed that if we did not employ volume fraction stratification the final autocorrelation dataset was strongly skewed toward higher volume fractions. We hypothesize that this is a fingerprint of the spacefilling under the \(L_2\)-norm.
The total number of microstructures is less than 100, 000 (i.e., \(10 \times 10,000\)) because several volume fraction and neighborhood combinations resulted in unstable generation, e.g., see NBSA in Table 1.
In the dataset, each class is stored separately to simplify studying subsets of the dataset.
We selected this specific discretization to balance the degree of achievable diversity against practical considerations. This resolution was sufficiently high to allow us to incorporate two important lengthscales: both salient individual features and long range patterns. However, it is sufficiently low to remain inline with the discretizations preferred by the microstructure informatics community (e.g., in Process-Structure–Property modeling [37, 72, 73, 79, 81,82,83] and synthetic generation [58, 65, 76, 80, 128]). Additionally, we construct our heuristic strategies to ensure that the chosen discretization is sufficient to represent the generated systems. Primarily, we do this by ensuring that the correlation length of the generated statistics is less than half the domain size and by generating periodic microstructures. It is well established in the micromechanics community that periodic RVEs and SVEs provide highly stable estimates of homogenized properties even using relatively small domains [84, 129]. We note that the proposed framework is not restricted to this discretization and datasets containing smaller, larger, or even 3D microstructures can readily be generated without significantly altering the codebase referenced at the end of this paper. However, more advanced generation strategies will need to be established if one is interested in incorporating more than two feature lengthscales.
The TAMU microstructures are rescaled down to \(256 \times 256\) for comparison.
The average relative \(L_2\) reconstruction error of the projection is \(0.0071 \pm 0.0077\) for the spinodal dataset. This is comparable with the reconstruction error of MICRO2D, Appendix B. Therefore, the dataset is well represented by the basis. Additionally, including the spinodal dataset in training the PC basis did not change the structure of the latent space.
We use an analysis congruent to the analysis reported in Robertson et al. [64]. Only the subset of 3-point statistics in which the first shift is equal to 3 are considered.
Additionally, we computed localized elastic strain fields that are not included in the dataset due to the extreme memory cost. Interested readers should contact the authors.
References
Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349(6245):255–260. https://doi.org/10.1126/science.aaa8415
Vaswani A, Shazeer N, Parmar N, Uskoreit J, Jones L, Gomez A, Kaiser L, Polosukhin I. Attention is all you need, NeurIPS
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y Generative adversarial networks, NeurIPS
Chen N, Zhang Y, Zen H, Weiss R, Norouzi M, Chan W (2009) Wavegrad: estimating gradients for waveform generation, https://doi.org/10.48550/arxiv.2009.00713
Mahdavifar S, Ghorbani AA (2019) Application of deep learning to cybersecurity: a survey. Neurocomputing 347:149–176. https://doi.org/10.1016/j.neucom.2019.02.056
Cai L, Gao J, Zhao D (2020) A review of the application of deep learning in medical image classification and segmentation. Ann Translat Med 8:713. https://doi.org/10.21037/atm.2020.02.44
Jiang W (2021) Applications of deep learning in stock market prediction: recent progress. Expert Syst Appl 184:115537. https://doi.org/10.1016/j.eswa.2021.115537
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Song Y, Sohl-Dickstein J, Kigma DP, Kumar A, Ermon S, Poole B (2021) Score-based generative modeling through stochastic differential equations. In: International congress for learning representation, pp 1–36
Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models. NeurIPS
Ho J, Salimans T, Gritsenko A, Chan W, Norouzi M, Fleet D. Video diffusion models, https://doi.org/10.48550/arxiv.2204.03458
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks, In: Pereira F, Burges C, Bottou L, Weinberger K (eds), Advances in neural information processing systems, vol. 25, Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs, In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds), Advances in neural information processing systems, vol. 30, Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/5dd9db5e033da9c6fb5ba83c7a7ebea9-Paper.pdf
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J, Shah A, Johnson M, Liu X, Kaiser L, Gouws S, Kato Y, Kudo T, Kazawa H, Stevens K, Kurian G, Patil N, Wang W, Young C, Smith J, Riesa J, Rudnick A, Vinyals O, Corrado G, Hughes M, Dean J (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv:1609.08144
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronnberger O, Tunyasuvunakool K, Bates R, Zidek A, Potapenko A, Bridgland A, Meyer C, Kohl S, Ballard A, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Peterson S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior A, Kavukcuoglu K, Kohli P, Hassabis D (2021) Highly accurate protein structure prediction with alphafold. Nature 596:583–589. https://doi.org/10.1038/s41586-021-03819-2
Anand N, Achim T. Protein structure and sequence generation with equivariant denoising diffusion probabilistic models, https://doi.org/10.48550/arxiv.2205.15019
Burley SK, Bhikadiya C, Bi C, Bittrich S, Chen L, Crichlow GV, Christie CH, Dalenberg K, Di Costanzo L, Duarte JM, Dutta S, Feng Z, Ganesan S, Goodsell DS, Ghosh S, Green RK, Guranovi V, Guzenko D, Hudson BP, Lawson CL, Liang Y, Lowe R, Namkoong H, Peisach E, Persikova I, Randle C, Rose A, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Tao Y-P, Voigt M, Westbrook JD, Young JY, Zardecki C, Zhuravleva M (2020) RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. Nucleic Acids Res 49(D1):D437–D451. https://doi.org/10.1093/nar/gkaa1038
Fersht A (2021) Alphafold: a personal perspective on the impact of machine learning. J Mol Biol 433(20):167088. https://doi.org/10.1016/j.jmb.2021.167088
Zheng S, He J, Liu C, Shi Y, Lu Z, Feng W, Ju F, Wang J, Zhu J, Min Y, Zhang H, Tang S, Hao H, Jin P, Chen C, Noé F, Liu H, Liu T-Y (2023) Towards predicting equilibrium distributions for molecular systems with deep learning. arxiv:2306.05445
Materials genome initiative for global competitiveness
Generale A, Robertson A, Kelly C, Kalidindi S. Inverse stochastic microstructure design, SSRN: preprint https://doi.org/10.2139/ssrn.4590691
Gao Y, Liu Y. Relibaility-based topology optimization with stochastic heterogeneous microstructure properties. Mater Des. https://doi.org/10.1016/j.matdes.2021.109713
Marshall A, Kalidindi S (2021) Autonomous development of a machine-learning model for the plastic response of two-phase composites from micromechanical finite element models. JOM 73:2085–2095. https://doi.org/10.1007/s11837-021-04696-w
Kalidindi S, Binci M, Fullwood D, Adams B (2006) Elastic properties closures using second-order homogenization theories: case studies in composites of two isotropic constituents. Acta Mater 54:3117–3126. https://doi.org/10.1016/j.actamat.2006.03.005
Hasan M, Mao Y, Tavazza F, Choudhary A, Agrawal A, Acar P. Data-driven multi-scale modeling and optimization for elastic properties of cubic microstructures. Integr Mater Manuf Innov. https://doi.org/10.1007/s40192-022-00258-3
Acar P, Sundararaghavan V (2019) Stochastic design optimization of microstructural features using linear programming for robust design. AIAA J 57:448–455
Xiong Y, Duong P, Wang D, Park S-I, Ge Q, Raghavan N, Rosen D (2019) Data-driven design space exploration and exploitation for design for additive manufacturing. J Mech Des 141:101101. https://doi.org/10.1115/1.4043587
Morris C, Bekker L, Haberman M, Seepersad C (2018) Design exploration of reliably manufacturable materials and structures with applications to negative stiffness metamaterials and microstereolithography. J Mech Des 140:111415. https://doi.org/10.1115/1.4041251
Pei Z, Rozman KA, Dogan ÖN, Wen Y, Gao N, Holm EA, Hawk JA, Alman DE, Gao MC (2021) Machine-learning microstructure for inverse material design. Adv Sci 8:2101207. https://doi.org/10.1002/advs.202101207
Fung V, Zhang J, Hu G, Ganesh P, Sumpter BG (2021) Inverse design of two-dimensional materials with invertible neural networks. npj Comput Mater 7:200. https://doi.org/10.1038/s41524-021-00670-x
Abram M, Burghardt K, Steeg GV, Galstyan A, Dingreville R. Inferring topological transitions in pattern forming processes with self supervised learning, NPJ: Comput Mater 8. https://doi.org/10.1038/s41524-022-00889-2
Diehl M, Groeber M, Haase C, Molodov D, Roters F, Raabe D (2017) Identifying structure-property relationships through dream. 3d representative volume elements and damask crystal plasticity simulations: An integrated computational materials engineering approach. JOM 69:848–855. https://doi.org/10.1007/s11837-017-2303-0
Muir C, Swaminathan B, Almansour A, Sevener K, Smith C, Presby M, Kiser J, Pollock T, Daly S. Damage mechanism identification in composites via machine learning and acoustic emission, NPJ: Comput Mater 7. https://doi.org/10.1038/s41524-021-00565-x
Hashemi S, Kalidindi SR (2023) Gaussian process autoregression models for the evolution of polycrystalline microstructures subjected to arbitrary stretching tensors. Int J Plast 162:103532. https://doi.org/10.1016/j.ijplas.2023.103532
Yabansu YC, Steinmetz P, Hötzer J, Kalidindi SR, Nestler B (2017) Extraction of reduced-order process-structure linkages from phase-field simulations. Acta Mater 124:182–194. https://doi.org/10.1016/j.actamat.2016.10.071
Dornheim J, Morand L, Zeitvogel S, Iraki T, Link N, Helm D. Deep reinforcement learning methods for structure-guided processing path optimization. J Intell Manuf 33. https://doi.org/10.1007/s10845-021-01805-z
Vlassis NN, Sun W (2023) Denoising diffusion algorithm for inverse design of microstructures with fine-tuned nonlinear material properties. Comput Methods Appl Mech Eng 413:116126. https://doi.org/10.1016/j.cma.2023.116126
Jain A, Ong S, Hautier G, Chen W, Richards W, Dacek S, Cholia S, Gunter D, Skinner D, Ceder G, Persson K (2013) Commentary: The materials project: a materials genome approach to accelerating materials innovation. APL Mater 1:011002. https://doi.org/10.1063/1.4812323
Groeber M, Jackson M (2014) Dream.3d: a digital representation environment for the analysis of microstructure in 3d, Integrating Materials and Manufacturing. Innovation 3:56–72. https://doi.org/10.1186/2193-9772-3-5
Groeber M, Ghosh S, Uchic M, Dimiduk D (2008) A framework for automated analysis and simulation of 3d polycrystalline microstructures. part 2: synthetic microstructure generation. Acta Mater 56:1274–1287. https://doi.org/10.1016/j.actamat.2007.11.040
Pilchak AL, Shank J, Tucker JC, Srivatsa S, Fagin PN, Semiatin SL(2016) A dataset for the development, verification, and validation of microstructure-sensitive process models for near-alpha titanium alloys. Integr Mater Manuf Innov, 1–18 https://doi.org/10.1186/s40192-016-0056-1
DeCost BL, Holm EA (2016) A large dataset of synthetic SEM images of powder materials and their ground truth 3d structures. Data Brief 9:727–731. https://doi.org/10.1016/j.dib.2016.10.011
Kalidindi S, Khosravani A, Yucel B, Shanker A, Blekh A (2019) Data infrastructure elements in support of accelerated materials innovation: ELA, PyMKS, and MATIN. Integr Mater Manuf Innov 8:441–454
Hart KA, Rimoli JJ (2020) Microstructpy: a statistical microstructure mesh generator in python. SoftwareX 12:100595. https://doi.org/10.1016/j.softx.2020.100595
Song K, Yan Y (2013) A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl Surf Sci 285P:858–864. https://doi.org/10.1016/j.apsusc.2013.09.002
DeCost BL, Hecht M, Francis T, Webler BA, Picard YN, Holm E (2017) Uhcsdb: ultra high carbon steel micrograph database. Integr Mater Manuf Innov 6:197–205. https://doi.org/10.1007/s40192-017-0097-0
Barber Z, Leake J, Clyne T. The doitpoms project: micrograph library. https://www.doitpoms.ac.uk/miclib/index.php
Saal J, Kirklin S, Aykol M, Meredig B, Wolverton C (2013) Materials design and discovery with high-throughput denisty functional theory: the open quantum materials database. JOM 65:1501–1509. https://doi.org/10.1007/s11837-013-0755-4
Choudhary K, Garrity KF, Reid ACE, DeCost B, Biacchi AJ, Walker ARH, Trautt Z, Hattrick-Simpers J, Kusne AG, Centrone A, Davydov A, Jiang J, Pachter R, Cheon G, Reed E, Agrawal A, Qian X, Sharma V, Zhuang H, Kalinin SV, Sumpter BG, Pilania G, Acar P, Mandal S, Haule K, Vanderbilt D, Rabe K, Tavazza F, The joint automated repository for various integrated simulations (JARVIS) for data-driven materials design. npj Comput Mater 6. https://doi.org/10.1038/s41524-020-00440-1
Tanifuji M, Matsuda A, Yoshikawa H (2019) Materials data platform: a fair system for data-driven materials science, In: 2019 8th International congress on advanced applied informatics (IIAI-AAI), pp 1021–1022. https://doi.org/10.1109/IIAI-AAI.2019.00206
Ma R, Luo T (2020) PI1M: a benchmark database for polymer informatics. J Chem Inf Model 60(10):4684–4690. https://doi.org/10.1021/acs.jcim.0c00726
Borysov S, Geilhufe R, Balatsky A. Organic materials database: an open-access online database for data mining. PLoS ONE 12. https://doi.org/10.1371/journal.pone.0171501
Kench S, Squires I, Dahari A Microlib: A library of 3d microstructures generated from 2d micrographs using slicegan. Sci Data 9. https://doi.org/10.1038/s41597-022-01744-1
Bargmann S, Klusemann B, Markmann J, Schnabel J, Schneider K, Soyarslan C, Wilmers J (2018) Generation of 3d representative volume elements for heterogeneous materials: a review. Prog Mater Sci 96:322–384. https://doi.org/10.1016/j.pmatsci.2018.02.003
Mosser L, Dubrule O, Blunt M (2018) Stochastic reconstruction of oolitic limestone by generative adversarial networks. Transp Porous Med 125:81–103. https://doi.org/10.1007/s11242-018-1039-9
Kench S, Cooper S (2021) Generating three-dimensional structures from a two-dimensional slice with generative adversarial network-based dimensionality expansion. Nature Mach Intell 3:299–305. https://doi.org/10.1038/s42256-021-00322-1
Fokina D, Muravleva E, Ovchinnikov G, Oseledets I (2020) Microstructure synthesis using style-based generative adversarial networks. Phys Rev E 101:043308. https://doi.org/10.1103/PhysRevE.101.043308
Noguchi S, Inoue J (2021) Stochastic characterization and reconstruction of material microstructures for establishment of process-structure-property linkage using the deep generative model. Phys Rev E 104:025302. https://doi.org/10.1103/PhysRevE.104.025302
Fullwood D, Niezgoda S, Adams B, Kalidindi S (2010) Microstructure sensitive design for performance optimization. Prog Mater Sci 55:477–562. https://doi.org/10.1016/j.pmatsci.2009.08.002
Torquato S (2002) Random heterogeneous materials. Springer, New York
Adams B, Kalidindi S, Fullwood D (2013) Microstructure sensitive design for performance optimization. Butterworth-Heinemann, Waltham
Gao Y, Jiao Y, Liu Y (2021) Ultra-efficient reconstruction of 3d microstructure and distribution of properties of random heterogeneous materials containing multiple phases. Acta Mater 204:116526. https://doi.org/10.1016/j.actamat.2020.116526
Robertson A, Kalidindi S (2022) Efficient generation of n-field microstructures from 2-point statistics using multi-output gaussian random fields. Acta Mater 232:117927. https://doi.org/10.1016/j.actamat.2022.117927
Robertson AE, Kelly C, Buzzy M, Kalidindi SR (2023) Local-global decompositions for conditional microstructure generation. Acta Mater 253:118966. https://doi.org/10.1016/j.actamat.2023.118966
Seibert P, Ambati M, Rabloff A, Kastner M (2021) Reconstructing random heterogeneous media through differentiable optimization. Comput Mater Sci 196:110455. https://doi.org/10.1016/j.commatsci.2021.110455
Seibert P, Rabloff A, Ambati M, Kastner M (2022) Descriptor-based reconstruction of three-dimensional microstructures through gradient-based optimization. Acta Mater 227:117667. https://doi.org/10.1016/j.actamat.2022.117667
Seibert P, Husert M, Wollner M, Kalina K, Kastner M. Fast reconstruction of microstructures with ellipsoidal inclusions using analytic descriptors, https://doi.org/10.48550/arxiv.2306.08316
Falco S, Jiang J, Cola FD, Petrinic N (2017) Generation of 3d polycrystalline microstructures with a conditioned Laguerre–Voronoi tessellation technique. Comput Mater Sci 136:20–28. https://doi.org/10.1016/j.commatsci.2017.04.018
Prasad M, Vajragupta N, Hartmaier A (2019) Kanapy: a python package for generating complex synthetic polycrystalline microstructures. J Open Source Softw 4:1732. https://doi.org/10.21105/joss.01732
Mandal S, Lao J, Donegan S, Rollett A (2018) Generation of statistically representative synthetic three-dimensional microstructures. Scripta Mater 146:128–132. https://doi.org/10.1016/j.scriptamat.2017.11.034
Niezgoda S, Fullwood D, Kalidindi S (2008) Delineation of the space of 2-point correlations in a composite material system. Acta Mater 56:5285–5292. https://doi.org/10.1016/j.actamat.2008.07.005
de Oca Zapiain DM, Stewart J, Dingreville R (2021) Accelerating phase field based microstructure evolution predictions via surrogate models trained by machine learning methods. NPJ Comput Mater 3:1–11. https://doi.org/10.1038/s41524-020-00471-8
Attari V, Honarmandi P, Duong T, Sauceda DJ, Allaire D, Arroyave R (2020) Uncertainty propagation in a multiscale calphad-reinforced elastochemical phase-field model. Acta Mater 183:452–470. https://doi.org/10.1016/j.actamat.2019.11.031
Hsu T, Epting WK, Kim H, Abernathy HW, Hackett GA, Rollett AD, Salvador PA, Holm EA (2021) Microstructure generation via generative adversarial network for heterogeneous, topoligically complex 3d materials. JOM 73:90–102. https://doi.org/10.1007/s11837-020-04484-y
NIMS, Nims materials database. https://mits.nims.go.jp/en/
Lee K, Yun G Microstructure reconstruction using diffusion-based generative models
Lin H, Brown LP, Long AC (2011) Modelling and simulating textile structures using texgen, In: Advances in textile engineering, vol. 331 of advanced materials research, pp 44–47. https://doi.org/10.4028/www.scientific.net/AMR.331.44
Krishnamoorthi S, Bandyopadhyay R, Sangid MD (2023) A microstructure-based fatigue model for additively manufactured ti-6al-4v, including the role of prior \(\beta \) boundaries. Int J Plast 163:103569. https://doi.org/10.1016/j.ijplas.2023.103569
Du P, Zebrowski A, Zola J, Ganapathysubramanian B, Wodo O. Microstructure design using graphs. Comput Mater 4. https://doi.org/10.1038/s41524-018-0108-5
Dureth C, Seibert P, Rucker D, Handford S, Kastner M, Gude M. Conditional diffusion-based microstructure reconstruction
Jung J, Yoon JI, Park HK, Jo H, Kim HS (2020) Microstructure design using machine learning generated low dimensional and continuous design space. Materialia 11:100690. https://doi.org/10.1016/j.mtla.2020.100690
Tang J, Geng X, Li D, Shi Y, Tong J, Xiao H, Peng F (2021) Machine learned-based microstructure prediction during laser sintering of alumina. Sci Rep 11:10724. https://doi.org/10.1038/s41598-021-89816-x
Iyer A, Dey B, Dasgupta A, Chen W. A conditional generative model for predicting material microstructures from processing methods
Kanit T, Forest S, Galliet I, Mounoury V, Jeulin D (2003) Determination of the size of the representative volume element for random composites: statistical and numerical approach. Int J Solids Struct 40(13):3647–3679. https://doi.org/10.1016/S0020-7683(03)00143-4
Kim Y, Jung J, Park H, Kim H (2023) Importance of microstructural features in bimodal structure-property linkage. Met Mater Int 29:53–58. https://doi.org/10.1007/s12540-022-01200-0
Paulson N, Priddy M, McDowell D, Kalidindi S (2019) Reduced-order microstructure-sensitive protocols to rank-order the transition fatigue resistance of polycrystalline microstructures. Int J Fatigue 119:1. https://doi.org/10.1016/j.ijfatigue.2018.09.011
Latypov M, Toth L, Kalidindi S (2019) Materials knowledge system for nonlinear composites. Comput Methods Appl Mech Eng 346:180. https://doi.org/10.1016/j.cma.2018.11.034
Paulson N, Priddy M, McDowell D, Kalidindi S (2017) Reduced-order structure-property linkages for polycrystalline microstructures based on 2-point statistics. Acta Mater 129:428. https://doi.org/10.1016/j.actamat.2017.03.009
Kaundinya PR, Choudhary K, Kalidindi SR. Machine learning approaches for feature engineering of the crystal structure: application to the prediction of the formation energy of cubic compounds, https://doi.org/10.48550/arXiv.2105.11319
Generale A, Kalidindi S (2021) Reduced-order models for microstructure-sensitive effective thermal conductivity of woven ceramic matrix composites with residual porosity. Compos Struct 274:114399. https://doi.org/10.1016/j.compstruct.2021.114399
Fast T, Wodo O, Ganapathysubramanian B, Kalidindi S (2016) Microstructure taxonomy based on spatial correlations: application to microstructure coarsening. Acta Mater 108:176. https://doi.org/10.1016/j.actamat.2016.01.046
Harrington G, Kelly C, Attari V, Arroyave R, Kalidindi S (2022) Application of a chained-ann for learning the process-structure mapping in \(mg_2si_xsn_{1-x}\) spinodal decomposition. Integr Mater Manuf Innov 11:433–449. https://doi.org/10.1007/s40192-022-00274-3
Barry MC, Gissinger JR, Chandross M, Wise KE, Kalidindi SR, Kumar S (2023) Voxelized atomic structure framework for materials design and discovery. Comput Mater Sci 230:112431. https://doi.org/10.1016/j.commatsci.2023.112431
Yabansu YC, Iskakov A, Kapustina A, Rajagopalan S, Kalidindi S. Application of gaussian process regression models for capturing the evolution of microstructure statistics in aging of nickel-based superalloys. Acta Mater 178
Altschuh P, Yabansu YC, Hötzer J, Selzer M, Nestler B, Kalidindi SR (2017) Data science approaches for microstructure quantification and feature identification in porous membranes. J Membr Sci 540:88–97. https://doi.org/10.1016/j.memsci.2017.06.020
Latypov M, Kalidindi S (2017) Data-driven reduced order models for effective yield strength and partitioning of strain in multiphase materials. J Comput Phys 346:242–261. https://doi.org/10.1016/j.jcp.2017.06.013
Wilson A, Adams R (2013) Gaussian process kernels for pattern discovery and extrapolation, In: Proceedings of the 30th international conference on machine learning, vol 28 of proceedings of machine learning research, PMLR, pp 1067–1075
Lazaro-Gredilla M, Quinonero-Candela J, Rasmussen C, Figueiras-Vidal A (2010) Sparse spectrum gaussian process regression. J Mach Learn Res, 1865–1881
Soutis C (2005) Fibre reinforced composites in aircraft construction. Prog Aerosp Sci 41:143–151. https://doi.org/10.1016/j.paerosci.2005.02.004
Brown Jr WF (1955) Solid mixture permittivities. J Chem Phys 23:1514–1517
Kroner E (1977) Bounds for effective elastic moduli of disordered materials. J Mech Phys Solids 25:137–155
Safdari M, Baniassadi M, Garmestani H, Al-Haik M (2012) A modified strong-constrast expansion for estimating the effective thermal conductivity of multiphase heterogeneous materials. J Appl Phys 112:114318
Torquato S (1997) Effective stiffness tensor of composite media: 1. Exact series expansions. J Mech Phys Solids 45:1421–1448
Torquato S (1998) Effective stiffness tensor of composite media: 2. Applications to isotropic dispersions. J Mech Phys Solids 46:1411–1440
Fullwood D, Adams B, Kalidindi S (2008) A strong contrast homogenization formulation for multi-phase anistropic materials. J Mech Phys Solids 56:2287–2297
Hashemi S, Kalidindi S (2021) A machine learning framework for the temporal evolution of microstructure during static recrystallization of polycrystalline materials simulated by cellular automaton. Comput Mater Sci 188:110132. https://doi.org/10.1016/j.commatsci.2020.110132
Fullwood D, Adams B, Kalidindi S (2007) Generalized pareto front methods applied to second-order material property closures. Comput Mater Sci 38:788–799. https://doi.org/10.1016/j.commatsci.2006.05.016
Mann A, Kalidindi S (2022) Development of a robust cnn model for capturing microstructure-property linkages and building property closures supporting material design. Front Mater 9:851085. https://doi.org/10.3389/fmats.2022.851085
Rossin J, Leser P, Pusch K, Frey C, Vogel S, Saville A, Torbet C, Clarke A, Daly S, Pollock T (2022) Single crystal elastic constants of additively manufactured components determined by resonant ultrasound spectroscopy. Mater Charact 192:112244. https://doi.org/10.1016/j.matchar.2022.112244
Kroner E (1972) Statistical continuum mechanics. Springer, New York
Niezgoda S, Yabansu Y, Kalidindi S (2011) Understanding and visualizing microstructure and microstructure variance as a stochastic process. Acta Mater 59:6387–6400. https://doi.org/10.1016/j.actamat.2011.06.051
Shlens J (2020) A tutorial of principal component analysis. Accessed 28 Nov 2020. https://www.cs.princeton.edu/picasso/mats/PCA-Tutorial-Intuition_jp.pdf
Kennard RW, Stone LA (1969) Computer aided design of experiments. Technometrics 11(1):137–148. https://doi.org/10.1080/00401706.1969.10490666
Mak S, Joseph V (2018) Minimax and minimax projection designs using clustering. J Comput Graph Stat 27:166–178. https://doi.org/10.1080/10618600.2017.1302881
Huang C, Joseph V, Ray D (2021) Constrained minimum energy designs. Stat Comput 31:80. https://doi.org/10.1007/s11222-021-10054-2
Fullwood D, Niezgoda S, Kalidindi S (2008) Microstructure reconstruction from 2-point statistics using phase recovery algorithms. Acta Mater 56:942–948. https://doi.org/10.1016/j.actamat.2007.10.044
Jiao Y, Stillinger F, Torquato S (2007) Modeling heterogeneous materials via two-point correlation functions: basic principles. Phys Rev E 76:031110. https://doi.org/10.1103/PhysRevE.76.031110
Jiao Y, Stillinger F, Torquato S (2009) A superior descriptor of random textures and its predictive capacity. PNAS 106:17634–17639. https://doi.org/10.1073/pnas.0905919106
Niezgoda SR, Turner DM, Fullwood DT, Kalidindi SR (2010) Optimized structure based representative volume element sets reflecting the ensemble-averaged 2-point statistics. Acta Mater 58(13):4432–4445. https://doi.org/10.1016/j.actamat.2010.04.041
Helton J, Davis F (2003) Latin hypercube sampling and propogation of uncertainty in analyses of complex systems. Reliab Eng Syst Saf, 23–69. https://doi.org/10.1016/S0951-8320(03)00058-9
Swayer S (2023) Wishart distributions and inverse-wishart sampling. Accessed 4 Oct 2023. https://www.math.wustl.edu/~sawyer/hmhandouts/Wishart.pdf
Odell PL, Feiveson AH (1966) A numerical procedure to generate a sample covariance matrix. J Am Stat Assoc 61(313):199–203. https://doi.org/10.1080/01621459.1966.10502018
Cecen A (2017) Calculation, utilization, and inference of spatial statistics in practical spatio-temporal data. Georgia Tech Library, Atlanta
Cecen A, Yucel B, Kalidindi S (2021) A generalized and modular framework for digital generation of composite microstructures. J Compos Sci 5:1–20. https://doi.org/10.3390/jcs5080211
Brough D, Wheeler D, Kalidindi S (2017) Materials knowledge systems in python: a data science framework for accelerated development of hierarchical materials. Integr Mater Manuf Innov 6:36–53. https://doi.org/10.1007/s40192-017-0089-0
Kelly C, Kalidindi S (2021) Recurrent localization networks applied to the Lippmann–Schwinger equation. Comput Mater Sci 192:110356. https://doi.org/10.1016/j.commatsci.2021.110356
You H, Zhang Q, Ross C, Lee C, Yu Y (2022) Learning deep implicit fourier neural operators (ifnos) with applications to heterogeneous material modeling. Comput Methods Appl Mech Eng 398:115296. https://doi.org/10.1016/j.cma.2022.115296
Chun S, Roy S, Nguyen Y, Choi J, Udaykumar H, Baek S (2020) Deep learning for synthetic microstructure generation in a materials-by-design framework for heterogeneous energetic materials. Sci Rep 10:13307. https://doi.org/10.1038/s41598-020-70149-0
Ostoja-Starzewski M, Kale S, Karimi P, Malyarenko A, Raghavan B, Ranganathan S, Zhang J (2016) Chapter two-scaling to RVE in random media, vol 49 of Advances in Applied Mechanics, pp 111–211. https://doi.org/10.1016/bs.aams.2016.07.001
Zerhouni O, Brisard S, Danas K. Quantifying the effects of two-point correlations on the effective elasticity of specific classes of random porous materials with and without connectivity. Int J Eng Sci. https://doi.org/10.1016/j.ijengsci.2021.103520
Li S (1999) On the unit cell for micromechanical analysis of fibre-reinforced composites. Proc R Soc A 455:815–838. https://doi.org/10.1098/rspa.1999.0336
Li S (2001) General unit cells for micromechanical analyses of unidirectional composites. Compos A Appl Sci Manuf 32(6):815–826. https://doi.org/10.1016/S1359-835X(00)00182-2
Landi G, Niezgoda N, Kalidindi S (2010) Multi-scale modeling of elastic propoerties of three-dimensional voxel-based microstructure datasets using novel DFT-based knowledge systems. Acta Mater 58:2716–2725. https://doi.org/10.1016/j.actamat.2010.01.007
Fast T, Kalidindi SR (2011) Formulation and calibration of higher-order elastic localization relationships using the MKS approach. Acta Mater 59:4595–4605. https://doi.org/10.1016/j.actamat.2011.04.005
Proust G, Kalidindi S (2006) Procedures for construction of anisotropic elastic-plastic property closures for face-centered cubic polycrystals using first-order bounding relations. J Mech Phys Solids 54:1744–1762. https://doi.org/10.1016/j.jmps.2006.01.010
Hill R (1963) Elastic properties of reinforced solids: some theoretical principles. J Mech Phys Solids 11:357–372
Yang M, Zhang J, Wei H, Zhao Y, Gui W, Su H, Jin T, Liu L. Study of \(\gamma \)’ rafting under different stress states: a phase field simulation considering viscoplasticity. J Alloys Compounds. https://doi.org/10.1016/j.jallcom.2018.07.317
Blesgen T, Chenchiah I. Cahn–Hilliard equations incorporating elasticity: analysis and comparison to experiments. Philos Trans R Soc. https://doi.org/10.1098/rsta.2012.0342
Chen W, Fuge M (2017) Beyond the known: detecting novel feasible domains over unbounded design space. J Mech Des 139:111405. https://doi.org/10.1115/1.4037306
Chen W, Fuge M (2019) Synthesizing designs with interpart dependencies using hierarchical generative adversarial networks. J Mech Des 141:111403. https://doi.org/10.1115/1.4044076
Wang S, Generale AP, Kalidindi SR, Joseph VR (2023) Sequential designs for filling output spaces. Technometrics, 1–12 https://doi.org/10.1080/00401706.2023.2231042
Ahrendt P (2023) The multivariate gaussian probability. Accessed 4 Oct 2023. https://d1wqtxts1xzle7.cloudfront.net/49874923/The_Multivariate_Gaussian_Probability_Di20161026-27105-77g7a0-libre.pdf?1477466954= &response-content-disposition=inline%3B+filename%3DThe_multivariate_gaussian_probability_di.pdf &Expires=1696429097 &Signature=EbY-smInGeeMVvC0qsTaERE9jTZTSJF8NC9MZl0fOkqTiBgWVcmYqZ~u-8vaYnjyuJyCgV-40kYMMHThOOAhgEGQ8~2dzZG~TV7Rn69mTy1I1ieWafwrsatRpsj3CB6KIbhRn6Y2MgwENUL0RVxnycgT2uiSJiAAoucqbOw5cxBO9H2OrgzgT2SywfSb2hxmr~GLayEwsCWUA~QRgm4AYcbK-YwWebZcZ6RkMOCMotDks-aCd66kbFpBz8bdM3avpmNpYJRWn9jxUFhDhJOnhz0OFdidp~fN96dS-J7~hSJDeK4dGDBE03b5sUd4Px7YrFf4jCCD6KOn1ldefSJR9w__ &Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA
Chawla M (2011) PCA and ICA processing methods for removal of artifacts and noise in electrocardiograms: a survey and comparison. Appl Soft Comput 11(2):2216–2226. https://doi.org/10.1016/j.asoc.2010.08.001
Hastie T, Tibshirani R, Friedman J (2016) The elements of statistical learning. Springer, New York
Vetterli M, Kovacevic J, Goyal V (2014) Foundations of signal processing. Cambridge University Press, Cambridge
Berryman J (1987) Relationship between specific surface area and spatial correlation functions for anistropic porous media. J Math Phys 28:244–245
Blair S, Berge P, Berryman J (1996) Using two-point correlation functions to characterize microgeometry and estimate permeabilities of sandstone and porous glass. J Geophys Res 101:20359–20375. https://doi.org/10.1029/96JB00879
Acknowledgements
A.E. Robertson and S.R. Kalidindi thank the National Science Foundation for their support under NSF 2027105. A.P. Generale acknowledges Pratt & Whitney and the Alfred P. Sloan Foundation. C. Kelly acknowledges NSF 2027105, NSF Graduate Research Fellowship DGE-1650044, and ONR N00014-18-1-2879. M. Buzzy acknowledges support from NSF DMREF 2119640. Additionally, A.E. Robertson would like to acknowledge the continued support of the Jack Kent Cooke Foundation. A.P. Generale would like to acknowledge the continued support of the Alfred P. Sloan Foundation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Real Space Mixtures
The mixture method adopted in this paper differs slightly from the approach adopted historically in Spectral Mixture-based Gaussian Process Regression modeling [97, 98]. Specifically, instead of constructing the kernel function via a mixture model approximation to a probability density in the frequency space (e.g., a Symmetric Gaussian mixture model [97]), we approximate the kernel using a real-space symmetric Gaussian mixture model and, subsequently, enforce the spectral requirements of the kernel function via two linear projections. This approach takes the following path. The approximate kernel function, \({\hat{k}}(\varvec{\tau })\), is constructed via a mixture of symmetric Gaussians.
Here, \(\varvec{\mu }\) and \(\varvec{\Sigma }\) are the mean and covariances of each mixture. \(|\cdot |\) is the determinant operator. The mixture weights, \(\alpha _i\), are selected to add to unity.
An approximate kernel structure of this form produces the following expression when transformed into frequency space [142].
Here, superscripts refer to exponentiation not indexing. Note that this produces the inverse of the kernel structure proposed by Wilson and Adams [97]—with the cosine fluctuations in the frequency space instead of the real space. This kernel structure meets only one of the minimum requirements outlined in Sects. 2.2 and 2.1: it is real valued. The presence of the cosine fluctuations introduces negative values in the spectrum. These fluctuations are removed by zeroing the negative values [92].
Here, \(\epsilon \) is a near zero, positive value added for computational stability of the subsequent steps. Finally, the generated kernel function is produced by applying the inverse cosine transform, \({\mathcal {C}}^{-1}[\cdot ]\). This operation returns the real space equivalent of the kernel without introducing spurious imaginary components in either the real or Fourier space representation of the kernel. In practice, we discretely sample the approximate covariance kernel, \({\hat{k}}(\varvec{\tau })\), in real space to a discrete covariance kernel, \({\hat{k}}_r\), and, subsequently, apply the two identified projections discretely. This procedure produces the following set of expressions.
Here, \(\varvec{\tau }_r\) is the value of \(\varvec{\tau }\) at the center of pixel r. The autocorrelation is derived from the kernel function via addition of the mean squared [63].
As noted in the main body of the paper, we used this alternative structure instead of the traditional method for two important reasons. Most importantly, the traditional Fourier-space mixture model produces spatially compact real-space kernels. This means that it cannot easily produce kernels with multiple modes. Mathematically, this is clear in the original real-space expression provided by Wilson and Adams [97] (here, for simplicity, we reproduce the 1D single mixture expression).
Clearly, the dominant exponential term has zero mean. As a result, this type of kernel cannot easily reproduce the important longer range peaks (for example, secondary peaks in layered composites that statistically represent the repetition of the layering [63]) that are present in 2-point statistics maps [90, 94, 123]. In contrast, our real-space formulation can directly construct these secondary peaks via the direct placement of the means of the individual symmetric mixtures. The second reason is practical and an extension of the first: in real space, we can use our expert knowledge of 2-point statistics [21, 72, 88, 90, 92, 93, 123] to guide the placement of the symmetric mixtures into common regions. The unfamiliar nature of the Fourier representation makes it challenging to embed domain knowledge into the construction of the kernel function (and, by extension, the autocorrelation). Of course, this second reason would be irrelevant if one was using an optimization-based placement strategy for the mixtures instead of an expert driven one.
Appendix B: PCA Truncation for MaxPro Filtering
We used PCA to perform distance preserving dimensionality reduction of the initial candidate autocorrelation set. This facilitated the framework’s spacefilling filtering operation by significantly decreasing the computational expense of the Min–Max optimization central to the MaxPro algorithm [113]. Importantly, the extracted latent space must be a good approximation of the original 2-point statistics. Therefore, it must have sufficient representational capacity to recreate the original autocorrelations’ salient features with high fidelity. Recent work by Generale et al. observed that PCA is relatively inefficient for generative tasks like this one [21]. As a result, we expect the number of necessary principal components to be quite high. We selected the truncation level for the number of principal components by tracking the reconstruction error (the relative \(L_2\) error—i.e., the \(L_2\) distance between a proposed autocorrelation and the reconstruction from the principal component basis normalized by the \(L_2\) magnitude of the original autocorrelation) of the original candidate autocorrelation dataset. Figure 14 summarizes the relative error. We selected a truncation level of 750, achieving an average reconstruction error of approximately \(0.75 \%\).
Additionally, we also explored the average feature size of stochastic microstructure functions differentiated by each eigenvector. Here, our aim was to ensure that the selected principal component basis was sensitive to every feature lengthscale in the initial candidate autocorrelation set. This is important to check because PCA is known to filter out short lengthscale (i.e., high frequency) features [112, 143,144,145]. For several eigenvectors, we identified a set of diverse autocorrelations with respect to the selected eigenvector. We did so by performing spacefilling in just the subspace defined by that eigenvector using the described MaxPro procedure. Then, we used Berryman’s method [63, 146, 147] to estimate the feature size of microstructures corresponding to the identified autocorrelations. Figure 15 depicts the trend. Importantly, the decay largely stabilizes well before the 750th eigenvector. Therefore, we expect this selected cutoff to create a subspace which is sufficiently sensitive to all salient lengthscales in the candidate autocorrelation set. We also note that the trend displays largely monotonic decay—i.e., lower index eigenvectors correspond to larger features.
Appendix C: Examples from MICRO2D
Here, we simply display a collection of randomly selected examples from each class in the MICRO2D dataset: GRF—Fig. 16, NBSA—Fig. 17, AngEllipse—Fig. 18, RandomEllipse—Fig. 19, VoidSmall—Fig. 20, VoidSmallBig—Fig. 21, VoronoiLarge—Fig. 22, VoronoiMedium—Fig. 23, VoronoiMediumSpaced—Fig. 24, and VoronoiSmall—Fig. 25.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Robertson, A.E., Generale, A.P., Kelly, C. et al. MICRO2D: A Large, Statistically Diverse, Heterogeneous Microstructure Dataset. Integr Mater Manuf Innov 13, 120–154 (2024). https://doi.org/10.1007/s40192-023-00340-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40192-023-00340-4