Skip to main content
Log in

MICRO2D: A Large, Statistically Diverse, Heterogeneous Microstructure Dataset

  • Thematic Section: Harnessing the Power of Materials Data
  • Published:
Integrating Materials and Manufacturing Innovation Aims and scope Submit manuscript

Abstract

The availability of large, diverse datasets has enabled transformative advances in a wide variety of technical fields by unlocking data scientific and machine learning techniques. In Materials Informatics for Heterogeneous Microstructures capitalization on these techniques has been limited due to the extreme complexity of generating or curating sizeable heterogeneous microstructure datasets. Historically, this difficulty can be attributed to two main hurdles: quantification (i.e., measuring microstructure diversity) and curation (i.e., generating diverse microstructures). In this paper, we present a framework for curating large, statistically diverse mesoscale microstructure datasets composed of 2-phase microstructures. The framework generates microstructures which are statistically diverse with respect to their n-point statistics—the primary emphasis is on diversity in their 2-point statistics. The framework’s foundation is a proposed set of algorithms for synthesizing salient 2-point statistics and neighborhood distributions. We generate statistically diverse microstructures by using the outputs of these algorithms as inputs to a statistically conditioned Local-Global Decomposition generation procedure. Finally, we demonstrate the proposed framework by curating MICRO2D, a diverse, large-scale, and open source heterogeneous microstructure dataset comprised of 87, 379 2-phase microstructures. The contained microstructures are periodic and \(256 \times 256\) pixels. The dataset also contains salient homogenized elastic and thermal properties computed across a range of constituent contrast ratios for each microstructure. Using MICRO2D, we analyze the statistical and property diversity achievable via the proposed framework. We conclude by discussing important areas of future research in microstructure dataset curation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Code Availability

The MICRO2D dataset and the code used in this paper will be freely provided upon publication at https://arobertson38.github.io/MICRO2D.

Notes

  1. Specifically, PCA is distance preserving only when the entire basis is maintained [112]. However, in practice, truncated PC representations provide useful dimensionality reduction while being approximately distance preserving [87, 88, 92].

  2. We emphasize the similarity of this requirement to that given by Niezgoda et al. [71] above.

  3. The mixture weights must sum to 1. In this work, all weights in a single parameterization were set to the same value.

  4. In this work, we set this value to \(\epsilon =10^{-8}\).

  5. Empirical observations strongly indicate that large parts of the parameter space are not important for many engineering systems (e.g., [94, 123]). For example, in general, peaks closer to zero, i.e., with \(\varvec{\mu }_i\) near zero, are more prevalent and important in real autocorrelations.

  6. This will likely be true even if optimal space filling is accomplished over the parameter space, because of the nonlinear generation transformation step described earlier.

  7. This approximation is a generalization of PYMKS’ standard generative model [125].

  8. The parameterization is numerically implemented as a standard eigenvalue decomposition of the covariance matrix where the eigenvector matrices are the euler rotation matrices.

  9. For example, the class ’VoidSmallBig’ is nonstationary, breaking the stationarity assumption that accompanies many stochastic quantification frameworks. Similarly, the sharp edges in the Voronoi classes and the small features in the NBSA class will be difficult for localization models [126], in particular those utilizing Fourier filters [127].

  10. The exact parameter values—along with all the code necessary to generate the dataset—can be found in the GitHub repository identified at the end of the paper.

  11. In particular, we noticed that if we did not employ volume fraction stratification the final autocorrelation dataset was strongly skewed toward higher volume fractions. We hypothesize that this is a fingerprint of the spacefilling under the \(L_2\)-norm.

  12. The total number of microstructures is less than 100, 000 (i.e., \(10 \times 10,000\)) because several volume fraction and neighborhood combinations resulted in unstable generation, e.g., see NBSA in Table 1.

  13. In the dataset, each class is stored separately to simplify studying subsets of the dataset.

  14. We selected this specific discretization to balance the degree of achievable diversity against practical considerations. This resolution was sufficiently high to allow us to incorporate two important lengthscales: both salient individual features and long range patterns. However, it is sufficiently low to remain inline with the discretizations preferred by the microstructure informatics community (e.g., in Process-Structure–Property modeling [37, 72, 73, 79, 81,82,83] and synthetic generation [58, 65, 76, 80, 128]). Additionally, we construct our heuristic strategies to ensure that the chosen discretization is sufficient to represent the generated systems. Primarily, we do this by ensuring that the correlation length of the generated statistics is less than half the domain size and by generating periodic microstructures. It is well established in the micromechanics community that periodic RVEs and SVEs provide highly stable estimates of homogenized properties even using relatively small domains [84, 129]. We note that the proposed framework is not restricted to this discretization and datasets containing smaller, larger, or even 3D microstructures can readily be generated without significantly altering the codebase referenced at the end of this paper. However, more advanced generation strategies will need to be established if one is interested in incorporating more than two feature lengthscales.

  15. Other microstructures, like grain boundary structures, could be generated by the local diffusion model [64, 76].

  16. The TAMU microstructures are rescaled down to \(256 \times 256\) for comparison.

  17. The average relative \(L_2\) reconstruction error of the projection is \(0.0071 \pm 0.0077\) for the spinodal dataset. This is comparable with the reconstruction error of MICRO2D, Appendix B. Therefore, the dataset is well represented by the basis. Additionally, including the spinodal dataset in training the PC basis did not change the structure of the latent space.

  18. We use an analysis congruent to the analysis reported in Robertson et al. [64]. Only the subset of 3-point statistics in which the first shift is equal to 3 are considered.

  19. Additionally, we computed localized elastic strain fields that are not included in the dataset due to the extreme memory cost. Interested readers should contact the authors.

  20. In practice, such second order variability arises in many important material classes and is important to study to achieve desirable properties (e.g., rafting in nickel superalloy [137, 138]).

References

  1. Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349(6245):255–260. https://doi.org/10.1126/science.aaa8415

    Article  ADS  MathSciNet  CAS  PubMed  Google Scholar 

  2. Vaswani A, Shazeer N, Parmar N, Uskoreit J, Jones L, Gomez A, Kaiser L, Polosukhin I. Attention is all you need, NeurIPS

  3. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y Generative adversarial networks, NeurIPS

  4. Chen N, Zhang Y, Zen H, Weiss R, Norouzi M, Chan W (2009) Wavegrad: estimating gradients for waveform generation, https://doi.org/10.48550/arxiv.2009.00713

  5. Mahdavifar S, Ghorbani AA (2019) Application of deep learning to cybersecurity: a survey. Neurocomputing 347:149–176. https://doi.org/10.1016/j.neucom.2019.02.056

    Article  Google Scholar 

  6. Cai L, Gao J, Zhao D (2020) A review of the application of deep learning in medical image classification and segmentation. Ann Translat Med 8:713. https://doi.org/10.21037/atm.2020.02.44

    Article  Google Scholar 

  7. Jiang W (2021) Applications of deep learning in stock market prediction: recent progress. Expert Syst Appl 184:115537. https://doi.org/10.1016/j.eswa.2021.115537

    Article  Google Scholar 

  8. Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

  9. Song Y, Sohl-Dickstein J, Kigma DP, Kumar A, Ermon S, Poole B (2021) Score-based generative modeling through stochastic differential equations. In: International congress for learning representation, pp 1–36

  10. Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models. NeurIPS

  11. Ho J, Salimans T, Gritsenko A, Chan W, Norouzi M, Fleet D. Video diffusion models, https://doi.org/10.48550/arxiv.2204.03458

  12. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks, In: Pereira F, Burges C, Bottou L, Weinberger K (eds), Advances in neural information processing systems, vol. 25, Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf

  13. Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs, In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds), Advances in neural information processing systems, vol. 30, Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/5dd9db5e033da9c6fb5ba83c7a7ebea9-Paper.pdf

  14. Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J, Shah A, Johnson M, Liu X, Kaiser L, Gouws S, Kato Y, Kudo T, Kazawa H, Stevens K, Kurian G, Patil N, Wang W, Young C, Smith J, Riesa J, Rudnick A, Vinyals O, Corrado G, Hughes M, Dean J (2016) Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv:1609.08144

  15. Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronnberger O, Tunyasuvunakool K, Bates R, Zidek A, Potapenko A, Bridgland A, Meyer C, Kohl S, Ballard A, Cowie A, Romera-Paredes B, Nikolov S, Jain R, Adler J, Back T, Peterson S, Reiman D, Clancy E, Zielinski M, Steinegger M, Pacholska M, Berghammer T, Bodenstein S, Silver D, Vinyals O, Senior A, Kavukcuoglu K, Kohli P, Hassabis D (2021) Highly accurate protein structure prediction with alphafold. Nature 596:583–589. https://doi.org/10.1038/s41586-021-03819-2

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  16. Anand N, Achim T. Protein structure and sequence generation with equivariant denoising diffusion probabilistic models, https://doi.org/10.48550/arxiv.2205.15019

  17. Burley SK, Bhikadiya C, Bi C, Bittrich S, Chen L, Crichlow GV, Christie CH, Dalenberg K, Di Costanzo L, Duarte JM, Dutta S, Feng Z, Ganesan S, Goodsell DS, Ghosh S, Green RK, Guranovi V, Guzenko D, Hudson BP, Lawson CL, Liang Y, Lowe R, Namkoong H, Peisach E, Persikova I, Randle C, Rose A, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Tao Y-P, Voigt M, Westbrook JD, Young JY, Zardecki C, Zhuravleva M (2020) RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. Nucleic Acids Res 49(D1):D437–D451. https://doi.org/10.1093/nar/gkaa1038

    Article  CAS  PubMed Central  Google Scholar 

  18. Fersht A (2021) Alphafold: a personal perspective on the impact of machine learning. J Mol Biol 433(20):167088. https://doi.org/10.1016/j.jmb.2021.167088

    Article  CAS  PubMed  Google Scholar 

  19. Zheng S, He J, Liu C, Shi Y, Lu Z, Feng W, Ju F, Wang J, Zhu J, Min Y, Zhang H, Tang S, Hao H, Jin P, Chen C, Noé F, Liu H, Liu T-Y (2023) Towards predicting equilibrium distributions for molecular systems with deep learning. arxiv:2306.05445

  20. Materials genome initiative for global competitiveness

  21. Generale A, Robertson A, Kelly C, Kalidindi S. Inverse stochastic microstructure design, SSRN: preprint https://doi.org/10.2139/ssrn.4590691

  22. Gao Y, Liu Y. Relibaility-based topology optimization with stochastic heterogeneous microstructure properties. Mater Des. https://doi.org/10.1016/j.matdes.2021.109713

  23. Marshall A, Kalidindi S (2021) Autonomous development of a machine-learning model for the plastic response of two-phase composites from micromechanical finite element models. JOM 73:2085–2095. https://doi.org/10.1007/s11837-021-04696-w

    Article  ADS  Google Scholar 

  24. Kalidindi S, Binci M, Fullwood D, Adams B (2006) Elastic properties closures using second-order homogenization theories: case studies in composites of two isotropic constituents. Acta Mater 54:3117–3126. https://doi.org/10.1016/j.actamat.2006.03.005

    Article  ADS  CAS  Google Scholar 

  25. Hasan M, Mao Y, Tavazza F, Choudhary A, Agrawal A, Acar P. Data-driven multi-scale modeling and optimization for elastic properties of cubic microstructures. Integr Mater Manuf Innov. https://doi.org/10.1007/s40192-022-00258-3

  26. Acar P, Sundararaghavan V (2019) Stochastic design optimization of microstructural features using linear programming for robust design. AIAA J 57:448–455

    Article  ADS  Google Scholar 

  27. Xiong Y, Duong P, Wang D, Park S-I, Ge Q, Raghavan N, Rosen D (2019) Data-driven design space exploration and exploitation for design for additive manufacturing. J Mech Des 141:101101. https://doi.org/10.1115/1.4043587

    Article  Google Scholar 

  28. Morris C, Bekker L, Haberman M, Seepersad C (2018) Design exploration of reliably manufacturable materials and structures with applications to negative stiffness metamaterials and microstereolithography. J Mech Des 140:111415. https://doi.org/10.1115/1.4041251

    Article  Google Scholar 

  29. Pei Z, Rozman KA, Dogan ÖN, Wen Y, Gao N, Holm EA, Hawk JA, Alman DE, Gao MC (2021) Machine-learning microstructure for inverse material design. Adv Sci 8:2101207. https://doi.org/10.1002/advs.202101207

    Article  Google Scholar 

  30. Fung V, Zhang J, Hu G, Ganesh P, Sumpter BG (2021) Inverse design of two-dimensional materials with invertible neural networks. npj Comput Mater 7:200. https://doi.org/10.1038/s41524-021-00670-x

  31. Abram M, Burghardt K, Steeg GV, Galstyan A, Dingreville R. Inferring topological transitions in pattern forming processes with self supervised learning, NPJ: Comput Mater 8. https://doi.org/10.1038/s41524-022-00889-2

  32. Diehl M, Groeber M, Haase C, Molodov D, Roters F, Raabe D (2017) Identifying structure-property relationships through dream. 3d representative volume elements and damask crystal plasticity simulations: An integrated computational materials engineering approach. JOM 69:848–855. https://doi.org/10.1007/s11837-017-2303-0

    Article  ADS  CAS  Google Scholar 

  33. Muir C, Swaminathan B, Almansour A, Sevener K, Smith C, Presby M, Kiser J, Pollock T, Daly S. Damage mechanism identification in composites via machine learning and acoustic emission, NPJ: Comput Mater 7. https://doi.org/10.1038/s41524-021-00565-x

  34. Hashemi S, Kalidindi SR (2023) Gaussian process autoregression models for the evolution of polycrystalline microstructures subjected to arbitrary stretching tensors. Int J Plast 162:103532. https://doi.org/10.1016/j.ijplas.2023.103532

    Article  CAS  Google Scholar 

  35. Yabansu YC, Steinmetz P, Hötzer J, Kalidindi SR, Nestler B (2017) Extraction of reduced-order process-structure linkages from phase-field simulations. Acta Mater 124:182–194. https://doi.org/10.1016/j.actamat.2016.10.071

    Article  ADS  CAS  Google Scholar 

  36. Dornheim J, Morand L, Zeitvogel S, Iraki T, Link N, Helm D. Deep reinforcement learning methods for structure-guided processing path optimization. J Intell Manuf 33. https://doi.org/10.1007/s10845-021-01805-z

  37. Vlassis NN, Sun W (2023) Denoising diffusion algorithm for inverse design of microstructures with fine-tuned nonlinear material properties. Comput Methods Appl Mech Eng 413:116126. https://doi.org/10.1016/j.cma.2023.116126

    Article  ADS  MathSciNet  Google Scholar 

  38. Jain A, Ong S, Hautier G, Chen W, Richards W, Dacek S, Cholia S, Gunter D, Skinner D, Ceder G, Persson K (2013) Commentary: The materials project: a materials genome approach to accelerating materials innovation. APL Mater 1:011002. https://doi.org/10.1063/1.4812323

    Article  ADS  CAS  Google Scholar 

  39. Groeber M, Jackson M (2014) Dream.3d: a digital representation environment for the analysis of microstructure in 3d, Integrating Materials and Manufacturing. Innovation 3:56–72. https://doi.org/10.1186/2193-9772-3-5

    Article  Google Scholar 

  40. Groeber M, Ghosh S, Uchic M, Dimiduk D (2008) A framework for automated analysis and simulation of 3d polycrystalline microstructures. part 2: synthetic microstructure generation. Acta Mater 56:1274–1287. https://doi.org/10.1016/j.actamat.2007.11.040

    Article  ADS  CAS  Google Scholar 

  41. Pilchak AL, Shank J, Tucker JC, Srivatsa S, Fagin PN, Semiatin SL(2016) A dataset for the development, verification, and validation of microstructure-sensitive process models for near-alpha titanium alloys. Integr Mater Manuf Innov, 1–18 https://doi.org/10.1186/s40192-016-0056-1

  42. DeCost BL, Holm EA (2016) A large dataset of synthetic SEM images of powder materials and their ground truth 3d structures. Data Brief 9:727–731. https://doi.org/10.1016/j.dib.2016.10.011

    Article  PubMed  PubMed Central  Google Scholar 

  43. Kalidindi S, Khosravani A, Yucel B, Shanker A, Blekh A (2019) Data infrastructure elements in support of accelerated materials innovation: ELA, PyMKS, and MATIN. Integr Mater Manuf Innov 8:441–454

    Article  Google Scholar 

  44. Hart KA, Rimoli JJ (2020) Microstructpy: a statistical microstructure mesh generator in python. SoftwareX 12:100595. https://doi.org/10.1016/j.softx.2020.100595

    Article  Google Scholar 

  45. Song K, Yan Y (2013) A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl Surf Sci 285P:858–864. https://doi.org/10.1016/j.apsusc.2013.09.002

    Article  ADS  CAS  Google Scholar 

  46. DeCost BL, Hecht M, Francis T, Webler BA, Picard YN, Holm E (2017) Uhcsdb: ultra high carbon steel micrograph database. Integr Mater Manuf Innov 6:197–205. https://doi.org/10.1007/s40192-017-0097-0

    Article  Google Scholar 

  47. Barber Z, Leake J, Clyne T. The doitpoms project: micrograph library. https://www.doitpoms.ac.uk/miclib/index.php

  48. Saal J, Kirklin S, Aykol M, Meredig B, Wolverton C (2013) Materials design and discovery with high-throughput denisty functional theory: the open quantum materials database. JOM 65:1501–1509. https://doi.org/10.1007/s11837-013-0755-4

    Article  CAS  Google Scholar 

  49. Choudhary K, Garrity KF, Reid ACE, DeCost B, Biacchi AJ, Walker ARH, Trautt Z, Hattrick-Simpers J, Kusne AG, Centrone A, Davydov A, Jiang J, Pachter R, Cheon G, Reed E, Agrawal A, Qian X, Sharma V, Zhuang H, Kalinin SV, Sumpter BG, Pilania G, Acar P, Mandal S, Haule K, Vanderbilt D, Rabe K, Tavazza F, The joint automated repository for various integrated simulations (JARVIS) for data-driven materials design. npj Comput Mater 6. https://doi.org/10.1038/s41524-020-00440-1

  50. Tanifuji M, Matsuda A, Yoshikawa H (2019) Materials data platform: a fair system for data-driven materials science, In: 2019 8th International congress on advanced applied informatics (IIAI-AAI), pp 1021–1022. https://doi.org/10.1109/IIAI-AAI.2019.00206

  51. Ma R, Luo T (2020) PI1M: a benchmark database for polymer informatics. J Chem Inf Model 60(10):4684–4690. https://doi.org/10.1021/acs.jcim.0c00726

    Article  CAS  PubMed  Google Scholar 

  52. Borysov S, Geilhufe R, Balatsky A. Organic materials database: an open-access online database for data mining. PLoS ONE 12. https://doi.org/10.1371/journal.pone.0171501

  53. Kench S, Squires I, Dahari A Microlib: A library of 3d microstructures generated from 2d micrographs using slicegan. Sci Data 9. https://doi.org/10.1038/s41597-022-01744-1

  54. Bargmann S, Klusemann B, Markmann J, Schnabel J, Schneider K, Soyarslan C, Wilmers J (2018) Generation of 3d representative volume elements for heterogeneous materials: a review. Prog Mater Sci 96:322–384. https://doi.org/10.1016/j.pmatsci.2018.02.003

    Article  Google Scholar 

  55. Mosser L, Dubrule O, Blunt M (2018) Stochastic reconstruction of oolitic limestone by generative adversarial networks. Transp Porous Med 125:81–103. https://doi.org/10.1007/s11242-018-1039-9

    Article  CAS  Google Scholar 

  56. Kench S, Cooper S (2021) Generating three-dimensional structures from a two-dimensional slice with generative adversarial network-based dimensionality expansion. Nature Mach Intell 3:299–305. https://doi.org/10.1038/s42256-021-00322-1

    Article  Google Scholar 

  57. Fokina D, Muravleva E, Ovchinnikov G, Oseledets I (2020) Microstructure synthesis using style-based generative adversarial networks. Phys Rev E 101:043308. https://doi.org/10.1103/PhysRevE.101.043308

    Article  ADS  CAS  PubMed  Google Scholar 

  58. Noguchi S, Inoue J (2021) Stochastic characterization and reconstruction of material microstructures for establishment of process-structure-property linkage using the deep generative model. Phys Rev E 104:025302. https://doi.org/10.1103/PhysRevE.104.025302

    Article  ADS  CAS  PubMed  Google Scholar 

  59. Fullwood D, Niezgoda S, Adams B, Kalidindi S (2010) Microstructure sensitive design for performance optimization. Prog Mater Sci 55:477–562. https://doi.org/10.1016/j.pmatsci.2009.08.002

    Article  CAS  Google Scholar 

  60. Torquato S (2002) Random heterogeneous materials. Springer, New York

    Book  Google Scholar 

  61. Adams B, Kalidindi S, Fullwood D (2013) Microstructure sensitive design for performance optimization. Butterworth-Heinemann, Waltham

    Google Scholar 

  62. Gao Y, Jiao Y, Liu Y (2021) Ultra-efficient reconstruction of 3d microstructure and distribution of properties of random heterogeneous materials containing multiple phases. Acta Mater 204:116526. https://doi.org/10.1016/j.actamat.2020.116526

    Article  CAS  Google Scholar 

  63. Robertson A, Kalidindi S (2022) Efficient generation of n-field microstructures from 2-point statistics using multi-output gaussian random fields. Acta Mater 232:117927. https://doi.org/10.1016/j.actamat.2022.117927

    Article  CAS  Google Scholar 

  64. Robertson AE, Kelly C, Buzzy M, Kalidindi SR (2023) Local-global decompositions for conditional microstructure generation. Acta Mater 253:118966. https://doi.org/10.1016/j.actamat.2023.118966

    Article  CAS  Google Scholar 

  65. Seibert P, Ambati M, Rabloff A, Kastner M (2021) Reconstructing random heterogeneous media through differentiable optimization. Comput Mater Sci 196:110455. https://doi.org/10.1016/j.commatsci.2021.110455

    Article  Google Scholar 

  66. Seibert P, Rabloff A, Ambati M, Kastner M (2022) Descriptor-based reconstruction of three-dimensional microstructures through gradient-based optimization. Acta Mater 227:117667. https://doi.org/10.1016/j.actamat.2022.117667

    Article  CAS  Google Scholar 

  67. Seibert P, Husert M, Wollner M, Kalina K, Kastner M. Fast reconstruction of microstructures with ellipsoidal inclusions using analytic descriptors, https://doi.org/10.48550/arxiv.2306.08316

  68. Falco S, Jiang J, Cola FD, Petrinic N (2017) Generation of 3d polycrystalline microstructures with a conditioned Laguerre–Voronoi tessellation technique. Comput Mater Sci 136:20–28. https://doi.org/10.1016/j.commatsci.2017.04.018

    Article  CAS  Google Scholar 

  69. Prasad M, Vajragupta N, Hartmaier A (2019) Kanapy: a python package for generating complex synthetic polycrystalline microstructures. J Open Source Softw 4:1732. https://doi.org/10.21105/joss.01732

    Article  ADS  Google Scholar 

  70. Mandal S, Lao J, Donegan S, Rollett A (2018) Generation of statistically representative synthetic three-dimensional microstructures. Scripta Mater 146:128–132. https://doi.org/10.1016/j.scriptamat.2017.11.034

    Article  CAS  Google Scholar 

  71. Niezgoda S, Fullwood D, Kalidindi S (2008) Delineation of the space of 2-point correlations in a composite material system. Acta Mater 56:5285–5292. https://doi.org/10.1016/j.actamat.2008.07.005

    Article  ADS  CAS  Google Scholar 

  72. de Oca Zapiain DM, Stewart J, Dingreville R (2021) Accelerating phase field based microstructure evolution predictions via surrogate models trained by machine learning methods. NPJ Comput Mater 3:1–11. https://doi.org/10.1038/s41524-020-00471-8

    Article  Google Scholar 

  73. Attari V, Honarmandi P, Duong T, Sauceda DJ, Allaire D, Arroyave R (2020) Uncertainty propagation in a multiscale calphad-reinforced elastochemical phase-field model. Acta Mater 183:452–470. https://doi.org/10.1016/j.actamat.2019.11.031

    Article  ADS  CAS  Google Scholar 

  74. Hsu T, Epting WK, Kim H, Abernathy HW, Hackett GA, Rollett AD, Salvador PA, Holm EA (2021) Microstructure generation via generative adversarial network for heterogeneous, topoligically complex 3d materials. JOM 73:90–102. https://doi.org/10.1007/s11837-020-04484-y

    Article  ADS  Google Scholar 

  75. NIMS, Nims materials database. https://mits.nims.go.jp/en/

  76. Lee K, Yun G Microstructure reconstruction using diffusion-based generative models

  77. Lin H, Brown LP, Long AC (2011) Modelling and simulating textile structures using texgen, In: Advances in textile engineering, vol. 331 of advanced materials research, pp 44–47. https://doi.org/10.4028/www.scientific.net/AMR.331.44

  78. Krishnamoorthi S, Bandyopadhyay R, Sangid MD (2023) A microstructure-based fatigue model for additively manufactured ti-6al-4v, including the role of prior \(\beta \) boundaries. Int J Plast 163:103569. https://doi.org/10.1016/j.ijplas.2023.103569

    Article  CAS  Google Scholar 

  79. Du P, Zebrowski A, Zola J, Ganapathysubramanian B, Wodo O. Microstructure design using graphs. Comput Mater 4. https://doi.org/10.1038/s41524-018-0108-5

  80. Dureth C, Seibert P, Rucker D, Handford S, Kastner M, Gude M. Conditional diffusion-based microstructure reconstruction

  81. Jung J, Yoon JI, Park HK, Jo H, Kim HS (2020) Microstructure design using machine learning generated low dimensional and continuous design space. Materialia 11:100690. https://doi.org/10.1016/j.mtla.2020.100690

    Article  Google Scholar 

  82. Tang J, Geng X, Li D, Shi Y, Tong J, Xiao H, Peng F (2021) Machine learned-based microstructure prediction during laser sintering of alumina. Sci Rep 11:10724. https://doi.org/10.1038/s41598-021-89816-x

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  83. Iyer A, Dey B, Dasgupta A, Chen W. A conditional generative model for predicting material microstructures from processing methods

  84. Kanit T, Forest S, Galliet I, Mounoury V, Jeulin D (2003) Determination of the size of the representative volume element for random composites: statistical and numerical approach. Int J Solids Struct 40(13):3647–3679. https://doi.org/10.1016/S0020-7683(03)00143-4

    Article  Google Scholar 

  85. Kim Y, Jung J, Park H, Kim H (2023) Importance of microstructural features in bimodal structure-property linkage. Met Mater Int 29:53–58. https://doi.org/10.1007/s12540-022-01200-0

    Article  CAS  Google Scholar 

  86. Paulson N, Priddy M, McDowell D, Kalidindi S (2019) Reduced-order microstructure-sensitive protocols to rank-order the transition fatigue resistance of polycrystalline microstructures. Int J Fatigue 119:1. https://doi.org/10.1016/j.ijfatigue.2018.09.011

    Article  CAS  Google Scholar 

  87. Latypov M, Toth L, Kalidindi S (2019) Materials knowledge system for nonlinear composites. Comput Methods Appl Mech Eng 346:180. https://doi.org/10.1016/j.cma.2018.11.034

    Article  ADS  MathSciNet  Google Scholar 

  88. Paulson N, Priddy M, McDowell D, Kalidindi S (2017) Reduced-order structure-property linkages for polycrystalline microstructures based on 2-point statistics. Acta Mater 129:428. https://doi.org/10.1016/j.actamat.2017.03.009

    Article  ADS  CAS  Google Scholar 

  89. Kaundinya PR, Choudhary K, Kalidindi SR. Machine learning approaches for feature engineering of the crystal structure: application to the prediction of the formation energy of cubic compounds, https://doi.org/10.48550/arXiv.2105.11319

  90. Generale A, Kalidindi S (2021) Reduced-order models for microstructure-sensitive effective thermal conductivity of woven ceramic matrix composites with residual porosity. Compos Struct 274:114399. https://doi.org/10.1016/j.compstruct.2021.114399

    Article  CAS  Google Scholar 

  91. Fast T, Wodo O, Ganapathysubramanian B, Kalidindi S (2016) Microstructure taxonomy based on spatial correlations: application to microstructure coarsening. Acta Mater 108:176. https://doi.org/10.1016/j.actamat.2016.01.046

    Article  ADS  CAS  Google Scholar 

  92. Harrington G, Kelly C, Attari V, Arroyave R, Kalidindi S (2022) Application of a chained-ann for learning the process-structure mapping in \(mg_2si_xsn_{1-x}\) spinodal decomposition. Integr Mater Manuf Innov 11:433–449. https://doi.org/10.1007/s40192-022-00274-3

    Article  Google Scholar 

  93. Barry MC, Gissinger JR, Chandross M, Wise KE, Kalidindi SR, Kumar S (2023) Voxelized atomic structure framework for materials design and discovery. Comput Mater Sci 230:112431. https://doi.org/10.1016/j.commatsci.2023.112431

    Article  CAS  Google Scholar 

  94. Yabansu YC, Iskakov A, Kapustina A, Rajagopalan S, Kalidindi S. Application of gaussian process regression models for capturing the evolution of microstructure statistics in aging of nickel-based superalloys. Acta Mater 178

  95. Altschuh P, Yabansu YC, Hötzer J, Selzer M, Nestler B, Kalidindi SR (2017) Data science approaches for microstructure quantification and feature identification in porous membranes. J Membr Sci 540:88–97. https://doi.org/10.1016/j.memsci.2017.06.020

    Article  CAS  Google Scholar 

  96. Latypov M, Kalidindi S (2017) Data-driven reduced order models for effective yield strength and partitioning of strain in multiphase materials. J Comput Phys 346:242–261. https://doi.org/10.1016/j.jcp.2017.06.013

    Article  ADS  MathSciNet  CAS  Google Scholar 

  97. Wilson A, Adams R (2013) Gaussian process kernels for pattern discovery and extrapolation, In: Proceedings of the 30th international conference on machine learning, vol 28 of proceedings of machine learning research, PMLR, pp 1067–1075

  98. Lazaro-Gredilla M, Quinonero-Candela J, Rasmussen C, Figueiras-Vidal A (2010) Sparse spectrum gaussian process regression. J Mach Learn Res, 1865–1881

  99. Soutis C (2005) Fibre reinforced composites in aircraft construction. Prog Aerosp Sci 41:143–151. https://doi.org/10.1016/j.paerosci.2005.02.004

    Article  Google Scholar 

  100. Brown Jr WF (1955) Solid mixture permittivities. J Chem Phys 23:1514–1517

  101. Kroner E (1977) Bounds for effective elastic moduli of disordered materials. J Mech Phys Solids 25:137–155

    Article  ADS  Google Scholar 

  102. Safdari M, Baniassadi M, Garmestani H, Al-Haik M (2012) A modified strong-constrast expansion for estimating the effective thermal conductivity of multiphase heterogeneous materials. J Appl Phys 112:114318

    Article  ADS  Google Scholar 

  103. Torquato S (1997) Effective stiffness tensor of composite media: 1. Exact series expansions. J Mech Phys Solids 45:1421–1448

    Article  ADS  MathSciNet  Google Scholar 

  104. Torquato S (1998) Effective stiffness tensor of composite media: 2. Applications to isotropic dispersions. J Mech Phys Solids 46:1411–1440

    Article  ADS  MathSciNet  CAS  Google Scholar 

  105. Fullwood D, Adams B, Kalidindi S (2008) A strong contrast homogenization formulation for multi-phase anistropic materials. J Mech Phys Solids 56:2287–2297

    Article  ADS  MathSciNet  CAS  Google Scholar 

  106. Hashemi S, Kalidindi S (2021) A machine learning framework for the temporal evolution of microstructure during static recrystallization of polycrystalline materials simulated by cellular automaton. Comput Mater Sci 188:110132. https://doi.org/10.1016/j.commatsci.2020.110132

    Article  CAS  Google Scholar 

  107. Fullwood D, Adams B, Kalidindi S (2007) Generalized pareto front methods applied to second-order material property closures. Comput Mater Sci 38:788–799. https://doi.org/10.1016/j.commatsci.2006.05.016

    Article  CAS  Google Scholar 

  108. Mann A, Kalidindi S (2022) Development of a robust cnn model for capturing microstructure-property linkages and building property closures supporting material design. Front Mater 9:851085. https://doi.org/10.3389/fmats.2022.851085

    Article  ADS  Google Scholar 

  109. Rossin J, Leser P, Pusch K, Frey C, Vogel S, Saville A, Torbet C, Clarke A, Daly S, Pollock T (2022) Single crystal elastic constants of additively manufactured components determined by resonant ultrasound spectroscopy. Mater Charact 192:112244. https://doi.org/10.1016/j.matchar.2022.112244

    Article  CAS  Google Scholar 

  110. Kroner E (1972) Statistical continuum mechanics. Springer, New York

    Google Scholar 

  111. Niezgoda S, Yabansu Y, Kalidindi S (2011) Understanding and visualizing microstructure and microstructure variance as a stochastic process. Acta Mater 59:6387–6400. https://doi.org/10.1016/j.actamat.2011.06.051

    Article  ADS  CAS  Google Scholar 

  112. Shlens J (2020) A tutorial of principal component analysis. Accessed 28 Nov 2020. https://www.cs.princeton.edu/picasso/mats/PCA-Tutorial-Intuition_jp.pdf

  113. Kennard RW, Stone LA (1969) Computer aided design of experiments. Technometrics 11(1):137–148. https://doi.org/10.1080/00401706.1969.10490666

    Article  Google Scholar 

  114. Mak S, Joseph V (2018) Minimax and minimax projection designs using clustering. J Comput Graph Stat 27:166–178. https://doi.org/10.1080/10618600.2017.1302881

    Article  MathSciNet  Google Scholar 

  115. Huang C, Joseph V, Ray D (2021) Constrained minimum energy designs. Stat Comput 31:80. https://doi.org/10.1007/s11222-021-10054-2

    Article  MathSciNet  Google Scholar 

  116. Fullwood D, Niezgoda S, Kalidindi S (2008) Microstructure reconstruction from 2-point statistics using phase recovery algorithms. Acta Mater 56:942–948. https://doi.org/10.1016/j.actamat.2007.10.044

    Article  ADS  CAS  Google Scholar 

  117. Jiao Y, Stillinger F, Torquato S (2007) Modeling heterogeneous materials via two-point correlation functions: basic principles. Phys Rev E 76:031110. https://doi.org/10.1103/PhysRevE.76.031110

    Article  ADS  MathSciNet  CAS  Google Scholar 

  118. Jiao Y, Stillinger F, Torquato S (2009) A superior descriptor of random textures and its predictive capacity. PNAS 106:17634–17639. https://doi.org/10.1073/pnas.0905919106

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  119. Niezgoda SR, Turner DM, Fullwood DT, Kalidindi SR (2010) Optimized structure based representative volume element sets reflecting the ensemble-averaged 2-point statistics. Acta Mater 58(13):4432–4445. https://doi.org/10.1016/j.actamat.2010.04.041

    Article  ADS  CAS  Google Scholar 

  120. Helton J, Davis F (2003) Latin hypercube sampling and propogation of uncertainty in analyses of complex systems. Reliab Eng Syst Saf, 23–69. https://doi.org/10.1016/S0951-8320(03)00058-9

  121. Swayer S (2023) Wishart distributions and inverse-wishart sampling. Accessed 4 Oct 2023. https://www.math.wustl.edu/~sawyer/hmhandouts/Wishart.pdf

  122. Odell PL, Feiveson AH (1966) A numerical procedure to generate a sample covariance matrix. J Am Stat Assoc 61(313):199–203. https://doi.org/10.1080/01621459.1966.10502018

    Article  MathSciNet  Google Scholar 

  123. Cecen A (2017) Calculation, utilization, and inference of spatial statistics in practical spatio-temporal data. Georgia Tech Library, Atlanta

    Google Scholar 

  124. Cecen A, Yucel B, Kalidindi S (2021) A generalized and modular framework for digital generation of composite microstructures. J Compos Sci 5:1–20. https://doi.org/10.3390/jcs5080211

    Article  CAS  Google Scholar 

  125. Brough D, Wheeler D, Kalidindi S (2017) Materials knowledge systems in python: a data science framework for accelerated development of hierarchical materials. Integr Mater Manuf Innov 6:36–53. https://doi.org/10.1007/s40192-017-0089-0

    Article  PubMed  PubMed Central  Google Scholar 

  126. Kelly C, Kalidindi S (2021) Recurrent localization networks applied to the Lippmann–Schwinger equation. Comput Mater Sci 192:110356. https://doi.org/10.1016/j.commatsci.2021.110356

    Article  CAS  Google Scholar 

  127. You H, Zhang Q, Ross C, Lee C, Yu Y (2022) Learning deep implicit fourier neural operators (ifnos) with applications to heterogeneous material modeling. Comput Methods Appl Mech Eng 398:115296. https://doi.org/10.1016/j.cma.2022.115296

    Article  ADS  MathSciNet  Google Scholar 

  128. Chun S, Roy S, Nguyen Y, Choi J, Udaykumar H, Baek S (2020) Deep learning for synthetic microstructure generation in a materials-by-design framework for heterogeneous energetic materials. Sci Rep 10:13307. https://doi.org/10.1038/s41598-020-70149-0

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  129. Ostoja-Starzewski M, Kale S, Karimi P, Malyarenko A, Raghavan B, Ranganathan S, Zhang J (2016) Chapter two-scaling to RVE in random media, vol 49 of Advances in Applied Mechanics, pp 111–211. https://doi.org/10.1016/bs.aams.2016.07.001

  130. Zerhouni O, Brisard S, Danas K. Quantifying the effects of two-point correlations on the effective elasticity of specific classes of random porous materials with and without connectivity. Int J Eng Sci. https://doi.org/10.1016/j.ijengsci.2021.103520

  131. Li S (1999) On the unit cell for micromechanical analysis of fibre-reinforced composites. Proc R Soc A 455:815–838. https://doi.org/10.1098/rspa.1999.0336

    Article  ADS  MathSciNet  Google Scholar 

  132. Li S (2001) General unit cells for micromechanical analyses of unidirectional composites. Compos A Appl Sci Manuf 32(6):815–826. https://doi.org/10.1016/S1359-835X(00)00182-2

    Article  Google Scholar 

  133. Landi G, Niezgoda N, Kalidindi S (2010) Multi-scale modeling of elastic propoerties of three-dimensional voxel-based microstructure datasets using novel DFT-based knowledge systems. Acta Mater 58:2716–2725. https://doi.org/10.1016/j.actamat.2010.01.007

    Article  ADS  CAS  Google Scholar 

  134. Fast T, Kalidindi SR (2011) Formulation and calibration of higher-order elastic localization relationships using the MKS approach. Acta Mater 59:4595–4605. https://doi.org/10.1016/j.actamat.2011.04.005

    Article  ADS  CAS  Google Scholar 

  135. Proust G, Kalidindi S (2006) Procedures for construction of anisotropic elastic-plastic property closures for face-centered cubic polycrystals using first-order bounding relations. J Mech Phys Solids 54:1744–1762. https://doi.org/10.1016/j.jmps.2006.01.010

    Article  ADS  MathSciNet  Google Scholar 

  136. Hill R (1963) Elastic properties of reinforced solids: some theoretical principles. J Mech Phys Solids 11:357–372

    Article  ADS  Google Scholar 

  137. Yang M, Zhang J, Wei H, Zhao Y, Gui W, Su H, Jin T, Liu L. Study of \(\gamma \)’ rafting under different stress states: a phase field simulation considering viscoplasticity. J Alloys Compounds. https://doi.org/10.1016/j.jallcom.2018.07.317

  138. Blesgen T, Chenchiah I. Cahn–Hilliard equations incorporating elasticity: analysis and comparison to experiments. Philos Trans R Soc. https://doi.org/10.1098/rsta.2012.0342

  139. Chen W, Fuge M (2017) Beyond the known: detecting novel feasible domains over unbounded design space. J Mech Des 139:111405. https://doi.org/10.1115/1.4037306

    Article  Google Scholar 

  140. Chen W, Fuge M (2019) Synthesizing designs with interpart dependencies using hierarchical generative adversarial networks. J Mech Des 141:111403. https://doi.org/10.1115/1.4044076

    Article  Google Scholar 

  141. Wang S, Generale AP, Kalidindi SR, Joseph VR (2023) Sequential designs for filling output spaces. Technometrics, 1–12 https://doi.org/10.1080/00401706.2023.2231042

  142. Ahrendt P (2023) The multivariate gaussian probability. Accessed 4 Oct 2023. https://d1wqtxts1xzle7.cloudfront.net/49874923/The_Multivariate_Gaussian_Probability_Di20161026-27105-77g7a0-libre.pdf?1477466954= &response-content-disposition=inline%3B+filename%3DThe_multivariate_gaussian_probability_di.pdf &Expires=1696429097 &Signature=EbY-smInGeeMVvC0qsTaERE9jTZTSJF8NC9MZl0fOkqTiBgWVcmYqZ~u-8vaYnjyuJyCgV-40kYMMHThOOAhgEGQ8~2dzZG~TV7Rn69mTy1I1ieWafwrsatRpsj3CB6KIbhRn6Y2MgwENUL0RVxnycgT2uiSJiAAoucqbOw5cxBO9H2OrgzgT2SywfSb2hxmr~GLayEwsCWUA~QRgm4AYcbK-YwWebZcZ6RkMOCMotDks-aCd66kbFpBz8bdM3avpmNpYJRWn9jxUFhDhJOnhz0OFdidp~fN96dS-J7~hSJDeK4dGDBE03b5sUd4Px7YrFf4jCCD6KOn1ldefSJR9w__ &Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA

  143. Chawla M (2011) PCA and ICA processing methods for removal of artifacts and noise in electrocardiograms: a survey and comparison. Appl Soft Comput 11(2):2216–2226. https://doi.org/10.1016/j.asoc.2010.08.001

    Article  Google Scholar 

  144. Hastie T, Tibshirani R, Friedman J (2016) The elements of statistical learning. Springer, New York

    Google Scholar 

  145. Vetterli M, Kovacevic J, Goyal V (2014) Foundations of signal processing. Cambridge University Press, Cambridge

    Book  Google Scholar 

  146. Berryman J (1987) Relationship between specific surface area and spatial correlation functions for anistropic porous media. J Math Phys 28:244–245

    Article  ADS  MathSciNet  Google Scholar 

  147. Blair S, Berge P, Berryman J (1996) Using two-point correlation functions to characterize microgeometry and estimate permeabilities of sandstone and porous glass. J Geophys Res 101:20359–20375. https://doi.org/10.1029/96JB00879

    Article  ADS  Google Scholar 

Download references

Acknowledgements

A.E. Robertson and S.R. Kalidindi thank the National Science Foundation for their support under NSF 2027105. A.P. Generale acknowledges Pratt & Whitney and the Alfred P. Sloan Foundation. C. Kelly acknowledges NSF 2027105, NSF Graduate Research Fellowship DGE-1650044, and ONR N00014-18-1-2879. M. Buzzy acknowledges support from NSF DMREF 2119640. Additionally, A.E. Robertson would like to acknowledge the continued support of the Jack Kent Cooke Foundation. A.P. Generale would like to acknowledge the continued support of the Alfred P. Sloan Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Surya R. Kalidindi.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Real Space Mixtures

The mixture method adopted in this paper differs slightly from the approach adopted historically in Spectral Mixture-based Gaussian Process Regression modeling [97, 98]. Specifically, instead of constructing the kernel function via a mixture model approximation to a probability density in the frequency space (e.g., a Symmetric Gaussian mixture model [97]), we approximate the kernel using a real-space symmetric Gaussian mixture model and, subsequently, enforce the spectral requirements of the kernel function via two linear projections. This approach takes the following path. The approximate kernel function, \({\hat{k}}(\varvec{\tau })\), is constructed via a mixture of symmetric Gaussians.

$$\begin{aligned} {\hat{k}}(\varvec{\tau })&= \sum _{i=1}^M \frac{\alpha _i}{2} \left[ \phi (\varvec{\tau }; \varvec{\mu }_i, \varvec{\Sigma }_i) + \phi (\varvec{\tau }; -\varvec{\mu }_i, \varvec{\Sigma }_i) \right] \end{aligned}$$
(11)
$$\begin{aligned} \phi (\varvec{\tau }; \varvec{\mu }, \varvec{\Sigma })&= \left( 2^k \pi ^k |\varvec{\Sigma }| \right) ^{-1/2} \exp \left( -0.5 \left( \varvec{\tau } - \varvec{\mu } \right) ^T \varvec{\Sigma }^{-1} \left( \varvec{\tau } - \varvec{\mu } \right) \right) \end{aligned}$$
(12)

Here, \(\varvec{\mu }\) and \(\varvec{\Sigma }\) are the mean and covariances of each mixture. \(|\cdot |\) is the determinant operator. The mixture weights, \(\alpha _i\), are selected to add to unity.

An approximate kernel structure of this form produces the following expression when transformed into frequency space [142].

$$\begin{aligned} {\mathcal {F}}({\hat{k}}(\varvec{\tau }))(\varvec{\xi })&\propto \sum _{i=1}^M \left[ \exp \left( -i \varvec{\mu }_i^T \varvec{\xi } \right) + \exp \left( i \varvec{\mu }_i^T \varvec{\xi } \right) \right] \exp \left( - \varvec{\xi }^T \varvec{\Sigma }_i \varvec{\xi } \right) \end{aligned}$$
(13)
$$\begin{aligned}&\propto \sum _{i=1}^M \cos \left( \varvec{\mu }_i^T \varvec{\xi } \right) \exp \left( - \varvec{\xi }^T \varvec{\Sigma }_i \varvec{\xi } \right) \end{aligned}$$
(14)

Here, superscripts refer to exponentiation not indexing. Note that this produces the inverse of the kernel structure proposed by Wilson and Adams [97]—with the cosine fluctuations in the frequency space instead of the real space. This kernel structure meets only one of the minimum requirements outlined in Sects. 2.2 and 2.1: it is real valued. The presence of the cosine fluctuations introduces negative values in the spectrum. These fluctuations are removed by zeroing the negative values [92].

$$\begin{aligned} {\mathcal {F}}(k(\varvec{\tau }))(\varvec{\xi }) = \max \left( {\mathcal {F}}({\hat{k}}(\varvec{\tau }))(\varvec{\xi }), \epsilon \right) \end{aligned}$$
(15)

Here, \(\epsilon \) is a near zero, positive value added for computational stability of the subsequent steps. Finally, the generated kernel function is produced by applying the inverse cosine transform, \({\mathcal {C}}^{-1}[\cdot ]\). This operation returns the real space equivalent of the kernel without introducing spurious imaginary components in either the real or Fourier space representation of the kernel. In practice, we discretely sample the approximate covariance kernel, \({\hat{k}}(\varvec{\tau })\), in real space to a discrete covariance kernel, \({\hat{k}}_r\), and, subsequently, apply the two identified projections discretely. This procedure produces the following set of expressions.

$$\begin{aligned} {\hat{k}}(\varvec{\tau })&= \sum _{i=1}^M \frac{\alpha _i}{2} \left[ \phi (\varvec{\tau }; \varvec{\mu }_i, \varvec{\Sigma }_i) + \phi (\varvec{\tau }; -\varvec{\mu }_i, \varvec{\Sigma }_i) \right] \end{aligned}$$
(16)
$$\begin{aligned} {\hat{k}}_r&= {\hat{k}}(\varvec{\tau }_r) \end{aligned}$$
(17)
$$\begin{aligned} k_r&= {\mathcal {C}}^{-1} [ \max \left( {\mathcal {F}}[{\hat{k}}_r]_t, \epsilon \right) ]_r \end{aligned}$$
(18)

Here, \(\varvec{\tau }_r\) is the value of \(\varvec{\tau }\) at the center of pixel r. The autocorrelation is derived from the kernel function via addition of the mean squared [63].

$$\begin{aligned} f^{\beta \beta }_r = k_r + (v_{f}^{\beta })^2 \end{aligned}$$
(19)

As noted in the main body of the paper, we used this alternative structure instead of the traditional method for two important reasons. Most importantly, the traditional Fourier-space mixture model produces spatially compact real-space kernels. This means that it cannot easily produce kernels with multiple modes. Mathematically, this is clear in the original real-space expression provided by Wilson and Adams [97] (here, for simplicity, we reproduce the 1D single mixture expression).

$$\begin{aligned} k(\tau ) = \exp (-2 \pi ^2 \tau ^2 \sigma ^2) \cos (2 \pi \tau \mu ) \end{aligned}$$
(20)

Clearly, the dominant exponential term has zero mean. As a result, this type of kernel cannot easily reproduce the important longer range peaks (for example, secondary peaks in layered composites that statistically represent the repetition of the layering [63]) that are present in 2-point statistics maps [90, 94, 123]. In contrast, our real-space formulation can directly construct these secondary peaks via the direct placement of the means of the individual symmetric mixtures. The second reason is practical and an extension of the first: in real space, we can use our expert knowledge of 2-point statistics [21, 72, 88, 90, 92, 93, 123] to guide the placement of the symmetric mixtures into common regions. The unfamiliar nature of the Fourier representation makes it challenging to embed domain knowledge into the construction of the kernel function (and, by extension, the autocorrelation). Of course, this second reason would be irrelevant if one was using an optimization-based placement strategy for the mixtures instead of an expert driven one.

Appendix B: PCA Truncation for MaxPro Filtering

We used PCA to perform distance preserving dimensionality reduction of the initial candidate autocorrelation set. This facilitated the framework’s spacefilling filtering operation by significantly decreasing the computational expense of the Min–Max optimization central to the MaxPro algorithm [113]. Importantly, the extracted latent space must be a good approximation of the original 2-point statistics. Therefore, it must have sufficient representational capacity to recreate the original autocorrelations’ salient features with high fidelity. Recent work by Generale et al. observed that PCA is relatively inefficient for generative tasks like this one [21]. As a result, we expect the number of necessary principal components to be quite high. We selected the truncation level for the number of principal components by tracking the reconstruction error (the relative \(L_2\) error—i.e., the \(L_2\) distance between a proposed autocorrelation and the reconstruction from the principal component basis normalized by the \(L_2\) magnitude of the original autocorrelation) of the original candidate autocorrelation dataset. Figure 14 summarizes the relative error. We selected a truncation level of 750, achieving an average reconstruction error of approximately \(0.75 \%\).

Fig. 14
figure 14

Trend of the average reconstruction error (the relative \(L_2\) error) as a function of the number of retained principal components

Fig. 15
figure 15

Trend of the expected feature size as a function of the principal component index

Additionally, we also explored the average feature size of stochastic microstructure functions differentiated by each eigenvector. Here, our aim was to ensure that the selected principal component basis was sensitive to every feature lengthscale in the initial candidate autocorrelation set. This is important to check because PCA is known to filter out short lengthscale (i.e., high frequency) features [112, 143,144,145]. For several eigenvectors, we identified a set of diverse autocorrelations with respect to the selected eigenvector. We did so by performing spacefilling in just the subspace defined by that eigenvector using the described MaxPro procedure. Then, we used Berryman’s method [63, 146, 147] to estimate the feature size of microstructures corresponding to the identified autocorrelations. Figure 15 depicts the trend. Importantly, the decay largely stabilizes well before the 750th eigenvector. Therefore, we expect this selected cutoff to create a subspace which is sufficiently sensitive to all salient lengthscales in the candidate autocorrelation set. We also note that the trend displays largely monotonic decay—i.e., lower index eigenvectors correspond to larger features.

Appendix C: Examples from MICRO2D

Here, we simply display a collection of randomly selected examples from each class in the MICRO2D dataset: GRF—Fig. 16, NBSA—Fig. 17, AngEllipse—Fig. 18, RandomEllipse—Fig. 19, VoidSmall—Fig. 20, VoidSmallBig—Fig. 21, VoronoiLarge—Fig. 22, VoronoiMedium—Fig. 23, VoronoiMediumSpaced—Fig. 24, and VoronoiSmall—Fig. 25.

Fig. 16
figure 16

Eighty randomly selected microstructures corresponding to the GRF class

Fig. 17
figure 17

Eighty randomly selected microstructures corresponding to the NBSA class

Fig. 18
figure 18

Eighty randomly selected microstructures corresponding to the AngEllipse class

Fig. 19
figure 19

Eighty randomly selected microstructures corresponding to the RandomEllipse class

Fig. 20
figure 20

Eighty randomly selected microstructures corresponding to the VoidSmall class

Fig. 21
figure 21

Eighty randomly selected microstructures corresponding to the VoidSmallBig class

Fig. 22
figure 22

Eighty randomly selected microstructures corresponding to the VoronoiLarge class

Fig. 23
figure 23

Eighty randomly selected microstructures corresponding to the VoronoiMedium class

Fig. 24
figure 24

Eighty randomly selected microstructures corresponding to the VoronoiMediumSpaced class

Fig. 25
figure 25

Eighty randomly selected microstructures corresponding to the VoronoiSmall class

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Robertson, A.E., Generale, A.P., Kelly, C. et al. MICRO2D: A Large, Statistically Diverse, Heterogeneous Microstructure Dataset. Integr Mater Manuf Innov 13, 120–154 (2024). https://doi.org/10.1007/s40192-023-00340-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40192-023-00340-4

Keywords

Navigation