# Materials Knowledge Systems in Python—a Data Science Framework for Accelerated Development of Hierarchical Materials

- 1.7k Downloads
- 6 Citations

## Abstract

There is a critical need for customized analytics that take into account the stochastic nature of the internal structure of materials at multiple length scales in order to extract relevant and transferable knowledge. Data-driven process-structure-property (PSP) linkages provide a systemic, modular, and hierarchical framework for community-driven curation of materials knowledge, and its transference to design and manufacturing experts. The Materials Knowledge Systems in Python project (PyMKS) is the first open-source materials data science framework that can be used to create high-value PSP linkages for hierarchical materials that can be leveraged by experts in materials science and engineering, manufacturing, machine learning, and data science communities. This paper describes the main functions available from this repository, along with illustrations of how these can be accessed, utilized, and potentially further refined by the broader community of researchers.

## Keywords

Materials knowledge systems Hierarchical materials Multiscale materials Python Scikit-learn NumPy SciPy Machine learning## Introduction

Current practices for developing tools and infrastructure used in multiscale materials design, development, and deployment are generally highly localized (sometimes even within a single organization) resulting in major inefficiencies (duplication of effort, lack of code review, not engaging the right talent for the right task, etc.). Although it is well known that the pace of discovery and innovation significantly increases with effective collaboration [1, 2, 3, 4], scaling such efforts to large heterogeneous communities such as those engaged in materials innovation has been very difficult.

The advent of information technology has facilitated massive electronic collaborations (generally referred to as e-collaborations) that have led to significant advances in several domains including the discovery of the Higg’s boson [5], the sequencing of the human genome [6], the Polymath project [7], the monitoring of species migration [8, 9], and numerous open-source software projects. E-collaborations allow experts from complementary domains to create highly productive collaborations that transcend geographical, temporal, cultural, and organizational distances. E-collaborations require a supporting cyber-infrastructure that allows team members to generate, analyze, disseminate, access, and consume information at dramatically increased pace and/or quantity [10]. A key element of this emerging cyber-infrastructure is open-source software, as it eliminates collaboration hurdles due to software licenses and can help foster truly massive e-collaborations. In other words, even with collaborations involving proprietary data, open-source cyber-infrastructure provides a common language that can facilitate e-collaborations with large numbers of team members.

Several recent national and international initiatives [11, 12, 13] have been launched with the premise that the adoption and utilization of modern data science and informatics toolsets offers a new opportunity to accelerate dramatically the design and deployment cycle of new advanced materials in commercial products. More specifically, it has been recognized that innovation cyber-ecosystems [14] are needed to allow experts from the materials science and engineering, design and manufacturing, and data science domains to collaborate effectively. The challenge in integrating these traditionally disconnected communities comes from the vast differences in how knowledge is captured, curated, and disseminated in these communities [15]. More specifically, knowledge systems in the materials field are rarely captured in a digital form. In order to create a modern materials innovation ecosystem, it is imperative that we design, develop, and launch novel collaboration platforms that allow automated distilling of materials knowledge from large amounts of heterogeneous data acquired through customized protocols that are necessarily diverse (elaborated next). It is also imperative that this curated materials knowledge is presented to the design and manufacturing experts in highly accessible (open) formats.

While the generation (from experiments and computer simulations) and dissemination of datasets consisting of heterogeneous images are necessary elements in a modern materials innovation ecosystem, there is an equally critical need for customized analytics that take into account the stochastic nature of these data at multiple length scales in order to extract high-value, transferable knowledge. Data-driven process-structure-property (PSP) linkages [26] provide a systemic, modular, and hierarchical framework for community engagement (i.e., several people making complementary or overlapping contributions to the overall curation of materials knowledge). Computationally cheap PSP linkages also communicate effectively the curated materials knowledge to design and manufacturing experts in highly accessible formats.

The Materials Knowledge Systems in Python project (PyMKS) is the first open-source materials data analytics toolkit that can be used to create high-value PSP linkages for hierarchical materials in large-scale efforts driven and directed by an entire community of users. In this regard, it could be a foundational element of the cyber-infrastructure needed to realize a modern materials innovation ecosystem.

## Current Materials Innovation Ecosystem

Open-access materials databases and computational tools are critical components of the cyber-infrastructure needed to curate materials knowledge through effective e-collaborations [27]. Several materials science open-source computational toolsets and databases have emerged in recent years to help realize the vision outlined in the Materials Genome Initiative (MGI) and the Integrated Computational Materials Engineering (ICME) paradigm [12, 13, 16, 17, 18, 19, 20, 21, 22, 23, 24]. Yet, the creation and adoption of a standard materials taxonomy and database schema has not been established due to the unwieldy size of material descriptors and heterogeneous data. Additionally, the coupled physical phenomena that govern material properties are too complex to model all aspects of a material simultaneously using a single computational tool. Consequently, current practices have resulted in the development of computation tools and databases with a narrow focus on specific length/structure scales, material classes, or properties.

The NIST Data Gateway contains over 100 free and paid query-able web-based materials databases. These databases contain atomic structure, thermodynamics, kinetics, fundamental physical constants, and x-ray spectroscopy, among other features [28]. The NIST DSpace provides a curation of links to several materials community databases [29]. The NIST Materials Data Curation Systems (MDCS) is a general online database that aims to facilitate the capturing, sharing, and transforming of materials data [30]. The Open Quantum Materials Database (OQMD) is an open-source data repository for phase diagrams and electronic ground states computed using density functional theory [31]. MatWeb is a database containing materials properties for over 100,000 materials [32]. Atomic FLOW of Materials Discovery (AFLOW) databases millions of materials and properties and hosts computational tools that can be used for atomic simulations [33]. The Materials Project (and the tool pyMatgen) [34, 35] provides open web-based access to computed information on known and predicted materials as well as analysis tools for electronic band structures. The Knowledgebase of Interatomic Models (OpenKIM) hosts open-source tools for potentials for molecular simulation of materials [36]. PRedictive Integrated Structural Materials Science (PRISMS) hosts a suite of ICME tools and datastorage for the metals community focused on microstructure evolution and mechanical properties [37]. Granta Materials and Citrine Informatics represent two of the for-profit efforts in these domains. Granta Materials Intelligence provides enterprise-scale infrastructure for in-house materials data management, which can be integrated with design tools [38]. Citrine Informatics is a cloud-based platform that provides access to multisource material databases as well as machine learning tools [39, 40]. Citrine Informatics also maintains open-access databases as well as open-source software projects [41].

SPPARKS Kinetic Monte Carlo Simulator (SPPARKS) is a parallel Monte Carlo code for on-lattice and off-lattice models [42]. MOOSE is a parallel computational framework for coupled systems of nonlinear equations [43]. Dream3D is a tool used for synthetic microstructure generation, image processing, and mesh creation for finite element [44].

While there exits a sizable number of standard analytics tools [45, 46, 47, 48, 49, 50, 51, 52, 53, 54], none of them are tailored to create PSP linkages from materials structure image data and their associated properties. PyMKS aims to seed and nurture an emergent user group in the materials data analytics field for establishing homogenization and localization (PSP) linkages by leveraging open-source signal processing and machine learning packages in Python. An overview of the PyMKS project accompanied with several examples is presented here. This paper is a call to others interested in participating in this open science activity.

## Theoretical Foundations of Materials Knowledge Systems

Material properties are controlled by their internal structure and the diverse physical phenomena occurring at multiple time and length scales. Generalized composite theories [55, 56] have been developed for hierarchical materials exhibiting well-separated length scales in their internal structure. Generally speaking, these theories either address homogenization (i.e., communication of effective properties associated with the structure at a given length scale to a higher length scale) or localization (i.e., spatiotemporal distribution of the imposed macroscale loading conditions to the lower length scale). Consequently, homogenization and localization are the essential building blocks in communicating the salient information in both directions between hierarchical length/structure scales in multiscale materials modeling. It is also pointed out that localization is significantly more difficult to establish, and implicitly provides a solution to homogenization.

The most sophisticated composite theory available today that explicitly accounts for the full details of the material internal structure (also simply referred as microstructure) comes from the use of perturbation theories and Green’s functions [55, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68]. In this formalism, one usually arrives at a series expansion for both homogenization and localization, where the individual terms in the series involve convolution integrals with kernels based on Green’s functions. This series expansion was refined and generalized by Adams and co-workers [67, 69, 70] through the introduction of the concept of a microstructure function, which conveniently separates each term in the series into a physics-dependent kernel (based on Green’s functions) and a microstructure-dependent function (based on the formalism of *n*-point spatial correlations [61, 62, 63, 64, 65, 66]).

Materials knowledge systems (MKS) [71, 72, 73, 74, 75, 76, 77] complements these sophisticated physics-based materials composite theories with a modern data science approach to create a versatile framework for extracting and curating multiscale PSP linkages. More specifically, MKS employs a discretized version of the composite theories mentioned earlier to gain major computational advantages. As a result, highly adaptable and templatable protocols have been created and used successfully to extract robust and versatile homogenization and localization metamodels with impressive accuracy and broad applicability over large microstructure spaces.

The MKS framework is based on the notion of a microstructure function. The microstructure function provides a framework to represent quantities that describe material structure such as phase identifiers, lattice orientation, chemical composition, defect types, and densities, among others (typically referred to as local states). The microstructure function, *m* _{ j }(*h*;*s*), represents a probability distribution for the given local state, *h*∈*H*, at each position, *s*∈*S*, in a given microstructure, *j* [78, 79, 80, 81]. The introduction of the local state space *H* (i.e., the complete set of all potential local states) provides a consolidated variable space for combining the diverse attributes (often a combination of scalar and tensor quantities) needed to describe the local states in the material structure. The MKS framework requires a discretized description of *m* _{ j }, which is denoted here as *m* _{ j }[*h*;*s*], where the [⋅;⋅] represent the discretized space (in contrast to (⋅;⋅), which defines the continuous space). The “ ;” symbol separates indices in physical space to the right of “ ;” from indices in local state space to the left of “ ;”. In most applications, *S* is simply tessellated into voxels on a regular (uniform) grid so that the position can be denoted by *s*→*i*,*j*,*k* in three dimensions.

As noted earlier, the local state space in most advanced materials is likely to demand sophisticated representations. In prior work [73, 82, 83], it was found that spectral representations on functions on the local state space offered many advantages both in compact representation and in reducing the computational cost. In such cases, *h* indexes the spectral basis functions employed. The selection of these functions depends on the nature of local state descriptors. Examples include (i) the primitive basis (or indicator functions) used to represent simple tessellation schemes [71, 72, 74, 75, 76, 77, 78, 79, 84], (ii) generalized spherical harmonics used to represent functions over the orientation space [73, 82], and (iii) Legendre polynomials used to represent functions over the concentration space [83].

### Homogenization

*s*in the tessellation of the microstructure volume. Yet the relative spatial distributions of the local states provide a valuable representation of the microstructure that can be used effectively to quantify the microstructure and compare it with other microstructures in robust and meaningful ways [77, 78, 79, 81, 84]. The lowest order of spatial correlations with relative spatial information comes in the form of 2-point statistics and can be computed as a correlation of a microstructure function such that

*r*is a discrete spatial vector within the voxelated domain specified by

*s*,

*f*

_{ j }[

*h*,

*h*

^{′};

*r*] is one set of 2-point statistics for the local stats

*h*and

*h*

^{′}, and Ω

_{ j }[

*r*] is a normalization factor that depends on

*r*[84]. The subscript

*j*refers to a sample microstructure used for analysis (i.e., each

*j*could refer to a microstructure image). The physical interpretation of the 2-point statistics is explained in Fig. 2 with a highly simplified two-phase microstructure (the two phases are colored white and gray). If the primitive basis is used to discretize both the spatial domain and the local state space then

*f*

_{ j }[

*h*,

*h*

^{′};

*r*] can be interpreted as the probability of finding local states

*h*and

*h*

^{′}at the tail and head, respectively, of a randomly placed vector

*r*.

*h*and

*h*

^{′}) with principal component analysis (PCA). The PCA dimensionality reduction can be mathematically expressed as follows:

*f*

_{ j }[

*l*] is a contracted representation of

*f*

_{ j }[

*h*,

*h*

^{′};

*r*] as a large vector (i.e.,

*l*maps uniquely to every combination of

*h*,

*h*

^{′}, and

*r*deemed to be of interest in the analyses). The

*μ*

_{ j }[

*k*] are low dimensional microstructure descriptors (the transformed 2-point statistics) or principal component scores (PC scores). The

*ϕ*[

*k*,

*l*] are the calibrated principle components (PCs) and the \(\overline {f[l]}\) are the mean values from the calibration ensemble of

*f*

_{ j }[

*l*] for each

*l*. The

*k*∈

*K*indices refer to the

*μ*

_{ j }[

*k*] in decreasing order of significance and are independent of

*l*,

*l*

^{′}, and

*r*. The main advantage of this approach is that the

*f*

_{ j }[

*l*] can be reconstructed to sufficient fidelity with only a small subset of

*μ*

_{ j }[

*k*] [85].

*μ*

_{ j }[

*k*] to \(p_{j}^{\text {eff}}\).

### Localization

*p*

_{ j }[

*s*] is the spatially resolved (localized) response field (e.g., a response variable such as stress or strain rate in a structure-property linkage, or an evolved microstructure function in a process-structure linkage), and

*α*[

*h*;

*r*] are the Green’s function-based discretized influence kernels. These digital kernels are calibrated using regression methods [71, 72, 73, 74, 82, 83].

Figure 3 provides schematic overviews of the MKS homogenization and localization workflows. More detailed explanations on the MKS homogenization and localization linkages can be found in prior literature [71, 72, 73, 74, 75, 76, 77, 78, 79, 83, 84].

## Materials Knowledge Systems in Python

PyMKS is an object-oriented numerical implementation of the MKS theory developed in the literature [72]. It provides a high-level, computationally efficient framework to implement data pipelines for classification, cataloging, and quantifying materials structures for PSP relationships. PyMKS is written in Python, a natural choice for scientific computing due to its ubiquitous use among the data science community as well as many other favorable attributes [86]. PyMKS is licensed under the permissive MIT license [87] which allows for unrestricted distribution in commercial and non-commercial systems.

### Core Functionality

PyMKS consists of four main components including a set of tools to compute 2-point statistics, tools for both homogenization and localization linkages, and tools for discretizing the microstructure. In addition, PyMKS has modules for generating data sets using conventional numerical simulations and a module for custom visualization of microstructures. PyMKS builds on Scikit-learn’s pipelining methodology to create materials-specific machine learning models. This is a high-level system for combining multiple data and machine learning transformations into a single customizable pipeline with only minimal required code. This approach makes cross-validation and parameter searches simple to implement and avoids the complicated book keeping issues associated with training, testing, and validating data pipelines in machine learning.

The starting point for an MKS homogenization analysis is to use 2-point statisics as outlined in Eq. 1 and provided in PyMKS by the MKSStructureAnalysis object, which calculates the objective low dimensional structure descriptors, *μ* _{ j }[*k*]. The default dimensionality reduction technique is PCA, but any model that uses the transform_fit or a “transformer” object can be substituted. After calculating the descriptors, the MKSHomogenizationModel is used to create linkages between the *μ* _{ j }[*k*] and the effective material response, \(p_{j}^{\text {eff}}\), as indicated in Eq. 3. The default machine learning algorithm is a polynomial regression, but any estimator with the fit and predict methods can be substituted to create the linkages between *μ* _{ j }[*k*] and \(p_{j}^{\text {eff}}\).

The MKSLocalizationModel object provides the MKS localization functionality. It calibrates the first-order influence kernels *α*[*h*;*r*] used to predict local materials responses, *p* _{ j }[*s*], as outlined in Eq. 4. The calibration of the influence kernels is achieved using a variety of linear regression techniques described in numerous previous studies [71, 72, 73, 83]. The MKSLocalizationModel object uses fit and predict methods to follow the standard interface for a Scikit-learn estimator object.

To use either the homogenization or the localization models in PyMKS, the microstructure first needs to be represented by a microstructure function, *m* _{ j }[*h*,*s*]. The bases module in PyMKS contains four transformer objects for generating the *m* _{ j }[*h*,*s*] using a varietly of discretization methods [71, 72, 73, 74, 75, 76, 77, 83]. These four objects can be thought of as materials-specific extension to the feature extraction module in Scikit-learn. A PrimitiveBasis object uses indicator (or hat) functions and is well suited for microstructures that have discrete local states (e.g., distinct thermodynamic phases). The LegendreBasis and FourierBasis objects create spectral representations of microstructure functions defined on nonperiodic and periodic continuous local state spaces, respectively. For example, functions over a range of chemical compositions can be described using LegendreBasis, while functions over orientations in two-dimensional space can be described using FourierBasis. Furthermore, GSHBasis creates compact spectral representations for functions over lattice orientation space (such as those needed to describe polycrystalline microstructures) [88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99].

PyMKS contains modest data generation tools (in the datasets module) that are used in both the PyMKS examples and the PyMKS test suite. The MicrostructureGenerator object creates stochastic microstructures using digital filters. This assists users in creating PyMKS workflows even when data is unavailable. PyMKS has objects for generating sample data from both a spinodal decomposition simulation (using the CahnHilliardSimulation object) and a linear elasticity simulation (using the ElasticFESimulation object). PyMKS comes with custom functions for visualizing microstructures in elegant ways (in the tools module). These are used extensively in the PyMKS example notebooks to minimize incidental code associated with visualization.

### Underlying Technologies

PyMKS is built upon the highly optimized Python packages NumPy [100], SciPy [101], and Scikit-learn [47]. NumPy arrays are the primary data structure used throughout PyMKS and provide the basic vector and matrix manipulation operations. SciPy’s signal processing and numerical linear algebra functions are used to calibrate models and generate synthetic data. PyMKS is highly integrated with Scikit-learn and mimics its simple API in order to leverage from Scikit-learn’s data pipeling methodology for machine learning and data transformations. In addition, PyMKS uses the Pytest framework to automate execution of the test suite [102].

Optional packages that can be used with PyMKS include Simple Finite Elements in Python (SfePy) [103], the python wrapper for the FFTW library (pyFFTW) [104], and the plotting package Matplotlib [105]. SfePy is used to simulate linear elasticity to create sample response field data. PyFFTW is a highly optimized fast Fourier transform library that enhances the efficiency of PyMKS and enables parallel computations in PyMKS. Matplotlib is used to generate custom microstructure visualizations.

### Development Practices

PyMKS leverages from existing tools, standards, and web resources wherever possible. In particular, the developers are an open community that use GitHub for issue tracking and release management (see https://github.com/materialsinnovation/pymks https://github.com/materialsinnovation/pymks). Additionally, a Google group is used as a public forum to discuss the project development, support, and announcements (see pymks-general@googlegroups.com). The Travis CI continuous integration tool is used to automate running the test suite for branches of the code stored on GitHub. Code standards are maintained by following the Python PEP8 standards and by reviewing code using pull requests on GitHub. Detailed administrative guidelines are outlined in the ADMINISTRATA.md document, and potential developers are encouraged to follow them.

## Examples of Homogenization and Localization with PyMKS

A demonstration of the MKS homogenization and localization workflows as shown in Fig. 3 is presented in this section using PyMKS. Additional workflow examples can be found on the PyMKS website http://pymks.org.

### Prediction of Effective Stiffness with Homogenization

#### Generation of Calibration Data

In this example, the MKSHomogenizationModel is used to create a structure-property linkage between a 2-phase composite material and effective stiffness *C* _{ x x }.

Multiple classes of periodic microstructures and their effective elastic stiffness values can be generated by importing the make_elastic_stiffness function from pymks.datasets.

This function has several arguments. n_samples is a list indicating the number of microstructures for each class. grain_size and volume_fraction are also lists that specify the average grain features and mean volume fractions for each of the microstructure classes. Variance in the volume fractions for each class can be controlled using percent_variance which specifies a range of volume fractions centered about the mean values (i.e., volume_fration ±percent_variance). size indicates the dimensions of all the microstructures. elastic_modulus and poissons_ratio are used to indicate the material properties for each of the phases. Lastly, seed is used as the seed for the random number generator.

#### Calibration of Homogenization Model

Before an instance of the MKSHomogenizationModel can be made, an instance of a basis class is needed to specify the discretization method for the microstructure functions (see Fig. 3). For this particular example, there are only two discrete phases numerated by 0 and 1. It has been shown that the primitive basis provides the most compact representation of discrete phases [71, 71, 74, 77, 78, 79, 81, 84]. In PyMKS, the class PrimitiveBasis from pymks.bases can be used with n_states equal to 2 and the domain equal to [0, 1].

*R*-squared values) within a defined subspace for the hyper parameters for our model. A dictionary params_to_tune defines the subspace. For this example, n_components will be varied between 1 to 13, and degree of the polynomial regression will be varied between 1 to 3. StratifiedKFold is used to ensure that microstructures from each of the classes are used for each fold during cross-validation. The array labels is used to label each of the classes.

*R*-squared values and their standard deviation. The output from draw_gridscores_matrix can be found in Fig. 5.

*R*-squared value was found to have a 2nd-order polynomial with 11 principal components. This model is calibrated using the entire training dataset and is used for the rest of the example.

#### Prediction of Effective Stiffness Values

*R*-squared value for our predicted data can be printed.

### Prediction of Local Strain Field with Localization

#### Generation of Calibration Data

In this example, the MKSLocalizationModel is used to predict the local strain field for a three-phase microstructure with elastic moduli values of 80, 100, and 120 MPa; Poisson’s ratio values all equal to 0.3 and a macroscopic imposed strain equal to 0.02. The model is calibrated using delta microstructures

*ε*

_{ x x }are needed to calibrate the localization model. A delta microstructure and its strain field can be visualized using draw_ microstructure_strain from pymks.tools. The output from draw_microstructure_strain can be found in Fig. 9.

#### Calibration of the Localization Model

#### Prediction of the Strain Field for a Random Microstructure

These examples demonstrate the high-level code that creates accurate and computationally efficient homogenization structure-property linkages using MKSHomogenization Model and localization linkages using MKSLocalization Model with PyMKS.

## Conclusion

The MKS framework offers a practical and computationally efficient approach for distilling and disseminating the core knowledge gained from physics-based simulations and experiments using emerging concepts in modern data science. PyMKS is an open-source project with a permissive license that provides simple high-level APIs to access the MKS framework by implementing pipelines from Scikit-learn with customized objects for data from hierarchical materials. PyMKS has been launched with the aim to nucleate and grow an emergent community focused on establishing data-driven homogenization and localization process-structure-property linkages for hierarchical materials.

## Notes

### Acknowledgments

DBB and SRK acknowledge support from NSF-IGERT Award 1258425 and NIST 70NANB14H191.

## References

- 1.Sawhney M, Verona G, Prandelli E (2005) Collaborating to create: the internet as a platform for customer engagement in product innovation. J Interact Mark 19(4):4–17CrossRefGoogle Scholar
- 2.Edwards AM, Bountra C, Kerr DJ, Willson TM (2009) Open access chemical and clinical probes to support drug discovery. Nat Chem Biol 5(7):436–440CrossRefGoogle Scholar
- 3.Bayne-Smith M, Mizrahi T, Garcia M (2008) Interdisciplinary community collaboration: perspectives of community practitioners on successful strategies. Journal of Community Practice 16(3):249–269CrossRefGoogle Scholar
- 4.Boudreau K (2010) Open platform strategies and innovation: granting access vs. devolving control. Manag Sci 56(10):1849–1872CrossRefGoogle Scholar
- 5.Aad G, Abajyan T, Abbott B, Abdallah J, Khalek SA, Abdelalim A, Abdinov O, Aben R, Abi B, Abolins M et al (2012) Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC. Phys Lett B 716(1):1–29CrossRefGoogle Scholar
- 6.Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W et al (2001) Initial sequencing and analysis of the human genome. Nature 409 (6822):860–921CrossRefGoogle Scholar
- 7.Cranshaw J, Kittur A (2011) The polymath project: lessons from a successful online collaboration in mathematics. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, pp 1865–1874Google Scholar
- 8.Dickinson JL, Zuckerberg B, Bonter DN (2010) Citizen science as an ecological research tool: challenges and benefits. Annu Rev Ecol Evol Syst 41:149–72CrossRefGoogle Scholar
- 9.Hochachka WM, Fink D, Hutchinson RA, Sheldon D, Wong W-K, Kelling S (2012) Data-intensive science applied to broad-scale citizen science. Trends Ecol Evol 27(2):130–137CrossRefGoogle Scholar
- 10.Atkins D (2003) Revolutionizing science and engineering through cyberinfrastructure: report of the National Science Foundation Blue-Ribbon Advisory Panel on CyberinfrastructureGoogle Scholar
- 11.Anderson A (2011) Report to the president on ensuring American leadership in advanced manufacturing. Executive Office of the President. https://eric.ed.gov/?id=ED529992
- 12.National Science and Technology Council Executive Office of the President: Materials Genome Initiative for Global Competitiveness. http://www.whitehouse.gov/sites/default/files/microsites/ostp/materials_genome/_initiative-final.pdf Accessed 2011-06-30
- 13.Materials Genome Initiative National Science and Technology Council Committee on Technology Subcommittee on the Materials Genome Initiative: Materials Genome Initiative Strategic Plan. http://www.whitehouse.gov/sites/default/files/microsites/ostp/NSTC/mgi_strategic_plan_-_dec_2014.pdf Accessed 2014-12-30
- 14.McDowell DL, Kalidindi SR (2016) The materials innovation ecosystem: a key enabler for the materials genome initiative. MRS Bulletin 41(04):326–337CrossRefGoogle Scholar
- 15.Kalidindi SR (2015) Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials. Int Mater Rev 60(3):150–168CrossRefGoogle Scholar
- 16.Ward C (2012) Materials genome initiative for global competitiveness. In: 23rd Advanced Aerospace Materials and Processes (AeroMat) Conference and Exposition. ASMGoogle Scholar
- 17.Allison J, Backman D, Christodoulou L (2006) Integrated computational materials engineering: a new paradigm for the global materials profession. JOM 58(11):25–27CrossRefGoogle Scholar
- 18.Allison J (2011) Integrated computational materials engineering: a perspective on progress and future steps. JOM 63(4):15– 18CrossRefGoogle Scholar
- 19.Olson GB (2000) Designing a new material world. Science 288(5468):993–998CrossRefGoogle Scholar
- 20.Allison J (2008) Integrated computational materials engineering: a transformational discipline for improved competitiveness and national security. National Academies Press, New York, NYGoogle Scholar
- 21.Schmitz GJ, Prahl U (2012) Integrative computational materials engineering: concepts and applications of a modular simulation platform. John Wiley & Sons, Hoboken, NJCrossRefGoogle Scholar
- 22.Robinson L (2013) TMS study charts a course to successful ICME implementation. SpringerGoogle Scholar
- 23.Allison JE Integrated computational materials engineering (ICME): a transformational discipline for the global materials profession. Met Mater 223Google Scholar
- 24.Integrated computational materials engineering (ICME): implementing ICME in the aerospace, automotive, and maritime industries. The Minerals, Metals and Materials, Society, PA. http://www.tms.org/icmestudy/
- 25.CORE-Materials (2009) CORE-Materials—a resource repository contains a large number of open educational resources (OERs) in materials science and engineering. https://www.flickr.com/people/core-materials/. [Online; accessed 6-April-2016]
- 26.Kalidindi SR (2015) Hierarchical materials informatics: novel analytics for materials data. Elsevier, New York, NYGoogle Scholar
- 27.Bhat TN, Bartolo LM, Kattner UR, Campbell CE, Elliott JT (2015) Strategy for extensible, evolving terminology for the materials genome initiative efforts. JOM 67(8):1866–1875CrossRefGoogle Scholar
- 28.of Standards, N.I., Technology: NIST Data Gateway. http://srdata.nist.gov/gateway/Accessed2016-04-01
- 29.Laboratory, N.M.M.: NIST Repositories DSpace. https://materialsdata.nist.gov/dspace/xmlui/ Accessed 2016-04-01
- 30.of Standards, N.I., Technology: NIST Data Curation System. https://mgi.nist.gov/materials-data-curation-system Accessed 2016-04-01
- 31.Saal JE, Kirklin S, Aykol M, Meredig B, Wolverton C (2013) Materials design and discovery with high-throughput density functional theory: the Open Quantum Materials Database (OQMD). JOM 65(11):1501–1509CrossRefGoogle Scholar
- 32.MatWeb L MatWeb—materials property data. http://www.matweb.com/ Accessed 2016-04-01
- 33.Curtarolo S, Setyawan W, Hart GL, Jahnatek M, Chepulskii RV, Taylor RH, Wang S, Xue J, Yang K, Levy O et al (2012) Aflow: an automatic framework for high-throughput materials discovery. Comput Mater Sci 58:218–226CrossRefGoogle Scholar
- 34.Ong SP, Richards WD, Jain A, Hautier G, Kocher M, Cholia S, Gunter D, Chevrier VL, Persson KA, Ceder G (2013) Python Materials Genomics (pymatgen): a robust, open-source Python library for materials analysis. Comput Mater Sci 68:314– 319CrossRefGoogle Scholar
- 35.Jain A, Ong SP, Hautier G, Chen W, Richards WD, Dacek S, Cholia S, Gunter D, Skinner D, Ceder G, et al. (2013) Commentary: the materials project: a materials genome approach to accelerating materials innovation. APL Materials 1(1):011002CrossRefGoogle Scholar
- 36.Project K OpenKIM - The Knowledgebase of Interatomic Models. https://openkim.org/ Accessed 2016-04-01
- 37.Project P PRedictive Integrated Structural Materials Science (PRISMS). http://www.prisms-center.org/#/home Accessed 2016-04-01
- 38.Selector CP (2013) Granta material intelligence, Cambridge, UKGoogle Scholar
- 39.Hill J, Mulholland G, Persson K, Seshadri R, Wolverton C, Meredig B (2016) Materials science with large-scale data and informatics: unlocking new opportunities. MRS Bulletin 41(05):399–409CrossRefGoogle Scholar
- 40.Seshadri R, Sparks TD (2016) Perspective: interactive material property databases through aggregation of literature data. APL Materials 4(5):053206CrossRefGoogle Scholar
- 41.Michel K, Meredig B (2016) Beyond bulk single crystals: a data format for all materials structure-property-processing relationships. MRS Bulletin 41(8):617–623CrossRefGoogle Scholar
- 42.Plimpton S, Thompson A, Slepoy A (2012) SPPARKS kinetic Monte Carlo simulatorGoogle Scholar
- 43.Gaston D, Newman C, Hansen G, Lebrun-Grandie D (2009) Moose: a parallel computational framework for coupled systems of nonlinear equations. Nucl Eng Des 239(10):1768–1778CrossRefGoogle Scholar
- 44.Groeber MA, Jackson MA (2014) Dream. 3D: a digital representation environment for the analysis of microstructure in 3D. Integrating Materials and Manufacturing Innovation 3(1): 1–17CrossRefGoogle Scholar
- 45.Institute S (1985) SAS User’s guide: Statistics, vol 2. Sas Inst, CaliforniaGoogle Scholar
- 46.Seabold S, Perktold J (2010) Statsmodels: econometric and statistical modeling with Python. In: Proceedings of the 9th python in science conference, pp 57–61Google Scholar
- 47.Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in Python. The J Mach Learn Res 12:2825–2830Google Scholar
- 48.Albanese D, Visintainer R, Merler S, Riccadonna S, Jurman G, Furlanello C (2012) mlpy: Machine Learning Python. arXiv:1202.6548
- 49.Goodfellow IJ, Warde-Farley D, Lamblin P, Dumoulin V, Mirza M, Pascanu R, Bergstra J, Bastien F, Bengio Y (2013) Pylearn2: a machine learning research library. arXiv:1308.4214
- 50.McKinney W (2012) Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O’Reilly Media, Inc., CaliforniaGoogle Scholar
- 51.Müller AC, Behnke S (2014) Pystruct: learning structured prediction in Python. The J Mach Learn Res 15(1):2055–2060Google Scholar
- 52.Demšar J, Zupan B, Leban G, Curk T (2004) Orange: from experimental machine learning to interactive data mining. Springer, Berlin HeidelbergGoogle Scholar
- 53.Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M et al (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467
- 54.Van Der Walt S, Schönberger JL, Nunez-Iglesias J, Boulogne F, Warner JD, Yager N, Gouillart E, Yu T (2014) scikit-image: image processing in Python. PeerJ 2:453Google Scholar
- 55.Hill R (1963) Elastic properties of reinforced solids: some theoretical principles. J Mech Phys Solids 11 (5):357–372CrossRefGoogle Scholar
- 56.Hashin Z (1983) Analysis of composite materials—a survey. J Appl Mech 50(3):481–505CrossRefGoogle Scholar
- 57.Brown WF Jr (1955) Solid mixture permittivities. The J Chem Phys 23(8):1514–1517CrossRefGoogle Scholar
- 58.Kröner E (1986) Statistical modelling. In: Modelling small deformations of polycrystals. Springer, Netherlands, pp 229–291CrossRefGoogle Scholar
- 59.Kröner E (1977) Bounds for effective elastic moduli of disordered materials. J Mech Phys Solids 25(2):137–155CrossRefGoogle Scholar
- 60.Kröner E (1972) Statistical continuum mechanics. Springer, ViennaGoogle Scholar
- 61.Etingof P, Adams BL (1993) Representations of polycrystalline microstructure by n-point correlation tensors. Texture, Stress, and Microstructure 21(1):17–37CrossRefGoogle Scholar
- 62.Adams BL, Olson T (1998) The mesostructure-properties linkage in polycrystals. Prog Mater Sci 43 (1):1–87CrossRefGoogle Scholar
- 63.Fullwood DT, Adams BL, Kalidindi SR (2008) A strong contrast homogenization formulation for multi-phase anisotropic materials. J Mech Phys Solids 56(6):2287–2297CrossRefGoogle Scholar
- 64.Torquato S (2013) Random heterogeneous materials: microstructure and macroscopic properties, vol 16. Springer, New YorkGoogle Scholar
- 65.Li D, Saheli G, Khaleel M, Garmestani H (2006) Quantitative prediction of effective conductivity in anisotropic heterogeneous media using two-point correlation functions. Comput Mater Sci 38(1):45–50CrossRefGoogle Scholar
- 66.Milhans J, Li D, Khaleel M, Sun X, Garmestani H (2011) Prediction of the effective coefficient of thermal expansion of heterogeneous media using two-point correlation functions. J Power Sources 196(8):3846–3850CrossRefGoogle Scholar
- 67.Adams BL, Kalidindi S, Fullwood DT (2013) Microstructure-sensitive design for performance optimization. Butterworth-Heinemann, United KingdomGoogle Scholar
- 68.Garmestani H, Lin S, Adams B, Ahzi S (2001) Statistical continuum theory for large plastic deformation of polycrystalline materials. J Mech Phys Solids 49(3):589–607CrossRefGoogle Scholar
- 69.Adams BL, Gao XC, Kalidindi SR (2005) Finite approximations to the second-order properties closure in single phase polycrystals. Acta Mater 53(13):3563–3577CrossRefGoogle Scholar
- 70.Binci M, Fullwood D, Kalidindi SR (2008) A new spectral framework for establishing localization relationships for elastic behavior of composites and their calibration to finite-element models. Acta Mater 56 (10):2272–2282CrossRefGoogle Scholar
- 71.Landi G, Niezgoda SR, Kalidindi SR (2010) Multi-scale modeling of elastic response of three-dimensional voxel-based microstructure datasets using novel DFT-based knowledge systems. Acta Mater 58(7):2716–2725CrossRefGoogle Scholar
- 72.Kalidindi SR, Niezgoda SR, Landi G, Vachhani S, Fast T (2010) A novel framework for building materials knowledge systems. Computers, Materials, and Continua 17(2):103–125Google Scholar
- 73.Yabansu YC, Patel DK, Kalidindi SR (2014) Calibrated localization relationships for elastic response of polycrystalline aggregates. Acta Mater 81:151–160CrossRefGoogle Scholar
- 74.Al-Harbi HF, Landi G, Kalidindi S (2012) Multi-scale modeling of the elastic response of a structural component made from a composite material using the materials knowledge system. Modell Simul Mater Sci Eng 20(5):055001CrossRefGoogle Scholar
- 75.Kalidindi SR, Niezgoda SR, Salem AA (2011) Microstructure informatics using higher-order statistics and efficient data-mining protocols. JOM 63(4):34–41CrossRefGoogle Scholar
- 76.Gupta A, Cecen A, Goyal S, Singh AK, Kalidindi SR (2015) Structure-property linkages using a data science approach: application to a non-metallic inclusion/steel composite system. Acta Mater 91:239–254CrossRefGoogle Scholar
- 77.Çeçen A, Fast T, Kumbur E, Kalidindi S (2014) A data-driven approach to establishing microstructure-property relationships in porous transport layers of polymer electrolyte fuel cells. J Power Sources 245:144–153CrossRefGoogle Scholar
- 78.Niezgoda SR, Kanjarla AK, Kalidindi SR (2013) Novel microstructure quantification framework for databasing, visualization, and analysis of microstructure data. Integrating Materials and Manufacturing Innovation 2(1):1–27CrossRefGoogle Scholar
- 79.Niezgoda SR, Yabansu YC, Kalidindi SR (2011) Understanding and visualizing microstructure and microstructure variance as a stochastic process. Acta Mater 59(16):6387–6400CrossRefGoogle Scholar
- 80.Qidwai SM, Turner DM, Niezgoda SR, Lewis AC, Geltmacher AB, Rowenhorst DJ, Kalidindi SR (2012) Estimating the response of polycrystalline materials using sets of weighted statistical volume elements. Acta Mater 60(13):5284– 5299CrossRefGoogle Scholar
- 81.Niezgoda SR, Turner DM, Fullwood DT, Kalidindi SR (2010) Optimized structure based representative volume element sets reflecting the ensemble-averaged 2-point statistics. Acta Mater 58(13):4432–4445CrossRefGoogle Scholar
- 82.Yabansu YC, Kalidindi SR (2015) Representation and calibration of elastic localization kernels for a broad class of cubic polycrystals. Acta Mater 94:26–35CrossRefGoogle Scholar
- 83.Brough DB, Wheeler D, Warren JA, Kalidindi SR (2016) Microstructure-based knowledge systems for capturing process-structure evolution linkages. Curr Opin Solid State Mater SciGoogle Scholar
- 84.Cecen A, Fast T, Kalidindi SR (2016) Versatile algorithms for the computation of 2-point spatial correlations in quantifying material structure. Integrating Materials and Manufacturing Innovation 5(1):1–15CrossRefGoogle Scholar
- 85.Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24(6):417CrossRefGoogle Scholar
- 86.Pérez F, Granger BE, Hunter JD (2011) Python: an ecosystem for scientific computing. Comput Sci Eng 13(2):13–21. doi: 10.1109/MCSE.2010.119 CrossRefGoogle Scholar
- 87.The MIT License (MIT). https://opensource.org/licenses/mit-license.php. Accessed: 2016-05-18
- 88.Kalidindi SR, Duvvuru HK, Knezevic M (2006) Spectral calibration of crystal plasticity models. Acta Mater 54(7):1795– 1804CrossRefGoogle Scholar
- 89.Shaffer JB, Knezevic M, Kalidindi SR (2010) Building texture evolution networks for deformation processing of polycrystalline fcc metals using spectral approaches: applications to process design for targeted performance. Int J Plast 26(8):1183– 1194CrossRefGoogle Scholar
- 90.Knezevic M, Levinson A, Harris R, Mishra RK, Doherty RD, Kalidindi SR (2010) Deformation twinning in AZ31: influence on strain hardening and texture evolution. Acta Mater 58(19):6230–6242CrossRefGoogle Scholar
- 91.Al-Harbi HF, Knezevic M, Kalidindi SR (2010) Spectral approaches for the fast computation of yield surfaces and first-order plastic property closures for polycrystalline materials with cubic-triclinic textures. Computers, Materials, and Continua 15(2):153–172Google Scholar
- 92.Duvvuru HK, Knezevic M, Mishra RK, Kalidindi S (2007) Application of microstructure sensitive design to FCC polycrystals. In: Materials Science Forum, vol 546. Trans Tech Publ, pp 675–680Google Scholar
- 93.Li D, Garmestani H, Schoenfeld S (2003) Evolution of crystal orientation distribution coefficients during plastic deformation. Scr Mater 49(9):867–872CrossRefGoogle Scholar
- 94.Li D, Garmestani H, Adams B (2005) A texture evolution model in cubic-orthotropic polycrystalline system. Int J Plast 21(8):1591–1617CrossRefGoogle Scholar
- 95.Li D, Garmestani H, Ahzi S (2007) Processing path optimization to achieve desired texture in polycrystalline materials. Acta Mater 55(2):647–654CrossRefGoogle Scholar
- 96.Li DS, Bouhattate J, Garmestani H (2005) Processing path model to describe texture evolution during mechanical processing. In: Materials Science Forum, vol 495. Trans Tech Publ, pp 977–982Google Scholar
- 97.Creuziger A, Hu L, Gnäupel-herold T, Rollett AD (2014) Crystallographic texture evolution in 1008 steel sheet during multi-axial tensile strain paths. Integrating Materials and Manufacturing Innovation 3(1):1CrossRefGoogle Scholar
- 98.Sundararaghavan V, Zabaras N (2008) A multi-length scale sensitivity analysis for the control of texture-dependent properties in deformation processing. Int J Plast 24(9):1581–1605CrossRefGoogle Scholar
- 99.Sundararaghavan V, Zabaras N (2007) Linear analysis of texture-property relationships using process-based representations of rodrigues space. Acta Mater 55(5):1573–1587CrossRefGoogle Scholar
- 100.Van Der Walt S, Colbert SC, Varoquaux G (2011) The numpy array: a structure for efficient numerical computation. Comput Sci Eng 13(2):22–30Google Scholar
- 101.Jones E, Oliphant T, Peterson P (2014) Scipy: open source scientific tools for PythonGoogle Scholar
- 102.Pytest (2016) http://pytest.org
- 103.Cimrman R (2014) SfePy—write your own FE application. arXiv:1404.6391
- 104.Frigo M, Johnson SG (1998) FFTW: an adaptive software architecture for the FFT. In: Proceedings of the 1998 IEEE International Conference On Acoustics, Speech and Signal Processing, 1998, vol 3. IEEE, pp 1381–1384Google Scholar
- 105.Hunter JD et al (2007) Matplotlib: a 2D graphics environment. Comput Sci Eng 9(3):90–95CrossRefGoogle Scholar