Numerical modeling of materials is performed at its most accurate level when each atom is accounted for individually. Such atomistic simulations enable the investigation of defects in crystalline solids, some of which are illustrated in Fig. 1. The collection of all defects within a material—known as its microstructure morphology—controls most of its properties. Historically, atomistic simulations of crystal defects have been performed under a variety of simplifying assumptions, such as limited microstructural complexity and low-fidelity atomic interactions. The rationale behind these simplifications often revolves around reducing computational costs, streamlining post-processing analyses, and facilitating physical-modeling efforts. Developments in computational materials science over the course of the last decade have alleviated the need for many of the traditional simplifying assumptions. Such developments are poised to bring nothing short of a paradigm shift to the field of atomistic simulations of materials. These changes will be supported by the ongoing progress in three distinct directions, which are discussed next.

Figure 1
figure 1

The microstructure of a crystal consists of all of its defects, some of which are illustrated here. Traditionally,[1] crystal defects are classified according to their dimensionality. Point defects (also known as native defects) are zero-dimensional defects, such as interstitials, vacancies, substitutional defects, surface adatoms, and F-centers. Higher-dimensional defects (also known as extended defects) are linear (dislocations, surface steps, and disconnections), planar (grain boundaries, stacking faults, and surfaces), and volumetric (voids and cracks). The collection of all defects within a material—known as its microstructure morphology—controls most of its properties.

The first direction is—unsurprisingly—the steady improvement of computational hardware, which resulted in widespread access to computing capabilities that were not long ago available only to select research centers. This made possible for large-scale atomistic simulations to be performed routinely and broadly in our community. When such simulations contain a large number of atoms—typically more than \(10^5\)—they enable the investigation of unique materials properties not accessible at smaller scales.[2,3,4,5,6] Such simulations are cross-scale, i.e., they allow the simultaneous examination of materials properties across different length scales. They are not to be mistaken with multi-scale simulations, where the atom-by-atom description of the material is given up and substituted by a coarser—and consequently, less accurate—representation in order to decrease computational costs. Cross-scale atomistic simulations allow one to investigate crystal defects with geometrical complexities akin to the microstructures observed in laboratory experiments.

The second direction is the development of approaches—and their accompanying high-performance software implementations[7,8]—that aid and augment human intuition in interpreting atomistic simulations and extracting physical information from them.[6,9,10,11,12] It is difficult to identify and quantity crystal defects in atomistic simulations because of their structural complexity. These difficulties are further amplified when the microstructural evolution in time (i.e., kinetics) is considered. Thus, specialized structure characterization techniques are required, namely in silico microscopy methods. While these have historically been challenging to construct, requiring significant intuition and effort, recent progress favoring data-centric approaches and the employment of machine learning (ML) techniques over heuristic rules of classification have shown promise in simplifying the development of novel techniques.[13,14]

In silico microscopy is particularly relevant to cross-scale atomistic simulations. The complex microstructures considered in such simulations render the visualization and physical modeling of the results challenging to perform. Information about crystal defects and their evolution is buried within an astronomical amount of data describing the motion and interaction of all atoms. The importance of accessing such atomistic information is further amplified by the fact that this level of simultaneous resolution of defect structure and kinetics—atom-by-atom and femtosecond-by-femtosecond—is virtually impossible to achieve experimentally.

Finally, the third direction, and focus of this article, is the convergence of decades of advancements in strategies for the calculation of atomic interactions (i.e., the energy and forces between atoms) into a class of methods known as machine-learning interatomic potentials (MLIAPs). Crystalline defects display dramatically different behaviors depending on subtle differences in bonding of the underlying atoms. Capturing such features accurately is essential for physically realistic descriptions of the microstructure, including their evolution and kinetics. In fact, the physical fidelity of atomistic simulations is often considered tantamount to the accuracy with which atomic interactions are represented.

Progress along the three directions described above is highly synergistic. For example, the prevalence of cross-scale atomistic simulations has heightened the need for in silico microscopy algorithms, while access to better computational resources enabled the employment of high-fidelity MLIAPs routinely. Similarly, MLIAPs have widened the spectrum of systems that can be realistically simulated in order to investigate their microstructural elements, at the same time as in silico microscopy algorithms facilitate the quantitative evaluation of MLIAPs for crystal defects. Together, these developments enable the investigation of the formation and evolution of the material’s mesoscale morphology as dictated solely by the fundamental quantum and statistical mechanical laws governing atomic interactions.

In this prospective article we review the application of MLIAPs to crystal defects. Our focus is on aspects of the development of MLIAPs that are particularly relevant for systems containing extended crystal defects, especially when they differ from conventional approaches employed for molecules and liquids. We start with a brief introduction to MLIAPs in the “Machine-learning interatomic potentials” section with the intention of highlighting how MLIAPs are fundamentally changing how investigations of microstructural evolution with atomistic simulations are carried out. The “Dislocations” and “Grain boundaries” sections cover in depth the development of MLIAPs for these two classes of extended defects. Finally, the “Improving MLIAPs accuracy for crystal defects” section offers a forward-looking view of promising methods and strategies, including notable applications of MLIAP to other defects and complex systems.

Machine-learning interatomic potentials

Table I List of all machine-learning interatomic potentials (MLIAPs) discussed in this article.

High-fidelity atomic interactions can be obtained directly from ab initio techniques, but those are computationally costly and scale poorly with system size. For example, density-functional theory (DFT) scaling is typically \({\mathcal{O}}({N}_{\hbox{e}}^3)\), where \({N}_{\hbox{e}}\) is the number of electrons. This has limited the application of ab initio atomistic simulations to systems with only a few hundred atoms. Investigation of extended crystal defects with such methods is notoriously challenging and—when possible at all—is often limited to static simulations. The development of linear-scaling DFT methods,[41,42,43] i.e., \({\mathcal{O}}({N}_{\hbox{e}})\), has been an active area of research that often extends DFT calculations to \({N}_{\rm a} \approx 1000\) atoms. While linear-scaling DFT has enabled the investigation of electronic properties of larger system, this approach has not reduced the computational cost enough to account for extended crystal defects.

Interatomic potentials (IAPs) are functional representations of atomic interactions that are considerably cheaper to evaluate and scale linearly with the number of atoms \({\mathcal{O}}({N}_{\hbox{a}})\). For this reason they are often employed to circumvent the size and time limitations of ab initio methods. The functional form of traditional IAPs is often rooted, or at least motivated, by the underlying quantum nature of chemical bonds. But, ultimately, such functional forms are simplifications of the quantum mechanics of electrons; invariably leading to lower physical fidelity than ab initio methods. The degree of physical fidelity of an IAP depends not only on its functional form, but also on the choice of parameters defining the IAP. Parameters are chosen with the goal for calculations with the IAP to reproduce certain target material properties such as elastic constants for example, which can be obtained experimentally or through ab initio calculations.

The introduction of the force-matching technique in Ref. 44 represented an important development that paved the way to the advent of MLIAPs. With this approach the parameters of an IAP are fitted to reproduce the forces obtained from ab initio calculations as closely as possible. Atomic configurations employed during the optimization process define—albeit indirectly—which materials properties one can expect the potential to reproduce accurately. The force-matching method was successful in deriving IAPs with improved physical-fidelity with respect to ab initio calculations. Careful design of the data set of atomic configurations led to smaller errors. Yet, the ultimate accuracy was still limited by the IAP functional form.

Several attempts were made to develop more flexible IAPs while still adhering to some underlying physical motivation for the functional form. For example, the embedded-atom method[45] (EAM) IAP for metals was derived through rigorous approximations based on DFT. Its successor, the modified embedded-atom method[46] (MEAM), was obtained through a much less rigorous approach where angular forces were added to EAM in order to account for the covalent character of bonds in certain materials, such as in partially filled d-band transition metals. In Ref. 47 physical motivations for the functional form of certain components of the MEAM potentials were abandoned altogether in favor of cubic splines, each with different parameters that were fitted by the force-matching method. However, the numerical problem being solved was still an optimization problem. Consequently, the bias-variance tradeoff was not being considered and the concept of generalization error was ill-defined given the lack of separation between test and training data sets.

MLIAPs made their first appearance in Ref. 24, when Behler and Parrinello trained a neural network potential (NNP) on DFT results to predict energies and forces for silicon. With MLIAPs one abandons any physical motivation for the functional form of the atomic interactions in favor of the unmatched functional flexibility provided by ML models—also referred to as the model capacity. A generalized form of the force-matching method is then employed to train the ML model parameters to reproduce interatomic forces and energies. Standard techniques from the field of ML are also employed to control the bias-variance tradeoff and estimate the generalization error of the model. For a review on technical progress in MLIAPs we refer the reader to Refs. 48,49,50,51.

MLIAPs reproduce atomic interactions from ab initio calculations with high physical fidelity while maintaining the same linear scaling with number of atoms as IAPs, which is an important property for its employment in cross-scale atomistic simulations. Moreover, the accuracy of an MLIAP is systematically improvable to a large extent. However, some of the most important contributions of MLIAPs to the field of computational materials science are not related to their quantitative improvement over IAPs. MLIAPs have made the process of creating IAPs more transparent,[15] standardized, and therefore more reproducible. The systematization of the process of developing MLIAPs also made this approach more accessible to the community. Datasets for MLIAPs are easy to construct because they employ, in most part, simple DFT calculations. Meanwhile, most classes of MLIAPs have their own software implementation available to train and test models given a data set. Many of these practices and techniques were borrowed directly from the field of ML and data science, such as emphasis on open-source distribution of software and data sets, reproducibility, and standard statistical metrics to compare competing models. Together, they facilitate the training of MLIAPs tailored for specific applications.

Because of the lack of physical motivation for MLIAPs functional forms, the set of atomic configurations employed when training the MLIAPs is critically important. The training data determines not only which materials properties will be reproduced accurately, but also the underlying physics of the MLIAP. In the following two sections, “Dislocations” and “Grain boundaries”, we explore how the judicious choice of training configurations and MLIAP models affect the accuracy of MLIAPs in describing these two classes of crystal defects. Table I lists all the MLIAPs discussed here.


Dislocations[53,54] are linear defects that define the ability of certain crystalline materials to deform plastically. Naturally, many mechanical properties are controlled by dislocations, such as strength and fracture resistance (i.e., toughness). Dislocations also play a role in a surprising variety of other situations, such as ionic transport in ceramics,[55,56] strain relaxation between misfit layers of epitaxial thin films,[57] kinetics of crystal growth,[58] and quantum efficiency loss in semiconductor devices such as light-emitting diodes and solar cells.[58]

Figure 2
figure 2

(a) Peierls barrier for \(\frac{1}{2} \left<111\right>\) screw dislocations gliding in the \(\left<112\right>\) direction. Many IAPs, such as the EAM potential Mendelev07 shown here, incorrectly predict a degenerate core structure with a metastable configuration, while all MLIAPs examined here correctly predict a non-degenerate core in agreement with DFT calculations. Adapted from Ref. 27. (b) Generalized stacking fault energy (or \(\gamma\)-surface) for the \(\{112\}\) planes. Including \(\gamma\)-surfaces in the training data sets of MLIAPs have shown to lead to improved accuracy in the description of dislocation core features. Reproduced from Ref. 27. (c) Dependence of the enthalpy barrier for kink-pair nucleation as function of applied stress as computed with GAP. Inset shows atoms near a dislocation core during glide. The atomic structure around the kinks is markedly different from the structure along a straight dislocation. Adapted from Ref. 52. (d) Energy profile along the migration path for a straight \(\frac{1}{2}\left<111\right>\) screw dislocation in iron as computed with a NNP. Reproduced from Ref. 26. (e) Dislocation core energy variation with the angle between the dislocation line and its Burgers vector (i.e., the character angle) as computed with IAPs. MLIAPs must account for the variation in dislocation properties with line direction. We are currently not aware of any investigation on the capacity of MLIAPs to extrapolate dislocation properties from the data of a single line direction in the training set.

Dislocation-mediated plastic deformation is controlled by dislocation interactions through their long-range elastic fields.[54] Because of this, any MLIAP aiming at properly describing plastic yielding must accurately reproduce the elastic constants of the material. Luckily, elastic constants are not only easily calculated through DFT simulations, but also seem to be routinely reproduced with outstanding accuracy by MLIAPs for various materials. For example, in Ref. 15 the three independent elastic constants of cubic materials computed by five different MLIAPs for six materials (Ni, Cu, Li, Mo, Si, and Ge) are typically within \(10\%\) of their DFT values.

While elastic constants depend only on the details of the atomic interactions occurring in the perfect crystal structure, most other factors governing dislocation-mediated plasticity involve atomic environments that are profoundly different from the structure of the underlying host lattice. Such environments occur in a region within a few lattices spacings of the dislocation line known as the dislocation core. The atomic structure and interactions at the dislocation core define a variety of properties, such as the dislocation mobility, preferred glide planes, propensity to move out of glide planes (i.e., cross-slip), and the short-ranged part of the elastic interaction with other defects.

Screw dislocations have notoriously low mobilities in body-centered cubic (BCC) metals, which makes them the most relevant type of dislocations for plasticity in these materials. This large resistance to glide has its origins precisely at the atomic structure of the screw dislocation core region, which is non-degenerate (i.e., it does not dissociate into partial dislocations). Many IAPs incorrectly predict a degenerate core structure for screw dislocations in BCC metals, with a metastable configuration that the dislocation must go through while gliding [shown in green in Fig. 2(a)]. The reason IAPs for metals have a difficult time capturing the correct structure is because BCC transition metals, in which this is observed, have atomic interactions dominated by d-band electrons with significant directionality (i.e., covalent character), which strongly favors contributions from first neighbors.[59] Hence, dislocation motion in BCC metals depends strongly on the details of the interatomic bonding at the core, making this a good case-study for the effectiveness of MLIAPs.

From the discussion above it is clear that any MLIAP aiming to properly describe plastic yielding in BCC metals must account for the dislocation core properties accurately. Direct DFT calculations of dislocation cores are possible, yet they are not as simple to perform as those for the elastic constants and require specialized methodologies (see Ref. 60 for a review). A promising approach to circumvent the use of dislocations cores in training MLIAPs has its origins in the physics of dislocations itself. It has long been known that many properties of the core are partially defined by the concept of a generalized stacking fault,[54,61] i.e., stacking faults in which the atomic slip between two adjacent atomic planes can assume values that vary continuously from zero (no stacking fault) to one Burgers vector. Given an atomic plane the generalized stacking fault energy for that plane along all possible slip directions is known as the \(\gamma\)-surface. The \(\gamma\)-surface fully characterizes the resistance of a perfect crystal to shear along that crystallographic plane. This approach was employed in Ref. 27, where the \(\{110\}\) and \(\{112\}\) \(\gamma\)-surfaces [shown in Fig. 2(b)] were included in the training set for a gaussian approximation potential[34,35] (GAP) for iron, while no dislocation core structures were employed for training. The resulting core structure for the \(\frac{1}{2}\left<111\right>\) screw dislocation was shown to be compact and nondegenerate, in agreement with DFT calculations. The corresponding Peierls barrier, obtained through nudged-elastic band calculations, was also in good agreement with DFT results and presented a single-hump, as shown in Fig. 2(a).

Despite the promising results described above, evidence suggests that the direct inclusion of dislocation core structure might be indeed necessary. In Ref. 33 a GAP is developed for tungsten that reproduces the core structure of screw dislocations. It is shown there that the error in the final dislocation core structure (RMS error of Nye tensor for the atoms nearest to the dislocation core) is decreased by half by the inclusion of the \(\{110\}\) and \(\{112\}\) \(\gamma\)-surfaces in the training set, and by one quarter when in addition to the \(\gamma\)-surfaces the dislocation core structure itself is included in the training. Notably, the inclusion of the dislocation core did not affect the Peierls barrier, indicating that the \(\gamma\)-surfaces are sufficient to capture important dislocation core properties that dictate dislocation mobility. Yet, the direct inclusion of dislocation cores in the training set led to accuracy improvements in the description of the core structure that cannot be obtained otherwise, indicating that certain subtleties of the atomic structure of dislocation cores cannot be obtained from \(\gamma\)-surface structures alone.

The direct inclusion of dislocation cores in the training set has some important limitations due to the size limitation of DFT. Dislocation core structures in DFT are restricted to straight dislocations with well-defined line directions, while dislocations generally carry a degree of curvature that affects their properties. Additionally, the inclusion of a single dislocation line direction in the training set does not guarantee that the resulting MLIAP will generalize well to other directions because the dislocation core atomic structure—and consequently, the dislocation properties—is strongly dependent on the dislocation direction.[62,63] For example, Fig. 2(e) shows the variation of the dislocation core energy as a function of its character angle (i.e., the angle between the line tangent vector and the Burgers vector). Another consideration that must be taken into account is the fact that dislocation cores present polymorphism,[64] with different structures being thermally accessible, potentially leading to DFT zero-temperature dislocation core structure not being the structures governing dislocation mobility at finite temperatures.[65]

In order for dislocation motion to occur, the dislocation must overcome the lattice resistance to its motion (i.e., the Peierls barrier), which has its origins in the structural transformations that the core must undergo while moving. Accurate reproduction of dislocation mobility with MLIAPs requires accounting for the dislocation structure along the entire Peierls barrier—as opposed to only the Peierls valleys. For example, screw dislocations in BCC metals move by thermally activated kink-pair formation due to the large Peierls barrier in these materials, resulting in strong temperature and strain-rate dependencies of the yield stress.[66]

The atomic structure near a kink [Fig. 2(c)] is markedly different from the structure along the straight dislocation line and might not be adequately extrapolated by MLIAP from the straight dislocation core structure. Yet, simulation cells containing kinks pairs are too large to be employed in DFT simulations. Nevertheless, promising results were obtained from kink-pair simulations with MLIAPs. For example, in Ref. 26 a NNP was employed to compute the formation energy of a double-kink pair in iron for a \(\frac{1}{2} \left<111\right>\) screw dislocation gliding on the \(\{110\}\) plane. The result, \(0.94\;{\hbox{eV}}\), is reasonably close to an estimation employing a line-tension model and DFT calculations: \(0.73\) to \(0.86\;{\hbox{eV}}\). While a direct comparison is not possible, the good agreement of the energy profile along the migration path for a straight dislocation [Fig. 2(d)] is consistent with this result since this profile controls the energetics of double-kink formation. It is interesting to notice that this NNP shows a substantial improvement in the description of this energy profile over a similar GAP for the same system.[27] Because there are no substantial differences in the training sets, one is led to assume that the methodology of the MLIAP itself is responsible for the observed improvements.

The role of thermal effects on the dislocation slip behavior in iron was thoroughly investigated in Ref. 52 employing the same GAP from Fig. 2(a) and (b). First, the nucleation and migration of kink pairs was observed directly from molecular dynamics simulations at finite temperatures, in which the glide plane was also observed to be consistent with DFT analyses. Nudged-elastic band calculations were then performed to compute the stress-dependent enthalpy barrier for kink-pair nucleation. The results, shown in Fig. 2(c), are consistent with DFT calculations and line-tension theoretical models.

Other notable MLIAPs for BCC metals have also reproduced various properties of \(\frac{1}{2}\left<111\right>\) screw dislocations, including the magnitude and single saddle-point shape of the migration of the Peierls barrier, \(\gamma\)-surfaces, and the kink-pair formation energies (always in comparison with line tension models parametrized by DFT calculations). The metals considered include iron,[28] tungsten,[28] tantalum,[30,66] molybdenum,[30] and niobium.[30]

Investigations of dislocation properties with MLIAPs for face-centered cubic (FCC) and hexagonal close packed (HCP) materials seem to be less common. Dislocation cores in FCC metals are planar and dissociate from \(\frac{1}{2} \left<110\right>\{111\}\) dislocations into a pair of Shockley partial dislocations \(\frac{1}{6} \left<112\right>\{111\}\) separated by a stacking fault. The separation between partials is often too large to be investigated with direct DFT calculations, making it challenging to include the dislocation cores directly in the MLIAPs training sets. Yet, the separation distance is decided by an energy balance between the stacking fault energy and the energy associated with the elastic repulsion between partials. This makes the inclusion of \(\gamma\)-surfaces in the training data set even more important than in the case of BCC metals. Dislocations in HCP materials pose an excellent challenge to MLIAPs due to the several potential dislocation slip systems available, with the choice of which one will be active being material-dependent. Similarly to FCC, the inclusion of \(\gamma\)-surfaces in the training set of MLIAPs is critical because the predominant dislocations, \(\left<{\mathrm{a}}\right>\) (i.e., \(\frac{1}{3}\left< 1{\bar{2}}10\right>\)), can dissociate into partials on the prismatic \(\{1010\}\) or basal \(\{0001\}\) planes depending on the ratio of the stacking fault energy of these planes. \(\left<{\mathrm{c+a}}\right>\) (i.e., \(\frac{1}{3}\left< 2{\bar{1}}{\bar{1}}3\right>\)) dislocations are also observed in HCP materials to accommodate deformation along the \(\left<0001\right>\) direction. Examples of investigation of dislocation properties with MLIAPs for HCP materials can be found in Refs. 16 and 29.

Grain boundaries

Figure 3
figure 3

(a) Grain boundary energy dependence on the rotation angle for symmetric tilt \(\left<110\right>\) GB in Al as computed by MLIAPs (purple circles). Open symbols show the grain boundary energies predicted using an IAP, while DFT values are shown by crosses. Reproduced from Ref. 18. (b) Convergence of GBs properties for silicon with the model capacity (or computational cost) of the MLIAPs. The lattice thermal conductivities were computed at 700 K. Reproduced from Ref. 21. (c) Grain boundary excess energy convergence with model capacity (or computational cost) of the MLIAPs. The MLIAP was trained on a dataset that did not include any GB structure. Reproduced from Ref. 21. (d) Same as figure (c), but now the 10,000-structure dataset was increased with four GB structures, leading to a markedly enhanced performance. Reproduced from Ref. 21.

Grain boundaries (GBs) are interfaces between differently oriented crystals of the same phase. The atomic structure near GBs is complex and diverse,[67,68] which makes these defects as challenging for MLIAPs as dislocations. GB properties are often rationalized in terms of the extent of misfit between the lattice of the two grains in contact, with the key quantity being the reciprocal density of coincident lattice sites between the two grains, denoted by \(\Sigma\). For example, for a \(\Sigma 3\) GB one third of atom sites are shared between two lattices. GBs with low \(\Sigma\) values (such as \(\Sigma 3\)) are expected to exhibit simpler structures than high-\(\Sigma\) GBs. Naturally, low-\(\Sigma\) GBs are more amenable to DFT calculations since their simpler structures often result in smaller simulation cells. Consideration of complex GBs such as high-misfit and asymmetric GBs is more limited in the literature.

Nishiyama et al.[18] constructed MLIAPs for a set of FCC elemental metals (Ag, Al, Au, Cu, Pd, and Pt) using gaussian-type pairwise features and polynomial invariant features. The MLIAPs were employed to systematically compute the structure and excess energy of a family of GBs spanning symmetric-tilt GBs (\(\Sigma 5 \left< 100\right>\), \(\Sigma 3 \left< 110\right>\), and \(\Sigma 9\left<110\right>\)) and pure-twist GBs (\(\Sigma 9 \left< 100\right>\)). The MLIAP showed great predictive power [Fig. 3(a)], despite the surprising fact that the training dataset did not contain any GB structure. Following this approach, one could employ the repository of MLIAPs created by Seko[19] (that contains no defect structures in the training data) to evaluate the GB structure excess energies for various material systems.

In Ref. 21 Fujii and Seko investigated low and high-\(\Sigma\) GBs in silicon using MLIAPs that were trained using datasets with and without GB structures. MLIAPs trained without GB structures typically overestimate the GB energies, similarly to the behavior observed in IAPs. Meanwhile, appending only four GB structure to the 10, 000-structure dataset markedly enhanced the potential performance and the prediction accuracy of GB energies [Figs. 3(c) and (d)]. The dependence of GB properties such as lattice thermal conductivity, phonon frequency, and GB energy with respect to the model complexity (or computational cost) of the MLIAPs was also computed [Fig. 3(b)].

Alternative approaches to the inclusion of GB structures in the training dataset have been shown to improve the accuracy of GB predictions by MLIAPs, such as optimized structural features[20,69,70] and physically informed training processes.[40,71] For example, Rosenbrock et al.[69] introduced two revised versions of SOAP descriptors specialized in predicting GB energies and structures. Pun et al.[40] developed a physically-informed neural networks (PINN) potential that combines physics-based models with neural-network regression that has good transferability to complicated structures such as GBs, stacking faults, and solid–liquid interfaces.[17]

Besides the technical developments in improving MLIAP accuracy in the description of GBs, MLIAPs have also been employed directly in the analyses of GB behaviors and relevant features in perovskites,[72] 2D materials,[73] GB complexion systems,[74] refractory high-entropy alloys,[75,76] Si,[21,22,23] Fe,[25] CdTe,[77] W,[32] Li\(_{3}\)N,[78] Ta,[31] and Al.[17,18,20]

Improving MLIAPs accuracy for crystal defects

Figure 4
figure 4

(a) Estimation of the reliability of a MLIAP for four different defects not included in the training data set. Positive values indicate local atomic environments (LAEs) that are similar to LAEs in the training set, while negative values indicate outliers. Reproduced from Ref. 79. (b) Per atom error prediction of GAP for atoms near four different types of defects. Reproduced from Ref. 22. (c) On-the-fly learning of a MLIAP for vacancy migration. DFT calls are shown as black dots (left figure) and occur mostly in the beginning of the simulation. The accuracy of the resulting MLIAP is compared to DFT by measuring the activation barrier for vacancy migration (right figure). This approach enables local atomic environments not included in the training set to be automatically identified and added to the MLIAP. Adapted from Ref. 80. (d) Tradeoff between computational cost and accuracy of different MLIAPs trained on the same data set. Points represent MLIAPs with different parameters and model capacities. A grid search is performed and the MLIAPs with best tradeoff between computational cost and accuracy are shown in red (i.e., the Pareto front). Reproduced from Ref. 19.

The physical fidelity of atomistic simulations is often considered tantamount to the accuracy with which atomic interactions are represented. But evaluating the accuracy of a MLIAP for the simulation of crystal defects is not trivial. In this section we briefly review promising new approaches to evaluate and improve the accuracy of MLIAPs. Focus is given to approaches that are agnostic to the specific class of MLIAPs being employed.

During the development of a MLIAP one includes in the training data set a variety of atomic configurations with the goal of sampling local atomic environments (LAEs) of relevance for defects. But, if the crystal defect of interest requires simulation cells too large to be solved by DFT, such as in the case of dislocations with kink-pairs, the configurations including the defect cannot be directly included in the training set. One approach to circumvent this problem is to use knowledge of the crystal defect physics in order to include surrogate configurations that approximate the LAEs of interest while being amenable to DFT calculations. For example, in the “Dislocations” section it was shown that the inclusion of \(\gamma\)-surface configurations leads to large improvements in the description of dislocation cores by MLIAPs.

Finding physically motivated surrogate configurations requires significant intuition (i.e., expert knowledge) and effort. An automated version of the approach described above has been proposed in Ref. 79 by Goryaeva et al. Given the atomic configuration of a crystal defect the authors in Ref. 79 employ ML outlier detection algorithms to generate a metric for the deviation of defect LAEs from the LAEs included in the training data set. Thus, given a MLIAP and its corresponding training data set one can evaluate its reliability when modeling a given defect structure. For example, in Fig. 4(a) this metric is evaluated for four different types of defects for a GAP potential, indicating that two of the defects are not well-described by the MLIAP. A similar approach has also been suggested in Ref. 22, where an error for the prediction of GAPs can be obtained along with the prediction itself [Fig. 4(b)].

If the crystal defect of interest can be processed by DFT calculations, one is able to directly evaluate the MLIAP error by calculating the difference in the energy and force computed by the MLIAP when compared to DFT results. While minimizing this error—and concurrently avoiding overfitting—is important, it does not guarantee that the MLIAP will perform well in practice because simulations employing MLIAP might explore LAEs outside of the domain of LAEs covered by the training set, i.e., the MLIAP would be performing an extrapolation for unknown LAEs instead of an interpolation between known LAEs. For example, one might include a certain dislocation core structure in the training set, but during simulations at finite temperatures the core might undergo a structural transformation to a different structure unseen during training.

The general solution to this problem is to identify such LAEs and include similar configurations in the training set. In Ref. 80 this process is automated and performed over the course of a molecular dynamics simulation involving the defect (i.e., active learning). Figure 4c demonstrates this for the case of vacancy migration in aluminum, where after an initial period where many calls for DFT calculations are performed to improve the MLIAP, the simulation proceeds without any additional DFT calculations. The accuracy of the resulting MLIAP is compared to DFT by measuring the activation barrier for vacancy migration. In Ref. 81 an interesting approach to active learning is introduced where small subregions of large-scale simulations are identified and periodic configurations small enough for DFT calculations are constructed out of these subregions.

The balance between computational cost and accuracy of a MLIAP is critical for their employment in cross-scale simulations of crystal defects. Given a fixed training set, the accuracy of a MLIAP can be systematically improvable to a large extent by increasing the underlying ML model complexity. When adequately controlling for the bias-variance tradeoff, increasing the model complexity leads to an overall better performance of the MLIAP, but it also increases its computational cost. For example, with MTPs the degree of the polynomial employed can be varied systematically in order to increase the model complexity. Yet, improving the MLIAP accuracy beyond a certain point might have no physically distinguishable effect, while making the simulations more costly, consequently limiting the applicability of the MLIAP in cross-scale atomistic simulations. For example, in Fig. 3(c) it is clear that increasing the model capacity beyond the point highlighted in the figure leads to negligible changes in the lattice thermal conductivity, phonon frequencies, and GB energetics. Notice, however, that one could still increase the capacity of the MLIAP to the point where calculations would be one order of magnitude more expensive. In Ref. 19 Seko constructed a repository of MLIAPs in which each MLIAP is trained with three different model capacities, where Pareto optimality is estimated through a grid search based on the energy prediction error. This allows for users to select the accuracy tradeoff with computational costs that best fit their goals. One could envision the same approach being applied to crystal defects, where the convergence of relevant defect properties would be evaluated.


The introduction of MLIAPs have dramatically widened the spectrum of materials systems that can be simulated with high physical fidelity. In this prospective article we have highlighted recent successes in training MLIAPs that accurately capture a variety of crystal defect properties, including their kinetics. Focus was given to examples including dislocations and grain boundaries. The summary in Table I makes it clear that similar work has been performed for a variety of other extended and native defects. Yet, the frontier of MLIAPs for crystal defects includes a large variety of materials systems and defects that have not yet been investigated in depth. This includes MLIAPs for solid-liquid interfaces and strategies for tackling defects in chemically complex systems.[82,83,84] Recent progress in accounting for degrees of freedom other than coordinates will soon enable the simulation of defects in systems with magnetic transitions[85] and space charge effects.[86]

Different state-of-the-art MLIAP classes present similar levels of performance and accuracy for crystal defects. Specific classes of MLIAPs (e.g., physically informed MLIAPs[40] or MLIAPs employing different atomic representations[87]) do not seem able to learn the physics of crystal defects better than others, with the few exceptions identified [Fig. 2(d)] not being comprehensive enough to warrant definitive conclusions. This suggests that, if the goal is to create more accurate MLIAPs for crystal defects, one should focus on the development of better training strategies—including training sets—instead of pushing for incremental improvements in accuracy through the development of a new class of MLIAPs altogether.

The lack of physical motivation for MLIAPs functional forms makes the training data set critically important for determining which materials properties will be reproduced accurately. Recent developments reviewed in the “Improving MLIAPs accuracy for crystal defects” section show promising methods for improving MLIAPs accuracy for crystal defects by automating the analysis of training data sets. Yet, no approach has leaped ahead and provided a “MLIAP panacea” for crystal defects (i.e., an unbiased and automated approach for creating optimal training sets for multiple defect types). For now one still needs to rely on expert knowledge regarding the underlying defect physics in order to develop good quality MLIAPs.