Nanoinformatics pp 25-44 | Cite as
Potential Energy Surface Mapping of Charge Carriers in Ionic Conductors Based on a Gaussian Process Model
Abstract
The potential energy surface (PES) of a charge carrier in a host crystal is an important concept to fundamentally understand ionic conduction. Such PES evaluations, especially by density functional theory (DFT) calculations, generally require vast computational costs. This chapter introduces a novel selective sampling procedure to preferentially evaluate the partial PES characterizing ionic conduction. This procedure is based on a machine learning method called the Gaussian process (GP), which reduces computational costs for PES evaluations. During the sampling procedure, a statistical model of the PES is constructed and sequentially updated to identify the region of interest characterizing ionic conduction in configuration space. Its efficacy is demonstrated using a model case of proton conduction in a well-known proton-conducting oxide, barium zirconate (BaZrO3) with the cubic perovskite structure. The proposed procedure efficiently evaluates the partial PES in the region of interest that characterizes proton conduction in the host crystal lattice of BaZrO3.
Keywords
Gaussian process Bayesian optimization Ionic conduction Potential energy surface2.1 Introduction
Atomic transport phenomena in solids such as atomic diffusion and ionic conduction are generally governed by thermally activated processes. Based on transition state theory (TST) [1, 2, 3], the mean frequency of an elementary process (ν) with a single saddle point state, a so-called an atomic or ionic jump, is approximated by \( \nu = \nu_{0} { \exp }({-}\Delta E^{\text{mig}} /k_{\text{B}} T) \), where ν 0 is the vibrational prefactor, k B is the Boltzmann constant, T is the temperature, and ΔE mig is the potential barrier, i.e., the change in the potential energy (PE) from the initial state to the saddle point state. ν 0 is typically a constant value in the range of 1012–1013 s−1 associated with a lattice vibration [3, 4, 5, 6, 7, 8]. Consequently, ΔE mig mainly determines the rate of an atomic jump in a solid.
In general, atomic transfer is composed of several types of atomic jumps, which form a complicated three-dimensional (3D) network in the crystal lattice. Therefore, it is necessary to grasp the entire potential energy surface (PES) of a mobile atom or ion. However, a theoretical PES evaluation, e.g., based on density functional theory (DFT), generally requires huge computational costs, particularly in the case of a host crystal with a low crystallographic symmetry. The nudged elastic band (NEB) method [9, 10] is a well-established technique to avoid evaluating the entire PES, in which only the minimum energy paths (MEPs) are focused on in the PES. Because of its efficiency and versatility, the NEB method is used conventionally to clarify the atomic-scale-picture and the kinetics of atomic transfer in crystals.
However, the NEB method has some practical limitations. First, the initial and final states of all elementary paths in a crystal must be specified. That is, all local energy minima in the crystal and all conceivable elementary paths between adjacent local energy minima must be known in advance. As the crystallographic symmetry of the host crystal decreases, the number of local energy minima and conceivable elementary paths rapidly increase. Consequently, satisfying the requirements in the NEB method is very difficult without a priori information on the entire PES. In cases without a priori information, physical and chemical knowledge (e.g., ionic radii, chemical bonding states, electrostatic interaction, and interstitial and bottleneck sizes) are generally used. However, a key elementary path determining the rate of atomic diffusion or ionic conduction is sometimes missed in such an arbitrary manner. In addition, the NEB method requires huge computational costs for low-symmetry crystals, even if only the MEPs in the PES are evaluated. For example, in our recent study on proton conduction in tin pyrophosphate (SnP2O7) with space group of P21/C , we evaluated 143 possible elementary paths connecting 15 local energy minima by the NEB method [11]. An alternative method that is both robust and efficient is desirable to analyze complicated atomic transfers consisting of many elementary paths in a low-symmetry crystal.
Synthetic two-dimensional (a) PES and (b) FNS of a charge carrier in a host crystal lattice. Region of interest is defined as the low-PE region in the PES and the low-FN region in the FNS
The proposed sampling procedure has three key features. (1) A statistical model of the PES or FNS is developed as a Gaussian process (GP) [13, 14]. The statistical model is iteratively updated by repeatedly (i) sampling at a point where the predicted PE or FN is low and (ii) incorporating the newly calculated PE or FN value at the sampled point. (2) The statistical PES or FNS model is used to identify the subset of grid points at which the PEs or FNs are relatively low. Here a selection criterion is introduced for this advanced purpose, because GP applications have generally targeted the single global minimum or maximum point (not a subset). (3) The procedure allows us to estimate how many points in the region of interest remain unsampled, i.e., lets us know when sampling should be terminated.
Schematic illustration of the proposed selective sampling procedure in a one-dimensional configuration space with synthetic data [12]. In each plot, the x- and y-axes represent the configuration space and the PEs, respectively. Red area in plot (a) represents the low-PE region. In this example, the goal is to efficiently identify and evaluate the PEs at the nine points in the low-PE region. Plot (b) indicates the initialization step, where two points (filled red squares) are randomly selected and their PEs are evaluated. Remaining 16 plots [plots (c) to plot (r)] indicate steps 1–16 of the procedure
The uncertainty in the GP model is useful also to determine when to terminate sampling. The termination criterion should be determined based on the existence probability of unsampled low-PE points, for which the information on the uncertainty is indispensable. As a model case, herein the efficacy of the proposed procedure is demonstrated using proton conduction in a proton-conducting oxide, barium zirconate (BaZrO3) [15, 16, 17, 18].
2.2 Problem Setup
2.2.1 Entire Proton PES in BaZrO3
Crystal structure and the asymmetric unit of BaZrO3 with the cubic perovskite structure
The DFT calculations for the PES (and FNS) evaluation of a proton in BaZrO3 are based on the projector augmented wave (PAW) method as implemented in the VASP code [19, 20, 21, 22]. The generalized gradient approximation (GGA) parameterized by Perdew, Burke, and Ernzerhof is used for the exchange-correlation term [23]. The 5s, 5p, and 6s orbitals for Ba, 4s, 4p, 5s, and 4d for Zr, 2s and 2p for O, and 1s for H are treated as valence states. The supercell consisting of 3 × 3 × 3 unit cells (135 atoms) is used with a 2 × 2 × 2 mesh for the k-point sampling. Only the atomic positions in a limited region corresponding to the 2 × 2 × 2 unit cells around the introduced proton are optimized with fixing all other atoms and the proton. The atomic positions are optimized until the residual forces converge to less than 0.02 eV/Å.
(a) Calculated proton PES in the low PE region below 0.3 eV in reference to the most stable point [12]. (b) Grid points at which the force norm acting on a proton is less than 0.2 eV/Å
2.2.2 Problem Statement
Figure 2.4(a) indicates that the partial PES of a proton in the low-PE region below 0.3 eV is necessary and sufficient to estimate the proton diffusivity and conductivity in the crystal lattice of BaZrO3. In the low-PE region, there are 353 grid points to be evaluated by DFT calculations, corresponding to the lowest 20% of the grid points. Therefore, the first task is to selectively sample all the low-PE grid points as efficiently as possible. Hereafter this is referred to task 1. Task 2 is based on the force norm (FN) acting on a proton at each grid point. The FN is calculated along with the PE by the DFT calculations. In this task, the region of interest is defined as grid points with an FN below a threshold (i.e., 0.2 eV/Å in the present study), denoted by blue spheres in Fig. 2.4(b). There are only 15 grid points in the low-FN region in the asymmetric unit. The region of interest in task 2 is much smaller than that in task 1, hopefully leading to more efficient sampling.
Prior to the detailed description of the proposed procedure in Sect. 2.3, this problem is generalized and mathematically formulated using the identification of the low-PE region as an example. There are N grid points, \( i = 1, \ldots ,N \), in the asymmetric unit of the host crystal lattice. The PE of a proton at grid point i is denoted by E i . Using the parameter 0 < α < 1, the low-PE region is defined as the set of αN points where the PEs are lower than those at other (1−α)N points. The goal is to identify all αN grid points in the low-PE region as efficiently as possible. For simplicity, α is assumed to be prespecified. However, it can be adaptively determined, as demonstrated in Sect. 2.4.3.
where i′ is the sampled point in the step. When the termination criterion is satisfied, \( \hat{P}_{\alpha } \) has a high probability of containing all points in P α . The estimated θ α is also defined as \( \hat{\theta }_{\alpha } \). Section 2.3.3 shows how to estimate θ α from the prespecified α. Note that the θ α estimation is unnecessary in task 2 because the FN threshold is directly specified by the FN value.
2.3 GP-Based Selective Sampling Procedure
Here the proposed sampling procedure based on the GP is described using the PES-based task (task 1) as an example. Specifically, the key features are explained in the following subsections: the GP-based PE statistical model (Sect. 2.3.1), the selection criterion of the next grid point (Sect. 2.3.2), the estimation of the PE threshold (Sect. 2.3.3), and the criterion for sampling termination (Sect. 2.3.4). Note that the threshold estimation (Sect. 2.3.3) is irrelevant to task 2 for the low-FN identification.
2.3.1 Gaussian Process Models
At each step, the GP model of PES is fitted based on \( \{ ({\varvec{\upchi}}_{i} ,E_{i} )\}_{{i \in \hat{P}_{\alpha } }} \), which is the set of points whose PEs have already been computed by DFT calculations in earlier steps.
2.3.2 Selection Criterion
where \( \phi (\cdot;\mu ,\sigma^{2} ) \) is the probability density function of \( N(\mu ,\sigma^{2} ) \). This study employs the second option, although the performance difference between Eqs. (2.10) and (2.11) is negligible in our experience.
2.3.3 PE Threshold
Contingency table defining TP, FP, FN, and TN
| P α | N α | |
|---|---|---|
| \( \hat{P}_{\alpha } \) | \( \# TP(\hat{\theta }_{\alpha } ) \) | \( \# FP(\hat{\theta }_{\alpha } ) \) |
| \( \hat{N}_{\alpha } \) | \( \# FN(\hat{\theta }_{\alpha } ) \) | \( \# TN(\hat{\theta }_{\alpha } ) \) |
-
#TP: The number of sampled points in the low-PE region.
-
#FP: The number of sampled points in the high-PE region.
-
#FN: The number of not-yet-sampled points in the low-PE region.
-
#TN: The number of not-yet-sampled points in the high-PE region.
The estimate of the threshold \( \hat{\theta }_{\alpha } \) is determined for each step so that it satisfies Eq. (2.12) where the quantities on the left-hand side are given by Eqs. (2.13) and (2.14).
2.3.4 Termination Criterion
can assess the badness of the sampled points. \( {\text{F}}\widehat{\text{N}}{\text{R}} \) in Eq. (2.15) can be interpreted as the proportion of points where the PEs have yet to be evaluated. At each step, \( \# {\text{TP}}(\hat{\theta }_{\alpha } ) \) is computed by Eq. (2.13) and \( \# {\text{FN}}(\hat{\theta }_{\alpha } ) \) is estimated by Eq. (2.14). Then, the sampling is terminated if \( {\text{F}}\widehat{\text{N}}{\text{R}} \) is close to zero (e.g., <10−6).
2.4 Results of Selective Sampling
2.4.1 Low-PE Region Identification
The performances of several sampling procedures for α = 0.2 are compared in the low-PE region identification problem. Specifically, the following six sampling methods are assessed: (1) GP1(xyz), (2) GP2(xyz + 1st NNs), (3) GP3(xyz + prePES), (4) random, (5) prePES, and (6) ideal. The first three are the proposed GP-based selective sampling methods with different descriptors. In GP1, the 3D coordinates (x i , y i , and z i ) in the host crystal lattice are used as the descriptors of the ith point (denoted as xyz). In GP2, the first nearest neighbor (1st NN) distances to the Ba, Zr, and O atoms from each point are used as additional descriptors (denoted as 1st NNs). In GP3, a preliminary PES (denoted as prePES) is used as an additional descriptor. The preliminary PES means a rough but quick approximation of the PES obtained using less accurate but more efficient computational methods. For prePES, the PE values at all N points obtained by single-point DFT calculations are used. Random indicates naive random sampling, where a point is selected randomly at each step. prePES denotes a selective sampling method based only on the preliminary PES. Specifically, points are sequentially selected in ascending order of the preliminary PEs obtained by single-point DFT calculations. Finally, ideal indicates the ideal sampling method, which can only be realized when the actual PEs at all the points are known in advance.
In GP1 to GP3, two points are randomly selected to initialize the GP model. The average and the standard deviation over ten runs with different random seeds are discussed. The tuning parameters of the GP models are set to σ f = l = 0.5. According to our preliminary experiments, the performances are insensitive to the tuning parameter choices.
Efficiencies of the six sampling methods for α = 0.20 [12]. Number of grid points successfully sampled from the low-PE region (#TP) is plotted versus the number of PE evaluations by DFT (#TP + #FP)
Selected grid points (gray dots) at 100, 200, 300, and 400 steps by the different sampling methods in the model crystal lattice of BaZrO3 for α = 0.20 [12]. Yellow surface in each plot is the isosurface corresponding to the PE threshold at α = 0.20
Figure 2.5 indicates that the preliminary PES obtained by single-point DFT calculations is highly valuable as a descriptor when it is used along with three-dimensional coordinates (x, y, z) in GP modeling. However, using the preliminary PES alone cannot identify the low-PE region in the prePES sampling. The results are only slightly better than random. In the earlier steps, the sampling curve of prePES almost overlaps with the ideal sampling curve, but it gradually deviates as the sampling proceeds. Eventually, 1479 steps are necessary to find all points in the low-PE region using prePES. This is 4.2-fold decline compared to the ideal sampling case (353 points). The inefficiency of prePES is attributed to the relationship between the DFT calculations with and without structural optimization.
Rank correlation between the actual and the preliminary PEs [12]. Open circles and crosses show the grid points in P α and N α , respectively. Blue and red symbols indicate sampled points at 400 steps in (a) prePES and (b) GP4, respectively. GP4 method samples all the positive points at 400 steps with a small number of False ositive points (i.e., sampled points not in the low-PE region). In (b), there are no False Negative points
2.4.2 Low-FN Region Identification
The previous subsection demonstrates several types of sampling methods, which use different descriptors to identify the low-PE region. GP3, which employs descriptors of xyz and prePES, exhibits the best performance and is comparable to ideal sampling. However, the region of interest (i.e., the low-PE region) comprises 20% of the configuration space. Thus, the computational cost can be reduced by 80% at most.
To further reduce computational costs, it is necessary to redefine a smaller region of interest. The mean frequency of atomic or ionic jumps in a solid is determined mainly by the change in PE from the initial point to the saddle point. As both of these points can be mathematically defined as points with a zero gradient in the PES, the region of interest can be redefined as the region where the force norm (FN) acting on a proton is small. In this model case, the FN threshold is set to 0.2 eV/Å, which leads to 15 grid points in the low-FN region (See Fig. 2.4b).
The efficiencies of four sampling methods are compared for the low-FN region identification problem: (1) GP4(xyz), (2) GP5(xyz + preFNS), (3) preFNS, and (4) ideal. GP4(xyz) and GP5(xyz + preFNS) are GP-based selective sampling procedures where the three-dimensional coordinates (x, y, z) and/or the preliminary FNS (denoted as “preFNS”) are used as descriptors. The preliminary FNS is the FN values at all N points computed by single-point DFT calculations, which should have a higher contribution to the sampling performance. The preFNS method indicates a selective sampling where the grid points are sequentially selected in the ascending order of the preliminary FNs. The average and the standard deviation over ten runs with different random seeds are discussed for GP4(xyz) and GP5(xyz + preFNS). The tuning parameters of the GP models σ f and l are optimized for each method.
Efficiencies of the four FN-based samplings: GP4(xyz), GP5(xyz + preFNS), preFNS, and ideal. Number of grid points successfully sampled from the low-FN region (#TP) is plotted versus the number of FN computations by DFT (#TP + #FP). Green line is the result in GP5 using the 16 lowest FN points in preFNS as the initial grid points
Rank correlation between the actual and the preliminary FNs. Red open circles and crosses show grid points in P α and N α , respectively. Blue and black crosses denote False Positive and True Negative points at step 955 in the preFNS sampling method, respectively
Step numbers where each of the low-FN grid points is sampled in ten runs of (a) GP4 and (b) GP5 (red crosses). Green open diamonds in (b) are the results in GP5 using the 16 lowest FN points in preFNS (bottom 1%) as the initial grid points. (c) Two most difficult grid points to sample among the low-FN grid points (No. 6: black spheres, No. 8: white spheres)
To overcome this difficulty, information on the preliminary FNS is exploited not only as a descriptor in the FNS statistical model. Specifically, the initial grid points for the GP-based methods are not selected randomly, but in the ascending order of the preliminary FNs. The green open diamonds in Fig. 2.10b show the results by GP5 sampling using the 16 lowest FN points in the preFNS as the initial grid points. The two grid points (Nos. 6 and 8) are sampled at step 2 and step 86, respectively, resulting in 95 DFT computations to sample all low-FN points (See the green line in Fig. 2.8). Thus, fully exploiting information about the preliminary FNS can improve the sampling performance.
2.4.3 Practical Issues
Here two critical issues, which limit practicality, are discussed in the case of the low-PE region identification (task 1): (1) when to terminate sampling and (2) how to determine the PE threshold α.
Profiles of the estimated (a) FNRs and (b) PE thresholds for GP3 sampling [12]
(a) Efficiency of GP3(xyz + prePES) sampling when α is increased in a stepwise manner from 0.05 to 0.20 in 0.05 increments [12]. Number of grid points successfully sampled from the low-PE region (#TP) is plotted versus the number of DFT computations (#TP + #FP). (b), (c) Profiles of the estimated FNRs and PE thresholds versus the number of DFT computations [12]
Figure 2.12(b) indicates that the convergence of the estimated FNR is slightly slower than the ground truth FNR in the first step with α = 0.05. This is why more than 250 DFT computations are required to ensure that all points in P 0.05 are successfully sampled. On the other hand, when α = 0.10, 0.15, or 0.20, the convergences of the FNRs are almost as fast as the ground truth FNRs. It should be noted that the true positive points abruptly increase when the α value is switched, indicating that the positive points for higher α are sampled in earlier steps. Although this stepwise strategy is less efficient than directly specifying α = 0.20, it is much more efficient than the prePES and random sampling methods.
2.5 Conclusions
In this chapter, a machine learning-based selective sampling procedure for PES evaluation is introduced and applied to proton conduction in BaZrO3 to demonstrate its efficacy. The region of interest governing the ionic conduction is defined in the two ways: (1) a low-PE region and (2) a low-FN region.
For the low-PE region, the performance of the selective sampling based on the GP model greatly depends on the descriptors. Employing the preliminary PES (prePES) is significantly effective, which is evaluated by single-point DFT computations in a smaller supercell. The GP3(xyz + prePES) sampling requires 394 DFT computations to sample all the low-PE grid points (353 points) in a grid with 1768 points for the asymmetric unit of BaZrO3 crystal. This is a 78% reduction in the computational costs. However, the defined region of interest, i.e., the low-PE region, comprises 20% of the configuration space. Consequently, the reducible computational cost is limited to 80%.
The region of interest should, therefore, be redefined as it becomes smaller in the configuration space. For the low-FN region, the region of interest contains only 15 grid points, whose volume is less than 1% of the configuration space. Among the several sampling methods to identify the low-FN region, GP5(xyz + preFNS) shows the best performance. It requires only 116 DFT computations to identify all grid points in the low-FN region. Furthermore, the computational cost can be further reduced to 95 DFT computations using the 16 lowest FN grid points in the preFNS as the initial points. This means that exploiting the information on the preFNS can reduce the computational cost by 95%.
Thus, preliminary information (i.e., prePES and preFNS) significantly contributes to the sampling performance. Therefore, a machine learning-based approach hybridized with a low-cost PES and/or FNS evaluation should be a solid methodology for preferential PES evaluation in the region of interest. In addition, using the FNR, which is defined in Eq. (2.15), solves two critical issues, which are when to terminate sampling and how to determine an appropriate α value (equivalent to the PE threshold).
Notes
Acknowledgements
We recognize Mr. Daisuke Hirano and Mr. Makoto Otsubo for their contributions and Dr. Atsuto Seko for the insightful comments and suggestions. This work is financially supported by JSPS KAKENHI (Grant Nos. 25106002 and 26106513).
References
- 1.S. Glasstone, K.J. Laidler, H. Eyring, The Theory of Rate Processes (McGraw-Hill, New York, 1941)Google Scholar
- 2.G.H. Vineyard, J. Phys. Chem. Solids 3, 121 (1957)|Google Scholar
- 3.K. Toyoura, Y. Koyama, A. Kuwabara, F. Oba, I. Tanaka, Phys. Rev. B 78, 214303 (2008)CrossRefGoogle Scholar
- 4.T. Vegge, Phys. Rev. B 70, 035412 (2004)CrossRefGoogle Scholar
- 5.A. Van der Ven, G. Ceder, Phys. Rev. Lett. 94, 045901 (2005)CrossRefGoogle Scholar
- 6.C.O. Hwang, J. Chem. Phys. 125, 226101 (2006)CrossRefGoogle Scholar
- 7.L.T. Kong, L.J. Lewis, Phys. Rev. B 74, 073412 (2006)CrossRefGoogle Scholar
- 8.K. Toyoura, Y. Koyama, A. Kuwabara, I. Tanaka, J. Phys. Chem. C 114, 2375 (2010)CrossRefGoogle Scholar
- 9.G. Henkelman, B.P. Uberuaga, H. Jonsson, J. Chem. Phys. 113, 9978 (2000)CrossRefGoogle Scholar
- 10.G. Henkelman, B.P. Uberuaga, H. Jonsson, J. Chem. Phys. 113, 9901 (2000)CrossRefGoogle Scholar
- 11.K. Toyoura, J. Terasaka, A. Nakamura, K. Matsunaga, J. Phys. Chem. C 121, 1578 (2017)CrossRefGoogle Scholar
- 12.K. Toyoura, D. Hirano, A. Seko, M. Shiga, A. Kuwabara, M. Karasuyama, K. Shitara, I. Takeuchi, Phys. Rev. B 93, 054112 (2016)CrossRefGoogle Scholar
- 13.C.E. Rasmussen, C.K.I. Williams, Gaussian Processes for Machine Learning (The MIT Press, Cambridge, 2006)Google Scholar
- 14.M.L. Stein, Interpolation of Spatial Data: Some Theory for Kriging (Springer Science & Business Media, New York, 2012)Google Scholar
- 15.H. Iwahara, T. Yajima, T. Hibino, K. Ozaki, H. Suzuki, Solid State Ionics 61, 65 (1993)CrossRefGoogle Scholar
- 16.W. Münch, K.-D. Kreuer, G. Seifert, J. Maier, Solid State Ionics 136–137, 183 (2000)CrossRefGoogle Scholar
- 17.M.S. Islam, J. Mater. Chem. 10, 1027 (2000)CrossRefGoogle Scholar
- 18.M.E. Björketun, P.G. Sundell, G. Wahnström, D. Engberg, Solid State Ionics 176, 3035 (2005)CrossRefGoogle Scholar
- 19.P.E. Blöchl, Phys. Rev. B 50, 17953 (1994)CrossRefGoogle Scholar
- 20.G. Kresse, J. Hafner, Phys. Rev. B 48, 13115 (1993)CrossRefGoogle Scholar
- 21.G. Kresse, J. Furthmüller, Comput. Mater. Sci. 6, 15 (1996)CrossRefGoogle Scholar
- 22.G. Kresse, D. Joubert, Phys. Rev. B 59, 1758 (1999)CrossRefGoogle Scholar
- 23.J.P. Perdew, K. Burke, M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996)CrossRefGoogle Scholar
- 24.J. Mockus, J. Global Optim. 4, 347 (1994)CrossRefGoogle Scholar
- 25.E. Brochu, V.M. Cora, N. De Freitas (2010), arXiv:1012.2599
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.











