Nanoinformatics pp 2544  Cite as
Potential Energy Surface Mapping of Charge Carriers in Ionic Conductors Based on a Gaussian Process Model
Abstract
The potential energy surface (PES) of a charge carrier in a host crystal is an important concept to fundamentally understand ionic conduction. Such PES evaluations, especially by density functional theory (DFT) calculations, generally require vast computational costs. This chapter introduces a novel selective sampling procedure to preferentially evaluate the partial PES characterizing ionic conduction. This procedure is based on a machine learning method called the Gaussian process (GP), which reduces computational costs for PES evaluations. During the sampling procedure, a statistical model of the PES is constructed and sequentially updated to identify the region of interest characterizing ionic conduction in configuration space. Its efficacy is demonstrated using a model case of proton conduction in a wellknown protonconducting oxide, barium zirconate (BaZrO_{3}) with the cubic perovskite structure. The proposed procedure efficiently evaluates the partial PES in the region of interest that characterizes proton conduction in the host crystal lattice of BaZrO_{3}.
Keywords
Gaussian process Bayesian optimization Ionic conduction Potential energy surface2.1 Introduction
Atomic transport phenomena in solids such as atomic diffusion and ionic conduction are generally governed by thermally activated processes. Based on transition state theory (TST) [1, 2, 3], the mean frequency of an elementary process (ν) with a single saddle point state, a socalled an atomic or ionic jump, is approximated by \( \nu = \nu_{0} { \exp }({}\Delta E^{\text{mig}} /k_{\text{B}} T) \), where ν _{0} is the vibrational prefactor, k _{B} is the Boltzmann constant, T is the temperature, and ΔE ^{mig} is the potential barrier, i.e., the change in the potential energy (PE) from the initial state to the saddle point state. ν _{0} is typically a constant value in the range of 10^{12}–10^{13} s^{−1} associated with a lattice vibration [3, 4, 5, 6, 7, 8]. Consequently, ΔE ^{mig} mainly determines the rate of an atomic jump in a solid.
In general, atomic transfer is composed of several types of atomic jumps, which form a complicated threedimensional (3D) network in the crystal lattice. Therefore, it is necessary to grasp the entire potential energy surface (PES) of a mobile atom or ion. However, a theoretical PES evaluation, e.g., based on density functional theory (DFT), generally requires huge computational costs, particularly in the case of a host crystal with a low crystallographic symmetry. The nudged elastic band (NEB) method [9, 10] is a wellestablished technique to avoid evaluating the entire PES, in which only the minimum energy paths (MEPs) are focused on in the PES. Because of its efficiency and versatility, the NEB method is used conventionally to clarify the atomicscalepicture and the kinetics of atomic transfer in crystals.
However, the NEB method has some practical limitations. First, the initial and final states of all elementary paths in a crystal must be specified. That is, all local energy minima in the crystal and all conceivable elementary paths between adjacent local energy minima must be known in advance. As the crystallographic symmetry of the host crystal decreases, the number of local energy minima and conceivable elementary paths rapidly increase. Consequently, satisfying the requirements in the NEB method is very difficult without a priori information on the entire PES. In cases without a priori information, physical and chemical knowledge (e.g., ionic radii, chemical bonding states, electrostatic interaction, and interstitial and bottleneck sizes) are generally used. However, a key elementary path determining the rate of atomic diffusion or ionic conduction is sometimes missed in such an arbitrary manner. In addition, the NEB method requires huge computational costs for lowsymmetry crystals, even if only the MEPs in the PES are evaluated. For example, in our recent study on proton conduction in tin pyrophosphate (SnP_{2}O_{7}) with space group of P2_{1/C }, we evaluated 143 possible elementary paths connecting 15 local energy minima by the NEB method [11]. An alternative method that is both robust and efficient is desirable to analyze complicated atomic transfers consisting of many elementary paths in a lowsymmetry crystal.
The proposed sampling procedure has three key features. (1) A statistical model of the PES or FNS is developed as a Gaussian process (GP) [13, 14]. The statistical model is iteratively updated by repeatedly (i) sampling at a point where the predicted PE or FN is low and (ii) incorporating the newly calculated PE or FN value at the sampled point. (2) The statistical PES or FNS model is used to identify the subset of grid points at which the PEs or FNs are relatively low. Here a selection criterion is introduced for this advanced purpose, because GP applications have generally targeted the single global minimum or maximum point (not a subset). (3) The procedure allows us to estimate how many points in the region of interest remain unsampled, i.e., lets us know when sampling should be terminated.
The uncertainty in the GP model is useful also to determine when to terminate sampling. The termination criterion should be determined based on the existence probability of unsampled lowPE points, for which the information on the uncertainty is indispensable. As a model case, herein the efficacy of the proposed procedure is demonstrated using proton conduction in a protonconducting oxide, barium zirconate (BaZrO_{3}) [15, 16, 17, 18].
2.2 Problem Setup
2.2.1 Entire Proton PES in BaZrO_{3}
The DFT calculations for the PES (and FNS) evaluation of a proton in BaZrO_{3} are based on the projector augmented wave (PAW) method as implemented in the VASP code [19, 20, 21, 22]. The generalized gradient approximation (GGA) parameterized by Perdew, Burke, and Ernzerhof is used for the exchangecorrelation term [23]. The 5s, 5p, and 6s orbitals for Ba, 4s, 4p, 5s, and 4d for Zr, 2s and 2p for O, and 1s for H are treated as valence states. The supercell consisting of 3 × 3 × 3 unit cells (135 atoms) is used with a 2 × 2 × 2 mesh for the kpoint sampling. Only the atomic positions in a limited region corresponding to the 2 × 2 × 2 unit cells around the introduced proton are optimized with fixing all other atoms and the proton. The atomic positions are optimized until the residual forces converge to less than 0.02 eV/Å.
2.2.2 Problem Statement
Figure 2.4(a) indicates that the partial PES of a proton in the lowPE region below 0.3 eV is necessary and sufficient to estimate the proton diffusivity and conductivity in the crystal lattice of BaZrO_{3}. In the lowPE region, there are 353 grid points to be evaluated by DFT calculations, corresponding to the lowest 20% of the grid points. Therefore, the first task is to selectively sample all the lowPE grid points as efficiently as possible. Hereafter this is referred to task 1. Task 2 is based on the force norm (FN) acting on a proton at each grid point. The FN is calculated along with the PE by the DFT calculations. In this task, the region of interest is defined as grid points with an FN below a threshold (i.e., 0.2 eV/Å in the present study), denoted by blue spheres in Fig. 2.4(b). There are only 15 grid points in the lowFN region in the asymmetric unit. The region of interest in task 2 is much smaller than that in task 1, hopefully leading to more efficient sampling.
Prior to the detailed description of the proposed procedure in Sect. 2.3, this problem is generalized and mathematically formulated using the identification of the lowPE region as an example. There are N grid points, \( i = 1, \ldots ,N \), in the asymmetric unit of the host crystal lattice. The PE of a proton at grid point i is denoted by E _{ i }. Using the parameter 0 < α < 1, the lowPE region is defined as the set of αN points where the PEs are lower than those at other (1−α)N points. The goal is to identify all αN grid points in the lowPE region as efficiently as possible. For simplicity, α is assumed to be prespecified. However, it can be adaptively determined, as demonstrated in Sect. 2.4.3.
where i′ is the sampled point in the step. When the termination criterion is satisfied, \( \hat{P}_{\alpha } \) has a high probability of containing all points in P _{ α }. The estimated θ _{ α } is also defined as \( \hat{\theta }_{\alpha } \). Section 2.3.3 shows how to estimate θ _{ α } from the prespecified α. Note that the θ _{ α } estimation is unnecessary in task 2 because the FN threshold is directly specified by the FN value.
2.3 GPBased Selective Sampling Procedure
Here the proposed sampling procedure based on the GP is described using the PESbased task (task 1) as an example. Specifically, the key features are explained in the following subsections: the GPbased PE statistical model (Sect. 2.3.1), the selection criterion of the next grid point (Sect. 2.3.2), the estimation of the PE threshold (Sect. 2.3.3), and the criterion for sampling termination (Sect. 2.3.4). Note that the threshold estimation (Sect. 2.3.3) is irrelevant to task 2 for the lowFN identification.
2.3.1 Gaussian Process Models
At each step, the GP model of PES is fitted based on \( \{ ({\varvec{\upchi}}_{i} ,E_{i} )\}_{{i \in \hat{P}_{\alpha } }} \), which is the set of points whose PEs have already been computed by DFT calculations in earlier steps.
2.3.2 Selection Criterion
where \( \phi (\cdot;\mu ,\sigma^{2} ) \) is the probability density function of \( N(\mu ,\sigma^{2} ) \). This study employs the second option, although the performance difference between Eqs. (2.10) and (2.11) is negligible in our experience.
2.3.3 PE Threshold
Contingency table defining TP, FP, FN, and TN
P _{ α }  N _{ α }  

\( \hat{P}_{\alpha } \)  \( \# TP(\hat{\theta }_{\alpha } ) \)  \( \# FP(\hat{\theta }_{\alpha } ) \) 
\( \hat{N}_{\alpha } \)  \( \# FN(\hat{\theta }_{\alpha } ) \)  \( \# TN(\hat{\theta }_{\alpha } ) \) 

#TP: The number of sampled points in the lowPE region.

#FP: The number of sampled points in the highPE region.

#FN: The number of notyetsampled points in the lowPE region.

#TN: The number of notyetsampled points in the highPE region.
The estimate of the threshold \( \hat{\theta }_{\alpha } \) is determined for each step so that it satisfies Eq. (2.12) where the quantities on the lefthand side are given by Eqs. (2.13) and (2.14).
2.3.4 Termination Criterion
can assess the badness of the sampled points. \( {\text{F}}\widehat{\text{N}}{\text{R}} \) in Eq. (2.15) can be interpreted as the proportion of points where the PEs have yet to be evaluated. At each step, \( \# {\text{TP}}(\hat{\theta }_{\alpha } ) \) is computed by Eq. (2.13) and \( \# {\text{FN}}(\hat{\theta }_{\alpha } ) \) is estimated by Eq. (2.14). Then, the sampling is terminated if \( {\text{F}}\widehat{\text{N}}{\text{R}} \) is close to zero (e.g., <10^{−6}).
2.4 Results of Selective Sampling
2.4.1 LowPE Region Identification
The performances of several sampling procedures for α = 0.2 are compared in the lowPE region identification problem. Specifically, the following six sampling methods are assessed: (1) GP1(xyz), (2) GP2(xyz + 1st NNs), (3) GP3(xyz + prePES), (4) random, (5) prePES, and (6) ideal. The first three are the proposed GPbased selective sampling methods with different descriptors. In GP1, the 3D coordinates (x _{ i }, y _{ i }, and z _{ i }) in the host crystal lattice are used as the descriptors of the ith point (denoted as xyz). In GP2, the first nearest neighbor (1st NN) distances to the Ba, Zr, and O atoms from each point are used as additional descriptors (denoted as 1st NNs). In GP3, a preliminary PES (denoted as prePES) is used as an additional descriptor. The preliminary PES means a rough but quick approximation of the PES obtained using less accurate but more efficient computational methods. For prePES, the PE values at all N points obtained by singlepoint DFT calculations are used. Random indicates naive random sampling, where a point is selected randomly at each step. prePES denotes a selective sampling method based only on the preliminary PES. Specifically, points are sequentially selected in ascending order of the preliminary PEs obtained by singlepoint DFT calculations. Finally, ideal indicates the ideal sampling method, which can only be realized when the actual PEs at all the points are known in advance.
In GP1 to GP3, two points are randomly selected to initialize the GP model. The average and the standard deviation over ten runs with different random seeds are discussed. The tuning parameters of the GP models are set to σ _{f} = l = 0.5. According to our preliminary experiments, the performances are insensitive to the tuning parameter choices.
Figure 2.5 indicates that the preliminary PES obtained by singlepoint DFT calculations is highly valuable as a descriptor when it is used along with threedimensional coordinates (x, y, z) in GP modeling. However, using the preliminary PES alone cannot identify the lowPE region in the prePES sampling. The results are only slightly better than random. In the earlier steps, the sampling curve of prePES almost overlaps with the ideal sampling curve, but it gradually deviates as the sampling proceeds. Eventually, 1479 steps are necessary to find all points in the lowPE region using prePES. This is 4.2fold decline compared to the ideal sampling case (353 points). The inefficiency of prePES is attributed to the relationship between the DFT calculations with and without structural optimization.
2.4.2 LowFN Region Identification
The previous subsection demonstrates several types of sampling methods, which use different descriptors to identify the lowPE region. GP3, which employs descriptors of xyz and prePES, exhibits the best performance and is comparable to ideal sampling. However, the region of interest (i.e., the lowPE region) comprises 20% of the configuration space. Thus, the computational cost can be reduced by 80% at most.
To further reduce computational costs, it is necessary to redefine a smaller region of interest. The mean frequency of atomic or ionic jumps in a solid is determined mainly by the change in PE from the initial point to the saddle point. As both of these points can be mathematically defined as points with a zero gradient in the PES, the region of interest can be redefined as the region where the force norm (FN) acting on a proton is small. In this model case, the FN threshold is set to 0.2 eV/Å, which leads to 15 grid points in the lowFN region (See Fig. 2.4b).
The efficiencies of four sampling methods are compared for the lowFN region identification problem: (1) GP4(xyz), (2) GP5(xyz + preFNS), (3) preFNS, and (4) ideal. GP4(xyz) and GP5(xyz + preFNS) are GPbased selective sampling procedures where the threedimensional coordinates (x, y, z) and/or the preliminary FNS (denoted as “preFNS”) are used as descriptors. The preliminary FNS is the FN values at all N points computed by singlepoint DFT calculations, which should have a higher contribution to the sampling performance. The preFNS method indicates a selective sampling where the grid points are sequentially selected in the ascending order of the preliminary FNs. The average and the standard deviation over ten runs with different random seeds are discussed for GP4(xyz) and GP5(xyz + preFNS). The tuning parameters of the GP models σ _{f} and l are optimized for each method.
To overcome this difficulty, information on the preliminary FNS is exploited not only as a descriptor in the FNS statistical model. Specifically, the initial grid points for the GPbased methods are not selected randomly, but in the ascending order of the preliminary FNs. The green open diamonds in Fig. 2.10b show the results by GP5 sampling using the 16 lowest FN points in the preFNS as the initial grid points. The two grid points (Nos. 6 and 8) are sampled at step 2 and step 86, respectively, resulting in 95 DFT computations to sample all lowFN points (See the green line in Fig. 2.8). Thus, fully exploiting information about the preliminary FNS can improve the sampling performance.
2.4.3 Practical Issues
Here two critical issues, which limit practicality, are discussed in the case of the lowPE region identification (task 1): (1) when to terminate sampling and (2) how to determine the PE threshold α.
Figure 2.12(b) indicates that the convergence of the estimated FNR is slightly slower than the ground truth FNR in the first step with α = 0.05. This is why more than 250 DFT computations are required to ensure that all points in P _{0.05} are successfully sampled. On the other hand, when α = 0.10, 0.15, or 0.20, the convergences of the FNRs are almost as fast as the ground truth FNRs. It should be noted that the true positive points abruptly increase when the α value is switched, indicating that the positive points for higher α are sampled in earlier steps. Although this stepwise strategy is less efficient than directly specifying α = 0.20, it is much more efficient than the prePES and random sampling methods.
2.5 Conclusions
In this chapter, a machine learningbased selective sampling procedure for PES evaluation is introduced and applied to proton conduction in BaZrO_{3} to demonstrate its efficacy. The region of interest governing the ionic conduction is defined in the two ways: (1) a lowPE region and (2) a lowFN region.
For the lowPE region, the performance of the selective sampling based on the GP model greatly depends on the descriptors. Employing the preliminary PES (prePES) is significantly effective, which is evaluated by singlepoint DFT computations in a smaller supercell. The GP3(xyz + prePES) sampling requires 394 DFT computations to sample all the lowPE grid points (353 points) in a grid with 1768 points for the asymmetric unit of BaZrO_{3} crystal. This is a 78% reduction in the computational costs. However, the defined region of interest, i.e., the lowPE region, comprises 20% of the configuration space. Consequently, the reducible computational cost is limited to 80%.
The region of interest should, therefore, be redefined as it becomes smaller in the configuration space. For the lowFN region, the region of interest contains only 15 grid points, whose volume is less than 1% of the configuration space. Among the several sampling methods to identify the lowFN region, GP5(xyz + preFNS) shows the best performance. It requires only 116 DFT computations to identify all grid points in the lowFN region. Furthermore, the computational cost can be further reduced to 95 DFT computations using the 16 lowest FN grid points in the preFNS as the initial points. This means that exploiting the information on the preFNS can reduce the computational cost by 95%.
Thus, preliminary information (i.e., prePES and preFNS) significantly contributes to the sampling performance. Therefore, a machine learningbased approach hybridized with a lowcost PES and/or FNS evaluation should be a solid methodology for preferential PES evaluation in the region of interest. In addition, using the FNR, which is defined in Eq. (2.15), solves two critical issues, which are when to terminate sampling and how to determine an appropriate α value (equivalent to the PE threshold).
Notes
Acknowledgements
We recognize Mr. Daisuke Hirano and Mr. Makoto Otsubo for their contributions and Dr. Atsuto Seko for the insightful comments and suggestions. This work is financially supported by JSPS KAKENHI (Grant Nos. 25106002 and 26106513).
References
 1.S. Glasstone, K.J. Laidler, H. Eyring, The Theory of Rate Processes (McGrawHill, New York, 1941)Google Scholar
 2.G.H. Vineyard, J. Phys. Chem. Solids 3, 121 (1957)Google Scholar
 3.K. Toyoura, Y. Koyama, A. Kuwabara, F. Oba, I. Tanaka, Phys. Rev. B 78, 214303 (2008)CrossRefGoogle Scholar
 4.T. Vegge, Phys. Rev. B 70, 035412 (2004)CrossRefGoogle Scholar
 5.A. Van der Ven, G. Ceder, Phys. Rev. Lett. 94, 045901 (2005)CrossRefGoogle Scholar
 6.C.O. Hwang, J. Chem. Phys. 125, 226101 (2006)CrossRefGoogle Scholar
 7.L.T. Kong, L.J. Lewis, Phys. Rev. B 74, 073412 (2006)CrossRefGoogle Scholar
 8.K. Toyoura, Y. Koyama, A. Kuwabara, I. Tanaka, J. Phys. Chem. C 114, 2375 (2010)CrossRefGoogle Scholar
 9.G. Henkelman, B.P. Uberuaga, H. Jonsson, J. Chem. Phys. 113, 9978 (2000)CrossRefGoogle Scholar
 10.G. Henkelman, B.P. Uberuaga, H. Jonsson, J. Chem. Phys. 113, 9901 (2000)CrossRefGoogle Scholar
 11.K. Toyoura, J. Terasaka, A. Nakamura, K. Matsunaga, J. Phys. Chem. C 121, 1578 (2017)CrossRefGoogle Scholar
 12.K. Toyoura, D. Hirano, A. Seko, M. Shiga, A. Kuwabara, M. Karasuyama, K. Shitara, I. Takeuchi, Phys. Rev. B 93, 054112 (2016)CrossRefGoogle Scholar
 13.C.E. Rasmussen, C.K.I. Williams, Gaussian Processes for Machine Learning (The MIT Press, Cambridge, 2006)Google Scholar
 14.M.L. Stein, Interpolation of Spatial Data: Some Theory for Kriging (Springer Science & Business Media, New York, 2012)Google Scholar
 15.H. Iwahara, T. Yajima, T. Hibino, K. Ozaki, H. Suzuki, Solid State Ionics 61, 65 (1993)CrossRefGoogle Scholar
 16.W. Münch, K.D. Kreuer, G. Seifert, J. Maier, Solid State Ionics 136–137, 183 (2000)CrossRefGoogle Scholar
 17.M.S. Islam, J. Mater. Chem. 10, 1027 (2000)CrossRefGoogle Scholar
 18.M.E. Björketun, P.G. Sundell, G. Wahnström, D. Engberg, Solid State Ionics 176, 3035 (2005)CrossRefGoogle Scholar
 19.P.E. Blöchl, Phys. Rev. B 50, 17953 (1994)CrossRefGoogle Scholar
 20.G. Kresse, J. Hafner, Phys. Rev. B 48, 13115 (1993)CrossRefGoogle Scholar
 21.G. Kresse, J. Furthmüller, Comput. Mater. Sci. 6, 15 (1996)CrossRefGoogle Scholar
 22.G. Kresse, D. Joubert, Phys. Rev. B 59, 1758 (1999)CrossRefGoogle Scholar
 23.J.P. Perdew, K. Burke, M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996)CrossRefGoogle Scholar
 24.J. Mockus, J. Global Optim. 4, 347 (1994)CrossRefGoogle Scholar
 25.E. Brochu, V.M. Cora, N. De Freitas (2010), arXiv:1012.2599
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.