Sensitivity estimation for calculated phase equilibria

The development of a consistent framework for Calphad model sensitivity is necessary for the rational reduction of uncertainty via new models and experiments. In the present work, a sensitivity theory for Calphad was developed, and a closed-form expression for the log-likelihood gradient and Hessian of a multi-phase equilibrium measurement was presented. The inherent locality of the defined sensitivity metric was mitigated through the use of Monte Carlo averaging. A case study of the Cr-Ni system was used to demonstrate visualizations and analyses enabled by the developed theory. Criteria based on the classical Cram\'er-Rao bound were shown to be a useful diagnostic in assessing the accuracy of parameter covariance estimates from Markov Chain Monte Carlo. The developed sensitivity framework was applied to estimate the statistical value of phase equilibria measurements in comparison with thermochemical measurements, with implications for Calphad model uncertainty reduction.


Introduction
Calphad-based thermodynamic models are routinely used to probe the phase stability in multicomponent systems. Computational efficiency and the ability to incorporate experimental measurements, atomistic simulations, and expert intuition in a semi-empirical fashion have led to the broad adoption of the Calphad approach, but it is only in recent years that serious attention has been paid to uncertainty quantification (UQ) of the model predictions.
Stan and Reardon demonstrated early work on Calphad UQ using genetic algorithms which anticipated the Bayesian approach adopted by later work [1]. As computing efficiency increased, several authors identified Markov Chain Monte Carlo (MCMC) as a powerful technique for optimizing Calphad model parameters and simultaneously determining their uncertainty with respect to the data [2,3]. Readers interested in further discussion of recent developments in Bayesian UQ for ICME, with application to Calphad, are directed to a recent review [4].
While there have been significant developments in understanding the propagation of Calphad model uncertainty to the equilibrium predictions [5], the inverse has received minimal attention since the work of Jansson in Calphad parameter optimization [6]. In seeking to develop a theory of Calphad model sensitivity, it is desired to understand the flow of uncertainty from the experimental measurements to a given Calphad model. Clear definitions must be given to all observation types, including multi-phase equilibrium information, commonly referred to as "phase diagram data." The development of a consistent framework for Calphad model sensitivity is necessary, not only for UQ, but for the rational reduction of uncertainty via new models and experiments.

Theory of Calphad Model Sensitivity
Without loss of generality to multi-component, multi-sublattice systems, we will first consider an isobaric binary system, A-B, consisting of two single-sublattice phases, α and β. The molar Gibbs energies are defined as ( , , ; ) and ( , , ; ), respectively, where is the temperature, is the site fraction of component in phase , and is an empiricallydetermined vector of continuously-valued model parameters for the phases. may have a nonlinear dependence on the elements of , but the partial derivatives with respect to the parameters are assumed to exist. It is often the case that each element of is associated with only one phase, but that assumption is not necessary. is the molar amount of phase . The total molar amount of components A and B, and , are equal to + and + , respectively.
In the interest of brevity, detailed solutions for the equations of equilibrium are not included here. This derivation can be found in several publications, recently in significant detail by Sundman et al. [7]. For this work it was sufficient to assume a particular solution is known to the equations under the given conditions.
Let an experimental observation at equilibrium of the coexistence of phases and at a fixed temperature find the measured mole fractions ̃, ̃, ̃ and ̃. The mole fractions predicted by the candidate model specified by are , , and . The "residual driving force" formulation for the error in a candidate model specified by of a multi-phase equilibrium observation was adopted in this work [8,9]. The Gibbs energies of phases in coexistence lie on an equipotential line (hyperplane in multi-component systems) which minimizes the total energy.
When the model degrees of freedom, , do not satisfy that criteria for a given experimental measurement, the deviation from the equipotential condition can be quantified as a signed distance ("driving force") from a fictitious hyperplane. That hyperplane is the arithmetic mean of the hyperplanes calculated at the measured phase compositions using the candidate model. As the deviation approaches zero, the hyperplanes at each measured composition will approach the mean ( Figure 1).
An advantage of this formulation is that it is continuous with respect to the metastability of one or more observed phases, meaning that if an observed phase is not predicted to be stable according to the candidate model, the error can still be defined. Another advantage is that it is differentiable with respect to . The "residual driving force" of a multi-phase equilibrium measurement is defined as The index refers to each observed phase, i.e., or . Equation 1 is computed at ̃, the tie-line endpoint corresponding to each observed phase . ̅ is the arithmetic mean of the chemical potentials found by computing a multi-phase global equilibrium at each tie-line endpoint, i.e., an average chemical potential of values calculated from their respective compositions at all tie-line endpoints. For the terms, corresponding to the individual phases, we compute single-phase local equilibria, i.e., compute the chemical potentials at ̃ while only considering the phase, and also the chemical potentials at ̃ while only considering the phase ( Figure 1). When are chosen such that ̃ = , ( ) will be equal to zero. It is often the case that, in an experimental multi-phase equilibrium observation, only one of the phase compositions can be determined, e.g., a measurement of the liquidus temperature by differential scanning calorimetry heating/cooling curve analysis. For the case of a phase of undetermined composition in equilibrium with a phase at measured composition ̃, ̃ can be estimated as the composition which maximizes the thermodynamic driving force for formation of the phase relative to ̅ . The computation of ̅ then excludes the chemical potentials calculated at the estimated ̃.
For sensitivity estimation, the first derivatives of ( ) need to be computed. Assume that is independent of . The derivative can then be written as follows.
Because the arithmetic mean is a linear operator, it is sufficient to determine an expression for . The chemical potentials are dependent variables which are outcomes of a non-linear optimization. The Lagrangian formulation of the Gibbs energy minimization problem [6] [7] can be stated as follows.
= −̃ is the mass balance constraint for component , is an index for the internal constraints (e.g., site fraction balance) for all phases in equilibrium, where is the corresponding Lagrange multiplier. For the following and all subsequent steps, we assume calculation at a feasible solution, so that the gradient of the Lagrangian is equal to zero. For the present case this can be expanded as the following system of equations: are the internal variables for all stable phases in the calculation, and is the amount of phase . can otherwise be discarded for the present analysis.
Assume that all and are independent of and differentiate the system of equations, resulting in Because of redundant constraints, under some circumstances this system of equations may be over-determined. By adopting the least-squares solution and referring to the preceding matrix as , the chemical potential derivatives can be written in closed form: This expression shares some similarity with the temperature "dot derivative" in Sundman et al. [7], if the model degrees of freedom ( ) are mathematically interpreted as independent thermodynamic state variables, similar to temperature. It is often the case that the Gibbs energy model of a phase has a linear dependence on , e.g., = ⋯ + ( 1 + 2 ). For this case, is independent of and is constant for a given equilibrium solution, assuming the phases' internal degrees of freedom do not change. is also constant, as a consequence. If this scenario applies to all the phases in the calculation, then will be independent of , and ( ) will be linear in .

Definition of Phase Equilibria Log-Likelihood
Assume that the error associated with an experimental observation of ( ; ̃) is normally distributed about zero with constant variance 2 . Note that ( ; ̃) is not a true observable because its value is inferred from the measured quantities ̃, but in principle this only affects computation of 2 . The log-likelihood function for independent observations can then be expressed as: The second term is often omitted, as it is independent of . In the present work attention is restricted solely to multi-phase equilibrium observations, though observations of other thermochemical quantities, such as heat capacity and enthalpy of formation, can easily be incorporated in the log-likelihood and the following derivation of sensitivity. In statistics, the score function ( ) is defined as the gradient of the log-likelihood function.
A "partial" score of a particular observation can also be defined and is denoted ( ). By the independence of observations, ( ) = ∑ ( ).
The corresponding Fisher information matrix (FIM), (̃, ; ), captures the curvature of the log-likelihood, and is defined as the negative of the expectation of the log-likelihood Hessian as follows Under a common condition, discussed in the previous section, that ( ) is linear in , 2 ( ) = 0 and the higher-order term can be neglected, i.e., The assumption of linearity causes the dependence on to fall out of Equation 10 because A scalar sensitivity metric based on the FIM can be defined several ways, and is strongly connected to the notion of the optimality of a measurement. In the present work a form of "Aoptimality" was adopted, wherein the trace of the FIM was used to define the sensitivity [10].
The specific definition in Equation 12 of an observation, , warrants discussion. A "phase region" is defined in the present work as a group of tie-line endpoints corresponding to the same multi-phase equilibrium. The approach taken in this work was to define an observation in terms of each measured phase region, such that each dataset consisted of multiple "observations," all assumed statistically independent. This definition preserved the additivity property of (̃, ) and made it convenient to generate sensitivity analyses based on both the parameters and the datasets ("groups of phase regions"). A disadvantage of this approach was that trends in (̃, ) with respect to the MCMC iterations were challenging to interpret. The MCMC simulation involves a maximization of the log-likelihood function (Eq. 7) and, while it is expected that the magnitude of the total log-likelihood gradient (Eq. 8) decreases with an approach toward the maximum-likelihood value of , the same is not generally true for (̃, ). This is because of error cancellation due to a summation of terms with opposing sign in the log-likelihood gradient.
In the scaled sensitivity as defined in the present work, the gradient of each observation is squared prior to summation, so opposing gradients do not cancel. For the case of stronglyconflicting (inconsistent) observations, the value of (̃, ) may be large, even near the maximum-likelihood . Another possibility would have been to define each dataset as a single "observation," such that gradients cancel in a way similar to Equation 8. That approach could have value as a diagnostic quantity for Calphad modeling during the parameter optimization process, but a scaled sensitivity defined in such a way would be strongly correlated with the loglikelihood gradient, and so such an analysis might be better performed by just computing the gradient norm. While the issue of potentially conflicting data has been investigated in the context of the pure elements [11], further analysis of the role of outliers in multi-component Calphad sensitivity analyses is recommended for future work.
A potential limitation of the present approach is that the sensitivity metric is only local and, for example, a "nearly mis-predicted" phase close to the limit of stability for a given tie-line will contribute nothing to the sensitivity, even though a small change in could cause it to become stable in the measured phase region of interest. This concern was mitigated through the use of Monte Carlo estimates of around the maximum-likelihood value. ( ;̃, ) was then computed as chain/trace averages, which introduced a degree of non-locality to the predictions.
Another potential limitation is that correlations between parameters are not explicitly considered, i.e., the covariance of . While it was not done in the present work, other optimality criteria incorporating the off-diagonal elements of Equation 10 are known [10]. One challenge to resolve for such an approach would be finding good empirical estimates of the covariance to use in the rescaling of the FIM.

Application to Cr-Ni
The Cr-Ni system is a common exemplar system for thermodynamic modeling of alloys, given its technological importance and relative simplicity. It contains three stable solution phases: fcc, bcc and liquid. If one adopts the Standard Element Reference and tabulated lattice stabilities for the solution phases [12], then it is only left to the modeler to determine the binary interaction parameters for each phase. The low-temperature intermetallic phase, CrNi2, is neglected in the present work, as are all magnetic contributions. A complete thermodynamic assessment was not an objective of this work, and such interested readers are directed to a recent review by Tang and Hallstedt [13].
The initial Cr-Ni thermodynamic model was generated by the ESPEI software using thermochemical data (enthalpies, entropies) for the individual phases [8]. The iterative leastsquares procedure used by ESPEI generated a database with non-zero + terms for the  Figure 2(a). The thermochemical data used to generate the starting point was then discarded for the subsequent analysis. Typically this data would be retained in a Calphad modeling procedure, but it was excluded in this work to isolate the statistical influence of the phase equilibria measurements.
For the MCMC step, phase equilibria data for the solution phases was collected from the literature [14][15][16][17][18][19][20][21][22][23][24][25]. The ESPEI YAML configuration file and JSON data files for the MCMC step can be found in the Supplementary Material. 24 chains (twice the number of parameters) were included in the ensemble. All data were equally weighted with an ESPEI "data weight" of 20, for an effective of 50 J/mol. A flat prior, contributing a log-prior of zero to the log-probability, was assumed for all parameter values. The MCMC simulation was run for 500 fixed iterations without stopping criteria. The phase diagram from the chain-average parameters at the last iteration is shown in Figure 2(b). The log-likelihood trace is shown in Figure 3.  The computed sensitivities can also be visualized in terms of each model degree of freedom. The contribution of each parameter to the scaled sensitivity is shown as a function of MCMC iterations in Figure 5. The sensitivity contribution from the higher-order liquid parameters was minimal throughout the optimization process, indicating that the considered observations were relatively uninformative for those model degrees of freedom. High sensitivities seen from the higher-order parameters in the A1 phase were consistent with the dataset-based analysis ( Figure   4), and were understood as an indicator that those parameters were strongly coupled to the observations, particularly at the Ni-rich side of the A1-A2 phase boundary.
Sensitivities can also be projected back into the space of the observations, providing insight into where new measurements might make the most impact on the likelihood. Figure 6 shows the  Figure 7(a), the peak in the Svechnikov1962 curve was understood to be indicative of a within-dataset initial disagreement in the sign of the log-likelihood gradient with respect to the given parameter. This disagreement was captured by the scaled sensitivity due to the squared gradient term, which increases in magnitude when multiple observations from the same dataset are in conflict. As the apparent conflict was resolved by the MCMC optimization, the sensitivity decreased. Initial sensitivity contributed by some of the other datasets was observed, but quickly dropped to zero as the phase mis-prediction was resolved by the optimization. The log-likelihood contribution of the Bechtoldt1961 dataset remained very sensitive to the regular solution parameter for the A1 phase (Figure 7(b)), and the sensitivity actually increased with respect to the MCMC iterations. While optimization reduces the total log-likelihood gradient, it does not guarantee sensitivity reduction with respect to every dataset.
It is desirable to determine whether an MCMC-based optimization has run for a sufficient number of iterations, and if its estimate of parameter uncertainty is reasonable. In this work the developed Calphad sensitivity theory was applied to perform an analysis using the well-known Cramér-Rao (CR) lower bound on the parameter covariance. The CR bound is a statement on the covariance of an unbiased estimator, giving the inverse of the Fisher information matrix (Eq. ) as the lower bound [26]. While any realized estimator may fail to achieve the lower limit, a corollary to the CR bound is that, if a given estimator's covariance falls below the given limit, the estimator must be biased in some way.
Corner plots for the A1 and liquid phases, with estimated CR covariance ellipsoids, are shown in In seeking to remove undesired bias in the liquid uncertainty estimate, two approaches were considered. The first was to use informative priors for the parameters that were found to be insensitive. This would mean adding information from another source, possibly based on experience or intuition. While this could resolve the issue in one sense, the choice of any particular prior would be difficult to justify in advance.
Another approach would be to add more informative observations to the optimization. In this work only phase equilibria measurements were considered, but a pair of liquid mixing enthalpy measurements, for example, would strongly increase both the magnitude of the likelihood gradient (Eq. 8) for the higher-order constant binary interaction term for the liquid, as well as the curvature of the likelihood function along that direction (Eq. 10). The extent of the statistical information for a given model contained within a set of observations can be quantified by the this spectral approach is the ratio of the largest to smallest eigenvalue (condition number), and was also computed in the table. Assume that such measurements were perfectly consistent with the candidate model (zero error). Even in making the relatively conservative assumption of = 1000 ⁄ for this hypothetical measurement, the strongly informative nature of thermochemical measurements provides a reduction to the uncertainty bound of the higher-order liquid parameters, driving an increase in the smallest FIM eigenvalues and an order-ofmagnitude reduction in the matrix condition number. One would then expect subsequent MCMC optimization to be accelerated by the greater curvature of the augmented likelihood function, and an uncertainty estimate closer to the CR bound to be achieved.
It is promising for the future of Calphad sensitivity that this analysis was able to quantifiably reproduce the long-respected wisdom in the Calphad community that thermochemical measurements are the foundation of an accurate thermodynamic model, with the phase diagram playing a highly-visible, yet merely supporting, role.

Conclusions
Sensitivity analysis is a powerful tool for the development of Calphad-based thermodynamic models, providing data-point level resolution on the coupling of prediction error to the model parameters. Through the presented theory it was shown possible, in a case study of the Cr-Ni system, to assess the accuracy of MCMC-based covariance estimates using the classical Cramér-