A Segregated Approach for Modeling the Electrochemistry in the 3-D Microstructure of Li-Ion Batteries and Its Acceleration Using Block Preconditioners

Battery performance is strongly correlated with electrode microstructure. Electrode materials for lithium-ion batteries have complex microstructure geometries that require millions of degrees of freedom to solve the electrochemical system at the microstructure scale. A fast-iterative solver with an appropriate preconditioner is then required to simulate large representative volume in a reasonable time. In this work, a finite element electrochemical model is developed to resolve the concentration and potential within the electrode active materials and the electrolyte domains at the microstructure scale, with an emphasis on numerical stability and scaling performances. The block Gauss-Seidel (BGS) numerical method is implemented because the system of equations within the electrodes is coupled only through the nonlinear Butler–Volmer equation, which governs the electrochemical reaction at the interface between the domains. The best solution strategy found in this work consists of splitting the system into two blocks—one for the concentration and one for the potential field—and then performing block generalized minimal residual preconditioned with algebraic multigrid, using the FEniCS and the Portable, Extensible Toolkit for Scientific Computation libraries. Significant improvements in terms of time to solution (six times faster) and memory usage (halving) are achieved compared with the MUltifrontal Massively Parallel sparse direct Solver. Additionally, BGS experiences decent strong parallel scaling within the electrode domains. Last, the system of equations is modified to specifically address numerical instability induced by electrolyte depletion, which is particularly valuable for simulating fast-charge scenarios relevant for automotive application.


Introduction
With the increased interest in electric vehicles, there is a focus on increasing the performance of lithium-ion (Li-ion) batteries, especially their volumetric and gravimetric capacity and their ability to charge faster without degradation and safety issues. Electrode materials for Li-ion batteries are porous microstructures, with active material particles blended with conductive additives and binder being the typical solid matrix while an electrolyte solution fills the pores. During operation, conductivity enables electron transport through the connected solid phase of each electrode up to the current collectors, while ion migration and ion diffusion transport Li ions through the continuous electrolyte solution phase all along the cell thickness. Batteries are ideally designed to maximize migration and minimize diffusion. Electrons and Li ions react at the interface between active material and electrolyte and form Li-ion complexes that diffuse within the solid particles. The impact of the microstructure geometry, or topology, on these transport and reaction mechanisms, and on the overall cell electrochemical performances, has been considered by the battery modeling community at different scales.
A first family of Li-ion battery electrochemical models uses porous electrode theory and thus abstract microstructural heterogeneity of composite electrodes using effective macroscopic properties [14,15,23,28,[40][41][42][43]. In porous electrode theory [15,16,[33][34][35], electrodes are treated as homogeneous media with superimposed solid and electrolyte phases. This macro-homogeneous approach encompasses models that consider either uniform or nonuniform effective macroscopic properties throughout the electrode domain. Because the complex microstructure topology is not directly resolved, only a small number of degrees of freedom (DOF) is required, thus allowing fast computation. Such a method is then particularly suitable to investigate large electrode volumes-for instance, to investigate the impact of microstructural mesoscale heterogeneity or to perform sensitivity analyses that require a large number of calculations [10,32]. These models, however, are intrinsically limited by their macroscale approach. (i) First, equations are derived assuming particle morphology is spherical [14,15,23,28,[40][41][42][43], which is not relevant for every electrode material, such as flake-like graphite [46]. (ii) Second, even though effective macroscopic parameter heterogeneity can be introduced in these models, the heterogeneity grid size is limited because a minimum volume is required to locally characterize and define macroscopic parameters. (iii) Third, homogenization induces a loss of information because it consists of a reduction in the microstructure DOFs. Indeed, typical macroscale models consider only four microstructure parameters [40]: (i) the porosity, which represents the available volume for the Li-ion diffusion; (ii) the electrochemically active specific surface area used to scale the reaction current; (iii) the particle diameter that controls the maximum diffusion length of the solid-phase diffusion; (iv) and, last, the factor of tortuosity, which characterizes the effect of the convoluted, tortuous path of the pores that hinders the Li-ion diffusion. This set of four parameters drastically simplifies the actual geometry and does not necessarily identify it uniquely. (iv) Fourth, determination of these microstructure parameters is challenging, especially for particle sizes for which numerical results are method-dependent [24,47] and for tortuosity factor for which the impact of the conductive additive and binder is difficult to precisely quantify [46]. Microstructure parameter errors are then propagated to the macro-homogenous electrochemical model, resulting in increased uncertainty. In addition, even though a correct agreement can be achieved between macroscale homogeneous electrochemical model predictions and experimental data [10,46], this could be the result of an overfitting of the microstructure parameters, which could lead to an overinterpretation of the results and to an inaccurately assessment of the validity of the model. Indeed, as for every model, discrepancy between experimental and numerical results can be reduced by either refining the model, i.e., reducing the number of model assumptions and introducing new parameters, or (over-)fitting existing parameters. (v) Fifth, the impact of local geometric features (e.g., cracks) on the electrochemical response is only partially considered because their contribution is quantified through their impact on the effective microstructure parameters, which might not be noticeable because of the volume averaging, whereas they might have a strong local impact on the electrochemical response and on degradation mechanisms.
Limitations on microstructure homogenization (limited number of microstructure parameters and partial representation of geometric features) and on microstructure homogenization heterogeneity (minimum grid size) provide a biased description of the microstructure. Inplane heterogeneity (defined as the solution variations within slices normal to the electrode thickness) and local heterogeneity (defined as the solution variations in the vicinity of geometric features) is controlled by the microstructure heterogeneity. At a low charge rate and/or for thin electrodes, a partial representation of transport-related parameters is of limited importance because solutions will be quite homogenous. Contrariwise, at high charge rate and/or for thick electrodes, both in-plane and local heterogeneities are expected to be significant and can trigger earlier degradation mechanisms. Therefore, macro-homogenous electrochemical models are particularly well suited for a low charge rate and/or for thin electrodes, but their limited heterogeneity description could lead to an underestimation of degradation mechanisms at a high charge rate and/or for thick electrodes even though they might capture the average response. The latter case corresponds to transport-limited scenarios critical for automotive application.
A second family of Li-ion battery electrochemical models has been developed recently to remedy these limitations and work at the microstructure scale. It consists of applying the governing equations on a mesh that represents the electrode microstructure geometry. This mesh is obtained through experimental observation [13,23,25,27,30,38,48,49], generally using X-ray tomography or Focus ion Beam Scanning Electron Microscopes (FIB-SEM). Some authors numerically generate the electrode microstructure either with ideal geometries [17,19,20] (such as packed spheres) or with ones that try to mimic the actual microstructure [21]. Because the microstructure geometry is used, some of the inconvenience of macrohomogenous electrochemical models are relieved: (i) no spherical assumptions; (ii) higher image resolution, typically in the order of 100 nm, which provides a much more refined description of the microstructure heterogeneity; (iii, iv) no effective microstructure parameters are required by definition; and (v) the impact of submicron geometric features on electrochemical performances can be evaluated locally. Microscale electrochemical models require extensive CPU capabilities because of the very large number of DOF, resulting in significant calculation times [20]. Therefore, to both achieve reasonable calculation times and model the full electrode thickness (essential to investigate transport-limited scenario), the authors used a relatively small in-plane section area of the electrode [23]. These tradeoffs could induce inaccurate results because the investigated volumes are too small to be representative [23]. The current limitation in terms of simulated volume size is a significant bottleneck that limits microscale model predictability. Some larger volumes have been reported in the literature but at the price of an image resolution downscaling, resulting in a coarse description of the microstructure [13], which might result in the loss of small details and in a change of the surface area and roughness. In addition, even though micromodels exhibit a fine description of the active material, the scale difference between the diameter of the active material particles and those of the conductive additives and binder [50] prevents the latter from being accurately represented (i.e.,meshed) at the nanoscale while preserving a field of view large enough to be representative of the active material spatial distribution. Further, X-ray tomography typically does not distinguish the conductive additives and binder from the pores [46]. Two numerical approaches have been reported to partially capture the effect of this phase on the electrochemical response: (i) considering the electrolyte, conductive additives, and binder as a unique phase with effective transport properties [17,23]; and (ii) meshing the conductive additives and binder phase as a uniform thin layer coated at the surface of the active material particles with a high electronic conductivity [13,38]. Last, microstructure models suffer from numerical stability and meshing issues not present in macro-homogeneous models, both leading to simulation inaccuracy and divergence [23].
To provide quantitative, predictive results, microstructure-scale electrochemical models must first address computation time and numerical stability issues, even before considering additional physics. This work focuses on these two areas. Sect. 2 develops the mathematical model used for simulating Li-ion batteries, which is solved using the segregated domain scheme discussed in Sect. 3.3. Various linear solver methods are presented in Sect. 4, which are then compared in the numerical results in Sect. 5. Finally, the concluding remarks are found in Sect. 6.

Li-Ion Electrochemical Model
This section will describe the development of the electrochemical model used to simulate the flow of Li + ions. It should be noted that there are two types of simulations: full-cell and half-cell. In a full-cell simulation, both the cathode and anode are simulated along with the electrolyte. However, a half-cell simulation replaces one of the electrodes with a reference domain which provides a source (or sink) of Li ions. No simulation needs to take place in the reference domain, which makes a half-cell simulation less computationally complex. In this paper, we will only be running half-cell simulation. However, for completeness, this section will describe a full-cell simulation as that is the eventual goal of this project.
We will start by describing the physical domain we wish model followed by the system of equations and boundary conditions that describe the development of Li + concentration and potential fields. Next is a discussion of the interface conditions, which is vital to the coupling between concentration and potential within electrode domains as well as between the electrode and electrolyte domains. Once the model is fully developed, a nondimensionalization is presented to help with numerical performance. In order to simulate the electrochemistry properly, the appropriate initial conditions must be set up, which is described in Sect. 2.6. Finally, this section ends with a description of a optional modification that helps numerical performance when the cell is in a state called "electrolyte depletion."

Domain Setup
In this work, we consider the following domains for a Li-ion cell, , represented in Fig. 1: the positive (cathode) electrode active, c ; the negative (anode) electrode active material, a ; the electrolyte that fill the electrode pores, e ; and the separator, sep , a thin electronblocking layer that prevents direct contact and sort-circuits between the two electrodes, such that: The various additives [46] and the negative and positive current collector volumes are not represented in the domain for this model. When charging, Li ions (noted Li + ) inside of solid positive particles diffuse to the surface, where they react (de-intercalation) and transfer from the solid phase into the electrolyte phase. The positively charged ions travel through the electrolyte and the separator to the negative electrode, where they react and insert (intercalation) into solid negative particles. Electrons follow an opposite path through the electron-conductive solid particles, the current collectors, and the external circuit. More information on the construction of a Li-ion battery can be found in [36]. Figure 1 shows the various domains, boundaries, and interfaces that comprise a cell. Separating electrodes and electrolyte are the interfaces denoted as I ae and I ce . The boundary comprises four parts as well: where a is the boundary between the negative current collector and the anode, c is the boundary between the positive current collector and the cathode, e is where the electrolyte contacts either current collector, and 0 corresponds to all the in-plane boundaries.
Note that the separator is a porous material filled with electrolyte that allows Li + to diffuse through but blocks electrons. In our model, the separator is homogenized. For this reason, the electrolyte and separator subdomains are treated as a single domain, e ∪ sep , with different transport properties. This results in only three computational subdomains. For simplicity, unless otherwise specified, the electrolyte domain is considered to incorporate the separator domain.

Governing Equations
In this work, the mathematical model of a Li-ion cell is restricted to the Li + concentration, c, and the electric potential, φ, in all domains. Both the concentration and potential depend on space, x, and time t. The system of equations used to describe the flow of Li + is also separated into three parts. To condense notation, when used as a subscript, let s = {a, c}, which indicates a property belongs to the anode and cathode domains, respectively. Additionally, let i = {a, e, c} is the same as s but applies to all three computational domains. This way, instead of having to present similar sets of equations for each domain where the only difference is the use of c a versus c c , the notation can be simplified to c s to refer to both equations simultaneously.
Starting with the electrode active materials, define the ionic and potential fluxes: These two depend on the diffusivity of Li + , D s , and the conductivity, κ s both of which depend on c s . As previously noted, the subscript s is the notation to indicate that these values apply to an electrode (solid) domain. When specifically talking about the anode or cathode, the subscript will be a or c, respectively. For example, c c would refer to the Li + concentration in the cathode, and c a would refer to the concentration in the anode. Later, the subscript e will refer to the electrolyte. The flow of Li + in the electrodes is described by: Next, consider the system of equations in the electrolyte domain. The main difference is in the definition of the ionic and potential fluxes: where t + is the transference number for Li + , F is Faraday's constant, D e is the electrolyte ionic diffusivity, κ e is the electrolyte ionic conductivity, and κ D is the diffusional conductivity.
The diffusional conductivity is defined as: where A c is the activity coefficient, R is the gas constant, and T is temperature. The flow of Li + in the electrolyte is described by: where ε is a measure of the porosity. Because of the large pore scale difference between the electrolyte domain, e , and the separator domain, sep , meshing both pores would result in a significant increase in DOF. To overcome this issue, the separator domain is considered as a homogeneous electrolyte domain, i.e., the separator is treated similarly with the electrolyte except for effective transport properties. To ensure mass conservation, an additional porosity term, ε, is introduced in the mass conservation equation with: where ε sep is the bulk porosity of the separator. This form of mass conservation (with a porosity term) is used in macroscale electrochemical models [40]. As described, the present model is then a multiscale electrochemical model because the homogenized medium (separator) and dense mediums (electrolyte and active materials) are solved concurrently.

Boundary Conditions
At the two current collectors, a non-permeability condition is applied to enforce that Li + cannot escape. This condition is represented as: Additionally, along the in-plane boundaries, 0 , an identical no-flux boundary condition is enforced, represented as: This condition results from the fact that the computational domain is a small portion of the full Li-ion battery. Next, a current flux is applied to the two current collectors: where g a and g c are the current densities, constrained by: This constraint essentially enforces that the current flowing into the anode equals the current flowing out of the cathode. Also, notice that the sign of g a should have the opposite sign of g c because the outward normals are facing opposite directions. The current integral balance equation is not explicitly enforced because it is a natural extension of the physics solved with the boundary conditions applied. These boundary conditions drive the entire simulation. The final boundary condition enforces that current does not flow through the sides of the cell: Here, i ∩ 0 refers to the external boundary associated with each subdomain but excludes the current collectors. Additionally, where the electrolyte meets the current collectors: is enforced. A final boundary condition arises from a need to pin down the potential field. The only boundary conditions enforced on the potential are Neumann type, which allow for a valid solution to be shifted by a constant and remain valid. To combat this, a point within one of the electrode domains is set to a fixed potential value, thus removing the constant from the null space. This is not required for the concentration because it already depends on an initial condition.

The Interface Condition
The last aspect of the model is the interface condition. This condition communicates how the Li + is (de)intercalating at the electrochemically active interface between the solid particles and the electrolyte. The active interface is defined as the interface between the pore and the solid particles, which is connected with the current collector. This condition is modeled using the Butler-Volmer equations [36]. First, define the exchange current density for each electrode: where k s is the kinetic constant, and c s max is the maximum Li + concentration in the electrode. Additionally, α a and α c are the anodic and cathodic charge transfer exponent, respectively. Considering the symmetry factor (α a + α c = 1) and assuming that the reaction is equally reactant-like and product-like, then α a = α c = 0.5, so the exchange current density reduces to: Additionally, define the overpotential as: where U s is the open circuit potential. Combining the exchange current density and overpotential according the Butler-Volmer equation results in the Faraday current: which can be simplfied to: if α a = α c = 0.5. The interface conditions can now be represented as: N a · n = N e · n = i ae F , and j a · n = j e · n = i ae for x ∈ I ae , Following is a summary of all the equations, boundary conditions, and interface conditions. System of equations: Boundary conditions: Interface conditions: The current integral balance equation, (17), is not explicitly enforced because it is a natural consequence of the physics solved with the boundary conditions applied. A table of all the parameters used in this model with a short description can be found in Table 1. Pay particular attention to which coefficients depend on concentration

Nondimensionalization
The electrochemical model includes coefficients spanning several order of magnitude and as a result will have a large condition number and will be difficult to solve efficiently and accurately. This numerical issue can be resolved by determining a set scaling coefficients so that the resulting modified system is dimensionless and its solutions are close to being unity. Problem nondimensionalization can be performed using various methods, and the scaling coefficients are not necessarily unique. This section presents one such method for the electrochemical model.
To help improve solver convergence, (6)- (17) is scaled to produce a nondimensional system of equations. To find this scaling, start with the electrolyte domain because it is the most complex. Plugging the definition j e from (4) into the potential balance equation, (8), yields: Now, for the moment, assume all coefficients are constant with respect to concentration, and define: and let φ e = C 0 φ * e . Using these definitions, (18) becomes: Dividing out κ e C 0 results in: and the potential flux simplifies to: The next step is to plug the new representation of j e , (19), and N e , (3), into the concentration balance equation, (7), which yields: Applied current density applied to a or c n -Outward unit normal In the "units" column, a dash, '-', indicates a unitless quantity. The f (c) column indicates whether the parameter can be a function of concentration (20) becomes: which reduces to: Note that: ln because the coefficients are assumed constant.
Next, the length scale, x, is nondimensionalized using x = Lx * , where L is in the characteristic length scale (such as 1e-6 for micrometers). This results in new derivative operators defined by: Using these operators in (21) and moving the L 2 to the left-hand side yields: Finally, redefine the timescale to t = L 2 D e t * . Differentiating this results in: dt * , which when used in (22) reduces to: The full scaled system for the electrolyte domain is: with: Performing dimensional analysis on x * , t * , c * e , and φ * e reveals that they are all unitless. These same scalings could also be used in the solid electrode domain; however, because the segregated-domain scheme solves each domain separately, a different scaling could be used that was specifically designed for the anode and cathode independently. The only guideline is that the timescale and length scale remain the same among all domains. Although this is not strictly necessary, having the same timescale among all domains keeps them from getting out of sync with each other. Additionally, keeping the same length scale makes the transfer of information between domains simpler.
Using the definitions of N s and j s from (1) and (2) in the concentration and potential balance equations, (6) and (8), with the new length and timescales results in the system: Again, this assumes that all coefficients are constant. Not much can be done about the D s D e , so the best scaling is to let c s = c s,max c * s , which will keep c * s around order one. The scaling for the potential term is in a similar situation. A decent choice would be φ s = C 0 φ * s to mimic the electrolyte potential. Another possibility would be to use the maximum voltage if known such that φ s = φ max φ * s . Also note that this scaling does not take the interface condition, (5), into account. This means that before applying the Butler-Volmer condition, c i and φ i need to be in their dimensional form.

Initial Conditions
To perform a simulation, a few quantities must first be calculated. The first quantity is the current flux, g s , which will be used to drive the electrochemical reaction in the Li-ion battery. This flux is dependent on a parameter known as the C-rate. The C-rate is a measure of how fast a battery is charging or discharging relative to its total theoretical capacity. By definition, 1C corresponds to the current required to fully discharge or charge a battery in 1 hour. For example, with a 4 Ah battery, a 1C charge will take an hour and require 4 A, but a 6C charge will take 10 minutes and 24 A. Because a main focus of this research is to use the model described in Eqs. (6)- (17) to simulate the charge/discharge cycles of various batteries at several C-rates, current flux, g, needs to be written in terms of C-rate, C rate : where V is the volume of the corresponding domain, A is the area of the corresponding boundary, and s hr is the number of seconds in an hour. This flux uses the anode to calculate the maximum lithium at full charge. To use the cathode, replace c a and V a with c c and V c . The next set of quantities that needs to be specified to run a simulation is the initial conditions. The initial concentrations, c i 0 , are relatively easy to compute. For the two electrodes, an initial state of charge (SOC) is chosen between 0 and 1, where 0 represents fully depleted, and 1 is fully charged. Then, multiplying the SOC by the corresponding c s,max gives the initial concentration within the two electrode domains. For the electrolyte, the initial concentration is simply set as a constant value.
The initial potentials are a bit more difficult to calculate. Because of the high nonlinearity in the Butler-Volmer equation, (5), a good initial guess is required to get Newton's method to converge. Luckily, with the initial concentrations already known, the Butler-Volmer equation can be inverted to calculate the required initial potentials, φ i 0 . Recall that one of the boundary conditions requires that the potential of one point of the cathode is fixed to an arbitrary value that will provide the reference potential for the whole domain. This reference value is then subtracted (only during post-processing) for all the potentials calculated. Because of this, the initial potential in the cathode is set to this base value, φ 0 : Next, calculate the initial potential in the electrolyte using: Finally, the initial potential in the anode is: As a final note, all of the simulations in this paper are known as half-cell simulations. This is because one of the electrodes is used as a reference domain, and all of its parameters are held fixed. This means that only the electrolyte and one electrode are simulated. For these simulations, the cathode is used as the reference and is considered as a lithium plate, with U c = 0 V and i c = 100 A m −2 , which is simply an arbitrary high value. This method to determine sequentially the potential from a domain to its neighbors is only valid for the initialization step because it requires all fields (concentration and potential) of all domains (solid and electrolyte) to be uniform.

Electrolyte Depletion Modification
This section describes a potential modification to Eq. 4 can help mitigate issues with numerical convergence associate with a state known as electrolyte depletion. This modification is optional and its effects are examined in Sect. 5.3.
Thick electrodes and/or fast charging induce a significant electrolyte concentration gradient all along the cell thickness that could lead to local electrolyte depletion (i.e., c e → 0) [10]. This is an undesirable situation because part of the electrode is then no more electrochemically active (i os → 0 when c e → 0), whereas the remaining part is overused (the integral current density is constant), which in turn triggers a degradation mechanism that will decrease the cell calendar life [10]. Thick electrodes and fast charging however, are desired features for electric automotive application because they reduce cost and charging time, respectively. Therefore, the ability to simulate electrolyte depletion is especially valuable; however, numerical instabilities arise when approaching electrolyte depletion.
The most obvious problem occurs in (4), with the κ D ∇ ln(c e ), which approaches infinity when c e → 0. This singularity can actually be removed by manipulating the definition of κ D . First, use the chain rule: Next, expand with the current definition of κ D : Now, take a closer look at the term κ e c e . In this model of Li + transport, the conductivity in the electrolyte, κ e , is approximated as an nth-degree polynomial: Next, assume that the conductivity, κ e , is zero when c e is zero, which is the case for the electrolyte presented in this paper. This results in a 0 = 0 and κ e c e can be simplified to:  Now define: and, as a result: such that: This modification can be used to overcome some of the numerical difficulties associated with electrolyte depletion.

Numerical Methods
This section discusses the various numerical methods used to address the electrochemical system for Li-ion batteries. Because the system of partial differential equations in (6)- (17) is set up with respect to three separate domains, a possible approach consists in solving this system in a computationally similar way. This results in the segregated-domain scheme. Instead of solving a single system over the whole domain, , the segregated-domain scheme solves three independent systems on the three subdomains: a , e , and c , which is depicted in Fig. 2.
In this paper, we discretize the set of partial differential equations within each subdomain using continuous piecewise linear Galerkin finite elements implemented through the FEniCS Project [1,29]. To solve in one subdomain independently of the others, it must be assumed that the Li + concentration and potential fields in the other subdomains is fixed. Notice that the only communications between subdomains occur at the interface. This results in converting the interface conditions, (15) and (16), to boundary conditions, where either (c s , φ s ) or (c e , φ e ) are fixed, depending on which subdomain is currently being solved. Unless the fixed values of concentration and potential are the exact solutions to (6)- (17), the solution on the independently solved subdomain will not satisfy (6)- (17). This can be resolved by performing a Picard-like iteration between the three subdomains. Picard iteration is used for resolving a nonlinear system of equations by linearizing around a fixed value and then calculating an update. The segregated-domain scheme is similar except values on two domains are fixed while calculating an update to the third.

Variational Form
Let W i for i = a, e, c be the continuous finite element space comprising piecewise linear functions defined on each computational domain. Additionally, let v i ∈ W i for i = a, e, c be the test function for concentration and q i ∈ W i for i = a, e, c be the test functions for potential. Backward (i.e., implicit) Euler is used to handle the time discretization, where τ is the time step, k is the time index, and T is the final time. Additionally, letĉ e andφ e be the fixed approximations to c e and φ e from the electrolyte domain. Then, within the solid active materials, the concentration functional is: and the potential functional is: The nonlinear variational problem on the solid active material domains is to find (c s , φ s ) ∈ W s × W s such that: The functionals for the electrolyte domain are slightly different where the concentration functional is: and the potential functional is: Similarly, the nonlinear variational problem on the electrolyte is to find (c s , φ s ) ∈ W s × W s such that: The segregated-domain scheme iterates between these variational problems using the solutions on one domain to improve the variational form on the other two. Iterating among the three domains is the consistency iteration. The domains are considered consistent when: for some tolerances c and φ .

Newton's Method
Newton's method in each of the segregated domains can be described as: where the superscript (n) denotes the Newton step, u (n) is a solution vector of potentials and concentrations, δu is the update vector, r (n) is the residual vector evaluated at u (n) , and J = F (u (n) )[δu] is the Jacobian matrix of the functional F evaluated at u (n) . The relative Newton convergence criterion is defined as: where n is some tolerance. The linear system of equations could also be represented in block form. Let u (n) be rewritten as a vector of potentials, φ (n) , and concentrations, c (n) . Thus, Eq. (39) could also be rewritten as: The subscripts φ and c found in r (n) and J correspond to the potential and concentration blocks, respectively.

Constant Coefficients
Assuming all coefficients listed in Table 1, the residual functions for each of the solid domains in block form are: where c s 0 is the initial condition or solution from the previous time step, t is the discrete time increment, andî se is the Butler-Volmer equation evaluated atĉ e andφ e along the interface I se . The corresponding Jacobian blocks are: where I is the identity matrix. The residual functions for the electrolyte domain in block form are: whereî se is now evaluated atĉ s andφ s along the interface I se . The corresponding Jacobian is: ∂î se ∂c e I se ,

Nonconstant Coefficients
When any of the coefficients from Table 1 are functions of concentration, the Jacobian blocks become more complex, especially for the electrolyte domain. Assuming that both κ s and D s are functions of c s , the Jacobian for the solid domains would be written as: For the electrolyte, under the assumptions outlined in Sect. 2.7, Eq. (41) can be simplified to: Now, assuming that κ e , κ * D , and D e are all functions of c e , the corresponding Jacobian becomes: ∂î se ∂c e I se ,

Segregated-Domain Scheme Algorithm
Now that all the parts are in place, we present the algorithm used for the segregated-domain scheme displayed in Algorithm 1.

Algorithm 1: Segregated-Domain Scheme
Within each time step, the segregated-domain scheme starts by solving for concentration, c a , and potential, φ a , within the anode domain while assuming c e and φ e are fixed, which results in enforcing the interface conditions as the boundary condition depending on: where: andĉ e ,φ e are the fixed approximations to c e and φ e in the electrolyte. Once Newton's method has converged and produced approximations to c a and φ a , it is time to update the electrolyte fields based on fixed values from the anode and cathode using an equation similar to (42). Finally, the fields in the cathode are updated. One cycle of solves on the anode, electrolyte, cathode is referred to as a consistency iteration and is labeled as such in Algorithm 1. The consistency iteration is repeated until the difference between the newly computed values, (c, φ), and the fixed values, (ĉ,φ), reach some tolerance. Because the coupling between domains only occurs at the interfaces, I ae and I ce , the fixed values, (ĉ,φ), from the previous domain's solve only needs to be interpolated to the next domain at the common interface. In FEniCS, this is achieved by first constructing the mesh to that the DOFs at the interface are shared between the relevant domains(the solid mesh and the electrolyte mesh are then conforming meshes). In other words, the anodes DOFs that lie in I ae are duplicated in the electrolytes copy of I ae . Then, during the setup phase of the simulation, a map can be constructed that takes the values at the interface from the anode's function space to the electrolyte's functions space. This allows for communication between domains without the need for interpolation.
For the tests within this paper, the relative tolerance for the consistency iteration, c , was set to 5 × 10 −4 and the relative tolerance for the Newton iteration, n was set to 1 × 10 −7 . As for the linear solver, When using block Gauss-Seidel, relative tolerance was 1 × 10 −7 . A possible way to reduced the solve time for the segregated-domain scheme would be to loosen the tolerances for the Newton and linear solve iterations. However, the linear solver's tolerance was chosen to reach a similar level of accuracy as the direct solver. Further simulation would be needed to find optimal values for these tolerances that balance speed and accuracy.
One advantage to using the segregated-domain scheme is that the solve on each subdomain requires less memory than solving a single large system. This means that for a fixed memory limit, the segregated-domain scheme can be applied to larger meshes.
The trade-off is the consistency iteration loop. If there is a large change in concentration or potential between times steps, more consistency iteration is expected. These large changes tend to occur at the beginning of the simulation or where the charge rate varies. Note that proof of the convergence of the consistency iteration is an open question, but in practice we find it converges reliably in 3-10 iterations depending on the time step size.

Linear Solvers
Solving the linear system from Eq. (39b) is computationally the most expensive step in our proposed segregated framework. The key to accelerating the overall time to solution is to solve the linear systems of equations in a scalable and efficient way. For notational convenience, let us rewrite Eq. (39b), which solves a linear system of N equations in N variables as: where A ∈ R N ×N is the coefficient matrix, b ∈ R N is the right-hand side vector, and x ∈ R N is the vector of variable unknowns. If all N equations are linearly independent, then there exists a unique solution x = A −1 b. Instead of explicitly computing and storing the inverted A matrix, one can apply the action of A −1 through either direct or iterative solvers.

Background on Direct and Iterative Solvers
One way to directly solve the system of equations is to first perform a lower-upper (LU) decomposition of A. That is: where L and U are the lower and upper triangular matrices, respectively. The diagonal elements should be all ones for one matrix and all zeros for the other. Using this decomposition, Eq. (43) can be rewritten as: which is solved by solving these two systems of equations consecutively: These systems are solved using forward and backward sweeps, respectively, and have an algorithmic complexity proportional to the number of nonzeros in A; however, the L and U matrices are computationally expensive to analyze and factorize; the algorithmic complexity can be as high as O(N 3 ) [45]. Several research efforts over the years have attempted to mitigate the cost by exploiting the sparsity patterns through techniques such as matrix reordering, but even with these improvements, direct approaches still do not scale well with N in terms of both algorithmic complexity and memory consumption [5,11,44]. An alternative is to use an iterative approach to compute the action of A −1 . An initial guess, x 0 , is used to calculate an eventual sequence of iterates {x 0 , x 1 , x 2 , ...}. The idea of an iterative scheme is that at some iterate k, the solution x k will be close enough to the true solution to be used as the approximate solution. To determine whether the approximate solution at iterate k is accurate enough, we evaluate the distance in error between the approximate and exact solutions with the residual vector: The iterative approach employed in this work is the Krylov subspace (KSP) method. The KSP of dimension k belonging to A and r 0 is defined as: KSP methods generate iterates in the following way: For example, the simplest form of the KSP solver can be written as: but in this work, we use the more complex generalized minimum residual (GMRES) method [39], which generates orthonormal vectors {v 0 , v 1 , v 2 , ...} through the Gram-Schmidt process to form a basis for K k . Each iteration of GMRES requires the following matrix-vector product: where w is the resulting vector used to update the solution. Krylov methods such as GMRES are said to have the optimality property if in each iteration the computed iterate is the best possible approximation of the solution within the current KSP; however, the convergence rate of the iterates produced by GMRES is limited by the Krylov subspace itself. That is, if the matrix A is either ill-conditioned or very large and complex, GMRES will converge slowly or in some cases not converge at all. To improve the convergence rate of the iterative solvers, we need to supply a preconditioner matrix P to modify the KSP. In this work, we consider only left-preconditioning. The linear system of equations in (43) with left-preconditioning becomes: The preconditioned residual for this is: The KSP for this new linear system of equations is: In the preconditioned GMRES approach, the action of P −1 alters the spectrum of the linear system by transforming w from Eq. (44) into a new vector z. That is: Choosing the right preconditioner such that P −1 A ≈ I is the task at hand. For elliptic PDEs such as the diffusion or heat equations, the algebraic multigrid (AMG) approach is a good way to construct and apply the action of P −1 .

Block Preconditioning
If Eq. (43) contains multiple physics or domains, applying GMRES with algebraic multigrid preconditioning to the entire linear system will not always yield a convergent solution. Such systems require careful preconditioning techniques where the preconditioning matrix is decomposed into individual blocks. First, let us rewrite Eq. (45) in block form: where the subscripts φ and c correspond to the potential and concentration blocks in Eq. (40b), respectively. We now highlight two common block preconditioning approaches for Eq. (46):

Block Jacobi (BJacobi)
In the BJacobi approach, we set all the blocks in P equal to their corresponding blocks in J. We set the initial guess for both z φ and z c to 0 so their solutions under this approach are obtained by solving these two inner linear systems of equations: The inverses of the diagonal Jacobian blocks are approximated using GMRES coupled with algebraic multigrid preconditioning. Alternatively, one could choose to apply only a single sweep of algebraic multigrid preconditioning to these two equations. This will save computational costs associated with fully solving the two inner systems of equations, but more outer GMRES iterations might be required.

Block Gauss-Seidel (BGS)
The BGS approach is the method of successive displacement. That is, the block equations for z φ and z c are solved sequentially. The solution vector obtained from the first block solve is used to obtain the second solution vector. We again set all the blocks in P equal to their corresponding blocks in J and set the initial guesses to 0, resulting in the following two inner linear systems of equations: These two blocks can also be solved for either through GMRES with algebraic multigrid preconditioning or through a single algebraic multigrid preconditioning sweep. One can alternatively rearrange the block system so that the concentration block is solved first. That is: In this paper, we employ the BGS approach to solve the linear system of equations associated with all three domains. Both the concentration and potential blocks are approximated using GMRES that is preconditioned with AMG. This setup was chosen because the system of equation within the solid domains, s . If the interface condition is ignored, then (6) and (8) are completely decoupled. These two independent systems are similar to a nonlinear heat equation and a nonlinear Poisson equation. Both of these systems are easily handled by GMRES that is preconditioned with AMG. As for the electrolyte domain, we are still investigating potentially better solvers. To this end, part of our future work is to derive a Schur complement approximation to handle nonlinear coupling within the electrolyte domain.

Parallel Implementation
To accelerate our proposed block preconditioning approach even further, we need to use parallelism. Specifically, we leverage the Portable, Extensible Toolkit for Scientific Computation (PETSc) library [3,4,12], a numerical linear algebra library built on top of the message passing interface (MPI). It not only has scalable nonlinear and optimization-based solvers (see [7,8]), but it can interface to other important math libraries such as HYPRE [18]. To handle the block system in Eq. (40b), we use PETSc's composable linear solver capabilities (referred to as PCFieldsplit) to approximate the inverse of the J φφ and J cc blocks individually using the BGS approach discussed previously. This is achieved by setting up a multiplicative FieldSplit within PETSc. The solve is decomposed into two levels of iterations. The outer iteration is simply GMRES, which relies on the inner iteration consisting of a single application of HYPRE's algebraic multigrid BoomerAMG [22] solver on the separate potential and concentration blocks. In the next section, we compare the performance of the BGS solver with that of a popular parallel LU solver, the MUltifrontal Massively Parallel sparse direct Solver (MUMPS) package [2]. All numerical experiments are carried out on Dual Intel Xeon Gold Skylake 6154 (3.0-GHz, 18-core) server nodes with 96, 192, or 768 GB of memory

Numerical Results
This section discusses the numerical results of two simulated batteries to explore how the block solver performs. First, a simple generated battery domain is used to compare the solver

Solver Comparison
To compare MUMPS and BGS, a simple battery geometry was generated. This domain consisted of seven particles in line with the x-direction (electrode thickness) and repeated in the y-and z-directions, which is represented in Fig. 3. The mesh was created using the MATLAB-based mesh generation Iso2mesh toolbox [37]. The initial concentrations for this problem were: For this simulation, the cathode is used as the reference domain, so no actual calculation will occur within it. A reference voltage, φ 0 , of 5 V is arbitrarily chosen and set as the cathode potential for the whole calculation. The initial concentration of the cathode is of no importance because its overcharge potential (OCP) is set constant equal to 0 V, and its exchange current density, i oc , is set to a high constant value (100 A m −2 ) because the cathode is considered to be a lithium plate. The remaining potentials were calculated using Eqs. (28) and (29). The simulation persisted until the intercalation fraction in the anode was 0.95.
The first series of tests performed involve increasing the number of DOFs while recording memory requirements and time to solution. Because these are 3-D simulations, uniform refinement of an initial mesh would increase the number of DOFs by a factor of around 8. In other words, it would take only a few uniform refinements before the meshes grew beyond the limitations of the hardware. To prevent this while still having comparable results, three initial meshes were chosen that resulted in approximately 0.5, 1.0, 2.0 million DOFs for the entire domain. These meshes were then uniformly refined three times to provide nine experimental meshes. All of these tests were performed on a single node with 16 processes.
For each mesh, the simulation was performed to completion or until it ran out of allocated time with a maximum of 10 days. During a simulation, anytime a solve occurred in the electrolyte or anode, the maximum memory and wall clock time were recorded. Keep in mind that a single solve on a domain will happen multiple times because the Newton and  All simulations use the 7-particle domain on a single node with 16 processors. The mesh from the italic row was used to perform the strong scaling study in Fig. 9 Consistency loops discussed in Algorithm 1. The average memory usage and wall clock time for each domain and all meshes are found in Tables 2 and 3. Some simulations ran out of memory, which is denoted as OOM. As a note, The "Test Index" indicates which meshes were being used in the same simulation. For example, Test 4 contained the electrolyte mesh with 1.53 M DOFs and the anode mesh with 1.74 M DOFs. The full simulation for Test 4 involved solving on those two meshes independently and sharing information across their interface, ae . The data in Tables 2 and 3 is also presented in Figs. 4 and 5. Figure 4 show how the time to complete a single solve on a specific domain grows with increasing DOFs. The iterative solver, BGS, is between 5.00 to 16.98 times faster than MUMPS when solving in the anode domain, and between 2.85 to 6.38 times faster than MUMPS in the electrolyte domain. The speed advantage of BGS is expected to increase as the number of DOFs increases. The  The bottom set of plots are a comparison of the peak memory usage used by each solver method: MUMPS or BGS. This data was collected using the Python package memory_profiler. Additionally, the top set of plots show the ratio of MUMPS memory usage to BGS memory usage smaller speedup in the electrolyte domain is expected because the system is more difficult because of the additional off-diagonal terms. It is expected that a better preconditioner using Schur complements could be constructed to increase the performance within the electrolyte domain. This Schur complement method is still in development.
The memory usage information is summarized in Figs. 5 and 6. On average, a solve using BGS takes about half the memory required by MUMPS; however, the ratio of memory required by MUMPS vs. BGS continues to increase as the number of DOFs increases. It is interesting to note that even though the anode domain generally has more DOFs, it requires less memory to solve. This is because the coefficient matrix for the electrolyte system contains more nonzeros resulting from the off-diagonal terms. This is why Test 8 ran out of memory while solving the electrolyte but succeeded in solving the anode. In short, the BGS method is much better suited for solving larger scale problems than MUMPS. This is not only because BGS is faster; it also uses less memory. From looking at the number of bytes per DOF in Fig. 6, the clear trend is that MUMPS becomes less efficient at larger problems sizes, while BGS become more efficient

Parallel Scaling
In the previous section, it was shown that the BGS method is more computationally efficient than the direct solver, MUMPS; however these tests were performed on a single node with 16 processes. This section examines the BGS method's parallel scalability. Traditional methods for testing an algorithm's parallel performance involve strong and weak scaling studies. In a strong scaling study, the problem size is held constant and the number of processes increases. A weak scaling study attempts to keep the ratio of problem size and processes constant by increasing both DOFs and processes at the same rate using a DOF per process constraint. A third option, however, holds the number of processes constant while increasing the problem size. This method is known as static scaling and is used to asses the ideal problem size for a given parallel framework. A summary of all the scaling metrics is found in Table 4. An example of ideal static scaling is found in Fig. 7. In this figure, there are three dominant regions: communication limited, optimal scaling, memory limited. In the communication limited region, the problem size is very small compared to the number of processes. This results in the solve slowing down because of the extensive communication overhead. The memory limited region results from the problem size growing out so large that the number of processes is insufficient, resulting in a slowdown because of transferring information in memory from suboptimal locations. In the final region, the problem size is a perfect match for the number of processes, which minimizes solve time and results in optimal scaling. More about static scaling can be found in [6,9,26,31].   Tables 2 and 3 The static scaling result for BGS and MUMPS on the system in (16)-(17) is found in Fig. 8.
As expected, this figure shows that the BGS method can process more DOFs/s than MUMPS, which is only a different way of presenting the information from Fig. 4. Additionally, for both solution methods the anode problem is more efficient, which is expected because of the difficulty in solving the electrolyte problem. The main point of this figure is to show that the BGS method remains within the optimal scaling region longer than MUMPS. This means the BGS method should exhibit good scaling performance.
The strong scaling study is found in Fig. 9. The initial mesh used for this set of experiments comes from the gray rows in Tables 2 and 3, which also corresponds to the boxed points in Fig. 8. This series of tests used 16 processes per node and double the number of nodes with each successive test. Using BGS to solve the anode problem results in nearly ideal scaling; however, the solve within the electrolyte domain starts to see diminishing returns after using 4 nodes (64 processes).
One possible cause for the suboptimal scaling within the electrolyte is an increase of Krylov iterations required to solve the problem as the number of MPI processes increases. As shown in Fig. 10, however, the average number of Krylov iterations remains relatively constant. Another possible cause has to do with the sparsity patterns of the two domains. Unlike in the solid domains, the electrolyte problem contains coupling on more than just the interface boundary. This could lead to a degradation in parallel performance because sparse matrix-vector multiplications have very low arithmetic intensities and can present as serial bottlenecks at the memory levels of each server node (see [9] for further discussion). The tomography-based mesh generated from a sample of SLC1506T Round 2 electrode used for testing electrolyte depletion. This electrode has a porosity of 37.4% Currently, the inverse of each block, J φφ and J cc , is approximated using algebraic multigrid; however, a better approximation to J cc −1 could be constructed using the Schur complement. This is the topic of future research.

Electrolyte Depletion
The final experiment involves the modification discussed in Sect. 2.7. When charging at high C-rates, real Li-ion batteries can experience electrolyte depletion, which makes solving the electrolyte problem significantly more difficult. The mesh used to test electrolyte depletion was generated from a real SLC1506T Round 2 Electrode fabricated by the Cell Analysis, Modeling and Prototyping facility at Argonne National Laboratory and imaged by University College London with nano-CT, which is shown in Fig. 11. The complex geometry of this mesh will make diffusing Li + through the electrolyte more difficult. Further, to artificially induce electrolyte depletion, several simulations are performed by decreasing the initial concentration of Li + within the electrolyte. Figure 12 shows the performance of BGS with the original and modified version of the system in Eqs. (6)- (17). Because electrolyte depletion occurs as the concentration gradient develops over time within the electrode, the time required to increase the SOC of the anode by 0.003555 was used instead of the time to perform an individual BGS solve. This took about 5 full time steps or about 2.88 simulated minutes. Including all five time steps incorporates the effects of electrolyte depletion on the Newton and consistency iterations as well as the linear solve.
In general, as the initial Li + concentration decreases, the time to solution increases, which is expected; however, without the modification, BGS begins to diverge as the concentration decreases, to the point where the simulation crashes with an initial concentration of 40 mol m −3 . The modified version was able to simulate down to 10 mol m −3 . In addition to this, the modified version is generally faster. This has to do with the fact that the unmodified case is Fig. 12 The effect of electrolyte depletion on solve time. This data was collected by summing the time to completion of the first five time steps. The modified system refers to the modification discussed in Sect. 2.7. In a typical Li-ion battery, the electrolyte concentration is around 1200 mol m −3 closer to singular when concentration approaches zero, which requires more linear iterations to get to the same level of accuracy as the modified case. On average, the modified case was 1.7 times faster than the unmodified problem.

Conclusion
In this paper, we presented a segregated-domain scheme to solve the electrochemistry of the Li-ion battery with FEniCS and a block preconditioning approach that significantly accelerates the linear solver. By simply leveraging the state-of-the-art preconditioned Krylov solvers available in the PETSc library, we achieved a nearly 6 times speedup over traditionally used direct solvers, and we expect this speedup to increase when we examine larger domains. Further, our block solver approach is scalable both in the algorithmic and parallel sense. Additionally, with a small modification to the off-diagonal term within the electrolyte, the solver was able to handle simulations with significantly smaller Li + concentrations. Continuing research will include the development of a better Schur-based block preconditioner for the electrolyte to overcome the issue of off-diagonal dominance associated with electrolyte depletion.