1 Introduction

Modern computing platforms allow the study and simulation of complex biophysical phenomena with very high accuracy, thanks to the availability of large computing and memory resources on last generation machines, as well as to advanced mathematical and numerical models. As a consequence, the development of accurate and efficient large-scale numerical solvers have become a very important research topic in the parallel scientific computing community. In the cardiac research field, the opportunities given by computational studies of the heart functions, have increased the interactions among clinicians, mathematicians, physicists and engineers, working together to better understand the behavior of the different parts of the heart and to use these studies to predict and simulate several cardiac pathologies [25, 30].

The mathematical modeling of the heart involves systems of partial differential equations (PDEs) and ordinary differential equations (ODEs), which are combined together to model the three main cardiac functions: the electrophysiology [4], the induced cardiac mechanical activity [27], and the blood circulation inside the heart [8].

In this paper, we focus on the first function and design efficient numerical solvers for cardiac electrophysiological models which are also scalable and robust. Among the recent works devoted to the study of parallel cardiac solvers as well as the coupling of different electrical and mechanical cardiac models, we refer to [2, 6, 23, 32] and to the references therein. The main objective of this work is to investigate the efficiency and robustness of a decoupled dual-primal Newton–Krylov solver for implicit time discretizations of the Bidomain model, already introduced in [15] but only coupled with simplified ionic models. This paper extends this study to include both more complex and biophysically detailed ionic models, as well as the presence of ischemic regions.

The Bidomain model describes the propagation of the electric signal in the cardiac tissue by means of a system of parabolic PDEs, coupled with a system of ODEs representing the ionic currents dynamics. By using a fully implicit time scheme for the discretization of the temporal variable, we obtain at each time step a nonlinear algebraic system, which is solved by the Newton method. We choose to solve the Jacobian linear system arising at each Newton step with a Krylov iterative solver, together with Balancing Domain Decomposition with Constraints (BDDC) [9, 10] and Dual-Primal Finite Element Tearing and Interconnecting (FETI-DP) [12] preconditioners, in order to accelerate convergence. These algorithms belongs to the class of dual-primal Domain Decomposition (DD) methods, where the degrees of freedom (dofs) are classified into those internal to each subdomain and into those on the interface, which are further divided into dual and primal dofs [29].

In previous work, the authors have analyzed and presented a theoretical convergence rate estimate for the preconditioned solver, together with several preliminary parallel tests using a simple phenomenological ionic model. In particular, the proposed solution strategy relies on a staggered approach, where the two PDE and ODE systems are solved successively, in contrast to a monolithic or coupled approach (e.g. [16, 22]).

We extend here the numerical results by studying the robustness of the proposed solver both in case biophysical ionic models are employed, such as the Luo–Rudy phase one [19] and the Ten Tusscher–Panfilov [28] human ionic models, and in case ischemic regions are considered. The inclusion of ischemic regions in the computational domain is modeled mathematically by introducing jumps in the diffusion coefficients; in turn, the discontinuity of the diffusion coefficients on the boundaries of the ischemic region impact on the conditioning of the linear systems, thus requiring robust preconditioned iterative solvers.

This paper is structured as follows: in Section 2 we review the micro and macroscopic models of cardiac electrophysiology, introducing the application we consider for our dual-primal solver, presented in Section 3, together with the adopted discretization and solution schemes. Section 4 provides extensive parallel numerical tests that show scalability and robustness of the proposed dual-primal Newton–Krylov solver, posing the basis for several possible future extensions, discussed in the conclusive Section 5.

2 The Cardiac Electrical Model

In the following Section, we briefly review our cardiac reaction - diffusion model, by introducing the assumptions needed for its formulation. Moreover, we introduce the ionic models employed both in the theoretical analysis and in the numerical experiments.

2.1 The Bidomain Model

The mathematical description of the electrical activity in the cardiac tissue, known as myocardium, is provided by the Bidomain model.

The myocardium can be represented as the composition of two ohmic conducting media, named intra- (Ωi) and extracellular (Ωe) domains, separated by the active cellular membrane (Γ); the latter acts as insulator between the two domains, as otherwise there would be no potential difference across the domain. These anisotropic continuous media are assumed to coexist at every point of the tissue and to be connected by a distributed continuous cellular membrane [5]. Additionally, the cardiac muscle fibers rotate counterclockwise and their arrangement is modeled as laminar sheets running radially from the outer (epicardium) to the inner surface (endocardium) of the heart.

This setting influences the mathematical definition of the conductivity tensors needed for the formulation of the Bidomain equations. In particular, at each point x of the cardiac domain we can define an orthonormal triplet of vectors al(x), at(x) and an(x), respectively parallel to the local fiber direction, tangent and orthogonal to the laminar sheets, and transversal to the fiber axis ([18]). By denoting with \(\sigma _{l, t, n}^{i,e}\) the conductivity coefficients in the intra- and extracellular domains along the corresponding directions, we define the conductivity tensors Di and De of the two media as

$$ D_{i,e} (\mathbf{x}) = \sigma_{l}^{i,e} \mathbf{a}_{l} (\mathbf{x})\mathbf{a}_{l}^{T}(\mathbf{x}) + \sigma_{t}^{i,e} \mathbf{a}_{t} (\mathbf{x})\mathbf{a}_{t}^{T}(\mathbf{x}) + \sigma_{n}^{i,e} \mathbf{a}_{n} (\mathbf{x})\mathbf{a}_{n}^{T}(\mathbf{x}). $$

Structural inhomogeneities in the intra- or extracellular spaces due to the presence of gap junctions, blood vessels and collagen are generally included in the conductivity tensors Di and De as inhomogeneous functions of space.

Thanks to the hypothesis on the cardiac tissue, the electric potential is defined in each point of the two domains as a quantity averaged over a small volume: consequently, every point of the cardiac tissue is assumed to belong to both the intracellular and the extracellular spaces, thus being assigned both an intra- and an extracellular potential. From now on, we will denote by Ω the cardiac tissue volume represented by the superposition of these two spaces.

The parabolic-parabolic formulation of the Bidomain model can be stated as follows: find the intracellular and extracellular potentials \(u_{i,e}: {\Omega } \times (0, T) \rightarrow \mathbb {R}\), the transmembrane potential v = uiue: \({\Omega } \times (0, T) \rightarrow \mathbb {R}\), the gating variables \(\mathbf {w}: {\Omega } \times (0, T) \rightarrow \mathbb {R}^{M}\), the ionic concentration variables \(\mathbf {c}: {\Omega } \times (0, T) \rightarrow \mathbb {R}^{S}\), \(M, S \in \mathbb {N}\), such that

$$ \left\{\begin{array}{ll} \chi C_{m} \frac{\partial v}{\partial t} - \text{div} \left( D_{i} \nabla u_{i} \right) + \chi I_{\text{ion}} (v, \mathbf{w}, \mathbf{c}) = I_{\text{app}}^{i} & \text{ in } {\Omega} \times (0,T), \\ -\chi C_{m} \frac{\partial v}{\partial t} - \text{div} \left( D_{e} \nabla u_{e} \right) - \chi I_{\text{ion}} (v,\mathbf{w}, \mathbf{c}) = I_{\text{app}}^{e} & \text{ in } {\Omega} \times (0,T), \\ \frac{\partial\mathbf{w}}{\partial t} - \mathbf{R} (v,\mathbf{w}) = 0 &\text{ in } {\Omega} \times (0,T), \\ \frac{\partial\mathbf{c}}{\partial t} - \mathbf{C} (v,\mathbf{w}, \mathbf{c}) = 0 &\text{ in } {\Omega} \times (0,T), \\ \mathbf{n}^{T} D_{i,e} \nabla u_{i,e} = 0 &\text{ in } \partial{\Omega} \times (0,T), \end{array}\right. $$
(1)

given \(I_{\text {app}}^{i,e}: {\Omega } \times (0, T) \rightarrow \mathbb {R}\) intra- and extracellular applied currents and initial values \(v_{0}: {\Omega } \rightarrow \mathbb {R}\), \(\mathbf {w}_{0}: {\Omega } \rightarrow \mathbb {R}^{M}\) and \(\mathbf {c}_{0}: {\Omega } \rightarrow (0, +\infty )^{S}\)

$$ v(\mathbf{x},0) = v_{0}(\mathbf{x}), \quad \mathbf{w}(\mathbf{x}, 0) = \mathbf{w}_{0} (\mathbf{x}), \quad \mathbf{c} (\mathbf{x}, 0) = \mathbf{c}_{0} (\mathbf{x})\qquad \qquad \text{in } {\Omega}. $$

Here, Cm is the membrane capacitance for unit area of the membrane surface and χ is the membrane surface to volume ratio. The Neumann zero boundary conditions in the system (1, last row) represent mathematically the assumption that the heart is electrically insulated. The nonlinear reaction term Iion and the ODEs system for the gating variables w (which represent the opening and closing process of the ionic channels) and the ionic concentrations c are given by the chosen ionic membrane model.

Results on existence, uniqueness and regularity of the solution of system (1) have been extensively studied, see for example [5, 31].

2.2 Ionic Models

In this work, we consider three different ionic models. First, a phenomenological ionic model, derived from a modification of the renowned FitzHugh–Nagumo model [13, 14], named the Rogers–McCulloch (RMC) ionic model [24]. This model overcomes the hyperpolarization of the cell during the repolarization phase by adding a nonlinear dependence between the transmembrane potential and the gating variable and by neglecting the ionic concentrations variable. For this model, Iion(v,w) and R(v,w) are given by

$$ I_{\text{ion}} (v,w) = G v \left( 1 - \frac{v}{v_{th}}\right) \left( 1 - \frac{v}{v_{p}} \right) + \eta_{1} v w, \qquad R(v,w) = \eta_{2} \left( \frac{v}{v_{p}} - w \right), $$

where G, vth, vp, η1 and η2 are given coefficients.

We then consider two more biophysically detailed ionic models, namely the Luo–Rudy phase one (LR1) and the Ten Tusscher–Panfilov (TP06) ionic models, which provide a more detailed description of the ionic currents in cardiac cells. For the several equations of these more complex ionic models we refer to the original papers [19, 28].

The theoretical analysis presented in [15] considered only the class of phenomenological ionic model (neglecting the ionic concentration variables). This work extends our previous study to include the LR1 and TP06 ionic models, as well as the presence of ischemic regions.

3 Dual-Primal Newton–Krylov Solvers

In this section, we briefly present our discretization choices for the model, by providing a finite element discretization in space, a fully implicit time discretization in time and a decoupling (or staggered) solution strategy, in the same fashion as in [20, 26]. Then, we introduce our dual-primal solver for the arising nonlinear algebraic system and we mention the theoretical convergence result, which can be found in its extended form in [15].

3.1 Space and Time Discretizations

We discretize in space the cardiac domain Ω with Q1 finite elements, leading to the semi-discrete system

$$ \left\{\begin{array}{l} \displaystyle \chi C_{m} \mathcal{M} \frac{\partial \mathbf{u}_{i}}{\partial t} + A_{i} \mathbf{u}_{i} + M \mathbf{I_{\text{ion}}} (\mathbf{v}, \mathbf{w}, \mathbf{c}) = M \mathbf{I_{\text{app}}^{i}}, \\ \displaystyle \chi C_{m} \mathcal{M} \frac{\partial \mathbf{u}_{e}}{\partial t} + A_{e} \mathbf{u}_{e} - M \mathbf{I_{\text{ion}}} (\mathbf{v}, \mathbf{w}, \mathbf{c}) = M \mathbf{I_{\text{app}}^{e}}, \\ \displaystyle \frac{\partial \mathbf{w}}{\partial t} = R \left( \mathbf{v}, \mathbf{w} \right), \quad \displaystyle \frac{\partial \mathbf{c}}{\partial t} = C \left( \mathbf{v}, \mathbf{w}, \mathbf{c}\right), \end{array}\right. $$

with Ai, Ae, M the stiffness and mass matrices arising from the finite element discretization.

Regarding the discretization in time, instead of using implicit-explicit (IMEX) schemes [6, 32], where the diffusion term is treated implicitly, while the remaining terms explicitly, or more generally operator splitting strategies [3], we consider here a fully implicit time discretization in a decoupling (or staggered) approach.

In this procedure, at each time step, the ODEs system representing the ionic model is solved first; then, the nonlinear algebraic Bidomain system is solved and updated. In a very schematic way, this strategy can be summarized as: for each time step n,

  1. 1.

    given the intra- and extracellular potentials at the previous time step, define \(\mathbf {v} := \mathbf {u}_{i}^{n} - \mathbf {u}_{e}^{n}\) and compute the gating and ionic concentrations variables

    $$ \begin{array}{@{}rcl@{}} \mathbf{w}^{n+1} - \tau R (\mathbf{v}, \mathbf{w}^{n+1}) &=& \mathbf{w}^{n}, \\ \mathbf{c}^{n+1} - \tau C (\mathbf{v}, \mathbf{w}^{n+1}, \mathbf{c}^{n+1}) &=& \mathbf{c}^{n}. \end{array} $$
  2. 2.

    solve and update the Bidomain nonlinear system. Given \(\mathbf {u}_{i,e}^{n}\) at the previous time step and given wn+ 1 and cn+ 1, compute \(\mathbf {u}^{n+1} = (\mathbf {u}_{i}^{n+1}, \mathbf {u}_{e}^{n+1})\) by solving the system \(\mathcal {F}_{\text {dec}} (\mathbf {u}^{n+1}) = \mathcal {G}\)

    $$ \begin{array}{@{}rcl@{}} \mathcal{F}_{\text{dec}}(\mathbf{u}^{n+1}) &=& \left( \chi C_{m} \mathcal{M} + \tau \mathcal{A} \right) \begin{bmatrix} \mathbf{u}_{i}^{n+1} \\ \mathbf{u}_{e}^{n+1} \end{bmatrix} + \tau \begin{bmatrix} M \mathbf{I_{\text{ion}}}(\mathbf{v}^{n+1}, \mathbf{w}^{n+1}, \mathbf{c}^{n+1}) \\ -M \mathbf{I_{\text{ion}}}(\mathbf{v}^{n+1}, \mathbf{w}^{n+1}, \mathbf{c}^{n+1}) \end{bmatrix},\\ \mathcal{G} &=& \chi C_{m} \mathcal{M} \begin{bmatrix} \mathbf{u}_{i}^{n} \\ \mathbf{u}_{e}^{n} \end{bmatrix} + \tau \begin{bmatrix} M \mathbf{I_{\text{app}}^{i}} \\ -M \mathbf{I_{\text{app}}^{e}} \end{bmatrix}, \end{array} $$

    where

    $$ \mathcal{A} = \begin{bmatrix} A_{i} & 0 \\ 0 & A_{e} \end{bmatrix}, \qquad \mathcal{M} = \begin{bmatrix} M & -M \\ -M & M \end{bmatrix}. $$

We observe that the Jacobian linear system associated to the nonlinear problem in step 2 is symmetric.

This strategy is usually adopted in contrast to a monolithic approach, where the PDEs and ODEs systems are solved altogether, and the computational workload is higher due to the presence of the gating and ionic concentrations variables in the nonlinear algebraic system. Nevertheless, this strategy has been extensively studied and several scalable parallel preconditioners have been designed and analyzed (e.g. [16, 21, 22]).

3.2 Dual-primal Preconditioners

In this approach, a linear system has to be solved at each Newton step: to this end, we employ an iterative method, preconditioned by a dual-primal substructuring algorithm.

Let us consider a partition of the computational domain Ω into N nonoverlapping subdomains of diameter Hj, such that \({\Omega } = \cup _{j=1}^{N} {\Omega }_{j}\), and we define the subdomain interface as \({\Gamma } = \left (\cup _{j=1}^{N} \partial {\Omega }_{j} \right ) \backslash \partial {\Omega }\). Let W(j) be the associated local finite element spaces. We introduce the product spaces

$$ W = \prod\limits_{j=1}^{N} W^{(j)}, \qquad \qquad W_{\Gamma} := \prod\limits_{j=1}^{N} W_{\Gamma}^{(j)}, $$

where we have partitioned W(j) into the interior part \(W_{I}^{(j)}\) and the finite element trace space \(W_{\Gamma }^{(j)}\). We define \(\widehat {W} \subset W\) as the subspace of functions of W, which are continuous in all interface variables between subdomains and similarly we denote by \(\widehat {W}_{\Gamma } \subset W_{\Gamma }\), the subspace formed by the continuous elements of WΓ. In order to obtain good convergence and to ensure that each local problem is invertible, a proper choice of primal constraints is needed. These primal constraints are, in this sense, continuity constraints which we require to hold throughout the iterations. By denoting with \(\widetilde {W}\) the space of finite element functions in W, which are continuous in all primal variables, we also have \(\widehat {W} \subset \widetilde {W} \subset W\) and \(\widehat {W}_{\Gamma } \subset \widetilde {W}_{\Gamma } \subset W_{\Gamma }\).

Let \(W_{\Pi }^{(j)} \subset W_{\Gamma }^{(j)}\) be the primal subspace of continuous functions across the interface and that will be subassembled between the subdomains sharing Γ(j). Moreover, denote with \(W_{\Delta }^{(j)} \subset W_{\Gamma }^{(j)}\) the space of finite element functions (called dual) which can be discontinuous across the interface and vanish at the primal degrees of freedom. Analogously, \(W_{\Pi } = {\prod }_{j=1}^{N} W_{\Pi }^{(j)}\) and \(W_{\Delta } = {\prod }_{j=1}^{N} W_{\Delta }^{(j)}\), thus WΓ = WπWΔ. Using this notation, we can decompose \(\widetilde {W}_{\Gamma }\) into a primal subspace \(\widehat {W}_{\Pi }\) which has continuous elements only and a dual subspace WΔ which contains finite element functions which are not continuous. In this work, we will denote with subscripts I, Δ and π the interior, dual and primal variables, respectively.

In the substructuring framework, the starting problem Kw = f is reduced to the interface Γ by eliminating the degrees of freedom (dofs) interior to each subdomain, obtaining the Schur complement system

$$ S_{\Gamma} w_{\Gamma} = g_{\Gamma}, $$
(2)

where \(S_{\Gamma } = K_{\Gamma {\Gamma }} - K_{\Gamma I} K_{I I}^{-1} K_{I {\Gamma }}\) and \(g_{\Gamma } = f_{\Gamma } - K_{\Gamma I} K_{I I}^{-1} f_{I}\) are obtained by reordering the dofs in interior and interface.

The reordering of the degrees of freedom allows to define algorithms where each subproblem is solved independently from the others except for the primal constraints, where the variables are assumed to be continuous across the subdomains. For a detailed presentation, we refer to [29].

On these premises, it is possible to built two of the most used dual-primal iterative substructuring algorithms, namely the Balancing Domain Decomposition with Constraints (BDDC) and dual-primal Finite Element Tearing and Interconnecting (FETI-DP) preconditioners.

In order to ensure a correct continuity of the solution across the subdomains, an appropriate interface averaging is needed. In our work, we focus on both standard ρ and deluxe scalings of the dual variables.

Restriction and scaling operators

Before introducing the preconditioners, it is helpful to understand how the scaling procedure works. We define the restriction operators

$$ \begin{array}{@{}rcl@{}} R_{\Delta}^{(j)}: W_{\Delta} \rightarrow W_{\Delta}^{(j)}, &&\qquad R_{\Gamma {\Delta}}: W_{\Gamma} \rightarrow W_{\Delta}, \\ R_{\Pi}^{(j)}: \widehat{W}_{\Pi} \rightarrow W_{\Pi}^{(j)}, &&\qquad R_{\Gamma {\Pi}}: W_{\Gamma} \rightarrow \widehat{W}_{\Pi}, \end{array} $$

and the direct sums \(R_{\Delta } = \oplus R_{\Delta }^{(j)}\), \(R_{\Pi } = \oplus R_{\Pi }^{(j)} \) and \(\widetilde {R}_{\Gamma } = R_{\Gamma {\Pi }} \oplus R_{\Gamma {\Delta }}\), which maps WΓ into \(\widetilde {W}_{\Gamma }\).

The ρ-scaling can be defined for the Bidomain model at each node x ∈Γ(j) as

$$ \delta^{i,e\dagger}_{j} (x) = \frac{\sigma_{M}^{{i,e}^{(j)}}}{{\sum}_{k\in\mathcal{N}_{x}}\sigma_{M}^{{i,e}^{(k)}}}, \qquad \sigma_{M}^{{i,e}^{(j)}} = \max_{\bullet = \{l, t, n\}} \sigma^{{i,e}^{(j)}}_{\bullet}, $$
(3)

where \(\mathcal {N}_{x}\)Footnote 1 is the set of indices of all subdomains with x in the closure of the subdomain.

Conversely, the deluxe scaling (introduced in [7, 10]) computes the average \(\bar {w} = E_{D} w\) for each face \(\mathcal {F}\) or edge \(\mathcal {E}\) equivalence class.

Suppose that \(\mathcal {F}\) is shared by subdomains Ωj and Ωk. Let \(S^{(j)}_{\mathcal {F}}\) and \(S^{(k)}_{\mathcal {F}}\) be the principal minors obtained from \(S^{(j)}_{\Gamma }\) and \(S^{(k)}_{\Gamma }\) by extracting all rows and columns related to the degrees of freedom of the face \(\mathcal {F}\). Denote with \(u_{j,\mathcal {F}} = R_{\mathcal {F}} u_{j}\) the restriction of uj to the face \(\mathcal {F}\) through the restriction operator \(R_{\mathcal {F}}\). Then, the deluxe average across \(\mathcal {F}\) can be defined as

$$ \bar{u}_{\mathcal{F}} = \left( S^{(j)}_{\mathcal{F}} + S^{(k)}_{\mathcal{F}} \right)^{-1} \left( S^{(j)}_{\mathcal{F}} u_{j,\mathcal{F}} + S^{(k)}_{\mathcal{F}} u_{k,\mathcal{F}} \right). $$

It is possible to extend this definition when considering the deluxe average across an edge \(\mathcal {E}\). Suppose for simplicity that \(\mathcal {E}\) is shared by only three subdomains with indices j1, j2 and j3; the extension to more than three subdomains is equivalent. Let \(u_{j,\mathcal {E}} = R_{\mathcal {E}} u_{j}\) be the restriction of uj to the edge \(\mathcal {E}\) through the restriction operator \(R_{\mathcal {E}}\) and define \(S_{\mathcal {E}}^{(j_{123})} = S_{\mathcal {E}}^{(j_{1})} + S_{\mathcal {E}}^{(j_{2})} + S_{\mathcal {E}}^{(j_{3})}\); the deluxe average across an edge \(\mathcal {E}\) is defined as

$$ \bar{u}_{\mathcal{E}} = \left( S_{\mathcal{E}}^{(j_{123})}\right)^{-1} \left( S_{\mathcal{E}}^{(j_{1})} u_{j_{1},\mathcal{E}} + S_{\mathcal{E}}^{(j_{2})} u_{j_{2}, \mathcal{E}} + S_{\mathcal{E}}^{(j_{3})} u_{j_{3}, \mathcal{E}} \right). $$

The average \(\bar {u}\) is constructed with the contributions from the relevant equivalence classes involving the substructure Ωj. These contributions will belong to \(\widehat {W}_{\Gamma }\), after being extended by zero to \({\Gamma } \backslash \mathcal {F}\) or \({\Gamma } \backslash \mathcal {E}\). The sum of all contributions \(R^{T}_{\ast } \bar {u}_{\ast }\) are then added from the different equivalence classes to obtain

$$ \bar{u} = E_{D} u = u_{\Pi} + \sum\limits_{\ast = \{\mathcal{F}, \mathcal{E}\}} R^{T}_{\ast} \bar{u}_{\ast}, $$

where ED is a projection and

$$ P_{D} u := (I - E_{D}) u = u_{\Delta} - \sum\limits_{\ast = \{ \mathcal{F}, \mathcal{E} \} } R^{T}_{\ast} \bar{u}_{\ast}, $$

is its complementary projection.

We define the scaling matrix for each subdomain Ωj as

$$ D^{(j)} = \text{diag}\left[D^{(j)}_{\ast_{k_{1}}}, \dots, D^{(j)}_{\ast_{k_{j}}}\right],\qquad \ast = \{\mathcal{F}, \mathcal{E}\} $$

being \(k_{1}, \dots , k_{j} \in {\varXi }_{j}^{\ast }\), a set containing the indices of the subdomains that share the face \(\mathcal {F}\) or the edge \(\mathcal {E}\) and where the diagonal blocks are given by \(D^{(j)}_{\mathcal {F}} = \big (S^{(j)}_{\mathcal {F}} + S^{(k)}_{\mathcal {F}}\big )^{-1} S^{(j)}_{\mathcal {F}}\) or \(D^{(j)}_{\mathcal {E}} = (S_{\mathcal {E}}^{(j_{1})} + S_{\mathcal {E}}^{(j_{2})} + S_{\mathcal {E}}^{(j_{3})})^{-1} S_{\mathcal {E}}^{(j_{1})}\) in case the deluxe scaling is considered. Conversely, if the ρ-scaling is taken into account, the j th diagonal scaling matrix D(j) contains the pseudo-inverses (3) along the diagonal.

Lastly, we define the scaled local restriction operators

$$ R_{D, {\Gamma}}^{(j)} = D^{(j)} R_{\Gamma}^{(j)}, \qquad\qquad R_{D, {\Delta}}^{(j)} = R_{\Gamma {\Delta}}^{(j)} R_{D, {\Gamma}}^{(j)} , $$

RD as direct sum of \(R_{D, {\Delta }}^{(j)}\) and the global scaled operator \(\widetilde {R}_{D, {\Gamma }} = R_{\Gamma {\Pi }} \oplus R_{D, {\Delta }} R_{\Gamma {\Delta }}\).

BDDC preconditioners

BDDC are two-level preconditioners introduced in [9] for the Schur complement system

$$ \widehat{S}_{\Gamma} w_{\Gamma} = \widehat{g}_{\Gamma}, \qquad \widehat{g}_{\Gamma} = \widehat{f}_{\Gamma} - K_{\Gamma I} K_{II}^{-1} f_{I}, $$
(4)

where \(\widehat {S}_{\Gamma } = R_{\Gamma }^{T} S_{\Gamma } R_{\Gamma }\) and \(\widehat {f}_{\Gamma } = R_{\Gamma }^{T} f_{\Gamma }\) are obtained with the operator RΓ which is the sum of local operators \(R_{\Gamma }^{(j)}\) that return the local interface component.

They can be considered as evolution of balancing Neumann–Neumann algorithms, where local and coarse problems are treated additively. In these algorithms, the choice of primal constraints across the interface is important, since it influences the structure and size of the coarse problem and hence the overall convergence of the method.

It is possible to define BDDC preconditioners using the scaled restriction operators \(\widetilde {R}_{D, {\Gamma }}^{T}\) as

$$ M^{-1}_{\text{BDDC}} = \widetilde{R}_{D, {\Gamma}}^{T} \widetilde{S}_{\Gamma}^{-1} \widetilde{R}_{D, {\Gamma}},\qquad\widetilde{S}_{\Gamma} = \widetilde{R}_{\Gamma} S_{\Gamma} \widetilde{R}_{\Gamma}^{T}, $$

where the action of the inverse of \(\widetilde {S}_{\Gamma }\) can be evaluated as

$$ \widetilde{S}_{\Gamma}^{-1} = \widetilde{R}_{\Gamma {\Delta} }^{T} \left( \sum\limits_{j=1}^{N} \begin{bmatrix} 0 &R_{\Delta}^{(j) T} \end{bmatrix} \begin{bmatrix} K_{II}^{(j)} & K_{I {\Delta}}^{(j)} \\ K_{I {\Delta}}^{(j) T} & K_{\Delta {\Delta}}^{(j)} \end{bmatrix}^{-1} \begin{bmatrix} 0 \\ R_{\Delta}^{(j)} \end{bmatrix} \right) \widetilde{R}_{\Gamma {\Delta} } + {\varPhi} S_{\Pi {\Pi}}^{-1} {\varPhi}. $$

The first term is the sum of local solvers on each subdomain Ωj, while the second is a coarse solver for the primal variables, where

$$ \begin{array}{@{}rcl@{}} S_{\Pi {\Pi} } &=& \sum\limits_{j=1}^{N} R_{\Pi}^{(j) T} \left( K_{\Pi {\Pi}}^{(j)} - \begin{bmatrix} K_{I {\Pi}}^{(j) T} &K_{\Delta {\Pi}}^{(j) T} \end{bmatrix} \begin{bmatrix} K_{II}^{(j)} & K_{I {\Delta}}^{(j)} \\ K_{I {\Delta}}^{(j) T} & K_{\Delta {\Delta}}^{(j)} \end{bmatrix}^{-1} \begin{bmatrix} K_{I {\Pi}}^{(j)} \\ R_{\Delta {\Pi}}^{(j)} \end{bmatrix} \right) R_{\Pi}^{(j)},\\ {\varPhi} &=& R_{\Gamma {\Pi}}^{T} - R_{\Gamma {\Delta}}^{T} \sum\limits_{j=1}^{N} \begin{bmatrix} 0 &R_{\Delta}^{(j) T} \end{bmatrix} \begin{bmatrix} K_{II}^{(j)} & K_{I {\Delta}}^{(j)} \\ K_{I {\Delta}}^{(j) T} & K_{\Delta {\Delta}}^{(j)} \end{bmatrix}^{-1} \begin{bmatrix} K_{I {\Pi}}^{(j)} \\ R_{\Delta {\Pi}}^{(j)} \end{bmatrix} R_{\Pi}^{(j)} \end{array} $$

are the primal problem and the matrix which maps the primal degrees of freedom to the interface variables respectively.

Then, BDDC algorithm for the solution of the Schur complement problem (4) can be defined as: find \(u_{\Gamma } \in \widehat {W}_{\Gamma }\) such that

$$ M^{-1}_{\text{BDDC}} \widehat{S}_{\Gamma} u_{\Gamma} = M^{-1}_{\text{BDDC}} \widehat{f}_{\Gamma}. $$

Once the interface uΓ is computed, we can retrieve the internal solution uI by solving local Dirichlet problems.

FETI-DP preconditioners

FETI-DP preconditioners were first proposed in [12] as an alternative to one-level and two-level FETI and are based on the transposition from the Schur complement problem (2) to a minimization problem on \(\widetilde {W}_{\Gamma }\) with continuity constraints on the dual degrees of freedom, by means of additional variables (named Lagrange multipliers). By introducing a jump matrix B, needed to ensure a correct transmission of the solution between subdomains, the resulting saddle point system

$$ \begin{bmatrix} \widetilde{S}_{\Gamma} & B^{T} \\ B & 0 \end{bmatrix} \begin{bmatrix} w_{\Gamma} \\ \lambda \end{bmatrix} = \begin{bmatrix} \tilde{f}_{\Gamma} \\ 0 \end{bmatrix} $$

is further reduced to a problem only in the Lagrange multipliers unknowns

$$ B \widetilde{S}_{\Gamma}^{-1} B^{T} \lambda = B \widetilde{S}_{\Gamma}^{-1} \tilde{f}_{\Gamma}. $$

This linear system is solved iteratively with FETI-DP preconditioners \(M^{-1}_{\text {FETI-DP}} = B_{D} \widetilde {S}_{\Gamma } {B_{D}^{T}}\) (where BD is the scaled jump operator)

$$ M^{-1}_{\text{FETI-DP}} \left( B \widetilde{S}_{\Gamma}^{-1} B^{T} \right) \lambda = M^{-1}_{\text{FETI-DP}} \left( B \widetilde{S}_{\Gamma}^{-1} \tilde{f}_{\Gamma} \right). $$

Once the Lagrange multipliers are computed and the interface variables wΓ retrieved, it is possible to compute the internal variables wI by solving local problems on each subdomain.

Convergence analysis

We have proved in [15] a theoretical convergence rate estimate for these two dual - primal preconditioned operators, since BDDC and FETI-DP have been proven to share the same essential spectrum in [17], if the same coarse space is chosen.

4 Parallel Numerical Results

We extend here the parallel numerical experiments presented in [15] by testing the efficiency, scalability and robustness of the proposed dual-primal Newton–Krylov solver in the following settings:

  • when biophysical ionic models, such as the LR1 and TP06 models, are considered in a physiological situation;

  • in the presence of an ischemic region, modeled mathematically by introducing jumps in the diffusion coefficients and other ionic parameters.

We consider two simplified geometries, a thin slab and a portion of half truncated ellipsoid, modeling an idealized left ventricle tissue geometry. The parametric equations of the prolate ellipsoid are given by the system

$$ \left\{\begin{array}{ll} \mathbf{x} = a(r) \cos \theta \cos \varphi,\quad&\theta_{\min} \leq \theta \leq \theta_{\max},\\ \mathbf{y} = b (r) \cos \theta \sin \varphi,\quad&\varphi_{\min} \leq \varphi \leq \varphi_{\max},\\ \mathbf{z} = c (r) \sin \varphi,\quad &0 \leq r \leq 1, \end{array}\right. $$

where a(r) = a1 + r(a2a1), b(r) = b1 + r(b2b1) and c(r) = c1 + r(c2c1) with a1,2, b1,2 and c1,2 given coefficients defining the main axes of the ellipsoid. Regarding the cardiac fibers, we consider a intramural rotation linearly with the depth of 120 proceeding counterclockwise from the surface corresponding to the outer layer of the tissue (epicardium, r = 1) to the surface corresponding to the inner layer (endocardium, r = 0), see e.g. Fig.   11.

Fig. 1
figure 1

Schematic diagram of fibers orientation on the slab geometry. The nested dashed region represents the location of the transmural ischemic region considered for the experiments in Section 4.2

As already stated, we refer to the original papers [19, 24, 28] for the equations and parameters of Luo–Rudy phase one (LR1), Rogers–McCulloch (RMC) and Ten Tusscher–Panfilov (TP06) ionic models, respectively. Details about the implementation of the ischemic region will be given in the related paragraph. Additionally, we refer to [5] for the parameters related to the Bidomain model.

We apply an external stimulus Iapp = 100 mA/cm3 for 1 ms on a small region of the epicardium. Instead, when the slab geometry is considered, the stimulus is applied in one corner of the domain, over a spheric volume of radius 0.1 cm.

We consider a time interval of 2 ms, with fixed time step τ = 0.05 ms, for a total of 40 time steps. This assumption is not restrictive, since with larger time steps it is possible to lose accuracy in the representation of the wavefront propagating in the tissue, while smaller time steps would only increment the computational workload to marginally increase the accuracy of the solution.

The parallel C codes have been developed using the Portable Extensible Toolkit for Scientific Computation (PETSc) library [1] and the numerical tests have been carried out on Indaco cluster from University of Milan (https://www.indaco.unimi.it/), a Linux Infiniband cluster with 16 nodes, each carrying 2 processors Intel Xeon E5-2683 V4 2.1 GHz with 16 cores each, for a total amount of 512 cores. We assign to each processor one local problem, thus resulting in a correspondence between subdomains and processors. We employ the SNES package from PETSc library, which provide a ready-to-use environment for the solution of nonlinear algebraic systems. We compare the performance of BDDC and FETI-DP preconditioners with the one of algebraic multigrid, from the Hypre implementation (boomer AMG, or bAMG, [11]) and from the built-in PETSc implementation (GAMG), both with default parameters. For the optimality tests, we compare both the standard ρ-scaling and the deluxe scaling. In order to test the efficiency of our solver on parallel architectures, we also compute the parallel speedup \(S_{p} = \frac {T_{1}}{T_{p}}\), the ratio between the runtime T1 on 1 processor and the average runtime Tp on p processors.

4.1 Normal Physiological Tests

With start with considering a normal physiological situation for the LR1 and TP06 ionic models and study the weak scalability and optimality of our dual-primal solvers.

Weak scalability

Tables 1 and 2 report a comparison between the bAMG, BDDC and FETI-DP preconditioners. In this set of tests, the local mesh size is fixed to 12 ⋅ 12 ⋅ 12 and the number of processors is increased from 4 to 256, resulting in an increasing number of total dofs from 16,250 to 926,786. Due to the loss of positive semidefinite property, the Generalized Minimal Residual (GMRES) method is employed for the solution of the Jacobian linear system.

Table 1 Weak scalability on slab domain for the Bidomain decoupled solver
Table 2 Weak scalability on ellipsoidal domain for the Bidomain decoupled solver

The number of nonlinear iterations do not increase with the number of subdomains, and present lower values for TP06 in both geometries. Additionally, this parameter seems to be affected by the complexity of the geometry, since it is higher for the truncated ellipsoid.

The performance of BDDC and FETI-DP in terms of average CPU time show robustness of the preconditioned solver, since this quantity remains bounded while increasing the number of subdomains, except for the case of BDDC for LR1 with 16 processors for which we do not have any clear explanation; this trend cannot be found for bAMG, which presents higher and increasing values.

Regarding the average number of linear iterations, for the slab geometry and for both LR1 and TP06 ionic models, BDDC and FETI-DP present lower and bounded values, while for bAMG these values increase with the number of processors. On the ellipsoidal geometry instead, we have some fluctuations for BDDC and FETI-DP, although the average number of linear iterations remains bounded; the multigrid presents higher values with respect to its corresponding cases on the slab.

Optimality tests

We report the results of optimality tests for increasing ratio H/h in Tables 3 and 4. As in the preliminary tests with the simple RMC ionic model reported in [15], these results are independent of the scaling employed (the ρ-scaling and deluxe scaling, see [10, 29] for more details).

Table 3 Optimality tests on slab domain for the Bidomain decoupled solver
Table 4 Optimality tests on ellipsoidal domain for the Bidomain decoupled solver

The average number of nonlinear iterations settles around 2-4 for the LR1 ionic model and around 2-3 for the TP06 model, with higher values for the ellipsoidal domain. The number of nonlinear iterations seems to be independent of the coarse space employed (V, V+E, V+E+F). As confirmed by Fig. 2, the average number of linear iterations per time step deteriorates while increasing the local problem size when the coarse space consists of only subdomain vertex constraints (V). Instead, by enriching the primal space by adding edge (V+E) and face (V+E+F) constraints, this quantity remains bounded and with lower values.

The average CPU times are almost identical if we consider the richest primal spaces (V+E and V+E+F).

Fig. 2
figure 2

Optimality tests on ellipsoidal domain for the Bidomain decoupled solver. Fixed number of subdomains 4 ⋅ 4 ⋅ 4. Increasing local size from 4 ⋅ 4 ⋅ 4 to 24 ⋅ 24 ⋅ 24. GMRES solver preconditioned by BDDC. Luo–Rudy phase 1 (LR1) and Ten Tusscher–Panfilov (TP06) ionic models. Comparison between different scaling (dash-dotted ρ-scaling, continuous deluxe scaling) and different primal sets (V = vertices, E = edges, F = faces). Comparison of average number of linear iteration per time step

4.2 Transmural Ischemic Tests

In order to test the robustness of our dual-primal Bidomain decoupled solver, we add jumps in the diffusion coefficients, modelling pathological conditions such as myocardial ischemia. In particular, we consider a small and regular transmural ischemic region located at the center of both geometries (see Fig. 1). We consider only the RMC and TP06 ionic models, investigating the behavior of the dual-primal solver both in case of a simple phenomenological ionic model and in case of a human ventricular ionic model.

In the ischemic region, we decrease the diffusion coefficients \({\sigma _{l}^{i}}\) and \({\sigma _{t}^{i}}\) along and across fiber as shown in Table 5. Furthermore, we reduce the ionic current by 30% for the RMC ionic model; in case of the TP06 ionic model, we increase the potassium extracellular concentration Ko from 5.4 mV to 8 mV, and we decrease the sodium conductance GNa by 30%, simulating a region with moderate ischemia. Others settings can be found in [5, Chapter 9] and the references therein.

Table 5 Conductivity coefficients for the Bidomain model in physiological and ischemic tissue for Rogers–McCulloch (RMC) and Ten Tusscher–Panfilov (TP06) ionic models

We represent in Figs. 3 and 4 the time evolution of the transmembrane potential v and the extracellular potential ue from the epicardial surfaces respectively, with a transmural ischemic region in the middle of the slab geometry.

In these experiments, we employ the PETSc implementation of algebraic multigrid (GAMG), with default parameters.

Fig. 3
figure 3

Snapshots (every 2 ms) of transmembrane potential v time evolution in presence of an ischemic transmural region. For each time frame, we report the epicardial side of the slab domain

Fig. 4
figure 4

Snapshots (every 2 ms) of extracellular potential ue time evolution in presence of an ischemic transmural region. For each time frame, we report the epicardial side of the slab domain

Weak scalability. Transmural ischemic region

The results of the weak scalability tests can be found in Tables 6 and 7, where we compare the performance of algebraic multigrid and dual-primal preconditioners, with both ionic models and geometries. The local mesh size is fixed to 14 ⋅ 14 ⋅ 14 and the number of processors is increased from 4 to 256, thus resulting in an increasing number of dofs from 25,230 to 1,462,050.

Table 6 Weak scalability on slab domain for the Bidomain decoupled solver, transmural ischemic region
Table 7 Weak scalability on ellipsoidal domain for the Bidomain decoupled solver, transmural ischemic region

As already observed in the previous comparison in normal physiological cases, the average number of nonlinear iterations for the TP06 model is higher than the RMC model. Additionally, we again observe higher nonlinear iteration counts for the more complex ellipsoidal domain.

Regarding the slab geometry, BDDC and FETI-DP scale well in terms of average number of linear iterations, since this quantity remains bounded while increasing the number of processors, while GAMG’s iterations increase. For the ellipsoidal geometry, we obtain more linear iteration fluctuations for all preconditioners, but the dual-primal preconditioners present lower values than GAMG.

BDDC and FETI-DP are slower in terms of average CPU time than GAMG, probably due to the higher need of interprocessors communication. In contrast to these higher values, GAMG presents an increasing trend for increasing processor counts, while the dual-primal preconditioners show a slower CPU time growth.

Strong scalability. Transmural ischemic region

We then consider strong scalability tests, where we fix the global mesh to 192 ⋅ 192 ⋅ 32 elements (for a total of 2,458,434 dofs) for the slab geometry and we increase the number of subdomains from 32 to 256. In contrast, we fix the global mesh to 128 ⋅ 128 ⋅ 64 elements for the portion of truncated ellipsoid (thus resulting in 2,163,3360 dofs).

The average number of nonlinear iterations increases with the complexity of the ionic current: as reported in Tables 8 and 9, this parameter increases from 1-2 per time step using the RMC model to 2-3 per time step with TP06 model.

Table 8 Strong scalability on slab domain for the Bidomain decoupled solver, transmural ischemic region
Table 9 Strong scalability on ellipsoidal domain for the Bidomain decoupled solver, transmural ischemic region

The robustness of the solver is confirmed by the average number of linear iterations, which is comparable for all the preconditioners and for both ionic models; moreover, this indicates that our dual-primal solver retains its good convergence properties even for more complex ionic models. Again, as a consequence, the CPU times of TP06 model increase (with respect to RMC CPU times) due to the increase of nonlinear iterations, but the associated parallel speedups of the models are comparable. Since we are working with a low number of processors (thus meaning larger local problems), BDDC and FETI-DP outperform the ideal speedup (both with respect to 32 and 64 processors), since the factorization of the matrices takes most of the computational time.

Optimality tests. Transmural ischemic region

Lastly, we report in Tables 10 and 11 the results of optimality tests for increasing ratio H/h. We fix the number of processors (and subdomains) to 64 and we increase the local size H/h from 4 to 24, thus reducing the finite element size h.

Table 10 Optimality tests on slab domain for the Bidomain decoupled solver, transmural ischemic region
Table 11 Optimality tests on ellipsoidal domain for the Bidomain decoupled solver, transmural ischemic region

Since the dual-primal preconditioners have been proven to be spectrally equivalent, we focus only on the performance of BDDC. For both ionic models, we again consider both scalings (ρ-scaling on top, deluxe scaling at bottom of each table) and we test the solver for increasing primal spaces including only vertex constraints (V), vertex and edge constraints (V+E) or vertex, edge and face constraints (V+E+F).

Similar results hold for both geometries, independently of the ionic model employed or the scaling chosen. Despite higher average CPU timings for the deluxe scaling, all the other parameters are similar between the two scalings (the average number of nonlinear iterations are the same, for each ionic model).

If we consider only vertices V in the coarse space, we observe that the number of linear iterations increases with the local size, as shown in Fig. 5. On the other hand, by adding edge and (V+E) and face (V+E+F) constraints in the primal space, we obtain a quasi-optimality condition, where the linear iterations remains bounded, except for the slab geometry with TP06 ionic model, where the ρ-scaling with the full primal space (V+E+F) behaves as the coarsest primal space (V).

Fig. 5
figure 5

Optimality tests on slab and ellipsoidal domains for the Bidomain decoupled solver, transmural ischemic region. Fixed number of subdomains 4 ⋅ 4 ⋅ 4. Increasing local size from 4 ⋅ 4 ⋅ 4 to 24 ⋅ 24 ⋅ 24. GMRES solver preconditioned by BDDC. Rogers–McCulloch (RMC) and Ten Tusscher–Panfilov (TP06) ionic models, in the presence of transmural ischemic region. Comparison between different scaling (dash-dotted ρ-scaling, continuous deluxe scaling) and different primal sets (V = vertices, E = edges, F = faces). Comparison of average number of linear iteration per time step

5 Conclusions

We have reviewed BDDC and FETI-DP preconditioners for fully implicit time discretizations of the Bidomain system, solved through a staggered strategy. We have presented extensive parallel numerical experiments testing the efficiency, scalability and robustness of the solver, both in case of complex ionic models and in the presence of regions with moderate ischemia. The results confirm the validity of the proposed solvers, enlarging the class of available methods for the numerical solution of complex reaction - diffusion models. Future studies should investigate alternatives to the Newton method, by exploring the potentiality of quasi-Newton and inexact-Newton algorithms.