1 Introduction

Homogenization theory [1,2,3] serves as the mathematical basis for understanding the behavior of microstructured and composite materials. For given constitutive laws and an explicit description of the microstructure, the effective constitutive law emerges naturally, based on solving a partial differential equation (PDE), the so-called cell problem.

Only for special cases, the effective constitutive laws may be expressed in closed form [4], and bounding techniques [5, 6] may lack accuracy. For these reasons, numerical approaches for solving the cell problem, so-called computational homogenization methods [7], emerged. Such techniques face a number of challenges. For a start, microstructures of heterogeneous materials may be rather complex, which becomes clear when looking at typical materials to be homogenized: polycrystalline materials, fiber-reinforced composites, rocks, foams, textiles and many more [8]. Deep insights into such microstructures are provided by modern digital volume imaging techniques like serial sectioning [9, 10], optical microscopy [11, 12], FIB-SEM [13,14,15], Electron back-scatter diffraction [16,17,18], X-ray diffraction microscopy [19, 20] and micro-computed tomography [21, 22]. Computational homogenization methods need to handle the complexity of industrial-scale microstructures, and should be compatible to digital volume data. In addition to their inherent complexity, microstructured materials typically show a degree of randomness as a result of the production process [23,24,25]. To obtain deterministic effective constitutive laws, the cell problem needs to be solved on sufficiently large cells, so-called representative volume elements (RVE) [26, 27]. To decide whether a cell is representative (enough), it is necessary to quantify the standard deviation of the computed effective properties for cells of fixed size (the so-called dispersion [28, 29]), and to investigate the change in the empirical mean of the computed effective properties for increasing cell size (to quantify the bias [28, 29]). Thus, computational homogenization methods need to solve the cell problem for complex microstructures on large volumes, and need to do so repeatedly in order to quantify the randomness.

In addition to the geometrical properties of the microstructure, computational homogenization methods should be applicable to a wide range of physical behaviors. This includes a variety of coupled problems, see Milton [4] for an overview. However, for applications in mechanics, for instance, the ability to work with nonlinear (and inelastic) material behavior is crucial.

From a mathematical perspective, the cell problem represents a partial differential equation (PDE) with discontinuous coefficients. In particular, the associated solutions are not smooth across the material interfaces. Thus, to exploit higher-order methods, the interfaces need to be resolved by the mesh. However, due to the complexity of the microstructures, generating such meshes may be difficult.

Fortunately, there are also a few characteristics which simplify treating the cell problem. For a start, the particular shape of the computational cell is not fixed beforehand—it is only required to be "large enough". In particular, we may choose a rectangular cell, and solve the cell problem with periodic boundary conditions. It was found empirically that using periodic boundary conditions produces results with smaller bias than for Dirichlet and Neumann boundary conditions [28, 30, 31].

As a second simplification, the cell problem typically does not need to be solved to high accuracy. Indeed, we are primarily interested in effective, i.e., averaged, quantities. Also, the useful accuracy is limited by the variance associated with the current cell size.

There are a number of computational homogenization methods and associated multi-scale schemes, see Matouš et al. [32] for a recent overview. The article at hand is devoted to the computational homogenization method developed by Moulinec and Suquet [33, 34] which is based on the fast Fourier transform (FFT), and subsequent developments. The Moulinec–Suquet approach builds upon the Lippmann–Schwinger equation [35,36,37], a reformulation of the cell problem as an equivalent volume integral equation. It operates directly on voxel data and was formulated for nonlinear material behavior from the very beginning. Due to its inherently matrix-free formulation and the associated low memory footprint, the method’s applicability to large-scale microstructures is ensured. Last but not least, the method is comparatively simple to implement and produces impressive results in short time, due to the efficient implementations of the FFT available [38, 39].

In the 2000s, the Moulinec–Suquet method was accelerated and applied to a wider range of materials and constitutive behaviors (to be discussed below). Still, the contributions were rather infrequent. In the beginning of the 2010s, the number of corresponding articles started to increase significantly. Still, there was a lack of deeper understanding of the numerical method, obstructing further developments.

For analyzing and constructing numerical methods, it is often convenient to separate the processes of modeling, discretization and solution. Starting from the physical equation to be solved, the problem at hand is discretized first, giving rise to a discrete system of equations involving a finite number of unknowns. The latter system is then solved by a solution method. In particular, a specific discretized system may be solved by different solution methods. Also, the same solution method may be applied for solving the equations obtained from different discretization schemes. For instance, a finite element discretization of a linear elastic problem gives rise to a linear system with symmetric positive definite system matrix, and the conjugate gradient method [40] may be used for solving the system. Alternatively, multi-grid algorithms [41] may be used, as well. The latter, however, may also be used for finite difference and finite volume discretizations.

The strategy presented by Moulinec and Suquet provides a solution method for a discretized system of equations. By disentangling the underlying discretization and solver, the Moulinec–Suquet approach was re-integrated into the conventional canon of contemporary computational methods. Thus, around 2015, the previous obstructions were removed and it became possible to change the discretization while preserving the salient features of the solver, or to use faster solution methods for the same discretization. The underlying insights and subsequent developments are discussed in Sects. 2 and 3. We shall also report on practical matters, see Sect. 4, discuss applications, see Sect. 5, and hint at promising research directions, see Sect. 6. This article is not intended to provide innovative research, and we claim no originality.

2 Discretization schemes

2.1 Lippmann–Schwinger framework for the cell problem

We use nonlinear elasticity at small-strains in its standard form [33, 42, 43] as our point of departure for discussing the discretization schemes and the solution methods. For this purpose, we consider a rectangular cell

$$\begin{aligned} Y= [0,L_1] \times [0,L_2] \times \cdots \times [0,L_d] \end{aligned}$$
(2.1)

for positive \(L_j\) \((j=1,\ldots ,d)\). For the case of sheared cells, we refer to Bergmann and Merkert [44].

Let us denote by \(L^2(Y;\text {Sym}(d))\) the (Hilbert) space of square-integrable \(\text {Sym}(d)\)-valued fields \(\varepsilon \) on the cell \(Y\) (2.1), endowed with the Frobenius inner product

$$\begin{aligned} \langle \varepsilon _1,\varepsilon _2 \rangle _{L^2} = \frac{1}{\text {vol}(Y)}\int _Y\varepsilon _1(x):\varepsilon _2(x) \, dx. \end{aligned}$$
(2.2)

We are interested in displacement fluctuation fields u living in the space \(H^1_\#(Y;\mathbb {R}^d)\), which is the closure of the space

$$\begin{aligned} \left\{ u:Y\rightarrow \mathbb {R}^d \quad \text{ continuously } \text{ differentiable } \quad \Big | \quad \int _Yu\, dx = 0, \quad u \, \text{ and } \, \partial _j u \, \text{ are } \, Y\text{-periodic } \text{ for } \text{ all } \quad j=1,\ldots ,d \right\} \end{aligned}$$

w.r.t. the norm induced by the inner product

$$\begin{aligned} (u_1,u_2) \mapsto \langle \nabla ^s u_1, \nabla ^s u_2\rangle _{L^2}, \end{aligned}$$

where \(\nabla ^s\) denotes the symmetrized gradient \(\nabla ^s u = \frac{1}{2}(\nabla u + \nabla u^T)\).

Suppose that a (condensed) free energy density

$$\begin{aligned} w: Y\times \text {Sym}(d) \rightarrow \mathbb {R}, \quad (x,\varepsilon ) \mapsto w(x,\varepsilon ), \end{aligned}$$
(2.3)

which is continuously differentiable in the second variable and satisfies other salient properties, is given. For a detailed discussion of the assumptions necessary which make the succeeding constructions reasonable, we refer to Schneider [43, 45].

For a given macroscopic strain \({\bar{\varepsilon }} \in \text {Sym}(d)\), we seek a displacement fluctuation field \(u\in H^1_\#(Y;\mathbb {R}^d)\) which minimizes the variational problem

$$\begin{aligned} \int _Yw(x, {\bar{\varepsilon }} + \nabla ^s u(x))\, dx \longrightarrow \min _{u\in H^1_\#(Y;\mathbb {R}^d)}. \end{aligned}$$
(2.4)

The first-order necessary condition reads

$$\begin{aligned} \int _Y\nabla ^sv (x) : \frac{\partial w}{\partial \varepsilon } (x, {\bar{\varepsilon }} + \nabla ^s u(x))\, dx = 0 \quad \text{ for } \text{ all } \quad v\in H^1_\#(Y;\mathbb {R}^d). \end{aligned}$$
(2.5)

In terms of the divergence operator, regarded as a bounded linear operator

$$\begin{aligned} \text {div}\,:L^2(Y;\text {Sym}(d)) \rightarrow H^{-1}_\#(Y;\mathbb {R}^d) \equiv \left( H^1_\#(Y;\mathbb {R}^d) \right) ', \end{aligned}$$

mapping to the continuous dual space of \(H^1_\#(Y;\mathbb {R}^d)\), a formal integration by parts shows that the first-order necessary condition is equivalent to the (quasi-static) balance of linear momentum

$$\begin{aligned} \text {div}\,\frac{\partial w}{\partial \varepsilon } (\cdot , {\bar{\varepsilon }} + \nabla ^s u) = 0. \end{aligned}$$
(2.6)

Please note that the latter equation is not in strong form, but rather equivalent to the weak form (2.5).

Let \(\mathbb {C}^0\) be a linear elastic reference medium, i.e., a linear operator \(\mathbb {C}^0 \in L(\text {Sym}(d))\) which is symmetric and positive definite w.r.t. the Frobenius inner product on \(\text {Sym}(d)\). For any right-hand side \(f\in H^{-1}_\#(Y;\mathbb {R}^d)\) with vanishing mean, the problem

$$\begin{aligned} \text {div}\,\mathbb {C}^0:\nabla ^su = f \end{aligned}$$
(2.7)

may be solved uniquely for \(u \in H^1_\#(Y;\mathbb {R}^d)\). Denote by \(G^0\) the associated solution operator, i.e.,

$$\begin{aligned} \text {div}\,\mathbb {C}^0:\nabla ^su = f \quad \text {precisely if} \quad u = G^0f. \end{aligned}$$

Representing a displacement fluctuation field \(u\in H^1_\#(Y;\mathbb {R}^d)\) as a Fourier series

$$\begin{aligned} u(x) = \sum _{\xi \in \mathbb {Z}^d} {\hat{u}}(\xi ) \, e^{i \, x \cdot \xi _Y}, \end{aligned}$$
(2.8)

where \(\mathbb {Z}^d\) denotes the set of d-tuples of integers, \({\hat{u}}(\xi )\) are the complex-valued Fourier coefficient vectors, and \(\xi _Y\) symbolizes the re-scaled frequency vector

$$\begin{aligned} \xi _Y= \left( 2\pi \, \frac{\xi _1}{L_1}, 2\pi \, \frac{\xi _2}{L_2}, \ldots , 2\pi \, \frac{\xi _d}{L_d}\right) \quad (\text {with} \, L_j \, \text {from Eq.}\, (2.1)){.} \end{aligned}$$
(2.9)

Then, the action of the differential operator \(\text {div}\,\mathbb {C}^0:\nabla ^s\) is readily computed in Fourier space via

$$\begin{aligned} {\hat{f}}(\xi ) = -\left[ \mathbb {C}^0: \left( \xi _Y\otimes _s {\hat{u}}(\xi ) \right) \right] \xi _Y, \quad \xi \in \mathbb {Z}^d, \end{aligned}$$

in terms of the Fourier coefficients \({\hat{f}}(\xi )\) of the right-hand side in Eq. (2.7) and the symmetrized tensor product \(\otimes _s\). For \(\mathbb {C}^0 = 2\mu _0\, {{\,\mathrm{\text {Id}}\,}}\), solving the latter equation is particularly simple, i.e., the relation

$$\begin{aligned} {\hat{u}}(\xi ) = -\frac{1}{2\mu _0}\left( \frac{2}{\Vert \xi _Y\Vert ^2} \, {{\,\mathrm{\text {Id}}\,}}- \frac{\xi _Y\otimes \xi _Y}{\Vert \xi _Y\Vert ^4}\right) {\hat{f}}(\xi ), \quad \xi \ne 0, \end{aligned}$$
(2.10)

holds. We may rewrite the balance equation (2.6) in the form

$$\begin{aligned} \text {div}\,\mathbb {C}^0:\nabla ^s u = -\text {div}\,\left( \frac{\partial w}{\partial \varepsilon } (\cdot , {\bar{\varepsilon }} + \nabla ^s u) - \mathbb {C}^0:({\bar{\varepsilon }} + \nabla ^s u) \right) , \end{aligned}$$

i.e., isolating the displacement fluctuation field

$$\begin{aligned} u = -G^0\text {div}\,\left( \frac{\partial w}{\partial \varepsilon } (\cdot , {\bar{\varepsilon }} + \nabla ^s u) - \mathbb {C}^0:({\bar{\varepsilon }} + \nabla ^s u) \right) . \end{aligned}$$
(2.11)

Applying the symmetrized gradient \(\nabla ^s\) and adding \({\bar{\varepsilon }}\) gives rise to the Lippmann–Schwinger equation

$$\begin{aligned} \varepsilon = {\bar{\varepsilon }} - \Gamma ^0 : \left( \frac{\partial w}{\partial \varepsilon } (\cdot , \varepsilon ) - \mathbb {C}^0:\varepsilon \right) , \end{aligned}$$
(2.12)

where

$$\begin{aligned} \varepsilon = {\bar{\varepsilon }} + \nabla ^s u \in L^2(Y;\text {Sym}(d)) \end{aligned}$$

denotes the total strain field and \(\Gamma ^0 = \text {div}\,G^0 \nabla ^s\) is the Eshelby–Green operator. The latter operator may also be expressed compactly in Fourier space via

$$\begin{aligned} \left( \widehat{\Gamma ^0:\tau } \right) (\xi ) = \frac{1}{\mu _0} \left[ \frac{\xi _Y\otimes _s ({\hat{\tau }}(\xi )\xi _Y)}{\Vert \xi _Y\Vert ^2} - \frac{ \xi _Y \cdot \left( {\hat{\tau }}(\xi )\xi _Y\right) }{2 \Vert \xi _Y \Vert ^4} \right] \, \xi _Y \otimes \xi _Y, \quad \xi \ne 0, \end{aligned}$$
(2.13)

and \(\left( \widehat{\Gamma ^0:\tau } \right) (0) = 0\) for any square-integrable stress-type field \(\tau \).

The Lippmann–Schwinger equation (2.12) may also be considered as a Fredholm integral equation of the second kind (with non-compact kernel). Indeed, the operator \(\Gamma ^0\) may also be regarded as a convolution-type singular integral operator

$$\begin{aligned} {\Gamma ^0:\tau (x) = \int _Y K^0(x-y):\tau (y) \, dy, \quad x\in Y, \quad \tau \in L^2(Y;\text {Sym}(d))} \end{aligned}$$
(2.14)

for a suitable (singular) integral kernel \(K^0:Y\rightarrow L(\text {Sym}(d))\), where the difference \(x-y\) has to be understood \(Y\)-periodically. The integral-equation point of view was adopted predominantly in early stages of FFT-based computational micromechanics [33, 34], where the convolution product (2.14) appears to have invited having recourse to computations in Fourier space.

The fixed point method associated with the Lippmann–Schwinger equation (2.12) is called the (continuous) basic scheme

$$\begin{aligned} \varepsilon ^{k+1} = {\bar{\varepsilon }} - \Gamma ^0:\left( \sigma (\varepsilon ^k) - \mathbb {C}^0:\varepsilon ^k\right) , \quad k=0,1,\ldots , \end{aligned}$$
(2.15)

for any initial value \(\varepsilon ^0 \in L^2(Y;\text {Sym}(d))\).

2.2 The Moulinec–Suquet discretization

The basic scheme introduced by Moulinec and Suquet [33, 34] proceeds as follows. For a d-tuple N of positive integers, describing the number of voxels per dimension, let us introduce the set of indices

$$\begin{aligned} {\mathcal {I}}_N = \left\{ I \in \mathbb {Z}^d \, \Big | \, 0 \le I_j < N_j \text { for all } j =1\ldots ,d \right\} , \end{aligned}$$
(2.16)

the set of frequencies

$$\begin{aligned} \mathbb {Z}_N = \left\{ \xi \in \mathbb {Z}^d \, \Big | \, -N_j/2 \le \xi _j < N_j/2 \text { for all } j =1\ldots ,d \right\} , \end{aligned}$$
(2.17)

and the discrete grid of points

$$\begin{aligned} Y_N = \left\{ x \in Y\, \Big | \, x_j = I_j L_j/N_j \text { for all } j =1\ldots ,d \text { and some fixed } I \in {\mathcal {I}}_N \right\} \end{aligned}$$
(2.18)

in the cell \(Y\) (2.1). We shall write \(x_I \in Y_N\) for the point with coordinates \(I_j L_j / N_j\) (\(j=1,\ldots ,d\)).

Then, Moulinec–Suquet proposed to iterate, for \(k=0,1,\ldots \),

$$\begin{aligned} \begin{aligned} \tau ^k(x_I)&= \frac{\partial w}{\partial \varepsilon }\left( x_I, \varepsilon ^k(x_I)\right) - \mathbb {C}^0:\varepsilon ^k(x_I), \quad x_I \in Y_N,\\ {\hat{\tau }}^k&= {\text {DFT}}(\tau ^k),\\ {\hat{\varepsilon }}^{k+1}(\xi )&= -\frac{1}{\mu _0} \left[ \frac{\xi _Y\otimes _s ({\hat{\tau }}^k(\xi )\xi _Y)}{ \Vert \xi _Y\Vert ^2} - \frac{ \xi _Y \cdot {\hat{\tau }}^k(\xi )\xi _Y}{2 \Vert \xi _Y \Vert ^4} \right] \, \xi _Y \otimes \xi _Y, \quad \xi \in \mathbb {Z}_N\backslash \{0\},\\ {\hat{\varepsilon }}^{k+1}(0)&= {\bar{\varepsilon }},\\ \varepsilon ^{k+1}&= {\text {DFT}}^{-1} (\varepsilon ^{k+1}), \end{aligned} \end{aligned}$$
(2.19)

where \({\text {DFT}}\) denotes the discrete Fourier transformFootnote 1

$$\begin{aligned} {\hat{\tau }} = {\text {DFT}}(\tau ) \text { with } {\hat{\tau }}(\xi ) = \frac{1}{N_1\cdots N_d} \sum _{I \in Y_N} \tau (x_I) \, e^{- i \, x_I \cdot \xi _Y}, \quad \xi \in \mathbb {Z}_N{,} \end{aligned}$$
(2.20)

and \(\xi _Y\) is given by formula (2.9). Notice that, by Eq. (2.9), the product

$$\begin{aligned} x_I \cdot \xi _Y= \sum _{j=1}^d \frac{I_j L_j}{N_j} \frac{2 \pi \xi _j}{ L_j} = 2 \pi \, \sum _{j=1}^d \frac{ I_j \xi _j}{N_j} \end{aligned}$$

is independent of the dimensions \(L_j\) of the cell \(Y\) (2.1). Actually, in the Scheme  (2.19), the Nyquist frequencies

$$\begin{aligned} \left\{ \xi \in \mathbb {Z}^d \, \Big | \, \xi _j = -N_j/2 \quad \text {for some} \, j=1,\ldots ,d \right\} \end{aligned}$$
(2.21)

require special attention to ensure that the resulting fields are real-valued. For such a frequency \(\xi \), one might either set

$$\begin{aligned} {\hat{\varepsilon }}^{k+1}(\xi ) = 0 \quad \text {or} \quad {\hat{\varepsilon }}^{k+1}(\xi ) = - \left( \mathbb {C}^0\right) ^{-1} : {\hat{\tau }}^k(\xi ), \end{aligned}$$

forcing the Fourier coefficients of the strain or the stress field at the Nyquist frequencies to zero [34]. We refer to Eq. (2.52) for a computationally more efficient evaluation of the \(\Gamma \)-operator.

Moulinec–Suquet’s basic scheme in the form (2.19) iterates between real-valued fields, defined on the discrete grid (2.18), and complex-valued fields which live in Fourier space and whose nonzero values are restricted to the discrete frequencies (2.17). However, the connection to the continuous setting (2.15) is not immediate. Inverting the \({\text {DFT}}\) (2.20),

$$\begin{aligned} \tau (x_I) = \sum _{\xi \in \mathbb {Z}_N} {\hat{\tau }}(\xi ) \, e^{i \, x_I \cdot \xi _Y}, \quad x_I \in Y_N, \end{aligned}$$
(2.22)

we observe that we may plug any point \(x \in Y\) (instead of just \(x_I \in Y_N\)) into the latter expression, i.e., for a given set of Fourier coefficients \(\{ \tau (\xi ) \}_{\xi \in \mathbb {Z}_N}\), a field

$$\begin{aligned} \tau (x) = \sum _{\xi \in \mathbb {Z}_N} {\hat{\tau }}(\xi ) \, e^{i \, x \cdot \xi _Y}, \quad x \in Y, \end{aligned}$$
(2.23)

is defined on the entire cell \(Y\) (2.1). The field (2.23) is called a trigonometric polynomial of order N [46]. With this notion at hand, it is possible to interpret the discrete Fourier transform (2.20) as a solution method for an interpolation problem. Indeed, given a set of values \(\{\tau _I\}_{I \in {\mathcal {I}}_N}\), the discrete Fourier transform (2.20) permits us to find a trigonometric polynomial \(\tau \) of order N, s.t. \(\tau (x_I) = \tau _I\) holds for all \(I \in {\mathcal {I}}_N\). Let us denote the space of trigonometric polynomials of order N with values in some vector space Z as \({\mathcal {T}}_N(Y;Z)\). (Typically, \(Z = \mathbb {R}^d\) or \(Z = \text {Sym}(d)\).) Let us write \(Q_N\) for the associated trigonometric interpolation operator, which takes any field \(g:Y\rightarrow Z\) as input, evaluates \(g_I = g(x_I)\) \((I\in {\mathcal {I}}_N)\), and produces the unique trigonometric polynomial \(Q_N[g] \in {\mathcal {T}}_N(Y;Z)\) which interpolates these values. Then, we may recast the basic scheme (2.19) in the form [47, 48]

$$\begin{aligned} \varepsilon ^{k+1} = {\bar{\varepsilon }} - \Gamma ^0:Q_N\left[ \frac{\partial w}{\partial \varepsilon }(\cdot ,\varepsilon ^k) - \mathbb {C}^0:\varepsilon ^k \right] , \quad k=0,1,\ldots , \end{aligned}$$
(2.24)

and some initial strain \(\varepsilon ^0 \in {\mathcal {T}}_N(Y;\text {Sym}(d))\). In particular, the stress field associated with the strain field \(\varepsilon \in {\mathcal {T}}_N(Y;\text {Sym}(d))\) is the trigonometric polynomial \(Q_N\left[ {\partial w}{/}{\partial \varepsilon }(\cdot ,\varepsilon )\right] \in {\mathcal {T}}_N(Y;\text {Sym}(d))\), i.e., the stress is evaluated only at the grid points (2.18), and interpolated thereafter. For linear elastic material behavior, \({\partial w}{/}{\partial \varepsilon }(x,\varepsilon ) = \mathbb {C}(x):\varepsilon \), \(x \in Y\), such an approach is different from conventional finite element methods, where the material law is evaluated exactly on the element, and the balance of linear momentum (2.6) is approximated. Instead, for the Moulinec–Suquet discretization, the balance of linear momentum is satisfied on the entire cell \(Y\), but the stress–strain relationship is approximated.

The formulation (2.24) of the basic scheme suggests that the Moulinec–Suquet discretization is a discretization of the integral equation (2.14). Following this rationale, Brisard and Dormieux [49, 50] used the Hashin–Shtrikman variational principle [51,52,53] to get insights into the Moulinec–Suquet discretization, see Sect. 2.4 for more details. In a nutshell, Brisard and Dormieux regarded the Moulinec–Suquet discretization as an under-integrated variant of the conforming Galerkin discretization of the Hashin–Shtrikman variational principle with voxel-wise constant strains as ansatz functions. Brisard and Dormieux [50] were the first to prove a convergence statement for the Moulinec–Suquet discretization. More precisely, they showed that, for a stiffness distribution \(\mathbb {C}:Y\rightarrow L(\text {Sym}(d))\) that is non-degenerate in the sense that there exist positive constants \(\alpha _\pm \), s.t.

$$\begin{aligned} \alpha _- \Vert E\Vert ^2 \le E:\mathbb {C}(x):E \le \alpha _+ \Vert E\Vert ^2 \text { holds for all }x\in Y\text { and }E\in \text {Sym}(d), \end{aligned}$$
(2.25)

the (unique) fixed point \(\varepsilon _N\) of the basic scheme (2.19), considered as a voxel-wise constant strain field with values \(\varepsilon _N(x_I)\), converges to the (unique) solution of the continuous Lippmann–Schwinger equation (2.12), provided an appropriate stiffness averaging is performed,

$$\begin{aligned} \left( \mathbb {C}_V - \mathbb {C}^0\right) ^{-1} = \frac{1}{\text {vol}(V)}\int _V \left( \mathbb {C}(x) - \mathbb {C}^0\right) ^{-1}\, dx \end{aligned}$$
(2.26)

for the stiffness \(\mathbb {C}_V\) inside the voxel \(V\subseteq Y\). However, the interpretation from the Hashin–Shtrikman point of view was not completely satisfactory. For a start, the critical points of the Hashin–Shtrikman variational principle, restricted to a general subspace of \(L^2\), depend on the chosen reference material (and not only on the subspace). This contrasts with the Moulinec–Suquet discretization, where solutions are independent of the reference material. Furthermore, the convergence statement required the judicious averaging rule (2.26).

Another interpretation [47, 48, 54] recasts the continuous Lippmann–Schwinger equation (2.12) in linear elasticity

$$\begin{aligned} \varepsilon + \Gamma ^0:\left( \mathbb {C}- \mathbb {C}^0\right) :\varepsilon = {\bar{\varepsilon }} \end{aligned}$$

in terms of the stress polarization \(\tau = (\mathbb {C}- \mathbb {C}^0):\varepsilon \),

$$\begin{aligned} (\mathbb {C}- \mathbb {C}^0)^{-1}:\tau + \Gamma ^0:\tau = {\bar{\varepsilon }}, \end{aligned}$$
(2.27)

where we suppose that \(\mathbb {C}- \mathbb {C}^0\) is invertible everywhere. Similarly, for the Moulinec–Suquet discretization (2.24), the Lippmann–Schwinger equation may be written in the form

$$\begin{aligned} Q_N\left[ (\mathbb {C}- \mathbb {C}^0)^{-1}:\tau \right] + \Gamma ^0:\tau = {\bar{\varepsilon }} \quad \text {for} \quad \tau \in {\mathcal {T}}_N(Y;\text {Sym}(d)). \end{aligned}$$
(2.28)

Thus, the Moulinec–Suquet discretization may be interpreted as a trigonometric collocation discretization [55] of the Lippmann–Schwinger equation (2.27) in polarization form. Convergence of the fixed point \(\varepsilon _N\) of the basic scheme (2.19), regarded as a trigonometric polynomial, towards the solution of the continuous Lippmann–Schwinger equation (2.12) could be established [48] for a class of nonlinear stress operators, which includes stiffness distributions satisfying the conditions (2.25) as special cases. In contrast to the Hashin–Shtrikman interpretation, no averaging rule is necessary (2.26). Instead, the nonlinear material law is evaluated directly on the grid \(Y_N\) (2.18).

However, also the collocation interpretation (2.28) appeared stilted, as, although the reference material does not influence the solutions, it is still an essential part of the formulation (2.28).

Vondřejc et al. [56] offered a natural explanation in terms of an under-integrated Fourier–Galerkin discretization of the displacement. More precisely, in view of the variational principle (2.4)

$$\begin{aligned} \int _Yw(x, {\bar{\varepsilon }} + \nabla ^s u(x))\, dx \longrightarrow \min _{u\in H^1_\#(Y;\mathbb {R}^d)}, \end{aligned}$$
(2.29)

Vondřejc et al. [56] suggested an approximation of the exact integral (2.29) by the trapezoidal quadrature rule, i.e., to consider

$$\begin{aligned} \sum _{x_I \in Y_N} w(x_I, {\bar{\varepsilon }} + \nabla ^s u_N(x_I)) \longrightarrow \min _{u_N \in {\mathcal {T}}_N(Y;\mathbb {R}^d)}. \end{aligned}$$
(2.30)

The first-order necessary condition for the problem (2.30) reads

$$\begin{aligned} \sum _{x_I \in Y_N} \frac{\partial w}{\partial \varepsilon }(x_I, {\bar{\varepsilon }} + \nabla ^s u_N(x_I)) : \nabla ^s v_N(x_I) = 0 \quad \text {for all} \quad v_N \in {\mathcal {T}}_N(Y;\mathbb {R}^d), \end{aligned}$$

which may be rewritten, using the Plancherel theorem for the \({\text {DFT}}\), in the form

$$\begin{aligned} \text {div}\,Q_N\left[ \frac{\partial w}{\partial \varepsilon }(\cdot , {\bar{\varepsilon }} + \nabla ^s u_N) \right] = 0. \end{aligned}$$
(2.31)

The latter equation serves as the equivalent of the continuous balance of linear momentum (2.6). For any reference material \(\mathbb {C}^0\) and an arbitrary right-hand side \(f_N \in {\mathcal {T}}_N(Y;\mathbb {R}^d)\) with vanishing mean, we have

$$\begin{aligned} \text{ div }\,\mathbb {C}^0:\nabla ^s u_N = -f_N \quad \text{ and } \quad \int _Yu_N(x) \, dx = 0 \quad \text{ precisely } \text{ if } \quad u_N = G^0 f_N \end{aligned}$$
(2.32)

in terms of the Green’s operator (2.10) for the displacement. In particular, \(u_N \in {\mathcal {T}}_N(Y;\mathbb {R}^d)\) and the Green’s operator preserves the order of trigonometric polynomials. As the reference material \(\mathbb {C}^0\) is homogeneous, and the differential operators \(\nabla ^s\) as well as \(\text {div}\,\!\!\) map trigonometric polynomials to trigonometric polynomials in a way that does not increase the degree, the line of argument leading to the continuous Lippmann–Schwinger equation (2.12) may be repeated. In particular, the displacement field \(u_N \in {\mathcal {T}}_N(Y;\mathbb {R}^d)\) solves the balance equation (2.32) if and only if the strain field \(\varepsilon _N = {\bar{\varepsilon }} + \nabla ^s u_N\) solves the Lippmann–Schwinger equation corresponding to the basic scheme (2.24)

$$\begin{aligned} \varepsilon _N + \Gamma ^0:Q_N\left[ \frac{\partial w}{\partial \varepsilon }(\varepsilon _N) - \mathbb {C}^0:\varepsilon _N \right] = {\bar{\varepsilon }}. \end{aligned}$$
(2.33)

The variational principle (2.30) has a number of positive properties, which makes it the "natural" setting for the Moulinec–Suquet discretization, see Zeman et al. [57] for a more in-depth discussion. For a start, it clarifies that the displacement field, and not the strain field, is the basic quantity of interest. In particular, the Lippmann–Schwinger equation (2.33) is only an auxiliary construction (with benefits for solution methods, see Sect. 3). The reference material gets exposed as an auxiliary quantity as well, having no intrinsic meaning. Rather, it is required in order to write down an equation of fixed point type, in the first place. The Moulinec–Suquet discretization is exposed as a result of a clever combination of trigonometric polynomials and the trapezoidal quadrature rule. Trigonometric interpolation arises naturally in order to write down the Euler–Lagrange equation (2.31).

The variational point of view (2.30) also clarifies the hidden "symmetry" of the Lippmann–Schwinger equation. For linear problems, Zeman et al. [54] showed that the conjugate gradient method may be applied to the Lippmann–Schwinger equation (2.33), although the linear operator involved is not symmetric. The reason, to be discussed in Sect. 3.2 more thoroughly, is the displacement-based formulation (2.30).

Last but not least, the quadrature-based approach (2.30) makes it evident in which way salient properties of the nonlinear energy function w dictate the behavior of nonlinear solution methods applied to the discretized system (2.33). For instance, if w is convex in \(\varepsilon \), the averaged energies

$$\begin{aligned} W(u) = \frac{1}{\text {vol}(Y)}\int _Yw(x,{\bar{\varepsilon }} +\nabla ^s u(x))\, dx \quad \text{ and } \quad W_N(u_N) = \frac{1}{N_1\cdots N_d} \sum _{x_I \in Y_N} w(x_I, {\bar{\varepsilon }} + \nabla ^s u_N(x_I)) \end{aligned}$$

will be convex in \(u \in H^1_\#(Y;\mathbb {R}^d)\) and \(u_N \in {\mathcal {T}}_N(Y;\mathbb {R}^d)\), respectively. Also, strict and strong convexity are inherited by the Moulinec–Suquet discretization.

For completeness, let us remark that the article of Vondřejc et al. [56] contains a convergence proof for the discretization under the condition that the coefficient field is smooth enough to ensure that the corresponding strain field is Hölder continuous. Indeed, then the Fourier series for the strain field converges point-wise, and simpler arguments may be used than for more general, discontinuous coefficients [48, 50].

Last but not least, we refer to a number of works [58,59,60], where a comparison to finite element methods was recorded.

2.3 The Fourier–Galerkin discretization

The Moulinec–Suquet discretization (2.30) is a non-conforming discretization of the variational principle (2.4)

$$\begin{aligned} \int _Yw(x, {\bar{\varepsilon }} + \nabla ^s u(x))\, dx \longrightarrow \min _{u\in H^1_\#(Y;\mathbb {R}^d)}. \end{aligned}$$
(2.34)

Indeed, in addition to restricting to the subspace \({\mathcal {T}}_N(Y;\mathbb {R}^d)\) of N-th order trigonometric polynomials, the exact integral is approximated by a quadrature rule. To dispense with the introduced error, quadrature-free Fourier–Galerkin discretizations were introduced [61,62,63,64], which evaluate the variational principle (2.34) on the subspace \({\mathcal {T}}_N(Y;\mathbb {R}^d)\)

$$\begin{aligned} \int _Yw(x, {\bar{\varepsilon }} + \nabla ^s u_N(x))\, dx \longrightarrow \min _{u_N\in {\mathcal {T}}_N(Y;\mathbb {R}^d)}. \end{aligned}$$
(2.35)

The associated first-order necessary condition computes as

$$\begin{aligned} \int _Y\frac{\partial w}{\partial \varepsilon }(x, {\bar{\varepsilon }} + \nabla ^s u_N(x)):\nabla ^s v_N(x)\, dx = 0 \quad \text{ for } \text{ all } \quad v_N \in {\mathcal {T}}_N(Y;\mathbb {R}^d). \end{aligned}$$
(2.36)

Introducing the trigonometric projector \(P_N\), which truncates the Fourier coefficients of any square-integrable field \(g:Y\rightarrow Z\) to \(\mathbb {Z}_N\), i.e.,

$$\begin{aligned} (\widehat{P_N g})(\xi ) = \left\{ \begin{array}{rl} {\hat{g}}(\xi ), &{}\quad \xi \in \mathbb {Z}_N,\\ 0, &{}\quad \text {otherwise,} \end{array} \right. \end{aligned}$$

the balance equation (2.36) may be rewritten in the form

$$\begin{aligned} \text {div}\,P_N\left[ \frac{\partial w}{\partial \varepsilon }(\cdot , {\bar{\varepsilon }} + \nabla ^s u_N) \right] = 0. \end{aligned}$$
(2.37)

This expression is very similar to the Moulinec–Suquet discretization (2.31). However, the trigonometric projection operator \(P_N\) appears instead of the trigonometric interpolation operator \(Q_N\). Please note that whereas \(Q_N\) solves an interpolation problem, \(P_N\) solves a regression problem, i.e., for given \(\tau \in L^2(Y;\text {Sym}(d))\), it provides the (unique) field \(\tau _N \in {\mathcal {T}}_N(Y;\text {Sym}(d))\) minimizing \(\Vert \tau - \tau _N \Vert _{L^2}\).

If w is a heterogeneous Hookean energy with local stiffness field \(\mathbb {C}:Y\rightarrow L(\text {Sym}(d))\), the action of the operator \(P_N:\mathbb {C}\) on a strain field \(\varepsilon _N \in {\mathcal {T}}_N(Y;\text {Sym}(d))\) may be computed by the convolution theorem, see pages 171–172 in McGillem and Cooper [65], i.e.,

$$\begin{aligned} (\widehat{\mathbb {C}:\varepsilon _N})(\xi ) = \sum _{\eta \in \mathbb {Z}^d} {\hat{\mathbb {C}}}(\xi - \eta ):{\hat{\varepsilon }}_N(\eta ), \quad \xi \in \mathbb {Z}^d, \end{aligned}$$
(2.38)

involving the Fourier coefficients of \(\mathbb {C}\) and \(\varepsilon _N\). As only the projection of the stress field \(\mathbb {C}:\varepsilon _N\) onto the restricted set \(\mathbb {Z}_N\) of frequencies (in Fourier space) is of interest, the quantity \(P_N(\mathbb {C}:\varepsilon _N)\) may be computed by a discrete Fourier transform on a doubly fine grid [61,62,63]. Still, the Fourier coefficients of the stiffness field \(\mathbb {C}\) are required to be known explicitly. If \(\mathbb {C}\) attains only a finite number of distinct values, it suffices to know the Fourier coefficients of the individual subdomains, where \(\mathbb {C}\) is constant.

The Lippmann–Schwinger equation associated with the balance equation (2.37) reads

$$\begin{aligned} \varepsilon _N + \Gamma ^0:P_N\left[ \mathbb {C}:\varepsilon _N - \mathbb {C}^0:\varepsilon _N \right] = {\bar{\varepsilon }}, \end{aligned}$$
(2.39)

which is very similar to the Moulinec–Suquet discretization (2.33). Indeed, the same \(\Gamma ^0\)-operator enters, only the trigonometric interpolation \(Q_N\) is replaced by trigonometric projection \(P_N\). Thus, similar properties hold as well. For a start, the solutions of the balance equation (2.37) and the Lippmann–Schwinger equation (2.39) may be identified via \(\varepsilon = {\bar{\varepsilon }} + \nabla ^s u\). Secondly, solution methods may exploit the hidden "symmetry" of the Lippmann–Schwinger equation (2.39). Last but not least, salient properties of the free energy w, like convexity conditions, are inherited by the Fourier–Galerkin discretization.

The Fourier–Galerkin discretization was studied by Bonnet [61] for elliptical as well as ellipsoidal inclusions, and for circular as well as rectangular inclusions by Vondřejc [63] and Monchiet [64] for conductivity and elasticity, respectively. The major benefit of the full Galerkin approach (2.35) is the emergence of a hierarchy of upper bounds to the effective elastic properties. However, some care has to be taken if the number of voxels \(N_j\) in some direction \(j=1,\ldots ,d\) is even, see section 4 in Vondřejc et al. [62] for a discussion.

Via dualization, see Sect. 4.1, lower bounds may be established as well [63, 64]. These lower bounds are special for the Fourier–Galerkin discretization, and are much harder to obtain for finite element methods, for instance.

For the Fourier–Galerkin discretization, the equilibrium constraint is easy to encode in Fourier space. However, the price to pay is the increased effort involved in computing the material law (2.38). Also, the Fourier coefficients of the characteristic function need to be known explicitly. Last but not least, evaluating the constitutive law (2.38) appears inherently limited to linear material behavior, and an extension to general constitutive behavior would increase the range of applicability of the Fourier–Galerkin discretization.

2.4 Discretizations based on the Hashin–Shtrikman principle

As an alternative to the variational formulation (2.4) for the displacement fluctuation, Hashin–Shtrikman [51,52,53] introduced another variational approach to the cell problem. Suppose that the material behavior is linear elastic, i.e., \(w(x,\varepsilon ) = \tfrac{1}{2} \varepsilon :\mathbb {C}(x):\varepsilon \) holds, and suppose that

$$\begin{aligned} \alpha _- \Vert E\Vert ^2 \le E:\mathbb {C}(x):E \le \alpha _+ \Vert E\Vert ^2 \quad \text { holds for all } x\in Y\text { as well as }E\in \text {Sym}(d) \end{aligned}$$

with positive constants \(\alpha _\pm \), see condition (2.25). Then, provided the reference material \(\mathbb {C}^0\) is sufficiently small, the difference \(\mathbb {C}- \mathbb {C}^0\) defines a uniformly positive elastic energy. In particular, \(\mathbb {C}- \mathbb {C}^0\) may be inverted, giving rise to a uniformly positive compliance tensor field \((\mathbb {C}- \mathbb {C}^0)^{-1}\). Then, the equivalent formulation (2.27)

$$\begin{aligned} (\mathbb {C}- \mathbb {C}^0)^{-1}:\tau + \Gamma ^0:\tau = {\bar{\varepsilon }} \end{aligned}$$

of the Lippmann–Schwinger equation in terms of the stress polarization \(\tau = (\mathbb {C}- \mathbb {C}^0):\varepsilon \) arises as the first-order necessary condition of the variational principle

$$\begin{aligned} \frac{1}{2} \left\langle \tau , (\mathbb {C}- \mathbb {C}^0)^{-1}:\tau + \Gamma ^0:\tau \right\rangle _{L^2} - \left\langle \tau , {\bar{\varepsilon }}\right\rangle _{L^2} \longrightarrow \min _{\tau \in L^2(Y;\text {Sym}(d))}, \end{aligned}$$
(2.40)

where the angular brackets refer to the \(L^2\)-inner product (2.2). It is not difficult to see that \(\Gamma ^0\) is a self-adjoint and positive semi-definite operator on \(L^2(Y;\text {Sym}(d))\). Moreover, \((\mathbb {C}- \mathbb {C}^0)^{-1}\) is also self-adjoint and even positive definite, by our choice of diagonal reference material \(\mathbb {C}^0\). Thus, the objective function (2.40) is strongly convex (with Lipschitz-continuous gradient) and admits a unique minimizer. A similar argument shows that if the reference material \(\mathbb {C}^0\) is sufficiently large, a complementary variational principle may be established which has a unique maximizer, see Willis [66] for details. The case of reference material \(\mathbb {C}^0\) with general anisotropy is more involved, and we refer to Brisard and Dormieux [50] for the case of isotropic linear elastic constituents.

An advantage of the Hashin–Shtrikman variational principle (2.40) compared to the classical variational formulation (2.4), based on the displacement field, is that general competitor fields \(\tau \in L^2(Y;\text {Sym}(d))\) may enter the functional (2.40) to produce bounds. In particular, discontinuous fields may be used. In contrast, admissible competitors for the displacement-based variational principle (2.4) need to be \(H^1_\#\), see Talbot and Willis [66, 67] for details.

For the discussion at hand, let us stick to the formulation (2.40) and sufficiently small reference material \(\mathbb {C}^0\). Brisard and Dormieux [49] proposed to use a Galerkin discretization of the Hashin–Shtrikman principle (2.40) for the linear subspace \(V_N \subseteq L^2(Y;\text {Sym}(d))\) consisting of voxel-wise constant polarization fields. The first-order necessary condition for the quadratic program

$$\begin{aligned} \frac{1}{2} \left\langle \tau _N, (\mathbb {C}- \mathbb {C}^0)^{-1}:\tau _N + \Gamma ^0:\tau _N \right\rangle _{L^2} - \left\langle \tau _N, {\bar{\varepsilon }}\right\rangle _{L^2} \longrightarrow \min _{\tau _N \in V_N} \end{aligned}$$
(2.41)

reads

$$\begin{aligned} \left\langle \omega _N, (\mathbb {C}- \mathbb {C}^0)^{-1}:\tau _N + \Gamma ^0:\tau _N \right\rangle _{L^2} = \left\langle \omega _N, {\bar{\varepsilon }}\right\rangle _{L^2} \quad \text {for all} \quad \omega _N \in V_N. \end{aligned}$$

Let us denote by \({\mathcal {P}}_N\) the \(L^2\)-orthogonal projector onto \(V_N\), which takes any field \(\tau \) as input and produces the voxel-wise averaged field \({\mathcal {P}}_N{:}\tau \) as output. Then, the first-order necessary condition might be expressed in the form

$$\begin{aligned} (\mathbb {C}_{N,0} - \mathbb {C}^0)^{-1}:\tau _N + \Gamma ^0_N:\tau _N = {\bar{\varepsilon }} \quad \text {with} \quad \Gamma ^0_N \equiv {\mathcal {P}}_N:\Gamma ^0:{\mathcal {P}}_N, \end{aligned}$$
(2.42)

where \(\mathbb {C}_{N,0}\) is the voxel-wise constant stiffness obtained from averaging (2.26), i.e.,

$$\begin{aligned} \left( \mathbb {C}_{N,0}(y) - \mathbb {C}^0\right) ^{-1} = \frac{1}{\text {vol}(V)}\int _V \left( \mathbb {C}(x) - \mathbb {C}^0\right) ^{-1}\, dx \end{aligned}$$

for the stiffness \(\mathbb {C}_{N,0}(y)\) at \(y \in V\) inside the voxel \(V\subseteq Y\). We write \(\mathbb {C}_{N,0}\) to emphasize the dependence of the resulting averaged stiffness on both the resolution and the reference material.

The formulation (2.42) has a number of advantages. First and foremost, it is inherently symmetric on the entire space \(V_N\). Notice that this is not the case for the strain-based Lippmann–Schwinger equation (2.12) in the continuous setting and for both Fourier-type discretizations discussed in previous sections. Indeed, the Lippmann–Schwinger operator \({{\,\mathrm{\text {Id}}\,}}+ \Gamma ^0:(\mathbb {C}- \mathbb {C}^0)\) is only symmetric on the subspace of compatible fields [68]. The symmetry of the linear operator in Eq. (2.41) naturally led to introducing Krylov-type linear solvers into FFT-based computational homogenization methods, see Sect. 3.2. Secondly, the operator \(\Gamma ^0_N\) has an explicit expression as a Fourier multiplier. This follows, as both operators \({\mathcal {P}}_N\) and \(\Gamma ^0\) are translation-invariant on the voxel grid and their Fourier coefficients are known explicitly. Hence, the composition \(\Gamma ^0_N = {\mathcal {P}}_N:\Gamma ^0:{\mathcal {P}}_N\) is a Fourier multiplier (for the \({\text {DFT}}\)), as well, and the Fourier coefficients are known explicitly. Moreover, the rigorous variational formulation (2.41) permitted to prove convergence of the Moulinec–Suquet discretization [50] by an ingenious argument. Last but not least, the naturally emerging averaging procedure (2.26) brought to attention the idea of dealing with the distribution of the phases constituting the microstructures on a sub-voxel scale, triggering the development of composite voxel techniques, see Sect. 4.3.

There are two primary disadvantages of the Brisard–Dormieux discretization (2.42). First, the solution \(\tau _N\) of the Brisard–Dormieux type Lippmann–Schwinger equation depends on the reference material. Thus, in contrast to displacement-based discretization schemes, the reference material has a meaning beyond being a numerical parameter. In particular, a judicious choice of \(\mathbb {C}^0\) is necessary, balancing accuracy of the discrete solution and retaining a favorable conditioning of the linear system (2.42). This is also encoded in the \(\Gamma _N\) -operator, as the operator \(\mathbb {C}^0:\Gamma _N\) does not define a projection operator, in general.

The second disadvantage of the discretization concerns the practical computation of the operator \(\Gamma ^0_N\). Although explicit expressions for the Fourier coefficients of the operator \(\Gamma ^0_N\) are available, the involved series converges rather slowly in three spatial dimensions [50]. In particular, computing the operator \(\Gamma ^0_N\) on the fly is not recommended, and the Fourier coefficients need to be pre-computed and stored.

Another (less critical) disadvantage is that the strain field \(\varepsilon _N = (\mathbb {C}_{N,0} - \mathbb {C}^0):\tau _N\) associated with a polarization field \(\tau _N\) solving the problem (2.42) is not compatible, in general, i.e., cannot be written as the symmetrized gradient of a displacement field. As a remedy, Brisard [69] proposed an approach to recover an associated displacement field which becomes more closely related to the strain field as the discretization is refined.

The discretized variational principle (2.41) was also utilized for more general decompositions of the unit cell \(Y\) (2.1), for instance obtained from a clustering analysis of solution fields. The resulting method, called self-consistent clustering analysis [70,71,72], represents a state-of-the-art data-driven model order reduction approach to up-scaling nonlinear constitutive relations via homogenization. Please note that the self-consistent clustering analysis concerns the Lippmann–Schwinger equation (2.12), but may also be computed via finite elements. In this context, the convergence analysis of Brisard–Dormieux [50] was generalized to strongly convex free energies w with Lipschitz-continuous gradient, also covering cluster decompositions more general than voxelizations. Recently, the Hashin–Shtrikman variational principle (2.40) was also used for constructing a discretization based on B-Splines [73], generalizing a previous work with splines [74].

2.5 Finite difference, finite element and finite volume discretizations

In 1998, Müller [75] proposed a variant of the basic scheme (2.19) which replaces (continuous) derivatives by appropriate central finite difference approximations on a regular grid and modifies the \(\Gamma ^0\)-operator appropriately. In this way, the well-known ringing artifacts associated with spectral and pseudo-spectral discretizations of non-smooth problems [76, 77] can be avoided. The underlying idea is that (homogeneous) finite difference stencils give rise to Fourier multipliers upon discrete Fourier transformation [78, 79].

When considering finite difference discretizations, care has to be taken concerning the placement of the variables on the grid, and the location where the material law is evaluated. For the Moulinec–Suquet discretization, no such problems arise. Recall that, via trigonometric interpolation, we may extend a set of values given at the locations \(x_I\in Y_N\) to the entire cell \(Y\) (2.1). In particular, we may evaluate the derivatives anywhere, for instance at the positions \(x_I\in Y_N\). For the Moulinec–Suquet discretization, it is thus natural to assume that the \(x_I\) correspond to the voxel centers, and the constitutive law corresponding to this specific voxel is used.

In the context of conductivity, a regular grid of points may be linked by a "resistor network," connecting adjacent grid points (in the coordinate directions) by line segments, and each of these segments is furnished with a scalar conductivity, see Luck [80] for an application without FFT. For such models, Willot and Pellegrini [81] derived the associated \(\Gamma ^0\)-operator and discussed the ensuing solution methods. The approach was extended to more general finite difference approximations by Willot–Abdallah–Pellegrini [82].

For digital images of microstructures, each voxel corresponds to a specific material. For the "resistor network," however, the conductivity naturally lives on the interface between adjacent voxels. Typically, material properties are associated with the individual voxels of a digital image, and the material properties at the material interfaces are a priori undefined. Also, the resistor network interpretation is, by construction, restricted to isotropic conductivity.

An alternative finite difference discretization was exploited by Willot [83], based on the rotated staggered grid [84, 85] popular for simulating the propagation of elastic waves in anisotropic media. The latter discretization places the temperatures at the voxel corners and considers a "resistor network" with resistors connecting diametrically opposite corners of each voxel, see Fig. 1a.

Fig. 1
figure 1

Illustration of Willot’s discretization scheme [83]

In this way, the resistors meet at the voxel centers, and it makes sense to evaluate anisotropic constitutive laws as well. Such an approach made it also possible to treat mechanical problems [83]. By a coordinate transformation, the resulting gradient (at the voxel center) equals the averaged forward differences along the coordinate axes, see Fig. 1b. Willot derived compact expressions for the \(\Gamma ^0\)-operator, see Eq. (2.48) and(2.52), and proposed pertinent FFT-based solution methods.

This finite difference method has a number of positive properties. As the Moulinec–Suquet discretization, Willot’s discretization may be derived from a variational principle. For this purpose, denote by \(Y_N^{\text {node}}\) the nodal grid, which we consider to be identical to \(Y_N\) (2.18). Furthermore, let us introduce the Gauss-point grid

$$\begin{aligned} Y_N^{\text {Gauss}} = \left\{ x \in Y\quad \Big | \quad x_i = {\left( I_i + \tfrac{1}{2}\right) L_i}{/}{N_i} \quad (i=1,\ldots ,d) \text { for some index } I \in {\mathcal {I}}_N\right\} , \end{aligned}$$
(2.43)

where \({\mathcal {I}}_N\) denotes the set of admissible indices (2.16). For any displacement field

$$\begin{aligned} u: Y_N^{\text {node}} \rightarrow \mathbb {R}^d, \end{aligned}$$

let us denote by Du the symmetrized gradient computed via the rotated staggered grid discretization, giving rise to a field

$$\begin{aligned} Du: Y_N^{\text {Gauss}} \rightarrow \text {Sym}(d). \end{aligned}$$

Then, we may consider the variational problem

$$\begin{aligned} \sum _{x \in Y_N^{\text{ Gauss }}} w(x, {\bar{\varepsilon }} + D u_N(x)) \longrightarrow \min \quad \text{ w.r.t. } \quad u_N: Y_N^{\text{ node }} \rightarrow \mathbb {R}^d, \end{aligned}$$
(2.44)

which is very similar to the Moulinec–Suquet discretization (2.30). The associated first-order necessary condition reads

$$\begin{aligned} \sum _{x \in Y_N^{\text {Gauss}}} \frac{\partial w}{\partial \varepsilon }(x, {\bar{\varepsilon }} + D u_N(x)) : D v_N(x) = 0 \quad \text {for all} \quad v_N: Y_N^{\text {node}} \rightarrow \mathbb {R}^d, \end{aligned}$$

which may be rewritten, via summation by parts, in the form

$$\begin{aligned} D^* \left[ \frac{\partial w}{\partial \varepsilon }(\cdot , {\bar{\varepsilon }} + D u_N) \right] = 0, \end{aligned}$$
(2.45)

where \(D^*\) is (the negative of) the discrete divergence operator associated with the discrete symmetrized gradient D. Introducing the set of undefined indices

$$\begin{aligned} {\mathcal {U}}_N = \{0\} \cup \left\{ \xi \in {\mathbb {Z}_N} \,\Big |\, \text {there is some }j=1,\ldots ,d, \, \text {s.t. } 2\xi _j = -N_j\right\} , \end{aligned}$$
(2.46)

comprising zero and the Nyquist frequencies (2.21) and defining a complex-valued vector field

$$\begin{aligned} k: {\mathbb {Z}_N} \rightarrow \mathbb {C}^d, \end{aligned}$$

with components

$$\begin{aligned} k(\xi ) = \left[ \begin{array}{c} \frac{N_1}{4L_1} \left( e^{2\pi i \xi _1/N_1} - 1\right) \left( e^{2\pi i \xi _2/N_2} + 1\right) \left( e^{2\pi i \xi _3/N_3} + 1\right) \\ \frac{N_2}{4L_2} \left( e^{2\pi i \xi _1/N_1} + 1\right) \left( e^{2\pi i \xi _2/N_2} - 1\right) \left( e^{2\pi i \xi _3/N_3} + 1\right) \\ \frac{N_3}{4L_3} \left( e^{2\pi i \xi _1/N_1} + 1\right) \left( e^{2\pi i \xi _2/N_2} + 1\right) \left( e^{2\pi i \xi _3/N_3} - 1\right) \\ \end{array} \right] \end{aligned}$$
(2.47)

for \(d=3\), the symmetrized gradient D can be conveniently expressed in Fourier space

$$\begin{aligned} \widehat{(Du)}(\xi ) = k(\xi ) \otimes _s {\hat{u}}(\xi ), \end{aligned}$$

and the divergence operator \(D^*\) reads, in terms of the complex conjugate,

$$\begin{aligned} \widehat{(D^*\tau )}(\xi ) = {\hat{\tau }}(\xi ) \overline{k(\xi )}. \end{aligned}$$

Then, arguing as for the Moulinec–Suquet discretization, \(u_N:Y_N^{\text {node}} \rightarrow \mathbb {R}^d\) is a critical point of the variational principle (2.44) precisely if \(\varepsilon _N:Y_N^{\text {Gauss}} \rightarrow \text {Sym}(d)\) solves the Lippmann–Schwinger equation

$$\begin{aligned} \varepsilon _N + \Gamma ^0_N:\left[ \frac{\partial w}{\partial \varepsilon }(\cdot ,\varepsilon _N) - \mathbb {C}^0:\varepsilon _N \right] = {\bar{\varepsilon }} \end{aligned}$$
(2.48)

via the identification \(\varepsilon _N = {\bar{\varepsilon }} + D u_N\), where \(\Gamma ^0_N = D(D^*\mathbb {C}^0D)^{-1}D^*\). Compared to the Lippmann–Schwinger equation for the Moulinec–Suquet discretization (2.33), the similarities are striking. For \(\mathbb {C}^0 = 2\mu _0 {{\,\mathrm{\text {Id}}\,}}\) and for any field \(\tau : Y_N^{\text {Gauss}} \rightarrow \text {Sym}(d)\), the Eshelby–Green operator \(\Gamma ^0_N\) for Willot’s discretization takes the form

$$\begin{aligned} \widehat{\Gamma ^0_N \tau }(\xi ) = \frac{1}{\mu _0} \, \left[ \frac{k(\xi ) \otimes _s ({\hat{\tau }}(\xi )\overline{k(\xi )})}{\Vert k(\xi )\Vert ^2} - \frac{1}{2} \frac{ \overline{k(\xi )} \cdot ({\hat{\tau }}(\xi )\overline{k(\xi )})}{\Vert k(\xi )\Vert ^4} \, k(\xi ) \otimes k(\xi ) \right] \end{aligned}$$
(2.49)

for \(\xi \in \mathbb {Z}_N \backslash {\mathcal {U}}_N\) and identical to zero otherwise. Please notice that, upon replacing \(k(\xi )\) by \(\xi _Y\) (which is real-valued), the continuous case (2.13) is recovered.

Willot’s discretization preserves all the advantages of the Moulinec–Suquet discretization: it is displacement based, not restricted to linear constitutive laws, and permits efficient FFT-based solvers. There are two additional advantages of Willot’s discretization over Moulinec–Suquet’s discretization. For a start, as the operators D and \(D^*\) can be computed in real space, it is possible to construct iterative Lippmann–Schwinger solvers formulated for the displacement field. For instance, the basic scheme reads

$$\begin{aligned} u^{k+1}_N = -G^0_N D^*\left[ \frac{\partial w}{\partial \varepsilon }(\cdot , {\bar{\varepsilon }} + Du^k_N) - \mathbb {C}^0:Du^k_N \right] . \end{aligned}$$
(2.50)

In particular, the necessary memory consumption can be reduced by a factor of two for three-dimensional elasticity. This idea has been used earlier [82, 86].

Secondly, Willot’s discretization permits treating porous materials by FFT-based techniques in a stable way. For a porous material, the stress field vanishes identically on a certain sub-domain of the unit cell. For conventional finite element or finite difference discretizations, the nodal degrees of freedom in the pore space can be eliminated from the system to be solved. For FFT-based methods, such an approach is not possible, because the regular grid requires retaining these degrees of freedom in the discretized systems. It has been observed by many authors that the convergence behavior of the basic scheme deteriorates for increasing material contrast, in particular for porous materials. As a remedy, more sophisticated FFT-based linear and nonlinear solution methods were proposed, see Sect. 3. However, after some time it became clear that improved solution methods do not solve the problem. Indeed, the lack of convergence of the basic scheme is a result of using a discretization based on trigonometric polynomials. This is demonstrated by an example in Schneider et al. [87]. There, a Kelvin-type foam microstructure with \(50^3\) voxels is investigated. It is shown that the average stress of the iterates obtained by the Moulinec–Suquet discretization and the basic scheme converges to zero as \(k\rightarrow \infty \), which is unreasonable from a physical point of view. It can be shown that, under reasonable geometric conditions on the pore space and a non-degeneracy assumption on the constitutive behavior in the solid, the continuous basic scheme (2.15) is linearly convergent [43]. However, Fourier-type discretizations do not inherit this property. Without going into details, the essential insight is that the iterates \(\varepsilon ^k\) of the basic scheme (2.15) involve a \(\mathbb {C}^0\)-elastic extension to the pore space. Thus, on this subspace, the basic scheme becomes a contraction mapping. Discretizations by trigonometric polynomials are not compatible to \(\mathbb {C}^0\)-elastic extensions for prescribed values on the pore boundary. Indeed, due to the global nature of the trigonometric polynomials, any small error propagates globally. In contrast, finite difference discretizations, like Willot’s discretization, admit well-defined boundary values on the pore-solid interface. In particular, the \(\mathbb {C}^0\)-elastic extension is the Lippmann–Schwinger equivalent of deleting the nodal degrees of freedom in the pore space. We refer to Schneider [43] for details.

It took some time to realize the inherent advantages of finite difference discretizations. One of the reasons is that there was no uniform agreement on the used convergence criterion, so that a particular scheme may be defined as converging for one author, whereas another author considered it non-convergent. Furthermore, the degree of ill-conditioning of Moulinec–Suquet’s discretization strongly depends on the complexity of the microstructure and also on the physics considered. For instance, thermal conductivity problems are less prone to ill-conditioning than mechanical problems, as are two-dimensional microstructures compared to their three-dimensional counterparts.

Willot’s discretization has become the standard discretization scheme for FFT-based computational homogenization methods, apart from the Moulinec–Suquet discretization. However, for specific purposes, also other finite difference discretization schemes were introduced.

Instead of the rotated staggered grid, the standard staggered grid [88] may be used for discretizing mechanical problems. Originating in finite volume methods for fluid flow problems, each voxel is considered as a control volume. Each such volume is furnished with a pressure, whereas the velocities naturally live on the voxel faces and quantify the flux of fluid across this interface. Thus, the variables are located on different, so-called staggered, grids. Discretizations on a staggered grid may also be used for mechanical problems, where the velocities are replaced by the displacements, see Zhu et al. [89] who use a geometric multi-grid solver. Without going into detail [87], it is also possible to write down a variational principle like for Willot’s discretization (2.44), but for a different symmetrized gradient operator, which reads, for \(d=3\),

$$\begin{aligned} Du = \left[ \begin{array}{c@{\quad }c@{\quad }c} D_1^+ u_1 &{} \tfrac{1}{2} \left( D_1^- u_2 + D_2^- u_1\right) &{} \tfrac{1}{2} \left( D_1^- u_3 + D_3^- u_1\right) \\ \tfrac{1}{2} \left( D_2^- u_1 + D_1^- u_2\right) &{} D_2^+ u_2 &{} \tfrac{1}{2} \left( D_2^- u_3 + D_3^- u_2\right) \\ \tfrac{1}{2} \left( D_3^- u_1 + D_1^- u_3\right) &{} \tfrac{1}{2} \left( D_3^- u_2 + D_2^- u_3\right) &{} D_3^+ u_3 \end{array} \right] \end{aligned}$$
(2.51)

in terms of appropriate forward and backward finite difference operators \(D^\pm _j\) in direction j. In a similar way as for Willot’s discretization, Lippmann–Schwinger-type solvers can be constructed, exploiting explicit expressions of the \(\Gamma ^0\)-operator in Fourier space, see Eq. (2.52).

The staggered grid scheme leads to more robust solvers than Willot’s scheme for porous materials, in particular for high porosity, see Sect. 5.6. The reasons for the inferior performance of Willot’s discretization are the hourglass modes emerging for microstructures with complex pore space, to be explained below.

More recently, further finite difference discretization schemes [90,91,92,93,94] were investigated. For applications without jumps in the material properties, like for microstructures emerging from minimizing non-convex potentials, high-order central difference schemes [92, 93] were reported to be advantageous.

When implementing the action of the (discrete) \(\Gamma \)-operator, it is often advantageous to exploit the product structure of the \(\Gamma \)-operator. For the most common finite difference schemes, computing \(\varepsilon = \Gamma ^0:\tau \) in Fourier space at specific frequency \(\xi \) is summarized as follows:

$$\begin{aligned} \begin{aligned} \forall j=1,2,3:k_j&\leftarrow \left\{ \begin{array}{rl} 2\pi \xi _j/L_j, &{} \text {Moulinec--Suquet [33,34]}\\ \prod _{k\ne j} (e^{2\pi i {\xi }_k/N_k} + 1) (e^{2\pi i {\xi }_j/N_j} - 1) \, N_j/L_j &{} \text {Willot [83]} \\ (e^{-2\pi i {\xi }_j/N_j} - 1) \, N_j/L_j, &{} \text {staggered grid [87]}\\ i\sin ( 2\pi i {\xi }_j/N_j )\, N_j/L_j, &{} \text {2nd order central differences [90,91]}\\ i\left( 8 \sin ( 2\pi i {\xi }_j/N_j ) - \sin ( 4\pi i {\xi }_j/N_j )\right) \, N_j/L_j, &{} \text {4th order central differences [92]}\\ \begin{array}{r} i\left( 23760\sin ( 2\pi i {\xi }_j/N_j ) -7425\sin ( 4\pi i {\xi }_j/N_j )\right. \\ +2200 \sin ( 6\pi i {\xi }_j/N_j ) -495 \sin ( 8\pi i {\xi }_j/N_j )\\ \left. +72 \sin ( 10\pi i {\xi }_j/N_j ) -5 \sin ( 12\pi i {\xi }_j/N_j )\right) \, N_j/L_j, \end{array}&\text {12th order central differences [93]} \end{array} \right. \\ \eta&\leftarrow k/\Vert k\Vert \\ f&\leftarrow \left\{ \begin{array}{rl} \left[ \begin{array}{r} -{\hat{\tau }}_{11}(\xi ) \eta _1 + {\hat{\tau }}_{12}(\xi ) {\bar{\eta }}_2 + {\hat{\tau }}_{13}(\xi ) {\bar{\eta }}_3\\ {\hat{\tau }}_{21}(\xi ) {\bar{\eta }}_1 - {\hat{\tau }}_{22}(\xi ) \eta _2 + {\hat{\tau }}_{23}(\xi ) {\bar{\eta }}_3\\ {\hat{\tau }}_{31}(\xi ) {\bar{\eta }}_1 + {\hat{\tau }}_{32}(\xi ) {\bar{\eta }}_2 - {\hat{\tau }}_{33}(\xi ) \eta _3\\ \end{array} \right] , &{} \text {staggered grid}\\ \left[ \begin{array}{r} {\hat{\tau }}_{11}(\xi ) {\bar{\eta }}_1 + {\hat{\tau }}_{12}(\xi ) {\bar{\eta }}_2 + {\hat{\tau }}_{13}(\xi ) {\bar{\eta }}_3\\ {\hat{\tau }}_{21}(\xi ) {\bar{\eta }}_1 + {\hat{\tau }}_{22}(\xi ) {\bar{\eta }}_2 + {\hat{\tau }}_{23}(\xi ) {\bar{\eta }}_3\\ {\hat{\tau }}_{31}(\xi ) {\bar{\eta }}_1 + {\hat{\tau }}_{32}(\xi ) {\bar{\eta }}_2 + {\hat{\tau }}_{33}(\xi ) {\bar{\eta }}_3\\ \end{array} \right] , &{} \text {otherwise} \\ \end{array}\right. \\ s&\leftarrow f \cdot {\bar{\eta }}\\ u&\leftarrow \left\{ \begin{array}{rl} \left( -f + s \, \eta /2 \right) /\mu _0, &{} \text {staggered grid},\\ \left( -f + s \, {\bar{\eta }}/2 \right) /\mu _0, &{} \text {otherwise},\\ \end{array} \right. \\ {\hat{\varepsilon }}(\xi )&\leftarrow \left\{ \begin{array}{rl} \left[ \begin{array}{c@{\quad }c@{\quad }c} -{\bar{\eta }}_1 u_1 &{} \frac{\eta _1 u_2 + \eta _2 u_1}{2} &{} \frac{\eta _1 u_3 + \eta _3 u_1}{2} \\ \frac{\eta _2 u_1 + \eta _1 u_2}{2} &{} -{\bar{\eta }}_2 u_2 &{} \frac{\eta _2 u_3 + \eta _3 u_2}{2} \\ \frac{\eta _3 u_1 + \eta _1 u_3}{2} &{} \frac{\eta _3 u_2 + \eta _2 u_3}{2} &{} -{\bar{\eta }}_2 u_3\\ \end{array} \right] , &{} \text {staggered grid}\\ \left[ \begin{array}{c@{\quad }c@{\quad }c} \eta _1 u_1 &{} \frac{{\eta }_1 u_2 + {\eta }_2 u_1}{2} &{} \frac{{\eta }_1 u_3 + {\eta }_3 u_1}{2} \\ \frac{{\eta }_2 u_1 + {\eta }_1 u_2}{2} &{} \eta _2 u_2 &{} \frac{{\eta }_2 u_3 + {\eta }_3 u_2}{2} \\ \frac{{\eta }_3 u_1 + {\eta }_1 u_3}{2} &{} \frac{{\eta }_3 u_2 + {\eta }_2 u_3}{2} &{} \eta _2 u_3\\ \end{array} \right] , &{} \text {otherwise} \\ \end{array}\right. \end{aligned} \end{aligned}$$
(2.52)

Closely related to finite difference discretizations are finite volume schemes. Based on the immersed interface method [95], which may also be solved efficiently by FFT [79, 96], Wiegmann and Zemitis [97] introduced an FFT-based solver for isotropic thermal conductivity on digital images based on a finite volume discretization. In their solution method, the jumps in the temperature gradient field across voxel interfaces are the primary unknowns. The Wiegmann–Zemitis discretization improves upon the Willot–Pellegrini scheme [81] by associating the harmonically averaged conductivities of the adjacent voxels to the interface. Dorn and Schneider [98] interpreted the Wiegmann–Zemitis formulation as a finite difference equivalent of a boundary integral equation and provided an associated (volumetric) Lippmann–Schwinger equation. As a result, the finite volume discretization was integrated into the canon of FFT-based computational homogenization methods. Compared to, both, the Moulinec–Suquet and Willot’s discretization, the Wiegmann–Zemitis finite volume method has minimal artifacts.

As finite element methods on regular meshes may be interpreted as specific finite difference stencils, it is possible to construct FFT-based solvers for finite element discretizations, as well. In particular, it does not make sense to "compare" FFT-based methods to finite element discretizations. Rather, it is reasonable to compare finite element discretizations on boundary/interface conforming meshes to discretizations on a regular grid. Also, one may compare different solution methods applied to problems arising from the same discretization scheme.

FFT-based solvers for trilinear hexahedral elements on a regular periodic grid were introduced by Schneider et al. [99]. The idea is to consider the nodal values of the displacement field to live on the nodal grid \(Y_N^{\text {node}}\), as for Willot’s discretization, and to introduce a Gauss-point grid \(Y_N^{\text {Gauss}}\) with eight Gauss points per element. Then, the symmetrized gradient may be interpreted as a mapping of displacement fields on the nodal grid \(Y_N^{\text {node}}\) to strain values at the Gauss-point grid \(Y_N^{\text {Gauss}}\), and the finite element variational principle

$$\begin{aligned} \sum _{x \in Y_N^{\text{ Gauss }}} w(x, {\bar{\varepsilon }} + D u_N(x)) \longrightarrow \min _{u_N} \quad \text{ among } \quad u_N: Y_N^{\text{ node }} \rightarrow \mathbb {R}^d \end{aligned}$$

is similar as for Willot’s discretization (2.44), but with eight Gauss points per element instead of a single Gauss point. It is interesting to note that the discretized system obtained by Willot’s discretization is identical to using trilinear FEM ansatz functions and reduced integration [100, 101] with only a single Gauss point per element, located at the voxel center. The latter interpretation permits applying the established FEM-machinery to Willot’s discretization, settling questions about convergence and stability of the discretization. Interestingly, for non-porous materials, only global hourglass instabilities may occur. These instabilities are concentrated, after discrete Fourier transform, at the Nyquist frequencies (2.21). Hence, setting the Green’s operator to zero for these frequencies, as suggested by Willot [83], serves as an hourglass stabilization. Unfortunately, for porous materials, local hourglass instabilities may emerge that are not taken care of by the \(\Gamma \)-operator.

Returning to the eight-point trilinear finite elements, a Lippmann–Schwinger equation and corresponding FFT-based solvers may be established as before. The notable difference is the increase in both memory demand and computational complexity when implementing such an FFT-based finite element solver on the strains. Indeed, eight Gauss points are located in every voxel. Thus, for a fixed microstructure, eight times the number of strain fields need to be stored for the full FEM discretization compared to the under-integrated version, equivalent to Willot’s discretization, and also the number of evaluations of the constitutive law is increased eightfold.

These shortcomings were addressed by Fritzen and Leuschner [102], building upon previous contributions by Yvonnet [103] using finite elements. The resulting Fourier-analytic nodal solver (FANS) method operates on the displacement field, see Eq. (2.50), and utilizes a Newton–Krylov method, see Sect. 3.4. Fritzen and Leuschner store the Newton tangent in a sparse format, reducing the overhead from the increased number of Gauss points. Indeed, only the nodal degrees of freedom are accessed for the linear solver. Moreover, Fritzen and Leuschner exploit a selective reduced integration [104].

Table 1 Summary of the considered discretization schemes

2.6 Overview

A (personal) summary of the presented discretization schemes is given in Table 1. Signs in brackets mean the following. For the Moulinec–Suquet and the Fourier–Galerkin discretization, computing the gradient and the divergence requires FFTs. Thus, the basic scheme requires at least a strain field to be kept in memory. However, additional fields may be stored in displacement form. For instance, the CG method can be implemented on a single strain and three displacement fields [105,106,107]. Thus, a low memory implementation is possible, but not to the same degree as for other discretization schemes.

By moving the strain and stress unknowns to the voxel center, the staggered grid discretization is compatible to anisotropic materials. However, the solution fields corresponding to a symmetric geometry may be non-symmetric. Whether this is a problem for complex (non-symmetric) microstructures is another matter. For the FEM discretization, computing the \(\Gamma \)-operator involves inverting a (Hermitian) \(3 \times 3\) matrix. Thus, the resulting \(\Gamma \)-operator is semi-explicit. The finite volume discretization is restricted to isotropic conductivity.

Fig. 2
figure 2

Comparison of von Mises stress (in MPa) at cross section for single spherical glass inclusion in polamide matrix and different discretization methods (CD = central differences)

For the readers’ convenience, a few examples on the characteristics of the different discretization methods are included. For a single spherical glass inclusion with \(12.78\%\) volume in a polyamide matrix, discretized by \(128^3\) voxels, the von Mises stress at the cross section for \(5\%\) uniaxial extension in x-direction and linear elastic material behavior is shown in Fig. 2. We use the material parameters of Schneider [108] and subject the composite to uniaxial extension at \(5\%\) strain. For all discretization methods, there are distinct artifacts at the interface. In the matrix, all discretizations look rather similar. In the inclusion, Moulinec–Suquet’s discretization shows ringing artifacts, whereas Willot’s scheme leads to checkerboarding. The central difference discretizations lead to an oscillatory behavior away from the interface which increases for increasing order of the method. This is a result of the lack of discrete combinatorial consistency of those discretizations, making them more suitable for problems with small contrast.

Table 2 Effective Young’s moduli (in GPa) in x-direction for the spherical glass inclusion and varying discretization and resolution
Fig. 3
figure 3

Influence of different discretization schemes on the convergence behavior of the (linear) conjugate gradient method and a microstructure of bound sand [109]

For completeness, the computed effective Young’s moduli are listed in Table 2. We observe that all proposed discretization schemes give rise to rather similar results for a fixed resolution. Concerning the behavior for porous materials, we consider a 3D microstructure of bound sand with a (rather high) porosity of approximately \(40.14\%\) and \(256^3\) voxels, see the original publication [109], where also the material parameters are listed. For this (admittedly complex) microstructure, shown in Fig. 3a, the performance of the linear conjugate gradient method [54] is compared in Fig. 3b. After an initially strong decrease, for the Moulinec–Suquet discretization, the residual fails to be reduced significantly below \(10^{-2}\). A similar behavior is observed for the high-order finite difference schemes. Willot’s discretization permits to reduce the residual further by about an order of magnitude, but hits a stall afterwards, as well. The second-order finite difference method converges after 774 iterations. The staggered grid discretization also converges and requires about an order of magnitude less iterations, namely 95.

Table 3 Axial components of the stress (in MPa) for the sandcore structure, \({\bar{\varepsilon }}_{xx} = 1\%\, e_x \otimes e_x\) and varying discretization scheme

The resulting effective stresses in the three axial directions are reported in Table 3. Please note that we included the effective stresses at iteration 1000 for those schemes which did not converge.

The only two discretization schemes which led to convergence within 1000 iterations, the staggered grid and the second-order central differences, give rise to effective stresses which coincide for the first two significant digits. These values coincide with those of Willot’s discretization which reaches the lowest residual among the non-convergent cases. All the remaining discretization schemes give different predictions.

Fig. 4
figure 4

Von Mises stress (in MPa) on a quarter of a transverse section in the sand-grain structure, see Fig. 3

Last but not least, let us also take a look at the resulting solution fields, see Fig 4. In this magnified view, we notice that the observations made before continue to hold for this porous setting, although the effects appear to be less pronounced.

3 Modern nonlinear solution methods

3.1 The basic scheme

Moulinec and Suquet [33] introduced an algorithm that is nowadays known as the basic scheme, following the terminology in Michel et al. [110]. It was formulated at small strains (2.15) and was already applicable to nonlinear and inelastic materials

$$\begin{aligned} \varepsilon ^{k+1} = {\bar{\varepsilon }} - \Gamma ^0:\left( \frac{\partial w}{\partial \varepsilon }(\cdot ,\varepsilon ^k) - \mathbb {C}^0:\varepsilon ^k\right) , \quad k=0,1,\ldots \end{aligned}$$
(3.1)

The algorithm was studied more thoroughly in Moulinec and Suquet [34]. In particular, the treatment of Nyquist frequencies (2.21) was discussed and a selection strategy for the isotropic reference material was suggested, provided all the materials in the microstructure are linear elastic and isotropic. More precisely, from computational experience, it was suggested to compute the spatial maximum and minimum for the two Lamé constants individually, and to select the Lamé constants of the reference medium to be the arithmetic average of these extremes. Last but not least, instead of prescribing a macroscopic direction of strain, a way for simulating a uniaxial stress driven process was introduced, see also Sect. 4.2.

In the context of (linear) conductivity, Eyre and Milton [111] analyzed the convergence behavior of the basic scheme by interpreting it as a Neumann series. Indeed, for linear elasticity, \( {\partial w}/{\partial \varepsilon }(\cdot ,\varepsilon ) \equiv \mathbb {C}:\varepsilon \), the basic scheme (3.1) may be expressed in the form

$$\begin{aligned} \varepsilon ^{k+1} = \sum _{j=0}^k \left( -\Gamma ^0:\left( \mathbb {C}- \mathbb {C}^0\right) \right) ^j{\bar{\varepsilon }}. \end{aligned}$$

It is well known that such a Neumann series for \(({{\,\mathrm{\text {Id}}\,}}+ B)^{-1}\) converges provided the spectral radius of B is less than unity. The spectral radius of the operator \(\Gamma ^0:(\mathbb {C}-\mathbb {C}^0)\) for isotropic stiffness \(\mathbb {C}\) was analyzed by Michel et al. [110]. They came to the conclusion that the best bound on the spectral radius (and thus the fastest convergence rate) is reached by computing the maximum and minimum of the shear and the compression modulus of the materials, individually, and to take the arithmetic average of those for the reference medium, see also Vinogradov and Milton [112] for a more functional analytic discussion. A salient feature of the basic scheme (3.1) is that the necessary iteration count for reaching a desired tolerance is bounded independently of the mesh width, depending only on the material contrast.

Milton [4], Sect. 14.11, extended the analysis of the basic scheme to a class of nonlinear materials. Suppose that positive constants \(\alpha _\pm \) and a heterogeneous stress operator

$$\begin{aligned} \sigma : Y\times \text {Sym}(d) \rightarrow \text {Sym}(d), \quad (x,\varepsilon ) \mapsto \sigma (x,\varepsilon ), \end{aligned}$$
(3.2)

are given, s.t. \(\sigma \) is uniformly \(\alpha _+\)-Lipschitz (in \(\varepsilon \))

$$\begin{aligned} \left\| \sigma (x,\varepsilon _1) - \sigma (x,\varepsilon _2) \right\| \le \alpha _+ \, \Vert \varepsilon _1 - \varepsilon _2\Vert \quad \text {for all} \quad x \in Y\quad \text {and all} \quad \varepsilon _1,\varepsilon _2 \in \text {Sym}(d) \end{aligned}$$
(3.3)

and uniformly \(\alpha _-\)-monotone (in \(\varepsilon \))

$$\begin{aligned} \left( \sigma (x,\varepsilon _1) - \sigma (x,\varepsilon _2)\right) :(\varepsilon _1 - \varepsilon _2) \ge \alpha _- \, \Vert \varepsilon _1 - \varepsilon _2\Vert ^2 \quad \text {for all} \quad x \in Y\quad \text {and all} \quad \varepsilon _1,\varepsilon _2 \in \text {Sym}(d). \end{aligned}$$
(3.4)

Then, for the reference material \(\mathbb {C}^0 = \alpha _0\, {{\,\mathrm{\text {Id}}\,}}\) with \(\alpha _0 > \alpha _+^2/(2 \alpha _-)\), the nonlinear operator

$$\begin{aligned} L^2(Y;\text {Sym}(d)) \rightarrow L^2(Y;\text {Sym}(d)), \quad \varepsilon \mapsto {\bar{\varepsilon }} - \Gamma ^0:\left( \sigma (\varepsilon ) - \mathbb {C}^0:\varepsilon \right) , \end{aligned}$$

defines a contraction mapping [113], and the basic scheme (3.1), interpreted as a fixed point method, converges globally with a linear rate. The (theoretically) fastest convergence rate is reached for \(\alpha _0 = \alpha _+^2/\alpha _-\). Specialized to the linear elastic setting, however, the ensuing convergence rate \(\sqrt{1 - \alpha _-^2/\alpha _+^2}\) is strictly inferior to the rate \(1 - \alpha _-/\alpha _+\) obtained for \(\alpha _0 = (\alpha _- + \alpha _+)/2\).

In their study of finite strain problems, Kabel et al. [105] interpreted the basic scheme (3.1) as a gradient descent method for the total (condensed) elastic energy on the cell \(Y\). Recall that there is a difference between the differential and the gradient of a (Frechét differentiable) function W on a Hilbert space H. Indeed, for any \(u \in H\), the differential DW(u) is an element of the dual space \(H'\), the space of bounded linear functionals on H, whereas the gradient \(\nabla W(u)\) at \(u \in H\) is a vector in H. Both are related via the inner product \(\langle \cdot ,\cdot \rangle _H\) on H, i.e.,

$$\begin{aligned} DW(u)[v] = \langle \nabla W(u), v\rangle _H, \quad u,v \in H, \end{aligned}$$
(3.5)

holds. In the context of small-strain mechanics, suppose a condensed free energy w (2.3) is given, together with \({\bar{\varepsilon }} \in \text {Sym}(d)\). Let us investigate the objective function (2.4)

$$\begin{aligned} W: H^1_\#(Y;\mathbb {R}^d) \rightarrow \mathbb {R}, \quad u \mapsto \frac{1}{\text {vol}(Y)}\int _Yw(x, {\bar{\varepsilon }} + \nabla ^s u(x))\, dx, \end{aligned}$$

where we endow the space \(H^1_\#(Y;\mathbb {R}^d)\) with the Korn-type inner product

$$\begin{aligned} \langle \cdot ,\cdot \rangle _{H^1_\#}: H^1_\#(Y;\mathbb {R}^d) \times H^1_\#(Y;\mathbb {R}^d) \rightarrow \mathbb {R}, \quad (u_1,u_2) \mapsto \frac{1}{\text {vol}(Y)}\int _Y\nabla ^s u_1(x):\nabla ^s u_2(x)\, dx. \end{aligned}$$
(3.6)

Then, for a sequence of positive step-sizes \(s^k\), the associated gradient descent method

$$\begin{aligned} u^{k+1} = u^k - s^k \, \nabla W(u^k), \quad k=0,1,\ldots \end{aligned}$$
(3.7)

may be studied. It is not difficult to see that, for the inner product (3.6), the gradient of W (3.5) becomes

$$\begin{aligned} \nabla W(u) = G\,\frac{\partial w}{\partial \varepsilon }(\cdot , {\bar{\varepsilon }} + \nabla ^s u), \end{aligned}$$

where \(G\) is a non-dimensional version of the Green’s operator (2.10) with Fourier coefficients

$$\begin{aligned} (\widehat{Gf})(\xi ) = -\left( \frac{2}{\Vert \xi _Y\Vert ^2} \, {{\,\mathrm{\text {Id}}\,}}- \frac{\xi _Y\otimes \xi _Y}{\Vert \xi _Y\Vert ^4}\right) {\hat{f}}(\xi ), \quad \xi \ne 0, \end{aligned}$$

and \((\widehat{Gf})(0) = 0\), for any \(f \in H^{-1}_\#(Y;\mathbb {R}^d)\). In view of these formulae, the gradient descent method (3.7) becomes

$$\begin{aligned} u^{k+1} = u^k - s^k \, G\,\frac{\partial w}{\partial \varepsilon }(\cdot , {\bar{\varepsilon }} + \nabla ^s u^k), \quad k=0,1,\ldots , \end{aligned}$$

which, for \(s^k = \frac{1}{2\mu _0}\), is the fixed point method associated with the Lippmann–Schwinger equation (2.11)

$$\begin{aligned} u = -G^0\text {div}\,\left( \frac{\partial w}{\partial \varepsilon } (\cdot , {\bar{\varepsilon }} + \nabla ^s u) - \mathbb {C}^0:({\bar{\varepsilon }} + \nabla ^s u) \right) \end{aligned}$$

for the displacement field u.

Interpreting the basic scheme as a gradient descent method sheds light on its properties. For a start, if the step size \(s^k\) is sufficiently small, the averaged energy \(W(\varepsilon ^k)\) will be non-increasing from one iteration to the next. On the other hand, if the step-size is too large (which corresponds to a very soft reference material), the explicit nature of the gradient update (3.7) leads to an unstable, non-convergent method.

The interpretation as a gradient scheme also permits drawing upon the mature convergence theory developed for gradient methods. For instance, if the stress operator \(\sigma \) (3.2) associated with the condensed free energy w (2.3) satisfies the conditions (3.3) and (3.4), the basic scheme (3.1) with \(\mathbb {C}^0 = \alpha _0\,{{\,\mathrm{\text {Id}}\,}}\) is a contractive fixed point method for \(\alpha _0 > \alpha _+/2\) and the fastest convergence rate is reached for \(\alpha _0 = (\alpha _- + \alpha _+)/2\) at \((\alpha _+ - \alpha _-)/(\alpha _+ + \alpha _-)\). In particular, no speed is lost compared to the linear elastic case.

In view of Milton’s contraction estimate, this is, however, a result of investigating stress operators coming from a potential. Indeed, Milton’s result also applies to linear and monotone "stiffness tensors" without major symmetry, whereas gradient descent only covers symmetric (and positive definite) stiffness tensors.

The interpretation as a gradient scheme also sheds light on the role of anisotropy in the reference material. Under the conditions (3.3) and (3.4), a reference stiffness of arbitrary anisotropy leads to a linearly convergent basic scheme provided the reference material is sufficiently stiff. However, such a strategy is typically not effective. Even for the linear case, to reach fast convergence, it is best if the reference material commutes with all local stiffness tensors \(\mathbb {C}:Y\rightarrow L(\text {Sym}(d))\). Then, an eigenvalue-based recommendation for the reference material can be given. Of course, choosing the reference material proportional to the identity ensures that it commutes with all material tangents (in the nonlinear case), which is the rationale behind this choice.

In any case, estimates for the constants \(\alpha _\pm \) may be obtained from spectral analyses of the material tangents. Indeed, \(\alpha _-\) arises as the minimum (over all points) of the smallest eigenvalues of the symmetric part of the tangent, whereas \(\alpha _+\) is the maximum of the largest singular value of the tangent. If the stress operator (3.2) is derived from a potential, the tangent is symmetric, and no distinction between eigenvalues and singular values needs to be made.

A closely related, more geometric approach to the basic scheme was introduced by Bellis and Suquet [114] based on the Helmholtz decomposition [115], where also further solvers exploiting geometric principles were discussed. In a nutshell, the basic scheme may also be interpreted as a projected gradient method on \(L^2(Y;\text {Sym}(d))\), where the objective function is

$$\begin{aligned} L^2(Y;\text{ Sym }(d)) \rightarrow \mathbb {R}, \quad \varepsilon \mapsto \int _Yw(x, \varepsilon (x))\, dx, \end{aligned}$$

and the \(L^2\)-orthogonal projector onto the closed, convex and non-empty set

$$\begin{aligned} {\mathcal {C}}_{{\bar{\varepsilon }}} = \left\{ \varepsilon = {\bar{\varepsilon }} + \nabla ^s u \,\Big |\, u \in H^1_\#(Y;\mathbb {R}^d) \right\} \subseteq L^2(Y;\text {Sym}(d)) \end{aligned}$$

of compatible strain fields with prescribed mean is given by

$$\begin{aligned} \varepsilon \mapsto {\bar{\varepsilon }} + \Gamma :\varepsilon , \quad \Gamma = \nabla ^s G\text {div}\,. \end{aligned}$$

In any case, the interpretation as a (projected) gradient method instigated research on accelerated gradient methods in the context of FFT-based computational homogenization, see Sect. 3.3.

An alternative interpretation of the basic scheme was provided by Mishra et al. [116], connecting the basic scheme for linear composites to Richardson’s iteration [117], a classical numerical technique. This interpretation served as the starting point for investigating the Chebyshev iteration [118] in this context [116], as well. Recently, also comparisons to laminates [119] and to series expansions available for composites with exact solutions [120] were made.

For a long time, the basic scheme, albeit cherished for its stability and reliability, has been considered too slow for nonlinear composites or linear composites with high material contrast. For this reason, alternative solution methods were sought, to be discussed below. However, recent work has revived the classical basic scheme to some extent. Motivated by investigations on Quasi–Newton methods, the Barzilai–Borwein method [121]

$$\begin{aligned} {\varepsilon ^{k+1} = {\bar{\varepsilon }} - \frac{1}{\alpha ^k} \, \Gamma :\left( \sigma (\varepsilon ^k) - \mathbb {C}^0:\varepsilon ^k\right) \quad \text {with} \quad \alpha ^k = \frac{ \langle \sigma (\varepsilon ^k) - \sigma (\varepsilon ^{k-1}), \varepsilon ^k - \varepsilon ^{k-1}\rangle _{L^2}}{ \Vert \varepsilon ^k - \varepsilon ^{k-1} \Vert _{L^2}^2}} \end{aligned}$$

was introduced into FFT-based computational homogenization [108]. The Barzilai–Borwein scheme may be interpreted as a gradient descent method (3.7) with an adaptive choice of step size (or, equivalently, the reference material). It may be implemented in a manner that is remarkably close to Moulinec and Suquet’s original suggestion [33], and is competitive to the fastest solvers available.

Also, it was realized that some discretization schemes lead to a numerical ill-conditioning of the discretized systems [43, 83, 87], independent of the solver used, see Table 1. By changing the discretization, the basic scheme turns into a linearly convergent method also for porous materials [43] (not a competitive one, though).

3.2 Krylov subspace methods

Krylov subspace methods for solving a linear system

$$\begin{aligned} Ax = b, \end{aligned}$$
(3.8)

where b and x are vectors in a high-dimensional space \(\mathbb {R}^n\), and \(A\in \mathbb {R}^{n \times n}\) is an invertible matrix, assume that applying the matrix A is much more expensive than linearly combining vectors in \(\mathbb {R}^n\). Then, a general iterative scheme for obtaining a solution \(x^*\) of Eq. (3.8) may be expressed by counting the number of applications of the matrix A, leading to the notion of Krylov subspaces

$$\begin{aligned} {\mathcal {K}}^k(A;b) = \left\{ x \in \mathbb {R}^n \, \Big | \, x = \sum _{j=0}^{k-1} \alpha _j\, A^jb \quad \text {for some real coefficients} \quad \alpha _0,\ldots ,\alpha _{k-1}\right\} , \end{aligned}$$

see Saad [122]. Krylov subspace methods are iterative methods which select the k-th iterate from the \((k+1)\)-th Krylov subspace, often in such a way that a pre-specified objective function \(\varphi _{A,b}:\mathbb {R}^n \rightarrow \mathbb {R}\) is optimized,

$$\begin{aligned} x^k = \mathop {\mathrm{arg\,min}}\limits _{x \in {\mathcal {K}}^{k+1}(A;b)} \varphi _{A,b}(x). \end{aligned}$$

Such Krylov subspace methods are \(\varphi _{A,b}\)-optimal iterative methods by construction. If the matrix A is symmetric and positive definite, the solution \(x^* \in \mathbb {R}^n\) of Eq. (3.8) may be obtained by minimizing the function

$$\begin{aligned} \varphi _{A,b}(x) = \tfrac{1}{2} \, x^T A x - b^T x. \end{aligned}$$

Choosing the latter function as the objective function on the Krylov subspaces gives rise to a solution strategy that is equivalent to the (linear) conjugate gradient (CG) method of Hestenes and Stiefel [40]. Considering the residual \(\varphi _{A,b}(x) = \Vert Ax - b\Vert ^2\) as the objective to minimize for a symmetric and non-degenerate matrix A gives rise to the minimum residual method (MINRES) [123]. Also, versions for non-symmetric matrices are available [124, 125].

Krylov subspace methods often become particularly powerful when combined with pre-conditioning, i.e., when Eq. (3.8) is transformed into the equivalent system

$$\begin{aligned} P^{-1}Ax = P^{-1}b \end{aligned}$$
(3.9)

for a suitable (symmetric and positive definite) matrix \(P \in \mathbb {R}^{n \times n}\) whose inverse is readily computable, and which improves the conditioning of the linear system (3.9) compared to the original one (3.8).

In the linear elastic setting and for a reference material \(\mathbb {C}^0\), s.t. the difference \(\mathbb {C}-\mathbb {C}^0\) is invertible, Brisard and Dormieux [49] considered the Lippmann–Schwinger equation (2.27)

$$\begin{aligned} (\mathbb {C}- \mathbb {C}^0)^{-1} : \tau + \Gamma ^0:\tau = {\bar{\varepsilon }} \end{aligned}$$
(3.10)

for the polarization \(\tau = (\mathbb {C}- \mathbb {C}^0):\varepsilon \). Both, for the Moulinec–Suquet and the Brisard–Dormieux discretizations, considering \(\tau \) as the unknown and \({\bar{\varepsilon }}\) as the right-hand side, the operator \((\mathbb {C}- \mathbb {C}^0)^{-1} + \Gamma ^0\) is a symmetric and invertible linear operator [50]. If \(\mathbb {C}^0\) is softer than all local \(\mathbb {C}\)’s, the operator is even positive definite, and if \(\mathbb {C}^0\)’s smallest eigenvalue is larger than all of the individual \(\mathbb {C}\)’s, the operator is negative definite. Then, Brisard and Dormieux [49] proposed to use the linear conjugate gradient method [40]. For the remaining reference material choices, they applied MINRES [123].

Krylov subspace solvers significantly speed up the convergence toward the solution \(\tau ^*\) of the Lippmann–Schwinger equation (3.10). Indeed, the number of necessary iterations to reach a desired accuracy grows with \(\sqrt{\kappa }\), where \(\kappa \) is the material contrast. To reach the same accuracy for the basic scheme, the number of iterations is proportional to \(\kappa \). Thus, the improvement when using Krylov-type solvers can be substantial. Also, please keep in mind that the estimated rate \(\sqrt{\kappa }\) is optimal in the class of strongly convex functions with Lipschitz gradient, see Sect. 2.1.4 in Nesterov [126].

A salient feature of Brisard–Dormieux’s Krylov-type approach to solving the Lippmann–Schwinger equation (3.10), which it shares with the basic scheme, is that the (estimate on the) number of necessary iterations is independent of the mesh size.

At practically the same time as Brisard and Dormieux [49], Zeman et al. [54] also considered Krylov subspace methods, but applied to the Lippmann–Schwinger equation for the strain (2.12)

$$\begin{aligned} \left( {{\,\mathrm{\text {Id}}\,}}+ \Gamma ^0 : \left( \mathbb {C}- \mathbb {C}^0 \right) \right) : \varepsilon = {\bar{\varepsilon }}. \end{aligned}$$
(3.11)

Zeman et al. [54] considered, both, the conjugate gradient method and the stabilized bi-conjugate gradient method Bi-CGStab [125], where the latter was developed for non-symmetric linear systems. Indeed, even for a reference material proportional to the identity \(\mathbb {C}^0 = \alpha _0 \, {{\,\mathrm{\text {Id}}\,}}\), the operator appearing on the left-hand side of Eq. (3.11) is not symmetric on \(L^2(Y;\text {Sym}(d))\),

$$\begin{aligned} \int _Y\varepsilon _1:\left( \left( {{\,\mathrm {\text{ Id }}\,}}+ \Gamma ^0 : \left( \mathbb {C}- \mathbb {C}^0 \right) \right) :\varepsilon _2\right) \, dx \ne \int _Y\varepsilon _2:\left( \left( {{\,\mathrm {\text{ Id }}\,}}+ \Gamma ^0 : \left( \mathbb {C}- \mathbb {C}^0 \right) \right) :\varepsilon _1\right) \, dx \end{aligned}$$

for general \(\varepsilon _1,\varepsilon _2 \in L^2(Y;\text {Sym}(d))\). Despite this fact, Zeman et al. [54] reported that the conjugate gradient method converged when applied to the Lippmann–Schwinger Eq. (2.12) and that the produced iterates were identical to those resulting from BiCGStab! The reasons behind this unexpected behavior were uncovered by Vondřejc et al. [68], who showed that the operator \({{\,\mathrm{\text {Id}}\,}}+ \Gamma ^0 : \left( \mathbb {C}- \mathbb {C}^0 \right) \) is symmetric and positive definite on the subspace of compatible strain fields. Indeed, utilizing the properties of the Helmholtz projector \(\Gamma ^0:\mathbb {C}^0\) onto the compatible strain fields, it may be shown that

$$\begin{aligned} \int _Y\varepsilon _1:\mathbb {C}^0:\left( \left( {{\,\mathrm {\text{ Id }}\,}}+ \Gamma ^0 : \left( \mathbb {C}- \mathbb {C}^0 \right) \right) :\varepsilon _2\right) \, dx = \int _Y\varepsilon _1(x):\mathbb {C}:\varepsilon _2(x) \, dx \end{aligned}$$

holds for all \(\varepsilon _i = \nabla ^s u_i\) for some \(u_i \in H^1_\#(Y;\mathbb {R}^d)\) \((i=1,2)\). Alternatively, one might think of the Lippmann–Schwinger equation (3.11) as a pre-conditioned version (3.9) of the balance of linear momentum

$$\begin{aligned} \text {div}\,\mathbb {C}:\nabla ^s u = -\text {div}\,\mathbb {C}:{\bar{\varepsilon }}, \end{aligned}$$
(3.12)

where the pre-conditioner P is chosen by the reference problem \(P = -\text {div}\,\mathbb {C}^0:\nabla ^s\). Then, \(P^{-1} = -G^0\) in terms of the Green’s operator (2.10) for the displacement, and the Lippmann–Schwinger equation (2.11)

$$\begin{aligned} G^0\text {div}\,\mathbb {C}:\nabla ^s u = -G^0\text {div}\,\mathbb {C}:{\bar{\varepsilon }} \end{aligned}$$

for the displacement is recovered. This point of view clarifies the underlying principles: The Lippmann–Schwinger equation (2.12) may be regarded as a pre-conditioned balance equation (3.12), where the pre-conditioner is chosen to solve a comparison problem with constant coefficients. Such a pre-conditioner removes the dependence of the condition number of the original balance equation (3.12) on the mesh size. However, as the coefficients are constant, the dependence on the variation of the coefficients, measured by the material contrast, remains.

Also, the point of view proposed by Vondřejc et al. [68] naturally leads to low-memory implementations on the displacement.

To conclude our discussion of Krylov-type methods, let us remark that the pre-conditioned conjugate gradient method in FFT-based computational homogenization was proposed by Nagai et al. [86] in 1998 already, but remained largely unnoticed. Also, let us briefly discuss the memory consumption of the Krylov-type methods. The conjugate gradient method is typically implemented on four vectors, and MINRES requires seven vectors. For the Brisard-Dormieux approach, thus, four and seven strain fields need to be kept in memory, respectively. If CG is implemented á la Zeman et al. [68], four strain fields are necessary, but a memory-efficient implementation on the displacement field may reduce the memory footprint further [106].

3.3 Fast gradient methods

Fast (or accelerated) gradient methods augment plain gradient descent (3.7)

$$\begin{aligned} u^{k+1} = u^k - s^k \, \nabla W(u^k), \quad k=0,1,\ldots , \end{aligned}$$
(3.13)

by a momentum term, and come in essentially two flavors. Heavy ball type methods [127] take the form

$$\begin{aligned} u^{k+1} = u^k - s^k \, \nabla W(u^k) + \beta ^k\, (u^k - u^{k-1}), \quad k=0,1,\ldots , \end{aligned}$$
(3.14)

where \(\beta ^k\), valued in [0, 1), are momentum coefficients, and \(u^{-1} \equiv u^0\). Nesterov’s fast gradient method [128] uses the update formula

$$\begin{aligned} \begin{aligned} v^k&= u^k + \beta ^k\,(u^k - u^{k-1}),\\ u^{k+1}&= v^k - s^k \, \nabla W(v^k), \end{aligned} \end{aligned}$$
(3.15)

for \(k=0,1,\ldots \), and \(u^{-1}\equiv u^0\) and \(\beta ^k\in [0,1)\). Whereas gradient descent (3.13) may be interpreted as a forward Euler discretization in time of continuous gradient descent

$$\begin{aligned} {\dot{u}}(t) = -\nabla W(u(t)) \quad \text {for} \quad t\ge 0 \quad \text {and} \quad u(0)=u_0, \end{aligned}$$

the fast gradient methods (3.14) and (3.15) arise as explicit and semi-explicit time discretizations [129, 130] of the Newtonian dynamical system with linear damping

$$\begin{aligned} \ddot{u}(t) + b\, {\dot{u}}(t) = -\nabla W(u(t)) \quad \text {for} \quad t\ge 0 \quad \text {and} \quad u(0)=u_0, \quad \ddot{u}(0)=0, \end{aligned}$$

involving a damping coefficient \(b\ge 0\). For FFT-based computational micromechanics, they take the form [131]

$$\begin{aligned} \varepsilon ^{k+1} = {\bar{\varepsilon }} - s^k \, \Gamma : \left( \sigma (\varepsilon ^k) - \tfrac{1}{s^k} \, \varepsilon ^k\right) + \beta ^k\, (\varepsilon ^k - \varepsilon ^{k-1}) \end{aligned}$$
(3.16)

for the heavy ball method, and

$$\begin{aligned} \begin{aligned} e^k&= \varepsilon ^k + \beta ^k\,(\varepsilon ^k - \varepsilon ^{k-1}),\\ \varepsilon ^{k+1}&= {\bar{\varepsilon }} - s^k \, \Gamma : \left( \sigma (e^k) - \tfrac{1}{s^k} \, e^k\right) \end{aligned} \end{aligned}$$
(3.17)

for Nesterov’s method [132]. The two fast gradient variants (3.14) and (3.15) differ in the parameter selection, the convergence speed and the stability properties.

For linear elasticity, and under the conditions (2.25), the (k-independent) choice

$$\begin{aligned} \frac{1}{s^k} = \left( \frac{ \sqrt{\alpha _-} + \sqrt{\alpha _+} }{2}\right) ^2 \quad \text {and} \quad \beta ^k = \left( \frac{ \sqrt{\alpha _+} - \sqrt{\alpha _-} }{ \sqrt{\alpha _+} + \sqrt{\alpha _-} }\right) ^2 \end{aligned}$$
(3.18)

is optimal for the heavy ball method (3.16), leading to the estimate

$$\begin{aligned} \Vert \varepsilon ^k - \varepsilon ^*\Vert _{L^2} \le C_\eta \left( \frac{ \sqrt{\alpha _+} - \sqrt{\alpha _-} }{ \sqrt{\alpha _+} + \sqrt{\alpha _-}} + \eta \right) ^k \Vert \varepsilon ^0 - \varepsilon ^*\Vert _{L^2} \end{aligned}$$

for any \(\eta >0\) and where \(\varepsilon ^*\) is the (unique) solution of the Lippmann–Schwinger equation (2.12), see Polyak [129]. The constant \(C_\eta \) goes to infinity as \(\eta \rightarrow 0\). In particular, the heavy ball method (3.16) has (almost) the optimal convergence rate, but may be implemented on only two strain fields, saving \(50\%\) of memory compared to the conjugate gradient method.

However, the heavy ball method is somewhat limited for nonlinear problems. By a perturbation argument, the choice (3.18) also leads to a locally convergent method for general twice differentiable, \(\alpha _-\)-strongly convex functions with \(\alpha _+\)-Lipschitz gradient [129]. However, the method may be globally unstable using these parameters [133, 134].

In contrast, Nesterov’s method (3.15) converges for general once differentiable, \(\alpha _-\)-strongly convex functions with \(\alpha _+\)-Lipschitz gradient and the choice

$$\begin{aligned} s^k = \frac{1}{\alpha _+} \quad \text {and} \quad \beta ^k = \frac{ \sqrt{\alpha _+} - \sqrt{\alpha _-} }{ \sqrt{\alpha _+} + \sqrt{\alpha _-} } \end{aligned}$$
(3.19)

with an estimate

$$\begin{aligned} \Vert \varepsilon ^k - \varepsilon ^*\Vert _{L^2} \le C \, \left( \sqrt{ 1 - \sqrt{\frac{\alpha _-}{\alpha _+}} }\right) ^k \Vert \varepsilon ^0 - \varepsilon ^*\Vert _{L^2} \end{aligned}$$

and a fixed constant C, see Nesterov’s book [126]. Thus, the rate is slower than for the heavy ball method, but the range of applicability is larger.

In computational practice, both fast gradient variants (3.16) and (3.17) suffer from the same shortcoming as the classical basic scheme, discussed in Sect. 3.1, namely that the algorithmic parameters, \(s^k\) and \(\beta ^k\), need to be chosen judiciously. Often, only estimates of the true constants \(\alpha _\pm \) are available. Also, for porous materials, the "effective" strong convexity constant is not known, and determining it can be more difficult than solving the problem in the first place [43]. For these reasons, an adaptive choice of the algorithmic parameters is recommended. For Nesterov’s method (3.15), a number of restarting strategies became popular [130, 135]. The restarting strategy of Su et al. [130] was investigated in Schneider [132]. Although the resulting method could recover the optimal convergence rate, the necessary number of iterations lags behind other competitive methods [108]. Indeed, when using the linear case as a sanity test, relying upon the speed restarting strategy increases the iteration count by a factor of two to three compared to the optimal choice [134] available for the linear case.

As a consequence, adaptive strategies for selecting the parameters of the heavy ball method were sought. Actually, for the linear elastic case and for the optimal choice of the parameters \(s^k\) and \(\beta ^k\) (via plane search), the heavy ball method reduces to the (linear) conjugate gradient method of Sect. 3.2, see Polyak [129]. More generally, the class of heavy ball methods (3.14)

$$\begin{aligned} u^{k+1} = u^k - s^k \, \nabla W(u^k) + \beta ^k\, (u^k - u^{k-1}), \quad k=0,1,\ldots , \end{aligned}$$

is identical to the class of nonlinear conjugate gradient methods

$$\begin{aligned} \begin{aligned} d^k&= -\nabla W(u^k) + \gamma ^{k-1} \, d ^{k-1},\\ u^{k+1}&= u^k + s^k \, d^k, \end{aligned} \end{aligned}$$
(3.20)

via the identification \(\beta ^k = \gamma ^{k-1} s^k / s^{k-1}\) [129]. However, for nonlinear conjugate gradient methods, the step size \(s^k\) is typically chosen via line search, and a variety of explicit formulas for the coefficient \(\gamma ^k\) have emerged, see Dai [136] for a recent overview.

In FFT-based computational micromechanics, evaluating \(\nabla W\) via the constitutive law is typically the most expensive step, and a line search should be avoided. Thus, it was suggested [137] to use the step size of the gradient scheme \(\alpha _0 = (\alpha _+ + \alpha _-)/2\) in combination with the Fletcher–Reeves [138] formula

$$\begin{aligned} \gamma ^{k-1} = \frac{ \Vert \nabla W(u^k)\Vert ^2 }{ \Vert \nabla W(u^{k-1})\Vert ^2 }. \end{aligned}$$

The associated nonlinear CG method may be interpreted as a (discrete) dynamical system with feedback control. The resulting FFT-based method may be implemented on three strain (or displacement) fields, and shows excellent performance [137].

3.4 Newton and Quasi-Newton methods

The Newton–Raphson method is a classic way of solving the balance equation (2.6)

$$\begin{aligned} \text {div}\,\frac{\partial w}{\partial \varepsilon } (\cdot , {\bar{\varepsilon }} + \nabla ^s u) = 0 \end{aligned}$$
(3.21)

via the update

$$\begin{aligned} u_{k+1} = u_k + s^k \triangle u_k, \end{aligned}$$
(3.22)

where the increment \(\triangle u_k \in H^1_\#(Y;\mathbb {R}^d)\) solves the linearized balance equation (3.21)

$$\begin{aligned} \text {div}\,\frac{\partial ^2 w}{\partial \varepsilon ^2} (\cdot , {\bar{\varepsilon }} + \nabla ^s u^k): \nabla ^s \triangle u^k = -\text {div}\,\frac{\partial w}{\partial \varepsilon } (\cdot , {\bar{\varepsilon }} + \nabla ^s u^k) \end{aligned}$$
(3.23)

and \(s^k \in (0,1]\) is a step size. For an arbitrary reference material \(\mathbb {C}^0\), the linearized equilibrium equation (3.23) may also be rewritten in Lippmann–Schwinger form (with residual stresses)

$$\begin{aligned} \triangle \varepsilon ^k + \Gamma ^0: \left( \left[ \frac{\partial ^2 w}{\partial \varepsilon ^2} (\cdot , \varepsilon ^k) - \mathbb {C}^0 \right] : \triangle \varepsilon ^k + \frac{\partial w}{\partial \varepsilon } (\cdot , \varepsilon ^k)\right) = 0, \end{aligned}$$
(3.24)

where \(\varepsilon ^k = {\bar{\varepsilon }} + \nabla ^s u^k\) and \(\triangle \varepsilon ^k = \nabla ^s \triangle u^k\). For finite strain problems, Lahellec et al. [139] proposed to use a Newton–Raphson method, where the linearized equilibrium equation (3.23) is solved by the basic scheme naturally associated with the Lippmann–Schwinger equation (3.24).

After the emergence of linear solvers of Krylov type, see Sect. 3.2, Newton–Krylov methods were put forward, both at small [140] and finite strains [105]. Kabel et al. [105] compared Newton’s method, combined with a variety of linear solvers, to the basic scheme, and Newton-CG proved to be the most effective combination.

Coupling Newton’s method to an iterative solver for the linearized equation (3.23) is commonly referred to as a Newton–Krylov solver [141], a special class of inexact Newton methods [142]. To be competitive, Newton’s method (3.22) needs to be supplemented by a globalization strategy and a judicious choice of the forcing term, i.e., the tolerance for the linear solver. Concerning the globalization strategy, it is well known that Newton’s method is not globally convergent. Outside of the local region of attraction, it is not convergent, in general. For this purpose, globalization strategies are necessary, and back-tracking line search is probably the simplest strategy available, see Boyd [143]. Furthermore, to ensure the convergence of Newton-type methods, it is important to choose the tolerance for the linear solver, the so-called forcing term, depending on the nonlinear residual [144]. For computational homogenization problems, both topics are discussed in Wicht et al. [145].

For general nonlinear optimization problems, Quasi–Newton methods are often preferred over plain Newton methods, see Nocedal and Wright [146] for a discussion. A particularly powerful method is the Quasi-Newton scheme proposed by Broyden-Fletcher-Goldfarb-Shanno (BFGS) [147,148,149,150]. However, investigations for small-strain inelasticity revealed that the limited-memory version of BFGS, L-BFGS [151], is not competitive to the Barzilai–Borwein scheme, see Wicht et al. [145]. The reason is found in the large computational overhead associated with L-BFGS. Furthermore, for general large-scale optimization problems, BFGS typically outperforms plain Newton methods for objective functions with densely populated Hessians. However, for the problem at hand (3.21), the "stiffness matrix" in Eq. (3.23) is block-diagonal and, hence, sparse. Interestingly, the Barzilai–Borwein method [108], which may be interpreted as L-BFGS of depth one without line search, turns out to be competitive to the fastest solvers available.

As a general note, Newton’s method will lead to the fastest solver if applying the tangent material is much cheaper than evaluating the nonlinear constitutive law. Indeed, for the basic scheme and its variants, an evaluation of the nonlinear material law is required for every iteration. In contrast, such an expensive computation is required for Newton’s method only when the right-hand side of the linear system (3.23) is computed. Also, for materials which involve voxel-wise rotations, these rotations may be shifted into the tangent material, accelerating the solution of the linear problem further. However, this speed comes at a price, namely the increased memory demand. Suppose that we wish to solve a problem with \(512^3\) voxels. Then, a single scalar field in double precision occupies 1 GB of memory, and a single strain field requires 6 GB of memory. Ignoring internal variables, the basic scheme may be computed in-place on a single strain field. The Barzilai–Borwein method, the fast gradient methods and the polarization schemes may be implemented on two strain fields, requiring 12 GB of memory. In contrast, the classical Newton-CG requires 51 GB for symmetric material tangents. Indeed, 21 GB are required for the tangent, in addition to five strain fields (\(\varepsilon ^k\) and four fields for computing \(\triangle \varepsilon ^k\) using CG). It is possible to reduce this memory demand by \(50\%\) by using displacement-based implementations and storing the tangent in single precision. Still, more than 25 GB are required. These numbers increase for finite strains and non-symmetric tangents. This is the reason why alternatives to the established Newton–Krylov solvers are (still) sought.

To conclude our discussion of Newton-type methods, let us remark that Volmer et al. [152] proposed an improved initial guess for Newton-type solvers. Peng et al. [153] investigated the recursive projection method, a variant of Newton’s method, to take care of possible ill-conditioning of Lippmann–Schwinger solvers for porous materials and discretizations based on trigonometric polynomials.

In addition to Quasi–Newton methods, a generalization of the secant methods popular for solving one-dimensional problems, a particular type of multi-secant method was introduced to FFT-based computational micromechanics with great success. Anderson mixing [154] is a general approach for accelerating fixed point methods of the type

$$\begin{aligned} u^{k+1} = F(u^k), \end{aligned}$$

where derivatives of F are not available. Anderson mixing of depth \(m>0\) keeps the previous m iterates \(u^k,u^{k-1},\ldots ,u^{k-m+1}\) in memory, together with their images \(F(u^k),F(u^{k-1}),\ldots ,F(u^{k-m+1})\) under F. Then, the next iterate is determined by the mixing rule

$$\begin{aligned} u^{k+1} = \sum _{i=0}^{m-1} \alpha _{i,m} F(u^{k-i}), \end{aligned}$$
(3.25)

where the coefficients \(\alpha _{i,m}\) are chosen to minimize

$$\begin{aligned} \left\| \sum _{i=0}^{m-1} \alpha _{i,m} \left( u^{k-i} - F(u^{k-i})\right) \right\| ^2 \longrightarrow \min \quad \text {among} \quad \left\{ (\alpha _{0,m},\ldots ,\alpha _{m-1,m}) \in \mathbb {R}^m \,\Big |\, \sum _{i=0}^{m-1} \alpha _{i,m} = 1 \right\} .\nonumber \\ \end{aligned}$$
(3.26)

The latter optimization problem (3.26) may be considered as a linearly constrained quadratic optimization problem in m dimensions, and the coefficients involve only scalar products of the vector residuals \(u^\ell - F(u^\ell )\). It has been shown that, applied to linear problems, Anderson mixing (3.25) and (3.26) is essentially equivalent to GMRES, see Walker and Ni [155]. For general fixed-point iterative schemes, Anderson mixing (3.25) and (3.26) was exposed as a multi-secant method by Fang and Saad [156].

Anderson mixing was applied to the fixed point methods used in FFT-based computational homogenization by Shantraj et al. [157] under the banner of nonlinear GMRES, and found to be quite performing. Independently, Chen et al. [158] reported the Anderson-mixed basic scheme with depth \(m=4\) to be both robust and efficient for problems at small strains.

3.5 Polarization methods

Eyre and Milton [111] proposed an ingenious rewriting of the Lippmann–Schwinger equation (2.12)

$$\begin{aligned} \varepsilon + \Gamma ^0 : \left( \mathbb {C}- \mathbb {C}^0 \right) :\varepsilon = {\bar{\varepsilon }} \end{aligned}$$
(3.27)

for the strain \(\varepsilon \) in terms of an equation

$$\begin{aligned} P - Y^0 : Z^0 : P = 2\,\mathbb {C}^0:{\bar{\varepsilon }} \end{aligned}$$
(3.28)

for the polarization field \(P = (\mathbb {C}+\mathbb {C}^0):\varepsilon \), involving the operators

$$\begin{aligned} Y^0 = {{\,\mathrm{\text {Id}}\,}}- 2 \,\mathbb {C}^0:\Gamma ^0 \quad \text {and} \quad Z^0 = \left( \mathbb {C}- \mathbb {C}^0\right) : \left( \mathbb {C}+ \mathbb {C}^0\right) ^{-1}. \end{aligned}$$
(3.29)

Actually, Eyre–Milton [111] discussed the case of conducting composites, and the extension to linear elasticity was reported by Michel et al. [110]. The Eyre–Milton equation (3.28) was motivated by fractional analytic convergence acceleration techniques for series representations of the effective conductivity [159], and has a number of remarkable properties. First, the Helmholtz reflection \(Y^0\) is non-expansive in the \(\mathbb {C}^0\)-weighted \(L^2\)-norm. Furthermore, if the local stiffness tensors satisfy the conditions (2.25), the Cayley transform \(Z^0\) of the local stiffness field is an \(L^2\)-contraction for every (!) choice of the reference material \(\mathbb {C}^0\). In particular, the Eyre–Milton scheme

$$\begin{aligned} P^{k+1} = 2\,\mathbb {C}^0:{\bar{\varepsilon }} + Y^0 : Z^0 : P^k \end{aligned}$$
(3.30)

is linearly convergent for any reference material \(\mathbb {C}^0\) and any initial value \(P^0\). Under the conditions (2.25), among reference materials of the type \(\mathbb {C}^0 = \alpha _0 \, {{\,\mathrm{\text {Id}}\,}}\) for \(\alpha _0>0\), the choice \(\alpha _0 = \sqrt{\alpha _- \alpha _+}\) leads to the fastest converge rate

$$\begin{aligned} \Vert P^{k+1} - P^*\Vert _{L^2} \le \frac{ \sqrt{\alpha _+} - \sqrt{\alpha _-} }{ \sqrt{\alpha _+} + \sqrt{\alpha _-} } \, \Vert P^k - P^*\Vert _{L^2}, \end{aligned}$$

where \(P^* = (\mathbb {C}+ \mathbb {C}^0):\varepsilon ^*\) is the unique solution of the Eyre–Milton equation (3.28). In particular, for fixed accuracy, the number of iterations is proportional to the square root of the material contrast, \(\sqrt{\alpha _+ / \alpha _-}\). Thus, the Eyre–Milton method has the same worst case convergence rate as the fastest Krylov methods, see Sect. 3.2.

In the very same paper that reported the extension of Eyre–Milton’s conductivity scheme to linear elastic composites, Michel et al. [110] introduced another solution method. The starting point [160] is the reformulation of the displacement-based, unconstrained variational principle (2.4)

$$\begin{aligned} \int _Yw(x, {\bar{\varepsilon }} + \nabla ^s u(x))\, dx \longrightarrow \min _{u\in H^1_\#(Y;\mathbb {R}^d)} \end{aligned}$$

as a constrained optimization problem

$$\begin{aligned} \int _Yw(x, e(x))\, dx \longrightarrow \min _{(u,e)} \quad \text{ among } \quad \left\{ (u,e) \in H^1_\#(Y;\mathbb {R}^d) \times L^2(Y;\text{ Sym }(d)) \,\Big |\, e = {\bar{\varepsilon }} + \nabla ^s u \right\} . \end{aligned}$$
(3.31)

The Lagrangian function, see chapter 5 in Boyd and Vandenberghe [143], associated with this constrained optimization problem reads

$$\begin{aligned} L(u,e,\lambda ) =\int _Yw(x, e) + \lambda :( {\bar{\varepsilon }} + \nabla ^s u - e)\, dx, \end{aligned}$$
(3.32)

where \(\lambda \in L^2(Y;\text {Sym}(d))\) is the Lagrange multiplier associated with the constraint (3.31). For a parameter \(\alpha _0>0\) with dimensions of stress, the augmented Lagrangian, see Bertsekas [161], becomes

$$\begin{aligned} L_{\alpha _0}(u,e,\lambda ) = \int _Yw(x, e) + \lambda :( {\bar{\varepsilon }} + \nabla ^s u - e) + \alpha _0 \, \Vert {\bar{\varepsilon }} + \nabla ^s u - e \Vert ^2\, dx. \end{aligned}$$

Michel et al. [110] proposed to use the alternating direction method of multipliers (ADMM) [162, 163], also called the method of partial updates,

$$\begin{aligned} \begin{aligned} u^{k+1}&= \mathop {\mathrm{arg\,min}}\limits _{u} L_{\alpha _0}(u,e^k,\lambda ^k),\\ e^{k+1}&= \mathop {\mathrm{arg\,min}}\limits _{e} L_{\alpha _0}(u^{k+1},e,\lambda ^k),\\ \lambda ^{k+1}&= \lambda ^k + \alpha _0\,({\bar{\varepsilon }} + \nabla ^s u^{k+1} - e^{k+1}), \end{aligned} \end{aligned}$$
(3.33)

for computing saddle points of the Lagrangian L. Please note the difference to the classical augmented Lagrangian method [164, 165], which updates

$$\begin{aligned} \begin{aligned} (u^{k+1},e^{k+1})&= \mathop {\mathrm{arg\,min}}\limits _{(u,e)} L_{\alpha _0}(u,e,\lambda ^k),\\ \lambda ^{k+1}&= \lambda ^k + \alpha _0\,({\bar{\varepsilon }} + \nabla ^s u^{k+1} - e^{k+1}), \end{aligned} \end{aligned}$$

coupling u and e in the first line. The first line of the ADMM recursion (3.33) reads, explicitly,

$$\begin{aligned} \alpha _0 \, \text {div}\,\nabla ^s u^{k+1} = - \text {div}\,(\lambda ^k - \alpha _0\, e^k). \end{aligned}$$

In terms of the operator \(G^0\) (2.10) for \(\mathbb {C}^0 = \alpha _0 \, {{\,\mathrm{\text {Id}}\,}}\), we might express the latter equation in the form

$$\begin{aligned} u^{k+1} = -G^0\text {div}\,(\lambda ^k - \alpha _0\, e^k), \quad \text {or, rather,} \quad \varepsilon ^{k+1} = {\bar{\varepsilon }} - \Gamma ^0:(\lambda ^k - \mathbb {C}^0:e^k) \end{aligned}$$

for \(\varepsilon ^{k+1} = {\bar{\varepsilon }} + \nabla ^s u^{k+1}\). The second line in the update (3.33) reads, more explicitly,

$$\begin{aligned} \frac{\partial w}{\partial \varepsilon } (\cdot , e^{k+1}) + \mathbb {C}^0:e^{k+1} = \lambda ^{k} + \mathbb {C}^0:\varepsilon ^{k+1}, \end{aligned}$$

which is a nonlinear, but local equation for the strain-type field \(e^{k+1} \in L^2(Y;\text {Sym}(d))\). The resulting method, formulated for \(\mathbb {C}^0 = \alpha _0 \, {{\,\mathrm{\text {Id}}\,}}\), reads

$$\begin{aligned} \begin{aligned} \varepsilon ^{k+1}&= {\bar{\varepsilon }} - \Gamma ^0:(\lambda ^k - \mathbb {C}^0:e^k),\\ e^{k+1}&= \left( \frac{\partial w}{\partial \varepsilon } + \mathbb {C}^0\right) ^{-1} \left( \lambda ^{k} + \mathbb {C}^0:\varepsilon ^{k+1}\right) ,\\ \lambda ^{k+1}&= \lambda ^k + \mathbb {C}^0:(\varepsilon ^{k+1} - e^{k+1}). \end{aligned} \end{aligned}$$
(3.34)

Here, the second line involves the inverse of the stress operator,

$$\begin{aligned} e = \left( \frac{\partial w}{\partial \varepsilon } + \mathbb {C}^0\right) ^{-1}(\tau ) \quad \text {precisely if} \quad \tau = \frac{\partial w}{\partial \varepsilon }(\cdot , e) + \mathbb {C}^0:e \quad \text {for} \, e,\tau \in L^2(Y;\text {Sym}(d)). \end{aligned}$$

Notice that \(\lambda ^{k+1} = \partial w / \partial \varepsilon (e^{k+1})\) holds by comparing the second and third lines in the algorithm  (3.34).

For linear elastic composites, a similar method was introduced by Nguyen et al. [47]. Any fixed point \((\varepsilon ^*,e^*,\lambda ^*)\) of the iterative scheme (3.34) satisfies \(\varepsilon ^* = e^*\) and \(\lambda ^* = \partial w/\partial \varepsilon (\cdot ,\varepsilon ^*)\), i.e., \(\lambda ^*\) is the stress field associated to the strain \(\varepsilon ^*\). Furthermore, \(\varepsilon ^*\) solves the Lippmann–Schwinger equation (2.12), i.e.,

$$\begin{aligned} \varepsilon ^* + \Gamma ^0:\left( \frac{\partial w}{\partial \varepsilon }(\cdot ,\varepsilon ^*) - \mathbb {C}^0:\varepsilon ^* \right) = {\bar{\varepsilon }} \end{aligned}$$

holds. By computational experiments, Michel et al. [110, 160] showed that their proposed solution method (3.34) improves upon the basic scheme significantly. Please note that the performance of the ADMM scheme (3.34) strongly depends on the ability to solve the second equation rapidly. Michel et al. [110] remarked that, for nonlinear elastic power-law von-Mises-type materials, the latter nonlinear tensorial equation reduces to a single scalar nonlinear equation, which may be solved quickly, see also Moulinec and Suquet [166].

For linear elastic composites, Monchiet and Bonnet [167] introduced a more general class of iterative schemes operating on the polarization field \(P = (\mathbb {C}+ \mathbb {C}^0):\varepsilon \). These methods depend on a scalar parameter \(\gamma \in [0,1)\), and, for \(\gamma = \tfrac{1}{2}\) and linear elastic composites, are equivalent to the ADMM scheme (3.34). Subsequently, these polarization schemes were also considered for conductivity [168] and for nonlinear problems [169]. Through the effort of Moulinec and Silva [170], it was uncovered that the polarization schemes of Monchiet and Bonnet may be written in the form

$$\begin{aligned} P^{k+1} = \gamma \, P^k + (1-\gamma )\left( 2\,\mathbb {C}^0:{\bar{\varepsilon }} + Y^0 : Z^0 : P^k\right) , \end{aligned}$$
(3.35)

which closely resembles the Eyre–Milton scheme (3.30). The expression (3.35) shows that the polarization schemes of Monchiet–Bonnet are damped versions of the Eyre-Milton method. For no damping, \(\gamma = 0\), the Eyre–Milton method (3.35) is recovered, whereas \(\gamma = \tfrac{1}{2}\) leads to the ADMM (3.34) (for linear elastic constituents). Also, via carefully investigating the spectral radius of the linear operator appearing in the fixed point scheme (3.35), Moulinec and Silva showed that the iterative scheme (3.35) converges for any (isotropic) reference material \(\mathbb {C}^0\) and any damping factor \(\gamma \in [0,1)\), provided the conditions (2.25) hold. The (theoretically) fastest convergence rate is reached for \(\gamma = 0\), i.e., the Eyre-Milton scheme, and the previously mentioned optimal choice for the reference material of the Eyre–Milton method.

Thus, for linear elasticity, the ADMM (3.34) turned out to be a damped (actually slower) version of the Eyre–Milton scheme (3.30). Conversely, the ADMM was formulated for nonlinear constitutive behavior, whereas the Eyre–Milton method was derived from a fractional-linear transformation in complex coordinates, and appeared restricted to linear problems in an intrinsic way.

It was subsequently realized [171] that the nonlinear extension of the Eyre-Milton method

$$\begin{aligned} P^{k+1} = 2\,\mathbb {C}^0:{\bar{\varepsilon }} + Y^0 : Z^0(P^k) \end{aligned}$$
(3.36)

using the Cayley transform of the nonlinear stress operator \(\partial w/\partial \varepsilon \) (3.2)

$$\begin{aligned} Z^0(P) = P - 2\,\mathbb {C}^0:\left( \frac{\partial w}{\partial \varepsilon } + \mathbb {C}^0\right) ^{-1}(P) \equiv \left( \frac{\partial w}{\partial \varepsilon } - \mathbb {C}^0\right) \left( \frac{\partial w}{\partial \varepsilon } + \mathbb {C}^0\right) ^{-1}(P), \end{aligned}$$
(3.37)

can be identified with the Peaceman–Rachford iterative scheme [172], and the method introduced by Michel et al. (3.34) arises as the Douglas–Rachford iterative method [173, 174]. In convex optimization, the equivalence of ADMM with the Douglas–Rachford method (applied to the dual) is well known [175]. This identification permitted gaining insights into selecting the reference material \(\mathbb {C}^0 = \alpha _0 \, {{\,\mathrm{\text {Id}}\,}}\), entering the nonlinear Eyre–Milton method (3.36), properly. Indeed, Giselsson and Boyd [176] showed that, for a (condensed) free energy w which satisfies the conditions (3.3) and (3.4), the choice \(\alpha _0 = \sqrt{\alpha _- \alpha _+}\) and \(\gamma = 0\) leads to the fastest converge rate

$$\begin{aligned} \Vert P^{k+1} - P^*\Vert _{L^2} \le \frac{ \sqrt{\alpha _+} - \sqrt{\alpha _-} }{ \sqrt{\alpha _+} + \sqrt{\alpha _-} } \, \Vert P^k - P^*\Vert _{L^2}, \end{aligned}$$

where \(P^* = \partial w/\partial \varepsilon (\cdot ,\varepsilon ^*) + \mathbb {C}^0:\varepsilon ^*\) is the unique fixed point of the Eyre–Milton scheme (3.36). In particular, the results obtained by Eyre and Milton [111] for the case of linear composites are recovered. Also, the result confirms the insights of Moulinec and Silva [170].

We already mentioned that the practical performance of the ADMM (3.34) and the nonlinear Eyre–Milton method (3.36) crucially depend on solving the equation

$$\begin{aligned} \frac{\partial w}{\partial \varepsilon }(\cdot , \varepsilon ) + \mathbb {C}^0:\varepsilon = P \end{aligned}$$
(3.38)

quickly for \(\varepsilon \) and prescribed polarization P. Extending previous ideas of Lebensohn et al. [177], Schneider et al. [171] showed that the latter equation may be solved quickly for generalized standard materials with a free energy, for which the dependence on the elastic strain is Hookean. Then, equation (3.38) may be solved by evaluating the original material law \({\partial w}/{\partial \varepsilon }\) with a modified stiffness tensor. Also, for a class of compliance-based damage models, an inexpensive resolution is possible [178].

The point of view taken by Schneider et al. [171] considers the nonlinear Eyre–Milton scheme (3.36) as the basic solution method of interest, and the ADMM (3.34) arises as a special case. However, the opposite point of view may be taken, as well. Indeed, let us consider, for \(\gamma \in [0,1)\), the following variant [176] of the algorithm (3.34)

$$\begin{aligned} \begin{aligned} \varepsilon ^{k+\tfrac{1}{2}}&= {\bar{\varepsilon }} - \Gamma ^0:(\lambda ^k - \mathbb {C}^0:e^k),\\ \varepsilon ^{k+1}&= 2(1-\gamma )\varepsilon ^{k+\tfrac{1}{2}} - (1-2\gamma ) e^k,\\ e^{k+1}&= \left( \frac{\partial w}{\partial \varepsilon } + \mathbb {C}^0\right) ^{-1} \left( \lambda ^k + \mathbb {C}^0:\varepsilon ^{k+1}\right) ,\\ \lambda ^{k+1}&= \lambda ^k + \mathbb {C}^0:(\varepsilon ^{k+1} - e^{k+1}), \end{aligned} \end{aligned}$$
(3.39)

which reduces to the original method (3.34) for \(\gamma = \tfrac{1}{2}\). By algebraic manipulations, see section 2 in Schneider [178], it can be shown that the iterative scheme (3.39) is equivalent to the Monchiet–Bonnet scheme (3.35) (for nonlinear \(Z^0\) and general \(\gamma \in [0,1)\)) via the identification \(P^k = \lambda ^k + \mathbb {C}^0:{e^k}\). In particular, for \(\gamma = 0\), the nonlinear Eyre-Milton method (3.36) and, for general \(\gamma \), the Monchiet–Bonnet methods may also be written in the Michel–Moulinec–Suquet framework (3.34). Thus, it turns out to be a matter of taste whether the Eyre–Milton formulation (3.36) or the Michel–Moulinec–Suquet framework (3.39) is used. The former can be implemented with lower memory requirements. The latter, however, has direct access to the physically meaningful strain and stress fields, and it is more convenient for adaptively choosing the reference material [178].

To sum up, the polarization methods in FFT-based computational homogenization can be connected to the operator splitting methods (ADMM, Douglas-Rachford, PDHG, Split Bregman, ...) that became popular in convex optimization of non-smooth problems, see Chambolle and Pock [179] for an introduction. In particular, polarization methods have a different nature than the solution methods that are built on top of the basic scheme and which were discussed in the previous sections. Indeed, whereas solvers built on top of the basic scheme evaluate the stress explicitly, Eq. (3.38) needs to be solved for polarization schemes and ADMM.

This characteristic, however, should not distract from the efficiency of polarization methods. Under the conditions (3.3) and (3.4) and whenever Eq. (3.38) can be solved with little overhead, the polarization methods can be extremely effective methods with minimal memory footprint. Indeed, for a proper choice of reference material, they are on the same level as the most advanced gradient-type and Newton methods, and may be implemented on two polarization fields.

Adaptive choices for general nonlinearity were studied recently [178]. Also, polarization methods were combined with Anderson acceleration [180], giving rise to extremely powerful and robust solution methods for computational micromechanics.

Last but not least, let us remark that To et al. [181] proposed an online estimation of the optimal algorithmic parameters of polarization methods. Also, recently polarization methods were used for non-differentiable problems associated with strongly nonlinear materials [182] and for computing the effective crack resistance of microstructured brittle materials [183].

3.6 Convergence tests and overview

The “natural” way of measuring the extent to which the equilibrium condition (2.6)

$$\begin{aligned} \text {div}\,\sigma = 0 \end{aligned}$$
(3.40)

is satisfied for any stress field \(\sigma \in L^2(Y;\text {Sym}(d))\) is in the dual norm to \(H^1_\#(Y;\mathbb {R}^d)\), i.e.,

$$\begin{aligned} \Vert f \Vert _{H^{-1}_\#} := \sup \left\{ \int _Yf(x)\cdot u(x)\, dx \,\Big |\, u \in H^1_\#(Y;\mathbb {R}^d) \quad \text{ with } \quad \int _Y\Vert \nabla ^s u(x)\Vert ^2\, dx \le 1\right\} , \end{aligned}$$

i.e., to consider

$$\begin{aligned} \Vert \text{ div }\,\sigma \Vert _{H^{-1}_\#} {\mathop {\le }\limits ^{!}} \text{ tol } \, \left\| \frac{1}{\text {vol}(Y)}\int _Y\sigma \, dx\right\| \end{aligned}$$
(3.41)

for a prescribed tolerance \(\text {tol}>0\). Indeed, under the conditions (3.3) and (3.4), the upper-lower bounds

$$\begin{aligned} \alpha _- \, \Vert \varepsilon - \varepsilon ^* \Vert _{L^2} \le \left\| \text {div}\,\frac{\partial w}{\partial \varepsilon }(\cdot ,\varepsilon ) \right\| _{H^{-1}_\#} \le \alpha _+ \, \Vert \varepsilon - \varepsilon ^* \Vert _{L^2} \end{aligned}$$
(3.42)

hold for any \(\varepsilon = {\bar{\varepsilon }} + \nabla ^s u\) with \(u \in H^1_\#(Y;\mathbb {R}^d)\), and where \(\varepsilon ^*\) is the unique solution of the Lippmann–Schwinger equation (2.12), see Appendix A.1 in Schneider [43]. Thus, the condition (3.41) measures the distance to the solution \(\varepsilon ^*\), and is both necessary and sufficient for convergence of those Lippmann–Schwinger solvers that are built on top of the basic scheme. Also, the natural convergence criterion for CG is, up to re-scaling, identical to the condition (3.41).

Table 4 Summary of memory consumption and respective merits of FFT-based solvers for nonlinear and inelastic computational homogenization

A practical way for computing the convergence criterion is via the identity

$$\begin{aligned} \Vert \text {div}\,\sigma \Vert _{H^{-1}_\#} = \Vert \Gamma :\sigma \Vert _{L^2} \end{aligned}$$
(3.43)

in terms of the non-dimensional \(\Gamma \)-operator \(\Gamma = \nabla ^s G\text {div}\,\) with \(G= (\text {div}\,\nabla ^s)^{-1}\), see Bellis and Suquet [114] or Monchiet and Bonnet [167]. With the formula (3.43) at hand, the convergence criterion (3.41) is readily evaluated in Fourier space, and even appears naturally in the basic scheme (2.15) with \(\mathbb {C}^0 = \alpha _0 \, {{\,\mathrm{\text {Id}}\,}}\) via

$$\begin{aligned} \left\| \text {div}\,\frac{\partial w}{\partial \varepsilon }(\cdot ,\varepsilon ^k) \right\| _{H^{-1}_\#} \equiv \left\| \Gamma :\frac{\partial w}{\partial \varepsilon }(\cdot ,\varepsilon ^k) \right\| _{L^2} = \alpha _0 \, \Vert \varepsilon ^{k+1} - \varepsilon ^k \Vert _{L^2}. \end{aligned}$$

There is even a version using only a single strain field in memory [131], based on the identity

$$\begin{aligned} \left\| \varepsilon ^{k+1} - \varepsilon ^k\right\| _{L^2} = \left\| \varepsilon ^{k+1}\right\| ^2_{L^2} - \left\| \varepsilon ^k\right\| ^2_{L^2} +\frac{2}{\text {vol}(Y)}\left( \int _Y\frac{\partial w}{\partial \varepsilon }(x,\varepsilon ^k(x)):\varepsilon ^k(x)\, dx - \int _Y\frac{\partial w}{\partial \varepsilon }(x,\varepsilon ^k(x))\, dx:E \right) . \end{aligned}$$

valid for the basic scheme (2.15). Notice that the estimate (3.42) also holds for suitable discretization schemes, as long as the continuous Green operator \(\Gamma \) is replaced by the corresponding discrete version.

In the literature, also alternative convergence criteria are used. For instance, the elementary estimate

$$\begin{aligned} \Vert \text {div}\,\sigma \Vert _{H^{-1}_\#} \le \frac{L}{\sqrt{2}\pi } \Vert \text {div}\,\sigma \Vert _{L^2} \quad \text {for} \quad L = \min _{j=1\ldots ,d} L_j \end{aligned}$$

on the cell \(Y\) (2.1) shows that a convergence criterion of the form

$$\begin{aligned} \Vert \text{ div }\,\sigma \Vert _{L^2} \le \frac{\sqrt{2}\pi }{L} \, \text{ tol } \left\| \frac{1}{\text {vol}(Y)}\int _Y\sigma \, dx\right\| \end{aligned}$$
(3.44)

implies the condition (3.41), see Eisenlohr et al. [184]. However, the root mean squared criterion (3.44) is not consistent from the a functional analytic point of view, see Bellis and Suquet [114], i.e., the number of iterations for a specific solver to satisfy (3.44) is typically not bounded as the number of voxels goes to infinity.

For the polarization methods, discussed in Sect. 3.5, more care has to be taken, as the iterates of the polarization schemes are neither in equilibrium nor compatible, in general. Indeed, for the iterates of the polarization scheme (3.36) with damping \(\gamma \in [0,1)\)

$$\begin{aligned} P^{k+1} = \gamma \, P^k + (1-\gamma )\left( 2\,\mathbb {C}^0:{\bar{\varepsilon }} + Y^0 : Z^0(P^k)\right) , \end{aligned}$$

the identity

$$\begin{aligned} \frac{\Vert P^{k+1} - P^k\Vert _{L^2}^2}{4(1-\gamma )^2} = \Vert \text {div}\,\lambda ^k \Vert ^2_{H^{-1}_\#} + \alpha _0^2 \, \Vert {e^k} - {\bar{\varepsilon }} - \Gamma :{e^k}\Vert ^2_{L^2} \end{aligned}$$

holds for \(P^k = \lambda ^k + \mathbb {C}^0:{e^k}\) split according to the ADMM (3.39).

For the ADMM implementation (3.39), an equivalent criterion may be set up based on the identity [178]

$$\begin{aligned} {\Vert \varepsilon ^{k + \frac{1}{2}} - e^k\Vert _{L^2}^2 = \Vert \text {div}\,\lambda ^k \Vert ^2_{H^{-1}_\#} + \alpha _0^2 \, \Vert \varepsilon ^{k+1} - {\bar{\varepsilon }} - \Gamma :\varepsilon ^{k+1}\Vert ^2_{L^2}.} \end{aligned}$$

For the convenience of the reader, a (personal) summary of the solution methods reviewed in this article is provided in Table 4. A comparison is made for small strain inelastic materials which derive from a potential, and the primal formulation (in the strain) is considered. The solvers are sorted by their memory footprint, and it is indicated which solvers admit low-memory implementations (provided suitable finite difference discretizations are used). The comparison does not account for the last converged field necessary for extrapolating between load steps, as proposed by Moulinec and Suquet [34]. Also, the (symmetric) tangent for Newton’s method is stored in double precision (which is not necessary). The Eyre–Milton iterative method (3.36) may be implemented on a single strain field. However, it is not clear how to check convergence (3.36) with only a single field in memory.

Fig. 5
figure 5

Performance [108, 137] of different solvers for the bound-sand microstructure, see Fig. 3a, and the staggered-grid discretization [87]

As a conclusion to this section on solution methods, we report on the performance of the different classes of solution methods for the microstructure of bound sand, see Fig. 3a, and the staggered-grid discretization [87]. The results are taken from different publications [108, 137, 178]. For uniaxial extension to \(1\%\) strain, the residual (3.41)

$$\begin{aligned} \frac{\Vert \text{ div }\,\sigma \Vert _{H^{-1}_\#}}{\left\| \frac{1}{\text {vol}(Y)}\int _Y\sigma \, dx\right\| } \end{aligned}$$

is plotted vs. iteration for the gradient-based solution methods. We refer to Schneider et al. [171] for details on the residual for the polarization schemes.

The basic scheme, see Fig 5b, converges linearly, but fails to converge to the desired tolerance \(10^{-5}\) within 1000 iterations. Nesterov’s method with speed restarting [130] and Fercoq-Qu restarting [135] also converges linearly, but does not match the performance of the fastest methods.

The (linear) conjugate gradient method is the fastest method considered, see Fig. 5a. L-BFGS of depth four, the Fletcher-Reeves nonlinear CG and the Barzilai–Borwein method come in second, all with a comparable iteration count. For exact line search, BFGS (without L) and the Fletcher–Reeves nonlinear CG produce identical iterates as the linear CG method. However, they are applicable to nonlinear problems as well. Thus, the reduction of performance is a result of this increase in generality. The inherent non-monotonicity of the Barzilai–Borwein method is certainly noticeable. Up to a residual of \(10^{-3}\), the Anderson-mixed basic scheme of depth four is the second best method. For higher accuracy, the method slows down to some degree. Still, the scheme represents a strong competitor for the other methods just discussed.

Furthermore, the Monchiet–Bonnet scheme is included with Lorenz–Tran–Dinh scaling [178]. Up to a tolerance of \(10^{-3}\), it is the fastest method. The latter combination turned out to be the fastest among various combinations of damping and strategies for adaptively selecting the reference material in ADMM [178]. We observe that, for properly selected solver parameters, polarization schemes are competitive with or even faster than the fastest strain-based solvers, also for porous microstructures.

4 Practical issues and efficient implementation

4.1 The dual scheme

Bhattacharya and Suquet [185] noticed that Lippmann–Schwinger-type solvers may also be developed in a dual framework. Rewriting the unconstrained variational principle (2.4) in the form

$$\begin{aligned} \int _Yw(x, \varepsilon )\, dx \longrightarrow \min _{\varepsilon } \quad \text{ among } \quad \left\{ \varepsilon \in L^2(Y;\text{ Sym }(d)) \, \Big | \, \varepsilon = {\bar{\varepsilon }} + \nabla ^s u \quad \text{ for } \text{ some } \quad u\in H^1_\#(Y;\mathbb {R}^d) \right\} , \end{aligned}$$

the associated LagrangianFootnote 2 reads

$$\begin{aligned} L(\varepsilon ,\sigma ) = \int _Yw(x, \varepsilon ) - \sigma :(\varepsilon - {\bar{\varepsilon }}) dx \end{aligned}$$

for the Lagrangian multiplier \(\sigma \) living in the space of equilibrated stress fields

$$\begin{aligned} \sigma \in {\mathcal {S}} = \left\{ \sigma \in L^2(Y;\text {Sym}(d))\, \Big | \, \text {div}\,\sigma = 0\right\} . \end{aligned}$$

The associated Lagrangian dual function

$$\begin{aligned} W^*(\sigma ) = \inf _{\varepsilon \in L^2(Y;\text {Sym}(d))} L(\varepsilon ,\sigma ) \end{aligned}$$

can be expressed in the form

$$\begin{aligned} W^*(\sigma ) = -\int _Yw^*(x, \sigma ) - \sigma :{\bar{\varepsilon }} \, dx, \end{aligned}$$

where

$$\begin{aligned} w^*(x,\sigma ) = \sup _{\varepsilon \in \text {Sym}(d)} \sigma :\varepsilon - w(x,\varepsilon ) \end{aligned}$$

is the Legendre–Fenchel dual of the (condensed) free energy w, the condensed Helmholtz free energy. For the variational principle

$$\begin{aligned} \int _Yw^*(x, \sigma ) - \sigma :{\bar{\varepsilon }} \, dx \longrightarrow \min _{\sigma \in {\mathcal {S}}}, \end{aligned}$$

the associated projected gradient method [114] is identical to the dual (basic) scheme of Bhattacharya and Suquet [185],

$$\begin{aligned} \sigma ^{k+1} = \mathbb {C}^0:{\bar{\varepsilon }} + \mathbb {C}^0:\left( {{\,\mathrm {\text{ Id }}\,}}- \Gamma ^0:\mathbb {C}^0 - \frac{1}{\text {vol}(Y)}\int _Y\cdot \, dx \right) : \left( \frac{\partial w^*}{\partial \sigma } (\cdot , \sigma ^k) - \mathbb {D}^0:\sigma ^k \right) , \end{aligned}$$
(4.1)

where \(\mathbb {D}^0 = (\mathbb {C}^0)^{-1}\) is the reference compliance. An advantage of the dual scheme (4.1) is its ability to account for rigid phases, which are characterized by vanishing compliance. Another advantage, also discussed in Sect. 2.3, is that the dual scheme naturally operates on (discretely) divergence-free fields. For finite element methods, it is much more difficult to incorporate the divergence constraint into the formulation. A possible remedy is to introduce a “vector potential”, but the issue of gauge fixing needs to be dealt with.

All solvers built on top of the primal basic scheme (2.15) may be developed for the dual basic scheme (4.1), and the convergence assertions carry over to the dual setting. However, reduced memory implementations exploiting the compatibility of the primal iterates are not available for the dual scheme.

For some materials of practical interest, computing the strain as a function of applied stress is computationally less expensive than computing the stress for applied strain. Then, it is possible to utilize solvers based on the dual scheme (4.1) to accelerate FFT-based solvers. For crystal plasticity models at small strains, such a strategy was applied by Wicht et al. [186], where an acceleration of about an order of magnitude could be achieved.

Polarization methods, as discussed in Sect. 3.5, are primal-dual methods by construction. Hence, there is no benefit from a dual formulation. Rather, for the nonlinear Eyre–Milton method (3.36)

$$\begin{aligned} P^{k+1} = 2\,\mathbb {C}^0:{\bar{\varepsilon }} + Y^0 : Z^0(P^k), \end{aligned}$$

the Cayley transform of the nonlinear stress operator \(\partial w/\partial \varepsilon \) (3.37)

$$\begin{aligned} Z^0(P) = P - 2\mathbb {C}^0:\left( \frac{\partial w}{\partial \varepsilon } + \mathbb {C}^0\right) ^{-1}(P) \end{aligned}$$

can be alternatively expressed in the form

$$\begin{aligned} Z^0(P) = 2 \left( \frac{\partial w^*}{\partial \sigma } + \mathbb {D}^0\right) ^{-1} (\mathbb {D}^0:P) - P, \end{aligned}$$

whenever w is convex in \(\varepsilon \). In this way, the dual energy \(w^*\) enters, see Michel et al. [110] and Schneider et al. [171].

4.2 Mixed boundary conditions

The Lippmann–Schwinger equation (2.12)

$$\begin{aligned} \varepsilon = {\bar{\varepsilon }} - \Gamma ^0 : \left( \frac{\partial w}{\partial \varepsilon } (\cdot , \varepsilon ) - \mathbb {C}^0:\varepsilon \right) \end{aligned}$$

was originally formulated for prescribed macroscopic strain \({\bar{\varepsilon }} \in \text {Sym}(d)\). However, for inelastic and nonlinear constitutive behavior, it is often desirable to prescribe the macroscopic stress, instead, or to prescribe some components of the strain and the complementary components of the stress. For uniaxial strain, an extension of the basic scheme was discussed in Appendix B of Moulinec and Suquet [33]. An approach for handling mixed boundary conditions in coordinate directions was proposed by Eisenlohr et al. [184].

In order to cover arbitrary mixed boundary conditions, it is convenient to introduce a pair of orthogonal complementary projectors \(\mathbb {P}\) and \(\mathbb {Q}\) on the space \(\text {Sym}(d)\). Then, for \({\bar{\varepsilon }} \in \text {Sym}(d)\) and \({\bar{\sigma }} \in \text {Sym}(d)\) satisfying the admissibility conditions

$$\begin{aligned} \mathbb {P}: {\bar{\varepsilon }} = {\bar{\varepsilon }} \quad \text {and} \quad \mathbb {Q} : {\bar{\sigma }} = {\bar{\sigma }}, \end{aligned}$$

it is natural to consider the equilibrium problem

$$\begin{aligned} \text{ div }\,\frac{\partial w}{\partial \varepsilon }(\cdot , E + \nabla ^s u) = 0, \quad \mathbb {P}:E = {\bar{\varepsilon }} \quad \text{ and } \quad \mathbb {Q} : \frac{1}{\text {vol}(Y)}\int _Y\frac{\partial w}{\partial \varepsilon }(x, E + \nabla ^s u) \, dx = {\bar{\sigma }} \end{aligned}$$
(4.2)

for \(E \in \text {Sym}(d)\) and \(u \in H^1_\#(Y;\mathbb {R}^d)\). Kabel et al. [187] and Lucarini and Segurado [188] showed that the problem (4.2) is equivalent to a Lippmann–Schwinger equation, which takes the form

$$\begin{aligned} \varepsilon + \left( \Gamma ^0 + \mathbb {D}^0:\mathbb {Q}:\frac{1}{\text {vol}(Y)}\int _Y\cdot \, dx\right) \left( \frac{\partial w}{\partial \varepsilon } (\cdot , \varepsilon ) - \mathbb {C}^0:\varepsilon \right) = {\bar{\varepsilon }} + \mathbb {D}^0:{\bar{\sigma }} \end{aligned}$$

for \(\mathbb {C}^0 = \alpha _0 \, {{\,\mathrm{\text {Id}}\,}}\) and \(\mathbb {D}^0 = \tfrac{1}{\alpha _0} \, {{\,\mathrm{\text {Id}}\,}}\). Furthermore, the equilibrium equation (4.2) arises as the first-order necessary condition of a variational principle, and Lippmann–Schwinger-based solvers may be developed in the framework of gradient-type methods, as well.

4.3 Composite voxels

Brisard and Dormieux [49], based on earlier FEM work [189], first proposed the idea of considering sub-voxel scale information on the microstructure. This idea was further refined by Gélébart and Ouaki [190] and by Kabel et al. [191], who investigated, for the linear elastic setting, the effect of using simple micromechanical models for deriving “effective” properties of composite voxels. It was found that furnishing the composite voxels by the effective stiffness of a two-phase laminate material with volume fractions respecting the sub-voxel volume fraction of the materials, and a normal that approximates the average interface normal between the constituents in the voxel, led to the most accurate results for finitely contrasted media. When intersecting pores, Voigt averaging turned out to be advantageous, whereas Reuss averaging is best when intersecting a rigid inclusion.

The composite voxel method was subsequently extended to finite strains [192] and to inelastic problems [193]. For the latter, the idea is to introduce internal variables for all the phases in a composite voxel, and to solve the evolution equations on the laminate, considered as a sub-voxel scale microstructure. Using laminates on such a scale appears to be related to homogenization at interfaces, see Josien and Raithel [194].

Often, dedicated composite voxel methods permit to reduce the number of degrees of freedom while retaining the accuracy of the computed results, even for inelastic and nonlinear constitutive behavior. Thus, they are an indispensable tool for FFT-based computational methods on an industrial-scale.

Composite voxel methods were applied to crystal plasticity [195], hollow glass microspheres composites [196], for predicting slip bands [197] and to assess the failure behaviors of composites [198].

5 Applications

In this section, we indicate which applications may be solved by FFT-based computational homogenization methods. We do not give much details. Rather, this section serves as a look-up table from which the reader may extract the references concerning his or her area of interest.

5.1 Conductivity and diffusivity

FFT-based methods were applied to predict the electrical response of composites [82], the thermal diffusivity of periodic porous materials [199], also accounting for thermal barrier coatings [200] and imperfect interfaces [201]. A coupling to the Nemat–Nasser–Iwakuma–Hejazi estimates for conductivity was proposed [202], and the conductivity of polygonal aggregates [203] was addressed. Also, FFT-based methods were applied to compute the through-thickness conductivity of heterogeneous plates [204] and for an inverse reconstruction of the local conductivity [205]. Furthermore, extensions to compute Fickian diffusion [206] and ionic conductivity [207] were reported, as were applications to the effective conductivity and diffusivity of porous carbon electrodes [208], the electronic conductivity of lithium ion positive electrodes [209] and the conductivity of solid oxide fuel cell anodes [210].

5.2 Coupled problems

Using FFT-based methods for piezoelectricity was pioneered by Brenner [211]. Disregarding the differential constraints, the central problems can be understood by taking a look at the piezoelectric constitutive law

$$\begin{aligned} \begin{array}{rclcl} \sigma &{}= &{}\mathbb {C}: \varepsilon &{}-&{}e^T\cdot E\\ D &{}= &{}e:\varepsilon &{}+ &{}\gamma \cdot E \end{array} \quad \text {or} \quad \left[ \begin{array}{c} \sigma \\ D \end{array}\right] = \left[ \begin{array}{rr} \mathbb {C}&{} -e^T\\ e &{} \gamma \end{array}\right] \left[ \begin{array}{c} \varepsilon \\ E \end{array}\right] \end{aligned}$$
(5.1)

written in suggestive matrix-vector form. Here, \(\mathbb {C}\) is the stiffness tensor, \(\gamma \) denotes the dielectric permittivity, and e refers to the (third-order) piezoelectric moduli tensor. Furthermore, \(\varepsilon \) and \(\sigma \) are the elastic strain and stress, and D and E symbolize the electric induction and the electric field, respectively. Please note that the linear operator on the right-hand side of equation (5.1) is not symmetric. However, it is strictly monotone, as

$$\begin{aligned} \left[ \begin{array}{c} \varepsilon \\ E \end{array}\right] ^T \left[ \begin{array}{rr} \mathbb {C}&{} -e^T\\ e &{} \gamma \end{array}\right] \left[ \begin{array}{c} \varepsilon \\ E \end{array}\right] = \varepsilon :\mathbb {C}:\varepsilon + E\cdot \gamma \cdot E \end{aligned}$$

holds for all \(\varepsilon \in \text {Sym}(3)\) and \(E \in \mathbb {R}^3\). Brenner [211] applied the basic scheme to the original form (5.1). Such an approach is justified, as the basic scheme may be applied to general strongly monotone and Lipschitz continuous operators, see Sect. 14.11 in Milton [4]. Also, ADMM-type solvers (3.39) may be applied in this setting, see Giselsson [212] for a discussion how to choose the algorithmic parameters in this case.

Alternatively, the constitutive equation may be rewritten in the form

$$\begin{aligned} \left[ \begin{array}{c} \sigma \\ -D \end{array}\right] = \left[ \begin{array}{rr} \mathbb {C}&{} -e^T\\ -e &{} -\gamma \end{array}\right] \left[ \begin{array}{c} \varepsilon \\ E \end{array}\right] , \end{aligned}$$
(5.2)

which is symmetric, but indefinite. Still, dedicated iterative solution methods may be applied, for instance MINRES [123]. A third alternative consists of eliminating one of the variables, for instance the electric induction D, and replacing it by the electric field E. The resulting system reads

$$\begin{aligned} \left[ \begin{array}{c} \sigma \\ E \end{array}\right] = \left[ \begin{array}{rr} \mathbb {C}+ e^T \cdot \gamma ^{-1} \cdot e &{} -e^T \cdot \gamma ^{-1}\\ -\gamma ^{-1} \cdot e &{} \gamma ^{-1} \end{array}\right] \left[ \begin{array}{c} \varepsilon \\ D \end{array}\right] , \end{aligned}$$
(5.3)

which corresponds to a partial Legendre–Fenchel transformation of the free energy in E. The linear operator in the right-hand side is symmetric and positive semi-definite

$$\begin{aligned} \left[ \begin{array}{c} \varepsilon \\ D \end{array}\right] ^T \left[ \begin{array}{rr} \mathbb {C}+ e^T \cdot \gamma ^{-1} \cdot e &{} -e^T \cdot \gamma ^{-1}\\ -\gamma ^{-1} \cdot e &{} \gamma ^{-1} \end{array}\right] \left[ \begin{array}{c} \varepsilon \\ D \end{array}\right] = \varepsilon :\mathbb {C}:\varepsilon + (D - e : \varepsilon ) \cdot \gamma ^{-1} (D - e : \varepsilon ) \ge 0. \end{aligned}$$

Actually, the operator is even positive definite due to the invertibility of the operator guaranteed by strong monotonicity. Thus, the conjugate gradient method [40] may be applied for solving the piezoelectric problem in the formulation (5.3).

All three formulations have their respective merits. The primal formulation (5.1) enables low-memory solvers, such as the basic scheme or polarization methods, to be applied. However, those are less effective than their Krylov counterparts. The indefinite formulation (5.2) inherits the conditioning of the original formulation (5.1), and converges rapidly. However, MINRES has a comparatively high memory demand. The symmetric and positive definite formulation (5.3) requires less memory. However, the spectrum of the linear operator may differ strongly from the original formulation.

The discussed alternatives pervade also coupled problems with a higher degree of sophistication. The electroelastic case was investigated by Brenner [213] applying ADMM to the primal form, and the multi-ferroic case was treated by Brenner and Bravo-Castillero [214] by a similar technique. Göküzüm and Keip [215] investigated piezoelectric coupling at finite strains by (a finite strain version of) the partial Legendre transform approach (5.3). Sixto-Camacho et al. [216] homogenized thermo-magneto-electro-elastic media in the primal formulation. Vidyasagar et al. [92] investigated ferroelectric ceramics by a phase-field approach. They rely upon a staggered, semi-explicit technique in the primal formulation. Rambausek et al. [217] proposed a Newton-CG approach for the partially Legendre transformed formulation.

A completely different type of coupling was studied by Hofmann et al. [218, 219], where electro-chemo-mechanical coupling in lithium ion batteries was investigated. The authors consider a finite volume discretization and use an adaptive Newton method with an FFT-based pre-conditioned linear Krylov solver. The effective properties of thermoelastic composites may also be computed by FFT-based methods [112, 220]. A full thermo-mechanical coupling strategy was recently proposed [221].

5.3 Damage and fracture mechanics

Predicting the strength and lifetime properties of multi-scale materials is certainly of practical interest, and a number of FFT-based solvers have been developed. However, care has to be taken, as the underlying models are typically non-convex, and interesting effects may arise. For instance, the balance equations might not be uniquely solvable, and physically plausible solutions need to be selected [222]. Also, there are damage and fracture models which may degenerate upon homogenization [223]. (This is independent of FFT.)

After the initial contribution by Herrmann et al. [224] who considered linear and elastic-plastic fracture mechanics, it took more than ten years until local damage models were considered again. FFT-based approaches are reported for isotropic local damage [225,226,227] and anisotropic local damage in SiC/SiC composite tubes [228], woven composites [229] and braided composites [230, 231].

Non-local damage models, well-known for finite elements [232], remove the spurious mesh dependence of local damage models upon failure, and were considered by different authors [233,234,235,236], where the non-locality is typically realized as a convolution-type integral, which is readily computed in Fourier space.

For FEM, a specific non-local damage model was proposed by Bourdin et al. [237] in order to approximate the variational approach to fracture [238], and is very popular nowadays because of its modular character and the ability to predict complex crack patterns. A semi-explicit FFT solver for phase-field fracture was introduced by Chen et al. [239], and Ernesti et al. [131] discussed a fully implicit solver based on a fast gradient method. Cao et al. [240] investigated the influence of different damage degradation functions with an implicit solver and a viscous regularization. Phase-field fracture in ceramic matrix composites (CMCs) [241], for interface decohesion [242], with random fracture energy [243], multi-phase field fracture [244] and fatigue in metals [245, 246] and fiber composites [247] were reported. A different phase-field damage model was investigated by Biner and Hu [248].

As an alternative to non-convex models, an FFT-based solver for computing the effective crack resistance of heterogeneous brittle materials was introduced [183], based on rigorous homogenization results [249, 250]. The problem to be solved is convex, but not differentiable, and an FFT-based primal-dual hybrid gradient method was developed. This approach is closely related to the solution method introduced by Willot [182] for a different application.

5.4 Polycrystalline materials

Applications of FFT-based methods to study the physical behavior of polycrystalline materials are rather frequent, and separate review articles are available on this topic [251,252,253]. For this reason, we will discuss the matter only briefly.

As a general note, different crystals of a material differ only in their orientations, reducing the material contrast of the microstructure to the inner material contrast of the single crystal. Thus, the Moulinec–Suquet discretization is most commonly used, unless softening constitutive laws are considered. Also, the inherent complexity of crystal plasticity models makes evaluating the material law expensive, and (Quasi-)Newton-type solvers are generally to be preferred when time to solution is to be minimized.

5.5 Composites

Composite materials may have a larger elastic contrast than polycrystalline materials. Also, isotropic models are encountered frequently, so that a time-consuming rotation step is not necessary. For non-spherical fillers, representative volume elements may be rather large. In view of these facts, memory-efficient solvers are preferred for such composites. Also, the stochastic nature inherent to the production process of composites makes it necessary to determine the size of a representative volume element [254,255,256].

Woven [229] and non-woven [257, 258] materials as well as braided composites [230, 231] were considered. For short fiber composites, studies included their viscoelastic behavior [259, 260], a comparison to in situ micro-tensile experiments [261] and an optimal experimental design [262]. Furthermore, SiC/SiC composites [158, 228], ceramic matrix composites (CMCs) [241], metal matrix composites (MMCs) [263], carbon nanoparticle reinforced composites [264], sheet molding compound (SMC) composites [265] and filled rubbers [266, 267] were studied via FFT-based computational homogenization methods.

5.6 Porous and granular materials

When treating porous and granular materials by FFT-based computational methods, the pore space needs to be meshed as well. In particular, numerically stable discretizations, as discussed in Sect. 2.5, are mandatory. Then, balancing computational speed and memory footprint determines the appropriate solution technique.

FFT-based methods were developed to compute the conductivity [268] and the thermal diffusivity [199] of porous materials, and for predicting microstructure-induced hotspots in granular media [269]. The elastic and plastic properties of porous materials [270, 271] were studied, and the effective yield [272, 273] and failure [274] surfaces could be identified.

FFT-based methods are particularly effective for analyzing digital volume images of rocks [275,276,277], sedimentary rocks [278] and shales [279]. In this context, the mechanical compression of the digital images is indispensable for simulating the ambient conditions several kilometers below earth’s surface [280]. Closely related are applications to microstructures of bound sand used for casting processes [109, 281]. FFT-based computational methods may be used to calculate the elastic properties of bones [282, 283], for estimating stiffness and strength of medium-density fiberboards [284,285,286] and gaining insight into the orientations of fiber bundles in natural fiber based microstructures [285, 287].

FFT-based methods were reported to be effective for aluminum [288], concrete [289] and ceramic foams [290]. Also, applications to cellular alumina ceramics [291], sintered alumina ceramics [292], ceramics with spheroidal pores [293], alumina and zirkonia ceramics [294], ferroelectric ceramics [92] and for auxetic ceramics [295] were reported. Besides, applications to cubic-cell [296] and porous metamaterials [297] were provided.

5.7 Miscellaneous

Extensions to mechanics at finite strains were reported for hyperelastic [139, 298, 299] and hypoelastic [300] constitutive formulations. FFT-based solvers were developed to compute the permeability [301,302,303,304] of heterogeneous materials, for instance for random cylinders [305] and doubly porous media [306]. Variants also account for inertial terms [307] or compute the dynamic permeability [308].

Moreover, FFT-based methods can be used to compute the effective viscosity of particle suspensions [309]. FFT-based computational techniques proved useful for studying the viscoelasticity of cement paste [310], mortar samples [311], cement [312] and concrete [313]. Furthermore, FFT-based techniques were used for studying explosive materials [314], secondary creep in a porous nuclear fuel [315], the thermal expansion of an energetic material [316], optical properties of deposit models for paint [317], dynamic recrystallization [318], to compute geodesics in two-dimensional media [182], fitting microstructure-property relationships [319], topology optimization [320] and for finding emerging microstructures associated to non-convex potentials [93, 321].

FFT-based computational homogenization methods have been used in concurrent multi-scale simulation [217, 226, 322, 323] and in connection to stochastic FEM [324]. They served as the basis for investigating field statistics [325,326,327,328] and as a reference for mean-field estimates [329,330,331]. FFT-based methods for higher-order homogenization were reported for strain gradient materials [332, 333] as well as to calculate the elastic moduli of beams and plates [334].

6 Concluding remarks and future directions

Since the early pioneering days [33, 34, 335], computational homogenization methods utilizing the fast Fourier transform (FFT) were developed into a mature state, where treating complex microstructures and constituents with sophisticated material behavior is feasible within reasonable time. FFT-based methods exploit the specific structure of regular grids, and utilize the FFT for constructing general-purpose pre-conditioners for the discretized (nonlinear) systems. These methods were developed in a modular fashion, where specific discretization schemes and dedicated solvers can be combined for advantageous problem formulations tailor-suited to the application at hand. For instance, to compute the effective viscosity of a suspension, a dual formulation, discretized by finite differences may be solved by a fast gradient method. To speed up material modelling, the integration of automatic differentiation approaches [336] can be useful, as well.

Despite the progress made, overcoming a few remaining technological challenges is expected to improve the applicability of computational multi-scale methods, in general, and FFT-based techniques, in particular. For instance, the requirements of modern multi-scale methods necessitate going beyond solving a specific homogenization problem for given parameters and a single microstructure. Indeed, variations in both the microstructure and the material parameters are to be taken into account. For instance, for injection-molded fiber-reinforced composites, both the filler content and the fiber orientation may be locally varying, and material models for any such combination need to be provided for macro-scale computations [259]. Moreover, it might be necessary to calibrate the microscopic material parameters inversely. For example, it is known that the material properties of the matrix material in a fiber composite may show a (strongly) different behavior than a pure matrix sample, for instance as a result of a change in crystallization caused by the presence of fibers.

Possible solutions include the use of sparse sampling techniques [337], tensor methods [338] or AI-based surrogate models [339].