1 Introduction

The constitutive behavior of metals and alloys is strongly influenced by their microstructural characteristics, such as the size, fraction, orientation and composition of the individual metallic phases. The need for understanding the evolution of microstructural characteristics with deformation has stimulated the development of advanced micromechanical models that accurately describe the underlying physical phenomena, e.g., recrystallization [3, 17], martensitic phase transitions [20, 23, 35], phase separation and coarsening by diffusion [5], twinning and detwinning [12, 39], dislocation interactions [7, 9], and cracking and damage growth [2, 4, 8, 29]. In order to apply state-of-the-art micromechanical models for the analysis of large-scale engineering problems, efficient and generic multiscale methods need to be developed for keeping the computational times within manageable bounds.

Starting with the landmark contributions of Voigt [38] and Reuss [22], substantial research effort has been devoted to efficiently transferring information from small length scales to the macroscopic scale, leading to a wide spectrum of analytical and numerical formulations for the effective mechanical behavior of composites [11, 13, 14, 18, 21, 24, 30, 31, 40, 41, 44]. Although for a broad range of materials these methods have provided an impetus to the homogenization of basic constitutive properties (elasticity, (rate-dependent) plasticity, power law creep), their extension towards the description of advanced microstructures composed of a diversity of phases with relatively complex constitutive behavior often is far from straightforward, and poses considerable mathematical challenges. Furthermore, the description of sophisticated micromechanical phenomena may introduce complementary conditions on the static and/or kinematic hypotheses adopted in classical homogenization approaches, such as the well-known Taylor assumption that demands the deformation in each microstructural phase to be equal to the applied macroscopic deformation. For example, for polycrystalline materials this kinematic assumption appears to be too restrictive for adequately simulating grain size effects [7] and deformation texture [37]; hence, during the last decade this has triggered the development of homogenization schemes in which deformation heterogeneity among grains is explicitly accommodated for by relaxing the Taylor assumption [6, 7, 16, 34, 37]. This relaxation can be formulated in various ways and at different degrees, and essentially comes down to requiring that the macroscopic deformation is no longer imposed on each grain individually, but rather on specific clusters of grains, by equating it to the weighted average of the grain deformations within a cluster. Accordingly, the distribution of strain remains homogeneous within each grain, but not within a cluster of grains.

To date, the grain cluster-type formulations presented in the literature typically consider relatively small clusters of 2–8 hexahedral (rectangular) grains, with the deformation incompatibilities at the grain boundaries being described by a set of additional kinematic variables, i.e., the a-priori unknown relaxations [6, 7, 34, 37]. These local relaxations are computed by minimizing the total work of the system, whereby the stationarity condition with respect to the relaxations results in the corresponding equations for traction continuity at the grain boundaries. For providing the relaxations with a physical background, the deformation mismatch at grain boundaries, commonly expressed by the Nye tensor [10, 19], often is constitutively connected to the development of dislocation networks, see [7, 34].

Despite their efficiency in terms of computational time, the current grain cluster-type formulations are not very suitable for being extended to clusters composed of a vast number of grains with realistic (convex and non-convex) shapes, since the incorporation of numerous relaxations at grain boundaries of arbitrary orientation makes the mathematical implementation relatively cumbersome. For this reason, in the present communication a generalized grain cluster method (GGCM) is proposed in which these limitations are removed. In specific, the general character of this formulation can be defined by means of three distinctive aspects, namely: (i) the method is able to model grains of arbitrary polyhedral shape, (ii) the method can handle a relatively large number of grains in a computationally tractable way, thereby explicitly accounting for interactions between individual grains, (iii) the method is formulated within a geometrically nonlinear framework and is independent of the actual micromechanical model(s) applied within the grains. The latter aspect allows for the analysis of an aggregate of dissimilar (multiphase) grains with different, user-defined constitutive properties.

The basic starting point of the method is to assume that each grain in a polycrystalline aggregate deforms homogeneously, whereby the deformation gradient is allowed to vary from grain to grain. However, as opposed to other grain cluster-type models, both traction discontinuities and deformation incompatibilities along grain boundaries are minimized simultaneously in the cluster, by means of iteratively adjusting the deformation gradients in the individual grains. This key ingredient is based upon the construction of a representative objective functional, which brings in the advantage that the method can be straightforwardly applied to a large number of grains of arbitrary shape. The use of separate weighting factors in the objective functional on the conditions for traction continuity and deformation compatibility makes it possible to cover the range of effective nonlinear responses lying between the Taylor bound (uniform deformation in the grains) and the Sachs bound (uniform stress in the grains). Although the method does allow for describing the incompatibilities at grain boundaries in terms of any particular localized deformation mechanism, such as geometrically necessary dislocations or intergranular cracks, for reasons of simplicity and generality these incompatibilities here are straightforwardly adopted as a consequence of the kinematic assumptions made in the formulation.

The paper is organized as follows. The generalized grain cluster method is formulated in Sect. 2 and its numerical implementation is treated in Sect. 3. The calibration of the weighting factors and a detailed analysis of the efficiency and accuracy of the method is demonstrated in Sect. 4 through a series of simulations on grain clusters of various sizes. The simulations were performed for a multiphase material composed of ferritic grains undergoing plastic deformation and austenitic grains undergoing a combination of plastic deformation and phase transformation. The advanced microstructural geometries considered in the analyses were generated by means of a multilevel Voronoi algorithm developed recently in [43], and represent a steel experiencing transformation-induced plasticity, i.e., a TRIP steel. The plasticity and transformation phenomena activated under shear loading were simulated by means of crystallographically-based models presented in previous works by the authors [32, 33, 35, 36, 42]. One important objective for the development of the GGCM is to have a flexible scheme that can be used in a so-called multiscale adaptive algorithm. Within that algorithm, microscale simulations may be conducted using either the GGCM or a fully-resolved finite-element simulation at the level of individual grains, depending on the required resolution. In view of this, the performance of the GGCM is compared to (finite element-based) direct numerical simulations for the same microstructure. This illustrates that the GGCM is able to efficiently account for the evolution of the stress and history variables, such as plastic slip and transformation volume fractions. Some concluding remarks are provided in Sect. 5 on the coupling of the GGCM with commercial FEM software, and the bifurcation sensitivity of microstructural responses.

As a general scheme of notation, vectors are indicated in lowercase boldface and tensors or matrices are indicated in uppercase boldface. A centered dot indicates an inner product between two tensors of the same order. Additional notation is introduced in the text as required.

Fig. 1
figure 1

Periodic aggregate of grains \(\varOmega \)

2 Formulation of generalized grain cluster method

2.1 Basic assumptions

Consider an aggregate of \(N_{\mathrm {gr}}\) grains, with each grain \(N = 1,\ldots , N_{\mathrm {gr}}\) occupying a region \(\varOmega _N\), as shown schematically in Fig.  1. The region filled by the cluster of grains is denoted as \(\varOmega \), i.e.,

$$\begin{aligned} \varOmega =\bigcup _{N=1}^{N_{\mathrm {gr}}}\varOmega _{N} \ . \end{aligned}$$
(1)

This aggregate of grains is assumed to be periodic in space. Each grain \(N\) is taken as a polyhedron, although not necessarily convex. The boundary of each polyhedral grain \(N\) is composed of \(M_N\) faces, denoted by \(\varGamma _{N,\gamma }\) and identified using the global grain index \(N\) and a local face index \(\gamma = 1,\ldots , M_N\). For notational purposes, it is convenient to introduce a global interface index \(I\) given by \(I = \hat{I}(N,\gamma )\). The common interface \(\varGamma _{I}\) between adjacent grains \(N\) and \(N'\) (corresponding to the local indices \(\gamma \) and \(\gamma '\), respectively), is uniquely identified as \(I = \hat{I}(N,\gamma ) = \hat{I}(N',\gamma ')\), as illustrated in Fig.  1. Furthermore, observe that parts of grains on the “external” boundary of \(\varOmega \) appear as disconnected, but are in fact treated as a single grain due to periodicity. In that case, the index \(N\) refers to the whole grain and the index \(I\) to the whole interface, see Fig.  1. Correspondingly, the total number of interfaces in the cluster is

$$\begin{aligned} N_\mathrm{{int}} = \frac{1}{2} \sum _{N=1}^{N_{\mathrm {gr}}} M_N \ . \end{aligned}$$
(2)

The description of a deformation \(\hat{\mathbf {y}}\) from a reference configuration is written as

$$\begin{aligned} \mathbf {y}= \hat{\mathbf {y}}(\mathbf {x}, t) \ \quad \text {with} \quad \mathbf {x} \in \varOmega \quad \text {and} \quad \mathbf {y} \in \varOmega _t \, , \end{aligned}$$
(3)

where \(\varOmega \) and \(\varOmega _t\) denote the regions occupied by the grain cluster in the reference and the current configurations (at time \(t\)), respectively, \(\mathbf {x}\) is a (material) point in the reference configuration and \(\mathbf {y}\) denotes the current location of \(\mathbf {x}\) at time \(t\). The deformation gradient \(\mathbf {F}\) is defined as

$$\begin{aligned} \mathbf {F}:=\nabla \hat{\mathbf {y}} \ , \end{aligned}$$
(4)

where \(\nabla = \partial / \partial \mathbf {x}\) designates the gradient with respect to \(\mathbf {x}\).

Previous grain cluster-type formulations rely on variables that describe the deformation of interfaces, see [6, 7, 34, 37]. While this approach is adequate for the description of cubic-like grains, it becomes cumbersome and requires enforcement of redundant constraints for grains with an arbitrary polyhedral shape. In order to reduce the number of variables while preserving the simplicity of the grain cluster method, it is more efficient to work directly with the deformation gradient of the grain as the primary variable. This modification allows for an extension of the range of applications to a large number of grains of arbitrarily complex shapes; hence, the method is termed the generalized grain cluster method (GGCM). In view of developing the weak formulation of the GGCM, the deformation field \(\hat{\mathbf {y}}\) and a generic test function \(\hat{\mathbf {w}}\) are assumed to be linear inside each grain, i.e.,

$$\begin{aligned} \begin{aligned}&\mathbf {y} = \hat{\mathbf {y}} (\mathbf {x}, t) =\mathbf {F}_{N}(t) \mathbf {x} + \mathbf {c}_{N} (t),\\&\mathbf {w} = \hat{\mathbf {w}} (\mathbf {x}, t) =\mathbf {G}_{N}(t) \mathbf {x} + \mathbf {d}_{N} (t), \end{aligned} \end{aligned}$$
(5)

for \(\mathbf {x}\) in \(\varOmega _{N}\). At a given time \(t\), the tensor \(\mathbf {F}_{N}\) and the vector \(\mathbf {c}_{N}\) thus are considered as uniform in grain \(N\), with \(\mathbf {F}_{N}\) the deformation gradient; note that for an admissible deformation it is required that \(\mathrm {det}\left( \mathbf {F}_{N}\right) >0\). The deformation gradient and the displacement are allowed to vary discontinuously from grain to grain. From this perspective, the proposed method shares similarities with non-conforming Galerkin finite element methods, where displacements are allowed to be discontinuous at element boundaries, see, e.g., [1]. The tensor \(\mathbf {G}_{N}\) and the vector \(\mathbf {d}_{N}\) characterizing the test functions in (5)\(_2\) are assumed to be constant in the interior of each grain. Further, at the common interface \(I\) between two adjacent grains \(N\) and \(N'\), these quantities are taken as a simple average, i.e.,

$$\begin{aligned} \begin{aligned}&\mathbf {G}_{I} :=\frac{1}{2}\left( \mathbf {G}_{N}+\mathbf {G}_{N'}\right) , \\&\mathbf {d}_{I} :=\frac{1}{2}\left( \mathbf {d}_{N}+\mathbf {d}_{N'}\right) , \\ \end{aligned} \end{aligned}$$
(6)

with \(I = \hat{I}(N,\gamma ) = \hat{I}(N',\gamma ')\) representing a global interface index, see the inset in Fig.  1. As will be shown below, the relevance of using (6) is to (approximately) recover continuity of traction and kinematic compatibility across grain boundaries.

2.2 Weak formulation and discretization of the balance of linear momentum

Neglecting body forces, for a quasi-static process the balance of linear momentum in terms of the first Piola–Kirchhoff stress tensor \(\mathbf {P}\) is expressed as

$$\begin{aligned} \mathrm {div}\mathbf {P} = \mathbf {0}, \end{aligned}$$
(7)

with \(\mathrm {div} = \mathrm {div}_{\mathbf {x}}\) denoting the divergence in the reference configuration. Multiplying (7) with a suitable test function \(\mathbf {w}\) gives

$$\begin{aligned} \mathrm {div}\mathbf {P} \cdot \mathbf {w} = \mathrm {div}\left( \mathbf {P}^{\mathrm {T}}\mathbf {w}\right) - \mathbf {P} \cdot \nabla \mathbf {w} = 0 \ , \end{aligned}$$
(8)

where \((\cdot )^T\) designates the transpose of a tensor.

In the classical formulation of a boundary value problem the macroscopic boundary conditions are applied on the external boundary of the domain \(\varOmega \). However, due to the periodicity of the present microstructure, all interfaces of the domain \(\varOmega \) are treated as internal boundaries, for which the boundary data is not explicitly defined. For this reason, the macroscopic deformation is imposed pointwise on the interior of the grain cluster instead, by means of the following multiscale kinematic constraint:

$$\begin{aligned} \int _{\varOmega }\left( \mathbf {F}-\bar{\mathbf {F}}\right) \mathrm {d}v=\mathbf {0}, \end{aligned}$$
(9)

where \(\mathbf {F}\) is the deformation gradient in a microscopic material point and \(\bar{\mathbf {F}}\) reflects the deformation gradient at the macroscopic level. Observe that the multiscale kinematic constraint (9) cannot be transformed to pointwise periodic boundary conditions, since the displacement field is not continuous across grains. Taking a variation in (9) with respect to the deformation gradient, it follows that a (virtual) deformation gradient \(\delta \mathbf {F} = \nabla \mathbf {w}\) should satisfy that its average over the domain \(\varOmega \) is zero. Consequently, a suitable test function \(\mathbf {w}\) is assumed to fulfill the condition

$$\begin{aligned} \int _{\varOmega } \nabla \mathbf {w} \mathrm {d}v =\mathbf {0} \ . \end{aligned}$$
(10)

Integrating (8) over the reference domain \(\varOmega \), followed by using the decomposition (1) and incorporating the constraint (10), gives

$$\begin{aligned}&\sum _{N = 1}^{N_{\mathrm {gr}}} \left[ \int _{\varOmega _N} \left( \mathrm {div}\left( \mathbf {P}^{\mathrm {T}}\mathbf {w}\right) - \mathbf {P} \cdot \nabla \mathbf {w} \right) \mathrm {d}v \right. \nonumber \\&\quad \left. + \varvec{\varSigma } \cdot \int _{\varOmega _N} \nabla \mathbf {w} \mathrm {d}v \right] = 0 \ , \end{aligned}$$
(11)

where \(\varvec{\varSigma }\) is a Lagrange multiplier tensor. Since the assumed displacement field \(\mathbf {u}\) presented in (5) may be discontinuous across grain boundaries, the divergence term in (11) may not be defined at the internal interfaces. However, it is still possible to use the divergence theorem for each grain separately, which leads to

(12)

where \(\mathbf {n} = \mathbf {n}(N,\gamma )\) refers to the outward normal unit vector of face \(\varGamma _{N,\gamma }\). Rewriting (12) using the assumed fields (5) yields the following expression for the weak form

(13)

where \(\mathbf {P}_N\) refers to the first Piola–Kirchhoff stress in grain \(N\) and \(\mathbf {G}_{I}\) and \(\mathbf {d}_{I}\) are given by (6) with \(I = \hat{I}(N, \gamma )\). Consistent with the assumption of a homogeneous grain subjected to a uniform deformation gradient \(\mathbf {F}_N\), the stress \(\mathbf {P}_N\) is taken as uniform within an individual grain \(N\).

To elaborate further on expression (13), observe that the first term refers to a summation over all grains \(N\) and all interfaces \(\gamma \) (local index), hence it involves surface integrals on both sides of each interface \(I = \hat{I}(N, \gamma ) = \hat{I}(N', \gamma ')\). For a specific grain \(N\) that shares an interface \(I\) with a neighboring grain \(N'\), the contribution from both grains to the first term in (13) is, using relation (6),

$$\begin{aligned}&\int _{\varGamma _{N,\gamma }} \left( \mathbf {P}_N \mathbf {n}(N,\gamma ) \right) \cdot \left( \mathbf {G}_{\hat{I}(N,\gamma )} \mathbf {x} + \mathbf {d}_{\hat{I}(N,\gamma )} \right) \mathrm {d}a\\&\qquad + \int _{\varGamma _{N',\gamma '}} \left( \mathbf {P}_{N'} \mathbf {n}'(N',\gamma ') \right) \cdot \left( \mathbf {G}_{\hat{I}(N',\gamma ')} \mathbf {x} + \mathbf {d}_{\hat{I}(N',\gamma ')} \right) \mathrm {d}a\\&\quad = \!\int _{\varGamma _{N,\gamma }} \frac{1}{2} \left( \mathbf {P}_N \mathbf {n}(N,\gamma ) \right) \cdot \left( \left( \mathbf {G}_{N} \!+\! \mathbf {G}_{N'} \right) \mathbf {x} \!+\! \left( \mathbf {d}_{N} \!+\! \mathbf {d}_{N'} \right) \right) \mathrm {d}a \\&\qquad + \!\int _{\varGamma _{N',\gamma '}} \frac{1}{2} \left( \mathbf {P}_{N'} \mathbf {n}'(N',\gamma ') \right) \cdot \left( \left( \mathbf {G}_{N'} \!+\! \mathbf {G}_{N} \right) \mathbf {x} \!+\! \left( \mathbf {d}_{N'} \!+\! \mathbf {d}_{N} \right) \right) \mathrm {d}a\\&\quad = \int _{\varGamma _{N,\gamma }} \frac{1}{2} \left( \mathbf {P}_N \mathbf {n}(N,\gamma ) + \mathbf {P}_{N'} \mathbf {n}'(N',\gamma ') \right) \cdot \left( \mathbf {G}_{N} \mathbf {x} + \mathbf {d}_{N} \right) \mathrm {d}a\\&\qquad + \! \int _{\varGamma _{N',\gamma '}} \frac{1}{2} \left( \mathbf {P}_{N'} \mathbf {n}'(N',\gamma ') \!+\! \mathbf {P}_N \mathbf {n}(N,\gamma ) \right) \cdot \left( \mathbf {G}_{N'} \mathbf {x} \!+\! \mathbf {d}_{N'} \right) \mathrm {d}a \\&\quad = \int _{\varGamma _{N,\gamma }} \frac{1}{2} \left( \mathbf {P}_N - \mathbf {P}_{N'} \right) \mathbf {n}(N,\gamma ) \cdot \left( \mathbf {G}_{N} \mathbf {x} + \mathbf {d}_{N} \right) \mathrm {d}a \\&\qquad + \int _{\varGamma _{N',\gamma '}} \frac{1}{2} \left( \mathbf {P}_{N'} - \mathbf {P}_N \right) \mathbf {n}'(N',\gamma ') \cdot \left( \mathbf {G}_{N'} \mathbf {x} + \mathbf {d}_{N'} \right) \mathrm {d}a \ , \end{aligned}$$

where the last expression follows from the fact that the outward normal unit vector of face \(\varGamma _{N',\gamma '}\) satisfies \(\mathbf {n}' = \mathbf {n}'(N',\gamma ') = - \mathbf {n}(N, \gamma )\), as shown in Fig.  2.

Fig. 2
figure 2

Common interface of grains \(N\) and \(N'\)

Consequently, the first term in (13) may be expressed as follows:

$$\begin{aligned}&\sum _{N=1}^{N_{\mathrm {gr}}} \sum _{\gamma =1}^{M_{N}} \int _{\varGamma _{N,\gamma }} \left( \mathbf {P}_N \mathbf {n} \right) \cdot \left( \mathbf {G}_{I} \mathbf {x} + \mathbf {d}_{I} \right) \mathrm {d}a \nonumber \\&\quad = \sum _{N=1}^{N_{\mathrm {gr}}} \sum _{\gamma =1}^{M_{N}} \int _{\varGamma _{N,\gamma }} \frac{1}{2} \left( \mathbf {P}_N - \mathbf {P}_{N'} \right) \mathbf {n} \cdot \left( \mathbf {G}_{N} \mathbf {x} + \mathbf {d}_{N} \right) \mathrm {d}a,\nonumber \\ \end{aligned}$$
(14)

where, to simplify the notation, the arguments of \(\mathbf {n}\) have been suppressed. A further simplification in (14) may be achieved by using the fact that the traction vector \(\mathbf {P}_N \mathbf {n}\) in each interface \(I\) is constant, hence

$$\begin{aligned}&\sum _{N=1}^{N_{\mathrm {gr}}} \sum _{\gamma =1}^{M_{N}} \int _{\varGamma _{N,\gamma }} \frac{1}{2} \left( \mathbf {P}_N - \mathbf {P}_{N'} \right) \mathbf {n} \cdot \left( \mathbf {G}_{N} \mathbf {x} + \mathbf {d}_{N} \right) \mathrm {d}a \nonumber \\&\quad = \sum _{N=1}^{N_{\mathrm {gr}}} \sum _{\gamma =1}^{M_{N}} \frac{1}{2} A_{N,\gamma } \left( \mathbf {P}_N - \mathbf {P}_{N'} \right) \mathbf {n} \cdot \left( \mathbf {G}_{N} \mathbf {r}_{N,\gamma } + \mathbf {d}_{N} \right) ,\nonumber \\ \end{aligned}$$
(15)

where \(\mathbf {r}_{N,\gamma }\) represents the position vector of the centroid of interface \(I = \hat{I}(N,\gamma )\) and \(A_{N,\gamma }\) is the corresponding area. From the assumptions (5) and in view of (15), it follows that (13) may be written as

(16)

with \(V_N\) reflecting the volume of the \(N\)-th grain.

In the weak formulation (16), which needs to be satisfied for all test tensors \(\mathbf {G}_{N}\) and test vectors \(\mathbf {d}_{N}\), the unknowns are the deformation gradient tensors \(\mathbf {F}_N\) with \(N=1,\ldots ,N_{\mathrm {gr}}\) and the Lagrange multiplier \(\varvec{\varSigma }\); observe that the vectors \(\mathbf {c}_N\) have no contribution in this formulation. Note that the first Piola–Kirchhoff stress \(\mathbf {P}_N\) in grain \(N\) depends on the deformation gradient \(\mathbf {F}_N\) through the constitutive law of grain \(N\), whereas the first Piola–Kirchhoff stress \(\mathbf {P}_{N'}\) in an adjacent grain \(N'\) depends in a similar fashion on the deformation gradient \(\mathbf {F}_{N'}\). Here, it is implicitly assumed that, with the actual constitutive relation, the stress (or stress rate) depends objectively on the deformation through an appropriate formulation.

Since the test tensors \(\mathbf {G}_{N}\) can be specified independently of the test vectors \(\mathbf {d}_{N}\), the (virtual) deformation \(\mathbf {G}_{N}~\mathbf {r}_{N,\gamma }~ +~\mathbf {d}_{N}\) of the centroid of an interface \(\gamma \) may be defined independently of the (virtual) deformation gradient \(\mathbf {G}_{N}\) of the grain. Consequently, the formulation (16), together with the constraint (9), leads to the following system of equations for each grain and each interface:

$$\begin{aligned} \begin{aligned} A_{N,\gamma } \left( \mathbf {P}_N - \mathbf {P}_{N'} \right) \mathbf {n}&= \mathbf {0} \\ V_N \left( \mathbf {P}_N - \varvec{\varSigma } \right)&= \mathbf {0} \\ \text {for all } N=1,\ldots ,&N_{\mathrm {gr}} \ , \ \gamma = 1,\ldots ,M_N&\\ \text {subjected to:}&\\ \;\sum _{N=1}^{N_{\mathrm {gr}}} V_N \left( \mathbf {F}_{N}-\bar{\mathbf {F}}\right)&= \mathbf {0} \ .&\end{aligned} \end{aligned}$$
(17)

In principle this is an over-determined system of equations since, in view of (2) and (17), there are \(N_{\mathrm {int}} + N_{\mathrm {gr}} +1\) distinct tensor-valued equations and only \(N_{\mathrm {gr}}+1\) tensor-valued unknowns, i.e., the deformations gradients in each grain and the global Lagrange multiplier. Observe that one solution of this system of equations corresponds to a uniform state of stress, with the Lagrange multiplier \(\varvec{\varSigma }\) representing the actual macroscopic stress value.

As mentioned before, equation (17)\(_3\) enforces compatibility between the volume-averaged microscopic deformation gradients \(\mathbf {F}_{N}\) in the grains and the macroscopic deformation gradient \(\bar{\mathbf {F}}\) of the whole cluster of grains. However, in analogy with a displacement-controlled process in which \(\bar{\mathbf {F}}\) is prescribed, it also acts as the “external loading” for which the system of equations (17)\(_{1,2}\) must be satisfied. In particular, note that in the absence of the loading term (17)\(_3\), equations (17)\(_{1,2}\) are trivially satisfied with a stress-free state \(\varvec{\varSigma }\) = 0. Accordingly, a state of equal stress in the grains, which reflects the well-known Sachs bound, requires the solution for the deformation gradients \(\mathbf {F}_N\) that satisfies (17)\(_{3}\). Obviously, this solution neglects kinematic compatibility across grain boundaries, which indeed would induce a non-uniform state of stress in the grain cluster. The incorporation of the kinematic compatibility equation in the formulation is treated in the section below.

2.3 Weak formulation and discretization of the kinematic compatibility equation

As mentioned in Sect. 2.1, the basic kinematic assumption adopted in the generalized grain cluster method is that the displacement field is linear within each grain, but generally may be discontinuous across grain boundaries. This discontinuity can be related to a physical mechanism, such as crack formation or dislocation activity [7, 34], but for simplicity and generality is considered here as a direct result of the above kinematic assumptions. Accordingly, the purpose is to find piecewise linear displacement fields that minimize the kinematic incompatibilities at grain boundaries. To this end, the equation of kinematic compatibility, which guarantees continuity of a displacement field, is explicitly incorporated in the formulation as a field equation.

Referring to a cartesian basis, the components of a vector \(\mathbf {n}\) and a tensor \(\mathbf {F}\) are, respectively, given by \(n_i\) and \(F_{ij}\), with \(i,j = 1,2,3\). Accordingly, the curl of the tensor field \(\mathbf {F} = \mathbf {F}(\mathbf {x})\) and the cross product between \(\mathbf {n}\) and \(\mathbf {F}\) can be expressed as

$$\begin{aligned} \begin{aligned}&\left( \nabla \times \mathbf {F}\right) _{ij}=\varepsilon _{imn}F_{jn,m} \ , \\&\left( \mathbf {n}\times \mathbf {F}\right) _{ij}=\varepsilon _{imn}n_{m}F_{jn} \ . \end{aligned} \end{aligned}$$
(18)

Here, implicit summation on repeated indices is assumed, \((\cdot )_{\cdot ,m}\) refers to partial differentiation with respect to \(x_m\), \(\varepsilon _{ijk} = (1/2)(i-j)(j-k)(k-i)\) represents the alternator (or permutation) tensor, and \(\nabla \times (\cdot )\) designates the curl of a tensor (in the reference configuration). When interpreting \(\mathbf {F}\) as the microscopic deformation gradient, the kinematic compatibility equation can be written as

$$\begin{aligned} \nabla \times \mathbf {F} = \mathbf {0} \ . \end{aligned}$$
(19)

Multiplying (19) by a suitable tensor-valued test function \(\mathbf {G}\) and using the identity (73) (see appendix) yields

$$\begin{aligned} \left( \nabla \times \mathbf {F} \right) \cdot \mathbf {G} = \mathrm {tr} \left( \nabla \times \left( \mathbf {G}\mathbf {F}\right) \right) + \left( \nabla \times \mathbf {G}^{\mathrm {T}} \right) \cdot \mathbf {F}^{\mathrm {T}} = 0 \ ,\nonumber \\ \end{aligned}$$
(20)

where \(\mathrm {tr}\) indicates the trace of a tensor. Integrating (20) over the domain \(\varOmega \) and using the decomposition (1) gives

$$\begin{aligned} \sum _{N = 1}^{N_{\mathrm {gr}}} \int _{\varOmega _N} \left( \mathrm {tr} \left( \nabla \times \left( \mathbf {G}\mathbf {F}\right) \right) + \left( \nabla \times \mathbf {G}^{\mathrm {T}} \right) \cdot \mathbf {F}^{\mathrm {T}} \right) \mathrm {d}v = 0.\nonumber \\ \end{aligned}$$
(21)

Applying the generalized divergence theorem for each grain separately then leads to

$$\begin{aligned}&\sum _{N = 1}^{N_{\mathrm {gr}}} \left[ \sum _{\gamma = 1}^{M_{N}} \mathrm {tr}\left( \int _{\varGamma _{N,\gamma }} \mathbf {n} \times \left( \mathbf {G}\mathbf {F} \right) \mathrm {d}a \right) \right. \nonumber \\&\quad \left. + \int _{\varOmega _N} \big ( \nabla \times \mathbf {G}^{\mathrm {T}} \big ) \cdot \mathbf {F}^{\mathrm {T}} \mathrm {d}v \right] = 0 \ , \end{aligned}$$
(22)

where the linearity of the integration and trace operators was used to interchange their order.

The tensor-valued test function \(\mathbf {G}\) is taken as the gradient of a vector-valued test function \(\mathbf {w}\), i.e., \(\mathbf {G} = \nabla \mathbf {w}\). Consequently, \(\nabla \times \mathbf {G}^{\mathrm {T}} = \nabla \times \left( \nabla \mathbf {w}\right) ^{\mathrm {T}} = \nabla \left( \nabla \times \mathbf {w}\right) \). In general, this term is not zero, but for the choice of piece-wise linear test functions \(\mathbf {w}\) introduced in (5), it follows that inside each grain \(N\) the tensor \(\mathbf {G}\) is constant and therefore meets the relation \(\nabla \times \mathbf {G}^{\mathrm {T}} = \mathbf {0}\). Correspondingly, the second term in (22) vanishes. Now, using the assumed fields (5) in (22) and in view of the identity \(\mathrm {tr}\left( \mathbf {n}\times \left( \mathbf {G}\mathbf {F}\right) \right) =\left( \mathbf {n}\times \ \mathbf {F}\right) \cdot \mathbf {G}\), the weak form of the kinematic compatibility equation becomes

$$\begin{aligned} \sum _{N=1}^{N_{\mathrm {gr}}} \sum _{\gamma =1}^{M_{N}} \int _{\varGamma _{N,\gamma }} \left( \mathbf {n} \times \mathbf {F}_{N}\right) \cdot \mathbf {G}_I \mathrm {d}a = 0 \ , \end{aligned}$$
(23)

where \(\mathbf {G}_I\) is given by (6), with \(I=\hat{I}(N,\gamma )\).

The summation in (23) is carried out over all grains \(N\) and all interfaces \(\gamma \), hence it includes surface integrals on both sides of each interface \(I\). Correspondingly, using (6), the weak form (23) may be expressed as

$$\begin{aligned}&\sum _{N=1}^{N_{\mathrm {gr}}} \sum _{\gamma =1}^{M_{N}} \int _{\varGamma _{N,\gamma }} \frac{1}{2} \left( \mathbf {n} \times \mathbf {F}_{N} + \mathbf {n}' \times \mathbf {F}_{N'} \right) \cdot \mathbf {G}_N \mathrm {d}a\nonumber \\&\quad = \sum _{N=1}^{N_{\mathrm {gr}}} \sum _{\gamma =1}^{M_{N}} \int _{\varGamma _{N,\gamma }} \frac{1}{2} \left( \mathbf {n} \!\times \! \left( \mathbf {F}_{N} - \mathbf {F}_{N'} \right) \right) \cdot \mathbf {G}_N \mathrm {d}a \!=\! 0, \end{aligned}$$
(24)

where \(\mathbf {n}' = - \mathbf {n}\) refers to the outward normal unit vector of face \(\varGamma _{N',\gamma '}\).

The formulation (24), together with the constraint (9), leads to the following system of equations

$$\begin{aligned} \begin{aligned} A_{N,\gamma } \frac{1}{2} \mathbf {n} \times \left( \mathbf {F}_{N} - \mathbf {F}_{N'} \right)&= \mathbf {0} \quad \\ \text {for all } N=1,\ldots ,N_{\mathrm {gr}} ,&\ \gamma = 1,\ldots ,M_N&\\ \text {subjected to:}&\\ \sum _{N=1}^{N_{\mathrm {gr}}} V_N \left( \mathbf {F}_{N}-\bar{\mathbf {F}}\right)&= \mathbf {0} \ .&\end{aligned} \end{aligned}$$
(25)

Similar to the weak formulation of linear momentum presented in (17), the weak formulation (25) for the equation of kinematic compatibility leads to an over-determined system of equations, i.e., there are \(N_{\mathrm {int}} + 1\) distinct tensor-valued equations and only \(N_{\mathrm {gr}}\) tensor-valued unknowns. Observe that a trivial solution to this system of equations corresponds to a uniform state of deformation, \(\mathbf {F}_N = \bar{\mathbf {F}}\) in the grains \(N\), reflecting the well-known Taylor bound.

Note that the equilibrium and compatibility equations presented in Sects. 2.2 and 2.3 have been consistently obtained using the common assumption (5) for the test functions \(\mathbf {w}\). This is particularly relevant in view of the framework presented in the section below, which combines both sets of equations.

2.4 Formulation of the constrained minimization problem

By assuming grain-wise constant deformation gradients, see expression (5), the discrete form of the balance of linear momentum (17) leads to the uniform stress solution (Sachs bound), whereas the discrete form of the kinematic compatibility equation (25) provides the state of a uniform deformation gradient or strain (Taylor bound). The GGCM consists of finding solutions that simultaneously approximate these two conditions. A simple combination of the formulations (17) and (25) results in \(2 N_{\mathrm {int}} + N_{\mathrm {gr}} + 1\) distinct tensor-valued equations. The unknown variables in these equations are (i) the deformation gradients of the cluster’s grains, which can be collected in a set \(\mathcal {F}\) defined as

$$\begin{aligned} \mathcal {F} := \left\{ \mathbf {F}_N \right\} _{N=1\ldots ,N_{\mathrm {gr}}} \ , \end{aligned}$$
(26)

and (ii) the Lagrange multiplier \(\varvec{\varSigma }\). Correspondingly, there are \( N_{\mathrm {gr}} + 1\) (tensor-valued) unknowns, resulting in an over-determined system of equations. This is a consequence of the simplifying assumption of a grain-wise constant deformation gradient, which does not provide sufficient degrees of freedom for finding a solution that simultaneously satisfies the weak forms of (7) and (19). Therefore, a compromise between these requirements needs to be found, which is accomplished by using a minimization formulation that approximates (17)\(_{1,2}\) and (25)\(_{1}\) while enforcing the multiscale condition (17)\(_{3}\) (which is the same as (25)\(_{2}\)). For this purpose, a weighted scalar functional \(J\) is defined that depends on the variables \(\mathcal {F}\) and \(\varvec{\varSigma }\) as follows:

$$\begin{aligned} J(\mathcal {F}, \varvec{\varSigma }) := \alpha _1 J_1(\mathcal {F}) + \alpha _2 J_2 (\mathcal {F} ) + \alpha _3 J_3(\mathcal {F}, \varvec{\varSigma }) \end{aligned}$$
(27)

where \(\alpha _i\), with \(i=1,2,3\), are scalar weighting factors and

$$\begin{aligned}&J_{1}(\mathcal {F}) := \frac{1}{2 \beta ^2 A_{\mathrm {int}}} \sum _{N=1}^{N_{\mathrm {gr}}} \sum _{\gamma =1}^{M_{N}} A_{N,\gamma } \left\| \left( \mathbf {P}_{N} - \mathbf {P}_{N'} \right) \mathbf {n} \right\| ^{2}, \nonumber \\&J_{2}(\mathcal {F}) := \frac{1}{2 A_{\mathrm {int}}} \sum _{N=1}^{N_{\mathrm {gr}}} \sum _{\gamma =1}^{M_{N}}A_{N,\gamma } \left\| \mathbf {n} \times \left( \mathbf {F}_{N} - \mathbf {F}_{N'} \right) \right\| ^{2}, \nonumber \\&J_{3}(\mathcal {F},\varvec{\varSigma }) := \frac{1}{2 \beta ^2 V} \sum _{N=1}^{N_{\mathrm {gr}}} V_{N} \left\| \left( \mathbf {P}_{N} - \varvec{\varSigma }\right) \right\| ^{2} \ . \end{aligned}$$
(28)

Here, \(\Vert \cdot \Vert \) refers to the norm of the corresponding vector or tensor, i.e., for a vector \(\mathbf {a}\) with cartesian components \(a_i\), \(\Vert \mathbf a \Vert = \left( \sum _{i=1}^3 a_i^2 \right) ^{1/2}\) and for a second-order tensor \(\mathbf {A}\) with components \(A_{ij}\), \(\Vert \mathbf A \Vert = \left( \sum _{i,j=1}^3 A_{ij}^2 \right) ^{1/2}\).

The terms \(A_{\mathrm {int}}\) and \(V\) denote the total interfacial area and the total volume of the cluster \(\varOmega \), respectively, as expressed by

$$\begin{aligned} A_{\mathrm {int}} := \frac{1}{2} \sum _{N=1}^{N_{\mathrm {gr}}} \sum _{\gamma =1}^{M_{N}} A_{N,\gamma } \, , \quad V := \sum _{N=1}^{N_{\mathrm {gr}}} V_N \ . \end{aligned}$$
(29)

For the numerical implementation of the GGCM it is convenient to warrant that the stress-related terms, i.e., \(J_1\) and \(J_3\), and the term directly related to the deformation gradients, i.e., \(J_2\), are of the same order of magnitude. Thus, a scaling factor \(\beta \) (units of stress) is introduced in (28)\(_{1,3}\) in order to non-dimensionalize the stress terms \(J_{1}\) and \(J_{3}\) and achieve a proper scaling. In principle, the same goal may be realized with the weighting factors \(\alpha _i\); however, for presentation purposes it is convenient to work with nondimensional values for \(\alpha _i\).

The generalized grain cluster method can now be outlined as follows: For a given macroscopic deformation gradient \(\bar{\mathbf {F}}\) applied to a cluster of \(N = 1,\ldots , N_{\mathrm {gr}}\) polyhedral grains, each with volume \(V_N\) and connected to adjacent grains \(N'\) through interfaces of area \(A_{N,\gamma }\) with outward normal unit vectors \(\mathbf {n} = \mathbf {n}(N,\gamma )\), find the collection of deformation gradients \(\mathcal {F}^* = \left\{ \mathbf {F}^*_N \right\} _{N=1\ldots ,N_{\mathrm {gr}}}\) and the Lagrange multiplier \(\varvec{\varSigma }^*\) such that

$$\begin{aligned} \begin{aligned} J(\mathcal {F}^*, \varvec{\varSigma }^*) = \min _{\mathcal {F},\varvec{\varSigma }} J(\mathcal {F},\varvec{\varSigma }) \ , \\ \text {subject to}\quad \mathbf {C}\left( \mathcal {F},\bar{\mathbf {F}}\right) = \mathbf {0}, \end{aligned} \end{aligned}$$
(30)

with \(J\) given by (27) and (28) and the tensor-valued multiscale constraint \(\mathbf {C}\) given by

$$\begin{aligned} \mathbf {C}\left( \mathcal {F},\bar{\mathbf {F}}\right) := \sum _{N=1}^{N_{\mathrm {gr}}}\frac{V_{N}}{V}\mathbf {F}_{N}-\bar{\mathbf {F}} \ . \end{aligned}$$
(31)

The first Piola–Kirchhoff stress tensor \(\mathbf {P}_N\) in (28)\(_{1,3}\) is assumed to be determined by a (path-dependent) constitutive model of grain \(N\) that depends on the deformation gradient \(\mathbf {F}_N\) and a set of internal variables characterizing the inelastic response.

The solution to the constrained minimization problem summarized by expressions (30) and (31) depends on the specific choice of the weighting factors \(\alpha _i\), \(i=1,2,3\). In general, a range of solutions may be obtained that is bounded by the limit cases of a uniform stress and a uniform deformation gradient in the grain cluster. Accordingly, the GGCM should be equipped with a calibration procedure for determining the specific combination of weighting factors for which a close approximation of an accurate reference solution or an experimental response is found. This procedure will be discussed in more detail in Sect. 4. The numerical implementation of the GGCM is discussed in Sect. 3 below.

3 Numerical implementation

If the microscopic material behavior in the grain cluster is inelastic and thus path-dependent, its effective macroscopic response can be computed by incrementally loading the cluster from an initial state to the final state of deformation \(\bar{\mathbf {F}}\). Correspondingly, the loading process may be divided into discrete steps \(s = 1, \ldots , N_{\text {steps}}\), where \(N_{\text {steps}}\) represents the total number of steps. The initial state (\(s=0\)) typically corresponds to an unloaded configuration characterized by the macroscopic deformation gradient being equal to identity, \(\bar{\mathbf {F}}^{s=0} = \mathbf {I}\). The loading process can be parameterized by a scalar \(t^{s}\), which may be interpreted as the actual time for rate-dependent constitutive models, with \(\bar{\mathbf {F}}^{s}= \bar{\mathbf {F}} \left( t^{s} \right) \) reflecting the macroscopic loading at time \(t^{s}\) in a quasi-static process. Consider a given macroscopic loading increment expressed by the change in the deformation gradient going from step \(s\) to step \(s+1\),

$$\begin{aligned} \left( \Delta \bar{\mathbf {F}} \right) ^{s+1} := \bar{\mathbf {F}}^{s+1} - \bar{\mathbf {F}}^{s} \ . \end{aligned}$$
(32)

Denote by \(\left\{ \mathcal {F}, \varvec{\varSigma }, \varvec{\Xi }\right\} \) a microscopic state, where \(\varvec{\Xi } := \left\{ \varvec{\xi }_N \right\} _{N=1\ldots ,N_{\mathrm {gr}}}\) represents a collection of internal variables \(\varvec{\xi }_N\) of the inelastic constitutive model in grain \(N\). Starting from the last converged state \(\left\{ \mathcal {F}^{s}, \varvec{\varSigma }^{s}, \varvec{\Xi }^{s} \right\} \) corresponding to the macroscopic deformation gradient \(\bar{\mathbf {F}}^{s}\), the goal is to determine the state \(\left\{ \mathcal {F}^{s+1}, \varvec{\varSigma }^{s+1}, \varvec{\Xi }^{s+1} \right\} \) that minimizes \(J\) under the incremental deformation (32), subject to the multiscale constraint \(\mathbf {C}(\mathcal {F}^{s+1}, \bar{\mathbf {F}}^{s+1}) = \mathbf {0}\). Since the internal variables \(\varvec{\Xi }^{s+1}\) are determined from a user-defined, constitutive model, the task is to calculate the Lagrange multiplier \(\varvec{\varSigma }^{s+1}\) and a collection of deformation gradients \(\mathcal {F}^{s+1}\) that minimize \(J\). To this end, the gradients of \(J\) with respect to the Lagrange multiplier \(\varvec{\varSigma }\) and the deformation gradients \(\mathbf {F}_N\) (collected in the set \(\mathcal {F}\)) need to be computed, as described below.

3.1 Unconstrained gradient

For solving the constrained minimization problem (30), a simple constrained gradient descent method based on the computation of the gradient of the objective functional \(J\) is proposed. In this section the components of the unconstrained gradient of the objective functional are derived. Observe that, in view of (27) and (28), the symbolic expression for the unconstrained gradient is the same for all loading steps \(s\), hence the superindex \(s\) will be suppressed for notational simplicity.

Consider a generic grain \(K \in [1,\ldots ,N_{\mathrm {gr}}]\) and the corresponding deformation gradient \(\mathbf {F}_K\) with cartesian components \(\left( F_K\right) _{mn}\). Henceforth, implicit summation on repeated cartesian components \(i=1,2,3\) will be assumed. The derivatives of the terms composing \(J\) in (28) are as follows:

$$\begin{aligned} \begin{aligned} \frac{\partial {J_{1}}}{\partial \left( F_{K}\right) _{mn}}&= \frac{2}{\beta ^2 A_{\mathrm {int}}} \sum _{\gamma =1}^{M_{K}}A_{K,\gamma }\\&\quad \, \times \left( \left( P_{K}\right) _{ij} - \left( P_{K'}\right) _{ij} \right) n_{j} \frac{\partial \left( P_{K}\right) _{ik}}{\partial \left( F_{K}\right) _{mn}}n_{k} \ ,\\ \frac{\partial {J_{2}}}{\partial \left( F_{K}\right) _{mn}}&= \frac{2}{A_{\mathrm {int}}} \sum _{\gamma =1}^{M_{K}}A_{K,\gamma } \\&\quad \, \times \left( \left( F_{K}\right) _{ml} - \left( F_{K'}\right) _{ml} \right) \left( \delta _{nl}-n_{n}n_{l} \right) \ , \\ \frac{\partial {J_{3}}}{\partial \left( F_{K}\right) _{mn}}&= \frac{V_{K}}{\beta ^2 V} \left( \left( P_{K}\right) _{ij} - \varSigma _{ij}\right) \frac{\partial \left( P_{K}\right) _{ij}}{\partial \left( F_{K}\right) _{mn}} \ , \\&\frac{\partial {J_{3}}}{\partial \varSigma _{mn}} = -\frac{1}{\beta ^2 V} \sum _{N=1}^{N_{\mathrm {gr}}} V_{N} \left( \left( P_{N}\right) _{mn} - \varSigma _{mn}\right) \ , \end{aligned} \end{aligned}$$
(33)

where the identity \(\epsilon _{ikl} \epsilon _{ipn} = \delta _{kp} \delta _{ln} - \delta _{kn} \delta _{lp}\) was used to derive (33)\(_2\), with \(\delta _{ij}\) representing the Kronecker delta symbol. Observe that the factor \(2\) in front of \(\partial {J_1}/\partial \left( F_K\right) _{mn}\) and \(\partial {J_2}/\partial \left( F_K\right) _{mn}\) is related to the contributions from grains \(K'\) that are adjacent to \(K\). The tangential stiffness \(\partial \left( P_{K}\right) _{ij}/\partial \left( F_K\right) _{mn}\) is obtained from the constitutive model of grain \(K\), and can be calculated analytically and/or numerically, for example, through a numerical perturbation technique [28]. The constitutive model further provides the stress components \(\left( P_{K}\right) _{ij}\) and the internal variables by means of an incremental-iterative update scheme, such as a return mapping algorithm commonly used for classical plasticity models. Henceforth, it is assumed that for an arbitrary deformation gradient \(\left( F_K\right) _{mn}\) it is possible to compute \(\left( P_{K}\right) _{ij}\) and \(\partial \left( P_{K}\right) _{ij}/\partial \left( F_K\right) _{mn}\) from a user-defined constitutive model for grain \(K\).

It is convenient to enforce the necessary condition for a minimum of the objective functional \(J\) with respect to the Lagrange multiplier \(\varvec{\varSigma }\) from the outset, i.e., the derivative of \(J\) with respect to \(\varvec{\varSigma }\) is set to zero. Consequently, in view of (33)\(_4\), the Lagrange multiplier \(\varvec{\varSigma }\) that satisfies the necessary condition for a minimum of \(J\) is interpreted as the macroscopic stress, i.e.,

$$\begin{aligned} \varvec{\varSigma } = \sum _{N=1}^{N_{\mathrm {gr}}} \theta _N \mathbf {P}_{N} \ , \end{aligned}$$
(34)

where the scalars \(\theta _N\) correspond to the volume fractions of the grains:

$$\begin{aligned} \theta _N := \frac{V_N}{V} \quad N=1,\ldots ,N_{\mathrm {gr}} \ . \end{aligned}$$
(35)

It is worth pointing out that the Hill–Mandel condition, which refers to the consistency between the microscale power \(\sum _{N=1}^{N_{\mathrm {gr}}} \theta _N \mathbf {P}_{N} \cdot \dot{\mathbf {F}}_{N}\) and macroscale stress power \(\varvec{\varSigma } \cdot \dot{\bar{\mathbf {F}}}\), is in general only approximately satisfied, since neither the kinematical compatibility relation nor the equilibrium condition are exactly met in the present framework. A consequence of this is that the spatial average of the energy dissipated at the microscale is not equal to the energy dissipated in a material point at the macroscale, although the difference is expected to be small in general. In the limit cases of uniform deformation and uniform stress, the Hill–Mandel condition is satisfied, albeit at the expense of relaxing, respectively, the equilibrium condition and the kinematic compatibility relation.

Using (34) in (33) and in view of (27), the derivative of \(J\) with respect to the deformation gradient in a generic grain \(K\) becomes

$$\begin{aligned}&\frac{\partial J}{\partial \left( F_K\right) _{mn}} \nonumber \\&\quad =\frac{2 \alpha _1}{\beta ^2 A_{\mathrm {int}}} \sum _{\gamma = 1}^{M_K} A_{K,\gamma } \left( \left( P_{K}\right) _{ij} - \left( P_{K'}\right) _{ij} \right) n_j \left( \mathbb {A}_K\right) _{ikmn} n_k\nonumber \\&\qquad + \frac{2 \alpha _2}{A_{\mathrm {int}}} \sum _{\gamma = 1}^{M_K} A_{K,\gamma } \left( \left( F_K\right) _{ml} - \left( F_{K'}\right) _{ml} \right) \left( \delta _{nl} - n_n n_l \right) \nonumber \\&\qquad + \frac{\alpha _3 \theta _{K}}{\beta ^2} \left( \left( P_{K}\right) _{ij} - \varSigma _{ij} \right) \left( \mathbb {A}_K\right) _{ijmn} \ , \end{aligned}$$
(36)

with \(\varSigma _{ij}\) given by (34) and the material tangent stiffness \(\mathbb {A}_K\) of the \(K\)-th grain defined in cartesian components as

$$\begin{aligned} \left( \mathbb {A}_K\right) _{ijmn} := \frac{\partial \left( P_{K}\right) _{ij}}{\partial \left( F_K\right) _{mn}} \ . \end{aligned}$$
(37)

Observe that the tensor \(\mathbf {I} - \mathbf {n} \otimes \mathbf {n}\) (in components: \(\delta _{nl} - n_n n_l\)) appearing in the second term on the right hand side of (36) represents a projection (of the microscopic deformation gradient) onto a grain boundary with normal vector \(\mathbf {n}\); hence this term measures the relative difference in deformation at grain boundaries. Furthermore, the first term in (36) reflects the traction discontinuity across a grain boundary and the third term represents the difference between the microscopic stress in a grain and the macroscopic stress.

In the gradient descent method, the estimate of the deformation gradient is modified by an incremental amount in the opposite direction of the derivative (36) in order to minimize the jumps in traction and displacement across grain boundaries, as well as the deviation of the microscopic stresses from the macroscopic stress. However, this modification cannot be performed arbitrarily, as it is required that the average microscopic deformation gradient remains unconditionally equal to the macroscopic deformation gradient, see expression (30)\(_2\). Accordingly, in the next section a gradient descent direction is constructed that satisfies this multiscale constraint.

3.2 Constrained gradient

In view of the numerical implementation of the constrained gradient descent method, a matrix-vector notation is henceforth used, such that a single index \(Q\) is obtained from a combination of a grain index \(K\) and two cartesian indices \(m\) and \(n\), i.e.,

$$\begin{aligned} Q= & {} Q(K,m,n) \quad \text {with} \ Q=1,\ldots , 9 N_{\mathrm {gr}} \nonumber \\&\text {for} \quad K=1,\ldots , N_{\mathrm {gr}}, \quad m,n = 1,2,3 \ . \end{aligned}$$

Similarly, two cartesian indices \(m\) and \(n\) are combined into a single index \(q\) such that

$$\begin{aligned} q = q(m,n) \quad \text {with} \ q=1,\ldots , 9 \quad \text {for} \quad m,n = 1,2,3 \ . \end{aligned}$$

The index \(Q=1,\ldots ,9 N_{\mathrm {gr}}\) ranges over all degrees of freedom in the minimization problem while the index \(q=1,\ldots ,9\) ranges over all cartesian components of the deformation gradient. With this notational convention, the gradient of the objective functional can be collected in a vector \({{\mathbf {\mathsf{{g}}}}}\), for which the \(9 N_{\mathrm {gr}}\) components \(\mathsf g _Q\) are given by

$$\begin{aligned} \mathsf g _Q :=\frac{\partial J}{\partial \left( F_{K}\right) _{mn}} \quad \text {with} \quad Q = Q(K,m,n) \ . \end{aligned}$$

Similarly, denote as \(\bar{{{\mathbf {\mathsf{{f}}}}}}\) the vector representing the 9 components of the macroscopic deformation gradient \(\bar{\mathbf {F}}\) and denote as \({{\mathbf {\mathsf{{x}}}}}\) the vector representing the \({9 N_{\mathrm {gr}}}\) components of all the microscopic deformation gradients \(\mathbf {F}_K\) in the set \(\mathcal {F}\), i.e.,

$$\begin{aligned}&\bar{\mathsf{f }}_q := \bar{F}_{mn} \quad \text {and} \quad \mathsf {x}_Q := \left( F_K \right) _{mn}\nonumber \\&\quad \text {with} \quad q=q(m,n)\ , Q = Q(K,m,n) \ . \end{aligned}$$

Accordingly, the multiscale constraint (30)\(_2\) can be written as

$$\begin{aligned} {{\mathbf {\mathsf{{L}}}}}{{\mathbf {\mathsf{{x}}}}} = \bar{{{\mathbf {\mathsf{{f}}}}}} \, , \end{aligned}$$
(38)

where \({{\mathbf {\mathsf{{L}}}}}\) is a \(9 \times 9 N_{\mathrm {gr}}\) (non-square) matrix composed of a collection of \(N_{\mathrm {gr}}\) matrices, each of size \(9 \times 9\) and arranged as follows:

$$\begin{aligned} {{\mathbf {\mathsf{{L}}}}} := \left[ \theta _{1} {{\mathbf {\mathsf{{I}}}}} \ | \ \theta _{2}{{\mathbf {\mathsf{{I}}}}} \ | \ \ \ldots |\theta _{N_{\mathrm {gr}}}{{\mathbf {\mathsf{{I}}}}} \right] \ , \end{aligned}$$
(39)

where the scalars \(\theta _N\), with \(N=1,\ldots , N_{\mathrm {gr}}\), are the grain volume fractions defined in (35) and \({{\mathbf {\mathsf{{I}}}}}\) represents the \(9 \times 9\)-identity matrix. The matrix \({{\mathbf {\mathsf{{L}}}}}\) may be viewed as a “volume averaging” operator that maps a microscopic deformation state \({{\mathbf {\mathsf{{x}}}}}\) to the macroscopic deformation state \(\bar{{{\mathbf {\mathsf{{f}}}}}}\).

Projecting the gradient \({{\mathbf {\mathsf{{g}}}}}\) to the subspace characterized by the multiscale constraint (38) ensures that the gradient descent method preserves this constraint for all iterations at a given loading step. The projection may be achieved using a basis for the null space \(\mathcal {N}\left( {{\mathbf {\mathsf{{L}}}}}\right) \) of the matrix \({{\mathbf {\mathsf{{L}}}}}\). In view of (39), it can be shown that the null space \(\mathcal {N}\left( {{\mathbf {\mathsf{{L}}}}}\right) \) has dimension \(9N_{\mathrm {gr}}-9\). The gradient descent direction is thus obtained by first computing the tangent \({{\mathbf {\mathsf{{g}}}}}\) according to (36) and then projecting it onto \(\mathcal {N}\left( {{\mathbf {\mathsf{{L}}}}}\right) \). However, instead of working with the null space directly, it is convenient to operate first in the subspace that is the orthogonal complement of the null space, since this subspace has a dimension that is generally far less than the dimension of the null space itself, i.e., 9 instead of \(9N_{\mathrm {gr}}-9\). Because the projection has to be performed for every newly calculated tangent vector \({{\mathbf {\mathsf{{g}}}}}\) in each iteration, for the efficiency of the computations it is preferable to first project the tangent vector \({{\mathbf {\mathsf{{g}}}}}\) to the complementary subspace \(\mathcal {N}\left( {{\mathbf {\mathsf{{L}}}}}\right) ^{\perp }\) and then subtract this result from the tangent \({{\mathbf {\mathsf{{g}}}}}\). An orthonormal basis for \(\mathcal {N}\left( \mathbf {L}\right) ^{\perp }\) can be constructed by taking the first nine left-singular vectors obtained from the singular value decomposition of \({{\mathbf {\mathsf{{L}}}}}^{\mathrm {T}}{{\mathbf {\mathsf{{L}}}}}\). Consequently, the projected gradient descent direction, denoted as \({{\mathbf {\mathsf{{g}}}}}^{\mathrm {p}}\), is calculated in accordance with:

$$\begin{aligned} {{\mathbf {\mathsf{{g}}}}}^{\mathrm {p}} := {{\mathbf {\mathsf{{g}}}}} - \sum _{q=1}^{9} \left( {{\mathbf {\mathsf{{g}}}}} \cdot {{\mathbf {\mathsf{{u}}}}}_{q} \right) {{\mathbf {\mathsf{{u}}}}}_{q} \, . \end{aligned}$$
(40)

Here, \({{\mathbf {\mathsf{{u}}}}}_{q}\), with \(q=1,\ldots ,9\), are the first nine left-singular vectors of the matrix \({{\mathbf {\mathsf{{U}}}}}\), as obtained from the singular-value decomposition of \({{\mathbf {\mathsf{{L}}}}}^{\mathrm {T}} {{\mathbf {\mathsf{{L}}}}}\), i.e.,

$$\begin{aligned} {{\mathbf {\mathsf{{U}}}}} {{\mathbf {\mathsf{{D}}}}} {{\mathbf {\mathsf{{V}}}}}^{\mathrm {T}} = \mathrm {svd}\left( {{\mathbf {\mathsf{{L}}}}}^{\mathrm {T}}{{\mathbf {\mathsf{{L}}}}}\right) , \end{aligned}$$

with \({{\mathbf {\mathsf{{D}}}}} \) being the diagonal matrix of singular values and \({{\mathbf {\mathsf{{V}}}}}\) the matrix of right-singular vectors. Because the multiscale constraint (38) is linear, the unit vectors \({{\mathbf {\mathsf{{u}}}}}_{q}\left( q=1,\ldots ,9\right) \) can be expressed in closed-form, such that the components of the projected gradient follow as:

$$\begin{aligned} \mathsf {g}_Q^{\mathrm {p}} = \frac{\partial J}{\partial \left( F_K\right) _{mn}} - \frac{\theta _K}{{\hat{\theta }}^2} H_{mn} \quad \text {with} \quad Q = Q(K,m,n),\nonumber \\ \end{aligned}$$
(41)

where

$$\begin{aligned} H_{mn} := \sum _{N=1}^{N_\mathrm {gr}} \theta _N \frac{\partial J}{\partial \left( F_N\right) _{mn}} \quad \text {and} \quad \hat{\theta } := \left( \sum _{N=1}^{N_\mathrm {gr}} \theta _N^2 \right) ^{\frac{1}{2}} \ . \end{aligned}$$
(42)

The tensor \(\mathbf {H}\) with cartesian components \(H_{mn}\) defined in (42) represents the volume average of the gradient of \(J\), while the factor \(\hat{\theta }\) reflects the \(L_2\) norm of the volume fractions of the grains.

3.3 Constrained gradient descent algorithm

Suppose that the converged state at loading step \(s\) has been determined and let \({{\mathbf {\mathsf{{x}}}}}^{s}\) denote the corresponding vector of deformation gradients in the grains. If the constitutive model of a grain uses internal variables, it is assumed that these were determined in the convergence process of the vector \({{\mathbf {\mathsf{{x}}}}}^s\) by means of an incremental (iterative) update algorithm at the grain level. This update algorithm also provides the converged stress \(\mathbf {P}_{N}^{s}\) in the grain, which, in view of (34), results in the update of the macroscopic stress \(\mathbf {\varSigma }^s\). In summary, \({{\mathbf {\mathsf{{x}}}}}^{s}\) may thus be formally interpreted as the vector with the main state variables in the grains of the cluster, for which the corresponding stress and history variables at the grain level are computed through a user-defined constitutive model. To determine the converged state \({{\mathbf {\mathsf{{x}}}}}^{s+1}\) at loading step \(s+1\), the time-like parameter is incremented from \(t^s\) to \(t^{s+1}\) by a sufficiently small time step \(\Delta t^{s+1}\), and the corresponding macroscopic deformation gradient is incremented from \(\bar{{{\mathbf {\mathsf{{f}}}}}}^{s}\) to \(\bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1}\). Starting from an initial estimate \({{\mathbf {\mathsf{{x}}}}}^{s+1,0}\), the vector of deformation gradients is updated from iteration \(i\) to iteration \(i+1\) by using the projected gradient \({{\mathbf {\mathsf{{g}}}}}^{\mathrm {p}}\) as

$$\begin{aligned} {{\mathbf {\mathsf{{x}}}}}^{s+1,i+1} = {{\mathbf {\mathsf{{x}}}}}^{s+1,i} - \omega \left( {{\mathbf {\mathsf{{g}}}}}^{\mathrm {p}} \right) ^{i} \ \quad i=0,1,\ldots \ . \end{aligned}$$
(43)

In (43), the scalar \(\omega > 0\) is a suitably-chosen step size that, for simplicity, is assumed to be constant at a given loading step. The value of \(\omega \) can be chosen such that the magnitude of \(\omega \left( {{\mathbf {\mathsf{{g}}}}}^{\mathrm {p}} \right) ^{i}\) is a fraction of the magnitude of \({{\mathbf {\mathsf{{x}}}}}^{s}\). The projected gradient is given in components in (41), and is computed from the unconstrained gradient \(\left( {{\mathbf {\mathsf{{g}}}}} \right) ^{i}\), given in components in (36) and evaluated at \({{\mathbf {\mathsf{{x}}}}}^{s+1,i}\).

The linearity of the constrained subspace characterized by (38) ensures that the projected gradient always lies within this subspace, and that all estimates \({{\mathbf {\mathsf{{x}}}}}^{s+1,i},\) with \(i=0,1,\ldots \), satisfy the multiscale kinematic constraint (38), independent of the magnitude of the projected gradient or the value of the step size. The estimates are iteratively updated until a convergence criterion is satisfied, as represented by the objective functional \(J\) reaching a minimum within a prescribed tolerance \(\varepsilon \):

$$\begin{aligned} \frac{| J^{i+1} - J^{i} |}{J^{i}} < \varepsilon \ . \end{aligned}$$
(44)

Alternatively, or as a complementary check, the relative magnitude of the projected gradient can be monitored at a given iteration, where at a converged state it should be confirmed that

$$\begin{aligned} \frac{\left\| \left( {{\mathbf {\mathsf{{g}}}}}^{\mathrm {p}} \right) ^{i+1} \right\| }{ \left\| \left( {{\mathbf {\mathsf{{g}}}}}^{\mathrm {p}} \right) ^{0} \right\| } \ll 1 \ . \end{aligned}$$
(45)

with \(\left( {{\mathbf {\mathsf{{g}}}}}^{\mathrm {p}} \right) ^{0}\) the projected gradient at the onset of the iterative process. If the convergence criterion (44) is not satisfied after a certain number of iterations, the time step \(\Delta t^s\) must be reduced and the update algorithm needs to be restarted at the last converged loading step.

The procedure indicated above is repeated for all loading steps \(s\) until the imposed macroscopic deformation gradient \(\bar{{{\mathbf {\mathsf{{f}}}}}}^{s}\) reaches its final value. To start the constrained gradient descent method at a new loading step \(s+1\), it is required to specify an initial estimate \({{\mathbf {\mathsf{{x}}}}}^{s+1,0}\) for the vector of microscopic deformation gradients. This issue deserves special attention and is discussed in the following section.

3.4 Loading step increment satisfying the multiscale kinematic constraint

Moving from loading step \(s\) to \(s+1\), an initial estimate \({{\mathbf {\mathsf{{x}}}}}^{s+1,0}\) for the microscale deformation gradients needs to be specified, with the superscript ’\(0\)’ indicating the onset of the iterative process. This initial estimate should satisfy the multiscale kinematic constraint, i.e.,

$$\begin{aligned} {{\mathbf {\mathsf{{L}}}}}{{\mathbf {\mathsf{{x}}}}}^{s+1,0} = \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1}, \end{aligned}$$
(46)

where \(\bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1}\) represents the vectorized form of the macroscopic deformation gradient \(\bar{\mathbf {F}}^{s+1}\) at time \(t^{s+1}\). The system of equations (46) has more equations than unknown variables and lacks a unique solution. In order to find an accurate, approximate solution to this system of equations, the initial estimate for the microscopic deformation gradients is expressed as

$$\begin{aligned} {{\mathbf {\mathsf{{x}}}}}^{s+1,0} = {{\mathbf {\mathsf{{x}}}}}^{s} + {{\mathbf {\mathsf{{d}}}}}^{s+1} \end{aligned}$$
(47)

where \({{\mathbf {\mathsf{{x}}}}}^{s}\) denotes the converged solution at the previous loading step \(s\) and \({{\mathbf {\mathsf{{d}}}}}^{s+1}\) is a vector of \(9 N_{\mathrm {gr}}\) components representing an initial estimate for the incremental microscopic deformation gradients. Substituting (47) in (46) and using the fact that the converged solution \({{\mathbf {\mathsf{{x}}}}}^{s}\) at step \(s\) meets the constraint \({{\mathbf {\mathsf{{L}}}}} {{\mathbf {\mathsf{{x}}}}}^{s} = \bar{{{\mathbf {\mathsf{{f}}}}}}^{s}\), it follows that

$$\begin{aligned} {{\mathbf {\mathsf{{L}}}}} {{\mathbf {\mathsf{{d}}}}}^{s+1} = \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1} \quad \text {with} \quad \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1} := \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1} - \bar{{{\mathbf {\mathsf{{f}}}}}}^{s} \ . \end{aligned}$$
(48)

From (48), the solution \({{\mathbf {\mathsf{{d}}}}}^{s+1}\) may be generally expressed as

$$\begin{aligned} {{\mathbf {\mathsf{{d}}}}}^{s+1} = \tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1} - {{\mathbf {\mathsf{{L}}}}}^{+} \left( {{\mathbf {\mathsf{{L}}}}} \tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1} - \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1} \right) \end{aligned}$$
(49)

where \(\tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) is an arbitrarily-chosen vector of \(9 N_{\text {gr}}\) components and \({{\mathbf {\mathsf{{L}}}}}^{+} \) is the (right) Moore–Penrose pseudo-inverse of \({{\mathbf {\mathsf{{L}}}}}\). Since the rows of \({{\mathbf {\mathsf{{L}}}}}\) are linearly independent, the pseudo-inverse is given by

$$\begin{aligned} {{\mathbf {\mathsf{{L}}}}}^{+} := {{\mathbf {\mathsf{{L}}}}}^{\mathrm {T}}\left( {{\mathbf {\mathsf{{L}}}}}{{\mathbf {\mathsf{{L}}}}}^{\mathrm {T}}\right) ^{-1} \ . \end{aligned}$$
(50)

The averaging operator \({{\mathbf {\mathsf{{L}}}}}\) presented in (39) has a relatively simple form, as a result of which the pseudo-inverse \({{\mathbf {\mathsf{{L}}}}}^{+}\) can be derived explicitly. This results in a \(9 N_{\mathrm {gr}} \times 9 \) matrix composed of a collection of \(N_{\mathrm {gr}}\) matrices, each of size \(9 \times 9\) and arranged as follows:

$$\begin{aligned} {{\mathbf {\mathsf{{L}}}}}^{+} = \frac{1}{\hat{\theta }^2} \left[ \begin{array}{c} \theta _{1} {{\mathbf {\mathsf{{I}}}}} \\ \theta _{2} {{\mathbf {\mathsf{{I}}}}} \\ \vdots \\ \theta _{N_{\mathrm {gr}}} {{\mathbf {\mathsf{{I}}}}} \end{array} \right] \ , \end{aligned}$$
(51)

where, as before, the scalars \(\theta _N\) are the grain volume fractions, \({{\mathbf {\mathsf{{I}}}}}\) represents the \(9 \times 9\)-identity matrix and \(\hat{\theta }\) is given by (42)\(_2\). It can be confirmed that inserting (49) into (47), followed by multiplying the result by \({{\mathbf {\mathsf{{L}}}}}\) and invoking (50) and (48), indeed leads to the multiscale kinematic constraint (46) for the initial estimate \({{\mathbf {\mathsf{{x}}}}}^{s+1,0}\) of the microscale deformation gradients.

In principle, one may choose any vector \(\tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) in (49) to obtain an increment \({{\mathbf {\mathsf{{d}}}}}^{s+1}\) that can in turn be used in (47) to generate an initial estimate for \({{\mathbf {\mathsf{{x}}}}}^{s+1,0}\) in the constrained minimization procedure. However, since the material response typically is path-dependent, it may be expected that the performance of the update algorithm will significantly depend on this initial estimate. Hence, it is critical to make a judicious choice for \(\tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) in (49), such that the corresponding initial value \({{\mathbf {\mathsf{{x}}}}}^{s+1,0}\) is located close to the final value obtained after reaching the convergence criterion (44). Accordingly, it is convenient to impose conditions on \(\tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\), under which at the grain boundaries the kinematic compatibility and/or traction continuity requirements are approximately satisfied, see Sect. 3.5 for examples. Furthermore, to effectively transfer the properties of \(\tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) to \({{\mathbf {\mathsf{{d}}}}}^{s+1}\), the term \({{\mathbf {\mathsf{{L}}}}} \tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1} - \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1}\) in (49) must be as small as possible, whereby \({{\mathbf {\mathsf{{d}}}}}^{s+1} \approx \tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\). This requirement can be satisfied by determining the vector \(\tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) from a uniform scaling relation, i.e.,

$$\begin{aligned} \tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1} = \eta ^{s+1} \hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1} \ . \end{aligned}$$
(52)

Here, \(\eta ^{s+1}\) is a scaling factor and \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) is an auxiliary vector, for which four specific options are discussed in Sect. 3.5. After inserting (52) in (49), it follows that the \(L_2\) norm of the vector \({{\mathbf {\mathsf{{L}}}}} \eta ^{s+1} \hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1} - \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1}\) needs to be minimized for meeting the condition \({{\mathbf {\mathsf{{d}}}}}^{s+1} \approx \tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\). This simply leads to the following expression for the scaling factor:

$$\begin{aligned} \eta ^{s+1} = \frac{ \left( {{\mathbf {\mathsf{{L}}}}} \hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1} \right) ^{\mathrm {\,T}} \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1}}{\left( {{\mathbf {\mathsf{{L}}}}} \hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1} \right) ^{\mathrm {\,T}} {{\mathbf {\mathsf{{L}}}}} \hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1} } \ . \end{aligned}$$
(53)

A schematic representation of the above incremental/iterative method is shown in Fig.  3. Starting from the converged state \({{\mathbf {\mathsf{{x}}}}}^{s}\) at loading step \(s\) and a specific choice for the auxiliary vector \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\), the scaling factor \(\eta ^{s+1}\) and the vector \(\tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) are computed from (53) and (52), respectively, the vector \({{\mathbf {\mathsf{{d}}}}}^{s+1}\) containing the initial incremental deformation gradients is computed from (49) and the initial estimate \({{\mathbf {\mathsf{{x}}}}}^{s+1,0}\) for the deformation gradients is determined from (47). Observe that \({{\mathbf {\mathsf{{x}}}}}^{s+1,0}\) lies within the ”feasible solution space”, as characterized by the space containing the vectors \({{\mathbf {\mathsf{{x}}}}}\) that satisfy the multiscale kinematic constraint \({{\mathbf {\mathsf{{L}}}}} {{\mathbf {\mathsf{{x}}}}} = \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1}\). Subsequently the constrained gradient is computed from (41) and the estimate is updated according to (43) until it converges to the final solution \({{\mathbf {\mathsf{{x}}}}}^{s+1}\) within the feasible solution space.

Fig. 3
figure 3

Schematic representation of the initial estimate \({{\mathbf {\mathsf{{x}}}}}^{s+1,0}\) of the vector of deformation gradients at loading step \(s+1\), and the subsequent constrained gradient descent method. Points in the domain on the left represents schematically the collection of microscale deformation gradients for the grain cluster, which are mapped through the averaging operator \({{\mathbf {\mathsf{{L}}}}}\) to the space of macroscale deformation gradients on the right

3.5 Possible estimates for the initial deformation gradient increment

For an optimal performance of the constrained minimization algorithm visualized in Fig.  3, it is critical to choose an appropriate estimate for the vector containing the increments in the deformation gradient \(\tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\), which directly depends on the auxiliary vector \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) through expression (52). Accordingly, four options for \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) are discussed below.

3.5.1 Initial estimate based on uniform deformation gradient increment

A possible choice for \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) is to assume that all grains deform in accordance with the macroscopic increment of the deformation gradient from step \(s\) to step \(s+1\). In components, this choice is given by

$$\begin{aligned}&\left( \hat{\mathsf {d}}^{s+1} \right) _Q = \Delta \bar{F}_{mn}^{s+1} \nonumber \\&\qquad \quad \text {with} \quad Q = Q(K,m,n), \quad K = 1,\ldots , N_{\mathrm {gr}}. \end{aligned}$$
(54)

With this particular choice it can be easily verified that

$$\begin{aligned} {{\mathbf {\mathsf{{L}}}}} \hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1} - \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1} = \mathbf {0} \ , \end{aligned}$$

from which it follows from (53) that the scaling factor \(\eta ^{s+1} = 1\), and from (49) that the corresponding vector \({{\mathbf {\mathsf{{d}}}}}^{s+1}\) is given by

$$\begin{aligned} {{\mathbf {\mathsf{{d}}}}}^{s+1} = \tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1} = \hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\ . \end{aligned}$$
(55)

The incremental estimate indicated in (54) and (55) is the exact incremental solution to the limit case of a uniform deformation gradient (Taylor bound), for which \(\alpha _1=\alpha _3=0\) in expression (27). Hence, it will serve as a good estimate for cases where the weighting factors \(\alpha _1\) and \(\alpha _3\) are relatively small compared to \(\alpha _2\), but generally will not provide a suitable initial estimate if these weighting factors are relatively large and the limit case of uniform stress (Sachs bound) is approached.

3.5.2 Initial estimate based on uniform stress increment

An alternative way to obtain an initial estimate for \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) is to incrementally deform the grains in accordance with a uniform stress increment. This initial estimate satisfies the equilibrium conditions, and therefore would be particularly useful if the contribution to \(J\) by the equilibrium conditions (reflected by the weighting factors \(\alpha _{1}\) and \(\alpha _{3}\) in (27)) has a higher importance than the kinematic compatibility condition (reflected by the weighting factor \(\alpha _{2}\) in (27)). Suppose that from step \(s\) to step \(s+1\) the average stress tensor in the grain cluster increases from \(\varvec{\varSigma }^{s}\) to \(\varvec{\varSigma }^{s+1}\). Although the stress \(\varvec{\varSigma }^{s+1}\) is unknown at the beginning of step \(s+1\), it can be estimated based on temporarily assuming that all grains are subjected to one and the same increment of the deformation gradient, \(\Delta \bar{\mathbf {F}}^{s+1}\). For computing the corresponding increment in macroscopic stress, \(\varvec{\varSigma }^{s+1}\), a frame-indifferent stress measure based on the Lie derivative of the first Piola–Kirchhoff stress is considered. The Lie derivative \(\mathring{\mathbf {P}}\) of the first Piola–Kirchhoff stress \(\mathbf {P}\) is given as \(\mathring{\mathbf {P}} = \mathbf {F} \dot{\mathbf {S}}\), where \(\mathbf {S} = \mathbf {F}^{-1} \mathbf {P}\) is the second Piola–Kirchhoff stress and \(\dot{\mathbf {S}}\) denotes its invariant material time derivative. Correspondingly, the Lie derivative of \(\mathbf {P}\) may be formulated as

$$\begin{aligned} \mathring{\mathbf {P}} = \dot{\mathbf {P}} - \dot{\mathbf {F}} \mathbf {F}^{-1} \mathbf {P} \, , \end{aligned}$$
(56)

where the superimposed dot indicates a material time derivative. In analogy with this expression, for a sufficiently small load increment and under the assumption of an equal deformation increment in the grains, the initial estimate of the macroscopic stress increment from step \(s\) to step \(s+1\), denoted as \(\Delta \varvec{\varSigma }^{s+1,0}\), may be computed as

$$\begin{aligned}&{\Delta }{\varvec{\varSigma }}^{s+1,0} := \bar{\mathbb {A}}^{s} \Delta \bar{\mathbf {F}}^{s+1} - \Delta \bar{\mathbf {F}}^{s+1} \left( \bar{\mathbf {F}}^{s+1} \right) ^{-1} \varvec{\varSigma }^{s}\nonumber \\&\quad \text {with} \quad \bar{\mathbb {A}}^{s} := \sum _{N=1}^{N_{\mathrm {gr}}} \theta _N \mathbb {A}_N^{s} \, . \end{aligned}$$
(57)

Here, \(\bar{\mathbb {A}}^{s}\) is the volume average of the cluster’s material tangent stiffness, \(\mathbb {A}_N^{s}\) is the material tangent stiffness of grain \(N\) and \(\varvec{\varSigma }^{s}\) is the macroscopic first Piola–Kirchhoff stress, all evaluated at the last converged step \(s\). The estimate of the uniform stress increment, (57), can now be used to compute an initial estimate of the corresponding non-uniform increments in the deformation gradient of the grains, \(\Delta \mathbf {F}^{s+1,0}_K\), by formulating a relation similar to (57) for each specific grain \(K\):

$$\begin{aligned}&\left( \mathbb {A}_K^{s}\right) \Delta \mathbf {F}^{s+1,0}_K - \Delta \mathbf {F}^{s+1,0}_K \mathbf {S}_K^{s} = \Delta \varvec{\varSigma }^{s+1,0} \nonumber \\&\quad \text {with} \quad \mathbf {S}_K^{s} := \left( \mathbf {F}^{s}_K \right) ^{-1} \mathbf {P}_K^{s} \quad K = 1,\ldots , N_{\mathrm {gr}} \ . \end{aligned}$$
(58)

Observe that in the above expression the second Piola–Kirchhoff stress in the grains, \(\mathbf {S}_K^s\), is known from the last converged loading step \(s\). Hence, (58) represents a linear system of equations in terms of the initial values of the deformation gradients in the grains, \(\Delta \mathbf {F}^{s+1,0}_K\), i.e., one set of 9 linear equations for each grain \(K\). For computing the numerical solution of this system of equations, the vectors \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}_K\) and \({{\mathbf {\mathsf{{p}}}}}^{s+1,0}\) are invoked, each composed of 9 components. These vectors incorporate, respectively, the initial estimate of the incremental deformation gradient in grain \(K\) and the estimate of the increment in the macroscopic first Piola–Kirchhoff stress, i.e.,

$$\begin{aligned}&\left( \hat{\mathsf {d}}^{s+1}_K \right) _q := \left( \Delta F_K^{s+1,0} \right) _{mn} \, ; \quad \left( \mathsf {p}^{s+1,0} \right) _q := \Delta \varSigma _{mn}^{s+1,0} \nonumber \\&\quad \text {with} \quad q \!=\! q(m,n), \quad \text {for} \quad K \!=\! 1,\ldots , N_{\mathrm {gr}} \ . \end{aligned}$$
(59)

Similarly, define the \(9 \times 9\) material tangent stiffness matrix \({{\mathbf {\mathsf{{A}}}}}^{s}_K\) of grain \(K\) (which corresponds to the converged solution at the previous step \(s\)) as

$$\begin{aligned}&\left( \mathsf {A}^{s}_K \right) _{pq} := \left( \mathbb {A}_K^{s}\right) _{ijkl} - \delta _{ik} \left( S_K^{s}\right) _{lj} \nonumber \\&\quad \text {with} \quad \! p \!=\! p(i,j), \ q \!=\! q(k,l) \quad \! \text {for} \quad K \!=\! 1,\ldots , N_{\mathrm {gr}}, \end{aligned}$$
(60)

where \(\mathbb {A}_K^{s}\) and \( \mathbf {S}_K^{s}\) are the material tangent stiffness and the second Piola–Kirchhoff stress in the \(K\)-th grain, respectively, and \(\delta _{ik}\) reflects the Kronecker delta symbol. Employing the notation in (60), the system of equations (58) can be expressed in vector-matrix form as

$$\begin{aligned}&{{\mathbf {\mathsf{{A}}}}}^{s}_K \hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}_K = {{\mathbf {\mathsf{{p}}}}}^{s+1,0}\nonumber \\&\quad \text {for} \quad K = 1,\ldots , N_{\mathrm {gr}} \quad \text {(no sum on} K) \ . \end{aligned}$$
(61)

The matrix \({{\mathbf {\mathsf{{A}}}}}^{s}_K \) is non-singular for the subspace of (vectorized) symmetric tensors but is singular for the subspace of (vectorized) skew-symmetric tensors. To circumvent this singularity, the general solution of (61) is formulated as

$$\begin{aligned} \hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}_K = \tilde{\tilde{{{\mathbf {\mathsf{{d}}}}}}}^{s+1}_K - \left( {{\mathbf {\mathsf{{A}}}}}^{s}_K \right) ^{+} \left( {{\mathbf {\mathsf{{A}}}}}^{s}_K \tilde{\tilde{{{\mathbf {\mathsf{{d}}}}}}}^{s+1}_K - {{\mathbf {\mathsf{{p}}}}}^{s+1,0} \right) \end{aligned}$$
(62)

where \(\left( {{\mathbf {\mathsf{{A}}}}}^{s}_K \right) ^{+}\) is the pseudo-inverse of \({{\mathbf {\mathsf{{A}}}}}^{s}_K \) and \(\tilde{\tilde{{{\mathbf {\mathsf{{d}}}}}}}^{s+1}_K\) is an arbitrarily-chosen vector of 9 components. For definiteness, the components of the vector \(\tilde{\tilde{{{\mathbf {\mathsf{{d}}}}}}}^{s+1}_K\) are chosen as zero, which simplifies (62) into

$$\begin{aligned} \hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}_K = \left( {{\mathbf {\mathsf{{A}}}}}^{s}_K \right) ^{+} {{\mathbf {\mathsf{{p}}}}}^{s+1,0} \ . \end{aligned}$$
(63)

The grain cluster vector \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) of \(9 N_{\mathrm {gr}}\) components can now be straightforwardly assembled from the grain level vectors \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}_K\) for the \(K = 1,\ldots , N_{\mathrm {gr}}\) grains. Subsequently, the initial estimate of the deformation gradients \({{\mathbf {\mathsf{{x}}}}}^{s+1,0}\) follows from (53), (52), (49) and (47). Clearly, the determination of the uniform stress initial increment is computationally more demanding than the computation of the uniform deformation gradient initial increment presented in Sect. 3.5.1. Nonetheless, as will be demonstrated in detail in Sect. 4, the uniform stress initial increment has the advantage that it provides an adequate prediction for a wide range of material responses and weighting factors \(\alpha _i\).

3.5.3 Initial estimate based on previous loading steps

Another option for the calculation of the initial estimate \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) is to use the history of converged solutions at previous loading steps. Particularly, one could straightforwardly compute \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) from the difference of the converged solutions at steps \(s\) and \(s-1\), i.e.,

$$\begin{aligned} \hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1} = {{\mathbf {\mathsf{{x}}}}}^{s} - {{\mathbf {\mathsf{{x}}}}}^{s-1} \ . \end{aligned}$$
(64)

Substituting (64) in (53) and noting that \({{\mathbf {\mathsf{{L}}}}} ({{\mathbf {\mathsf{{x}}}}}^{s} - {{\mathbf {\mathsf{{x}}}}}^{s-1}) = \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s}\), it follows that

$$\begin{aligned} \eta ^{s+1}= & {} \frac{\left( \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s} \right) ^{\mathrm {\,T}} \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1}}{\left( \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s} \right) ^{\mathrm {\,T}} \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s} }, \nonumber \\&\text {with} \quad \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1} := \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1} - \bar{{{\mathbf {\mathsf{{f}}}}}}^{s} \ , \quad \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s} := \bar{{{\mathbf {\mathsf{{f}}}}}}^{s} - \bar{{{\mathbf {\mathsf{{f}}}}}}^{s-1}.\nonumber \\ \end{aligned}$$
(65)

The initial guess based on the loading history is proposed here because of its simplicity, and because it may provide an accurate and efficient prediction for material systems subjected to proportional loading. Under the latter condition the vectors \(\Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1}\) and \(\Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s}\) are parallel with respect to each other, whereby \(\eta ^{s+1}\) in (65) becomes equal to the relative change in loading magnitude going from step \(s\) to step \(s+1\). However, under strongly non-proportional loading the initial estimate (64) should be treated with care: Note that in the extreme case a load increment \(\Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1}\) may be specified such that \(\left( \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s} \right) ^{\mathrm {\,T}} \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1} = 0\), for which \(\eta ^{s+1}\) becomes zero and the current estimate would not be applicable.

figure a

3.5.4 Initial estimate based on the null vector

The last option presented in this section is included for completeness, and corresponds mathematically to the most basic choice for \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\), namely the null vector. Although from (53) it may be concluded that for this case the scaling factor \(\eta ^{s+1}\) is not defined, in view of (52) it follows that \(\tilde{{{\mathbf {\mathsf{{d}}}}}}^{s+1} = \mathbf {0}\). Correspondingly, from (49) the vector \({{\mathbf {\mathsf{{d}}}}}^{s+1}\) simply becomes

$$\begin{aligned} {{\mathbf {\mathsf{{d}}}}}^{s+1} = {{\mathbf {\mathsf{{L}}}}}^{+} \Delta \bar{{{\mathbf {\mathsf{{f}}}}}}^{s+1} \ . \end{aligned}$$
(66)

Using expression (51), expression (66) can be written in components as

$$\begin{aligned} \left( \mathsf {d}^{s+1} \right) _Q= & {} \frac{\theta _K}{\hat{\theta }^2} \Delta \bar{F}_{mn}^{s+1}\nonumber \\&\text {with} \quad Q = Q(K,m,n), \quad K = 1,\ldots , N_{\mathrm {gr}},\nonumber \\ \end{aligned}$$
(67)

where \(\hat{\theta }\) is given by (42)\(_2\). Note that this initial estimate is based on the grain size only, i.e., the increment in the deformation gradient for a grain \(K\) scales proportionally with the grain volume fraction \(\theta _K\), see (67). Preliminary numerical tests not presented here have indicated that such an initial estimate, though simple to determine, may result in an inconvenient starting point for the incremental-iterative update algorithm, and therefore in a (very) poor convergence behavior.

3.6 Overview of GGCM algorithm

The incremental-iterative update algorithm for the GGCM is summarized in Algorithm 1. The algorithm is based upon the uniform stress initial increment presented in Sect. 3.5.2; the implementations of the three alternative initial estimates presented in Sect. 3.5 occurs in a similar fashion, but are omitted here for brevity reasons. A detailed analysis of the performance of the different initial estimates is provided in Sect. 4.3.

It is worth pointing out that user-defined constitutive models may provide tangent stiffnesses based on stress and deformation measures different than the first Piola–Kirchhoff stress and the deformation gradient used in expression (37). However, these material tangent stiffnesses may be converted to this format using push-forward and pull-backward relations presented in the literature, see, e.g., [25, 28]. Furthermore, the parameters \(\alpha _1, \alpha _2, \alpha _3, \beta \), \(\omega \) and the time step \(\Delta t\) require a calibration procedure, such as that described in Sect. 4.5. Representative values of these parameters are used in the numerical examples treated in the section below.

4 Simulations of clusters of multiphase materials

4.1 Preliminaries

A series of simulations involving microstructures typically found in low-alloyed multiphase steels is presented in this section in order to illustrate important features of the generalized grain cluster method. These microstructures consist of an aggregate of ferritic grains (primary phase) and metastable retained austenitic grains (secondary phase). Under mechanical loading, the austenitic grains may partially or totally transform into a third phase, called martensite. The crystal plasticity model presented in [32] and extended in [33] and [42] is used to simulate the elasto-plastic deformation in the ferritic grains. This model is suitable for single-crystal grains undergoing plastic deformation along slip systems in a body-centered cubic lattice structure. The model incorporates the asymmetry typically observed in the twinning and anti-twinning directions. For incorporating the crystal plasticity model in the GGCM, the deformation gradient \(\mathbf {F}_{K}\), for each ferritic grain \(K\), is decomposed as

$$\begin{aligned} \mathbf {F}_K = \mathbf {F}^{\mathrm {e}}_K \mathbf {F}^{\mathrm {p}}_K \end{aligned}$$

where \(\mathbf {F}^{\mathrm {e}}_K\) represents the elastic part of the deformation gradient and \(\mathbf {F}^{\mathrm {p}}_{K}\) is the plastic part of the deformation gradient. The crystal plasticity model includes a so-called ferritic microstrain as an internal variable, which reflects the local elastic distortions in the crystalline lattice due to the presence of dislocations, see [33, 42] for more details. For the \(K\)-th ferritic grain, the microstrain is denoted as \(\beta _K^{\mathrm {F}}\), and the volume average over the ferritic grains is indicated as \(\bar{\beta }^{\mathrm {F}}\). The latter parameter will be used for characterizing the average plastic deformation in the ferritic grains in the simulation results presented in this section.

The constitutive behavior of the austenitic grains is simulated using the model developed in [28, 35, 36] and extended in [33, 42]. This model is suitable for simulating single-crystal grains simultaneously undergoing a plastic deformation and a martensitic phase transformation from a face-centered cubic austenitic lattice structure into a body-centered tetragonal martensite upon mechanical and/or thermal loading. The model includes the possible transformation into crystallographically-distinct martensitic phases, referred to as transformation systems. For the implementation of the phase transformation model within the GGCM, the deformation gradient \(\mathbf {F}_{K}\) for each austenitic grain \(K\) is decomposed as

$$\begin{aligned} \mathbf {F}_{K} = \mathbf {F}^{\mathrm {e}}_{K} \mathbf {F}^{\mathrm {p}}_{K} \mathbf {F}^{\mathrm {tr}}_{K} \end{aligned}$$

where, as before, \(\mathbf {F}^{\mathrm {e}}_{K}\) and \(\mathbf {F}^{\mathrm {p}}_{K}\) represent the elastic and plastic parts of the deformation gradient, respectively, and \(\mathbf {F}^{\mathrm {tr}}_{K}\) corresponds to the transformation part. The model allows for separately determining the volume fraction of each crystallographically-distinct martensitic transformation system [28, 33, 35, 36, 42]. For the \(K\)-th austenitic grain, the total martensitic volume fraction is denoted as \(\xi _K^{\mathrm {M}}\), which corresponds to the sum of volume fractions of the individual transformation systems. The volume average of the martensitic volume fraction over the austenitic grains is indicated as \(\bar{\xi }^{\mathrm {M}}\). This parameter will be used for quantifying the average martensitic transformation in the austenitic grains in the simulation results presented in this section. The material parameters used in the numerical simulations for the ferrite, austenite and martensite can be found in [42]. The computation of the stress, tangential stiffness and internal variables in the individual grains is carried out by means of a fully implicit, incremental-iterative update algorithm formulated within a large deformation framework. The details of this numerical implementation, which includes a selection algorithm for the determination of the active transformation systems and slip systems, can be found in [28].

Using the constitutive models outlined above allows for testing the GGCM for relatively complex and challenging material systems, where a large number of internal variables at each material point inside a grain capture inelastic phenomena originating from the sub-grain length scale, i.e., plastic slip and phase transformation resolved in crystallographically-distinct planes. The periodic microstructures used in the simulations are constructed from a two-level Voronoi algorithm generating realistic (convex and non-convex) polyhedral grains, see [43] for details. A randomly-chosen crystal orientation is assigned to each single-crystal grain in the cluster in order to approach a macroscopically isotropic material under an increasing number of grains. Representative pole figures for the orientation distributions of the samples analyzed in the present study can be found in [43]. All grain clusters were macroscopically loaded under simple shear according to

$$\begin{aligned} \bar{\mathbf {F}} = \mathbf {I} + \bar{\gamma } \mathbf {e}_1 \otimes \mathbf {e}_2 \, , \end{aligned}$$

where \(\bar{\gamma }\) is the amount of shear and \(\mathbf {e}_1\) and \(\mathbf {e}_2\) are orthonormal unit vectors perpendicular to the external faces of the cubic grain cluster. The imposed macroscopic shear rate was \(\dot{\bar{\gamma }} = 10^{-4} \mathrm {s}^{-1}\), which is in the range of quasi-static loading, i.e., it was confirmed that the inertial terms in the balance of linear momentum can be neglected. The samples were deformed up to a final value of \(\bar{\gamma } = 0.2\). Unless indicated otherwise, the step size for the constrained gradient descent method equals \(\omega = 20\) and the convergence criterion used is provided by expression (44), with the tolerance prescribed a priori as \(\varepsilon = 10^{-3}\). It is emphasized that the simple shear deformation mode was chosen for reasons of simplicity, such that the basic characteristics of the GGCM can be demonstrated in a consistent and unequivocal fashion. The computation of sample responses under alternative, more complex deformation modes falls beyond the scope of this study, although it may be reasonably expected that these will expose similar characteristics of the GGCM as for the simple shear mode.

The performance of the GGCM is demonstrated by considering six microstructures, each composed of a different number of grains, see Table  1. Note that for all these microstructures the initial volume fraction of the secondary austenitic phase is approximately 12  %, which is within the range of experimental values observed in multiphase TRIP steels [15, 26, 27]. A parametric study is carried out using a wide range of values of the weighting factors \(\alpha _i\), with \(i=1,2,3\), see Table  2. In accordance with (27) and (28), in the solution procedure the weighting factors \(\alpha _1\) and \(\alpha _3\) determine the relative importance of the equilibrium conditions at the grain boundaries and within each grain, respectively, while the weighting factor \(\alpha _2\) sets the relevance of kinematical compatibility across grain boundaries. The scaling factor \(\beta \) that appears in (28) was kept fixed, by setting it equal to \(\beta = 1\) (units of stress).

Table 1 Composition and volume fraction of different microstructural samples used in the GGCM simulations and the (finite element-based) DNS
Table 2 List of distinct weighting factors used in the GGCM simulations. In all cases, \(\alpha _1\) and \(\alpha _3\) were chosen equal to each other. Set W1 approximates the Taylor bound where the deformation gradient is uniform across the grains, while set W5 tends to the Sachs bound, for which the stress in the grains is uniform, within the tolerance provided by the kinematical multiscale constraint. Sets W2–W4 represent intermediate cases

For each grain cluster, the results from a direct numerical simulation (DNS) performed with an accurate finite element model were used as a benchmark. The term DNS is borrowed from its classical context in fluid mechanics to reflect the analogy between resolving the spatial scales of turbulence and resolving the micromechanical scales; it thus represents the full-field numerical solution obtained from determining the balance of linear momentum at each microscopic material point. Details of the finite element-based DNS simulations can be found in [43]. Unless indicated otherwise, all DNS calculations were performed using a regular \(30\times 30\times 30\) mesh (27000 hexahedral elements with a reduced integration scheme) with pointwise periodic boundary conditions, resulting in 81021 displacement degrees of freedom. This mesh size is based on a convergence study where the relative error in the effective main shear stress is less than \(2\,\%\) of the value found for a (substantially) finer \(40\times 40\times 40\) mesh, see also [43]. The maximum time step in the DNS was determined as \(\Delta t\) = 3.2 s, since larger time steps typically triggered numerical convergence problems. All other parameters used in the DNS are identical to those in the simulations performed with the GGCM. For comparison purposes, results are reported in terms of the Cauchy stress tensor \(\mathbf {T} := (\text {det}\,\mathbf {F})^{-1} \mathbf {P} \mathbf {F}^\mathrm {T}\).

4.2 Influence of time step size

In order to examine the influence of the size of the time step on the accuracy and stability of the numerical results computed with the GGCM, the response of a microstructural sample loaded under simple shear is analyzed considering three distinct time steps, namely \(\Delta t\) = 1, 5 and 25 s. The sample is composed of 24 grains of austenite and 176 grains of ferrite (sample S4 in Table  1). The corresponding response curves are shown in Fig.  4, which include the main Cauchy shear stress component \(\bar{T}_{12}\) averaged over the whole cluster (Fig.  4a), the average stress component \(\bar{T}_{11}\) (Fig.  4b), the average martensitic volume fraction \(\bar{\xi }^{\mathrm {M}}\) (Fig.  4c) and the average microstrain in the ferrite \(\bar{\beta }^{\mathrm {F}}\) (Fig.  4d), all plotted as a function of the macroscopic amount of shear \(\bar{\gamma }\). The GGCM curves were computed using calibrated weighting factors \(\alpha _1 = \alpha _3 = 5.0 \times 10^{-6}\) and \(\alpha _2 = 2.0 \times 10^{-2}\) (set W2 in Table  2) and an initial estimate for the deformation gradient based on a uniform stress increment. As shown in the figure, for the largest time step, \(\Delta t\) = 25 s, the GGCM response shows significant fluctuations and deviates strongly from the DNS response represented by the dashed line. In contrast, the GGCM responses for \(\Delta t\) = 1 and 5 s are relatively smooth and remain close to each other over the whole deformation range, thereby approaching the DNS response closely. Hence, it may be concluded that a specific minimum time step is required for obtaining a GGCM solution of satisfactory accuracy. Using an \(L_2\) norm of the main shear stress along the complete deformation path (see also Sect. 4.5, expression (68)), the relative difference between the responses for \(\Delta t\) = 1 and 5 s is calculated as \(2.5\,\%\). Correspondingly, the results obtained with the time step \(\Delta t\) = 5 s were considered as sufficiently accurate, and forthcoming results were computed using this value.

Fig. 4
figure 4

Average (macroscopic) Cauchy shear stress \(\bar{T}_{12}\) (a), stress component \(\bar{T}_{11}\) (b), martensitic volume fraction \(\bar{\xi }^{\mathrm {M}}\) (c) and microstrain in the ferrite \(\bar{\beta }^{\mathrm {F}}\) (d), all plotted as a function of the applied amount of shear \(\bar{\gamma }\), for three distinct time steps in the GGCM: \(\Delta t\) = 1.0, 5.0 and 25.0 s. The sample used is S4 with 200 grains in total, see Table  1, and the set of weighting factors equals W2, see Table  2. The dashed lines represent the corresponding DNS responses, obtained from an accurate \(30 \times 30 \times 30\) finite element model

From the aspect of stability, no specific conclusions on the maximum time step size can be drawn from the GGCM simulations, since for all three time steps considered the response remains bounded within the range analyzed. Conversely, as already indicated in the previous section, the DNS calculation did encounter convergence problems for time steps larger than \(\Delta t\) = 3.2 s; hence the DNS is characterized by a smaller maximum time step than the GGCM simulations.

4.3 Influence of initial estimate of the deformation gradients

In Sect. 3.5 several options were described for estimating the initial incremental microscale deformation gradients \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) required at the onset of each new loading step \(s+1\) of the constrained minimization algorithm. Due to the loading path dependency of inelastic material models, this initial estimate may affect the accuracy and efficiency of the generalized grain cluster method. Since GGCM has the general aim of closely approximating accurate DNS response curves at (much) lower computational cost, it needs to be examined in detail what the effect of this estimate is on the numerical accuracy and efficiency of the GGCM result. Accordingly, in this section the numerical responses computed for three different initial estimates of \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) are compared, as based upon (i) a uniform deformation gradient initial increment (Sect. 3.5.1), (ii) a uniform stress initial increment (Sect. 3.5.2) and (iii) previously converged loading steps (Sect. 3.5.3). The initial estimate based upon the null vector (Sect. 3.5.4) is left out of consideration in this comparison, since preliminary computations (not presented here) clearly indicated a deficient performance with respect to the other three approaches.

In the analysis a sample consisting of 8 grains of austenite and 56 grains of ferrite (set S2 in Table  1) is subjected to simple shear, where the GGCM responses calculated with the three initial estimates of \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) are compared against the response of the accurate benchmark FEM model described in Sect. 4.1. In the GGCM the weighting factors are \(\alpha _1 = \alpha _3 = 5.0 \times 10^{-6}\) and \(\alpha _2 = 2.0 \times 10^{-2}\) (set W2 in Table  2) and the time step equals \(\Delta t\) = 5 s. The results of the simulations are shown in Fig.  5 in terms of the main Cauchy shear stress \(\bar{T}_{12}\) (Fig.  5a), the stress component \(\bar{T}_{11}\) (Fig.  5b), the martensitic volume fraction \(\bar{\xi }^{\mathrm {M}}\) (Fig.  5c) and the microstrain in the ferrite \(\bar{\beta }^{\mathrm {F}}\) (Fig.  5d). All quantities are displayed as a function of the macroscopic amount of shear \(\bar{\gamma }\). It can be observed that the simulations based on a uniform deformation gradient initial increment and a uniform stress initial increment are relatively smooth and remain close to each other. The method based on a uniform deformation gradient increment predicts a slightly stiffer response for the main Cauchy shear stress \(\bar{T}_{12}\) compared to the simulation based on a uniform stress increment, while the opposite occurs for the stress component \(\bar{T}_{11}\), see Fig.  5a, b. For the present choice of weighting factors W2 both methods approach the solution obtained with DNS closely, although the method based on a uniform stress initial increment appears to be slightly more accurate. The response of the secondary phase, expressed in terms of the martensitic phase transformation, is somewhat underpredicted by both methods at larger deformation, \(\bar{\gamma } > 0.1\), see Fig.  5c, while the microstrain \(\bar{\beta }^{\mathrm {F}}\) in the ferrite is predicted accurately over the whole deformation range, see Fig.  5d. In contrast, the simulation based on an initial estimate obtained from previously converged loading steps is characterized by significant fluctuations in the stress components and a relatively large over-prediction of the volume fraction of martensite. Moreover, the step size in the constrained minimization algorithm had to be reduced here from \(\omega = 20\) to \(\omega = 10\) in order to circumvent convergence problems. Consequently, the initial estimate based on extrapolating previously converged loading steps was deemed sub-optimal compared to initial estimates based upon an increment of a uniform deformation gradient and a uniform stress.

Fig. 5
figure 5

Average (macroscopic) Cauchy shear stress \(\bar{T}_{12}\) (a), stress component \(\bar{T}_{11}\) (b), martensitic volume fraction \(\bar{\xi }^{\mathrm {M}}\) (c) and microstrain in the ferrite \(\bar{\beta }^{\mathrm {F}}\) (b) as a function of the applied amount of shear \(\bar{\gamma }\) for distinct initial estimates of the deformation gradient at the beginning of each GGCM loading step. The three initial estimates for the vector of deformation gradients within the grains, \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s}\), are based on (i) a uniform deformation gradient increment, (ii) a uniform stress increment, and (iii) previously converged loading steps. The sample used is S2 with 64 grains in total, see Table  1, and the set of weighting factors is W2, see Table  2. The dashed lines represent the corresponding DNS responses, obtained from an accurate \(30 \times 30 \times 30\) finite element model

The computational times for the three initial estimates of \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\) are compared in Table  3. It is noted that in Sect. 4.6 the computational times of simulations performed with the GGCM will be also compared objectively to those of finite element-based DNS. The table shows that the simulations with a uniform deformation gradient initial increment and an initial increment based on previously converged loading steps are characterized by similar computational times, which are about half of the computational time obtained for a uniform stress initial increment. Nonetheless, additional simulations (not shown here) have indicated that the numerical results computed with the uniform deformation gradient initial increment may substantially loose their accuracy under conditions approaching the uniform stress limit case (i.e., set W5 in Table  2). Conversely, the numerical results obtained with the uniform stress initial increment proved to be accurate for both the uniform stress limit case (set W5) and the uniform deformation gradient limit case (set W1). Hence, despite its higher computational cost, from the viewpoint of accuracy all GGCM simulations discussed in the subsequent sections were carried out using a uniform stress initial increment. Note, however, that the method based on a uniform deformation gradient initial increment remains an attractive alternative under conditions generating negligible to moderate differences between the deformations in individual grains i.e., sets W1–W3 in Table  2.

Table 3 Computational times of GGCM with distinct initial estimates of the microscopic deformation gradient at the onset of each incremental loading step. The sample is loaded under simple shear

4.4 Influence of weighting factors \(\alpha _i\) and sample composition

In this section the effects on the GGCM response by the weighting factors, \(\alpha _i\) with \(i=1,2,3\), see (27), and the sample composition are demonstrated. For this purpose, a sample subjected to simple shear deformation, up to \(\bar{\gamma } = 0.2\), is analyzed for five different sets of weighting factors, W1–W5, as listed in Table  2, and two different microstructural compositions, S1 and S4 (of 8 grains and 200 grains in total, respectively), see Table  1. The samples S1–S6 presented in Table  1 were recently subjected to an extensive convergence study using an accurate finite element model of \(30 \times 30 \times 30\) elements, which demonstrated that sample S4 represents a lower bound for the number of grains required for reaching a macroscopic representative volume element (RVE), see [43] for more details. Accordingly, by comparing the response of the “minimal” RVE sample S4 to the response of the “small sample” S1, the influence by the discreteness of the microstructure on the overall response can be highlighted. A detailed analysis of the convergence behavior of the GGCM responses, including all samples S1–S6 summarized in Table  1, will be presented in Sect. 4.5.

In order to clearly illustrate some essential features of the weighting factors, first for sets W1, W2, and W5 the main Piola–Kirchhoff shear stress distribution \(\bar{P}_{12}\) in sample S2 is considered at the final state of deformation \(\bar{\gamma } = 0.2\), see Fig.  6. As can be observed from Fig.  6a, in the limit of a uniform deformation gradient (set W1) the kinematic compatibility between individual grains is preserved, which occurs at the expense of violating the equilibrium requirements for the grains, i.e., the shear stress \(P_{12}\) jumps significantly in value across grain boundaries. The response shown in Fig.  6c for the limit case of uniform stress (set W5) indeed satisfies the equilibrium requirements, but clearly violates kinematic compatibility between grains. The intermediate case (set W2) shown in Fig.  6b corresponds to a compromise solution that partially satisfies both the kinematic compatibility and equilibrium requirements.

Fig. 6
figure 6

Piola Kirchhoff shear stress distribution \(P_{12}\) across sample S2 at a macroscopic shear deformation \(\bar{\gamma } = 0.2\) for weighting factors representing a the uniform deformation gradient limit case (set W1), b an intermediate case (set W2) and c the uniform stress limit case (set W5)

In Fig.  7 the evolutions of macroscopic state and internal variables of the samples S1 and S4 are compared. The figure illustrates the main Cauchy shear component \(\bar{T}_{12}\), (Fig.  7a, b) and the stress component \(\bar{T}_{11}\) (Fig.  7c, d), i.e., grain cluster averages, of the two samples, as a function of the applied macroscopic shear deformation \(\bar{\gamma }\). It further shows the evolutions of the effective martensitic volume fraction \(\bar{\xi }^{\mathrm {M}}\) in the secondary, austenitic phase (Fig.  7e, f) and the effective microstrain \(\bar{\beta }^{\mathrm {F}}\) in the ferritic matrix (Fig 7g, h), with the latter parameter being a measure for the amount of plastic deformation generated. For both samples the GGCM responses for the five different sets of weighting factors W1–W5 are plotted together with the response obtained from an accurate finite element-based DNS described in Sect. 4.1. As mentioned before, the GGCM simulations were carried out with a uniform stress initial increment.

Fig. 7
figure 7

Average (macroscopic) Cauchy shear stress \(\bar{T}_{12}\) (a, b), stress component \(\bar{T}_{11}\) (c, d), martensitic volume fraction \(\bar{\xi }^{\mathrm {M}}\) (e, f) and microstrain in the ferrite \(\bar{\beta }^{\mathrm {F}}\) (g, h) in samples S1 and S4, respectively, plotted as a function of the applied amount of shear \(\bar{\gamma }\) for different sets of weighting factors W1–W5, see Table  2. The dashed lines represent the corresponding DNS responses, obtained from an accurate \(30 \times 30 \times 30\) finite element model

Table 4 Computational times (sample loaded under simple shear) of GGCM for distinct weighting factors, using a uniform stress initial increment and two different samples, S1 and S4

Both for samples S1 and S4 the limit cases of a uniform deformation gradient (set W1) and a uniform stress (set W5) provide relatively smooth response curves, which, as may be expected from basic homogenization principles, respectively overpredict and underpredict the DNS response. For the intermediate sets with weighting factors W2–W4 the closest resemblance with the DNS response is realized by set W2, although for the small sample S1 some distinctive drops in stress are observed, see Fig.  7a, c. These stress drops result from a competition between the two opposing terms in the objective functional \(J\) given by (27), which are the compatibility term, \(\alpha _2 J_2\), and the equilibrium term, \(\alpha _1 J_1 + \alpha _3 J_3\). During the minimization procedure both terms occasionally fluctuate strongly in magnitude, thereby showing a tendency of compensating each other. Whenever the minimization of the functional \(J\) was dominated by the derivative of the force equilibrium term (i.e., the term \(\partial (\alpha _1 J_1 + \alpha _3 J_3) / \partial \left( F_{K}\right) _{mn} \) following from (36) and (33)), the force equilibrium term \(\alpha _1 J_1 + \alpha _3 J_3\) decreased while the compatibility term \(\alpha _2 J_2\) increased. In contrast, this process was reversed when the compatibility term \(\alpha _2 J_2\) became dominant. As demonstrated by the shear stress response in Fig. 7a, this bifurcation-type behavior is more apparent for the small sample S1, and thus can be associated to the level of discreteness of the microstructure. More discussion on the bifurcation sensitivity of the microstructural responses can be found in Sect. 5.

Under a stepwise change of the sets of weighting factors from W1 to W5 the corresponding Cauchy shear stress responses \(\bar{T}_{12}\) decrease monotonically, see Fig.  7a, b. A similar order in the responses for the different sets of weighting factors can be observed for the ferritic microstrain \(\bar{\beta }^{\mathrm {F}}\), see Fig.  7g, h. In contrast, the responses of the stress component \(\bar{T}_{11}\) and the martensitic volume fraction \(\bar{\xi }^{\mathrm {M}}\) do not follow this regular order, see Fig.  7c–f. This difference can be explained as follows. Under the applied isochoric shear deformation \(\bar{\gamma }\) the secondary austenitic phase transforms into martensite, whereby the resulting volumetric expansion is compensated for by the surrounding ferritic matrix under the generation of normal stresses. Since the number of austenitic grains in samples S1 and S4 is relatively low (i.e., 1 grain and 24 grains, respectively, see Table  1), the resulting predictions for the effective stress component \(\bar{T}_{11}\), and the martensitic volume fraction, \(\bar{\xi }^{\mathrm {M}}\) occasionally are sensitive to inaccuracies, and therefore do not provide the regular order in responses mentioned above. Despite this aspect, for sample S4 all 4 response curves computed with the set of weighting factors W2 follow the corresponding DNS response curves reasonably well, see Fig.  7b, d, f, h.

The computational times of the numerical simulations presented in Fig.  7 are summarized in Table  4. All simulations were carried out with the uniform stress initial increment method, since this initial estimate leads to a good balance between computational accuracy and efficiency, see Sect. 4.3. It can be observed that the computational times are the largest for the uniform deformation gradient limit case W1. This is, because the initial estimate provides a deformation state that is incompatible at the grain boundaries, as a result of which in W1 the most dominant term, \(\alpha _2 J_2\), representing kinematic compatibility, see (33) and (36), initially is relatively far off its minimum. Consequently, the number of iterations required for finding a converged state that satisfies kinematic compatibility is relatively large. Obviously, the application of the uniform deformation gradient initial increment method will reduce the number of iterations for set W1, since it matches the kinematic compatibility requirement and also satisfies the multiscale kinematic constraint given by (9). Recall, however, that this initial estimate may become inaccurate and inefficient under conditions approaching the limit case W5 of a uniform stress in the sample, see Sect. 4.3.

Table  4 shows that the computational time decreases when the set of weighting factors is changed stepwisely from W1 to W4. In accordance with this trend and the discussion presented above, it is somewhat counterintuitive to observe that the computational time for the uniform stress limit case W5, instead of being the lowest, is significantly higher than that of sets W2, W3 and W4. This can be explained, however, from the correction step that projects the uniform stress initial estimate for the deformation gradient, \(\hat{{{\mathbf {\mathsf{{d}}}}}}^{s+1}\), given by (63), within the feasible solution space that satisfies the multiscale kinematic constraint (9), leading to the projected initial deformation gradient increment \({{\mathbf {\mathsf{{d}}}}}^{s+1}\), see (49) and (52) as well as Fig.  3. Accordingly, for step \(s+1\) the starting values for the local deformation gradients in the minimization algorithm \(\varvec{x}^{s+1,0}= \varvec{x}^s + {{\mathbf {\mathsf{{d}}}}}^{s+1}\), see (47), do not fully reflect a state of uniform stress. This makes the minimization process for reaching the uniform stress limit case W5 computationally more demanding than for the “intermediate” sets W2, W3 and W4 for which the stress distribution in the sample is (far) less uniform.

4.5 Calibration procedure

From the results presented in Sect. 4.4 it became clear that the weighting factors \(\alpha _1, \alpha _2\) and \(\alpha _3\) and the sample size play a major role in (i) the accuracy and (ii) the computational efficiency of the generalized grain cluster method. If the analyst wants to adequately account for these two aspects in the calibration of the weighting factors, a parametric analysis similar to that presented in Sect. 4.4 needs to be performed. By inspection of Fig.  7 it was concluded that the set W2 provides the best agreement with the DNS response; therefore the parameters \(\alpha _i\) (with \(i=1,2,3\)) of set W2 may be referred to as the “calibrated weighting factors”. A sensitivity analysis of these weighting factors has indicated that differences in GGCM responses become significant only when the ratio \(\alpha _2 / \alpha _1\) is modified by at least a factor of 2. For example, Table  2 illustrates that the change in \(\alpha _2 / \alpha _1\) in the transition of set W2 to set W3 equals a factor of 4. As further shown in this table, the method becomes rather insensitive to changes in the ratio \(\alpha _2 / \alpha _1\) close to the uniform deformation gradient limit case W1 (\(\alpha _2 / \alpha _1\) approaches infinity) or the uniform stress limit case W5 (\(\alpha _2 / \alpha _1\) approaches zero). If the experimental data required for calibration is not readily available or if the analyst wants to “design” a material starting from its microstructural features, the relation between the microscopic behavior and the effective macroscopic response may be initially estimated using “default” weighting factors in the range of set W2.

In principle, the calibration procedure should be carried out with a grain cluster corresponding to a macroscopic representative volume element (RVE). Similar to common procedures used for multiscale simulations based on the finite element method, the minimally required size of an RVE for the GGCM may be established from a convergence analysis of the macroscopic response on grain clusters of increasing size. As mentioned before, the microstructures S1-S6 presented in Table  1 were recently subjected to such a convergence study, performed with an accurate finite element model of \(30 \times 30 \times 30\) elements [43]. The results of this convergence study are here compared to that of a convergence study carried out with the GGCM, by computing the relative difference in macroscopic response for the individual samples S1-S6. The macroscopic response is reflected by two distinct macroscopic variables, which are the main Cauchy shear stress \(\bar{T}_{12}\) and the martensitic volume fraction \(\bar{\xi }^{\mathrm {M}}\). The main shear stress is representative of the whole grain cluster whereas the martensitic volume fraction reflects the average value in the secondary (austenitic/martensitic) phase. As indicated by Fig.  8a, for both macroscopic variables the relative difference in response is based on an \(L_2\) norm, which is taken over the complete deformation path from \(\bar{\gamma }=0\) to \(\bar{\gamma }=0.2\). For an arbitrary scalar function \(\bar{f} = \bar{f}(\bar{\gamma })\), this norm thus may be expressed as

$$\begin{aligned} \Vert \bar{f} \Vert _{L_2} = \left( \int _{\bar{\gamma } = 0}^{\bar{\gamma } = 0.2} \bar{f}^2 \text {d}\bar{\gamma } \right) ^{1/2} \ . \end{aligned}$$
(68)

The GGCM response to simple shear for all samples S1-S6 was computed using the same set of weighting factors W2. As shown in Fig.  8a, the shear stresses predicted by GGCM and the accurate finite element-based DNS simulation remain within \(5\,\%\) difference when the number of grains grows. Furthermore, the difference between the martensitic volume fractions calculated by the GGCM and DNS simulations oscillates around \(15\,\%\) at a larger number of grains. Hence, the predictions by the GGCM and DNS simulations appear to have similar rates of convergence, from which it can be concluded that the minimal macroscopic RVE for the two methods is identical, i.e., the “minimal” RVE sample S4 with 200 grains in total. An additional observation from Fig.  8a is that, once a set of weighting parameters has been calibrated for a small sample, it can be subsequently used for an alternative, larger sample without significantly compromising to the accuracy of the results. Accordingly, the calibrated weighting factors appear as “objective”, in a sense that they do not need to be modified when analyzing alternative microstructures. In addition, by using a sample that is smaller than the minimal RVE sample, this characteristic leads to a quick and straightforward calibration of the weighting factors. For example, in the present study sample S2 was used for a quick calibration of the weighting factors W2, see Sect. 4.3, which is composed of about three times fewer grains than the minimal RVE sample S4, see Table  1.

Fig. 8
figure 8

a Relative differences in responses (to simple shear) computed by the GGCM and DNS, plotted as a function of the total number of grains in the sample. The GGCM computation of the macroscopic Cauchy shear stress \(\bar{T}_{12}\) and martensitic volume fraction \(\bar{\xi }^{\mathrm {M}}\) was performed using the set of weighting factors W2, see Table  2. b Computational times (in CPU hours) of grain clusters S1–S6 subjected to simple shear (\(\bar{\gamma }=0.2\)), together with a linear approximation (dashed line) in terms of the total number of grains in the sample. The microstructural characteristics of samples S1–S6 are listed in Table  1

4.6 Performance assessment of GGCM in comparison to DNS

Fig.  8b illustrates the overall computational performance of the GGCM by depicting the CPU time for samples S1–S6 versus the corresponding number of grains in the sample. All computations were carried out on the same single-processor computer, using the uniform stress initial increment method and the set of weighting factors W2. The six data points depicted are approximated by a linear relation (dashed line) up to an \(R^2\)-value of 0.8743 calculated by a least squares regression. The linear relation between the GGCM computational time and the total number of grains \(N_{gr}\) in the sample results from solving the local deformation gradient per grain, whereby the average computational effort per grain (i.e., the update of the stress, internal variables and tangential stiffness) is nearly independent of the total number of grains \(N_{gr}\). In contrast, the computational time associated to a finite element calculation typically scales with the power \(n\) of the number of elements, where \(n>1\). Hence, it depends on the number of grains indirectly by means of the number of elements per grain required for achieving a converged solution to within a specific tolerance. Since the finite element models of the samples S1 to S6 use comparable \(30\times 30\times 30\) meshes, they are characterized by similar computational times, in the range of \(220 \pm 30\) CPU hours.

For an objective evaluation of the computational cost of the GGCM, its computational time should be compared against that of a finite element model of the same accuracy. As pointed out in Sect. 4.3, the calibration procedure of the GGCM under simple shear was efficiently performed using a relatively small sample S2 (64 grains in total), which led to the set of weighting factors W2. With these weighting factors the shear response of the “minimal RVE sample” S4 (200 grains in total) was computed, see Fig.  7 in Sect. 4.4. The accuracy of this GGCM solution will now be quantified with respect to a reference DNS solution, calculated using a finer finite element mesh of \(40^3 = 64,000\) hexahedral elements. Subsequently, the computational speed of this GGCM solution will be compared against that of a DNS solution with the same accuracy. This DNS solution can be traced back from FEM responses obtained for distinct finite-element meshes of \(8^3\), \(15^3\), \(20^3\) and \(30^3\) hexahedral elements (equipped with a reduced integration scheme), by evaluating their accuracies measured with respect to the reference DNS solution obtained for the mesh of \(40^3\) elements. The accuracies of these distinct FEM solutions are plotted in Fig.  9a by means of the error in the main Cauchy shear stress \(\bar{T}_{12}^{\text {DNS}}\), measured with respect to the main Cauchy shear stress \(\bar{T}_{12}\) computed with the accurate FEM model of \(40^3\) elements in terms of an \(L_2\) norm.

Fig. 9
figure 9

a Relative error of the main macroscopic Cacuhy shear stress \(\bar{T}_{12}^{\text {DNS}}\) of sample S4 under simple shear (\(\bar{\gamma }=0.2\)), computed for different FEM mesh sizes \(M\) corresponding to \(8^3\), \(15^3\), \(20^3\) and \(30^3\) elements in total. The DNS reference value \(\bar{T}_{12}\) characterizing the relative \(L_2\) error was computed using a fine FEM mesh of \(40^3\) elements. For comparison, the relative error for the GGCM simulation of sample S4 (also measured with respect to the DNS reference value \(\bar{T}_{12}\)) is specified by the dashed line. The GGCM simulation uses the set of weighting factors W2, see Table  2. b Computational time (in CPU hours) of sample S4 for different mesh sizes of the DNS. For comparison, the GGCM computational time of sample S4 is indicated by the dashed line

The figure also shows the relative \(L_2\) error of the GGCM, which is slightly below \(2\,\%\). It can be observed that the errors of the FEM calculations range from about \(6 \,\%\) for a coarse \(8^3\) element mesh to about \(1\,\%\) for a fine \(30^3\) element mesh. By means of straightforward interpolation, the FEM model with the same \(2\,\%\) error as the GGCM turns out to consist of about \(24^3\) elements, see Fig.  9a. The computational times for the various FEM discretizations are shown in Fig.  9b. For comparison purposes, the simulations with coarser meshes (\(8^3\), \(15^3\) and \(20^3\) elements) were carried out with the same time step as applied in the GGCM, namely \(\Delta t\) = 5 s. As already indicated in Sect. 4.1, the time step used for the \(30^3\) element mesh had to be reduced to \(\Delta t\) = 3.2 s to avoid convergence problems. The GGCM requires \(39\) CPU hours to complete a simulation for sample S4. In contrast, through a simple interpolation it can be estimated that a finite element simulation with a \(24^3\) element mesh requires about \(160\) CPU hours for completing this task, which thus is about 4 times more CPU time than the GGCM.

It is worth pointing out that by using the uniform deformation gradient increment method (Sect. 3.5.1) instead of the uniform stress increment method (Sect. 3.5.2), an additional decrease in the GGCM computational time of about a factor of 2 can be expected, see Table  3, which then makes the GGCM about eight times faster than the FEM simulation with \(24^3\) elements. Obviously, for accomplishing this result a recalibration of the weighting factors is required.

The part of the CPU time spent on iteratively solving the mathematical equations of the GGCM can be globally subdivided into (i) the time required at the local level of an integration point by the implicit update algorithms (for crystal plasticity and crystallographic phase transformations) to compute the stress and consistent tangent stiffness in each grain, and (ii) the time required at the global (system) level by the constrained gradient descent algorithm for minimizing the objective potential reflecting the equilibrium conditions and kinematic compatibility conditions of all grains. The update algorithms used at the integration point level are the same as used in the integration points of a finite element-based DNS, where, for the incremental step size and specific crystallographic models used in the current simulations, convergence is typically reached within 10 iterations. Note that the GGCM only uses the equivalent of one integration point per grain, while a representative DNS with a sufficiently fine mesh of \(24 \times 24 \times 24\) elements on average uses 70 integration points per grain when computing the response for the minimal RVE size of 200 grains in total (Sample S4, Table 1). The resulting gain in computational efficiency by the GGCM is partly lost at the system level due to the relatively large (100–200) iterations required for convergence of the constrained gradient descent algorithm. Hence, for further increasing the efficiency of the GGCM it is worthwhile investigating the application of faster numerical algorithms at the system level. This remains a topic for future studies.

5 Concluding remarks

The generalized grain cluster method presented in this communication can be used for the efficient and accurate simulation of advanced multiphase microstructures composed of a large number of grains of arbitrary shape. It was demonstrated that an accurate calibration of the GGCM can be performed relatively quickly by using samples smaller than the “minimal RVE sample”. For samples subjected to simple shear deformation, a comparison of the computational results with those obtained from finite element-based direct numerical simulations indicated that, for a given accuracy, the computational speed of the GGCM may be up to almost an order of magnitude lower, i.e., a factor of 8. This motivates the application of the GGCM in multiscale analyses of large-scale engineering problems, whereby the solution to the macroscale problem is calculated by the finite element method, and the material response at the microscale is computed by the GGCM. The coupling between the two methods is rather straightforward; since the GGCM uses the macroscopic deformation gradient as input for solving the microscale response of a multiphase material, it can be naturally included in commercial finite element codes as a user-supplied subroutine at the integration point level, where the effective stress and the effective constitutive tangent stiffness calculated for the multiphase material are returned by the GGCM to the finite element model. Obviously, the gain in computational time compared to a multiscale FE\(^2\) approach will be considerable for large-scale engineering problems, since with the GGCM a relative decrease in computational time up to almost an order of magnitude is obtained for every macroscopic integration point in the FEM model.

Due to the overdetermined system of equations that characterize the GGCM, the collection of possible microscale solutions contains numerous local minima. Accordingly, under an incremental macroscopic deformation, a stationary point may jump from one local minimum to another local minimum. These fluctuations may be activated by numerical inhomogeneities such as the value of the incremental step size, the convergence criterion applied or the initial estimates used in the iterative solution procedure. In the simulation results presented in this communication, the bifurcation sensitivity of the solution became manifest through local variations in some of the responses computed, as observed in the variation studies on the time step (see Fig.  4a, b), the initial estimate for the incremental deformation gradient (see Fig.  5a–c), and the GGCM weighting factors (see Fig.  7a, c, e). In addition, the oscillatory convergence behavior of microstructural samples under an increasing number of grains (see Fig.  8a) also implies a bifurcation sensitivity of the response. However, the numerical results in Figs. 4, 5, 7 and 8 also show that fluctuations due to bifurcations (substantially) diminish once (the size of) the numerical parameters and the size of the RVE are adequately determined. A further reduction of the fluctuations may be obtained by applying numerical regularization techniques in the iterative update algorithm applied at the system level, which is a topic for future research.