1 Introduction

Heterogeneous materials such as composites have been widely applied in various industries such as aircraft and automobile manufacturing. The multiscale simulation of heterogeneous materials is therefore a crucial task in computational mechanics.

Such simulation is usually facilitated by the classical FE\(^2\) scheme [1], as illustrated in Fig. 1. In a typical FE\(^2\) scheme, the finite element method is applied at the microscale and the macroscale concurrently, and hence the name. More precisely, at the macroscale, the entire composite part is discretized into continuum finite elements, each of which has several Gauss quadrature points for numerical integration. For each Gauss quadrature point, the effective constitutive behavior for the macroscale is obtained through a homogenization process via a finite element analysis at the microscale. The computational domain at the microscale is called a representatixve volume element (RVE). Take the mechanical simulation for a fiber-reinforced composite as an example, a typical RVE consists of a fiber and the surrounding matrix [2], possibly with defects such as cracks. Normally the desired effective responses include the stress tensor and the elasticity tensor, and the simplest way of homogenization is by volume averaging.

Fig. 1
figure 1

Flowchart illustrating the FE\(^2\) scheme

Among available numerical methods for the analysis at the RVE with crack propagation, the phase field approach to fracture [3], also known as the regularized variational theory for fracture, shows clear advantages. This approach is built on Griffith’s theory for brittle fracture [4]. The key idea is to use a scalar field, called phase field, to represent the crack path, instead of incorporating the explicit geometry of the crack path in the computational domain. The advantages include obviating the need for explicitly tracking the crack path geometry, and the ability to predict crack nucleation and bifurcation without extra criterion. The method has since been applied to fracture modeling in Euler-Bernoulli beams [5], thin shells [6], composite materials [7, 8], cement-based materials [9], layered structures [10], and CO\(_2\) fracturing [11].

However, solving the equations arising from the phase field method for fracture can be costly. Since the strain energy functional to minimize in this approach is not convex, the required number of iterations for convergence is not known a priori. The RVE analysis is, of course, no exception. Many efforts have been devoted to accelerating the phase field fracture solution procedure. Heister et al. [12] and Li et al. [13] constructed mesh adaptivity approaches for the problem. Ziaei-Rad and Shen [14] developed a massively parallel algorithm for the phase field approach with time adaptivity. Gerasimov and De Lorenzis [15] proposed a line search procedure for the monolithic scheme to overcome the iterative convergence issues of non-convex minimization. Wick [16, 17] developed modified Newton-Raphson schemes for fully monolithic quasi-static brittle phase field fracture propagation. Farrell and Maurini [18] reformulated the staggered algorithm of the phase field analysis as a nonlinear Gauss-Seidel iteration and employed over-relaxation to accelerate convergence. Wu et al. [19] developed a quasi-Newton monolithic method with the Brodyen–Fletcher–Goldfarb–Shanno (BFGS) algorithm. Kopaničáková and Krause [20] proposed a trust region method with application to monolithic phase-field fracture models.

We aim to accelerate the multiscale simulation from another perspective. In fact, in many cases, the RVEs are similar within the same multiscale analysis. This similarity can be exploited to accelerate computation, for example, via manifold learning.

In the machine learning context, manifold learning is employed to extract the manifold that represents high-dimensional data points and to perform data reconstruction with a minimum amount of computation. Manifold learning has been widely applied to multiscale analysis [21,22,23,24], see also the review by Matouš et al. [25]. An instance of manifold learning techniques is locally linear embedding (LLE). Proposed by Roweis and Saul [26], LLE is an unsupervised learning algorithm that computes low-dimensional, topology-preserving embeddings of high-dimensional data points. As an instance of kernel principal component analysis (kernel PCA), LLE has many attractive properties. For example, the local geometry of high-dimensional data is preserved in the low-dimensional manifold. LLE is particularly suitable for problems with a large amount of similar high-dimensional data.

However, LLE assumes that the data all reside on a single continuous manifold [27], which poses certain restrictions on the application. For example, in image-based simulations [28], each RVE is represented as a vector containing, e.g., pixel values. In this case, if the dimension of this vector varies between RVEs, the nonuniform data structure will make LLE training and interpolation impossible. This is because the neighborhood finding and interpolation operations of the LLE algorithm requires that the linear combination of data points to be well defined.

Despite such restrictions, the advantages of LLE make it ideal for random RVE computation and computational homogenization [28,29,30] for multiscale analysis of heterogeneous materials.

Inspired by [28] for heat conduction problems, for the problem of multiscale fracture simulation at hand, we aim to learn a manifold that contains a collection of similar cracked RVEs, and to efficiently compute any desired output dependent on such microstructure using LLE reconstructions. Concretely speaking, the input is chosen as the phase field pattern at the beginning of a certain time step (termed “initial phase field” for short), and the output can be the phase field at the end of the time step—so as to make a closed loop for the analysis of the next step—or any other derived quantity from such phase field solution such as the homogenized stress. In the discrete picture, we construct a finite element mesh to describe the RVE, interpolate the phase field for the crack pattern using the finite element basis functions, and vectorize the description of the initial crack pattern of each RVE using the nodal values of the phase field. The desired output is the phase field solution corresponding to a certain boundary condition.

Compared with recent contributions on applying machine learning techniques, neural networks in particular, for constitutive modeling [31,32,33,34,35,36] and similar computations for RVEs [37,38,39,40,41], the adopted method possesses the following features.

First, the number of hyperparameters is minimal: only the size of the neighborhood and the number of reduced dimensions need to be input by the user. The selection of such hyperparameters is determined by a systematic cross-validation approach.

Second, there is no limit on the dimension of the desired output, as long as it is a continuous functional of the microstructure, while a typical neural network would have one set of thresholds and weights per scalar output.

Finally, for any new input, the uncertainty (“error bar”) for the reconstructed output can be obtained, as a strong correlation is observed between the reconstruction error and a parameter solely dependent on the input information. In this case, the parameter is the distance from the new input to the learned data manifold. This last feature enables a criterion to be developed to assess the reconstruction error a priori; in other words, a criterion to decide whether to use the reconstruction which is less expensive, or resort to the high-fidelity computation which is more accurate. This also serves as an indicator of whether the collection of inputs should be augmented with the new input in question, in a greedy sampling fashion, should some kind of adaptivity is to be implemented.

However, it is still worth noting that, just like many other machine learning techniques, the LLE approach requires enough data points to guarantee the accuracy of predictions. Hence the training set should be dense and large enough. Moreover, as inherited from the general LLE technique, the proposed approach requires the data structure to be homogeneous, making the distance function and linear combination between data points well-defined. Finally, the output should continuously depend on the input data, which is also a necessary condition for a well-posed problem anyway.

The content of this paper is structured as follows. In Sect. 2, the FE\(^2\) scheme and phase field method are introduced. In Sect. 3, the manifold learning and LLE techniques are explained in detail. In Sect. 4, numerical implementations and results are illustrated with error assessments. Finally, in Sect. 6, a summary of the proposed computational strategy is presented.

2 The FE\(^2\) scheme applied to composite fracture

In this section, we introduce the FE\(^2\) scheme in the multiscale fracture simulation of a fiber-reinforced composite. The FE\(^2\) is a two-scale modeling scheme which applies FE discretizations at both macro and micro scales, the former taking input from the latter through the analysis of the RVE.

In our case, as shown in Fig. 2, the RVE is composed of a strong fiber in the center with a weaker matrix. We aim to perform the fracture simulation of the cracked RVE at the microscale. Once the local behavior is determined, the overall macroscopic response of the RVE can be obtained using any well-established homogenization theory and be used for the macroscopic simulation.

Fig. 2
figure 2

Modeling a macroscopic composite as a collection of RVEs

For simplicity, we only consider the microcrack evolution in the matrix and ignore all other defects, such as cracks on the interface (debonding) and in the fiber, see Fig. 3.

Fig. 3
figure 3

The simplified RVE to be analyzed in this work. In this RVE there is a strong fiber inside a weaker matrix. The only allowed form of failure is matrix cracking

Phase Field Approach for RVE Cracking Among many crack simulation methods, we adopt the phase field method to simulate the microcrack evolution in RVE. The phase field modeling of brittle fracture has shown its advantages on simulating complex fracture process, such as obviation of remeshing, see [3, 42, 43]. The phase field approach of fracture is based on the variational energy formulation proposed by [44], which can be considered as a generalization of Griffith’s theory [4].

As shown in Fig. 4b, the phase field method uses a diffuse field d to represent the cracked microstructures where \(d=0\) represents the intact material and \(d=1\) the crack. Then equipped with a finite element mesh, cracked microstructures can be represented as a vector containing the nodal values of the phase field, and the distance of the cracked RVEs can be measured as the Euclidian norm of the difference of such vectors.

Compared with a geometric description of cracks (Fig. 4a) which may require a heterogeneous data structure (such as the coordinates of a possibly varying number of discrete points on the evolving crack), the phase field method is advantageous in terms of data structure for the manifold learning approach, as each cracked microstructure can be uniformly represented as a vector consisting of the nodes’ phase field values. This feature is favorable in the manifold learning process introduced in Sect. 3, as we can adopt a data structure for the inputs (and outputs) as vectors of the same length.

Fig. 4
figure 4

Representations of a unit cracked microstructures: a discrete crack model; b phase field corresponding to a; c pixel representation of the phase field model with a structured quadrilateral mesh of with \(h/l=0.5\)

Figs. 4c and 5a show the pixel representations of microcracks with the phase field approach. At first sight, there are at least two possible alternatives to translate a cracked microstructure into a numerical representation: (1) using the characteristic function of the cracks, i.e., 1 for the crack and 0 otherwise, as shown in Fig. 5b, (2) using the distance function to the cracks, as shown in Fig. 5c. Considering that we will need to quantify the “distance” of such microstructures, both alternatives present severe drawbacks: method (1) would not be able to tell the distance of non-overlapping cracks, while method (2) would weight too much on the difference of crack pattern pairs in areas far away from the cracks.

Fig. 5
figure 5

Numerical representations of cracked microstructures with 5\(\times\)5 nodes: a phase field (chosen); b characteristic function (not recommended); c distance function (not recommended). We employ a since it is able to vectorize cracked microstructures, and the distance metric between crack patterns is well defined. b would not be able to tell the distance of non-overlapping cracks and c would weight too much on the difference of crack pattern pairs in areas far away from the cracks

The adopted variant of the phase field formulation is as follows. In a plane strain setting, let \({\mathscr {B}}=(-L,L)^2\) be the area initially occupied by the RVE. Within the RVE, let \(S\subset \subset {\mathscr {B}}\) be the fiber, and \({\mathscr {B}}_s={\mathscr {B}}\setminus {\overline{S}}\) be the matrix, see Fig. 3. In the absence of body force and traction boundary condition, the phase field formulation for the RVE is [45]

$$\begin{aligned} \Pi _{l}[{{\varvec{u}}},d]= & {} \int _{{\mathscr {B}}_s}\varPsi \left[ \boldsymbol{\varepsilon },d \right] \mathrm {d}{\mathscr {B}} + \int _{S}\varPsi _1(\boldsymbol{\varepsilon }) \mathrm {d}{\mathscr {B}} \nonumber \\&+ \frac{g_{c}}{2}\int _ {{\mathscr {B}}_s}\left( \frac{d^2}{l} + l\left| \nabla d \right| ^2 \right) \mathrm {d} {\mathscr {B}}, \end{aligned}$$
(1)

where the arguments \({{\varvec{u}}}\in H^1({\mathscr {B}},{\mathbb {R}}^2)\) and \(d\in H^1({\mathscr {B}}_s)\) are the displacement field and the phase field, respectively, and the strain tensor is defined as \(\boldsymbol{\varepsilon }=(\nabla {{\varvec{u}}}+\nabla {{\varvec{u}}} ^T)/2\). Here we set the convention for the phase field d as \(d=1\) represents the crack and \(d=0\) the intact material. Let \((\lambda ,\mu )\) and \((\lambda _1,\mu _1)\) be the Lamé constants of the matrix and of the fiber, respectively, then the strain energy density for the fiber is given by

$$\begin{aligned} \varPsi _1(\boldsymbol{\varepsilon }) = \frac{\lambda _1}{2}({{\mathrm{tr}\,}}\boldsymbol{\varepsilon })^2 + \mu _1 \boldsymbol{\varepsilon }:\boldsymbol{\varepsilon }, \end{aligned}$$

while that for the matrix also depends on d, for which we adopt the formulation proposed by Amor et al. [42]. This model splits the strain energy density \(\varPsi\) into volumetric and deviatoric parts:

$$\begin{aligned} \varPsi (\boldsymbol{\varepsilon },d)=g(d)\varPsi _+(\boldsymbol{\varepsilon }) + \varPsi _-(\boldsymbol{\varepsilon }), \end{aligned}$$

where

$$\begin{aligned}&\varPsi _+(\boldsymbol{\varepsilon }) = \frac{K}{2} \langle {{\mathrm{tr}\,}}\boldsymbol{\varepsilon }\rangle ^2_+ + \mu \Vert {{\mathrm{dev}\,}}\boldsymbol{\varepsilon }\Vert ^2, \end{aligned}$$
(2a)
$$\begin{aligned}&\varPsi _-(\boldsymbol{\varepsilon }) = \frac{K}{2} \langle {{\mathrm{tr}\,}}\boldsymbol{\varepsilon }\rangle ^2_-, \end{aligned}$$
(2b)
$$\begin{aligned}&\boldsymbol{\sigma }(\boldsymbol{\varepsilon }, d) = g(d)\left( K\langle {{\mathrm{tr}\,}}\boldsymbol{\varepsilon }\rangle _+{\mathbf {1}} + 2\mu {{\,\mathrm{dev}\,}}\boldsymbol{\varepsilon }\right) + K \langle {{\mathrm{tr}\,}}\boldsymbol{\varepsilon }\rangle _-{\mathbf {1}}, \end{aligned}$$
(2c)

where \(K=\lambda + 2\mu /3\) is the bulk modulus, \({{\,\mathrm{dev}\,}}\boldsymbol{\varepsilon } := \boldsymbol{\varepsilon } - (1/3)({{\,\mathrm{tr}\,}}\boldsymbol{\varepsilon }){\mathbf {1}}\), \(\langle a\rangle _{\pm }:=(a\pm |a|)/2\) and the degradation function \(g(d) = (1 - d)^2 +k\), where k is a small positive number. The positive numbers \(g_c\) and l are the energy release rate of crack propagation and the regularization length scale, respectively.

The strong form of the governing equations, except the displacement boundary condition at \(\partial {\mathscr {B}}\), read

$$\begin{aligned}&\text {div } \boldsymbol{\sigma } = {\mathbf {0}}, \quad \text {in } {\mathscr {B}}_s \cup S, \end{aligned}$$
(3a)
$$\begin{aligned}&\boldsymbol{\sigma } = \frac{\partial \varPsi }{\partial \boldsymbol{\varepsilon }}, \quad \text {in } {\mathscr {B}}_s, \end{aligned}$$
(3b)
$$\begin{aligned}&\boldsymbol{\sigma } = \frac{\partial \varPsi _1}{\partial \boldsymbol{\varepsilon }}, \quad \text {in } S, \end{aligned}$$
(3c)
$$\begin{aligned}&\frac{\partial \varPsi }{\partial d} +g_c \left( \frac{d}{l} - l\Delta d \right) = 0, \quad \text {in } {\mathscr {B}}_s, \end{aligned}$$
(3d)
$$\begin{aligned}&\boldsymbol{\sigma }\cdot {{\varvec{n}}} \big |_{{\mathscr {B}}_s} = \boldsymbol{\sigma }\cdot {{\varvec{n}}} \big |_S \quad \text {on } \partial S \end{aligned}$$
(3e)
$$\begin{aligned}&{{\varvec{u}}} \big |_{{\mathscr {B}}_s} = {{\varvec{u}}} \big |_{S} \quad \text {on } \partial S \end{aligned}$$
(3f)
$$\begin{aligned}&\nabla d \cdot {{\varvec{n}}} = 0\text { on } \partial {\mathscr {B}}_s, \end{aligned}$$
(3g)

where \({\varvec{n}}\) denotes the outward unit normal vector of \(\partial S\) or \(\partial {\mathscr {B}}\).

The general quasi-static calculation for each load step of the microcrack evolution is shown in Fig. 6: the inputs are the crack configuration (represented by a phase field) at time t and the boundary conditions for \({\varvec{u}}\) and d at the next time step \(t+\Delta t\), and the output is the updated phase field at \(t+\Delta t\). Here t represents a time-like variable to indicate the process of load increment, and likewise \(t+\Delta t\).

Fig. 6
figure 6

a RVE with micro cracks; b the boundary conditions of RVE; c RVE with evolved micro cracks

For simplicity, we fix the following boundary conditions on \(\partial {\mathscr {B}}\) and focus on the effect of the crack path at t on its updated counterpart at \(t+\Delta t\). Let \(\overline{\boldsymbol{\varepsilon }}\in {\mathbb {R}}^{2\times 2}\) be the imposed macroscopic strain tensor, then the boundary conditions are set to be

$$\begin{aligned}&{{\varvec{u}}} = \overline{\boldsymbol{\varepsilon }} \cdot {{\varvec{x}}}, \quad \text {on } \partial {\mathscr {B}}. \end{aligned}$$
(3h)

3 Manifold learning details

The FE\(^2\) scheme introduced in Sect. 2 requires an unpredictable number of iterations for convergence due to the non-convexity of the functional \(\Pi _l\). In order to reduce computational cost, we adopt the so-called manifold learning method. The manifold learning scheme uses techniques traditionally designed for machine learning purposes to extract the manifold that represents high-dimensional data points and perform reconstruction with minimum amount of computation [28, 30]. The main idea is to generate enough inputs and pre-compute their outputs offline, in this case the phase fields at t and \(t+\Delta t\), respectively, then provides the desired output for any input by reconstruction.

In this section, we will elaborate on the manifold learning approach and the LLE technique [26], specialized to the problem stated in Sect. 2. In particular, as we fix the load shown in Fig. 6b, the only input to consider is the initial crack path (i.e. the initial phase field) (Fig. 6a), and the output is the evolved phase field (Fig. 6c) upon equilibrium.

3.1 Locally linear embedding

Locally linear embedding (LLE), proposed by Roweis and Saul [26], is an unsupervised learning algorithm that computes low-dimensional, topology-preserving embeddings of high-dimensional data points. LLE is an instance of kernel principal component analysis (kernel PCA), which handles nonlinear dimensionality reduction [46]. As illustrated in Fig. 7, LLE maps high-dimensional data into a single global coordinate system of lower dimensionality.

Fig. 7
figure 7

The illustration of locally linear embedding. a A two-dimensional manifold; b the three-dimensional data points sampled from a, colored according to the z-coordinates; c the data points after dimensionality reduction by LLE

In this paper, we use LLE to accelerate the computation of the phase field. The main idea is that from the offline calculation of enough cracked microstructures, we will be able to reconstruct crack evolution due to various initial crack patterns with minimal computation online.

The specific process of LLE is as follows. Suppose that there are N input data points \({{\varvec{X}}}_i\in {\mathbb {R}}^{\mathscr {D}}\) where \(i = 1, \ldots , N\), each \({{\varvec{X}}}_i\) containing the phase field values representing a specific cracked microstructure. According to [26], under the assumption that all inputs are on the same manifold, we can linearly reconstruct each data point \({{\varvec{X}}}_i\) by its \(k_1\) (\(\ll N\)) nearest neighbors, say

$$\begin{aligned} {{\varvec{X}}}_i = \sum _{j\in S_i} W_{ij}{{\varvec{X}}}_{j}, \end{aligned}$$
(4)

where \(W_{ij}\) are the weights to be determined and \(S_i\) represents the set of the \(k_1\) nearest neighbors of \({{\varvec{X}}}_i\) in the \(l^2\)-norm.

To compute these weights \(W_{ij}\), we minimize the cost function which measures the reconstruction errors:

$$\begin{aligned} {\mathscr {F}}({{\varvec{W}}}) = \sum _{i=1}^N \left\Vert {{\varvec{X}}}_i - \sum _{j\in S_{i}} W_{ij}{{\varvec{X}}}_{j}\right\Vert ^2. \end{aligned}$$
(5)

The minimization of \({\mathscr {F}}({{\varvec{W}}})\) is subjected to two constraints: (i) each data point \({{\varvec{X}}}_i\) is reconstructed only from its neighbors: \(W_{ij}=0\) if \({{\varvec{X}}}_j\notin S_{i}\). (ii) the rows of the weight matrix sum to 1: \(\sum _{j\in S_{i}} W_{ij} = 1\), \(i = 1, \ldots , N\). An important feature is, for any data point, the weights are invariant to rotation, rescaling and translation of that data point with respect to its neighbors [26].

Now we suppose that all data points are mapped into a lower dimensional embedding space (manifold) of dimension \({\mathscr {L}}\), \({\mathscr {L}}\ll {\mathscr {D}}\). The reconstruction weights \(W_{ij}\) remain unchanged in such transformation. Therefore, each high dimensional data point \({{\varvec{X}}}_i\) is mapped to a low dimensional vector \({{\varvec{Y}}}_i\) representing coordinates on the manifold. We compute \({{\varvec{Y}}}:=\{{{\varvec{Y}}}_i\}\) by minimizing the embedding cost function

$$\begin{aligned} {\mathscr {G}}({{\varvec{Y}}}) = \sum _{i=1}^N \left\Vert {{\varvec{Y}}}_i - \sum _{j\in S_{i}} W_{mi}{{\varvec{Y}}}_{j}\right\Vert ^2. \end{aligned}$$
(6)

During this minimization, the weights \(W_{ij}\) are fixed. To fully determine \(\{{{\varvec{Y}}}_i\}\), certain constraints have to be imposed so that the solution is unique [26]. The resulting constrained minimization problem can be solved via an \(N\times N\) eigenvalue problem.

3.2 Training and output reconstruction

As previously discussed, the offline procedure of this manifold learning scheme consists of two stages: (1) dataset generation with the phase field analysis for the RVE, (2) data manifold construction with LLE. Then for any given phase field under the same load, the online reconstruction procedure readily delivers the phase field evolution.

To generate the training data, we subject a series of RVEs with an initial crack at various locations to the unilateral tension test. The configuration and mesh with an initial phase field are shown in Fig. 8. The mesh shown in Fig. 8b contains \({\mathscr {D}}\) nodes, so every input data point \({{\varvec{X}}}_i\) as well as the corresponding output data point \({{\varvec{Z}}}_i\) is a column vector with \({\mathscr {D}}\) phase field values.

Fig. 8
figure 8

a Setup of the boundary value problem for the RVE; b mesh and a typical initial phase field

Here we made some simplifications for the micro crack simulation so that we can better illustrate the main idea: (1) as mentioned in Sect. 2, the load is a unilateral tension with given displacement as shown in (3h), where the macroscopic strain is \(\overline{\boldsymbol{\varepsilon }} = {\overline{\varepsilon }}_{22} {{\varvec{e}}}_2 \otimes {{\varvec{e}}}_2\); (2) we only consider cracks in the matrix and ignore those on the interface and in the fiber; (3) the initial crack consists of two edges and three connected nodes, but nodes belonging to the same element are forbidden to be chosen.

With the phase field values \(d=1\) imposed at the three nodes mentioned in (3) above and with an all-zero displacement field \({{\varvec{u}}}\equiv {{\varvec{0}}}\), we minimize (1) to get an “equilibrated” phase field as a typical input \({{\varvec{X}}}_i\). The totality of such inputs is termed the training set. The process of construction of the data manifold with the training set is illustrated in Fig. 9.

Fig. 9
figure 9

The process of manifold learning using LLE

For each input \({{\varvec{X}}}_i\) in the training set, we generate the high-fidelity solution of the evolved phase field through a finite element program, and the result is denoted \({{\varvec{Z}}}_i\). Notice that only input data are used during the LLE construction, while the output data \(\left\{ {{\varvec{Z}}}_i\right\}\) are only used for reconstruction. The output data are not limited to be the phase field solution at the given load, nor need it have the same dimension as the input data points.

Once we obtain the data manifold, we reconstruct the output, marked by \({{\varvec{Z}}}^{*}_i\), for every new input \({{\varvec{X}}}^{*}_i\) not in the training set through the following process:

  1. 1.

    We find \(k_2\) (\(\ll N\), which can be the same as \(k_1\), see Sect. 4 for more details) nearest neighbors of \({{\varvec{X}}}_i^*\) in \({{\varvec{X}}}\) and the corresponding weights in the high dimensional space \({\mathbb {R}}^{\mathscr {D}}\), then we map \({{\varvec{X}}}_i^*\) to the low dimensional manifold \({{\varvec{Y}}}_i^*\in {\mathbb {R}}^{\mathscr {L}}\).

  2. 2.

    We find the \(k_2\) nearest neighbors of \({{\varvec{Y}}}_i^*\) in \({{\varvec{Y}}}\), called \(S_i^*\), and their weights \(W_{ij}\) in the low dimensional manifold. Note that these neighbors may not correspond to those in the previous step.

  3. 3.

    Locally linear reconstruct the output with weights and its \(k_2\) nearest neighbors in high dimensional data space:

    $$\begin{aligned}{{\varvec{Z}}}_i^{*}=\sum _{j\in S_i^*}W_{ij}{{\varvec{Z}}}_j.\end{aligned}$$

4 Numerical implementation and validation

In this section, we detail the numerical implementation along with a validation check for the computational strategy.

4.1 Data generation

In our high-fidelity finite element analysis, the material constants are chosen according to Table 1. The RVE size \(L=500\)mm and the macroscopic strain \(\overline{\boldsymbol{\varepsilon }} = {\overline{\varepsilon }}_{22} {{\varvec{e}}}_2 \otimes {{\varvec{e}}}_2\) where \({\overline{\varepsilon }}_{22}=1.4\times 10^{-4}\). The regularized length scale parameter l is chosen such that \(h\le l/2\), where h is the mesh size. We randomly generated 496 initial phase fields as detailed in Sect. 3.2, which correspond to 464 data points for training (manifold learning) and 32 points for testing.

Table 1 Material parameters used in the high-fidelity finite element simulations

4.2 Parameter selection by cross validation

Once the data points are generated, parameter selection is conducted for the manifold learning and reconstruction. Recall the LLE manifold is defined by two hyperparameters \(k_1\) and \({\mathscr {L}}\), while the reconstruction process is defined by one hyperparameter \(k_2\). Hence, the complete manifold model for the problem requires three hyperparameters (\(k_1, k_2, {\mathscr {L}}\)).

The adopted parameter selection method is called cross validation (CV). Through CV we will select the best combination of hyperparameters which leads to a balance of cost and accuracy. The CV process is proceeded as follows. Out of the whole dataset, we randomly select and split \(N=490\) data points to be \(n=10\) equal-sized mutually disjoint subsets, \({{\varvec{X}}}^{(1)}\),...,\({{\varvec{X}}}^{(n)}\), then we choose \(n-1\) subsets as the training set to generate the manifold, and use the remaining one for validation, say the jth subset \({{\varvec{X}}}^{(j)}\). Let \({{\varvec{Z}}}^{(j)}=\{{{\varvec{Z}}}_i^{(j)}\}\) denote the corresponding output phase field data for the validation set, and \({{\varvec{Z}}}^{*(j)}=\{{{\varvec{Z}}}^{*(j)}_i\}\) the LLE reconstruction. Then the final CV error R reads

$$\begin{aligned} R = \frac{1}{n} \sum _{j=1}^{n} \sum _i \frac{\Vert {{\varvec{Z}}}_i^{*(j)}-{{\varvec{Z}}}_i^{(j)}\Vert _{l^2}}{\Vert {{\varvec{Z}}}_i^{(j)}\Vert _{l^2}}. \end{aligned}$$

This procedure is illustrated in Fig. 10.

Fig. 10
figure 10

The process of cross validation

The procedure to select hyperparameters consists of two stages: (1) the dimension reduction process involving \(k_1\), and (2) the reconstruction process involving \(k_2\). Iterating through combinations of \(({\mathscr {L}}, k_1,k_2)\) with a fixed \(k_2\) value, an error matrix is deduced with columns denoting values of \(k_1/k_2\), and rows denoting values of \({\mathscr {L}}\) as shown in Table 2. We find that \(k_1=k_2\) will yield a low CV error, which is reasonable, as the case \(k_1>k_2\) will lead to information loss in the reconstruction process, and \(k_1<k_2\) will add noise to the reconstruction process.

Table 2 CV error with different combinations of \(k_1/k_2\) and \({\mathscr {L}}\), with \(k_2=20\)

Then we fix \(k_1=k_2\) and perform more CV to obtain Table 3, from which we determine that \(k_1=k_2=20\) gives a relatively low CV error for each \({\mathscr {L}}\).

Table 3 CV error with different combinations of \(k_1 (= k_2)\) and \({\mathscr {L}}\)

Then, we plot the CV error as a function of \({\mathscr {L}}\) in Fig. 11. This figure indicates that an increase in \({\mathscr {L}}\) will reduce the average error, as expected. However, using a larger \({\mathscr {L}}\) increases the training time. Therefore, we follow the standard way to make the trade-off, i.e., to get the critical turning point at approximately the elbow, where \({\mathscr {L}}=80\). When \({\mathscr {L}}\) is beyond this value, the error decreases at a very slow rate, while the training efficiency continually decreases.

Fig. 11
figure 11

CV error versus \({\mathscr {L}}\)

In conclusion, the chosen hyperparameters are \((k_1,k_2,{\mathscr {L}})=(20,20,80)\).

4.3 Reconstruction error analysis for the phase field

In this subsection, the output is the evolved phase field \({{\varvec{Z}}}^*_i=\left\{ d_{j}\right\} _i\), where \(j=1,2,\ldots ,{\mathscr {D}}\). Therefore, the output \({{\varvec{Z}}}^*_i\) and input \({{\varvec{X}}}_i\) have the same dimension. A histogram showing the reconstruction errors is given in Fig. 12, where we use the normalized \(l^2\)-norm to represent the error magnitude in the output phase field, i.e.,

$$\begin{aligned} \frac{\Vert {{\varvec{Z}}}^*_i - {{\varvec{Z}}}_i\Vert _{l^2}}{\Vert {{\varvec{Z}}}_i \Vert _{l^2}}. \end{aligned}$$
(7)

From this figure it can be seen that the LLE reconstruction error for the phase field is acceptable.

Fig. 12
figure 12

Normalized \(l^2\) reconstruction error of the evolved phase field, i.e., \(\Vert {{\varvec{Z}}}^*_i - {{\varvec{Z}}}_i\Vert _{l^2}/\Vert {{\varvec{Z}}}_i \Vert _{l^2}\), of the 32 test data points

Fig. 13
figure 13

Normalized \(l^2\) error of the evolved phase field versus the distance to the manifold. The \(l^2\) errors have a positive correlation with the distance to the manifold

To examine the deciding factor of such error, we plot the normalized error in \(l^2\)-norm of the 32 test points versus their distance to the manifold in Fig. 13. Here the distance of \({{\varvec{X}}}_i^*\) to the manifold is given by

$$\begin{aligned} \left\| {{\varvec{X}}}_i^*-\sum _{j\in S_i^*} W_{ij} {{\varvec{X}}}_j\right\| _{l^2}. \end{aligned}$$

A positive relationship between the reconstruction error and this distance is observed, without outliers. Thus we can safely say that if a test data point is close enough to the manifold, the reconstruction error of its microcrack propagation result will be small, guaranteeing the validity of this LLE manifold learning method.

4.4 Reconstruction error analysis for the homogenized stress

In this subsection, the output is the homogenized stress \({{\varvec{Z}}}^*_i=\overline{\boldsymbol{\sigma }}_i\), where \(\overline{\boldsymbol{\sigma }}_i=\left\{ {\overline{\sigma }}_x, {\overline{\sigma }}_y, {\overline{\sigma }}_z, {\overline{\sigma }}_{xy}\right\} _i\), where for the plane strain case, \({\overline{\sigma }}_z=\nu ({\overline{\sigma }}_x+ {\overline{\sigma }}_y)\) for the matrix and likewise for the fiber. As Fig. 1 shows, the homogenized stress is obtained from the RVE through the volume average,

$$\begin{aligned} \overline{\boldsymbol{\sigma }}=\frac{1}{|{\mathscr {B}}|}\int _{\partial {\mathscr {B}}} \boldsymbol{\sigma }\; \mathrm {d}{\mathscr {B}}. \end{aligned}$$

Then the normalized reconstruction error in \(l^2\)-norm (7) becomes

$$\begin{aligned} \frac{\Vert \overline{\boldsymbol{\sigma }}^*_i - \overline{\boldsymbol{\sigma }}_i\Vert _{l^2}}{\Vert \overline{\boldsymbol{\sigma }}_i \Vert _{l^2}}. \end{aligned}$$
(8)

The normalized reconstruction error of the homogenized stress is shown in Fig. 14. It shows that the normalized reconstruction error is smaller than 0.05, which is very small. Figure 15 shows that the reconstruction error is bounded by a factor times the distance to the manifold, indicating a similar conclusion, i.e., an a priori error estimate can be obtained.

From Figs. 13 and 15, the positive relationship between the reconstruction error and the distance to the manifold can serve as an input-specific error bar, which will be elaborated in Sect. 5.

Fig. 14
figure 14

Normalized \(l^2\) reconstruction error of the homogenized stress, i.e., \(\Vert \overline{\boldsymbol{\sigma }}^*_i - \overline{\boldsymbol{\sigma }}_i\Vert _{l^2}/\Vert \overline{\boldsymbol{\sigma }}_i \Vert _{l^2}\), of the 32 test data points

Fig. 15
figure 15

Normalized \(l^2\) error of the homogenized stress versus the distance to the manifold. The \(l^2\) errors are bounded by a factor times the distance to the manifold

5 Results and discussion

In this section, we present the manifold learning results and discuss the features, applications and future directions of the proposed approach.

5.1 2D visualizations of results

To remove the data bias, we generate a new set of 496 data points, which shares no data points with the set used in the parameter selection process (cross validation) in Sect. 4.2. With the selected hyperparameters \((k_1,k_2,{\mathscr {L}})=(20,20,80)\), we build the model using 464 data points for training, and use the remaining 32 data points for testing. To visualize the manifold built by the training data, and together showing the test data, we perform an LLE reduction again for the 80-dimensional manifold to 2 dimensions, as in Fig. 16. It can be observed that the test data points are not far from the manifold trained from the training data.

Fig. 16
figure 16

2D visualization of the 80D manifold

Fig. 17
figure 17

a Nearest 20 neighbor points of test point No. 11; b Nearest 20 neighbor points of test point No. 13

Next we extract and visualize the nearest neighbors of a certain data point, as shown in Fig. 17a, b.

In Fig. 17a, we observe that the nearest neighbors in the training set are close to the chosen test data point (Point No. 11). In Fig. 17b, however, the nearest neighbors of the chosen test data point (Point No. 13) appear scattering around. This phenomenon is still acceptable since the distances between points in the remaining 78 dimensions are not seen in the figures.

We next visualize the cracked microstructures in Fig. 18, where we can observe a pattern that similar microstructure will cluster in a continuous mode, showing the dimension reduction is reasonable.

Fig. 18
figure 18

Crack microstructures mapped into the manifold described by the first two coordinates of LLE. Representative microstructures are shown next to the square points. The solid line shows a continuous mode change of microstructures

5.2 Input-specific error bar

As is shown in Figs. 13 and 15, a positive relationship between the reconstruction error and the distance to the manifold is observed. Through this strong correlation, we can pre-determine whether a new input data point \({{\varvec{X}}}_i^*\) is suitable for the manifold learning approach: if \({{\varvec{X}}}_i^*\) is close enough to the manifold, the reconstruction of the phase field at the given load will be accurate; otherwise, if it is far away from the manifold, we should either not use the manifold reconstruction for this particular input, or augment the training set with \({{\varvec{X}}}_i^*\). This property can also be exploited to aid an adaptivity procedure to augment the training set on the fly: if the distance from a certain new input \({{\varvec{X}}}_i^*\) to its manifold projection is too high, then we can add it (and its output from high-fidelity computation) to the training set.

5.3 On the number of sampling microstructures

In this subsection, we compare reconstruction errors by manifolds learned from different training sets. The output to compare is the evolved phase field, and the only variable is the number of the sampling microstructures in the training set.

Fig. 19
figure 19

Normalized \(l^2\) reconstruction error of the evolved phase field versus the number of sampling microstructures

In Fig. 19 we compare the reconstruction errors from 100, 200, 300, 400, 464, 496 sampling microstructures. As expected, the results are more accurate with more sampling microstructures in the training set. Also, errors from 464 and 496 microstructures are both small and the results are nearly the same. Hence, 496 sampling microstructures is sufficient for the types of crack paths considered.

5.4 On the computational costs

As expected, the manifold learning method and the high-fidelity finite element method have a dramatic difference in computational costs. On a regular laptop using MATLAB, in particular, the high-fidelity program spend \(5 \pm 1\) min to finish one single case while the manifold approach only needs less than 1 second.

5.5 Applications and future directions

A number of generalizations can be made for the proposed approach. For example, the proposed approach can be generalized to cases with more complicated RVEs such as those with elastoplastic constitutive behavior. Different boundary conditions can be performed, and various outputs can be obtained. In fact, the output can be of a high dimension, as long as there exists a continuous dependence of the output on the input, which is anyway a prerequisite of a well-posed problem.

Generalizations to different crack types, i.e., debonding and cracking in the fibers are possible. In the proposed manifold learning approach, such generalizations can be realized by modifications on initial input data and boundary conditions. For example, research on combining the phase field method and the cohesive model is proposed in [47, 48]. Moreover, through the adaptive algorithm introduced in Sect. 5.2, efficient multiscale fracture simulation is possible.

In summary, the applicability of this approach is promising.

6 Conclusions

We have proposed a manifold learning approach to accelerate phase field fracture simulations in the RVE in the context of the FE\(^2\) scheme. Considering a group of RVEs with the same microstructure except for the microcracks, we use the phase field approach to represent such microcracks.

We then make use of the LLE technique to construct a data manifold that contains a collection of similar cracked microstructures (RVEs). This LLE manifold can be used to efficiently and accurately predict the phase field output as a function of the initial phase field, provided that all the analysis is done at the same load applied to the RVE. Among various machine learning approaches, manifold learning has been widely applied to multiscale analysis [21,22,23,24]. As an instance of manifold learning approach, LLE is particularly suitable for problems with a large amount of similar high-dimensional data, such as heat conduction problems [28]. This new computational approach enjoys the following features. In the proposed approach, only three hyperparameters need to be determined to learn the manifold. And once the data manifold is constructed, minimum computation is required to reconstruct the phase field output. Moreover, there exists an indicator which can pre-estimate the reconstruction error and pre-determine whether an input data is suitable to perform the reconstruction. We would like to emphasize that this feature is very desirable, since compared with more popular machine-learning techniques such as neural networks—in many of those techniques, it is difficult to predict whether an interpolation is accurate or not without knowing the exact solution.

A number of generalizations can be made for the proposed approach, e.g., to three dimensions, and to the types of RVEs, boundary conditions, and outputs. In fact, the output can be of a high dimension, as long as there exists a continuous dependence of the output on the input, which is anyway a prerequisite of a well-posed problem. The applicability of this approach is promising. The adaptive algorithm makes efficient multiscale fracture simulation possible.