FE $${}^\textrm{ANN}$$ : an efficient data-driven multiscale approach based on physics-constrained neural networks and automated data mining

Kalina, Karl A.; Linden, Lennart; Brummund, Jörg; Kästner, Markus

doi:10.1007/s00466-022-02260-0

FE${}^\textrm{ANN}$: an efficient data-driven multiscale approach based on physics-constrained neural networks and automated data mining

Original Paper
Open access
Published: 08 February 2023

Volume 71, pages 827–851, (2023)
Cite this article

Download PDF

You have full access to this open access article

Computational Mechanics Aims and scope Submit manuscript

FE${}^\textrm{ANN}$: an efficient data-driven multiscale approach based on physics-constrained neural networks and automated data mining

Download PDF

4905 Accesses
25 Citations
7 Altmetric
1 Mention
Explore all metrics

Abstract

Herein, we present a new data-driven multiscale framework called FE${}^\textrm{ANN}$ which is based on two main keystones: the usage of physics-constrained artificial neural networks (ANNs) as macroscopic surrogate models and an autonomous data mining process. Our approach allows the efficient simulation of materials with complex underlying microstructures which reveal an overall anisotropic and nonlinear behavior on the macroscale. Thereby, we restrict ourselves to finite strain hyperelasticity problems for now. By using a set of problem specific invariants as the input of the ANN and the Helmholtz free energy density as the output, several physical principles, e. g., objectivity, material symmetry, compatibility with the balance of angular momentum and thermodynamic consistency are fulfilled a priori. The necessary data for the training of the ANN-based surrogate model, i. e., macroscopic deformations and corresponding stresses, are collected via computational homogenization of representative volume elements (RVEs). Thereby, the core feature of the approach is given by a completely autonomous mining of the required data set within an overall loop. In each iteration of the loop, new data are generated by gathering the macroscopic deformation states from the macroscopic finite element simulation and a subsequently sorting by using the anisotropy class of the considered material. Finally, all unknown deformations are prescribed in the RVE simulation to get the corresponding stresses and thus to extend the data set. The proposed framework consequently allows to reduce the number of time-consuming microscale simulations to a minimum. It is exemplarily applied to several descriptive examples, where a fiber reinforced composite with a highly nonlinear Ogden-type behavior of the individual components is considered. Thereby, a rather high accuracy could be proved by a validation of the approach.

A multiscale, data-driven approach to identifying thermo-mechanically coupled laws—bottom-up with artificial neural networks

Article 07 April 2022

Derivation of heterogeneous material laws via data-driven principal component expansions

Article 22 May 2019

Integration of Experiments and Simulations to Build Material Big-Data

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Materials with an underlying meso- or microstructure, e. g., composites, solid foams, dual-phase steels or 3D-printed structures, enable the targeted design of engineering components with respect to their application. However, due to effects such as anisotropy, nonlinear or multiphysics phenomena, the experimental characterization of the effective constitutive behavior of such materials can be very complex.

1.1 Multiscale schemes

To avoid such a characterization of the material’s effective behavior, computational multiscale schemes can be used. These schemes allow the simulation of engineering components made of materials with underlying microstructure solely based on information about the microstructural arrangement and the properties of the individual components, e. g., matrix and inclusions. Basically, two different types of multiscale schemes exist: the coupled multiscale scheme which is also known as FE${}^2$ approach (finite element square) [13, 42, 48, 49, 62, 68, 69, 85] and the decoupled or sequential multiscale scheme [15, 27, 39, 40, 67, 74, 83, 84].

The FE${}^2$ method allows to completely couple the microscopic and macroscopic scales without the need for the explicit formulation of an effective constitutive model. It is thus universally applicable to arbitrary geometries if the necessary microscopic information are available. However, the decisive disadvantage is the very high computational effort, which results from the solution of the microscopic boundary value problem (BVP) for the homogenization at each integration point of the macroscopic FE mesh.

Within a decoupled multiscale scheme, the material’s effective behavior is initially determined from homogenization of representative volume elements (RVEs) and then a suitable constitutive model, the so-called macro or surrogate model, is calibrated by these data. With this model, macroscopic BVPs can now be solved, whereby the influence of the microstructure is implicitly captured by the macro model. Thus, in contrast to the FE${}^2$ method, the explicit solution of the microscopic BVP at each integration point is omitted. The central disadvantage, however, is that the formulation of such a model can be extremely complicated.

1.2 Data-based methods in solid mechanics

To circumvent the time consuming task of formulating and calibrating a surrogate model within a decoupled multiscale scheme, data-based or data-driven techniques are very promising and have the potential to improve or replace traditional constitutive models. These techniques have become increasingly popular in the computational mechanics community during the last years [3, 63]. In the following, a brief overview of the most common methods and their application to multiscale schemes is given.

1.2.1 Overview on data-based constitutive modeling

A relatively new strategy to substitute classical constitutive equations is the data-driven mechanics approach, initially proposed by Kirchdoerfer and Ortiz [43] and extended to, e. g., noisy data sets [44], finite strains [64], inelasticity [8, 11] or fracture mechanics [6] in the meantime. This method completely avoids to use constitutive equations. Instead, sets of stress-strain tuples which characterize the material’s behavior are used. A data-driven solver hence seeks to minimize the distance between the searched solution and the material data set within a proper energy norm, while compatibility and equilibrium have to be satisfied simultaneously.

The construction of so called constitutive manifolds from collected data is an alternative approach which is described for elasticity and inelasticity by Ibañez et al. [37]. With this technique, it is possible to strictly fulfill the 2nd law of thermodynamics, i. e., the thermodynamic consistency, by using the GENERIC paradigm (General Equation for Non-Equilibrium Reversible-Irreversible Coupling) during the construction of constitutive manifolds [31].

Besides the previously mentioned techniques, there exist numerous data-based methods originating from the field of machine learning (ML). Probably, the most common technique is the application of artificial neural networks (ANNs), which have already been proposed in the early 90s by the pioneering work of Ghabussi et al. [28]. In the last decades, ANNs have been intensively used for mechanical material modeling and simulations by means of the finite element method (FEM), e. g., in [4, 33, 38, 71, 73, 86] among others.

However, in general, a large amount of data is required to train ANNs to serve as robust and accurate surrogate models for systems with complex underlying physics, e. g., constitutive models. In this context, a comparatively new branch of ML techniques related to constitutive modeling are approaches classified as physics-informed, physics-constrained [17, 21], mechanics-informed [2], physics- augmented [46], or hybrid models [59, 70].^{Footnote 1} In these methods, essential physical principles and information are inserted into the ML-based model or parts of classical models are replaced with data-based methods, which leads to an improvement of the extrapolation capability and enables training with sparse data. In the case of ANNs, this can be achieved via the network architecture or by adapted training algorithms. By choosing problem-specific invariant sets as the input variables, material symmetries are automatically satisfied for hyperelastic models [38, 45, 53, 75]. Furthermore, the thermodynamic consistency can be fulfilled by choosing the free energy as the output quantity. In order to train the ANN with respect to the stresses, gradients of the output with respect to the input are inserted into the loss [12, 45, 46, 52, 53]. This technique is also named as Sobolev training in [77, 80]. Furthermore, physical knowledge can be inserted via constraint training processes [81]. For the consideration of dissipative behavior, Masi et al. [61] proposed an adapted ANN architecture consisting of two feedforward neural networks (FNNs). Thereby, the first network is used to predict the evolution of internal variables and the second for approximating the free energy. Thus, internal state variables are needed for the training process. Another physically informed approach to dissipative materials, which has the advantage of requiring only stresses and strains for training, is shown in [35]. Thereby, the internal state variables capturing the path-dependency are inferred automatically from the hidden state of recurrent neural networks (RNNs). Finally, the combination of classical models with data-based techniques is a further method to insert physical knowledge. Thereby, only parts of a model are replaced by data-based techniques, e. g., in plasticity models [59, 70, 79].

1.2.2 Data-based multiscale modeling and simulation

In the context of multiscale problems, the mentioned data-based methods can be used as surrogate models which replace the computationally expensive simulation of RVEs [54]. The high flexibility of trained networks with simultaneously excellent prediction qualities is thereby proven in numerous works: In [12] ANNs are used to describe the homogenized response of cubic lattice metamaterials exhibiting large deformations and buckling phenomena. Thereby, several types of hyperelastic ML-based models which fulfill basic physical requirements are used and compared to each other. Based on this, an extension to polyconvex hyperelastic models is given in [45]. An application of polyconvex ANNs to electro elasticity is shown in [46]. In [51], FFNs and RNNs are used to replicate the homogenization of RVEs revealing inelastic behavior of the individual components. Thereby, also unknown paths can be predicted with the trained networks. Similarly, ANNs that replace the inelastic constitutive responses of composite materials sampled by RVE simulations are shown in [19]. Thereby, in addition, a deep reinforcement learning combinatorics game is used to automatically find an optimal set of network hyper-parameters from a decision tree. A data-driven multiscale framework called deep material network is shown in [25, 55, 56]. Thereby, the homogenized RVE response is reproduced by a network including a collection of connected mechanistic building blocks with analytical homogenization solutions. With that, a complex effective response could be described without the loss of essential physics. An extension of the deep material network approach to fully coupled thermo-mechanical multiscale simulations of composite materials is given in [26]. The application of ANNs used as a surrogate for molecular dynamics simulations is discussed in [7, 80].

Furthermore, ANNs can be used to link microstructural characteristics and effective properties. For example, in [24], an ML framework for predicting macroscopic yield as a function of crystallographic texture is described. An ML-based multiscale calibration of constitutive models representing the effective response of rate dependent composite materials is applied to brain white matter in [14]. Based on a library of calibrated parameters corresponding to a set of microstructural characteristics, an ML model which predicts the constitutive model parameters directly from a new microstructure is trained. Similarly, an ANN is trained to predict the elastic properties of short fiber reinforced plastics in [5].

Finally, the application of data-based techniques as surrogate models in decoupled multiscale schemes, which enable the simulation of macroscopic engineering components, is promising. In the pioneering work [50], a decoupled multiscale scheme using an ANN-based surrogate model has been shown for elastic composites with cubic microstructures. Another multiscale methodology is presented in [20]. Therein, a hybrid macroscopic surrogate model is used, i. e., a traditional constitutive model is combined with a data-driven correction. An ML-based multiscale framework for the simulation of the elastic response of metals having a polycrystalline microstructure is presented in [77]. Therein, the database is generated by using a 3D FFT (fast Fourier transform) solver. Based on this, [78, 79] show an extension to elastoplasticity, whereby ANNs are used for the description of the yield surface and the stress within a hybrid modeling approach. Similarly, the simulation of the elastic-plastic deformation behavior of open-cell foam structures is shown in [59, 70]. Thereby, a hybrid model including two ANNs is used as the macroscopic surrogate model. Therein, the first network serves for the description of the macroscale yield function and the second one for the prediction of the floating direction. In [26], twoscale simulations of thermo-mechanical problems are solved by using deep material networks. Applying the modeling strategy initially proposed in [61], a multiscale scheme for the inelastic behavior of lattice material structures is described in [60]. An application of ANNs as surrogate models describing the anisotropic electrical response of graphene/polymer nanocomposites is shown in [58]. Furthermore, in [82], the application of RNNs as surrogate models within multiscale simulations of elastoplastic problems is shown. Furthermore, the RNN-based approach is compared to full FE${}^2$ simulations. An application of RNN surrogate models to viscoplasticity is described in [29]. A combination of fully coupled FE${}^2$ simulations with an adaptive on-the-fly switching to ANN-based surrogate models for the complex simulation of RVEs is shown in [18].

In addition, a multiscale framework based on the data-driven mechanics approach [43] is presented in [41] for the application to sand. Thereby, the necessary data set is extracted from lower scale simulations. In [47], a multi-level method is used within a data-driven multiscale scheme which is applied for the simulation of solid foam materials. Finally, a data-driven multiscale scheme with the purpose of taking into account polymorphic uncertainties of the underlying microstructure is presented in [87]. Thereby, material uncertainties are considered within one data set containing uncertain stress and strain states.

1.3 Content

Within this contribution, an efficient data-driven multiscale approach called FE${}^\textit{ANN}$, which makes use of physics-constrained ANNs as a macroscopic surrogate model, is presented. The approach allows to consider materials with complex microstructures leading to an overall anisotropic behavior, whereby a restriction to hyperelastic solids is made for now. The ANN-based surrogate model automatically fulfills several physical principles, e. g., objectivity, material symmetry, compatibility with the balance of angular momentum and the thermodynamic consistency. This is done by using a set of problem specific invariants $I_k$ as the input of the network and the Helmholtz free energy density $\psi $ as the output, cf. Linka et al. [53], Klein et al. [45] or Linden et al. [52], among others. In addition, the weights of the network can be constrained such that the growth condition is guaranteed to be fulfilled. The data which are used to train the ANN are collected via computational homogenization of an RVE representing the material’s microstructure. Thereby, in contrast to most of the data-driven multiscale approaches from the literature, it is not necessary to explore all required macroscopic states of deformation which occur in the macroscopic simulation in advance, e. g., by latin hypercube sampling [20]. Instead, the required data, i. e., effective deformations and corresponding stresses, are determined by homogenization in a fully autonomous way within the framework by collecting the relevant deformations from the macroscopic FE simulation. This procedure is similar to the approach presented by Korzeniowski and Weinberg [47] which is based on the data-driven mechanics approach [43]. Here, in addition, the macroscopic deformations are mapped into an invariant space associated to the symmetry group of the considered material. In this space, the selection of relevant states is done so that the number of time-consuming microscale simulations can be reduced to a minimum. Moreover, it is not necessary to perform a rough scan of the relevant deformation area in advance here. This is possible through the use of a physics-constrained ANN, which prevents violating basic physical principles even though the network must extrapolate into an unknown region of states. The proposed framework is exemplarily applied to several descriptive examples, where a fiber reinforced composite with a highly nonlinear behavior of the individual components is considered.

The organization of the paper is as follows: In Sect. 2, the basic equations of the finite strain continuum solid mechanics theory as well as basic principles of hyperelastic models are given. After this, the proposed data-driven multiscale framework is described in Sect. 3. This approach is exemplarily applied within several numerical examples in Sect. 4. After a discussion of the results, the paper is closed by concluding remarks and an outlook to necessary future work in Sect. 5.

2 Continuum solid mechanics

In this section, the basic kinematic and stress quantities as well as general relations of anisotropic hyperelastic constitutive models are summarized shortly. The reader is referred to the textbooks of Haupt [34], Holzapfel [36] or Ogden [65] for a detailed overview. Furthermore, a Hill-type homogenization framework is introduced. For a clear mathematical notation, the space of tensors

$$\begin{aligned} \mathcal {L}_{n}:=\underbrace{\mathbb {R}^3 \otimes \cdots \otimes \mathbb {R}^3}_{n\text {-times}} \ \forall n\in \mathbb {N}_{\ge 1} \; , \end{aligned}$$

(1)

except for a tensor of rank zero, is used in the following. In Eq. (1), $\mathbb {R}^3$, $\mathbb {N}$ and $\otimes $ denote the Euclidean vector space, the set of natural numbers and the dyadic product, respectively. Tensors of rank one and two are given by boldface italic symbols in the following, i. e., $\varvec{A} \in \mathcal {L}_{1}$ or $\varvec{B},\varvec{C} \in \mathcal {L}_{2}$. Furthermore, a single or double contraction of two tensors is given by $\varvec{B} \cdot \varvec{C} = B_{kq} C_{ql} \varvec{e}_k \otimes \varvec{e}_l$ or $\varvec{B}:\varvec{C}=B_{kl}C_{lk}$, respectively. Thereby, $\varvec{e}_k\in \mathcal {L}_{1}$ denotes a Cartesian basis vector and the Einstein summation convention is used.

2.1 Kinematics and stress measures

2.1.1 Kinematics

In the following, a material body ${\mathcal {C}}$ which occupies the reference configuration $\mathcal {B}_0 \subset \mathbb {R}^3$ at time $t_0 \in \mathbb {R}_{\ge 0}$ and the current configuration $\mathcal {B}\subset \mathbb {R}^3$ at time $t\in {\mathcal {T}}:=\{\tau \in \mathbb {R}_+ \,|\,\tau \ge t_0\}$ is considered. The displacement vector $\varvec{u} \in \mathcal {L}_{1}$ of a material point $P\in {\mathcal {C}}$ capturing the positions $\varvec{X} \in \mathcal {B}_0$ at $t_0$ and $\varvec{x} \in \mathcal {B}$ at t is given by $\varvec{u}(\varvec{X},t):=\varvec{\varphi }(\varvec{X},t) - \varvec{X}$. Therein, $\varvec{\varphi }: \mathcal {B}_0 \times {\mathcal {T}} \rightarrow \mathcal {B}\; , (\varvec{X}, t) \mapsto \varvec{x}=:\varvec{\varphi }(\varvec{X},t)$ denotes a bijective motion function which is postulated to be continuous in space and time. As further kinematic quantities, the deformation gradient $\varvec{F} \in \mathcal {L}_{2}$ and the Jacobian determinant $J\in \mathbb {R}_+$ are defined by the relations

$$\begin{aligned} \varvec{F} := (\nabla _{\!\!{\varvec{X}}}\varvec{\varphi })^\text {T} \quad \text {and} \quad J:=\det \varvec{F} > 0 \; . \end{aligned}$$

(2)

In the equation above, $\nabla _{\!\!{\varvec{X}}}$ is the nabla-operator with respect to reference configuration $\mathcal {B}_0$.

A deformation measure which is free of rigid body motions is given by the positive definite right Cauchy–Green deformation tensor with the space of symmetric second order tensors . With that and by using Sylvesters formula, the spectral decomposition of $\varvec{C}$ follows to

$$\begin{aligned} \varvec{C} = \sum _{\beta =1}^{N_\lambda } \lambda _\beta ^2 \varvec{P}^\beta \; \text {with} \; \varvec{P}^\beta := \delta _{1 N_\lambda } \varvec{1} + \prod _{\beta \ne \alpha }^{N_\lambda } \frac{\varvec{C} - \lambda _\beta ^2 \varvec{1}}{\lambda _\alpha ^2 - \lambda _\beta ^2} \; , \end{aligned}$$

(3)

where $\varvec{1}\in \mathcal {L}_{2}$, $\lambda _\beta \in \mathbb {R}_{\ge 0}$ and denote the identity tensor, the principal stretches and the projection tensors, respectively. The introduced symbol $N_\lambda \in \{1,2,3\}$ indicates the number of independent eigenvalues. Furthermore, $\delta _{kl}$ is defined as the Kronecker delta.

2.1.2 Stress measures

Within the framework of nonlinear continuum solid mechanics, various stress measures can be defined. The symmetric Cauchy stress , also known as true stress, is defined with respect to the current configuration $\mathcal {B}$. Furthermore, the 1st and 2nd Piola–Kirchhoff stress tensors $\varvec{P} \in {\mathcal {L}}_2$ and follow from the pull back operations

$$\begin{aligned} \varvec{P} := J \varvec{F}^{-1}\cdot \varvec{\sigma }\quad \text {and} \quad \varvec{T} := J \varvec{F}^{-1}\cdot \varvec{\sigma }\cdot \varvec{F}^{-\text {T}} \; . \end{aligned}$$

(4)

Accordingly, $\varvec{P}$ is related to both, ${\mathcal {B}}$ and ${\mathcal {B}}_0$, whereas $\varvec{T}$ is completely defined with respect to ${\mathcal {B}}_0$.

2.2 Hyperelasticity

2.2.1 General properties

The constitutive behavior of the considered solids is restricted to hyperelasticity within this work. Accordingly, a hyperelastic potential which is equal to the Helmholtz free energy density function $\psi : \mathcal {L}_{2} \rightarrow \mathbb {R}_+\; , \varvec{F} \mapsto \psi (\varvec{F})$ exists. By applying the procedure of Coleman and Noll [9], the relations

$$\begin{aligned} \varvec{P} = \frac{\partial \psi }{\partial \varvec{F}^\text {T}} \; \text {and} \; \varvec{T} = \frac{\partial \psi }{\partial \varvec{F}^\text {T}} \cdot \varvec{F}^{-\text {T}} \; \end{aligned}$$

(5)

then follow from the evaluation of the Clausius–Planck inequality [36]. With that, the thermodynamic consistency of any hyperelastic model is fulfilled a priori.

In addition, there are some further requirements on $\psi $, which ensure a physically reasonable constitutive behavior [36, 65]. The normalization condition requires that $\psi (\varvec{1})=0$, i. e., that the free energy vanishes in the undeformed state. Furthermore, the free energy should increase in any case if deformation appears: $\psi (\varvec{F})>0 \, \forall \varvec{F}\ne \varvec{1}$. The former two conditions imply that $\psi $ has a global minimum at $\varvec{F} = \varvec{1}$. Consequently, the undeformed state is stress-free, i. e., $\varvec{T}(\varvec{1}) = \varvec{0}$ holds. Additionally, the growth condition requires that an infinite amount of energy is needed to infinitely expand the volume or compress it to zero: $\psi (\varvec{F})\rightarrow \infty $ as $(J\rightarrow \infty \vee J \rightarrow 0^+)$. Finally, the principle of material objectivity states that the free energy is invariant with respect to a superimposed rigid body motion. This statement is expressed by the relation $\psi (\varvec{F}) = \psi (\varvec{Q} \cdot \varvec{F})$ which holds for all special orthogonal tensors . Accordingly, the principle of material objectivity is automatically fulfilled if the tensor $\varvec{C}$ is used as the argument of $\psi $ instead of $\varvec{F}$, i. e., , which is done in the following.

Finally, polyconvexity of the energy $\psi $, i. e., convexity with respect to $\varvec{F}$, ${{\,\textrm{Cof}\,}}\varvec{F} := J \varvec{F}^{-\text {T}}$ and $\det \varvec{F}$, is a further condition. This condition implies material stability, i. e., Legendre–Hadamard-Ellipticity, see Ebbing [10] for more details. However, polyconvexity is a quite strong requirement on the free energy.

2.2.2 Material symmetry

Besides the previously mentioned requirements, the constitutive equations should also reflect the symmetry of the described material which is expressed by the principle of material symmetry [10, 34]. According to that, the free energy have to be invariant with respect to all orthogonal transformations belonging to the symmetry group ${\mathscr {G}}$ of the underlying material: with .

In order to describe anisotropic constitutive behavior, the concept of structural tensors can be used [10, 34]. Depending on the considered material, these tensors are of order two $\varvec{M}^1,\varvec{M}^2,\dots ,\varvec{M}^{n_2} \in \mathcal {L}_{2}$, four ${\mathbb {M}}^1,{\mathbb {M}}^2,\dots ,{\mathbb {M}}^{n_4} \in \mathcal {L}_{4}$, or even higher. They reflect the material’s anisotropy and are thus invariant with respect to the symmetry transformations, e. g.,

$$\begin{aligned} \varvec{M}^\alpha = \varvec{Q} \cdot \varvec{M}^\alpha \cdot \varvec{Q}^\text {T} \; \text {and} \; {\mathbb {M}}^\beta = \varvec{Q} * {\mathbb {M}}^\beta \, \forall \varvec{Q}\in {\mathscr {G}} \; . \end{aligned}$$

(6)

The notation $\varvec{Q} * {\mathbb {M}}^\beta $ means $Q_{IM}Q_{JN}Q_{KO}Q_{LP} M_{MNOP}^\beta $, where the Einstein summation convention is used. If the structural tensors are appended to the list of arguments of $\psi $, the energy is an isotropic tensor function even if the material is anisotropic which means that

$$\begin{aligned} \psi (\varvec{C}, {\mathcal {M}}_2, {\mathcal {M}}_4) = \psi (\varvec{Q} \cdot \varvec{C} \cdot \varvec{Q}^\text {T}, \varvec{Q} \cdot {\mathcal {M}}_2 \cdot \varvec{Q}^\text {T},\varvec{Q} *{\mathcal {M}}_4) \end{aligned}$$

(7)

holds for all . To abbreviate the notation, the two sets ${\mathcal {M}}_2 :=\{\varvec{M}^1,\varvec{M}^2,\dots ,\varvec{M}^{n_2}\}$ and ${\mathcal {M}}_4 :=\{{\mathbb {M}}^1,{\mathbb {M}}^2,\dots ,{\mathbb {M}}^{n_4}\}$ have been used in Eq. (7).

Finally, a set $\underline{{\textbf {I}}} :=(I_1,I_2,\dots ,I_n) \in \mathbb {R}^{n\times 1}$ consisting of $n\in \mathbb {N}$ irreducible scalar valued invariants $I_\alpha \in \mathbb {R}$ could be derived by using the Cayley–Hamilton theorem.^{Footnote 2} Consequently, the free energy is expressed by the isotropic tensor function $\psi =\psi (I_1,I_2,\dots ,I_n)$ which is invariant with respect to all transformations . By using the latter definition and applying the chain rule, the 2nd Piola–Kirchhoff stress $\varvec{T}$ follows to

$$\begin{aligned} \varvec{T} = \sum _{\alpha =1}^{n}2 \frac{\partial \psi }{\partial I_\alpha } \underbrace{\frac{\partial I_\alpha }{\partial \varvec{C}}}_{\displaystyle =:\varvec{G}^\alpha } \; , \end{aligned}$$

(8)

where denotes tensor generators corresponding to the invariants $I_\alpha $ [38].

2.2.3 Special anisotropy classes

Within this work, two anisotropy classes are considered, isotropic as well as transversely isotropic materials. Corresponding sets of irreducible invariants are given in the following.

A set of irreducible invariants describing an isotropic hyperelastic solid is given by $\underline{{\textbf {I}}}^\circ := (I_1,I_2,I_3) \in \mathbb {R}^{3\times 1}$. The used principal invariants are given by

$$\begin{aligned} I_1:=\det \varvec{C},\; I_2:={{\,\textrm{tr}\,}}({{\,\textrm{Cof}\,}}\varvec{C}) ,\;I_3:=\det \varvec{C} \; , \end{aligned}$$

(9)

whereby the cofactor of $\varvec{C}$ is defined by ${{\,\textrm{Cof}\,}}\varvec{C} := J^2 \varvec{C}^{-1}$.^{Footnote 3} Note that $\underline{{\textbf {I}}}^\circ $ is also expressible by the principal stretches $\lambda _\alpha $ which follows from Eq. (3):

$$\begin{aligned} I_1 = \sum _{\alpha =1}^{N_\lambda } \nu _\alpha \lambda _\alpha ^2 \, , \, I_2 = \prod _{\alpha =1}^{N_\lambda } \lambda _\alpha ^{2 \nu _\alpha } \sum _{\beta =1}^{N_\lambda } \nu _\beta \frac{1}{\lambda _\beta ^{2}}\, \text {,} \, I_3 = \prod _{\alpha =1}^{N_\lambda } \lambda _\alpha ^{2 \nu _\alpha } \, . \end{aligned}$$

(10)

According to [10, 34, 36], the structural tensor describing transverse isotropy is given by the second order tensor $\varvec{M} = \varvec{A} \otimes \varvec{A}$, whereby $\varvec{A} \in \mathcal {L}_{1}$ with $|\varvec{A}| \equiv 1$ is the fiber direction in the undeformed state. A set of irreducible invariants is thus $\underline{{\textbf {I}}}^\parallel :=(I_1,I_2,I_3,I_4,I_5)\in \mathbb {R}^{5\times 1}$. Therein, the latter two invariants, which capture the material’s anisotropy, are given by the expressions

$$\begin{aligned} I_4 := \varvec{M} : \varvec{C} \; \text {and} \; I_5 := \varvec{M} : \varvec{C}^2 \; . \end{aligned}$$

(11)

2.3 Scale transition scheme

In the following, a distinction between two different scales, the micro- and the macroscale is made. The former is characterized by a heterogeneous structure consisting of matrix $\mathcal {B}_0^\text {m}\subset \mathbb {R}^3$ and inhomogeneities $\mathcal {B}_0^\text {i}\subset \mathbb {R}^3$ of characteristic length $\ell \in \mathbb {R}_+$, whereas the second considers a macroscopic body ${\bar{\mathcal {B}}}_0 \subset \mathbb {R}^3$ of characteristic length ${\bar{\ell }} \in \mathbb {R}_+$ and is assumed to be homogeneous. For the lengths introduced, the relation $\bar{\ell } \gg \ell $ known as scale separation must hold [68]. To label macroscopic quantities, they are marked by $\bar{(\bullet )}$ in the following.

In order to connect microscopic and macroscopic tensor quantities, an appropriate homogenization scheme is needed. Consequently, each macroscopic point $\bar{\varvec{X}} \in {\bar{\mathcal {B}}}_0$ gets assigned properties resulting from the behavior of the microstructure. For this purpose, a representative volume element (RVE) of the material is considered on the microscale in the vicinity of $\bar{\varvec{X}}$, cf. Fig. 1. An effective macroscopic quantity is then identified from the microscopic field distribution within the RVE by the volume average

$$\begin{aligned} \langle (\bullet ) \rangle := \frac{1}{V^\text {RVE}}\int \limits _{\mathcal {B}_0^\text {RVE}} (\bullet ) \, \text {d}V \; . \end{aligned}$$

(12)

Using Eq. (12), the macroscopic deformation gradient and the 1st Piola–Kirchhoff stress are defined by $\bar{\varvec{F}} := \langle \varvec{F} \rangle $ and $\bar{\varvec{P}} := \langle \varvec{P} \rangle $, respectively [40, 68, 69]. Appropriate boundary conditions for the microscopic BVP, which has to be solved before the volume averaging can be performed, are deducible from the equivalence of the macroscopic and the averaged microscopic energies which is also known as the Hill–Mandel condition. For the considered finite strain setting it is given by the following relation [40, 68, 69]:

$$\begin{aligned} \bar{\varvec{P}} : \dot{\bar{\varvec{F}}} = \langle \varvec{P} : \dot{\varvec{F}} \rangle \; , \end{aligned}$$

(13)

where $\dot{(\bullet )}$ denotes the material time derivative. Regarding purely hyperelastic behavior, the Hill–Mandel condition expresses the equality of the rates of the macroscopic and the averaged microscopic Helmholtz free energies: $\dot{{\bar{\psi }}} = \langle {\dot{\psi }} \rangle $. Consequently, it holds ${\bar{\psi }} = \langle \psi \rangle $.

To fulfill Eq. (13), several type of boundary conditions (BCs) can be used, whereby periodic BCs given by the spaces

$$\begin{aligned} \varvec{u}\in {\mathcal {U}}(\bar{\varvec{F}})&:= \left\{ \varvec{u} \in \mathbb {R}^3 \; | \; \varvec{u} = (\bar{\varvec{F}} - \varvec{1}) \cdot \varvec{X} + \tilde{\varvec{u}}, \; \tilde{\varvec{u}}^+ = \tilde{\varvec{u}}^-\right\} , \end{aligned}$$

(14)

$$\begin{aligned} \varvec{p} \in {\mathcal {P}}&:= \left\{ \varvec{p} \in \mathbb {R}^3 \; |\; \varvec{p}^- = -\varvec{p}^+ \right\} \end{aligned}$$

(15)

are applied within this work [40]. In the equation above, $\varvec{p} = \varvec{N} \cdot \varvec{P}$ is the nominal stress vector. Furthermore, $(\bullet )^+$ and $(\bullet )^-$ denote values on opposing boundaries of the RVE which is supposed to be periodic. The tilde $\tilde{(\bullet )}$ marks the fluctuation part of a microscopic tensor quantity.

3 ANN-based multiscale approach with autonomous data generation

Based on the summarized continuum theory, the following section introduces the proposed data-driven multiscale scheme FE${}^\textit{ANN}$. The general procedure of this framework is basically subdivided into five steps referred as

(a)
initial data generation,
(b)
training process of the ANN,
(c)
macroscopic simulation,
(d)
data analysis, as well as
(e)
data enrichment.

After the framework is initiated with step (a), the steps (b)–(e) are executed within an overall loop which is given in Algorithm 1. Accordingly, the ANN is trained with the current data set ${\mathcal {D}}_i$ of the iteration $i\in \mathbb {N}$, the macroscale problem is solved, unknown macroscopic states of deformation $\bar{\varvec{F}}$ are detected, and the macroscopic stresses $\bar{\varvec{P}}$ corresponding to the previously unknown deformations are generated via computational homogenization. The loop terminates in an iteration $i\ge 1$, after no further deformation states are found and the relevant space of deformation is thus completely sampled by ${\mathcal {D}}_{i}$. In order to prevent that the macroscopic simulation does not reach the final time step $t_\text {goal}$–which could be occur due to a failed convergence of the ANN–but no new deformations are detected within $t_j \in \{t_0,t_1,\dots ,t_\text {end}<t_\text {goal}\}$, a further inner repeat loop is implemented. Therein, the steps (b)–(d) are performed multiple times, until the final time step $t_\text {goal}$ is reached or new states of deformation are found. A schematic representation of the framework is given in Fig. 2. The single steps are described in detail in the following.

In order to enable a fully automatized utilization of the framework, a Python wrapper which runs on a high performance cluster (HPC) using the Batch-System SLURM (Simple Linux Utility for Resource Management) has been implemented. This wrapper processes the loop independently by starting the individual jobs and managing the results.

3.1 Initial data generation (a)

To start with, an initial data set ${\mathcal {D}}_1:=\{{}^{1}\!{\mathcal {T}},{}^{2}\!{\mathcal {T}}, \dots , {}^{k}\!{\mathcal {T}}\}$ consisting of $k\in \mathbb {N}$ data tuples ${}^{\alpha }\!{\mathcal {T}} := ({}^{\alpha }\! \bar{\varvec{F}}^\text {RVE} , {}^{\alpha }\! \bar{\varvec{P}}^\text {RVE} ) \in \mathcal {L}_{2} \times \mathcal {L}_{2}$ is generated. Basically, a low number of tuples is sufficient to initiate the multiscale scheme. Suitable states of deformation are simple load cases as, e. g., uniaxial tension and equibiaxial tension

$$\begin{aligned} {[}{\bar{F}}_{lK}^\text {RVE}] = \begin{bmatrix} {\bar{\lambda }} &{} \quad 0 &{} \quad 0\\ 0 &{} \quad - &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad - \end{bmatrix} \; , \; {[}{\bar{F}}_{lK}^\text {RVE}] = \begin{bmatrix} {\bar{\lambda }} &{} \quad 0 &{} \quad 0\\ 0 &{} \quad {\bar{\lambda }} &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad - \end{bmatrix} \end{aligned}$$

(16)

or simple shear

$$\begin{aligned} {[}{\bar{F}}_{lK}^\text {RVE}] = \begin{bmatrix} 1 &{} \quad {\bar{\gamma }} &{} \quad 0\\ 0 &{} \quad 1 &{} \quad 0 \\ 0 &{} \quad 0 &{} \quad 1 \end{bmatrix} \; , \end{aligned}$$

(17)

where these load cases could be applied on the RVE in different directions. In the equations above, ${\bar{\lambda }} \in \mathbb {R}_+$ and ${\bar{\gamma }} \in \mathbb {R}$ denote prescribed effective stretches and shears, respectively. The labeling with $(-)$ means that the corresponding coordinate of the effective 1st Piola–Kirchhoff stress ${\bar{P}}_{Kl}$ is prescribed to zero. The stresses ${}^{\alpha }\! \bar{\varvec{P}}^\text {RVE}$ belonging to the deformations ${}^{\alpha }\! \bar{\varvec{F}}^\text {RVE}$ are calculated from a computational homogenization according to Sect. 2.3, whereby this is done by using an in-house Matlab FE code. The periodic BCs are applied via the master node concept therein [32].

3.2 ANN-based macroscopic surrogate model

Within the data-driven multiscale loop, an appropriate macroscopic surrogate model is needed. To this end, a physics-constrained ANN model, which a priori fulfills several physical conditions, is used, compare Sect. 2.2.

Assuming that the macroscopic material symmetry group of the considered composite is known, and thus the structural tensors ${\mathcal {M}}_2$, ${\mathcal {M}}_4$ corresponding to the material are available, the invariants ${\bar{I}}_\alpha (\bar{\varvec{C}},{\mathcal {M}}_2,{\mathcal {M}}_4)$ are determined first. These scalar values are mapped into a normalized domain, i. e., $S_\alpha :\mathbb {R}\rightarrow [-1,1] \subset \mathbb {R}, \; {\bar{I}}_\alpha \mapsto \bar{{\mathfrak {i}}}_\alpha \forall \alpha \in \{1,2,\dots ,n\}$ with respect to the training data set, and are arranged in a vector $\bar{\underline{{\mathfrak {i}}}}=(\bar{{\mathfrak {i}}}_1,\bar{{\mathfrak {i}}}_2,\dots ,\bar{{\mathfrak {i}}}_n) \in \mathbb {R}^{n\times 1}$. Thereafter, the free energy is predicted by an FNN with the normalized invariants $\bar{\underline{{\mathfrak {i}}}}$ serving as input values and ${\bar{\psi }}$ as output. Restricting the network architecture to only one hidden layer containing N neurons, it holds

$$\begin{aligned} {\bar{\psi }}^\text {ANN} := B + \sum _{\alpha =1}^{N} W_{\alpha }\,{\mathscr {S}}\mathscr {P}\left( \sum _{\beta =1}^{n} w_{\alpha \beta } \bar{{\mathfrak {i}}}_\beta + b_\alpha \right) , \end{aligned}$$

(18)

where the monotonously increasing and convex Softplus activation function [45]

$$\begin{aligned} {\mathscr {S}}\mathscr {P}: \mathbb {R}\rightarrow (0,\infty ), \ x \mapsto {\mathscr {S}}\mathscr {P}(x) := \ln (1 + \exp (x)) \end{aligned}$$

(19)

is used.^{Footnote 4} The stress prediction is finally done by applying Eq. (8):

$$\begin{aligned} \bar{\varvec{T}}^\text {ANN} = \sum _{\alpha =1}^{n}2 \frac{\partial {\bar{\psi }}^\text {ANN}}{\partial \bar{{\mathfrak {i}}}_\alpha } \frac{\partial \bar{{\mathfrak {i}}}_\alpha }{\partial {\bar{I}}_\alpha }\frac{\partial {\bar{I}}_\alpha }{\partial \bar{\varvec{C}}} \; . \end{aligned}$$

(20)

Thus, the surrogate model automatically fulfills several physical principles: thermodynamic consistency^{Footnote 5}, objectivity and material symmetry. A graphical summary of the described surrogate model structure is given in Fig. 3.

In order to fulfill additional physical conditions, it may be necessary to incorporate further non-independent invariants ${\bar{I}}_\beta ^*$ into the argument list of ${\bar{\psi }}^\text {ANN}$. In this case, Eqs. (18) and (20) are to be modified according to

$$\begin{aligned} {\bar{\psi }}^\text {ANN} := B + \sum _{\alpha =1}^{N} W_{\alpha } {\mathscr {S}}\mathscr {P}\left( \sum _{\beta =1}^{n} w_{\alpha \beta } \bar{{\mathfrak {i}}}_\beta + \sum _{\beta \in {\mathcal {A}}} w^*_{\alpha \beta } \bar{{\mathfrak {i}}}^*_\beta + b_\alpha \right) \end{aligned}$$

(21)

and

$$\begin{aligned} \bar{\varvec{T}}^\text {ANN} = \sum _{\alpha =1}^{n}2 \frac{\partial {\bar{\psi }}^\text {ANN}}{\partial \bar{{\mathfrak {i}}}_\alpha } \frac{\partial \bar{{\mathfrak {i}}}_\alpha }{\partial {\bar{I}}_\alpha }\frac{\partial {\bar{I}}_\alpha }{\partial \bar{\varvec{C}}} + \sum _{\beta \in {\mathcal {A}}}2 \frac{\partial {\bar{\psi }}^\text {ANN}}{\partial \bar{{\mathfrak {i}}}_\beta ^*} \frac{\partial \bar{{\mathfrak {i}}}_\beta ^*}{\partial {\bar{I}}_\beta ^*}\frac{\partial {\bar{I}}_\beta ^*}{\partial \bar{\varvec{C}}}\; , \end{aligned}$$

(22)

where ${\mathcal {A}} := \{\gamma _1,\dots ,\gamma _A\}, A = \vert {\mathcal {A}}\vert $, is a set containing the indices of the additional invariants ${\bar{I}}_\beta ^*$. Accordingly, the growth condition can be additionally guaranteed for ${\bar{J}}\rightarrow 0^+$ by including ${\bar{I}}_3^* := 1/{\bar{I}}_3$ as a further invariant and not only ${\bar{I}}_3$ itself, where ${\bar{I}}_3^*$ is defined to be independent, i. e., $\partial _{{\bar{I}}_3}{\bar{I}}_3^*=0$. Thus, ${\mathcal {A}}=\{3\}$ in this case. However, in addition, further constraints have to be satisfied by the weights to ensure that the growth condition is fulfilled. As shown in Appendix A, a possible sufficient condition is given by

$$\begin{aligned}&\left( W_\alpha> 0 \forall \alpha \in {\mathcal {N}}\right) \wedge \left( \exists \,w_{\alpha 3}>0 \text { with } \alpha \in {\mathcal {N}}\right) \nonumber \\&\quad \wedge \left( \exists \,w^*_{\alpha 3}>0 \text { with } \alpha \in {\mathcal {N}}\right) \; . \end{aligned}$$

(23)

Therein, the set ${\mathcal {N}}:=\{1,2,,\ldots ,N\}$ contains the indices of the hidden layer neurons. In the work [45], the growth condition is fulfilled in a similar way by adding an additional energy term which is not directly included in ${\bar{\psi }}^\text {ANN}$. In [80], a post-training validation test is suggested to check the growth condition. Thereby, deformation gradients with $\det \bar{\varvec{F}} \rightarrow 0^+$ are prescribed to the network and it is observed whether the resultant energy is monotonically increasing.

3.2.1 Training process of the ANN (b)

The ANN-based model is trained with respect to the current data set ${\mathcal {D}}_i$ in each iteration i of the multiscale loop, where a random division into training and test data is made.

Within the training process, the weights and bias values $B, W_\alpha , b_\alpha $, $w_{\alpha \beta }$ and $w^*_{\alpha \beta }$ are then determined. In order to enable an adjustment of the ANN to the stress values^{Footnote 6}, gradients of the output with respect to the input are inserted into the loss

$$\begin{aligned} {\mathcal {L}} := \sum _\alpha \sqrt{({}^{\alpha }\! {\bar{T}}^\text {ANN}_{11} - {}^{\alpha }\! {\bar{T}}_{11}^\text {RVE})^2 + \cdots + ({}^{\alpha }\! {\bar{T}}^\text {ANN}_{12} - {}^{\alpha }\! {\bar{T}}_{12}^\text {RVE})^2} \end{aligned}$$

(24)

which is similar to [12, 53, 77]. The training is done by using the SLSQP optimizer (Sequential Least Squares Programming). Thereby, the ANN is trained several times, where the parameters of the best achieved training state are stored at the end [38].^{Footnote 7} An implementation of the described workflow is realized using Python, Tensorflow and SciPy. Within the training the constraint (23) can be switched on optionally. Finally, in order to fulfill the normalization condition, the bias value B is chosen such that ${\bar{\psi }}^\text {ANN} = 0$ within the initial (undeformed) state.

Remark 1

In the proposed framework, it is assumed that the more data tuples are included in the set ${\mathcal {D}}_i$, the better the ANN’s prediction quality becomes. In the context of the examples given in Sect. 4, this has been shown to hold. However, it should be noted that in the case of more complex microstructures or constitutive properties of the individual phases, it may happen that expressiveness of the ANN is limited and thus the error cannot decrease further. In this case, it would be necessary to insert another tolerance for the deviation of the ANN’s stress prediction compared to the RVE solution in order to decide whether the macroscopic simulation was successful.

3.2.2 Implementation and macroscopic simulation (c)

The model equation (21) has been implemented within the FE toolbox FEniCS [1, 57]. Therein, the stress relation given by Eq. (22) and the material tangent

$$\begin{aligned} \bar{{\mathbb {C}}}^\text {ANN}&:= 4 \frac{\partial ^2 {\bar{\psi }}^\text {ANN}}{\partial \bar{\varvec{C}} \partial \bar{\varvec{C}}} \in \mathcal {L}_{4} \; , \end{aligned}$$

(25)

which is required within the solution via a standard Newton-Raphson scheme, are calculated by means of automatic differentiation. When applying the model in FE simulations, the typical quadratic convergence of the Newton iteration becomes apparent.

In addition to weights and bias values of the trained ANN, problem specific structural tensors ${\mathcal {M}}_2^\text {macro}$, ${\mathcal {M}}_4^\text {macro}$ have to be prescribed within the macroscopic simulation, cf. Sect. 2.2.2. Note that the preferred directions on the macroscale does not necessarily have to match the ones of the RVE, e. g., ${\mathcal {M}}_2^\text {macro}\ne {\mathcal {M}}_2^\text {RVE}$. A conversion which is necessary in this case is done in step (e), cf. Sect. 3.3.2.

3.3 Autonomous data mining

Besides the physics-constrained surrogate model, the core feature of the data-driven multiscale framework is the data mining process which is done in a fully automatic manner.

3.3.1 Data analysis (d)

Within each iteration of the overall multiscale loop, the local deformation states of the macroscopic body ${\bar{\mathcal {B}}}$ are collected. To this end, the deformation gradient $\bar{\varvec{F}}$ is stored at each quadrature point of the FE mesh and at each time increment $t_j$. Consequently, the body’s full state of deformation is characterized by the set ${\mathcal {F}}_i^\text {macro} :=\{{}^{1}\!\bar{\varvec{F}}^\text {macro}_i,{}^{2}\!\bar{\varvec{F}}^\text {macro}_i,\ldots ,{}^{m}\!\bar{\varvec{F}}^\text {macro}_i\}$ which is a subset of $\mathcal {L}_{2}$ and includes the tuples of all time steps.

Now, previously unknown deformations have to be detected, whereby it is meaningful to only consider states providing additional information for the material under observation. Since the intrinsic constitutive behavior of the material lives in the space of invariants ${\mathcal {I}} \subset \mathbb {R}^{(n+A)\times 1}$, it is useful to perform a transformation into this space at this point [38]:

$$\begin{aligned} T: \mathcal {L}_{2} \rightarrow \mathbb {R}^{(n+A)\times 1}, \bar{\varvec{F}} \mapsto ({\bar{I}}_1,\dots ,{\bar{I}}_n, {\bar{I}}_{\gamma _1}^*,\dots ,{\bar{I}}_{\gamma _A}^*)\; , \end{aligned}$$

(26)

whereby ${\mathcal {M}}_2^\text {macro}$, ${\mathcal {M}}_4^\text {macro}$ and ${\mathcal {M}}_2^\text {RVE}$, ${\mathcal {M}}_4^\text {RVE}$ are used, respectively. Thus, by making use of Eq. (26), the current data set ${\mathcal {D}}_i$ is compared to ${\mathcal {F}}_i^\text {macro}$ within the invariant space. If a state which is contained in ${\mathcal {F}}_i^\text {macro}$ is unique within a given tolerance $\varepsilon _\text {tol,1}$, i. e., it holds $({\bar{I}}_\alpha ^\text {macro}-{\bar{I}}_\alpha ^\text {RVE})/{\bar{I}}_\alpha ^\text {RVE}>\varepsilon _\text {tol,1}$ for at least one $\alpha \in \{1,2,\dots ,n+A\}$, it is needed to enrich the data set for the next multiscale iteration step $i+1$. Thereby, the set ${\mathcal {F}}^\text {macro}_i$ is searched in an reverse manner, i. e., starting from the last time step $t_\text {end}$. If a unique state is identified in the step $t_j$, the full time series $\bar{\varvec{F}}(t_0), \bar{\varvec{F}}(t_1), \dots , \bar{\varvec{F}}(t_j)$ with $t_j \le t_\text {end}$ is added to ${\mathcal {F}}_i^\text {new} \subset \mathcal {L}_{2}$.^{Footnote 8} Thus, it is possible that deformation states in the set ${\mathcal {F}}_i^\text {new}$ are multiple with respect to ${\mathcal {F}}_i^\text {new}$ itself and/or ${\mathcal {D}}_i$. To avoid that this unnecessarily inflates the training process, a further filtering step is performed in the data enrichment step (e).

By using the described technique for the identification of relevant macroscopic deformation states, it is possible to significantly reduce the number of time consuming microscale simulations following in the next step.^{Footnote 9}

3.3.2 Data enrichment (e)

After ${\mathcal {F}}_i^\text {new}$ has been determined, it is necessary to identify the stresses belonging to the states ${}^{\alpha }\!\bar{\varvec{F}}^\text {macro}$. This is done by prescribing these states within the RVE simulations and calculating the stresses by volume averaging. However, within the macroscopic body, arbitrary microstructure orientations, which may differ from the orientation in the RVE by a rotation , are allowed. Thereby, these orientations are captured by the sets of structural tensors ${\mathcal {M}}_2^\text {macro}$ and ${\mathcal {M}}_4^\text {macro}$ as well as ${\mathcal {M}}_2^\text {RVE}$ and ${\mathcal {M}}_4^\text {RVE}$, respectively. Thus, to fulfill

$$\begin{aligned} {\bar{\psi }} ({}^{\alpha }\!\bar{\varvec{C}}^\text {macro}, {\mathcal {M}}_2^\text {macro}, {\mathcal {M}}_4^\text {macro}) = {\bar{\psi }} ({}^{\alpha }\!\bar{\varvec{C}}^\text {RVE}, {\mathcal {M}}_2^\text {RVE}, {\mathcal {M}}_4^\text {RVE}) \; , \end{aligned}$$

(27)

where ${\mathcal {M}}_2^\text {RVE} = \varvec{Q} \cdot {\mathcal {M}}_2^\text {macro} \cdot \varvec{Q}^\text {T}$ and ${\mathcal {M}}_2^\text {RVE}=\varvec{Q} *{\mathcal {M}}_4^\text {macro}$, the deformation gradient ${}^{\alpha }\!\bar{\varvec{F}}^\text {macro}$ has to be transformed before it is prescribed within the homogenization. Thereby, it is the goal to get ${}^{\alpha }\!\bar{\varvec{C}}^\text {RVE}=\varvec{Q} \cdot {}^{\alpha }\!\bar{\varvec{C}}^\text {RVE} \cdot \varvec{Q}^\text {T}$ for the right Cauchy–Green deformation tensor, compare Eq. (7). This can be achieved by either using the orthogonal transformation

$$\begin{aligned} {}^{\alpha }\!\bar{\varvec{F}}^\text {RVE} = {}^{\alpha }\!\bar{\varvec{F}}^\text {macro} \cdot \varvec{Q}^\text {T} \text { or } {}^{\alpha }\!\bar{\varvec{F}}^\text {RVE} = \varvec{Q} \cdot {}^{\alpha }\!\bar{\varvec{F}}^\text {macro} \cdot \varvec{Q}^\text {T} \; . \end{aligned}$$

(28)

In this work, the second transformation is favored because, in contrast to the first one, the application of an additional rigid body rotation of the RVE is omitted within the homogenization. It should be noted that both transformations given in Eq. (28) naturally yield ${}^{\alpha }\!{\bar{I}}_\beta ^\text {macro} = {}^{\alpha }\!{\bar{I}}_\beta ^\text {RVE}$ for the invariants which depend on the deformation and structural tensors, respectively. After the homogenization, the tuples ${}^{\alpha }\!{\mathcal {T}} = ({}^{\alpha }\! \bar{\varvec{F}}^\text {RVE} , {}^{\alpha }\! \bar{\varvec{P}}^\text {RVE})$ consisting of applied deformations and corresponding stresses are then collected in the set ${\mathcal {D}}_i^\text {RVE}$.

As already mentioned above, it is possible that data tuples in the set ${\mathcal {D}}_i^\text {RVE}$ are multiple with respect to ${\mathcal {D}}_i^\text {RVE}$ itself and/or ${\mathcal {D}}_i$. Thus, a further filtering step is performed, where multiple tuples are sorted out with respect to ${\mathcal {D}}_{i}$ and ${\mathcal {D}}_i^\text {RVE}$ within a given tolerance $\varepsilon _\text {tol,2}$. To this end, a transformation to the invariant space is used again, see Sect. 3.3.1. Finally, the enriched data set for the next iteration of the multiscale loop follows from ${\mathcal {D}}_{i+1} = {\mathcal {D}}_{i} \cup {\mathcal {D}}_{i}^\text {new}$, where ${\mathcal {D}}_{i}^\text {new}$ only contains relevant and unique tuples of the added deformations and corresponding stresses.

4 Examples

In order to demonstrate the ability of the developed data-driven multiscale approach FE${}^\textit{ANN}$ described in Sect. 3, it is applied to the solution of three numerical examples within this section. Specifically, three macroscopic structures–a cuboid with a circular hole, a torsional sample and the Cook membrane–are considered. All of them consist of a fiber reinforced composite revealing a highly nonlinear behavior of the individual phases.

4.1 Microscopic properties of the composite

4.1.1 Constitutive behavior

Table 1 Material parameters for matrix and fiber phases of the composite described by the Ogden model (31). Initial shear modulus $G^\text {init}$ and Poisson’s ratio $\nu ^\text {init}$ as well as parameter sets $\mu _p$, $\alpha _p$ and $\kappa $. The parameters of the matrix phase are chosen according to Kalina et al. [38]. The initial elastic constants are calculated as given in Footnote 10

Full size table

The constitutive behavior of the composite’s individual components, i. e., fibers and matrix, is described by a hyperelastic Ogden model [65]. It is given by the free energy density function

$$\begin{aligned}&\psi := \sum _{p=1}^{N_\text {O}} \frac{\mu _p}{\alpha _p} \left[ \sum _{\beta =1}^{N_\lambda } \nu _\beta \left( \lambda _\beta ^\text {iso}\right) ^{\alpha _p} -3 \right] + \frac{\kappa }{4}(J^2 - 2 \ln J -1), \end{aligned}$$

(29)

where $\mu _p,\alpha _p \in \mathbb {R}$ and $\kappa \in \mathbb {R}_+$ are model parameters.^{Footnote 10} The symbol $\mathbb {R}_+ \ni \lambda _\beta ^\text {iso} := J^{-1/3} \lambda _\beta $ denotes the isochoric principal stretches following from the Flory split [16]:

$$\begin{aligned} \varvec{F} = \varvec{F}^\text {vol} \cdot \varvec{F}^\text {iso} \; \text {with} \; \varvec{F}^\text {iso}=J^{-1/3} \varvec{F} \; \text {and} \; \det \varvec{F}^\text {iso} \equiv 1 \; , \end{aligned}$$

(30)

i. e., the multiplicative decomposition of $\varvec{F}$ into volumetric $\varvec{F}^\text {vol}$ and isochoric $\varvec{F}^\text {iso}$ parts. Furthermore, in order to guarantee a physically meaningful behavior, the parameters $\alpha _p$ and $\mu _p$ have to be restricted by the following constraints: $(\alpha _p < -1 \, \vee \, \alpha _p\ge 2$ and $\mu _p \alpha _{p} > 0) \, \forall p \in \{1,2,\cdots N_\text {O}\}$ [65]. The stress of the Ogden model is determined by using Eq. (5) and follows to

$$\begin{aligned} \varvec{T}&= \sum _{\beta =1}^{N_\lambda } \left[ \frac{1}{\lambda _\beta ^2} \sum _{p=1}^{N_\text {O}} \mu _p \left( (\lambda _\beta ^\text {iso})^{\alpha _p} -\frac{1}{3} \sum _{\gamma =1}^{N_\lambda } \nu _\gamma (\lambda _\gamma ^\text {iso})^{\alpha _p}\right) \right. \nonumber \\&\quad + \left. \frac{\kappa }{2} \lambda _\beta ^{-2} (J^2-1) \right] \varvec{P}^\beta \; . \end{aligned}$$

(31)

The calculation of the projection tensors $\varvec{P}^\beta $ related to the right Cauchy–Green deformation tensor $\varvec{C}$ is given in Eq. (3).

Within the following numerical examples, the material parameters given in Table 1 are chosen. With this choice, a highly nonlinear behavior of the matrix phase is achieved, cf. [38]. The parameters of the fibers are chosen in such a way, that a neo-Hookean model results. Note that the selected parameters are not related to a real material.

4.1.2 Microstructure and RVE definition

Besides the constitutive behavior of the individual components, the geometric arrangement of the individual phases has to be defined. In order to choose a realistic microstructure, it is characterized by a monodisperse stochastic arrangement of fibers with $\phi ={30}\,\%$ volume fraction here.^{Footnote 11} Furthermore, a minimum distance of $d:=0.4 R$ with respect to the fiber radius $R={10}\,\upmu \hbox {m}$ is chosen.

According to the homogenization concept, a suitable RVE which is sufficiently large to capture the essential statistic properties of the microstructure is needed. Herein, this condition is checked by applying the $\chi ^2$-test proposed by Gittman et al. [30]. Thereby, n randomly generated RVEs with fixed number of inclusions and volume fraction are generated. The effective response of these RVEs is then determined by a computational homogenization, whereby a specific load case is chosen depending on the property a of interest, e. g., this could be a stress or a stiffness component. Based on these results, a statistical analysis is done by evaluating the scatter of a by means of the quantity

$$\begin{aligned} \chi ^2:=\sum _{i=1}^n\frac{\left( a_i - \langle a \rangle _n \right) ^2}{\langle a \rangle _n} \; \text {with} \; \langle a\rangle _{n}&:=\frac{1}{n} \sum _{i=1}^{n} a_i \; . \end{aligned}$$

(32)

If the accuracy of a is sufficient, i. e., $\chi ^2\le \varepsilon _\text {tol}$, the tested sample size is the final RVE size. Otherwise, the sample size have to be increased and the analysis will be repeated.

The described $\chi ^2$-test has been applied to the fiber reinforced material. Thereby, a tolerance of $\varepsilon _\text {tol}={2.5}\,\%$ and a number of $n=10$ RVEs for each RVE-class with $N^\text {inc}$ inclusions have been chosen. As representative load cases, uniaxial tensions according to Eq. (16)${}_1$ are applied into the $x_1$-, $x_2$- and the $x_3$-direction, i. e., perpendicular and parallel to the fiber orientation. Thereby, the stress into the tension direction is evaluated according to Eq. (32). The prescribed tolerance is finally reached for the RVE-class with $N^\text {inc}=100$ fibers. The determined stress-stretch curves and the chosen RVE are depicted in Fig. 4 for the final RVE. Accordingly, a highly nonlinear and anisotropic effective response occurs.

Generation and meshing of the periodic cells have been done by using the python tool gmshModel^{Footnote 12}, whereby the placement of the fibers is realized via the random sequential adsorption (RSA) algorithm [76]. The RVEs were meshed by tetrahedron elements with 10 nodes. A total of 94,456 elements is reached for the final RVE with 100 fibers.

4.2 Application of the data-driven multiscale framework for the simulation of macroscopic samples

After the definition of the composite’s microscopic properties, the data-driven framework FE${}^\textit{ANN}$ is applied for the simulation of several multiscale problems. The macroscopic geometries and applied BCs are depicted in Fig. 5.

In all examples, the following specifications and meta parameters have been used: The effective transversely isotropic behavior, which results from the RVE homogenization, is described by using the set $\bar{\underline{{\textbf {I}}}}^\parallel :=({\bar{I}}_1,{\bar{I}}_2,{\bar{I}}_3,{\bar{I}}_4, {\bar{I}}_5,{\bar{I}}_3^*)\in \mathbb {R}^{6\times 1}$ consisting of six invariants, cf. Sect. 2.2.3. As discussed in Sect. 3.2, the non-independent invariant ${\bar{I}}_3^*=1/{\bar{I}}_3$ is added to allow the growth condition to be satisfied by construction of the network architecture. As already mentioned, this requires additional constraints for the weights, e. g., Eq. (23). However, training under these constraints results in noticeably worse stress predictions. Thus, in order to achieve maximum prediction quality, the constraints were not considered within the training here. The adapted architecture is nevertheless used to allow a better comparability to the same network architecture trained with the constraint, see the study in Appendix A. The ANNs used as the surrogate model consist of only one hidden layer with 15 neurons which has been shown to be sufficiently accurate, where the networks are trained 25 times in each macroscopic iteration. For comparison see the study given in [38].

As shown in Fig. 4b, the fiber orientation of the RVE is given by $\varvec{A}^\text {RVE}=\varvec{e}_3$, where $\varvec{e}_3$ is the Cartesian base vector. In order to transform deformation states for the general case $\varvec{A}^\text {macro} \ne \varvec{A}^\text {RVE}$ by using Eq. (28), Rodrigues’ rotation formula

$$\begin{aligned} \varvec{Q} = \varvec{N} \otimes \varvec{N} + \cos (\alpha ) \left[ \varvec{1} - \varvec{N} \otimes \varvec{N}\right] + \sin (\alpha ) \varvec{N} \times \varvec{1} \; \end{aligned}$$

(33)

is applied to determine $\varvec{Q}$. Therein, $\varvec{N} := \varvec{A}^\text {macro} \times \varvec{A}^\text {RVE}$ is the unit normal vector and $\alpha := \measuredangle (\varvec{A}^\text {macro}, \varvec{A}^\text {RVE})$ the angle between the fiber directions.

Finally, the tolerance for the detection of unique deformation states within the invariant space ${\mathcal {I}}^\parallel $ is set to $\varepsilon _\text {tol,1}={5}\,\%$. The tolerance for the filtering step is set to $\varepsilon _\text {tol,2}={1}\,\%$, cf. Sects. 3.3.1 and 3.3.2.

4.2.1 Cuboid under tension

As a first example, a cuboid with a circular hole is loaded by tension. To this end, a displacement of $\hat{\bar{\varvec{u}}} = 0.4 {\bar{L}}_{x_1}$ is prescribed at the top surface, where ${\bar{L}}_{x_1}$ is the initial length in $x_1$-direction. The fiber orientation is chosen to $\varvec{A}^\text {macro} = \varvec{e}_1$. The cuboid’s geometric dimensions are specified by ${\bar{L}}_{x_1}\times {\bar{L}}_{x_2} \times {\bar{L}}_{x_3} = (100\times 100 \times 25) \, \hbox {mm}$. The hole in the center has a radius of ${30}\,\hbox {mm}$.

Multiscale iterations The single iterations within the loop are described for the multiscale simulation of the cuboid in the following.

Table 2 Load cases for the generation of the initial data. The effective deformations $\bar{\varvec{F}}^\text {RVE}$ are prescribed according to Eqs. (16) and (17)

Full size table

To start the algorithm, the initial data have to be generated first, cf. Sect. 3.1 step (a). Here, a total number of 18 load cases–six uniaxial tension and compression, six equibiaxial tension and compression as well as six simple shear states–are prescribed to the fiber reinforced RVE. Different loading directions are considered to collect knowledge about the composite’s overall anisotropy. The applied stretches and shears as well as the directions are given in Table 2. In order to avoid that data providing the same physical information are contained multiple times, a filtering process is applied in the invariant space ${\mathcal {I}}^\parallel $.^{Footnote 13} Within ${\mathcal {I}}^\parallel $, the collected states cover only a sparse region which is pervaded by a few paths consisting of 193 tuples ${}^{\alpha }\!{\mathcal {T}}$. This is shown in Fig. 6, where the set is exemplarily visualized in the sectional planes ${\bar{I}}_1$-${\bar{I}}_3$ and ${\bar{I}}_1$-${\bar{I}}_5$. However, due to the physical knowledge, which is incorporated into the ANN-based macroscopic surrogate model, this sparse data set is sufficient to initiate the multiscale scheme.

Now, the algorithm enters the multiscale loop with iteration 1 and the initial data set is labeled as ${\mathcal {D}}_1$. It is used to train the physics-constrained ANN which is afterwards utilized as the RVE surrogate model within the macroscopic simulation, cf. Sect. 3.2 steps (b) and (c). Note that the prediction of the sample’s macroscopic fields, i. e., displacement $\bar{\varvec{u}}(\varvec{X},t_j)$, deformation $\bar{\varvec{F}}(\varvec{X},t_j)$, stress $\bar{\varvec{P}}(\varvec{X},t_j)$, etc., is only an initial guess in this first iteration. This is due to the fact that the ANN does not know the occurring stress-deformation states for the most part at this point. However, it is still sufficient to identify new deformation states. Again, thanks to the physics-constrained architecture, the ANN will thereby not violate fundamental physical principles, although it must extrapolate into an unknown area. As described in Sect. 3.3.1, the identification of unique states is done by transforming the sample’s deformation states, collected into the set ${\mathcal {F}}^\text {macro}_1$, into the invariant space ${\mathcal {I}}^\parallel $ and doing a comparison to ${\mathcal {D}}_1$. As already mentioned, the full time series $\bar{\varvec{F}}(t_0), \bar{\varvec{F}}(t_1), \dots , \bar{\varvec{F}}(t_j)$ with $t_j \le t_\text {end}$ is added to ${\mathcal {F}}_1^\text {new}$ if a unique state is identified in the step $t_j$. These deformation paths are prescribed in the RVE simulation and the corresponding stresses $\bar{\varvec{P}}$ are determined. After the subsequent filtering process with the tolerance $\varepsilon _\text {tol,2}$, a total number of 495 unique tuples is identified. They are given in Fig. 6.

Table 3 Multiscale iterations of the cuboid under tension: tuples in the current data set ${\mathcal {D}}_i$ and the new states collected in ${\mathcal {D}}_i^\text {new}$ as well as reached time step $t_\text {end}$ within the macro simulation

Full size table

In iteration 2, the new information are used to enrich the data set ${\mathcal {D}}_2={\mathcal {D}}_1\cup {\mathcal {D}}_1^\text {new}$. This set is used to train the ANN which is then applied for the prediction of the sample’s macroscopic fields. As shown in Fig. 6, only a low number of new tuples is now detected from ${\mathcal {F}}_2^\text {macro}$. This is due to the fact that the macroscopic simulation terminates already in time step $t_\text {end} = t_{12}<t_\text {goal}$, cf. Table 3.

However, using these new data, the prediction of the macroscopic simulation in the subsequent 3rd iteration yields that the sample’s deformation field maps to a new region within ${\mathcal {I}}^\parallel $ which consists of 60 tuples. The described process–consisting of the steps training process, macroscopic simulation, data analysis and data enrichment–is repeated until no further unique data are found which is the case in iteration 5. Note that an overlap of tuples from ${\mathcal {D}}_i$ and ${\mathcal {D}}_i^\text {new}$ is possible, since a state is already unique if it deviates only in one of the six relevant invariants. Thus, an intersection in the depicted sectional planes may occur.

In advance, the macroscopic sample’s states ${\mathcal {F}}_5^\text {macro}\rightarrow {\mathcal {D}}_5^\text {macro}$ mapped to ${\mathcal {I}}^\parallel $ and the data set ${\mathcal {D}}_5$ of iteration 5 are depicted within the full invariant space in Fig. 7. It becomes apparent, that ${\mathcal {D}}_5 \setminus {\mathcal {D}}_5^\text {macro}$ contains tuples which are not included into ${\mathcal {F}}_5^\text {macro}$. Thus, as described above, the predictions of iteration 1–4 are only necessary to gradually approach the correct result of the macroscopic simulation. However, due to the fact that the collected deformations are prescribed within the RVE simulations to determine the corresponding stresses, no defective information could enter the data set.

The stress predictions of the ANN for all tuples in ${\mathcal {D}}_5$ are compared to the reference values of the RVE simulations in Fig. 8a. As shown there, an almost perfect prediction occurs for both, training and test data. A single loading path and the corresponding deformed RVE at time $t_\text {goal}$ are depicted in Fig. 9 exemplarily.

In order to showcase the functionality of the autonomous data mining within FE${}^\textit{ANN}$, the convergence of the ANN’s stress prediction quality with respect to the data set ${\mathcal {D}}_5$ from the final multiscale iteration step is analyzed. To do so, $\bar{\varvec{T}}^\text {ANN}$ is calculated for all states in ${\mathcal {D}}_5$, where the weights of the respective multiscale iterations are used. As shown in Fig. 10, poor predictions occur within the first two iterations, which is due to the fact that a wide range of relevant states within the invariant space ${\mathcal {I}}^\parallel $ is unknown for the network at this point. Then, as a larger and larger area of ${\mathcal {I}}^\parallel $ is sampled with each successive multiscale iteration, it becomes apparent that the prediction quality continues to increase. However, as one can see, the prediction quality within iteration 4 is almost as good as in the final iteration 5. This is due to the fact that only a few new data tuples are added here, which are also relatively close to the area already sampled up to that point, cf. Fig. 6. Thus, it may be useful to apply an adjusted criterion to decide whether a deformation state is relevant or not, see Footnote 9. However, the presented method works reliably in any case.

Validation In order to demonstrate the quality of the developed FE${}^\textit{ANN}$ approach, the full deformation space of the cuboid is analyzed and the single states are applied to the RVE simulation, respectively. To this end, the symmetric right Cauchy–Green deformation tensors ${}^{\alpha }\!\bar{\varvec{C}}^\text {macro}_5$ are calculated from ${\mathcal {F}}_5^\text {macro}$ and included into the new set

(34)

This set is then compared to the set ${\mathcal {D}}_5$ within the space of the tensor . Thereby, the contained deformation gradients ${}^{\alpha }\!\bar{\varvec{F}}^\text {RVE}$ in ${\mathcal {D}}_5$ are rotated back to ${}^{\alpha }\!\bar{\varvec{F}}^\text {macro}$ by inverting Eq. (28)${}_2$. Subsequently, they are transformed to $\bar{\varvec{C}}$-values to enable a comparison with ${\mathcal {C}}_5^\text {macro}$. The unique states are then determined within a tolerance of ${5}\,\%$. The states contained in ${\mathcal {D}}_5$ and the unique, unknown states in the deformation space, i. e., ${\mathcal {C}}_5^\text {macro}\setminus {\mathcal {D}}_5$, are depicted in Fig. 11. Accordingly, a wide region of ${\mathcal {C}}_5^\text {macro}$ do not intersect with the training data set which results from the transformation into the invariant space during the multiscale loop.

The unknown states are prescribed within RVE simulations to get the corresponding stress values. These stresses are then compared to the predictions of the ANN which has been trained by ${\mathcal {D}}_5$. As shown in Fig. 8b, an almost perfect prediction is observed also for these states. Consequently, the proposed FE${}^\textit{ANN}$ approach has shown to be highly accurate.

Furthermore, in order to demonstrate that the chosen tolerance $\varepsilon _\text {tol,1}={5}\,\%$ within the multiscale loop is sufficient, the achieved macroscopic solution is compared to a solution in which this tolerance is prescribed to ${2.5}\,\%$.^{Footnote 14} As shown in Fig. 12, a relative error of below ${2}\,\%$ with respect to the reference solution occurs for the stress ${\bar{P}}_{11}$.

4.2.2 Torsional sample

After the detailed description of the multiscale loop within the solution process of the cuboid geometry, a further example is considered. Now, the FE${}^\textit{ANN}$ scheme is applied to the solution of a torsional sample with a circular hole given in Fig. 5b. Accordingly, the fiber direction of the macro sample is chosen as $\varvec{A}^\text {macro}=\varvec{e}_2$ in this example. Nevertheless, due to the applied transformation given by Eq. (28), the same RVE with $\varvec{A}^\text {RVE}=\varvec{e}_3$ is used. The torsional sample is loaded by specifying a distortion of $\hat{{\bar{\phi }}} = 45^\circ $ around the $x_1$-axis. The sample’s geometric dimensions are specified by ${\bar{L}}_{x_1}\times {\bar{L}}_{x_2} \times {\bar{L}}_{x_3} = (200\times 100 \times 100) \, \hbox {mm}$. The hole in the center has a radius of ${40}\,\hbox {mm}$.

In order to minimize the computational effort, the scheme is initiated by using the collected data set ${\mathcal {D}}_5$ from the previous example as the initial data. In this way, a knowledge base is created for a specific material under consideration, which can be used for further simulations and, at the same time, can be continuously expanded.

The multiscale loop now terminates after only 2 iterations which is due to the described initiation with the available data set. As shown exemplarily for the sectional planes ${\bar{I}}_1$-${\bar{I}}_3$ and ${\bar{I}}_1$-${\bar{I}}_5$ in Fig. 13, a wide range of relevant states is already covered by the states extracted from the cuboid simulation. Thus, the advantage of a transformation into the invariant space becomes again very clear. Although the deformation of the cuboid tensile specimen and torsional specimen is very different in the $\bar{\varvec{C}}$-space, both overlap clearly in invariant space which is in accordance to [38]. Caused from the geometry and the anisotropic nonlinear elastic behavior, a complex deformation of the macroscopic sample occurs, cf. the surface plot of ${\bar{C}}_{12}$ on the deformed geometry in Fig. 14.

4.2.3 Cook’s membrane

Finally, as a last example, Cook’s membrane is simulated by using the developed data-driven multiscale scheme. Thereby, the fiber direction within the membrane is prescribed to $\varvec{A}^\text {macro}=(\varvec{e}_1 + \varvec{e}_3) / \sqrt{2}$. To initiate the scheme, the data set from the previous example is used, where this set also contains data collected within the first example.

Again, a fast convergence of the multiscale loop is achieved after only 2 overall iterations. As shown exemplarily for the sectional planes ${\bar{I}}_1$-${\bar{I}}_3$ and ${\bar{I}}_1$-${\bar{I}}_5$ in Fig. 15, a wide range of relevant states is already covered by the states extracted from the cuboid and the torsional sample simulations. Thus, as already mentioned, the advantage of a transformation into the invariant space is underpinned. Although the deformation of the cuboid tensile specimen, the torsional specimen and the Cook membrane are very different in the $\bar{\varvec{C}}$-space, they overlap clearly in invariant space. The deformed macroscopic states with the right Cauchy–Green deformation ${\bar{C}}_{11}$ are depicted in Fig. 16 for the time steps $t_{10}$, $t_{15}$ and $t_{25}$, where the load $\hat{\bar{\varvec{p}}}$ is applied linear within the steps $t_j\in \{t_0,t_1,\dots ,t_{25}\}$. Due to the obliquely oriented fibers with respect to the alignment of the Cook membrane in the $x_1$-$x_2$-plane, an out of plane deformation of the Cook membrane occurs. This effect results from the coupling between shear and tension which is a well known effect of fiber reinforced materials. Although a relatively complex response behavior occurs here, the multiscale problem can be solved quickly and without further human supervision even in this example.

5 Conclusions

In this work, a novel data-driven multiscale approach called FE${}^\textit{ANN}$ is presented. It is based on physics-constrained ANNs which are used as highly efficient surrogate models and an unsupervised data mining process. The approach allows the efficient simulation of materials with complex underlying microstructures which reveal an overall anisotropic and nonlinear elastic behavior on the macroscale, e. g., composites, architectured materials with pronounced microstructure or foams. The framework has been implemented in such a way, that it is usable on a HPC cluster based on the Batch-System SLURM.

Starting from basic kinematics and stress measure definitions, a short revision of anisotropic hyperelastic constitutive models at finite strains is given. Furthermore, a Hill-type homogenization framework is described in brief. Based on this theoretical basis, the developed data-driven multiscale scheme is illustrated in detail. This includes the general procedure and a description of the single steps: initial data generation (a), training process of the ANN (b), macroscopic simulation (c), data analysis (d) and data enrichment (e). Afterwards, the approach is exemplarily applied to the solution of three demonstrative examples, a cuboid under tension, a torsional sample and the Cook membrane. Thereby, the considered macroscopic bodies consist of a fiber reinforced composite revealing a highly nonlinear behavior of the individual phases. Due to the incorporation of physical knowledge into the ANN-based surrogate model, only a small number of computationally expensive RVE simulations was needed to solve the considered macroscopic problems. Furthermore, a rather high accuracy of the surrogate model has been shown within a validation.

Altogether, the presented data-driven approach has shown to be an efficient tool for the solution of complex multiscale problems at finite strains. Due to the implemented unsupervised data mining, it is universally applicable to various macroscopic geometries and BCs. In order to extend the scheme’s application area, several extensions have to be made in the future. For instance, an extension to further material symmetry groups [10] have to be made by integrating appropriate invariant sets into the implementation. Furthermore, to make the FE${}^\textit{ANN}$ approach even more general, the inclusion of a preprocessing step using tensor-basis ANNs which can discover type and orientation of the underlying anisotropy would be a valuable addition [22]. In order to incorporate additional physics, the usage of ANN-based models that account for polyconvexity is possible [45, 46, 72]. However, this may result in a reduction of the network’s prediction quality within the training regime. Finally, an extension to dissipative constitutive behavior [60, 61, 78] is needed. To do this, in addition to extending the ANN-based surrogate model to the more general path-dependent case, the data mining procedure must also be adapted. In contrast to the elastic case, complete time sequences within the deformation and possibly also the stress invariant space have to be compared, which will lead to additional difficulties.

Notes

In the following, the term physics-informed does not directly refer to the PINN-approach (physics-informed neural network) according to Raissi et al. [66]. In a PINN, the searched solution field approximated by the ANN, e. g., displacement $\varvec{u}(\varvec{x},t)$, is inserted into the governing partial differential equation (PDE) at collocation points. This expression is then added to the loss, so that the fulfillment of the PDE is enforced. In the context of constitutive modeling, however, the idea to enrich the ML approach with physical knowledge is applied in a similar way.
The Cayley–Hamilton theorem states that a second order tensor fulfills his own eigenvalue equation, e. g., $\varvec{C}^3 - I_1 \varvec{C}^2 + I_2 \varvec{C} -I_3 \varvec{1} = \varvec{0}$, where $I_1,I_2,I_3$ denote the principal invariants of $\varvec{C}$, cf. Eq. (9). Consequently, any power $\varvec{C}^n$ with $n\ge 3$ as well as the inverse $\varvec{C}^{-1}$ are expressible in terms of $\varvec{C}^2$, $\varvec{C}$ and $\varvec{1}$ [34].
Note that the more common expression $I_2 = \frac{1}{2} ({{\,\textrm{tr}\,}}^2 \varvec{C} - {{\,\textrm{tr}\,}}\varvec{C}^2)$ is equivalent to $I_2={{\,\textrm{tr}\,}}({{\,\textrm{Cof}\,}}\varvec{C})$. Both expressions can be transformed into each other by using the Cayley–Hamilton theorem, see Footnote 2.
Note that a non-bounded activation function is necessary to fulfill the growth condition, i. e., ${\bar{\psi }}^\text {ANN}(\bar{\varvec{C}})\rightarrow \infty $ as $({\bar{J}}\rightarrow \infty \vee {\bar{J}} \rightarrow 0^+)$. Besides the Softplus activation function, further choices are possible, e. g., ELU or ReLu. However, in contrast to the Softplus function, they are not twice continuously differentiable.
The thermodynamic consistency is a priori fulfilled since Eq. (5) is used for the stress calculation based on the free energy defined by the ANN. This is true although the stresses and thus also the energy following from the computational homogenization can only be approximated by the network because $-\text {d}_t{\bar{\psi }}^\text {ANN} + \bar{\varvec{P}}^\text {ANN} : \dot{\bar{\varvec{F}}} = 0$ holds for arbitrary $\dot{\bar{\varvec{F}}}$.
It should be noted that the ANN-based surrogate is only a mean field optimizer and thus will not exactly represent the stresses at the training points. This could be achieved by using e. g. Gaussian process regression (GPR), see [17, 23].
Due to local minima within the loss function, the optimization procedure which is applied here depends on the starting values of the weights and biases. Thus, the network is trained several times to overcome this.
Note that it is useful to save the time series of a new deformation state, i. e., all states $\bar{\varvec{F}}(t_0,t_1,\dots ,t_j)$ proceeding at a fixed quadrature point. This facilitates the application of the macroscopic deformation within the computational homogenization later on. Furthermore, an extension to path dependent materials requires this mandatory.
To once more reduce the number of RVE simulations, the use of another criterion for the selection of relevant states could be meaningful. To this end, an additional metamodel for the energy ${\bar{\psi }}$ depending on ${\bar{I}}_\alpha $, e.g., Gaussian process regression (GPR), could be used to quantify the uncertainty at an arbitrary point. Based on that, it is possible to decide whether a state is relevant and an RVE simulation has to be started or not.
Note that the introduced material parameters are related to initial shear modulus $G^\text {init}$ and Poisson’s ratio $\nu ^\text {init}$ via the following relations:
$$\begin{aligned} G^\text {init}=\frac{1}{2} \sum _{p=1}^{N_\text {O}} \alpha _p \mu _p \quad \text {and} \quad \kappa = \frac{2}{3} G^\text {init} \frac{1+\nu ^\text {init}}{1-2\nu ^\text {init}} \; . \end{aligned}$$
It should be noted, that it is not sufficient to represent the microstructure of a fiber reinforced composite with an overall transversely isotropic behavior by a hexagonal unit cell if finite strains are considered. This is due to the loss of the material symmetry which results from the deformation of the RVE cell, cf. Appendix B.
The python tool gmshModel is freely accessible under https://gmshmodel.readthedocs.io/en/latest/.
Due to the material symmetry (transverse isotropy), several loadings are equivalent from the point of the material, e. g., uniaxial tension in $x_1$- or $x_2$-direction. Note that the fiber direction of the RVE is $\varvec{A}^\text {RVE}=\varvec{e}_3$.
In want of a fully coupled FE${}^2$ implementation, this comparison is used.

References

Alnæs M, Blechta J, Hake J, Johansson A, Kehlet B, Logg A, Richardson C, Ring J, Rognes ME, Wells GN (2015) The FEniCS project version 1.5. Arch Numer Softw 3(100):9–23
As’ad F, Avery P, Farhat C (2022) A mechanics-informed artificial neural network approach in data-driven constitutive modeling. In: AIAA SCITECH 2022 Forum. American Institute of Aeronautics and Astronautics, San Diego, CA & Virtual. https://doi.org/10.2514/6.2022-0100
Bock FE, Aydin RC, Cyron CJ, Huber N, Kalidindi SR, Klusemann B (2019) A review of the application of machine learning and data mining approaches in continuum materials mechanics. Front Mater 6:110. https://doi.org/10.3389/fmats.2019.00110
Article Google Scholar
Bonatti C, Mohr D (2022) On the importance of self-consistency in recurrent neural network models representing elasto–plastic solids. J Mech Phys Solids 158:104697. https://doi.org/10.1016/j.jmps.2021.104697
Article MathSciNet Google Scholar
Breuer K, Stommel M (2021) Prediction of short fiber composite properties by an artificial neural network trained on an RVE database. Fibers 9(2):8. https://doi.org/10.3390/fib9020008
Article Google Scholar
Carrara P, De Lorenzis L, Stainier L, Ortiz M (2020) Data-driven fracture mechanics. Comput Methods Appl Mech Eng 372:113390. https://doi.org/10.1016/j.cma.2020.113390
Article MathSciNet MATH Google Scholar
Chung I, Im S, Cho M (2021) A neural network constitutive model for hyperelasticity based on molecular dynamics simulations. Int J Numer Meth Eng 122(1):5–24. https://doi.org/10.1002/nme.6459
Article MathSciNet Google Scholar
Ciftci K, Hackl K (2022) Model-free data-driven simulation of inelastic materials using structured data sets, tangent space information and transition rules. Comput Mech 70(2):425–435. https://doi.org/10.1007/s00466-022-02174-x
Article MathSciNet MATH Google Scholar
Coleman BD, Noll W (1963) The thermodynamics of elastic materials with heat conduction and viscosity. Arch Ration Mech Anal 13(1):167–178
Article MathSciNet MATH Google Scholar
Ebbing V (2010) Design of polyconvex energy functions for all anisotropy classes. No. 8 in Bericht / Universität Duisburg-Essen, Institut für Mechanik, Abt. Bauwissenschaften. Inst. für Mechanik, Abt. Bauwissenschaften, Essen. OCLC: 750952548
Eggersmann R, Kirchdoerfer T, Reese S, Stainier L, Ortiz M (2019) Model-free data-driven inelasticity. Comput Methods Appl Mech Eng 350:81–99. https://doi.org/10.1016/j.cma.2019.02.016
Article MathSciNet MATH Google Scholar
Fernández M, Jamshidian M, Böhlke T, Kersting K, Weeger O (2020) Anisotropic hyperelastic constitutive models for finite deformations combining material theory and data-driven approaches with application to cubic lattice metamaterials. Comput Mech. https://doi.org/10.1007/s00466-020-01954-7
Feyel F, Chaboche JL (2000) FE2 multiscale approach for modelling the elastoviscoplastic behaviour of long fibre SiC/Ti composite materials. Comput Methods Appl Mech Eng 183(3):309–330. https://doi.org/10.1016/S0045-7825(99)00224-8
Article MATH Google Scholar
Field D, Ammouche Y, Peña JM, Jérusalem A (2021) Machine learning based multiscale calibration of mesoscopic constitutive models for composite materials: application to brain white matter. Comput Mech 67(6):1629–1643. https://doi.org/10.1007/s00466-021-02009-1
Article MathSciNet MATH Google Scholar
Fleischhauer R, Thomas T, Kato J, Terada K, Kaliske M (2020) Finite thermo-elastic decoupled two-scale analysis. Int J Numer Meth Eng 121(3):355–392. https://doi.org/10.1002/nme.6212
Article MathSciNet Google Scholar
Flory P (1961) Thermodynamic relations for high elastic materials. Trans Faraday Soc 57:829–838
Article MathSciNet Google Scholar
Frankel AL, Jones RE, Swiler LP (2020) Tensor basis gaussian process models of hyperelastic materials. J Mach Learn Model Comput. https://doi.org/10.1615/.2020033325
Fritzen F, Fernández M, Larsson F (2019) On-the-fly adaptivity for nonlinear twoscale simulations using artificial neural networks and reduced order modeling. Front Mater 6:75. https://doi.org/10.3389/fmats.2019.00075
Fuchs A, Heider Y, Wang K, Sun W, Kaliske M (2021) DNN2: a hyper-parameter reinforcement learning game for self-design of neural network based elasto–plastic constitutive descriptions. Comput Struct 249:106505
Article Google Scholar
Fuhg JN, Böhm C, Bouklas N, Fau A, Wriggers P, Marino M (2021) Model-data-driven constitutive responses: application to a multiscale computational framework. Int J Eng Sci 167:103522. https://doi.org/10.1016/j.ijengsci.2021.103522
Article MathSciNet MATH Google Scholar
Fuhg JN, Bouklas N (2022) On physics-informed data-driven isotropic and anisotropic constitutive models through probabilistic machine learning and space-filling sampling. Comput Methods Appl Mech Eng 394:114915. https://doi.org/10.1016/j.cma.2022.114915
Article MathSciNet MATH Google Scholar
Fuhg JN, Bouklas N, Jones RE (2022) Learning hyperelastic anisotropy from data via a tensor basis neural network. J Mech Phys Solids 168:105022. https://doi.org/10.1016/j.jmps.2022.105022
Article MathSciNet Google Scholar
Fuhg JN, Marino M, Bouklas N (2022) Local approximate Gaussian process regression for data-driven constitutive models: development and comparison with neural networks. Comput Methods Appl Mech Eng 388:114217. https://doi.org/10.1016/j.cma.2021.114217
Article MathSciNet MATH Google Scholar
Fuhg JN, van Wees L, Obstalecki M, Shade P, Bouklas N, Kasemer M (2022) Machine-learning convex and texture-dependent macroscopic yield from crystal plasticity simulations. Materialia 23:101446. https://doi.org/10.1016/j.mtla.2022.101446
Article Google Scholar
Gajek S, Schneider M, Böhlke T (2020) On the micromechanics of deep material networks. J Mech Phys Solids 142:103984. https://doi.org/10.1016/j.jmps.2020.103984
Article MathSciNet Google Scholar
Gajek S, Schneider M, Böhlke T (2022) An FE-DMN method for the multiscale analysis of thermomechanical composites. Comput Mech. https://doi.org/10.1007/s00466-021-02131-0
Gebhart P, Wallmersperger T (2022) A constitutive macroscale model for compressible magneto-active polymers based on computational homogenization data: Part I - Magnetic linear regime. Int J Solids Struct 236–237:111294. https://doi.org/10.1016/j.ijsolstr.2021.111294
Article Google Scholar
Ghaboussi J, Garrett JH, Wu X (1991) Knowledge-based modeling of material behavior with neural networks. J Eng Mech 117(1):132–153. https://doi.org/10.1061/(ASCE)0733-9399(1991)117:1(132)
Article Google Scholar
Ghavamian F, Simone A (2019) Accelerating multiscale finite element simulations of history-dependent materials using a recurrent neural network. Comput Methods Appl Mech Eng 357:112594. https://doi.org/10.1016/j.cma.2019.112594
Article MathSciNet MATH Google Scholar
Gitman I, Askes H, Sluys L (2007) Representative volume: existence and size determination. Eng Fract Mech 74(16):2518–2534. https://doi.org/10.1016/j.engfracmech.2006.12.021
Article Google Scholar
González D, Chinesta F, Cueto E (2019) Thermodynamically consistent data-driven computational mechanics. Continuum Mech Thermodyn 31(1):239–253. https://doi.org/10.1007/s00161-018-0677-z
Article MathSciNet Google Scholar
Haasemann G, Kästner M, Ulbricht V (2006) Multi-scale modelling and simulation of textile reinforced materials. In Motasoares CA, Martins JAC, Rodrigues HC, Ambrósio JAC, Pina CAB, Motasoares CM, Pereira EBR, Folgado J (eds) III European Conference on Computational Mechanics. Springer Netherlands, pp 510–510. https://doi.org/10.1007/1-4020-5370-3_510
Hashash YMA, Jung S, Ghaboussi J (2004) Numerical implementation of a neural network based material model in finite element analysis: neural network based material model. Int J Numer Meth Eng 59(7):989–1005. https://doi.org/10.1002/nme.905
Article MATH Google Scholar
Haupt P (2000) Continuum mechanics and theory of materials. Springer, Berlin
Book MATH Google Scholar
He X, Chen JS (2022) Thermodynamically consistent machine-learned internal state variable approach for data-driven modeling of path-dependent materials. Comput Methods Appl Mech Eng. https://doi.org/10.1016/j.cma.2022.115348
Holzapfel GA (2000) Nonlinear solid mechanics—a continuum approach for engineering. Wiley, Chichester
MATH Google Scholar
Ibañez R, Abisset-Chavanne E, Aguado JV, Gonzalez D, Cueto E, Chinesta F (2018) A manifold learning approach to data-driven computational elasticity and inelasticity. Archi Comput Methods Eng 25(1):47–57. https://doi.org/10.1007/s11831-016-9197-9
Article MathSciNet MATH Google Scholar
Kalina KA, Linden L, Brummund J, Metsch P, Kästner M (2021) Automated constitutive modeling of isotropic hyperelasticity based on artificial neural networks. Comput Mech. https://doi.org/10.1007/s00466-021-02090-6
Kalina KA, Metsch P, Brummund J, Kästner M (2020) A macroscopic model for magnetorheological elastomers based on microscopic simulations. Int J Solids Struct 193–194:200–212. https://doi.org/10.1016/j.ijsolstr.2020.02.028
Article Google Scholar
Kalina KA, Raßloff A, Wollner M, Metsch P, Brummund J, Kästner M (2020) Multiscale modeling and simulation of magneto-active elastomers based on experimental data. Phys Sci Rev. https://doi.org/10.1515/psr-2020-0012
Karapiperis K, Stainier L, Ortiz M, Andrade J (2021) Data-driven multiscale modeling in mechanics. J Mech Phys Solids 147:104239. https://doi.org/10.1016/j.jmps.2020.104239
Article MathSciNet Google Scholar
Keip MA, Rambausek M (2017) Computational and analytical investigations of shape effects in the experimental characterization of magnetorheological elastomers. Int J Solids Struct 121:1–20. https://doi.org/10.1016/j.ijsolstr.2017.04.012
Article Google Scholar
Kirchdoerfer T, Ortiz M (2016) Data-driven computational mechanics. Comput Methods Appl Mech Eng 304:81–101. https://doi.org/10.1016/j.cma.2016.02.001
Article MathSciNet MATH Google Scholar
Kirchdoerfer T, Ortiz M (2017) Data driven computing with noisy material data sets. Comput Methods Appl Mech Eng 326:622–641. https://doi.org/10.1016/j.cma.2017.07.039
Article MathSciNet MATH Google Scholar
Klein DK, Fernández M, Martin RJ, Neff P, Weeger O (2021) Polyconvex anisotropic hyperelasticity with neural networks. J Mech Phys Solids. https://doi.org/10.1016/j.jmps.2021.104703
Klein DK, Ortigosa R, Martínez-Frutos J, Weeger O (2022) Finite electro-elasticity with physics-augmented neural networks. Comput Methods Appl Mech Eng 400:115501. https://doi.org/10.1016/j.cma.2022.115501
Article MathSciNet MATH Google Scholar
Korzeniowski TF, Weinberg K (2021) Data-driven finite element method with RVE generated foam data. arXiv:2110.11129 [cs]
Koyanagi J, Kawamoto K, Higuchi R, Tan VBC, Tay TE (2021) Direct FE2 for simulating strain-rate dependent compressive failure of cylindrical CFRP. Compos Part C Open Access 5:100165. https://doi.org/10.1016/j.jcomc.2021.100165
Article Google Scholar
Lange N, Hütter G, Kiefer B (2021) An efficient monolithic solution scheme for FE2 problems. Comput Methods Appl Mech Eng 382:113886. https://doi.org/10.1016/j.cma.2021.113886
Article MATH Google Scholar
Le BA, Yvonnet J, He QC (2015) Computational homogenization of nonlinear elastic materials using neural networks: neural networks-based computational homogenization. Int J Numer Meth Eng 104(12):1061–1084. https://doi.org/10.1002/nme.4953
Article MATH Google Scholar
Li B, Zhuang X (2020) Multiscale computation on feedforward neural network and recurrent neural network. Front Struct Civ Eng 14(6):1285–1298. https://doi.org/10.1007/s11709-020-0691-7
Article Google Scholar
Linden L, Kalina KA, Brummund J, Metsch P, Kästner M (2021) Thermodynamically consistent constitutive modeling of isotropic hyperelasticity based on artificial neural networks. PAMM. https://doi.org/10.1002/pamm.202100144
Linka K, Hillgärtner M, Abdolazizi KP, Aydin RC, Itskov M, Cyron CJ (2021) Constitutive artificial neural networks: a fast and general approach to predictive data-driven constitutive modeling by deep learning. J Comput Phys 429:110010. https://doi.org/10.1016/j.jcp.2020.110010
Article MathSciNet MATH Google Scholar
Liu X, Tian S, Tao F, Yu W (2021) A review of artificial neural networks in the constitutive modeling of composite materials. Compos Part B Eng 224:109152. https://doi.org/10.1016/j.compositesb.2021.109152
Article Google Scholar
Liu Z, Wu C (2019) Exploring the 3D architectures of deep material network in data-driven multiscale mechanics. J Mech Phys Solids 127:20–46. https://doi.org/10.1016/j.jmps.2019.03.004
Article MathSciNet MATH Google Scholar
Liu Z, Wu C, Koishi M (2019) A deep material network for multiscale topology learning and accelerated nonlinear modeling of heterogeneous materials. Comput Methods Appl Mech Eng 345:1138–1168. https://doi.org/10.1016/j.cma.2018.09.020
Article MathSciNet MATH Google Scholar
Logg A, Mardal KA, Wells G (2012) Automated solution of differential equations by the finite element method: the FEniCS book, vol 84. Springer, New York
Book MATH Google Scholar
Lu X, Giovanis DG, Yvonnet J, Papadopoulos V, Detrez F, Bai J (2019) A data-driven computational homogenization method based on neural networks for the nonlinear anisotropic electrical response of graphene/polymer nanocomposites. Comput Mech 64(2):307–321. https://doi.org/10.1007/s00466-018-1643-0
Article MathSciNet MATH Google Scholar
Malik A, Abendroth M, Hütter G, Kiefer B (2021) A hybrid approach employing neural networks to simulate the elasto–plastic deformation behavior of 3D-foam structures. Adv Eng Mater. https://doi.org/10.1002/adem.202100641
Masi F, Stefanou I (2022) Multiscale modeling of inelastic materials with thermodynamics-based artificial neural networks (TANN). Comput Methods Appl Mech Eng 398:115190. https://doi.org/10.1016/j.cma.2022.115190
Article MathSciNet MATH Google Scholar
Masi F, Stefanou I, Vannucci P, Maffi-Berthier V (2021) Thermodynamics-based artificial neural networks for constitutive modeling. J Mech Phys Solids 147:104277. https://doi.org/10.1016/j.jmps.2020.104277
Article MathSciNet MATH Google Scholar
Miehe C, Koch A (2002) Computational micro-to-macro transitions of discretized microstructures undergoing small strains. Arch Appl Mech (Ingenieur Archiv) 72(4–5):300–317. https://doi.org/10.1007/s00419-002-0212-2
Article MATH Google Scholar
Montáns FJ, Chinesta F, Gómez-Bombarelli R, Kutz JN (2019) Data-driven modeling and learning in science and engineering. C R Méc 347(11):845–855. https://doi.org/10.1016/j.crme.2019.11.009
Article Google Scholar
Nguyen LTK, Keip MA (2018) A data-driven approach to nonlinear elasticity. Comput Struct 194:97–115. https://doi.org/10.1016/j.compstruc.2017.07.031
Article Google Scholar
Ogden RW (1997) Non-linear elastic deformations. Dover Publications, Mineola
Google Scholar
Raissi M, Perdikaris P, Karniadakis G (2019) Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys 378:686–707. https://doi.org/10.1016/j.jcp.2018.10.045
Article MathSciNet MATH Google Scholar
Saito R, Yamanaka Y, Matsubara S, Okabe T, Moriguchi S, Terada K (2021) A decoupling scheme for two-scale finite thermoviscoelasticity with thermal and cure-induced deformations. Int J Numer Methods Eng 122(4):1133–1166. https://doi.org/10.1002/nme.6575
Article MathSciNet Google Scholar
Schröder J, Hackl K (eds) (2014) Plasticity and beyond: microstructures, crystal-plasticity and phase transitions. No. 550 in Courses and lectures / International Centre for Mechanical Sciences. Springer, Wien. OCLC: 931441976
Schröder J, Labusch M, Keip MA (2016) Algorithmic two-scale transition for magneto-electro-mechanically coupled problems. Comput Methods Appl Mech Eng 302:253–280. https://doi.org/10.1016/j.cma.2015.10.005
Article MATH Google Scholar
Settgast C, Hütter G, Kuna M, Abendroth M (2020) A hybrid approach to simulate the homogenized irreversible elastic-plastic deformations and damage of foams by neural networks. Int J Plast 126:102624. https://doi.org/10.1016/j.ijplas.2019.11.003
Article Google Scholar
Stoffel M, Bamer F, Markert B (2018) Artificial neural networks and intelligent finite elements in non-linear structural mechanics. Thin Walled Struct 131:102–106. https://doi.org/10.1016/j.tws.2018.06.035
Article Google Scholar
Tac V, Sahli Costabal F, Tepole AB (2022) Data-driven tissue mechanics with polyconvex neural ordinary differential equations. Comput Methods Appl Mech Eng 398:115248. https://doi.org/10.1016/j.cma.2022.115248
Article MathSciNet MATH Google Scholar
Tac V, Sree VD, Rausch MK, Tepole AB (2022) Data-driven modeling of the mechanical behavior of anisotropic soft biological tissue. Eng Comput. https://doi.org/10.1007/s00366-022-01733-3
Article Google Scholar
Terada K, Kato J, Hirayama N, Inugai T, Yamamoto K (2013) A method of two-scale analysis with micro-macro decoupling scheme: application to hyperelastic composite materials. Comput Mech 52(5):1199–1219. https://doi.org/10.1007/s00466-013-0872-5
Article MathSciNet MATH Google Scholar
Thakolkaran P, Joshi A, Zheng Y, Flaschel M, De Lorenzis L, Kumar S (2022) NN-EUCLID: deep-learning hyperelasticity without stress data. J Mech Phys Solids 169:105076. https://doi.org/10.1016/j.jmps.2022.105076
Article MathSciNet Google Scholar
Torquato S (2002) Random heterogeneous materials: microstructure and macroscopic properties. No. 16 in Interdisciplinary applied mathematics. Springer, New York
Book MATH Google Scholar
Vlassis NN, Ma R, Sun W (2020) Geometric deep learning for computational mechanics part I: anisotropic hyperelasticity. Compu Methods Appl Mech Eng 371:113299. https://doi.org/10.1016/j.cma.2020.113299
Article MathSciNet MATH Google Scholar
Vlassis NN, Sun W (2021) Component-based machine learning paradigm for discovering rate-dependent and pressure-sensitive level-set plasticity models. J Appl Mech. https://doi.org/10.1115/1.4052684
Vlassis NN, Sun W (2021) Sobolev training of thermodynamic-informed neural networks for interpretable elasto–plasticity models with level set hardening. Compu Methods Appl Mech Eng 377:113695. https://doi.org/10.1016/j.cma.2021.113695
Article MathSciNet MATH Google Scholar
Vlassis NN, Zhao P, Ma R, Sewell T, Sun W (2022) Molecular dynamics inferred transfer learning models for finite-strain hyperelasticity of monoclinic crystals: Sobolev training and validations against physical constraints. Int J Numer Meth Eng 123(17):3922–3949. https://doi.org/10.1002/nme.6992
Article MathSciNet Google Scholar
Weber P, Geiger J, Wagner W (2021) Constrained neural network training and its application to hyperelastic material modeling. Comput Mech 68(5):1179–1204. https://doi.org/10.1007/s00466-021-02064-8
Article MathSciNet MATH Google Scholar
Wu L, Nguyen VD, Kilingar NG, Noels L (2020) A recurrent neural network-accelerated multi-scale model for elasto–plastic heterogeneous materials subjected to random cyclic and non-proportional loading paths. Compu Methods Appl Mech Eng 369:113234. https://doi.org/10.1016/j.cma.2020.113234
Article MathSciNet MATH Google Scholar
Yamamoto T, Okabe T, Terada K (2022) Numerical simulation for deformation of laminates combining the novel shell element with the decoupled two-scale viscoelastic analysis of FRP. Int J Solid Struct 234–235:111236. https://doi.org/10.1016/j.ijsolstr.2021.111236
Article Google Scholar
Yamazaki Y, Koyanagi J, Sawamura Y, Ridha M, Yoneyama S, Tay T (2018) Numerical simulation of dynamic failure behavior for cylindrical carbon fiber reinforced polymer. Compos Struct 203:934–942. https://doi.org/10.1016/j.compstruct.2018.06.075
Article Google Scholar
Yvonnet J (2019) Computational homogenization of heterogeneous materials with finite elements, solid mechanics and its applications, vol 258. Springer, Cham. https://doi.org/10.1007/978-3-030-18383-7
Book MATH Google Scholar
Zopf C, Kaliske M (2017) Numerical characterisation of uncured elastomers by a neural network based approach. Comput Struct 182:504–525. https://doi.org/10.1016/j.compstruc.2016.12.012
Article Google Scholar
Zschocke S, Leichsenring F, Graf W, Kaliske M (2022) A concept for data-driven computational mechanics in the presence of polymorphic uncertain properties. Eng Struct 267:114672. https://doi.org/10.1016/j.engstruct.2022.114672

Download references

Acknowledgements

All presented computations were performed on a PC-Cluster at the Center for Information Services and High Performance Computing (ZIH) at TU Dresden. The authors thus thank the ZIH for generous allocations of computer time. Finally, the authors want to thank Vincent Scholz for several discussions on the topic.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Institute of Solid Mechanics, TU Dresden, 01062, Dresden, Germany
Karl A. Kalina, Lennart Linden, Jörg Brummund & Markus Kästner

Authors

Karl A. Kalina
View author publications
You can also search for this author in PubMed Google Scholar
Lennart Linden
View author publications
You can also search for this author in PubMed Google Scholar
Jörg Brummund
View author publications
You can also search for this author in PubMed Google Scholar
Markus Kästner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Markus Kästner.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix: Fulfillment of the growth condition by adjusted physics-constrained ANNs

Within this appended section, it is discussed how the growth condition, i. e.,

$$\begin{aligned} {\bar{\psi }}^\text {ANN}(\bar{\varvec{C}})\rightarrow \infty \; \text {as} \; ({\bar{J}}\rightarrow \infty \vee {\bar{J}} \rightarrow 0^+) \; , \end{aligned}$$

(35)

can be fulfilled by construction of physics-constrained ANNs. Furthermore, a comparison of such an adapted network with a network neglecting the growth condition is given.

1.1 Network architecture and additional constraints

As discussed in Sect. 3.2, there are several requirements for the ANN to fulfill Eq. (35) by construction: First of all, a non-bounded activation function is necessary in any case. This requirement is fulfilled by using the Softplus activation function, cf. Eq. (19). Secondly, the additional invariant ${\bar{I}}_3^* := 1/{\bar{I}}_3$ has to be included in the argument list of ${\bar{\psi }}^\text {ANN}$. This is to enable the fulfillment of Eq. (35) for ${\bar{J}}\rightarrow 0^+$. Thirdly, additional constraints have to be satisfied by the weights belonging to ${\bar{I}}_3$ and ${\bar{I}}_3^*$. Accordingly, a possible constraint is given by

$$\begin{aligned} \begin{aligned}&\left( W_\alpha> 0 \forall \alpha \in {\mathcal {N}}\right) \cdots \\ \cdots \wedge&\left( \exists \,w_{\alpha 3}>0 \text { with } \alpha \in {\mathcal {N}}\right) \wedge \left( \exists \,w^*_{\alpha 3}>0 \text { with } \alpha \in {\mathcal {N}}\right) \; , \end{aligned} \end{aligned}$$

(36)

which is a sufficient condition to guarantee that Eq. (35) holds. In the equation above, the set ${\mathcal {N}}:=\{1,2,,\ldots ,N\}$ contains the indices of the hidden layer neurons. A proof of Eq. (36) is given in the following. Thereby, taking into account the requirements given above, the following network with one hidden layer is considered:

$$\begin{aligned} {\bar{\psi }}^\text {ANN} := B + \sum _{\alpha =1}^{N} W_{\alpha } {\mathscr {S}}\mathscr {P}\Big (\!\sum _{\beta =1}^{n} w_{\alpha \beta } \bar{{\mathfrak {i}}}_\beta + \sum _{\gamma \in {\mathcal {A}}} w^*_{\alpha \gamma } \bar{{\mathfrak {i}}}^*_\gamma + b_\alpha \!\Big ). \end{aligned}$$

(37)

The following discussion is carried out on the example of the transversely isotropic material symmetry group considered within Sect. 4. However, the argumentation is equally valid for other symmetry groups which can be described by a set of invariants. Now, the purely volumetric deformation state $\bar{\varvec{F}}= {\bar{\lambda }} \varvec{1}$ is analyzed to study the ANN-based model’s behavior for ${\bar{J}}\rightarrow 0^+$ and ${\bar{J}} \rightarrow \infty $. For this state, the relevant invariants follow to

$$\begin{aligned} {\bar{I}}_1&= 3 {\bar{\lambda }}^2 \; , \; {\bar{I}}_2 = 3 {\bar{\lambda }}^4 \; , \; {\bar{I}}_3 = {\bar{\lambda }}^6 \; , \; {\bar{I}}_4={\bar{\lambda }}^2 \; , \; {\bar{I}}_5 = {\bar{\lambda }}^4 \nonumber \\ {\bar{I}}_3^*&= {\bar{\lambda }}^{-6} \; . \end{aligned}$$

(38)

Additionally, the normalization of the invariants according to

$$\begin{aligned} \bar{{\mathfrak {i}}}_\alpha ({\bar{I}}_\alpha ) :=\left[ {\bar{I}}_\alpha - \frac{{\bar{I}}_\alpha ^\text {max}+{\bar{I}}_\alpha ^\text {min}}{2}\right] \frac{2}{{\bar{I}}_\alpha ^\text {max}-{\bar{I}}_\alpha ^\text {min}} \end{aligned}$$

(39)

has to be taken into account, where ${\bar{I}}_\alpha ^\text {max} \in \mathbb {R}$ and ${\bar{I}}_\alpha ^\text {min} \in \mathbb {R}$ denote maximum and minimum components of a given training data set, i. e., these values are finite. As one can see from Eq. (39), the applied normalization has no influence on the respective power order. An evaluation of the invariants for ${\bar{\lambda }} \rightarrow 0^+$ and ${\bar{\lambda }} \rightarrow \infty $ gives

$$\begin{aligned}&\lim _{{\bar{\lambda }} \rightarrow 0^+} \bar{{\mathfrak {i}}}_\alpha = -d_\alpha , \quad \lim _{{\bar{\lambda }} \rightarrow 0^+} \bar{{\mathfrak {i}}}^*_3 = \infty \; \text {and} \end{aligned}$$

(40)

$$\begin{aligned}&\lim _{{\bar{\lambda }} \rightarrow \infty } \bar{{\mathfrak {i}}}_\alpha = \infty , \quad \lim _{{\bar{\lambda }} \rightarrow \infty } \bar{{\mathfrak {i}}}^*_3 = -d_3^*, \end{aligned}$$

(41)

with $d_\alpha ,d_3^*\in \mathbb {R}_+$ respectively. Starting from Eq. (37) to analyze the case ${\bar{\lambda }} \rightarrow 0^+$, it follows

$$\begin{aligned} \begin{aligned} \lim _{{\bar{\lambda }} \rightarrow 0^+} {\bar{\psi }}^\text {ANN}&= \lim _{{\bar{\lambda }} \rightarrow 0^+} \Bigg [ B + \sum _{\alpha =1}^{N} W_{\alpha } \ln \Bigg (1+\exp \Bigg ( \sum _{\beta =1}^{n} w_{\alpha \beta } \bar{{\mathfrak {i}}}_\beta \\&\quad + w^*_{\alpha 3} \bar{{\mathfrak {i}}}^*_3 + b_\alpha \Bigg )\Bigg ) \Bigg ] \; . \end{aligned} \end{aligned}$$

(42)

Using Eqs. (40) and (41) and supposing that it exists at least one $w_{\alpha 3}^*>0$, one finds that

$$\begin{aligned} \lim _{{\bar{\lambda }} \rightarrow 0^+} {\bar{\psi }}^\text {ANN} = \lim _{{\bar{\lambda }} \rightarrow 0^+} {\bar{\lambda }}^{-6} \frac{2}{{\bar{I}}_3^\text {*,max}-{\bar{I}}_3^\text {*,min}} \underbrace{\sum _{\alpha =1}^N W_\alpha w^*_{\alpha 3} \theta (w^*_{\alpha 3})}_{C^*_3} \; , \end{aligned}$$

(43)

where $\theta : \mathbb {R}\rightarrow \{0,1\}$ denotes the Heaviside step function. Thereby, it has been utilized that the values $d_\alpha $ and B, $b_\alpha $ are finite and are negligible with respect to $\bar{{\mathfrak {i}}}_3^*$. Consequently, it holds

$$\begin{aligned} \lim _{{\bar{\lambda }} \rightarrow 0^+} {\bar{\psi }}^\text {ANN} = \infty \; \text {if} \; C^*_3 > 0 \; . \end{aligned}$$

(44)

In the same way, supposing that it exists at least one $w_{\alpha 3}>0$, it holds

$$\begin{aligned} \lim _{{\bar{\lambda }} \rightarrow \infty } {\bar{\psi }}^\text {ANN} = \lim _{{\bar{\lambda }} \rightarrow \infty } {\bar{\lambda }}^{6} \frac{2}{{\bar{I}}_3^\text {max}-{\bar{I}}_3^\text {min}} \underbrace{\sum _{\alpha =1}^N W_\alpha w_{\alpha 3} \theta (w_{\alpha 3})}_{C_3} \; . \end{aligned}$$

(45)

Consequently, similar to Eq. (44), it holds

$$\begin{aligned} \lim _{{\bar{\lambda }} \rightarrow \infty } {\bar{\psi }}^\text {ANN} = \infty \; \text {if} \; C_3 > 0 \; . \end{aligned}$$

(46)

Note that for the case ${\bar{\lambda }} \rightarrow \infty $, the other invariants which also tend towards infinity have no influence, since ${\bar{I}}_3$ is the leading term. The above two conditions given in Eqs. (44) and (46) together constitute a sufficient condition for the growth condition to be satisfied. However, an easy to implement condition which is also sufficient but more restrictive is given by Eq. (36). As one can see, it is included in the conditions $C^*_3> 0 \wedge C_3 > 0$.

1.2 Comparison of an adapted network and a network neglecting the growth condition

Here, the network which is used in the examples discussed within Sect. 4 is compared to an adapted network which, in contrast to the other network, fulfills the growth condition. To this end, the prediction quality for the data set containing relevant deformation states of the cuboid, the torsional sample and the Cook’s membrane is analyzed for both ANNs. Thereby, the constraint (36) has been taken into account within the training of the second network.

Table 4 Relative errors of the single components within the stress prediction. ANN-1 and ANN-2 designate the network without and with further constraint (36), respectively

Full size table

The stresses predicted by the networks and the reference stresses obtained from RVE simulations are given in Fig. 17a, b. As one can see, the prediction quality is very good for both ANNs. However, regarding the zoom plots, a noticeable difference between both networks becomes apparent. Accordingly, the prediction quality of the adapted network which fulfills the growth condition is declined compared to the network with no further constraints on the weights. This is underlined by a comparison of the maximum relative errors within the stress components given in Table 4. Note that the comparatively large maximum error in ${\bar{T}}_{13}$ results from the small stresses within these component.

Comparison of hexagonal unit cells and random cells

In this appended section, the effective stress-strain response of a fiber reinforced composite represented by two different microstructures, an ideal hexagonal and a random distribution, are compared to each other. The first microstructure is thus represented by a unit cell, whereas the second one consists of 100 fibers to capture for statistical effects. The fiber orientation points in the $x_3$-direction for both. Exemplarily, a uniaxial tension into the $x_1$- and the $x_2$- direction, i. e., perpendicular to the fiber orientation, are considered, where a maximum stretch of ${\bar{\lambda }} = 2$ is applied.

In Fig. 18a, b, the stress-stretch curves and the transverse stretch ${\bar{\lambda }}^\perp $ are depicted for the hexagonal unit cell. Thereby, ${\bar{\lambda }}^\perp $, which follows due to lateral contraction, is measured in the $x_2$- or $x_1$-direction, respectively. Since in the undeformed state all fibers have the same distance in the $x_1$-$x_2$-plane, the curves in the initial region are nearly equivalent. However, as the deformation of the RVE increases, the curves deviate more and more from each other. This is due to the fact that the arrangement of the microstructure changes significantly as a result of the deformation which could be termed a deformation induced anisotropy. Regarding the deformed microstructure for tension in the $x_1$-direction, it no longer corresponds to the arrangement in the case of tension in the $x_2$-direction (rotated by 90 degrees), cf. Fig. 18c. Thus, in summary, the material loses the property of transverse isotropy if finite deformations occur.

Compared to this, the same uniaxial loadings are depicted in Fig. 19a, b for the cell with a random fiber distribution. As one can see there, the curves lie on top of each other over the complete range of stretch ${\bar{\lambda }} \in [1,2]$. Thus, in contrast to the hexagonal unit cell, a transversely isotropic effective behavior–which is expected for a fiber reinforced composite–results even for finite strains, whereby the $x_1$-$x_2$-plane is the isotropy plane. In order to simulate the overall behavior of a realistic fiber reinforced composite, the usage of a statistical RVE is thus mandatory for finite strains.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kalina, K.A., Linden, L., Brummund, J. et al. FE${}^\textrm{ANN}$: an efficient data-driven multiscale approach based on physics-constrained neural networks and automated data mining. Comput Mech 71, 827–851 (2023). https://doi.org/10.1007/s00466-022-02260-0

Download citation

Received: 29 June 2022
Accepted: 04 December 2022
Published: 08 February 2023
Issue Date: May 2023
DOI: https://doi.org/10.1007/s00466-022-02260-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

FE\({}^\textrm{ANN}\): an efficient data-driven multiscale approach based on physics-constrained neural networks and automated data mining

Abstract

Similar content being viewed by others

A multiscale, data-driven approach to identifying thermo-mechanically coupled laws—bottom-up with artificial neural networks

Derivation of heterogeneous material laws via data-driven principal component expansions

Integration of Experiments and Simulations to Build Material Big-Data

1 Introduction

1.1 Multiscale schemes

1.2 Data-based methods in solid mechanics

1.2.1 Overview on data-based constitutive modeling

1.2.2 Data-based multiscale modeling and simulation

1.3 Content

2 Continuum solid mechanics

2.1 Kinematics and stress measures

2.1.1 Kinematics

2.1.2 Stress measures

2.2 Hyperelasticity

2.2.1 General properties

2.2.2 Material symmetry

2.2.3 Special anisotropy classes

2.3 Scale transition scheme

3 ANN-based multiscale approach with autonomous data generation

3.1 Initial data generation (a)

3.2 ANN-based macroscopic surrogate model

3.2.1 Training process of the ANN (b)

Remark 1

3.2.2 Implementation and macroscopic simulation (c)

3.3 Autonomous data mining

3.3.1 Data analysis (d)

3.3.2 Data enrichment (e)

4 Examples

4.1 Microscopic properties of the composite

4.1.1 Constitutive behavior

4.1.2 Microstructure and RVE definition

4.2 Application of the data-driven multiscale framework for the simulation of macroscopic samples

4.2.1 Cuboid under tension

4.2.2 Torsional sample

4.2.3 Cook’s membrane

5 Conclusions

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix: Fulfillment of the growth condition by adjusted physics-constrained ANNs

1.1 Network architecture and additional constraints

1.2 Comparison of an adapted network and a network neglecting the growth condition

Comparison of hexagonal unit cells and random cells

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation