1 Introduction

Many common engineering materials are characterized by a thermomechanically coupled mechanical behavior, i.e., involving a coupling between temperate and deformation. In particular, variations in temperature may affect the mechanical response of a structural material. In addition, deformations may lead to changes in temperature as well, e.g.,  via changes in internal entropy or in the form of internal energy dissipation. For instance, dissipation-induced self-heating is commonly observed for thermoplastic polymers subjected to cyclic loading. As such polymers are particularly sensitive to temperature fluctuations, especially in the vicinity of their glass transition temperature, deformation-induced self-heating effects may significantly influence the mechanical properties of such materials and even lead to premature failure [1].

To complicate matters, many structural materials consist of composite materials, i.e., they feature a (spatially varying) complex microstructure. As the effective response of composite materials depends both on the constituent materials and the geometric composition of the microstructure, predicting the thermomechanical response of such materials is a challenging task, even for rather simple geometries.

For instance, a monolithic finite element (FE) model of a structural component, also resolving the microstructure heterogeneities, is typically not feasible with today’s computational power. Alternatively and under a suitable separation of scales, asymptotic homogenization methods [2,3,4] may be used to derive so-called effective material models which account for the geometric composition of the microstructure and the material behavior of the constituents on the lower scale. Chatzigeorgiou et al. [5] applied first-order asymptotic homogenization to composites of small-strain non-isothermal generalized standard materials (GSM) [6] and deduced the governing equations for the macroscopic and microscopic scale. In particular, provided the force term varies only slowly on the macroscopic scale, Chatzigeorgiou et al. [5] deduced that the balance of linear momentum on the microscopic scale, the thermomechanical cell problem, only depends on the macroscopic temperature, i.e., temperature fluctuations on the microscopic scale constitute only a lower-order contribution to the effective stress. By solving the thermomechanical cell problem on a suitable microstructure, the effective, non-isothermal model of the composite emerges naturally. For linear constituent materials, the effective material behavior can be pre-computed and cached for later use. The outlined strategy does not, however, extend to inelastic materials as the internal variables naturally live on the microscopic scale and cannot be homogenized to the macroscopic scale. For this reason, \(\text {FE}^2\) methods [7,8,9,10,11] were developed. In a \(\text {FE}^2\) simulation, each Gauss point of the macroscopic finite element simulation is furnished with a finite element model of the microstructure on which the cell problem is solved. Thus, the evolution of the internal variables can be accounted for. The \(\text {FE}^2\) method for thermomechanical composites was investigated, for instance, in the context of thermo-elastoplasticity [12, 13], phase transforming polycrystals under dynamic loading [14] or single-crystal thermo-elastoviscoplasticity [15]. Recently, Tikarrouchine et al. [16] investigated a short-fiber reinforced composite in a concurrent two-scale setting accounting for heat conduction and convection but temperature-independent material properties.

As an alternative to FE models on the microscale, FFT-based computational micromechanics [17,18,19] may be used to solve the thermomechanical cell problem more efficiently giving rise to the so-called FE-FFT method [20,21,22]. Recently, Wicht et al. [23] proposed an efficient, fully implicit FFT-based solution scheme for thermomechanical composites.

To reduce the computational burden on the microscopic scale, model order reduction techniques (MOR) exploit that the cell problem is solved repeatedly, but with slightly different input parameters, in order to derive a reduced order model. MOR techniques include the transformation field analysis (TFA) [24,25,26,27], the self-consistent clustering analysis (SCA) [28,29,30,31,32] and the non-uniform transformation field analysis (NTFA) [33,34,35,36,37,38,39,40]. These approaches can be incorporated into a concurrent two-scale framework giving rise to the \(\text {FE}^{2\text {R}}\) (R for reduced) method [41]. Furthermore, these approaches allow to incorporate thermomechanical loading, thermal eigenstrains and temperature-dependent material parameters. However, they typically do not consider the back-coupling of the mechanical deformation onto the temperature evolution.

In contrast to approximating the solution of the cell problem, alternative strategies seek to approximate the effective properties directly. Data driven approaches, e.g., artificial neural networks (ANN), are predestined for such tasks as they effortless operate on a high-dimensional domain of interest. For instance, the regularity of the effective stress allows to approximate the stress–strain relationship directly. Being by no means exhaustive, we refer to the works of Jadid [42], Penumadu-Zhao [43] or Srinivasu et al. [44] for different approaches. By considering the temperature as an additional degree of freedom of the feature space, ANNs can be extended to thermomechanical problems, see for example the works of Ji at al. [45] or Li et al. [46]. Machine learning approaches were applied in a concurrent two-scale setting both for isothermal and non-isothermal problems, see, e.g., Acuna et al. [47] or Fritzen et al. [48]. Using ANNs comes with two significant drawbacks, however. For a start, the capabilities to extrapolate beyond the training domain is limited for ANNs, in general. Secondly, the underlying physical principles, e.g., thermodynamic consistency or preservation of stress–strain monotonicity, may be violated unless specifically accounted for by the model. Recently, Masi and co-workers [49, 50] proposed so-called thermodynamics-based artificial neural networks (TANN) which ensure thermodynamic consistency a priori. Their findings indicate that the predictive capabilities of TANNs outperform those of standard ANNs. Please note that the mentioned approaches only consider a one-way thermomechanical coupling, i.e., from the temperature on the effective properties, and not vice versa.

Applying the concepts underlying deep learning in a more micromechanics-aware context, Liu and co-workers [51, 52] proposed so-called deep material networks (DMN) as a surrogate model for micromechanical computations. To be more precise, for a N-phase microstructure, they consider a N-ary tree structure of N-phase laminates with intermittent rotations associated with the edges of the tree as their primary modeling approach. Instead of approximating the stress–strain relationship directly, DMNs approximate the effective stiffness of a fixed microstructure and variable constituents. For identifying the free parameters of the DMN, the so-called training process, Liu et al. [51, 52] rely upon stochastic gradient descent and automatic differentiation. Once the training process is complete, DMNs can be applied to inelastic problems at finite and infinitesimal strains with impressive accuracy. Subsequently, direct DMNs were introduced by Gajek et al. [53, 54] which allow for an efficient solution scheme in the inelastic setting as they do not involve additional rotations. Furthermore, Gajek et al. [53] motivated the approximation capabilities of (direct) DMNs by showing that, to first-order in the strain rate, the effective inelastic behavior of composite materials is determined by linear elastic localization. In addition, Gajek et al. [53] clarified that DMNs inherit thermodynamic consistency and stress–strain monotonicity from their phases. The former is crucial for stable and fast simulations, especially in a two-scale context, as it ensures that the effective model inherits stabilizing numerical properties, e.g., strong convexity, from its phases. Recently, DMNs were augmented by cohesive zone models to account for interface damage [55] or multiscale strain localization modeling [56]. Liu et al. [57] and Gajek et al. [54] extended DMNs to accelerate two-scale concurrent simulation giving rise to the FE-DMN method.

In this work, we extend the framework of direct DMNs [53, 54] to composites with full thermomechanical coupling, effectively enabling thermomechanical two-scale simulations of industrial problems. As point of departure, we recapitulate the results of Chatzigeorgiou et al. [5] in Sect. 2, who introduced a framework for the first-order asymptotic homogenization of thermomechanical composites. Subsequently, we extend the framework of direct DMNs to thermomechanical composites, see Sect. 3. We take special care in incorporating the coupling of microscopic mechanical deformation onto the macroscopic temperature and vice versa into our approach. For this purpose, we exploit the homogeneity of the absolute temperature on the microscopic scale to arrive at an efficient solution scheme for solving the balance of linear momentum of a direct DMN. To accelerate a component-scale simulation of industrial complexity, we discuss the efficient implementation of our approach as a user-material subroutine (UMAT) only relying on the provided interfaces.

To demonstrate the capabilities of the proposed approach, we consider a short-fiber reinforced polyamide featuring a pronounced thermomechanical coupling, see Sect. 4. In Sect. 5, we elaborate on the training and the validation of the identified DMN surrogate model separately. We show that the DMN is able to predict the effective stress, the effective dissipation as well as the deformation-induced change in temperature of the composite with sufficient accuracy for all investigated loading conditions and strain rates. Later on, we demonstrate the power of our approach in Sect. 6, where we conduct a fully coupled thermomechanical two-scale simulation of a asymmetric notched specimen subjected to cyclic loading also considering heat conduction and convection on the macroscopic scale.

2 First-order asymptotic homogenization of thermomechanical composites

In their work, Chatzigeorgiou et al. [5] introduced a framework for the (first-order) asymptotic homogenization of thermomechanical composites at small strains. More precisely, they considered quasi-static, non-isothermal generalized standard materials (GSM) [6] and derived governing equations for the microscopic and macroscopic scale.

Let \(\mathrm{Sym }( d )\) denote the set of symmetric \(d \times d\) matrices. Then, in \(d \in \left\{ 2, 3\right\} \) spatial dimensions, we consider a small-strain, quasi-static, non-isothermal GSM to be a quadruple \(\left( Z, \psi , \phi , {{\varvec{ z}} }_0 \right) \) comprising

  1. D1

    a (sufficiently large) Banach vector space Z of internal variables,

  2. D2

    a Helmholtz free energy density \(\psi : \mathrm{Sym }( d ) \times {\mathbb { R}}_{> 0} \times Z \rightarrow {\mathbb { R}}\), which we assume to be differentiable w.r.t. all arguments,

  3. D3

    an extended-real-valued dissipation potential \(\phi : {\mathbb { R}}_{> 0} \times Z \rightarrow {\mathbb { R}}\cup \{+\infty \}\), which we assume to be proper, convex, lower semicontinuous in its second argument, and to satisfy \(\phi (\cdot , 0) = 0\) as well as \(0\in \partial _{\dot{z}} \phi (\cdot , 0)\), where \(\partial _{\dot{z}} \phi \) denotes the subdifferential of the convex function \(\phi \) w.r.t. the second argument,

  4. D4

    and an element \({{\varvec{ z}} }_0 \in Z\) serving as initial condition for the dynamics.

For every strain path \({\varvec{\varepsilon }}: [0, T] \rightarrow \mathrm{Sym }( d )\), temperature path \(\theta : [0, T] \rightarrow {\mathbb { R}}_{> 0}\) and internal variables \({{\varvec{ z}} }: [0, T] \rightarrow Z\) with final time \(T \in \left( 0, \infty \right] \), the Cauchy stress \({\varvec{\sigma }}: [0, T] \rightarrow \mathrm{Sym }( d )\) is expressed in terms of the potential relation

$$\begin{aligned} {\varvec{\sigma }}(t) = \displaystyle \frac{\partial \psi }{\partial {\varvec{\varepsilon }}}({\varvec{\varepsilon }}(t), \theta (t), {{\varvec{ z}} }(t)), \end{aligned}$$
(2.1)

and the evolution of the internal variables satisfies the initial value problem described by Biot’s equation

$$\begin{aligned}&\displaystyle \frac{\partial \psi }{\partial {{\varvec{ z}} }}({\varvec{\varepsilon }}(t), \theta (t), {{\varvec{ z}} }(t)) + \displaystyle \frac{\partial \phi }{\partial \dot{{{\varvec{ z}} }}}(\theta (t), \dot{{{\varvec{ z}} }}(t)) = 0 \nonumber \\&\quad \text {with} \quad {{\varvec{ z}} }(0) = {{\varvec{ z}} }_0. \end{aligned}$$
(2.2)

With these definitions at hand, we turn our attention to the first-order homogenization of non-isothermal GSMs. We refer to Chatzigeorgiou et al. [5] for more details.

We consider a macroscopic body \(\Omega \subseteq {\mathbb { R}}^d\) with macroscopic point \(\bar{{{\varvec{ x}} }} \in \Omega \). To every macroscopic point \(\bar{{{\varvec{ x}} }}\), we associate a (rectangular) two-phase periodic microstructure \(Y \subseteq {\mathbb { R}}^d\).

The microscopic cell problem The microstructure Y comprises two non-isothermal GSMs, i.e., \(\left( Z_1, \psi _1, \phi _1, {{\varvec{ z}} }_{1,0} \right) \) and \(\left( Z_2, \psi _2, \phi _2, {{\varvec{ z}} }_{2,0} \right) \), with measurable characteristic functions \(\chi _{1/2}: Y \rightarrow \left\{ 0, 1\right\} \) whose associated sets are mutually disjoint and cover all of Y, i.e., the conditions

$$\begin{aligned} \chi _1 \chi _2 = 0 \quad \text {and} \quad \chi _1 + \chi _2 = 1 \end{aligned}$$
(2.3)

hold. Then, on the microscopic level, the so-called thermomechanical cell problem of first-order homogenization, i.e., the (quasi-static) microscopic balance of linear momentum, reads

$$\begin{aligned} \mathrm{div } _ {\! x}\left( \sum _{i=1}^{2} \chi _i \displaystyle \frac{\partial \psi _i}{\partial {\varvec{\varepsilon }}}(\bar{{\varvec{\varepsilon }}}+ \nabla ^\text {s}_{\!\! x}\, {{\varvec{ u}} }, \bar{\theta }, {{\varvec{ z}} }_i) \right) = {{\varvec{ 0}} }, \end{aligned}$$
(2.4)

where \(\mathrm{div } _ {\! x}\) and \(\nabla ^\text {s}_{\!\! x}\) refer to the divergence and the symmetrized gradient operator w.r.t. the microscopic point \({{\varvec{ x}} }\in Y\), respectively. Furthermore, \(\bar{{\varvec{\varepsilon }}}: \Omega \times [0, T] \rightarrow \mathrm{Sym }( d )\) denotes the macrostrain, \({{\varvec{ u}} }: \Omega \times Y \times [0, T] \rightarrow {\mathbb { R}}^d\) symbolizes the periodic displacement fluctuation with anti-periodic normal derivative and \({{\varvec{ z}} }_{1/2}: \Omega \times Y \times [0, T] \rightarrow Z_{1/2}\) stands for the fields of internal variables. Chatzigeorgiou et al. [5] established that, for first-order homogenization, the absolute temperature is a macroscopic quantity, i.e., there is no temperature fluctuation on the microscopic level. Most importantly, there is no need to solve for the temperature on the microscopic level. Thus, the macroscopic absolute temperature \(\bar{\theta }: \Omega \times [0, T] \rightarrow {\mathbb { R}}_{> 0}\) as well as the macrostrain \(\bar{{\varvec{\varepsilon }}}\) enter Equation (2.4) as inputs and constitute the one-way coupling between the macroscopic and the microscopic scale.

The macroscopic balance of linear momentum and the macroscopic heat equation On the macroscopic level, two governing equations emerge. First, the quasi-static balance of linear momentum, governing the evolution of the macrostrain \(\bar{{\varvec{\varepsilon }}}\), reads

$$\begin{aligned} \mathrm{div } _ {\! \bar{x}} \left\langle \sum _{i=1}^{2} \chi _i \displaystyle \frac{\partial \psi _i}{\partial {\varvec{\varepsilon }}}(\bar{{\varvec{\varepsilon }}}+ \nabla ^\text {s}_{\!\! x}\, {{\varvec{ u}} }, \bar{\theta }, {{\varvec{ z}} }_i) \right\rangle _\text {Y} + {{\varvec{ b}} }= {{\varvec{ 0}} }, \end{aligned}$$
(2.5)

where \(\left\langle \cdot \right\rangle _\text {Y}\) denotes the volume average over Y

$$\begin{aligned} \left\langle \cdot \right\rangle _\text {Y} = \frac{1}{|Y|} \int _{Y} (\cdot ) {\,\mathrm d}V. \end{aligned}$$
(2.6)

Furthermore, \({{\varvec{ b}} }: \Omega \times [0, T] \rightarrow {\mathbb { R}}^d\) denotes the vector of volume forces and \(\mathrm{div } _ {\! \bar{x}}\) designates the divergence operator w.r.t. the macroscopic point \(\bar{{{\varvec{ x}} }} \in \Omega \). Secondly, the macroscopic heat equation reads

$$\begin{aligned} \bar{c}_{\varepsilon } \, \dot{\bar{\theta }}= \bar{w} - \mathrm{div } _ {\! \bar{x}}(\bar{{{\varvec{ q}} }}) + \bar{D}, \end{aligned}$$
(2.7)

which governs the evolution of the macroscopic absolute temperature \(\bar{\theta }\). Here, \(\bar{w}: \Omega \times [0, T] \rightarrow {\mathbb { R}}\) denotes the macroscopic heat source and \(\bar{{{\varvec{ q}} }}: \Omega \times [0, T] \rightarrow {\mathbb { R}}^d\) stands for the macroscopic heat flux. The effective heat capacity at constant strain \(\bar{c}_{\varepsilon }\) is given explicitly by

$$\begin{aligned} \bar{c}_{\varepsilon } = -\bar{\theta }\left\langle \sum _{i=1}^{2} \chi _i \displaystyle \frac{\partial ^{2} \psi }{\partial \theta ^{2}}(\bar{{\varvec{\varepsilon }}}+ \nabla ^\text {s}_{\!\! x}\, {{\varvec{ u}} }, \bar{\theta }, {{\varvec{ z}} }_i) \right\rangle _\text {Y}. \end{aligned}$$
(2.8)

To keep the notation reasonable, we introduced the thermomechanical coupling term

$$\begin{aligned} \begin{aligned} \bar{D}&= \bar{\theta }\left\langle \sum _{i=1}^{2} \chi _i \displaystyle \frac{\partial ^2 \psi _i}{\partial \theta \partial {\varvec{\varepsilon }}}(\bar{{\varvec{\varepsilon }}}+ \nabla ^\text {s}_{\!\! x}\, {{\varvec{ u}} }, \bar{\theta }, {{\varvec{ z}} }_i) : (\dot{\bar{{\varvec{\varepsilon }}}} + \nabla ^\text {s}_{\!\! x}\, \dot{{{\varvec{ u}} }}) \right\rangle _\text {Y}\\&\quad + \bar{\theta }\left\langle \sum _{i=1}^{2} \chi _i \displaystyle \frac{\partial ^2 \psi _i}{\partial \theta \partial {{\varvec{ z}} }}(\bar{{\varvec{\varepsilon }}}+ \nabla ^\text {s}_{\!\! x}\, {{\varvec{ u}} }, \bar{\theta }, {{\varvec{ z}} }_i) \cdot \dot{{{\varvec{ z}} }}_i \right\rangle _\text {Y} \\&\quad - \left\langle \sum _{i=1}^{2} \chi _i \displaystyle \frac{\partial \psi _i}{\partial {{\varvec{ z}} }}(\bar{{\varvec{\varepsilon }}}+ \nabla ^\text {s}_{\!\! x}\, {{\varvec{ u}} }, \bar{\theta }, {{\varvec{ z}} }_i) \cdot \dot{{{\varvec{ z}} }}_i \right\rangle _\text {Y} \end{aligned} \end{aligned}$$
(2.9)

as an additional source term of the macroscopic heat equation. The former constitutes the back-coupling between the microscopic scale and the evolution of the macroscopic temperature. Please note that the coupling term \(\bar{D}\) may be decomposed further. The first two terms are linked to changes in entropy, whereas the last summand is commonly referred to as the dissipation

$$\begin{aligned} \bar{\mathcal {D}}= - \left\langle \sum _{i=1}^{2} \chi _i \displaystyle \frac{\partial \psi _i}{\partial {{\varvec{ z}} }}(\bar{{\varvec{\varepsilon }}}+ \nabla ^\text {s}_{\!\! x}\, {{\varvec{ u}} }, \bar{\theta }, {{\varvec{ z}} }_i) \cdot \dot{{{\varvec{ z}} }}_i \right\rangle _\text {Y}. \end{aligned}$$
(2.10)

The dissipation measures the dissipated energy of the composite due to the evolution of the internal variables, e.g., the dissipated energy due to plastic flow, and is the primary cause for the self-heating of the material due to irreversible processes.

Typically, in a concurrent two-scale setting, the macroscopic balance of linear momentum (2.5) and the macroscopic heat equation (2.7) are solved on the macroscopic scale while, in every Gauss point of the macroscopic model, the thermomechanical cell problem (2.4) is solved as well. Here, the above-mentioned two-way thermomechanical coupling prevails. On the one hand, the macrostrain and the macroscopic absolute temperature influence the mechanical behavior at the microscopic scale. On the other hand, the evolution of the macroscopic absolute temperature is driven by the coupling term \(\bar{D}\), which comprises deformation induced changes of entropy and dissipated energy on the microscopic level.

In the article at hand, we consider speeding up such a thermomechanical two-scale simulation by means of direct DMNs. In this context, a DMN might be regarded as a surrogate for the underlying microstructure for which the thermomechanical cell problem (2.4) can be solved efficiently. However, to use a DMN to speed up such a fully coupled thermomechanical two-scale simulation, the aforementioned two-way thermomechanical coupling needs to be taken into account. This will be the topic of the following section.

3 Direct deep material networks for thermomechanical composites

3.1 The framework of direct deep material networks

We start with the formal definition of a direct DMN. For more detailed information, we refer to Gajek et al. [53, 54]. We consider a two-phase direct DMN to be a perfect, ordered, rooted binary tree of depth K, see Fig. 1 for an illustration. Each node of the binary tree is given by a two-phase laminate \(\mathcal {B}^i_k\) with unknown direction of lamination \({{\varvec{ n}} }^i_k\) and unknown volume fractions \(c^i_{k,1}\) and \(c^i_{k,2}\). Here, we denote the depth of a node by the letter \(k = 1, \dots , K\) and consistently index the horizontal position by the letter \(i = 1, \dots , 2^{k-1}\). Then, the DMN’s free parameters are given by the directions of lamination, which we collect in the form of a (large) vector

$$\begin{aligned} \vec {{{\varvec{ n}} }} =&\left[ {{\varvec{ n}} }^1_K, \dots , {{\varvec{ n}} }^{2^{K-1}}_K, {{\varvec{ n}} }^1_{K-1}, \dots , {{\varvec{ n}} }^{2^{K-2}}_{K-1}, \dots , {{\varvec{ n}} }^1_1 \right] \nonumber \\&\in \left( {\mathbb { R}}^d\right) ^{2^K - 1}, \end{aligned}$$
(3.1)

and the volume fractions of all laminates \(c^i_{k,1}\) and \(c^i_{k,2}\). For reasons of numerical stability during the parameter identification, Liu et al. [51, 52] proposed a change of coordinates when parameterizing the volume fractions. For this reason, the laminates’ volume fractions are expressed in terms of the (input) weights \(w^i_{K+1}\). These weights are assigned to the laminates at the bottom layer of the binary tree in pairs. By traversing the binary tree from the leaves to the root, the weights on level k are inductively computed by a pairwise summation of the weights of the previous level, i.e.,

$$\begin{aligned} w^i_k = w^{2i-1}_{k+1} + w^{2i}_{k+1} \end{aligned}$$
(3.2)

holds, see Fig. 1a for a schematic. Then, for every laminate, the volume fractions \(c^i_{k,1}\) and \(c^i_{k,2}\) are computed by normalization

$$\begin{aligned} c^i_{k,1} = \frac{w^{2i-1}_{k+1}}{w^{2i-1}_{k+1} + w^{2i}_{k+1}} \quad \text {and} \quad c^i_{k,2} = 1 - c^i_{k,1}. \end{aligned}$$
(3.3)

For consistency, the weights \(w^i_{K+1}\) need to be non-negative and sum to unity, i.e., the conditions

$$\begin{aligned} w^i_{K+1} \ge 0 \quad \text {and} \quad \sum _{i=1}^{2^K} w^i_{K+1} = 1 \end{aligned}$$
(3.4)

hold. In the following, we collect the input weights \(w^i_{K+1}\) into the vector

$$\begin{aligned} \vec {w} = \left[ w^1_{K+1}, \dots , w^{2^K}_{K+1} \right] \in {\mathbb { R}}^{2^K}_{\ge 0}. \end{aligned}$$
(3.5)

Thus, the network topology of a two-phase direct DMN of depth K is uniquely determined by the vector \(\vec {{{\varvec{ n}} }}\), containing \(2^K-1\) independent directions of lamination, and the vector \(\vec {w}\) of weights comprising \(2^K\) scalar parameters, for which \(2^K-1\) parameters are independent. We call the process of identifying these free parameters the offline training. During the offline training, the DMN is fitted to the effective elastic response of a fixed microstructure Y but varying stiffness parameters of the constituting phases. Afterwards, during the online evaluation, the free parameters \(\vec {{{\varvec{ n}} }}\) and \(\vec {w}\) are fixed. Then, the DMN acts as a high-fidelity surrogate model for inelastic computations on the microscopic scale.

Fig. 1
figure 1

Weight and stiffness propagation (from the bottom to the top) in a two-phase direct DMN [54] of depth \(K=3\)

3.2 Offline training

For isothermal problems, DMNs are trained on linear elastic data alone, see Liu et al [51, 52]. As we wish to predict the effective stress response of the composite for nonlinear and non-isothermal constituents, we assume that the linear elastic training still suffices. Thus, the following section serves as a brief summary of Gajek et al. [53, 54].

In the following, we treat the linear elastic training data as given, see Sect. 5.1 for more information on the sampling of the training data. The training data is represented by a sequence of triples of stiffnesses \(\left\{ \left( \bar{{\mathbb { C}}}^s, {\mathbb { C}}^s_1, {\mathbb { C}}^s_2 \right) \right\} ^{N_\text {s}}_{s=1}\) where s enumerates the sample index and \(N_\text {s}\) the number of samples. For a fixed microstructure Y, the training data is generated by sampling \(N_\text {s}\) tuples of input stiffnesses \(\left( {\mathbb { C}}^s_1, {\mathbb { C}}^s_2 \right) \) and computing the associated effective stiffness \(\bar{{\mathbb { C}}}^s\) by means of computational homogenization.

The offline training, i.e., the parameter identification, is expressed by solving the optimization problem

$$\begin{aligned} J(\vec {{{\varvec{ n}} }}, {\vec {w}}) \longrightarrow \min _{\vec {{{\varvec{ n}} }}, {\vec {w}}} \quad {\text {s.t.} \quad w^i_{K+1} \ge 0}. \end{aligned}$$
(3.6)

Please note that there is some freedom in selecting a suitable objective function J, see, e.g., Liu et al. [51, 52] or Gajek et al. [53, 54]. In this work, we follow Gajek et al. [53, 54] and prescribe the following objective function

$$\begin{aligned} J\left( \vec {{{\varvec{ n}} }}, \vec {w}\right)= & {} \frac{1}{N_b} \root q \of {\sum _{s=1}^{N_b} \left( \frac{\left\| \, \bar{{\mathbb { C}}}^s - \mathcal {DMN}^{\mathcal {L}}_{}\left( {\mathbb { C}}^s_{1}, {\mathbb { C}}^s_{2}, \vec {{{\varvec{ n}} }}, \vec {w}\right) \, \right\| _p}{\left\| \, \bar{{\mathbb { C}}}^s \, \right\| _p}\right) ^q}\nonumber \\&+ \lambda \left( \sum _{i=1}^{2^K} w^i_{K+1} - 1\right) ^2. \end{aligned}$$
(3.7)

Here, the \(\Vert \cdot \Vert _{p}\)-norm on the stiffness tensors is defined via the \(\ell ^p\)-norm of the components in (normalized) Voigt-Mandel notation and \(p,q \ge 1\) and \(\lambda \gg 0\) hold. Furthermore, \(\mathcal {DMN}^{\mathcal {L}}_{}\) denotes the DMN’s linear elastic homogenization function

$$\begin{aligned}&\mathcal {DMN}^{\mathcal {L}}_{}: {{\varvec{ C}} }\times {{\varvec{ C}} }\times \left( {\mathbb { R}}^d\right) ^{2^K-1} \nonumber \\&\quad \times {\mathbb { R}}^{2^K} \rightarrow {{\varvec{ C}} }, \quad \left( {\mathbb { C}}_1, {\mathbb { C}}_2, \vec {{{\varvec{ n}} }}, \vec {w}\right) \mapsto \bar{{\mathbb { C}}}, \end{aligned}$$
(3.8)

which maps two input stiffnesses \({\mathbb { C}}_1\), \({\mathbb { C}}_2\) and the parameter vectors \(\vec {{{\varvec{ n}} }}\), \(\vec {w}\) to the DMN’s predicted effective stiffness. Efficiently evaluating the linear elastic homogenization function \(\mathcal {DMN}^{\mathcal {L}}_{}\) is paramount and involves computing a sequence of effective stiffnesses of two-phase laminates which are propagated from the bottom to the top of the binary tree. More formally, the effective stiffness

$$\begin{aligned} {\mathbb { C}}^i_k = \mathcal {B}^i_k({\mathbb { C}}^{2i-1}_{k+1}, {\mathbb { C}}^{2i}_{k+1}) \end{aligned}$$
(3.9)

of a single laminate at level k and position i with direction of lamination \({{\varvec{ n}} }^i_k\) and volume fractions \(c^i_{k,1}\) and \(c^i_{k,2}\) is computed by solving the equation

$$\begin{aligned}&\left( {\mathbb { P}}({{\varvec{ n}} }^i_k) + \lambda \left[ {\mathbb { C}}^i_k - \lambda {\mathbb { I}}_\text {s}\right] ^{-1}\right) ^{-1} \nonumber \\&\quad = c^i_{k,1} \left( {\mathbb { P}}({{\varvec{ n}} }^i_k) + \lambda \left[ {\mathbb { C}}^{2i-1}_{k+1} - \lambda {\mathbb { I}}_\text {s}\right] ^{-1}\right) ^{-1}\nonumber \\&\qquad + c^i_{k,2} \left( {\mathbb { P}}({{\varvec{ n}} }^i_k) + \lambda \left[ {\mathbb { C}}^{2i}_{k+1} - \lambda {\mathbb { I}}_\text {s}\right] ^{-1}\right) ^{-1} \end{aligned}$$
(3.10)

for the effective stiffness \({\mathbb { C}}^i_k\), see Section 9.5 in Milton’s book [58]. With \({\mathbb { I}}_\text {s}: \mathrm{Sym }( d ) \rightarrow \mathrm{Sym }( d )\), we denote the identity on \(\mathrm{Sym }( d )\) and \({\mathbb { P}}: \mathrm{Sym }( d ) \rightarrow \mathrm{Sym }( d )\) stands for a projection operator which reads

$$\begin{aligned} \left( {\mathbb { P}}({{\varvec{ n}} })\right) _{mnop}= & {} \frac{1}{2}(n_m \delta _{no} n_p + n_n \delta _{mo} n_p \nonumber \\&+ n_m \delta _{np} n_o + n_n \delta _{mp} n_o) - n_m n_n n_o n_p \end{aligned}$$
(3.11)

in Cartesian coordinates. Here, \(\delta \) denotes the Kronecker symbol and \(\lambda \) is a parameter which needs to be chosen either sufficiently large or suitably small, see Appendix in Kabel et al. [59]. Starting at level \(K+1\), the input stiffnesses \({\mathbb { C}}_1\), \({\mathbb { C}}_2\) are assigned pairwise to laminates at the K-th level and the respective effective stiffnesses are computed. These homogenized stiffnesses serve as the input for the next higher level, i.e., the level \(K-1\), until the effective stiffness of the DMN \({\mathbb { C}}^1_1 = \bar{{\mathbb { C}}}\) emerges on the highest level. We refer to Fig. 1b for a schematic of the stiffness propagation.

The objective function J penalizes the difference of the DMN’s predicted effective stiffness to the actual effective stiffness \(\bar{{\mathbb { C}}}^s\) of microstructure Y for all sampled input stiffnesses \(\left\{ \left( {\mathbb { C}}^s_1, {\mathbb { C}}^s_2 \right) \right\} ^{N_s}_{s=1}\). Additionally, the mixing constraint on the weights

$$\begin{aligned} \sum _{i=1}^{2^K} w^i_{K+1} = 1 \end{aligned}$$
(3.12)

is encoded by the quadratic penalty term of Equation (3.7). To ensure that the non-negativity constraint on the weights \(w^i_{K+1} \ge 0\) holds, we express the constrained weights \(\vec {w} \in {\mathbb { R}}^{2^K}_{\ge 0}\) in terms of the unconstrained weights \(\vec {v} = \left[ v_1, \dots , v_{2^K}\right] \in {\mathbb { R}}^{2^K}\) by projecting each element of \(\vec {v}\) onto the positive real number line, i.e.,

$$\begin{aligned} \vec {w}= & {} \langle \vec {v} \rangle _{+} \quad \text {with} \quad \langle \cdot \rangle _{+}: {\mathbb { R}}^{2^K} \rightarrow {\mathbb { R}}^{2^K}_{\ge 0}, \quad \nonumber \\&\vec {v} \mapsto \left[ \max (0, v_1) , \dots , \max (0, v_{2^K})\right] , \end{aligned}$$
(3.13)

holds. In this way, the regression problem (3.6) may be rewritten as

$$\begin{aligned} J(\vec {{{\varvec{ n}} }}, \langle \vec {v}\rangle _{+}) \longrightarrow \min _{\vec {{{\varvec{ n}} }}, \vec {v}}, \end{aligned}$$
(3.14)

which we solve by means of accelerated stochastic gradient descent using mini batches of size \(N_b\). A training epoch j consists of the following steps: First, the loss function (3.7) is evaluated for all stiffness samples in a batch. Then, the gradients \({\partial J}/{\partial \vec {{{\varvec{ n}} }}}\left( \vec {{{\varvec{ n}} }}_{j}, {\langle \vec {v}_j\rangle _{+}}\right) \), \({\partial J}/{\partial {\vec {v}}}\left( \vec {{{\varvec{ n}} }}_{j}, {\langle \vec {v}_j\rangle _{+}}\right) \) are computed by means of automatic differentiation. Subsequently, the fitting parameters are updated by

$$\begin{aligned} \vec {{{\varvec{ n}} }}_{j+1}= & {} \vec {{{\varvec{ n}} }}_{j} - \alpha _{\vec {n}} \frac{\partial J}{\partial \vec {{{\varvec{ n}} }}}\left( \vec {{{\varvec{ n}} }}_{j}, {\langle \vec {v}_j\rangle _{+}}\right) , \quad \nonumber \\ {\vec {v}}_{j+1}= & {} {\vec {v}}_{j} - \alpha _{{\vec {v}}} \frac{\partial J}{\partial {\vec {v}}}\left( \vec {{{\varvec{ n}} }}_{j}, {\langle \vec {v}_j\rangle _{+}}\right) \quad \text {and} \quad {\vec {w}_j = \langle \vec {v}_j\rangle _{+}} \end{aligned}$$
(3.15)

where \(\alpha _{\vec {n}}, \alpha _{{\vec {v}}} \in {\mathbb { R}}_{>0}\) strictly larger than zero denote the learning rates. This procedure is repeated for all batches in the training set and for a pre-defined number of epochs. Upon convergence, the unknown fitting parameters of the DMN, i.e., \(\vec {{{\varvec{ n}} }}\) and \(\vec {w}\), are given.

3.3 Online evaluation

For fixed fitting parameters \(\vec {{{\varvec{ n}} }}\) and \(\vec {w}\), the goal of the online evaluation is to efficiently integrate a deep material networks implicitly at a single Gauss point of a macroscopic FE simulation. Indeed, direct DMNs are defined as a hierarchy of nested laminates. For this reason, they inherit thermodynamic consistency and stress–strain monotonicity from their phases, see Section 3.1 and Appendix C in Gajek et al. [53] for a discussion. Thus, extending DMNs to non-isothermal problems does not infer any challenges from the point of view of thermodynamics. The governing equation, i.e., the DMN’s balance of linear momentum, emerges naturally by incorporating the homogeneity of the absolute temperature into the framework. Furthermore, considering the back-coupling from the microscopic onto the macroscopic scale is straightforward as well. Both will be explained in the following.

We consider a two-phase DMN of depth K comprising two non-isothermal GSMs \({\mathcal {G}}_1 = (Z_1, \psi _1, \phi _1, {{\varvec{ z}} }_{0,1})\) and \({\mathcal {G}}_2 = (Z_2, \psi _2, \phi _2, {{\varvec{ z}} }_{0,2})\) as phases. We consider the former as a single laminate with a complex kinematics, comprising \(2^K\) independent phases in total, see Gajek et al. [53] for a schematic. We index these phases by the letter \(i = 1, \dots , 2^K\) and assign to each phase the non-isothermal GSM \({\mathcal {G}}_i\) which alternates between \({\mathcal {G}}_1\) and \({\mathcal {G}}_2\), i.e.,

$$\begin{aligned} {\mathcal {G}}_i = \left\{ \begin{array}{rl} {\mathcal {G}}_1 = (Z_1, \psi _1, \phi _1, {{\varvec{ z}} }_{0,1}), &{} i\text { odd,}\\ {\mathcal {G}}_2 = (Z_2, \psi _2, \phi _2, {{\varvec{ z}} }_{0,2}), &{} i\text { even.} \end{array} \right. \end{aligned}$$
(3.16)

Let the superscript n refer to the n-th time step at time \(t^n\) and let \(\triangle t = t^{n+1} - t^{n}\) denote the time increment. Then, for each phase \(i = 1, \dots , 2^K\), discretizing Biot’s equation (2.2) in time with an implicit Euler method gives rise to the condensed free energy potential \({\Psi _i: \mathrm{Sym }( d ) \times {\mathbb { R}}_{> 0} \times Z_i \rightarrow {\mathbb { R}}}\),

$$\begin{aligned} \Psi _i\left( {\varvec{\varepsilon }}^{n+1}_i, \theta ^{n+1}_i, {{\varvec{ z}} }^n_i\right)= & {} \inf _{{{\varvec{ z}} }^{n+1}_i \in Z_i}\left( \psi _i\left( {\varvec{\varepsilon }}^{n+1}_i, \theta ^{n+1}_i, {{\varvec{ z}} }^{n+1}_i\right) \right. \nonumber \\&\left. + \triangle t \, \phi _i\left( \theta ^{n+1}_i, \frac{{{\varvec{ z}} }^{n+1}_i - {{\varvec{ z}} }^n_i}{\triangle t}\right) \right) ,\nonumber \\ \end{aligned}$$
(3.17)

solely depending on the strain \({\varvec{\varepsilon }}^{n+1}_i \in \mathrm{Sym }( d )\), the temperature \(\theta ^{n+1}_i \in {\mathbb { R}}_{> 0}\) and the internal variables \({{\varvec{ z}} }^{n}_i \in Z_i\) of the last converged time step. Then, for a fixed temperature, the stress of phase i

$$\begin{aligned} {\varvec{\sigma }}^{n+1}_i = \frac{\partial \Psi _i}{\partial {\varvec{\varepsilon }}} \left( {\varvec{\varepsilon }}^{n+1}_i, \theta ^{n+1}_i, {{\varvec{ z}} }^{n}_i\right) \end{aligned}$$
(3.18)

is given by a nonlinear elastic law, see Lahellec-Suquet [60] for more information. For the sake of exposition, we omit explicit reference to time step \(n+1\) from here on.

First, we consider the kinematics of the DMN by collecting the phase strains into the vector \(\vec {{\varvec{\varepsilon }}} = \left[ {\varvec{\varepsilon }}_1,\ldots ,{\varvec{\varepsilon }}_{2^K}\right] \in (\mathrm{Sym }( d ))^{2^K}\), introducing the vector of macrostrains \(\vec {\bar{{\varvec{\varepsilon }}}} = \left[ \bar{{\varvec{\varepsilon }}},\ldots ,\bar{{\varvec{\varepsilon }}}\right] \in (\mathrm{Sym }( d ))^{2^K}\) and the vector of the unknown displacement jumps \(\vec {{{\varvec{ a}} }} \in ({\mathbb { R}}^d)^{2^K-1}\). The latter inherits its ordering from the vector of lamination directions \(\vec {{{\varvec{ n}} }}\). Then, the DMN’s kinematics admits the representation

$$\begin{aligned} \vec {{\varvec{\varepsilon }}} = \vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \end{aligned}$$
(3.19)

where \({{\varvec{ A}} }:({\mathbb { R}}^d)^{2^K-1} \rightarrow (\mathrm{Sym }( d ))^{2^K}\) is a gradient-type operator comprising the DMN’s topology, i.e., the tree structure, lamination directions and volume fractions, into a single linear mapping, see Gajek et al. [53] for the derivation of the special structure of \({{\varvec{ A}} }\). Secondly, the homogeneity of the absolute temperature on the microscopic scale, see Sect. 2, infers that only the macroscopic absolute temperature \(\bar{\theta }\) needs to be considered, i.e., \(\theta _i \equiv \bar{\theta }\) holds for all phases \(i = 1, \dots , 2^K\). Here, the macroscopic absolute temperature \(\bar{\theta }\) and the macrostrain \(\bar{{\varvec{\varepsilon }}}\) act as inputs to the DMN. Both are provided by the macroscopic finite element simulation for every Gauss point and for every increment of the global (Newton) solver. As outputs, the effective stress \(\bar{{\varvec{\sigma }}}\), the thermomechanical coupling term \(\bar{D}\) and algorithmic tangents, i.e., the partial derivatives of the effective stress and thermomechanical coupling term w.r.t. the effective strain and macroscopic temperature, need to be returned.

We start with deriving the governing equation of a thermomechanically coupled direct DMN. Let \(\bar{\Psi }:(\mathrm{Sym }( d ))^{2^K}\!\!\!\!\times {\mathcal {Z}} \rightarrow {\mathbb { R}}\) denote the averaged condensed free energy of the DMN

$$\begin{aligned} \bar{\Psi }(\vec {{\varvec{\varepsilon }}}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) = \sum _{i=1}^{2^K} w^i_{K+1} \Psi _{i}({\varvec{\varepsilon }}_i, \bar{\theta }, {{\varvec{ z}} }_i^n) \end{aligned}$$
(3.20)

where \(\vec {{{\varvec{ z}} }}^{\, n} = \left[ {{\varvec{ z}} }_1^n, \dots , {{\varvec{ z}} }_{2^K}^n\right] \in {\mathcal {Z}} := Z_1 \oplus Z_2 \oplus \cdots \oplus Z_1 \oplus Z_2\) denotes the vector of internal variables of the last time step. Then, critical points of the optimization problem

$$\begin{aligned} \bar{\Psi }(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) \longrightarrow \min _{\vec {{{\varvec{ a}} }} \in ({\mathbb { R}}^d)^{2^K-1}} \end{aligned}$$
(3.21)

encode the DMN’s (microscopic) balance of linear momentum

$$\begin{aligned} {{{\varvec{ A}} }}^\mathsf{T}{{\varvec{ W}} }\vec {{\varvec{\sigma }}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) = {{\varvec{ 0}} }. \end{aligned}$$
(3.22)

Here, \(\vec {{\varvec{\sigma }}} = \left[ {\varvec{\sigma }}_1, \dots , {\varvec{\sigma }}_{2^K} \right] \in (\mathrm{Sym }( d ))^{2^K}\) represents the vector of phase stresses for which Relation (3.18) holds. Furthermore, the mass matrix \({{\varvec{ W}} }:\mathrm{Sym }( d )^{2^K} \rightarrow \mathrm{Sym }( d )^{2^K}\)

$$\begin{aligned} {{\varvec{ W}} }= \mathrm{diag }\left( w^1_{K+1}, \dots , w^{2^K}_{K+1} \right) \end{aligned}$$
(3.23)

associates the weights \(w^i_{K+1}\) to the corresponding phase stresses \({\varvec{\sigma }}_i\) and may be represented by a diagonal matrix. Indeed, \({{{\varvec{ A}} }}^\mathsf{T}{{\varvec{ W}} }: (\mathrm{Sym }( d ))^{2^K} \rightarrow ({\mathbb { R}}^d)^{2^K-1}\) may be regarded as a divergence-type operator, such that the similarity of Relation (3.22) to the thermomechanical cell problem in general form (2.4) is immediately revealed.

For solving the DMN’s balance of linear momentum (3.22) for the unknown displacement jumps \(\vec {{{\varvec{ a}} }}\), we rely upon Newton’s method. Let j denote the j-th Newton increment. Then, for an initial guess \({\vec {{{\varvec{ a}} }}_0 \in ({\mathbb { R}}^d)^{N-1}}\), the unknown displacement jump vector is iteratively updated,

$$\begin{aligned} \vec {{{\varvec{ a}} }}_{j+1} = \vec {{{\varvec{ a}} }}_{j} + s_j\,\triangle \vec {{{\varvec{ a}} }}_j, \end{aligned}$$
(3.24)

for which the increment \(\triangle \vec {{{\varvec{ a}} }}_j \in \left( {\mathbb { R}}^d\right) ^{2^K-1}\) solves the linear system

$$\begin{aligned}&\left[ {{{\varvec{ A}} }}^\mathsf{T}{{\varvec{ W}} }\frac{\partial \vec {{\varvec{\sigma }}}}{\partial \vec {{\varvec{\varepsilon }}}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }_j}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) {{\varvec{ A}} }\right] \triangle \vec {{{\varvec{ a}} }}_j\nonumber \\&\qquad = - {{{\varvec{ A}} }}^\mathsf{T}{{\varvec{ W}} }\vec {{\varvec{\sigma }}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }_j}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}). \end{aligned}$$
(3.25)

To ensure convergence, a step size \(s_j\in (0,1]\) less than unity may arise from backtracking, and the Jacobian \(\partial \vec {{\varvec{\sigma }}} / \partial \vec {{\varvec{\varepsilon }}}\) may be represented by a block-diagonal matrix comprising the (stress–strain related) algorithmic tangents of the phase materials \(\partial {\varvec{\sigma }}_i / \partial {\varvec{\varepsilon }}\, ({\varvec{\varepsilon }}_i, \bar{\theta }, z^n_i)\), i.e.,

$$\begin{aligned}&\frac{\partial \vec {{\varvec{\sigma }}}}{\partial \vec {{\varvec{\varepsilon }}}}(\vec {{\varvec{\varepsilon }}}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) = \text {block-diag}\nonumber \\&\quad \left( \frac{\partial {\varvec{\sigma }}_1}{\partial {\varvec{\varepsilon }}} ({\varvec{\varepsilon }}_1, \bar{\theta }, {{\varvec{ z}} }_1^n), \dots , \frac{\partial {\varvec{\sigma }}_{2^K}}{\partial {\varvec{\varepsilon }}}({\varvec{\varepsilon }}_{2^K}, \bar{\theta }, {{\varvec{ z}} }_{2^K}^n) \right) \end{aligned}$$
(3.26)

holds. Upon convergence, the DMN’s effective stress \(\bar{{\varvec{\sigma }}}\) is computed by averaging the phase stresses by

$$\begin{aligned} \bar{{\varvec{\sigma }}}= {\left[ {\mathbb { I}}_\text {s}, {\mathbb { I}}_\text {s}, \dots , {\mathbb { I}}_\text {s}\right] }^\mathsf{T} {{\varvec{ W}} }\vec {{\varvec{\sigma }}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}), \end{aligned}$$
(3.27)

where \(\left[ {\mathbb { I}}_\text {s}, \dots , {\mathbb { I}}_\text {s}\right] \in \mathrm{Sym }( d )^{2^K}\) stand for a vector of the identity operators on \(\mathrm{Sym }( d )\) and \({{\varvec{ W}} }\) constitutes the weight matrix (3.23).

In Sect. 2, we learned that the evolution of the macroscopic temperature \(\bar{\theta }\) is coupled to the microscopic scale by the thermomechanical coupling term \(\bar{D}\). For computing \(\bar{D}\) efficiently, we introduce the phase-wise coupling terms

$$\begin{aligned}&D_i({\varvec{\varepsilon }}_i, \bar{\theta }, {{\varvec{ z}} }_i^n) = \bar{\theta }\displaystyle \frac{\partial ^2 \Psi _i}{\partial \theta \partial {\varvec{\varepsilon }}}({\varvec{\varepsilon }}_i, \bar{\theta }, {{\varvec{ z}} }_i^n) : \frac{{\varvec{\varepsilon }}_i - {\varvec{\varepsilon }}^n_i}{\triangle t} \nonumber \\&+ \left[ \bar{\theta }\displaystyle \frac{\partial ^2 \Psi _i}{\partial \theta \partial {{\varvec{ z}} }}({\varvec{\varepsilon }}_i, \bar{\theta }, {{\varvec{ z}} }_i^n) - \displaystyle \frac{\partial \Psi _i}{\partial {{\varvec{ z}} }}({\varvec{\varepsilon }}_i, \bar{\theta }, {{\varvec{ z}} }_i^n)\right] \cdot \frac{{{\varvec{ z}} }_i - {{\varvec{ z}} }^n_i}{\triangle t} \end{aligned}$$
(3.28)

for every phase \(i = 1, \dots , 2^K\), individually. With the vector of coupling terms \(\vec {D} = \left[ D_1, \dots , D_{2^K}\right] \in {\mathbb { R}}^{2^K}\) and the vector of ones, \(\left[ 1, \dots , 1\right] \in {\mathbb { R}}^{2^K}\), we compute \(\bar{D}\) by averaging, i.e.,

$$\begin{aligned} \bar{D}= {\left[ 1, \dots , 1\right] }^\mathsf{T} {{\varvec{ W}} }\vec {D}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) \end{aligned}$$
(3.29)

holds.

To employ a DMN in a two-scale setting, four algorithmic tangents need to be computed and provided to the macroscopic solver. We start with the algorithmic tangents related to the effective stress. Derivation of the effective stress \(\bar{{\varvec{\sigma }}}\) (3.27) w.r.t. the effective strain \(\bar{{\varvec{\varepsilon }}}\) and the absolute temperature \(\bar{\theta }\) gives rise to the DMN’s (stress-related) algorithmic tangents

$$\begin{aligned} {\mathbb { C}}^\text {algo}_{\bar{\varepsilon }}:= & {} \frac{\partial \bar{{\varvec{\sigma }}}}{\partial \bar{{\varvec{\varepsilon }}}} = {\left[ {\mathbb { I}}_\text {s}, \dots , {\mathbb { I}}_\text {s}\right] }^\mathsf{T} {{\varvec{ W}} }\nonumber \\&\left[ \frac{\partial \vec {{\varvec{\sigma }}}}{\partial \bar{{\varvec{\varepsilon }}}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n})\right. \nonumber \\&\left. + \frac{\partial \vec {{\varvec{\sigma }}}}{\partial \vec {{\varvec{\varepsilon }}}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) {{\varvec{ A}} }\frac{\partial \vec {{{\varvec{ a}} }}}{\partial \bar{{\varvec{\varepsilon }}}} \right] \end{aligned}$$
(3.30)

and

$$\begin{aligned} {\mathbb { C}}^\text {algo}_{\bar{\theta }}:= & {} \frac{\partial \bar{{\varvec{\sigma }}}}{\partial \bar{\theta }} = {\left[ {\mathbb { I}}_\text {s}, \dots , {\mathbb { I}}_\text {s}\right] }^\mathsf{T} {{\varvec{ W}} }\nonumber \\&\left[ \frac{\partial \vec {{\varvec{\sigma }}}}{\partial \bar{\theta }}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) \right. \nonumber \\&\left. + \frac{\partial \vec {{\varvec{\sigma }}}}{\partial \vec {{\varvec{\varepsilon }}}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) {{\varvec{ A}} }\frac{\partial \vec {{{\varvec{ a}} }}}{\partial \bar{\theta }} \right] . \end{aligned}$$
(3.31)

To get compact expressions, we introduced the vectors of algorithmic tangents

$$\begin{aligned} \frac{\partial \vec {{\varvec{\sigma }}}}{\partial \bar{{\varvec{\varepsilon }}}}(\vec {{\varvec{\varepsilon }}}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n})\!=\! \left[ \frac{\partial {\varvec{\sigma }}_1}{\partial {\varvec{\varepsilon }}}({\varvec{\varepsilon }}_1, \bar{\theta }, {{\varvec{ z}} }_1^n), \dots , \frac{\partial {\varvec{\sigma }}_{2^K}}{\partial {\varvec{\varepsilon }}}({\varvec{\varepsilon }}_{2^K}, \bar{\theta }, {{\varvec{ z}} }_{2^K}^n) \right] \nonumber \\ \end{aligned}$$
(3.32)

and

$$\begin{aligned} \frac{\partial \vec {{\varvec{\sigma }}}}{\partial \bar{\theta }}(\vec {{\varvec{\varepsilon }}}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) \!=\!\left[ \frac{\partial {\varvec{\sigma }}_1}{\partial \theta }({\varvec{\varepsilon }}_1, \bar{\theta }, {{\varvec{ z}} }_1^n), \dots , \frac{\partial {\varvec{\sigma }}_{2^K}}{\partial \theta }({\varvec{\varepsilon }}_{2^K}, \bar{\theta }, {{\varvec{ z}} }_{2^K}^n) \right] \nonumber \\ \end{aligned}$$
(3.33)

which arise by inserting \(\partial {\varvec{\sigma }}_i / \partial {\varvec{\varepsilon }}\, ({\varvec{\varepsilon }}_i, \bar{\theta }, z^n_i)\) and \(\partial {\varvec{\sigma }}_i / \partial \theta \, ({\varvec{\varepsilon }}_i, \bar{\theta }, z^n_i)\), \(i=1,\dots ,2^K\), into column vectors. To evaluate Expression (3.30) and (3.31), the partial derivatives of the strain jumps with respect to the macrostrain \(\partial \vec {{{\varvec{ a}} }}/\partial \bar{{\varvec{\varepsilon }}}\) and the absolute temperature \(\partial \vec {{{\varvec{ a}} }}/\partial \bar{\theta }\) need to be computed first. To this end, differentiating the balance of linear momentum (3.22) with respect to the macrostrain \(\bar{{\varvec{\varepsilon }}}\) and the absolute temperature \(\bar{\theta }\) yields the linear systems

$$\begin{aligned}&\left[ {{{\varvec{ A}} }}^\mathsf{T}{{\varvec{ W}} }\frac{\partial \vec {{\varvec{\sigma }}}}{\partial \vec {{\varvec{\varepsilon }}}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) {{\varvec{ A}} }\right] \frac{\partial \vec {{{\varvec{ a}} }}}{\partial \bar{{\varvec{\varepsilon }}}}\nonumber \\&\quad = -{{{\varvec{ A}} }}^\mathsf{T}{{\varvec{ W}} }\frac{\partial \vec {{\varvec{\sigma }}}}{\partial \bar{{\varvec{\varepsilon }}}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) \end{aligned}$$
(3.34)

and

$$\begin{aligned}&\left[ {{{\varvec{ A}} }}^\mathsf{T}{{\varvec{ W}} }\frac{\partial \vec {{\varvec{\sigma }}}}{\partial \vec {{\varvec{\varepsilon }}}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) {{\varvec{ A}} }\right] \frac{\partial \vec {{{\varvec{ a}} }}}{\partial \bar{\theta }} \nonumber \\&\quad = -{{{\varvec{ A}} }}^\mathsf{T}{{\varvec{ W}} }\frac{\partial \vec {{\varvec{\sigma }}}}{\partial \bar{\theta }}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) \end{aligned}$$
(3.35)

which need to be solved for \(\partial \vec {{{\varvec{ a}} }}/\partial \bar{{\varvec{\varepsilon }}}\) and \(\partial \vec {{{\varvec{ a}} }}/\partial \bar{\theta }\). By comparing Equations (3.34) and (3.35) to (3.25), we observe that all three problems share the same linear operator, i.e., only the right hand sides differ. Using a direct solver, e.g., a Cholesky decomposition, the matrix decomposition can be reused to minimize the computational overhead.

Derivation of the effective coupling term \(\bar{D}\) w.r.t. the macrostrain \(\bar{{\varvec{\varepsilon }}}\) and absolute temperature \(\bar{\theta }\) gives rise to the DMN’s (energy-related) algorithmic tangents

$$\begin{aligned} {\mathbb { D}}^\text {algo}_{\bar{\varepsilon }}:= & {} \frac{\partial \bar{D}}{\partial \bar{{\varvec{\varepsilon }}}} = \left[ 1, \dots , 1\right] ^T {{\varvec{ W}} }\nonumber \\&\left[ \frac{\partial \vec {D}}{\partial \bar{{\varvec{\varepsilon }}}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) \right. \nonumber \\&\left. + \frac{\partial \vec {D}}{\partial \vec {{\varvec{\varepsilon }}}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) {{\varvec{ A}} }\frac{\partial \vec {{{\varvec{ a}} }}}{\partial \bar{{\varvec{\varepsilon }}}} \right] \end{aligned}$$
(3.36)

and

$$\begin{aligned} {\mathbb { D}}^\text {algo}_{\bar{\theta }}:= & {} \frac{\partial \bar{D}}{\partial \bar{\theta }} = \left[ 1, \dots , 1\right] ^T {{\varvec{ W}} }\nonumber \\&\left[ \frac{\partial \vec {D}}{\partial \bar{\theta }}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) \right. \nonumber \\&\left. + \frac{\partial \vec {D}}{\partial \vec {{\varvec{\varepsilon }}}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) {{\varvec{ A}} }\frac{\partial \vec {{{\varvec{ a}} }}}{\partial \bar{\theta }} \right] . \end{aligned}$$
(3.37)

As before, \(\partial \vec {D} / \partial \vec {{\varvec{\varepsilon }}} \, (\vec {{\varvec{\varepsilon }}}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n})\) denotes the block-diagonal matrix of phase-wise algorithmic tangents

$$\begin{aligned} \frac{\partial \vec {D}}{\partial \vec {{\varvec{\varepsilon }}}}(\vec {{\varvec{\varepsilon }}}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) =&\text {block-diag}\left( \frac{\partial D_1}{\partial {\varvec{\varepsilon }}}({\varvec{\varepsilon }}_1, \bar{\theta }, {{\varvec{ z}} }_1^n), \dots , \frac{\partial D_{2^K}}{\partial {\varvec{\varepsilon }}}\right. \nonumber \\&\left. ({\varvec{\varepsilon }}_{2^K}, \bar{\theta }, {{\varvec{ z}} }_{2^K}^n) \right) .\nonumber \\ \end{aligned}$$
(3.38)

Furthermore, for brevity, the vectors of the (energy-related) algorithmic tangents

$$\begin{aligned} \frac{\partial \vec {D}}{\partial \bar{{\varvec{\varepsilon }}}}(\vec {{\varvec{\varepsilon }}}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) = \left[ \frac{\partial D_1}{\partial {\varvec{\varepsilon }}}({\varvec{\varepsilon }}_1, \bar{\theta }, {{\varvec{ z}} }_1^n), \dots , \frac{\partial D_{2^K}}{\partial {\varvec{\varepsilon }}}({\varvec{\varepsilon }}_{2^K}, \bar{\theta }, {{\varvec{ z}} }_{2^K}^n) \right] \nonumber \\ \end{aligned}$$
(3.39)

and

$$\begin{aligned} \frac{\partial \vec {D}}{\partial \bar{\theta }}(\vec {{\varvec{\varepsilon }}}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}) = \left[ \frac{\partial D_1}{\partial \theta }({\varvec{\varepsilon }}_1, \bar{\theta }, {{\varvec{ z}} }_1^n), \dots , \frac{\partial D_{2^K}}{\partial \theta }({\varvec{\varepsilon }}_{2^K}, \bar{\theta }, {{\varvec{ z}} }_{2^K}^n) \right] \nonumber \\ \end{aligned}$$
(3.40)

were introduced. Indeed, to efficiently compute Relations (3.36) and (3.37), the already computed partial derivatives \(\partial \vec {{{\varvec{ a}} }}/\partial \bar{{\varvec{\varepsilon }}}\) and \(\partial \vec {{{\varvec{ a}} }}/\partial \bar{\theta }\) are reused.

Later on in Sect. 5.4, we take a closer look at the effective dissipation \(\bar{\mathcal {D}}\) to assess the self-heating of the DMN under cyclic and non-cyclic loading. For this reason, we compute the phase-wise dissipation by

$$\begin{aligned} \mathcal {D}_i({\varvec{\varepsilon }}_i, \bar{\theta }, {{\varvec{ z}} }_i^n)= & {} - \displaystyle \frac{\partial \Psi _i}{\partial {{\varvec{ z}} }}({\varvec{\varepsilon }}_i, \bar{\theta }, {{\varvec{ z}} }_i^n) \cdot \frac{{{\varvec{ z}} }_i - {{\varvec{ z}} }^n_i}{\triangle t} \quad \text {with} \quad \nonumber \\ \vec {\mathcal {D}}= & {} \left[ \mathcal {D}_1, \dots , \mathcal {D}_{2^K}\right] \in {\mathbb { R}}^{2^K}. \end{aligned}$$
(3.41)

Then, the effective dissipation is computed by averaging

$$\begin{aligned} \bar{\mathcal {D}}= {\left[ 1, \dots , 1\right] }^\mathsf{T} {{\varvec{ W}} }\vec {\mathcal {D}}(\vec {\bar{{\varvec{\varepsilon }}}} + {{\varvec{ A}} }\vec {{{\varvec{ a}} }}, \bar{\theta }, \vec {{{\varvec{ z}} }}^{\, n}). \end{aligned}$$
(3.42)

The pseudo-code summarizing the relevant steps of the algorithm can be found in Algorithm 1. Please note that the effective stress \(\bar{{\varvec{\sigma }}}\), the effective thermomechanical coupling term \(\bar{D}\), the effective dissipation \(\bar{\mathcal {D}}\) and the algorithmic tangents \({\mathbb { C}}^\text {algo}_{\bar{\varepsilon }}\), \({\mathbb { C}}^\text {algo}_{\bar{\theta }}\), \({\mathbb { D}}^\text {algo}_{\bar{\varepsilon }}\) and \({\mathbb { D}}^\text {algo}_{\bar{\theta }}\) are computed after the convergence of Newton’s method for reasons of numerical efficiency.

figure a

4 Short fiber reinforced polyamide

In general, thermoplastic polymers feature a pronounced thermomechanical coupling. For this reason, we study a short fiber reinforced polyamide 6.6 (PA66) as our benchmark composite. As reinforcement, we consider E-glass fibers with a (uniform) fiber length of \(L_\text {f} = 200~{\mu }\)m and a fiber diameter of \(D_\text {f} = 10~{\mu }\)m. We choose a fiber volume fraction of \(c_\text {f} = 16 \%\), which correspond to a fiber mass fraction of approximately \(30 \%\). The fiber orientation is described by a transversely isotropic fiber orientation tensor of second order [61] which reads

(4.1)

in Cartesian coordinates, i.e., \(80\%\) of the fibers point in the \({{\varvec{ e}} }_1\) direction, whereas \(20\%\) of the fibers are uniformly distributed in the \({{\varvec{ e}} }_2\)-\({{\varvec{ e}} }_3\) plane. Figure 2 illustrates an example of such a microstructure comprising 577 straight, cylindrical fibers.

E-glass fibers

We model the E-glass fibers as isotropic, linear thermoelastic. We rely upon the commonly used additive splitting of the volume-specific Helmholtz free energy density into two parts

$$\begin{aligned} \psi ({\varvec{\varepsilon }}, \theta ) = \psi _\text {mech}({\varvec{\varepsilon }}, \theta ) + \psi _\text {heat}(\theta ). \end{aligned}$$
(4.2)

The first part \(\psi _\text {mech}({\varvec{\varepsilon }}, \theta )\) represents the storage of mechanical energy whereas the second part \(\psi _\text {heat}(\theta )\) represent the heat-storage alone. We assume the heat capacity at constant strain to be independent of the deformation \({\varvec{\varepsilon }}\). Thus, the mechanical part of the Helmholtz free energy \(\psi _\text {mech}({\varvec{\varepsilon }}, \theta )\) may at most be linear in the temperature and

$$\begin{aligned} c_{\varepsilon }(\theta ) = - \theta \frac{\partial ^2 \psi _\text {heat}}{\partial \theta ^2}(\theta ) \end{aligned}$$
(4.3)

holds. For a constant heat capacity at constant strain \(c_{\varepsilon }(\theta ) = c_0\), the heat storage part of the Helmholtz free energy reads

$$\begin{aligned} \psi _\text {heat}(\theta ) = c_0 \left[ (\theta - \theta _0) - \theta \ln \left( \frac{\theta }{\theta _0}\right) \right] , \end{aligned}$$
(4.4)

where \(\theta _0\) stands for the reference temperature. The mechanical part of the Helmholtz free energy is given by the following quadratic form

$$\begin{aligned} \psi _\text {mech}({\varvec{\varepsilon }}, \theta ) = \frac{1}{2} {\varvec{\varepsilon }}: {\mathbb { C}}\left[ {\varvec{\varepsilon }}\right] - {\varvec{\varepsilon }}: {\mathbb { C}}[{\varvec{\alpha }}(\theta - \theta _0)], \end{aligned}$$
(4.5)
Fig. 2
figure 2

Generated microstructure realization comprising 577 fibers

such that the stress response of the material computes to

$$\begin{aligned} {\varvec{\sigma }}= {\mathbb { C}}\left[ {\varvec{\varepsilon }}- {\varvec{\alpha }}(\theta - \theta _0)\right] . \end{aligned}$$
(4.6)

Both the stiffness \({\mathbb { C}}\) and the coefficient of thermal expansion \({\varvec{\alpha }}\) are assumed to be isotropic, i.e., the following relations

$$\begin{aligned} {\mathbb { C}}= 3 K {\mathbb { P}}_1 + 2 G {\mathbb { P}}_2 \quad \text {and} \quad {\varvec{\alpha }}= \alpha _0 \mathbf {1}\end{aligned}$$
(4.7)

hold, with the projection operators \({\mathbb { P}}_1: \mathrm{Sym }( d ) \rightarrow \mathrm{Sph }( d )\) and \({\mathbb { P}}_2: \mathrm{Sym }( d ) \rightarrow \mathrm{Dev }( d )\) on the spherical and deviatoric subspaces of \(\mathrm{Sym }( d )\) which read

$$\begin{aligned} \left( {\mathbb { P}}_1\right) _{mnop}= & {} \frac{1}{3}\delta _{mn} \delta _{op} \quad \text {and} \quad \left( {\mathbb { P}}_2\right) _{mnop} \nonumber \\= & {} \frac{1}{2}(\delta _{mo} \delta _{pn} + \delta _{mp} \delta _{on}) - \frac{1}{3} \delta _{mn} \delta _{op} \end{aligned}$$
(4.8)

in Cartesian coordinates and \(\mathbf {1}: {\mathbb { R}}^d \rightarrow {\mathbb { R}}^d\) denotes the identity on \({\mathbb { R}}^d\). The bulk modulus K and the shear modulus G may be expressed in terms of the Young’s modulus E and the Poisson’s ratio \(\nu \), i.e.,

$$\begin{aligned} \quad K = \frac{E}{3(1 - 2 \nu )} \quad \text {and} \quad \quad G = \frac{E}{2(1 + \nu )}. \end{aligned}$$
(4.9)

As the material is purely elastic, the dissipation potential vanishes identically \(\phi (\theta ) \equiv 0\). Thus, the thermomechanical coupling term

$$\begin{aligned} D({\varvec{\varepsilon }}, \theta )&= \theta \displaystyle \frac{\partial ^2 \psi }{\partial \theta \partial {\varvec{\varepsilon }}}({\varvec{\varepsilon }}, \theta ) : \dot{{\varvec{\varepsilon }}} = - \theta \, \dot{{\varvec{\varepsilon }}} : {\mathbb { C}}\left[ {\varvec{\alpha }}\right] \end{aligned}$$
(4.10)

is solely dependent on the strain rate \(\dot{{\varvec{\varepsilon }}}\) due to the vanishing dissipation, i.e., \(\mathcal {D}\equiv 0\) holds. In fact, a non-vanishing strain rate causes self-cooling under hydrostatic extension and self-heating under hydrostatic compression. This effect is commonly referred to as Gough-Joule effect, see, e.g., Section 96 in Truesdell-Noll [62]. The material parameters for the E-glass fibers are taken from Tikarrouchine et al. [16] and summarized in Table 1.

Table 1 Material parameters of the E-glass fibers [16]

Polyamide 6.6 matrix For modeling the material behavior of the PA66 matrix, we adapt the model proposed by Krairi and co-workers [63], which was specifically derived for thermoplastic polymers under non-isothermal conditions. The model couples linear viscoelasticity, viscoplasticty and thermal effects such as thermal softening and dissipative self-heating. More precisely, the linear viscoelastic part of the model is given by a generalized Maxwell model comprising N Maxwell elements, and the viscoplastic part is governed by \(J_2\)-viscoplasticity. We refer to Krairi et al. [63] for all underlying modeling assumptions and the experimental calibration of the model.

For the PA66 matrix, we prescribe the following heat-storage related free energy

$$\begin{aligned} \psi _\text {heat}(\theta ) = c_0 \left[ (\theta - \theta _0) - \theta \ln \left( \frac{\theta }{\theta _0}\right) \right] . \end{aligned}$$
(4.11)

Furthermore, the mechanical part of the Helmholtz free energy reads

$$\begin{aligned} \begin{aligned} \psi _\text {mech}({\varvec{\varepsilon }}, \theta , {{\varvec{ z}} })&= \frac{1}{2} ({\varvec{\varepsilon }}- {\varvec{\varepsilon }}_\text {vp}) : {\mathbb { C}}_{\infty } [{\varvec{\varepsilon }}- {\varvec{\varepsilon }}_\text {vp}] \\&\quad - ({\varvec{\varepsilon }}- {\varvec{\varepsilon }}_\text {vp}) : {\mathbb { C}}_{\infty } [{\varvec{\alpha }}(\theta - \theta _0)] \\&\quad + \int _0^{\varepsilon _\text {p}} H(\theta , {\bar{\varepsilon }}_\text {p}) \mathop {}\!\mathrm {d}{\bar{\varepsilon }}_\text {p}\\&\quad + \frac{1}{2} \sum _{i=1}^{N} ({\varvec{\varepsilon }}- {\varvec{\varepsilon }}_\text {vp} - {\varvec{\varepsilon }}_{\text {v},i}) : {\mathbb { C}}_i [{\varvec{\varepsilon }}- {\varvec{\varepsilon }}_\text {vp} - {\varvec{\varepsilon }}_{\text {v},i}]\\&\quad - \sum _{i=1}^{N} ({\varvec{\varepsilon }}- {\varvec{\varepsilon }}_\text {vp} - {\varvec{\varepsilon }}_{\text {v},i}) : {\mathbb { C}}_i [{\varvec{\alpha }}(\theta - \theta _0)].\\ \end{aligned} \end{aligned}$$
(4.12)

For readability, we collect the state variables, i.e., the accumulated plastic strain \(\varepsilon _\text {p}\), the viscoplastic strain \({\varvec{\varepsilon }}_\text {vp}\) and the viscoelastic strains \(\left\{ {\varvec{\varepsilon }}_{\text {v},i}\right\} _{i=1}^N\) into the state vector \({{\varvec{ z}} }= \left[ \varepsilon _\text {p}, {\varvec{\varepsilon }}_\text {vp}, {\varvec{\varepsilon }}_{\text {v}, 1}, \dots , {\varvec{\varepsilon }}_{\text {v}, N} \right] \in Z := {\mathbb { R}}_{\ge 0} \times \mathrm{Dev }( d ) \times \mathrm{Sym }( d )^{\times ^N}\).

With the Helmholtz free energy (4.12) at hand, the material’s stress response computes to

$$\begin{aligned} {\varvec{\sigma }}= & {} {\mathbb { C}}_{\infty } [{\varvec{\varepsilon }}- {\varvec{\varepsilon }}_\text {vp} - {\varvec{\alpha }}(\theta - \theta _0)]\nonumber \\&+ \sum _{i=1}^{N} {\mathbb { C}}_i [{\varvec{\varepsilon }}- {\varvec{\varepsilon }}_\text {vp} - {\varvec{\varepsilon }}_{\text {v},i} - {\varvec{\alpha }}(\theta - \theta _0)]. \end{aligned}$$
(4.13)

For a fixed viscoplastic strain \({\varvec{\varepsilon }}_{\text {vp}}\), we assume the material to be linear and isotropic, both in its long-term elastic and its purely viscoelastic response, and to feature an isotropic thermal expansion. More precisely, the stiffness governing infinitely slow processes \({\mathbb { C}}_{\infty }\), the stiffness \({\mathbb { C}}_i\) associated to the i-th dashpot and the coefficient of thermal expansion \({\varvec{\alpha }}\) admit the representations

$$\begin{aligned} {\mathbb { C}}_{\infty }= & {} 3 K_{\infty } {\mathbb { P}}_1 + 2 G_{\infty } {\mathbb { P}}_2, \quad {\mathbb { C}}_i = 3 K_i {\mathbb { P}}_1 + 2 G_i {\mathbb { P}}_2 \quad \text {and} \quad \nonumber \\ {\varvec{\alpha }}= & {} \alpha _0 \mathbf {1}. \end{aligned}$$
(4.14)

The bulk \(K_{\infty }\), \(K_{i}\) and shear moduli \(G_{\infty }\), \(G_{i}\) are expressed in terms of the Young’s moduli \(E_{\infty }\), \(E_i\) and the Poisson’s ratio \(\nu \), i.e., the following relations

$$\begin{aligned} K_{\infty }= & {} \frac{E_{\infty }}{3(1 - 2 \nu )}, \quad K_i = \frac{E_i}{3(1 - 2 \nu )}, \quad \nonumber \\ G_{\infty }= & {} \frac{E_{\infty }}{2(1 + \nu )} \quad \text {and} \quad G_i = \frac{E_i}{2(1 + \nu )} \end{aligned}$$
(4.15)

hold. Indeed, for the model at hand, the bulk and shear moduli \(K_i\) and \(G_i\) are coupled due to an assumed constant Poisson’s ratio \(\nu \), see Krairi et al. [63]. Such an assumption is not unusual if only experimental data from uniaxial experiments are available.

Concerning the thermo-viscoelastic behavior, we assume the PA66 to be thermorheologically simple, i.e., the viscosity tensor \({\mathbb { V}}_i\) associated to the i-th dashpot of the generalized Maxwell model should have the form

$$\begin{aligned} {\mathbb { V}}_i = a_\theta (\theta ) \left( 3 K_i \, \tau _{\text {K},i} \, {\mathbb { P}}_1 + 2 G_i \tau _{\text {G},i} \, {\mathbb { P}}_2\right) , \end{aligned}$$
(4.16)

where \(a_{\theta }: {\mathbb { R}}_{> 0} \rightarrow {\mathbb { R}}_{> 0}\) denotes a temperature-dependent shift function. The volumetric and deviatoric relaxation times

$$\begin{aligned} \tau _{\text {K},i} = \frac{\tau _i E_i}{K_i} \quad \text {and} \quad \tau _{\text {G},i} = \frac{\tau _i E_i}{G_i} \end{aligned}$$
(4.17)

are expressed in terms of the Young’s modulus \(E_i\), the bulk and shear moduli \(K_i\) and \(G_i\) and the relaxation time \(\tau _i\). The fluidity tensor \({\mathbb { F}}_i\) is given by the pseudoinverse of the viscosity tensor \({\mathbb { F}}_i = {\mathbb { V}}_i^{\dagger }\), giving rise to the evolution equation for the viscous strain

$$\begin{aligned} \dot{{\varvec{\varepsilon }}}_{\text {v},i} = {\mathbb { F}}_i \left[ {\varvec{\sigma }}_{\text {v},i} \right] , \end{aligned}$$
(4.18)

where \({\varvec{\sigma }}_{\text {v},i}\) denotes the (viscous) partial stress

$$\begin{aligned} {\varvec{\sigma }}_{\text {v},i} = {\mathbb { C}}_i [{\varvec{\varepsilon }}- {\varvec{\varepsilon }}_\text {vp} - {\varvec{\varepsilon }}_{\text {v},i} - {\varvec{\alpha }}(\theta - \theta _0)] \end{aligned}$$
(4.19)

of the i-th dashpot. As we consider temperatures above the glass transition, the temperature-dependent shift function is assumed to obey the Williams-Landel-Ferry (WLF) [64] equation

$$\begin{aligned} \log _{10}(a_\theta (\theta )) = -\frac{C_1 (\theta - \theta _{\text {ref}})}{C_2 + (\theta - \theta _{\text {ref}})}. \end{aligned}$$
(4.20)

To capture thermal softening of the material, the yield stress

$$\begin{aligned} \sigma _\text {Y}: {\mathbb { R}}_{> 0} \rightarrow {\mathbb { R}}_{> 0}, \quad \theta \mapsto \Gamma (\theta , \beta _1) \, \sigma _\text {Y0}, \end{aligned}$$
(4.21)

and the power-law hardening

$$\begin{aligned} H: {\mathbb { R}}_{> 0} \times {\mathbb { R}}_{\ge 0} \rightarrow {\mathbb { R}}_{\ge 0}, \quad (\theta , \varepsilon _\text {p}) \mapsto \Gamma (\theta , \beta _1) \, k \, \varepsilon _\text {p}^n,\nonumber \\ \end{aligned}$$
(4.22)

feature an explicit temperature-dependence. The temperature-degradation function

$$\begin{aligned} \Gamma : {\mathbb { R}}_{> 0} \times {\mathbb { R}}_{\ge 0} \rightarrow {\mathbb { R}}_{> 0}, \quad (\theta , \beta ) \mapsto \exp (- \beta (\theta - \theta _{\text {ref}})),\nonumber \\ \end{aligned}$$
(4.23)

takes the temperature and the material parameter \(\beta _1 \in {\mathbb { R}}_{\ge 0}\) as input and degrades both the yield stress and the isotropic hardening w.r.t. the temperature. As for classical \(J_2\)-viscoplasticity, the evolution of the plastic strain

$$\begin{aligned} \dot{{\varvec{\varepsilon }}}_{\text {vp}} = \sqrt{\frac{3}{2}} \dot{\varepsilon }_\text {p} \frac{{\varvec{\sigma }}'}{\Vert {\varvec{\sigma }}' \Vert } \end{aligned}$$
(4.24)

is driven by the deviatoric part of the stress tensor \({\varvec{\sigma }}'\). The accumulated plastic strain rate \(\dot{\varepsilon }_\text {p}\) is given by the following evolution equation

$$\begin{aligned} \dot{\varepsilon }_\text {p} = \frac{\sigma _\text {Y}(\theta )}{\eta (\theta )} \left\langle \frac{ \sqrt{\frac{3}{2}} \Vert {\varvec{\sigma }}' \Vert - \sigma _\text {Y}(\theta ) - H(\theta , \varepsilon _\text {p}) }{\sigma _\text {Y}(\theta )}\right\rangle _{+}^m, \end{aligned}$$
(4.25)

see Krairi et al. [63], where the reference viscosity

$$\begin{aligned} \eta : {\mathbb { R}}_{> 0} \rightarrow {\mathbb { R}}_{> 0}, \quad \theta \mapsto \Gamma (\theta , \beta _2) \, \eta _0, \end{aligned}$$
(4.26)

involves a temperature-dependence as well.

In addition to the Helmholtz free energy, the material’s (extended-valued) dissipation potential takes the following form

$$\begin{aligned} \phi (\theta , \dot{{{\varvec{ z}} }}) = \left\{ \begin{array}{rl} \sigma _\text {Y}(\theta ) \, \dot{\varepsilon }_\text {p} + \sum _{i=1}^{N} {\varvec{\sigma }}_{\text {v},i} : \dot{{\varvec{\varepsilon }}}_{\text {v},i}, &{} \dot{\varepsilon }_\text {p} = \sqrt{\frac{2}{3}} \Vert \dot{{\varvec{\varepsilon }}}_{\text {vp}} \Vert ,\\ +\infty , &{} \text {otherwise}.\\ \end{array} \right. \nonumber \\ \end{aligned}$$
(4.27)

For the material at hand, the thermomechanical coupling term D computes to

$$\begin{aligned} \begin{aligned} D({\varvec{\varepsilon }}, \theta , {{\varvec{ z}} }) =&\, \theta \displaystyle \frac{\partial ^2 \psi }{\partial \theta \partial {\varvec{\varepsilon }}}({\varvec{\varepsilon }}, \theta , {{\varvec{ z}} }) : \dot{{\varvec{\varepsilon }}} \\&+ \theta \displaystyle \frac{\partial ^2 \psi }{\partial \theta \partial {{\varvec{ z}} }}({\varvec{\varepsilon }}, \theta , {{\varvec{ z}} }) \cdot \dot{{{\varvec{ z}} }}- \displaystyle \frac{\partial \psi }{\partial {{\varvec{ z}} }}({\varvec{\varepsilon }}, \theta , {{\varvec{ z}} }) \cdot \dot{{{\varvec{ z}} }}\\ =&- \theta \, (\dot{{\varvec{\varepsilon }}} - \dot{{\varvec{\varepsilon }}}_\text {vp}) : {\mathbb { C}}_{\infty } \left[ {\varvec{\alpha }}\right] \\&- \theta \sum _{i=1}^{N} (\dot{{\varvec{\varepsilon }}} - \dot{{\varvec{\varepsilon }}}_\text {vp} - \dot{{\varvec{\varepsilon }}}_{\text {v},i}) : {\mathbb { C}}_i \left[ {\varvec{\alpha }}\right] \\&+ \theta \frac{\partial H}{\partial \theta }(\theta , \varepsilon _\text {p}) \, \dot{\varepsilon }_\text {p}\\&+ \sigma _\text {Y}(\theta ) \, \dot{\varepsilon }_\text {p} + \sum _{i=1}^{N} {\varvec{\sigma }}_{\text {v}, i} : \dot{{\varvec{\varepsilon }}}_{\text {v}, i}, \end{aligned} \end{aligned}$$
(4.28)

which is composed of three independent parts. The first two terms are responsible for the Joule-Gough effect. The third term is related to the thermal softening and the last two terms, i.e., the dissipation

$$\begin{aligned} \mathcal {D}({\varvec{\varepsilon }}, \theta , {{\varvec{ z}} }) = \sigma _\text {Y}(\theta ) \, \dot{\varepsilon }_\text {p} + \sum _{i=1}^{N} {\varvec{\sigma }}_{\text {v}, i} : \dot{{\varvec{\varepsilon }}}_{\text {v}, i}, \end{aligned}$$
(4.29)

comprises the dissipated energy due to viscoplastic and viscoelastic flow. The latter is responsible for the self-heating of the material due to viscoelastic or viscoplastic deformations. The full set of material parameters for the PA66, involving \(N=12\) Maxwell elements, are summarized in Table 2.

Table 2 Material parameters of the PA66 [63]

5 Identifying a DMN surrogate model using FFT-based computational homogenization

This section is dedicated to the identification of the DMN surrogate model. First, we consider the sampling of the linear elastic training data. To this end, we start by identifying both the necessary resolution and the size of the representative volume element (RVE). Secondly, we present the offline training of the DMN and the validation of the surrogate model for thermomechanically coupled inelastic computations on the microscopic scale. For all numerical computations, we rely on a workstation equipped with two AMD EPYC 7642 with 48 physical cores each and \({1\,024}\) GB of DRAM.

5.1 Material sampling

We start with the sampling of tuples of linear elastic input stiffnesses \(\left( {\mathbb { C}}^s_1, {\mathbb { C}}^s_2 \right) \). Indeed, there is some freedom in selecting appropriate sampling strategies. For example, Liu and coworkers [51, 52] and Gajek et al. [53] sampled orthotropic stiffnesses. In the work at hand, we follow Gajek et al. [54] who proposed to draw the samples from the space of possible algorithmic tangents occurring during the online evaluation. More precisely, the input stiffnesses \({\mathbb { C}}^s_1\), corresponding to the isotropic, purely thermoelastic glass fibers, are sampled from the set of isotropic stiffnesses, i.e., we use a parameterization

$$\begin{aligned} {\mathbb { C}}^s_1 = 3 K^s_1 \, {\mathbb { P}}_1 + 2 G^s_1 \, {\mathbb { P}}_2. \end{aligned}$$
(5.1)

As the polyamide matrix features a thermo-viscoelastic, viscoplastic material behavior, the samples \({\mathbb { C}}^s_2\) are assumed to be isotropic minus a rank-one perturbation, see Chapter 3 in Simo-Hughes [65]. Thus, the stiffness \({\mathbb { C}}^s_2\) is assumed to have the form

$$\begin{aligned} {\mathbb { C}}^s_2 = 3 K^s_2 \, {\mathbb { P}}_1 + 2 G^s_2 \left( {\mathbb { P}}_2 - a_s \, {{\varvec{ N}} }'_s \otimes {{\varvec{ N}} }'_s\right) , \end{aligned}$$
(5.2)

where \({{\varvec{ N}} }'_s \in {\mathcal {N}} := \left\{ {{\varvec{ N}} }\in \mathrm{Sym }( d ) \ | \ \mathrm{tr }\left( {{\varvec{ N}} } \right) = 0, \ \left\| \, {{\varvec{ N}} } \, \right\| _\text {F}=1\right\} \) is normalized and deviatoric. In other words, the set of all considered positive definite stiffness tuples \(\left( {\mathbb { C}}^s_1, {\mathbb { C}}^s_2\right) \) may be parameterized via

$$\begin{aligned}&\left( K^s_1, G^s_1, K^s_2, G^s_2, a_s, {{\varvec{ N}} }'_s \right) \in {\mathbb { R}}_{>0} \times {\mathbb { R}}_{>0} \nonumber \\&\qquad \times {\mathbb { R}}_{>0} \times {\mathbb { R}}_{>0} \times \left[ 0, 1\right) \times {\mathcal {N}}. \end{aligned}$$
(5.3)

The former set is given in terms of an eight-dimensional continuum. For more details on sampling \(\Big (K^s_1, G^s_1, K^s_2, G^s_2, a_s, {{\varvec{ N}} }'_s \Big )\) from this eight-dimensional space and assembling the stiffnesses \(\left( {\mathbb { C}}^s_1, {\mathbb { C}}^s_2 \right) \), we refer to Section 4.4 in Gajek et al. [54].

In the following, we assume that \(N_\text {s} = 1000\) tuples of input stiffnesses \(\left\{ \left( {\mathbb { C}}^s_1, {\mathbb { C}}^s_2 \right) \right\} ^{N_s}_{s=1}\) were generated. With these samples at hand, we turn our attention to the computation of the associated effective stiffnesses. For this purpose, a representative volume element (RVE) with a suitable resolution and size needs to be generated first. To this end, we take a closer look at the sampled input stiffnesses. More precisely, we consider the distribution of the material contrast \(\mu \) which is defined, for the sample s, as

$$\begin{aligned} \mu ^s = \max \left( \frac{\lambda ^s_{1, \text {max}}}{\lambda ^s_{2, \text {min}}}, \frac{\lambda ^s_{2, \text {max}}}{\lambda ^s_{1, \text {min}}} \right) . \end{aligned}$$
(5.4)

Here, \(\lambda ^s_{1/2, \text {max}}\) and \(\lambda ^s_{1/2, \text {min}}\) denote the largest and smallest eigenvalues of stiffnesses \({\mathbb { C}}^s_1\) and \({\mathbb { C}}^s_2\), respectively. Figure 3a illustrates the sorted material contrast vs. the \({1\,000}\) samples. We observe that the material contrast starts at around two and goes up to around \({23\,000}\). To get a better understanding of how the material contrast is distributed on the sample set, Fig. 3b shows the respective histogram with 50 evenly log-spaced bins. We observe that the median of the distribution is well below a material contrast of 100 and that only \(3 \%\) of the samples exceed a material contrast of \({1\,000}\). In the following section, we consider finding a suitable resolution and size of the volume element, taking into account the findings of this section.

5.2 On the necessary resolution and the size of the RVE

Finding a suitable resolution and RVE size is necessary to obtain accurate effective properties. However, performing a resolution and RVE size study for any tuple of input stiffnesses \(\left( {\mathbb { C}}^s_1, {\mathbb { C}}^s_2 \right) \) is computationally expensive. The former is especially relevant for samples with a high material contrast, i.e., greater than 1000, which only occur with a small frequency in the sample set. For this reason, we conduct a resolution and RVE size study for selected samples alone. To be more precise, we choose samples from the sampling set \(\left\{ \left( {\mathbb { C}}^s_1, {\mathbb { C}}^s_2 \right) \right\} ^{N_\text {s}}_{s=1}\) corresponding to the 70th, 80th, 90th and 95th percentile i.e., samples with a material contrast of \(\mu = 160\), \(\mu = 274\), \(\mu = 470\) and \(\mu = 764\), respectively, see Fig. 3a for an illustration and color coding.

For a start, we consider generated cubic microstructures with a variable resolution and with a fixed edge length of \(L=384~\mu \)m, i.e., roughly twice the fiber length of \(L_\text {f} = 200~{\mu }\)m. We vary the resolution from 3.3 to 13.3 voxels per fiber diameter in equidistant steps. The former corresponds to volume element discretizations with \(128^3\) to \(512^3\) voxels. We choose a resolution of 20 voxels per fiber diameter, i.e., discretized by \(768^3\) voxels, as reference.

Fig. 3
figure 3

Distribution of the material contrast in the sample set

Fig. 4
figure 4

Study to determine necessary resolution and size of the RVE. Shown is the relative error of the computed effective stiffness vs. the resolution and volume element size

For generating the volume elements, we rely upon the Sequential Addition and Migration (SAM) [66] method, using the exact closure approximation [67]. The SAM method takes the fiber length \(L_\text {f}\), the fiber diameter \(D_\text {f}\), the fiber volume fraction \(c_\text {f}\) and the (axis-aligned) fiber orientation tensor \({{\varvec{ A}} }\) as inputs and generates short fiber reinforced microstructure realizations. The effective stiffnesses are computed with the help of an FFT-based computational micromechanics code [17, 18] using a conjugate gradient solver [68, 69] and the staggered grid discretization [70, 71].

Figure 4a shows the relative error of the effective stiffness computed by the Frobenius norm of the corresponding Voigt matrices. For the crudest resolution of 3.33 voxels per fiber diameter, the relative error exceeds \(10\%\). Increasing the resolution decreases the relative error for the four considered material contrasts. At a resolution of ten voxels per fiber diameter, the relative error of the sample corresponding to the 95th percentile falls below \(3 \%\). For the samples corresponding the 90th, 80th/70th percentile, the relative error is below \(2 \%\) and \(1 \%\), respectively. As material contrasts of \(\mu = 764\) and above only occur with frequency of less than \(5 \%\), we consider the resolution of 10 voxels per fiber diameter as sufficient. We fix this resolution and focus on finding a suitable size of the RVE.

We investigate volume elements with edge length L ranging from 0.96 up to 3.84 fiber lengths. The former corresponds to volume element discretizations with \(192^3\) up to \(768^3\) voxels. To obtain the reference, we generate a volume element with edge length of 5.76 fiber lengths and discretized by \(1152^3\) voxels. As before, we consider the relative error in the effective stiffness as error measure. For all considered edge lengths, the relative error is well below \(1 \%\), see Fig. 4b. Indeed, even the smallest volume element, i.e., an edge length smaller than the fiber length, the relative error is below \(0.5\%\). Increasing the volume element edge length from 0.96 to 3.84 further decreased the error. In the work at hand, we consider a volume elements length of 1.92 fiber lengths, i.e., a edge lengths of \(L = 384~\mu \)m, as sufficient to keep the computational costs for the sampling of the training data reasonable.

With the optimal resolution and RVE size at hand, i.e., a resolution of 10 voxels per fiber diameter and a volume element discretization with \(384^3\) voxels, we compute the effective stiffnesses of all generated \(N_\text {s} = 1000\) stiffness samples and turn our attention to the training of the DMN.

5.3 Offline training

As explained in Sect. 3.2, we implement the offline training in PyTorch [72] exploiting the frameworks automatic differentiation capabilities, see Gajek et al. [53, 54] for more details. From previous works [51,52,53,54], we know that at least eight layers are necessary to achieve a sufficient approximation quality for inelastic computations, i.e., during the online evaluation. For this reason, we restrict to a two-phase DMN with \(K = 8\) layers, i.e., 255 individual directions of lamination and 256 weights as free parameters.

Fig. 5
figure 5

Loss function (a) and mean training and validation error (b) for the 3000 training epochs

We randomly split the training data \(\left\{ \left( \bar{{\mathbb { C}}}^s, {\mathbb { C}}^s_1, {\mathbb { C}}^s_2 \right) \right\} ^{N_s}_{s=1}\) into a training and a validation set, comprising \(90 \%\) and \(10 \%\) samples, respectively. The DMN is trained with mini batches with a batch size of \(N_\text {b} = 32\) samples, which are drawn randomly from the training set. Batches with less than 32 samples are discarded. Prior to the offline training, we sample the unknown directions of lamination \({{\varvec{ n}} }^i_k\) from a uniform distribution on the unit sphere and the initial weights \(w^i_{K+1}\) are sampled from a uniform distribution on \(\left[ 0, 1\right] \) and subsequently rescaled to sum to unity.

For training the DMN, we rely on the AMSGrad method [73, 74] and determine appropriate learning rates \(\alpha _{\vec {n}}\) and \(\alpha _{{\vec {v}}}\) by a learning rate sweep as suggested by Smith-Topin [75]. The learning rate sweep yields almost identical learning rates, i.e., \(\alpha _{\vec {n}} = \alpha _{{\vec {v}}} = 1.5 \cdot 10^{-2}\). To aid finding a suitable minimizer for J (3.7), we employ the warm restart technique as suggested by Loshchilov-Hutter [76]. The warm restarts are realized by a harmonic learning rate modulation

$$\begin{aligned}&\alpha : {\mathbb { N}}\rightarrow {\mathbb { R}}, \quad m \mapsto \gamma ^m\left( \alpha _\text {min} + \frac{1}{2}\left( \alpha _\text {max} - \alpha _\text {min} \right) \right. \nonumber \\&\left. \quad \left( 1 + \cos \left( \pi \frac{m}{M}\right) \right) \right) \end{aligned}$$
(5.5)

of the learning rates \(\alpha _{\vec {n}}\) and \(\alpha _{{\vec {v}}}\) in combination with a geometric decay to enforce convergence. Here, \(\alpha _\text {min}\) and \(\alpha _\text {max}\) denote the minimum and maximum learning rate, 2M corresponds to the period of the modulation, and \(\gamma \) represents the geometric decay rate. The maximum learning rate, both for \(\alpha _{\vec {n}}\) and \(\alpha _{{\vec {v}}}\), is set to \(\alpha _\text {max} = 1.5 \cdot 10^{-2}\), i.e., the result of the learning rate sweep. The minimum learning rate is set to \(\alpha _\text {min} = 1.5 \cdot 10^{-3}\), i.e., one magnitude smaller. In addition, we choose \(M = 50\) as well as \(\gamma = 0.999\) for the learning rate modulation (5.5) and set \(p = 1\), \(q = 10\) and \(\lambda = 10^3\) for the loss function (3.7).

We measure the accuracy of the fit by the mean error

$$\begin{aligned} e_\text {mean} = \frac{1}{N_s} \sum _{s=1}^{N_s} \frac{\left\| \, \mathcal {DMN}^{\mathcal {L}}_{}\left( {\mathbb { C}}^s_1, {\mathbb { C}}^s_2, \vec {{{\varvec{ n}} }}, \vec {w}\right) - \bar{{\mathbb { C}}}^s \, \right\| _1}{\left\| \, \bar{{\mathbb { C}}}^s \, \right\| _1}, \end{aligned}$$
(5.6)

where \(\Vert \cdot \Vert _1\) denotes the \(\ell ^1\)-norm of the components in (normalized) Voigt-Mandel notation, and \(N_\text {s}\) denotes the number of elements in the training or validation set, depending on the considered scenario.

In Fig. 5, the training progress in terms of the loss J and the mean error \(e_\text {mean}\) is illustrated. Overall, the effect of the learning rate modulation becomes apparent. The loss as well as the mean training and validation error fluctuate heavily, especially for the first 500 epochs. The fluctuation decreases due to the learning rate decay such that in the last 500 epochs, convergence is ensured. During the training, no significant model over-fitting can be observed as the validation error does not increase noticeably during training.

5.4 Online validation

This section is concerned with validating the identified DMN surrogate model for the inelastic regime. To this end, we compare the DMN’s predicted effective stress \(\bar{{\varvec{\sigma }}}\), the associated effective dissipation \(\bar{\mathcal {D}}\) as well as the change of the absolute temperature \(\triangle \bar{\theta }= \bar{\theta }- \bar{\theta }_0\) to a high-fidelity full-field solution on the microscopic scale. To compute the reference solution, we use the implicit staggered solution scheme of Wicht et al. [23], an inexact Newton-CG [77] solver and the discretization by trigonometric polynomials as introduced by Moulinec-Suquet [17, 18].

First, to obtain accurate inelastic results, a suitable resolution and size of the RVE needs to be determined first. In Sect. 5.2, we learned that the RVE size has a minor influence on the effective elastic response of the composite. For this reason, we fix the volume element’s edge length of \(L = 384~{\mu }\)m and only vary the RVE’s resolutions from 5 to 10 voxels per fiber diameter in equidistant steps. The former corresponds to volume element discretizations with \(192^3\) to \(384^3\) voxels, respectively. As loading, we consider a uniaxial extension in the principal fiber direction, i.e.,

$$\begin{aligned} \bar{{\varvec{\varepsilon }}}= \bar{\varepsilon }\, {{\varvec{ e}} }_1 \! \otimes {{\varvec{ e}} }_1, \end{aligned}$$
(5.7)

and use mixed boundary conditions [78], i.e., stress free loading perpendicular to the loading direction. The strain loading is applied in 40 equidistant load steps with a strain rate of \(\dot{\bar{\varepsilon }} = 5 \cdot 10^{-4} {~\mathrm s^{-1}}\). The reference temperature is set to \(\bar{\theta }_0 = {293.15}{\mathrm{K}}\). For simplicity, we assume adiabatic conditions, as we consider a single macroscopic point without any additional macroscopic heat sources.

Fig. 6
figure 6

Effective stress, temperature change and effective dissipation for the four considered resolutions and a uniaxial extension in the principal fiber direction

In Fig. 6, the computed effective stress \(\bar{{\varvec{\sigma }}}\), the change of the absolute temperature \(\triangle \bar{\theta }\) and the effective dissipation \(\bar{\mathcal {D}}\) are shown for all four considered resolutions. For a macrostrain of \(\bar{\varepsilon }= 1.0 \%\) and below, the Joule-Gough effect, i.e., an almost linear temperature decrease due to the hydrostatic extension, becomes apparent. This regime is captured well, even for the coarsest resolution. At around \(\bar{\varepsilon }= 1.0 \%\) macrostrain, the matrix starts to deform plastically. Due to the increasing dissipation, self-heating of the composite occurs and the four solutions start to deviate noticeably. Thus, to accurately capture self-heating effects, a resolution of at least 8.33 fibers per fiber diameter is necessary. Such a resolution suffices to accurately compute the effective stress and the effective dissipation as well. For this reason, we consider a resolution of 8.33 voxels per fiber diameter, i.e., a volume element discretization with \(320^3\) voxels, as sufficient for the inelastic computations.

With the identified resolution at hand, we turn back to the validation of the DMN surrogate model. For this purpose, we implemented the procedure introduced in Sect. 3.3 as an implicit user-material subroutine. A computationally efficient implementation of the UMAT is critical. For this reason, we use the binary tree compression as explained in Gajek et al. [54] and exploit the sparsity pattern of the gradient operator \({{\varvec{ A}} }\) and the Jacobians \(\partial \vec {{\varvec{\sigma }}} / \partial \vec {{\varvec{\varepsilon }}}\) and \(\partial \vec {D} / \partial \vec {{\varvec{\varepsilon }}}\). For this reason, we rely upon the Eigen3 [79] library for all linear algebra operations. We set the tolerance for the convergence criterion to \(\text {tol} = 10^{-12}\) and solve the linear system with the help of a sparse Cholesky decomposition. The former allows to reuse the decomposition for computing the algorithmic tangents \({\mathbb { C}}^\text {algo}_{\bar{\varepsilon }}\), \({\mathbb { C}}^\text {algo}_{\bar{\theta }}\), \({\mathbb { D}}^\text {algo}_{\bar{\varepsilon }}\), \({\mathbb { D}}^\text {algo}_{\bar{\theta }}\) with minimal computational overhead, see Sect. 3.3.

Strain-controlled monotonic and non-monotonic virtual experiments We first consider strain-controlled virtual experiments. Using the material parameters of Sect. 4, we investigate six monotonic uniaxial strain loadings

$$\begin{aligned} \bar{{\varvec{\varepsilon }}}= & {} \frac{\bar{\varepsilon }}{2} \left( {{\varvec{ e}} }_i \otimes {{\varvec{ e}} }_j + {{\varvec{ e}} }_j \otimes {{\varvec{ e}} }_i\right) \quad \text {with} \quad \nonumber \\ (i,j) \in L_1:= & {} \left\{ (1,1),(2,2),(3,3),(1,2),(1,3),(2,3)\right\} .\nonumber \\ \end{aligned}$$
(5.8)

For every uniaxial strain loading direction in the index set \(L_1\), a monotonic strain loading amplitude of \(\bar{\varepsilon }= 4.0\%\) is applied in 40 equidistant load steps. To capture the rate dependence of the polyamide matrix, we investigate four individual strain rates which are logarithmically spaced from \(\dot{\bar{\varepsilon }} = 5 \cdot 10^{-4} {~\mathrm s^{-1}}\) to \(\dot{\bar{\varepsilon }} = 5 \cdot 10^{-1} {~\mathrm s^{-1}}\).

To evaluate the approximation errors of the DMN in a quantitative way, we introduce the following error measures. For a load in direction (ij), we define the relative error in the effective stress component \(\bar{\sigma }_{ij}\), the change in absolute temperature \(\triangle \bar{\theta }\) and the effective dissipation \(\bar{\mathcal {D}}\) as

$$\begin{aligned} \eta ^{\bar{\sigma }}_{ij}(t)= & {} \frac{\left| \bar{\sigma }^{\, \text {DMN}}_{ij}(t) - \bar{\sigma }^{\, \text {FFT}}_{ij}(t)\right| }{\underset{t \in {\mathcal {T}}}{\max }\left| \bar{\sigma }^{\, \text {FFT}}_{ij}(t)\right| }, \nonumber \\ \eta ^{\triangle \bar{\theta }}_{ij}(t)= & {} \frac{\left| \triangle \bar{\theta }^{\, \text {DMN}}(t) - \triangle \bar{\theta }^{\, \text {FFT}}(t)\right| }{\underset{t \in {\mathcal {T}}}{\max }\left| \triangle \bar{\theta }^{\, \text {FFT}}(t)\right| }, \quad \nonumber \\ \eta ^{\bar{\mathcal {D}}}_{ij}(t)= & {} \frac{\left| \bar{\mathcal {D}}^{\, \text {DMN}}(t) - \bar{\mathcal {D}}^{\, \text {FFT}}(t)\right| }{\underset{t \in {\mathcal {T}}}{\max }\left| \bar{\mathcal {D}}^{\, \text {FFT}}(t)\right| }, \end{aligned}$$
(5.9)

where \({\mathcal {T}}=[0,T]\) denotes the considered time interval of the simulation. Furthermore, the mean and the maximum error are defined by

$$\begin{aligned} \eta ^{(\cdot )}_\text {mean}= & {} \underset{i,j \in \{1,2,3\}}{\max } \frac{1}{T} \int _{0}^{T} \eta ^{(\cdot )}_{ij}(t) \mathop {}\!\mathrm {d}t \quad \text {and} \quad \nonumber \\ \eta ^{(\cdot )}_\text {max}= & {} \underset{i,j \in \{1,2,3\}}{\max } \underset{t \in {\mathcal {T}}}{\max } \,\, \eta ^{(\cdot )}_{ij}(t). \end{aligned}$$
(5.10)

In Fig. 7, the results for the monotonic loading in the principal fiber direction, i.e., \((i,j) \equiv (1,1)\), are shown. We observe that, up to the maximum load of \(\bar{\varepsilon }= 4.0\%\), the effective stresses predicted by the DMN and the full-field solution are almost indistinguishable.

Fig. 7
figure 7

Strain-controlled monotonic loading: uniaxial extension in principal fiber direction

The relative stress error for all four considered strain rates is well below \(2.0 \%\). The same holds for the temperature change \(\triangle \bar{\theta }\) and the effective dissipation \(\bar{\mathcal {D}}\) for a strain loadings up to \(2.0 \%\). Only from \(2.0 \%\) macroscopic strain and above, deviations in the effective dissipation, and, thus, also the temperature change, become noticeable. The former is a result of the power-law hardening of the polyamide matrix. Indeed, due to the power-law hardening, local clusters of significant plastic deformation form in the microstructure. Figure 8 visualizes this effect by showing the evolution of the accumulated plastic strain \(\varepsilon _\text {p}\), for the strain rate \(\dot{\bar{\varepsilon }} = 5 \cdot 10^{-4} {~\mathrm s^{-1}}\), on a \({{\varvec{ e}} }_1\)-\({{\varvec{ e}} }_2\) slice of the 3D microstructure.

Figure 8 illustrates that for the chosen loading, clusters with more than \(30 \%\) accumulated plastic strain form in the vicinity of fiber ends. This strong plastification leads to a pronounced energy dissipation, which is slightly underestimated by the DMN, see Fig. 7. For this reason, the DMN underestimates the self-heating of the composite as it does not fully capture such localization phenomena.

To account for more complex loading conditions, i.e., load reversal or biaxial loadings, we investigate six independent non-monotonic loadings and six independent biaxial loadings in “Appendix A”. The relative errors in the effective stress, the temperature change and the effective dissipation for all four considered strain rates and all considered load cases are summarized in Table 3.

Fig. 8
figure 8

Accumulated plastic strain \(\varepsilon _\text {p}\) for a \(4.0\%\) uniaxial extension in principal fiber direction with a strain rate of \(\dot{\bar{\varepsilon }} = 5 \cdot 10^{-4} {~\mathrm s^{-1}}\)

Table 3 Mean and maximum relative errors for the investigated strain-controlled uniaxial and biaxial loadings

Stress-controlled cyclic loading In the previous section, we investigated the identified DMN surrogate model for monotonic and non-monotonic, uniaxial and biaxial loadings. Indeed, for such loadings, self-heating effects played a minor role. However, polymers, in general, show a significant self-heating under cyclic loading, see, e.g., Benaarbia et al [80]. For this reason, we conclude this section with the validation of the DMN surrogate model for cyclic loading and conduct stress-controlled virtual experiments

$$\begin{aligned} \bar{{\varvec{\sigma }}}(t)= & {} \frac{\bar{\sigma }(t)}{2} \, \left( {{\varvec{ e}} }_i \otimes {{\varvec{ e}} }_j + {{\varvec{ e}} }_j \otimes {{\varvec{ e}} }_i\right) \quad \text {with} \quad \nonumber \\ \bar{\sigma }(t)= & {} \bar{\sigma }^{\, \text {ampl}} \sin \left( 2 \pi \frac{t}{T_\text {c}} \right) , \quad \nonumber \\ (i,j) \in L_4:= & {} \left\{ (1,1), (2,2)\right\} . \end{aligned}$$
(5.11)

More precisely, for both loading directions in the index set \(L_4\), we apply a uniaxial, sinusoidal stress load. Here, \(\bar{\sigma }^{\, \text {ampl}}\) denotes the stress amplitude and \(T_\text {c}\) represent the period of the harmonic loading. As self-heating effects depend on the loading amplitude, we consider four linearly spaced stress amplitudes, ranging from \(\bar{\sigma }^{\, \text {ampl}} = {20}\,{\mathrm{MPa}}\) to \(\bar{\sigma }^{\, \text {ampl}} = {80}\,{\mathrm{MPa}}\). We simulate 100 cycles, where every cycle is discretized with 20 equidistant load steps, i.e., \({2\,000}\) load steps in total. The stress load is applied with a frequency of \(f = {10}\,{\mathrm{Hz}}\), i.e., the period is \(T_\text {c} = {0.1}\,{\mathrm{s}}\), and adiabatic conditions are assumed due to the short simulation time of \({10}\,{\mathrm{s}}\). Please note that we consider small stress amplitudes up to \(\bar{\sigma }^{\, \text {ampl}} = {80}\,{\mathrm{MPa}}\) resulting in strain amplitudes well below \(2.5\%\). For this loading, resolutions of the volume element smaller than 8.33 voxels per fiber diameter are admissible, see Fig. 6. For this purpose, we use a volume element resolved with \(256^3\) voxels, corresponding to 6.67 voxels per diameter, to keep computational costs reasonable.

In Fig. 9, the results for the cyclic loading perpendicular to the principal fiber direction, i.e., \((i,j) \equiv (2,2)\), are shown. The strain amplitude \(\bar{\varepsilon }^{\, \text {ampl}}_{22}\), which is computed by

$$\begin{aligned} \bar{\varepsilon }^{\, \text {ampl}}_{22}(n)= & {} \frac{1}{2} \left( \max _{t \in {\mathcal {T}}_\text {c}(n)}(\bar{{\varvec{\varepsilon }}}(t) \cdot {{\varvec{ e}} }_2 \otimes {{\varvec{ e}} }_2) \right. \nonumber \\&\left. - \min _{t \in {\mathcal {T}}_\text {c}(n)}(\bar{{\varvec{\varepsilon }}}(t) \cdot {{\varvec{ e}} }_2 \otimes {{\varvec{ e}} }_2) \right) \nonumber \\&\quad \text {with} \quad {\mathcal {T}}_\text {c}(n) := [(n-1) T_\text {c}, n T_\text {c}] \end{aligned}$$
(5.12)

for a cycle n, is shown vs. the cycles for all four considered amplitudes. We observe that for stress amplitudes of \(\bar{\sigma }^{\, \text {ampl}} = {60}\,{\mathrm{MPa}}\) and above, the composite exhibits viscoplastic flow, resulting in a decrease of the strain amplitude in the first few cycles due to hardening. Subsequently, for the two largest amplitudes, the strain amplitude increases again due to the self-heating induced thermal softening of the composite. Beside the strain amplitude, the temperature change and the dissipated energy are illustrated as well in Fig. 9. Here, for cycle n, \(\triangle \bar{\theta }^{\, \text {cycle}}\) denotes the mean temperature change and \(\bar{\mathcal {D}}^{\, \text {cycle}}\) expresses the total dissipated energy, i.e.,

$$\begin{aligned} \triangle \bar{\theta }^{\, \text {cycle}}(n)= & {} \frac{1}{T_\text {c}} \int _{{\mathcal {T}}_\text {c}(n)} \triangle \bar{\theta }(t) \mathop {}\!\mathrm {d}t \quad \text {and} \quad \nonumber \\ \bar{\mathcal {D}}^{\, \text {cycle}}(n)= & {} \int _{{\mathcal {T}}_\text {c}(n)} \bar{\mathcal {D}}(t) \mathop {}\!\mathrm {d}t \end{aligned}$$
(5.13)

hold. We observe an almost linear self-heating of the composite for all considered amplitudes. In the first few cycles, the total dissipated energy is dominated by viscoplastic flow which decreases for an increasing number of cycles due to hardening. Furthermore, we observe a noticeable temperature-dependence of the dissipated energy, i.e., a noticeable oscillation of the dissipation starting at around 10 cycles. Both is especially visible for the two highest amplitudes. The former can be attributed to the Maxwell elements which are, due to the employed WLF shift function, activated and deactivated depending on the temperature. With an increasing number of cycles, the dissipated energy increases again as the material starts to soften resulting in higher strain amplitudes and thus a higher dissipation.

Fig. 9
figure 9

Stress-controlled cyclic loading: uniaxial extension perpendicular to the principal fiber direction

Comparing the DMN and the full-field solution, we observe an excellent agreement. The strain amplitude, temperature change and dissipated energy of the DMN compared to the full-field solution are almost indistinguishable. To quantify the approximation errors, we evaluate the mean \(\eta ^{(\cdot )}_{\text {mean}}\) and maximum \(\eta ^{(\cdot )}_{\text {max}}\) error (5.10), for the strain amplitude \(\bar{\varepsilon }^{\, \text {ampl}}\), the mean temperature change \(\triangle \bar{\theta }^{\, \text {cycle}}\) and the total dissipation \(\bar{\mathcal {D}}^{\, \text {cycle}}\), respectively. These results are summarized in Table 4 for the cyclic loading parallel and perpendicular to the principal fiber direction.

Table 4 Mean and maximum relative errors for the investigated stress-controlled uniaxial cyclic loadings

Summing up, we investigated monotonic, non-monotonic uniaxial, biaxial and cyclic loading scenarios to validate the identified DMN surrogate model for thermomechanically coupled simulations on the microscopic scale. Indeed, the DMN is able to provide a digital twin for the investigated short fiber reinforced plastic microstructure of thermomechanically coupled constituents. The approximation errors for the effective stress in the inelastic setting were well below \(3.5 \%\), for every investigated loading condition. Even the effective dissipation and the predicted temperature change only range up to \(5 \%\), depending on the considered scenario.

6 A computational example

With the identified DMN at hand, we turn our attention to conducting a DMN-accelerated two-scale concurrent thermomechanical simulation. More precisely, we study the macroscopic response of a non-symmetric, notched plate subjected to cyclic loading using the FE software ABAQUS. The effective material response of the short-fiber reinforced polyamide is provided by the identified DMN surrogate model. The local orientation of the material, i.e., the principal fiber direction, aligns with the loading direction. The geometry of the structure is similar to Tikarrouchine et al. [16] and is illustrated in Fig. 10.

Fig. 10
figure 10

Non-symmetric, notched plate subjected to a cyclic loading [16]

The structure is clamped on the left hand side and is subjected to a cyclic stress load

$$\begin{aligned} \bar{\sigma }(t) = \bar{\sigma }^{\, \text {ampl}} \sin \left( 2 \pi \frac{t}{T_\text {c}} \right) \end{aligned}$$
(6.1)

with \(\bar{\sigma }^{\, \text {ampl}} = {50}\,{\mathrm{MPa}}\) on the right hand side of the plate. We simulate \({3\,000}\) cycles, and every cycle is discretized by 20 equidistant load steps, i.e., \({60\,000}\) load steps in total. The stress load is applied with a frequency of \(f = {10}\,{\mathrm{Hz}}\), i.e., the period is \(T_\text {c} = {0.1}\,{\mathrm{s}}\). Due to the long simulation time of \({200}\,{\mathrm{s}}\), the assumption of adiabatic conditions is not appropriate. For this reason, we prescribe a convective boundary condition on the free surfaces of the plate, i.e., the heat flux across the surface of the plate

$$\begin{aligned} \bar{q}_{\text {s}} = - h (\bar{\theta }_{\text {s}} - \bar{\theta }_0) \end{aligned}$$
(6.2)

is a function of the film coefficient h and the difference of the surface temperature \(\bar{\theta }_{\text {s}}\) and the ambient temperature \(\bar{\theta }_0 = {293.15}\,{\mathrm{MPa}}\). We assume a free convection. Thus, the film coefficient for air is set to

$$\begin{aligned} h = {~\mathrm 10 \, \frac{W}{m^2 \, K}}, \end{aligned}$$
(6.3)

see Kosky et al. [81]. To account for heat conduction on the macroscopic level, we assume Fourier’s law

$$\begin{aligned} \bar{{{\varvec{ q}} }} = - \bar{{{\varvec{ \kappa }} }} \, \nabla _{\!\! \bar{x}}\,\bar{\theta }\end{aligned}$$
(6.4)
Fig. 11
figure 11

Evolution of the absolute temperature on the surface of the notched plate subjected to a cyclic loading

Fig. 12
figure 12

The strain amplitude, the absolute temperature and the dissipated energy versus the number of cycles for the five locations shown in Fig. 11a

to hold, where \(\nabla _{\!\! \bar{x}}\) denotes the gradient operator w.r.t. the macroscopic point \(\bar{{{\varvec{ x}} }} \in \Omega \). We use the thermal conductivities of the glass fibers and the polyamide matrix from Tables 1 and 2 and compute the effective thermal conductivity tensor \(\bar{{{\varvec{ \kappa }} }}\) by means of an FFT-based computational homogenization code [17, 18, 82]. Indeed, the effective thermal conductivity is almost isotropic and reads

(6.5)

in Cartesian coordinates. The notched plate is discretized by \({1\,099}\) thermally coupled quadratic hexahedron elements. In every Gauss point, a DMN is integrated implicitly. For solving the global system, we rely on the direct Newton solver in ABAQUS, which solves for the displacements and absolute temperature simultaneously.

In Fig. 11, the evolution of the mean temperature change \(\triangle \bar{\theta }^{\, \text {cycle}}\) is shown. For the first 250 cycles, a slight self-heating of the plate is observed in the vicinity of the two notches where the viscoelastic and viscoplastic deformation localizes. For an increasing number of cycles, the inner part of the plate starts to heat up as well both due to energy dissipation as well as heat conduction.

In addition to the contour plots, Fig. 12 shows the temporal evolution of the strain amplitude \(\bar{\varepsilon }^{\, \text {ampl}}\), mean temperature change \(\triangle \bar{\theta }^{\, \text {cycle}}\) and the dissipated energy \(\bar{\mathcal {D}}^{\, \text {cycle}}\) for five distinct points A to E, see Fig. 11a. Here, in the macroscopic setting, we compute the strain amplitude \(\bar{\varepsilon }^{\, \text {ampl}}\) of cycle n by

$$\begin{aligned} \bar{\varepsilon }^{\, \text {ampl}}(n) = \frac{1}{2} \left( \max _{t \in {\mathcal {T}}_\text {c}(n)}(\lambda ^{\text {max}}_{\bar{\varepsilon }}(t)) - \min _{t \in {\mathcal {T}}_\text {c}(n)}(\lambda ^{\text {min}}_{\bar{\varepsilon }}(t)) \right) , \end{aligned}$$
(6.6)

where \(\lambda ^{\text {min}}_{\bar{\varepsilon }}\) and \(\lambda ^{\text {max}}_{\bar{\varepsilon }}\) denote the smallest and the largest eigenvalue of the macroscopic strain tensor \(\bar{{\varvec{\varepsilon }}}\).

A closer look at the evolution of the strain amplitude shows that, as in the microscopic setting, the hardening of the viscoplastic matrix results into a decrease of the strain amplitude in the first few cycles for all five investigated points. Afterwards, the strain amplitude increases again until a steady-state is reached. The reason for the renewed increase of the strain amplitude and subsequent saturation becomes clear by inspecting the evolution of the absolute temperature. In the first few cycles, the temperature increases rapidly in all considered points as a result of the dissipated energy due to the viscoelastic and viscoplastic flow. This increase is more pronounced in the vicinity of the two notches and decreases towards the inside of the plate. The temperature increase results in the thermal softening of the material, and, in turn, the the strain amplitude increases. After about \({1\,000}\) cycles, the temperature increase saturates and a steady-state is reached. This steady-state is the result of two effects. One one hand, the dissipated energy in a cycle decreases with increasing temperature due to thermal softening. On the other hand, the heat conduction due to the free convection increases with an increasing surface temperature.

These results, i.e., the saturating temperature increase and the temperature-dependent mechanical behavior, can only be reproduced in a macroscopic setting, since heat conduction and convection have to be considered. Therefore, only relying on microscopic simulations for characterizing thermomechanically coupled composites by simulative means does not suffice. For this reason, DMNs are a promising technique to enable thermomechanically coupled concurrent two-scale simulations with reasonable computational resources. Last but not least, we consider the computational costs of our approach, accounting for the offline training and the online validation in the following section.

7 Computational costs

The material sampling was performed in parallel, i.e., six independent load steps for computing the effective stiffnesses using 16 threads each. The training of the DMN was carried out on four threads. The wall-clock times of the sampling and the offline training are summarized in Table 5. Indeed, sampling of the training data took \({74}\,{\mathrm{h}}\) whereas the training finished in under \({2}\,{\mathrm{h}}\). As we only considered DMNs with a depth of \(K=8\), 765 independent fitting parameters were determined during the offline training.

Table 5 Wall-clock times for sampling, training and number of fitting parameters

Turning our attention to the online evaluation, we focus on the computational costs of the DMN evaluated at a single Gauss point. Solving the thermomechanical cell problem for a microstructure discretized by \(320^3\) voxels for a prescribed macrostrain and absolute temperature takes about \({2737}\,{\mathrm{s}}\) on average on a single thread (Table 6). In contrast, integrating a DMN at a single Gauss point takes less than \({6}\,{\mathrm{ms}}\). Thus, we achieve a speed-up of about half a million times compared to solving the cell problem by means of an FFT-based micromechanics solver. For applications which admit using DMNs with less than eight layers, speed-ups in the range of several millions may be possible.

Table 6 Wall-clock times and speed-up (compared to an FFT-base computational micromechanics solver) for a single time step of the inelastic micro simulation

Wall-clock time and memory consumption for the component scale simulation are summarized in Table 7. Indeed, the macroscopic FE model was discretized by \({1\,099}\) elements resulting in \({9\,706}\) degrees of freedom. Computing all \({60\,000}\) time steps involved \({161\,240}\) total Newton iterations and took about \({117}\,{\mathrm{h}}\) on 96 threads and required about \({2}\,{\mathrm{GB}}\) of DRAM. Indeed, ABAQUS only required about 2.7 Newton iterations (on average) per load increment, indicating a robust quadratic convergence.

Table 7 Wall-clock time, memory consumption and total Newton iterations of the concurrent two-scale simulation

These results indicate that DMNs are a promising technique for accelerating thermomechanical two-scale simulations. This holds for structures of moderate complexity resolved by \({10\,000}\) time steps (and more), as in the previous examples. Alternatively, DMNs can be used in large-scale concurrent multiscale simulations consisting of millions of elements and a smaller number of time steps, see Gajek et al. [54].

8 Conclusion

In the work at hand, we extended the framework of direct DMNs to fully coupled thermomechanical two-scale simulations. More precisely, we incorporated the intrinsic two-way thermomechanical coupling between the microscopic and macroscopic scale into the framework. Considering the former is essential to accurately capture the mechanical response of common engineering materials, e.g., short-fiber reinforced thermoplastics, in structural simulations.

For this purpose, we built upon the first-order homogenization framework of thermomechanical composites established by Chatzigeorgiou et al. [5], who showed that there is no fluctuation of the absolute temperature on the microscopic scale. For this reason, both the absolute temperature and the macrostrain are regarded as inputs to the DMN’s (microscopic) balance of linear momentum. This way, the one-way thermomechanical coupling from the macroscopic onto the microscopic scale is accounted for. Furthermore, we incorporated the back-coupling from the microscopic scale onto the evolution of the macroscopic temperature into the framework. To this end, changes of entropy and dissipated energy are computed and propagated to the macroscopic scale where both combined, act as an additional source term to the macroscopic heat equation. This way, the two-way thermomechanical coupling was incorporated into the framework of direct DMNs. To accelerate a thermomechanically coupled two-scale simulation, we explained how our approach was implemented as an implicit user-material subroutine.

Choosing a short-fiber reinforced polyamide 6.6 with industrial aspect ratio and filler fraction, we demonstrated that the trained DMN was able to predict, for a macroscopic point, the effective stress, the effective dissipation and the ensuing temperature change of the composite with high accuracy for a set of different strain rates and loading conditions. Indeed, DMNs are trained on linear elastic data alone. Predicting the dissipated energy at a macroscopic point, which in turn is intrinsically associated to nonlinear effects on the underlying microstructure, e.g., plasticity, is a remarkable result which can be attributed to the DMNs internal structure. As DMNs rely on laminates as building blocks which are combined in a hierarchical manner, it is ensured that DMNs naturally inherit thermodynamic consistency and stress strain monotonicity from their phases. The former constitutes a key feature both in terms of physics as well as numerical implementation and represents one reason for the DMNs approximation capabilities, even for thermomechanically coupled problems.

To evaluate the performance of our approach in a concurrent two-scale setting, we conducted a thermomechanically coupled simulation of an asymmetric notched plate. The notched plate was subjected to a cyclic stress load also considering heat conduction and convection. Indeed, our results indicate that the FE-DMN method is a powerful piece of technology for accelerating two-scale concurrent simulations. With the possibility of providing speed-ups of five to six orders of magnitude, DMNs promise to become a standard tool for industrial applications. This way, the FE-DMN method finally realizes the promise of fully coupled thermomechanical two-scale simulations of large-scale industrial problems as envisioned by Chatzigeorgiou et al. [5].

Fig. 13
figure 13

Strain-controlled non-monotonic loading: uniaxial extension in principal fiber direction

Fig. 14
figure 14

Strain-controlled biaxial loading: extension in principal fiber direction followed by an extension perpendicular to the principal fiber direction

In terms of future works, it would of interest to formulate the underlying material models directly in cycle space in the fashion of Köbler et al. [83]. The former alleviates the need to resolve every load cycle with multiple load steps enabling the stimulative characterization of thermomechanical composites in the regime of high-cycle fatigue. Furthermore, the combination of our approach with the fiber-orientation interpolation scheme [54, 84] in order to arrive at a DMN surrogate model applicable to short-fiber reinforced polymers with a locally varying fiber orientation would further increase the applicability. Also, extensions to problems involving damage [85, 86] and fracture [87,88,89] are of interest.