Anisotropic hyperelastic constitutive models for finite deformations combining material theory and data-driven approaches with application to cubic lattice metamaterials

This work investigates the capabilities of anisotropic theory-based, purely data-driven and hybrid approaches to model the homogenized constitutive behavior of cubic lattice metamaterials exhibiting large deformations and buckling phenomena. The effective material behavior is assumed as hyperelastic, anisotropic and finite deformations are considered. A highly flexible analytical approach proposed by Itskov (Int J Numer Methods Eng 50(8): 1777–1799, 2001) is taken into account, which ensures material objectivity and fulfillment of the material symmetry group conditions. Then, two non-intrusive data-driven approaches are proposed, which are built upon artificial neural networks and formulated such that they also fulfill the objectivity and material symmetry conditions. Finally, a hybrid approach combing the approach of Itskov (Int J Numer Methods Eng 50(8): 1777–1799, 2001) with artificial neural networks is formulated. Here, all four models are calibrated with simulation data of the homogenization of two cubic lattice metamaterials at finite deformations. The data-driven models are able to reproduce the calibration data very well and reproduce the manifestation of lattice instabilities. Furthermore, they achieve superior accuracy over the analytical model also in additional test scenarios. The introduced hyperelastic models are formulated as general as possible, such that they can not only be used for lattice structures, but for any anisotropic hyperelastic material. Further, access to the complete simulation data is provided through the public repository https://github.com/CPShub/sim-data.

mations [2,4,12,23,34], harness mechanical instabilities and buckling [28], absorb energy or dissipate vibrations [5]. This functional behavior of soft metamaterials is enabled by their microstructural topology [3], whether they consist of truss, beam or shell-like lattice members or multi-phase solids, and their soft material constituents, typically elastomers or polymer hydrogels. Due to their microstructural topology and the soft materials, they are mechanically characterized by nonlinearities, i.e., large elastic deformations and finite strains, as well as mechanical instabilities, such as buckling and snapping, and anisotropy with specific material symmetries.
To efficiently model the mechanical behavior of large scale structures that are constituted of soft metamaterials at a microscopic level, nonlinear multiscale simulation methods are required, for which the effective microstructural behavior must be homogenized, see, e.g., [15,30]. Concurrent multiscale simulation, e.g., in the framework of the FE 2 method, requires the solution and homogenization of a microstructural boundary value problem at each evaluation point of the macroscopic finite element (FE) problem and is thus computationally expensive. In contrast, hierarchical or sequential multiscale simulation requires the formulation of an effective constitutive model, which can then be cheaply evaluated within the macroscale simulation. However, developing such effective constitutive models even for elastic soft metamaterials is difficult, since analytical hyperelastic models based on strain energy potentials must be adapted to include anisotropy and often do not reflect the behaviour of metamaterials well-as will also be shown here. Thus, empirical, data-driven constitutive models are generally developed for nonlinear microstructures by fitting either strain energies or stresses obtained from microstructural simulations as interpolating or approximating response surfaces, e.g., by polynomial database interpolation [19,36], B-splines [6,37], or reduced basis models [24].
More recently, the development of data-driven constitutive modeling has gained great momentum. Methods based on artificial neural networks (ANNs) and other machine learning (ML) approaches have shown excellent results in terms of prediction quality, flexibility and performance. In the field of nonlinear elasticity, [25] shows how ANNs can be used in order to calibrate hyperelastic laws, but the approach is limited to small deformations and does not offer strategies for specific material symmetries. In [27] ANNs are used for the identification constitutive relations with known material symmetry for large large deformations. The ansatz is demonstrated for Nickel with cubic material symmetry, where the ANNs are trained on the deformation gradient or specific invariants with structure tensors up to fourth-order for cubic behavior. The approach yields the best results based on the cubic invariants. But since no illustrations of the stress strain behavior are provided, it remains unclear if the ansatz is able to capture highly nonlinear material behavior. In [19] a polynomial approach is presented for the data-driven identification of the constitutive manifold. The approach shows very good results but is limited to small strains and does not cover material symmetry considerations. A data-driven model-free method for nonlinear hyperelasticity at finite strains is developed in [31]. The method demonstrates very good performance but is demonstrated only in one-and twodimensional problems for isotropic material behavior. The RNEXP approach of [14] offers a highly efficient method for hyperelastic materials, but it is formulated for small deformations and specific material symmetries can only be approximated through patterns in the provided calibration data. In [13] a ANN-based on-the-fly adaptive ansatz is presented for nonlinear hyperelasticity at small strains. The effective material law of the microstructure is replaced by an ANN and used in macroscopic FE simulations. The approach shows also very good results within the training range but does not consider strategies specifically incorporating the material symmetry. In [16] an interesting approach for the data-driven calibration of corrections of given hyperelastic models for finite strain is presented. This approach shows very promising results due to its thermodynamical properties and its application in biomechanics, see [17], but is only demonstrated for isotropic material behavior. The work of [35] introduces efficient sampling strategies and ANN-based models for hyperelasticity at finite deformations for isotorpic material behavior.
Effective constitutive modeling of beam lattices, assumed in this work as hyperelastic systems, is particularly challenging, since they exhibit either stretching-or bendingdominated behavior, which depends on the microarchitecture, and can vary between deformation modes (such as uniaxial, biaxial, shear, etc.) and even between tensile and compressive deformation, see, e.g., [1,11,18]. Furthermore, stretching-dominated behavior often results in instabilities and buckling, which makes microstructural simulations and identification of effective material laws challenging. Effective continuum models have so far only been investigated for finite deformations of simpler types of 2D lattices [7,10,32]. In [22] a hyperelastic model is proposed, which, however, only performs satisfactory for uniaxial tensile deformation. To the best of the authors' knowledge, no sensible anisotropic hyperelastic models have been proposed and investigated for typical 3D beam lattices subject to large deformations and instabilities.
The present work introduces three data-driven models for the homogenized three-dimensional constitutive behavior of anisotropic hyperelastic (meta-) materials at finite deformations. The models are build upon ML methods, more specifically, based on ANNs, and are compared to a highly flexible hyperelastic model incorporating material symmetries proposed by [20]. The structure of the ANN models is designed such that all models fulfill the principle of objectivity and the material symmetry conditions for the known symmetry group of the material. All models are formulated as general as possible, in order to allow for future applications for any hyperelastic material with known material symmetry and alternative ML models. For the investigation of cubic lattice structures, considered as hyperelastic systems, the ANN models are trained based on synthetic data from three-dimensional simulations of two different lattice cells. The calibrated models are then compared to the hyperelastic model of [20] and among themselves in terms of prediction quality, capture of lattice instabilities and generalization behavior.
The outline of the manuscript is as follows. In Sect. 2, first, the basic material theory considerations for anisotropic hyperelastic materials are shortly described. Then, the hyperelastic model of [20] is sketched. This is followed by the development of the three ML models. In Sect. 3 the abilities of all models for the calibration of the effective material behavior of two different lattice cells is demonstrated. The manuscript ends in Sect. 4 with conclusions on the performance of the models. Notation A symbolic tensor notation is preferred throughout the manuscript. The present work requires only zeroth-, second-and fourth-order tensors. An orthonornmal basis {e 1 , e 2 , e 3 } of the three-dimensional physical space is used to represent all tensors. Scalars (zeroth-order tensors) are denoted by italic letters, e.g., W , a, b. Second-and fourthorder tensors are denoted by bold characters, e.g., A, B, P, S, and blackboard bold characters, e.g., C, respectively. Einstein's summation convention is not used in this work. The composition of two second-order tensors is denoted simply by AB with components ( AB) i j = k=1 A ik B k j . The double contraction is simply denoted by a colon, i.e., A : B is the scalar A : k,l=1 A kl C kli j and A : B is a fourth-order tensor with components (A : B) i jkl = 3 m,n=1 A i jmn B mnkl . The number of elements in a list or countable finite set L is denoted by #(L). The tensor product is denoted by ⊗, while the tensor power is denoted by A ⊗n . Standard variable symbols from continuum mechanics are used, i.e., W is reserved for quantities connected to the strain energy density, F denotes the deformation gradient, P and S are reserved for quantities immediately connected to the first and second Piola-Kirchhoff stress tensors, respectively.

Basic material theory considerations
A hyperelastic material model for finite deformations is described by a scalar potential W = W (F), where F denotes the deformation gradient, which belongs to the set of invertible tensors with positive determinant I nv + . The potential W represents the strain energy density with respect to the volume in the initial placement. In order to fulfill the principle of material objectivity, see [33], the dependency of the potential is reduced to the right Cauchy-Green tensor C = F T F, i.e., i.e.,Ŵ is prescribed and W is determined by (1). In the following, the material model is referred to as objective for short. The material symmetry of the material is specified by the collection G of all symmetry transformations Q The collection G is referred to as the material symmetry group. The set of orthogonal second-order tensors in three-dimensional space is denoted by O(3), while the special orthogonal group is denoted as SO(3) ⊂ O (3). The present work is restricted to finite symmetry groups. For finite G, #(G) is used to denote the number of elements.
The potential satisfies which for (1) translates forŴ tô where Sym + denotes the set of symmetric positive definite second-order tensors. It is shortly remarked that the present work considers material symmetry groups as subsets of SO (3), and not of O(3), since the potential W is restricted to evaluations of arguments F Q in (3) with positive determinant.
The first and second Piola-Kirchhoff stress tensors, P and S, respectively, are defined as follows such that the identity P = F S (6) and the material symmetry conditions, see, e.g., [8], are automatically fulfilled if the potential fulfills (1) and (3). Naturally, the just described properties of constitutive models for hyperelasticity form only a subset of the large set of constraints in material modeling. For instance, ellipticity is an important property in terms of material stability, but it lies beyond the scope of the present work. In the following, only the principle of objectivity, hyperelasticity and material anisotropy are considered for all upcoming constitutive models.

Material model of [20]
A highly flexible material model was proposed by [20]. It was first proposed for orthotropic hyperelasticity, but it can be easily generalized to arbitrary material symmetry. The potential of [20] will be denoted by W I . The core idea of [20] is to define the potential as follows with N isotropic tensor-valued functions E i , i.e., and major-symmetric anisotropic fourth-order tensors fulfilling the associated material symmetry conditions where (9) is objective per definition and fulfills the symmetry conditions (3) or (4), respectively. It should be remarked that the ansatz (9) was originally motivated by [20] from the St. Venant material model, which becomes immediately obvious by replacing E i → (C − I)/2 and identifying C = N i=1 c i C i as the elastic stiffness of the anisotropic material.
In order to formulate a sensible material model, the isotropic functions E i (C) should fulfill the conditions where O and I S denote the zero second-order tensor and the identity on symmetric second-order tensors, respectively. Condition immediately shows, that condition (12) also ensures S I (I) = O = P I (R) for any R ∈ SO(3) for the correspondingly derived stresses.
In [20], the isotropic functions E i are chosen as analytic. More specifically, based on the spectral decomposition of C with M distinct eigenvalues Λ α and corresponding projectors P α , the functions E i (C) are chosen as which is a specific choice of isotropic functions with scalar core functions e i (x). In [20], the core function is used, since it provides a close connection to the Ogden model and related ones. Additionally, the core function (17) can be interpreted from a mechanical point of view as a superposition of R different Seth-strains. The core function (17) fulfills (12). The condition (13) can be reduced to the constraint e i (1) = 1/2, which imposes R r =1 a ir = 1. The formulas given in [21] can be used to compute the analytic gradient (14).
We refer to the parametrized potential as W I = W I (F; p), where p refers to the collection of all model parameters, in this case, a ir and b ir . The quantities c i in (9) could also be considered as free parameters, but in this work, they will be fixed together with C i if the anisotropic stiffness C = N i=1 c i C i for the linear behavior is known. For given stiffness C of the linear elastic behavior (corresponding to, e.g., triclinic, monoclinic, hexagonal, cubic, etc. material symmetry), one can decompose the stiffness into its spectral representation and use it as c i and C i , where c i correspond to the eigenvalues and C i to the projectors of the decomposed stiffness. This will be discussed later on in more detail in Sect. 3.1.

Theory-driven non-intrusive ML approaches
Based on available data of a considered material or microstructure, we propose two theory-driven machine learning approaches for building constitutive models that fulfill the objectivity and material symmetry conditions (3): 1. Group symmetrization of the potential: We consider an arbitrary core model for the potentialW with parameters p, which takes C = F T F as input, i.e.,W =W (C; p). We assume that the core modelW is twice continuously differentiable with respect to C. The core model can be any scalar-valued function accepting a six-dimensional input (degrees of freedom of C) which is twice continuously differentiable. We define the auxiliary quantity W 0 (F; p) =W (C; p) and formulate the potential as This hyperelastic ansatz is objective and, due to the properties of groups, fulfills the symmetry conditions (3) for any choice of its parameters p. A proof of the fulfillment of the group symmetry is given in "Appendix A".
This means that the stresses derived from (18) automatically fulfill (7) and (8). Since the core modelW is twice continuously differentiable, not only the stress P MLW = ∂ W MLW /∂ F is computable, but also the stress tangent tensor. Alternatively, one could define the auxiliary quantities and useW 0 (F; p) instead of W 0 (F; p) in (18). The usage ofG(I; p) in (18) through (20) ensures that the gradient ofŴ 0 vanishes for C = I. This would yield a modified model in (18) which is stress free for any rigid body rotation, see "Appendix A" for a proof. Additionally, W MLW (I; p) = 0 would hold per construction. While these properties are appealing, we do not implement them in the following, since they add computational complexity to the ML model in terms of the parameter dependency with small benefit. Based on a few training points the model (18) can be calibrated to achieve an acceptable approximation of the material state at C = I. 2. Group symmetrization of the stress: We consider a continuously differentiable modelS with parameters p, i.e., S =S(C; p). We define the auxiliary quantity P 0 (F; p) = FS(F T F; p) (22) and the corresponding first Piola-Kirchhoff stress as The ansatz (23) is objective and, due to the properties of groups, fulfills the symmetry conditions (7) for any choice of its parameters. A proof of the fulfillment of the group symmetry is given in "Appendix A". Since the core modelS is differentiable, the corresponding stress tangent is computable. Alternatively, the quantity could be used in (23) instead of P 0 . This would yield a modified model fulfilling P MLP (R; p) = O ∀R ∈ SO (3). In the following, we do not implement this alternative formulation, since it only adds computational complexity to the model with small benefit. We are further interested in the ability of models in approximating the material state at C = I.
Both proposed approaches offer a highly flexible and nonintrusive form for the usage of arbitrary data-driven ML methods. Here, artificial neural networks (ANNs) will be considered forW andS. It should be remarked that the outlined approaches could use any other data-driven model forW and S. Further, it should be remarked that the potential model (18) is a hyperelastic model, whereas the stress model (23) is only an elastic model, since no potential for it is known. Still, the model (23) is taken into consideration as a pragmatic model for the stress of an elastic material in view of its practical use for macroscopic FE simulations only requiring P and its derivative for structural simulations.
As indicated in Sect. 2.1, the present work is restricted to finite material symmetry groups. For material symmetry groups with an infinite number of elements, the approaches (18) and (23) could be reformulated in terms of integrals over the corresponding group. From an implementation point of view, this may be difficult, such that simply using (18) or (23) with a finite subgroup of the infinite group may suffice for the corresponding application. For instance, if transversely isotropic behavior is of interest, then the finite subgroup with 60-degrees rotations around the symmetry axis may provide acceptable results in some situations.
For the current investigation, the material law is pathindependent, i.e., solely state-dependent. From the rich pool of data-driven approaches, we consider in this work ANNs due to their enormous flexibility. More specifically, instead of recurrent neural networks for history-dependent processes, we choose feedforward neural networks (FFNNs) for pure state-dependency. Hereby, different number of neurons and hidden layers, as illustrated in detail in "Appendix B" , are considered. In order to ensure (infinite) continuous differentiability, we choose the softplus function s(x) = log(1 + exp(x)) as activation function in all hidden layers. For a compact reference of the FFNN taken into account, the networks forW andS will be addressed as where n i , i ∈ {1, . . . , H }, denote the number of neurons in the H hidden layers. After fixing the number of hidden layers and neurons in each layer, the remaining FFNN parameters are the layer weights and biases, addressed simply by the parameter p. The total number of parameters of each model is described in "Appendix B" and tabulated in Table 3.

ML-extended theoretical model
Methods from the field of ML can also help (i) to decide, which model from a given collection offers the best quality based on inference criteria, as shown in [29] with a Bayesian approach for a collection of three isotropic hyperelastic models, or (ii) to extend the capabilities of an existing model.
Typically, constitutive models are based on several considerations of material theory, but require manual, human decisions on the choice of representations or ansatz functions at different steps of the model formulation, e.g., here the choice of isotropic functions E i in (16) and of the scalar core functions e i in (17) for the model of [20]. These human decisions may be, depending on the modeling step, too restrictive with respect to the richness of the theoretical model. Thus, it could be beneficial to combine a motivated material model with ML methods at an appropriate decision point.
The present section aims at this perspective and proposes to extend the model of [20] as follows. The structure of the model of [20] is highly flexible for isotropic E i . Mainly the choice of E i is limited to human imagination and intuition. For a more general approach, one may consider Reiner's representation theorem which states that any isotropic functioñ E i (C) of a symmetric tensor C can be represented as with C 0 = I, C 1 = C, C 2 = CC, the invariants and (26) are unknown, which motivates in the current setting the use of FFNNs with parameters p i j for each function φ i j = φ i j (I , I I , I I I ; p i j ). Usage of (26) in the approach of [20] instead of E i given in (16) offers a hybrid model combining some material theory requirements and the power of datadriven ML methods. The ML-extended ansatz based on (26) yields a corresponding potential W MLI . This ML-extended ansatz inherits from [20] the fulfillment of objectivity, hyperelasticity and material symmetry per construction. The collection of all parameters for the hybrid approach is referred to as p, as for the purely data-driven ones. We denote the corresponding potential as W MLI = W MLI (F; p). Due to linearity, the functioň is also isotropic. Alternative usage of (28) instead of (26) in the approach of [20] fulfills (12), such that a modified model with vanishing stresses at C = I can be generated. As for the other ML models, we do not implement this alternative form in order to reduce the complexity of parameter dependency and check the ability to approximate the material state at C = I. Furthermore, the amplifying quantities c i in (9) are set to c i = 1 Pa for the instances of the ML-extended approach W MLI , since the output amplification is already carried out by the internal FFNNs.

Application to cubic beam-lattice metamaterials
The constitutive models introduced in the previous section are now used to represent the effective material behavior of beam-lattice metamaterials with cubic symmetry. For this purpose, the parameters of the models are fitted to data obtained from numerical homogenization of the respective unit cell models, see "Appendix C" for details.

Preparations and implementation of considered models for cubic material symmetry
For a clearer identification of the considered models, we refer to them in this section as follows: -Theoretical model of [20] (9) with potential W I and derived P I : -ML model (18) with group symmetrization of the potential W MLW and derived P MLW ; -ML model (23) with group symmetrization of the stress P MLP ; -ML extension of [20] based on (26) with potential W MLI and derived P MLI .
The list of models is referred to as {I, MLW, MLP, MLI}. The preparation of the models for representing cubic material symmetry, their calibration and implementation are discussed in the following.
Pre-calibrated c i and C i for [20] model. Calibration of the original approach of [20] requires the determination of the parameters of the core functions E i and, eventually, of the quantities c i in (9), as well as a decision on which C i are to be used for the cubic material law. Based on linear elasticity, in the present work we choose to decompose a given linear elastic stiffness tensor C into its spectral representation, such that C i are the corresponding projectors and c i the corresponding eigenvalues. For cubic stiffnesses, the N = 3 projectors are known For the cubic lattice unit cells under consideration, the eigenvalues c i are given explicit values, such that C = 3 i=1 c i C i equals the effective stiffness of the cell for infinitesimal deformations, which can be computed from linear homogenization or nonlinear simulations results around F = I. Group symmetrization The Schoenflies octahedral groups O and O h , see, e.g., [9], containing 24 and 48 orthogonal transformations for cubic objects, respectively, are well known. These parameters will be calibrated by minimization of the mean squared error (MSE) of the output model, W (F; p) and P (F; p), respectively, based on available data. With a given dataset D = {F 1 , . . . } with corresponding measured/simulated outputW andP, in J and Pa, respectively, we define the corresponding dimensionless MSE as objective function for the model W (F; p) as The factor 1/9 in the second term of the sum is used to compute the average over the 9 components of P. For the model P MLP (F; p) given in (23) we consider only the MSE with respect to the measured/simulated stresses The parameters p of the corresponding models {I, MLW, MLP, MLI} will be the minimizers of the corresponding MSE, i.e., It is shortly remarked, that calibration of material models is usually focused on the stress values. The present approach incorporates the potential values in (30) for the hyperelastic models from the following perspective. Any function being calibrated at given points benefits from information of the function values and of its gradient at those points. In virtual material testing and design, simulation technologies allow for a deeper insight into several state variables. If this information is available, it can enhance the model quality and should, therefore, be used. In real-world experiments many of these state variables are not accessible or only under significant effort. For instance, in lattice structures, if the material of the elastic beams is characterized well, then, in principle, their deformation could be tracked with digital image correlation and the elastic energy density could be computed. Since the current work focuses the simulative investigation of lattice structures, the potential values are at hand and used for the calibration of the hyperelastic models. It should finally be noted, that the gradient error may still dominate (30).

Simulation data
We first consider a cubic lattice type with a body-centered micro-architecture, which consists of 8 beams connecting at the center of the cell, see Fig. 1, here denoted as "X" cell. Synthetic data has been generated in nonlinear simulations with uniaxial, equibiaxial, planar, volumetric and simple shear deformations enforced by periodic boundary conditions. These particular deformation modes were chosen here, since they can also be applied in physical material tests and are thus often used for the experimental characterization of hyperelastic materials. The calibration dataset D C = {(F 1 ,P 1 ,W 1 ), . . . } contains triples (F,P,W ) for all simulation steps denoting the effective deformation gradient F of the cell, the effective first Piola-Kirchhoff stress tensorP in Pa and the effective potential (effective strain energy of the cell)W in J . The dataset D C will be used for the calibration of models with the objective functions (30) and (31). Additionally, 3 validation tests, to be addressed as the test dataset D T , were conducted to check the generalization capabilities of the calibrated models. The dataset D T consists of 2 biaxial tests with different stretch ratios (test 1 and 2) and a test combining tension and shear (test 3). Further technical details of the conducted simulations are described in "Appendix C" . The dataset D C is composed of 905 data points (each with a corresponding triple (F,P,W )), while D T contains 603 data points. It should be noted that these numbers are relatively low in terms of the dimensionality of the input and output space. Since the input space of deformation tensors is nine-dimensional, an equidistant distribution of only 10 points in each dimension would yield 10 9 data points. The current investigation aims at the examination of the proposed models with mechanically sensible design with small amounts of data.
All components of the effective deformation and stress tensors for D C and D T for the X cell are displayed in Figs. 2 and 3 over the corresponding process parameter, respectively. The data for the potential will not be illustrated, we focus the visualization of the deformation and stress state. In the uniaxial, biaxial, planar and volumetric simulations, the process parameter is F 11 , where typically 0.5 ≤ F 11 ≤ 1.5, while in simple shear F 12 is used with 0 ≤ F 12 ≤ 0.5. In the test data F 11 is used as process parameter. All plots show the components of the corresponding 3-by-3 matrices for F (left column in Figs. 2 and 3) and P (right column in Figs. 2 and 3), denoted by colors in the legends.
The X cell has been chosen as a first benchmark example, since it shows vastly differing stress responses for F 11 > 1 and F 11 < 1, as shown in Fig. 2. Furthermore, the stress response in the uniaxial case and biaxial case differ by an order of magnitude. Finally, buckling manifests itself in the volumetric case very early for F 11 < 1 and the shear case activates several stress components. This very particular and highly nonlinear effective behavior occurs for the X lattice type, since it does not have any beams along the cell edges. The soft responses in compression result from bending of the struts, while the increasingly stiff behaviors in tension result from a combination of bending and stretching of struts. Only in volumetric deformation, all struts are subjected to equal loading, which causes the much stiffer behavior in tension and buckling in compression. Based on these observations, the X cell offers an excellent challenging study case for the proposed anisotropic material models.
In order to further verify the calibrated material models, the test cases (test 1, 2 and 3) are taken into consideration, see Fig. 3. Test 1 and 2 are biaxial tests with different stretching ratio along the 1 and 2 directions (see F 11 and F 22 is the left plots of Fig. 3), such that the adaptive properties of the calibrated models solely based on D C are to be tested. Test 3 offers a more challenging scenario combining tension and shearing (see F 11 and F 12 components) and a highly different resulting stress behavior of the non-symmetric P including instabilities for compression (F 11 < 1) and tension (F 11 > 1), see P 11 , P 12 and P 21 in the bottom right plot of Fig. 3. This behavior is not encountered in any of the cases in D C . It should, therefore, be stressed that the calibration dataset D C contains less complex mechanical scenarios compared to the test dataset D T . This makes, form the authors' perspective, the dataset D T a good benchmark dataset to evaluate the generalization properties of the considered models. Interlude Before continuing to the advanced models, it should be shortly discussed that the elementary, linear constitutive relation of St. Venant's law P SV = F(C : (F T F − I)/2) with cubic stiffness C = 3 i=1 c i C i is not at all suitable for the finite-deformation modeling of the considered structures.
For a clear example, consider the uniaxial case of the X cell depicted in Fig. 2. The uniaxial case with diagonal F imposes F 11 values and is carried out with the stress free condition P 22 = P 33 = 0, such that The (F 11 , F 22 ) uni path of the simulated uniaxial case and the fictitious path for (F 11 , F 22 ) SV−ideal based on (33) are displayed in Fig. 4 (left plot). It can be seen that the simulation path (F 11 , F 22 ) uni (blue curve) does not deviate much from the fictitious one (F 11 , F 22 ) SV−ideal (orange curve) yielding P SV 22 = 0. The P 11 values corresponding to the simulated (F 11 , F 22 ) uni are depicted in blue in Fig. 4 (right plot), while P SV 11 corresponding to (F 11 , F 22 ) SV−ideal is illustrated in orange. It can be seen, that these stress curves agree reasonably in the vicinity of F 11 = 1. But it should be stressed that P SV 11 for (F 11 , F 22 ) SV−ideal corresponds to a fictitious deformation, it does not correspond to the real deformation (F 11 , F 22 ) uni of the cell. Evaluation of the real cell deformation (F 11 , F 22 ) uni with St.-Venant's law yields the green curve shown in Fig. 4 (right plot).
From this comparison, it can be concluded that outside the infinitesimal deformation regime around F ≈ I, St. Venant's law may not be acceptable for modeling the effective constitutive behavior of such a metamaterial, not even for the simple uniaxial case. Naturally, the cells under consideration pose further challenges for a material law, e.g., manifestation of buckling in compression and tension (cf. test 3 in Fig. 3), such that we do not further discuss St. Venant's law and move on to the comparison of the ML approaches.

Comparison of calibrated models
The four considered constitutive models are calibrated with the dataset D C . For the model W I R = 7 has been chosen. For the FFNNs in the ML models, the following architec-  For the W I model it can be seen by the very high M SE W for D T and for D C that it does not perform well, despite the high number of model functions in each of the E i functions (R = 7) and the incorporation of the elastic stiffness for small deformations. The W MLW model shows a significantly better performance than the W I approach. Various architectures achieve satisfactory results in the M SE W for D C (second last column). It can be seen through M SE P for D C (last column), that the stress errors dominate. The same holds in the evaluations for D T . No monotonicity with respect to the M SE W for D T with respect to the size of the FFNNs can be seen. This is well known in the ANN literature, larger networks do not always yield better results and a unique "best" performing architecture is usually hardly identifiable. The elastic model P MLP shows good results with respect to its objective function (M SE P for D C , last column), but does not achieve the generalization quality of W MLW with respect to M SE P for D T (fourth column). Finally, the W MLI model shows acceptable results larger than W MLW for D C , but a dramatic loss of generalization quality with respect to D T . In the following, we consider for W MLW the tabulated instance with N [8,8], for P MLP the instance with N [8,8,8] yielding M SE P = 5.00 · 10 2 and for W MLI the instance with N [8,8,8]. This choice is based on pure pragmatism targeting the selection of models yielding comparable and tolerable quality (within the corresponding model groups) with low computational complexity. In terms of evaluation speed of the chosen instances, 100,000 deformation gradients are evaluated in a single forward pass by W MLW in 0.89 s (potential and stress values), by P MLP in 0.62 s (only stress values) and by W MLI in 0.18 s (potential and stress values). This evaluation speed seems acceptable for FE simulations requiring the evaluation of the material model at a corresponding numbers of integration points.
In terms of the approximation of the material state at F = I and for rigid body rotations, the chosen ML-model instances have been evaluated for 100 random rotations. The corresponding maximum of the stress norm P over the 100 sampled rotations evaluates to 1. 27  , are addressed for the sake of brevity simply as W MLW , P MLP and W MLI , respectively. Evaluation of all stress components of P of the calibrated models for the D C (uniaxial, biaxial, planar, volumetric and shear cases) is depicted in Fig. 5. The simulation data is depicted by the continuous lines, while the model predictions are depicted by the dashed lines. The models W I , W MLW , P MLP and W MLI are depicted in the first, second, third and fourth column, respectively.
As shown in Fig. 5, the theoretical model for the stresses based on the potential W I is not able to adequately fit the homogenized material behavior of the metamaterial, even though the model of [20] carries the cubic symmetry and uses the stiffness C for infinitesimal deformations (extracted from the simulation data), such that the stress tangent matches exactly around F ≈ I. The performance of the calibrated model W I is very low, not only with respect to the potential values, cf. Table 1, but also with respect to the derived stresses. The reason for the low performance of the model W I in this example is the choice of the special class of isotropic function in (16) and of core functions in (17). More specifically, the chosen core functions are too smooth to be able to fit the manifestation of the instabilities due to beam buckling In the second column of Fig. 5 the stresses derived from the ML model W MLW are depicted. As visible in the plots, the stress predictions of this ML model are almost indistinguishable from the simulation data. This remarkable results show that this machine learning-based hyperelastic model fulfilling the principle of objectivity and all anisotropy conditions of the material law is also able to capture manifestation of the instability in the volumetric case.
The evaluation of the stresses for the model P MLP is shown in the third column of Fig. 5. This model has been calibrated solely with the stress data of the simulations. It achieves very good results with respect to the calibration data and is able to capture the instability in the volumetric case.
In the last column of Fig. 5 the stresses based on the hybrid model W MLI are shown. Compared to W I , the hybrid W MLI is able to fit the simulation data very well and also to capture the instability in the volumetric case. This is due to the far more general approach using Reiner's representation theorem (26) and insertion of highly flexible ML functions for φ i j . This result demonstrates that theory-based models can very well be extended by ML methods at appropriate decision points. Examination of test cases As done above, the chosen MLmodel instances, i.e., W MLW with N [8,8], P MLP with Fig. 6 Evaluation of calibrated models W I (first column), W MLW (second column), P MLP (third column) and W MLI (fourth column) for cubic X cell for test 1, 2 and 3 (from top to bottom). Continuous lines depict the simulation data, while the dashed lines denote the calibrated models N [8,8,8] and W MLI with N [8,8,8], are addressed for the sake of brevity by W MLW , P MLP and W MLI , respectively. After calibration, the models have been evaluated with the test data, as shown in Fig. 6. As for the previous figure, the simulation data is depicted by the continuous lines, while the model predictions are depicted by the dashed lines. The models W I , W MLW , P MLP and W MLI are shown in the first, second, third and fourth column, respectively.
It can be clearly seen that W I does not generalize well in any form, which is expected since it did not yield satisfactory results for D C . The model W MLW in the second column shows excellent generalization properties, which is best seen in the third test. The instability is well predicted in compression (F 11 < 1) as well as in tension (F 11 > 1). Furthermore, the hyperelastic model W MLW shows in the third test better generalization with respect to P 11 , compared to P MLP , and far better compared to W MLI . The model W MLI does not generalize well. As in the evaluation of W I , the derived stresses of W MLI grow too fast and diverge very quickly from the material behavior in the third test. The authors suspect that this behavior may be an intrinsic property of the structure of the model of [20]. Re-calibration with test data If more calibration data is provided, better results can be obtained with the highly flexible ML models proposed in the present work. The models W MLW , P MLP and W MLI have been re-trained with the complete simulation data (i.e., concatenating D C and D T ) for 5,000 additional epochs. The evaluation of the re-calibrated models with D C is depicted in Fig. 7. The evaluation of the re-calibrated models with D T is displayed in Fig. 8. It can be clearly seen that the ML model W MLW not only maintains its excellent performance in the calibration cases but also improves very well in capturing the material behavior in the test cases, cf. Fig. 8. This also holds for the stress model P MLP . The re-calibrated model W MLI improves its performance in the test cases, cf. Fig. 8. In this example, the ML hyperelastic model W MLW shows its excellent performance in terms of (1) capturing the manifestation of instabilities, (2) generalization behavior and (3) maintaining accuracy during re-calibration. The stress model P MLP shows also excellent results, but does not offer, strictly speaking, a hyperelastic model. The ML-extended approach W MLI shows its ability to go beyond [20] and to capture instabilities, but does not achieve the prediction quality of the former ML models.

Simulation data
This second example follows the structure of the previous one, but a different cell topology is examined, the "BCC" cell, which is displayed in Fig. 9. Hereby, it should be noted that a periodic arrangement of the cell is considered through the periodic boundary conditions in all simulations. The cell consists of a body-centered architecture that has a cubic frame (through periodic extension) added with Fig. 7 Evaluation of re-calibrated models W MLW (first column), P MLP (second column) and W MLI (third column) of X cell for calibration cases uniaxial, biaxial, planar, volumetric and shear (from top to bottom). Continuous lines depict the simulation data, while the dashed lines denote the re-calibrated models Fig. 8 Evaluation of re-calibrated models W MLW (first column), P MLP (second column) and W MLI (third column) of X cell for tests 1, 2 and 3 (from top to bottom). Continuous lines depict the simulation data, while the dashed lines denote the re-calibrated models further beams on all edges of the cell. This stiffens the cell, but makes it also susceptible to instabilities already in uniaxial compression and other simple cases, cf. Fig.  9. Therefore, this second example provides an even more challenging scenario in terms of complex effective material behavior.
As in the previous example, 3D simulations for the computation of the effective constitutive behavior of the cell are conducted. The calibration dataset D C (consisted of the uniaxial, biaxial, planar, volumetric cases as in the previous example), is depicted in Fig. 10 by the continuous lines. The test dataset D T (composes of the test 1, 2 and 3 as in the previous example) is depicted in Fig. 11 by the continuous lines. All simulation cases of the corresponding datasets are illustrated over the respective process parameter. In the shear case, the process parameter is the F 12 component of the effective deformation gradient of the cell, while in all other cases the F 11 is taken as the process parameter. It should be noted that the BCC cell, compared to the X cell, exhibits further instabilities in uniaxial, biaxial and planar compression (F 11 < 1), cf. Fig. 10.

Uni cell
Uniaxial compression Volumetric compression Fig. 9 Elementary unit BCC cell of a periodic arrangement (left), deformation for uniaxial compression (middle), deformation for volumetric compression (right)

Comparison of calibrated models
In this second example we compare the performance of the models W MLW , P MLP and W MLI . As in the previous example, we conduct the same architecture sweep with the same 48 architecture instances for each ML model. The five best performing instances of each model with respect to D T are tabulated in Table 2. As in the previous example, Table 2 shows that different architectures achieve comparable quality with respect to D T .  The hyperelastic models W I/MLW/MLI have been trained on M SE W with D C (second last column) and sorted by increasing M SE W with D T (third column). The elastic model P MLP has been trained on M SE P with D C (last column) and sorted by increasing M SE P with D T (fourth column) The major challenge is still the minimization of the dominant stress errors, which constitute the major part of the M SE W in the hyperelastic models. For D T , with respect to M SE W the W MLW achieves better results than W MLI . With respect to M SE P the W MLW model performs slightly better than P MLP and W MLI . In the following, from Table 2 one instance for each group is selected: for W MLW the instance with N [16,16], for P MLP the instance with N [8,8,8,8] and for W MLI the instance with N [32]. These selected instances will be referred to as W MLW , P MLP and W MLI , for the sake of brevity.
The selected instances have been evaluated with D C . The corresponding stress values are depicted in Fig. 10 by dashed lines. The second column corresponds to the evaluation of W MLW . The third column shows the predictions of the stress model P MLP . The right column depicts the predictions of W MLI . All models are able to capture the main features of the BCC cell, where W MLI shows the lowest performance, as also reflected by the M SE P for D C in Table 2.
Evaluation of the selected instances for D T is displayed in Fig. 11. As reflected by the corresponding M SE P of Table  2, all models show comparable performance. Recalibration of the selected instances for 5,000 additional epochs with the union of D C and D T improves the performance of all instances for D T , as displayed in Fig. 12.
It should be noted that the selected instance W MLW with N [8,8,8] has the least number of parameters from the selected ones. Still, it shows an excellent performance, despite the small size of the core FFNN. In view of the rel-atively low model-complexity of W MLW and the challenging nonlinearities of the present cell, its generalization behavior seen in Fig. 11 is considered as excellent.

Conclusions
The present work proposes new anisotropic models for hyperelasticity at finite deformations, which combine material theoretical considerations with machine learning methods. In this way, data-driven approaches are developed that fulfill by construction the principle of objectivity and material symmetry conditions for any given material symmetry group. In particular, a ML-based hyperelastic model W MLW through (18), a ML-based stress model P MLP through (23), and a ML-extended hyperelastic model W MLI based on (28), which is motivated from the work of [20], are introduced. In all ML models, any suitable ML approach can be inserted. In the present work, differentiable FFNNs were used in order to ensure the computation of stresses and potential stress tangents.
These ML-based approaches have been compared to the highly flexible theoretical model proposed by [20] and among themselves. As benchmark examples, simulation data of the homogenized constitutive behavior of cubic 3D lattice cells has been gathered. These considered flexible metamaterials exhibit finite deformations with strong instabilities in various loading scenarios depending on the cell topology. In the first example of the X cell, the approaches of the present work manage not only to capture the instability in volumetric compression, but they also forecast the manifestation of instabilities in more complex test scenarios not contained in the calibration data. Thus, they largely outperform the constitutive model of [20]. In the second example of the BCC cell, the ML models are trained with synthetic data of a BCC cell, which even exhibits several instabilities. All instabilities are reproduced by the ML models, at what W MLW shows the best performance.
We conclude that the novel approaches proposed in the present work not only offer better results than the approach of [20], but higher flexibility at minor intrusion. Thus, the hyperelastic ML-based model W MLW based on (18) showed the best generalization behavior and the best prediction quality in the investigated examples. The model W MLW offers an attractive alternative for anisotropic hyperelastic constitutive models, not only for lattice structures, but for other anisotropic hyperelastic cases, with potential application as material law in finite element simulations. For non-hyperelastic problems, the stress model P MLP may be used as an initial structure for future developments of data-driven constitutive laws. The ML-extended approach W MLI does not offer the prediction quality of the former ML models, but shows that mechanical models, as the one of [20], can be extended at key modeling points in order to capture new constitutive behavior, as the instabilities of lattice structures. All models have been calibrated with small amounts of data based on few mechanical simulations. As shown in the re-calibration of all ML models, the more data is provided, the better performance can be achieve due to their flexibility. Finally, it should be remarked that no single best performing architecture has been identified in the provided examples. For the W MLW , based on Tables 1 and 2, it appears that the small architecture N [8,8,8] yields very good results at low complexity. For the remaining ML models, very different architectures achieve among themselves comparable quality. It is, therefore, recommended to always perform an architecture sweep for new hyperelastic microstructures, if resources are available. This not only helps to obtain a set of high-quality candidates, but also to allow a model selection with a sensible compromise between prediction quality Acknowledgements The authors thank the anonymous reviewers for their valuable comments, which have motivated the authors to improve the present work.
Funding Open Access funding enabled and organized by Projekt DEAL.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.

Data availability
The authors provide access to the complete simulation data through the public GitHub repository https://github.com/CPShub/ sim-data Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.

Appendix: A Auxiliary proofs
In this appendix, we prove some properties of the hyperelastic model (18) and of the stress model (23).
We begin with the hyperelastic model (18). For a compact notation in the proof in this appendix, we drop the parameters p of (18), such that we consider the core scalar functioñ W (C) with the optional extension (21) leading to the modi-fied model of (18) The scalar function W (F) defined in (37) corresponds to (18) including (21). The function W (F) fulfills objectivity, the anisotropy conditions and yields a stress free state for arbitrary rigid body motions. In order to prove these properties, one elementary feature of group theory is required. Two . . , n for any R ∈ G yields, due to (1) the closure axiom of groups and (2) the relation (3), that G then is simply a reshuffled list of the elements of G. This implies that G and G are equal in the sense of sets, i.e., Summations over G or G then yield the exact same results. This elementary property is used for (37) as follows This proves that W (F) in (37) fulfills the anisotropy conditions of the group G (3). We now prove the vanishing stresses at F = I for W (F) as defined in (37). We define the following auxiliary quantities and consider for a differentiable function f ( A) and corresponding gradient G( A) = ∂ f /∂ A the following elementary application of the chain rule Inserting (37) into (39), application of (42) and taking (40) into consideration yields The result (43) is a consequence of the group symmetrization of W 0 in (37), such that P inherits the fulfillment of the group conditions (7) This immediately proves that the ML approach (23) fulfills the stress anisotropy conditions (7). For the evaluation of P(R) for any R ∈ SO(3), we can see in (43) that P 0 (R ) for arbitrary R = R Q ∈ SO(3) is to be computed. This requires the evaluation of S 0 (I), cf. (40), which finally leads to the necessary examination of the gradient ofŴ 0 (C), cf. (41). Based on the definition (35) of W 0 , its gradient is computed as It then becomes visible that for C = I the gradient ofŴ 0 vanishes per construction, implying P(R) = O for any R ∈ SO(3).

Appendix: B Feedforward neural networks
Feedforward neural networks (FFNN) are a special class of artificial neural networks. A neuron is modeled as the atomic unit processing the input x through an activation function a with weights w and bias b as follows A standard FFNN is a network connecting several neurons and transporting signals from layer to layer. Such a network with a total of L layers with input vector x ∈ R n can be defined recursively with layer index l = 1, . . . , L as follows The activation function of layer l, a [l] , is applied componentwise. The weights and biases of layer l with n [l] neurons are denoted by W [l] ∈ R n [l] ×n [l−1] and b [l] ∈ R n [l] . The layers l ∈ {1, . . . , H } with H = L − 1 are referred to as hidden layers, while the last layer l = L is referred to as the output layer.
For the potential approach built upon (18), the FFNÑ W (C) is constructed by extracting the six independent components of C, yielding a six-dimensional input vector x [0] ∈ R 6 . The activation functions of all hidden layers are chosen as the softplus function such that the network is differentiable for arbitrary input. The activation function of the last layer is chosen as the identity function, such that the scalar output [H ] ∈ R 1 , i.e.,W (C) = y is a linear combination of the outputs of the last hidden layer. The network is then evaluated for the given input C = F T F, and group symmetrized to the final model output W (F) as defined in (18). For the stress approach (23) built upon (24), the FFNN gets the six independent components of C as input vector x [0] ∈ R 6 , which is then processed through H hidden layers with the softplus function as activation function. The output layer with the identity function as activation layer returns a six-dimensional output vector y = x [L] ∈ R 6 , which corresponds to the six independent components of the symmetric S =S T . The network is then evaluated and group symmetrized for S as defined in (24). The final model output P is obtained as P(F) = F S(F T F).
For each of the functions φ i0/1/2 (I , I I , I I I ) in the hybrid approach formulated in (26) and (28) Table 3. It should be noted that for equal number of layers and neurons per layer

Appendix: C Effective constitutive behaviour of soft lattice unit cells
To characterize the effective behaviour of flexible beamlattice metamaterials, we homogenize the simulated stressstrain response of the lattice unit cells under various deformation scenarios. For this purpose, we use the experimentally validated micromechanical computational model of [22], which is based on the nonlinear buckling analysis of beamlattice structures and implemented in the commercially available finite element package Abaqus. First, a linear buckling analysis of the lattice model is performed to obtain the buckling mode shapes. These are then multiplied by an imperfection factor and incorporated into the lattice model as geometric imperfections. These perturbations of the lattice wireframe can be associated with the inevitable manufacturing imperfections that result in a softer structural response and smooth buckling scenarios, while instantaneous buckling would occur for perfect geometries. The large deformation response of the resultant imperfect lattice model is then simulated by a quasi-static nonlinear analysis. As detailed in [22], the total number of imperfection mode shapes is determined via a structural response convergence analysis and the imperfection factor is estimated by fitting the converged simulated response to experimentally obtained load-displacement curves.
To characterize the effective stress response of a unit cell subjected to an arbitrary effective deformation, an effective deformation gradient is applied to the unit cell under periodic boundary conditions (PBC). Considering a 3D Timoshenko beam finite element model of a typical "X"-type cubic 8node unit cell of size L • as shown in Table 4, each unit cell node has six degrees of freedom (DOFs), including three transnational DOFs 1, 2, 3 denoting nodal translations along axis x, y, z, respectively, and three rotational DOFs 4, 5, 6 denoting nodal rotations about axis x, y, z, respectively. Let U N (i) DO F represent the displacement boundary condition on a DO F ∈ {1, . . . , 6} of node i ∈ {1, . . . , 8} denoted by N (i).
On top of all finite element nodes in unit cell model, six reference points are created, each having the same six DOFs as those of the beam element nodes. Let U R P(k) DO F represent the boundary condition on a DO F ∈ {1, . . . , 6} of reference point k ∈ {1, . . . , 6} denoted by R P(k). PBC are imposed to the unit cell by a set of kinematic constraints defined in Table 4. Subsequently, the arbitrary effective deformation gradient F = I + H under PBC is applied to the unit cell by applying displacement boundary conditions on the reference point DOFs as where H is the effective displacement gradient tensor and I is the identity tensor. Without loss of generality, the zero entries in the effective displacement gradient tensor are inevitable due to technical reasons in simultaneous implementation of PBC and prevention of rigid body motions. The rigid body motions (translations and rotations) of the unit cell are prevented by applying fixed boundary conditions on the unit cell nodes as given in Table 5.
From the simulated response of the unit cell to the applied deformation gradient, the effective first Piola-Kirchhoff stress tensor P(F) is calculated based on the reaction forces on reference points and fixed unit cell nodes as where R F N (i) DO F is the reaction force on a particular DOF of unit cell node i and R F R P(k) DO F is the reaction force on a specific DOF of reference point k. Here, in the case of a lattice unit cell modeled by beam finite elements, this approach is equivalent to continuum averaging of P. Likewise, the effective strain energy density W (F) is computed from the discrete strain energy of the beam model divided by the volume of the unit cell L • 3 . For characterizing the effective constitutive behaviour of a lattice unit cell type in terms of W (F) or P(F), simulated data is generated via five standard tests that are commonly used for experimental characterization of hyperelastic material models, including uniaxial, biaxial, planar, and volumetric tension and compression, as well as simple shear. The effective displacement gradient for these stan-  Table 6 The effective displacement gradient and the corresponding boundary conditions under PBC for the standard characterization cases and our designed evaluation tests 1 to 3 Test Displacement gradient H(λ) dard tests and the corresponding boundary conditions are given in Table 6. We have further designed three test cases to evaluate the developed constitutive model. The displacement gradient tensor and the corresponding boundary conditions for the evaluation tests 1 to 3 are also given in Table 6. In the finite element simulations, each reference point DOF is either subjected to an applied boundary condition or free to move, on top of the PBC constraints. In the former case, the corresponding reaction force is read and used to calculate the corresponding first Piola-Kirchhoff stress component by (52). In the latter case, the free DOF is measured to calculate the corresponding displacement gradient component by (51). As shown in Table 6, in all tests, the applied boundary conditions are proportional with the proportionality constant λ ∈ [−1, 1], where λ = 0 denotes the undeformed reference configuration, λ = −1 corresponds to the maximum compressive deformation state, and λ = 1 represents the maximum tensile deformation state.