1 Introduction

Constitutive models are equations (or sets of equations) that describe the response of a material to imposed loads, deformations, and/or temperature changes. The philosophy behind formulating constitutive models has evolved considerably over the years, as illustrated schematically in Fig. 1. We focus here on models that consider data from macroscopic tests as the basis for their development, rather than models that consider microstructural/micromechanical information. The oldest stage can perhaps be traced back to Robert Hooke [1], and involves writing simplified and often empirical equations to describe observations of the material’s response within a restricted range of loading conditions (e.g., the ideal Hookean elastic solid). The modern continuum theory of constitutive modeling introduced a different philosophy: it begins with very general functional constitutive equations that are constrained by physical laws and thermodynamic considerations, and seeks to apply specializing assumptions to obtain the specific model (based on the observed material response) as late as possible [2,3,4]. This approach is very widely used, and is particularly valuable for formulating constitutive models for materials that exhibit complex deformation response (e.g., large deformations), because broad classes of constitutive equations can be developed without extensive experimental exploration. Constitutive models for soft materials—where the mechanical response involves large deformations, nonlinear stress–strain behavior, and strain-rate or time-dependence—have mostly been developed following this paradigm.

Fig. 1
figure 1

Schematic diagram showing the different stages in the evolution of constitutive modeling

The third stage of constitutive modeling utilized machine learning (ML) tools to create mappings between stress and deformation. This approach has been used for many material classes over the last three decades [5,6,7,8,9,10,11]. ML offers numerous advantages: (i) it can directly utilize experimental data with no requirement of physical laws or expert knowledge of response trends, (ii) it can capture complex trends when given sufficient training data, and (iii) it offers potentially lower computational cost. However, the accuracy of these “black-box models" is limited due to their requirement of a large amount of training data, which is often not available in real experimental settings. Further, because these models are not restricted by physical considerations, they can show very poor prediction accuracy in regimes that are not considered during model training [12,13,14].

The most recent phase of constitutive modeling approaches seeks to combine the physics-informed nature of the continuum theory of constitutive modeling (i.e., the second stage) with the flexibility and efficiency of classical ML (i.e., the third stage) to formulate "physics-informed data-driven constitutive models" [12, 15,16,17,18,19,20]. For example, Liu et al. [18] recently proposed a physics-informed neural network material model (NNMat) for isotropic hyperelastic soft tissues. This model is based on a mapping between the invariants of the right Cauchy–Green deformation tensor and the derivatives of the strain energy density with respect to the invariants, and imposes convexity constraints in the NN loss function to ensure physically reasonable predictions. Frankel et al. [19] employed Gaussian process regression (GPR) to create a mapping between the invariants of the right Cauchy–Green deformation tensor and the coefficients of the irreducible integrity basis of the stress tensor (for isotropic hyperelastic solids). Fuhg and Bouklas [12] showed that this type of generalized functional constitutive equation automatically ensures a number of physical constraints on material behavior: material frame-indifference, material symmetry, the balance of angular momentum, and the second law of thermodynamics (as the Clausius–Planck inequality), where the latter is imposed locally at training points.

Although important and useful, these GPR-based models are limited in applicability to hyperelastic solids that assume no strain-rate-dependence in material behavior, and were trained using a large amount of artificially generated data. In practice, soft materials often show strain-rate dependence in material response. In addition, only sparse data is usually available from mechanical experiments over a limited range of strains and strain rates [21,22,23].

For completeness, it is important to note that in addition to the physics-informed data-driven constitutive modeling approach (used in this work), a "model-free" computing paradigm is gaining interest in the computational mechanics field. This framework avoids explicit mathematical models by reformulating the classical boundary value problem (BVP) into a distance minimization or search problem. Examples include the series of pioneering works by Ortiz and co-workers that seek to solve BVPs by searching for a state in an available material data set that is closest to a constraint set comprising all states fulfilling the equilibrium and compatibility equations [24,25,26], and the MAP series of works by Guo and co-workers that project a three-dimensional stress/strain state onto a reduced-dimensional state (e.g., uniaxial) and then employ search techniques to locate the closest data point in an available material data set [27,28,29].

The goal of this work is to develop a physics-informed data-driven constitutive model that can capture the strain-rate-sensitive mechanical response of visco-hyperelastic soft materials, while allowing for robust extrapolation/generalization beyond the limited training data, consistent with current experimental practice. To this end, this work leverages the recent contributions by Frankel et al. [19] and Fuhg and Bouklas [12] for hyperelastic soft materials, which enables some form of generalization—consistent with phenomenological modeling—contrary to model-free approaches that are valid only in paths where data is available. Note that strain rate-dependence in visco-hyperelasticity can be classified into short-time (e.g., high strain rate deformation) and long-time (e.g., creep and relaxation) responses [30]. This work focuses only on the short-time response, which is of special interest in the field of injury biomechanics (e.g., simulations of crashes, blast, and ballistic impact) and in the design of protective equipment [31,32,33,34].

The paper is organized as follows: Sect. 2 formulates a generalized functional constitutive equation for visco-hyperelastic soft materials, which forms the basis for the data-driven model proposed in this study. In Sect. 3, a data-driven mapping is defined for the generalized constitutive equation. The physics-based constraints on this data-driven mapping, which stem from both the generalized model construction and the mapping approach, are also summarized. Gaussian process regression is the primary supervised learning method utilized in this work, which allows the imposition of several additional physical constraints. These are described in Sect. 4. Section 5 demonstrates the fitting and prediction performance of our GPR-based physics-informed data-driven constitutive model on several numerical tests that consider multiple deformation modes and a wide range of loading rates. The performance of our model is also compared against both conventional visco-hyperelastic constitutive models (i.e, from the second stage of the constitutive modeling evolution) and the classical ML mapping models (i.e., from the third stage). Finally, Sect. 6 presents a summary of this work.

2 Generalized visco-hyperelastic constitutive framework

In general, every admissible thermomechanical process (and so the choice of a constitutive model) must satisfy the Clausius–Duhem inequality,

$$\begin{aligned} \rho _0\dot{\eta }_0 + \nabla _0\cdot \left( \frac{\textbf{Q}}{\theta }\right) - \rho _0 \frac{R}{\theta } \ge 0 \end{aligned}$$
(1)

where \(\rho _0\) is the mass density, \(\eta _0\) is the specific entropy, \(\textbf{Q}\) is the heat flux, R is the rate of internal heating per unit mass, and \(\theta \) is the absolute temperature. Subscript 0 denotes the reference configuration, and the operator \(\left( \nabla _0\cdot \right) \) represents divergence. Further, in the reference configuration, the local form of the conservation of energy is

$$\begin{aligned} \rho _0\dot{u}_0 = \textbf{T}^0 \cdot \cdot \dot{\textbf{F}} - \nabla _0 \cdot \textbf{Q} + \rho _0 R \end{aligned}$$
(2)

where \(u_0\) is the specific internal energy, \(\textbf{T}^0\) is the nominal stress tensor, and \(\textbf{F}\) is the deformation gradient tensor. The symbol \(\cdot \cdot \) represents the tensor scalar product, such that \(\textbf{T}^0 \cdot \cdot \dot{\textbf{F}} = T^0_{ij}\dot{F}_{ji}\) in rectangular Cartesian coordinates. By substituting R from Eq. (2) in Eq. (1) and introducing the Helmholtz free energy function, \(\psi = u_0 - \theta \eta _0\), we obtain

$$\begin{aligned} {\textbf{T}^0 \cdot \cdot \dot{\textbf{F}} - \rho _0\left( \dot{\psi } + \dot{\theta }\eta _0\right) - \frac{\textbf{Q} \cdot \nabla _0 \theta }{\theta } \ge 0}. \end{aligned}$$
(3)

Restricting the focus for the moment to isothermal deformations only, we particularize the second law of thermodynamics through the Clausius-Planck inequality as

$$\begin{aligned} \Xi _{int} = \textbf{T}^0 \cdot \cdot \dot{\textbf{F}} - \rho _0 \dot{\psi } \ge 0 \end{aligned}$$
(4)

where \(\Xi _{int}\) is the internal dissipation or local entropy production [35]. Equation 4 represents the primary physical constraint on the possible mathematical forms of our generalized constitutive equation.

In addition to the Clausius–Planck inequality, another postulate that constrains the forms of constitutive equations is the principle of local action, which states that material response at a given point depends only on conditions in the close vicinity of that point (e.g., \(\psi =\psi (\theta ,\textbf{F},\nabla _0\textbf{F},\dots )\)). Assuming that the Helmholtz free energy function depends only on \(\theta \) and \(\textbf{F}\) (i.e., neglecting higher-order gradients) so that \(\psi =\psi (\theta ,\textbf{F})\), Eq. (4) becomes

$$\begin{aligned} \Xi _{int} = \left( \textbf{T}^{0^\text {T}} - \rho _0 \frac{\partial \psi }{\partial \textbf{F}}\right) : \dot{\textbf{F}} \ge 0. \end{aligned}$$
(5)

In terms of the right Cauchy–Green deformation tensor, \(\textbf{C} = \textbf{F}^\text {T}\textbf{F}\), and the hyperelastic strain energy density function, \(W_h\),

$$\begin{aligned} \Xi _{int} = \left( \textbf{T}^{0^\text {T}} - 2\textbf{F} \frac{\partial W_h(\textbf{C})}{\partial \textbf{C}}\right) : \dot{\textbf{F}} \ge 0. \end{aligned}$$
(6)

For an ideal hyperelastic material, the inequality in Eq. (6) reduces to equality under the assumption that the material undergoes zero dissipation during deformation. This leads to a rate-independent form of the constitutive model. While this assumption holds reasonably well in the case of quasi-static deformations, the response of soft materials under high strain rate deformation is associated with irreversible thermodynamic processes resulting from viscous dissipation effects stemming from microscale processes [36, 37]. This is the case when the deformation time scale is very small compared to the time it takes for the internal material microstructure to rearrange or relax under load. Under these conditions, the existence of a viscous dissipation potential (also called pseudo-potential) \(W_v\) that describes the energy dissipation is usually assumed [30, 38,39,40,41], such that

$$\begin{aligned} \Xi _{int}= & {} \left( 2\textbf{F}\frac{\partial W_v(\textbf{C},{\dot{\textbf{C}}})}{\partial \dot{\textbf{C}}}\right) :\dot{\textbf{F}}\nonumber \\= & {} \left( \textbf{T}^{0^\text {T}} - 2\textbf{F} \frac{\partial W_h(\textbf{C})}{\partial \textbf{C}}\right) : \dot{\textbf{F}} \ge 0. \end{aligned}$$
(7)

Rearranging Eq. (7) and introducing the second Piola-Kirchhoff stress tensor \(\textbf{S}\) defined by \(\textbf{S}= \textbf{F}^\text {-1} \textbf{T}^{0^\text {T}}\), we can define two components of \(\textbf{S}\):

$$\begin{aligned} \textbf{S} = \textbf{S}_h + \textbf{S}_v = 2\frac{\partial W_h(\textbf{C})}{\partial \textbf{C}} + 2\frac{\partial W_v(\textbf{C},{\dot{\textbf{C}}})}{\partial \dot{\textbf{C}}}. \end{aligned}$$
(8)

From Eq. (8), the total stress in the material may be additively decomposed into a hyperelastic stress component (\(\textbf{S}_h\)) and a viscous overstress component (\(\textbf{S}_v\)). Note that because both the hyperelastic strain energy density and the viscous dissipation potential are functions of symmetric tensors (i.e., \(\textbf{C}\) and \(\dot{\textbf{C}}\)), the stress tensor \(\textbf{S}\) is also symmetric. This ensures the physical constraint of the balance of angular momentum. Further, it can be readily confirmed that Eq. (8) also satisfies the principle of determinism or causality (i.e., material response at a given instant is only a function of past and present events).

Next, assuming that the material is isotropic, Eq. (8) can be written in terms of scalar invariants of the tensors \(\textbf{C}\) and \(\dot{\textbf{C}}\) [3, 30],

$$\begin{aligned} \textbf{S}= & {} \textbf{S}_h + \textbf{S}_v = 2\frac{\partial W_h(I_1,I_2,I_3)}{\partial \textbf{C}}\nonumber \\{} & {} + 2\frac{\partial W_v(I_1,I_2,I_3,J_1,J_2,J_3,J_4,J_5,J_6,J_7)}{\partial \dot{\textbf{C}}} \end{aligned}$$
(9)

where the invariants are given by

$$\begin{aligned} I_1&= \text {tr}\textbf{C},\quad I_2 = \frac{1}{2}[(\text {tr}\textbf{C})^2 - \text {tr}(\textbf{C}^2)], \quad I_3 = \text {det}\textbf{C} \end{aligned}$$
(10a)
$$\begin{aligned} J_1&= \text {tr}\dot{\textbf{C}}, \quad J_2 = \text {tr}({\dot{\textbf{C}}}^2), \quad J_3 = \text {det}(\dot{\textbf{C}}) \end{aligned}$$
(10b)
$$\begin{aligned} J_4&= \text {tr}(\textbf{C}\dot{\textbf{C}}), \quad J_5 = \text {tr}(\textbf{C}{\dot{\textbf{C}}}^2), \quad J_6 = \text {tr}(\textbf{C}^2\dot{\textbf{C}}), \nonumber \\ J_7&= \text {tr}(\textbf{C}^2{\dot{\textbf{C}}}^2). \end{aligned}$$
(10c)

Following one common approach for modeling compressible materials [42], the tensors \(\textbf{F}\) and \(\textbf{C}\) are multiplicatively decomposed into dilatational and isochoric components,

$$\begin{aligned} \textbf{F} = J^{1/3}\bar{\textbf{F}}, \quad \textbf{C} = J^{2/3}\bar{\textbf{C}} \end{aligned}$$
(11)

where J is defined as

$$\begin{aligned} J = \sqrt{I_3}=\sqrt{\text {det} \textbf{C}}=\text {det}\textbf{F} \end{aligned}$$
(12)

and the modified deformation gradient tensor \(\bar{\textbf{F}}\) along with the modified right Cauchy–Green deformation tensor \(\bar{\textbf{C}}\) are introduced. These are associated with the isochoric (volume-preserving) part of the deformation, such that,

$$\begin{aligned} \text {det}\bar{\textbf{F}} = 1, \quad \text {det}\bar{\textbf{C}} = 1 \end{aligned}$$
(13)

The decomposition of the deformation gradient in Eq. (11) allows an equivalent decomposition of the stress tensor in Eq. (9) into dilatational and isochoric stress components (see [41]):

$$\begin{aligned} \textbf{S}= & {} \textbf{S}_\text {vol} + \textbf{S}_{h,\text {iso}} + \textbf{S}_{v,\text {iso}} \nonumber \\= & {} 2\frac{\partial U(J)}{\partial \textbf{C}} + 2\frac{\partial \bar{W}_h(\bar{I}_1,\bar{I}_2)}{\partial \textbf{C}}\nonumber \\{} & {} + 2\frac{\partial \bar{W}_v(\bar{I}_1,\bar{I}_2,\bar{J}_1,\bar{J}_2,\bar{J}_3,\bar{J}_4,\bar{J}_5,\bar{J}_6,\bar{J}_7)}{\partial \dot{\textbf{C}}} \end{aligned}$$
(14)

where the barred, modified invariants are

$$\begin{aligned} \bar{I}_1&= \text {tr}\bar{\textbf{C}},\quad \bar{I}_2 = \frac{1}{2}[(\text {tr}\bar{\textbf{C}})^2 - \text {tr}(\bar{\textbf{C}}^2)] \end{aligned}$$
(15a)
$$\begin{aligned} \bar{J}_1&= \text {tr}\dot{\bar{\textbf{C}}}, \quad \bar{J}_2 = \text {tr}({\dot{\bar{\textbf{C}}}}^2), \quad \bar{J}_3 = \text {det}(\dot{\bar{\textbf{C}}}) \end{aligned}$$
(15b)
$$\begin{aligned} \bar{J}_4&= \text {tr}(\bar{\textbf{C}}\dot{\bar{\textbf{C}}}), \quad \bar{J}_5 = \text {tr}(\bar{\textbf{C}}{\dot{\bar{\textbf{C}}}}^2), \quad \bar{J}_6 = \text {tr}({\bar{\textbf{C}}}^2\dot{\bar{\textbf{C}}}), \nonumber \\ \bar{J}_7&= \text {tr}({\bar{\textbf{C}}}^2{\dot{\bar{\textbf{C}}}}^2). \end{aligned}$$
(15c)

In Eq. (14), \(\textbf{S}_\text {vol}\) is the volumetric stress component, \(\textbf{S}_{h,\text {vol}}\) is the isochoric hyperelastic stress component, and \(\textbf{S}_{v,\text {vol}}\) is the isochoric viscous overstress component. These three stress components are captured by the volumetric energy density function U(J), the modified hyperelastic strain energy density \(\bar{W}_h\), and the modified viscous dissipation potential \(\bar{W}_v\), respectively. For convenience, \(\bar{W}_h\) and \(\bar{W}_v\) will henceforth be simply referred to as the strain energy density and the viscous dissipation potential, respectively. Similarly, the barred, modified invariants will be simply referred to as invariants.

Equation (14) is the generalized functional form of constitutive equation considered in this work. This generalized framework (referring to both Eq. (9) and Eq. (14)) is called external state variable driven viscous dissipation-based visco-hyperelasticity [30, 40]. Here, \(\textbf{C}\) and \(\dot{\textbf{C}}\) are the external thermodynamic state variables that are employed to relate the strain rate stiffening or softening of material response to viscous dissipation. This is in contrast to the internal state variable driven framework in which the non-equilibrium part of the Helmholtz free energy function is described via a set of internal variables that a priori lack any physical meaning (e.g., see [43,44,45,46]). Unlike this alternative class of visco-hyperelastic models, many of which result in a hereditary-integral-based equation for stress, the external state variable driven viscous dissipation-based visco-hyperelastic constitutive framework of this work is bound by the constraint of limited memory (a limiting case of the principle of fading memory). Here, limited memory means that the viscous material behavior is dependent only on the instantaneous deformation rate (i.e., very recent history), and dependence on the entire previous loading history (often described via a hereditary-integral) is neglected. First proposed by Pioletti et al. [39], constitutive models within this framework have been employed to successfully capture the rate-dependent response of numerous soft materials under rapid loading: human ligaments and tendons [41, 47, 48], skeletal muscles [49], hydrogels and elastomers [30, 33], tongue tissue [50], and the brain and pericardium [22, 51, 52], among others. These studies assume specific mathematical forms for U(J), \(\bar{W}_h(\bar{I}_1,\bar{I}_2)\) and \(\bar{W}_v(\bar{I}_1,\bar{I}_2,\bar{J}_1,\bar{J}_2,\bar{J}_3,\bar{J}_4, \bar{J}_5,\bar{J}_6,\bar{J}_7)\) based on expert knowledge and experience. The physics-informed data-driven mapping approach of the present study (Sect. 3) aims to eliminate this need for expert intervention by employing ML to flexibly discover the mapping between stress-like and strain-like or strain rate-like variables directly from limited experimental data.

The partial derivatives representing the three stress components in Eq. (14) can be expanded via the chain rule, leading to the following generalized equations for the three stress components (see derivation in the Appendix A):

$$\begin{aligned} \textbf{S}_\text {vol}= & {} \zeta _1(J)\textbf{C}^{-1} \end{aligned}$$
(16)
$$\begin{aligned} \textbf{S}_{h,\text {iso}}= & {} J^{-2/3} \left[ \Gamma _1(\bar{I}_1,\bar{I}_2) \text {Dev}(\textbf{I}) + \Gamma _2(\bar{I}_1,\bar{I}_2) \text {Dev}(\bar{\textbf{C}}) \right] \nonumber \\ \end{aligned}$$
(17)
$$\begin{aligned} \textbf{S}_{v,\text {iso}}= & {} J^{-2/3} \bigl [ \Phi _1(\bar{I}_1,\bar{I}_2, \bar{J}_1, \dots , \bar{J}_7) \text {Dev}(\textbf{I}) \nonumber \\{} & {} + \Phi _2(\bar{I}_1,\bar{I}_2, \bar{J}_1, \dots , \bar{J}_7) \text {Dev}(\bar{\textbf{C}}) + \nonumber \\{} & {} \Phi _3(\bar{I}_1,\bar{I}_2, \bar{J}_1, \dots , \bar{J}_7) \text {Dev}({\bar{\textbf{C}}}^{-1}) \nonumber \\{} & {} + \Phi _4(\bar{I}_1,\bar{I}_2, \bar{J}_1, \dots , \bar{J}_7) \text {Dev} (\dot{\bar{\textbf{C}}}) + \nonumber \\{} & {} \Phi _5(\bar{I}_1,\bar{I}_2, \bar{J}_1, \dots , \bar{J}_7) \text {Dev} ({\dot{\bar{\textbf{C}}}}^{-1})\nonumber \\{} & {} +\Phi _6(\bar{I}_1,\bar{I}_2, \bar{J}_1, \dots , \bar{J}_7) \text {Dev} (\bar{\textbf{C}}\dot{\bar{\textbf{C}}} + \dot{\bar{\textbf{C}}}\bar{\textbf{C}}) +\nonumber \\{} & {} \Phi _7(\bar{I}_1,\bar{I}_2, \bar{J}_1, \dots , \bar{J}_7) \text {Dev} (\bar{\textbf{C}}^{2}\dot{\bar{\textbf{C}}} + \dot{\bar{\textbf{C}}}\bar{\textbf{C}}^{2})\bigr ] \end{aligned}$$
(18)

where \(\text {Dev}(\cdot ) = (\cdot ) - \frac{1}{3}((\cdot ):{\textbf {C}}){\textbf {C}}^{-1}\) is the deviatoric operator in the Lagrangian description [42]. The list of invariants in Eq. (18) can be somewhat simplified, as shown in the next two sections.

Notice that individual stress components in these equations are linear combinations of the components of an integrity basis \(\mathbb {G}\) of the total stress \(\textbf{S}\), defined as

$$\begin{aligned} \mathbb {G} = \{\mathbb {G}_1, \mathbb {G}_2, \mathbb {G}_3, \mathbb {G}_4, \mathbb {G}_5, \mathbb {G}_6, \mathbb {G}_7, \mathbb {G}_8\} \end{aligned}$$
(19)

where

$$\begin{aligned} \mathbb {G}_1&= \textbf{C}^{-1},\quad \mathbb {G}_2 = \text {Dev}(\textbf{I}), \quad \mathbb {G}_3 = \text {Dev}(\bar{\textbf{C}}) \end{aligned}$$
(20a)
$$\begin{aligned} \mathbb {G}_4&= \text {Dev}({\bar{\textbf{C}}}^{-1}),\quad \mathbb {G}_5 = \text {Dev} (\dot{\bar{\textbf{C}}}), \mathbb {G}_6 = \text {Dev} ({\dot{\bar{\textbf{C}}}}^{-1}) \end{aligned}$$
(20b)
$$\begin{aligned} \mathbb {G}_7&= \text {Dev} (\bar{\textbf{C}}\dot{\bar{\textbf{C}}} + \dot{\bar{\textbf{C}}}\bar{\textbf{C}}),\quad \mathbb {G}_8 = \text {Dev} (\bar{\textbf{C}}^{2}\dot{\bar{\textbf{C}}} + \dot{\bar{\textbf{C}}}\bar{\textbf{C}}^{2}). \end{aligned}$$
(20c)

Further, \(\zeta _1, \Gamma _1, \Gamma _2, \Phi _1, \dots , \Phi _7\) are coefficients of the integrity basis components in the stress equations (Eqs. (1618)), and are functions of the invariants of the tensors \(\bar{\textbf{C}}\) and \(\dot{\bar{\textbf{C}}}\). For several commonly used constitutive models in the literature (i.e., for a given choice of U, \(\bar{W}_h\), and \(\bar{W}_v\)), the explicit mathematical forms of these coefficients are provided in Tables 13 of the Appendix A.

The physics-informed data-driven mapping approach of the present study discovers a mapping between invariants and coefficients of the integrity basis components directly from stress–strain–strain rate data.

3 Proposed physics-informed data-driven mapping approach

We consider a dataset \(\mathcal {D}\) consisting of strain and strain rate as input and stress components as output,

$$\begin{aligned} \mathcal {D}= & {} \mathcal {D}_\text {vol} \cup \mathcal {D}_{h,\text {iso}} \cup \mathcal {D}_{v,\text {iso}} \nonumber \\= & {} \{ \textbf{C}^i, \textbf{S}_\text {vol}^i \}_{i=1}^{N_\text {vol}} \cup \{ \textbf{C}^j, \textbf{S}_{h,\text {iso}}^j \}_{j=1}^{N_{h,\text {iso}}}\nonumber \\{} & {} \cup \{ \textbf{C}^k, \dot{\textbf{C}}^k, \textbf{S}_{v,\text {iso}}^k\}_{k=1}^{N_{v,\text {iso}}}. \end{aligned}$$
(21)

Here, constituent datasets \(\mathcal {D}_\text {vol}\), \(\mathcal {D}_{h,\text {iso}}\) and \(\mathcal {D}_{v,\text {iso}}\) correspond to the material response under hydrostatic, isochoric quasi-static, and isochoric dynamic conditions, respectively. Superscripts i,j, and k denote a particular data point in the datasets \(\mathcal {D}_\text {vol}\), \(\mathcal {D}_{h,\text {iso}}\), and \(\mathcal {D}_{v,\text {iso}}\), respectively. A dataset like \(\mathcal {D}\) for a soft material can be obtained in practice by compiling its hydrostatic stress–strain data as \(\mathcal {D}_\text {vol}\) [53, 54], quasi-static stress–strain data under uniaxial and/or shear deformations as \(\mathcal {D}_{h,\text {iso}}\) [55, 56], and viscous overstress (total stress minus stress under quasi-static loading)–strain data from high strain rate testing at multiple strain rate levels under uniaxial and/or shear deformations as \(\mathcal {D}_{v,\text {iso}}\) [21, 33, 54, 55, 57]. For every data point in the dataset \(\mathcal {D}\), the inputs \(\textbf{C}\) and \(\dot{\textbf{C}}\) can be used to compute the set of invariants (using Eqs. (12) and (15)) as well as the components of the integrity basis \(\mathbb {G}\) (using Eq. (20)). The coefficients of the integrity basis in Eqs. (1618) can then be obtained by solving the following systems of equations:

$$\begin{aligned}&\Bigl [\text {vec}(\textbf{S}_\text {vol})\Bigr ] = \Bigl [ \text {vec}(\mathbb {G}_1)\Bigr ] \Bigl [ \zeta _1 \Bigr ] \end{aligned}$$
(22)
$$\begin{aligned}&\left[ \text {vec}\left( \frac{\textbf{S}_{h,\text {iso}}}{J^{-2/3}}\right) \right] = \Bigl [ \text {vec}(\mathbb {G}_2) \quad \text {vec}(\mathbb {G}_3)\Bigr ] \begin{bmatrix} \Gamma _{1} \\ \Gamma _{2} \end{bmatrix} \end{aligned}$$
(23)
$$\begin{aligned}&\left[ \text {vec}\left( \frac{\textbf{S}_{v,\text {iso}}}{J^{-2/3}}\right) \right] \nonumber \\&\quad = \Bigl [ \text {vec}(\mathbb {G}_2) \quad \text {vec}(\mathbb {G}_3) \quad \text {vec}(\mathbb {G}_4) \quad \text {vec}(\mathbb {G}_5) \nonumber \\&\qquad \text {vec}(\mathbb {G}_6) \quad \text {vec}(\mathbb {G}_7) \quad \text {vec}(\mathbb {G}_8)\Bigr ]\nonumber \\&\quad \begin{bmatrix} \Phi _{1} \\ \Phi _{2} \\ \Phi _{3} \\ \Phi _{4} \\ \Phi _{5} \\ \Phi _{6} \\ \Phi _{7} \end{bmatrix} \end{aligned}$$
(24)

where \(\text {vec}(\cdot )\) denotes the Voigt form of a matrix (for an arbitrary second-order symmetric tensor \(\textbf{Z}\), \(\text {vec}(\textbf{Z}) = \bigl [\textbf{Z}_{11}, \textbf{Z}_{22}, \textbf{Z}_{33}, \textbf{Z}_{23}, \textbf{Z}_{13}, \textbf{Z}_{12}\bigr ]^\text {T}\)). Notice that the above systems of equations are identical to Eqs. (1618), and that each of these systems is written in the form of \(\bigl [\varvec{b}\bigr ] = \bigl [\varvec{A}\bigr ] \bigl [\varvec{x}\bigr ]\). The vectors \(\bigl [\varvec{x}\bigr ]\) containing the coefficients of the integrity basis can be obtained by solving the corresponding optimization problems—\(\min _{x} \left\Vert \bigl [\varvec{A}\bigr ] \bigl [\varvec{x}\bigr ] - \bigl [\varvec{b}\bigr ]\right\Vert _2^2\)—via, for example, QR-decomposition.

This procedure of obtaining invariants and the coefficients of the integrity basis is carried out for every data point in \(\mathcal {D}\). Using this information, an alternative dataset is generated that is composed of invariants and integrity basis coefficients following the definition below,

$$\begin{aligned} \begin{aligned} \mathcal {D}^*&= \mathcal {D}_\text {vol}^* \cup \mathcal {D}_{h,\text {iso}}^* \cup \mathcal {D}_{v,\text {iso}}^*\\&= \{ J^i, \zeta _1^i \}_{i=1}^{N_\text {vol}} \cup \{[ \bar{I}_1^j, \bar{I}_2^j ], [\Gamma _1^j, \Gamma _2^j] \}_{j=1}^{N_{h,\text {iso}}} \cup \\&\{[ \bar{I}_1^k, \bar{I}_2^k, \bar{J}_1^k, \bar{J}_4^k, \bar{J}_6^k], [\Phi _1^k, \dots ,\Phi _7^k] \}_{k=1}^{N_{v,\text {iso}}} \end{aligned} \end{aligned}$$
(25)

Note, the reason behind omitting invariants \(\bar{J}_2\), \(\bar{J}_3\), \(\bar{J}_5\) and \(\bar{J}_7\) in the above dataset is described briefly later in this section and in detail in Sect. 4. The constituent datasets in \(\mathcal {D}^*\) are used in the present study to train three surrogate models that comprise our physics-informed data-driven constitutive model:

$$\begin{aligned} \begin{aligned} \widetilde{\mathcal {M}}_{\text {vol}}: {}&\mathcal {J}_\text {vol} \in \mathbb {R}^1 \rightarrow \mathcal {\zeta } \in \mathbb {R}^1 \end{aligned} \end{aligned}$$
(26)
$$\begin{aligned} \begin{aligned} \widetilde{\mathcal {M}}_{h,\text {iso}}: {}&\mathcal {I} \in \mathbb {R}^2 \rightarrow \varGamma \in \mathbb {R}^2 \end{aligned} \end{aligned}$$
(27)
$$\begin{aligned} \begin{aligned} \widetilde{\mathcal {M}}_{v,\text {iso}}: {}&\mathcal {J} \in \mathbb {R}^5 \rightarrow {\varPhi } \in \mathbb {R}^7 \end{aligned} \end{aligned}$$
(28)

Specifically, \(\widetilde{\mathcal {M}}_{\text {vol}}\) captures the volumetric stress component and is a mapping between the random vector of the invariant J (i.e., \(\mathcal {J}_\text {vol}\)) and the random vector of the coefficient \(\zeta _1\) (i.e., \(\mathcal {\zeta }\)). Next, \(\widetilde{\mathcal {M}}_{h,\text {iso}}\) captures the isochoric hyperelastic stress component and is a mapping between the random vector of the invariants \([\bar{I}_1, \bar{I}_2]\) (i.e., \(\mathcal {I}\)) and the random vector of the corresponding coefficients \([\Gamma _1, \Gamma _2]\) (i.e., \({\varGamma }\)). Finally, \(\widetilde{\mathcal {M}}_{v,\text {iso}}\) captures the isochoric viscous overstress and is a mapping between the random vector of the invariants \([\bar{I}_1, \bar{I}_2, \bar{J}_1, \bar{J}_4, \bar{J}_6]\) (i.e., \(\mathcal {J}\)) and the random vector of the corresponding coefficients \([\Phi _1, \dots , \Phi _7]\) (i.e., \({\varPhi }\)). In a more explicit form, we have

$$\begin{aligned} \begin{aligned} \widetilde{\mathcal {M}}_{\text {vol}}: {}&\Bigl [ J \Bigr ] \rightarrow \Bigl [ \zeta _1 \Bigr ] \end{aligned} \end{aligned}$$
(29)
$$\begin{aligned} \begin{aligned} \widetilde{\mathcal {M}}_{h,\text {iso}}: {}&\begin{bmatrix} \bar{I}_1 \\ \bar{I}_2 \end{bmatrix} \rightarrow \begin{bmatrix} \Gamma _{1} \\ \Gamma _{2} \end{bmatrix} \end{aligned} \end{aligned}$$
(30)
$$\begin{aligned} \begin{aligned} \widetilde{\mathcal {M}}_{v,\text {iso}}: {}&\begin{bmatrix} \bar{I}_1 \\ \bar{I}_2 \\ \bar{J}_1 \\ \bar{J}_4 \\ \bar{J}_6 \end{bmatrix} \rightarrow \begin{bmatrix} \Phi _{1} \\ \Phi _{2} \\ \Phi _{3} \\ \Phi _{4} \\ \Phi _{5} \\ \Phi _{6} \\ \Phi _{7} \end{bmatrix} \end{aligned} \end{aligned}$$
(31)

The rationale behind not considering invariants \(\bar{J}_2\), \(\bar{J}_3\), \(\bar{J}_5\) and \(\bar{J}_7\) in the surrogate model \(\widetilde{\mathcal {M}}_{v,\text {iso}}\) (Eq. (28/31)) is described in the next section. Briefly, this assumption of a negligible effect of certain invariants on the viscous overstress allows imposition of an assumed natural state (a stress-free reference state); i.e., to ensure \(\textbf{S}_{v,\text {iso}}(\textbf{C} = \textbf{I}) = \textbf{0}\) regardless of the applied rate of the right Cauchy–Green deformation tensor (\(\dot{\textbf{C}}\)).

Algorithm 1
figure a

Development of a physics-informed data-driven constitutive model for strain-rate-sensitive soft materials

In this work, the Gaussian Process Regression (GPR) supervised learning method [58] is chosen for surrogate modeling purposes (see details of GPR in Sect. 4), i.e., to learn the mappings in Eqs. (2628) from a given input–output training dataset \(\mathcal {D}\) and then predict integrity basis coefficients for a new set of input tensors \(\textbf{C}\) and \(\dot{\textbf{C}}\). Given the linear relationship between individual stress components and their corresponding integrity basis coefficients (Eq. (1618)), the trained surrogate models allow prediction of the three stress components as

$$\begin{aligned} \widetilde{\textbf{S}}_\text {vol}= & {} \widetilde{\zeta }_1 \mathbb {G}_1 \end{aligned}$$
(32)
$$\begin{aligned} \widetilde{\textbf{S}}_{h,\text {iso}}= & {} J^{-2/3} \left( \widetilde{\Gamma }_1 \mathbb {G}_2 + \widetilde{\Gamma }_2 \mathbb {G}_3\right) \end{aligned}$$
(33)
$$\begin{aligned} \widetilde{\textbf{S}}_{v,\text {iso}}= & {} J^{-2/3} \left( \widetilde{\Phi }_1 \mathbb {G}_2 + \widetilde{\Phi }_2 \mathbb {G}_3 + \widetilde{\Phi }_3 \mathbb {G}_4 + \widetilde{\Phi }_4 \mathbb {G}_5 \right. \nonumber \\{} & {} \left. +\widetilde{\Phi }_5 \mathbb {G}_6 + \widetilde{\Phi }_6 \mathbb {G}_7 + \widetilde{\Phi }_7 \mathbb {G}_8\right) \end{aligned}$$
(34)

In Eqs. (3234), the accent symbols \(\widetilde{\cdot }\) over stress and coefficients of the integrity basis denote that these are predicted values from the surrogate models. Note, for any given input strain and strain rate, the components of the integrity basis \(\mathbb {G}\) can be obtained using Eq. (20). The entire process of the development of our physics-informed data-driven constitutive model from a stress–strain–strain rate dataset (i.e., \(\mathcal {D}\)) is summarized in Algorithm 1.

As the proposed surrogate models in Eqs. (2628) are based on the generalized functional forms of the visco-hyperelastic stress equations (i.e., Eq. (1618)), our data-driven constitutive model and its predictions (Eqs. (3234)) automatically satisfy a number of physics-based constraints:

  • Principle of local action: Because the integrity basis only captures spatially local deformation (via \(\textbf{C}\) and \(\dot{\textbf{C}}\)).

  • Balance of angular momentum: Because the integrity basis is necessarily symmetric.

  • Principle of determinism (or causality): Because the predicted stress is only a function of past and present events.

  • Principle of material frame-indifference (objectivity): Because \(\textbf{S}\), \(\textbf{C}\), and \(\dot{\textbf{C}}\) are all objective tensors associated with the reference configuration and thus remain unaffected under change of observer.

  • Isotropic material symmetry: Because the surrogate model stress equations (Eqs. (3234)) remain unaffected by any transformation from the proper orthogonal material symmetry group SO(3) (i.e., under rigid body rotation). This is a direct consequence of the generalized model formulation in Eq. (14) in the form of isotropic invariants of tensors \(\bar{\textbf{C}}\) and \(\dot{\bar{\textbf{C}}}\).

  • Limited memory constraint: Because the viscous overstress component (Eq. (34)) depends only on the instantaneous deformation rate (i.e., very recent history).

Two additional physics-based constraints will be imposed via the employment of GPR as the chosen surrogate modeling technique: \(\bullet \) the assumed stress-free reference state, and \(\bullet \) the second law of thermodynamics. These additional constraints along with the GPR technique will now be described.

4 Gaussian process regression and the enforcement of additional physical constraints

Gaussian process regression has recently seen increased interest as a tool for building surrogate models for describing material behaviors [59,60,61]. This is in part because GPR is derived from a convenient statistical background, offers strict convergence guarantees (unlike, for example, artificial neural networks (ANN)), has empirically shown excellent performance for out-of-sample model predictions, and can be conveniently utilized in a finite element method setting to simulate mechanical responses [60, 62, 63]. For an excellent overview of GPR, the reader is referred to [64]. In this section, the basic ideas behind GPR and its specific employment for regression in our surrogate models are discussed. Two distinct but related GPR formulations are introduced: the standard GPR and the constrained GPR. The latter allows enforcement of thermodynamic consistency in the viscous overstress component of our data-driven model.

4.1 Standard gaussian process regression (GPR)

Consider a general dataset \(\mathcal {D}_g\) of N data points,

$$\begin{aligned} \mathcal {D}_{g} = \lbrace \textbf{x}_{i}, \textbf{y}_{i} \rbrace _{i=1}^{N} \end{aligned}$$
(35)

where \(\textbf{x} \in \mathbb {R}^{n_{i}}\) and \(\textbf{y} \in \mathbb {R}^{n_{o}}\) represent input and output data points of dimensions \(n_{i}\) and \(n_{o}\), respectively. The sets of all input and output data points can be reformulated as matrices \(\textbf{X} \in \mathbb {R}^{N \times n_{i}}\) and \(\textbf{Y} \in \mathbb {R}^{N \times n_{o}}\), respectively. The objective of a GPR model that is trained using the dataset \(\mathcal {D}_g\) is to predict an output \(\widetilde{\textbf{y}}\) for a new input data point \(\textbf{x}_{\star }\) that is not contained in the training dataset.

In GPR, the notion of the similarity between input data points is critical. The assumption is that two input data points that are closer together are more likely to have closer target values than the two input data points that are farther away from each other. Typically, the similarity between two input data points \(\textbf{x}\) and \(\textbf{x}'\) is modeled by a user-defined covariance function. The choice of this function is an important part of GPR. Based on the work of Laurent et al. [65], which compared different functional forms of the covariance function for computer experiments when no prior knowledge is available, we restrict ourselves to the Matérn 3/2 kernel [66], given as

$$\begin{aligned} \begin{aligned} k(\textbf{ x}, \textbf{ x}')&= \sigma _{f}^{2} \left( 1 + \dfrac{\sqrt{3} \Vert \textbf{ x} - \textbf{ x}'\Vert _2}{l} \right) \\&\quad \exp \left( -\dfrac{\sqrt{3} \Vert \textbf{ x} - \textbf{ x}'\Vert _2 }{l} \right) + \alpha \delta _{x, x'} \end{aligned} \end{aligned}$$
(36)

where \(\Vert \left( \cdot \right) \Vert _2\) represents the L2-norm, \(\sigma _{f}\) is a scaling factor, and l is the characteristic length-scale of the covariance function. \(\sigma _{f}\) and l can be combined into a vector \(\varvec{\theta }=\left[ \sigma _{f}, l \right] ^{T}\) that collects the trainable parameters. The term \(\alpha \) denotes a small positive value that helps to overcome numerical instabilities [67]. \(\delta _{x, x'}\) is the Kronecker delta function. Using the functional form of the covariance function in Eq. (36), a covariance matrix \(\textbf{K}(\textbf{X},\textbf{X}')\) can be constructed to define similarity between sets of points \(\textbf{X}\) and \(\textbf{X}'\).

The optimal set of parameters \({\varvec{\theta }}\) that best describes the training dataset (Eq. (35)) is obtained using a maximum log-likelihood approach [68],

$$\begin{aligned} \begin{aligned} \widetilde{{\varvec{\theta }}}&=\mathop {\mathrm {arg\,max}}\limits _{ {\varvec{\theta }}^{\star }} \text {log} \, p(\textbf{Y} | \textbf{X}, \varvec{\theta }) \\&= \mathop {\mathrm {arg\,max}}\limits _{ {\varvec{\theta }}^{\star }}\left[ -\frac{1}{2} \textbf{Y}^{T} \textbf{K}(\textbf{X},\textbf{X}) ^{-1} \textbf{Y} \right. \\&\quad \left. - \frac{1}{2} \text {log}( \det ( \textbf{K}(\textbf{X},\textbf{X}) )) - \frac{N}{2} \log (2 \pi ) \right] \end{aligned} \end{aligned}$$
(37)

After finding the best parameters, the GPR regression model is fully defined. Given a new input data point \(\textbf{ x}_{\star }\), the predicted output value \(\widetilde{\textbf{y}}\) of the Gaussian process regressor reads

$$\begin{aligned} \begin{aligned} \widetilde{\textbf{y}}({\textbf{ x}}_{\star })&= \textbf{K}(\textbf{X},\textbf{x}_{\star })^{\text {T}} \, \textbf{K}(\textbf{X},\textbf{X}) ^{-1} \, \textbf{Y} \end{aligned} \end{aligned}$$
(38)

As \(\alpha \rightarrow 0\) in Eq. (36), the GPR predictor becomes an exact interpolator (see [69]), i.e. \(\widetilde{\textbf{y}}(\textbf{ x}_{i})= \textbf{ y}_{i}\) for all points \(i=1, \ldots , N\) of the training dataset of Eq. (35). This is termed as the exact inference property of GPR. Since the prediction is probabilistic, the variance \(\widetilde{\varvec{\sigma }}^{2}\) of the regressor fit can also be obtained,

$$\begin{aligned} \begin{aligned} \widetilde{\varvec{\sigma }}^{2}&= \textbf{K}(\textbf{x}_{\star },\textbf{x}_{\star }) - \textbf{K}(\textbf{X},\textbf{x}_{\star })^{\textrm{T}} \, \textbf{K}(\textbf{X},\textbf{X}) ^{-1}\, \textbf{K}(\textbf{X},\textbf{x}_{\star }) \end{aligned} \end{aligned}$$
(39)

In this study, standard GPR (as defined above) is employed to fit the surrogate models \(\widetilde{\mathcal {M}}_{\text {vol}}\) (Eq. (26) and \(\widetilde{\mathcal {M}}_{h,\text {iso}}\) (Eq. (27) using datasets of the form of \(\mathcal {D}_{\text {vol}}^*\) and \(\mathcal {D}_{h,\text {iso}}^*\) (Eq. (25), respectively; remember, the datasets \(\mathcal {D}_{\text {vol}}^*\) and \(\mathcal {D}_{h,\text {iso}}^*\) are first derived from the initial training datasets \(\mathcal {D}_{\text {vol}}\) and \(\mathcal {D}_{h,\text {iso}}\) (Eq. (21), respectively (see Algorithm 1). Further, the exact inference property of GPR is harnessed to enforce the physical constraint of an assumed stress-free reference (i.e., undeformed) state, also called the normalization condition [42]. For the two stress components \(\textbf{S}_{\text {vol}}\) and \(\textbf{S}_{h,\text {iso}}\) that are modeled using standard GPR, the normalization condition reads

$$\begin{aligned} \textbf{S}_\text {vol}(\textbf{C} = \textbf{I}) = \textbf{0}, \quad \textbf{S}_{h,\text {iso}}(\textbf{C} = \textbf{I}) = \textbf{0} \end{aligned}$$
(40)

To enforce the above constraint, the stress-free reference states—\(\{\textbf{C}=\textbf{I}, \textbf{S}_{\text {vol}}=\textbf{0}\}\) and \(\{\textbf{C}=\textbf{I}, \textbf{S}_{h,\text {iso}}=\textbf{0}\}\)—are included in the training dataset. This guarantees that the normalization condition is achieved, provided the stabilization constant \(\alpha \) of Eq. (36) is kept small enough. Note, in terms of invariants, \(\textbf{C}=\textbf{I}\) translates to \(\bar{I}_1 = \bar{I}_2 = 3\).

A physics-based constraint of primary interest in this work is the second law of thermodynamics. In light of the Clausius–Planck inequality, this constraint requires that the internal dissipation \(\Xi _{int}\) in a visco-hyperelastic soft material as described in Eq. (7) is non-negative. Among the three stress components \(\textbf{S}_\text {vol}\), \(\textbf{S}_{h,\text {iso}}\) and \(\textbf{S}_{v,\text {iso}}\), the first two are rate-independent and therefore do not contribute to dissipation. Thus, enforcing the second law of thermodynamics constraint on the surrogate models \(\widetilde{\mathcal {M}}_{\text {vol}}\) and \(\widetilde{\mathcal {M}}_{h,\text {iso}}\) that are based on standard GPR becomes trivial. In other words, the specific construction of the generalized constitutive equation in this work automatically ensures the second law constraint on the rate-independent portion of our data-driven constitutive model.

4.2 Constrained gaussian process regression (C-GPR)

Unlike the rate-independent stress components \(\textbf{S}_\text {vol}\) and \(\textbf{S}_{h,\text {iso}}\), the rate-dependent isochoric viscous overstress \(\textbf{S}_{v,\text {iso}}\) results in viscous dissipation. In this regard, the Clausius–Planck inequality in Eq. (7) can be rewritten as [40]

$$\begin{aligned} \Xi _{int} = 2 \left( \textbf{F} \textbf{S}_{v,\text {iso}} \right) : \dot{\textbf{F}} = \textbf{S}_{v,\text {iso}}: \dot{\textbf{C}} \ge 0 \end{aligned}$$
(41)

For arbitrary input strain and strain rates, the isochoric viscous overstress prediction of the surrogate model \(\widetilde{\mathcal {M}}_{v,\text {iso}}\) (Eq. (34)) must satisfy the above inequality. This type of inequality constraint on input–output mappings can be enforced by employing the constrained GPR (C-GPR) formulation recently proposed by Pensoneault et al. [70]. In standard GPR, the model output remains unconstrained.

The basic idea of C-GPR is to restrict the solution space in the hyperparameter optimization problem (i.e., Eq. (37)) such that the predicted output at a set of user-defined "constraint points" follows the desired inequality constraint. Accordingly, in the present study, the following optimization problem is devised for training the GPR regressor of the surrogate model \(\widetilde{\mathcal {M}}_{v,\text {iso}}\) (cf. Equation (37)):

$$\begin{aligned} \begin{aligned}&\widetilde{{\varvec{\theta }}} =\mathop {\mathrm {arg\,max}}\limits _{ {\varvec{\theta }}^{\star }} \text {log}\, p(\textbf{Y} | \textbf{X}, \varvec{\theta }) \\&\text {s.t.} \quad \left( \widetilde{\textbf{S}}_{v,\text {iso}}(\textbf{C}^{m},\dot{\textbf{C}}^{m}) \right) : \dot{\textbf{C}}^{m} \ge 0,\\&\quad \quad \forall \, \, m=1, \ldots , N_{c} \end{aligned} \end{aligned}$$
(42)

Here, \(\widetilde{\textbf{S}}_{v,\text {iso}}(\textbf{C}^{m},\dot{\textbf{C}}^{m})\) denotes the isochoric viscous overstress prediction of the surrogate at the \(m^\text {th}\) constraint point (i.e., for \(\{\textbf{C}^m, \dot{\textbf{C}}^m\}\) input). \(N_c\) is the total number of constraint points. The constrained optimization of Eq. (42) imposes the Clausius–Planck inequality (Eq. (41)) on the rate-dependent part of our physics-informed data-driven constitutive model. Note that the inequality condition in Eq. (42) is not a functional constraint; rather, it is applied at a finite set of input data points. Therefore, C-GPR only applies a "weak" constraint on the model output.

Next, the normalization condition constraint for the isochoric viscous overstress is written as

$$\begin{aligned} \textbf{S}_{v,\text {iso}}(\textbf{C} = \textbf{I}, \dot{\textbf{C}}) = \textbf{0} \end{aligned}$$
(43)

The above condition requires that the isochoric viscous overstress vanishes in the undeformed state (i.e., \(\textbf{C} = \textbf{I}\)) regardless of the applied rate of deformation \(\dot{\textbf{C}}\) in that state. Owing to the particular choice of the strain rate invariants \(\bar{J}_1\), \(\bar{J}_4\), and \(\bar{J}_6\) in the mapping of the \(\widetilde{\mathcal {M}}_{v,\text {iso}}\) surrogate (Eq. (31)), this condition can be simply enforced by including the stress-free reference state—\(\{\textbf{C}=\textbf{I}, \textbf{S}_{v,\text {iso}}=\textbf{0}\}\)—in the training data. This is possible because in the undeformed state, these three invariants hold a fixed value of zero regardless of the loading rate, i.e.,

$$\begin{aligned} \bar{J}_1(\textbf{C}= & {} \textbf{I}, \dot{\textbf{C}}) = 0, \quad \bar{J}_4(\textbf{C} = \textbf{I}, \dot{\textbf{C}}) = 0, \nonumber \\ \bar{J}_6(\textbf{C}= & {} \textbf{I}, \dot{\textbf{C}}) = 0 \end{aligned}$$
(44)

For a derivation of the above result, see Appendix B. From Eq. (44), \(\textbf{C}=\textbf{I}\) translates to a single set of input values in the mapping of \(\widetilde{\mathcal {M}}_{v,\text {iso}}\): \([ \bar{I}_1, \bar{I}_2, \bar{J}_1, \bar{J}_4, \bar{J}_6] = [3,3,0,0,0]\), at which the exact inference property of GPR will always lead to a near-zero viscous overstress value (provided the stabilization constant \(\alpha \) is kept very small).

Unlike the invariants \(\bar{J}_1\), \(\bar{J}_4\), and \(\bar{J}_6\), the invariants \(\bar{J}_2\), \(\bar{J}_3\), \(\bar{J}_5\), and \(\bar{J}_7\) in the undeformed state of \(\textbf{C}=\textbf{I}\) are functions of the tensor \(\dot{\textbf{C}}\) (see Appendix B). Therefore, if these invariants were included in the mapping of \(\widetilde{\mathcal {M}}_{v,\text {iso}}\) (Eq. (31)), there would be infinitely many possible combinations of invariant values (i.e., input data points) that would correspond to the stress-free reference state, making it impossible to utilize the GPR exact inference property for ensuring the normalization condition.

5 Model performance

In this section, the performance of our physics-informed data-driven constitutive model is demonstrated in terms of (i) the accuracy of fitting training datasets from various deformation modes (e.g., hydrostatic compression, and unconfined uniaxial tension and compression), (ii) the accuracy and physical plausibility of out-of-sample predictions outside the training regime, and (iii) the training dataset size requirement. As the three surrogate models that make up our constitutive model—\(\widetilde{\mathcal {M}}_{\text {vol}}\), \(\widetilde{\mathcal {M}}_{h,\text {iso}}\), and \(\widetilde{\mathcal {M}}_{v,\text {iso}}\)—are independent of each other in terms of the specific dataset type required for training them (i.e., \(\mathcal {D}_\text {vol}\), \(\mathcal {D}_{h,\text {iso}}\), and \(\mathcal {D}_{v,\text {iso}}\) in Eq. (21), respectively), the following subsections will evaluate these surrogate models individually using multiple deformation modes for training and testing purposes. In practice, given multiple datasets spanning volumetric, isochoric quasi-static, and isochoric dynamic deformation modes, the individually trained \(\widetilde{\mathcal {M}}_{\text {vol}}\), \(\widetilde{\mathcal {M}}_{h,\text {iso}}\), and \(\widetilde{\mathcal {M}}_{v,\text {iso}}\) surrogate models can be conveniently combined to form one constitutive model. Note that we will be working with synthetic data for training and testing purposes. We will extract this data by simulating virtual experiments that could be conducted physically, remaining in a data-poor regime in terms of strain, strain rate, and stress states. A traditional phenomenological constitutive law will be selected for each surrogate model case to simulate the virtual experiments and to obtain the corresponding training and testing data.

To assess the fitting accuracy of the surrogate models, the following percent relative error metric is employed:

$$\begin{aligned} \text {err} (\widetilde{\textbf{S}}, \textbf{S}) = 100 \times \frac{\left\Vert \text {vec}(\widetilde{\textbf{S}}) - \text {vec}(\textbf{S})\right\Vert _F}{\left\Vert \text {vec}(\textbf{S})\right\Vert _F} \end{aligned}$$
(45)

where \(\Vert \left( \cdot \right) \Vert _F\) represents the Frobenius norm, \(\widetilde{\textbf{S}}\) is the stress tensor predicted by a surrogate model, and \(\textbf{S}\) is the true value of the stress tensor (i.e., the ground truth). Note that the stress-free reference state, when \(\left\Vert \text {vec}(\textbf{S})\right\Vert _F = 0\), is excluded during the calculation of \(\text {err}\) to avoid division by zero. The scalar error, \(\text {err}\), is obtained at individual data points. For convenience, a mean percent relative error value will also be reported,

$$\begin{aligned} \overline{\text {err}} = \frac{\sum _{i=1}^N \text {err}_i}{N} \end{aligned}$$
(46)

where \(\text {err}_i\) is the error value at the ith data point, and N is the total size of the dataset (training or testing).

To assess the accuracy and physical plausibility of model predictions outside the training regime, two types of out-of-sample "testing regions" will be considered: (i) a testing region that corresponds to the same deformation mode that was considered during training—for example, training a model using uniaxial tension stress–strain data in the [0, 0.25] strain range, and then predicting tensile stresses for the wider [0, 0.5] strain range; (ii) a testing region that corresponds to a different deformation mode from those considered during model training (for example, training a model using uniaxial tension stress–strain data and then using the trained model to predict material responses under uniaxial compression and simple shear deformation modes). Further, the effect of training dataset size on both fitting errors and out-of-sample predictions will be studied.

Finally with a focus on generalization capabilities as an added benchmark to test the performance of our approach, for each surrogate model case, the performance of our model will be compared to (i) a conventional visco-hyperelastic constitutive model, and (ii) a corresponding surrogate model that employs a classical ML black-box mapping between the Voigt forms of tensors \(\textbf{C}\) and \(\dot{\textbf{C}}\) (as strain and strain rate input) and \(\textbf{S}\) (as the second Piola-Kirchoff stress output) [6, 7], i.e.,

$$\begin{aligned} \text {Classical mapping:} {} \begin{bmatrix} \text {vec} (\textbf{C})\\ \text {vec} (\dot{\textbf{C}}) \end{bmatrix} \rightarrow \begin{bmatrix} \text {vec} (\textbf{S})\\ \end{bmatrix}. \end{aligned}$$
(47)

Standard GPR will be utilized to learn the classical mappings. Of course, tensor \(\dot{\textbf{C}}\) will not be considered in the case of quasi-static loading conditions.

All the calculations presented in this work are performed in Python [71]. For machine learning tasks, the Scikit-learn python package [72] is utilized. Note that constrained optimization capability was added by the authors in the GPR module of this package locally for the development of C-GPR-based surrogate models. The Constrained Optimization BY Linear Approximation (COBYLA) algorithm was used for this purposeFootnote 1

5.1 The \(\widetilde{\mathcal {M}}_{\text {vol}}\) surrogate model under hydrostatic (confined uniaxial) loading

Consider the following deformation gradient tensor,

$$\begin{aligned} \textbf{F}_{app} = \textbf{I} + J \textbf{e}_1 \otimes \textbf{E}_1 \quad \text {with} \hspace{2 pt} J \in [-0.75,0] \end{aligned}$$
(48)

Deformations of the form of Eq. (48) are applied in practice via confined compression experiments to study the hydrostatic bulk material response [53, 73, 74]. These experiments provide hydrostatic stress / pressure versus volumetric strain (i.e., J) response of the material, which in turn can be used to calibrate a particular form of the volumetric energy density function U(J) (see Eq. (14)). In this study, the commonly used Simo–Miehe volumetric energy density [75] is employed to generate the hydrostatic stresses for training the \(\widetilde{\mathcal {M}}_{\text {vol}}\) surrogate model,

$$\begin{aligned} U^{\text {SM}} (J) = \frac{\kappa }{2} \left( \frac{J^2 - 1}{2} - \ln {J} \right) \end{aligned}$$
(49)

where \(\kappa \) is the bulk modulus. We choose \(\kappa \) = 10. The corresponding volumetric stress component \(\textbf{S}_\text {vol}^\text {SM}\) (see Eq. (14)) is given by

$$\begin{aligned} \textbf{S}_\text {vol}^\text {SM} = 2 \frac{\partial U^{\text {SM}}(J)}{\partial \textbf{C}} = \frac{\kappa }{2} \left( J^2 - 1 \right) \textbf{C}^{-1} \end{aligned}$$
(50)

where \(\textbf{C}\) = \(\textbf{C}_{app}\) = \(\textbf{F}_{app}^{\text {T}} \textbf{F}_{app}\) is the applied right Cauchy–Green deformation tensor.

\(\textbf{C}_{app}\) and \(\textbf{S}_\text {vol}^\text {SM}\) constitute the training dataset \(\mathcal {D}_\text {vol}\) of Eq. (21) (note, \(\mathcal {D} = \mathcal {D}_\text {vol}\) in this case) that is utilized to train the \(\widetilde{\mathcal {M}}_{\text {vol}}\) surrogate model (Eq. (26)) by following the procedure detailed in Sect. 3. Standard GPR is used for mapping. As the stress-free undeformed state (i.e., \(\textbf{F}_{app} = \textbf{I}\)) is included in the training data, the normalization condition is simply enforced by assigning a near-zero Gaussian noise of \(\alpha = 10^{-4}\) (Eq. (36)) at all data points.

The confined compression deformation mode in Eq. (48) generates three non-zero volumetric stress components: \(\textbf{S}_{\text {vol,11}}\), \(\textbf{S}_{\text {vol,22}}\), and \(\textbf{S}_{\text {vol,33}}\). Given the 11-loading direction, \(\textbf{S}_{\text {vol,22}} = \textbf{S}_{\text {vol,33}}\), resulting in only two independent stress components. Figure 2a shows the evolution of \(\textbf{S}_{\text {vol,11}}\) and \(\textbf{S}_{\text {vol,22}}\) versus J from the training data and compares it with the corresponding \(\textbf{S}_{\text {vol,11}}\) versus J and \(\textbf{S}_{\text {vol,22}}\) versus J responses predicted by the trained surrogate model. 26 data points were considered for training in this case (i.e., a strain increment of 0.01). The corresponding percent relative error (Eq. (45)) versus J plot is shown in Fig. 2b. From these plots, a very good agreement between training data and surrogate model predictions in the training regime is evident, which is expected from an ML-based model. Overall, the maximum and mean percent relative errors in the training regime are 1.12% and 0.12%, respectively, suggesting a very good fitting accuracy of our surrogate model.

Fig. 2
figure 2

a Comparison of the numerically generated volumetric stress components (\(\textbf{S}_{\text {vol,11}}\) and \(\textbf{S}_{\text {vol,22}}\)) versus volumetric strain (J) training data under confined compression with the corresponding surrogate model predictions. Inset shows a schematic illustration of the confined compression deformation mode, highlighting the reference and deformed states and the 11-loading direction. b Evolution of the percent relative error (\(\text {err}\)) of surrogate model predictions versus J in the training regime

Next, the performance of our surrogate model for a wider range of J (i.e., the testing regime) is analyzed and compared with the corresponding predictions of (i) a conventional volumetric strain energy density function of the neo-Hookean model ([76]), and (ii) the standard GPR with the classical mapping approach (Eq. (47)). The neo-Hookean volumetric strain energy density and its corresponding volumetric stress tensor are given as

$$\begin{aligned} U^{\text {NH}} (J)&= \frac{\kappa _\text {NH}}{2} \left( J - 1\right) ^2 \end{aligned}$$
(51a)
$$\begin{aligned} \textbf{S}_\text {vol}^\text {NH}&= 2 \frac{\partial U^{\text {NH}}(J)}{\partial \textbf{C}} = \kappa _\text {NH} J \left( J - 1 \right) \textbf{C}^{-1} \end{aligned}$$
(51b)

The volumetric neo-Hookean model was calibrated against the training data using linear least squares optimization. This yielded a \(\kappa _\text {NH}\) value of 11.16, which is close to the ground truth of \(\kappa = \)10 considered in the Simo–Miehe model for generating the training data.

Fig. 3
figure 3

Comparison of the numerically generated volumetric stress–volumetric strain data from the Simo–Miehe model (i.e., ground truth) in the overall testing regime (\(J \in [0.5, 1.5]\)) with the corresponding predictions of our surrogate model, the volumetric neo-Hookean model, and the classical ML-based mapping model: a \(\textbf{S}_{\text {vol,11}}\) versus J under confined compression, b \(\textbf{S}_{\text {vol,22}}\) versus J under confined compression, c \(\textbf{S}_{\text {vol,11}}\) versus J under confined tension, and d \(\textbf{S}_{\text {vol,22}}\) versus J under confined tension. e Comparison of the percent relative error (\(\text {err}\)) versus J responses of the present surrogate model, the volumetric neo-Hookean model, and the classical mapping model

Fig. 3a–b show the \(\textbf{S}_{\text {vol,11}}\)J and \(\textbf{S}_{\text {vol,22}}\)J responses as predicted by the surrogate model proposed in this work, the calibrated volumetric neo-Hookean model, and the classical mapping of Eq. (47), in a strain range (\(J \in [0.5, 1]\)) that spans beyond the training regime in the compression deformation mode (i.e., the sole deformation mode considered in training). The ground truth (i.e., Simo–Miehe Model) in this extended strain regime is also shown in the same figure for comparison, with the corresponding percent relative error \(\text {err}\) (with respect to the ground truth) versus J plots for the three models shown in the left half of Fig. 3e. From these figures, it is clear that within the training regime (i.e., \(J \in [0.75, 1]\)), all three models exhibit reasonable accuracy in predicting the stress–strain data. Among the three models, the two ML-based data-driven models—the surrogate model of this study and the classical mapping model—resulted in much smaller mean percent relative errors (\(\overline{\text {err}}\)) of 0.12% and 0.27% compared to the volumetric neo-Hookean model, which resulted in a \(\overline{\text {err}}\) of 4.90%. This was expected because unlike ML-based mappings, conventional constitutive models have limited fitting accuracy due to their fixed mathematical forms. Outside the training regime when \(J \in [0.5,0.75]\), our surrogate model clearly outperforms the other two models by predicting stress components that are in excellent agreement with the ground truth. In terms of the mean percent relative error, our surrogate model yields a \(\overline{\text {err}}\) of 1.07% in this regime, which is an order-of-magnitude smaller than the corresponding errors of the volumetric neo-Hookean model (i.e., 14.86%) and the classical mapping model (i.e., 12.72%).

Figure 3c–d compare the predictions of the three models with the ground truth (Simo–Miehe model) in the tension (confined) deformation mode (\(J \in [1, 1.5]\)) that was not considered at all in the training data. Here, the classical mapping model results in physically implausible predictions in that the stresses at large volumetric strains start to decrease, violating thermodynamic stability for incremental deformations [77, 78]. In fact, for \(J > 1.6\) (not shown in the plots), this model predicts compressive stresses for a confined tension loading, which is not possible. This type of erroneous model behavior is attributed to the purely black-box mapping in these conventional ML-based constitutive models, which do not respect any physical or mechanistic constraints on the response of the continuum. From Fig. 3c–d, it is clear that our physics-informed data-driven surrogate model does not suffer from this issue and makes physically-plausible and trustworthy predictions even in the deformation mode that was not considered in training. Not surprisingly, the predictions of the volumetric neo-Hookean model are also physically reasonable (as \(\kappa _{\text {NH}} > 0\)). Further, our surrogate model outperforms the other two models in terms of agreement of the predicted stress components with the ground truth. The mean percent relative error of our model in the tensile regime is 6.66%, which is significantly lower compared to the errors of the volumetric neo-Hookean model (i.e., 23.59 %) and the classical mapping model (i.e., 37.23%).

Lastly, the effect of training dataset size on the performance of the two data-driven models is analyzed in Fig. 4a–b, which plot the mean percent relative errors \(\overline{\text {err}}\) in the training regime (\(J \in [0.75, 1]\)) and the overall testing regime (\(J \in [0.5, 1.5]\)) as a function of the training dataset size \(N_\text {vol}\) (i.e., the number of data points in \(\mathcal {D}_\text {vol}\)). In the training regime (Fig. 4a), the errors of both the models decrease asymptotically with \(N_\text {vol}\). This is a typical behavior of ML-based models, which yield improved predictions in the training regime (i.e., interpolation) as larger volumes of data are used in model training. Overall, both the models show excellent fitting accuracy (\(\overline{\text {err}}\)) even with a small training dataset of 26 stress–strain values. In the testing regime (Fig. 4b), which includes both confined compression and tension deformation modes, the classical mapping model no longer shows improvement in the performance with an increase in training dataset size. In fact, for \(N_\text {vol}> 50\), \(\overline{\text {err}}\) versus \(N_\text {vol}\) increases monotonically, which coincides with highly physically implausible predictions in the confined tension regime (see an example of this in the supplementary material, Section SM1). Our surrogate model does not suffer from this limitation due to its physics-informed nature, and its predictions continue to improve with an increase in training dataset size in both the training and testing regimes. Further, as the inclusion of physics-based constraints confines the feature space of possible GPR hyperparameters, our surrogate model predictions are in very good agreement with the ground truth even at \(N_\text {vol} = 26\). Clearly, our physics-informed data-driven constitutive model combines the high fitting accuracy of ML-based models with the physical nature and small-data compatibility of conventional constitutive models. As a direct outcome, we note that our model is trustworthy in unseen states and can be developed from limited/low data.

Fig. 4
figure 4

a Comparison of the evolution of mean percent relative error (\(\overline{\text {err}}\)) in the predictions of the surrogate model and the classical mapping model in the training regime (\(J \in [0.75,1]\)), as a function of the training dataset size (\(N_\text {vol}\)). b The corresponding \(\overline{\text {err}}\) versus \(N_\text {vol}\) responses of the present surrogate model and the classical mapping model, for their predicted responses in the overall testing regime (\(J \in [0.5,1.5]\))

Our surrogate models’ superior performance is both due to their specific input–output construction that stems from a thermodynamically consistent framework and the employment of GPR for supervised learning, which ensures the normalization condition and yields excellent out-of-sample prediction accuracy even with limited training data. To highlight this point, we repeated the analysis of the hydrostatic loading case of this section by employing an ANN (unconstrained) to learn the mapping in Eq. (26). The \(\text {err}\) versus J plot for this case (corresponding to Fig. 3e) is shown in Appendix C. From this plot, relatively large prediction errors are observed near the stress-free reference state when using an ANN-based model, which can be attributed to a violated normalization condition. In addition, ANN-based mapping also reduces the overall performance. This exercise further justifies the choice of GPR as the supervised learning method for our surrogate models.

5.2 The \(\widetilde{\mathcal {M}}_{h,\text {iso}}\) surrogate model under quasi-static uniaxial and shear loading

Now, consider the following isochoric (volume-preserving) uniaxial deformation mode,

$$\begin{aligned} \textbf{F}_{app}= & {} \lambda \textbf{e}_1 \otimes \textbf{E}_1 + \frac{1}{\sqrt{\lambda }} \left( \textbf{e}_2 \otimes \textbf{E}_2 + \textbf{e}_3 \otimes \textbf{E}_3\right) \nonumber \\{} & {} \quad \text {with} \hspace{2 pt} \lambda \in [1,1.25] \end{aligned}$$
(52)

where \(\lambda \) is called the uniaxial stretch. Under the quasi-static assumption, the rate of stretching \(\dot{\lambda }\) is approximately zero. Deformations like Eq. (52) are routinely applied in the mechanical characterization of soft materials during unconfined uniaxial tensile tests [79,80,81]. The uniaxial stress versus stretch responses obtained from these tests are used to calibrate hyperelastic constitutive models. In this study, we use a polynomial-type hyperelastic model known as the Mooney–Rivlin model [82, 83] to generate training data for developing our \(\widetilde{\mathcal {M}}_{h,\text {iso}}\) surrogate model. The strain energy density \(\bar{W}_h\) and the corresponding isochoric hyperelastic stress \(\textbf{S}_{h,\text {iso}}\) of the Mooney–Rivlin model are given by

$$\begin{aligned} \bar{W}_h^\text {MR}&= A_{10}\left( \bar{I}_1 - 3\right) + A_{01}\left( \bar{I}_2 - 3\right) \end{aligned}$$
(53a)
$$\begin{aligned} \textbf{S}_{h,\text {iso}}^\text {MR}&= 2 \frac{\partial \bar{W}_h^\text {MR}}{\partial \textbf{C}} \nonumber \\&= J^{-2/3}\left[ 2 \left( A_{10} + \bar{I}_1 A_{01}\right) \text {Dev}\textbf{I} - 2 A_{01} \text {Dev}\bar{\textbf{C}} \right] \end{aligned}$$
(53b)

where \(A_{10}\) and \(A_{01}\) are model parameters. We choose \(A_{10} = 1\) and \(A_{01} = 0.5\) for generating the training data. Also, \(J = J_{app} = \text {det}(\textbf{F}_{app})\), \(\textbf{C} = \textbf{C}_{app} = \textbf{F}_{app}^\text {T} \textbf{F}_{app}\), and \(\bar{\textbf{C}} = \bar{\textbf{C}}_{app} = J^{-2/3} \textbf{C}_{app}\) are the applied deformations derived from Eq. (52).

Using \(\textbf{C}_{app}\) and \(\textbf{S}_{h,\text {iso}}^\text {MR}\) as the training dataset \(\mathcal {D}_{h,\text {iso}}\) (in this case, \(\mathcal {D} = \mathcal {D}_{h,\text {iso}}\)), a standard GPR-based mapping as in Eq. (27) was trained. Like the hydrostatic loading case, the normalization condition, in this case, is enforced by assigning a near-zero Gaussian noise of \(\alpha \) = \(10^{-5}\) (Eq. (36)) at the stress-free reference state of the training data (a relatively higher \(\alpha \) value of \(10^{-2}\) was assigned at all other data points to avoid numerical issues and promote data fitting flexibility). This trained surrogate model can predict the isochoric hyperelastic stress component \(\widetilde{\textbf{S}}_{h,\text {iso}}\) for any arbitrary quasi-static deformation.

Fig. 5
figure 5

a Comparison of the numerically generated isochoric hyperelastic stress components (\(\textbf{S}_{h,\text {iso,11}}\) and \(\textbf{S}_{h,\text {iso,22}}\)) versus uniaxial stretch (\(\lambda \)) training data under uniaxial tension with the corresponding surrogate model predictions. Inset shows a schematic illustration of the unconfined uniaxial tension deformation mode, highlighting the reference and deformed states and the 11-loading direction. b Evolution of the percent relative error (\(\text {err}\)) of surrogate model predictions versus \(\lambda \) in the training regime

Fig. 5 compares the stress–stretch response considered for training the surrogate model with the corresponding predictions of the trained surrogate model in this regime. Like the hydrostatic case (Sect. 5.1), there are two independent stress components resulting from the uniaxial loading condition, \(\textbf{S}_{h, \text {iso,11}}\) and \(\textbf{S}_{h, \text {iso,22}}\), with \(\textbf{S}_{h, \text {iso,22}} = \textbf{S}_{h, \text {iso,33}}\). 26 data points were considered for training in this case. From Fig. 5a, the model predictions are in excellent agreement with the training data, leading to relatively small errors as shown in Fig. 5b. The mean percent relative error in this case is 4.75%.

Fig. 6
figure 6

Comparison of the numerically generated isochoric hyperelastic stress–uniaxial stretch data from the Mooney–Rivlin model (i.e., ground truth) in the uniaxial testing regime (\(\lambda \in [0.5, 1.5]\)) with the corresponding predictions of our surrogate model, the Yeoh model, and the classical mapping model: a \(\textbf{S}_{h,\text {iso,11}}\) versus \(\lambda \) under uniaxial tension, b \(\textbf{S}_{h,\text {iso,22}}\) versus \(\lambda \) under uniaxial tension, c \(\textbf{S}_{h,\text {iso,11}}\) versus \(\lambda \) under uniaxial compression, and d \(\textbf{S}_{h,\text {iso,22}}\) versus \(\lambda \) under uniaxial compression. e Comparison of the percent relative error (\(\text {err}\)) versus \(\lambda \) responses of the present surrogate model, the Yeoh model, and the classical mapping model

With a good fitting accuracy established from Fig. 5, the performance of our surrogate model is now analyzed in a wider testing regime. The testing regime in this case consists of uniaxial deformation mode (Eq. (52)) with \(\lambda \in [0.5, 1.5]\), and the simple shear deformation mode,

$$\begin{aligned} \textbf{F}_{app} = \textbf{I} + \gamma \textbf{e}_1 \otimes \textbf{E}_2 \quad \text {with} \hspace{2 pt} \gamma \in [0,0.5] \end{aligned}$$
(54)

where \(\gamma \) is the shear strain. The surrogate model performance is compared with that of the classical mapping model (Eq. (47)) and another polynomial hyperelastic model, the 2-parameter Yeoh model (henceforth, referred to simply as the Yeoh model). The Yeoh model [84] is given by

$$\begin{aligned} \bar{W}_h^\text {Y}&= C_1 \left( \bar{I}_1 - 3\right) + C_2 \left( \bar{I}_1 - 3\right) ^2 \end{aligned}$$
(55a)
$$\begin{aligned} \textbf{S}_{h,\text {iso}}^\text {Y}&= 2 \frac{\partial \bar{W}_h^\text {Y}}{\partial \textbf{C}} \nonumber \\&= J^{-2/3}\left[ 2 C_1 + 4 C_2 \left( \bar{I}_1 - 3\right) \right] \text {Dev}\textbf{I} \end{aligned}$$
(55b)

The Yeoh model was calibrated by fitting its predicted \(\textbf{S}^{\text {Y}}_{h, \text {iso,11}}\) versus \(\lambda \) response to the corresponding uniaxial stress–stretch response in the training data, using linear least squares optimization. This yielded \(C_1 = 1.46\) and \(C_2 = -0.21\).

Fig. 7
figure 7

a Comparison of the numerically generated isochoric hyperelastic shear stress (\(\textbf{S}_{h,\text {iso,12}}\))–shear strain (\(\gamma \)) data from the Mooney–Rivlin model (i.e., ground truth) in the shear testing regime (\(\gamma \in [0, 0.5]\)) with the corresponding predictions of our surrogate model, the Yeoh model, and the classical mapping model. b Comparison of the percent relative error (\(\text {err}\)) versus \(\gamma \) responses of the present surrogate model, the Yeoh model, and the classical mapping model

Figure 6 compares the stress versus uniaxial stretch predictions of the present surrogate model, the classical mapping model and the calibrated Yeoh model, with the ground truth (i.e., from the Mooney–Rivlin model that was utilized for generating training data). In the training regime of \(\lambda \in [1, 1.25]\) (see Fig. 6a–b), all three models show an excellent agreement with the ground truth. In terms of the percent relative error that is shown in Fig. 6e, the mean errors in the three model predictions in this regime are 4.75% (surrogate model), 1.26% (classical mapping model), and 0.50% (Yeoh model).

In the overall testing regime, our surrogate model results in a markedly better performance than the other two models both in terms of the physical plausibility of predicted responses and the prediction accuracy. For example, the compressive stress–stretch responses predicted by the classical mapping model (Fig. 6c–d) exhibit unreasonable mechanical features: (i) the \(\textbf{S}_{h, \text {iso,11}}\) versus \(\lambda \) response shows a softening response that is uncharacteristic of hyperelastic soft materials, which typically exhibit compression–tension asymmetry in the loading direction (i.e., the material exhibits a highly nonlinear and stiffer response in compression than in tension) [56, 85, 86], and (ii) the \(\textbf{S}_{h, \text {iso,22}}\) versus \(\lambda \) response starts to decrease following a maximum at approximately \(\lambda = 0.57\), which violates the expected monotonicity and thermodynamic stability in stress–stretch responses prior to damage/failure [78]. The latter behavior is also exhibited by the Yeoh model, for which the predicted \(\textbf{S}_{h, \text {iso,22}}\) versus \(\lambda \) response violates monotonicity at large compressive strains of \(\lambda < 0.61\). Non-physical response predictions from conventional hyperelastic models have been observed in the literature for certain model parameter values and are attributed to the violation of the second law of thermodynamics by the model with those parameter values [77, 78, 87]. From the figure, our surrogate model prevents this issue in the investigated strain range and yields a physically-plausible mechanical response even at large deformations. Further, it results in a mean percent relative error of 10.98% in the uniaxial testing regime, which is slightly lower compared to the \(\overline{\text {err}}\) of 11.08% of the Yeoh model, and significantly lower than the \(\overline{\text {err}}\) of 49.94% of the classical mapping model.

Fig. 8
figure 8

a Comparison of the evolution of mean percent relative error (\(\overline{\text {err}}\)) in the predictions of the surrogate model and the classical mapping model in the training regime (\(\lambda \in [1,1.25]\)), as a function of the training dataset size (\(N_{h,\text {iso}}\)). b The corresponding \(\overline{\text {err}}\) versus \(N_{h,\text {iso}}\) responses of the present surrogate model and the classical mapping model, for their predicted responses in the overall testing regime (\(\lambda \in [0.5,1.5] \cup \gamma \in [0,0.5]\))

Figure 7a shows the predicted shear stress versus shear strain responses from the three models and the ground truth. Here, the classical mapping model predicts zero stress regardless of the applied strain. This is attributed to the sole consideration of uniaxial deformation in the training dataset (which constitutes only a line segment in the six-dimensional strain-space that needs to be explored), in which non-diagonal components in the \(\textbf{C}\) and \(\textbf{S}_{h, \text {iso}}\) tensors were always zero. As our surrogate model is based on the mapping of strain invariants with response coefficients, it does not suffer from this limitation and leads to accurate shear stress predictions as shown in Fig. 7a. Note that even though the simple shear deformation mode generates multiple non-zero stress components (viz., \(\textbf{S}_{h, \text {iso}, 11}\), \(\textbf{S}_{h, \text {iso}, 12}\), \(\textbf{S}_{h, \text {iso}, 22}\), and \(\textbf{S}_{h, \text {iso}, 33}\)), only the dominant 12-component is shown for brevity. In terms of the percent relative error metric that includes all non-zero stress components and is plotted in Fig. 7b, our surrogate model leads to a maximum \(\text {err}\) value of 13.46% across the shear deformation mode, with a mean error of 10.09%. For comparison, the maximum \(\text {err}\) value for the Yeoh model is 14.59%, while the mean error \(\overline{\text {err}}\) is 7.54%. Overall, it is seen that both the surrogate model of this study and the Yeoh model exhibit reasonable prediction accuracy across the entire testing regime (uniaxial + shear), but only our surrogate model results in stress–strain predictions that are physically plausible across all the investigated deformation modes.

Figure 8 shows the evolution of mean percent relative error \(\overline{\text {err}}\) in the training and overall testing regimes, with the training dataset size \(N_{h,\text {iso}}\). Similar to the hydrostatic loading case, our surrogate model results in a monotonically decreasing \(\overline{\text {err}}\) versus \(N_{h,\text {iso}}\) response with asymptotic behavior in both cases. On the other hand, the classical mapping model shows an asymptotic decrease in \(\overline{\text {err}}\) only in the testing regime, while showing a slight increase in the training regime. Ultimately, while both the models show very good fitting accuracy for every investigated training dataset size, our surrogate model results in significantly more accurate predictions across multiple deformation modes. This ability to seamlessly transition between deformation modes without loss of accuracy is a distinctive feature of trustworthy constitutive models [88].

5.3 The \(\widetilde{\mathcal {M}}_{v,\text {iso}}\) surrogate model under dynamic uniaxial and shear loading

Thus far, the performance of the physics-informed data-driven constitutive model of this study has been evaluated only under quasi-static loading conditions, when the applied strain rate is so small that viscous dissipation remains negligible. Now, consider an isochoric dynamic uniaxial loading at a finite applied loading rate, equivalently following

$$\begin{aligned} \textbf{F}_{app}&= \lambda \textbf{e}_1 \otimes \textbf{E}_1 \nonumber \\&\quad + \frac{1}{\sqrt{\lambda }} \left( \textbf{e}_2 \otimes \textbf{E}_2 + \textbf{e}_3 \otimes \textbf{E}_3\right) \quad \text {with} \hspace{2 pt} \lambda \in [1,1.5] \end{aligned}$$
(56a)
$$\begin{aligned} \dot{\textbf{F}}_{app}&= \dot{\lambda } \left[ \textbf{e}_1 \otimes \textbf{E}_1 - \frac{1}{2 \lambda ^{3/2}} \left( \textbf{e}_2 \otimes \textbf{E}_2 + \textbf{e}_3 \otimes \textbf{E}_3\right) \right] \nonumber \\&\quad \text {with} \hspace{2 pt} \dot{\lambda } \in [10,100] \end{aligned}$$
(56b)

where \(\dot{\lambda }=d\lambda /dt\) is the applied uniaxial strain rate. At any given time t during loading, \(\lambda = \dot{\lambda }t\).

Dynamic loading of the form of Eq. (56) is applied during high strain rate tension tests on soft materials such as the tensile Kolsky bar experiment (e.g., see [22, 55]). The stress–strain (or stretch)–strain rate data from these tests combined with the stress–strain (or stretch) data from quasi-static uniaxial tests are then used to calibrate visco-hyperelastic constitutive models. As a number of visco-hyperelastic constitutive models (see [30] for a review) assume an additive decomposition of total stress into hyperelastic (rate-independent) and viscous overstress components (e.g., see Eq. (14)), the quasi-static test data is exclusively used to calibrate the hyperelastic model (\(\bar{W}_h\)) parameters, while the viscous overstress that is calculated by subtracting quasi-static stress–strain data from high strain rate stress–strain data is used to calibrate the model parameters of the viscoelastic component of the constitutive model (e.g., \(\bar{W}_v\) in Eq. (14)). In this study, the 3-parameter Upadhyay–Subhash–Spearot (USS) viscous dissipation potential [30, 33] is utilized to generate the viscous overstress training data for developing the \(\widetilde{\mathcal {M}}_{v,\text {iso}}\) surrogate model,

$$\begin{aligned} \bar{W}_v^\text {USS}&= k_{11} \bar{J}_2 \sqrt{\bar{I}_1 - 3} + \frac{k_{21}}{c_{21}} \bar{J}^{c_{21}}_5 \sqrt{\bar{I}_2 - 3} \end{aligned}$$
(57a)
$$\begin{aligned} \textbf{S}_{v,\text {iso}}^\text {USS}&= 2 \frac{\partial \bar{W}_v^\text {USS}}{\partial \dot{\textbf{C}}} \nonumber \\&= J^{-2/3}\left[ \left( 4 k_{11} \sqrt{\bar{I}_1 - 3} \right) \text {Dev}(\dot{\bar{\textbf{C}}}) \right. \nonumber \\&\quad \left. + \left( 2 k_{21} \bar{J}_5^{c_{21} - 1} \sqrt{\bar{I}_2 - 3} \right) \text {Dev} (\bar{\textbf{C}}\dot{\bar{\textbf{C}}} + \dot{\bar{\textbf{C}}}\bar{\textbf{C}})\right] \end{aligned}$$
(57b)

where \(k_{11}\) and \(k_{21}\) are the linear and nonlinear rate sensitivity control parameters, respectively, and \(c_{21}\) is the rate sensitivity index. We use \(k_{11} = 1\), \(k_{21} = 1\), and \(c_{21} = 0.75\) for generating the training data. The expressions for invariants \(\bar{I}_1\), \(\bar{I}_2\), \(\bar{J}_2\), and \(\bar{J}_5\) are given in Eq. (15). Just like the quasi-static loading case, \(J = J_{app} = \text {det}(\textbf{F}_{app})\), \(\textbf{C} = \textbf{C}_{app} = \textbf{F}_{app}^\text {T} \textbf{F}_{app}\), and \(\bar{\textbf{C}} = \bar{\textbf{C}}_{app} = J^{-2/3} \textbf{C}_{app}\) are the applied deformations derived from Eq. (56); \(\dot{\bar{\textbf{C}}} = \dot{\bar{\textbf{C}}}_{app}\) is the time rate of change of \(\bar{\textbf{C}}_{app}\).

The training dataset \(\mathcal {D}_{v,\text {iso}} = \mathcal {D} = \{ \textbf{C}_{app}, \dot{\textbf{C}}_{app}, \textbf{S}_{v,\text {iso}}^{\text {USS}}\}\) (see Eq. (21)) generated from the USS model is used to train the \(\widetilde{\mathcal {M}}_{v,\text {iso}}\) surrogate model (Eq. (28)). Similar to the previous quasi-static cases (involving \(\textbf{S}_\text {vol}\) and \(\textbf{S}_{h,\text {iso}}\)), the normalization condition is implicitly enforced due to the consideration of stress-free reference states (i.e., when \(\textbf{F}_{app} = \textbf{I}\)) in the training data and assigning a near-zero Gaussian noise of \(\alpha \) = \(10^{-5}\) (Eq. (36)) at those data points (a relatively higher \(\alpha \) value of \(10^{-2}\) was assigned at all other data points). The second law of thermodynamics constraint, on the other hand, is enforced via C-GPR as the chosen regression method. Here, the inequality constraint on the hyperparameter optimization problem in Eq. (42) is enforced at all training input data points (i.e., the constraint points are the same as the training data points; \(N_c = N_{v,\text {iso}}\)). The trained surrogate model can predict viscous overstress \(\widetilde{\textbf{S}}_{v,\text {iso}}\) for any given applied tensors \(\textbf{C}\) and \(\dot{\textbf{C}}\).

Fig. 9
figure 9

a Comparison of the numerically generated training data of isochoric viscous overstress component \(\textbf{S}_{v,\text {iso,11}}\) versus uniaxial stretch (\(\lambda \)) at multiple stretch rates (\(\dot{\lambda }\)), with the corresponding surrogate model predictions. Inset shows a schematic illustration of the unconfined uniaxial tension deformation mode. b The corresponding \(\textbf{S}_{v,\text {iso,22}}\) versus \(\lambda \) responses at fixed \(\dot{\lambda }\) values from the training data and the surrogate model. c Evolution of the percent relative error (\(\text {err}\)) of surrogate model predictions versus \(\lambda \) in the training regime (dashed line: mean \(\text {err}\) across all stretch rates; shaded region: mean ± standard deviation)

Figure 9 investigates the fitting response of the data-driven surrogate model in the training regime of uniaxial tension loading (Eq. (52)) that is composed of 5 uniformly distributed stretch rates in the range of \(\dot{\lambda } \in [10, 100]\) (viz., \(\dot{\lambda } = 10, 32.5, 55, 77.5,\) and 100), and 31 uniformly distributed stretch values in the range of \(\lambda \in [1,1.5]\) for every stretch rate considered. The total number of training data points is thus \(5 \times 31 = 155\). Both the non-zero stress-components under uniaxial tension, \(\textbf{S}_{v, \text {iso,11}}\) and \(\textbf{S}_{v, \text {iso,22}}\), are plotted in Fig. 9a, b, respectively. An excellent agreement between model predictions and the stress–stretch–stretch rate training data is apparent from these plots. The corresponding scalar fitting error (\(\text {err}\)) versus stretch response is shown in Fig. 9c; here, the line represents the mean \(\text {err}\) values averaged across all the investigated stretch rates, and the shaded region represents the vertical range of mean ± standard deviation. The \(\text {err}\) versus \(\lambda \) response starts at relatively high values at small stretches and then assumes consistently small values close to zero for \(\lambda > 1.1\). The large \(\text {err}\) at small stretch values is attributed to the very small stresses in the small strain regime in the training data (i.e., the denominator in the formula for \(\text {err}\)), which causes spuriously high \(\text {err}\) even when the absolute differences between true and predicted stresses are very small. Overall, the mean percent relative error \(\overline{\text {err}}\) across all the 155 training data points is 1.73%, suggesting a very good model fitting accuracy.

Fig. 10
figure 10

Comparison of the numerically generated isochoric viscous overstress–uniaxial stretch data at multiple stretch rates from the USS model (i.e., ground truth) in the uniaxial testing regime (\(\lambda \in [0.5, 1.75]\), \(\dot{\lambda } \in [-145,145]\)) with the corresponding predictions of our surrogate model, the Pioletti model, and the classical mapping model: a \(\textbf{S}_{v,\text {iso,11}}\) versus \(\lambda \) under uniaxial tension, b \(\textbf{S}_{v,\text {iso,22}}\) versus \(\lambda \) under uniaxial tension, c, e \(\textbf{S}_{v,\text {iso,11}}\) versus \(\lambda \) under uniaxial compression, and d, f \(\textbf{S}_{v,\text {iso,22}}\) versus \(\lambda \) under uniaxial compression. Note: the light gray lines in the background are the data corresponding to the stretch rates \(\dot{\lambda } = \pm 32.5, \pm 77.5, \pm 100\) and \(\pm 122.5\), which were also considered in model testing but are not shown in this figure for clarity

The accuracy and physical plausibility of our data-driven model predictions are now investigated in a wider testing regime that consists of tension, compression, and simple shear deformation modes. Here, the two uniaxial deformation modes (i.e., tension and compression) comprise stretch rates in the range of \(\dot{\lambda } \in [-145,145]\) and stretch values in the range of \(\lambda \in [0.5,1.75]\) (cf. Equation (56)). Specific stretch rate values considered in model testing are: \(\dot{\lambda } = \pm 10, \pm 32.5, \pm 55, \pm 77.5, \pm 100, \pm 122.5,\) and \(\pm 145\). Now, the dynamic simple shear deformation is given by

$$\begin{aligned} \textbf{F}_{app}&= \textbf{I} + \gamma \textbf{e}_1 \otimes \textbf{E}_2 \quad \text {with} \hspace{2 pt} \gamma \in [0,0.5] \end{aligned}$$
(58a)
$$\begin{aligned} \dot{\textbf{F}}_{app}&= \dot{\gamma } \textbf{e}_1 \otimes \textbf{E}_2 \quad \text {with} \hspace{2 pt} \dot{\gamma } \in [10,145] \end{aligned}$$
(58b)

where \(\dot{\gamma }=d\gamma /dt\) is the applied shear strain rate. At any given time t during loading, \(\gamma = \dot{\gamma }t\). For model testing, we consider shear strain rates in the range of \(\dot{\gamma } \in [10, 145]\) (specifically, \(\dot{\gamma } = 10, 32.5, 55, 77.5, 100, 122.5,\) and 145); for each strain rate, shear strains in the range of \(\gamma \in [0, 0.5]\) are considered. The overall testing regime outstretches the training regime both in terms of the maximum stretch/strain and stretch-/strain-rate magnitudes and spans multiple deformation modes. Like the previous two quasi-static cases, the present model predictions are compared with those from the classical mapping model (Eq. (47)) and an existing constitutive model from the literature. Here, we choose the Pioletti model [39, 40],

$$\begin{aligned} \bar{W}_v^\text {PL}&= \frac{\eta '}{4} \left( \bar{I}_1 - 3\right) \bar{J}_2 \end{aligned}$$
(59a)
$$\begin{aligned} \textbf{S}_{v,\text {iso}}^\text {PL}&= 2 \frac{\partial \bar{W}_v^\text {PL}}{\partial \dot{\textbf{C}}} \nonumber \\&= J^{-2/3} \eta ' \left( \bar{I}_1 - 3\right) \text {Dev}(\dot{\bar{\textbf{C}}}) \end{aligned}$$
(59b)

The Pioletti model was calibrated by fitting its predicted tensile stress–stretch–stretch rate response (i.e., the \(\textbf{S}^{\text {PL}}_{v, \text {iso,11}}\)\(\lambda \)\(\dot{\lambda }\) response) to the corresponding response in the training data (generated using the USS model), using nonlinear least squares optimization. This resulted in \(\eta '\) = 6.94. Figure 10a–b compare the tensile stress versus stretch responses in the testing regime at three representative stretch rates (for clarity) as predicted by the present data-driven surrogate model, the classical mapping model, and the Pioletti model, with the ground truth (i.e., USS model predictions). Out of the three stretch rates shown in these figures, \(\dot{\lambda }\) = 10 and \(\dot{\lambda }\) = 55 belong to the training subset, while \(\dot{\lambda }\) = 145 belongs to the testing dataset that was not considered during training. In the training regime (i.e., \(\lambda \in [1, 1.5]\) and \(\dot{\lambda } \in [10, 100]\)), both the data-driven models exhibit excellent fitting performance as their predicted tensile stress–stretch responses nearly overlap the ground truth. The Pioletti model, on the other hand, results in a relatively poor fitting accuracy, which is attributed to its overly simple mathematical form with only one model parameter. In terms of the scalar percent relative error metric plotted in Fig. 11 (averaged across all investigated strain rates), the average prediction error of our surrogate model in the training regime is the lowest at 1.73%, followed by the classical mapping model (4.67%) and the Pioletti model (40.59%). Outside the training regime in the tension deformation mode when \(\lambda > 1.5\) or \(\dot{\lambda } > 100\) (see Fig. 10a–b), the Pioletti model remains the worst performing model among the three investigated models. This time, however, the stress predictions of the classical mapping model also differ considerably from the ground truth, especially at high stretch rates. Overall, in the tensile deformation mode, the mean percent relative error, \(\overline{\text {err}}\), of our surrogate model is just 3.40%, which is considerably lower than that of the classical mapping model (12.30%) and the Pioletti model (41.59%).

Fig. 11
figure 11

Comparison of the percent relative error (\(\text {err}\)) versus \(\lambda \) responses of the present surrogate model, the Pioletti model, and the classical mapping model. The lines represent mean \(\text {err}\) across all investigated strain rates, and the shaded regions represent mean ± standard deviation

Fig. 12
figure 12

Comparison of the internal dissipation (\(\Xi _{int}\)) versus \(\lambda \) responses of the present surrogate model and the classical ML-based mapping model. Note: the light gray lines in the background are the data corresponding to the stretch rates \(\dot{\lambda } = \pm 32.5, \pm 77.5, \pm 100\) and \(\pm 122.5\), which were also evaluated in model testing but are not shown in this figure for clarity

In the compression deformation mode (Fig. 10c–f), the stress predictions of all three models show disagreement with the ground truth to varying degrees. Nevertheless, the stress–stretch plots of our data-driven surrogate model and the Pioletti model are physically reasonable and exhibit the expected compressive nature (i.e., negative \(\textbf{S}_{v, \text {iso,11}}\)), monotonicity at every investigated stretch rate, and compression–tension asymmetry in the 11-loading direction (i.e., stiffer response in compression than in tension). The classical mapping model, on the other hand, predicts tensile stresses under compression (see Fig. 10e–f), which is physically implausible. Owing to these unreasonable predictions, the mean percent relative error \(\overline{\text {err}}\) of this model, 253.14%, is an order of magnitude higher than the corresponding mean errors of our surrogate model (39.69%) and the Pioletti model (45.59%).

Fig. 13
figure 13

a Comparison of the numerically generated isochoric shear viscous overstress (\(\textbf{S}_{v,\text {iso,12}}\))–shear strain (\(\gamma \)) data from the USS model (i.e., ground truth) in the shear testing regime (\(\gamma \in [0, 0.5]\), \(\dot{\gamma } \in [10,145]\)) with the corresponding predictions of our surrogate model, the Pioletti model, and the classical mapping model. b The corresponding percent relative error (\(\text {err}\)) versus \(\gamma \) responses of these three models. The lines represent mean \(\text {err}\) across all investigated strain rates, and the shaded regions represent mean ± standard deviation. c Comparison of the internal dissipation (\(\Xi _{int}\)) versus \(\gamma \) responses of the two data-driven models: our surrogate model and the classical mapping model. Note: in (a) and (c), the background light gray lines are the data corresponding to the strain rates \(\dot{\gamma } = 32.5, 77.5, 100,\) and 122.5, which were also evaluated in model testing but are not shown for clarity

The predictions of the two data-driven models, the present surrogate model and the classical mapping model, are now analyzed in light of their conformity to the second law of thermodynamics in the testing regime. In this regard, Fig. 12 plots the internal viscous dissipation \(\Xi _{int}\) as a function of stretch at multiple stretch rates as predicted by these two models. In the tension deformation mode, both the models result in non-negative viscous dissipation, thus satisfying the second law of thermodynamics (see Eq. (41). However, in compression, the classical mapping model predictions show a negative internal dissipation at every investigated stretch rate, which violates the second law of thermodynamics. This causes the model to predict physically unreasonable stress–stretch–stretch rate predictions as were seen in Fig. 10e–f. Owing to its physics-informed construction that enforces the non-negativity of viscous dissipation, our surrogate model prevents this issue and its predicted uniaxial mechanical response conforms to the second law of thermodynamics throughout the uniaxial testing regime.

Fig. 14
figure 14

a Comparison of the evolution of mean percent relative error (\(\overline{\text {err}}\)) in the predictions of the surrogate model and the classical mapping model in the training regime (\(\lambda \in [1,1.5], \dot{\lambda } \in [10, 100]\)), as a function of the training dataset size (\(N_{v,\text {iso}}\)). b The corresponding \(\overline{\text {err}}\) versus \(N_{v,\text {iso}}\) responses of the present surrogate model and the classical mapping model, for their predicted responses in the overall testing regime (\([\lambda \in [0.5,1.75], \dot{\lambda } \in [-145, 145]] \cup [\gamma \in [0,0.5]], \dot{\gamma } \in [10,145]\))

The performance of our surrogate model in the simple shear deformation mode is analyzed and compared with the Pioletti and classical mapping models in Fig. 13. Like the quasi-static hyperelastic case of Sect. 5.2, the classical model predicts a zero shear stress component (i.e., \(\textbf{S}_{v, \text {iso, 12}}\)) regardless of the applied shear strain and strain rates (see Fig. 13a). Note, the diagonal stress components predicted by the classical mapping model in shear loading are non-zero. Among the other two models under comparison, both the present surrogate model and the Pioletti model lead to physically plausible predictions, but our surrogate model predictions are in better agreement with the ground truth compared to the Pioletti model predictions. The percent relative error versus shear strain responses (averaged across investigated shear strain rates) of the three models are compared in Fig. 13b. The mean percent error \(\overline{\text {err}}\) of our surrogate model across the shear deformation mode is 15.86%, which is significantly lower than the corresponding mean error of the Pioletti model at 60.43%, and more than two orders of magnitude lower compared to the \(\overline{\text {err}}\) of the classical mapping model, 2453.32%.

From Fig. 13a–b, it is clear that among the two data-driven constitutive models under comparison, only the surrogate model of this work results in trustworthy predictions that maintain physical plausibility even under large strain and at high strain rates and more importantly at deformation modes completely unseen by the model, confirming its ability to generalize proficiently. Figure 13c plots the internal viscous dissipation versus shear strain at multiple shear strain rates from these models. Unsurprisingly, the model predictions of our surrogate model show non-negative dissipation, thus obeying the second law of thermodynamics. Similar to the compression deformation mode, the classical mapping model in shear deformation exhibits negative dissipation that is inconsistent with the second law of thermodynamics.

Finally, the effect of training dataset size \(N_{v,\text {iso}}\) on the mean percent relative error \(\overline{\text {err}}\) of the two data-driven models is studied in Fig. 14. Here, \(N_{v,\text {iso}}\) is the product of the number of distinct tensile stretch rates in the range of \(\dot{\lambda } \in [10, 100]\) considered during training and the number of stretch values in the range of \(\lambda \in [1,1.5]\) for every stretch rate case (e.g., in Fig. 9, \(N_{v,\text {iso}}\) was equal to \(5 \times 31 = 155\)). We consider nine \(N_{v,\text {iso}}\) values: 155 (5 (stretch rates) \(\times \) 31 (stretch values)), 186 (\(6 \times 31\)), 217 (\(7 \times 31\)), 305 (\(5 \times 61\)), 366 (\(6 \times 61\)), 427 (\(7 \times 61\)), 455 (\(5 \times 91\)), 546 (\(6 \times 91\)), and 637 (\(7 \times 91\)). The effect of \(N_{v,\text {iso}}\) on \(\overline{\text {err}}\) values computed for the training regime (i.e., the fitting error) and for the overall testing regime (tension \(+\) compression \(+\) shear) are shown in Fig. 14a, b, respectively. From these plots, the mean percent relative errors of both the data-driven models in the training regime are very small (maximum \(\overline{\text {err}}\) value of less than 5%), demonstrating a consistently high fitting accuracy. Further, both the \(\overline{\text {err}}\) versus \(N_{v,\text {iso}}\) responses exhibit a generally decreasing trend. In the testing regime, the responses of the two data-driven models differ significantly. On one hand, the classical mapping model leads to extremely high mean percent relative errors that generally increase with the training dataset size. On the other hand, our data-driven model consistently results in reasonably small mean percent relative errors (< 20%) for all investigated training dataset sizes, revealing a good prediction performance even when data availability is limited.

The three numerical examples of this section demonstrate the robustness of the proposed physics-informed data-driven modeling framework in capturing highly nonlinear mechanical responses with limited training data and generalizing seamlessly to other stress states. Although we used noise-free training data from phenomenological constitutive models in these examples, our surrogate models perform well even when the training data is noisy. Appendix D provides model fitting and prediction results for the isochoric dynamic loading case of Sect. 5.3 when noisy rate-dependent stress–strain data is used to train the \(\widetilde{\mathcal {M}}_{v,\text {iso}}\) surrogate model.

6 Summary and discussion

This work presents a framework for the development of physics-informed data-driven constitutive models to describe the short-time, strain-rate-dependent mechanical response of soft materials. A major motivation of this work is the limitations of the traditional continuum thermodynamics-based and the machine learning-based constitutive models: while the former have a limited fitting and prediction accuracy owing to their fixed mathematical form and require expert model selection based on experimental observations (e.g., choice of the hyperelastic model), the latter require exorbitant amounts of training data (experiments that would be required to uniformly sample strain space are not physically plausible) and generally result in poor out-of-sample predictions (poor generalization performance). Our proposed framework takes a significant step toward eliminating these limitations by combining the physics-informed nature of continuum thermodynamics with the highly accurate and flexible regression capability of supervised ML. The result is a fully data-driven constitutive model that can capture complex material response features with high accuracy without any expert intervention, yields physically reasonable and accurate out-of-sample predictions, and can be trained with a small amount of training data that is achievable from simple contemporary experiments. As some of these points were previously exhibited for hyperelasticity and elastoplasticity [12,13,14] this work outlines the extension of this general framework toward capturing the rate-dependent response of viscoelastic materials at finite deformations.

The formulation of our data-driven constitutive model is based on the generalized stress–strain–strain rate equations of the continuum thermodynamics-based framework of external state variable driven viscous dissipation-based visco-hyperelasticity. In these equations, the total stress is additively decomposed into volumetric, isochoric hyperelastic, and isochoric viscous overstress components. Each of these stress components is written as linear combinations of the components of an integrity basis. This type of linear relationship allowed us to propose three data-driven surrogate model mappings—\(\widetilde{\mathcal {M}}_{\text {vol}}\), \(\widetilde{\mathcal {M}}_{h,\text {iso}}\), and \(\widetilde{\mathcal {M}}_{v,\text {iso}}\)—to capture each of the three stress components, respectively. These surrogate models map strain / strain rate invariants to the coefficients of the integrity basis that make up their corresponding stress components. It is shown that this type of model construction ensures key physics-based constraints on the predicted response: principles of local action, determinism, material frame-indifference, the balance of angular momentum, isotropic material symmetry, and limited memory. Further, owing to the exact inference property of the GPR supervised learning method (both standard and C-GPR) and the special inequality constraint capability of C-GPR, the proposed surrogate models also respect the normalization condition and the second law of thermodynamics.

The performance of each of the three surrogate models that form our constitutive model was evaluated by fitting them to a small numerically-generated training dataset, each obtained from one deformation mode—corresponding to common experimental protocols—and then applying the trained model to predict material responses in a significantly wider testing regime comprising multiple distinct deformation modes. In every case, our model’s performance was compared with those of a traditional continuum thermodynamics-based constitutive model and a classical ML-based mapping model. The results showed that our models provide critical improvements in describing material responses compared to the other modeling frameworks. For example, under hydrostatic loading, it was seen that the \(\widetilde{\mathcal {M}}_{\text {vol}}\) surrogate model predictions are in excellent agreement with the ground truth both in the training regime of confined compression (\(J \in [0.75,1]\)) and in the overall testing regime comprising confined compression and tension (\(J \in [0.5,1.5]\)). In comparison, the traditional neo-Hookean model resulted in large prediction errors in the testing regime and the classical mapping model resulted in large errors as well as a physically unreasonable softening response in confined tension.

The pathology of predicting unreasonable / thermodynamically unstable physical responses in deformation modes that are not considered in training was consistently noted in the classical mapping approach throughout this study. For example, in the isochoric quasi-static loading case, the classical mapping model trained under uniaxial tension violated stress-stretch monotonicity and compression–tension asymmetry under uniaxial compression, and resulted in a zero shear stress regardless of the applied deformation under simple shear loading. Here, even the traditional Yeoh model resulted in a physically unreasonable softening response in compression. In the isochoric dynamic loading case, the classical mapping model trained under rate-dependent uniaxial tension predicted tensile stresses under compression and no shear stresses under simple shear. It was seen that these model predictions violated the second law of thermodynamics through negative internal dissipation in both the compression and shear deformation modes (the two deformation modes not considered in training). Notably, most efforts in ML-enabled constitutive models in the literature follow the so-called classical approach of mapping strain components to stress components, which is evidently insufficient for performing tasks that traditional phenomenological and micromechanical modeling excelled at. Our physics-informed data-driven surrogate models resulted in physically-reasonable mechanical responses that obeyed the second law across all the investigated deformation modes (training and testing) and their prediction errors were consistently at the lowest level (among the three model types).

The physics-informed nature of our constitutive model also eliminates the requirement of large training datasets by restricting the solution space of possible surrogate model parameters based on physical laws [89]. Notably, our model works in both the low-data regime and in the limited-data regime (due to experimental limitations) allowing it to extrapolate beyond stress states seen by simple experiments. For each of the three surrogate models of this work, a reasonably good prediction accuracy over the full testing regime was obtained with relatively small-sized training datasets like the ones commonly seen in the experimental literature. In addition, the prediction accuracy generally improved when the training dataset size was increased (notably, still in the limited-data regime, utilizing synthetic data in deformation modes consistent with easily accessible experiments). On the other hand, classical ML-based mapping models resulted in very high prediction errors at low training dataset sizes, and these errors did not necessarily decrease with increasing data volume. In fact, in the hydrostatic and the isochoric dynamic loading cases, the prediction errors of these black-box models increased when large datasets were employed for training them, which coincided with highly non-physical predictions in the testing regime. Overall, even the lowest mean percent relative error of the classical mapping models, in any case, was several times the peak mean percent relative error of our surrogate models.

Recall that the proposed data-driven constitutive model is only applicable to short-time responses that are dominant under high strain rate deformation. Under sufficiently slow loading rates, long-time effects such as relaxation and creep become important. In addition, our current model enforces material isotropy as one of the physics-based constraints. While many soft materials (e.g., hydrogels, tissues, and elastomers) are considered isotropic, some are anisotropic (e.g., the white matter of the brain [90]). In future work, efforts to add long-time effects and anisotropy modeling capabilities in the current framework will be pursued. We will also further investigate model generalizability to 3D stress states by training our surrogate model with direct experimental data of simple deformations (e.g., tension, shear, etc.) and comparing model predictions with full-field strain data from complex mechanical experiments (e.g., large deformation indentation). Finally, we plan to explore model-free strategies and compare the results of our model-based framework with available model-free algorithms, even though we expect that it is even more challenging from the model-free perspective to generalize to unseen loading conditions.