1 Introduction

This paper proposes new continuum descriptions of stress, strain, and related quantities that are intermediate in detail between tensor mechanics and particle mechanics. The descriptions help to bridge the gap between these disciplines, and provide new ways to use information from both in the construction of continuum constitutive models.

Cauchy’s [13] concept and equations for continuum mechanical stress assumes that effects of granularity become smoothed out if large numbers of particles are considered. This assumption serves well for linear elastic materials, but does not appear to have been formally proved for more complex behaviours. Terzaghi [72, 73] proposed a first modification, the now well-established Principle of Effective Stress, which implies that the symmetric stress tensor \(\varvec{\sigma}^{{^{\prime } }}\) that is effective in terms of the stress–strain–strength behaviours of fully saturated, uncemented granular soils is related to the total (Cauchy) stress tensor σ by:

$${\varvec{\upsigma}}^{\prime } = {\varvec{\upsigma}} - u{\mathbf{\rm I}}$$
(1)

where u is the pore fluid stress, and I is the identity tensor. This principle is also used for sedimentary rocks [40, 85]. Explanations for its success include those of Bishop [3], Skempton [67], Mitchell and Soga [51], and others, who relate effective stress to inter-granular forces. Some limitations have been explored by Singh and Wallender [66] and others.

Effective stress “at a point” is a local concept, though referring to spatially distributed effects at microscopic scale in geomaterials. At macroscopic scale, let F represent the deformation gradient tensor from a reference configuration to a current configuration of a representative elementary volume of granular material [70]. Then the rate \(W^{\prime }\) of effective work done on the soil per unit particle volume is, in the absence of fluid flow:

$$W^{\prime } = - Vtr\left( {{\varvec{\upsigma}}^{\prime T} {\dot{\mathbf{F}}}.{\mathbf{F}}^{ - 1} } \right)$$
(2)

The negative sign is used when taking compressive stress as positive and work as positive when done on the material. V = 1+e is the specific volume, equal to the ratio of macroscopic volume to particle volume, e is the void ratio, and tr is the trace operator. Houlsby [39] discusses the complicating effects of fluid flow.

The new proposals made herein are intended to provide new connections between the microscopic and macroscopic scales. Note however that the new proposals involve some simplifications and therefore that the new connections will not be as explicit as those available in either the full discrete element method (DEM) or in a macroscopic model deduced solely from particle mechanics. This therefore represents a different approach to the ones taken by Jiang et al. [42], Liu and Chang [47],Wu et al. [82] and others who propose new continuum models based on results obtained with these more fundamental methods.

The present starting point is a proof, outlined by O’Sullivan [54] and attributed by her to several previous independent authors. The proof is widely accepted in DEM modelling and states that the effective stress in a dry granular matrix can be expressed in terms of the statistics of inter-particle forces in one of the following ways (differing for different researchers):

$${\varvec{\upsigma}}^{\prime } = \frac{1}{{V_{REP} }}\sum\limits_{c = 1}^{N} {\underline{{\mathbf{F}}}_{c} \otimes \underline{{\mathbf{L}}}_{c} } = \frac{1}{{V_{REP}^{\prime } }}\sum\limits_{c = 1}^{N} {\underline{{\mathbf{L}}}_{c} \otimes \underline{{\mathbf{F}}}_{c} }$$
(3)

where \(V_{REP}\) and \(V_{REP}^{\prime }\) are measures of macroscopic volume enclosing a representative group of particles, \(\underline{{\mathbf{F}}}_{c}\) is the inter-particle force at the cth contact, \(\otimes\) is the dyadic product, \(\underline{{\mathbf{L}}}_{c}\) is the inter-particle branch vector for that contact, and N is the number of contacts in the volume. A dyadic product is not in general symmetric, but the sum of many such products can be. Each contact is counted once.

The proposals herein are expected to be particularly useful in resolving two modelling problems for soils. One is that of induced anisotropy, defined as anisotropy induced by strain [12]. All natural soils appear to be anisotropic, with most being transversely anisotropic in situ, due partly to vertical deposition and partly to a subsequent history of one-dimensional vertical compression [31, 45, 51]. More complex forms of anisotropy can develop with different depositional and/or stress–strain histories [24]. Anisotropy and fabric evolution are active topics in granular matter research, see for example Cottechia et al. [20], Papadimitriou et al. [56], Chen et al. [15], Vijayan et al. [78], Wang et al. [79, 80], Zhao and Kruyt [84] and others.

The second is “plastic spin”, described by Dafalias and Aifantis [23] and Dafalias [22] for soils and by Soare [68] for crystalline metals. Dafalias [21] argued that plastic spin is a “necessary constitutive ingredient” for large deformation anisotropic elasto-plasticity with tensorial internal variables. Soare [68] interpreted plastic spin as the rate of rotation of a material frame, relative to the material itself, where the material frame is defined by axes that characterize a yield surface.

The following text starts with a description of the notation used herein for the familiar process of orientational averaging. This process is then applied in a new way to the above equation, resulting in a new relation between inter-particle point forces and effective stress. The familiar algebra for the deformation gradient is then used to deduce new expressions for strain rate and work rate in granular media. Some general implications for modelling are then considered. Finally, some simple constitutive assumptions are explored that reveal new information about simple behaviours, including about the origins of Poisson’s ratio for elastic granular materials. Application to elastoplastic models and critical states is also discussed.

2 Proposed functional representations

2.1 Directional notation for orientational averaging

The idea of averaging quantities over physical directions has a long history, and is part of the process of “homogenization” or “upscaling” by which micro-scale features can be averaged to create macroscale results. Examples for soils include Pande and Sharma’s [55] multi-laminate model in which macroscopic constitutive behaviour is constructed from simple behaviours that are individually associated with physical direction. Faria [29] discusses applications in mixture theory, and Altenbach et al. [1] discuss orientation averaging for fiber suspensions. Göodert and Hutter [30] use orientational averaging in their model of induced anisotropy in ice shields. Yimsiri and Soga [83] upscale their proposals for contact mechanics using a spherical integration method. Blumenfeld [4] proposed a specialised upscaling method to solve a problem related to fabric in isostaticity theory.

Figure 1 illustrates some concepts associated with physical direction. A direction emanating from a material point M may be represented by an arrow, or by a label ψ, or by a unit vector \(\underline{{\mathbf{n}}}_{\psi }\), or by a point P on a sphere centred on M. The unit vector may be represented in a variety of external coordinate systems, including a cartesian system {x,y,z} or a spherical system {r,θ,ϕ}:

$$\underline{{\mathbf{n}}}_{\psi } = \left[ {\begin{array}{*{20}c} {n_{\psi x} } \\ {n_{\psi y} } \\ {n_{\psi z} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {\sin \theta \cos \phi } \\ {\sin \theta \sin \phi } \\ {\cos \theta } \\ \end{array} } \right]$$
(4)

where the polar angle \({\theta }\) runs from 0 to π radians, and the azimuthal angle ϕ runs from 0 to 2π radians. The set of all possible directions may also be represented by the set of all points on the sphere.

Fig. 1
figure 1

Notation and concepts associated with physical direction

Let \(Q_{\psi }\) be a function of direction ψ, such that for every ψ there is a value of the quantity Q denoted as \(Q_{\psi }\). This value may be a scalar, or a vector (such as the unit vector \(\underline{{\mathbf{n}}}_{\psi }\)). It may be a tensor if the tensor can be associated in some way with direction, and an example used later is the tensor \(\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi }\) where \(\otimes\) is the dyadic (outer) product. This tensor can represent a unit uniaxial stress in the ψ direction, with:

$$\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi } = \underline{{\mathbf{n}}}_{\psi } \underline{{\mathbf{n}}}_{\psi }^{T} = \left[ {\begin{array}{*{20}c} {n_{\psi x}^{2} } & {n_{\psi x} n_{\psi y} } & {n_{\psi x} n_{\psi z} } \\ {n_{\psi y} n_{\psi x} } & {n_{\psi y}^{2} } & {n_{\psi y} n_{\psi z} } \\ {n_{\psi z} n_{\psi x} } & {n_{\psi z} n_{\psi y} } & {n_{\psi z}^{2} } \\ \end{array} } \right]$$
(5)

The function \(Q_{\psi }\) may alternatively be determined by some different process, such as by a stress–strain history. Its average over all directions ψ is here denoted as \(\left\langle {Q_{\psi } } \right\rangle\) and could be calculated by integrating over the surface of the sphere and dividing by the surface area a:

$$\left\langle {Q_{\psi } } \right\rangle = \int\limits_{{}}^{{}} {Q_{\psi } \frac{{da_{\psi } }}{a}} = \int\limits_{{}}^{{}} {Q_{\psi } \frac{{d\varOmega_{\psi } }}{4\pi }}$$
(6)

where \(da_{\psi }\) is the area of an infinitesimal patch on the sphere centred on the point P representing the direction ψ, and \(d\varOmega_{\psi }\) is the solid angle in steradians that is subtended by the patch. The solid angle is the area of the patch divided by the square of the radius \(r_{\psi }\) to the centre of the patch, with \(r_{\psi }\) = 1 unit for a sphere of unit radius:

$$d\varOmega_{\psi } = \frac{{da_{\psi } }}{{r_{\psi }^{2} }} = \frac{{3dv_{\psi } }}{{r_{\psi }^{3} }}$$
(7)

where \(dv_{\psi }\) is the volume of a cone subtended at the centre by the patch. The full spherical surface subtends a solid angle of 4π steradians.

Equation 6 is just a definition of a directional average in three dimensions. Some details are nevertheless worth exploring. The area of a rectangular patch illustrated in Fig. 1 is the product of its polar dimension and its circumferential dimension. This product is \(\sin \theta d\phi d\theta\) for a unit radius, so:

$$\left\langle {Q_{\psi } } \right\rangle = \frac{1}{4\pi }\int\limits_{\phi = 0}^{2\pi } {\int\limits_{\theta = 0}^{\pi } {Q_{\psi } \sin \theta d\theta d\phi } }$$
(8)

Using this, the following results are obtained:

$$\left\langle {n_{\psi x}^{2} } \right\rangle = \left\langle {n_{\psi y}^{2} } \right\rangle = \left\langle {n_{\psi z}^{2} } \right\rangle = \frac{1}{3}$$
(9)
$$\left\langle {n_{\psi y} n_{\psi z} } \right\rangle = \left\langle {n_{\psi z} n_{\psi x} } \right\rangle = \left\langle {n_{\psi x} n_{\psi y} } \right\rangle = 0$$
(10)
$${\text{so}}:\;\;\left\langle {\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi } } \right\rangle = \frac{1}{3}{\mathbf{I}}$$
(11)

where \({\mathbf{I}}\) is the 3 × 3 identity matrix. The trace of a matrix is the sum of its diagonal terms, so the trace of \(\left\langle {\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi } } \right\rangle\) is 1. By transforming to a coordinate frame in which \(\underline{{\mathbf{n}}}_{\psi }\) acts along a coordinate axis, it can readily be verified that \(\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi }\) is a tensor with two principal values equal to zero and one equal to 1.

If a function \(Q_{\psi }\) has the same value for all directions, then its average is that value (the average of a constant is that constant). If a and b are two quantities that are independent of direction, and \(Q_{1\psi }\) and \(Q_{2\psi }\) are two functions of direction, then the average of the linear combination is the linear combination of the averages:

$$\left\langle {aQ_{1\psi } + bQ_{2\psi } } \right\rangle = a\left\langle {Q_{1\psi } } \right\rangle + b\left\langle {Q_{2\psi } } \right\rangle$$
(12)

In the constitutive context, a and b may be variables that change with time. A directional average is not itself a function of direction, and a useful result later will be that:

$$\left\langle {A_{\psi } \left\langle {B_{\psi } } \right\rangle } \right\rangle = \left\langle {A_{\psi } } \right\rangle \left\langle {B_{\psi } } \right\rangle = \left\langle {\left\langle {A_{\psi } } \right\rangle B_{\psi } } \right\rangle$$
(13)

where the average inside an averaging operator is done first. One might call this a “swap property” since the change from left to right involves swapping the position of the inner averaging operator. Another property follows from the fact that there are many functions of direction whose average is zero, for example \(\cos \theta\). If two functions have the same average, their difference is a function whose average is zero:

$${\text{If}}\;\;\left\langle {Q_{1\psi } } \right\rangle = \left\langle {Q_{2\psi } } \right\rangle ,\;{\text{then}}\,Q_{1\psi } = Q_{2\psi } + E_{\psi } \,$$
(14)

where \(E_{\psi }\) is some function with \(\left\langle {E_{\psi } } \right\rangle = 0\). One might call this the “extraction property”, since it allows a directional equation to be extracted from an equality of averages.

In tensor mechanics an invariant is a quantity whose value does not depend on physical direction or coordinate frame. The average \(\left\langle {Q_{\psi } } \right\rangle\) is a first invariant of the function \(Q_{\psi }\). But a directional function can also be associated with an infinite number of invariants, for example the average of its mth power is invariant, for any given m. The variance \(Q_{q}^{2}\) of \(Q_{\psi }\) is:

$$Q_{q}^{2} = \left\langle {\left( {Q_{\psi } - \left\langle {Q_{\psi } } \right\rangle } \right)^{2} } \right\rangle = \left\langle {Q_{\psi }^{2} } \right\rangle - \left\langle {Q_{\psi } } \right\rangle^{2}$$
(15)

The first expression for \(Q_{q}^{2}\) is the average of the square of the deviation \(Q_{\psi } - \left\langle {Q_{\psi } } \right\rangle\) from the average \(\left\langle {Q_{\psi } } \right\rangle\), and the second is then a standard result for variance in statistics. The deviator \(Q_{q}\) is a second invariant of the function, and would normally be taken non-negative. Its value does not depend on any physical direction.

A complement \(A_{\psi }^{\dag }\) of a function \(A_{\psi }^{\prime}\) can be defined as the average of \(A_{\psi }^{\prime}\) over directions that are normal to ψ. This is defined more formally in Appendix 1.

The label ψ in the expression \(\left\langle {Q_{\psi } } \right\rangle\) is a dummy variable, and may be replaced by any other if the meaning is clear. Thus \(\left\langle {Q_{\beta } } \right\rangle\) would be the same as \(\left\langle {Q_{\psi } } \right\rangle\) if the first average is understood as being taken for all directions β. Where there is ambiguity, the variable over which the average is taken can also be placed outside as a subscript. Functions of more than one direction can also be useful, and the algebra of such functions can be very similar to that of tensors. For example, a function \(C_{\psi }\) may be defined in terms of functions \(A_{\psi \beta }\) and \(B_{\beta }\) as:

$$C_{\psi } = \left\langle {A_{\psi \beta } B_{\beta } } \right\rangle_{\beta } = \int {A_{\psi \beta } B_{\beta } \frac{{d\varOmega_{\beta } }}{4\pi }}$$
(16)

This is analogous to the multiplication of two tensors, or the post-multiplication of a matrix A by a vector \(\underline{\mathbf{B}}\), except that the result is an average rather than a sum.

2.2 Effective stress

Figure 2 shows the sign convention used herein for positive components of the effective stress tensor. Normal stresses are positive in compression. The off-diagonal components are in the opposite directions to shear stresses denoted as \(\tau_{rs} = - \sigma_{rs}\).

Fig. 2
figure 2

Sign convention for components of the effective stress tensor

In Eq. 3, since \(\underline{{\varvec{\upsigma}}}^{\prime }\) is symmetric, only the symmetric parts of the tensors \(\underline{{\mathbf{F}}}_{c} \otimes \underline{{\mathbf{L}}}_{c}\) or \(\underline{{\mathbf{L}}}_{c} \otimes \underline{{\mathbf{F}}}_{c}\) are needed in calculating the macroscopic stress. Consistently with Cauchy [13], every symmetric stress tensor can be regarded as the sum of three uniaxial stress tensors, the ith of which represents a stress in a principal direction of the original tensor. Let \({\mathbf{u}}_{ic}\) be the uniaxial stress tensor representing a uniaxial stress in the ith principal direction of the dyadic product for the cth contact, so that:

$$sym\left( {\underline{{\mathbf{F}}}_{c} \otimes \underline{{\mathbf{L}}}_{c} } \right) = sym\left( {\underline{{\mathbf{L}}}_{c} \otimes \underline{{\mathbf{F}}}_{c} } \right) = \sum\limits_{i = 1}^{3} {{\mathbf{u}}_{ic} }$$
(17)

Substituting this into Eq. 3 gives a summation over 3N uniaxial stress tensors. Let these uniaxial stress tensors now be organized into groups, the ψth of which contains only stress directions that correspond to points in a particular patch on the unit sphere of area \(da_{\psi }\). Then:

$${\varvec{\upsigma}}^{\prime}= \frac{1}{{V_{REP}^{\prime \prime } }}\sum\limits_{c,i} {\underline{{\mathbf{u}}}_{ic} } = \sum\limits_{\psi } {\left( {\left( {\frac{1}{{V_{REP}^{\prime \prime } }}\sum\limits_{c,i,\gamma } {\underline{{\mathbf{u}}}_{ic\psi } } } \right) + \underline{{\varvec{\updelta}}}_{\psi } } \right)}$$
(18)

where the first summation on the right is taken over all the groups. \(V_{REP}^{\prime \prime }\) depends on which original expression is used from Eq. 3, and the quantity in brackets is the summation taken over all the members of the ψth group. Inside the second sum on the right, \(\underline{{\mathbf{u}}}_{ic\psi }\) is a uniaxial tensor with its one non-zero principal value in direction ψ, so it is a scalar multiple of \(\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi }\). The last term \(\underline{{\varvec{\updelta}}}_{\psi }\) describes the second-order deviations from direction ψ over the area of the patch. A stress magnitude \(\sigma_{\psi }^{\prime}\) can then be defined for the ψth group as:

$$\frac{1}{{V_{REP}^{\prime \prime } }}\sum\limits_{c,i,\gamma } {\underline{{\mathbf{u}}}_{ic\psi } } = 3\sigma_{\psi }^{\prime}\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi } \frac{{da_{\psi } }}{a}$$
(19)

Combining this with the previous equation, and then putting the result in the DEM Eq. 3 for the stress tensor \(\underline{{\varvec{\upsigma}}}^{\prime }\), gives:

$${\varvec{\upsigma}}^{\prime } = \sum\limits_{\psi } {\left( {3\sigma_{\psi }^{\prime}\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi } \frac{{da_{\psi } }}{a} + \underline{{\varvec{\updelta}}}_{\psi } } \right)}$$
(20)

Taking the limit as smaller and smaller patch sizes are considered, the deviations from direction ψ tend to zero, so the second order term \(\underline{{\varvec{\updelta}}}_{\psi }\) is expected to collapse to zero, and the summation will become an integral, giving:

$$\underline{{\varvec{\upsigma}}}^{\prime } = \left\langle {3\sigma_{\psi }^{\prime}\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi } } \right\rangle$$
(21)

where \(\sigma_{\psi }^{\prime}\) has now become a continuous scalar function of direction ψ. It will be convenient to call this the “outer stress” function. Its derivation involves no special constraint, other than that it be integrable in the way described. It may for example be continuous or discontinuous, provided the integration remains possible and gives a definite result.

Equations 19 and 21 represent a new way of upscaling from particle mechanics to continuum mechanics, but the summation in Eq. 19 means that the method does not provide the inverse path, from continuum to particle mechanics. The upscaling is based on orientational averaging, and in this sense it follows many previous models in the literature. For example, Pande and Sharma’s [55] multi-laminate model involves averaging of the behaviours of laminates each of which is affected by several stresses. A key feature is that Eq. 21 has been proved, essentially simply from the assumption that macroscopic stress represents point forces at inter-particle contacts—the assumption underlying Eq. 3. Equation 21 does not involve any additional constitutive assumption. And the stress function \(\sigma_{\psi }^{\prime}\) has a clear physical meaning in terms of the statistics of the principal values and directions of stresses that are set up in the soil matrix by different combinations of differently oriented inter-particle contact forces and different brace vectors.

Equation 21 implies that each element of the stress tensor is a directional average of the outer stress function, for example, \(\sigma_{xx}^{\prime}= \left\langle {3\sigma_{\psi }^{\prime}n_{\psi x}^{2} } \right\rangle\). Since the symmetric tensor has only six independent elements, Eq. 21 puts only six constraints on the function, and the function can contain infinitely more information than the tensor. Indeed, the tensor contains no information at all about the function deviator \(\sigma_{q}^{\prime}\), which is defined as in the general Eq. 15, or about any of the infinite number of invariants \(\left\langle {\sigma {^\prime}_{\psi }^{m} } \right\rangle\) for m ≠ 1.

The stress function will have a basic symmetry that its value in a given direction is equal to its value in the diametrically opposite direction. Taking the trace of both sides of Eq. 21, noting that the trace of \(\underline{{\varvec{\upsigma}}}^{\prime }\) is three times the mean normal effective stress p’, and \(tr\left( {\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi } } \right) = 1\), gives:

$$\left\langle {\sigma_{\psi }^{\prime}} \right\rangle = p^{\prime}$$
(22)

Also, if \(\sigma_{\psi }^{\prime}\) equals p’ for all directions ψ, then from Eqs. 11 and 21, the Cauchy stress tensor will be hydrostatic. However, the converse is not necessarily true. Many different functions can satisfy the same six constraints, so a hydrostatic tensor does not necessarily imply that the stress function equals p’ in all directions.

2.3 Strain rate and work

For tensor models, it is conventional to use strain rate or incremental strain parameters that are “work-conjugate” with the chosen measures of stress (e.g. [57, 70]. An extension of this idea to functions is straightforward. Using Eq. 21 to substitute for the stress tensor in Eq. 2, and organizing the result appropriately, gives:

$$\dot{W^{\prime}} = \left\langle {\dot{W}_{\psi }^{\prime}} \right\rangle$$
(23)
$${\text{with}}:\;\,\dot{W}_{\psi }^{\prime}= - \sigma_{\psi }^{\prime}\dot{V}_{\psi }$$
(24)
$$\dot{V}_{\psi } = - 3Vtr(\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi } {\dot{\mathbf{F}}}.{\mathbf{F}}^{ - 1} )$$
(25)

\(\dot{V}_{\psi }\) might be conveniently called an “outer” volumetric rate. It can be different for different directions. There may be some relationship with Pietruszczak and Krucinski’s [59] proposal of “directional porosity”, but there is no general implication that \(\dot{V}_{\psi }\) is a differential of any quantity that is independent of the choice of reference configuration. \(\dot{W}_{\psi }^{\prime}\) is a work rate per unit particle volume associated with direction ψ. The first equation shows that the overall work rate \(\dot{W}^{\prime }\) is the first invariant of \(\dot{W}_{\psi }^{\prime}\). The product \({\dot{\mathbf{F}}}.{\mathbf{F}}^{ - 1}\) can be expressed as:

$${\dot{\mathbf{F}}}.{\mathbf{F}}^{ - 1} = {\dot{\mathbf{\theta }}} - {\dot{\mathbf{\varepsilon }}} = \left[ {\begin{array}{*{20}c} 0 & { - \dot{\theta }_{z} } & {\dot{\theta }_{y} } \\ {\dot{\theta }_{z} } & 0 & { - \dot{\theta }_{x} } \\ { - \dot{\theta }_{y} } & {\dot{\theta }_{x} } & 0 \\ \end{array} } \right] - \left[ {\begin{array}{*{20}c} {\dot{\varepsilon }_{xx} } & {\dot{\varepsilon }_{xy} } & {\dot{\varepsilon }_{xz} } \\ {\dot{\varepsilon }_{yx} } & {\dot{\varepsilon }_{yy} } & {\dot{\varepsilon }_{yz} } \\ {\dot{\varepsilon }_{zx} } & {\dot{\varepsilon }_{zy} } & {\dot{\varepsilon }_{zz} } \\ \end{array} } \right]$$
(26)

This rate equation has an equivalent form in terms of infinitesimals, and a way of using it in the context of a theory of large deformations appropriate to soils is discussed later. The first matrix is the anti-symmetric part of \({\dot{\mathbf{F}}}.{\mathbf{F}}^{ - 1}\), and contains rotation rates. The second is the symmetric parts (\(\dot{\varepsilon }_{rs} = \dot{\varepsilon }_{sr}\)), and contains strain rates with compression positive. In some soil mechanics conventions, shear strain rates may be represented as \(\dot{\gamma }_{rs} = - 2\dot{\varepsilon }_{rs}\) [45]. Evaluating the trace in Eq. 2 gives:

$$W^{\prime } = - V\left\{ \begin{aligned} {\sigma}_{xx}^{\prime}\dot{\varepsilon }_{xx} + \tau_{yz} \dot{\gamma }_{yz} + \\ \sigma_{yy}^{\prime}\dot{\varepsilon }_{yy} + \tau_{zx} \dot{\gamma }_{zx} + \\ \sigma_{zz}^{\prime}\dot{\varepsilon }_{zz} + \tau_{xy} \dot{\gamma }_{xy} \\ \end{aligned} \right\}$$
(27)

This is consistent with the familiar result that rotation rates are not associated with any working [70]. Using Eqs. 5 and 26 to expand the right side of Eq. 25 gives the rate of outer volume strain as:

$$\begin{aligned}\dot{V}_{\psi } = - 3V\left\{ \begin{aligned} n_{\psi x}^{2} \dot{\varepsilon }_{xx} + 2n_{\psi y} n_{\psi z} \dot{\varepsilon }_{yz} + \\ n_{\psi y}^{2} \dot{\varepsilon }_{yy} + 2n_{\psi z} n_{\psi x} \dot{\varepsilon }_{zx} + \\ n_{\psi z}^{2} \dot{\varepsilon }_{zz} + 2n_{\psi x} n_{\psi y} \dot{\varepsilon }_{xy} \\ \end{aligned} \right\} \\ = - 3Vtr(\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi } {\dot{\mathbf{\varepsilon }}}) \end {aligned}$$
(28)

If special axes are considered with the x-axis in direction ψ, then \(n_{\psi x} = 1\) and \(n_{\psi y} = n_{\psi z} = 0\), and the strain rate \(\dot{\varepsilon }_{xx}\) will equal the normal compressive strain rate \(\dot{\varepsilon }_{\psi }\) in direction ψ. Hence the right side simplifies to:

$$\dot{V}_{\psi } = - 3V\dot{\varepsilon }_{\psi }$$
(29)

The work rate \(\dot{W}_{\psi }^{\prime}\) in Eq. 24 might then be interpreted as a measure of external work that is done on the material in compressing it in this direction, per unit particle volume. Using Eqs. 9, 10 and 28 to compute the average \(\left\langle {\dot{V}_{\psi } } \right\rangle\) gives:

$$\left\langle {\dot{V}_{\psi } } \right\rangle = - V\left\{ {\dot{\varepsilon }_{xx} + \dot{\varepsilon }_{yy} + \dot{\varepsilon }_{zz} } \right\} = \dot{V}$$
(30)

Thus the directional average of the outer volumetric rates \(\dot{V}_{\psi }\) is the rate of change \(\dot{V}\) of the overall specific volume V.

3 General aspects of constitutive modelling with functions

3.1 Noll’s axioms

Noll’s [53] four axioms for the constitutive modelling of “simple materials” have largely been accepted [49, 77]. The Axioms of Local Action, Frame Indifference (Objectivity), and Indifference to Rigid Transformation would seem to be directly transferable to functions through Eqs. 21 and 25, because the tensors in these equations satisfy those axioms.

3.2 Local action

The concept of local action is that the stress and other descriptions of the state of a material can be represented by descriptions that apply at infinitesimal points. Bower [7] states that, if the principle holds, “the stress at a point in the solid depends only on the change in shape of a vanishingly small volume element surrounding the point. It must therefore be a function of the deformation gradient or a strain measure that is derived from it.” Another implication is that the stress–strain behaviour at a point does not depend on the spatial gradient of stress at that point. So, for example, the equilibrium equations are formally independent of the constitutive behaviour.

It is easy to imagine that granular media might not obey this principle. Consider for example a hard particle that, as a result of a small change of loading conditions, slips at some inter-particle contact. If the particle is hard enough, there will not be enough local elasticity to take up the movement in the immediate vicinity of the slip, and the movement will necessarily spread to nearby particles. Their movements will spread further, and there seems to be no a priori reason to suppose that this spreading effect is limited in scale.

This paper explores the concept of spreading later. Lambe and Whitman [44] describe the usual assumption in practical engineering in the following way: “When we talk about the stresses at a point within a soil, we often must envision a rather large point”. On this basis, Eq. 3 is a local equation in the sense that it involves a summation over a representative volume that is small enough to be considered as a “large point”—small enough in comparison to the scale of a practical engineering problem, but large enough that the statistics involved in the summation are not sensitive to the actual size of the REV. It follows that Eq. 21 is also a local equation, since it applies to the same elementary volume.

The calculation of work rate in Eq. 2 involves the assumption that the deformation gradient has physical meaning. The present paper assumes that this is the case, implying that the gradient is not measured over dimension scales that are less than the order of magnitude of a large point. It would then follow that the outer volume rates in Eq. 25 are also local variables.

3.3 Objectivity

The concept of objectivity is that behaviour that is the property of a macroscopic physical system does not depend on the way it is observed [18, 34, 50, 52]. Wooseok et al. [81] argue that large errors occur in some finite element models due to the use of non-objective stress rates. For the present proposals, the algebra by which Eq. 21 was proved from the objective Eq. 3 ensures that the stress function \(\sigma_{\psi }^{\prime}\) is objective. Indeed, this is implicit in the definition of ψ as a physical direction. The following text verifies this formally.

Noll’s [53] Axiom of Material Frame-Indifference also expresses the expectation that constitutive behaviour of a material does not depend on the frame in which it is observed. Let Q be the frame transformation tensor that relates the value \(\underline{{\mathbf{r}}}\) of a vector in a given frame to its value \(\underline{{\mathbf{r}}}_{alt} = {\mathbf{Q}}\underline{{\mathbf{r}}}\) in an alternative frame. The tensor transformation law for stress is \({\varvec{\upsigma}}_{alt}^{\prime } = {\mathbf{Q\sigma }}^{\prime } {\mathbf{Q}}^{\text{T}}\) (e.g. [70]. Using Eq. 21 for \({\varvec{\upsigma}}^{\prime }\) gives:

$${\mathbf{\sigma^{\prime}}}_{alt} = {\mathbf{Q}}\left\langle {3\sigma_{\psi }^{\prime}\underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi } } \right\rangle {\mathbf{Q}}^{\text{T}}$$
(31)

Since Q is independent of direction it can be brought inside the averaging operator. Now the dyadic product of two column vectors equals the result of post-multiplying the first vector by the transpose (to row vector) of the second (e.g. Eq. 5). Using this property then gives:

$${\mathbf{\sigma^{\prime}}}_{alt} = \left\langle {3\sigma_{\psi ,alt}^{\prime}\underline{{\mathbf{n}}}_{\zeta ,alt} \otimes \underline{{\mathbf{n}}}_{\zeta ,alt} } \right\rangle$$
(32)
$${\text{with}}:\underline{{\mathbf{n}}}_{\psi ,alt} = {\mathbf{Q}}\underline{{\mathbf{n}}}_{\psi }$$
(33)
$$\sigma_{\psi ,alt}^{\prime}= \sigma_{\psi }^{\prime}$$
(34)

The first equation has the same form as the original Eq. 21, confirming frame-indifference. The second equation defines a one-to-one correspondence between the values of the unit vector in direction ψ as evaluated in the two different frames. The third relates the stress functions in the two frames.

3.4 Determinism

Together with Locality, the Axiom of Determinism implies that a model for isothermal stress–strain behaviour needs to be capable of predicting the rates of all constitutive variables given the deformation rate and their current values. This might be represented by the following calculation sequence, where TCM represents a tensor-based model:

$${\dot{\mathbf{F}}\mathbf{F}}^{ - 1} \to \text{TCM} \to {\dot{\mathbf{\sigma }}}^{\prime }$$
(35)

This “roadmap” is the opposite compared to models of elasto-plasticity in which strain rates are determined from given stress rates, though this form can be achieved by inverting a model’s compliance matrix [28].

Equation 21 implies that the stress function is not necessarily determinable uniquely solely from the stress tensor. Equations 21 and 25 would imply that the above process can be expanded as follows, where FCM represents a functions-based model:

$${\dot{\mathbf{F}}\mathbf{F}}^{ - 1} \to \dot{V}_{\psi } \to \text{FCM} \to \dot{\sigma }_{\psi }^{\prime}\to {\dot{\mathbf{\sigma }}}^{\prime }$$
(36)

This raises the following issue. It would seem natural to expect that the FCM would need to be able to calculate the rate of change of the outer stress function for any given outer volumetric rate function \(\dot{V}_{\psi }\). However, Eq. 25 implies that the only functions that are involved in practice will be those that can be specified by six independent tensor components of strain rate. Since the set of all possible functions \(\dot{V}_{\psi }\) is infinitely larger than the set of functions that satisfy the six constraints, the question arises as to whether to impose the six constraints on the FCM or not.

Three arguments exist in favour of not imposing these six constraints on FCM. First, the calculation of \(\dot{V}_{\psi }\) from \({\dot{\mathbf{F}}\mathbf{F}}^{ - 1}\) automatically applies the constraints, so Occam’s razor suggests there is no need to apply them again in the FCM. Second, as shown by example in the next section, applying them is algebraically inconvenient, and can prevent inferences which seem natural to make. Third, the behaviour of a soil element in a laboratory test is only controlled on its boundary. The six constraints are not explicitly enforced inside a sample, which may be free to deform in ways that use more freedoms.

Accordingly, the present paper assumes that it is not appropriate to impose the tensor constraints on an FCM. This might be called a principle of “freedom from tensor constraint”.

3.5 An example of using freedom from tensor constraint

Composite quantities are used in many tensor-based models. A simple example is the triaxial deviator stress which consists of a combination of principal stresses. In the context of the modelling schematics above, some FCMs might involve a process of combining outer geometrical rates from different directions.

A simple example, developed for the purposes of illustration and not from any particular data, might be the following definition of a new “inner volumetric rate” \(\dot{V}_{\psi }^{*}\), in terms of the outer geometric rate \(\dot{V}_{\psi }\) and another function, \(\alpha_{\psi }\):

$$\dot{V}_{\psi }^{*} = \left( {1 - \alpha_{\psi } } \right)\dot{V}_{\psi } + \alpha_{\psi } \left\langle {\dot{V}_{\psi } } \right\rangle$$
(37)

where \(\left\langle {\dot{V}_{\psi } } \right\rangle\) is the same as the rate \(\dot{V}\) of change of overall specific volume. The function \(\alpha_{\psi }\) may depend on other factors in the model, and is herein called a “spreading coefficient”, because the effect of the last term is a spreading of strain rates from different directions. Taking averages gives:

$$\left\langle {\dot{V}_{\psi }^{*} } \right\rangle = \left( {1 + \left\langle {\alpha_{\psi } } \right\rangle } \right)\dot{V} - \left\langle {\alpha_{\psi } \dot{V}_{\psi } } \right\rangle$$
(38)

Hence, while the average inner rate equals the overall volumetric rate in a purely volumetric deformation, independently of the spreading coefficient, it does not equal this for a general deformation unless it happens to be the case that \(\left\langle {\alpha_{\psi } } \right\rangle \dot{V} = \left\langle {\alpha_{\psi } \dot{V}_{\psi } } \right\rangle\).

Let \(\sigma_{\psi }^{*}\) be a function that is work-conjugate with \(\dot{V}_{\psi }^{*}\). One might call this an “inner” stress function. In the context of functions, “work-conjugate” would mean that, for all possible outer functions \(\dot{V}_{\psi }\), the rate of working can alternatively be calculated using the following equations analogous to Eqs. 23, 24:

$$\dot{W}^{\prime } = \left\langle {\dot{W}_{\psi }^{*} } \right\rangle$$
(39)
$$\dot{W}_{\psi }^{*} = - \sigma_{\psi }^{*} \dot{V}_{\psi }^{*}$$
(40)

Using Eq. 37 to substitute for \(\dot{V}_{\psi }^{*}\), then taking the average, gives:

$$\dot{W}^{\prime } = - \left\langle {\sigma_{\psi }^{*} \left( {1 - \alpha_{\psi } } \right)\dot{V}_{\psi } + \alpha_{\psi } \sigma_{\psi }^{*} \left\langle {\dot{V}_{\psi } } \right\rangle } \right\rangle$$
(41)

Using Eqs. 23, 24 to substitute for the work rate on the left, and using the swap property (Eq. 13) on the last term inside the outer averaging operator on the right gives:

$$- \left\langle {\sigma_{\psi }^{\prime}\dot{V}_{\psi } } \right\rangle = - \left\langle {\left\{ {\sigma_{\psi }^{*} \left( {1 - \alpha_{\psi } } \right) + \alpha_{\psi } \sigma_{\psi }^{*} } \right\}\dot{V}_{\psi } } \right\rangle$$
(42)

It seems natural to expect that the outer stress is given by the quantity inside the curly brackets. However, Eq. 25 imposes six constraints of \(\dot{V}_{\psi }\), and the extraction property (Eq. 14) implies that:

$$\sigma_{\psi }^{\prime}\dot{V}_{\psi } = \left\{ {\sigma_{\psi }^{*} \left( {1 - \alpha_{\psi } } \right) + \alpha_{\psi } \sigma_{\psi }^{*} } \right\}\dot{V}_{\psi } + \dot{W}_{\psi }^{\prime \prime }$$
(43)

where \(\left\langle {\dot{W}_{\psi }^{\prime \prime } } \right\rangle = 0\). If the principle of freedom from tensor constraint is adopted, one can now proceed as follows. First, referring to the patch on the sphere in Fig. 1, a function \(\delta_{\psi }\) for direction ψ is defined such that it is large and finite and constant for the patch, and zero everywhere. Using this as \(\dot{V}_{\psi }\) gives, on calculating the integrals:

$$- \sigma_{\psi }^{\prime}\delta_{\psi } = - \left\{ {\sigma_{\psi }^{*} \left( {1 - \alpha_{\psi } } \right) + \left\langle {\alpha_{\psi } \sigma_{\psi }^{*} } \right\rangle } \right\}\delta_{\psi }$$
(44)

Dividing each side by \(\delta_{\psi }\) then gives a relation for the patch around direction ψ. Doing the same for each patch, then taking the limits as the patch sizes tend to zero (assuming that any fractal effects can be managed) gives:

$$\sigma_{\psi }^{\prime}= \left( {1 - \alpha_{\psi } } \right)\sigma_{\psi }^{*} + \left\langle {\alpha_{\psi } \sigma_{\psi }^{*} } \right\rangle$$
(45)

This is an assembly equation for the outer stresses in terms of the inner stresses, ensuring work conjugacy. It happens to have the property that the average outer stress equals the average inner stress (which means that both averages always equal the mean normal effective stress). Appendix 2 derives the inverse, giving the inner stress function in terms of the outer stress function.

3.6 Determinism and locality revisited

If the representation of spreading using a function \(\alpha_{\psi }\) is valid, then Eq. 37 implies that the inner variables are local variables just as the outer ones are. In effect the spreading is assumed to happen only within a “large point”, and to not violate the principle of local action.

In a continuum context, Eq. 37 is one example of a wide range of possibilities of relating outer functions to functions that may have some more direct role in continuum constitutive behaviour. This raises the possibility that, in general, the constitutive roadmap of Eq. 36 can be further expanded as follows:

$${\dot{\mathbf{F}}\mathbf{F}}^{ - 1} \to \dot{V}_{\psi } \to \dot{V}_{\psi }^{*} \to \text{ICM} \to \dot{\sigma }_{\psi }^{*} \to \dot{\sigma }_{\psi }^{\prime}\to {\dot{\mathbf{\sigma }}}^{\prime }$$
(46)

where ICM represents an “inner constitutive model” that is somehow simpler than the models at other levels. For example, the inner equations for a given direction ψ will not necessarily need to use all the information available from all other directions.

3.7 Mechanical consistency

Nothing in the above algebra restricts the direction ψ to be fixed in relation to an external coordinate frame. It may alternatively be the direction of a material vector, which can change in a stress–strain process.

In this second case, ψ itself has a rate of change. In Fig. 1, the directions and the size of the patch will all change, and this will affect the calculation of the change of the average of any general function \(Q_{\psi }\). To calculate the effect, it is convenient to re-state the integral of Eq. 6 as a limit as the sizes of patches covering the sphere tend to zero, or equivalently as the solid angles that they subtend tend to zero:

$$\left\langle {Q_{\psi } } \right\rangle = \mathop {\lim }\limits_{{\delta \varOmega_{\psi } \to 0 }} \sum\limits_{patches} {Q_{\psi } \frac{{\delta \varOmega_{\psi } }}{4\pi }}$$
(47)

Differentiating with respect to time, and noting that the solid angle has a rate, gives:

$$\mathop {\overline{{\left\langle {Q_{\psi } } \right\rangle }} }\limits^{.} = \mathop {\lim }\limits_{{\delta \varOmega_{\psi } \to 0 }} \sum\limits_{patches} {\left( {\dot{Q}_{\psi } \frac{{\delta \varOmega_{\psi } }}{4\pi } + Q_{\psi } \frac{{\mathop {\overline{{\delta \varOmega_{\psi } }} }\limits^{.} }}{4\pi }} \right)}$$
(48)

The rate of change of the solid angle can be obtained by expressing Eq. 7 in terms of finite patch sizes, so that:

$$\delta \varOmega_{\psi } = 3\frac{{\delta v_{\psi } }}{{r_{\psi }^{3} }}$$
(49)

This is an exact equation involving finite quantities, so there is no difficulty in differentiating it with respect to time, giving:

$$\mathop {\overline{{\delta \varOmega_{\psi } }} }\limits^{.} = \frac{{\mathop {\overline{{\delta v_{\psi } }} }\limits^{.} }}{{\delta v_{\psi } }}\delta \varOmega_{\psi } - 3\frac{{\dot{r}_{\psi } }}{{r_{\psi } }}\delta \varOmega_{\psi }$$
(50)

The first fraction on the right is the rate of change of volume divided by volume, which can be expressed in terms of specific volume as \(\dot{V}/V\). The second fraction is the rate of extension strain of a material vector in direction ψ, which can be calculated from Eq. 29 (recalling that \(\dot{\varepsilon }_{\psi }\) is positive in compression while \(\dot{r}_{\psi } /r_{\psi }\) is positive in expansion). Using these results in Eq. 48, then taking the limit as the patch sizes tend to zero, gives:

$$\mathop {\overline{{\left\langle {Q_{\psi } } \right\rangle }} }\limits^{.} = \left\langle {\dot{Q}_{\psi } + Q_{\psi } \dot{g}_{\psi } } \right\rangle$$
(51)
$$\dot{g}_{\psi } = \frac{{\mathop {\overline{{d\varOmega_{\psi } }} }\limits^{.} }}{{d\varOmega_{\psi } }} = \frac{{\dot{V} - \dot{V}_{\psi } }}{V}$$
(52)

The first equation shows that, when ψ represents material direction, the rate of change of an average is the sum of the average of the rate of change of the quantity and an extra, geometric effect. The second shows that this is due to the rate of change of solid angle. This might be called a “bunching effect”, with \(\dot{g}_{\psi }\) being negative if material directions in the neighbourhood of ψ are changing in a way that makes them bunch together, and positive if they are spreading.

The above results are new in the sense that they show that the bunching effect is a real feature of behaviour that is implied by classical tensor mechanics but is not explicitly recognized in classical tensor mechanics. Equation 51 might be called a “general bunching equation”. Its effects for the Cauchy stress rate are described in Appendix 3. Taking the average of both sides of Eq. 52, and using Eq. 30, gives \(\left\langle {\dot{g}_{\psi } } \right\rangle = 0\). This is expected since there is no change to the total solid angle subtended by the full sphere of Fig. 1.

3.8 Thermodynamic consistency

Collins and Houlsby [19] developed a thermodynamic analysis for granular materials subject to continuum stress–strain events. They showed that, under conditions of equilibrium and constant temperature (isothermal), work is used up as follows:

$$\dot{W}^{\prime } = \dot{H}^{\prime } + \dot{W}_{d}^{\prime }$$
(53)

where \(\dot{H}^{\prime }\) is the change of the Helmholtz free energy \(H^{\prime }\) per unit particle volume. \(\dot{W}_{d}^{\prime }\) is the rate of work or energy loss in irreversible, dissipative processes such as those associated with frictional sliding, contact damage, or particle crushing. \(\dot{W}_{d}^{\prime }\) is truly irreversible. It is non-negative in the sign convention used herein.

Collins and Houlsby [19] derived their equation on the basis of the familiar assumptions of small deformation theory, including the assumptions represented (in different notation) in Eq. 26 of the present paper. In effect then, their result is applicable as a local theory. An important implication relates to the concept that has become known as the “exploitation of the entropy principle” [17, 75]. The issue was addressed by Coleman and Noll [18] in the context of heat conduction, which requires that there be a spatial gradient of temperature, and by Liu [46] and others, and the two papers were discussed by Triani et al. [76]. In the present paper, the deformation gradient is the only spatial gradient involved: the assumption is therefore formally required that this gradient does not affect the basic constitutive behaviour of a large material point.

Collins and Houlsby’s [19] derivation is such that \(\dot{H}^{\prime }\) includes the reversible component of the change of entropy in addition to change of strain energy. Consequently, the terms on the right of their equation are not in general the same as the elastic and plastic components of work that are often assumed in plasticity models [37]. In those models, plastic work is the net work done in a closed stress cycle. By contrast, dissipative work is here taken to be associated with the irreversible component of change of entropy that occurs in a stress–strain process that is not necessarily a closed stress cycle, and is not necessarily a loading process.

If the inner work equals the outer work, no loss of generality occurs if the work rate in Collins and Houlsby’s [19] original equation is taken to be the average \(\left\langle {\dot{W}_{\psi }^{*} } \right\rangle\) of the inner work rate. It seems natural to consider whether there might be materials for which the Helmholtz energy and dissipated work are also averages, so that:

$$H^{\prime } = \left\langle {H_{\psi }^{\prime}} \right\rangle$$
(54)
$$\dot{W}_{d}^{\prime } = \left\langle {\dot{W}_{\psi d}^{\prime}} \right\rangle$$
(55)

Differentiating the first equation with respect to time, and using the general bunching Eq. 51, gives:

$$\dot{H}^{\prime } = \left\langle {\dot{H}_{\psi }^{\prime}+ H_{\psi }^{\prime}\dot{g}_{\psi } } \right\rangle$$
(56)

with \(\dot{g}_{\psi }\) given by Eq. 52 if ψ represents a material direction, or with \(\dot{g}_{\psi } = 0\) if ψ represents direction fixed with respect to an external coordinate frame. \(\dot{H}_{\psi }^{\prime}\) would be the rate of change of a Helmholtz free energy that is associated with direction ψ, and \(\dot{W}_{\psi d}^{\prime}\) would be a rate of dissipation of work or energy in particulate processes associated with direction ψ. Putting these results into Eq. 53 then gives:

$$\left\langle {\dot{W}_{\psi }^{*} } \right\rangle = \left\langle {\dot{H}_{\psi }^{\prime}+ H_{\psi }^{\prime}\dot{g}_{\psi } } \right\rangle + \left\langle {\dot{W}_{\psi d}^{\prime}} \right\rangle$$
(57)

The averaging operator is linear (Eq. 12), so the two averages on the right can be combined into one. Using the extraction property (Eq. 14) then gives:

$$\dot{W}_{\psi }^{*} + \dot{E}_{\psi }^{\prime}= \dot{H}_{\psi }^{\prime}+ H_{\psi }^{\prime}\dot{g}_{\psi } + \dot{W}_{\psi d}^{\prime}$$
(58)

where \(\left\langle {\dot{E}_{\psi }^{\prime}} \right\rangle = 0\). This is now an equation that involves only quantities for direction ψ. It might be regarded as the fundamental work-balance equation for a postulated class of materials for which Eqs. 54, 55 have physical meaning.

The balance is between two inputs on the left and three outputs on the right. On the left, the inner work \(\dot{W}_{\psi }^{*}\) can be known in a calculation once the strain rates are known. It is calculated using Eq. 25 and whatever is found to be the relation between outer and inner geometric rates, an example being Eq. 37. The second input is \(\dot{E}_{\psi }^{\prime}\). This is interpreted as energy that is transferred into particle mechanical processes associated with direction ψ from particle mechanical processes associated with other directions. These would be local transfers, occurring inside the “large point”. Experimental evidence that supports their existence is discussed later.

The right side of Eq. 58 contains three ways the inputs are used. One is the change the Helmholtz energy, one is to account for the bunching effect. The third is in dissipated work. Appendix 4 derives the relation between \(\dot{W}_{\psi d}^{\prime}\) and a cumulative dissipated work function.

Equation 58 provides the following guide as to what a continuum constitutive model is required to do, and what kinds of assumptions it needs to provide. Since the inner work rate and the bunching effect can be calculated if the strain rates are known, all that remains for a model to do is:

  1. 1.

    provide rules for calculating two of the functions \(\dot{E}_{\psi }^{\prime}\), \(\dot{H}_{\psi }^{\prime}\), and \(\dot{W}_{\psi d}^{\prime}\) for a given input function \(\dot{W}_{\psi }^{*}\), (the balance equation will then provide the means of calculating the third of the three functions), and

  2. 2.

    provide rules for inferring the inner stress rates from the results.

For example, if a relation between energy and stress is used to solve the second requirement, the rules needed in the first would need to result in equations for the receipts and the dissipated work rate, given the inner work rate.

3.9 Large deformations

The derivations above are in terms of work rates, and the equations can be readily converted to infinitesimal (incremental) form by multiplying by an infinitesimal increment of time. There remains the question of whether the results are applicable to realistic stress–strain processes for granular media, which typically involve finite deformations, and often involve very large deformations [5, 10, 62].

A simple procedure by which the present equations can be used is proposed as follows. First, a large deformations process is discretized into a number of small increments. In an increment, small deformations are obtained by multiplying rates by the relevant small time increment. The proposed procedure will use the directions ψ as material directions for the increment, and these will in general also have a rate of change. The bunching effect will therefore need to be included.

At the end of the constitutive calculation for the increment, the material directions are now different, and all the functions of direction (such as Helmholtz energy) now refer to the new directions. Let us re-name the new directions ζ, so that the new Helmholtz energy is now \(H_{\zeta }^{\prime}\). The relation between ζ and the original directions ψ at the start of the increment is known because the incremental strain is known. So it is now possible to re-cast \(H_{\zeta }^{\prime}\) as a new function \(H_{\psi }^{\prime}\) of the directions ψ as they were at the start of the increment. This is now the energy function that will apply at the start of the next increment.

If this proposed procedure is done for every increment on the deformations path, the final result is a function \(H_{\psi }^{\prime}\) that describes the material state at the end of the path in terms of directions that have not changed since the start of the path.

The procedure is similar in concept to a re-meshing method used in a large-strain finite element analysis. Further aspects of the use of meshes to represent the continuous constitutive functions are discussed later.

4 Examples of simple constitutive models

4.1 Preliminary considerations

The aim of the following presentation is to explore the use of the functional representation of stress in the context of some relatively simple constitutive behaviours. This is found to lead to potentially new understandings of the particle mechanical origins of some elastic parameters and of critical and steady states. However, these results would need to be verified by future DEM simulations. The exploration also sheds potential light on anisotropy and plastic spin. Again, these results would need to be confirmed in further work.

For simplicity in the following, strains and rotations are assumed to be small so that their products can be neglected. Direction ψ is taken to be relative to a fixed external frame, so that the bunching effect does not appear in the equations.

4.2 Linear elastic behaviours

Linear elastic models, sometimes called Hooke’s law, are described for soils by Davis and Selvadurai [26], Lings [45], and others. The most general model has 21 independent material constants, but the most complex type usually considered for soils is typically either the transversely isotropic linear elastic (TILE) model, with rotational symmetry about one axis, or the fully isotropic linear elastic (FILE) model. Orthotropic behaviour is also sometimes relevant [24].

Linear combinations of well-connected linear elastic systems are typically linear elastic, so linear combinations of linear elastic relations for different directions ψ might be expected to be linear elastic. Since linear elastic behaviour is reversible, the dissipated work rate in the work balance would be zero, and a simple approach would be to take the receipts as zero. A natural assumption would be that the inner stress rate would be linearly related to the inner geometric rate:

$$\dot{\sigma }_{\psi }^{*} = - K_{\psi }^{*} \frac{{\dot{V}_{\psi }^{*} }}{V}$$
(59)

where the modulus \(K_{\psi }^{*}\) could be a function of direction. For simplicity, the following analysis assumes that the volume strains are sufficiently small that that changes of \(K_{\psi }^{*}\) and \(K_{\psi }^{*} /V\) are of secondary significance during a stress–strain process, and can be ignored in a linear analysis. Integrating over time then gives:

$$\sigma_{\psi }^{*} = \frac{{K_{\psi }^{*} }}{V}\left( {V_{\psi 0}^{*} - V_{\psi }^{*} } \right)$$
(60)

where \(V_{\psi 0}^{*}\) could be interpreted as a value of \(V_{\psi }^{*}\) at zero stress. Using Eqs. 25 and 37 the right side can be expressed in terms of tensor strain rates. If the spreading coefficient is a constant, there are no non-linearities involved, and the result can be integrated to give a path-independent relation between inner stress and tensor strains.

Using Eqs. 40 and 59 to calculate the inner work rate, then integrating the balance equation over time taking the receipts, bunching effect, and dissipated work as zero, and assuming that energy is zero at zero inner stress, gives:

$$H_{\psi }^{\prime}= \int\limits_{time}^{{}} { - \sigma_{\psi }^{*} \dot{V}_{\psi }^{*} dt} = \frac{{\left( {\sigma_{\psi }^{*} } \right)^{2} }}{{2K_{\psi }^{*} /V}}$$
(61)

This result could be interpreted as energy associated with direction ψ. Taking the average, Eq. 54, then gives the Helmholtz energy \(H^{\prime }\).

$$H^{\prime} = \frac{1}{2}\left\langle {\frac{{\left( {\sigma_{\psi }^{*} } \right)^{2} }}{{K_{\psi }^{*} /V}}} \right\rangle$$
(62)

Because the inner stress is related to inner geometry through Eq. 60, and provided that the inner geometry can be related to cumulative strain by integration over time, it follows that \(H^{\prime }\) is a function of strain, implying that this model would satisfy the familiar continuum mechanical concept of a Cauchy elastic material (e.g. [70].

Lings [45] discusses the thermodynamic constraint of non-negative strain energy, and shows how this provides limits of Poisson’s ratios for transversely isotropic and fully isotropic linear elastic materials. These limits include those by Love [48], Pickering [58], and Raymond [61]. If \(H^{\prime }\) can be interpreted as strain energy, the above equation shows that the thermodynamic constraint is automatically satisfied if the inner modulus \(K_{\psi }^{*}\) is non-negative for all directions.

The detailed stress–strain equations are obtained by first substituting for the inner functions in Eq. 59. Using the particular example of Eq. 37 for the geometrical relation, and using the differential of Eq. 45 for the stress relation, gives:

$$\dot{\sigma }_{\psi }^{\prime}= - \left\{ \begin{aligned} \left( {1 - \alpha_{\psi }^{\prime}} \right)^{2} K_{\psi }^{*} \frac{{\dot{V}_{\psi }^{*} }}{V} + \alpha_{\psi }^{\prime}\left( {1 - \alpha_{\psi }^{\prime}} \right)K_{\psi }^{*} \frac{{\dot{V}}}{V} \hfill \\ + \left\langle {\alpha_{\psi }^{\prime}\left( {1 - \alpha_{\psi }^{\prime}} \right)K_{\psi }^{*} \frac{{\dot{V}_{\psi }^{*} }}{V}} \right\rangle + \left\langle {\alpha_{\psi }^{\prime 2} K_{\psi }^{*} } \right\rangle \frac{{\dot{V}}}{V} \hfill \\ \end{aligned} \right\}$$
(63)

This shows that, while outer stress is associated with direction ψ, it can be determined by changes of geometry that occur in other directions too. This equation can now be used with Eq. 21, in rate form, and Eq. 28 to determine the rate of change of the Cauchy stress tensor in terms of the tensor strain rates.

4.3 Special linear elastic behaviours

Constitutive properties of a transversely isotropic material are symmetric under rotation about an axis of symmetry [45, 70]. Taking this as the z-axis, the spreading coefficient and the modulus function would be a function of, at most, the polar angle θ (Fig. 1), and might be written as \(\alpha_{\theta }\) and \(K_{\theta }^{*}\). The integrals implied by Eq. 63 then simplify considerably. Appendix 5 confirms that the symmetries in the matrix of moduli calculated in this way are consistent with the symmetries in the compliance matrix given by Lings [45] and others for a transversely isotropic linear elastic (TILE) material.

If the spreading coefficient and modulus function are independent of direction, they may be denoted simply as α and K. Appendix 6 confirms that model then reduces to one of a standard fully isotropic linear elastic (FILE) material. The inner modulus K is then found to equal the elastic bulk modulus. The shear modulus is found to be given by:

$$G = \frac{3}{5}\left( {1 - \alpha } \right)^{2} K$$
(64)

The isotropic Poisson’s ratio μ is found to be given by:

$$\mu = \frac{{5 - 2\left( {1 - \alpha } \right)^{2} }}{{10 + 2\left( {1 - \alpha } \right)^{2} }}$$
(65)

The spreading coefficient has a range from minus infinity to plus infinity. This equation shows that this range corresponds to the familiar limits of − 1 to + 1/2 on the fully isotropic Poisson's ratio.

The above results represent new proposed understandings of what determines the isotropic shear modulus and Poisson’s ratio. The above equation shows that the isotropic Poisson’s ratio is 0.25 if the spreading coefficient is zero. This is in the range that is typical for loose to medium dense sand or sandy clay [8, 25]. In terms of the functions approach, it suggests that there may be little or no reversible interaction between directions in such soils.

4.4 Elastic behaviours: discussion

Many previous authors have attempted to relate macroscopic elastic parameters to particle mechanical ones. Bathurst and Rothenburg [2] extended Horne’s [38] work and showed how a simple linear contact elasticity law could be used to generate the macroscopic elastic properties of an assemblage of discs. Their proposals involve a two-dimensional version of the three-dimensional Eq. 3 used herein. Chang and Gao [14] and Liu and Chang [47] propose expressions for continuum elastic constants based on micromechanics including a version of Eq. 3 of the present paper. Stránský et al. [71] and Kruyt [43] developed equations for an assemblage of spheres. Gu and Yang [32] investigated experimentally and through DEM modelling a problem with the classical Hertz–Mindlin contact law [27]. It therefore seems relevant to ask how the spreading function and inner elastic moduli might relate to these previous works.

An answer will need to take account of the loss of particle mechanical information that is inherent in the process of defining the stress function \(\sigma_{\psi }^{\prime}\) in terms of the average on the left of Eq. 19. This loss means that one cannot deduce detailed particle mechanical information from macroscopic parameters of this type of model, though a deduction in the opposite direction remains possible, from particle to continuum parameters. It follows that the spreading coefficients and inner stiffnesses are not necessarily fundamental microscopic particle mechanical properties. Instead, they may possibly relate to mesoscopic scales, i.e. to aggregate structure or fabric.

4.5 Elasto-plastic behaviours (1) some preliminary ideas

The constitutive map of Eq. 46 indicates that there are two places where more complex behaviours can be introduced. One is the spreading relation \(\dot{V}_{\psi } \to \dot{V}_{\psi }^{*}\) and the consequent conjugate stress assembly relation \(\sigma_{\psi }^{*} \to \sigma_{\psi }^{\prime}\). The second is the inner relation \(\dot{V}_{\psi }^{*} \to \dot{\sigma }_{\psi }^{*}\). Whatever properties or behaviours are introduced in these places will appear in the tensor model \({\dot{\mathbf{\varepsilon }}} \to {\dot{\mathbf{\sigma }}}^{\prime}\), typically in more complicated form due to the spreading and assembly processes.

A simple way to start to introduce elasto-plasticity with functions would be to adapt a linear-elastic model by adding a yield stress \(\sigma_{\psi y}^{*}\) as a function of direction ψ (as well as of stress–strain history in that direction). Figure 3 illustrates this for a given direction ψ, based on direct adaptation of basic features of the hydrostatic parts of the Cam–Clay models of Schofield and Wroth [64] and Roscoe and Burland [63]:

Fig. 3
figure 3

Adapting familiar ideas to create a simple elastoplastic inner constitituve relation

  1. (a)

    Along the elastic lines such as EB and FC, reversible elastic behaviour would occur, governed by some relation that would be a development of Eq. 60 and in which a constant analogous to \(V_{\psi 0}^{*}\) would occur, different for different elastic lines. The soil state at a given time may be on different elastic lines for different directions ψ,

  2. (b)

    If the stress reaches the yield limit, for example at point B for line EB or point C for line FC, further compression in direction ψ would cause the state in that direction to move down the elastoplastic limit line ABCD. On subsequent unloading, the state in that direction would move up a new elastic line,

  3. (c)

    A no-tension limit might be imposed in each direction ψ. The derivation of Eq. 21 showed that \(\sigma_{\psi }^{\prime}\) represent a combination of principal values of tensors that are determined by contact forces and branch vectors. If contact forces cannot be tensile, this would suggest a limit of \(\sigma_{\psi }^{\prime}\ge 0\) in uncemented granular soils. A limit \(\sigma_{\psi }^{*} \ge 0\) might alternatively be postulated.

Item (b) introduces plastic behaviour relative to a direction ψ, so a model with this feature would certainly exhibit elasto-plasticity in its overall macroscopic behaviour, though not necessarily in accordance with conventional elasto-plastic model assumptions. The modelling framework would allow for different behaviours in different physical directions, perhaps giving a result akin to Callladine’s [11] micro-structural model. Whether or not a model that includes (b) would be realistic may depend on other assumptions, as discussed in general terms earlier in relation to the work balance Eq. 58.

One difference between this framework and some previous elasto-plastic models might be in its incremental calculation procedure. A simple procedure for this model would be as follows. Given the tensor strain rates \({\dot{\mathbf{\varepsilon }}}\) over a time increment dt, the outer and inner changes of geometry are calculated. Then, independently for each direction ψ, three steps would be carried out:

  1. 1.

    a trial calculation: the stress response to the given change of geometry is calculated assuming the soil state in that direction remains on its elastic line

  2. 2.

    a yield check: if the result is greater than the yield stress for the line, the response is recalculated using the elastoplastic line ABCD

  3. 3.

    a tension check: if the result is tensile, the stress is re-set to zero. This would mean that this state in direction ψ would move up the geometric axis, such as from F towards E.

This simple procedure omits consideration of the receipts in the work balance equation, and some important aspects of this omission will be considered later. The simple model includes hardening along ABCD, softening by movement up the geometric axis, and induced anisotropy due to the fact that different directions can evolve differently.

4.6 Elasto-plastic behaviours (2) preliminary discussion

The preliminary considerations suggest that the development of ideas about plasticity in the context of functions may require very different ways of thinking compared to previous conventional plasticity approaches. If yielding is something that occurs in terms of inner, direction-related variables, then

  1. 1.

    the question of whether loading or unloading occurs will need to be answered in terms of inner variables rather than tensor stresses, and the answer may be different for different directions ψ.

  2. 2.

    a macroscopic yield envelope and flow rule may be likely to be emergent properties rather than fundamental ones.

These features illustrate both the potential richness and the potential complexity of the functional approach. But also there can be relative simplicity: for example, the concept of “plastic spin” would likely not be needed, because the model structure in Eq. 46 is already complete, and so is expected to automatically create any changes of symmetry that may result from the different responses in different material directions ψ.

4.7 Elasto-plastic behaviours (3) further preliminary explorations

It can be interesting to briefly explore standard triaxial behaviours. If the spreading coefficient is a constant α, as in the FILE model, then the rate of change of inner geometry due to axial and radial strain rates \(\dot{\varepsilon }_{a}\) and \(\dot{\varepsilon }_{r}\) is calculated in Appendix 7 as follows:

$$- \frac{{\dot{V}_{\psi }^{*} }}{V} = 3\left( {1 - \alpha } \right)\left( {\dot{\varepsilon }_{a} \cos {}^{2}\theta + \dot{\varepsilon }_{r} \sin {}^{2}\theta } \right) + \alpha \dot{\varepsilon }_{vol}$$
(66)

where \(\dot{\varepsilon }_{vol} = \dot{\varepsilon }_{a} + 2\dot{\varepsilon }_{r}\) is the volumetric strain rate. If \(0 < \alpha < 1\), directions close to axial will experience faster compression in one-dimensional axial compression (\(\dot{\varepsilon }_{a} > 0\) with \(\dot{\varepsilon }_{r} = 0\)), indicating that yielding will begin in directions close to vertical. By contrast, directions close to radial will experience faster compression in two-dimensional, radial compression (\(\dot{\varepsilon }_{r} > 0\) with \(\dot{\varepsilon }_{a} = 0\)). In pure volumetric compression, \(\dot{\varepsilon }_{a} = \dot{\varepsilon }_{r} = \dot{\varepsilon }_{vol} /3\), and so

$$- \frac{{\dot{V}_{\psi }^{*} }}{V} = \dot{\varepsilon }_{vol} = \frac{{ - \dot{V}}}{V}$$
(67)

for this special case. This suggests that, if the inner relations are also the same in all directions, the inner volumes can be identified with the overall specific volume in this special type of hydrostatic process for isotropic soils.

The next section will confirm that this is a special case, and that \(V_{\psi }^{*}\) cannot be identified with V in more complex stress–strain processes, but it is useful to first explore the simple case of hydrostatic processes for isotropic samples. The asymptotic isotropic compression curve assumed in the Cam–Clay models by Schofield and Wroth [64] and Roscoe and Burland [63] can be expressed as:

$$V + \lambda \ln p^{\prime } = \varGamma$$
(68)

where λ is a slope constant and Γ is a positional constant. If the material is isotropic then all the inner stresses will evolve as the mean normal effective stress in these hydrostatic events. The corresponding inner asymptotic compression curve would therefore be:

$$V_{\psi }^{*} + \lambda \ln \sigma_{\psi }^{*} = \varGamma$$
(69)

Alternative formulae may be developed in the same way, such as the double-log proposals by Butterfield [9], Hashiguchi [33], Butterfield and Marchi [10]. Other inner properties might be inferable similarly, depending on modelling assumptions, such as the relation between Helmholtz free energy function \(H_{\psi }^{\prime}\) and the inner stress and inner volume.

Finally, it is worth noting that the energy receipts term \(\dot{E}_{\psi }^{\prime}\) in Eq. 58 was not used in the linear elastic models discussed earlier. This term provides a second connection between physical directions, the first being the spreading coefficient. It therefore seems possible that the spreading coefficient might represent reversible aspects of spreading, while the receipts represent irreversible aspects. A simple particle mechanical explanation might be as follows:

  • the spreading coefficients model, in some average way, the mechanics by which a given particle equilibrates one inter-particle force by a combination of inter-particle forces at other contacts and in different directions,

  • the receipts represent energy changes caused by the changes of these forces that would be induced if particles slide frictionally at inter-particle contacts.

Further work outside the scope of the present paper would be needed to explore this further. It would imply that \(\dot{E}_{\psi }^{\prime}\) would be determined by dissipated work rates in different directions, subject to the thermodynamic requirement that the average receipt is zero, \(\left\langle {\dot{E}_{\psi }^{\prime}} \right\rangle = 0\).

4.8 Critical and steady states as experimental evidence

Schofield and Wroth [64] define a “critical state” as one at which a soil element can be sheared continuously at constant volume and constant effective stress. An analogous concept is that of “steady states” [41, 60]. A critical state can also be interpreted as a special case of proportional straining, other examples of which include isotropic and one-dimensional compression [16, 74].

If the state of the soil does not change as shearing continues, then \(V_{\psi }^{*}\) cannot be a description of state, since any definite equation like Eq. 37 would mean that it does change. It follows that the elastoplastic curve ABCD would need to be defined in a different way, one for which \(\dot{V}_{\psi }^{*} = \dot{V}\) for asymptotic hydrostatic compression if the soil is isotropic, but for which \(V_{\psi }^{*}\) and V become uncoupled for asymptotic critical state deformations.

During critical state straining, the constancy of the specific volume and effective stress would suggest that inner parameters such as the Helmholtz energy function are also constant as the straining occurs. But some directions will receive a negative inner work rate \(\dot{W}_{\psi }^{*}\). From Eq. 66 with \(\dot{\varepsilon }_{vol} = \dot{\varepsilon }_{a} + 2\dot{\varepsilon }_{r} = 0\), this will occur in triaxial compression for directions with \(\tan^{2} \theta > 2\). If the rate of change of the Helmholtz energy function is zero, the work balance Eq. 58 in the absence of the bunching effect then reduces to:

$$\dot{W}_{\psi }^{*} + \dot{E}_{\psi }^{\prime}= \dot{W}_{\psi d}^{\prime}$$
(70)

The right side cannot be negative, so this implies that sufficient energy transfers must occur between directions such that the receipts for the directions experiencing negative inner work rates offset the negative rates and make the left sides positive. In effect, therefore, critical and steady states are prima facie experimental evidence for the occurrence of receipts.

4.9 Critical and steady states as challenges

The elastic models discussed earlier did not need to involve receipts, suggesting that receipts may depend only on dissipated work. This might be consistent with a particle mechanics explanation in which receipts represent the re-distribution of forces in a particle aggregate after frictional slip at particle contacts or due to other irreversible processes. For purposes of discussion, a simple example of a formula might be:

$$\dot{E}_{\psi }^{\prime}= j\,\left( {1 - \frac{{H_{\psi }^{\prime}}}{{\left\langle {H_{\psi }^{\prime}} \right\rangle }}} \right)\left\langle {\dot{W}_{\psi d}^{\prime}} \right\rangle$$
(71)

where j could be a positive dimensionless material constant. This describes a system in which energy is transferred from high energy directions to low energy directions. This might be sufficient to prevent the energy continually increasing in directions for which \(\dot{W}_{\psi }^{*} > 0\), and also prevent energy reaching zero in directions for which \(\dot{W}_{\psi }^{*} < 0\).

However, this cannot be a complete answer, for the following reason. It means that the work balance equation for general processes would become:

$$\dot{W}_{\psi }^{*} + j\,\left( {1 - \frac{{H_{\psi }^{\prime}}}{{\left\langle {H_{\psi }^{\prime}} \right\rangle }}} \right)\left\langle {\dot{W}_{\psi d}^{\prime}} \right\rangle = \dot{H}_{\psi }^{\prime}+ \dot{W}_{\psi d}^{\prime}$$
(72)

The problem now is that the three-step calculation procedure for the simple plasticity model can no longer be applied in the same way, because the average dissipated work would need to be known in order to do the trial calculation in step 1, but the dissipated work will not be known until step 2 has been done.

One possible way to address this new challenge would be through modelling the rate of change of energy. If \(\dot{H}_{\psi }^{\prime}\) involves a term that offsets the average on the left, then the simplicity of the calculation procedure might be recovered. This would be consistent with the proposed physical explanation of receipts discussed earlier. It might relate to the familiar phenomenon of plastic hardening, and if so would indicate a difference compared to other plasticity models in the literature. Further work on this is outside the scope of the present paper.

5 Discussion

5.1 Nature of the new approach proposed herein

Equation 21 for the functional representation of effective stress was derived from the well-established DEM Eq. 3, which might be regarded as a mathematical representation of the familiar explanation of Terzaghi’s [72, 73] Principle of Effective Stress in terms of inter-particle forces. Equation 30 for the outer volumetric rates followed directly from work-conjugacy in the absence of Cosserat stress effects.

So these continuum variables provide a modelling basis that is consistent with a widely accepted approximation to the micro-mechanics of aggregates of hard, inert particles.

Bolton [6] argues that “continuum soil mechanics is insufficient”. Soga and O’Sullivan [69] divide models into macro and micro ones. Liu and Chang [47] note that continuum plasticity approaches involve parameters that “describe directly the behavior of a particle assembly element (macro-scale)”, which are typically measured in a laboratory element test. By contrast, the micro-mechanical approach includes parameters that describe behaviour of an inter-particle contact (micro-scale), and which may be “difficult or impossible to be obtained in laboratory directly”. The aim is to use an upscaling or other procedure to deduce macroscopic continuum behaviour.

Another recent example is given in the detailed micromechanics-based model by He et al. [35, 36], who identify a problem of an “overwhelming number of parameters in the model”. They attribute this to the complexity of granular materials, but it does present a challenge in terms of practical engineering application. Micro-mechanical approaches to continuum modelling differ from DEM approaches in that the latter do not necessarily involve an intermediate stage of a continuum model. Huge efforts by many authors over many decades have been made using all of these approaches.

In the fully isotropic linear elastic model explored earlier, the inner modulus was equal to the macroscopic bulk modulus, and the Poisson’s ratio was related to the spreading coefficient. This suggests that the functional representation can involve a hybrid approach, which

  • uses the basic micro–macro relationships of Eqs. 21 and 30, and

  • can include mesoscopic concepts such as the postulated spreading coefficients, and

  • can have its parameters measurable in macroscopic, element-scale testing

There is no implication of conflict between the various approaches, and one would expect that there might eventually be ways by which all three approaches can agree in their predictions for engineering behaviours of interest. Indeed, one might expect that advances in one approach may assist in the development of one or more of the others.

An example of potential mutual assistance and consistency might be as follows. If the spreading coefficient \(\alpha_{\psi }\) is a function of direction, then it may relate to the way particles interact and therefore to fabric. One might use it to define a variety of tensors, for example:

$${\varvec{\upalpha}} = \left\langle {3\alpha_{\psi } \underline{{\mathbf{n}}}_{\psi } \otimes \underline{{\mathbf{n}}}_{\psi } } \right\rangle$$
(73)

This would transform like stress (Eqs. 3234). It may be that, if \(\alpha_{\psi }\) is also a function of stress–strain history as experienced in terms of inner quantities, that function’s evolution would constitute a fabric evolution law. Whether this particular \({\varvec{\upalpha}}\) is useful or not, it illustrates the potential power of this new approach to constitutive modelling of granular media.

5.2 Achievements and some further considerations

The simple models developed herein confirm that modelling is possible and can produce useful results when functions are used to represent stress and strain rate. The elastic model reproduces the typical transversely isotropic behaviour of soil and the elasto-plastic one includes yield behaviours and other realistic features. The challenges of using functions to model critical and steady states have also been explored.

Features that have not been explored herein include the possibility of a more complex approach to the spreading equations such as making the spreading coefficient \(\alpha_{\psi }^{\prime}\) depend on energy ratios. It seems clear that any behavioural features that are included in the inner relation will also appear in the overall behaviour, in more complex form. For instance, an entire model would be time-dependent if the inner relation between \(\dot{V}_{\psi }^{*}\) and \(\dot{\sigma }_{\psi }^{*}\) was.

In numerical implementation, a function of direction might be represented by its values for a finite number of directions. Several models in the literature already involve multiple directions, including the multi-laminate models [55, 65]. It therefore seems likely that functions of direction can be modelled using simple adaptations of existing finite element or other numerical methods.

Figure 4 illustrates this for four symmetry cases. If only triaxial stress–strain processes are to be explored on a fully isotropic or transversely isotropic material aligned with the axis, then only the polar angle need be discretized, and this would correspond to a system in which the sphere of Fig. 1 was discretized into hoops as in Fig. 4a. If more complex processes are applied, both the polar and azimuthal angles need to be varied, and the deformation symmetries then indicate discretizations of one-eighth of the sphere for true triaxial process, or one quarter for plane strain processes that include simple shear. For general processes and material symmetries, only one half of the sphere need to be discretized, because of the basic symmetry that the value of a function in a given direction will be the same as its value in the diametrically opposite direction.

Fig. 4
figure 4

Discretizations of direction for various symmetries a standard triaxial b true triaxial c plane strain in the yz plane d general

6 Concluding remarks

This paper has proposed a continuum mechanics that is intermediate in detail between particle mechanics and tensor continuum mechanics. The proposal provides a more rigorous but also simplified approach to the task of upscaling from particle mechanics to continuum mechanics. This leads to an alternative framework for the stress–strain constitutive modelling of granular materials that in particular considers directionality.

The proposals are based on Terzaghi’s Principle of Effective Stress and on a well established result in the discrete element method, Eq. 3, which relates inter-particle point forces to macroscopic tensors. The proposed descriptions of stress and deformations result from organizing the information according to the directions of the principal stresses associated with the effects of each particle contact in a representative elementary volume.

The organization involves a summation (Eq. 19) that loses some particle mechanical information, so the proposed descriptions lose part of the explicit link between particulate-level and macroscopic behaviours that a full discrete element method or a fundamental micro-mechanical derivation of macroscopic behaviour might provide. However, analysis of elastic behaviour suggested that the new framework may provide a link to mesoscopic, aggregate or fabric properties.

The framework was shown to satisfy Noll’s Axioms of constitutive modelling. A principle of freedom from tensor constraint was proposed. A general structure for constitutive models employing functions was inferred from the Axiom of Determinism, Eq. 46. A new work balance equation was derived, based on Collins and Houlsby’s [19] thermo-mechanics. A “bunching” effect was identified and included in this equation.

Familiar linear elastic models of soils—fully isotropic and transversely isotropic—were shown to be readily expressed in terms of functions of direction. The assumption of non-negative strain energy, which produces all the known limits on Poisson’s ratio [45], is expressed by the simple criterion that the inner modulus function \(K_{\psi }^{*}\) is non-negative. For the fully isotropic linear elastic (FILE) model, the spreading coefficient is independent of direction, and determines Poisson’s ratio though Eq. 65.

By adding a yield stress function to the isotropic model, it was argued that a simple elasto-plastic model can certainly be obtained. A model so constructed will be able to allow for different hardening in different physical directions, thereby providing opportunities for modelling induced anisotropy. The structure of Eq. 46 also means that the model is complete, so that one can expect that the phenomenon that is interpreted as “plastic spin” in a tensor context would emerge automatically from a functional plasticity model, without the need for a separate concept of spin. The new proposals also create the potential for new approaches in numerical analysis methods, which were briefly discussed.

A future challenge will be to find the required additional constitutive assumptions that can lead to a good match between model and data. Proposing such assumptions in detail is outside the scope of this paper, but some general aspects have been briefly discussed in relation to the modelling of critical or steady states.