1 Introduction

The divergence theorem for submanifolds of a Euclidean vector space is indispensable to prove alternative representations of internal virtual work functionals of a continuum. These representations are required among others to identify compatible external virtual work functionals and to obtain the local equilibrium equations together with the boundary conditions, see among others [11, 14, 16, 18, 22, 27, 28, 37]. Typically, the divergence theorem for an m-dimensional orientable submanifold \(M\subseteq {\mathbb {E}}^n\) of an n-dimensional Euclidean vector space \({\mathbb {E}}^n\) is employed, where \(m\le n=3\). Applying Einstein summation convention, which implies summation over upper and lower indices that appear twice in a term, a point \(x \in {\mathbb {E}}^3\) can be written as \(x=x^ie_i\in {\mathbb {E}}^n\), where \((e_1,\dots ,e_n)\) is a basis of \({\mathbb {E}}^n\). In particular, we agree that Roman indices range from 1 to n. Consider a (local) parametrization \(\psi =\psi (\theta ^1,\dots ,\theta ^m)\) of the m-dimensional manifold M, then for \(\alpha \in \{1,\dots ,m\}\) the vectors \(g_\alpha (x):=\frac{\partial \psi }{\partial \theta ^\alpha }\big |{}_{\psi ^{-1}(x)}\) define a basis of \(T_xM\). Hence, a vector field \(X\in \mathop {\mathrm {Vect}}(M)\) can be represented in two natural ways, namely \(X=X_\parallel ^ie_i=X^\alpha g_\alpha \) with the tacit assumption that Greek indices range from 1 to m. Since the inner product \(\langle \cdot ,\cdot \rangle \) on \({\mathbb {E}}^n\) induces a Riemannian metric g on M, the manifold is equipped with the Levi-Civita connection \(\nabla \) as well as the Riemannian volume form dM. Exploiting that \(T_xM\) is a subspace of \({\mathbb {E}}^n\), we furthermore define the orthogonal projection \(P_\parallel (x):{\mathbb {E}}^n\rightarrow {\mathbb {E}}^n\) onto \(T_xM\).

According to [25, 26], the divergence theorem corresponds to the equality

$$\begin{aligned} \int \limits _M \mathop {\mathrm {div}}\!X\, dM = \int \limits _{\partial M} \langle X,\nu \rangle \, d\partial M\,, \end{aligned}$$
(1)

where \(\nu \) denotes the outward-pointing unit normal to the topological boundary \(\partial M\) of M, \(\mathop {\mathrm {div}}\!\,(X)\) is the divergence of the vector field X and \(d\partial M\) is used for the Riemannian volume element of the boundary manifold \(\partial M\). In the literature on higher gradient continua, see for instance [2, 5, 13, 14, 17, 18, 22, 23, 27, 28, 32, 36, 37], several expressions of the divergence of a vector field are encountered. Specifically, these are

$$\begin{aligned} \mathop {\mathrm {div}}\!X = \frac{1}{\sqrt{g}}\frac{\partial }{\partial \theta ^\alpha }\Big (\sqrt{g}X^\alpha \Big )= \frac{\partial X^\alpha }{\partial \theta ^\alpha }+\Gamma _{\alpha \beta }^\alpha X^\beta = {P_\parallel }_i^j\frac{\partial X_\parallel ^i}{\partial x^j} \, , \end{aligned}$$
(2)

where \(g:=\det (g(g_\alpha ,g_\beta ))= \det (\langle g_\alpha ,g_\beta \rangle )\) denotes the determinant of the first fundamental form, \(\Gamma _{\alpha \beta }^\gamma \) are the Christoffel symbols of the connection \(\nabla \) and the \({P_\parallel }_i^j\) denote the components of the orthogonal projection. In fact, these are all local expressions, in case a parametrization of the manifold is given or when the orthogonal projector has been introduced.

In addition to the application of the divergence theorem to second-gradient continua in Sect. 2, the main goal of this paper is to show why these different representations occur and how they are related. In fact, the divergence theorem (1) is a consequence of Stokes’ theorem on manifolds [25], which is formulated in the context of intrinsic differential geometry. To get to the divergence theorem in the form (1), we will show in this paper how to apply results from intrinsic differential geometry to submanifolds of \({\mathbb {E}}^n\). In Sect. 3, we mainly gather results from the literature and present a concise derivation of the divergence theorem on Riemannian manifolds, which is of great interest for the theory of general relativity [9, 10]. Along the way, we will see that there exist two different but equivalent definitions of the divergence, whose local representations in a natural way lead to the second and third expression of (2). Then, in Sect. 4, the special case of submanifolds of \({\mathbb {E}}^n\) is considered, which directly leads to the divergence theorem in the form (1) as well as to the last coordinate expression in (2). In particular, we will show again the equivalence of the two definitions of the divergence. This allows a reader that is only interested in the equivalence of these definitions for submanifolds of Euclidean vector spaces to omit Theorem 1 and Corollary 1 of Sect. 3.

2 Equivalent representations of the internal virtual work functional for second-gradient continua

To get two different representations of the internal virtual work functional of second-gradient continua in Eulerian description, we apply the divergence theorem for submanifolds (1) together with the divergence representation given by the last expression in (2). The aim is to show the importance of the divergence theorem for continuum mechanics, and it is not intended as an exhaustive treatment of second-order continua. For more details about second-gradient continua and possible applications, we refer to [1, 3, 4, 7, 12, 19,20,21, 29, 30, 33].

Following the postulation accepted for Galilean Mechanics, the physical space, where a second-gradient continuum can be placed, is modeled as a three-dimensional Euclidean vector space \(\mathbb {E}^{3}\) with the inner product denoted by \(\langle \cdot ,\cdot \rangle \). We assume the actual configuration \(\omega \subset {\mathbb {E}}^3\) of the considered continuum to be a three-dimensional submanifold with corners, which is sufficiently regular to perform all the required calculations, see [14]. The topological boundary of \(\omega \) is denoted by \(\partial \omega \), which is the union of a finite number of two-dimensional orientable surfaces with boundary. These surfaces, called faces of \(\omega \), are again manifolds with corners and their boundary curves are called edges. The union of all edges of \(\omega \) is denoted by \(\partial \partial \omega \).

Choosing a basis \((e_1,e_2,e_3)\) of \({\mathbb {E}}^3\), we can represent any spatial point as \(x=x^ie_i\in {\mathbb {E}}^3\). A spatial virtual displacement field \(\delta x\) is a vector field on the actual configuration, i.e., a vector-valued function \(\delta x:\omega \rightarrow {\mathbb {E}}^3\). Using the basis representation \(\delta x(x) = \delta x^i(x)e_i\) for the spatial virtual displacements, we introduce the abbreviations

$$\begin{aligned} \delta d_{j}^{i}:=\frac{\partial \delta x^{i}}{\partial x^{j}}\quad \text {and}\quad \delta \mathbbm {d}_{jk}^{i}:=\frac{\partial ^2\delta x^{i}}{\partial x^{j}\partial x^{k}} \end{aligned}$$
(3)

for the components of the first and second gradient of the spatial virtual displacement field \(\delta x\).

The internal virtual work functional of a second-gradient continuum in Eulerian form can be defined in the form

$$\begin{aligned} \delta \mathscr {W}_{\omega }^{\text{ int }}(\delta x):=-\int \limits _{\omega }\big ( c_{i}^{k}\delta d_{k}^{i}+\mathbbm {c}_{i}^{jk}\delta \mathbbm {d}_{jk}^{i}\big )d\omega \,,\end{aligned}$$
(4)

where \(c_{i}^{k}\) and \(\mathbbm {c}_{i}^{jk}\) are the components of the Cauchy–Euler stress c and the Cauchy–Euler double-stress \(\mathbbm {c}\), respectively. In fact, the functional \(\delta \mathscr {W}_{\omega }^{\text {int}}\) can be considered as a representation of a second-order distribution. With the subsequent transformations, we will find an equivalent representation, which is important for the choice of compatible external virtual work functionals for second-gradient continua.

Using (3) together with the product rule, we have that

$$\begin{aligned} c_{i}^{k}\delta d_{k}^{i} + \mathbbm {c}_{i}^{jk}\delta \mathbbm {d}_{jk}^{i} =c_{i}^{k}\frac{\partial \delta x^{i}}{\partial x^{k}} + \mathbbm {c}_{i}^{jk}\frac{\partial ^2\delta x^{i}}{\partial x^{j}\partial x^{k}} = \Big (c_{i}^{k} - \frac{\partial \mathbbm {c}_{i}^{jk}}{\partial x^j}\Big )\frac{\partial \delta x^i}{\partial x^k} + \frac{\partial }{\partial x^j}\Big (\mathbbm {c}_{i}^{jk}\frac{\partial \delta x^i}{\partial x^k}\Big ) \,. \end{aligned}$$
(5)

Introducing the abbreviation

$$\begin{aligned} \bar{c}_i^k:=c_{i}^{k} - \frac{\partial \mathbbm {c}_{i}^{jk}}{\partial x^j}\,, \end{aligned}$$

the integrand (5) of the internal virtual work can be further recast to

$$\begin{aligned} c_{i}^{k}\delta d_{k}^{i} + \mathbbm {c}_{i}^{jk}\delta \mathbbm {d}_{jk}^{i} = -\frac{\partial \bar{c}_i^k}{\partial x^k}\delta x^i + \frac{\partial }{\partial x^j}\Big (\bar{c}_i^j\delta x^i + \mathbbm {c}_{i}^{jk}\frac{\partial \delta x^i}{\partial x^k}\Big ) \end{aligned}$$
(6)

by using the product rule on the first term. It is clear that because \(T_x\omega ={\mathbb {E}}^3\) for all \(x\in \omega \), the orthogonal projection onto \(T_x\omega \) is the identity map, whose components are given by the Kronecker delta \(\delta _i^j\). Consequently, comparing the last term in (6) with the last coordinate expression in (2) reveals that the internal virtual work (4) can be reformulated as

$$\begin{aligned} \delta \mathscr {W}_{\omega }^{\text{ int }}(\delta x)=\int \limits _{\omega } \frac{\partial \bar{c}_i^k}{\partial x^k}\delta x^i\,d\omega -\int \limits _{\omega }\mathop {\mathrm {div}}\!\Big (\Big (\bar{c}_i^j\delta x^i + \mathbbm {c}_{i}^{jk}\frac{\partial \delta x^i}{\partial x^k}\Big )e_j\Big )\,d\omega \,. \end{aligned}$$

We can now invoke the divergence theorem (1) on the second integral to arrive at

$$\begin{aligned} \begin{aligned} \delta \mathscr {W}_{\omega }^{\text{ int }}(\delta x)&=\int \limits _{\omega } \frac{\partial \bar{c}_i^k}{\partial x^k}\delta x^i\,d\omega -\int \limits _{\partial \omega } \Big (\bar{c}_i^j\delta x^i + \mathbbm {c}_{i}^{jk}\frac{\partial \delta x^i}{\partial x^k}\Big )n_j\,d\partial \omega \,, \end{aligned} \end{aligned}$$

where \(n_i :=\langle e_i,n\rangle \) and n denotes the outward-pointing unit normal vector field to \(\partial \omega \). Consequently, the internal virtual work functional can be represented as

$$\begin{aligned} \delta \mathscr {W}_{\omega }^{\text {int}} = \delta \mathscr {W}_{\omega }^{\text {int},0} + \delta \mathscr {W}_{\partial \omega }^{\text {int},0} + \delta \bar{\mathscr {W}}_{\partial \omega }^{\text {int},I}\,, \end{aligned}$$

where we have introduced the functionals

$$\begin{aligned} \begin{aligned}&\delta \mathscr {W}_{\omega }^{\text{ int },0}(\delta x):=\int \limits _{\omega } \frac{\partial \bar{c}_i^k}{\partial x^k}\delta x^i\,d\omega \, ,\\ {}&\delta \mathscr {W}_{\partial \omega }^{\text{ int },0}(\delta x):=-\int \limits _{\partial \omega } \bar{c}_i^jn_j\delta x^i\,d\partial \omega \, ,\\ {}&\delta \bar{\mathscr {W}}_{\partial \omega }^{\text{ int },I}(\delta x):=- \int \limits _{\partial \omega } \mathbbm {c}_{i}^{jk}n_j\, \delta d^i_k\,d\partial \omega \,. \end{aligned}\end{aligned}$$
(7)

Since \(\partial \omega \) has co-dimension one, the tangent space \(T_x\partial \omega \) is a two-dimensional subspace of \({\mathbb {E}}^3\) for all \(x\in \partial \omega \). Consequently, the normal space \(N_x\partial \omega :=(T_x\partial \omega )^\perp \) is one-dimensional and is spanned by the outward-pointing unit normal vector n(x). It is immediately clear that the map

$$\begin{aligned} {m_{\perp }\!}:{\mathbb {E}}^3\rightarrow {\mathbb {E}}^3, X \mapsto \langle n,X\rangle n \end{aligned}$$

is the orthogonal projection onto the normal space. Moreover, since

$$\begin{aligned} {m_{\perp }\!}(e_i) = \langle n,e_i\rangle n^je_j = n_in^je_j = {m_\perp }_i^je_j\,, \end{aligned}$$
(8)

the components of the projection can be identified as \({m_\perp }_{i}^{j} = n_{i} n^{j}\). Denoting the orthogonal projection onto \(T_x\partial \omega \) by \(m_{\parallel }(x)\), it holds that \(\text {id} = m_{{\perp }} + m_{\parallel }\), where “id” is the identity map on \({\mathbb {E}}^3\). Hence, \(\delta _j^i = {m_{\perp }}_{j}^{i} +{m_\parallel }_{j}^{i}\), which we use to further manipulate \(\delta {\bar{\mathscr {W}}}_{\partial \omega }^{\text {int},I}\) defined in (7). Namely,

$$\begin{aligned} \delta \bar{\mathscr {W}}_{\partial \omega }^{\text{ int },I}(\delta x) = -\int \limits _{\partial \omega } \mathbbm {c}_{i}^{jk}n_j\frac{\partial \delta x^i}{\partial x^l} \delta _k^l\,d\partial \omega = -\int \limits _{\partial \omega } \mathbbm {c}_{i}^{jk}n_j \frac{\partial \delta x^i}{\partial x^l}({m_\perp }_{k}^{l}+{m_\parallel }_{k}^{l})\,d\partial \omega \,.\end{aligned}$$
(9)

Hence, the virtual work functional (9) can be represented as the sum

$$\begin{aligned} \delta \bar{\mathscr {W}}_{\partial \omega }^{\text {int},I} = \delta {\mathscr {W}}_{\partial \omega }^{\text {int},I} + \delta \tilde{\mathscr {W}}_{\partial \omega }^{\text {int},I} \, , \end{aligned}$$

where the first functional is defined as

$$\begin{aligned} \delta {\mathscr {W}}_{\partial \omega }^{\text{ int },I}(\delta x):=-\int \limits _{\partial \omega } \mathbbm {c}_{i}^{jk}\frac{\partial \delta x^i}{\partial x^l}n_j {m_\perp }_{k}^{l}\,d\partial \omega {\mathop {=}\limits ^{(8)}}- \int \limits _{\partial \omega } (\mathbbm {c}_{i}^{jk}n_jn_k) \frac{\partial \delta x^i}{\partial x^l}n^l\,d\partial \omega \,,\end{aligned}$$

which is a second-order transverse distribution, [31], involving the normal derivative \(\frac{\partial \delta x^i}{\partial x^l}n^l\) of the virtual displacement field \(\delta x\). The second functional is then given by

$$\begin{aligned} \begin{aligned} \delta \tilde{\mathscr {W}}_{\partial \omega }^{\text{ int },I}(\delta x)&:=-\int \limits _{\partial \omega } \mathbbm {c}_{i}^{jk}n_j \frac{\partial \delta x^i}{\partial x^l}{m_\parallel }_{k}^{l}\,d\partial \omega \\ {}&=- \int \limits _{\partial \omega } \mathbbm {c}_{i}^{jk}n_j\frac{\partial \delta x^i}{\partial x^l} {m_\parallel }_{m}^{l} {m_\parallel }_{k}^{m}\,d\partial \omega \\ {}&= -\int \limits _{\partial \omega } \Big [{m_\parallel }_{m}^l \frac{\partial }{\partial x^l}\Big ({m_\parallel }_{k}^{m}\mathbbm {c}_{i}^{jk} n_j \delta x^i\Big ) - {m_\parallel }_{m}^{l}\frac{\partial }{\partial x^l}\Big ({m_\parallel }_{k}^{m}\mathbbm {c}_{i}^{jk}n_j\Big )\delta x^i \Big ] d\partial \omega \\ {}&{\mathop {=}\limits ^{(Equ2)}}-\int \limits _{\partial \omega } \Big [\mathop {\mathrm {div}}\!\Big ({m_\parallel }_{k}^{m}\mathbbm {c}_{i}^{jk}n_j \delta x^i e_m\Big ) - {m_\parallel }_{m}^{l}\frac{\partial }{\partial x^l}\Big ({m_\parallel }_{k}^{m}\mathbbm {c}_{i}^{jk}n_j\Big )\delta x^i \Big ] d\partial \omega \,, \end{aligned}\end{aligned}$$
(10)

where for the equalities we have used the idempotence of the projector \(m_{\parallel }\), the product rule and the local representation (2) of the divergence. Note, that we are allowed to use (2) for the last equality, since \({m_\parallel }_{k}^{m}\mathbbm {c}_{i}^{jk}\delta x^in_j\) are the components of the vector field \(m_{{\parallel }}(\mathbbm {c}_{i}^{jk}n_j\delta x^ie_k)\in \mathop {\mathrm {Vect}}(\partial \omega )\), which in virtue of the projector \(m_{\parallel }\) is indeed a vector field on \(\partial \omega \). Using the divergence theorem (1) in (10) leads to

$$\begin{aligned} \delta \tilde{\mathscr {W}}_{\partial \omega }^{\text {int},I} = \delta \mathscr {W}_{\partial \partial \omega }^{\text {int},0} +\delta \tilde{\mathscr {W}}_{\partial \omega }^{\text {int},0} \end{aligned}$$

with the functionals given by

$$\begin{aligned} \begin{aligned}&\delta \mathscr {W}_{\partial \partial \omega }^{\text{ int },0}(\delta x):=-\int \limits _{\partial \partial \omega } {m_\parallel }_{k}^{m}\mathbbm {c}_{i}^{jk}n_jb_m\delta x^i\, d\partial \partial \omega = -\int \limits _{\partial \partial \omega } \mathbbm {c}_{i}^{jk}n_jb_k\delta x^i\, d\partial \partial \omega \, ,\\ {}&\delta \tilde{\mathscr {W}}_{\partial \omega }^{\text{ int },0}(\delta x):=\int \limits _{\partial \omega }{m_\parallel }_{m}^{l}\frac{\partial }{\partial x^l} \Big ({m_\parallel }_{k}^{m}\mathbbm {c}_{i}^{jk}n_j\Big )\delta x^i\,d\partial \omega \,, \end{aligned}\end{aligned}$$
(11)

where \(b_m:=\langle e_m,b\rangle \) and b denotes the outward-pointing unit normal field to the edges constituting \(\partial \partial \omega \) which are also tangent to \(\partial \omega \). Note, to obtain \(\delta \mathscr {W}_{\partial \partial \omega }^{\text {int},0}\), the divergence theorem has been applied leading to a line integral along the edges of \(\partial \omega \). We used here a notational convention in the expression of the integral. In fact, an edge \(\gamma \) is constituted by two concurring boundary surface manifolds \(\sigma ^+\) and \(\sigma ^-\), say. Hence, \(\gamma \) is traversed twice: once with the surface normal \(n^-\), edge normal \(b^-\) and the limit \((\delta x^i\mathbbm {c}_i^{jk})^-\) approached from the surface \(\sigma ^-\), as well as once with the corresponding \(n^+\), \(b^+\) and \((\delta x^i\mathbbm {c}_i^{jk})^+\). Consequently, if we denote each of the edge curves by \(\gamma _i\) for \(i = 1,\dots ,n_\mathrm {e}\), the integral expression of the first equality in (11) must be understood as follows

$$\begin{aligned} \int \limits _{\partial \partial \omega } \mathbbm {c}_{i}^{jk}n_jb_k\delta x^i\, d\partial \partial \omega := \sum _{i=1}^{n_\mathrm {e}} \int \limits _{\gamma _i} \big [(\mathbbm {c}_{i}^{jk}n_jb_k\delta x^i)^+ + (\mathbbm {c}_{i}^{jk}n_jb_k\delta x^i)^-\big ] d\gamma _i \, . \end{aligned}$$

In conclusion, the internal virtual work functional (4) can be represented equivalently in the form

$$\begin{aligned} \begin{aligned} \delta \mathscr {W}_{\omega }^{\text{ int }}(\delta x)&= [\delta \mathscr {W}_{\omega }^{\text{ int },0} + (\delta \mathscr {W}_{\partial \omega }^{\text{ int },0} + \delta \tilde{\mathscr {W}}_{\partial \omega }^{\text{ int },0}) + \delta {\mathscr {W}}_{\partial \omega }^{\text{ int },I} +\delta \mathscr {W}_{\partial \partial \omega }^{\text{ int },0}](\delta x) \\ {}&= \int \limits _{\omega } \frac{\partial \bar{c}_i^k}{\partial x^k}\delta x^i\,d\omega +\int \limits _{\partial \omega } \Big [ {m_{\parallel }}^l_m\frac{\partial }{\partial x^l}\Big ({m_{\parallel }}^m_k\mathbbm {c}_{i}^{jk}n_j\Big ) -\bar{c}_i^jn_j\Big ]\delta x^i\,d\partial \omega \\ {}&\quad - \int \limits _{\partial \omega } (\mathbbm {c}_{i}^{jk}n_jn_k) \frac{\partial \delta x^i}{\partial x^l}n^l\,d\partial \omega -\int \limits _{\partial \partial \omega } \mathbbm {c}_{i}^{jk}n_jb_k\delta x^i\, d\partial \partial \omega \,. \end{aligned}\end{aligned}$$
(12)

Since in D’Alembert–Lagrange continuum mechanics, the fundamental principle is the principle of virtual work, which requires for a static equilibrium the total virtual work \(\delta \mathscr {W}_\omega ^\mathrm {tot}:= \delta \mathscr {W}_{\omega }^{\text {int}} + \delta \mathscr {W}_{\omega }^{\text {ext}} = 0\) to vanish for all admissible virtual displacement fields \(\delta x\), the external virtual work functional compatible for second-gradient continua must be of the form

$$\begin{aligned} \delta \mathscr {W}_{\omega }^{\mathrm {ext}}(\delta x)=\int \limits _{\omega }f_{i}^{\omega }\delta x^{i} d\omega +\int \limits _{\partial \omega }f_{i}^{\partial \omega }\delta x^{i} d\partial \omega +\int \limits _{\partial \omega }d_{i}^{\partial \omega }\frac{\partial \delta x^{i}}{\partial x^{c}}n^{c} d\partial \omega +\int \limits _{\partial \partial \omega }f_{i}^{\partial \partial \omega }\delta x^{i}d\partial \partial \omega \;, \end{aligned}$$
(13)

where the co-vector fields \(f^{\omega }\), \(f^{\partial \omega }\) and \(f^{\partial \partial \omega }\) are forces per unit actual volume, surface and line, respectively. Note, the somehow uncommon additional surface density \(d^{\partial \omega }\), called surface density of double-forces, which is a density per unit actual surface and which is dual to the normal gradient with respect to the actual normal vector. Using (12) and (13) in the principle of virtual work, the expressions in the volume integral readily lead to the equilibrium equations, whereas the expressions in the surface and edge integrals lead to the boundary conditions.

3 Divergence theorem on Riemannian manifolds

In this section, we aim at revising the concepts needed to understand the divergence theorem in a setting which uses the least mathematical structure as possible. For that, we regard M as a smooth oriented (topological) manifold equipped with a Riemannian metric g, i.e., g is a covariant tensor field of rank two on M which endows the tangent spaces \(T_xM\) with an inner product for each \(x\in M\).

Since the manifold M is oriented, we can define the Riemannian volume form \(dM\in \Omega ^m(M)\) as the unique differential m-form defined by the condition

$$\begin{aligned} dM_x(B_1,\dots ,B_m) = 1 \end{aligned}$$
(14)

for every \(x\in M\) and for any positively oriented orthonormal basis \((B_1,\dots ,B_m)\) of \(T_xM\). The Riemannian volume form is well-defined and unique since the transformation matrix between two positively oriented orthonormal bases has determinant 1. It can be shown, see Proposition 15.31 in [26], that

$$\begin{aligned} dM = \sqrt{g}\,\mathrm {d}{\theta }^1\wedge \cdots \wedge \mathrm {d}{\theta }^m \end{aligned}$$
(15)

with respect to a positively oriented chart \(\phi :M\supseteq U\rightarrow {\mathbb {R}}^m, x\mapsto (\theta ^1,\dots ,\theta ^m)\), where \(\sqrt{g} = \sqrt{\det (g_{\alpha \beta })}\) and \(g_{\alpha \beta } = g\big (\frac{\partial }{\partial \theta ^\alpha },\frac{\partial }{\partial \theta ^\beta }\big )\).

With the Riemannian volume form at hand, we can define the divergence of a vector field X on M as the scalar function \(\mathop {\mathrm {div}}\!X\in C^\infty (M)\) satisfying

$$\begin{aligned} \mathop {\mathrm {div}}\!X\,dM = {\mathcal {L}}_X(dM)\,, \end{aligned}$$
(16)

where \({\mathcal {L}}_X\) denotes the Lie derivative with respect to X, see [25, 26]. Hence, the divergence of X captures how the (incremental) volume of M changes under the flow of the vector field X. It follows immediately from Cartan’s magic formula and because dM is a top-degree form that

(17)

The divergence operator is not \(C^\infty \)-linear, since the Leibniz rule applies for the exterior derivative. In fact, for \(X,Y\in \mathop {\mathrm {Vect}}(M)\) and \(f\in C^\infty (M)\),

(18)

Since the interior product is an anti-derivation and \(\mathrm {d}f\wedge dM = 0\), because dM is a top-degree form, we have that

(19)

From the sum of (18) and (19) the well-known property

$$\begin{aligned} \mathop {\mathrm {div}}\!\,(X+fY) = \mathop {\mathrm {div}}\!X + f\mathop {\mathrm {div}}\!Y + \mathrm {d}f(Y)\,. \end{aligned}$$
(20)

of the divergence operator is readily derived. Property (20) reveals that the divergence operator \(X\mapsto \mathop {\mathrm {div}}\!X\) is \({\mathbb {R}}\)-linear, which can also be immediately seen from (17) using the \({\mathbb {R}}\)-linearity of the exterior derivative and the interior product.

Using the representation (15), the first local form of the divergence in (2) is found by straightforward computations in coordinates.

Proposition 1

Let \(\phi :M\supseteq U\rightarrow {\mathbb {R}}^m, x\mapsto (\theta ^1,\dots ,\theta ^m)\) be a positively oriented chart of M and let \(X=X^\alpha \frac{\partial }{\partial \theta ^\alpha }\) be a vector field on M. Then,

$$\begin{aligned} \mathop {\mathrm {div}}\!X = \frac{1}{\sqrt{g}}\frac{\partial }{\partial \theta ^\alpha }\Big (\sqrt{g}X^\alpha \Big )\,. \end{aligned}$$
(21)

Proof

This proof can be found on page 399 of [25]. Still, it is given here for completeness, as it is also the goal of this paper to gather all important proofs related to the divergence theorem at one place.

Using the local representation (15) of the Riemannian volume form, we have that

where the dual vector \(\mathrm {d}{\theta }^{(\alpha )}\) is left out in the sequence of wedge products. Consequently, by (17), we have that

(22)

The m-form appearing in the last sum in (22) vanishes whenever \(\alpha \ne \beta \) and can be rearranged to equal \(\mathrm {d}{\theta }^1\wedge \dots \wedge \mathrm {d}{\theta }^m\) when \(\alpha = \beta \). For that, \((\alpha -1)\) permutations of the indices occur, such that (22) takes the form

$$\begin{aligned} \begin{aligned} \mathop {\mathrm {div}}\!X\,dM&=\sum _{\alpha =1}^{m}(-1)^{2(\alpha -1)}\frac{\partial }{\partial \theta ^\alpha }(\sqrt{g}X^\alpha )\mathrm {d}{\theta }^1\wedge \dots \wedge \mathrm {d}{\theta }^m\\ {}&=\frac{\partial }{\partial \theta ^\alpha }(\sqrt{g}X^\alpha )\mathrm {d}{\theta }^1\wedge \dots \wedge \mathrm {d}{\theta }^m\\ {}&=\frac{\partial }{\partial \theta ^\alpha }(\sqrt{g}X^\alpha )\frac{1}{\sqrt{g}}dM. \end{aligned}\end{aligned}$$

Again, (15) has been used. \(\square \)

Instead of characterizing the divergence of a vector field implicitly by condition (16), an explicit formula can be found.

Theorem 1

Let \(\nabla \) denote the Levi-Civita connection on the Riemannian manifold (Mg). The divergence of a vector field \(X\in \mathop {\mathrm {Vect}}(M)\) in \(p\in M\) is the trace of the map \(\phi :M\supseteq U\rightarrow {\mathbb {R}}^m, x\mapsto (\theta ^1,\dots ,\theta ^m)\). Explicitly, for any chart \(\phi :M\supset U\rightarrow {\mathbb {R}}^m, x\mapsto (\theta ^1,\dots ,\theta ^m)\) the divergence can be locally written as

$$\begin{aligned} \mathop {\mathrm {div}}\!X = \mathrm {d}{\theta }^\alpha \big (\nabla _{\frac{\partial }{\partial \theta ^\alpha }}X\big )\,. \end{aligned}$$
(23)

Hence, as done for instance by [6, 34], the divergence could alternatively be defined as

$$\begin{aligned} \mathop {\mathrm {div}}\!X = \mathop {\mathrm {tr}}(Y\mapsto \nabla _Y X)\,, \end{aligned}$$

where “\(\mathop {\mathrm {tr}}\)” denotes the trace operator. The proof of Theorem 1 involves one of the structural equations of a Riemannian manifold, which we will revise briefly before proving the theorem. The reader familiar with the subject can skip the subsequent paragraphs and directly go to the proof.

The structural equations result from Cartan’s theory of moving frames, which roughly speaking stems from representing connections in terms of arbitrary basis fields \(B_\alpha \in \mathop {\mathrm {Vect}}(M)\) instead of using the basis fields \(\frac{\partial }{\partial \theta ^\alpha }\) induced by a chart of the manifold. Let \(B^1,\dots ,B^m\) denote the canonical dual fields to the basis given by the \(B_\alpha \). It is clear that, in contrast to the exterior derivative of the dual fields \(\mathrm {d}{\theta }^\alpha \), the exterior derivatives \(\mathrm {d}B^\alpha \) do not vanish. In fact, by the well-known formula for the exterior derivative of one-forms, see Proposition 14.29 in [26], for any \(X,Y\in \mathop {\mathrm {Vect}}(M)\) it holds that

$$\begin{aligned} \mathrm {d}B^\alpha (X,Y) = \nabla _X\big (B^\alpha (Y)\big ) - \nabla _Y\big (B^\alpha (X)\big ) - B^\alpha ([X,Y])\,, \end{aligned}$$
(24)

where by definition \(\nabla _X(f):=X(f)\) for any smooth \(f :M\rightarrow {\mathbb {R}}\) and [XY] denotes the Lie bracket of X and Y. Since the covariant derivative \(\nabla _X\omega \) of a one-form \(\omega \) is defined such that the product rule

$$\begin{aligned} \nabla _X(\omega (Y))=\nabla _X\omega (Y) + \omega (\nabla _XY) \end{aligned}$$
(25)

is satisfied, equation (24) can be reformulated to

$$\begin{aligned} \begin{aligned} \mathrm {d}B^\alpha (X,Y)&= \nabla _XB^\alpha (Y) + B^\alpha (\nabla _XY)- \nabla _YB^\alpha (X) - B^\alpha (\nabla _YX) - B^\alpha ([X,Y])\\&= \nabla _XB^\alpha (Y) - \nabla _YB^\alpha (X) + B^\alpha (\nabla _XY - \nabla _YX - [X,Y])\\&=\nabla _XB^\alpha (Y) - \nabla _YB^\alpha (X)\,. \end{aligned} \end{aligned}$$
(26)

Herein, we have used that the Levi-Civita connection is torsion free, that is, \(T(X,Y) = \nabla _XY - \nabla _YX - [X,Y] = 0\). Exploiting the linearity of the connection and the fact that any vector field can be represented as \(X = B^\beta (X)B_\beta \), it follows from (26) that

$$\begin{aligned} \mathrm {d}B^\alpha (X,Y) = B^\beta (X)\nabla _{B_\beta }B^\alpha (Y) - B^\beta (Y) \nabla _{B_\beta }B^\alpha (X) = (B^\beta \wedge \nabla _{B_\beta }B^\alpha )(X,Y)\,, \end{aligned}$$

from which we conclude the structural equation

$$\begin{aligned} \mathrm {d}B^\alpha = B^\beta \wedge \nabla _{B_\beta }B^\alpha \,. \end{aligned}$$
(27)

Introducing the real-valued functions \(\omega _{\alpha \beta }^\gamma :=B^\gamma (\nabla _{B_\alpha }B_\beta )\), we have

$$\begin{aligned} \nabla _{B_\alpha }B_\beta = \omega _{\alpha \beta }^\gamma B_\gamma \quad \text {and}\quad \nabla _{B_\alpha }B^\gamma = - \omega _{\alpha \beta }^\gamma B^\beta \,, \end{aligned}$$
(28)

where the second equality follows from the first as a consequence of (25). Using (28), the structural equation (27) can finally be written as

$$\begin{aligned} \mathrm {d}B^\alpha = B^\beta \wedge \nabla _{B_\beta }B^\alpha = - \omega _{\beta \gamma }^\alpha B^\beta \wedge B^\gamma \,. \end{aligned}$$
(29)

The complete structural equations of a Riemannian manifold can be found in the 5. Theorem of [35, p. 267]. For a more in depth exposition of the theory of moving frames, we refer to [25] or [35].

To bridge the gap between the theory of moving frames and the “classical” treatment of connections, we assert that we can always chose \(B_\alpha =\frac{\partial }{\partial \theta ^\alpha }\). In that case, \(B^\alpha = \mathrm {d}{\theta }^\alpha \). It is immediately clear from (28) that \(\omega _{\alpha \beta }^\gamma =\Gamma _{\alpha \beta }^\gamma \), where the \(\Gamma _{\alpha \beta }^\gamma \) denote the Christoffel symbols of the Levi-Civita connection. Moreover, since \(\mathrm {d}B^\alpha = \mathrm {d}(\mathrm {d}{\theta }^\alpha )=0\), the structural equation (29) expresses the symmetry of the Christoffel symbols, i.e., \(\Gamma _{\alpha \beta }^\gamma = \Gamma _{\beta \alpha }^\gamma \).

Proof of Theorem 1

A similar proof can be found in Addendum 1 of Chapter 7 in [34]. Without loss of generality, we choose basis fields \(B_\alpha \in \mathop {\mathrm {Vect}}(M)\) which constitute a positively oriented orthonormal basis \((B_1,\dots ,B_m)\) and by \((B^1,\dots ,B^m)\) denote its canonical dual basis. By that choice, the Riemannian volume form has the simple representation \(dM = B^1\wedge \dots \wedge B^m\), which can immediately be seen from (14). Since the trace can be computed using any basis together with its dual basis, we can prove the theorem by showing that

$$\begin{aligned} \mathop {\mathrm {div}}\!X = B^\alpha (\nabla _{B_\alpha }X)\,. \end{aligned}$$
(30)

Moreover, the vector field X may be represented as \(X= X^\beta B_\beta \) and by (20) we have that

$$\begin{aligned} \mathop {\mathrm {div}}\!X = \mathop {\mathrm {div}}\!\,(X^\beta B_\beta ) = X^\beta \mathop {\mathrm {div}}\!B_\beta + \mathrm {d}X^\beta (B_\beta )\,. \end{aligned}$$
(31)

By the property \(\nabla _X(fY)= f \nabla _X Y + \mathrm {d}f(X) Y\) for any smooth function f, the right hand side of (30) satisfies

$$\begin{aligned} B^\alpha (\nabla _{B_\alpha }X) = B^\alpha (\nabla _{B_\alpha }(X^\beta B_\beta )) = X^\beta \,B^\alpha (\nabla _{B_\alpha }B_\beta ) + \mathrm {d}X^\beta (B_\beta )\,. \end{aligned}$$
(32)

Hence, it follows from inserting (31) and (32) in (30) that the theorem can be proven by showing (30) for the basis vectors \(B_1,\dots ,B_m\). Moreover, since the basis can be reordered arbitrarily, it even suffices to show (30) for one basis vector. We choose \(B_1\) for simplicity and subsequently show that

$$\begin{aligned} \mathop {\mathrm {div}}\!B_1 = B^\alpha (\nabla _{B_\alpha }B_1)\,. \end{aligned}$$

It is straightforward to see that

Consequently, by (17) we have

$$\begin{aligned} \begin{aligned} \mathop {\mathrm {div}}\!B_1\,dM&= \mathrm {d}(B^2\wedge \dots \wedge B^m)\\ {}&= \sum _{\alpha =2}^{m}(-1)^\alpha B^2\wedge \dots \wedge \mathrm {d}B^\alpha \wedge \dots \wedge B^m\\ {}&= \sum _{\alpha =2}^{m}(-1)^{\alpha } B^2\wedge \dots \wedge \big (- \omega _{\beta \gamma }^\alpha B^\beta \wedge B^\gamma \big )\wedge \dots \wedge B^m\\ {}&= \sum _{\alpha =2}^{m}(-1)^{\alpha } B^2\wedge \dots \wedge \big (- \omega _{\alpha 1}^\alpha B^\alpha \wedge B^1\big )\wedge \dots \wedge B^m\\ {}&= \sum _{\alpha =2}^{m}(-1)^{\alpha }(-1)^{\alpha -1} (- \omega _{\alpha 1}^\alpha B^1)\wedge B^2\wedge \dots \wedge B^\alpha \wedge \dots \wedge B^m\\ {}&= \sum _{\alpha =2}^{m}\omega _{\alpha 1}^\alpha dM\,, \end{aligned} \end{aligned}$$

where we have used (29). It can be concluded that

$$\begin{aligned} \mathop {\mathrm {div}}\!B_1 = \sum _{\alpha =2}^{m}\omega _{\alpha 1}^\alpha \,. \end{aligned}$$
(33)

Because the chosen basis is orthonormal, i.e., \(g(B_\beta ,B_\gamma )=\delta _{\beta \gamma }\), and \(\nabla \) is metric, it holds that \(\omega ^\gamma _{\alpha \beta }=-\omega ^\beta _{\alpha \gamma }\). Indeed,

$$\begin{aligned} 0=\nabla _{B_\alpha }(g(B_\beta ,B_\gamma )) = g(\nabla _{B_\alpha }B_\beta ,B_\gamma ) + g(B_\beta ,\nabla _{B_\alpha }B_\gamma ) = \omega ^\gamma _{\alpha \beta }+\omega ^\beta _{\alpha \gamma }\,. \end{aligned}$$

As a consequence, \(\omega _{11}^1=0\), which allows to take the sum from 1 to m in (33). Using that \(\omega _{\alpha \beta }^\gamma = B^\gamma (\nabla _{B_\alpha }B_\beta )\), (33) can be stated as

$$\begin{aligned} \mathop {\mathrm {div}}\!B_1 = B^\alpha (\nabla _{B_\alpha }B_1)\,. \end{aligned}$$

\(\square \)

Corollary 1

Let \(\Gamma _{\alpha \beta }^\gamma \) denote the Christoffel symbols of the Levi-Civita connection on the Riemannian manifold (Mg) w.r.t. the chart \(\phi :M\supseteq U\rightarrow {\mathbb {R}}^m, x\mapsto (\theta ^1,\dots ,\theta ^m)\). Moreover, let \(X=X^\alpha \frac{\partial }{\partial \theta ^\alpha }\), then

$$\begin{aligned} \mathop {\mathrm {div}}\!X = \frac{\partial X^\alpha }{\partial \theta ^\alpha }+\Gamma _{\alpha \beta }^\alpha X^\beta \,. \end{aligned}$$
(34)

Proof

The representation (34) follows immediately from using the coordinate expression

$$\begin{aligned} \nabla _{\frac{\partial }{\partial \theta ^\alpha }}X = \Big (\frac{\partial X^\gamma }{\partial \theta ^\alpha }+\Gamma _{\alpha \beta }^\gamma X^\beta \Big )\frac{\partial }{\partial \theta ^\gamma } \end{aligned}$$
(35)

in (23). \(\square \)

The boundary \(\partial M\) of the oriented Riemannian manifold (Mg) is a codimension-one submanifold of M. This implies that for every boundary point \(x\in \partial M\); the tangent space \(T_x\partial M\) is a subspace of \(T_xM\). Moreover, the orthogonal complement \((T_x\partial M)^\perp \) is one-dimensional. This allows to define the outward-pointing unit normal vector field \(\nu \in \mathop {\mathrm {Vect}}(M)\) to \(\partial M\) by the following three conditions. First, \(\nu \) is normalized, i.e., \(g(\nu ,\nu )=1\). Second, \(\nu \) is orthogonal to the boundary, that is, \(\nu (x)\) spans \((T_x\partial M)^\perp \), and last, \(\nu \) is outward-pointing, which means that for all \(x\in \partial M\) there is a curve \(\gamma : (-\varepsilon ,0]\rightarrow M\) \((\varepsilon >0)\) with \(\gamma (0)=x\) and \(\dot{\gamma }(0)=\nu (x)\).

The orientation of M induces an orientation on the boundary by means of the outward-pointing unit normal \(\nu \). In fact, we define the induced orientation of \(\partial M\) by saying that a basis \((B_1,\dots ,B_{m-1})\) of \(T_x\partial M\) is positively oriented if and only if the basis \((\nu ,B_1,\dots ,B_{m-1})\) of \(T_xM\) is positively oriented.

Proposition 2

(see [26], Proposition 15.34) Let (Mg) be an oriented Riemannian manifold with boundary \(\partial M\), which carries the induced orientation. Moreover, let dM and \(d\partial M\) denote the Riemannian volume form of M and \(\partial M\), respectively. Then,

(36)

where \(\nu \) is the outward-pointing unit normal to \(\partial M\).

Proof

For \(x\in \partial M\), let \((B_1,\dots ,B_{m-1})\) be a positively oriented orthonormal basis of \(T_x\partial M\), then by definition of the induced orientation, \((\nu (x),B_1,\dots ,B_{m-1})\) is a positively oriented basis of \(T_xM\). Hence, by the condition (14) of the Riemannian volume form, it holds that

This proves the claim, since the orthonormal basis \((B_1,\dots ,B_{m-1})\) is arbitrary. \(\square \)

With this preparatory work, we can finally proceed to the derivation of the divergence theorem for Riemannian manifolds. For that, we integrate (17) over M to arrive at

(37)

where Stokes’ theorem has been invoked. To further manipulate the integrand of the right-hand side, we use the outward-pointing unit normal \(\nu \) of \(\partial M\) to write \(X = g(\nu , X)\nu + X_\parallel \), where \((X_\parallel )_x\in T_x\partial M\) for all \(x\in \partial M\). Due to the linearity of the interior product, we have

(38)

on \(\partial M\), where (36) has been used. Moreover, it is easy to see that . In fact, for each \(x\in \partial M\), let \((B_1,\dots ,B_{n-1})\) be a basis of \(T_x\partial M\), then

since the vectors \(X_\parallel (x),B_1,\dots ,B_{n-1}\) are linearly dependent because \(X_\parallel (x)\in T_x\partial M\). Hence, we can drop the last summand when inserting (38) in (37), which proofs the divergence theorem in the following form.

Theorem 2

(see [26], Theorem 16.32) Let (Mg) be an oriented Riemannian manifold with boundary \(\partial M\), which carries the induced orientation. Moreover, let dM and \(d\partial M\) denote the Riemannian volume forms of M and \(\partial M\), respectively. For a vector field \(X\in \mathop {\mathrm {Vect}}(M)\), it holds that

$$\begin{aligned} \int \limits _M \mathop {\mathrm {div}}\!X\,dM = \int \limits _{\partial M} g(\nu , X)\,d\partial M \,, \end{aligned}$$
(39)

where \(\nu \) is the outward-pointing unit normal to \(\partial M\).

In many applications of the divergence theorem, see for example Sect. 2, the boundary M is the union of a finite number of faces \(F_i\), which are orientable manifolds with boundary. This makes of M a manifold with corner and \(\partial M\) is not a smooth manifold in general. However, everything which has been said in this section remains valid for manifold with corners, cf. [26]. To see that, it suffices to use \(\partial M = \cup _iF_i\) and treat the faces \(F_i\) separately whenever \(\partial M\) is invoked. That is, we can define the outward-pointing unit normal \(\nu _i\) as well as the induced orientation for every \(F_i\) exactly as done above. Also, equation (36) remains valid for every \(F_i\), i.e., . Finally, the integration over the boundary of M in the divergence theorem (39) must be interpreted as the sum of the integrals over the faces, viz

$$\begin{aligned} \int \limits _M \mathop {\mathrm {div}}\!X\,dM = \sum _i\int \limits _{F_i} g(\nu _i, X)\,dF_i\,.\end{aligned}$$

4 Divergence theorem on submanifolds of \({\mathbb {E}}^n\)

In this section, the divergence theorem for the case of an oriented submanifold M of \({\mathbb {E}}^n\) is studied. Especially, the last representation of the divergence (2) is derived.

The fact that \(M\subseteq {\mathbb {E}}^n\) is a submanifold implies that \(T_xM\subseteq T_x{\mathbb {E}}^n\) for all \(x\in M\). In extrinsic differential geometry, it is custom to identify \(T_x{\mathbb {E}}^n\) with \({\mathbb {E}}^n\). This is done implicitly by saying that a vector \(v\in {\mathbb {E}}^n\) is tangent to M at x, i.e., lies in \(T_xM\), if there is a curve \(\gamma \) in M with \(\gamma (0)=x\) and \(\dot{\gamma }(0)=v\), where \(\dot{\gamma }\) denotes the derivative of the curve. Hence, \(T_xM\) is a vector subspace of \({\mathbb {E}}^n\). The inner product \(\langle u,v\rangle \) between two vectors \(u,v\in {\mathbb {E}}^n\) can be used to induce a Riemannian metric on the manifold M by setting

$$\begin{aligned} g(x):T_xM\times T_xM\rightarrow {\mathbb {R}}, (u,v)\mapsto \langle u,v\rangle \end{aligned}$$
(40)

for every point \(x\in M\). The fact that (Mg) is an oriented Riemannian manifold gives rise to the Levi-Civita connection on M as well as the Riemannian volume form dM. Moreover, it implies that the results from Sect. 3 apply directly to submanifolds of \({\mathbb {E}}^n\). Especially, using the induced metric (40) in the right-hand side of (39) leads to the divergence theorem in the following form.

Corollary 2

Let M be an oriented submanifold of \({\mathbb {E}}^n\) equipped with the induced Riemannian metric (40). Let \(\partial M\) denote the submanifold’s boundary, which carries the induced orientation. Then, the divergence theorem is given as

$$\begin{aligned} \int \limits _M \mathop {\mathrm {div}}\!X\, dM = \int \limits _{\partial M} \langle \nu ,X\rangle \, d\partial M\,, \end{aligned}$$

where dM and \(d\partial M\) are the Riemannian volume forms of M and \(\partial M\), respectively, and \(\nu \) is outward-pointing unit normal to \(\partial M\).

To finally derive the last representation of the divergence in (2), it is insightful to further examine the geometric structure which M inherits from the surrounding space \({\mathbb {E}}^n\). Choosing a basis \((e_1,\dots ,e_n)\) of \({\mathbb {E}}^n\), we can write \(x=x^ie_i\in M\subseteq {\mathbb {E}}^n\). Moreover, consider a (local) parametrization \(\psi :{\mathbb {R}}^m\rightarrow M, (\theta ^1,\dots ,\theta ^m) \mapsto \psi (\theta ^1,\dots ,\theta ^m)\) of the manifold M, then the vectors \(g_\alpha (x):=\frac{\partial \psi }{\partial \theta ^\alpha }\big |_{\psi ^{-1}(x)}\) (\(\alpha =1,\dots ,m\)) define a basis of \(T_xM\). Let \((g^1,\dots ,g^m)\) be the canonical dual basis of \((g_1,\dots ,g_m)\), i.e., \(g^\beta \) is a linear map such that \(g^\beta (g_\alpha )=\delta _\alpha ^\beta \). Similarly, let \((e^1,\dots ,e^m)\) be the canonical dual basis of \((e_1,\dots ,e_n)\), then

$$\begin{aligned} g_\alpha = e^i(g_\alpha )e_i = A^i_\alpha e_i \quad \text {and}\quad g^\alpha = g^\alpha (e_i) e^i = B_i^\alpha e^i\,, \end{aligned}$$
(41)

where we have implicitly introduced the abbreviations \(A^i_\alpha :=e^i(g_\alpha )\) and \(B_i^\alpha :=g^\alpha (e_i)\).

Since \(T_xM\) is a linear subspace of the Euclidean space \({\mathbb {E}}^n\), we can define the orthogonal projection \(P_\parallel (x):{\mathbb {E}}^n\rightarrow {\mathbb {E}}^n\) onto \(T_xM\). By the projection property, \(P_\parallel \) must be the identity map when restricted to \(T_xM\). Hence, \(P_\parallel (g_\alpha )=g_\alpha \) for all \(\alpha =1,\dots ,m\), which implies the local representation

$$\begin{aligned} P_\parallel :{\mathbb {E}}^n\rightarrow {\mathbb {E}}^n, V \mapsto g^\alpha (V)g_\alpha \,. \end{aligned}$$

Using (41), it follows that

$$\begin{aligned} P_\parallel (e_i) = g^\alpha (e_i)g_\alpha = B^\alpha _i g_\alpha = B^\alpha _iA_\alpha ^j e_j\,. \end{aligned}$$
(42)

With the components \({P_\parallel }_i^j :=e^j(P_\parallel (e_i))\) of the projection, which is equivalent to stating that \(P_\parallel (e_i) = {P_\parallel }_i^j e_j\), a comparison with (42) yields

$$\begin{aligned} {P_\parallel }_i^j = B^\alpha _iA_\alpha ^j\,. \end{aligned}$$
(43)

The directional derivative of a scalar function \(f:{\mathbb {E}}^n\rightarrow {\mathbb {R}}\) and a vector field \(U:{\mathbb {E}}^n\rightarrow {\mathbb {E}}^n\) in the direction of the vector field \(V:{\mathbb {E}}^n\rightarrow {\mathbb {E}}^n\) at \(x\in {\mathbb {E}}^n\) are, respectively, defined as

$$\begin{aligned} \mathrm {D}_Vf(x) = \frac{\mathrm {d}(f\circ \gamma )}{\mathrm {d}t}\Big \vert _{t=0} \quad \text {and}\quad \mathrm {D}_VU(x) = \frac{\mathrm {d}(U\circ \gamma )}{\mathrm {d}t}\Big \vert _{t=0}\,, \end{aligned}$$
(44)

where \(\gamma \) is any curve in \({\mathbb {E}}^n\) with \(\gamma (0)=x\) and \(\dot{\gamma }(0)=V(x)\). Using the chain rule and the product rule, it is straightforward to verify the following properties

$$\begin{aligned} \begin{aligned} \mathrm {D}_V(fU+W)&= \mathrm {D}_Vf\,U + f \mathrm {D}_VU + \mathrm {D}_VW\\ \mathrm {D}_{fV+W}U&= f\mathrm {D}_{V}U + \mathrm {D}_{W}U\\ \mathrm {D}_V\langle U,W\rangle&=\langle \mathrm {D}_VU,W\rangle + \langle U,\mathrm {D}_VW\rangle \,, \end{aligned} \end{aligned}$$
(45)

where UVW are vector fields on \({\mathbb {E}}^n\) and f is a scalar function. Since in \({\mathbb {E}}^n\), we can always use \(\gamma (t)= x + tV\), the directional derivative in the direction of the basis vector \(e_j\) is just the partial derivative with respect to the component \(x^j\), i.e.,

$$\begin{aligned} \begin{aligned} \mathrm {D}_{e_j}f(x)&= \frac{\mathrm {d}}{\mathrm {d}t}\Big \vert _{t=0}f(x+te_j) = \frac{\partial f}{\partial x^j}(x) \\ \mathrm {D}_{e_j}U(x)&= \frac{\mathrm {d}}{\mathrm {d}t}\Big \vert _{t=0}U(x+te_j) = \frac{\partial U}{\partial x^j}(x)\,, \end{aligned} \end{aligned}$$
(46)

where (44) has been employed. With the representations \(U = U^ie_i\) and \(V=V^je_j\), it follows from (45) that

$$\begin{aligned} \mathrm {D}_VU = \mathrm {D}_{V^je_j}U = V^j\, \mathrm {D}_{e_j}U {\mathop {=}\limits ^{(4)}} V^j\frac{\partial U}{\partial x^j} = V^j\frac{\partial U^i}{\partial x^j}e_i\,. \end{aligned}$$

Since two vector fields \(X,Y\in \mathop {\mathrm {Vect}}(M)\) are also vector fields on \({\mathbb {E}}^n\), it makes sense to compute the directional derivative \(\mathrm {D}_YX\). Instead of dedicating ourselves to general vector fields, it is insightful to compute the directional derivative in the direction of the basis vector field \(g_\alpha =\frac{\partial \psi }{\partial \theta ^\alpha } \circ \psi ^{-1}\) induced by the parametrization \(\psi \) of M. By definition, at point \(x=\psi (\theta ^1,\dots ,\theta ^m)\), the curve \(\gamma (t)=\psi (\theta ^1,\dots ,\theta ^\alpha +t,\dots \theta ^m)\) satisfies \(\gamma (0)=x\) and \(\dot{\gamma }(0)=g_\alpha (x)\). Consequently, in agreement with (44), it holds that

$$\begin{aligned} \begin{aligned} \mathrm {D}_{g_\alpha }f(x)&= \frac{\mathrm {d}}{\mathrm {d}t}\Big \vert _{t=0}f\circ \psi (\theta ^1,\dots ,\theta ^\alpha +t,\dots \theta ^m) = \frac{\partial (f\circ \psi )}{\partial \theta ^\alpha }\circ \psi ^{-1}(x) \\ \mathrm {D}_{g_\alpha }X(x)&= \frac{\mathrm {d}}{\mathrm {d}t}\Big \vert _{t=0}X\circ \psi (\theta ^1,\dots ,\theta ^\alpha +t,\dots \theta ^m) = \frac{\partial (X\circ \psi )}{\partial \theta ^\alpha }\circ \psi ^{-1}(x)\,. \end{aligned} \end{aligned}$$
(47)

To keep notation short, we will drop the parametrization and briefly write \(\mathrm {D}_{g_\alpha }X = \frac{\partial X}{\partial \theta ^\alpha }\). In other words, X and \(X\circ \psi \) are identified and it is assumed to be clear from the context which of the two is meant. Moreover, we will interchangeably use \(\mathrm {D}_{g_\alpha }X\) and \(\frac{\partial X}{\partial \theta ^\alpha }\).

Using the representation \(X = X^\beta g_\beta \) in (47), it follows that

$$\begin{aligned} \mathrm {D}_{g_\alpha }X = \frac{\partial (X^\beta g_\beta )}{\partial \theta ^\alpha } = \frac{\partial X^\beta }{\partial \theta ^\alpha }g_\beta + X^\beta \frac{\partial g_\beta }{\partial \theta ^\alpha }\,. \end{aligned}$$

It is immediately clear, that while \(g_\alpha \) and X are vector fields on M, the directional derivative \(\mathrm {D}_{g_\alpha }X\) is not necessarily a vector field on M, because in general \(\mathrm {D}_{g_\alpha }g_\beta (x)=\frac{\partial g_\beta }{\partial \theta ^\alpha }(x)\notin T_xM\). Hence, the directional derivative D is not suitable as directional derivative on M. However, we can easily construct the covariant derivative \(\nabla \) on M by setting

$$\begin{aligned} \nabla _YX = P_\parallel \big (\mathrm {D}_YX\big )\,, \end{aligned}$$
(48)

where \(X,Y\in \mathop {\mathrm {Vect}}(M)\). The covariant derivative plays the role of the directional derivative on M. Moreover, since the projection is linear, the covariant derivative inherits the properties (45) of the directional derivative, where the inner product is replaced by the induced metric (40). That is,

$$\begin{aligned} \begin{aligned} \nabla _Y(fX+Z)&= \nabla _Yf\,X + f \nabla _YX + \nabla _YZ\\ \nabla _{fY+Z}X&= f\nabla _{X}X + \nabla _{Z}X\\ \nabla _Y\big (g(X,Z)\big )&=g(\nabla _YX,Z) + g(X,\nabla _YZ)\,, \end{aligned} \end{aligned}$$
(49)

where \(X,Y,Z\in \mathop {\mathrm {Vect}}(M)\) and f is a scalar function. In particular, the last property shows that the covariant derivative is metric. In contrast to \(\mathrm {D}_{g_\alpha }g_\beta =\frac{\partial g_\beta }{\partial \theta ^\alpha }\), the covariant derivative \(\nabla _{g_\alpha }g_\beta \) is a vector field on M by construction; hence, it can be spanned by the basis vector fields \(g_1,\dots ,g_m\), that is,

$$\begin{aligned} \nabla _{g_\alpha }g_\beta = \Gamma _{\alpha \beta }^\gamma g_\gamma \,, \end{aligned}$$
(50)

where the coefficients \(\Gamma _{\alpha \beta }^\gamma \) are called Christoffel symbols. Since \(g_\gamma = \frac{\partial \psi }{\partial \theta ^\gamma }\) and using the symmetry of second derivatives, we have that

$$\begin{aligned} \nabla _{g_\alpha }g_\beta - \nabla _{g_\beta }g_\alpha = P_\parallel \Big (\frac{\partial g_\beta }{\partial \theta ^\alpha }-\frac{\partial g_\beta }{\partial \theta ^\alpha } \Big ) = P_\parallel \Big (\frac{\partial ^2\psi }{\partial \theta ^\beta \partial \theta ^\alpha } -\frac{\partial ^2\psi }{\partial \theta ^\alpha \partial \theta ^\beta } \Big ) = 0\,. \end{aligned}$$
(51)

Using (50), the symmetry property (51) implies the symmetry of the Christoffel symbols

$$\begin{aligned} \Gamma _{\alpha \beta }^\gamma = \Gamma _{\beta \alpha }^\gamma \end{aligned}$$

and shows that the covariant derivative is torsion-free. Since by the fundamental theorem of Riemannian geometry, the Levi-Civita connection is the unique connection which is metric and torsion-free, the covariant derivative \(\nabla \) is exactly the Levi-Civita connection on the Riemannian manifold (Mg), where g is given by (40). For more details, we refer to Section 4.3 in [25] or Section 4A in [24]. Finally, using the representation \(X = X^\beta g_\beta \), (49) and (50), the covariant derivative of X with respect to \(g_\alpha \) takes the form

$$\begin{aligned} \nabla _{g_\alpha }X = \Big (\frac{\partial X^\gamma }{\partial \theta ^\alpha } + \Gamma ^\gamma _{\alpha \beta }X^\beta \Big )g_\gamma \,, \end{aligned}$$
(52)

which corresponds to (35) since in extrinsic differential geometry the basis vector \(g_\alpha \) plays the role of the basis vector \(\frac{\partial }{\partial \theta ^\alpha }\) from the intrinsic theory.

This preparatory work on the geometry of M induced from the surrounding space \({\mathbb {E}}^n\) is used in the subsequent paragraphs to give an alternative proof of Theorem 1 for the case of extrinsic differential geometry and to derive the last representation of the divergence (2).

Theorem 3

Let \(\nabla \) denote the Levi-Civita connection (48) on the Riemannian manifold (Mg) with the induced metric (40). The divergence of a vector field \(X\in \mathop {\mathrm {Vect}}(M)\) in \(x\in M\) is the trace of the map \(\nabla X:T_xM\rightarrow T_xM,\ Y_x\mapsto \nabla _{Y_x}X\). Explicitly, for any basis \((g_1,\dots ,g_m)\) given by a parametrization \(\psi :{\mathbb {R}}^m\rightarrow M, (\theta ^1,\dots ,\theta ^m) \mapsto \psi (\theta ^1,\dots ,\theta ^m)\), the divergence can be written as

$$\begin{aligned} \mathop {\mathrm {div}}\!X = g^\alpha (\nabla _{g_\alpha }X)\,. \end{aligned}$$
(53)

Proof

The proof follows from direct computations in coordinates starting from (21). By the product rule, we have

$$\begin{aligned} \mathop {\mathrm {div}}\!X = \frac{1}{\sqrt{g}}\frac{\partial }{\partial \theta ^\alpha }\Big (\sqrt{g}X^\alpha \Big ) = \frac{\partial X^\alpha }{\partial \theta ^\alpha } + \frac{1}{\sqrt{g}}\frac{\partial (\sqrt{g})}{\partial \theta ^\alpha }X^\alpha \,. \end{aligned}$$
(54)

Since \(\sqrt{g}=\sqrt{\det (g_{\alpha \beta })}\), using the short notation \(\det (g):=\det (g_{\alpha \beta })\), the chain rule can be used to compute

$$\begin{aligned} \frac{\partial (\sqrt{g})}{\partial \theta ^\alpha } = \frac{1}{2\sqrt{g}}\frac{\partial \det (g)}{\partial \theta ^\alpha } = \frac{1}{2\sqrt{g}}\frac{\partial \det (g)}{\partial g_{\beta \gamma }}\frac{\partial g_{\beta \gamma }}{\partial \theta ^\alpha }\,. \end{aligned}$$
(55)

Since the matrix \(g_{\beta \gamma }\) is symmetric and invertible, it holds by Jacobi’s formula that \(\frac{\partial \det (g)}{\partial g_{\beta \gamma }} = \det (g) g^{\beta \gamma }\), where \(g^{\beta \gamma }\) are the components of the inverse matrix of \(g_{\beta \gamma }\), i.e., \(g^{\alpha \beta }g_{\beta \gamma } = \delta ^\alpha _\gamma \). Consequently, (55) can be further manipulated to

$$\begin{aligned} \frac{\partial (\sqrt{g})}{\partial \theta ^\alpha } = \frac{\sqrt{g}}{2}g^{\beta \gamma }\frac{\partial g_{\beta \gamma }}{\partial \theta ^\alpha } = \frac{\sqrt{g}}{2}g^{\beta \gamma }\big (g(\nabla _{g_\alpha }g_\beta ,g_\gamma ) +g(g_\beta ,\nabla _{g_\alpha }g_\gamma )\big )\,, \end{aligned}$$

where for the last equality we have used the last property in (49) as well as the fact that \(g_{\beta \gamma }=g(g_\beta ,g_\gamma )\). We can now use (50) and the linearity of the metric to arrive at

$$\begin{aligned} \frac{\partial (\sqrt{g})}{\partial \theta ^\alpha } = \frac{\sqrt{g}}{2}g^{\beta \gamma }\big (\Gamma _{\alpha \beta }^\nu g_{\nu \gamma }+\Gamma _{\alpha \gamma }^\nu g_{\beta \nu }\big ) = \sqrt{g}\, \Gamma _{\alpha \nu }^\nu \,. \end{aligned}$$
(56)

Finally, inserting (56) in (54) yields

$$\begin{aligned} \mathop {\mathrm {div}}\!X = \frac{\partial X^\alpha }{\partial \theta ^\alpha } + \Gamma _{\alpha \nu }^\nu X^\alpha \,, \end{aligned}$$
(57)

which corresponds to (53) after (52) is inserted. \(\square \)

Remark that, due to the symmetry of the Christoffel symbols, expression (57) corresponds with the coordinate expression in Corollary 1.

Proposition 3

Let M be a submanifold of \({\mathbb {E}}^n\). For a basis \((e_1, \dots , e_n)\) of \(\mathbb {E}^n\), points \(x \in M\) and vector fields \(X \in \mathop {\mathrm {Vect}}(M)\) can be expressed as \(x=x^i e_i\) and \(X = X_\parallel ^i e_i\), respectively. Using the orthogonal projection \(P_\parallel (x):{\mathbb {E}}^n\rightarrow {\mathbb {E}}^n\) onto \(T_xM\), the divergence of the vector field X can be represented as

$$\begin{aligned} \mathop {\mathrm {div}}\!X = {P_\parallel }_i^j \frac{\partial X_\parallel ^i}{\partial x^j}\,. \end{aligned}$$
(58)

Proof

Using (48) and (41) in (53) and exploiting linearity, it follows that

$$\begin{aligned} \mathop {\mathrm {div}}\!X = g^\alpha \big (P_\parallel (\mathrm {D}_{g_\alpha }X)\big ) = B^\alpha _k e^k\big (P_\parallel (\mathrm {D}_{A^j_\alpha e_j}X)\big )= B^\alpha _k A^j_\alpha e^k\big (P_\parallel (\mathrm {D}_{e_j}X)\big )\,. \end{aligned}$$

Moreover, invoking (43) and (46), the divergence can be further simplified to

$$\begin{aligned} \mathop {\mathrm {div}}\!X = {P_\parallel }_k^j\, e^k\big (P_\parallel (e_i)\big ) \frac{\partial X_\parallel ^i}{\partial x^j} = {P_\parallel }_k^j {P_\parallel }_i^k \frac{\partial X_\parallel ^i}{\partial x^j} = {P_\parallel }_i^j \frac{\partial X_\parallel ^i}{\partial x^j}\,, \end{aligned}$$

where we have used the idempotence of the projection, i.e., \({P_\parallel }_k^j {P_\parallel }_i^k = {P_\parallel }_i^j\). \(\square \)

5 Conclusions

In Sect. 2, we have used the divergence theorem for submanifolds of \({\mathbb {E}}^3\) according to Corollary 2 with the representation of the divergence due to Proposition 3. In fact, we have used the divergence theorem twice, once to transform the volume integral into a surface integral and once to transform a surface integral to a line integral. While in the first application the orthogonal projector is trivially given by the identity map, the second application shows the charm to use (58) as representation of the divergence. Since the surface is a co-dimension one submanifold, the outward-pointing unit normal vector field readily defines the orthogonal projector.

We have shown, in order to arrive at the divergence theorem in this form, that one can take the following path. We define the divergence of a vector field X on a manifold M by the relation (16). Starting from this definition and accepting the theorem of Stokes on manifolds, the divergence theorem on Riemannian manifolds can be readily derived. Specifically, the divergence is integrated over M, then Cartan’s magic formula, Stokes’ theorem and Proposition 2 are successively employed. Finally, the divergence theorem for submanifolds of \({\mathbb {E}}^n\) follows from using the induced metric on M. The representation (58) of the divergence follows directly from computations in coordinates of the submanifold and is gathered in Theorem 3 and Proposition 3. Hereby, Theorem 3 provides us with an alternative definition of the divergence, which uses the trace of the covariant derivative of the vector field.

For completeness, we wanted to show that the alternative definition of the divergence as the trace of the covariant derivative of the vector field is also valid for general Riemannian manifolds. This result is provided by Theorem 1. Similarly to Theorem 3, the proof follows from computations in coordinates. However, since we do not have the luxury of a surrounding Euclidean space, the proof is more technical and employs the structure equations of Riemannian manifolds.

For a submanifold M of \({\mathbb {E}}^n\), it is also possible and common to define the divergence by (53) for vector fields X which are not tangent to the submanifolds, see for instance [8, 13, 15, 32]. This exploits the fact that the expression (48) defining the covariant derivative makes sense also for such vector fields X. In that case, one needs an extension of the divergence theorem which stems from an additive splitting of X into two parts, one tangential and one normal to the manifold M. The need of this splitting is a consequence of the fact that the divergence theorem from the intrinsic theory can only be applied to the part of the vector field X which is tangent to M. For more details on this construction, we refer to [6].