1 Introduction

Many mechanical systems of interest in applications possess underlying geometric structures that are preserved along the time evolution as, for instance, energy and other constants of the motion, reversibility, symplecticity... Therefore, when we implement numerical simulations it is interesting to exactly preserve some of these geometric properties to improve the quantitative and qualitative accuracy and long-time stability of the proposed methods. This is precisely the main idea behind geometric integration [3, 18, 33] and, in particular, of discrete mechanics and variational integrators [27]. In this last case, the construction of an exact discrete Lagrangian is a crucial element for the analysis of the error between the continuous trajectory and the numerical simulation derived by a variational integrator (see also [27, 32] and [7, 14] for forced systems). However, an open question is how to derive the exact discrete version for nonholonomic mechanics (see [29] for an attempt) and this is the main topic of the present paper. The importance of this problem was point out as an open problem by R.I. MacLachlan and C. Scovel:

The problem for the more general class of non-holonomic constraints is still open, as is the question of the correct analogue of symplectic integration for non-holonomically constrained Lagrangian systems. [28]

The importance of nonholonomic systems appears since they model mechanical systems subjected to velocity constraints which are not derivable from position or holonomic constraints and their equations are not obtained using variational techniques. This is the case, for instance, of rolling without slipping. These systems are of considerable interest since the velocity or nonholonomic constraints are present in a great variety of mechanical systems in engineering and robotics (see [4] and references therein). However, at the moment, there is no consensus in the scientific community on the best geometrical methods to numerically integrate a non-holonomic system while several possibilities were proposed inspired in the geometry of each nonholonomic system and suitable discretizations of Lagrange-d’Alembert principle (see [30] for a discussion on this topic and [2, 5, 6, 13, 15, 16, 20, 29], among others, for some proposals of numerical integrators). Apart from preservation of geometric structure, another reason why variational integrators work so well for unconstrained systems is because we can compare them with the corresponding exact discrete version to estimate the error of the numerical integrator. Following the work started in [29], the main contribution of our paper is to equip nonholonomic mechanics with an exact discrete version and, as a relevant consequence, we define a new family of nonholonomic integrators containing the exact discrete nonholonomic trajectory. For this purpose, we study how to describe geometrically the exact discrete space where the nonholonomic flow evolves as a submanifold of the Cartesian product of two copies of the configuration space and then we construct an exact discrete version of nonholonomic dynamics. The new geometric integrators are based in the application of the discrete modified Lagrange-D’Alembert principle and, for this reason, they will be called modified Lagrange-d’Alembert integrators (see [31] for an application of similar methods to Dirac systems).

The outline of the paper is the following: in Sect. 2, we review the theory of Lagrangian mechanics three-fold: unconstrained, nonholonomic constrained and forced. In Sect. 3, we construct the nonholonomic exponential map using the theory of second-order differential equations on the tangent bundle (see [19]) and also restricted to the constraint distribution. The main result is summarized in Theorem 3.7. The nonholonomic exponential map allows us to introduce an important geometric object: the exact discrete nonholonomic constraint submanifold. In Sect. 4, we review discrete Lagrangian mechanics for unconstrained systems and the discrete Lagrange-d’Alembert principle for discrete forced mechanics. In Sect. 5, we introduce the exact discrete flow for nonholonomic mechanics and derive an integrator having it as a particular solution. With this motivation, we construct a new family of nonholonomic integrators based on the properties of the exact discrete equations. This theory is applied to several examples showing in numerical computations the excellent behaviour of the energy. Finally, we discuss in Sect. 6 new directions to find a completely intrinsic version of nonholonomic mechanics as a discrete version of the recently proposed continuous setting [12, 17].

Unless stated otherwise, all the maps and manifolds in this paper are smooth. Einstein’s summation convention is used along the paper.

2 Continuous Lagrangian mechanics

2.1 Unconstrained systems

A mechanical system is a pair formed by a smooth manifold Q called the configuration space and a smooth function \(L:TQ\rightarrow {\mathbb {R}}\) on its tangent bundle called the Lagrangian [1, 9]. If the system is not subjected to any constraint or external forces, a motion of the mechanical system is a solution of the Euler-Lagrange equations, whose expression on natural coordinates relative to a chart \((q^{i})\) for Q and the induced coordinates \((q^i, {\dot{q}}^i)\) on TQ is

$$\begin{aligned} \frac{d}{dt}\left( \frac{\partial L}{\partial {\dot{q}}^{i}}\right) - \frac{\partial L}{\partial q^{i}}=0. \end{aligned}$$
(1)

As it is well-known these equations are obtained by minimizing the action functional defined over curves with fixed end points. Denote the set of twice differentiable curves with fixed end-points \(q_0, q_1 \in Q\) by

$$\begin{aligned} C^2 (q_0,q_1)= \{q:[0,T]\longrightarrow Q| \ q(\cdot ) \ \text {is} \ C^2, q(0)=q_0, q(T)=q_1\}. \end{aligned}$$

Then the action functional is defined by

$$\begin{aligned} {\mathcal {J}}:C^2 (q_0,q_1)\longrightarrow {\mathbb {R}}, \ \ q(\cdot )\mapsto {\mathcal {J}}(q(\cdot ))=\int _{0}^{T} L(q(t),{\dot{q}}(t)) \ dt. \end{aligned}$$

We can also express these equations using the geometric ingredients on the tangent bundle. Let \(\tau _Q: TQ\rightarrow Q\) be the canonical tangent projection which in coordinates is given by \((q^i, {\dot{q}}^i)\longrightarrow (q^i)\). The vertical lift of a vector \(v_q\in T_qQ=\tau ^{-1}_Q(q)\) to \(T_{u_q}TQ\), with \(u_q\in T_qQ\) is given by

$$\begin{aligned} (v_q)^V_{u_q}=\left. \frac{d}{dt}\right| _{t=0}(u_q + t v_q) \end{aligned}$$

and the Liouville vector field on TQ is

$$\begin{aligned} \Delta (v_q)=\left. \frac{d}{dt}\right| _{t=0}(v_q + t v_q)=(v_q)^V_{v_q}. \end{aligned}$$

The vertical endomorphism \(S:TTQ\rightarrow TTQ\) is defined by

$$\begin{aligned} S(X_{v_q})=(T_{v_q} \tau _{Q} (X_{v_q}))^V_{v_q}. \end{aligned}$$

In local coordinates, \(\Delta (q^i, v^i)=v^{i}\frac{\partial }{\partial {\dot{q}}^{i}}\) and \(S(X^{i}\frac{\partial }{\partial q^{i}}+X^{n+i}\frac{\partial }{\partial {\dot{q}}^{i}})=X^{i}\frac{\partial }{\partial {\dot{q}}^{i}}\).

A vector field \(\Gamma \) on TQ is said to be a second order differential equation (SODE) if \(S\Gamma = \Delta \) or, equivalently,

$$\begin{aligned} (T_{v_q}\tau _Q)(\Gamma (v_q)) = v_q, \; \; \; \forall v_q \in T_qQ. \end{aligned}$$
(2)

From (2), it also follows that \(\Gamma \in \mathfrak {X}(TQ)\) is a SODE if and only if its local expression is

$$\begin{aligned} \Gamma (q, {\dot{q}}) = {\dot{q}}^{i}\frac{\partial }{\partial q^{i}} + \xi ^{i}(q, {\dot{q}}) \frac{\partial }{\partial {\dot{q}}^{i}}, \end{aligned}$$

with \(\xi ^{i}\) local real \(C^{\infty }\)-functions on TQ. So, the integral curves

$$\begin{aligned} t \rightarrow (q^{i}(t), {\dot{q}}^{i}(t)) \end{aligned}$$

of a SODE \(\Gamma \) satisfy the following system of differential equations

$$\begin{aligned} \displaystyle \frac{dq^{i}}{dt} = {\dot{q}}^{i}, \; \; \; \frac{d{\dot{q}}^i}{dt} = \xi ^{i}(q, {\dot{q}}), \; \; \forall i \end{aligned}$$

or, equivalently, the following system of second order differential equations

$$\begin{aligned} \displaystyle \frac{d^2q^{i}}{dt^2} = \xi ^{i}(q, {\dot{q}}), \; \; \forall i. \end{aligned}$$

In particular, every integral curve \( \gamma : I \rightarrow TQ \) of a SODE \(\Gamma \) is the tangent lift of a curve on Q. Namely,

$$\begin{aligned} \gamma (t) = \frac{d}{dt}(\tau _Q \circ \gamma ). \end{aligned}$$

Such curves on Q are the trajectories of \(\Gamma \).

Other notion that will be used later is that of the vertical lift of a vector field on Q to TQ. Let \(X\in {{\mathfrak {X}}}(Q)\), the vertical lift of X is the vector field on TQ defined by:

$$\begin{aligned} X^V(v_{q})=\left. \frac{d}{dt}\right| _{t=0}(v_{q}+tX(q)))=(X(q))^V_{v_{q}},\; \ \forall v_{q}\in T_{q} Q. \end{aligned}$$

Locally,

$$\begin{aligned} X^V=X^i\frac{\partial }{\partial {\dot{q}}^i} \end{aligned}$$
(3)

where \(X=X^i\frac{\partial }{\partial q^i}\).

Denote by \(\{\Phi ^X_t\}\) the flow of a vector field \(X\in {{\mathfrak {X}}}(Q)\). We can also define the complete lift \(X^C\in {{\mathfrak {X}}}(TQ)\) of X in terms of its flow. We say that \(X^C\) is the vector field on TQ with flow \(\{T\Phi ^X_t\}\). In other words,

$$\begin{aligned} X^C(v_q)=\left. \frac{d}{dt}\right| _{t=0}\left( T_q\Phi ^X_t(v_q)\right) \; . \end{aligned}$$

In coordinates

$$\begin{aligned} X^C=X^i\frac{\partial }{\partial q^i}+{\dot{q}}^j\frac{\partial X^i}{\partial q^j}\frac{\partial }{\partial {\dot{q}}^i} \ . \end{aligned}$$
(4)

Note that, if \(q^{i}(t)\) are the local coordinates of a curve on Q, then using (3) and (4), it is easy to prove that such a curve is a solution of Euler-Lagrange Eq. (1) if and only if

$$\begin{aligned} X^{C}(L) (q,{\dot{q}})-\frac{d}{dt}\left( X^{V}(L)(q,{\dot{q}})\right) =0, \quad \forall \ X\in {\mathfrak {X}}(Q). \end{aligned}$$

When the function L is regular that is, the matrix \(\text {Hess}(L):=\left( \frac{\partial ^2 L}{\partial {\dot{q}}^i \partial {\dot{q}}^j}\right) \) is non-singular, Eq. (1) may be written as a system of second-order differential equations obtained by computing the integral curves of the unique vector field \(\Gamma _L\) satisfying

$$\begin{aligned} i_{\Gamma _L}\omega _L=dE_L, \end{aligned}$$
(5)

where \(\omega _L =-d(S^* dL)\) and \(E_L =\Delta L-L\) are the Poincaré-Cartan 2-form and the energy function, respectively. Moreover, \(\Gamma _L\) is a SODE vector field on TQ (see [9]). Observe that regularity of L is equivalent to \(\omega _L\) being symplectic and therefore to the uniqueness of solution for Eq. (5). In effect, the local expression of the Poincaré-Cartan 2-form is

$$\begin{aligned} \omega _L=\frac{\partial ^2 L}{\partial {\dot{q}}^i \partial q^j} d q^{i}\wedge d q^{j}+\frac{\partial ^2 L}{\partial {\dot{q}}^i \partial {\dot{q}}^j} dq^{i}\wedge d{\dot{q}}^{j}. \end{aligned}$$

Now, we move on to a brief description of standard Hamiltonian mechanics. The cotangent bundle \(T^*Q\) of a differentiable manifold Q is equipped with a canonical exact symplectic structure \(\omega _Q=-d\theta _Q\), where \(\theta _Q\) is the canonical 1-form on \(T^*Q\) defined by

$$\begin{aligned} (\theta _Q)_{\alpha _q}(X_{\alpha _q})=\langle \alpha _q, T_{\alpha _q}\pi _Q(X_{\alpha _q})\rangle \end{aligned}$$

with \(X_{\alpha _q}\in T_{\alpha _q}T^*Q\), \(\alpha _q\in T_q^*Q\) and \(\pi _Q: T^*Q\rightarrow Q\) the canonical projection which in canonical coordinates is \((q^i, p_i)\rightarrow (q^i)\). In canonical bundle coordinates these become

$$\begin{aligned} \theta _Q= p_i\, dq^i\; ,\ \omega _Q= dq^i\wedge dp_i\; . \end{aligned}$$

Given a Hamiltonian function \(H: T^*Q\rightarrow {{\mathbb {R}}}\) we define the Hamiltonian vector field \(X_H\) by

$$\begin{aligned} \imath _{X_H}\omega _Q=dH\; \end{aligned}$$

The integral curves of \(X_H\) are determined by Hamilton’s equations:

$$\begin{aligned} \frac{dq^i}{dt}=\frac{\partial H}{\partial p_i}\; ,\qquad \frac{dp_i}{dt}=-\frac{\partial H}{\partial q^i}\; . \end{aligned}$$

We can define the Legendre transformation \({\mathbb {F}} L: TQ\rightarrow T^*Q\) by:

$$\begin{aligned} \langle {\mathbb {F}} L (u_q), v_q\rangle =\left. \frac{d}{dt}\right| _{t=0}L(u_q + t v_q) \end{aligned}$$

and if L is regular, its Legendre transformation is a local diffeomorphism. In local coordinates \({\mathbb {F}}L (q^i, {\dot{q}}^i)=(q^i, \frac{\partial L}{\partial \dot{q}^i})\). Defining \(H = E_L \circ \left( {\mathbb {F}}L\right) ^{-1}\) we have that the solutions of \(\Gamma _L\) and \(X_H\) are \({\mathbb {F}}L\)-related. An extensive account of this subject is contained in [1, 9], for instance.

2.2 Forced mechanics

Now, we also add into the picture external forces. An external force can be interpreted as a fiber-preserving map denoted by \(F:TQ\rightarrow T^{*}Q\) satisfying \(\pi _Q \circ F = \tau _Q\). In canonical bundle coordinates \((q^i, p_i)\) on \(T^*Q\) we have that \(\pi _Q(q^i, p_i)=(q^i)\), thus \(F(q^i, {\dot{q}}^i)=(q^i, F_i(q^{i}, {\dot{q}}^{i}))\).

It is well-know that to each such a map we can associate a semibasic one-form on TQ defined by

$$\begin{aligned} \langle \mu _{F}(v_q), W \rangle =\langle F(v_q),T\tau _{Q}(W) \rangle , \ \ v_q\in TQ \ \text {and} \ W\in T_{v_q}TQ. \end{aligned}$$

In coordinates \( \mu _F=F_i(q^i, {\dot{q}}^i)\, dq^i\; . \)

A system described by a Lagrangian function \(L:TQ\rightarrow {\mathbb {R}}\) and subjected to an external force F, satisfies the Lagrange-d’Alembert principle, which asserts that a motion of this system between two fixed points \(q_0,q_1\in Q\) is a curve \(q\in C^{2}(q_0,q_1)\) satisfying

$$\begin{aligned} \left. \frac{d}{ds}\right| _{s=0} \int _{0}^{h} L(q(t,s),{\dot{q}}(t,s)) \ dt + \int _{0}^{h} \left\langle F(q(t),{\dot{q}}(t)), \frac{\partial q}{\partial s}(t,0)\right\rangle \ dt=0, \end{aligned}$$
(6)

for all smooth variations \(q(s)\in C^{2}(q_0,q_1)\) of q. This is locally equivalent to the forced Euler-Lagrange equations

$$\begin{aligned} \frac{d}{dt}\left( \frac{\partial L}{\partial {\dot{q}}^{i}}\right) - \frac{\partial L}{\partial q^{i}}=F_{i}\; . \end{aligned}$$
(7)

As in the case of unconstrained systems, it is easy to see using (7), that a curve q(t) on Q satisfies the forced Euler-Lagrange equations if and only if

$$\begin{aligned} X^{C}(L) (q,{\dot{q}})-\frac{d}{dt}\left( X^{V}(L)(q,{\dot{q}})\right) =\langle F(q,{\dot{q}}),X\circ q \rangle , \quad \forall \ X\in {\mathfrak {X}}(Q). \end{aligned}$$

If L is regular, then the solutions of Eq. (7) are integral curves of a SODE vector field on Q denoted by \(\Gamma _{(L,F)}\), called forced Lagrangian vector field which is the unique vector field satisfying

$$\begin{aligned} i_{\Gamma _{(L,F)}}\omega _{L}=dE_{L}-\mu _{F}. \end{aligned}$$
(8)

Now, we move onto the Hamiltonian description of systems subjected to external forces. Given a Hamiltonian function \(H: T^*Q\rightarrow {{\mathbb {R}}}\) we may construct the transformation \({\mathbb {F}}H: T^*Q\rightarrow TQ\) where \(\langle \beta _q, {\mathbb {F}}H(\alpha _q)\rangle ={\frac{d}{dt}\big |_{ t=0}H(\alpha _q+t\beta _q)}\). In coordinates, \({\mathbb {F}}H(q^i, p_i)=(q^i, \frac{\partial H}{\partial p_i}(q,p))\). We say that the Hamiltonian is regular if \({\mathbb {F}}H\) is a local diffeomorphism, which in local coordinates is equivalent to the regularity of the Hessian matrix whose entries are:

$$\begin{aligned} M^{i j} = \frac{\partial ^2 H}{\partial p_i\partial p_j}. \end{aligned}$$

Consider now the external force previously defined in the Lagrangian description and denote \(F^H = F\circ {\mathbb {F}}H: T^*Q\rightarrow T^*Q\).

It is possible to modify the Hamiltonian vector field \(X_H\) to obtain the forced Hamilton’s equations as the integral curves of the vector field \(X_H+Y^v_F\) where the vector field \(Y^v_F\in {{\mathfrak {X}}}(T^*Q)\) is defined by

$$\begin{aligned} Y^v_F(\alpha _q)=\frac{d}{dt}\Big |_{t=0}(\alpha _q+t F^H(\alpha _q))\; . \end{aligned}$$

We will say the the forced Hamiltonian system is determined by the pair \((H, F^H)\).

Locally,

$$\begin{aligned} Y^v_F=F_i\left( q^j, \frac{\partial H}{\partial p_j}(q, p)\right) \frac{\partial }{\partial p_i}=F^H_i(q, p)\frac{\partial }{\partial p_i} \end{aligned}$$

modifying Hamilton’s equations as follows:

$$\begin{aligned} \frac{dq^i}{dt}= & {} \frac{\partial H}{\partial p_i}(q,p)\; , \end{aligned}$$
(9)
$$\begin{aligned} \frac{dp_i}{dt}= & {} -\frac{\partial H}{\partial q^i}(q,p)+F^H_i(q,p)\; . \end{aligned}$$
(10)

2.3 Nonholonomic systems

A nonholonomic system is defined by the triple \((Q, L, {\mathcal {D}})\) where \(L: TQ\rightarrow {{\mathbb {R}}}\) is a Lagrangian function and \({\mathcal {D}}\) is a nonintegrable distribution on the configuration manifold Q. The distribution \({\mathcal {D}}\) restricts the velocity vectors of motions to lie on \({\mathcal {D}}\) without imposing any restriction on the configuration space. Note that if the distribution was integrable, then the manifold Q would be foliated by immersed submanifolds of Q whose tangent space at each point coincides with the subspace given by the distribution at that point. Hence, motions of these systems would be confined to submanifolds \(N\subseteq Q\) (the leaves of the foliation). In this way, we can consider this case as a holonomic system specified by \((N,L|_{N})\). This class of constraints is called holonomic constraints. See [4] for more details.

Locally, the nonholonomic constraints are given by a set of k equations that are linear on the velocities

$$\begin{aligned} \mu ^{a}_{i}(q){\dot{q}}^{i}=0, \end{aligned}$$

where \(1\leqslant a\leqslant k\) and the rank of \({\mathcal {D}}\) is \(\text {dim}(Q)-k\). From other point of view, these equations define the vector subbundle \({\mathcal {D}}^{o}\subseteq T^{*}Q\), called the annihilator of \({\mathcal {D}}\), spanned at each point by the one forms \(\{\mu ^{a}\}\) locally given by \(\mu ^{a}=\mu ^{a}_{i}(q) dq^{i}\). Observe that with this relationship, we can identify the distribution \({\mathcal {D}}\) with a submanifold of the tangent bundle that we also denote by \({\mathcal {D}}\).

In nonholonomic mechanics, the equations of motion are completely determined by the Lagrange-d’Alembert principle. This principle states that a curve \(q(\cdot )\in C^2(q_0, q_1)\) is an admissible motion of the system if

$$\begin{aligned} \delta {\mathcal {J}}=\delta \int ^{T}_{0}L( q(t), \dot{q}(t))\, dt=0\, , \end{aligned}$$

for all variations such that \(\delta q (t)\in {\mathcal {D}}_{q(t)}\), \(0\le t\le T\), with \(\delta q(0)=\delta q(T)=0\). The velocity of the curve itself must also satisfy the constraints \({\dot{q}}(t)\in {\mathcal {D}}_{q(t)}\). From the Lagrange-d’Alembert principle, we arrive at the well-known nonholonomic equations

$$\begin{aligned}&\frac{d}{dt}\left( \frac{\partial L}{\partial {\dot{q}}^{i}}\right) - \frac{\partial L}{\partial q^{i}}=\lambda _{a}\mu ^{a}_{i}(q) \end{aligned}$$
(11)
$$\begin{aligned}&\mu ^{a}_{i}(q){\dot{q}}^{i}=0, \end{aligned}$$
(12)

for some Lagrange multipliers \(\lambda _{a}\), which may be determined with the help of the constraint equations.

In more geometric terms, Eqs. (11) and (12) are the differential equations for a SODE \(\Gamma _{nh}\) on \({\mathcal {D}}\) satisfying the equations

$$\begin{aligned}&i_{\Gamma _{nh}}\omega _L-dE_L\in \Gamma (F^o), \end{aligned}$$
(13)
$$\begin{aligned}&\Gamma _{nh} \in {\mathfrak {X}}({\mathcal {D}}), \end{aligned}$$
(14)

where \(F^{o}=S^*((T{\mathcal {D}})^{o})\) is the annihilator of a distribution F on TQ defined along \({\mathcal {D}}\) and \(\Gamma (F^{o})\) is the space of sections of \(F^{o}\). The fact that \(\Gamma _{nh}\) is a SODE along \({\mathcal {D}}\) means that

$$\begin{aligned} (T_v\tau )(\Gamma _{nh}(v)) = v, \; \; \; \forall v \in {\mathcal {D}}, \end{aligned}$$

with \(\tau : {\mathcal {D}} \rightarrow Q\) the vector bundle projection. Thus, the integral curves of \(\Gamma _{nh}\) are tangent lifts of curves on Q, the trajectories of \(\Gamma _{nh}\).

The nonholonomic system is said to be regular if the following compatibility condition is satisfied (see [8]):

$$\begin{aligned} {T_{v}{\mathcal {D}}\cap (\sharp )_{v}(F_{v}^{o})=\{0\} \text{ for } \text{ all } v\in {\mathcal {D}}}. \end{aligned}$$

The sharp isomorphism \(\sharp :T^{*}(TQ)\rightarrow T(TQ)\) is the inverse map to the flat isomorphism defined by \(\flat (X)=i_{X}\omega _{L}\). If the nonholonomic system is regular, then Eqs. (13) and (14) have a unique solution denoted by \(\Gamma _{nh}\) whose integral curves satisfy Eqs. (11) and (12).

To each of the one-forms \(\mu ^{a}\in \Gamma ({\mathcal {D}}^o)\) we associate the constraint functions \(\Phi ^{a}:TQ\rightarrow {\mathbb {R}}\) defined by \(\Phi ^{a}(v_q)=\langle \mu ^{a}(q),v_q \rangle \) or \(\Phi ^{a}(q, {\dot{q}})= \mu ^{a}_i(q){\dot{q}}^i \). In local coordinates, Eq. (13) may be written like

$$\begin{aligned} i_{\Gamma _{nh}}\omega _L-dE_L=\lambda _{a} S^{*}(d\Phi ^{a})=\lambda _{a} \mu ^{a}_{i} dq^{i}, \end{aligned}$$

for some Lagrange multipliers \(\lambda _{a}\). Therefore, a solution \(\Gamma _{nh}\) of (13) is of the form \(\Gamma _{nh}=\Gamma _{L}+\lambda _{a} Z^{a}\), where \(Z^{a}=\sharp (\mu ^{a}_{i} dq^{i})\). The Lagrange multipliers may be computed by imposing the tangency condition (14), which is equivalent to

$$\begin{aligned} 0=\Gamma _{nh}(\Phi ^{b})=\Gamma _{L}(\Phi ^{b})+\lambda _{a} Z^{a}(\Phi ^{b}), \ \ \text {for} \ b=1,\ldots ,n-k. \end{aligned}$$
(15)

This equation has a unique solution for the Lagrange multipliers if and only if the matrix \(C=({\mathcal {C}}^{a b})=(Z^{a}(\Phi ^{b}))\) is invertible at all points of \({\mathcal {D}}\), which is equivalent to the compatibility condition (cf. [8]).

Recall from symplectic geometry that \(F^{\bot }=\sharp (F^{o})\) for any distribution F, where \(\bot \) denotes the symplectic complement relative to \(\omega _{L}\). Hence, the compatibility condition also implies the following Whitney sum decomposition

$$\begin{aligned} T(TQ)|_{{\mathcal {D}}}=T{\mathcal {D}}\oplus F^{\bot }, \end{aligned}$$

to which we may associate two complementary projectors \(P:T(TQ)|_{{\mathcal {D}}}\rightarrow T{\mathcal {D}}\) and \(P':T(TQ)|_{{\mathcal {D}}}\rightarrow F^{\bot }\) with coordinate expressions

$$\begin{aligned} P(X)=X-{\mathcal {C}}_{a b}\, d\Phi ^{b}(X)Z^{a}, \quad P'(X)={\mathcal {C}}_{a b}\, d\Phi ^{b}(X)Z^{a}, \end{aligned}$$

where \(C_{ab}\) are the entries of the inverse matrix \(C^{-1}\) of C.

Proposition 2.1

The nonholonomic dynamics is given by

$$\begin{aligned} \Gamma _{nh}=P(\Gamma _{L}|_{{\mathcal {D}}}). \end{aligned}$$

Indeed, under all the assumptions we have considered so far, we can compute the Lagrange multipliers to be

$$\begin{aligned} \lambda _{a}=-{\mathcal {C}}_{a b}\Gamma _{L}(\Phi ^{b}), \end{aligned}$$
(16)

from where the result follows. So, under the compatibility condition, the nonholonomic system \((L,{\mathcal {D}})\) is said to be regular. For more details see [4] or [8].

Remark 2.2

Note that, under the compatibility condition, nonholonomic mechanics can be interpreted as “restricted forced systems", in the sense that we can define the nonholonomic external force \(F_{nh}:{\mathcal {D}}\rightarrow T^{*}Q\) which makes (11) forced Euler-Lagrange equations. In coordinates, \(F_{nh}(v_q)=\lambda _{a}(v_q)\mu _{i}^{a}(q)dq^{i}\) where the \(\lambda _a\) are given in expression (16). Moreover, as in the case of forced Lagrangian systems, if q(t) is a curve on Q such that \({\dot{q}}(t)\in {\mathcal {D}}\), then such a curve is a solution of the nonholonomic Eq. (11) if and only if

$$\begin{aligned} X^{C}(L) (q,{\dot{q}})-\frac{d}{dt}\left( X^{V}(L)(q,{\dot{q}})\right) =\langle F_{nh}(q,{\dot{q}}),X\circ q \rangle , \quad \forall \ X\in {\mathfrak {X}}(Q). \end{aligned}$$
(17)

Taking the restriction of the Lagrangian \(L: TQ\rightarrow {\mathbb {R}}\) to \({\mathcal {D}}\) denoted by \(l: {\mathcal {D}}\rightarrow {\mathbb {R}}\) we can construct the nonholonomic Legendre map

$$\begin{aligned} {\mathbb {F}} l: {\mathcal {D}}\longrightarrow {\mathcal {D}}^*\; , \end{aligned}$$

as

$$\begin{aligned} \langle {\mathbb {F}} l (u_q), v_q\rangle =\left. \frac{d}{dt}\right| _{t=0}l(u_q + t v_q) \end{aligned}$$

for \(u_q, v_q\in {\mathcal {D}}\). Under the compatibility assumption, the map \({\mathbb {F}} l\) is a local diffeomorphism and we can transport the vector field \(\Gamma _{nh}\in {\mathfrak {X}}({\mathcal {D}})\) to a vector field \({{\bar{\Gamma }}}_{nh}\in {\mathfrak {X}}({\mathcal {D}}^*)\) which represents the almost-Hamiltonian dynamics on \({\mathcal {D}}^*\) [12, 17].

Example 1

We will introduce here an example of a simple nonholonomic system to which we will get back all along the text: the nonholonomic particle. Consider a mechanical system in the configuration manifold \(Q={\mathbb {R}}^3\) defined by the Lagrangian

$$\begin{aligned} L(x,y,z,{\dot{x}},{\dot{y}},{\dot{z}})=\frac{1}{2}({\dot{x}}^2 +{\dot{y}}^2 +{\dot{z}}^2) \end{aligned}$$

and subjected to the nonholonomic constraint \({\dot{z}}-y {\dot{x}}=0\). The one-form \(\mu =dz-y \ dx\) spans the vector subbundle \({\mathcal {D}}^{o}\), which is the annihilator of the distribution

$$\begin{aligned} {\mathcal {D}}=\text {span}\left\{ \frac{\partial }{\partial x}+y \frac{\partial }{\partial z}, \frac{\partial }{\partial y} \right\} . \end{aligned}$$

Then the equations of motion of this system are given by Lagrange-d’Alembert Eqs. (11) and (12), which in this case hold

$$\begin{aligned} {\left\{ \begin{array}{ll} \ddot{x}=- \lambda y \\ \ddot{y}=0 \\ \ddot{z}=\lambda \\ {\dot{z}}-y{\dot{x}}=0 \end{array}\right. } \quad \Rightarrow \quad {\left\{ \begin{array}{ll} \ddot{x}=-y\frac{{\dot{x}}{\dot{y}}}{1+y^2} \\ \ddot{y}=0 \\ \ddot{z}=\frac{{\dot{x}}{\dot{y}}}{1+y^2} \\ {\dot{z}}-y{\dot{x}}=0, \end{array}\right. } \end{aligned}$$
(18)

where the value of \(\lambda \) is computed with the help of the constraints. These equations have an explicit solution given by

$$\begin{aligned} {\left\{ \begin{array}{ll} x_{nh}(t)= \frac{{\dot{x}}_0}{{\dot{y}}_0}\sqrt{y_0^2+1}({{\,\mathrm{arcsinh}\,}}({\dot{y}}_0 t+y_0)-{{\,\mathrm{arcsinh}\,}}(y_0))+x_0\\ y_{nh}(t)={\dot{y}}_0 t+y_0 \\ z_{nh}(t)= \frac{{\dot{x}}_0}{{\dot{y}}_0}\sqrt{y_0^2+1}(\sqrt{({\dot{y}}_0 t+y_0)^2+1}-\sqrt{y_0^2+1})+z_0, \quad \text {if} \ \ {\dot{y}}_0\ne 0, \end{array}\right. } \end{aligned}$$
(19)

or

$$\begin{aligned} {\left\{ \begin{array}{ll} x_{nh}(t)= {\dot{x}}_0 t+x_0\\ y_{nh}(t)=y_0 \\ z_{nh}(t)= y_0 {\dot{x}}_0 t+z_0, \quad \text {if} \ \ {\dot{y}}_0= 0. \end{array}\right. } \end{aligned}$$
(20)

3 The nonholonomic exponential map

In this section, we will define the nonholonomic exponential map associated with a regular nonholonomic system \((L, {\mathcal {D}})\) with configuration space Q and dynamical vector field \(\Gamma _{nh} \in \mathfrak {X}({\mathcal {D}})\). We will use an arbitrary SODE extension \(\Gamma \in {{\mathfrak {X}}}(TQ)\) of \(\Gamma _{nh}\) and the exponential map \(\text {exp}_h^{\Gamma }\) for the SODE \(\Gamma \) on TQ at time \(h > 0\). We remark that such SODE extension vector fields always exist (see Appendix 1).

First, we will review the definition of the exponential map associated with \(\Gamma \) (for more details, see [26]).

3.1 Exponential map for SODE vector fields on the tangent bundle

A preliminary convexity result for a SODE \(\Gamma \) may be deduced using the theory of explicit second order differential equations (see [19]).

Theorem 3.1

Let \(\Gamma \) be a SODE in Q and \(q_{0}\) be a point of Q. Then, one may find a sufficiently small positive number \(h_{0}\), a family of tangent vectors of Q at \(q_0\),

$$\begin{aligned} v_{(h, q_0)} \in T_{q_{0}}Q, \; \; \text{ for } 0 < h \le h_0, \end{aligned}$$

and two compact subsets C and \({\bar{C}}\) of Q and TQ, respectively, with \(q_0 \in C\) and \(v_{(h, q_0)} \in {\bar{C}}\), such that there exists a unique trajectory of \(\Gamma \)

$$\begin{aligned} \sigma _{q_0q_0h}: [0, h] \rightarrow C \subseteq Q \end{aligned}$$

satisfying

$$\begin{aligned} \sigma _{q_0q_0h}(0) = q_0, \; \; \; \sigma _{q_0q_0h}(h) = q_0, \end{aligned}$$

and

$$\begin{aligned} {\dot{\sigma }}_{q_0q_0h}(t) \in {\bar{C}}, \; \text{ for } \text{ every } t \in [0, h]. \end{aligned}$$

Proof

Let \(({U}, {\varphi } \equiv (q^i))\) be a local chart on Q such that

$$\begin{aligned} {\varphi }({U}) = B(0; \epsilon ) \; \text{ and } {\varphi }(q_0) = (0, \dots , 0), \end{aligned}$$

where \(B(0; \epsilon )\) is the open ball in \({\mathbb {R}}^n\) with centre the origin and radius \(\epsilon > 0\).

We consider the corresponding local coordinates \((\tau _{Q}^{-1}({U}), {\bar{\varphi }} \equiv (q^i, v^i))\) on TQ. Note that \({\bar{\varphi }}(\tau _{Q}^{-1}({U})) = {\varphi }({U}) \times {\mathbb {R}}^n\). Since \(\Gamma \) is a SODE, we also have that

$$\begin{aligned} \Gamma = \displaystyle {\dot{q}}^i \frac{\partial }{\partial q^i} + \xi ^{i} (q, {\dot{q}}) \frac{\partial }{\partial {\dot{q}}^i}. \end{aligned}$$

Then, the trajectories of \(\Gamma \) in U are the solutions of the system of second order differential equations

$$\begin{aligned} \displaystyle \frac{d^2q^i}{dt^2} = \xi ^i\left( q, \frac{dq}{dt}\right) , \; \; \; \text{ for } \text{ all } i. \end{aligned}$$

Now, if we take

$$\begin{aligned} 0< R< \epsilon \; \; \text{ and } \; \; 0 < R' \end{aligned}$$

then, using that \(\xi ^{i}\) is a real \(C^{\infty }\)-function on \(B(0; \epsilon ) \times {\mathbb {R}}^n\), we deduce that there exist positive constants \(L, L' > 0\) satisfying

$$\begin{aligned} \Vert D_1\xi (q, {\dot{q}})\Vert \le L, \; \; \Vert D_2\xi (q, {\dot{q}})\Vert \le L', \; \; \text{ for } (q, {\dot{q}}) \in \overline{{B}(0; R)} \times \overline{{B}(0; R')}, \end{aligned}$$

where \(\overline{{B}(0; R)}\) and \(\overline{{B}(0; R')}\) are the closed balls in \({\mathbb {R}}^n\) centred at the origin and with radius R and \(R'\), respectively. Thus, from Proposition B.3 (see Appendix 1), it follows that

$$\begin{aligned} \begin{array}{rcl} \Vert \xi ^{i}(q^j_1, {\dot{q}}^j_1) - \xi ^{i}(q^j_2, {\dot{q}}^j_2)\Vert &{}\le &{} \Vert \xi ^{i}(q^j_1, {\dot{q}}^j_1) - \xi ^{i}(q^j_2, {\dot{q}}^j_1)\Vert + \Vert \xi ^{i}(q^j_2, {\dot{q}}^j_1) - \xi ^{i}(q^j_2, {\dot{q}}^j_2)\Vert \\ &{} \le &{} L \Vert q_2 - q_1\Vert + L' \Vert {\dot{q}}_2 - {\dot{q}}_1 \Vert \end{array} \end{aligned}$$

for \((q^j_1, {\dot{q}}^j_1), (q^j_2, {\dot{q}}^j_2) \in B(0; R) \times B(0; R')\).

Moreover, it is clear that there exists a positive constant \(M > 0\) such that

$$\begin{aligned} \Vert \xi (q^j, {\dot{q}}^j) \Vert \le M, \; \; \forall (q, {\dot{q}}) \in \overline{{B}(0; R)} \times \overline{{B}(0, R')}. \end{aligned}$$

Next, we choose a sufficiently small positive number \(h_{0}\) satisfying

$$\begin{aligned} \displaystyle \frac{L h_{0}^2}{8} + \frac{L' h}{2} < 1, \; \; \frac{Mh_{0}^2}{8} \le R, \; \; \frac{Mh_{0}}{2} \le R'. \end{aligned}$$

Now, if we take \(h\in {\mathbb {R}}\), \(0 < h \le h_0\) and the compact subsets K and \({\bar{K}}\) of Q and TQ, respectively, given by

$$\begin{aligned} K = {\varphi }^{-1}(\overline{{B}(0; R)}), \; \; {\bar{K}} = {\bar{\varphi }}^{-1}(\overline{{B}(0; R)} \times \overline{{B}(0;R')}) \end{aligned}$$

then, using Theorem B.1 (see Appendix 1), we conclude that there exists a unique trajectory \(\sigma _{q_{0}q_{0}}: [0, h] \rightarrow K \subseteq Q\) of \(\Gamma \) such that

$$\begin{aligned} \sigma _{q_{0}q_{0}}(0) = q_0, \; \; \; \sigma _{q_{0}q_{0}}(h) = q_0, \end{aligned}$$

and

$$\begin{aligned} {\dot{\sigma }}_{q_0q_0}(t) \in {\bar{K}}, \; \; \text{ for } t \in [0, h]. \end{aligned}$$

Therefore, if we take \(v_{(h, q_0)} = {\dot{\sigma }}_{q_0q_0}(0)\), we end the proof of the result. \(\square \)

Now, we will denote by \(\Phi ^{\Gamma }\) the flow of the SODE \(\Gamma \)

$$\begin{aligned} \Phi ^{\Gamma }: M^{\Gamma }\subseteq {\mathbb {R}} \times TQ \rightarrow TQ. \end{aligned}$$

Here, \(M^{\Gamma }\) is the open subset of \({\mathbb {R}} \times TQ\) given by

$$\begin{aligned} M^{\Gamma } = \{(t, v) \in {\mathbb {R}} \times TQ \mid \Phi ^{\Gamma }(\cdot , v) \text{ is } \text{ defined } \text{ at } \text{ least } \text{ in } [0, t] \}. \end{aligned}$$

Now, if \(q_{0}\) is a point of Q and \(h \ge 0\), we may consider the open subset \(M^{\Gamma }_{(h, q_{0})}\) of \(T_{q_{0}}Q\) given by

$$\begin{aligned} M^{\Gamma }_{(h,q_{0})} = \{v \in T_{q_{0}}Q \mid (h, v) \in M^{\Gamma } \}. \end{aligned}$$

Note that if \(h > 0\) is sufficiently small then it is clear that \(M^{\Gamma }_{(h,q_{0})} \ne \emptyset \). Moreover, we may introduce the exponential map associated with \(\Gamma \) at \(q_{0}\) for the time h as follows

$$\begin{aligned} \text {exp}^{\Gamma }_{(h, q_{0})}(v) = (\tau _{Q} \circ \Phi ^{\Gamma })(h, v), \; \; \text{ for } v\in M^{\Gamma }_{(h, q_{0})}. \end{aligned}$$
(21)

We remark that the map \(\text {exp}^{\Gamma }_{(0, q_{0})}\) is constant. However, we have the following result.

Theorem 3.2

Let \(\Gamma \) be a SODE in Q and \(q_0\) a point in Q. We take a sufficiently small positive real number h and \(v_{(h, q_0)} \in T_{q_0}Q\) as in Theorem 3.1. Then,

$$\begin{aligned} v_{(h, q_0)} \in M^{\Gamma }_{(h, q_0)}, \; \; \mathrm{exp}^{\Gamma }_{(h, q_0)}(v_{(h, q_0)}) = q_0, \end{aligned}$$

and

$$\begin{aligned} T_{v_{(h, q_0)}}{{\mathrm{exp}}}^{\Gamma }_{(h, q_0)}: T_{v_{(h, q_0)}}M^{\Gamma }_{(h, q_0)} \rightarrow T_{q_0}Q \end{aligned}$$

is an isomorphism.

Proof

From Theorem 3.1, it follows that

$$\begin{aligned} v_{(h, q_0)} \in M^{\Gamma }_{(h, q_0)} \; \; \text{ and } \; \; \text {exp}^{\Gamma }_{(h, q_0)}(v_{(h, q_0)}) = q_0. \end{aligned}$$

Moreover, it is clear that the map

$$\begin{aligned} \text {exp}^{\Gamma }_{(h, q_0)}: M^{\Gamma }_{(h, q_0)} \subseteq T_{q_0}Q \rightarrow Q \end{aligned}$$

is smooth.

Next, we will proceed locally. So, we will denote by

$$\begin{aligned} (t, q^{i}, {\dot{q}}^{i}) \rightarrow (x^j(t, q^{i}, {\dot{q}}^{i}), {\dot{x}}^j(t, q^{i}, {\dot{q}}^{i})) \end{aligned}$$

the flow of the SODE \(\Gamma \) given by

$$\begin{aligned} \Gamma (q^{j}, {\dot{q}}^{j}) = {\dot{q}}^{i}\frac{\partial }{\partial q^{i}} + \xi ^{i}(q^{j}, {\dot{q}}^j) \frac{\partial }{\partial {\dot{q}}^{i}}, \end{aligned}$$

so that the second order equations

$$\begin{aligned} \ddot{x}^{i}(t, q^j, {\dot{q}}^j) = \xi ^{i}(x^k(t, q^{j}, {\dot{q}}^{j}), {\dot{x}}^k(t, q^{j}, {\dot{q}}^{j})) \end{aligned}$$
(22)

are satisfied as well as the following boundary conditions

$$\begin{aligned} x^{i}(0, q^j, {\dot{q}}^j) = q^{i}, \; \; \; {\dot{x}}^{i}(0, q^{j}, {\dot{q}}^j) = {\dot{q}}^{i}. \end{aligned}$$
(23)

The local expression of the map \(\text {exp}^{\Gamma }_{(h, q_0)}\) is

$$\begin{aligned} {\dot{q}}^{i} \rightarrow \text {exp}^{\Gamma }_{(h, q_0)}({\dot{q}}^{i}) = x(h, q_0, {\dot{q}}^{i}). \end{aligned}$$

Denote by \({\dot{q}}_{0h}\) the tangent vector \(v_{(h, q_0)} \in T_{q_0}Q\). We must prove that the Jacobian matrix of \(\text {exp}^{\Gamma }_{(h, q_0)}\) at \({\dot{q}}_{0h}\)

$$\begin{aligned} (D_{{\dot{q}}} \text {exp}^{\Gamma }_{(h, q_0)})({\dot{q}}_{0h}) = (D_{{\dot{q}}}x)(h, q_0, {\dot{q}}_{0h}) \end{aligned}$$

is non-singular which, by the inverse function theorem, automatically implies that the map \(\text {exp}^{\Gamma }_{(h, q_0)}\) is a diffeomorphism on a local neighbourhood of \({\dot{q}}_{0h}\).

Denote by \(U_{(q_0,{\dot{q}}_{0h})}(t)\) the Jacobian matrix of the smooth map \(\text {exp}^{\Gamma }_{(t, q_0)}\) at \({\dot{q}}_{0h}\), that is,

$$\begin{aligned} U_{(q_0,{\dot{q}}_{0h})}(t) = (D_{{\dot{q}}} \text {exp}^{\Gamma }_{(t, q_0)})({\dot{q}}_{0h}) = (D_{{\dot{q}}}x)(t, q_0, {\dot{q}}_{0h}). \end{aligned}$$

Then from the second order system of Eq. (22), using a standard argument on the differentiability of solutions with respect to initial conditions, we may prove that

$$\begin{aligned} \begin{aligned} {\ddot{U}}_{(q_0, {\dot{q}}_{0h})}(t)&= (D_{q}\xi )(x^{i}(t, q_0, {\dot{q}}_{0h}), {\dot{x}}^{i}(t, q_0, {\dot{q}}_{0h}))U_{(q_0,{\dot{q}}_{0h})}(t) \\&\quad + (D_{{\dot{q}}}\xi )(x^{i}(t, q_0, {\dot{q}}_{0h}), {\dot{x}}^{i}(t, q_0, {\dot{q}}_{0h})){\dot{U}}_{(q_0,{\dot{q}}_{0h})}(t) \end{aligned} \end{aligned}$$

and, in a similar way, using (23) we also deduce that

$$\begin{aligned} U_{(q_0, {\dot{q}}_{0h})}(0) = 0, \; \; \; {\dot{U}}_{(q_0, {\dot{q}}_{0h})}(0) = Id. \end{aligned}$$

So, if we denote by \(B_{(q_0, {\dot{q}}_{0h})}(t)\) and \(F_{(q_0, {\dot{q}}_{0h})}(t)\) the matrices

$$\begin{aligned} (D_{q}\xi )(x^{i}(t, q_0, {\dot{q}}_{0h}), {\dot{x}}^{i}(t, q_0, {\dot{q}}_{0h})) \text{ and } (D_{{\dot{q}}}\xi )(x^{i}(t, q_0, {\dot{q}}_{0h}), {\dot{x}}^{i}(t, q_0, {\dot{q}}_{0h})), \end{aligned}$$

respectively, it follows that

$$\begin{aligned} {\ddot{U}}_{(q_0, {\dot{q}}_{0h})}(t) = B_{(q_0, {\dot{q}}_{0h})}(t) U_{(q_0, {\dot{q}}_{0h})}(t) + F_{(q_0, {\dot{q}}_{0h})}(t) {\dot{U}}_{(q_0, {\dot{q}}_{0h})}(t). \end{aligned}$$

Now, we consider the linear system of second order differential equations

$$\begin{aligned} \ddot{y}(t) = B_{(q_0, {\dot{q}}_{0h})}(t) y(t) + F_{(q_0, {\dot{q}}_{0h})}(t) {\dot{y}}(t). \end{aligned}$$
(24)

Note that \(B_{(q_0,{\dot{q}}_{0h})}\) and \(F_{(q_0, {\dot{q}}_{0h})}\) are \(C^{\infty }\)-matrices, for every sufficiently small positive number h.

So, taking into account that there exists a compact subset \({\bar{C}} \subseteq TQ\) such that \(v_{(h, q_0)} \in {\bar{C}}\) (for every h), using Theorem B.1 and proceeding as in the proof of Theorem 3.1, we conclude that there exists a sufficiently small positive number \(p_0 > 0\) such that for all h the unique solution

$$\begin{aligned} t \rightarrow y_{(q_0,{\dot{q}}_{0h})}(t) \end{aligned}$$

of the system (24) satisfying the boundary conditions

$$\begin{aligned} y_{(q_0,{\dot{q}}_{0h})}(0) = 0, \; \; y_{(q_0,{\dot{q}}_{0h})}(p) = 0, \; \; \text{ with } 0 < p \le p_0, \end{aligned}$$

is the trivial solution.

Thus, from Lemma 3.1, Chapter XII in [19] (see Lemma B.2), we deduce that the matrix

$$\begin{aligned} U_{(q_0, {\dot{q}}_{0h})}(p), \; \; \text{ with } 0 < p \le p_0, \end{aligned}$$

is regular, for every h.

Therefore, it is sufficient to take \(h = p\), with \(0 < p \le p_0\), and the result is proved. \(\square \)

From Theorem 3.2, we have that there exist open subsets \({{\mathcal {U}}}_0\) and U in \(M^{\Gamma }_{(h, q_0)}\) and Q, respectively, with \(v_{(h, q_0)} \in {{\mathcal {U}}}_0\) and \(q_0 \in U\), such that the map

$$\begin{aligned} \text {exp}^{\Gamma }_{(h, q_0)}: {{\mathcal {U}}}_0 \subseteq M^{\Gamma }_{(h, q_0)} \rightarrow U\subseteq Q \end{aligned}$$

is a diffeomorphism.

Next, we will consider the open subset \(M^{\Gamma }_{h}\) of TQ given by

$$\begin{aligned} M^{\Gamma }_{h} = \{v \in TQ \mid (h, v) \in M^{\Gamma } \}. \end{aligned}$$

Note that

$$\begin{aligned} v \in M^{\Gamma }_{h} \Longrightarrow M^{\Gamma }_{(h, \tau _{Q}(v))} = M^{\Gamma }_{h} \cap T_{\tau _{Q}(v)}Q \subseteq M^{\Gamma }_{h}. \end{aligned}$$

Thus, since \(\tau _Q: TQ \rightarrow Q\) is an open map, it follows that \(\tau _Q(M^{\Gamma }_{h})\) is an open subset of Q and

$$\begin{aligned} M^{\Gamma }_{h} = \bigcup _{q\in \tau _Q(M^{\Gamma }_{h})} M^{\Gamma }_{(h, q)}. \end{aligned}$$

Definition 3.3

The smooth map \(\text {exp}_{h}^{\Gamma }: M^{\Gamma }_{h} \subseteq TQ \rightarrow Q \times Q\) defined as follows

$$\begin{aligned} \text {exp}^{\Gamma }_{h}(v) = (\tau _{Q}(v), \text {exp}^{\Gamma }_{(h, \tau _{Q}(v))}(v)), \; \; \text{ for } v \in M^{\Gamma }_{h}, \end{aligned}$$

is the exponential map associated with the SODE \(\Gamma \) at time h.

Now, we deduce that

Lemma 3.4

Let v be an element of \(M^{\Gamma }_{h}\) such that \(\mathrm{exp}^{\Gamma }_{(h, \tau _{Q}(v))}\) is non-singular at v. Then, \({{\mathrm{exp}}}^{\Gamma }_{h}\) is also non-singular at v.

Proof

We must prove that the map

$$\begin{aligned} T_{v}(\text {exp}^{\Gamma }_{h}): T_v(M^{\Gamma }_{h}) \simeq T_v(TQ) \rightarrow T_{\tau _{Q}(v)}Q \times T_{\text {exp}^{\Gamma }_{(h,\tau _{Q}(v))}(v)}Q \end{aligned}$$

is a linear isomorphism.

Suppose that

$$\begin{aligned} 0 = (T_{v}(\text {exp}_{h}^{\Gamma }))(X_{v}), \; \; \text{ with } X_{v} \in T_{v}(M^{\Gamma }_{h}). \end{aligned}$$

Then, we have that

$$\begin{aligned} 0 = (T_{v}\tau _{Q})(X_{v}) \; \; \text{ and } \; \; 0 = (T_{v}\text {exp}^{\Gamma }_{(h,\tau _{Q}(v))})(X_{v}). \end{aligned}$$

The first condition implies that

$$\begin{aligned} X_{v} \in T_{v}(M^{\Gamma }_{h} \cap T_{\tau _{Q}(v)}Q) = T_{v}(M^{\Gamma }_{(h, \tau _{Q}(v))}) \end{aligned}$$

and thus, using the second one, we conclude that

$$\begin{aligned} X_{v} = 0. \end{aligned}$$

\(\square \)

As we know, if \(h > 0\) is sufficiently small and \(q_{0} \in Q\) then the map \(\text {exp}_{(h, q_{0})}^{\Gamma }: M^{\Gamma }_{(h, q_{0})} \rightarrow Q\) is non-singular at the point \(v_{(h, q_0)} \in M^{\Gamma }_{(h, q_{0})}\). Therefore, using Lemma 3.4, we deduce the following result

Theorem 3.5

Let \(\Gamma \) be a SODE in TQ and \(q_{0}\) be a point of Q. Then, one may find a sufficiently small positive number h, an open subset \({{\mathcal {U}}}_h \subseteq M^{\Gamma }_{h} \subseteq TQ\), with \(v_{(h, q_0)} \in {{\mathcal {U}}}_h\), and an open subset U of Q, with \(q_{0} \in U\), such that:

  1. 1.

    The map

    $$\begin{aligned} {{\mathrm{exp}}}^{\Gamma }_{h}: {{\mathcal {U}}}_h \subseteq M^{\Gamma }_{h} \rightarrow U \times U \subseteq Q \times Q \end{aligned}$$

    is a diffeomorphism.

  2. 2.

    For every couple \((q, q') \in U \times U\) there exists a unique trajectory of \(\Gamma \)

    $$\begin{aligned} \sigma _{qq'}: [0, h] \rightarrow Q \end{aligned}$$

    satisfying

    $$\begin{aligned} \sigma _{qq'}(0) = q, \; \; \sigma _{qq'}(h) = q' \; \; \text{ and } \; \; {\dot{\sigma }}_{qq'}(0) \in {{\mathcal {U}}}_h. \end{aligned}$$

We will denote by \(R^{e^-}_{h}: U \times U \rightarrow {{\mathcal {U}}}_h\) (respectively, \(R^{e+}_{h}: U \times U \rightarrow {{\mathcal {U}}}_h\)) the inverse map of the diffeomorphism \(\text {exp}^{\Gamma }_{h}: {{\mathcal {U}}}_h \rightarrow U \times U \) (respectively, \(\text {exp}^{\Gamma }_{h} \circ \Phi ^{\Gamma }_{-h}: \Phi ^{\Gamma }_{h}({{\mathcal {U}}}_h) \rightarrow U \times U\)).

The maps

$$\begin{aligned} R^{e^-}_{h}: U \times U \subseteq Q \times Q \rightarrow {{\mathcal {U}}}_h\subseteq TQ \text{ and } R^{e^+}_{h}: U \times U \subseteq Q \times Q \rightarrow \Phi ^{\Gamma }_{h}({{\mathcal {U}}}_h)\subseteq TQ \end{aligned}$$

are called the exact retraction maps associated with \(\Gamma \). We have that

$$\begin{aligned} R^{e^-}_{h}(q, q') = {\dot{\sigma }}_{qq'}(0), \; \; R^{e^+}_{h}(q, q') = {\dot{\sigma }}_{qq'}(h). \end{aligned}$$

Note that

$$\begin{aligned} R^{e^+}_{h} = \Phi ^{\Gamma }_{h} \circ R^{e^-}_{h}, \end{aligned}$$

that is, the following diagram

is commutative.

In [26] (see also [24]), the authors give a generalized version of the previous theorem in the scope of SODE vector fields on Lie algebroids.

3.2 Exponential map and the exact discrete submanifold for the nonholonomic dynamics

Let \(L:TQ\rightarrow {\mathbb {R}}\) be a regular Lagrangian function and \({\mathcal {D}}\) a regular distribution on Q such that the non-holonomic system \((L,{\mathcal {D}})\) is also regular and let \(\Gamma _{nh}\) be the SODE on \({\mathcal {D}}\) which is solution of the non-holonomic dynamics. Denote by \(\phi _t^{\Gamma _{nh}}: {\mathcal {D}} \rightarrow {\mathcal {D}}\) the flow of \(\Gamma _{nh}\) and for h a sufficiently small positive number, we consider the open subset of \({\mathcal {D}}\) given by

$$\begin{aligned} M_{h}^{\Gamma _{nh}}=\{ v\in {\mathcal {D}}\ | \ \phi _{t}^{\Gamma _{nh}}(v) \ \text {is defined for} \ t\in [0,h] \}. \end{aligned}$$

Note that, if \(\Gamma \in {\mathfrak {X}}(TQ)\) is a SODE extension of \(\Gamma _{nh}\) then

$$\begin{aligned} M_{h}^{\Gamma _{nh}}=M_h^{\Gamma } \cap {\mathcal {D}}. \end{aligned}$$

Definition 3.6

The map

$$\begin{aligned} \text {exp}_h^{\Gamma _{nh}}:M_{h}^{\Gamma _{nh}}\subseteq {\mathcal {D}}&\rightarrow Q\times Q \\ v \in {\mathcal {D}}&\mapsto (\tau _Q(v),(\tau _{Q}\circ \phi _h^{\Gamma _{nh}})(v)) \in Q \times Q \end{aligned}$$

is called the nonholonomic exponential map of \(\Gamma _{nh}\) at time h.

Now, we may prove the following result

Theorem 3.7

Let \((L, {\mathcal {D}})\) be a regular nonholonomic system with configuration space Q and \(q_0\) a point in Q. Then, one may find a sufficiently small positive number h, an open subset \({{\mathcal {U}}}_h^{nh} \subseteq M^{\Gamma _{nh}}_h \subseteq {\mathcal {D}}\) and an open subset \(U \subseteq Q\), with \(q_0 \in U\), such that the map

$$\begin{aligned} {{\mathrm{exp}}}_h^{\Gamma _{nh}}: {{\mathcal {U}}}_h^{nh} \subseteq M^{\Gamma _{nh}}_h \rightarrow U \times U \end{aligned}$$

is an embedding.

Proof

Let \(\Gamma \) be a SODE in TQ such that \(\Gamma _{|{\mathcal {D}}} = \Gamma _{nh}\) (see Appendix 1). Then, using Theorem 3.5, we may find a sufficiently small positive number h, an open subset \({{\mathcal {U}}}_h \subseteq M^{\Gamma }_h \subseteq TQ\), with \(v_{(h, q_0)} \in {{\mathcal {U}}}_h\), and an open subset U of Q, with \(q_0 \in U\), such that the map

$$\begin{aligned} \text {exp}_h^{\Gamma }: {{\mathcal {U}}}_h \subseteq M^{\Gamma }_h \rightarrow U \times U \subseteq Q \times Q \end{aligned}$$

is a diffeomorphism.

Now, since \(\Gamma _{|{\mathcal {D}}} = \Gamma _{nh}\), it is clear that

$$\begin{aligned} (\text {exp}^{\Gamma }_h)_{|{{\mathcal {U}}}_h \cap M^{\Gamma _{nh}}_h} = \text {exp}_h^{\Gamma _{nh}}. \end{aligned}$$

So, if we take the open subset of \({\mathcal {D}}\)

$$\begin{aligned} {{\mathcal {U}}}_h^{nh} = {{\mathcal {U}}}_h \cap M^{\Gamma _{nh}}_h \end{aligned}$$

then, using that every immersion is a local embedding, we can suppose (without the loss of generality) that the map \(\text {exp}_h^{\Gamma _{nh}}: {{\mathcal {U}}}_h^{nh} \rightarrow U \times U\) is an embedding. \(\square \)

Now, we may introduce the following definition.

Definition 3.8

The exact discrete nonholonomic constraint submanifold is the submanifold of \(Q\times Q\) given by

$$\begin{aligned} {\mathcal {M}}_h^{e,nh}= \text {exp}_h^{\Gamma _{nh}} ({\mathcal {U}}_h^{nh})\; . \end{aligned}$$

In view of Theorem 3.7, the map

$$\begin{aligned} \text {exp}_h^{\Gamma _{nh}}:{\mathcal {U}}_h^{nh}\rightarrow {\mathcal {M}}_{h}^{e,nh} \end{aligned}$$

is a diffeomorphism and we can define its inverse diffeomorphism, called the nonholonomic exact retraction map

$$\begin{aligned} R_{h,nh}^{e^-}:{\mathcal {M}}_{h}^{e,nh}\longrightarrow {\mathcal {U}}_h^{nh}. \end{aligned}$$

The following are commutative diagrams:

figure a

We will also use the map: \(R_{h,nh}^{e^+}:{\mathcal {M}}_{h}^{e,nh}\longrightarrow \phi _{h}^{\Gamma _{nh}}({\mathcal {U}}_h^{nh})\) defined by

figure b

Example 2

Let us get back to Example 1, the nonholonomic particle and identify the different geometric objects involved. The nonholonomic vector field is given by

$$\begin{aligned} \Gamma _{nh}={\dot{x}}\frac{\partial }{\partial x}+{\dot{y}}\frac{\partial }{\partial y}+y{\dot{x}}\frac{\partial }{\partial z}-y\frac{{\dot{x}}{\dot{y}}}{1+y^2}\frac{\partial }{\partial {\dot{x}}}+\frac{{\dot{x}}{\dot{y}}}{1+y^2}\frac{\partial }{\partial {\dot{z}}}. \end{aligned}$$

From (19) and (20), we construct its corresponding flow and nonholonomic exponential map

$$\begin{aligned} \phi _{t}^{\Gamma _{nh}}(x_0,y_0,z_0,{\dot{x}}_0,{\dot{y}}_0,y_0{\dot{x}}_0)= & {} (x_{nh},y_{nh},z_{nh},{\dot{x}}_{nh},{\dot{y}}_{nh},{\dot{z}}_{nh}),\\ \text {exp}_h^{\Gamma _{nh}}(x_0,y_0,z_0,{\dot{x}}_0,{\dot{y}}_0,y_0{\dot{x}}_0)= & {} (x_0,y_0,z_0,x_{nh}(h),y_{nh}(h),z_{nh}(h)). \end{aligned}$$

We see that this is an invertible map, when we restrict the co-domain to its image, and we may explicitly compute the inverse to be

$$\begin{aligned} R_{h,nh}^{e^-}(x_0,y_0,z_0,x_1,y_1,z_1)&= \left( x_0,y_0,z_0, \frac{(x_1-x_0)(y_1-y_0)}{h\sqrt{y_0^2+1}({{\,\mathrm{arcsinh}\,}}(y_1)-{{\,\mathrm{arcsinh}\,}}(y_0))},\right. \\&\quad \left. \frac{y_1-y_0}{h},\frac{y_0(x_1-x_0)(y_1-y_0)}{h\sqrt{y_0^2+1}({{\,\mathrm{arcsinh}\,}}(y_1)-{{\,\mathrm{arcsinh}\,}}(y_0))}\right) , \end{aligned}$$

in the case where \(y_1\ne y_0\). Note that the domain of the map \(R_{h,nh}^{e^-}\) is not \({\mathbb {R}}^3\times {\mathbb {R}}^3\), it is restricted to \({\mathcal {M}}_{h}^{e,nh}\), which explicitly means that

$$\begin{aligned} \frac{z_1-z_0}{h}-\frac{(x_1-x_0)\left( \sqrt{y_1^2+1}-\sqrt{y_0^2+1} \right) }{h({{\,\mathrm{arcsinh}\,}}(y_1)-{{\,\mathrm{arcsinh}\,}}(y_0))}=0. \end{aligned}$$
(25)

In fact, let the left-hand side of Eq. (25) be denoted by \(\mu _d:Q\times Q\rightarrow {\mathbb {R}}\). It is a constraint function whose annihilation gives the discrete space \({\mathcal {M}}_{h}^{e,nh}\).

4 Lagrangian discrete mechanics and the exact discrete Lagrangian

4.1 Unconstrained discrete mechanics

We will now describe a theory of discrete mechanics on the discretized velocity space \(Q\times Q\) [27]. Discrete mechanics differs from continuous mechanics on the description of motion. In this respect, a discrete motion is not a curve on the configuration manifold Q, it is rather a sequence of points on Q.

We describe a variational discrete theory based on a discretized Hamilton’s principle. From here we see that much of the theory evolves in parallel with the continuous Lagrangian theory. See [27] for the main bibliographic account on the subject.

Let \(L_d:Q\times Q\rightarrow {\mathbb {R}}\) be the discrete Lagrangian function. Let us fix some \(N\in {{\mathbb {N}}}\) (number of steps) and a pair of points \(q_0, q_N\in Q\)

The discrete path space is the space of sequences:

$$\begin{aligned} C_d (q_0,q_N)= \{q_d\equiv \{q_k\}_{k=0}^{N} \ |\; q_k \in Q \hbox { and } q_0, q_N \hbox { fixed} \}. \end{aligned}$$

The discrete action map is defined to be the map \(S_d:C_d (q_0, q_N)\rightarrow {\mathbb {R}}\),

$$\begin{aligned} S_d(q_d)=\sum _{k=0}^{N-1} L_d(q_k,q_{k+1}). \end{aligned}$$
(26)

Note that when one wishes to construct a numerical method using this approach, one usually regards the value of the discrete Lagrangian on a point \((q_0,q_1)\) as being an approximation of the (continuous) action, integrated over a solution connecting the two fixed points \(q_0,q_1\) in a fixed time-step \(h\in {\mathbb {R}}\), i.e.,

$$\begin{aligned} L_d (q_0,q_1)\approx \int _{0}^{h} L(q_{0,1}(t),{\dot{q}}_{0,1}(t)) \ dt, \end{aligned}$$

where \(L:TQ\rightarrow {\mathbb {R}}\) is a regular continuous-time Lagrangian function and \(q_{0,1}(t)\) is the unique solution of the Euler-Lagrange equations connecting \(q_0\) and \(q_1\) (as a consequence of Theorem 3.5).

The discrete Hamilton’s principle states that a solution of the discrete Lagrangian system given by the discrete Lagrangian function \(L_d\) is an extremum for the discrete action map (26) among all sequences of points with fixed end-points. That is, \(q_d\in C_{d}(q_0, q_N)\) is a solution if and only if \(q_d\) is a critical point of the functional \(S_d\), i.e.

$$\begin{aligned} dS_d(q_d)(X_d) = 0, \end{aligned}$$

for all \(X_d \in T_{q_d} {\mathcal {C}}_d(q_0, q_N)\).

Analogously to the continuous-time case, we find out the discrete Euler-Lagrange equations (DEL equations) as necessary and sufficient conditions to find extrema

$$\begin{aligned} D_2 L_d(q_{k-1},q_{k})+D_1 L_d (q_{k},q_{k+1})=0, \ \ \text {for all} \ k=1,\ldots ,N-1. \end{aligned}$$
(27)

where \(D_1L_d(q_{k-1},q_k)\in T^*_{q_{k-1}}Q\) and \(D_2L_d(q_{k-1},q_k)\in T^*_{q_k}Q\) correspond to \(dL_d(q_{k-1},q_k)\) under the identification \(T^*_{(q_{k-1},q_k)}(Q\times Q)\cong T^*_{q_{k-1}}Q\times T^*_{q_k}Q\), that is,

$$\begin{aligned} dL_d(q_{k-1},q_k)=D_1 L_d (q_{k-1},q_{k})+D_2 L_d (q_{k-1},q_{k})\; . \end{aligned}$$

Given a discrete Lagrangian \(L_d: Q\times Q\rightarrow {\mathbb {R}}\) we can define two discrete Legendre transformations \({\mathbb {F}}^{\pm } L_{d}:Q\times Q \rightarrow T^{*}Q\) given by

$$\begin{aligned} {\mathbb {F}}^{+}L_{d}(q_{k-1},q_{k})= & {} (q_{k},D_2L_d(q_{k-1},q_{k})) \, ,\\ {\mathbb {F}}^{-}L_{d}(q_{k-1},q_{k})= & {} (q_{k-1},-D_1L_d(q_{k-1},q_{k})) \, . \end{aligned}$$

We say that \(L_d\) if regular if \({\mathbb {F}}^{+}L_{d}\) (or, equivalently, \({\mathbb {F}}^{-}L_{d}\) ) is a local diffeomorphism. This is equivalent to the regularity of the matrix \(D_{12}L_d\).

Under this regularity condition the 2- form on \(Q\times Q\) defined by

$$\begin{aligned} ({\mathbb {F}}^{+}L_{d})^*\omega _Q=({\mathbb {F}}^{-}L_{d})^*\omega _Q=:\text{\O}mega _{L_d} \end{aligned}$$

is a symplectic form.

Moreover if \(L_d\) is regular then we can obtain a well defined discrete Lagrangian map

$$\begin{aligned} F_{L_d}: Q\times Q\longrightarrow & {} Q \times Q \\ (q_{k-1},q_k)\longmapsto & {} (q_k,q_{k+1}(q_{k-1},q_k)) \, , \end{aligned}$$

which is the discrete dynamical flow of our system. Here \(q_{k+1}\) is the unique solution of the DEL Eq. (27) for the given pair \((q_{k-1},q_k)\). We can easily check the symplecticity of the flow:

$$\begin{aligned} F_{L_d}^*\text{\O}mega _{L_d}=\text{\O}mega _{L_d} \end{aligned}$$

Alternatively, using the discrete Legendre transformations, we can also define the evolution of the discrete system on the cotangent bundle or Hamiltonian side, \({\widetilde{F}}_{L_d}:T^*Q \longrightarrow T^*Q\), by any of the formulas

$$\begin{aligned} {\widetilde{F}}_{L_d}={\mathbb {F}}^{+}L_{d}\circ ({\mathbb {F}}^{-}L_{d})^{-1}={\mathbb {F}}^{+}L_{d}\circ F_{L_d} \circ ({\mathbb {F}}^{+}L_{d})^{-1}={\mathbb {F}}^{-}L_{d}\circ F_{L_d} \circ ({\mathbb {F}}^{-}L_{d})^{-1} \, , \end{aligned}$$

because of the commutativity of the following diagram:

figure c

The discrete Hamiltonian map \({\widetilde{F}}_{L_d}:(T^*Q,\omega _Q) \longrightarrow (T^*Q,\omega _Q)\) is symplectic where \(\omega _Q\) is the canonical symplectic 2-form on \(T^*Q\).

If we start with a continuous Lagrangian and somehow derive an appropriate discrete Lagrangian, then the DEL equations become a geometric integrator for the continuous Euler-Lagrange system, known as a variational integrator. This method to construct integrators for Lagrangian systems enjoys plenty of nice geometric features such as a symplectic discrete flow and discrete momentum conservation [27].

Hence, given a regular Lagrangian function \(L: TQ \longrightarrow {\mathbb {R}}\), we define a discrete Lagrangian \(L_d\) as an approximation of the action of the continuous Lagrangian. More precisely, for a regular Lagrangian L and appropriate \(h>0\), \(q_0,q_1\in Q\), we can define the exact discrete Lagrangian function \(L_d^{e,h}:Q\times Q\rightarrow {\mathbb {R}}\) giving an exact correspondence between continuous and discrete motions as

$$\begin{aligned} L_d^{e,h}(q_0,q_1)=\int _{0}^{h} L(q_{0,1}(t),{\dot{q}}_{0,1}(t)) \ dt. \end{aligned}$$
(28)

Again, \(q_{0,1}(t)\) is the unique solution of the Euler-Lagrange equations connecting \(q_0\) and \(q_1\) with h small enough. Observe that the solutions of Discrete Euler-Lagrange equations for L exactly lie on the solutions of the Euler-Lagrange equations for \(L_d^{e,h}\). In fact, in [27], the authors prove the following theorem which gives us the correspondence between discrete and continuous Lagrangian mechanics:

Theorem 4.1

Take a series of times \(\{t_k=kh, k=0,\ldots ,N\}\) for a sufficiently small time-step \(h\in {\mathbb {R}}\), a regular Lagrangian L and its corresponding discrete Lagrangian function \(L_d^{e,h}\). Let q(t) be a solution of Euler-Lagrange equations for L satisfying the boundary conditions \(q(0)=q_0\) and \(q(t_N)=q_N\). Define a sequence \(\{q_k\}_{k=0}^{N}\) in Q by

$$\begin{aligned} q_k=q(t_k), \ \ \text {for} \ k=0,\ldots ,N. \end{aligned}$$

Then \(\{q_k\}_{k=0}^{N}\) is a solution of the discrete Euler-Lagrange equations for \(L_d^{e,h}\;\).

Conversely, if we let \(\{q_k\}_{k=0}^{N}\) be a solution of the discrete Euler-Lagrange equations for \(L_d^{e,h}\), then the curve \(q:[0,t_N]\rightarrow Q\) defined by

$$\begin{aligned} q(t)=q_{k,k+1}(t), \ \ \text {for} \ t\in [t_k,t_{k+1}], \end{aligned}$$

where \(q_{k,k+1}(t)\) is the unique solution of the Euler-Lagrange equations connecting \(q_k\) and \(q_{k+1}\), is a solution of Euler-Lagrange equations for L on the whole interval \([0,t_N]\).

Following the Hamiltonian formalism, if we have a Hamiltonian problem defined by the Hamiltonian \(H = E_L \circ \left( {\mathbb {F}}L\right) ^{-1}\), then the exact Hamiltonian map \({\widetilde{F}}_{L_d^{e,h}}\) coincides with the Hamiltonian flow \(\phi _{h}^{X_{H}}\) of the continuous Hamiltonian system H for a discrete amount of time h. Now we recall the result of [27] and [32] for a discrete Lagrangian \(L_d:Q\times Q\rightarrow {{\mathbb {R}}}\).

Definition 4.2

Let \(L_{d}:Q\times Q\rightarrow {{\mathbb {R}}}\) be a discrete Lagrangian. We say that \(L_{d}\) is a discretization of order r if there exist an open subset \(U_{1}\subset TQ\) with compact closure and constants \(C_1>0\), \(h_1>0\) so that

$$\begin{aligned} |L_{d}(q(0),q(h))-L_{d}^{e,h}(q(0),q(h))|\le C_{1}h^{r+1} \end{aligned}$$

for all solutions q(t) of the second-order Euler–Lagrange equations with initial conditions \((q_0,{\dot{q}}_0)\in U_1\) and for all \(h\le h_1\).

Following [27, 32], we have the following important result about the order of a variational integrator.

Theorem 4.3

If \({\widetilde{F}}_{L_d}\) is the evolution map of an order r discretization \(L_d:Q\times Q\rightarrow {{\mathbb {R}}}\) of the exact discrete Lagrangian \(L_d^{e,h}:Q\times Q\rightarrow {{\mathbb {R}}}\), then

$$\begin{aligned} {\widetilde{F}}_{L_d}={\widetilde{F}}_{L_{d}^{e,h}}+{\mathcal {O}}(h^{r+1}). \end{aligned}$$

In other words, \({\widetilde{F}}_{L_d}\) gives an integrator of order r for \({\widetilde{F}}_{L_{d}^{e,h}}=F_{H}^{h}\).

This theorem gives us a method to find the order of a symplectic integrator for a mechanical system determined by a regular Lagrangian function \(L: TQ\rightarrow {\mathbb {R}}\). We take a discrete Lagrangian \(L_{d}:Q\times Q\rightarrow {\mathbb {R}}\) as an approximation of \(L_d^{e,h}\) and the order can be calculated by expanding the expressions for \(L_d(q(0),q(h))\) in a Taylor series in h and comparing this to the same expansions for the exact Lagrangian. If the both series agree up to r terms, then the discrete Lagrangian is of order r (see [22, 27] and references therein).

4.2 Forced discrete mechanics

One of the most important properties of variational integrators is the possibility to adapt to more complex situations, for instance, systems involving forces or constraints (see [27]).

For the case of systems subjected to external forces, given a continuous force \(F: TQ\rightarrow T^*Q\), we introduce the discrete counterpart as two maps \(F_d^{+}: Q\times Q\longrightarrow T^*Q\) and \(F_d^{-}: Q\times Q\longrightarrow T^*Q\) called the discrete force maps. These discrete forces satisfy \(\pi _{Q}\circ F_{d}^{+}=\text {pr}_{2}\) and \(\pi _{Q}\circ F_{d}^{-}=\text {pr}_{1}\), where \(\pi _{Q}\) is the canonical projection of the cotangent bundle, and \(\text {pr}_{1,2}:Q\times Q\longrightarrow Q\) are the canonical projections onto the first and second factors, respectively.

Now, the discrete equations of motion are derived from the discrete Lagrange-d’Alembert principle:

$$\begin{aligned} \delta S_{d}(q_{d})\cdot \delta q_{d}+\sum _{k=1}^{N-1} \left[ F_d^{+}(q_{k-1},q_{k})+F_d^{-}(q_{k},q_{k+1}) \right] \cdot \delta q_{k}=0 \end{aligned}$$
(29)

for all variations \(\delta q_{k}\), with \(\delta q_0=\delta q_N=0\).

The forced Euler-Lagrange equations are given by

$$\begin{aligned} D_2 L_d(q_{k-1},q_{k})+D_1 L_d (q_{k},q_{k+1})+F_d^{+}(q_{k-1},q_{k})+F_d^{-}(q_{k},q_{k+1})=0\; . \end{aligned}$$
(30)

which implicitly define a discrete forced Lagrangian map \({F}_{L^f_d}: Q\times Q\rightarrow Q\times Q\).

As in the unforced case, we can define the corresponding discrete Legendre transformations \({\mathbb {F}}^{f\pm }L_{d}:Q\times Q \rightarrow T^{*}Q\) given by

$$\begin{aligned} {\mathbb {F}}^{f+}L_{d}(q_{k-1},q_{k})= & {} (q_{k},D_2L_d(q_{k-1},q_{k})+F_d^+(q_{k-1}, q_k)) \, ,\\ {\mathbb {F}}^{f-}L_{d}(q_{k-1},q_{k})= & {} (q_{k-1},-D_1L_d(q_{k-1},q_{k})-F_d^-(q_{k-1}, q_k)) \, . \end{aligned}$$

If the discrete forced system is regular, that is, the discrete Legendre transformations \({\mathbb {F}}^{f\pm }L_{d}\) are local diffeomorphisms then we have an explicit discrete forced Lagrangian map \({F}_{L^f_d}\) which is a local diffeomorphism. In addition, we may consider the discrete forced Hamiltonian map \({\widetilde{F}}_{L^f_d}: T^*Q\rightarrow T^*Q\)

$$\begin{aligned} {\widetilde{F}}_{L^f_d}={\mathbb {F}}^{f\pm }L_{d}\circ {F}_{L^f_d}\circ \left( {\mathbb {F}}^{f\pm }L_{d}\right) ^{-1}\; . \end{aligned}$$

Now suppose that (LF) is a forced continuous Lagrangian system with regular Lagrangian function \(L:TQ\rightarrow {\mathbb {R}}\) and an external force \(F:TQ\rightarrow T^{*}Q\). Then, as we know (see Sect. 2.2), the dynamical vector field is a SODE \(\Gamma _{(L,F)}\) on TQ which is characterized by condition (8).

We will denote by

$$\begin{aligned} \text {exp}_{h}^{\Gamma _{(L,F)}}:{\mathcal {U}}_{h}\subseteq TQ\rightarrow Q \times Q \end{aligned}$$

the exponential map associated with \(\Gamma _{(L,F)}\) for a sufficiently small positive number h. This map is a local diffeomorphism and so we may consider the exact retraction associated to it, which is its inverse map \(R_{h,F}^{e-}\).

Using the flow \(\phi _{h}^{\Gamma _{(L,F)}}\) of \(\Gamma _{(L,F)}\) and the associated exact retraction we may introduce the forced exact discrete Lagrangian function \(L_{d,F}^{e,h}:Q \times Q\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} L_{d,F}^{e,h}(q_0, q_1)=\int _{0}^{h} \left( L\circ \phi _{t}^{\Gamma _{(L,F)}} \circ R_{h,F}^{e-}\right) (q_{0},q_{1}) \ dt, \end{aligned}$$

and the double exact discrete force \(F_{d}^{e,h}:Q \times Q\rightarrow T^{*}(Q\times Q)\) defined by

$$\begin{aligned} \langle F_d^{e,h}(q_0, q_1,h), (X_{q_0}, X_{q_1})\rangle =\int ^h_0 \left\langle \left( F\circ \phi _{t}^{\Gamma _{(L,F)}} \circ R_{h,F}^{e-}\right) (q_{0},q_{1}), X_{0,1}(t)\right\rangle \; dt \end{aligned}$$

where \(X_{0,1}(t)=T_{(q_0, q_1)}(\tau _Q\circ \phi _t^{\Gamma _{(L,F)}}\circ R^{e-}_{h,F})(X_{q_0}, X_{q_1})\), for \((X_{q_0}, X_{q_1})\in T_{q_{0}}Q\times T_{q_{1}}Q\).

Then, the exact discrete force maps are just \(F_d^{e,+}: Q\times Q\rightarrow T^*Q\) and \(F_d^{e,-}: Q\times Q\rightarrow T^*Q\) given by

$$\begin{aligned} \langle F_d^{e,+}(q_0, q_1), X_{q_{1}}\rangle= & {} \langle F_d^{e,h}(q_0, q_1), (0_{q_0}, X_{q_{1}})\rangle \\ \langle F_d^{e,-}(q_0, q_1), X_{q_{0}}\rangle= & {} \langle F_d^{e,h}(q_0, q_1), (X_{q_{0}}, 0_{q_1}, \rangle . \end{aligned}$$

Note that if we denote by \(q:Q\times Q\times [0,h]\rightarrow Q\) the function defined by

$$\begin{aligned} q(q_{0},q_{1},t)=q_{0,1}(t), \end{aligned}$$

where \(q_{0,1}:[0,h]\rightarrow Q\) is the solution of the forced Lagrangian system satisfying \(q_{0,1}(0)=q_{0}\) and \(q_{0,1}(h)=q_{1}\). Then it is clear that

$$\begin{aligned} q_{0,1}(t)=\left( \tau _{Q}\circ \phi _{t}^{\Gamma _{(L,F)}} \circ R_{h,F}^{e-}\right) (q_{0},q_{1}). \end{aligned}$$

So, with this notation, the maps \(L_{d,F}^{e,h}\), \(F_d^{e,+}\) and \(F_d^{e,-}\) may be written as follows

$$\begin{aligned} L_{d,F}^{e,h}(q_0,q_1)= & {} \int _{0}^{h} L(q_{0,1}(t),{\dot{q}}_{0,1}(t)) \ dt,\\ F_d^{e,+}(q_0,q_1)= & {} \int _{0}^{h} \left\langle F(q_{0,1}(t),{\dot{q}}_{0,1}(t)),\frac{\partial q_{0,1}}{\partial q_{1}} \right\rangle \ dt \end{aligned}$$

and

$$\begin{aligned} F_d^{e,-}(q_0,q_1)=\int _{0}^{h} \left\langle F(q_{0,1}(t),{\dot{q}}_{0,1}(t)),\frac{\partial q_{0,1}}{\partial q_{0}} \right\rangle \ dt, \end{aligned}$$

where

$$\begin{aligned} \frac{\partial q_{0,1}}{\partial q_{1}}:T_{q_{1}}Q\rightarrow T_{q_{0,1}(t)}Q, \quad \text {and} \quad \frac{\partial q_{0,1}}{\partial q_{0}}:T_{q_{0}}Q\rightarrow T_{q_{0,1}(t)}Q \end{aligned}$$

are given by

$$\begin{aligned} \left\langle \frac{\partial q_{0,1}}{\partial q_{1}},X_{q_{1}} \right\rangle =T_{(q_{0},q_{1},t)}q (0_{q_{0}},X_{q_{1}},0_{t}), \quad \left\langle \frac{\partial q_{0,1}}{\partial q_{0}},X_{q_{0}} \right\rangle =T_{(q_{0},q_{1},t)}q (X_{q_{0}},0_{q_{1}},0_{t}), \end{aligned}$$

for \(X_{q_{0}}\in T_{q_{0}}Q\) and \(X_{q_{1}}\in T_{q_{1}}Q\).

Using the previous definitions, one may prove a forced version of Theorem 4.1 (cf. [27]). Moreover, in [7], the authors give a forced version of Theorem 4.3 using the variational order of the corresponding duplicated system.

In fact, we will need a useful Lemma from [27] in Sect. 5.

Lemma 4.4

Let (QLF) be a forced Lagrangian problem with regular Lagrangian function L. The corresponding exact discrete Legendre transformations satisfy

  1. 1.

    \({\mathbb {F}}^{f+}L_{d,F}^{e,h}(q_{0},q_{1})={\mathbb {F}}L (q_{0,1}(h),{\dot{q}}_{0,1}(h))\);

  2. 2.

    \({\mathbb {F}}^{f-}L_{d,F}^{e,h}(q_{0},q_{1})={\mathbb {F}}L (q_{0,1}(0),{\dot{q}}_{0,1}(0))\);

where \(q_{0,1}(t)\) is the solution of the forced Euler-Lagrange equations verifying \(q_{0,1}(0)=q_{0}\) and \(q_{0,1}(h)=q_{1}\).

5 Discrete nonholonomic mechanics

In this section, we introduce a modification of the discrete Lagrange-d’Alembert principle. Later, using the construction of the nonholonomic exponential map in Sect. 3, we will define the exact discrete version of nonholonomic mechanics and show that it satisfies the discrete modified Lagrange-d’Alembert principle.

5.1 Discrete modified Lagrange-d’Alembert principle

Let \({\mathcal {D}}\) be a distribution on the manifold Q. Let \(L_{d}:Q\times Q\longrightarrow {\mathbb {R}}\) be a discrete Lagrangian function, \(F_d^{\pm }:Q\times Q\longrightarrow T^*Q\) discrete forces and \({\mathcal {M}}_{d}\subseteq Q\times Q\) a discrete constraint space. We remark that \(\pi _{Q}\circ F_{d}^{+}=\text {pr}_{2}\) and \(\pi _{Q}\circ F_{d}^{-}=\text {pr}_{1}\), where \(\pi _{Q}:T^{*}Q\rightarrow Q\) and \(\text {pr}_{1,2}:Q\times Q\rightarrow Q\) are the canonical projections.

Definition 5.1

A sequence \((q_{0},\ldots , q_{N})\) in Q satisfies the modified Lagrange-d’Alembert principle if it extremizes

$$\begin{aligned} \begin{aligned}&\delta S_{d}(q_{d})\cdot \delta q_{d}+\sum _{k=1}^{N-1} \left[ F^{+}(q_{k-1},q_{k})+F^{-}(q_{k},q_{k+1}) \right] \cdot \delta q_{k}=0 \\&(q_{k},q_{k+1})\in {\mathcal {M}}_{d}, \ 0\leqslant k \leqslant N-1 \end{aligned} \end{aligned}$$
(31)

for all variations lying in the distribution \(\delta q_{k}\in {\mathcal {D}}_{q_k}\), \(\delta q_{d}=(\delta q_{0},\ldots ,\delta q_{N})\in T_{q_{d}}{\mathcal {C}}_{d}(q_{0},q_{N})\) and \(\delta q_0=\delta q_N=0\).

Remark 5.2

Observe that this principle is exactly the same that discrete Lagrange-d’Alembert principle for forced systems when \({\mathcal {D}}=TQ\) and \({\mathcal {M}}_{d}=Q\times Q\). It is also the discrete Lagrange-d’Alembert principle for nonholonomic systems introduced by [6] when \(F^+=F^-=0\). Also, in this context we find the methods proposed by [11], using a discretization of the forces for a nonholonomic system and a discrete submanifold derived from the continuous constraints and the forced discrete Legendre transformations. Recently, a similar principle was introduced in [31] to study discretizations of Dirac mechanics.

Now, as in the case of forced systems, we have that

Proposition 5.3

A sequence \((q_{0},\ldots , q_{N})\) in Q satisfies the modified Lagrange-d’Alembert principle if and only if it satisfies modified Lagrange-d’Alembert equations

$$\begin{aligned}&D_2 L_d(q_{k-1},q_{k})+D_1 L_d (q_{k},q_{k+1})+F^{+}(q_{k-1},q_{k})+F^{-}(q_{k},q_{k+1}) \in {\mathcal {D}}^{o}_{q_k} \nonumber \\&\omega ^{a}(q_{k},q_{k+1})=0, \ 0\leqslant k \leqslant N-1, \end{aligned}$$
(32)

where \({\mathcal {M}}_d\) is determined by the zeros of a set of constraint functions \(\omega ^{a}:Q\times Q\longrightarrow {\mathbb {R}}\).

5.2 Exact discrete nonholonomic flow

If we denote the inclusion of \({\mathcal {D}}\) in TQ by \(i_{{\mathcal {D}}}: {\mathcal {D}}\hookrightarrow TQ\), we induce the dual projection \(i_{\mathcal {D}}^*: T^*Q\rightarrow {\mathcal {D}}^*\) defined by

$$\begin{aligned} \langle i_D^*(\mu _q), v_q\rangle =\langle \mu _q, i_{{\mathcal {D}}}(v_q)\rangle , \quad \mu _q\in T^*_qQ, \ v_q\in {\mathcal {D}}_q. \end{aligned}$$

The Legendre transformations of the Lagrangian functions \(L:TQ\rightarrow {\mathbb {R}}\) and \(l=L|_{{\mathcal {D}}}: {\mathcal {D}}\rightarrow {\mathbb {R}}\) satisfy the following relation

$$\begin{aligned} i_{{\mathcal {D}}}^*\circ {\mathbb {F}}L\circ i_{{\mathcal {D}}}={\mathbb {F}}l, \end{aligned}$$
(33)

where \({\mathbb {F}}l: {\mathcal {D}}\rightarrow {\mathcal {D}}^*\) is the restricted Legendre transformation defined from l (see Sect. 2.3).

Now consider the exact discrete nonholonomic Legendre transformations \({\mathbb {F}}^{\pm }_{h,nh} l: {\mathcal {M}}_{h}^{e,nh}\rightarrow {\mathcal {D}}^*\) defined by

$$\begin{aligned} {\mathbb {F}}^{-}_{h,nh} l (q_0, q_1)= & {} {\mathbb {F}}l\circ R_{h,nh}^{e-}(q_0,q_1)\in {\mathcal {D}}^*_{q_0}\\ {\mathbb {F}}^{+}_{h,nh} l (q_0, q_1)= & {} {\mathbb {F}}l \circ R_{h,nh}^{e+}(q_0,q_1)\in {\mathcal {D}}^*_{q_1}. \end{aligned}$$

Note that \({\mathbb {F}}^{\pm }_{h,nh} l\) are (local) diffeomorphisms.

As we will see below, the condition of momentum matching gives the exact discrete nonholonomic equations:

$$\begin{aligned} \begin{aligned} {\mathbb {F}}^{+}_{h, nh} l (q_0, q_1)-{\mathbb {F}}^{-}_{h,nh} l (q_1, q_2)&= 0 \\ (q_0, q_1), (q_{1},q_{2})&\in {\mathcal {M}}_{h}^{e,nh}. \end{aligned} \end{aligned}$$
(34)

We sill see in a theorem below why they are called "exact".

Remark 5.4

Alternatively we can define the subset

$$\begin{aligned} S_{nh}^e = \{ ({\mathbb {F}}l\circ R_{h, nh}^{e-}(q_0,q_1), {\mathbb {F}}l\circ R_{h, nh}^{e+}(q_0,q_1)) \ | \ (q_0, q_1)\in M_{h}^{e,nh}\} \end{aligned}$$

and we can think \(S_{nh}^e\subset {\mathcal {D}}^*\times {\mathcal {D}}^*\) as an implicit difference equation [21] producing the exact discrete nonholonomic dynamics.

Observe that, since both \(R_{h,nh}^{e-}\) and \({\mathbb {F}}l\) are local diffeomorphisms, then Eq. (34) implicitly define an exact discrete flow: \(\Phi _{h,nh}^e:{\mathcal {M}}_{h}^{e,nh}\rightarrow {\mathcal {M}}_{h}^{e,nh}\) by

$$\begin{aligned} \Phi _{h,nh}^e (q_0, q_1)= \text {exp}_{h}^{\Gamma _{nh}}\circ R_{h,nh}^{e+}(q_0,q_1). \end{aligned}$$
(35)

Moreover, it produces a well-defined flow on \({\mathcal {D}}^*\), denoted by \(\varphi ^e_{h,nh}: {\mathcal {D}}^*\rightarrow {\mathcal {D}}^*\), which is defined by

$$\begin{aligned} \varphi _{h,nh}^e(\mu _{q_0})= {\mathbb {F}}^{+}_{h,nh} l \circ ( {\mathbb {F}}^{-}_{h,nh} l)^{-1}(\mu _{q_0}), \quad \mu _{q_0}\in D^*_{q_0}. \end{aligned}$$

The interplay between both discrete flows and the nonholonomic Legendre transformations may be summarized in the commutative diagram in Fig. 1.

Fig. 1
figure 1

Commutative diagram. Exact discrete and continuous noholonomic flows

Having the construction of nonholonomic integrators in mind, it is interesting to observe that the exact discrete nonholonomic dynamics exactly reproduces the continuous flow of the nonholonomic system at any step h.

Theorem 5.5

Given \((q_{0},q_{1})\in {\mathcal {M}}_{h}^{e,nh}\) and \(h>0\), consider the sequence \((q_{0},q_{1},\ldots ,q_{N})\) obtained by multiple iterations of the exact discrete flow \(\Phi ^e_{h,nh}\) and thus, by definition, satisfying the exact discrete nonholonomic equations

$$\begin{aligned} {\mathbb {F}}^{+}_{h,nh} l (q_{k-1}, q_{k})-{\mathbb {F}}^{-}_{h,nh} l (q_{k}, q_{k+1})=0, \quad (q_{k},q_{k+1})\in {\mathcal {M}}_{h}^{e,nh}, \end{aligned}$$
(36)

for \(0\leqslant k \leqslant N-1\).

Then, we have that the sequence \((q_{0},q_{1},\ldots ,q_{N})\) exactly matches the trajectories of \(\Gamma _{nh}\) in the sense that

$$\begin{aligned} q_{k}=q_{0,1}(kh), \end{aligned}$$
(37)

where \(q_{0,1}\) is the unique trajectory of \(\Gamma _{nh}\) satisfying \(q_{0,1}(0)=q_{0}\) and \(q_{0,1}(h)=q_{1}\).

Proof

The theorem is a direct consequence of the definition of the exact discrete flow in (35). \(\square \)

For the construction of geometric integrators we will need another alternative expression of Eq. (36). In particular, using (33) we can rewrite these equations in a way that are very similar to the modified Lagrange-d’Alembert equations defined in Eq. (32) as

$$\begin{aligned}&\displaystyle i^*_{D}\left( ({\mathbb {F}}L \circ i_{{\mathcal {D}}} \circ R_{h,nh}^{e+}) (q_0,q_1)-({\mathbb {F}}L \circ i_{{\mathcal {D}}} \circ R_{h,nh}^{e-})(q_1,q_2)\right) =0\\&\displaystyle (q_{0},q_{1}),(q_1, q_2)\in {\mathcal {M}}_{h}^{e,nh}. \end{aligned}$$

Note that the projection \(i^{*}_{{\mathcal {D}}}:T^{*}Q \rightarrow {\mathcal {D}}^{*}\) satisfies

$$\begin{aligned} \text {ker} (i^{*}_{{\mathcal {D}}}) = {\mathcal {D}}^{o}. \end{aligned}$$
(38)

Thus, we conclude that

$$\begin{aligned} \begin{aligned}&({\mathbb {F}}L\circ R_{h,nh}^{e+}(q_{0},q_1)-{\mathbb {F}}L\circ R_{h,nh}^{e-}(q_1,q_{2})) \in {\mathcal {D}}^{o}_{q_1}\\&(q_{0},q_{1}),(q_1, q_{2}) \in {\mathcal {M}}_{h}^{e,nh}, \end{aligned} \end{aligned}$$
(39)

where we omit \(i_{{\mathcal {D}}}\) since \(R_{h,nh}^{e+}(q_{0},q_1)\) and \(R_{h,nh}^{e-}(q_1,q_{2})\) are vectors in the distribution \({\mathcal {D}}\) and may be identified with its inclusion.

5.3 The nonholonomic forced exact discrete Lagrangian function

Given a regular nonholonomic system determined by the triple (QLD), we have seen how to derive the nonholonomic force \(F_{nh}: {\mathcal {D}}\rightarrow T^*Q\) by modifying the free dynamics to satisfy the nonholonomic constraints.

Consider now an arbitrary extension \(\widetilde{F_{nh}}: T Q\rightarrow T^*Q\) of \(F_{nh}\). It is clear that the solutions of the forced system determined by \((L, \widetilde{F_{nh}})\) with initial conditions in \({\mathcal {D}}\), remain in \({\mathcal {D}}\) and match the trajectories of the nonholonomic system. In fact, if \(\Gamma _{nh}\) is the nonholonomic dynamics and \(\Gamma _{(L,\widetilde{F_{nh}})}\) is the forced dynamics, then it is clear that \(\Gamma _{nh}=\Gamma _{(L,\widetilde{F_{nh}})}|_{{\mathcal {D}}}\).

If \(R_{h,{\widetilde{F}}_{nh}}^{e-}\) is the exact retraction associated with the forced SODE \(\Gamma _{(L,\widetilde{F_{nh}})}\) then, as in Sect. 4.2, we may define the exact discrete versions

$$\begin{aligned} L_{d,{\widetilde{F}}_{nh}}^{e,h}(q_0, q_1)=\int _{0}^{h} \left( L\circ \phi _{t}^{\Gamma _{(L,{\widetilde{F}}_{nh})}} \circ R_{h,{\widetilde{F}}_{nh}}^{e-}\right) (q_{0},q_{1}) \ dt, \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} \langle ({\widetilde{F}}_{nh})_d^{e,+}(q_0, q_1), X_{q_{1}}\rangle&= \langle F_d^{e,h}(q_0, q_1), (0_{q_0}, X_{q_{1}})\rangle \\ \langle ({\widetilde{F}}_{nh})_d^{e,-}(q_0, q_1), X_{q_{0}}\rangle&= \langle F_d^{e,h}(q_0, q_1), (X_{q_{0}}, 0_{q_1}, \rangle , \end{aligned} \end{aligned}$$

where \(F_d^{e,h}:Q\times Q\rightarrow T^{*}(Q\times Q)\) is the double exact discrete force given by

$$\begin{aligned} \langle F_d^{e,h}(q_0, q_1), (X_{q_0}, X_{q_1})\rangle =\int ^h_0 \left\langle \left( {\widetilde{F}}_{nh}\circ \phi _{t}^{\Gamma _{(L,{\widetilde{F}}_{nh})}} \circ R_{h,{\widetilde{F}}_{nh}}^{e-}\right) (q_{0},q_{1}), X_{0,1}(t)\right\rangle \; dt \end{aligned}$$

where \(X_{0,1}(t)=T_{(q_0, q_1)}(\tau _Q\circ \phi _t^{\Gamma _{(L,{\widetilde{F}}_{nh})}}\circ R^{e-}_{h,{\widetilde{F}}_{nh}})(X_{q_0}, X_{q_1})\), for \((X_{q_0}, X_{q_1})\in T_{q_{0}}Q\times T_{q_{1}}Q\).

Following the notation in [27], we may rewrite these maps as

$$\begin{aligned} L_{d,\widetilde{F_{nh}}}^{e,h}(q_0, q_1)= & {} \int _{0}^{h} L(q_{0,1}(t),{\dot{q}}_{0,1}(t)) \ dt\; , \\ (\widetilde{F_{nh}})^{e,+}_d(q_0, q_1)= & {} \int _{0}^{h} \left\langle (\widetilde{F_{nh}})(q_{0,1}(t),{\dot{q}}_{0,1}(t)), \frac{\partial q_{0,1}(t)}{\partial q_1}\right\rangle \ dt\; ,\\ (\widetilde{F_{nh}})^{e,-}_d(q_0, q_1)= & {} \int _{0}^{h} \left\langle (\widetilde{F_{nh}})(q_{0,1}(t),{\dot{q}}_{0,1}(t)), \frac{\partial q_{0,1}(t)}{\partial q_0}\right\rangle \ dt\; . \end{aligned}$$

where now \(q_{0,1}: [0, h]\rightarrow Q\) is the solution of the forced Euler-Lagrange equations for \((L, \widetilde{F_{nh}})\) verifying \(q_{0,1}(0)=q_{0}\) and \(q_{0,1}(h)=q_{1}\).

Theorem 5.6

The exact discrete nonholonomic trajectories satisfy the discrete forced Euler-Lagrange Eq. (30) associated with the exact discrete Lagrangian and forces, that is,

$$\begin{aligned} \begin{aligned}&D_2 L_{d,{\widetilde{F}}_{nh}}^{e,h}(q_{k-1},q_{k})+D_1 L_{d,{\widetilde{F}}_{nh}}^{e,h} (q_{k},q_{k+1})\\&\quad +(\widetilde{F_{nh}})^{e,+}_d(q_{k-1},q_{k})+(\widetilde{F_{nh}} )^{e,-}_d(q_{k},q_{k+1}) = 0,\\&\quad (q_{k},q_{k+1})\in {\mathcal {M}}_{h}^{e,nh}, \ 0\leqslant k \leqslant N-1. \end{aligned} \end{aligned}$$
(40)

Proof

By construction, the exact discrete forced trajectory associated with the forced system \((L, \widetilde{F_{nh}})\) and initial values \((q_0,q_1)\in {\mathcal {M}}^{e, nh}_{h}\) is precisely the exact discrete nonholonomic trajectory with respect to the nonholonomic system \((L,{\mathcal {D}})\). \(\square \)

Remark 5.7

Observe that the exact discrete nonholonomic Eq. (40) are a particular instance of the modified Lagrange-d’Alembert equations proposed in Eq. (32) where the corresponding term in \({{\mathcal {D}}}_{q_k}^0\) is exactly zero. This is a direct consequence of using the exact discrete force maps corresponding to the continuous nonholonomic external forces \(F_{nh}: {{\mathcal {D}}}\rightarrow T^*Q\) defined in Remark 2.2.

However, when one discretizes the forced Eq. (40) and take approximations of the exact discrete Lagrangian function \(L_{d,{\widetilde{F}}_{nh}}^{e,h}\) defined above, the exact discrete forces \((\widetilde{F_{nh}})^{e,-}_d\) and \((\widetilde{F_{nh}})^{e,+}_d\), as well as the exact discrete nonholonomic constraint submanifold \({\mathcal {M}}_{h}^{e,nh}\), there is no reason why the discrete forced flow should still satisfy the discrete constraint \((q_{k},q_{k+1})\in {\mathcal {M}}_{h}^{d}\), where \({\mathcal {M}}_{h}^{d}\) is the discrete version of \({\mathcal {M}}_{h}^{e,nh}\). This is guaranteed if we let the equations lie in \({{\mathcal {D}}}_{q_k}^0\) in Eq. (32) and impose at the same time the discrete constraint.

Remark 5.8

The relation between the modified discrete Lagrange-d’Alembert Eq. (32) and the nonholonomic exact discrete flow is different from other popular nonholonomic integrators in two senses:

  1. 1.

    the exact discrete nonholonomic equations are included as a particular case of (32) described above but not in DLA (see [6]).

  2. 2.

    the exact Eq. (40) generate a well-defined discrete flow \(\Phi _{h}^{d}:{\mathcal {M}}_{h}^{e,nh} \rightarrow {\mathcal {M}}_{h}^{e,nh}\) in contrast with the exact integrator proposed in [29].

5.4 Construction of integrators and numerical examples

To construct variational integrators we consider discretizations \((L_d, F_d^-, F_d^+)\) of \((L_{d,\widetilde{F_{nh}}}^{e,h}, (\widetilde{F_{nh}})^{e,-}_d, (\widetilde{F_{nh}})^{e,+}_d)\) as a typical forced integrator and then we consider a discretization \({\mathcal {M}}_{h}^{d}\) of \({\mathcal {M}}^{e,nh}_{h}\) to derive the modified discrete Lagrange-d’Alembert equations:

$$\begin{aligned} \begin{aligned}&D_2 L_d(q_{k-1},q_{k})+D_1 L_d (q_{k},q_{k+1})+F^+_d(q_{k-1},q_{k})+F^-_d(q_{k},q_{k+1}) \in {\mathcal {D}}^{o}_{q_k} \\&(q_{k},q_{k+1})\in {\mathcal {M}}_h^d, \ 0\leqslant k \leqslant N-1, \end{aligned} \end{aligned}$$
(41)

We remark that (41) is equivalent to the projection onto \({\mathcal {D}}^{*}\), i.e.,

$$\begin{aligned} \begin{aligned}&i_{{\mathcal {D}}}^{*}\left( D_2 L_d(q_{k-1},q_{k})+D_1 L_d (q_{k},q_{k+1})+F^+_d(q_{k-1},q_{k})+F^-_d(q_{k},q_{k+1}) \right) =0\\&(q_{k},q_{k+1})\in {\mathcal {M}}_h^d, \ 0\leqslant k \leqslant N-1, \end{aligned} \end{aligned}$$
(42)

This projection motivates the definition of the Legendre transformations \({\mathbb {F}}^{\pm }l_{d}:{\mathcal {M}}^{d}_h\rightarrow {\mathcal {D}}^{*}\) given by

$$\begin{aligned} \begin{aligned} {\mathbb {F}}^{+}l_{d}&=i_{{\mathcal {D}}}^{*}\circ {\mathbb {F}}^{f+} L_{d}|_{{\mathcal {M}}^{d}_h} \\ {\mathbb {F}}^{-}l_{d}&=i_{{\mathcal {D}}}^{*}\circ {\mathbb {F}}^{f-} L_{d}|_{{\mathcal {M}}^{d}_h}. \end{aligned} \end{aligned}$$

Example 3

Consider once more the nonholonomic particle. We introduce a discretization of the discrete space \({\mathcal {M}}^{e,nh}_{h}\)

$$\begin{aligned} {\mathcal {M}}^{d}_h=\{ z_{1}=z_{0}+\left( \frac{y_1+y_0}{2} \right) (x_1-x_0) \}, \end{aligned}$$
(43)

and a discrete Lagrangian

$$\begin{aligned} L_{d}(x_0,y_0,z_0,x_1,y_1,z_1)=\frac{1}{2h}\left[ \left( x_1-x_0 \right) ^2+\left( y_1-y_0 \right) ^2 +\left( z_1-z_0 \right) ^2 \right] . \end{aligned}$$

Moreover we need two discrete forces

$$\begin{aligned} F^+_d(q_{0},q_{1})=\frac{2}{h}\frac{(x_1-x_0)(y_1-y_0)}{4+\left( y_1+y_0 \right) ^{2}}\left( -\frac{y_1+y_0}{2}d x_1+d z_1 \right) \end{aligned}$$

and

$$\begin{aligned} F^{-}_d(q_{0},q_{1})=\frac{2}{h}\frac{(x_1-x_0)(y_1-y_0)}{4+\left( y_1+y_0 \right) ^{2}}\left( -\frac{y_1+y_0}{2}d x_{0}+d z_{0} \right) . \end{aligned}$$

The forced discrete Legendre transformations which appear also in the modified Lagrange-d’Alembert equations are

$$\begin{aligned} \begin{aligned} {\mathbb {F}}^{f-} L_{d}(q_{0},q_{1})&=\left( \frac{x_1-x_0}{h}+\frac{1}{h}\frac{(x_1-x_0)(y_1-y_0)(y_1+y_0)}{4+\left( y_1+y_0 \right) ^{2}} \right) dx_{0} \\&\quad +\frac{y_1-y_0}{h}dy_{0} + \left( \frac{z_1-z_0}{h}-\frac{2}{h}\frac{(x_1-x_0)(y_1-y_0)}{4+\left( y_1+y_0 \right) ^{2}} \right) dz_{0} \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} {\mathbb {F}}^{f+} L_{d}(q_{0},q_{1})&=\left( \frac{x_1-x_0}{h}-\frac{1}{h}\frac{(x_1-x_0)(y_1-y_0)(y_1+y_0)}{4+\left( y_1+y_0 \right) ^{2}} \right) dx_{1} \\&\quad +\frac{y_1-y_0}{h}dy_{1} + \left( \frac{z_1-z_0}{h}+\frac{2}{h}\frac{(x_1-x_0)(y_1-y_0)}{4+\left( y_1+y_0 \right) ^{2}} \right) dz_{1}. \end{aligned} \end{aligned}$$

Now projecting the forced Legendre transformations onto \({\mathcal {D}}^{*}\) by means of \(i_{{\mathcal {D}}}^{*}\) and restricting to \({\mathcal {M}}_{h}^d\) we get

$$\begin{aligned} {\mathbb {F}}^{-}l_{d}(q_{0}^{i},q_{1}^{a})=\frac{x_{1}-x_{0}}{h}\left( 1+\frac{1}{2}y_{0}(y_{1}+y_{0})+\frac{(y_{1}-y_{0})^{2}}{4+( y_1+y_0 )^{2}} \right) e^{1}+\left( \frac{y_1-y_0}{h} \right) e^{2} \end{aligned}$$

and

$$\begin{aligned} {\mathbb {F}}^{+}l_{d}(q_{0}^{i},q_{1}^{a})=\frac{x_{1}-x_{0}}{h}\left( 1+\frac{1}{2}y_{1}(y_{1}+y_{0})+\frac{(y_{1}-y_{0})^{2}}{4+( y_1+y_0 )^{2}} \right) e^{1}+\left( \frac{y_1-y_0}{h} \right) e^{2}, \end{aligned}$$

where the local frame \(\{e^{a}\}\subseteq {\mathcal {D}}^{*}\) is dual to the local frame \(\{e_{a}\}\) spanning \({\mathcal {D}}\), where \(e_{1}=\frac{\partial }{\partial x}+y\frac{\partial }{\partial z}\) and \(e_{2}=\frac{\partial }{\partial y}\).

Now solving Eq. (42) for this example we get

$$\begin{aligned} \begin{aligned} x_{2}&=x_{1}+(x_{1}-x_{0})\frac{1+\frac{1}{2}y_{1}(y_{1} +y_{0})+\frac{(y_{1}-y_{0})^{2}}{4+( y_1+y_0 )^{2}}}{1+\frac{1}{2}y_{1} (3y_{1}-y_{0})+\frac{(y_{1}-y_{0})^{2}}{4+( 3y_1-y_0 )^{2}}} \\ y_{2}&=2y_{1}-y_{0}. \end{aligned} \end{aligned}$$

We can see in Figs. 2 and 3 a comparison between the proposed integrator (MLA) and the more standard Discrete Lagrange-d’Alembert (DLA) integrator. We compare the error in both integrators as well as the energy behaviour of both. We observe the proposed integrator as good behaviour in both aspects and it even behaves slightly better than DLA. Notice that the Hamiltonian function \(H|_{{\mathcal {D}}^{*}}\) given by

$$\begin{aligned} H|_{{\mathcal {D}}^{*}}(x,y,z,p_{1},p_{2})=\frac{1}{2}\left( \frac{p_{1}^{2}}{1+y^{2}}+p_{2}^{2} \right) \end{aligned}$$

becomes constant along the discrete flow, after the first steps. To run the simulation we set the initial position at the origin \(q_{0}=0\) and \(q_{1}=(0.4,0.4,z_{1})\), with \(z_{1}\) being determined by (43). The step is \(h=0.5\) and the total number of steps is \(N=1200\).

Fig. 2
figure 2

Comparison of the value of the Hamiltonian function between DLA and MLA integrators

Fig. 3
figure 3

Evolution of the error in DLA and MLA integrators

We also draw in Fig. 4 the discrete constraint space \({\mathcal {M}}^{d}_h\) and compare it with its exact version \({\mathcal {M}}^{e,nh}_{h}\).

Fig. 4
figure 4

Graph of the defining function for the respective spaces. We have fixed the origin as the initial point \(q_{0}=0\) and plotted the coordinate \(z_{1}\) as a function of \(x_{1}\) and \(y_{1}\)

Example 4

Let us introduce another typical example of nonholonomic system (see [4]): the knife edge. Choosing appropriate constants, its Lagrangian function is described by the function \(L:T(Q\times {\mathbb {S}}^{1})\rightarrow {\mathbb {R}}\)

$$\begin{aligned} L(x,y,\varphi ,{\dot{x}},{\dot{y}},{\dot{\varphi }})=\frac{1}{2}({\dot{x}}^{2} +{\dot{y}}^{2}+{\dot{\varphi }}^{2})+\frac{x}{2}, \end{aligned}$$

and it is subjected to the nonholonomic constraint

$$\begin{aligned} \sin (\varphi ){\dot{x}}-\cos (\varphi ){\dot{y}}=0. \end{aligned}$$

We introduce the following discretization of the constraint space

$$\begin{aligned} {\mathcal {M}}^{d}_h=\left\{ \sin \left( \frac{\varphi _{1}+\varphi _{0}}{2}\right) \frac{x_{1}-x_{0}}{h}-\cos \left( \frac{\varphi _{1}+\varphi _{0}}{2}\right) \frac{y_{1}-y_{0}}{h}=0\right\} . \end{aligned}$$

The natural discretization of the Lagrangian compatible with the above discrete constraint space is then

$$\begin{aligned} L_{d}(x_{0},y_{0},\varphi _{0},x_{1},y_{1},\varphi _{1})= & {} \frac{1}{2h}((x_{1}-x_{0})^{2}+(y_{1}-y_{0})^{2}+(\varphi _{1}-\varphi _{0})^{2})\\&+h\cdot \frac{x_{1}+x_{0}}{4} \end{aligned}$$

Moreover the discrete forces are given by

$$\begin{aligned} F^+_d(q_{0},q_{1})=\frac{h}{2}\lambda \left( \mu _{x}d x_{1}+ \mu _{y} d y_{1} \right) , \quad F^{-}_d(q_{0},q_{1})=\frac{h}{2}\lambda \left( \mu _{x}d x_{0}+ \mu _{y} d y_{0} \right) , \end{aligned}$$

with

$$\begin{aligned} \begin{aligned} \lambda&= -\frac{\varphi _{1}-\varphi _{0}}{h^{2}}\left( (x_{1}-x_{0}) \cos \left( \frac{\varphi _{1}+\varphi _{0}}{2}\right) +(y_{1}-y_{0}) \sin \left( \frac{\varphi _{1}+\varphi _{0}}{2}\right) \right) \\&\quad -\frac{1}{2}\sin \left( \frac{\varphi _{1}+\varphi _{0}}{2}\right) \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \mu _{x}=\sin \left( \frac{\varphi _{1}+\varphi _{0}}{2}\right) , \quad \mu _{y}=\cos \left( \frac{\varphi _{1}+\varphi _{0}}{2}\right) . \end{aligned}$$

With these ingredients we obtained an integrator with a nearly preservation of the energy (see Fig. 5), where we use the Hamiltonian function

$$\begin{aligned} H|_{{\mathcal {D}}^{*}}(x,\varphi ,y,p_{1},p_{2})=\frac{1}{2}\left( \frac{p_{1}^2}{A(\varphi )} +p_{2}^2-x\right) , \quad A(\varphi )=1+\frac{\sin ^{2}(\varphi )}{\cos ^{2}(\varphi )}. \end{aligned}$$
Fig. 5
figure 5

Experiment with the knife edge example: the initial positions are the origin \(q_{0}=0\) and \(q_{1}=(0.4,0.4,y_{1})\), the step is \(h=0.5\) and the total number of steps is \(N=600\)

Example 5

We now slightly perturb the knife edge system by introducing the nonholonomic constraint (see [30])

$$\begin{aligned} \sin (\varphi ){\dot{x}}-(\cos (\varphi )-\varepsilon ){\dot{y}}=0, \quad \varepsilon >0. \end{aligned}$$

We obtain an integrator for the perturbed system that no longer preserves energy. Anyway, it still behaves clearly better than standard DLA algorithm (check Fig. 6), for the Hamiltonian function

$$\begin{aligned} H|_{{\mathcal {D}}^{*}}(x,\varphi ,y,p_{1},p_{2})=\frac{1}{2}\left( \frac{p_{1}^2}{A(\varphi ,\varepsilon )}+p_{2}^2-x\right) , \quad A(\varphi ,\varepsilon )=1+\frac{\sin ^{2}(\varphi )}{(\cos (\varphi )-\varepsilon )^2}. \end{aligned}$$
Fig. 6
figure 6

Experiment with the perturbed knife edge example with \(\varepsilon =0.1\): the initial positions are the origin \(q_{0}=0\) and \(q_{1}=(0.4,0.4,y_{1})\), the step is \(h=0.5\) and the total number of steps is \(N=600\)

6 Towards an intrinsic version of the exact discrete nonholonomic equations

We have recently introduced a formulation of nonholonomic mechanics using a suitable geometric environment, in this case, the skew-symmetric algebroid (cf. [12, 17]) which is a weaker version of the well-known concept of Lie algebroid, where now the Lie bracket may not satisfy the Jacobi identity.

Following the program initiated by Alan Weinstein in [35], it was shown in [23, 25] and [24] how to formulate discrete mechanics in a unified way using the notion of a Lie groupoid. In the future, we want to find and study the equivalent algebraic structures for nonholonomic mechanics.

In this section we will describe some of the ingredients needed to develop this new theory, in particular, the nonholonomic exact discrete Lagrangian defined in \({\mathcal {M}}_{h}^{e,nh}\), its main properties and the relationship with the results of [29].

Assume that we have a nonholonomic system defined by the triple \((Q, L, {\mathcal {D}})\), where \(L: TQ\rightarrow {\mathbb {R}}\) is a regular Lagrangian and \((L,{\mathcal {D}})\) is a regular non-holonomic system.

With the help of the constrained exact retraction, defined by \(R^{e-}_{h,nh}:{\mathcal {M}}_{h}^{e,nh}\rightarrow {\mathcal {U}}_h^{nh}\subseteq {\mathcal {D}}\) introduced in Sect. 3, we define the nonholonomic exact discrete Lagrangian for \((Q, L, {\mathcal {D}})\) as a function on the exact discrete space \(l_{h,nh}^{e}:{\mathcal {M}}_{h}^{e,nh}\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} l_{h,nh}^{e}(q_{0},q_{1})=\int _{0}^{h} \left( L\circ \phi _{t}^{\Gamma _{nh}} \circ R^{e-}_{h,nh} \right) (q_0,q_1) \ dt. \end{aligned}$$
(44)

where \(\{\phi _{t}^{\Gamma _{nh}}\}\) is the flow of \(\Gamma _{nh}\), the solution of the nonholonomic dynamics.

To ease the notation let us introduce the following objects:

  1. 1.

    given \((q_0, q_1)\in {\mathcal {M}}_{h}^{e,nh}\), define the following curves on \({\mathcal {D}}\) and Q, respectively:

    $$\begin{aligned} \gamma _{0}(t):=\left( \phi _{t}^{\Gamma _{nh}} \circ R^{e-}_{h,nh} \right) (q_0,q_1) \ \text {and} \ c_{0}(t):=\tau _{Q}\circ \gamma _{0}(t); \end{aligned}$$
  2. 2.

    a variation of the former curve is denoted by

    $$\begin{aligned} \gamma _{s}(t)=\left( \phi _{t}^{\Gamma _{nh}} \circ R^{e-}_{h,nh} \right) (q_0(s),q_1(s))\ \text {and} \ c_{s}(t):=\tau _{Q}\circ \gamma _{s}(t) \end{aligned}$$
  3. 3.

    the infinitesimal variation vector field on the configuration manifold is

    $$\begin{aligned} X_{0,1}(t)=\left. \frac{d}{ds} \right| _{s=0} c_{s}(t). \end{aligned}$$

Next we will prove a result which we will use later. The proof of this result involves the canonical involution \(\kappa _{Q}:TTQ\rightarrow TTQ\) of the double tangent bundle. We recall that \(\kappa _{Q}\) is a vector bundle isomorphism between the vector bundles \(T\tau _{Q}:TTQ\rightarrow TQ\) and \(\tau _{TQ}:TTQ\rightarrow TQ\). In fact, \(\kappa _{Q}\) is characterized by the following condition: if

$$\begin{aligned} x:U\subseteq {\mathbb {R}}^{2} \rightarrow Q, \quad (s,t)\mapsto x(s,t) \end{aligned}$$

is a smooth map then

$$\begin{aligned} \kappa _{Q}\left( \frac{d}{dt}\frac{d}{ds} x(s,t) \right) =\frac{d}{ds}\frac{d}{dt} x(s,t). \end{aligned}$$

So, \(\kappa _{Q}^{2}=Id\). Moreover, if \(X:Q\rightarrow TQ\) is a vector field on Q then the tangent map \(TX:TQ\rightarrow TTQ\) is a section of the vector bundle \(T\tau _{Q}:TTQ\rightarrow TQ\) and, in addition, \(\kappa _{Q}\circ TX=X^{C}\), where \(X^{C}\) is the complete lift of X to TQ (see [34] for more details).

Lemma 6.1

Given a SODE \(\Gamma \), if \(\gamma _{s}\) is a one-parameter family of integral curves of \(\Gamma \), then the infinitesimal variation vector field of \(\gamma _{s}\) is the complete lift of the infinitesimal variation vector field of the one-parameter family of curves formed by the base integral curves of \(\Gamma \), that is \(c_{s}=\tau _{Q}\circ \gamma _{s}\).

Proof

If \(\gamma _{s}\) is a one-parameter family of integral curves of \(\Gamma \), it has the form \(\gamma _{s}=\frac{d}{dt}c_{s}\). Let

$$\begin{aligned} X_{01}(t)=\left. \frac{d}{ds}\right| _{s=0}\tau _{Q}\circ \gamma _{s}(t) \end{aligned}$$

be the infinitesimal variation vector field of \(c_{s}\). Then the infinitesimal variation vector field of \(\gamma _{s}\) is

$$\begin{aligned} \left. \frac{d}{ds}\right| _{s=0}\gamma _{s}(t)= & {} \left. \frac{d}{ds}\right| _{s=0}\frac{d c_{s}}{dt}(t)\\= & {} \kappa _{Q} \left( \frac{d}{dt} \left. \frac{d}{ds}\right| _{s=0}c(s,t) \right) =\kappa _{Q} \left( \frac{d}{dt}X_{01}(t) \right) =X_{01}^{C}(t). \end{aligned}$$

\(\square \)

Next, we will obtain an interesting expression for the differential of the nonholonomic exact discrete Lagrangian function \(l_{h,nh}^{e}\). For this purpose, we will denote by \(F_{nh}:{\mathcal {D}}\rightarrow T^{*}Q\) the continuous-time nonholonomic external force (see Remark 2.2).

Proposition 6.2

The differential of the nonholonomic exact discrete Lagrangian satisfies

$$\begin{aligned} \langle dl_{h,nh}^{e}(q_{0},q_{1}),(X_{q_0},X_{q_1}) \rangle= & {} -\langle \beta _{nh}(q_{0},q_{1}),(X_{q_0},X_{q_1}) \rangle \\&+\langle {\mathbb {F}}L\circ R_{h,nh}^{e+}(q_0, q_1),X_{q_1} \rangle \\&- \langle {\mathbb {F}}L\circ R_{h,nh}^{e-}(q_0,q_1),X_{q_0} \rangle , \end{aligned}$$

where

$$\begin{aligned} \langle \beta _{nh}(q_{0},q_{1}), (X_{q_0}, X_{q_1})\rangle =\int _{0}^{h} \langle F_{nh}(\gamma _0(t)),X_{01}(t)\rangle \ dt \end{aligned}$$

and we are identifying the vector \((X_{q_0},X_{q_1})\in T_{(q_0, q_1)}{\mathcal {M}}_{h}^{e,nh}\) with its image by \(Ti:T{\mathcal {M}}_{h}^{e,nh}\hookrightarrow T(Q\times Q)\), with \(i:{\mathcal {M}}_{h}^{e,nh}\hookrightarrow Q \times Q\) the canonical inclusion. The smooth curve \(X_{01}: [0,h]\rightarrow TQ\) is defined as

$$\begin{aligned} X_{01}(t)= T_{(q_0, q_1)}(\tau _{Q}\circ \phi _{t}^{\Gamma _{nh}} \circ R^{e-}_{h,nh}) (X_{q_0},X_{q_1})\; . \end{aligned}$$

Proof

Let \(v: (-\epsilon ,\epsilon )\rightarrow {\mathcal {M}}_{h}^{e,nh}\) be a smooth curve denoted by \(v(s)=(q_0(s),q_1(s))\) such that \(v(0)=(q_0,q_1)\in {\mathcal {M}}_{h}^{e,nh}\) and \(v'(0)=(X_{q_0},X_{q_1})\in T_{(q_0, q_1)}{\mathcal {M}}_{h}^{e,nh}\) and

$$\begin{aligned} \gamma _{s}(t)=\left( \phi _{t}^{\Gamma _{nh}} \circ R^{e-}_{h,nh} \right) (q_0(s),q_1(s)). \end{aligned}$$

Then, using Lemma 6.1, we have that

$$\begin{aligned} \begin{aligned}&\langle d l_{h,nh}^{e}(q_0,q_1), \left. \frac{d}{d s}\right| _{s=0}(q_{0}(s),q_{1}(s)) \rangle \\&\quad = \int _{0}^{h}\langle dL(\gamma _{0}(t)),\left. \frac{d}{d s}\right| _{s=0} \gamma _{s}(t) \rangle dt\\&\quad = \int _{0}^{h}\langle dL(\gamma _{0}(t)),X_{01}^{C}(t) \rangle dt. \end{aligned} \end{aligned}$$
(45)

Note that \(X_{01}^{C}(t)\) is a vector field on TQ along \(\gamma _{0}(t)\), hence using (17) it follows that

$$\begin{aligned} \begin{aligned} \langle d l_{h,nh}^{e}(q_0,q_1), (X_{q_0}, X_{q_1})\rangle&= X_{01}^{V}(h)(L)-X_{01}^{V}(0)(L)-\int _{0}^{h} \langle F_{nh}(\gamma _0(t)),X_{01}(t) \rangle dt\\&= \langle {\mathbb {F}}L(\gamma _{0}(h)),X_{01}(h)\rangle -\langle {\mathbb {F}}L(\gamma _{0}(0)),X_{01}(0)\rangle \\&\quad -\int _{0}^{h} \langle F_{nh}(\gamma _0(t)),X_{01}(t) \rangle dt. \end{aligned} \end{aligned}$$
(46)

By unyielding the definition of \(X_{01}\) and identifying \((X_{q_0},X_{q_1})\) with its image by \(Ti:T{\mathcal {M}}_{h}^{e,nh}\hookrightarrow T(Q\times Q)\), we see that

$$\begin{aligned} \begin{aligned} X_{01}(h)&= T_{(q_0, q_1)}(\tau _{Q}\circ R^{e+}_{h,nh}) (X_{q_0}, X_{q_1})=X_{q_1}, \\ X_{01}(0)&=T_{(q_0, q_1)}( \tau _{Q} \circ R^{e-}_{h,nh}) (X_{q_0},X_{q_1})=X_{q_0}, \end{aligned} \end{aligned}$$

since

$$\begin{aligned} \tau _{Q}\circ R^{e+}_{h,nh}=\text {pr}_{2}|_{{\mathcal {M}}_{h}^{e,nh}} \quad \text {and} \quad \tau _{Q}\circ R^{e-}_{h,nh}=\text {pr}_{1}|_{{\mathcal {M}}_{h}^{e,nh}}, \end{aligned}$$

where \(\text {pr}_{1,2}:Q\times Q\rightarrow Q\) are the projection onto the first and second factor, respectively. \(\square \)

Observe that in the previous Proposition, the intrinsic discrete objects associated to the nonholonomic problem are \(dl_{h}^{e}\), \(\beta _{nh}\in \text{\O}mega ^1({\mathcal {M}}_{h}^{e,nh})\). Then, \(\sigma _{nh}\) given by

$$\begin{aligned} {\sigma _{nh} = dl_{h}^{e} + \beta _{nh}} \end{aligned}$$
(47)

is also a 1-form in \({\mathcal {M}}_{h}^{e,nh}\).

Finally we will relate the exact discrete objects we use in the modified Lagrange-d’Alembert principle in Sect. 5 with the intrinsic exact discrete objects defined above.

Proposition 6.3

The restriction to \({\mathcal {M}}_{h}^{e,nh}\) of the forced exact discrete Lagrangian function \(L_{d,{\widetilde{F}}_{nh}}^{e,h}\) is just the non-holonomic exact discrete Lagrangian function \(l_{h,nh}^{e}\), that is,

$$\begin{aligned} \left. L_{d,{\widetilde{F}}_{nh}}^{e,h} \right| _{{\mathcal {M}}_{h}^{e,nh}}=l_{h,nh}^{e}. \end{aligned}$$

Moreover, if \((q_{0},q_{1})\in {\mathcal {M}}_{h}^{e,nh}\) and \((X_{q_0}, X_{q_1})\in T_{(q_{0},q_{1})}{\mathcal {M}}_{h}^{e,nh}\) then

$$\begin{aligned} \langle ((\widetilde{F_{nh}})^{e,-}_d(q_0, q_1), (\widetilde{F_{nh}})^{e,+}_d(q_0, q_1)), (X_{q_0}, X_{q_1})\rangle =\langle \beta _{nh}(q_0, q_1), (X_{q_0}, X_{q_1})\rangle . \end{aligned}$$

Thus, the 1-form \({\tilde{\sigma }}_{nh} \in \text{\O}mega ^1(Q \times Q)\) defined by

$$\begin{aligned} {\tilde{\sigma }}_{nh}=dL_{d,{\widetilde{F}}_{nh}}^{e,h} + ((\widetilde{F_{nh}})^{e,-}_d, (\widetilde{F_{nh}})^{e,+}_d) \end{aligned}$$

satisfies \(i^{*}{\tilde{\sigma }}_{nh} = \sigma _{nh}\), where \(i:{\mathcal {M}}_{h}^{e,nh}\hookrightarrow Q\times Q\).

Proof

Given a pair of points \((q_0, q_1)\in {\mathcal {M}}_{h}^{e,nh}\), since the unique trajectory of \(\Gamma _{nh}\) connecting the two points is also the unique trajectory of the forced problem \((L,{\widetilde{F}}_{nh})\) connecting these points, the expressions of \(\left. L_{d,\widetilde{F_{nh}}}^{e,h} \right| _{{\mathcal {M}}_{h}^{e,nh}}\) and \(l_{h,nh}^{e}\) match.

By definition, we have that

$$\begin{aligned} \sigma _{nh}=dl_{h,nh}^{e}+\beta _{nh} \end{aligned}$$

and

$$\begin{aligned} {\widetilde{\sigma }}_{nh}-dL_{d,\widetilde{F_{nh}}}^{e,h}=((\widetilde{F_{nh}})^{e,-}_d, (\widetilde{F_{nh}})^{e,+}_d). \end{aligned}$$

By Proposition 6.2, we have that

$$\begin{aligned} \begin{aligned} \langle \sigma _{nh}(q_{0},q_{1}),(X_{q_0},X_{q_1})\rangle&= \langle {\mathbb {F}}L\circ R_{h,nh}^{e+}(q_0, q_1),X_{q_1} \rangle - \langle {\mathbb {F}}L\circ R_{h,nh}^{e-}(q_0,q_1),X_{q_0} \rangle \\&= \langle i^{*}{\tilde{\sigma }}_{nh}(q_{0},q_{1}),(X_{q_0},X_{q_1})\rangle , \end{aligned} \end{aligned}$$

for \((q_{0},q_{1})\in {\mathcal {M}}_{h}^{e,nh}\) and \((X_{q_0},X_{q_1})\in T_{(q_0, q_1)}{\mathcal {M}}_{h}^{e,nh}\), where the last equality comes from Lemma 4.4. So,

$$\begin{aligned} \beta _{nh}=\sigma _{nh}-dl_{h,nh}^{e}=i^{*}({\widetilde{\sigma }}_{nh} -dL_{d,\widetilde{F_{nh}}}^{e,h}), \end{aligned}$$

from where the result follows. \(\square \)

From Proposition 6.2, we can recover the equations satisfied by the exact discrete nonholonomic trajectory appearing in [29]. In fact, it is a generalized version of these equations since we may drop the assumption that we have a reversible Lagrangian (see [29] for the definition).

Proposition 6.4

Suppose that \((L,{\mathcal {D}})\) is a regular nonholonomic system, \((q_{0},q_{1}), (q_{1},q_{2})\in {\mathcal {M}}_{h}^{e,nh}\). Then \(q_{1}\) is a point in the intersection \({\mathcal {M}}_{h,q_{0}}^{e,nh} \cap {\mathcal {M}}_{-h,q_{2}}^{e,nh}\) where

$$\begin{aligned} {\mathcal {M}}_{h,q_{0}}^{e,nh}=\{q_{1} \in Q \ | \ (q_{0},q_{1}) \in {\mathcal {M}}_{h}^{e,nh}\}, \quad {\mathcal {M}}_{-h,q_{2}}^{e,nh}=\{q_{1} \in Q \ | \ (q_{2},q_{1}) \in {\mathcal {M}}_{-h}^{e,nh}\} \end{aligned}$$

Let \(X_{q_{1}}\) be a vector in the intersection \(T_{q_{1}}{\mathcal {M}}_{h,q_{0}}^{e,nh} \cap T_{q_{1}}{\mathcal {M}}_{-h,q_{2}}^{e,nh}.\) Then, along the exact discrete nonholonomic trajectory the following equation is satisfied

$$\begin{aligned} \begin{aligned}&\langle dl_{h,nh}^{e}(q_{0},q_{1}),(0_{q_{0}},X_{q_{1}}) \rangle +\langle \beta _{nh}(q_{0},q_{1}),(0_{q_{0}},X_{q_{1}}) \rangle \\&\quad = \langle dl_{h,nh}^{e}(q_{1},q_{2}),(X_{q_{1}},0_{q_{2}}) \rangle +\langle \beta _{nh}(q_{1},q_{2}),(X_{q_{1}},0_{q_{2}}) \rangle . \end{aligned} \end{aligned}$$

Observe that the exact discrete Eq. (40) might be written as

$$\begin{aligned} \langle \widetilde{\sigma _{nh}}(q_{0},q_{1}),(0_{q_{0}},X_{q_{1}}) \rangle =\langle \widetilde{\sigma _{nh}}(q_{1},q_{2}),(X_{q_{1}},0_{q_{2}}) \rangle , \quad \forall X_{q_{1}}\in T_{q_{1}}Q \end{aligned}$$

and \((q_{0},q_{1}), (q_{1},q_{2}) \in {\mathcal {M}}_{h}^{e,nh}\). However, using the intrinsic objects we can only write

$$\begin{aligned} \langle \sigma _{nh}(q_{0},q_{1}),(0_{q_{0}},X_{q_{1}}) \rangle =\langle \sigma _{nh}(q_{1},q_{2}),(X_{q_{1}},0_{q_{2}}) \rangle \end{aligned}$$

for \(X_{q_{1}}\in T_{q_{1}}{\mathcal {M}}_{h,q_{0}}^{e,nh} \cap T_{q_{1}}{\mathcal {M}}_{-h,q_{2}}^{e,nh}\).

So, as the authors noted in [29], the last equation is not a suitable initial value integrator, since the number of dimensions in the intersection \(T_{q_{1}}{\mathcal {M}}_{h,q_{0}}^{e,nh} \cap T_{q_{1}}{\mathcal {M}}_{-h,q_{2}}^{e,nh}\) may be too low to produce a well-defined discrete flow. This is precisely the main reason why, in this paper, we consider the modified Lagrange-d’Alembert principle as the correct discrete analogue of Lagrange-d’Alembert principle. This is an important difference from the approach followed in [29].

7 Conclusion and future work

In this paper, we have precisely identified the exact discrete equations for a nonholonomic system. The main ingredients were the definition of the exponential map for a constrained second-order differential equation allowing us to define the exact discrete nonholonomic constraint submanifold. Then, we define the main discrete elements that appear on the definition of the exact discrete nonholonomic equations. The special form of these equations allow us to introduce a new family of nonholonomic integrators showing in numerical computations the excellent behaviour of the energy.

In a future paper, we will study an intrinsic version of discrete nonholonomic mechanics in \({{\mathcal {M}}_{h}^{e,nh}}\) following the steps given in Sect. 6. Also we aim to find a nonholonomic version of Theorem 4.3 once we know the exact discrete nonholonomic flow and the elements that it is necessary to approximate (discrete constraint submanifold, discrete Lagrangian and discrete forces) . Knowing these data we will be in a position describe the order of the numerical method for a nonholonomic system as in the pure variational case. Moreover, since typically nonholonomic systems admit symmetries [4], we will study the reduction of the discrete counterparts following the results by [20].

On the other hand, one of the advantages of our approach is that it can be extended, in a natural way, to the discretization of Lagrangian systems subjected to nonholonomic constraints which are not necessarily linear. Indeed, for a regular Lagrangian function \(L:TQ \rightarrow {\mathbb {R}}\) and a constraint submanifold C (not necessarily a distribution in Q) such that the nonholonomic system (LQC) is regular, there exists a unique SODE \(\Gamma _{nh}\) along C whose trajectories are the solutions of the nonholonomic dynamics (see, for instance, [10]). In fact, as in the case when C is a distribution in Q, the nonholonomic system (LQC) may be considered as a restricted forced system. In addition, using Theorem 3.5 and proceeding as in Sect. 3.2, we can introduce the nonholonomic exponential map associated with \(\Gamma _{nh}\), the exact discrete constraint submanifold and the exact retractions maps on it. From these objects, one may also introduce the forced exact discrete Lagrangian function and the exact forces and, then, a version of Theorem 5.6 could be proved. The construction of the corresponding integrators (as in Sect. 5.4) and its application to concrete examples of nonholonomic systems subjected to non-linear constraints (in particular, to affine constraints in the velocities) should be the next step.