What makes nonholonomic integrators work?

A nonholonomic system is a mechanical system with velocity constraints not originating from position constraints; rolling without slipping is the typical example. A nonholonomic integrator is a numerical method specifically designed for nonholonomic systems. It has been observed numerically that many nonholonomic integrators exhibit excellent long-time behaviour when applied to various test problems. The excellent performance is often attributed to some underlying discrete version of the Lagrange–d’Alembert principle. Instead, in this paper, we give evidence that reversibility is behind the observed behaviour. Indeed, we show that many standard nonholonomic test problems have the structure of being foliated over reversible integrable systems. As most nonholonomic integrators preserve the foliation and the reversible structure, near conservation of the first integrals is a consequence of reversible KAM theory. Therefore, to fully evaluate nonholonomic integrators one has to consider also non-reversible nonholonomic systems. To this end we construct perturbed test problems that are integrable but no longer reversible (with respect to the standard reversibility map). Applying various nonholonomic integrators from the literature to these problems we observe that no method performs well on all problems. This further indicates that reversibility is the main mechanism behind near conservation of first integrals for nonholonomic integrators. A list of relevant open problems is given toward the end.

obtained from the Lagrange-d'Alembert principle (see § 2), which is not a variational principle (contrary to Hamilton's principle in Lagrangian mechanics).Therefore, nonholonomic systems do not, in general, preserve a symplectic structure, although total energy is conserved. 1Motivated by the success of symplectic integrators for Hamiltonian systems, various discrete versions of the Lagrange-d'Alembert principle have nevertheless been suggested, with the objective of deriving "structure preserving" time-stepping algorithms for nonholonomic systems [7,6,17,8,10,14,15].Such algorithms are often called nonholonomic integrators; they are observed to nearly conserve first integrals when applied to some standard nonholonomic test problems.The cause of the near conservation is often attributed to the discrete Lagranged'Alembert principle.
In this paper we give numerical and theoretical results suggesting that reversibility lies behind the observed good behaviour of nonholonomic integrators, regardless of any underlying discrete Lagrange-d'Alembert principle.Our results in fact reveal that the standard nonholonomic test problems have a bias: they are all reversible integrable.We therefore construct a family of nonholonomic test problems that are still integrable but not reversible (that is, not reversible with respect to a standard reversibility map).We apply several nonholonomic integrators from the literature on the new test problems.None of them exhibit structure preservation for all problems.
The underlying philosophy of nonholonomic integrators is well summarised by Cortés Monforte and Martínez [7, § 1] as follows: "by respecting the geometric structure of nonholonomic systems, one can create integrators capturing the essential features of this kind of systems."Because of our limited geometric understanding of nonholonomic dynamics, it is, however, unclear whether nonholonomic integrators at all possess special properties making them superior to other methods (in the same sense as symplectic integrators for Hamiltonian systems possess properties making them superior to non-symplectic methods).Except for exact conservation of momentum maps corresponding to horizontal symmetries [7, § 5], there are no theoretical results pertaining to structure preserving properties of nonholonomic integrators (by contrast, the excellent long-time behaviour of symplectic integrators for Hamiltonian systems and reversible integrators for reversible system is fully explained by KAM theory in combination with backward error analysis, see Hairer, Lubich, and Wanner [12] and references therein).Nonholonomic integrators nevertheless often display "very good energy behaviour" [9, § 1].Such statements are based on experimental evidence-how well the integrators perform on standard test problems.Thus, a nonholonomic integrator is "structure preserving" if it nearly conserves the first integrals of such test problems over long integration times.With this definition, if we extend the test problem suite to include our unbiased nonholonomic problems, then, as far as we know, there are no structure preserving nonholonomic integrators (all tested integrators fail to be structure preserving for some of the problems).Our aspiration for the new set of unbiased test problems is to serve the community as a more complete suite for evaluating structure preservation of nonholonomic integrators.
We now summarize the contributions in the paper: • We show that five classical nonholonomic test problems are part of a larger family of nonholonomically coupled systems ( § 2, § 2.2). • We show that a subset of nonholonomically coupled systems (that includes the classical test problems) are foliations over reversible integrable systems (Theorem 3.3, § 4, § 5).We thereby obtain a new family of reversible integrable nonholonomic systems that extends existing systems.• We use the result in Theorem 3.3 together with reversible KAM theorem to explain the excellent long-time behaviour of nonholonomic integrators observed experimentally ( § 3.3).• We propose new, perturbed test problems for nonholonomic integrators ( § 2.2).While still integrable, these problems are not reversible and therefore avoid experimental bias.• We carry out numerical experiments with five commonly used nonholonomic integrators on both reversible (biased) and non-reversible (unbiased) test problems ( § 3).The behaviour in the numerical simulations is consistent with our predictions from the theory ( § 3.4).(One of the methods still yields good long-time behaviour on one of the unbiased problems without explanation, but this method fails for other test problems.) The paper is organised as follows.In § 2 we describe a family of nonholonomic systems; this family contains some of the classical nonholonomic systems in the literature.In § 3 we run numerical experiments on some particular systems in that family, and measure the conservation of some integrals of those systems.In § 3.3 we show how reversible KAM theory in combination with Theorem 3.3 explain near conservation of integrals.In § 3.4 we then verify our theory against the numerical simulations.In § 4 and § 5 we prove results leading to Theorem 3.3.

Nonholonomically coupled systems
The Lagrange-d'Alembert principle states that a motion curve q(t), for a system with Lagrangian function L(q, q) and nonholonomic constraints A(q) q = 0 fulfils δ b a L q(t), q(t) dt = 0, for all virtual displacements δq with A q(t) δq(t) = 0. Throughout this paper we assume that the Lagrangian is of the form L(q, q) = 1 2 q q − V (q), for a potential function V .The equations of motion are then given by q = −∇V (q) + A(q) λ 0 = A(q) q (1) where λ ∈ R r are the Lagrange multipliers (see [6,4] for details).
As we shall see in § 2.2, the continuously varying transmission, the nonholonomic oscillator and nonholonomic particle, the knife edge, vertical rolling disk, are all part of a greater family of nonholonomically coupled systems, where two independent subsystems are coupled through the constraints.
Definition 2.1.A nonholonomically coupled system is a nonholonomic system with Lagrangian of the form and constraints of the form where, for any ξ, the matrix A(ξ) has a kernel of dimension one.The (ξ, ξ) subsystem is called the driving system.Remark 2.2.Note that, in the examples of § 2.2, some components of x, or ξ, may be periodic (see Table 1).In the rest of the paper we will nevertheless assume that x ∈ R n−1 and ξ ∈ R, for the sake of simplicity.
Remark 2.3.Note that the matrix A in equation ( 2) is not the same as the one appearing in (1).The difference is that now A depends only on ξ, and applies only on ẋ.This slight abuse of notation should not be confusing.
Notice that the driving system is a self-contained unconstrained Lagrangian system.As indicated, one may think of it as the "driver" for the remaining system.
We write the total energy H as where h(ξ, ξ) := 1 2 ξ2 + V (ξ) is the energy of the driving system.Given a nonholonomically coupled system, let k(ξ) be a kernel vector of A(ξ) such that k(ξ) = 1.We then define v := ẋT k(ξ).
Since k(ξ) spans the kernel of A(ξ), it follows from the constraint equation ( 2) that ẋ = vk(ξ).Also note that since Both the total energy H and the energy h of the driving system are first integrals, so we obtain that d dt Note that, even though this derivation assumes v = 0, one can check that (4) is still valid when v = 0 by directly computing the Lagrange multiplier from the equation of motion (1).
The equation of motion are thus where ξ is a solution of the independent Lagrangian system (the driving system in Definition 2.1) Thus, every nonholonomically coupled system can be reduced to a system of ordinary differential equations of the form (5), with first integrals given by the passenger energy and the driver energy Notice that the total energy (3) is the sum of the driver and passenger energies.
2.1.Quadratic potentials.We now assume that the potential is quadratic, i.e., where K is a symmetric positive semi-definite matrix and f ∈ R n−1 is a constant vector.The corresponding force F (x) = − ∂U ∂x is given by F (x) = −Kx + f .Using the spectral decomposition, removing eigenvectors corresponding to zero eigenvalues, the matrix K can be factorised as for a rectangular m × (n − 1) matrix κ of full rank.We define and the projection (8) π(x, v, ξ, ξ) := (κx, v, ξ, ξ) = (y, v, ξ, ξ).
From (5) we have v = k(ξ) T (−κ T κx + f ), so we get v = −k 0 (ξ) T y + k(ξ) T f .The projection π therefore intertwines the original system (5) with the reduced system Table 1.Summary of the nonholonomically coupled systems presented in § 2. F is the dimension of the kernel of κ, and hence the dimension of the fibres of the map π defined by equation (8).I is the number of first integrals.θ is the number of angle variables.ρ is the map used to define the reversibility map in (23).
where, again, ξ fulfills equation (5a).There is thus a stack of three systems above one another, summarised by the following chain of projections: The system (9) can be written in matrix notations, using an auxilliary variable ε with initial condition 1, as We observe that the matrix in (10) is an element of se(m + 1), the Lie algebra of the semidirect product Lie group SO(m + 1) R m+1 , where m is the rank of κ.If f is zero, the group reduces to SO(m + 1).If m = 0, the group reduces to R. The Lie algebra structure of equation ( 10) is central for the reversibility analysis in § 4.

2.2.
Examples.Here we give several examples of nonholonomic systems of the form presented in § 2.1.The standard form of these problems are used in the literature as test problems for nonholonomic integrators.We also suggest new modifications of the standard systems, constructed so that they fail to be reversible integrable (as detailed in § 4).
A summary of the problems in this section in terms of the symbols in § 2 is given in Table 1.
2.2.1.CVT and nonholonomic particle.The continuously variable transmission (CVT) problem is a family of coupled nonholonomic system of the form in § 2.1 with Illustration of the principle of the continuous variable transmission (CVT) gearbox.The driving subsystem (ξ, ξ) determines the location of the belt which in turn determines the gear ration between the shafts.A nonholonomic system describing the motion is given in § 2.2.1.and κ = 1 0 0 1 .
It is a simple model for a variable transmission gearbox, as illustrated in Figure 1.
The driving system determines the gear ratio.We consider different driver systems.First, the harmonically driven CVT.In this case, the driver is a harmonic oscillator, in other words, V (ξ) = ξ 2 /2.The nonholonomic constraint is then Under the name contact oscillator this case is considered as a test problem in [17].The system is shown to be reversible integrable in [18]; together with reversible KAM theory this gives a theoretical explanation of the excellent numerical results observed in [17].
The vector spanning the kernel of A(ξ) is Since f = 0 and κ is the identity, the evolution matrix (10) is in this case 0 k(ξ) −k(ξ) T 0 , so the matrix subalgebra is so(3).
The second type of driver system is the pendulum driven CVT.Here, the gear ratio (driving system) is governed by a nonlinear pendulum instead of the harmonic oscillator.The potential for the nonlinear pendulum is now (11) V (ξ) = cos(ξ) − ε sin(2ξ)/2, where ε is an arbitrary perturbation parameter, and the nonholonomic constraint is now given by A(ξ) = 1 sin(ξ) .
The vector spanning the kernel of A(ξ) is We refer to the CVT problem with the driving potential (11) as the pendulum driven CVT.As we shall see in § 3.2, ε = 0 corresponds to a non-reversible perturbation, which tends to destroy the good long-time behavior of reversible integrators.
For convenience, the equations of motion are The last type of driver system is the nonholonomic particle, considered in [7].It is a degenerate case of harmonically driven CVT, where the spring of one of the shafts has zero stiffness, so κ = 0 1 .

This gives
Thus, the evolution matrix (10) is so the Lie subalgebra is g = so(2).
An illustration is given in Figure 2. In terms of the data in § 2, the system is defined by Illustration of the knife edge system (13).The contact point of the knife edge, or skate, is sliding under gravity on the inclined plane.The direction of sliding is determined by the angle ξ; one may think of a "one-legged skater", changing direction of his skate according to the driving system.and κ = 0 (it is a 0 × 2 matrix), so y = [ ] (the empty vector).The kernel vector k(ξ) is and the evolution equation of the reduced system (9) is simply v = k(ξ) T f = cos(ξ).
Since y = [ ], the matrix in (10) is given by 0 cos(ξ) 0 0 , so the underlying group is R.
Consider now a slightly perturbed version of the knife edge, where When ε = 0 this is exactly the knife edge.As we shall see in § 3.1, ε = 0 implies non-integrability.

Vertical rolling disk and mobile robot.
The vertical rolling disk is a standard example of a nonholonomic system [4, Illustration of the vertical disk, or rolling penny, given by ( 15).The rotation of the penny is described by x 3 (measured from the z-axis) and the position of the contact point by (x 1 , x 2 ).The directional angle is described by ξ.Because of conservation of angular momentum, we have ξ = 0.
An illustration is given in Figure 3.In terms of § 2, the data are κ = 0 Since y is the empty vector and f = 0, the reduced equation ( 9) is simply v = 0, so the underlying group is the trivial group 1.
A modification of the vertical rolling disk is the mobile robot [7], which corresponds to Thus, the driving system for the mobile robot is ξ = − cos(ξ).
Everything else is identical to the vertical rolling disk.

Numerical experiments
In this section we give examples and counter-examples of good long-time behaviour for nonholonomic integrators applied to the test problems of § 2.2.The counterexamples stem for the perturbed versions of the knife edge and the CVT.The two perturbed problems correspond to two different mechanisms destroying the good long-time behaviour of nonholonomic integrators: (i) by removal of the integrable structure (perturbed knife edge), and (ii) by removal of the reversible structure (perturbed CVT).
3.1.Knife edge.The initial data is This corresponds to an initially horizontal skate which is rotated by the driving system thereby picking up speed in the direction of the skate due to gravity.The unperturbed (ε = 0) and perturbed (ε = 0.1) systems are integrated using 5 methods: DLA 0.5 , DLA 0.4 , DLA 0,1 , DD, and LF.The stepsize for all methods is ∆t = π/10.The integration interval is [0, 100].
Notice that the dynamics of the driver system (ξ, ξ) is trivial for the knife edge (simply ξ = 0).Thus, all the integrators provide the exact solution of the driver system (by integrator consistency).Consequently, all the integrators exactly preserve the energy h of the driving system.
The evolution of the energy error |H(t) − H(0)| for all 5 methods is given in Figure 4. We make the following observations.For the unperturbed system, all integrators except DLA 0.4 exactly or nearly preserves the energy integral H.For the perturbed system, all integrators except DD give a drift in the total energy H.

Continuously variable transmission (CVT).
The system here is the CVT with potential for the driver system given by (11).We consider two different sets of initial data.First, corresponding to total energy H 0 = 5.0.The two different sets of initial data yield two different types of behaviour for the driver system.When the energy level is low (H 0 = 2.8) the phase diagram of the driver system corresponds to a nonlinear pendulum going back and forth: we call this the oscillating driver.When the energy level is high (H 0 = 5.0) the phase diagram of the driver system corresponds to a nonlinear pendulum with enough kinetic energy so that it does not turn back, but  keep on rotating in the same direction: we call this the rotating driver.The setup is illustrated in Figure 5.
The evolution of the passenger energy error |E(t) − E(0)| for all 5 methods is given in Figure 6.The evolution of the driver energy error |h(t) − h(0)| for all 5 methods is given in Figure 7. Since the CVT system is integrable (as detailed in § 4 and § 5), there is an additional integral that is not available explicitly (see Proposition 5.3).Although an explicit formula is not available, we can study this integral at the Poincaré section given by sampling every period of the driver system.It can be interpreted as the latitude along a certain direction (depending on the initial data) of the vector (x 1 , x 2 , v) with v given by (4); we therefore call it the latitude integral.The evolution of the latitude integral error (at the Poincaré section) for all 5 methods is given in Figure 8.
Remark 3.1.Since the first preprint of this paper, the asymmetrically perturbed CVT problem has been used as a test problem for energy-momentum type integrators [5].In that paper, however, the authors evaluate the performance solely based on conservation of energy.This misses the point; to say if an integrators performs well or not on this problem one has to study all the integrals, in particular the latitude integral (for which there is no simple formula).

3.3.
Mechanism for near conservation of integrals.The perturbation theory of Kolomogorov, Arnold, and Moser (KAM theory) comes in two flavours: symplectic and reversible.In short, it states that Hamiltonian/reversible perturbations of Hamiltonian/reversible integrable systems nearly preserve invariant tori.In combination with backward error analysis of numerical integrators, KAM theory Error in the driver energy (7) for the 5 methods applied to the pendulum driven CVT problem (12) for both the oscillating and rotating driver, and two different values of ε.The data are convoluted by a running mean with a window of 30 time units.The only method that does not nearly conserve the driver energy is DD.Notice that even DLA 0.4 gives a good behaviour, despite not being reversible: this is fully explained by the fact that the driver system in itself is a Hamiltonian system and DLA 0.4 is a symplectic integrator in absence of constraints, so backward error analysis of symplectic integrators (cf.[12, § IX.3]) fully explains the near conservation.
rigorously explains near conservation of first integrals for symplectic/reversible integrators applied to Hamiltonian/reversible integrable systems.For a thorough treatment we refer to the monograph by Hairer et al. [12] and references therein.KAM theory, however, is not readily applicable to nonholonomic systems: it applies to integrable systems of ODE.A main result of this paper is that the class of nonholonomically coupled systems introduced in § 2 can be made compatible with the setting of reversible KAM theory for backward error analysis of reversible integrators, provided that the nonholonomic systems and integrators are reversible with respect to the same reversibility map.We now state this result, which in turn relies on results proved in § 4 and § 5.
As always when analyzing constrained systems, the first step is to reduce the constrained system to an ordinary differential equation on a state space manifold M. Details of how this is done for the nonholonomically coupled systems are given in § 4 below.3).This first integral is 'hidden': we have no explicit formula for it.We can only compute it on a Poincaré section by sampling every period of the driver system.The data is convoluted by a running mean with a window of 30 time units.If the numerical integrator is reversible, then preservation of this integral is fully explained by reversible KAM theory, as discussed in § 3.3.Notice that this explains all but one of the results: For the oscillating driver, where reversibility in the driver system is still preserved (corresponding to the green curve in Figure 5), or when ε = 0, all reversible integrators exhibit no drift.For the rotating driver with ε = 0, every integrator fails except the leap-frog method.
It is an open problem to explain why the leap-frog method works for that particular system (note how leap-frog exhibits drift for the perturbed knife edge problem, see Figure 4).Definition 3.2.Consider a vector field X on M. We assume that there is a surjection π : M → N .(In our case, M and N are vector spaces or cylinders, and this surjection is just a linear map.)Assume that X descends to a vector field Y , i.e., π * X = Y .We now say that X is fibrated over an R-integrable system if there is an involution R defined on N , and the system Y is R-integrable (in the precise sense of Definition 5.1 below).
Proof.The theorem is a consequence of the results presented in § 4 and § 5 below, in particular the final result Proposition 5.4.
We now describe the mechanism by which reversible KAM and backward error analysis can be adopted to ODEs fibrated over integrable systems.Given a nonholonomic system whose state space formulation is fibrated over a reversible integrable system, with a projection π and a reversibility map (an involution) R : N → N as in Definition 3.2, suppose that a nonholonomic integrator Φ h : M → M is compatible with π and R in the following sense: (i) It descends to a method Ψ h : N → N , i.e., (ii) The descending integrator Ψ h preserves the reversibility structure R, i.e., Thus, Ψ h is a reversible integrator for the reversible integrable system on N corresponding to the vector field Y .By backward error analysis of numerical integrators on manifolds (see [12, Ch.IX.2] or [13]) we then get, up to exponentially small terms and for exponentially long times, that the integrator Ψ h corresponds to the exact flow of a reversible vector field Y h .By the reversible KAM Theorem [20, Th. 2], this explains the near conservation of the first integrals for the reversible system on N .Since these integrals are unaffected by the fibre motion in the fibration M → N , and since the numerical integrator preserves the fibration, they are also nearly conserved as first integrals of the system on M.This explains near conservation of integrals for reversible perturbations of reversible integrable nonholonomic systems.The final step is to verify that the conditions (i) and (ii) above for the integrators in Appendix A. This is straightforward.Consider, for example, the knife edge example ( § 2.2.2).In this case, the projection π is just π(x, v, ξ, ξ) = (v, ξ, ξ).The condition (i) is true for all methods in Appendix A except DD (the discrete derivative method).Next, the reversibility map is ρ(v, ξ, ξ) = (v, ξ, − ξ).A direct calculations shows that condition (ii) holds for all DLA methods, except DLA α with α = 1 2 .To summarize our findings: if a nonholonomic integrator is compatible with the fibrated integrable structure of a nonholonomic coupled system, and at the same time respects the reversible structure of the integrable system, then reversible KAM theory fully explains the good long time behaviour.If, however, the nonholonomic integrator fails to preserve either the fibration structure or the reversible structure of the underlying integrable ODE, then one cannot expect good long time behaviour.The perturbed problems are exactly constructed to break these structures: The perturbed Knife Edge breaks the fibration over an integrable system, whereas the perturbed CVT for the high energy level, although still fibrated over an integrable system, breaks reversibility.Let us now discuss how our predictions adhere to the numerical results.Let us now go through the two numerical examples (knife edge and CVT) and relate the performance of each method to our theoretical predictions.
By Theorem 3.3 the unperturbed (ε = 0) knife edge is fibrated over a reversible integrable system.By our theoretical predictions, every method that preserves both the fibration and the reversibility should perform well.And indeed they do, as confirmed in the left diagram of Figure 4. We also see that the non-reversible method, DLA 0.4 , exhibits drift.This constitutes strong evidence that reversibility, not the discrete Lagrange-d'Alembert principle, is behind near conservation.Notice that DD exactly preserves the energy by construction.Consider now the more interesting case of the perturbed (ε = 0.1) knife edge.Recall that this perturbation is constructed to destroy the fibration structure, so in this case our theoretical results cannot be applied.In the right diagram of Figure 4 we see that all methods, except of course DD, exhibit energy drift.This shows that reversibility alone is not enough for near conservation; it is more intricate.
We now turn to the unperturbed (ε = 0) CVT problem.Recall that this problem has three first integrals: (i) the driver energy, (ii) the passenger energy, and (ii) the more complicated latitude integral.Also recall that there are two types of drivers depending on the initial conditions: oscillating and rotation (see Figure 5).By Theorem 3.3 the system is fibrated over a reversible integrable system.As expected from our theory, we see in the left columns of Figures 6, 7, and 8 that the methods that are both fiber preserving and reversible nearly conserve the integrals.Notice also that DD exhibit drift in all the integrals, although total energy is exactly conserved by construction.This illustrates that methods that are "forced" to preserve one of the integrals generally do not perform well on other integrals. 2otice in the left column of Figure 7 that the driver energy is nearly conserved for DLA 0.4 despite this integrator not being reversible.This is expected since the driver subsystem is independent from the rest of the dynamics and is actually a Hamiltonian subsystem.Thus, since DLA 0.4 applied to Hamiltonian systems is a symplectic integrator (albeit not reversible), the standard theory of backward error analysis for symplectic integrators explains the near conservation of the driver energy.(The passenger energy, however, is not conserved for DLA 0.4 .) Finally, we turn to the perturbed (ε = 0.5) CVT problem.It is constructed to still be fibrated over an integrable system, but it is no longer reversible with respect to the standard reversibility map (being integrable it is, of course, still reversible with respect to a different, more complicated reversibility map).Actually, there are two possible reversibility maps, reflection in the q-plane or the p-plane, and the reversible methods preserve both of these.ε = 0 destroys the q-reversibility, but the p-reversibility remains.However, when the driver is rotating the p-reversibility becomes "invisible" to the dynamics as illustrated in Figure 5. Therefore, according to our theoretical predictions, the fibre preserving and reversible methods should perform well with an oscillating driver.Indeed they do, as can be seen in the upper left diagram of Figures 6, 7, and 8: only the non-reversible DLA 0.4 and non-descending DD methods exhibit drift.Since the independent driver subsystem is Hamiltonian, we also see for the rotating driver at the bottom right of Figure 7 that all methods that are symplectic when applied to Hamiltonian systems preserve the driver energy, regardless of weather they are reversible or not.So far, our theory is perfectly aligned with the numerical simulations in "both directions": whenever the theory is applicable we observe near conservation (simply verifying the theoretical results), but also, whenever the theory is not applicable we observe drift.According to this, we expect for a rotating driver (where our theory does not apply) to see drift in the passenger energy in all the methods.In the bottom right of Figure 6 we observe drift for all methods but LF.This fact, that LF conserves the first integrals of a non-reversible nonholonomic problem, came as a surprise.At this stage we have no explanation for it.We predict, however, that there is some reversibility map, or some modified symplectic structure, that is shared by that particular problem and the integrator map, and which would allow us to apply KAM theory.
In the remainder of the paper we prove results leading to Theorem 3.3 above.

Linear, periodic systems: averaging and reversibility
Systems of the form (10) in § 2.2 can be written generally as where ξ is the solution of the driving system (5a) and A(ξ) takes values in a Lie algebra g.Assume now that the energy function of the driving system h(ξ, ξ) = ξ2 /2 + V (ξ) has bounded level sets, so that solutions are periodic.After a change of variables, the system (ξ, ξ) can be rewritten using an angle variable θ such that θ = ω(h) for some function of the energy h (the action variable).We are thus led to study systems of the form where A is periodic in θ.
The first aim of this section is to show that systems of the form ( 16) can be transformed into autonomous linear systems by means of averaging.The second aim is to give conditions under which this averaging transformation is a reversible map.
The results in this section (Theorem 4.2 and Theorem 4.4) are generalizations of Theorem 3.1 and Theorem 3.2 in [18].4.1.Averaging.We first study averaging of systems of the form (16). Averaging means that after a reparametrization, the system ( 16) is equivalent to a system of the same form but with a constant matrix A, which can be interpreted as the average of A. We first reformulate systems of the form (16) as follows: Definition 4.1.Let g be a Lie subalgebra of gl(n, R).A g-periodic differential equation is a system of the form where (u, θ) ∈ R n × R/Z and A : R/Z → g is a smooth mapping.Theorem 4.2.Consider a g-periodic system (Definition 4.1).Then there is a smooth change of variables such that the g-periodic system (17) expressed in the new variables (v, θ) takes the form for a constant "average" element A ∈ g.
Proof.One defines the flow map Φ(τ ) as the solution operator of the differential equation defined for all τ ∈ R by (19) dw(τ ) dτ = A(τ )w(τ ) with initial condition at τ = 0 -the initial time matters because the differential equation is not autonomous.This means that if w is a solution of ( 19), then and vice versa.Since A(τ ) ∈ g for all τ , the flow map after one period, i.e., Φ(1), is an element of G. Let G ⊂ GL(n, R), the connected Lie group corresponding to the Lie algebra g.As the exponential is surjective from g to G, there exists A ∈ g such that (21) Φ(1) = exp(A).
We define the mapping Ψ : Recall that, as A is periodic, for any integer n ∈ Z and any τ ∈ R we have (cf.Floquet theory [1, § 28]) Now, v(u, τ ) is periodic in τ , of period one, because, due to (22), and the definition (21) of A, As a result, the mapping Ψ induces a mapping Ψ : R n × R/Z → R n × R/Z.Note that, as Ψ is a smooth change of variables, so is Ψ.We now proceed to show that Ψ sends solutions of (17) to solutions of (18).Consider a solution u(t), τ (t) of the differential equation , so w is a solution of (19).As a result, w(σ) = Φ(σ)w(0), which, using u(t) = w τ (t) , gives u(t) = Φ τ (t) u(t 0 ).We thus obtain that along a solution As a result, we obtain which proves the result for the mapping Ψ and thus also for Ψ.
4.2.Reversibility.We now turn to the question of whether the mapping Ψ defined in Theorem 4.2 can preserve a reversibility structure.We namely equip the space R n × R/Z with a linear involution R, defined as (23) R(u, θ) := (ρu, −θ), for a given linear involution ρ.
We first observe the following condition on ρ which will turn out to be essential for the preservation of the reversibility structure R in Theorem 4.4.
We now turn to the main result of this section.
Theorem 4.4.Assume that (24) holds.Assume further that the average matrix A ∈ g defined in Theorem 4.2 is such that where Ω is a subset of g where the exponential is injective, and such that −Ω ⊂ Ω, and that ρΩρ ⊂ Ω.Then the averaging mapping Ψ : R n × R/Z → R n × R/Z, defined in Theorem 4.2, preserves reversibility, i.e., R • Ψ = Ψ • R.

N
Figure 9.An illustration of the setting of the examples treated in this paper.The actual system is represented by the spirals in the manifold M. The system, however, sits above an integrable system, which foliation in tori is represented downstairs on N .If the numerical integrator descends to a reversible integrator downstairs, then there is no drift in the first integral of that system.Moreover, as the energy depends only on these invariants, there is no energy drift either.

Fibrations over reversible integrable systems
The goal of this section is to show that all the nonholonomically coupled systems studied in § 2.2 are fibrated over a reversible integrable system, as summarized in Table 1.The situation is illustrated in Figure 9.
5.1.Reversible integrability.We briefly recall the definition of an R-integrable, or reversible integrable, system [20].Definition 5.1.Consider a manifold N , equipped with an involution R. A dynamical system, i.e., a vector field Y on the manifold N , is R-integrable if there is a map ϕ : N → R p × (R/Z) k , which we denote by ϕ(x) = (I(x), θ(x)), such that (i) ϕ • R = (I, −θ) (ii) the induced vector field is İ = 0 and θ = ω(I) for a given frequency map ω (iii) the image of the frequency map ω does not lie in a proper linear subspace.
Remark 5.2.The last condition (iii) is called a nondegeneracy, or non resonance, condition [20].Note that the usual non resonance condition (also called diophantine condition [12, §X.2.1]) is not strong enough for our examples, as discussed in [18].
We first make a statement about some g-periodic systems (Definition 4.1).
Proposition 5.3.Consider a g-periodic system.We assume that the average matrix A from Theorem 4.2 is either zero, or, if n ≤ 3, an element of so(n).Suppose further that there exists a map ρ fulfilling (24).Then the system is R-integrable (Definition 5.1), with R defined in (23).
Proof.The assumptions of Theorem 4.2 and Theorem 4.4 are fulfilled, so we obtain a variable transformation which brings the system to the form (18).If A is zero, there are only action variables.If A is in so(n) for n ≤ 3, we have one more angle variable determined by the only angle of the rotation matrix exp(A).
Proposition 5.4.All the reduced systems defined in § 2.2 fulfill the assumptions of Proposition 5.3.
Proof.The only nontrivial case is that of the knife edge, where the group is R.The average matrix A computed in Theorem 4.4 can be computed explicitly in that case: where which, following (14), is simply zero.We define the reversibility structure R from (23), where ρ is defined as follows.For the CVT, ρ is ρ(y, v) = (y, −v).
For the remaining systems, ρ is In both cases, ρ fulfils (24).

Conclusions and open problems
In this section we summarize the main points of our paper and, based on our findings, list a set of open problems.These problems are aimed for the nonholonomic integrators community; our goal is that work on them shall lead to a better understanding of geometric integration algorithms for nonholonomic systems.
We start with the conclusions.The field of nonholonomic integrators aims to construct numerical methods specifically designed for nonholonomic systems, aspiring to outperform standard integration schemes.The starting point for designing such integrators is most often some discrete emulation of the Lagrange-d'Alembert principle.Contrary, however, to the holonomic case, where Hamilton's principle leads to conservation of symplecticity, the phase space flow structure induced by the Lagrange-d'Alembert principle is not fully understood; total energy is always conserved but are there any additional structures?This lack of understanding is reflected in the literature on nonholonomic integrators.Indeed, several different, incompatible discrete versions of the Lagrange-d'Alembert principles have been suggested and it is unclear if one or the other yield better performance.(Compare, for example, the DLA-principle in [7,17] with the GNI approach in [8].)All the suggested discrete frameworks do, however, have a common feature: if the nonholonomic system is reversible (with respect to the standard reversibility map), then the discrete phase flow map preserves reversibility.Since all standard nonholonomic test problems are reversible (and typically also integrable), one can expect that the observed numerical behaviour is explained by theory for reversible dynamical systems (for example reversible KAM theory).In this paper we have concluded this to be the case for a large class of nonholonomic test problems (fibrations over reversible integrable systems).To verify that reversibility alone is responsible for the good long-time behaviour we have developed a new class of perturbed nonholonomic test problems that are still integrable, but no longer reversible with respect to the standard reversibility map.Our simulations show that none of the tested nonholonomic integrators perform well on all of these problems.Of course, our specific selection of test problems does not cover all nonholonomic problems studied in the literature.They do, however, in many ways represent the simplest nontrivial nonholonomic systems and are therefore worthy for primary investigations of nonholonomic integrators (the notion being that before moving on to more complicated problems, an integrator should perform well on the simplest non-trivial problems).
In addition to nonholonomic integrators based on discrete emulation of the Lagrange-d'Alembert principle, another approach is to start directly from conservation of energy and momentum and construct algorithms that exactly preserve these conservation laws (energy-momentum methods, see e.g.[3]).Our numerical simulations with a commonly used method from this class shows that, although both energy and reversibility are preserved, the fibration structure of the test problems is not preserved, even for the reversible problems, which leads to a drift in the "hidden" integrals.This numerical observation is in agreement with our theory based on reversible KAM theory, which cannot be applied unless the fibration structure is also preserved.We thus conclude that forcing conservation of first integrals is not enough to obtain good long-time behaviour.
Open problems.We have formulated a set of problems that reflect, in our opinion, the core challenges for the nonholonomic integrator community.The problems are listed from most to least general.
(1) Backward error analysis for relevant classes of nonholonomic systems.In lack of a complete characterization of nonholonomic systems, one can still develop a numerical backward error analysis (cf.[12]) for subclasses of nonholonomic dynamics where the (reduced) phase space geometry is understood.We suggest to start with the class of integrable nonholonomically coupled systems described in this paper.Thus, the problem consists in developing nonholonomic integrators such that, when they are applied to integrable nonholonomically coupled systems, their modified vector fields on the reduced phase space preserve the structure of being fibrated over integrable systems.(2) Explain leap-frog performance in Figure 6.The lower right diagram of Figure 6 indicates that the leap-frog method nearly conserves the driver energy of a rotating, non-reversible driver.As pointed out, this is the only near conservation behaviour we have seen that is not (directly) explained by reversible KAM theory.The open problem is to come up with a rigorous understanding of why near conservation is observed in this case.A hypothesis is that reversible KAM still can be used, but that one needs to slightly modify the reversibility map which happen to be conserved by leap-frog (but not the other integrators).A rigorous understanding of the unexplained behaviour is likely to shed light on the structure of nonholonomic systems.
(We remark again that leap-frog yields drift in integrals for other nonreversible integrable systems, for example the perturbed knife edge as seen in Figure 4.) (3) Energy-momentum methods that preserve fibrations As seen in Figure 8, the energy preserving method (DD), despite being reversible, performs poorly on the reversible CVT problem (it exhibits drift in the "hidden" first integral).The reason for the poor performance has to do with this integrator not perserving the fibration structure.The last open problem consists in finding an energy-momentum nonholonomic integrator that also preserves the fibration structure.To base the integrator on the average vector field approach, instead of discrete derivatives, might lead to a solution.
Appendix A. Integration schemes for nonholonomic systems In this section we give a brief description of the numerical integrators specifically developed for nonholonomic systems of the form (1).With integrator we mean a map Φ h : (q k , qk ) → (q k+1 , qk+1 ), depending on the time-step parameter h > 0, such that the sequence q 1 , q 2 , . . .with q 0 = q(0), approximates the exact solution sequence q(h), q(2h), . . .for the nonholonomic system under consideration.
Let us give a word of caution here; the notion of reversibility of an integrator depends on the particular non-holonomic system being approximated.Indeed, a method may be reversible for one non-holonomic system, but fail to be so for another system.
(29) This method is second-order accurate and, contrary to DLA A.2. Nonholonomic leap-frog method.The nonholonomic leap-frog method is given by q 1 = q 0 + h q1 q1 = q0 + h(∇V (q 0 ) + A(q 0 ) T λ) 0 = A(q 0 )( q0 + q1 ) This integrator was suggested already in [17, § 5.1], but discarded as it is not based on the DLA principle.A very important property of this method is that it is only linearly-implicit.
Let us define the auxiliary variable q := q 0 + q 1 2 .
We assume that a function ∇V (q 0 , q 1 ) is a discrete gradient of the potential, that is, it fulfills the requirement ∇V (q 0 , q 1 )(q 1 − q 0 ) = V (q 1 ) − V (q 0 ).The method is

4 Figure 4 .
Figure 4. Evolution of the error |H − H(0)| of the energy integral for 5 methods applied to the knife edge.The data are convoluted by a running mean with a window of about 3 time units.For all methods except DLA 0.4 , the error for the unperturbed system (ε = 0) is bounded in time.For the perturned system (ε = 0.1) all methods except DD show energy drift.

5 Figure 5 .
Figure 5. Phase diagrams for the (ξ, ξ) subsystem of the CVT problem, with ε = 0 (left) and ε = 0.5 (right).The circular (oscillating driver) and upper (rotating driver) paths correspond, respectively, to the low (H 0 = 2.8) and high (H 0 = 5.0) energy levels.Both diagrams are symmetric under the standard reversibility map ξ → − ξ.Notice that the left diagram also is symmetric under the 'non-physical' reversibility map ξ → −ξ, whereas the right diagram does not have this symmetry (due to the perturbation).

Figure 6 .
Figure 6.Error in the passenger energy(6) for the 5 methods applied to the pendulum driven CVT problem(12) for both the oscillating and rotating driver, and two different values of ε.The data are convoluted by a running mean with a window of 30 time units.In the bottom right diagram, a systematic drift of the energy occurs for all methods but LF.

4 Figure 7 .
Figure 7. Error in the driver energy(7) for the 5 methods applied to the pendulum driven CVT problem(12) for both the oscillating and rotating driver, and two different values of ε.The data are convoluted by a running mean with a window of 30 time units.The only method that does not nearly conserve the driver energy is DD.Notice that even DLA 0.4 gives a good behaviour, despite not being reversible: this is fully explained by the fact that the driver system in itself is a Hamiltonian system and DLA 0.4 is a symplectic integrator in absence of constraints, so backward error analysis of symplectic integrators (cf.[12, § IX.3]) fully explains the near conservation.

4 Figure 8 .
Figure 8. Error in the latitude integral corresponding to the extra first integral due to integrable structure of the CVT problem (seeProposition 5.3).This first integral is 'hidden': we have no explicit formula for it.We can only compute it on a Poincaré section by sampling every period of the driver system.The data is convoluted by a running mean with a window of 30 time units.If the numerical integrator is reversible, then preservation of this integral is fully explained by reversible KAM theory, as discussed in § 3.3.Notice that this explains all but one of the results: For the oscillating driver, where reversibility in the driver system is still preserved (corresponding to the green curve in Figure5), or when ε = 0, all reversible integrators exhibit no drift.For the rotating driver with ε = 0, every integrator fails except the leap-frog method.It is an open problem to explain why the leap-frog method works for that particular system (note how leap-frog exhibits drift for the perturbed knife edge problem, see Figure4).