1 Introduction

A geodesic on a (pseudo) Riemannian manifold is uniquely specified by a point and a tangent direction. Moreover, if two points are sufficiently close to each other, then there exists exactly one length minimising (or, in Lorentzian geometry, maximising) curve between these two points, which is geodesic. This formulation relies on the methods of the calculus of variations and enables a construction of normal neighbourhoods, as well as the analysis of the Jacobi fields determining the existence of conjugate points.

The variational formulation has been missing in conformal geometry, where conformal geodesics arise as solutions to a system of third-order ODEs: A conformal geodesic is uniquely specified by a point, a tangent direction, and a perpendicular acceleration [3, 18, 19]. The odd order of the underlying system of equations is not amenable to the usual methods of calculus of variation, where the resulting Euler–Lagrange equations for non-degenerate Lagrangians are of even order.

One way around this difficulty, which we explore in this paper, is to consider a more general class of variations. As we shall see, this allows to terminate the procedure of integration by parts when the integrand depends on a derivative of a variation. This argument relies on a number of technical steps: carefully controlling boundary terms, respecting conformal invariance, and making sure that the fundamental lemma of the calculus of variations can be applied to the extended class of variations.

In the next section, we shall formulate the conformal geodesic equations and summarise the notation. In Sect. 3, we shall propose two ways to deal with third-order equations from the variational perspective. In Sect. 4, we formulate the main result of our paper (Theorem 4.1) and compute the variation of the conformally invariant functional associated to a third-order Lagrangian, while the standard variational procedure leads to a fourth-order conformally invariant equation (4.14), looking at the extended class of variations reduces the order of the Euler–Lagrange equations to 3, and gives conformal geodesics as extremal curves. In Sect. 5, we focus on the conformally flat case, where the link between the third- and fourth-order systems is particularly clear, and the Hamiltonian formalism can be constructed. In particular, we show that logarithmic spirals arise as solutions to the fourth-order system for a particular class of initial conditions. In Sect. 6, we show how the Lagrangian of Theorem 4.1 can be interpreted as the ‘free particle’ Lagrangian on the total space of the tractor bundle. Finally, in Sect. 7, we construct a degenerate Lagrangian which uses a skew-symmetric tensor. This gives rise, via a Legendre transform, to a Poisson structure on the second-order tangent bundle.

2 Conformal geodesic equations

A conformal class on an n-dimensional smooth manifold M is an equivalence class of (pseudo) Riemannian metrics, where two metrics \(\hat{g}\) and g are equivalent if there exists a nowhere zero function \(\Omega \) on M such that

$$\begin{aligned} \hat{g}=\Omega ^2 g. \end{aligned}$$
(2.1)

If a metric \(g\in [g]\) has been chosen, then \({\langle }X, Y{\rangle }\) denotes the inner product of two vector fields with respect to this metric. We also set \(|X|^2\equiv \langle X, X\rangle \) and use the notation \(\psi (X)\) for the \((k-1)\)-form arising as a contraction of the k-form \(\psi \) with the vector field X.

Let \(\gamma \) be a curve of class at least \(C^3\) in M, parametrised by t, and let U be a nowhere vanishing tangent vector to \(\gamma \) such that \(U(t)=1\). The acceleration vector of \(\gamma \) is \(A=\nabla _U U\), where \(\nabla _U \equiv U^a\nabla _a\) is the directional derivative along \(\gamma \), and \(\nabla \) is the Levi-Civita connection of g. The conformal geodesic equations in their conformally invariant form given by Bailey and Eastwood [3] are

$$\begin{aligned} E\equiv \nabla _U A-\frac{3\langle U, A\rangle }{|U|^2}A+\frac{3|A|^2}{2|U|^2}U-|U|^2P^{\sharp }(U)+2P(U,U)U=0. \end{aligned}$$
(2.2)

The Schouten tensor \(P\in \Gamma (TM\otimes TM)\) of g is given by

$$\begin{aligned} P=\frac{1}{n-2}\Big (r-\frac{1}{2(n-1)}Sg\Big ), \end{aligned}$$

in terms of the Ricci tensor r and the Ricci scalar S of g. The symbol \(P^{\sharp }\) is an endomorphism of TM defined by \(\langle P^{\sharp }(X), Y\rangle =P(X, Y)\) for all vector fields XY.

Changing the metric to \(\hat{g}\) as in (2.1) results in changes to the Schouten tensor, the Levi-Civita connection and the acceleration

$$\begin{aligned} \hat{P}= & {} P-\nabla \Upsilon + \Upsilon \otimes \Upsilon +|\Upsilon |^2g,\\ {\hat{\nabla }}_X Y= & {} \nabla _X Y+\Upsilon (Y)X+\Upsilon (X)Y-\langle X, Y\rangle \Upsilon ^{\sharp },\\ \hat{A}= & {} A-|U|^2\Upsilon ^{\sharp }+2\Upsilon (U)U, \end{aligned}$$

where \(\Upsilon \equiv \Omega ^{-1}d\Omega \), and \(\Upsilon ^{\sharp }\) is a vector field defined by \(\Upsilon (X)=\langle \Upsilon ^{\sharp }, X\rangle \). It is now a matter of explicit calculation to verify that the conformal geodesic equations (2.2) are conformally invariant.

3 Lagrangians for third-order equations

For a non-degenerate Lagrangian, the order of the resulting Euler–Lagrange equations is equal to twice the order of the highest derivative appearing in the Lagrangian, so that in particular the Euler–Lagrange equations are of even order. Two approaches can be taken to deal with third-order systems (while they will also be applicable to systems of other odd orders, for clarity we focus on order 3)

  1. (1)

    To allow Lagrangians which are quadratic in the acceleration and terminate the procedure of integration by parts at the level of third-order derivatives, consider variations V which do not keep end points fixed, but only satisfy \({\dot{V}}(t_0)={\dot{V}}(t_1)=0\), where \(t_0, t_1\) are values of the parameter at end points. This enlarges the class of variations of extremal curves, and so reduces the number of these curves as well as the order of the resulting Euler–Lagrange equations from 4 to 3.

  2. (2)

    Consider degenerate Lagrangians which are linear in acceleration, and necessarily involve an anti-symmetric tensor.

To illustrate both approaches with an elementary example, consider a curve \(\gamma \) in \(\mathbb {R}^n\) parametrised by \(t\rightarrow X(t)\), and aim to obtain the third-order system

$$\begin{aligned} \dddot{X}=0 \end{aligned}$$
(3.3)

from a variational principle. Let \(\Gamma :[-1, 1]\times [t_0, t_1]\rightarrow M=\mathbb {R}^n\) be a one-parameter family of curves parametrised by \(s\in [-1, 1]\), such that

$$\begin{aligned} \Gamma (0, t)=\gamma (t), \quad \frac{\partial \Gamma }{\partial t}|_{s=0}=U, \quad \text{ and }\quad \frac{\partial \Gamma }{\partial s}|_{s=0}=V \end{aligned}$$

so that the variation V is also a vector field along \(\gamma \).

In the first approach, we take

$$\begin{aligned} I[X]=\frac{1}{2}\int _{t_0}^{t_1}|\ddot{X}|^2 {\mathrm {d}}t \end{aligned}$$

so that one integration by parts gives

$$\begin{aligned} I[X+sV]=I[X]-s\int _{t_0}^{t_1}\langle \dddot{X}, {\dot{V}}\rangle {\mathrm {d}}t+o(s). \end{aligned}$$

If the variation \({\dot{V}}\) vanishes at the end points and is otherwise arbitrary, then \(\delta I=0\) iff (3.3) holds.

There appear to be two immediate obstructions to generalising this approach to the conformal geodesic system (2.2), where \({\dot{X}}=U, \ddot{X}=A+\dots \), and \(\dddot{X}=\nabla _U A+\dots \), where \(\dots \) denote the lower-order terms involving the Christoffel symbols and curvature. Firstly, if there is an explicit X-dependence in the Lagrangian, then the undifferentiated variation V appears in the integrand. Secondly \({\dot{V}}=\nabla _U V+\dots \) is not conformally invariant. We shall get around both obstructions by modifying \(\nabla _U V\) to a first-order conformally invariant linear operator along \(\gamma \)

$$\begin{aligned} \mathcal {D}(V)=\nabla _UV+|U|^{-2}(\langle A, V\rangle U-\langle U, V\rangle A-\langle A, U\rangle V). \end{aligned}$$
(3.4)

This operator differs from the derivative \(\nabla _U\) along \(\gamma \) by a linear operator which depends on the second jet of \(\gamma \). It is the unique, up to the reparametrisation of \(\gamma ,\) conformally invariant adjustment of \(\nabla _U\). In the Proof of Theorem 4.1, we shall see that the linear operator of order zero, \(D(V)-\nabla _UV\), has the effect of cancelling some of the V contributions in the variation of the functional, and that all these contributions can be cancelled in the conformally flat case.

For the second approach, let \(\Omega \in \Lambda ^2(\mathbb {R}^n)\) be a constant two-form, and set

$$\begin{aligned} I[X]=\int _{t_0}^{t_1}\Omega (\ddot{X}, {\dot{X}}) {\mathrm {d}}t. \end{aligned}$$

If \(\Gamma (s, t)=X(t)+sV(t)+o(s)\), and the variation V and its derivative vanish at the end points, then integrating by parts twice gives

$$\begin{aligned} \delta I=\int _{t_0}^{t_1} 2\Omega (\dddot{X}, V){\mathrm {d}}t, \qquad \text{ so } \text{ that }\quad \Omega (\dddot{X}, \,\cdot )=0. \end{aligned}$$

If the dimension n is even, and \(\Omega \) is non-degenerate (so that \(\Omega \) is a symplectic form), then this implies (3.3).

4 Main theorem

Let E be the vector field along a smooth curve \(\gamma \) defined by equation (2.2). Define a third-order Lagrangian \(\mathcal {L}\), and the corresponding functional \(I[\gamma ]\) by

$$\begin{aligned} \mathcal {L}=\frac{\langle U, E\rangle }{|U|^2}, \end{aligned}$$
(4.5)

and

$$\begin{aligned} I[\gamma ]=\int _{t_0}^{t_1} \mathcal {L}{\mathrm {d}}t. \end{aligned}$$
(4.6)

The Lagrangian \(\mathcal {L}\) is conformally invariant, because the expression E is.

Theorem 4.1

The first variation of the functional (4.6) is given by

$$\begin{aligned} \delta I=\int _{t_0}^{t_1} |U|^{-2}(\langle K, V\rangle -\langle E-2\mathcal {L}U, \mathcal {D}(V)\rangle ){\mathrm {d}}t+\mathcal {B}(V)|^{t_1}_{t_0}, \end{aligned}$$
(4.7)

where K is a vector field along \(\gamma \) given, in terms of the Weyl tensor W, by

$$\begin{aligned} K^e=g^{ec}({W_{bca}}^d U^a U^b A_d-2|U|^2\nabla _{[c} P_{a]b} U^aU^b), \quad a,b, \dots =1, \dots , n \end{aligned}$$
(4.8)

and

$$\begin{aligned} \mathcal {B}(V)=|U|^{-2}(\langle U, \mathcal {D}^2(V)\rangle -\langle E-2\mathcal {L}U, V\rangle ), \end{aligned}$$
(4.9)

where \(\mathcal {D}\) is the operator (3.4).

Proof

The proof is by a cumbersome calculation. We shall list the main steps and give enough details so that the reader can verify our computations.

The third-order term \(|U|^{-2}\langle U, \nabla _U A\rangle \) in (4.5) can be integrated by parts to give

$$\begin{aligned} \frac{{\mathrm {d}}}{{\mathrm {d}}t}\Big (\frac{\langle U, A\rangle }{|U|^2}\Big )- \Big (\frac{|A|^2}{|U|^2}-\frac{2\langle U, A\rangle ^2}{|U|^4}\Big ), \end{aligned}$$

which results in the alternative form

$$\begin{aligned} \mathcal {L}=\frac{{\mathrm {d}}}{{\mathrm {d}}t}\Big (\frac{\langle U, A\rangle }{|U|^2}\Big )+\mathcal {L}_1, \quad \text{ where }\quad \mathcal {L}_1=\frac{1}{2}\frac{|A|^2}{|U|^2}-\frac{\langle U, A\rangle ^2}{|U|^4}+P(U, U). \end{aligned}$$
(4.10)

The term \(\mathcal {L}_1\) coincides, up to a constant multiple, with the Lagrangian considered in [3]. Neither the resulting boundary term, nor \(\mathcal {L}_1\) are conformally invariant. We shall therefore focus on \(\mathcal {L}\), but use the variation of \(\mathcal {L}_1\) as an intermediate step.

First disregard the boundary term in (4.10), and consider variations of the functional \(I_1[\gamma ]=\int _{t_0}^{t_1}\mathcal {L}_1 {\mathrm {d}}t\). This yields

$$\begin{aligned} \delta I_1= & {} \int _{t_0}^{t_1}\biggl [\frac{\langle \nabla ^2_UV, A\rangle }{|U|^2}-\frac{\langle A, A\rangle \langle \nabla _UV, U\rangle }{|U|^4}+ \frac{\langle R(V,U)U, A\rangle }{|U|^2}\\&-2\frac{\langle \nabla _U^2V, U\rangle \langle U, A\rangle }{|U|^4} -2\frac{\langle \nabla _UV, A\rangle \langle U, A\rangle }{|U|^4}\\&+4\frac{\langle U, A\rangle ^2\langle \nabla _UV, U\rangle }{|U|^6} +(\nabla _VP)(U,U)+2P(\nabla _UV,U)\biggr ]{\mathrm {d}}t. \end{aligned}$$

The appearance of the Riemann tensor R arises from varying the metric g in the inner products. We eliminate R in favour of the Weyl tensor W and the Schouten tensor P using the formula

$$\begin{aligned} \begin{aligned} \langle W(V,U)U, A\rangle =&\,\,\langle R(V,U)U, A\rangle -\langle U, U\rangle P(V,A)-\langle V, A\rangle P(U,U)\\&+\langle A, U\rangle P(V,U)+\langle V, U\rangle P(A,U). \end{aligned} \end{aligned}$$

We substitute

$$\begin{aligned} 2P(\nabla _UV,U)=2\nabla _U(P(V,U))-2(\nabla _UP)(V,U)-2P(V,A), \end{aligned}$$

and integrate the following terms, with the given coefficients, by parts:

$$\begin{aligned} \frac{\langle \nabla ^2_UV, A\rangle }{|U|^2},\quad - \frac{\langle \nabla _UV, A\rangle \langle U, A\rangle }{|U|^4},\quad -\frac{1}{2}\frac{\langle A, A\rangle \langle \nabla _UV, U\rangle }{|U|^4},\\ -2\frac{\langle \nabla _U^2V, U\rangle \langle U, A\rangle }{|U|^4},\qquad 2\frac{\langle U, A\rangle ^2\langle \nabla _UV, U\rangle }{|U|^6},\quad \nabla _U(P(V,U)). \end{aligned}$$

These terms were selected by a systematic, but somewhat tedious procedure, which starts from the highest-order term and ensures that, apart from the inner product \(\langle K, V\rangle \), only terms involving \(\mathcal {D}(V)\) appear in the integrand. The boundary terms arising from these integrations are combined with the boundary term (4.10), which gives an expression for the variation \(\delta I\). To arrive at the statement (4.7) in the theorem, we use (3.4) to eliminate \(\nabla _U V\) in favour of \(\mathcal {D}(V)\). \(\square \)

4.1 Conformal geodesic equations

The boundary term \(\mathcal {B}\) given by (4.9) is conformally invariant, as both \(\mathcal {D}\) and E are. Furthermore, the vector field K along \(\gamma \) given by (4.8) is conformally invariant, and consequently the integral

$$\begin{aligned} \mathcal {K}(V)\equiv \int _{t_0}^{t_1}|U|^{-2}\langle K, V\rangle {\mathrm {d}}t, \end{aligned}$$
(4.11)

defines a conformally invariant linear operator acting on variational vector fields along a given curve \(\gamma \). We shall exploit both \(\mathcal {B}\) and \(\mathcal {K}\) to define a class of variations needed in the following corollary

Corollary 4.2

The functional \(I[\gamma ]\) is stationary under the class of variations such that

$$\begin{aligned} \mathcal {B}(V)|^{t_1}_{t_0}=-\mathcal {K}(V) \end{aligned}$$
(4.12)

if and only if the conformal geodesic equations (2.2) hold.

Proof

The proof relies on a modification of the fundamental lemma of calculus of variations, which we recall following [13]

Lemma 4.3

(Fundamental Lemma of Calculus of Variations) If \(Y:[t_0, t_1]\rightarrow \mathbb {R}^n\) is continuous, and such that

$$\begin{aligned} \int _{t_0}^{t_1}\langle W, Y\rangle {\mathrm {d}}t=0 \end{aligned}$$

for all continuous \(W:[t_0, t_1]\rightarrow \mathbb {R}^n\) with \(W(t_0)=W(t_1)=0\), then Y is identically 0.

In the proof, one assumes that W is compactly supported with and has continuous derivatives up to some specified order. For example, all components of W may vanish, except the kth component which is given by a bump function \(\rho (t)\), where

$$\begin{aligned} \rho (t)= {\left\{ \begin{array}{ll} (t-a)(b-t) &{} \text {if } t\in (a, b), \quad t_0<a<b<t_1\\ 0 &{}\text {otherwise.} \end{array}\right. }. \end{aligned}$$

We now proceed to proving Corollary 4.2. First observe that formula (4.7) immediately proves that \(I[\gamma ]\) is stationary under the class of variations such that (4.12) holds, provided that \(\gamma \) is a solution to (2.2). In order to establish the converse, we shall proceed by contradiction. For this, assume that \(\delta I[\gamma ]=0\) but \(E(t^*)\ne 0\) at some point \(t^*\in [t_0,t_1]\) (by continuity, we may assume that \(t^*\) is in the interior of the interval). Our goal is to construct a variation V satisfying (4.12), and such that

$$\begin{aligned} \int _{t_0}^{t_1}|U|^{-2}\langle E-2\mathcal {L}U, \mathcal {D}(V)\rangle {\mathrm {d}}t>0 \end{aligned}$$
(4.13)

which, together with (4.7), and the Fundamental Lemma applied to \(W={\mathcal D}(V)\) will give a contradiction with \(\delta I[\gamma ]=0\).

In order to complete this task, let us notice that there is a vector field \(V_0\) in the kernel of the differential operator \(\mathcal {D}\) such that \(V_0(t^*)=E(t^*)-2\mathcal {L}U(t^*)\). Indeed, \(\mathcal {D}\) is a first-order differential operator and one can impose arbitrary condition on the value of \(V_0\) at a given point \(t^*\) when solving the ODE \(\mathcal {D}(V_0)=0\). Now, let \(\rho \) be a bump function concentrated in a sufficiently small neighbourhood of \(t^*\), and let \({\dot{\phi }}=\rho \). Set \({\widetilde{V}}=\phi V_0\). Then, \(\mathcal {D}({\widetilde{V}})=\rho V_0\) and we find

$$\begin{aligned} \int _{t_0}^{t_1}|U|^{-2}\langle E-2\mathcal {L}U, \mathcal {D}({\widetilde{V}})\rangle {\mathrm {d}}t>\omega \end{aligned}$$

for some real number \(\omega >0\). If \({\widetilde{V}}\) satisfies (4.12), then the proof is complete. However, this condition is not satisfied in general. Therefore, we shall find a correction \(\hat{V}\) such that \(V={\widetilde{V}}+{\hat{V}}\) satisfies (4.12) and (4.13) as well. In order to find \({\hat{V}}\) explicitly, we exploit the fact that the velocity U satisfies \(\mathcal {D}(U)=0\) and \(\langle K, U\rangle =0\). Thus, for any function f and \(\hat{V}=fU\), we find \(\mathcal {K}({\hat{V}})=0\) and consequently \(\mathcal {B}(\hat{V})|^{t_1}_{t_0}+\mathcal {K}({\hat{V}})=\left( \ddot{f} +f|U|^{-2}\langle E, U\rangle \right) |^{t_1}_{t_0}\). Now, f can be taken such that it is zero everywhere outside a small neighbourhood of \(t_0\), the value of \(\ddot{f}(t_0)\) is arbitrary large, and the values of f and \(\dot{f}\) are arbitrary small. For instance, \(f(t)=\frac{\kappa }{2}(t-c)^2\) for \(t\in [t_0,c]\) and \(f(t)=0\) otherwise satisfies these conditions for appropriately chosen constants \(\kappa \in \mathbb {R}\) and \(c>t_0\) (function f can be also regularised to be of class \(C^\infty \)). We get that \({\hat{V}}\) can be picked such that \(\mathcal {B}({\hat{V}})|^{t_1}_{t_0}+\mathcal {B}(\widetilde{V})|^{t_1}_{t_0}=-\mathcal {K}({\hat{V}})-\mathcal {K}({\widetilde{V}})\) on the one hand since the value of \(\ddot{f}(t_0)\) in \(\mathcal {B}(\hat{V})|^{t_1}_{t_0}\) can take any value we want, and \(\int _{t_0}^{t_1}|U|^{-2}\langle E-2\mathcal {L}U, \mathcal {D}(\hat{V})\rangle {\mathrm {d}}t<\frac{\omega }{2}\) on the other hand, since \(\mathcal {D}({\hat{V}})=\dot{f} U\) and \(\dot{f}\) can be arbitrary small. This completes the proof. \(\square \)

4.2 Conformal Mercator equation

Integrating (4.7) by parts once more, and neglecting the boundary term then gives

$$\begin{aligned} \delta I=\int _{t_0}^{t_1}{\langle }-\mathcal {D}^*(|U|^{-2}(E-2\mathcal {L}U))+|U|^{-2}K, V{\rangle }{\mathrm {d}}t=0. \end{aligned}$$

The fundamental lemma applied to the arbitrary variation V now gives the fourth-order system which we shall call the conformal Mercator equation (the terminology will be justified in the next section)

$$\begin{aligned} \mathcal {D}^*\Big (\frac{E-2\mathcal {L}U}{|U|^2}\Big )-\frac{K}{|U|^2}=0, \end{aligned}$$
(4.14)

where

$$\begin{aligned} \mathcal {D}^*=-\nabla _U+|U|^{-2}({\langle }U, \cdot {\rangle }A-{\langle }A, \cdot {\rangle }U-{\langle }A, U{\rangle }\text{ Id}) \end{aligned}$$

is the adjoint of \(\mathcal {D}\) with respect to the \(L^2\) inner product, and \(K, E, \mathcal {L}\) are given by (4.8), (2.2) and (4.5), respectively. Unlike \(\mathcal {D}\), the operator \(\mathcal {D}^*\) is not conformally invariant. The fourth-order system (4.14) is nevertheless conformally invariant as the conformal weight \(-2\) of the term \(|U|^{-2}\) balances the contributions from \(\mathcal {D}^*\) if g changes according to (2.1). This equation also arises from the second-order Lagrangian \(\mathcal {L}_1\) in (4.10) under the assumption that the variation V and its derivative \(\nabla _U V\) vanish at the end points \(t_0\) and \(t_1\). Therefore, \(\mathcal {L}_1\) leads to a boundary value problem for (4.14), where two points on the curve \(\gamma \) and two tangent vectors at these points are specified. This, in n dimensions, gives 4n conditions which is what one would expect for a fourth-order system.

If the metric g is conformally flat, so that \(K=0\), then any solution to the conformal geodesic equation (2.2) is also a solution to (4.14). In the next section, we shall explore this conformally flat case in greater detail.

5 Conformally flat case and spirals

Assume that \(M=\mathbb {R}^n\), the conformal class is flat, and choose a flat metric representative g with vanishing Christoffel symbols. Then, the Schouten tensor P vanishes, and the fourth-order system (4.14) arising, for arbitrary variations V such that V and its first derivatives vanish at the end points, is

$$\begin{aligned} \frac{{\mathrm {d}}C}{{\mathrm {d}}t}= & {} 0, \quad \text{ where }\nonumber \\ C= & {} \frac{1}{|U|^2}\Big ({\dot{A}}-\frac{|A|^2}{|U|^2}U-2\frac{{\langle }A, U{\rangle }}{|U|^2}A+4\frac{{\langle }A, U{\rangle }^2}{|U|^4}U -2\frac{{\langle }{\dot{A}}, U{\rangle }}{|U|^2}U \Big )\nonumber \\ \end{aligned}$$
(5.15)

or, eliminating the \({\langle }{\dot{A}}, U{\rangle }\) term,

$$\begin{aligned} {\dot{A}}+\frac{|A|^2}{|U|^2}U-\frac{2{\langle }A, U{\rangle }}{|U|^2}A+2{\langle }C, U{\rangle }U-|U|^2C=0, \quad C=\text{ const }. \end{aligned}$$
(5.16)

Picard’s existence and uniqueness theorem imply that a solution curve to (5.15) is determined by specifying initial conditions \(X(0), U(0), A(0), {\dot{A}}(0)\). Specifying these values also determines C in (5.15), and conversely specifying X(0), U(0), A(0), C determines \({\dot{A}}(0)\) by (5.16). There is an advantage in using C instead of \({\dot{A}}(0)\) in the initial conditions, as C stays constant along the integral curves of (5.15). This can be used to verify directly that all conformal geodesics are integral curves of (5.15) for special initial conditions. Indeed, setting

$$\begin{aligned} C=\Big (\frac{1}{2}\frac{|A|^2}{|U|^4}-2\frac{{\langle }A, U{\rangle }^2}{|U|^6}\Big )U+\frac{{\langle }A, U{\rangle }}{|U|^4}A, \end{aligned}$$
(5.17)

and substituting this into (5.16) gives the conformal geodesic equations for the flat metric

$$\begin{aligned} {\dot{A}}-\frac{3{\langle }A, U{\rangle }}{|U|^2}A+\frac{3|A|^2}{2|U|^2}U=0. \end{aligned}$$
(5.18)

We also verify that \({\dot{C}}=0\), as a consequence of (5.18). Thus (5.17) gives a first integral of the conformal geodesic equations. The general solutions to these equations are projectively parametrised circles

$$\begin{aligned} t\rightarrow X(t)=X_0+\frac{tU_0+t^2 A_0}{1+t^2|A_0|^2}, \end{aligned}$$
(5.19)

where \(U_0\) is a constant unit vector, and \({\langle }U_0, A_0{\rangle }=0\). Therefore, (5.17) evaluated at \(t=0\) defines a submanifold in the space of initial conditions which singles out conformal geodesics as integral curves.

For generic initial conditions, the integral curves of (5.15) are not conformal geodesics. For example, choosing arbitrary values of X(0), U(0), A(0) and setting \(C=0\) reduce (5.15) to

$$\begin{aligned} {\dot{A}}-\frac{2{\langle }A, U{\rangle }}{|U|^2}A+\frac{|A|^2}{|U|^2}U =0. \end{aligned}$$
(5.20)

The general solution of this system isFootnote 1

$$\begin{aligned} t\rightarrow X(t)=e^t\cos {(ct)}\;P_0 +e^t\sin {(ct)}\;Q_0+R_0, \end{aligned}$$
(5.23)

where \(P_0, Q_0, R_0\) are constant vectors such that \({\langle }P_0, Q_0{\rangle }=0\) and \(|P_0|=|Q_0|\). The curves (5.22) are logarithmic spirals in the plane spanned by \((P_0, Q_0)\) which spiral towards \(R_0\) as \(t\rightarrow -\infty \). If \((r, \theta )\) are plane polar coordinates in the plane spanned by \(P_0, Q_0\) and centred at \(R_0\), then the unparametrised form of the spirals are \( r=|P_0|e^{\theta /c}. \) This contrasts with the behaviour of conformal geodesics, where the spirals are conjectured not to arise even in curved cases [5, 18]. The conformal invariance of the fourth-order system (4.14) ensures that the inverse images of the logarithmic spirals under the stereographic projection from \(S^{n}\) to \(\mathbb {R}^n\) are solutions to (4.14) on the round sphere. These curves are the loxodromes. They cut all meridians at a fixed angle and correspond to straight lines on the Mercator map—this justifies our terminology.

For general initial conditions, the integral curves (5.15) are, unlike (5.19) and (5.22), no-longer planar, and their Serret–Frenet torsion is given in terms of different initial jerk \({\dot{A}}(0)\) (see [9] for other occurrences of equations involving a change of acceleration in physics).

Fig. 1
figure 1

Integral curves of Eq. (5.15) with the same initial values of \(X=(0, 0, 0), {\dot{X}}=(1, 0, 0), \ddot{X}=(0.1, 1, 0)\) but different \(\dddot{X}\): Conformal geodesic (red), logarithmic spiral (blue), and a generic integral curve with nonzero torsion (green)

In Fig. 1, we plot integral curves of (5.15) in \(\mathbb {R}^3\) corresponding to 3 sets of initial conditions. Each set shares the same values of X(0), U(0), A(0) (so the three integral curves are tangent to the second order at X(0)) but has different \({\dot{A}}(0)\). If \({\dot{A}}(0)\) is determined by (5.18) in terms of the remaining initial conditions, then the (red) integral curves are circles. If \({\dot{A}}(0)\) is determined by \(C=0\), or equivalently by (5.20), then the (blue) curves are spirals. Finally, if \(C=(0, 0, 1)\) then the numerical solution of (5.15) yields the non-planar (green) integral curves. The general solution to the conformal Mercator equation (5.15) is given by the special conformal transformation of the logarithmic spiral (5.22):

$$\begin{aligned} t\rightarrow Y(t)=\frac{X(t)-|X(t)|^2B}{1-2{\langle }X(t), B{\rangle }+|B|^2 |X(t)|^2}, \end{aligned}$$
(5.21)

where X(t) is given by (5.22), and B is a constant vectorFootnote 2.

5.1 Hamiltonian formalism

The fourth-order system (5.15) arises from a Hamiltonian, and we shall construct the Hamiltonian formulation using the Ostrogradsky approach to higher derivative Lagrangians. Neglecting the boundary term in (4.5) gives the Lagrangian

$$\begin{aligned} \mathcal {L}=\frac{1}{2}\frac{|{\dot{U}}|^2}{|U|^2}-\frac{{\langle }U, {\dot{U}}{\rangle }}{|U|^4}-{\langle }\lambda , U-{\dot{X}}{\rangle }, \end{aligned}$$

where the curve \(\gamma \) is parametrised by \(t\rightarrow X(t)\) with \(X\in M=\mathbb {R}^n\), and \(\lambda \in \mathbb {R}^n\) is a Lagrange multiplier. Define the momenta \(\mathcal {P}\) and \(\mathcal {R}\) conjugate to X and U by

$$\begin{aligned} \mathcal {P}\equiv \frac{\partial \mathcal {L}}{\partial {\dot{X}}}=\lambda , \quad \mathcal {R}\equiv \frac{\partial \mathcal {L}}{\partial {\dot{U}}}=\frac{{\dot{U}}}{|U|^2}-2\frac{{\langle }U, {\dot{U}}{\rangle }}{|U|^4}U, \end{aligned}$$

so that \(\mathcal {P}-\lambda =0\) is the set of constraints, and we can eliminate \({\dot{U}}\) by

$$\begin{aligned} {\dot{U}}=|U|^2(\mathcal {R}-2|U|^{-2}{\langle }U, \mathcal {R}{\rangle }U). \end{aligned}$$

In this formula, and below, we abuse notation and use the flat metric to identify the tangent space to \(\mathbb {R}^n\) with its dual, and so regard \(\mathcal {P}\) and \(\mathcal {R}\) as both vectors and co-vectors depending on the context. The Hamiltonian is given by the Legendre transform

$$\begin{aligned} H= & {} {\langle }\mathcal {P}, {\dot{X}}{\rangle }+{\langle }\mathcal {R}, {\dot{U}}{\rangle }-\mathcal {L}\nonumber \\= & {} \frac{1}{2}|U|^2|\mathcal {R}|^2-{\langle }U, \mathcal {R}{\rangle }^2+{\langle }\mathcal {P}, U{\rangle }. \end{aligned}$$
(5.24)

Using the canonical commutation relations

$$\begin{aligned} \{X^a, \mathcal {P}_b\}={\delta ^a}_b, \quad \{U^a, \mathcal {R}_b\}={\delta ^a}_b \end{aligned}$$

gives

$$\begin{aligned} {\dot{X}}^a= & {} U^a, \quad {\dot{U}}^a=|U|^2\mathcal {R}^a-2{\langle }U, \mathcal {R}{\rangle }U^a, \\ \dot{\mathcal {R}}^a= & {} \{\mathcal {R}^a, H\}=-|\mathcal {R}|^2U^a+2{\langle }U, \mathcal {R}{\rangle }\mathcal {R}^a-\mathcal {P}^a, \quad \dot{\mathcal {P}}^a=0. \end{aligned}$$

Turning this system of first-order ODEs to a fourth-order system yields (5.16).

6 Free particle on the tractor bundle

In the tractor approach of [4], the condition for a curve \(\gamma \) to be a conformal geodesic is shown to be equivalent to the condition that the acceleration tractor is constant along this curve, and that its tractor norm is zero. This, at least formally, has a simple mechanical interpretation of a free particle on the total space of the tractor bundle, whose position is given by the velocity tractor \(\mathbf{U}\). The natural kinetic Lagrangian given by the squared tractor norm of the acceleration tractor is equal to (4.5).

The details are as follows. The tractor bundle is a rank \((n+2)\) vector bundle \(\mathbf{T}={\mathcal E}[1]\oplus TM\otimes {\mathcal E}[-1]\oplus {\mathcal E}[-1]\), where \({\mathcal E}[k]\) denotes a line bundle over M of conformal densities of weight k. Under the conformal rescalings (2.1) a section \(\mathbf{X}=(\sigma , \mu ^a, \rho )\) of \(\mathbf{T}\) transforms according to

$$\begin{aligned} \left( \begin{array}{c} {\hat{\sigma }}\\ {\hat{\mu }}^a\\ {\hat{\rho }} \end{array}\right) = \left( \begin{array}{c} {\sigma }\\ {\mu }^a+\Upsilon ^a \sigma \\ {\rho }-\Upsilon _a\mu ^a-\frac{1}{2}|\Upsilon |^2\sigma \end{array}\right) . \end{aligned}$$

The tractor bundle comes equipped with a conformally invariant connection

$$\begin{aligned} {\quad {{D}}_a \left( \begin{array}{c} \sigma \\ \mu ^b\\ \rho \end{array} \right) = \left( \begin{array}{c} \nabla _a\sigma -\mu _a \\ \nabla _a\mu ^b+P_{a}^b\sigma +\delta _a^b\rho \\ \nabla _a\rho -P_{ab}\mu ^b \end{array} \right) ,} \end{aligned}$$
(6.25)

and a metric on the fibres of \(\mathbf{T}\) defined by the norm

$$\begin{aligned} {\langle }{} \mathbf{X}, \mathbf{X}{\rangle }_\mathbf{T}\equiv |\mu |^2+2\sigma \rho , \end{aligned}$$

and preserved by the tractor connection.

With the velocity tractor \(\mathbf{U}\), and the acceleration tractor \(\mathbf{A}\) given by

$$\begin{aligned} \mathbf{U}= \left( \begin{array}{c} 0\\ |U|^{-1}U\\ -|U|^{-3}{\langle }U, A{\rangle } \end{array}\right) , \quad \mathbf{A}=U^aD_a \mathbf{U} \end{aligned}$$

a curve is a projectively parametrised conformal geodesic if

$$\begin{aligned} {\langle }{} \mathbf{A}, \mathbf{A}{\rangle }_\mathbf{T}=0, \quad \frac{{\mathrm {d}} \mathbf{A}}{{\mathrm {d}} t}=0, \end{aligned}$$
(6.26)

where now \({\mathrm {d}}/{\mathrm {d}}t\) is the directional tractor derivative. Given that \(\mathbf{A}=\dot{\mathbf{U}}\), the second equation in (6.26) is the Euler–Lagrange equation \(\ddot{\mathbf{U}}=0\) for the ‘free particle’, with position \(\mathbf{U}\). This equation arises from a Lagrangian \(1/2{\langle }\dot{\mathbf{U}}, \dot{\mathbf{U}}{\rangle }_\mathbf{T}\), and the first equation in (6.26) states that this Lagrangian, or equivalently the kinetic energy taken with respect to the tractor metric, is zero. It can be verified by explicit calculation that this kinetic energy is equal to the third-order Lagrangian (4.5), which, at least at this formal level, therefore appears to be natural.

7 Degenerate Lagrangian

The second approach to the Lagrangian formulation alluded to in Sect. 3 is to allow general variations, but consider a degenerate Lagrangian linear in the acceleration. As explained in Sect. 3, such Lagrangian must involve a preferred anti-symmetric tensor, which in general is absent in conformal geometry. We shall therefore restrict to structures which admit a Kähler metric in the conformal class. In this case, it is advantageous to use a formulation of the conformal geodesic equations which is due to Yano [19] and Tod [18].

This formulation is equivalent to (2.2) after a change of parametrisation. Decomposing the LHS E (2.2) into parts orthogonal and parallel to U gives \(E\wedge U\), and \({\langle }E, U{\rangle }\). The second term can be made zero by reparametrising, and using \(s=s(t)\) as a parameter along \(\gamma \). An explicit calculation then verifies that vanishing of the first term \(E\wedge U\) is invariant. If one takes s to be the arc-length so that \(|{\mathrm {d}}\gamma /{\mathrm {d}}s|^2=|U|^2=1\), and \(A=\nabla _U U\) is orthogonal to U, then the conformal geodesic equations become

$$\begin{aligned} \nabla _U A=-(|A|^2+P(U, U))U+P^{\sharp }(U). \end{aligned}$$
(7.27)

See [3, 11, 18] for details.

Lemma 7.1

Kähler-magnetic geodesics on a Kähler–Einstein manifold are also conformal geodesics.

Proof

Assuming that the metric g is Einstein reduces (7.27) to

$$\begin{aligned} \nabla _U A=-|A|^2U, \end{aligned}$$

and the norm \(|A|^2\) of the acceleration is a first integral. If g is in addition Kähler with the complex structure J, then the Kähler-magnetic geodesics (i.e., the geodesics of charged particles moving in a magnetic field given by the Kähler form) are also conformal geodesicsFootnote 3. Indeed, the equation

$$\begin{aligned} \nabla _U U=eJ(U), \end{aligned}$$
(7.28)

where the constant e is the electric charge, implies

$$\begin{aligned} \nabla _U A= & {} \nabla _U\nabla _U U=e J(\nabla _U U)=e^2 J^2 (U)\\= & {} -e^2 U \end{aligned}$$

where we used \(\nabla J=0\). This is the conformal geodesic equation with \(|A|=e\). \(\square \)

The Kähler-magnetic geodesics arise from a Lagrangian

$$\begin{aligned} L=\frac{1}{2}|U|^2+\phi (U), \end{aligned}$$

where the one-form \(\phi \) is the magnetic potential, i.e., \({\mathrm {d}}\phi =\Omega \), and \(\phi (U)\) is the contraction of the vector field U with \(\phi \). In the corresponding Hamiltonian formalism, the Kähler-magnetic geodesics are integral curves of a Hamiltonian vector field on the phase space.

A choice for the phase space in the conformal geodesic context is the second-order tangent bundle \(T^2 M\). It is the union of all second-order tangent spaces \(T^2_xM\)—the space of equivalence classes (also called 2-jets \(j^2(\gamma )\)) of curves \(\gamma :\mathbb {R}\rightarrow M\) which agree at \(x\in M\) up to and including the second derivatives. The space \(T^2M=\cup _{x\in M} T^2_x M\) is a bundle, but not, in general, a vector bundle over M. In the presence of a linear connection, there exists a canonical splitting [20]

$$\begin{aligned} S:T^2 M\rightarrow TM\oplus TM, \quad S(j^2(\gamma ))\rightarrow (\dot{\gamma }(0), (\nabla _{\dot{\gamma }(0)}\dot{\gamma })(0)). \end{aligned}$$

This splitting equips \(T^2M\) with the structure of rank-2n vector bundle over M, and allows a definition of the acceleration of the curve at \(\gamma (0)\) as \(A=\nabla _{\dot{\gamma }(0)}\dot{\gamma }(0)\).

We would like to construct a Lagrangian on \(T^2M\) involving \(\phi \), as well as \(\Omega \) which gives rise to all conformal geodesics on a Kähler–Einstein manifold \((M, g, \Omega )\), but even this appears to be out of reach. The following construction works in the flat case with \(M=\mathbb {R}^n\), where n is even, and \(\Omega \) is a (chosen) Kähler form.

Consider a second-order Lagrangian

$$\begin{aligned} L=w^2\phi ({\dot{X}})+\frac{1}{2} \Omega (\ddot{X}, {\dot{X}}) \end{aligned}$$
(7.29)

where \(d\phi =\Omega \), or \(\partial _a\phi _b-\partial _b\phi _a=\Omega _{ab}\), and w is a constant. The Euler–Lagrange equations are

$$\begin{aligned} 0= & {} \frac{\partial L}{\partial x^a}-\frac{{\mathrm {d}}}{{\mathrm {d}}s}\frac{\partial L}{\partial {\dot{x}}^a}+ \frac{{\mathrm {d}}^2}{{\mathrm {d}} s^2}\frac{\partial L}{\partial \ddot{x}^a}\\= & {} w^2{\dot{x}}^b\partial _a\phi _b-\frac{{\mathrm {d}}}{{\mathrm {d}}s}\left( w^2\phi _a+\frac{1}{2}\Omega _{ba}\ddot{x}^b\right) +\frac{{\mathrm {d}}^2}{{\mathrm {d}}s^2}\Big (\frac{1}{2}\Omega _{ba}{\dot{x}}^b\Big )\\= & {} w^2(\partial _a\phi _b-\partial _b\phi _a){\dot{x}}^b+\Omega _{ab}\dddot{x}^b \end{aligned}$$

which gives, with \(x^a, a=1, \dots , n\) denoting the components of \(X\in \mathbb {R}^n\),

$$\begin{aligned} \dddot{X}=-w^2 {\dot{X}} \end{aligned}$$
(7.30)

as \(\Omega _{ab}\) is invertible.

Proposition 7.2

The Poisson structure on \(T^2(\mathbb {R}^n)\) with n even, and coordinates \((x^a, U^a, A^a)\) induced by the Lagrangian (7.29) is

$$\begin{aligned} \{x^a, A^b\}=-\Omega ^{ab}, \quad \{U^a, U^b\}=\Omega ^{ab}, \quad \{A^a, A^b\}=w^2\Omega ^{ab}, \end{aligned}$$
(7.31)

where w is a nonzero constant, and all other brackets vanish. Then, (7.30) is Hamilton’s equations with the Hamiltonian

$$\begin{aligned} H=\Omega (A, U). \end{aligned}$$
(7.32)

Proof

In order to construct the canonical formalism, rewrite (7.29) as a first-order Lagrangian with constraints

$$\begin{aligned} L=w^2\phi (U)+\frac{1}{2}\Omega ({\dot{U}}, U)-{\langle }\lambda , U-{\dot{X}}{\rangle } \end{aligned}$$

with n Lagrange multipliers \(\lambda \in \mathbb {R}^n\). The conjugate momenta

$$\begin{aligned} \frac{\partial L}{\partial {\dot{x}}}= \mathcal {P}, \quad \frac{\partial L}{\partial {\dot{U}}}= \mathcal {R}\quad \frac{\partial L}{\partial \dot{\lambda }}= {\mathcal S}. \end{aligned}$$

This gives rise to 3n constraints

$$\begin{aligned} \mathcal {P}_a-\lambda _a=0, \quad \psi _a\equiv \mathcal {R}_a-\frac{1}{2}\Omega _{ab}U^b=0, \quad {\mathcal S}^a=0. \end{aligned}$$
(7.33)

Impose these constraints in the 6n-dimensional phase space with coordinates \((x, \mathcal {P}, U, \mathcal {R}, \lambda , {\mathcal S})\) with the Poisson structure

$$\begin{aligned} \{x^a, \mathcal {P}_b\}={\delta ^a}_b, \quad \{U^a, \mathcal {R}_b\}={\delta ^a}_b, \quad \{\lambda _a, {\mathcal S}^b\}={\delta ^b}_a \end{aligned}$$

(and all other brackets vanishing), and compute the Dirac brackets [7] on the 3n-dimensional reduced phase space \(TM\oplus T^*M\) with coordinates \((x, U, \mathcal {P})\).

Modifying \(\{U^a, U^b\}=0\) to the Dirac bracket gives

$$\begin{aligned} \{U^a, U^b\}^*\equiv \{U^a, U^b\}-\{U^a, \psi _c\}(C^{-1})^{cd}\{\psi _d, U^b \}, \end{aligned}$$

where \( C_{cd}\equiv \{\psi _c, \psi _d\}=\Omega _{cd}\) and \( (C^{-1})^{dc}=\Omega ^{cd}\), where \(\Omega _{ab}\Omega ^{ac}={\delta _b}^c\). We also find \(\{U^a, \psi _c\}={\delta ^a}_c\), and finally (dropping \(*\))

$$\begin{aligned} \{x^a, \mathcal {P}_b\}={\delta ^a}_b, \quad \{U^a, U^b\}=\Omega ^{ab}. \end{aligned}$$
(7.34)

The Hamiltonian is now given by the Legendre transform (see [16] for other possible choices of phase spaces)

$$\begin{aligned} H= & {} ={\langle }{\dot{X}},\mathcal {P}{\rangle }+{\langle }{\dot{U}}, \mathcal {R}{\rangle } +{\langle }\dot{\lambda }, {\mathcal S}{\rangle }-L\nonumber \\= & {} {\langle }\mathcal {P}-w^2\phi , U{\rangle } \end{aligned}$$
(7.35)

where the last expression holds on the surface of constraints. Hamilton’s equations are equivalent to (7.30). If we use the equation \({\dot{U}}=\{U, H\}\) to define \(A^a=(\mathcal {P}_b-w^2\phi _b)\Omega ^{ab}\), and instead use \(TM\oplus TM\) as the phase space with coordinates \((x^a, U^a, A^a)\), then eliminating \(\mathcal {P}\) by

$$\begin{aligned} \mathcal {P}_b=\Omega _{ab}A^a+w^2\phi _b \end{aligned}$$

yields the Hamiltonian (7.32). The Poisson brackets (7.34) yield the Poisson structure (7.31). Hamilton’s equations

$$\begin{aligned}&{\dot{x}}^a=\{x^a, H\}=U^a, \quad {\dot{U}}^a=\{U^a, H\}=A^a,\\&{\dot{A}}^a=\{A^a, H\}=\Omega _{bc}U^c\{A^a, A^b\}=-w^2U^a, \end{aligned}$$

are equivalent to (7.30). \(\square \)

The Poisson structure (7.31) does not generalise to curved spaces, as the Jacobi identity is obstructed by the Riemann curvature of g. It is nevertheless possible to make contact with the first integrals of the conformal geodesic equations, and the conformal Killing–Yano tensors under the additional assumption that \(w^2=|A|^2\). In this case, the Hamiltonian vector field corresponding to (7.32) is given by

$$\begin{aligned} X_H=U^a\frac{\partial }{\partial x^a}+A^a\frac{\partial }{\partial U^a}-|A|^2U^a\frac{\partial }{\partial A^a} \end{aligned}$$

Consider a function \(Q:T^2 M\rightarrow \mathbb {R}\) of the form

$$\begin{aligned} Q=Y(A, U)+W(U), \end{aligned}$$

where \(Y\in \Lambda ^1(M)\otimes \Lambda ^1(M)\) and \(W\in \Lambda ^1(M)\) are differential forms on \(M=\mathbb {R}^n\) which are not necessarily constant. The function H Poisson-commutes with Q iff the conformal Killing–Yano (CKY) equation

$$\begin{aligned} \nabla _aY_{bc}=\nabla _{[a}Y_{bc]}+2g_{a[b}W_{c]} \end{aligned}$$
(7.36)

holds. Indeed,

$$\begin{aligned} \{Q, H\}= & {} X_H(Q)=(\partial _a Y_{bc}+2g_{a[c}W_{b]})U^aU^cA^b+U^aU^b\partial _a W_b\\&+Y_{bc}(A^bA^c-|A|^2U^bU^c) \end{aligned}$$

so \(\{Q, H\}=0\) iff

$$\begin{aligned} Y_{(bc)}=0, \quad \partial _{(b} W_{c)}=0, \quad \partial _a Y_{bc}= \partial _{[a}Y_{bc]}+ 2g_{a[b}W_{c]}. \end{aligned}$$

Therefore, Q is constant along the conformal geodesics, in agreement with the results of [14, 18].

8 Conclusions

Conformal geodesics are examples of distinguished curves in parabolic geometries [6, 8, 11, 17]. They also arise naturally in General Relativity as a tool in studying the proprieties of conformal infinity [2, 12, 15]. Despite their importance, there are few explicit examples known [10, 18], and the underlying mathematical theory is not nearly as well developed as that of geodesics. In this paper, we have proposed a variational formulation of the conformal geodesic equations. We hope that this will shed light on the integrability properties of these equations [14], as well as the global problems such as trapping and spirals in conformal geometry [5].