To Joseph-Louis Lagrange

and George Lemaître

1 Introduction

The three-body problem involves the motion of three point masses influenced by mutual forces which depend on their separations.

While the two-body problem is integrable, the three-body problem is in general non-integrable and chaotic. Its solution has been a puzzle ever since it was studied by Newton (1687) in an attempt to understand the motion of the moon. The system is known to be rather rich, and along the centuries its study deeply influenced several scientific fields including perturbation theory, chaos and topology.

It is interesting to seek a dynamical reduction of the system: a simplified formulation, where the dynamical variables are chosen such that the equations of motion simplify. More specifically, the existence of conserved quantities allows to reduce the order of the system of differential equations. While a dynamical reduction is often applied to integrable systems, it could also apply to a non-integrable system, in which case the reduced system remains non-integrable.

This work originated from the recent flux-based reduction of the statistical solution to the three-body problem in Newtonian gravity Kol (2021), see also progress on the statistical solution through non-flux-based methods Stone and Leigh (2019); Ginat and Perets (2021). The flux-based theory involves the (regularized) phase space volume as a function of the conserved charges. It is defined by a multi-dimensional integral, yet it was found that a certain change of variables enables analytic integration over numerous variables Kol (2021); Dandekar et al. (2022). This suggested the possibility that certain dynamical reductions may exist, involving the new variables, thereby providing a deeper reason for the success of the analytic integrations.

Over the centuries, much magnificent research was devoted to the dynamical reduction of the three-body problem (3BP). Lagrange (1772) reduced the equations’ order through replacing the position vectors by variables describing the triangle geometry and additional variables. Jacobi (1842) showed how to perform one additional reduction of order, known as the elimination of the nodes. Levi-Civita (1915) was the first to discuss the reduction at the level of the action, namely using a Lagrangian or a Hamiltonian, carrying out the reduction to the instantaneous three-body plane. Murnaghan (1936) performed a reduction of the planar 3BP at the level of the action, choosing side lengths as the dynamical variables. However, the side length variables become singular for collinear configurations. van Kampen and Wintner (1937) generalized Murnaghan (1936) to 3d by adding the average azimuthal angle as a fourth coordinate. Lemaître (1952, 1964) introduced a different set of symmetric coordinates, which remain smooth in the collinear limit, and presented a Hamiltonian for the general, non-planar case. These coordinates were given by a geometric construction and are related to the principal axes of inertia of the configuration. More recently, Moeckel and Montgomery (2013) introduced the term shape sphere, for a set of symmetric coordinates for the planar 3BP, see also Montgomery (2015, 2019). These coordinates are essentially equivalent to those introduced by Lemaître, though it is not immediately apparent. In addition, Moeckel and Montgomery (2013) recast the analysis into modern mathematical terminology. See Appendix A for selected quotes out of the above-mentioned papers.

The quantum version of the three-body problem was studied in the context of the Helium atom (nucleus + 2 electrons = 3 bodies). Interestingly, some works on the quantum version reached new definitions of the dynamical variables and new forms for the quantum Hamiltonian, which, in hindsight, are related with novel formulations of the non-quantum problem. Hylleraas (1929) presented a Hamiltonian in terms of side lengths for the \(\vec {J}=0\) sector and furthermore setting two electron masses to be equal and the nucleus mass to be infinite. Gronwall (1932) introduced 3d variables that made the kinetic term conformally Euclidean. Fock (1954) introduced 4d variables that reduce to Gronwall’s variables after imposing the reduction to \(\vec {J}=0\). Finally, Tkachenko (1978) presented a Hamiltonian for nonzero \(\vec {J}\).

In mathematics, the three-problem inspired considerable development, and one can mention the following contemporary works. McGehee (1974) introduced the triple collision manifold in the context of linear motion, and Waldvogel (1982) generalized it to planar motion. Albouy and Chenciner (1998); Albouy (2004); Chenciner (2013) generalized the reduction and the determination of the shape preserving time evolutions of the n-body system to arbitrary space dimensions. Lastly, the 3BP is the setup for one of Smale’s problems for 21st century mathematics (no. 6) Smale (1998).

The study of the three-body problem goes beyond the above-mentioned extensive work and includes diverse work within general physics, quantum physics, computer simulations and astrophysics. Forces other than Newtonian gravity were considered, and a sample of such work includes Coulomb forces Liverts and Barnea (2013) and harmonic forces Newton (1687) prop. 64, Saporta Katz and Efrati (2020).

Despite this venerable history, certain features of the reduction are still lacking. First, a choice of coordinates that satisfy the center of mass constraint appears to violate the permutation symmetry (see Sect. 2.1), which is unnatural. Secondly and relatedly, the shape sphere variables display somewhat surprising and undeserved properties (see last paragraph of Sect. 2.6), which suggest that a more direct and natural route to their introduction might exist, which would make them clearer. Thirdly and lastly, extant formulations do not incorporate the angular momentum components as canonical variables, which would be natural given their central standing as conserved quantities.

In this paper, the reduction process is described in Sect. 2. It starts by introducing a solution to the center of mass constraints, which transforms nicely under permutations, and hence is natural. It involves complex numbers in a novel way, and it is inspired by Lagrange’s solution to the cubic. Next, the orientation of the plane defined by the three bodies is separated from motion within the plane simply by using a rotating body frame. We proceed to the separation of rotations within the plane from the geometry of the triangle formed by the three bodies. This requires to define certain spherical coordinates on \(\mathbb {C}^2\) and to handle correctly one of the angles, resulting in a famous quotient of it. Finally, we Legendre transform from angular velocities to angular momenta and obtain their Poisson brackets from a general theory for non-coordinate velocities and their conjugate momenta. This leads us to our final result, (2.49). We end the section with an illustration of the resulting formulation and a discussion of it.

In Sect. 3, we present several applications of the formulation. We define Hill-like regions in geometry space. We discuss the total phase volume of the three-body system, which is an ingredient of the statistical theory of Kol (2021). We provide a novel derivation of the uniformly rotating planar solution. Finally, we mention implications for economizing simulations.

This work has interesting implications to the four-body problem, and a sketchy generalization appears in Sect. 4. Finally, Sect. 5 is a brief epilogue.

Further appendices provide background material on Lagrange’s solution to the cubic, on bi-complex numbers and the isotropic oscillator, and finally, on the equations of motion in terms of non-coordinate velocities and the rotating rigid body.

2 Reduction

Setup. Consider three point masses, \(m_1, m_2, m_3\). Let us take the bodies’ position vectors, \(\vec {r}_a, ~a=1,2,3\), to be our initial dynamical variables. The kinetic energy is given by

$$\begin{aligned} T:= \sum _{a=1}^3 \frac{1}{2}\, m_a\, \dot{\vec {r}}_a^{~2} \end{aligned}$$
(2.1)

For concreteness, take the potential to be given by the Newtonian gravitational potential

$$\begin{aligned} V:= - \frac{G\, m_1\, m_2}{r_{12}} - \frac{G\, m_1\, m_3}{r_{13}} - \frac{G\, m_2\, m_3}{r_{23}} \end{aligned}$$
(2.2)

where G is Newton’s gravitational constant, and \(r_{ab}=\left| \vec {r}_a - \vec {r}_b \right| \), are the inter-body distances. We note that the whole reduction procedure is independent of the choice of V as long as it depends on \(r_{ab}\) only, namely \(V=V(\{ r_{ab} \} )\). Finally, the system can be defined by the Lagrangian

$$\begin{aligned} \mathcal{L}\left( \{\vec {r}_a,\, \dot{\vec {r}}_a \}_{a=1}^3 \right) := T - V ~. \end{aligned}$$
(2.3)

2.1 Translations and the complex position vector \(\vec {w}\)

The system is invariant under translations and hence the total linear momentum is conserved, the center of mass moves in uniform motion, and henceforth, we work in the center of mass frame

$$\begin{aligned} 0=\vec {r}_{CM}:= \frac{1}{M} \sum _a m_a \vec {r}_a \end{aligned}$$
(2.4)

where \(M:= m_1 + m_2 + m_3\) is the total mass.

The center of mass constraint reduces the configuration space, reducing the number of degrees of freedom from 9 to 6. This requires a choice of coordinates on the reduced configuration space. We shall see that this choice creates a tension with the bodies’ permutation symmetry, which will affect the whole reduction process, and hence, we shall describe it in some detail.

The coordinates are chosen to be translation-invariant vectors: vectors, in order to preserve the transformation properties under rotations, and translation-invariant, in order to decouple them from the center of mass coordinates in the expression for the kinetic energy.

In the two-body problem, one chooses the relative coordinate to be

$$\begin{aligned} \vec {r}_1-\vec {r}_2 \end{aligned}$$
(2.5)

It is a translation-invariant vector, and under a \(1 \leftrightarrow 2\) exchange, it changes only by a sign.

In the three-body problem, there are three relative vectors

$$\begin{aligned} \vec {r}_{12}:=\vec {r}_1-\vec {r}_2, \qquad \vec {r}_{23}:=\vec {r}_2-\vec {r}_3, \qquad \vec {r}_{31}:=\vec {r}_3-\vec {r}_1 \end{aligned}$$
(2.6)

They satisfy the constraint \(0= \vec {r}_{12} + \vec {r}_{23} + \vec {r}_{31}\).

Any pair of linear combinations of the relative vectors can serve as coordinates in the center of mass frame. There are two popular choices for that. The first set consists of the planetary coordinates

$$\begin{aligned} \vec {r}_{13}, \qquad \vec {r}_{23} \end{aligned}$$
(2.7)

They are useful for the case where one of the masses, say \(m_3\), is much heavier than the other two: \(m_1,m_2 \ll m_3\). The second set consists of the lunar coordinates

$$\begin{aligned} \vec {r}_{12}, \qquad \frac{1}{m_1+m_2} \left( m_1\, \vec {r}_1 + m_2\, \vec {r}_2 \right) - \vec {r}_3 \end{aligned}$$
(2.8)

These are useful for the hierarchical case where the magnitude of the first vector is much small than the second, namely bodies 1 and 2 are much closer to each other, relative to their distance from the third body. These coordinates are commonly called Jacobi coordinates, because they appeared in Jacobi (1842), but they are a rather natural choice, and they appeared already in Newton (1687); Lagrange (1772).

Both coordinate choices break the symmetry between the bodies. Hence, they are not optimal for the non-hierarchical case where all distances and all masses are comparable.

Can we define natural, namely symmetric, coordinates in the center of mass frame? A solution is suggested by Lagrange’s solution to the cubic equation. Lagrange realized that the key to a solution of a polynomial equation is to study expressions in terms of the roots that display a certain measure of symmetry among them. For more see Appendix B. Inspired by this, we define

$$\begin{aligned} \vec {w}:= \vec {r}_1 + \eta \, \vec {r}_2 +{\bar{\eta }}\, \vec {r}_3 \end{aligned}$$
(2.9)

where \(\eta \) is the cubic root of unity, namely \(\eta =\exp (2 \pi j /3)\), and we denote here the imaginary unit by j, \(j^2=-1\), since i will have a different use below.

The definition (2.9) is one of the central results of this paper. Let us discuss it. By construction \(\vec {w}\) is a complex vector. Naturally, it is composed of 2 real vectors, and so it has the correct number of components to serve as coordinates in the center of mass frame. It is translation-invariant as a result of the identity \(1+ \eta + {\bar{\eta }}=0\). Under a cyclic rotation it is merely multiplied by a \(2 \pi /3\) phase, while the exchange \(2 \leftrightarrow 3\) transforms it into its complex conjugate. These simple transformations under permutations render \(\vec {w}\) a natural coordinate.

Transformed Lagrangian Given \(\vec {w}\), the relative positions can be expressed by

$$\begin{aligned} \vec {r}_{23}=\frac{1}{j \sqrt{3}} \left( \vec {w} - \vec {\bar{w}} \right) \end{aligned}$$
(2.10)

and similarly, replacing \(w \rightarrow \eta w\) in this expression gives \(\vec {r}_{12}\), and \(w \rightarrow {\bar{\eta }}w\) gives \(\vec {r}_{31}\).

The kinetic energy in the center of mass frame is given by

$$\begin{aligned} T_{CM}:= & {} T - \frac{M}{2}\, {\dot{\vec {r}}_{CM}}^2 = \frac{1}{2 M} \left( m_2\, m_3\, {\dot{\vec {r}}_{23}}^2 + cyc. \right) \nonumber \\= & {} \frac{1}{6 M} \left( m_2\, m_3\, \Vert \frac{d}{j\, dt}(\vec {w}-\vec {\bar{w}}) \Vert ^2 + cyc. \right) \end{aligned}$$
(2.11)

where the cyclic transformation denotes now \(w \rightarrow {\bar{\eta }}w\), in addition to \(1 \rightarrow 2 \rightarrow 3 \rightarrow 1\).

2.2 Rotations and the orientation–geometry decomposition

One can naturally divide the system’s coordinates into orientation coordinates and coordinates that describe the configuration up to rotations, namely geometry coordinates. At any given moment, the positions of the three bodies define an instantaneous plane (apart for collinear configurations, which are codimension 2 in configuration space and hence can be ignored for now—for more see the end of Sect. 2.4).

We define a rotating body frame whose z axis is normal to the instantaneous plane (either choice of the positive direction would do, namely, this is a \(\mathbb {Z}_2\) gauge, and once chosen it remains determined throughout time evolution as long as collinear configurations are not reached). 2d vectors \(\vec {\rho }_a\) specify the positions within the rotating (xy) plane. As usual, inertial frame velocities are given by

$$\begin{aligned} \left. \frac{d}{dt} \right| _{inert} \vec {\rho }_{ab} = \dot{\vec {\rho }}_{ab} + \vec {\tilde{\omega }} \times \vec {\rho }_{ab} \end{aligned}$$
(2.12)

where \(\vec {\tilde{\omega }}\) is the angular velocity vector that describes the rotation of the body frame, and \(\omega \) is reserved for later use.

Substituting, the kinetic energy (2.11) becomes

$$\begin{aligned} T = \frac{1}{2}I_{ij}\, \tilde{\omega }^i\, \tilde{\omega }^j + \vec {L}_w \cdot \vec {\tilde{\omega }} + T_w \end{aligned}$$
(2.13)

where \(I_{ij}, ~ i,j=1,2,3\) denotes the inertia tensor in the rotating (and center of mass) frame

$$\begin{aligned} I_{ij} = \frac{1}{M} \left[ m_2\, m_3\, \left( \delta _{ij} \vec {\rho }_{23}^{\;2} - \rho ^i_{23} \rho ^j_{23} \right) + cyc. \right] ~; \end{aligned}$$
(2.14)

\(\vec {L}_w\) denotes the angular momentum due to the \(\vec {\rho }_a\) motion

$$\begin{aligned} \vec {L}_w= & {} L_w \hat{z} \nonumber \\ L_w= & {} \frac{1}{M} \left( m_2\, m_3\, \vec {\rho }_{23} \wedge \dot{\vec {\rho }}_{23} + cyc. \right) ~, \end{aligned}$$
(2.15)

where the wedge product between any 2d vectors \(\vec {u},\vec {v}\) is defined by \(\vec {u} \wedge \vec {v}:= (\vec {u} \times \vec {v}) \cdot \hat{z}\); and finally, the kinetic energy due to the \(\vec {\rho }_a\) motion

$$\begin{aligned} T_w:= \frac{1}{M} \left( m_2\, m_3\, \dot{\vec {\rho }}_{23}^{\; 2} + cyc. \right) ~. \end{aligned}$$
(2.16)

We note several properties of \(I_{ij}\). Since the mass distribution is confined to the \(z=0\) plane, one has

$$\begin{aligned} 0= & {} I_{13}= I_{23} \nonumber \\ I:= & {} I_{33} = I_{11} + I_{22} \end{aligned}$$
(2.17)

In addition, since the mass distribution consists of only 3 point masses, one has

$$\begin{aligned} I_{11}\, I_{22} - \left( I_{12}\right) ^2 = \frac{m_1\, m_2\, m_3}{M}\, \Delta ^2 \end{aligned}$$
(2.18)

where

$$\begin{aligned} \Delta := \det \left[ x_1, y_1, 1; x_2, y_2, 1; x_3, y_3,1 \right] \end{aligned}$$
(2.19)

is twice the (signed) area of the triangle formed by the bodies. In other words, the \(2*2\) determinant of \(I_{\alpha \beta }, ~\alpha , \beta =1,2\) factors into a part that depends only on positions, and a part which depends only on the masses.

2.3 Rotations within the plane and geometry space

The orientation–geometry decomposition is not done yet. To see that, let us count the number of generalized coordinates. We know that the 3d problem has 6 degrees of freedom (d.o.f) in the center of mass frame (CM). The orientation is specified by 3 angles, such as the Euler angles, and the \(\vec {\rho }_a\) at CM (or equivalently the associated 2d \(\vec {w}\)) provide 4 other d.o.f., so altogether we have 7 generalized coordinates. This means that these coordinates are redundant by a single degree of freedom.

Rotations within the (xy) plane are the origin of this redundancy: since a rotation of the body frame around the z-axis is equivalent to a rotation of the 2d vectors, the configuration depends on the two associated angles only through their sum.

Bi-complex w. In order to account for plane rotations, it is useful to complexify the plane, so that rotations would be represented by a phase multiplication. By complexification one means that any 2 vector \(\vec {\rho }\) is mapped onto a complex number \(\vec {\rho } \rightarrow \rho := \vec {\rho } \cdot \left( \hat{x} + i\, \hat{y} \right) \). Similarly, the 2d j-complex vector \(\vec {w}\), defined in (2.9), is mapped onto a so-called bi-complex number

$$\begin{aligned} \vec {w} \rightarrow w = \vec {w} \cdot \left( \hat{x} + i\, \hat{y} \right) \end{aligned}$$
(2.20)

which is of the form \(w = a + b\, i + c\, j + d \, i j\). The imaginary unit i represents a quarter rotation in the (xy) plane. Hence, ij are commuting imaginary units, namely \(i^2=j^2=-1\) and \( i\, j = j\, i\). For more on the algebraic structure of bi-complex numbers, and their application to the isotropic oscillator of mechanics, see Appendix c.

Let us introduce a natural basis for the w space. Evaluating the w variable (2.20) for a right-handed and a left-handed equilateral triangles of unit sides motivates the definitions

$$\begin{aligned} e_R:= \frac{\sqrt{3}}{2} \left( 1+ i\, j\right) \qquad e_L:= \frac{\sqrt{3}}{2} \left( 1- i\, j\right) ~, \end{aligned}$$
(2.21)

where for concreteness, the equilateral triangles are chosen to be oriented such that the height from edge 2—3 to vertex 1 is pointed along \(+\hat{x}\) within the (xy) plane. Algebraically, \(e_R, \, e_L\) are characterized as being zero divisors of the bi-complex ring (see Appendix c), and thereby are natural.

Spherical coordinates for w. Considering w space as a 2d complex vector space over complex numbers involving i, namely \(\mathbb {C}[i]^2\), one can expand a general state w in the \(e_R,\, e_L\) basis as follows

$$\begin{aligned} w = r\, e^{i \psi _0}\, \left[ e^{i \phi /2}\, \cos \frac{\theta }{2}\, e_R + e^{-i \phi /2}\, \sin \frac{\theta }{2}\, e_L \right] ~. \end{aligned}$$
(2.22)

The definition of \(\psi _0\) can be changed into \(\psi = \psi _0 + \chi (\theta , \phi )\), which is a gauge transformation. The variable r sets the overall triangle scale, \(\psi \sim \psi + 2\pi \) is an overall rotation angle, \(\theta \in [0,\pi ]\) determines the relative magnitudes of the right and left components, and finally \(\phi \sim \phi + 2 \pi \) is the relative phase of the right and left components. Altogether \(r,\theta ,\phi ,\psi \) are spherical coordinates for \(w \in \mathbb {C}[i]^2\).

We define G to be the quotient space of planar three-body configurations, up to rotations, where G stands for geometry. In other words, G is the space of equivalence classes of congruent triangles. We can write

$$\begin{aligned} G = \mathbb {C}[i]^2/U(1) \end{aligned}$$
(2.23)

where the U(1) acts by overall phase rotations. We shall later see how to parameterize G in terms of quadratics in w. In coordinates, G is given by the w variable up to \(\psi \)-shifts, and it is parameterized by \(r,\theta ,\phi \). G includes information on both size and shape: the r coordinate determines the triangle size, while the \(\theta , \phi \) coordinates parameterize a sphere that describes triangles up to similarity, and is known as the shape sphere. Coordinates equivalent to \(r,\theta ,\phi \) appeared in Lemaître (1952) in Eqs. (1,2,14), where they were found through a trial and error process: Lemaître (1952) started with the equal mass case, used its increased symmetry, and then generalized to unequal masses, while Lemaître (1964) started from equal moments of inertia and then generalized. The term “shape sphere” appeared in Moeckel and Montgomery (2013), see also Easton (1971); Saari (1984); Moeckel (1988).

Substituting (2.22) into (2.10), the relative position vector \(r_{23}\) becomes

$$\begin{aligned} \rho _{23} = i\, r\, e^{i \psi _0} \left[ e^{i \phi /2}\, \cos \frac{\theta }{2}\, - e^{-i \phi /2}\, \sin \frac{\theta }{2}\, \right] ~. \end{aligned}$$
(2.24)

In particular,

$$\begin{aligned} r_{23}^{~2} \equiv \left| \rho _{23} \right| ^2 = r^2\, \left( 1 - \sin \theta \, \cos \phi \right) ~. \end{aligned}$$
(2.25)

Similarly, \(r_{31}, r_{12}\) are given by the same expression after the substitutions \(\phi \rightarrow \phi - 2 \pi /3\), and \(\phi \rightarrow \phi + 2 \pi /3\), respectively. This means that expressions are invariant under a cyclic permutation combined with a third of a revolution in \(\phi \).

\(\psi _0\), defined in (2.22), is symmetric between the R and L hemispheres, \(0 \le \theta \le \pi /2\) and \(\pi /2 \le \theta \le \pi \), respectively. However, \(\psi _0\) is singular for both R and L poles. A gauge that is regular at the R pole is given by

$$\begin{aligned} \psi _+:= \psi _0 + \phi /2 \end{aligned}$$
(2.26)

Kinetic energy. In spherical coordinates, and using the \(\psi _+\) gauge (2.26), \(T_w\) (2.16) becomes

$$\begin{aligned} T_w = \frac{1}{2}\, I\, \left( \dot{\psi }_+ +\frac{L_G}{I} \right) ^2 + T_G \end{aligned}$$
(2.27)

where the largest principal moment of inertia (for rotations within the plane) (2.14), is given by

$$\begin{aligned} I:= I_{33} = \frac{r^2}{M} \left[ m_2\, m_3 \left( 1 - \sin \theta \, \cos \phi \right) + cyc. \right] ~. \end{aligned}$$
(2.28)

\(L_G\) denotes the angular momentum in the geometry space G and it is given by

$$\begin{aligned} L_G = -\frac{r^2}{2\, M} \left[ m_2\, m_3 \left( \left( 1 - \cos \theta - \sin \theta \, \cos \phi \right) \dot{\phi } - \sin \phi \, \dot{\theta } \right) + cyc. \right] ~ \end{aligned}$$
(2.29)

where cyc. denotes \(1 \rightarrow 2 \rightarrow 3 \rightarrow 1\) together with \(\phi \rightarrow \phi - 2 \pi /3\), just like the comment after (2.25). \(L_G\) is related to \(L_w\), defined in (2.15), by

$$\begin{aligned} L_w=I\,\dot{\psi }_+ + L_G \end{aligned}$$
(2.30)

\(T_G\) denotes the kinetic energy in G space, and is given by

$$\begin{aligned} T_G:=\frac{1}{8\, I}\, \dot{I}^2 + \frac{3\, M_3}{8\, M} \frac{r^4}{I} \left( \dot{\theta }^2 + \sin ^2 \theta \, \dot{\phi }^2 \right) ~. \end{aligned}$$
(2.31)

Finally, \(M, M_2, M_3\) denote the elementary symmetric functions of the masses

$$\begin{aligned} M:= & {} m_1 + m_2 + m_3 \nonumber \\ M_2:= & {} m_2 \, m_3 + m_3 \, m_1 + m_1 \, m_2 \nonumber \\ M_3:= & {} m_1\, m_2\, m_3 ~. \end{aligned}$$
(2.32)

The derivation proceeds by projecting \(\dot{\rho }_{23}\) along the radial and the tangential directions with respect to \(\rho _{23}\), performed in the \(r,\theta ,\phi \) coordinates, and similarly for the other relative velocities. It is rather lengthy and non-illuminating and I suspect that a better one can be found, perhaps along the lines of Moeckel and Montgomery (2013). For these reasons, it will not be included here.

The kinetic energy (2.27) is in a form of a dimensional reduction (also known as Kaluza-Klein reduction) over the coordinate \(\psi \). Therefore, the expression for \(L_G\) depends on the choice of gauge for \(\psi \), while the kinetic metric \(T_G\) is gauge-independent, and represents the metric on the quotient space (2.23). The expression for \(T_G\) appeared essentially in Eq. (4.3.13) of Montgomery (2002), which studied the \(J^2=0\) sector, and in Moeckel and Montgomery (2013).

Fixing the \(\psi \) gauge. Let us return to the issue of coordinate redundancy discussed at the beginning of this subsection. This redundancy can be removed by fixing a gauge for the \(\psi \) coordinate, and we choose to set

$$\begin{aligned} \psi =0 ~. \end{aligned}$$
(2.33)

This can always be achieved through a choice of the xy body axes, which in turn fixes the value of \(\tilde{\omega }_z\). Clearly, the effect of this gauge is directly related to the choice of gauge for \(\psi \).

Relatedly, the kinetic energy depends on \(\tilde{\omega }_z,\, \dot{\psi }\) only through their sum. Indeed, the relevant terms in T are (2.13,2.27)

$$\begin{aligned} T{} & {} \supset \frac{1}{2}I \left( \tilde{\omega }_3 \right) ^2 + L_w \, \tilde{\omega }_3 + \frac{1}{2}I \left( \dot{\psi }+\frac{L_G}{I}\right) ^2 \nonumber \\ {}{} & {} = \frac{1}{2}I \left( \tilde{\omega }_3 + \dot{\psi }+\frac{L_G}{I}\right) ^2 = \frac{1}{2}I \left( \omega _3 +\frac{L_G}{I}\right) ^2 \end{aligned}$$
(2.34)

where in passing to the second line, we have used (2.30), and in the last equality we have used \(\vec {\omega }\) defined by

$$\begin{aligned} \vec {\omega } = \vec {\tilde{\omega }} + \dot{\psi }\, \hat{z} ~. \end{aligned}$$
(2.35)

Altogether, the expression for the kinetic energy in the center of mass frame becomes

$$\begin{aligned} T\left[ \vec {\omega },r,\theta ,\phi \right] = T_G + \frac{1}{2}I \left( \omega _3 +\frac{L_G}{I}\right) ^2 + \frac{1}{2}I_{\alpha \beta }\, \omega ^\alpha \, \omega ^\beta \end{aligned}$$
(2.36)

where \(T_G, I,\, L_G,\, I_{\alpha \beta } ~\alpha ,\beta =1,2\) depend on the G-space variables and were defined in (2.31,2.28,2.29,2.14) respectively.

An alternative gauge choice, suggested by Lemaître (1964), is to employ the principal axes of inertia also within the instantaneous plane. It has the advantage of being natural and of reducing the rotation equations (2.65) to the familiar Euler equations.

Interpretation as a 1d SO(2) gauge theory. The general formulation in a rotating frame (2.12) can be considered to be an SO(3) gauge theory in \(0+1\) dimensions. In fact, this relation is the definition of the covariant derivative D/Dt, where \(\vec {\tilde{\omega }}\) is the gauge field in the adjoint representation of so(3), namely \({\textbf{3}}\), and the charged fields are the position vectors \(\vec {r}_a\), which transform in the same representation. Finite gauge transformations correspond to a frame redefinition through an SO(3) rotation matrix.

In the three-body problem, we partially fix this gauge through \(z_a=0 ~a=1,2,3\). We are left with a 1d \(SO(2) \simeq U(1)\) gauge theory, the residue of the original SO(3) gauge theory. The gauge field is \(\tilde{\omega }_3\), and the charged fields are the 2d position vectors \(\rho _a\).

After transforming to spherical coordinates, \(\psi \) remains the only charged field, such that \(\dot{\psi }+\tilde{\omega }_3\) is gauge invariant. Changing variables into \(\omega _3\) fully eliminates the gauge field (this is possible for 1d gauge theories).

2.4 Geometry of geometry space

Let us study further the geometry of geometry space, namely three-body configurations up to rotations (2.23).

Invariants. So far, geometry space was described by the coordinates \(r,\theta ,\phi \). Alternatively, we can employ the invariants of the quotient. This approach will clarify the geometry at the origin of geometry space.

We define the quadratic invariants

$$\begin{aligned} Q_s:= \left[ \begin{array}{cc} w^*_1&w^*_2 \end{array} \right] \; \tau _s \; \left[ \begin{array}{c} w_1 \\ w_2 \end{array} \right] \qquad s=0,1,2,3,4 \end{aligned}$$
(2.37)

where \(w_1, w_2 \in \mathbb {C}[i] \) denote the j-real and imaginary parts of the bi-complex w, up to normalization, defined as follows

$$\begin{aligned} w = \sqrt{\frac{3}{2}} \left( w_1 + j\, w_2 \right) \end{aligned}$$
(2.38)

and the Pauli-like matrices \(\tau _r\) are given by

$$\begin{aligned} \tau _0 = \left[ \begin{array}{cc} 1 &{} 0 \\ 0 &{} 1 \end{array} \right] ,~ \tau _1 = \left[ \begin{array}{cc} 1 &{} 0 \\ 0 &{} -1 \end{array} \right] ,~ \tau _2 = \frac{1}{2}\left[ \begin{array}{cc} -1 &{} -\sqrt{3} \\ -\sqrt{3} &{} 1 \end{array} \right] ,~ \tau _3 = \frac{1}{2}\left[ \begin{array}{cc} -1 &{} \sqrt{3} \\ \sqrt{3} &{} 1 \end{array} \right] ,~ \tau _4 = \left[ \begin{array}{cc} 0 &{} -i \\ i &{} 0 \end{array} \right] \nonumber \\ \end{aligned}$$
(2.39)

Since \(Q_s\) are of the form \(\sim w^*\, w\), they are manifestly invariant under the rotations \(w \rightarrow \exp (i \psi )\, w\).

The quadratic invariants \(Q_0, \dots , Q_4\) are not all independent, but rather satisfy the relations

$$\begin{aligned} 0&= Q_1 + Q_2 + Q_3 \end{aligned}$$
(2.40a)
$$\begin{aligned} Q_0^2&= \frac{2}{3} \left( Q_1^2 + Q_2^2 + Q_3^2 \right) + Q_4^2 \end{aligned}$$
(2.40b)

Geometrically, relation (2.40a) means that \(Q_1, Q_2, Q_3\) describe a flat 2d space, and relation (2.40b) means that the quotient is a (light) cone in a 3+1 space. The cone singularity at the origin corresponds to a triple collision configuration.

We note that the \(Q_0,\dots Q_4\) variables are closely related to the Stokes parameters \(S_0, \dots , S_3\) Stokes (1852), which are used to describe states of polarization of light and are reviewed in Appendix c. More precisely, \(Q_0,\, Q_4\) are identical to the \(S_0,\, S_3\) while \(Q_1,Q_2,Q_3\) up to the relation (2.40a) are analogous to \(S_1,S_2\). However, the Q variables are distinguished by being compatible with a symmetry of order 3.

In terms of the \(r,\theta ,\phi \) coordinates, substitution of (2.22) shows that the invariants are given by

$$\begin{aligned} Q_0&= r^2 \end{aligned}$$
(2.41a)
$$\begin{aligned} Q_a&= r^2 \sin \theta \, \cos \phi _a, ~ a=1,2,3 \end{aligned}$$
(2.41b)
$$\begin{aligned} Q_4&= r^2\, \cos \theta \end{aligned}$$
(2.41c)

where

$$\begin{aligned} \phi _a:= \phi -(a-1) 2 \pi /3 ~. \end{aligned}$$
(2.42)

In this form, the invariants are seen to be closely related to Cartesian coordinates \(\vec {G}\) associated with the spherical coordinates \(r^2,\theta ,\phi \), namely \(G_1=r^2 \sin \theta \, \cos \phi , ~G_2=r^2 \sin \theta \, \sin \phi ,\) and \(G_3=r^2 \cos \theta \). The three components of \(\vec {G}\) are independent coordinates, which solve the relations on the Q variables (2.40). Similarly, we define the vector \(\vec {g}\) to be the Cartesian coordinates associated with \(r, \theta , \phi \) (r instead of \(r^2\)).

Using (2.25), the triangle edge lengths can now be expressed as

$$\begin{aligned} r_{23}^{~2} = Q_0 - Q_1 ~, \end{aligned}$$
(2.43)

and similarly for \(r_{31}^{~2}, ~ r_{12}^{~2}\). Summing all three, and using (2.40a,2.41a) we find

$$\begin{aligned} r^2 = \frac{1}{3} \left( r_{23}^{~2} + r_{31}^{~2} + r_{12}^{~2} \right) ~. \end{aligned}$$
(2.44)

This means that the geometric interpretation of the radial coordinate r is the root mean square of the side lengths.

We note that one could take a wider perspective, and rather than study the planar configuration of the bi-complex w up to phase shifts, one could study the 3d complex position vector \(\vec {w}\) (2.9) up to identification by 3d rotations, namely \(\vec {w}/SO(3)\). The quadratic invariants \(Q_0, \dots Q_3\) can be expressed in terms of scalar products among \(\vec {w}, \, \vec {\bar{w}}\) and hence are invariants from the 3d perspective. On the other hand, \(Q_4\) cannot be described in this way, and hence while it is an invariant of \(\mathbb {C}^2/U(1)\), only \(Q_4^{~2}\) is an invariant of \(\vec {w}/SO(3)\).

In addition, we note that \(Q_4\) is proportional to the triangle area, more precisely

$$\begin{aligned} Q_4 = \frac{2}{\sqrt{3}}\, \Delta \end{aligned}$$
(2.45)

where \(\Delta \) is twice the (signed) triangle area, and was defined in (2.19). This can be shown by following the definition of \(Q_4\) and using translation invariance to set \(\vec {\rho }_1=0\). Interestingly, Montgomery (2002) has shown that (2.40b) is equivalent to Heron’s formula for a triangle’s area.

Shape sphere. Geometry up to size is known as shape. For this reason, the unit sphere in geometry space, namely \(r=1\), is known as shape sphere.

For triangles, shape is the same as classification up to similarity, which is well known to be classified by the values of the three angles which are constrained to sum to \(\pi \). This means that shape space should be a 2d surface. As we have seen, it turns out that this surface has the topology of sphere. More precisely, the shape sphere is the space of shapes of triangles with labeled vertices.

The shape sphere is shown in Fig. 1. The geometrical interpretation of various locations on the shape sphere is known Moeckel and Montgomery (2013) and will be repeated here for convenience. The vertical \(Q_4\) coordinate is proportional to the triangle area, as stated in (2.45). Hence, the \(Q_4=0\) equator corresponds to collinear configurations. Within the collinear equator, the point \(Q_1=1\) implies \(w_2=0\) and hence it corresponds to the collision of the 2,3 vertices and it is denoted by C1. Similarly for C2, C3. Going away from the equator, the \(Q_4=1\) pole is associated with \(w =1 + i\, j\), which corresponds to a right equilateral triangle, namely, such the 1,2,3 vertices are oriented in the positive mathematical direction (counter-clockwise). Similarly, the \(Q_4=-1\) pole corresponds to a left equilateral triangle.

Fig. 1
figure 1

The shape sphere describes the space of all possible labeled triangles in a plane up to similarity. See text for the description of various locations on it. The vertical direction represents \(Q_4\), while the horizontal directions represent \(Q_1, Q_2, Q_3\) constrained by (2.40a)

Just as the Q invariants were noted above to be closely related to the Stokes parameters and the Pauli matrices (Pauli 1927), the shape sphere is closely related to the Poincaré sphere, the sphere of polarizations that was introduced in the lectures on optics Poincaré (1892). These notions are also closely related to the Hopf fibration (Hopf 1931) and the Bloch sphere (Bloch 1946). The Pauli matrices are used to describe quantum operators on the states of a spin-half particle. The Hopf fibration represents the 3-sphere as a circle fibration over the 2-sphere. The Bloch sphere describes the states of a two-state quantum system, through an analogy with the spin-half system. Mathematically, the \(\mathbb {C}^2/U(1)\) quotient is at the root of all of the above-mentioned topics.

We record the form of several quantities, which appear in the expression for the kinetic energy, in terms of the Q variables

$$\begin{aligned} I&= \frac{1}{M} \left[ m_2\, m_3 (Q_0 - Q_1 ) + cyc. \right] \nonumber \\ L_G&= -\frac{\left[ m_2\, m_3 \left( (Q_1-Q_0-Q_4)\, \frac{d}{dt} (Q_2-Q_3) - (Q_2-Q_3)\, \frac{d}{dt}(Q_1-Q_0-Q_4) \right) + cyc. \right] }{2\, \sqrt{3} \,M\, (Q_0+Q_4)} \end{aligned}$$
(2.46)

In planar motion, collinear configurations are codimension 1, and hence occur from time to time in generic trajectories, see, e.g., Montgomery (2002). Each time this occurs, the triangle changes its right/left handedness and the trajectory crosses the equator of the shape sphere. On the other hand, in 3d motion collinear configurations are codimension 2, and hence generic trajectories never cross the equator, rather they approach it from time to time only to be eventually repelled by the centrifugal force (2.56). This raises a question regarding continuity of the limit of planar motion. A possible resolution is to identify \(Q_4 \simeq -Q_4\). This would make the Q variables invariant not only with respect to 2d rotations, but also with respect to 3d rotations.

2.5 Angular momentum variables

In order to incorporate into the reduction the conservation of angular momenta, we transform from angular velocity variables into the conjugate momenta, which are none other than the angular momenta

$$\begin{aligned} J_i = \frac{{\partial }T}{{\partial }\omega ^i} \qquad i=1,2,3. \end{aligned}$$
(2.47)

We have arrived at the final set of dynamical variables, namely

$$\begin{aligned} \vec {J},\, \vec {g} \end{aligned}$$
(2.48)

where \(\vec {g}\) is a location in geometry space, which usually would be represented by the spherical coordinates \(r, \theta , \phi \).

Performing a Legendre transform over the Lagrangian (2.3), we obtain

$$\begin{aligned} \mathcal{L}_J:= \mathcal{L}- \vec {\omega } \cdot \vec {J} = \mathcal{L}_0 + \mathcal{L}_1 + \mathcal{L}_2 ~. \end{aligned}$$
(2.49)

\(\mathcal{L}_J\) is a function of the dynamical variables (2.48) (together with the generalized velocities associated with \(\vec {g}\)). The expression for \(\mathcal{L}_J\) is organized into three parts \(\mathcal{L}_0, \mathcal{L}_1, \mathcal{L}_2\) according to powers of \(\vec {J}\), and their values are detailed below.

The transform that defined \(\mathcal{L}_J\) was taken only with respect to part of the velocity variables, and therefore, \(\mathcal{L}_J\) is a hybrid of Lagrangian and a Hamiltonian (sometimes known as Routhian). Later, we shall see that the equations of motion can be derived from \(\mathcal{L}_J\). It is useful to have a term that refers to any kind of such function, whether it is a Lagrangian, a Hamiltonian or hybrid. In mechanics, the standard potential function encodes the forces through derivatives. In thermodynamics, one uses one of several thermodynamic potentials, such as the energy, the free energy, the Gibbs free energy, all related among themselves by Legendre transforms. Similarly, we shall use the term motion potential to refer to any function from which the equations of motion can be derived. From this perspective, the standard potential can be distinguished by the term force potential.

The first term appeared already in (2.2,2.31) and is repeated here for convenience. It depends only on the geometry space variables

$$\begin{aligned} \mathcal{L}_0:= \frac{1}{8\, I}\, \dot{I}^2 + \frac{3\, M_3}{8\, M} \frac{r^4}{I} \left( \dot{\theta }^2 + \sin ^2 \theta \, \dot{\phi }^2 \right) + \left( \frac{G m_2\, m_3}{r_{23}} + cyc. \right) \end{aligned}$$
(2.50)

where

$$\begin{aligned} I = \frac{1}{M} \left( m_2\, m_3\, r_{23}^{~2} + cyc.\right) \end{aligned}$$
(2.51)

and

$$\begin{aligned} r_{23}^{~2}&= r^2 \left( 1 - \sin \theta \, \cos \phi \right) \nonumber \\ r_{31}^{~2}&= r^2 \left( 1 - \sin \theta \, \cos (\phi -2 \pi /3) \right) \nonumber \\ r_{12}^{~2}&= r^2 \left( 1 - \sin \theta \, \cos (\phi -4 \pi /3) \right) \end{aligned}$$
(2.52)

The first two terms of \(\mathcal{L}_0\) specify the kinetic energy on geometry space—in the radial and angular directions, respectively. The last term is minus the potential, and for concreteness we present the case of a Newtonian gravitational potential.

The \(\mathcal{L}_1\) term couples geometry space and the rotating body, and it is given by

$$\begin{aligned} \mathcal{L}_1:= J_3 \frac{L_G}{I} \end{aligned}$$
(2.53)

where \(L_G\) is given by

$$\begin{aligned} L_G = -\frac{r^2}{2\, M} \left[ m_2\, m_3 \left( \left( 1 - \cos \theta - \sin \theta \, \cos \phi \right) \dot{\phi } - \sin \phi \, \dot{\theta } \right) + cyc. \right] \end{aligned}$$
(2.54)

cyc. denotes \(1 \rightarrow 2 \rightarrow 3 \rightarrow 1\) together with \(\phi \rightarrow \phi - 2 \pi /3\). The expression for \(L_G\) is given in the \(\psi _+\) gauge (2.26), and a gauge transformation with a gauge function \(\chi \) could shift it by \(\Delta \mathcal{L}_1 = - J_3\, \dot{\chi }\). In geometry space, \(\mathcal{L}_1\) is analogous in form to a coupling of the motion in G-space to a vector potential, as noticed by Lemaître (1952). This means that the motion in G-space experiences a velocity-dependent force, which is magnetic-like in form, and can be thought to originate from a Coriolis force.

The magnetic-like vector is given by

$$\begin{aligned} \vec {B}:= 2 \frac{J_3}{I^{3/2}}\, r\, {\hat{r}} \end{aligned}$$
(2.55)

where \({\hat{r}}\) is a unit vector in the radial direction. It is obtained by dualizing the two-form associated with \(\mathcal{L}_1\) with respect to the kinetic metric (2.31): \(\vec {B} = *d\left( J_3 \frac{L_G\, dt}{I} \right) = 3 M_3\, J_3 \, r^4 \, / (2\, M\, I^2)\, *\left( d\theta \, \sin \theta d\phi \right) \) and using \(*=1/\sqrt{g}\, {\partial }^3 x = 8\, M\, I^{3/2}/(3 M_3 r^4 \sin \theta ) {\partial }_I {\partial }_\theta {\partial }_\phi \) we obtain \(\vec {B}=4\, J_3 {\partial }_I /\sqrt{I} = 2\, J_3 r {\partial }_r /I^{3/2}\). The expression for the magnetic-like vector field is surprisingly simple. It is radial and inversely proportional to r, namely \(\vec {B} \propto {\hat{r}}/r^2\) (since \(I \propto r^2\)). It carries nonzero magnetic monopole charge, proportional to \(J_3\) and located at \(\vec {g}=0\). The monopole charge is a consequence of the intrinsic charge associated with the Hopf fibration, which is used to reduce over \(\psi \) rotations. The magnetic-like vector field has appeared already, though in different form, in Lemaître (1964) and as curvature terms in Moeckel and Montgomery (2013).

On the side of the rotating body, \(\mathcal{L}_1\) will be understood to generate a precession of \(\vec {J}\) around the z-axis.

The \(\mathcal{L}_2\) term also couples geometry space and the rotating body; only this term is quadratic in \(\vec {J}\) and is given by

$$\begin{aligned} - \mathcal{L}_2:= \frac{1}{2 I} J_3^{~2} + \frac{M}{2 M_3} \frac{1}{\Delta ^2} \bar{I}^{\alpha \beta }\, J_\alpha J_\beta \end{aligned}$$
(2.56)

where \(\Delta \) is twice the triangle’s area, defined in (2.19), and it can be expressed in G-variables through

$$\begin{aligned} \Delta ^2 = \frac{3}{4} \, r^4\, \cos ^2 \theta ~, \end{aligned}$$
(2.57)

and where \(\bar{I}_{\alpha \beta }\) is proportional to the inverse of the \(I_{\alpha \beta }\) matrix and is given by

$$\begin{aligned} \bar{I}^{\alpha \beta }:= \frac{1}{M} \left( m_2\, m_3\, \rho _{23}^\alpha \, \rho _{23}^\beta + cyc. \right) \end{aligned}$$
(2.58)

where \(\rho _{23}\) is given within the \(\psi _+\) gauge by

$$\begin{aligned} \rho _{23} = i\, r\, \left[ \cos \frac{\theta }{2}\, - e^{-i \phi }\, \sin \frac{\theta }{2}\, \right] ~. \end{aligned}$$
(2.59)

In geometry space, minus \(\mathcal{L}_2\) is interpreted as the centrifugal potential. It can be thought to generalize the familiar 2-body centrifugal potential

$$\begin{aligned} V_\mathrm{cent, 2\, body} = \frac{L^2}{2 \mu \, r^2} \end{aligned}$$
(2.60)

where \(\vec {L}\) is the system’s angular momentum, and \(\mu \) is its reduced mass. Indeed, 2-body motion is necessarily planar and hence \(J_1=J_2=0\). Moreover, \(I=\mu r^2\), thereby the 3-body expression for the centrifugal potential reduces to that of the 2-body.

As usual, a coupling in the motion potential implies several terms in the equations of motion. In this case, in addition to a centrifugal force acting on the geometric variables \(\vec {g}\), we shall see that \(\mathcal{L}_2\) also implies the Euler equations for the rotating body.

Equations of motion. We performed several natural changes of variables in the kinetic energy in order to re-formulate the problem, and thereby re-phrase the equations of motion. One possibility to achieve the equations of motion would be to express \(\vec {\omega }\) in (2.36) in terms of a set of frame orientation angles, such as the Euler angles, which would be used as the fundamental dynamical variables. However, this procedure requires to make an arbitrary choice of the Euler-like angles and obscures the rotational symmetry.

Alternatively, it would be nice to derive the equations of motion using \(\vec {\omega }\) as fundamental velocities variables. However, the standard Euler–Lagrange equations would not produce the correct equations of motion, as can be seen for the example of the rigid body in Appendix D. Indeed, if we write \(\omega ^i = \beta ^i_j\, \dot{q}^j\), where \(q^i\) are generalized coordinates, then \(\beta ^i \equiv \beta ^i_j\, dq^j\) are 1-forms over G-space, such that \(d\beta ^i \ne 0\) and hence \(\beta ^i\) cannot be expressed as a differential of any generalized coordinates. In other words, \(\beta ^i\) define a non-coordinate basis of differential forms, also known as a non-holonomic basis.

In order to derive the equations of motion from a Lagrangian expressed in terms of non-coordinate velocities, we rediscovered the appropriate generalization of the Euler–Lagrange equations. This partially known generalization was originally found by Poincare (1901) and it is described in Appendix D. Here we only state the result for the case at hand.

In a Lagrangian formulation, one takes

$$\begin{aligned} \mathcal{L}(\vec {\omega },\vec {g}) = T(\vec {\omega },\vec {g}) - V(\vec {g}) \end{aligned}$$
(2.61)

where \(T(\vec {\omega },\vec {g})\) is given by (2.36) and \(V(\vec {g})\) is an arbitrary potential. The above-mentioned differential 1-forms have the following nonzero exterior differentials

$$\begin{aligned} d\beta ^i = \frac{1}{2}\epsilon _{ijk}\, \beta ^k \beta ^j \end{aligned}$$
(2.62)

and hence the equations of motion read

$$\begin{aligned} \frac{d}{dt} \left( \frac{{\partial }\mathcal{L}}{{\partial }\omega ^i} \right) + \epsilon _{ijk}\, \omega ^j \, \frac{{\partial }\mathcal{L}}{{\partial }\omega ^k}= & {} 0 \nonumber \\ \frac{d}{dt} \left( \frac{{\partial }\mathcal{L}}{{\partial }\dot{q}^s} \right)= & {} \frac{{\partial }\mathcal{L}}{{\partial }q^s} \end{aligned}$$
(2.63)

where \(q^s=(r,\theta ,\phi )\), the spherical coordinates in geometry space. The first line describes 3 generalized equations of motion which originate from variations with respect to \(\omega \), while the second line describes 3 standard Euler–Lagrange equations of motion that originate from variation with respect to the G-space variables.

In terms of the partial Legendre transform \(\mathcal{L}_J\) (2.49), the equations of motion are given by

$$\begin{aligned} \frac{d}{dt}\, J_i= & {} -\{ J_i, \mathcal{L}_J \} \nonumber \\ \frac{d}{dt} \left( \frac{{\partial }\mathcal{L}_J}{{\partial }\dot{q}^s} \right)= & {} \frac{{\partial }\mathcal{L}_J}{{\partial }q^s} \end{aligned}$$
(2.64)

where the Poisson brackets among the dynamic variables originate from the non-coordinate nature of the corresponding velocities according to the general rule (D.9) and in our case are given by

$$\begin{aligned} \{ J_i, J_j \} = -\epsilon _{ijk} \, J_k \end{aligned}$$
(2.65)

After a solution \(\vec {J}=\vec {J}(t),\, \vec {g}=\vec {g}(t)\) is found, one can further integrate to obtain the frame orientation angles.

Note that \(\vec {J}^{\;2}\) is conserved as a result of (2.49,2.64). Hence, phase space is essentially 8 dimensional: 6 dimensions for \(\vec {g}\) and its generalized velocities and 2 additional dimensions for \(\vec {J}\) that lies on a sphere of fixed \(\vec {J}^{\;2}\).

We comment that one could obtain a formulation where the vector potential is replaced by the gauge-invariant field strength by performing a Legendre transform on the remaining velocity variables, and employing “covariant momenta” (in analogy with the covariant derivative).

2.6 Discussion

The motion potential \(\mathcal{L}_J=\mathcal{L}_J(\vec {J},\vec {g})\) given by (2.49) defines a reformulation of the three-body problem, and it is a central result of this paper. The dynamic variables \(\vec {J},\vec {g}\) are compatible with the conserved charges in such a way as to reduce the number of effective degrees of freedom.

The employed dynamical variables decompose into two sets. The first set consists of the \(\vec {J}\) variables that describe the angular momentum in the body frame. They are the momenta canonically conjugate to \(\vec {\omega }\), the angular velocity vector in the body frame. Hence, \(\vec {J}\) describes the rotational motion of the configuration triangle. The second set of variables consists of \(\vec {g}\) or in spherical coordinates \((r,\theta ,\phi )\). These describe the geometry of the triangle, namely its shape and size, see Fig. 1. The two sets are coupled as the geometry of the triangle determines its moment of inertia, which affects its rotation motion.

The corresponding decomposition of the mechanics into a rotational motion and a motion in 3d geometry space is illustrated in Fig. 2.

Let us look for the origin of the chaotic nature of the system in terms of these two components. Already in the limit of planar motion, the rotational component of the motion becomes trivial, yet the system is chaotic. Indeed, the motion in geometry space is non-integrable since it has 3 degrees of freedom and only a single conserved charge, namely the energy. Coupling to a rotating body cannot change this non-integrable nature. Moreover, given that the motion of a rotating body is integrable in the rigid body limit, geometry space can be considered to be the core chaotic component in the three-body system.

Fig. 2
figure 2

The natural dynamical reduction into orientation and geometry. Left: the three bodies define a triangle. Its dynamics is decomposed into its orientation and its geometry (shape + size). The dynamical variables for the orientation are taken to be the components of the total angular momentum within the rotating system. Right: Mechanics in geometry space. Geometry space is a three-dimensional space in which each point describes the shape and size of a triangle formed by the three bodies. \(r,\theta ,\phi \) are considered to be spherical coordinates in this space. The (yellow) surface shaped like a pipe joint is a surface of constant potential (drawn for the equal mass case), so that motion is confined to be within it. The three solid (black) lines are sources of an attractive potential which originates with the Newtonian gravitational potential. Finally, the (blue) arrows show the radial magnetic-like field, which originates from a Coriolis force

Various ingredients of this formulation have already appeared in the literature. The \(r, \theta ,\phi \) coordinates together with the vector potential appeared in Lemaître (1952). The kinetic term in shape space, which is conformal to the round sphere, appeared in Montgomery (2002); Moeckel and Montgomery (2013) together with the origin of geometry space from a quotient of a 4d space. In hindsight, these elements partially appeared also in the Helium atom context: the 3d geometry space and its conformal equivalence to the sphere in Gronwall (1932), and the 4d predecessor in Fock (1954).

The current paper includes two main novelties: first, the definition of the complex position vector \(\vec {w}\) that solves the center of mass constraint (2.9), and secondly, the formulation in terms of \(\vec {J}\), the angular momentum components in the rotating system, see Sect. 2.5. In comparison to previous work, Moeckel and Montgomery (2013) was limited to planar motion, while we address the full 3d problem. \(\vec {J}\) has not been used as dynamical variables before, and in particular neither in Lemaître (1952) nor in Moeckel and Montgomery (2013). While in hindsight, Tkachenko (1978) can be recognized to have some relation to a formulation in terms of \(\vec {J}\) within the Helium atom context, it is at most partial and special due to the special values of the masses and the distinct quantum context.

Finally, the definition of \(\vec {w}\) naturally explains certain features that were previously observed, yet deemed surprising. Lemaître (1952) introduces the spherical coordinates in geometry space without explanation. It starts with the equal mass case, where the correct definitions are easier to guess, before generalizing to the general case of unequal masses. Moeckel and Montgomery (2013) arrives at these coordinates after a choice of coordinates on \(\mathbb{C}\mathbb{P}^1\) and comments that “Remarkably, it turns out that if we put the binary collisions at the third roots of unity... then the equilateral points are automatically moved to the north and south poles.” The definition of \(\vec {w}\) provides the missing link and makes natural the transition to the geometry space coordinates and their properties.

3 Applications

This section describes certain applications of the formulation in Sect. 2.

3.1 Hill-like region in geometry space

We have described a dynamical reduction onto geometry space, the space of all possible triangle geometries. One of its ingredients is a centrifugal potential, which implies an effective potential in geometry space. This allows to define the region of allowed motion in geometry space given the conserved charges. It is a generalization of the Hill region from its original context within the hierarchical limit Hill (1877, 1878), into the full non-hierarchical range.

Let us rewrite the motion potential term (2.56) as a centrifugal potential (as remarked above (2.60) )

$$\begin{aligned} V_\textrm{cent} = \frac{1}{2}\left( I^{-1}\right) ^{ij}\, J_i \, J_j ~. \end{aligned}$$
(3.1)

The principal moments of inertia are bounded from above by \(I_a \le I, ~a=1,2,3\), where \(I=I(\vec {g})\) is the moment around axis 3 that is perpendicular to the instantaneous plane and was given in (2.28,2.51) as a function of geometry space. This inequality holds since a three-body configuration is necessarily planar. It implies a lower bound on the centrifugal energy

$$\begin{aligned} V_\textrm{cent} \ge V_\textrm{cent,min}:= \frac{J^2}{2\, I} ~. \end{aligned}$$
(3.2)

The bound is saturated when \(\vec {J}\) is in the direction of axis 3, namely \(J^2 = J_3^2\).

Given the conserved quantities \(E,\, J^2\), the following inequality holds \(E \ge V + V_\textrm{cent} \), where the potential V is given by (2.2). Combining with (3.2), the motion in geometry space is restricted to

$$\begin{aligned} E \ge V\left( \vec {g}\right) + \frac{J^2}{2\, I\left( \vec {g}\right) } ~. \end{aligned}$$
(3.3)

In fact, the inequality can be saturated—this happens when both \(\dot{\vec {g}}=0\) and \(J^2 = J_3^2\). Therefore, this relation defines the projection of the allowed phase space into geometry space, which deserves to be called a Hill-like region (or in short, a Hill region).

The Hill region in geometry space is illustrated by the right-hand side of Fig. 2, where the Hill region lies inside of the shown surface, which resembles a pipe joint connecting three pipes. In the figure, the masses are equal and \(J^2=0\). For \(J^2 > 0\), a neighborhood of the origin, \(\vec {g}=0\), is deleted due to the centrifugal term. As \(J^2\) is increased, the Hill region shrinks and can be seen to undergo a topology change.

An analogous discussion of Hill’s region for the planar problem can be found in Moeckel (1988). The dynamical implications found there are hereby generalized to the 3d system.

3.2 The total phase volume

The dynamical reduction was applied to the evaluation of the regularized phase volume of the three-body system.

The phase volume is of intrinsic interest, and its regularized version is an ingredient of the flux-based theory Kol (2021). It is defined through a multi-dimensional integral over all of phase space. The dynamical reduction motivates changes of variables that allow to perform some of these integrations analytically, thereby enabling a full evaluation, as shown in Dandekar et al. (2022).

In fact, phase volume evaluation serves not only as an application of the dynamical reduction, but also as a signpost on the road to it, as described in the introduction. Therefore, we can consider the two topics to have co-evolved in symbiosis.

3.3 Planar motion

This subsection describes an application of the current formulation to planar three-body motion, and in particular to uniformly rotating configurations.

If the initial velocities are within the three-body plane, then the motion remains planar throughout. For planar motion, \(J_1, J_2\) vanish identically, while \(J_3\) is conserved. Hence, the rotating body motion is trivial, and \(\mathcal{L}_2\) specializes to

$$\begin{aligned} \mathcal{L}_{2,2d} = - \frac{1}{2 I} J_3^{~2} \end{aligned}$$
(3.4)

where \(\mathcal{L}_0, \mathcal{L}_1\) are defined in (2.50,2.53).

Equilibria. Equilibria of the reduced planar motion describe rigidly rotating solutions.

The effective potential in geometry space \(V_{eff}^G = V_{eff}^G (r,\theta ,\phi )\) is given by

$$\begin{aligned} V_{eff}^G = V + \frac{J_3^{~2}}{2\, I} \end{aligned}$$
(3.5)

where the second term is the centrifugal potential.

If V is a power law in the inter-body distance, one can solve the equilibrium potential explicitly with respect to r. In the remainder of this section, we assume the Newtonian gravitational potential. Solving the equation \(0={\partial }V_{eff}^G/{\partial }r\) for r and substituting back into \(V_{eff}^G\), one gets a potential that may be called the effective potential in shape space \(V_{eff}^S = V_{eff}^S (\theta ,\phi )\). It is given by

$$\begin{aligned} V_{eff}^S = -\frac{1}{2\, J_3^{~2}} I\, V^2 \end{aligned}$$
(3.6)

Note that \(I\, V^2\) is indeed r-independent, on account of the r-scaling properties of I and V.

To proceed toward the equilibria, it remains to differentiate \(V_{eff}^S\) with respect to \(\theta , \phi \). It turns out to be convenient to use the side lengths, \(r_{12}, r_{23}, r_{31}\), rather than \(\theta , \phi \). We require

$$\begin{aligned} 0&= \frac{{\partial }}{{\partial }r_{23}} \log V_{eff}^S = \frac{{\partial }}{{\partial }r_{23}} \left( \log I + 2 \log (-V) \right) \nonumber \\ {}&= \frac{2\, m_2\, m_3\, r_{23}}{M\, I} - \frac{2\, G\, m_2\, m_3}{(-V)\, r_{23}^{~2}} \end{aligned}$$
(3.7)

where we have used the expressions for \(V,\, I\) from (2.2, 2.51). Solving for \(r_{23}\) the factors of \(m_2\, m_3\) cancel out on account of the equality of gravitational and inertial masses, and one finds that the side lengths are all equal

$$\begin{aligned} r_{12}^{~3} = r_{23}^{~3} = r_{31}^{~3} = \frac{M\, J_3^2}{G\, M_2^2} ~. \end{aligned}$$
(3.8)

These are the equilateral solutions found by LagrangeLagrange (1772). Thus, we presented a derivation of them through the reduced Lagrangian in geometry space.

Note that the angular velocity associated with an equilateral with sides a is given by

$$\begin{aligned} \Omega ^2 \equiv \left( \frac{J_3}{I} \right) ^2 = \frac{GM}{a^3} \end{aligned}$$
(3.9)

which is consistent with Kepler’s third law and the fact that within the equilateral orbit each body moves as though attracted toward the center of mass. (For this reason, these configurations are known as central configurations.)

We add some comments.

Collinear solutions. \(V_{eff}^S\) is invariant under a space reflection, which when translated to geometry space becomes a reflection through the equator. Hence, the gradient of \(V_{eff}^S\) must lie within the equator. By restricting \(V_{eff}^S\) to the equator one finds the three collinear solutions found by Euler (1767). These solutions were not found above while differentiating with respect to the \(r_{ab}\) variables, since the transformation between them and the \(r,\theta ,\phi \) variables is singular at the equator.

Type of equilibrium. By noting the behavior of \(V_{eff}^S\) near the equator, one concludes that the extremum associated with equilateral motion is in fact a maximum.

Stability. The condition for the stability of the equilateral solutions is \(M_2/M^2 \le 1/27\). It was found by Gascheau (1843), see also Routh (1875) since I was not able to locate the former reference.

More generally, central configurations yield not only rigidly rotating solutions but also rotating–rescaling elliptic solutions, where the masses form an equilateral triangle and each mass moves on a Keplerian ellipse. In geometry space, these solutions correspond to radial motion along the polar directions, which can be seen to be a consistent truncation of the equations of motion. This validates the term \(\dot{I}^2/(8\, I) \subset T_G\) (2.31). The stability of the elliptic equilateral solutions was found in Danby (1964); Roberts (2002); Sicardy (2010); Martinez et al. (2006).

The current formulation reduces the stability analysis. In particular, the stability of the circular equilateral solutions is reduced to stability around a static solution. It would be interesting to revisit the above-mentioned stability criteria.

3.4 Economizing simulation

Clearly, a formulation of a dynamical reduction has implications for a reduction of computation time during simulations (in addition to implications to theory).

The standard Newtonian formulation involves 9 second-order differential equations (or equivalently, 9 degrees of freedom).

Using the j-complex position vector \(\vec {w}\) (2.9) as a dynamical variable for simulations reduces the equation set to 6 second-order equations (6 degrees of freedom) and guarantees the conservation of the center of mass during the simulation. In other words, it avoids the cost of an unnecessary simulation of the motion of the center of mass.

In addition, transforming the equations into the geometry variables \(\vec {g} \leftrightarrow (r, \, \theta , \, \phi )\) (2.22) and the angular momentum variables in the body frame \(\vec {J}\) (2.47), reduces the equations to 3 second-order equations for \(\vec {g}\), plus 3 first-order equations for \(\vec {J}\) (essentially, Euler’s equations), see (2.64), plus 3 integrations of Euler-like orientation angles, which do not affect the previous equations. In fact, \(\vec {J}\) is constrained to move on the surface of a sphere. Hence, by transforming to coordinates on this sphere, the \(\vec {J}\) equations would reduce to 2 first-order equations, which guarantee the conservation of \({\vec J}^2\).

It would be interesting to implement these equation sets on a computer and to measure or quantify the resulting reduction in simulation time.

4 Four-body problem

In the three-body problem, we introduced symmetric vector coordinates in the center of mass frame, inspired by Lagrange’s solution to the cubic (2.9). The quartic equation also has a general solution, which suggests a generalization to the four-body problem.

Denoting the masses by \(m_a, ~a=1,2,3,4\) and the positions by \(\vec {r}_a\), we define

$$\begin{aligned} \vec {s}_1= & {} \vec {r}_1 - \vec {r}_2 - \vec {r}_3 + \vec {r}_4 \nonumber \\ \vec {s}_2= & {} -\vec {r}_1 + \vec {r}_2 - \vec {r}_3 + \vec {r}_4 \nonumber \\ \vec {s}_3= & {} -\vec {r}_1 - \vec {r}_2 + \vec {r}_3 + \vec {r}_4 \end{aligned}$$
(4.1)

These variables are translation-invariant vectors. They contain 9 degrees of freedom, which are necessary to cover the configuration space at the center of mass frame. As the labels 1, 2, 3, 4 are permuted, the \(\vec {s}\) vectors transform nicely—they permute among themselves and/or change signs.

In order to proceed and decompose the variables into orientation and geometrical variables, the invariants are the 6 scalar products \(Q_{rt}= \vec {s}_r \cdot \vec {s}_t, ~ r,t=1,2,3\) and the gauge field and associated field strength become SO(3)-valued, and so, non-Abelian.

5 Epilogue

Lagrange (1772) reduced the formulation of the three-body problem based on mutual distance variables. Lemaître (1952) introduced triangle geometry variables that make the collinear configurations regular.

This paper incorporates two main novelties into the formulation. First, it defines the complex position vector \(\vec {w}\) (2.9) that provides a missing link toward the geometry space variables and a motivation for them. Secondly, it introduces into the formulation the angular momenta in the rotating frame \(\vec {J}\). Several applications were discussed.