1 Introduction

In the brief span of time after the launch of Sputnik, a whole succession of analyses was devoted to the problem poised by the drag-free motion of an artificial satellite about an oblate planet, employing almost every known perturbation method. Although, in a sense, the problem is a classic one that also occurred among the natural satellites, in the applications of artificial satellite motion it was necessary to obtain a more general, detailed, and accurate solution. The most intricate and notable investigations were presented by Brouwer (1959), Garfinkel (1959), and Kozai (1959) in the celebrated 1959 issue of the Astronomical Journal. These authors treat the first- and second-order secular perturbations, as well as the first-order short-periodic (related to the satellite’s mean motion) and long-periodic (related to the evolution of the argument of the perigee) perturbations of the orbital elements, where order here refers to the oblateness parameter.

Astronomical experience amply bears out the notion of separation of perturbing effects into periodic and secular variations and the distinction between fast and slow time variables concerning the motion of satellites (Lara 2019). The device of canonical transformations employed by Brouwer (1959) and Garfinkel (1959) permits a deeper understanding of the difference between periodic and secular perturbations and provides a systematic procedure for the inclusion of higher-order terms (Kozai 1962b). On the other hand, the technique used by Kozai (1959), known formally as the method of analytic continuation (Scheeres 2012), while applicable to any set of orbital elements, canonical or otherwise, can be cumbersome beyond the first order. The Poincaré-von Zeipel method of canonical transformations, though still in use (Nie et al. 2019), was significantly generalized by Hori (1966) and Deprit (1969) based on Lie series; the latter becoming the sine qua non of modern perturbation theory in celestial mechanics (Deprit and Rom 1970; Kaufman 1981).

Kozai’s gravity solution to the artificial satellite problem became the basis for the simplified general perturbations theory, SGP, that would be supplanted by SGP4, an analytical solution for satellite short-term prediction using the two-line element sets that instead has its roots in Brouwer’s gravitational theory (Hoots et al. 2004). Brouwer (1959) developed his original solution in Delaunay variables, the canonically-conjugate counterpart of the Keplerian orbital elements used by Kozai (1959), which, like the classical ones themselves, are singular for circular and equatorial orbits. The mathematical singularities associated with zero eccentricity and vanishing line of nodes that plague these solutions can be removed by reformulating them in nonsingular variables, such as those of Poincaré (Lyddane 1963; Breakwell and Vagners 1970), Hill (Izsak 1963; Aksnes 1972), or the equinoctial set (Gim and Alfriend 2003; Alfriend et al. 2009; Le Fèvre et al. 2014),Footnote 1 or Euler-parameter-based elements (Alfriend et al. 2009). A simplified form of the Lyddane-modified Brouwer theory, optimized in SGP4 for the rapid propagation of satellite ephemerides of the space object catalog (SOC) (Hoots 1981), is the basis for many tracking and prediction operations. Moreover, SGP4, or its deep space equivalent, SDP4, must be used to convert the mean, orbit-averaged, TLEs of the SOC into an osculating set of elements for use in special perturbation theory to obtain more accurate predictions (Levit and Marshall 2011).

The principal observable features due to Earth’s oblateness are a secular precession of the orbit plane about the polar axis and a steady motion of the major axis in the moving orbit plane. Well-known applications of this particular problem are the two inclined and elliptical orbit systems (12-hr Molniya and 24-hr Tundra), placed at the reputed critical inclination of \(\sim \)63.4\(^\circ \) so that the apsidal precession freezes on average. The critical inclination is an intrinsic singularity in the artificial satellite theories of our forebears (Coffey et al. 1986), and hitherto remains a fundamental issue in all modern reformulations of the main problem (Breakwell and Vagners 1970; Aksnes 1972; Lara 2015a, b). The orbit-averaging technique fundamentally involves removing all terms that depend on the fast-varying mean anomaly, thus retaining the (secular and long-period) mean-element motion. As noted by Kozai (1959), the long-periodic perturbations of the first order come from the terms of the second order, and the singularity associated with the critical inclination only arises when such long-period effects are retained. Nevertheless, the distinction between secular and mean elements has often been muddled in the literature, and tabulated mean-to-osculating transformations (Gim and Alfriend 2003; Schaub and Junkins 2018) apply Brouwer’s full periodic corrections. This approach will inevitably lead to significant errors near the critical inclination, but can be properly amended by neglecting the long-periodic terms (Breakwell and Vagners 1970).

An important application of the mean-to-osculating transformation is in the computation of the nominal osculating orbits from the frozen-orbit conditions determined in mean-elements space (Gurfil and Lara 2013). Frozen orbits correspond to equilibria for the averaged equations of motion, and, in the oblateness model, occur when secular effects due to even zonal harmonics are canceled by the long-period perturbations of the odd harmonics. While modern formulations compute frozen orbits directly from the non-averaged equations, based on the underlying quasi-periodic structure of librations around these mean equilibria, explicit analytical solutions form the starting point for the numerical optimization process. Recovering the short-periodic effects needed for this initialization can be troublesome at the critical inclination, whose precise frozen-orbit location is very sensitive to model truncation (Lara 2018).

Here, we present a new formulation of the mean-to-osculating and inverse conversions for first-order oblateness perturbations based on the Milankovitch elements (Rosengren and Scheeres 2013, 2014). We use the direct method of Kozai (1959), as further elucidated by Scheeres (2012), and present an explicit analytical short-period correction in vector form that is valid for all elliptical orbits. We adopt the mean longitude as the fast variable and present a compact power-series solution in eccentricity for its short-periodic perturbations that can be truncated to achieve the necessary accuracy. We establish a numerical averaging approach based on the fast-Fourier transform (Uphoff 1973; Ely 2015) and make detailed comparisons between our vectorial solution and the classical Brouwer–Lyddane (BL) theory. For the latter, we adapt the more streamlined formulas presented in Schaub and Junkins (2018) and Gim and Alfriend (2003).

2 Problem formulation

2.1 Analytical averaging

The basic idea in orbit-averaging methods is to obtain approximate equations for the system evolution that contain only slowly changing variables by exploiting the presence of a small dimensionless parameter \(\epsilon \) that characterizes the size of the perturbation. The tacit assumption is that the perturbing forces are sufficiently weak so that these approximate mean equations of motion can be used to describe the secular and long-period orbital evolution. The perturbation equations in celestial mechanics, relating the time variation of the orbit parameters to the perturbing accelerations, in Gauss or Lagrange form, are nonlinear, nonautonomous, first-order differential equations: (Alfriend et al. 2009; Scheeres 2012)

$$\begin{aligned} \dot{{\varvec{x}}} = \epsilon {{\varvec{g}}} ({{\varvec{x}}}, t), \end{aligned}$$
(1)

in which \({{\varvec{g}}} ({{\varvec{x}}}, t)\) is assumed to be T-periodic in time t. Equation (1) is trivially solved when \(\epsilon = 0\), yielding the integrals (Keplerian elements) in the unperturbed problem. The method of averaging consists of replacing Eq. (1) by the averaged autonomous system (Sanders et al. 2007)

$$\begin{aligned} \dot{\overline{{\varvec{x}}}}&= \epsilon \overline{{\varvec{g}}} (\overline{{\varvec{x}}}), \qquad \overline{{\varvec{g}}} (\overline{{\varvec{x}}}) = \frac{1}{T} \int _0^T {{\varvec{g}}} (\overline{{\varvec{x}}}, t)\, \mathrm {d} t, \end{aligned}$$
(2)

where the average is performed over time, and it is understood that \(\overline{{\varvec{x}}}\) in the integrand is to be regarded as a constant during the averaging process. The basis for this approximation is the averaging principle, which states that in the general, non-resonant case, the short-period terms removed by averaging cause only small oscillations that are superimposed on the long-term solution described by the averaged system.

Comparison between numerical integrations and the mean solution will in general show a divergence between the two as a result of an inconsistent choice of initial conditions, as depicted in Fig. 1. This offset can be understood by decomposing the osculating elements into mean and short-period components (Kozai 1959; Scheeres 2012):

$$\begin{aligned} {{\varvec{x}}} (t) = \overline{{\varvec{x}}} (t) + {{\varvec{x}}}^{sp} (t). \end{aligned}$$
(3)

Differentiating, we can obtain an approximate equation governing the short-period dynamics:

$$\begin{aligned} \dot{{\varvec{x}}}^{sp} (t) = \dot{{\varvec{x}}} - \dot{\overline{{\varvec{x}}}} \cong {{\varvec{g}}} (\overline{{\varvec{x}}}, t) - \overline{{\varvec{g}}} (\overline{{\varvec{x}}}), \end{aligned}$$
(4)

in which the mean state \(\overline{{\varvec{x}}}\) has replaced the corresponding osculating elements \({{\varvec{x}}}\) in the dynamical equations and where the small parameter \(\epsilon \) has been omitted without lack of generality. From Eqs. (3) and (4), the short-periodic perturbations can be obtained as (Kozai 1959)

$$\begin{aligned} {{\varvec{x}}}^{sp} (t) = \mathrm {d} {{\varvec{x}}}^{sp} (t) - \overline{\mathrm {d} {{\varvec{x}}}}^{sp} = \int \left[ {{\varvec{g}}} (\overline{{\varvec{x}}}, t) - \overline{{\varvec{g}}} (\overline{{\varvec{x}}}) \right] \, \mathrm {d} t - \frac{1}{T} \int _0^T \int \left[ {{\varvec{g}}} (\overline{{\varvec{x}}}, t) - \overline{{\varvec{g}}} (\overline{{\varvec{x}}}) \right] \, \mathrm {d} t^2. \end{aligned}$$
(5)

Accordingly, it can be seen that \(\overline{{\varvec{x}}}^{sp} = {{\varvec{0}}}\), which is also the implicit assumption in the averaging process.

The interpretation of this result is that given an initial condition for a state \({{\varvec{x}}}_0 = {{\varvec{x}}} (t_0)\), the mean equations have to be initialized at a value provided by Eqs. (3) and (5) as

$$\begin{aligned} \overline{{\varvec{x}}}_0 = {{\varvec{x}}}_0 - {{\varvec{x}}}^{sp} (t_0), \end{aligned}$$
(6)

to have the averaged dynamics track the true evolution more closely.

Fig. 1
figure 1

Schematic showing a comparison between the numerical and mean solutions for an arbitrary orbital element, when starting from the same initial condition. The mean solution changes approximately linearly over one orbit and is referred to as the averaged variation. The short-period oscillations are the fluctuations that happen per orbit in the real motion

The removal of time, or analogously of the mean anomaly, requires computing the quadrature of functions depending implicitly on this variable through the true anomaly. The time averaging is performed over a periodic motion having a period much shorter than the time that characterizes the evolution of the dynamical system; this periodicity necessarily implies that averaging is taken for elliptical orbits. Given a quantity \({{\varvec{g}}} ({{\varvec{x}}}, M)\) representing the right-hand side of the equations of motion, defined as a function of the dimensionless time variable M (the mean anomaly) in addition to the other orbital elements given by \({{\varvec{x}}}\), the average, Eq. (2), can be redefined as

$$\begin{aligned} \overline{{{\varvec{g}}}} (\overline{{\varvec{x}}}) = \frac{1}{2 \pi } \int _0^{2 \pi } {{\varvec{g}}} (\overline{{\varvec{x}}}, M)\, \mathrm {d} M, \end{aligned}$$
(7)

where the orbital elements \(\overline{{\varvec{x}}}\) are held constant in the integration. Although the average is defined with respect to mean anomaly, it is often more easily calculated by means of the true or eccentric anomaly, f and E, respectively, using the differential relationships

$$\begin{aligned} \mathrm {d} M = \frac{r}{a} \mathrm {d} E = \frac{r^2}{a b} \mathrm {d} f, \end{aligned}$$
(8)

in which a and \(b = a \sqrt{1 - e^2}\) are the semi-major and semi-minor axes, respectively, and e is the eccentricity, yielding the equivalent forms for averaging:

$$\begin{aligned} \overline{{{\varvec{g}}}} (\overline{{\varvec{x}}}) = \frac{1}{2 \pi } \int _0^{2 \pi } {{\varvec{g}}} (\overline{{\varvec{x}}}, M)\, \mathrm {d} M = \frac{1}{2 \pi a} \int _0^{2 \pi } {{\varvec{g}}} (\overline{{\varvec{x}}}, E)\, r\, \mathrm {d} E = \frac{1}{2 \pi a b} \int _0^{2 \pi } {{\varvec{g}}} (\overline{{\varvec{x}}}, f)\, r^2\,\mathrm {d} f. \end{aligned}$$
(9)

Note that r can be expressed in terms of f and E as

$$\begin{aligned} r = \left\{ \begin{array}{c} \displaystyle \frac{a (1 - e^2)}{1 + e \cos f} = \frac{H^2/\mu }{1 + e \cos f}, \\ [1.0em] a (1 - e \cos E). \end{array} \right. \end{aligned}$$
(10)

Here, H is the specific angular momentum and \(\mu \) is the gravitational parameter.

The Milankovitch elements consist of the two fundamental vectorial integrals of motion, namely, the eccentricity vector \({{\varvec{e}}}\) and angular momentum vector \({{\varvec{H}}}\). These vectors can be parameterized in terms of the Keplerian elements relative to an inertial frame:

$$\begin{aligned} {{\varvec{H}}}= & {} H {\hat{{{\varvec{h}}}}} = H (\sin i \sin \varOmega {\hat{{{\varvec{x}}}}} - \sin i \cos \varOmega {\hat{{{\varvec{y}}}}} + \cos i {\hat{{{\varvec{z}}}}}), \end{aligned}$$
(11)
$$\begin{aligned} {{\varvec{e}}}= & {} e {\hat{{{\varvec{e}}}}} = e \big [ (\cos \omega \cos \varOmega - \cos i \sin \omega \sin \varOmega ) {\hat{{{\varvec{x}}}}} \nonumber \\&\qquad + (\cos \omega \sin \varOmega + \cos i \sin \omega \cos \varOmega ) {\hat{{{\varvec{y}}}}} \nonumber \\&\qquad + \sin i \sin \omega {\hat{{{\varvec{z}}}}} \big ], \end{aligned}$$
(12)

where i is the inclination, \(\varOmega \) is the right ascension of the ascending node, and \(\omega \) is the argument of periapsis. Because of the orthogonality constraint (i.e., \({{\varvec{H}}} \cdot {{\varvec{e}}}=0\)), we need a sixth scalar element to fully define an orbit (Roy and Moran 1973). Adopting the mean longitude \(l = \omega + \varOmega + M\), the non-averaged equations of motion for an arbitrary disturbing acceleration \({{\varvec{a}}}_d\) can be stated in ‘dyadic’ formFootnote 2 as (Battin 1999; Rosengren and Scheeres 2014):

$$\begin{aligned} \dot{{\varvec{e}}}&= \frac{1}{\mu } \left( {\widetilde{{\varvec{v}}}} \cdot {\widetilde{{\varvec{r}}}} - {\widetilde{{\varvec{H}}}} \right) \cdot {{\varvec{a}}}_d = {{\varvec{g}}}_{{\varvec{e}}} ( {{\varvec{x}}}, t), \end{aligned}$$
(13a)
$$\begin{aligned} \dot{{\varvec{H}}}&= {\widetilde{{\varvec{r}}}} \cdot {{\varvec{a}}}_d = {{\varvec{g}}}_{{\varvec{H}}} ( {{\varvec{x}}}, t), \end{aligned}$$
(13b)
$$\begin{aligned} {\dot{l}}&= n + \left( -\frac{e}{\mu (1 + \sqrt{1 - e^2})} \left[ H ({\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{r}}}}}) {\hat{{{\varvec{r}}}}} + (r + p) ({\hat{{{\varvec{e}}}}} \cdot {{\varvec{v}}}) {\hat{{{\varvec{\theta }}}}} \right] - \frac{2}{n a^2} {{\varvec{r}}} + \frac{{{\varvec{r}}} \cdot {\hat{{{\varvec{z}}}}}}{H (H + {{\varvec{H}}} \cdot {\hat{{{\varvec{z}}}}})} {{\varvec{H}}} \right) \cdot {{\varvec{a}}}_d, \nonumber \\&= n + g_l ({{\varvec{x}}}, t), \end{aligned}$$
(13c)

where \(n^2 = \mu / a^3\), \(p = H^2/\mu \), and \({\hat{{{\varvec{\theta }}}}} = {\widetilde{{\hat{{{\varvec{h}}}}}}} \cdot {\hat{{{\varvec{r}}}}}\). Note that Eq. (13c) consists of terms that collect the contributions due to the three components in the radial, transverse, and normal directions of the disturbing acceleration and is valid for elliptical orbits in which \(e < 1\).

The position and velocity vectors, \({{\varvec{r}}}\) and \({{\varvec{v}}}\), may be expressed as

$$\begin{aligned} {{\varvec{r}}}&= r \left( \cos f {\hat{{{\varvec{e}}}}} + \sin f {\hat{{{\varvec{e}}}}}_\perp \right) , \end{aligned}$$
(14)
$$\begin{aligned} {{\varvec{v}}}&= \frac{\mu }{H} \left[ -\sin f {\hat{{{\varvec{e}}}}} + \left( e + \cos f \right) {\hat{{{\varvec{e}}}}}_\perp \right] , \end{aligned}$$
(15)

where \({\hat{{{\varvec{e}}}}} = {{\varvec{e}}}/e\), \({\hat{{{\varvec{e}}}}}_\perp = {\widetilde{{\hat{{{\varvec{h}}}}}}} \cdot {\hat{{{\varvec{e}}}}}\), and \({\hat{{{\varvec{h}}}}} = {{\varvec{H}}}/H\).

The Gauss equations have time-varying terms multiplying the accelerations involving the true anomaly, and thus they must each be averaged separately. The mean evolution of these elements can be computed as

$$\begin{aligned} {\dot{\overline{{\varvec{e}}}}}&= \frac{1}{2 \pi } \int _0^{2 \pi } \dot{{\varvec{e}}}\, \mathrm {d}M = \overline{{\varvec{g}}}_{{\varvec{e}}} ( \overline{{\varvec{x}}}), \end{aligned}$$
(16a)
$$\begin{aligned} {\dot{\overline{{\varvec{H}}}}}&= \frac{1}{2 \pi } \int _0^{2 \pi } \dot{{\varvec{H}}}\, \mathrm {d}M = \overline{{\varvec{g}}}_{{\varvec{H}}} ( \overline{{\varvec{x}}}), \end{aligned}$$
(16b)
$$\begin{aligned} \dot{\overline{l}}&= \frac{1}{2 \pi } \int _0^{2 \pi } {\dot{l}}\, \mathrm {d}M = \overline{n} + \overline{g}_l ( \overline{{\varvec{x}}}). \end{aligned}$$
(16c)

The approximate short-period equations of motion for each element can then be formulated by subtracting Eq. (16) from Eq. (13), while holding all orbital elements but f constant. Kozai (1959), in his solution employing the classical elements a, e, i, \(\varOmega \), \(\omega \), and M, obtained the non-averaged, mean, and short-period equations of motion using the Lagrange planetary equations. We note that a more general procedure has been outlined herein.

Particularly, for \({{\varvec{e}}}\) and \({{\varvec{H}}}\), following Eq. (5), we have

$$\begin{aligned} \mathrm {d} {{\varvec{e}}}^{sp} (t)&= \int \left[ {{\varvec{g}}}_{{\varvec{e}}} ( \overline{{\varvec{x}}}, t) - \overline{{\varvec{g}}}_{ {\varvec{e}}} (\overline{{\varvec{x}}}) \right] \, \mathrm {d} t, \end{aligned}$$
(17a)
$$\begin{aligned} \mathrm {d} {{\varvec{H}}}^{sp} (t)&= \int \left[ {{\varvec{g}}}_{{\varvec{H}}} ( \overline{{\varvec{x}}}, t) - \overline{{\varvec{g}}}_{ {\varvec{H}}} (\overline{{\varvec{x}}}) \right] \, \mathrm {d} t, \end{aligned}$$
(17b)

so that

$$\begin{aligned} {{\varvec{e}}}^{sp} (t)&= \mathrm {d} {{\varvec{e}}}^{sp} (t) - \overline{\mathrm {d} {{\varvec{e}}}}^{sp}, \end{aligned}$$
(18a)
$$\begin{aligned} {{\varvec{H}}}^{sp} (t)&= \mathrm {d} {{\varvec{H}}}^{sp} (t) - \overline{\mathrm {d} {{\varvec{H}}}}^{sp}. \end{aligned}$$
(18b)

Special care, however, must be taken in the case of the mean longitude due to the presence of the mean motion appearing without any factor in Eq. (13c) (Kozai 1959). Expanding the osculating n into a first-order Taylor series about the mean elements, and following Eq. (4), we can write the approximate equation governing the short-period dynamics as

$$\begin{aligned} {\dot{l}}^{sp} (t) = {\dot{l}} - \dot{\overline{l}} \cong g_l ( \overline{{\varvec{x}}}, t ) - \overline{g}_l (\overline{{\varvec{x}}}) + \nabla _{{\varvec{e}}} n (\overline{{\varvec{x}}}) \cdot {{\varvec{e}}}^{sp} (t) + \nabla _{{\varvec{H}}} n (\overline{{\varvec{x}}}) \cdot {{\varvec{H}}}^{sp} (t), \end{aligned}$$
(19)

where

$$\begin{aligned} \nabla _{{\varvec{e}}} n (\overline{{\varvec{x}}})&= \frac{\partial n}{\partial {{\varvec{e}}}} \Bigg |_{{{\varvec{x}}} = \overline{{\varvec{x}}}} = -\frac{3 \mu ^2}{\overline{H}^3} \sqrt{1 - \overline{e}^2}\, \overline{{\varvec{e}}}, \end{aligned}$$
(20)
$$\begin{aligned} \nabla _{{\varvec{H}}} n (\overline{{\varvec{x}}})&= \frac{\partial n}{\partial {{\varvec{H}}}} \Bigg |_{{{\varvec{x}}} = \overline{{\varvec{x}}}} = -\frac{3 \mu ^2}{\overline{H}^5} (1 - \overline{e}^2)^{3/2}\, \overline{{\varvec{H}}}. \end{aligned}$$
(21)

Accordingly,

$$\begin{aligned} \mathrm {d} l^{sp} (t)&= \int \left[ g_l ( \overline{{\varvec{x}}}, t ) - \overline{g}_l ( \overline{{\varvec{x}}} ) \right] \, \mathrm {d} t + \nabla _{{\varvec{e}}} n ( \overline{{\varvec{x}}} ) \cdot \int {{\varvec{e}}}^{sp} (t)\, \mathrm {d} t + \nabla _{{\varvec{H}}} n ( \overline{{\varvec{x}}} ) \cdot \int {{\varvec{H}}}^{sp} (t)\, \mathrm {d} t, \end{aligned}$$
(22)

and

$$\begin{aligned} \overline{\mathrm {d} l}^{sp} \nonumber&= \frac{1}{T} \int _0^T \int \left[ g_l ( \overline{{\varvec{x}}}, f ) - \overline{g}_l (\overline{{\varvec{x}}}) \right] \, \mathrm {d} t^2 \nonumber \\& + \nabla _{{\varvec{e}}} n (\overline{{\varvec{x}}}) \cdot \frac{1}{T} \int _0^T \int {{\varvec{e}}}^{sp} (t) \, \mathrm {d} t^2 + \nabla _{{\varvec{H}}} n (\overline{{\varvec{x}}}) \cdot \frac{1}{T} \int _0^T \int {{\varvec{H}}}^{sp} (t) \, \mathrm {d} t^2. \end{aligned}$$
(23)

As a result, the short-periodic perturbations of l are given by

$$\begin{aligned} l^{sp} (t) = \mathrm {d} l^{sp} (t) - \overline{\mathrm {d} l}^{sp}. \end{aligned}$$
(24)

2.2 Numerical averaging

The conversion between mean and osculating elements can be obtained numerically, as previously shown by Walter (1967), Uphoff (1973), and Ely (2015). Here, we will exploit a numerical implementation of the near-identity transformation in Eq. (3) between averaged and osculating Milankovitch elements to validate our subsequent analytical developments.

The mapping is obtained by numerically solving

$$\begin{aligned} \begin{aligned}&\dfrac{\partial \, {{\varvec{x}}}^{sp}}{\partial \, t} = \epsilon \left[ {{\varvec{g}}} \left( \overline{{\varvec{x}}}, t \right) - \overline{{{\varvec{g}}}} \left( \overline{{\varvec{x}}} \right) \right] , \\&\int _0^T {{\varvec{x}}}^{sp} (\overline{{\varvec{x}}}, t) \, \mathrm {d} t = {{\varvec{0}}}. \end{aligned} \end{aligned}$$
(25)

The boundary condition in Eq. (25) guarantees that the oscillations of \({{\varvec{x}}}^{sp} (t)\) are unbiased with respect to \(\overline{{\varvec{x}}}\). The formal solution is then (Sanders et al. 2007)

$$\begin{aligned} {{\varvec{x}}}^{sp} \left( \overline{{\varvec{x}}}, t \right) = - \epsilon \frac{i}{n} \sum _{k=1}^{\infty } \dfrac{{c}_k \left( \overline{{\varvec{x}}} \right) }{k} e^{i k n t}, \end{aligned}$$
(26)

where \({c}_k\left( \overline{{\varvec{x}}} \right) \) are the Fourier coefficients of \({{\varvec{g}}} \left( \overline{{\varvec{x}}}, t \right) \), and i is the imaginary unit.

In this study, we numerically approximate Eq. (26) by truncating the series and by using the fast-Fourier transform (FFT) algorithm as discussed in Ely (2015), under the assumption that \({{\varvec{g}}}\) is analytic on the continuation of nt in a non-vanishing complex strip. This assumption guarantees that the Fourier coefficients exhibit exponential decrease as a function of k. It results in a numerical averaging scheme that we call “FFT transformation" from now on. This scheme yields a first-order approximation of the motion, similarly to the Brouwer–Lyddane or Milankovitch ones. For the mean orbit longitude, we incorporate the additional terms in Eq. (19), resulting from the Taylor series of n, into Eq. (25), before performing the numerical quadrature.

3 Analytical short-period correction for oblateness perturbations

The quadrupolar (i.e., \(J_2\)-truncated) disturbing function arising from an oblate planet can be stated in a general vector expression as (Scheeres 2012)

$$\begin{aligned} {\mathcal {R}} = \frac{\mu J_2 R^2}{2 r^3} \left[ 1 - 3 ({\hat{{{\varvec{r}}}}} \cdot {\hat{{{\varvec{p}}}}})^2 \right] , \end{aligned}$$
(27)

where \(J_2\) is the second zonal harmonic coefficient, R is the mean equatorial radius of the planet, and \({\hat{{{\varvec{r}}}}} = {{\varvec{r}}}/r\) from Eq. (14). Note that the Earth’s spin axis \({\hat{{{\varvec{p}}}}}\) is assumed to be fixed in inertial space, and, as such, is aligned with \({\hat{{{\varvec{z}}}}}\).

The perturbing acceleration is then given by \(\partial {\mathcal {R}} / \partial {{\varvec{r}}}\) as

$$\begin{aligned} {{\varvec{a}}}&= -\frac{3 \mu J_2 R^2}{2 r^5} \left\{ \left[ 1 - 5 ( {\hat{{{\varvec{r}}}}} \cdot {\hat{{{\varvec{p}}}}} )^2 \right] {{\varvec{r}}} + 2 ({{\varvec{r}}} \cdot {\hat{{{\varvec{p}}}}}) {\hat{{{\varvec{p}}}}}\right\} . \end{aligned}$$
(28)

Accordingly, following Eq. (13), the perturbation equations can be stated as

$$\begin{aligned} \dot{{\varvec{e}}}&= \frac{3 J_2 R^2}{2 r^5} \Big \{ \Big [ 1 - 5 ( {\hat{{{\varvec{r}}}}} \cdot {\hat{{{\varvec{p}}}}} )^2 \Big ] {\widetilde{{\varvec{H}}}} \cdot {{\varvec{r}}} - 2 ({{\varvec{r}}} \cdot {\hat{{{\varvec{p}}}}}) ({\widetilde{{\varvec{v}}}} \cdot {\widetilde{{\varvec{r}}}} - {\widetilde{{\varvec{H}}}}) \cdot {\hat{{{\varvec{p}}}}} \Big \}, \end{aligned}$$
(29a)
$$\begin{aligned} \dot{{\varvec{H}}}&= -\frac{3 \mu J_2 R^2}{r^5} ({{\varvec{r}}} \cdot {\hat{{{\varvec{p}}}}}) {\widetilde{{\varvec{r}}}} \cdot {\hat{{{\varvec{p}}}}}, \end{aligned}$$
(29b)
$$\begin{aligned} {\dot{l}}&= n + \frac{3 \mu J_2 R^2}{2 r^5} \Bigg \{ \frac{1}{\mu (1 + \sqrt{1 - e^2})} \left[ H ({{\varvec{e}}} \cdot {{\varvec{r}}}) ( 1 - 3 ( {\hat{{{\varvec{r}}}}} \cdot {\hat{{{\varvec{p}}}}} )^2 ) + 2 (r + p)({{\varvec{e}}} \cdot {{\varvec{v}}}) ({{\varvec{r}}} \cdot {\hat{{{\varvec{p}}}}}) ({\hat{{{\varvec{\theta }}}}} \cdot {\hat{{{\varvec{p}}}}}) \right] \nonumber \\& + \frac{2 r^2}{n a^2} ( 1 - 3 ( {\hat{{{\varvec{r}}}}} \cdot {\hat{{{\varvec{p}}}}} )^2 ) - \frac{2 ({{\varvec{r}}} \cdot {\hat{{{\varvec{p}}}}})^2 ({{\varvec{H}}} \cdot {\hat{{{\varvec{p}}}}})}{H (H + {{\varvec{H}}} \cdot {\hat{{{\varvec{p}}}}})} \Bigg \}. \end{aligned}$$
(29c)

The averaged equations of motion for the eccentricity and angular momentum vectors are given by Ward (1962) and Rosengren and Scheeres (2013). Averaging Eq. (29c) directly requires computing the quadrature of various dyadics and higher rank tensors of the dynamical variables, the details of which we omit. The mean equations can be stated as

$$\begin{aligned} {\dot{\overline{{\varvec{e}}}}}&= - \frac{3 n J_2 R^2}{4 p^2} \Big \{ \Big [ 1 - 5 ({\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}})^2 \Big ] {\widetilde{{\hat{{{\varvec{h}}}}}}} + 2 ( {\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}} ) {\widetilde{{\hat{{{\varvec{p}}}}}}} \Big \} \cdot {\varvec{e}}, \end{aligned}$$
(30a)
$$\begin{aligned} {\dot{\overline{{\varvec{H}}}}}&= \frac{3 H n J_2 R^2}{2 p^2} ( {\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}} ) {\widetilde{{\hat{{{\varvec{h}}}}}}} \cdot {\hat{{{\varvec{p}}}}}, \end{aligned}$$
(30b)
$$\begin{aligned} \dot{\overline{l}}&= n + \frac{3 n J_2 R^2}{4 p^2} \Big \{ \sqrt{1 - e^2} \left[ 3 ( {\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}} )^2 - 1 \right] + 5 ( {\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}} )^2 - 2 ({\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}}) - 1 \Big \}, \end{aligned}$$
(30c)

where the bar operator is omitted from the elements because there is no ambiguity in what follows, i.e., all variables are averaged variables. Note that Eq. (30c) can be seen as the sum of the classical secular precession rates arising from planetary oblateness, \(\dot{\overline{l}} = \dot{\overline{M}} + \dot{\overline{\omega }} + \dot{\overline{\varOmega }}\), where

$$\begin{aligned} \dot{\overline{M}}&= n + \frac{3 n J_2 R^2}{4 p^2} \sqrt{1 - e^2} \left( 3 \cos ^2 i - 1 \right) ,\, \dot{\overline{\omega }} = \frac{3 n J_2 R^2}{4 p^2} \left( 5 \cos ^2 i - 1 \right) , \nonumber \\&\quad \dot{\overline{\varOmega }} = -\frac{3 n J_2 R^2}{2 p^2} \cos i. \end{aligned}$$
(31)

Following the procedure outlined in Sect. 2.1 and detailed in Appendix B, the short-periodic perturbations in the eccentricity vector, angular momentum vector, and mean longitude can be stated as

$$\begin{aligned} {{\varvec{e}}}^{sp} (t)&= - \frac{3 J_2 R^2}{2 p^2} \bigg \{ \Big [ {\widehat{I}}_{2} + 2 ( I_1 + e II_{11} + III_{111} - III_{122} - 5 IV_{122} ) ( {\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}} ) ( {\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}} ) \nonumber \\& - ( 2 {\widehat{III}}_{112} + 5 {\widehat{IV}}_{112} ) ( {\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}} )^2 + ( 2 {\widehat{I}}_{2} + 2 e {\widehat{II}}_{12} + 2 {\widehat{III}}_{112} - 5 {\widehat{IV}}_{222} ) ( {\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}} )^2 \Big ] {\hat{{{\varvec{e}}}}} \nonumber \\& - \Big [ I_1 - M e + 2 ( {\widehat{I}}_2 - e {\widehat{II}}_{12} - {\widehat{III}}_{112} + {\widehat{III}}_{222} - 5 {\widehat{IV}}_{112} ) ( {\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}} ) ( {\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}} ) \nonumber \\& + ( 3 M e / 2 + 2 I_1 + 2 III_{122} - 5 IV_{111} ) ( {\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}} )^2 + ( 3 M e / 2 - 2 e II_{22} - 2 III_{122} - 5 IV_{122} ) ( {\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}} )^2 \Big ] {\hat{{{\varvec{e}}}}}_\perp \nonumber \\& + M e ( {\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}} ) ( {\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}} ) {\hat{{{\varvec{h}}}}} - 2 e \Big [ {\widehat{II}}_{12} ( {\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}} ) + II_{22} ( {\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}} ) \Big ] {\hat{{{\varvec{p}}}}} \bigg \}, \end{aligned}$$
(32a)
$$\begin{aligned} {{\varvec{H}}}^{sp} (t)&= - \frac{3 H J_2 R^2}{p^2} \bigg \{ \Big [ {\widehat{II}}_{12} ( {\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}} ) ( {\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}} ) + ( II_{22} - M/2 ) ( {\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}} ) ( {\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}} ) \Big ] {\hat{{{\varvec{e}}}}} \nonumber \\& - \Big [ ( II_{11} - M/2 ) ( {\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}} ) ( {\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}} ) + {\widehat{II}}_{12} ( {\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}} ) ( {\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}} ) \Big ] {\hat{{{\varvec{e}}}}}_\perp \nonumber \\& + \Big [ II_{11} - II_{22} + {\widehat{II}}_{12} \left( ( {\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}} )^2 - ( {\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}} )^2 \right) \Big ] {\hat{{{\varvec{h}}}}} \bigg \}, \end{aligned}$$
(32b)
$$\begin{aligned} l^{sp} (t)&= \frac{3 J_2 R^2}{2 p^2} \bigg \{ \frac{e}{1 + \sqrt{1 - e^2}} \Big [ I_{1} - 3 \Big ( IV_{111} ({\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}})^2 + 2 {\widehat{IV}}_{112} ({\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}}) ({\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}}) + IV_{122} ({\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}})^2 \Big ) \nonumber \\& + 2 \left( {\widehat{III}}_{222} - {\widehat{III}}_{112} + {\widehat{IV}}_{222} - {\widehat{IV}}_{112} \right) ({\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}}) ({\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}}) + 2 \left( III_{122} + IV_{122} \right) \left( ({\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}})^2 - ({\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}})^2 \right) \Big ] \nonumber \\& - 2 \left( 3 \sqrt{1 - e^2} + \frac{{\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}}}{1 + {\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}}} \right) \Big ( II_{11} ({\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}})^2 + 2 {\widehat{II}}_{12} ({\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}}) ({\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}}) + II_{22} ({\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}})^2 \Big ) + 2 \sqrt{1 - e^2} I_{0} \nonumber \\& - \frac{1}{2} M \left[ \sqrt{1 - e^2} \left( 3 ({\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}})^2 - 1 \right) + 5 ({\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}})^2 - 2 ({\hat{{{\varvec{h}}}}} \cdot {\hat{{{\varvec{p}}}}}) - 1 \right] \nonumber \\& + \frac{3 e H}{n p^2} \sqrt{1 - e^2} \Big [ 2 \Big ( \widetilde{\mathcal {IV}} + e \widetilde{\mathcal {II}} \Big ) ({\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}}) ({\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}}) + \widehat{\widetilde{I}}_{2} \Big ( 1 + 2 ({\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}})^2 \Big ) \nonumber \\& - 5 \widehat{\widetilde{IV}}_{112} ({\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}})^2 - 5 \widehat{\widetilde{IV}}_{222} ({\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}})^2 - 2 \Big ( \widehat{\widetilde{III}}_{112} + e \widehat{\widetilde{II}}_{12} \Big ) \Big ( ({\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}})^2 - ({\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}})^2 \Big ) \Big ] \nonumber \\& + \frac{6 H}{n p^2} (1 - e^2)^{3/2} \Big [ \widetilde{\mathcal {II}} ( {\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}} ) ( {\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}} ) - \widehat{\widetilde{II}}_{12} \Big ( ({\hat{{{\varvec{e}}}}} \cdot {\hat{{{\varvec{p}}}}})^2 - ({\hat{{{\varvec{e}}}}}_\perp \cdot {\hat{{{\varvec{p}}}}})^2 \Big ) \Big ] \bigg \}, \end{aligned}$$
(32c)

where Roman numerals \(I_1\), \(I_2\), \(II_{11}\), \(\ldots \) designate various functions of true anomaly, \({\widehat{I}}_2\), \({\widehat{II}}_{12}\), \(\ldots \) represent the difference of these trigonometric expressions from their averaged values, \(\widetilde{I}_{1}\), \(\ldots \) are indefinite integrals of the previous core functions, and \(\widehat{\widetilde{I}}_{2}\), \(\ldots \), represent differences between the doubly-integrated expressions, their averaged values, and the previous core averages. While somewhat cumbersome, the adopted notation mirrors the derivation of Appendix B and is otherwise systematic and methodical. The needed results pertaining to the solution, Eq. (32), are given by

$$\begin{aligned} I_0&= f + e \sin f, \end{aligned}$$
(33a)
$$\begin{aligned} I_{1}&= \frac{1}{12} \left( 12 e f + (12 + 9 e^2) \sin f + 6 e \sin 2 f + e^2 \sin 3 f \right) , \end{aligned}$$
(33b)
$$\begin{aligned} II_{11}&= \frac{1}{12} \Big ( 6 f + 9 e \sin f + 3 \sin 2 f + e \sin 3 f \Big ), \end{aligned}$$
(33c)
$$\begin{aligned} II_{22}&= \frac{1}{12} \Big ( 6 f + 3 e \sin f - 3 \sin 2 f - e \sin 3 f \Big ), \end{aligned}$$
(33d)
$$\begin{aligned} III_{111}&= \frac{1}{96} \left( 36 e f + 72 \sin f + 24 e \sin 2 f + 8 \sin 3 f + 3 e \sin 4 f \right) , \end{aligned}$$
(33e)
$$\begin{aligned} III_{122}&= \frac{1}{96} \left( 12 e f + 24 \sin f - 8 \sin 3 f - 3 e \sin 4 f \right) , \end{aligned}$$
(33f)
$$\begin{aligned} IV_{111}&= \frac{1}{240} \left( 180 e f + 30 (6 + 5 e^2) \sin f + 120 e \sin 2 f + 5 ( 4 + 5 e^2) \sin 3 f\right. \nonumber \\&\quad \left. + 15 e \sin 4 f + 3 e^2 \sin 5 f \right) , \end{aligned}$$
(33g)
$$\begin{aligned} IV_{122}&= \frac{1}{240} \left( 60 e f + 30 ( 2 + e^2 ) \sin f - 5 ( 4 + e^2 ) \sin 3 f - 15 e \sin 4 f - 3 e^2 \sin 5 f \right) , \end{aligned}$$
(33h)
$$\begin{aligned} {\widehat{I}}_{2}&= -\frac{1}{12} \Big ( (12 + 3 e^2) ( \cos f - X_0^{0,1} ) + 6 e ( \cos 2 f - X_0^{0,2} ) + e^2 ( \cos 3 f - X_0^{0,3} ) \Big ), \end{aligned}$$
(34a)
$$\begin{aligned} {\widehat{II}}_{12}&= -\frac{1}{12} \Big ( 3 e ( \cos f - X_0^{0,1} ) + 3 ( \cos 2 f - X_0^{0,2} ) + e ( \cos 3 f - X_0^{0,3} ) \Big ), \end{aligned}$$
(34b)
$$\begin{aligned} {\widehat{III}}_{112}&= - \frac{1}{96} \Big ( 24 ( \cos f - X_0^{0,1} ) + 12 e ( \cos 2 f - X_0^{0,2} ) + 8 ( \cos 3 f - X_0^{0,3} ) + 3 e ( \cos 4 f - X_0^{0,4} ) \Big ), \end{aligned}$$
(34c)
$$\begin{aligned} {\widehat{III}}_{222}&= - \frac{1}{96} \Big ( 72 ( \cos f - X_0^{0,1} ) + 12 e ( \cos 2 f - X_0^{0,2} ) - 8 ( \cos 3 f - X_0^{0,3} ) - 3 e ( \cos 4 f - X_0^{0,4} ) \Big ), \end{aligned}$$
(34d)
$$\begin{aligned} {\widehat{IV}}_{112}&= - \frac{1}{240} \Big ( 30 (2 + e^2) ( \cos f - X_0^{0,1} ) + 60 e ( \cos 2 f - X_0^{0,2} ) + 5 ( 4 + 3 e^2 ) ( \cos 3 f - X_0^{0,3} ) \nonumber \\& + 15 e ( \cos 4 f - X_0^{0,4} ) + 3 e^2 ( \cos 5 f - X_0^{0,5} ) \Big ), \end{aligned}$$
(34e)
$$\begin{aligned} {\widehat{IV}}_{222}&= - \frac{1}{240} \Big ( 30 ( 6 + e^2 ) ( \cos f - X_0^{0,1} ) + 60 e ( \cos 2 f - X_0^{0,2} ) - 5 ( 4 - e^2 ) ( \cos 3 f - X_0^{0,3} )\nonumber \\& - 15 e ( \cos 4 f - X_0^{0,4} ) - 3 e^2 ( \cos 5 f - X_0^{0,5} ) \Big ), \end{aligned}$$
(34f)
$$\begin{aligned} \widehat{\widetilde{I}}_{2}&= -\frac{1}{12} \sum \limits _{k = 1}^\infty \frac{1}{k} \Big ( (12 + 3 e^2) C_k^{0,1} + 6 e C_k^{0,2} + e^2 C_k^{0,3} \Big )\, \sin k M, \end{aligned}$$
(35a)
$$\begin{aligned} \widehat{\widetilde{II}}_{12}&= -\frac{1}{12} \sum \limits _{k = 1}^\infty \frac{1}{k} \Big ( 3 e C_k^{0,1} + 3 C_k^{0,2} + e C_k^{0,3} \Big )\, \sin k M, \end{aligned}$$
(35b)
$$\begin{aligned} \widehat{\widetilde{III}}_{112}&= - \frac{1}{96} \sum \limits _{k = 1}^\infty \frac{1}{k} \Big ( 24 C_k^{0,1} + 12 e C_k^{0,2} + 8 C_k^{0,3} + 3 e C_k^{0,4} \Big )\, \sin k M, \end{aligned}$$
(35c)
$$\begin{aligned} \widehat{\widetilde{IV}}_{112}&= - \frac{1}{240} \sum \limits _{k = 1}^\infty \frac{1}{k} \Big ( 30 (2 + e^2) C_k^{0,1} + 60 e C_k^{0,2} + 5 ( 4 + 3 e^2 ) C_k^{0,3} + 15 e C_k^{0,4} + 3 e^2 C_k^{0,5} \Big )\, \sin k M, \end{aligned}$$
(35d)
$$\begin{aligned} \widehat{\widetilde{IV}}_{222}&= - \frac{1}{240} \sum \limits _{k = 1}^\infty \frac{1}{k} \Big ( 30 ( 6 + e^2 ) C_k^{0,1} + 60 e C_k^{0,2} - 5 ( 4 - e^2 ) C_k^{0,3} - 15 e C_k^{0,4} - 3 e^2 C_k^{0,5} \Big )\, \sin k M, \end{aligned}$$
(35e)
$$\begin{aligned} \widetilde{\mathcal {II}}&= - \frac{1}{6} \sum \limits _{k = 1}^\infty \frac{1}{k} \Big ( 3 e S_k^{0,1} + 3 S_k^{0,2} + e S_k^{0,3} \Big )\, \cos k M, \end{aligned}$$
(36a)
$$\begin{aligned} \widetilde{\mathcal {IV}}&= -\frac{1}{48} \Bigg [ \sum \limits _{k = 1}^\infty \frac{1}{k} \Big ( 6 ( 2 + e^2 ) S_k^{0,1} + 36 e S_k^{0,2}\nonumber \\&\quad + ( 28 + 9 e^2 ) S_k^{0,3} + 18 e S_k^{0,4} + 3 e^2 S_k^{0,5} \Big )\, \cos k M \Bigg ], \end{aligned}$$
(36b)

where all intermediate terms are given in Appendix B. They are obtained by also using the auxiliary formulas given in Appendix A.

Thus, following Eq. (6), given the initial osculating state \(( {{\varvec{e}}}_0, {{\varvec{H}}}_0, l_0 )\), the mean equations of motion, Eqs. (30), have to be initialized as

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle \overline{{\varvec{e}}}_0 = {{\varvec{e}}}_0 - {{\varvec{e}}}^{sp} (t_0), \\ \displaystyle \overline{{\varvec{H}}}_0 = {{\varvec{H}}}_0 - {{\varvec{H}}}^{sp} (t_0), \\ \displaystyle \overline{l}_0 = l_0 - l^{sp} (t_0). \end{array} \right. \end{aligned}$$
(37)

4 Validation of the Milankovitch scheme

In this section, we consider an extended grid of initial conditions and test the developed Milankovitch formulation against both BL and the fully numerical transformation of Sect. 2.2. For Brouwer–Lyddane, we have verified that the more streamlined formulas presented in Schaub and Junkins (2018) have been correctly transcribed according to their original sources, excepting the missing \(\sin ( 2 \omega )\) factor in the long-period terms of M, \(\omega \), and \(\varOmega \), which has only been noted in the recent erratum of the latest edition of this widely used monograph.Footnote 3 Furthermore, rather than using Lyddane’s adhoc modification, Gim and Alfriend (2003) developed a new theory based on Brouwer’s generating function that uses equinoctial elements. While still invalid at the critical inclination, Gim and Alfriend (2003) concluded from various numerical simulations that their method produces reasonable results within \(0.25^\circ \) of this small divisor. Being inherently rooted in Brouwer’s theory, it was expected at the outset that Lyddane (1963) and Gim and Alfriend (2003) would yield the same overall degree of accuracy. Nevertheless, for the sake of completeness, and because, superficially, it is not apparent that the formulas of Gim and Alfriend (2003) are mathematically equivalent to those systematized in Schaub and Junkins (2018),Footnote 4 we have also extended our numerical campaign to include a comparison between these as well (see Appendix C).

Figure 2 shows a numerical confirmation of the validity of the Milankovitch formulation for satellites of the Sun-synchronous and Molniya type. The procedure was to first convert the initial osculating orbit into its corresponding mean elements using the developed formulas. These initial osculating and mean states were then propagated according to their dynamics described by Eqs. (29) and (30). In the former case, the results are equivalent to, though generally more accurate than, a simple Cowell integration in Cartesian space. The time histories of these evolutions at various subintervals were subsequently used as input to the respective osculating-to-mean and mean-to-osculating transformations in order to recover the aforementioned simulated trajectories.

Fig. 2
figure 2

Evolution of the osculating and mean orbital elements for a low-altitude, nearly circular, and retrograde satellite (top) and a highly elliptical, semi-synchronous, critically inclined satellite (bottom). The initial osculating states \((a, e, i, M, \omega , \varOmega ) = (R + 800\, \text {km}, 0.001, 98^\circ , 0, 90^\circ , 180^\circ )\) (top) and \((26562\, \text {km}, 0.74, 64.3^\circ , 120^\circ , 60^\circ , 30^\circ )\) (bottom) were converted into their corresponding mean elements using the developed Milankovitch formulation, and each set was propagated according to Eq. (29) (denoted “simulated osculating” dynamics in the legend) and Eq. (30) (denoted “simulated mean” dynamics), respectively. The osculating (cyan) and mean (yellow) trajectories were recovered (purple circles and red diamonds) at equal time steps using the aforementioned transformations, taking the respective simulated dynamics as input

Projecting both the simulated and recovered osculating evolutions into the radial, along-track, and cross-track frame, we use the norm root mean square (RMS) of the positional error as a means to assess the accuracy of the transformation. To keep the presentation brief, we do not consider the velocity errors or other metrics. Figure 3 shows the results of this process for Brouwer–Lyddane, Milankovitch, and the FFT transformations, respectively, for Sun-synchronous-like orbits and nearly critically inclined, semi-synchronous ones. The dependence of the resulting errors on the choice of orbit orientation angles is also highlighted by Fig. 3.

Fig. 3
figure 3

Radial, along-track, and cross-track errors between the recovered and simulated osculating trajectories, using the Brouwer–Lyddane (orange, dash-dot), Milankovitch (blue, dashed), and FFT (gray, solid) transformations, respectively. The initial osculating orbits \((a, e, i) = (R + 800\, \text {km}, 0.001, 98^\circ )\) (top) and \((26562\, \text {km}, 0.75, 63^\circ )\) (bottom), each with \((\omega , \varOmega ) = (90^\circ , 180^\circ )\) and \(M = 0\) (left) or \(M = 45^\circ \) (right), were converted to their corresponding mean states using each respective transformation and subsequently propagated following Eq. 30. At every time step, the simulated mean trajectory was converted to osculating according to each transformation and compared against the simulated dynamics of Eq. 29. The norm RMS of the difference between the recovered and simulated positions (in km) over five orbital periods varied between 0.0556 and 20.9714 for Brouwer–Lyddane, 0.0666 and 0.3114 for Milankovitch, and 0.0533 and 20.9796 for the FFT transformation

For completeness and further validation, Fig. 4 compares the aforementioned Brouwer–Lyddane transformation which includes the long-period terms, with a BL mean-to-osculating implementation that omits them. These test cases are simulated in the trusted semi-analytical propagator, STELA, using a \(J_2\)-only model. While slight discrepancies with STELA are to be expected due to different platforms, mean-element equations, and astronomical constants used in the respective simulations, the results are in good quantitative agreement. Importantly, the long-period terms in the full Brouwer theory do not cause appreciable changes in the recovered solutions throughout the whole of orbital phase space excepting a narrow band centered around the critical inclination.

Fig. 4
figure 4

Radial, along-track, and cross-track errors between the recovered and simulated osculating trajectories, using the full Brouwer–Lyddane (orange, dash-dot), BL without long-period terms (blue, dashed), and STELA (black, solid) transformations, respectively. The initial osculating orbits \((a, e, i) = (R + 800\, \text {km}, 0.001, 98^\circ )\) (left) and \((26562\, \text {km}, 0.75, 63^\circ )\) (right), each with \((\omega , \varOmega ) = (90^\circ , 180^\circ )\) and \(M = 45\) (left) or \(M = 0^\circ \) (right), were converted to their corresponding mean states using each respective transformation and subsequently propagated. At every time step, the simulated mean trajectory was converted to osculating according to each transformation and compared against the simulated osculating dynamics. The norm RMS of the difference between the recovered and simulated positions (in km) over five orbital periods was 0.0556 and 20.9714 for BL, 0.0555 and 20.9785 for BL w/out LP, and 0.0543 and 22.4028 for the STELA transformation

Fig. 5
figure 5

Error maps in the inclination–semi-major axis plane using the Brouwer–Lyddane (top), Milankovitch (middle), and FFT (bottom) transformations, respectively. Each panel samples an equidistant grid of 250 thousand initial osculating (ia) values, for initial eccentricities of 0.01 (left) and 0.2 (right), and where the initial mean anomaly, perigee, and node angles were all set to zero. The colorbar represents the norm RMS of the difference between the recovered and simulated positions over five orbital periods, according to each formulation. The colorbar limit was set to the maximum error found between the Milankovitch and FFT schemes, as the Brouwer–Lyddane formulation becomes singular near the critical inclination. Grid points leading to unphysical errors or to values that exceed this limit are represented in white. From left to right, the (maximum, mean) errors (in km) were (1.0626, 0.1967) and (68.9371, 0.1804) for Brouwer–Lyddane, (1.4004, 0.2664) and (0.8627, 0.0268) for Milankovitch, and (0.9673, 0.1933) and (0.3948, 0.0334) for the FFT scheme

Figure 5 shows error maps corresponding to two different initial eccentricities for \(500 \times 500\) grids of initial inclinations and semi-major axes. Each grid point, together with \((M, \omega , \varOmega ) = (0, 0, 0)\), was used to form the full osculating element state vector and propagated for five orbital periods. The same procedure outlined in the previous paragraph was used to characterize the accuracy of each transformation, where the colorbar in Fig. 5 corresponds to the norm RMS of the difference between the recovered and simulated positions over the timescale of the propagations (five orbital periods). Note that prescribed limits were imposed on the colorbar of each map in this numerical campaign so as to more clearly highlight the differences among the various transformations.

Figure 6 shows error maps in the semi-major axis–eccentricity plane for four different initial inclinations. For all three transformations, the error is highest at the boundary of allowable orbits, above which the perigee altitude would equal the Earth’s radius.

Fig. 6
figure 6

Error maps in the semi-major axis–eccentricity plane using the Brouwer–Lyddane (top panels), Milankovitch (middle panels), and FFT (bottom panels) transformations, respectively. Each panel samples an equidistant grid of 40 thousand initial osculating (ae) values, for initial inclinations of \(6^\circ \) (top-left), \(63^\circ \) (top-right), \(98^\circ \) (bottom-left), and \(116.6^\circ \) (bottom-right), and where the initial mean anomaly, perigee, and node angles were all set to zero. The colorbar represents the norm RMS of the difference between the recovered and simulated positions over five orbital periods, according to each formulation. The colorbar limit of each map was set to the maximum error found in the Milankovitch scheme to provide a better contrast. Grid points leading to unphysical errors or to values that exceed this limit are represented in white. Clockwise starting from the top left: Brouwer–Lyddane recorded (maximum, mean) errors (in km) of (58.5237, 1.2662), (59.5538, 1.2013), (59.6549, 1.2698), and (82198.6187, 218.0410); Milankovitch recorded errors of (3.7776, 0.2221), (2.2572, 0.0675), (3.4344, 0.0914), and (2.2940, 0.0682); and the FFT scheme recorded errors of (58.4914, 1.0276), (59.5997, 1.1049), (59.5904, 1.0825), and (59.6009, 1.1046)

The remaining slice of the action-like element space is given in Fig. 7, which shows how the errors manifest in the (ie) plane for two values of initial semi-major axes (representing LEO satellites at 800 km altitude and GEO birds).

From this simulation campaign, we can conclude that the Milankovitch method consistently agrees with the classical Brouwer–Lyddane solution with each of the approaches showing a slightly better accuracy in specific orbital regimes. Differences in positional error residuals remain in the order of a few kilometers or less.

Fig. 7
figure 7

Error maps in the inclination–eccentricity plane using the Brouwer–Lyddane (top), Milankovitch (middle), and numerical (bottom) transformations, respectively. Each panel samples an equidistant grid of 40 thousand initial osculating (ia) values, for initial semi-major axes of \(R + 800\) km (left) and \(a_\text {GEO}\) (right), and where the initial mean anomaly, perigee, and node angles were all set to zero. The colorbar represents the norm RMS of the difference between the recovered and simulated positions over five orbital periods, according to each formulation. The colorbar limit of each map was set to the maximum error found in the Milankovitch scheme to provide a better contrast. Grid points leading to unphysical errors or to values that exceed this limit are represented in white. From left to right, the (maximum, mean) errors (in km) were (1.2992, 0.2601) and (94.1968, 1.9108) for Brouwer–Lyddane, (1.1378, 0.2973) and (3.0803, 0.0955) for Milankovitch, and (0.7508, 0.2230) and (45.3521, 1.7884) for the numerical scheme

5 Discussion

Our Milankovitch formulation is closed-form in both the eccentricity and angular momentum vectors. We have provided a general series solution for the mean longitude based on Hansen coefficients, which can be taken to any desired order of accuracyFootnote 5, and shown that our scheme performs in agreement with standard BL theories. We note that our formulation bypasses the critical inclination because we do not consider the long-periodic terms that arise from a second-order perturbation treatment. On this account, we also omitted these in Brouwer’s theory in Fig. 4 to emphasize that the long-periodic terms are negligible away from the critical inclination in the time-scale of interest.

Being a non-canonical set of elements, our derivation followed the approach used by Kozai (1959), as further elucidated by Scheeres (2012). We note, however, that, like Gim and Alfriend (2003), we could have merely adopted Brouwer’s generating function and computed the short-periodic corrections using Poisson-bracket operations. Nevertheless, we chose to present an independent derivation that, as a positive outcome, results to be applicable to other perturbations for which a convenient generating function is not always available, see Shen et al. (2019). Having provided a general Kozai-like scheme, not rooted in canonical perturbation theory, our approach can be potentially extended to non-conservative forces, such as solar radiation pressure and atmospheric drag, in addition to treating lunisolar third-body gravity and other predominant perturbations. Future work will be devoted to the inclusion of long-periodic terms in our transformation and to the application of this method to modeling non-conservative perturbations.

6 Conclusions

We have developed a new mean-to-osculating and inverse transformation based on the Milankovitch elements with the mean longitude as the fast variable, which is valid for all eccentricity values smaller than one. An extensive numerical campaign was used to validate the vectorial transformation over orbital phase-space grids tailored to the relative distribution of cataloged Earth satellites and debris.