A Gravity Assist Mapping Based on Gaussian Process Regression

We develop a Gravity Assist Mapping to quantify the effects of a flyby in a two-dimensional circular restricted three-body situation based on Gaussian Process Regression (GPR). This work is inspired by the Keplerian Map and Flyby Map. The flyby is allowed to occur anywhere above 300 km altitude at the Earth in the system of Sun-(Earth+Moon)-spacecraft, whereas the Keplerian map is typically restricted to the cases outside the Hill sphere only. The performance of the GPR model and the influence of training samples (number and distribution) on the quality of the prediction of post-flyby orbital states are investigated. The information provided by this training set is used to optimize the hyper-parameters in the GPR model. The trained model can make predictions of the post-flyby state of an object with an arbitrary initial condition and is demonstrated to be efficient and accurate when evaluated against the results of numerical integration. The method can be attractive for space mission design.


Introduction
exists at all energies and is more efficient than the retrograde type. The GPR-based Gravity Assist Mapping aims at developing an accurate and efficient method of quantifying the flyby effects. This method holds the advantage of a prominent accuracy at the cost of a low computational efficiency, because the state of the particle must not be integrated during the encounter.
Inspired by the Keplerian map and the flyby map, we develop a Gravity Assist Mapping (GAM) based on Gaussian Process Regression (GPR), a machine-learning technique. The method is used for quantifying flyby effects in the PCR3BP. It works for a broad range of energy levels, i.e. beyond the domain of applicability the Keplerian map, and requires smaller computational cost.
In this research, the dynamical model of a PCR3BP is presented first in Section Planar Circular Restricted Three-Body Problem. In Section Gravity Assist Mapping, it is described how the GPR model is to learn the features of flybys from training samples generated by PCR3BP equations of motion. The covariance function, the key module of the GPR model, is improved based on the analysis of flyby dynamics. The number and distribution of training samples are selected by taking the Root-Mean-Square Error (RMSE) and the Mean-Absolute-Percentage Error (MAPE) of test samples as criterion. In Section Performance of GPR-based Gravity Assist Mapping, the performance of the mapping technique is assessed. By training the model with the information provided by hundreds of training samples, the GPR-based approach can accurately capture the features of flybys. Finally, the conclusions are drawn in Section Conclusions.

Planar Circular Restricted Three-Body Problem
In the preliminary design of a space mission, simplified dynamical models are commonly used to reduce the computational effort. In this research, the PCR3BP model is used for constructing GA trajectories.
The PCR3BP assumes a system with three bodies: a massless particle and two massive primaries P 1 and P 2 with mass M 1 and M 2 , respectively. Generally, we refer to body P 2 with the smaller mass as the secondary. The motion of the particle is driven by the simultaneous gravitational pull of both primaries. The primaries are assumed to rotate around the barycenter of the system in circular orbits. The orbital plane of these two massive bodies is taken as the reference plane. Characteristic quantities are commonly defined such that a normalized model can be applied to any three-body system.
In the rotating reference frame, which has the barycenter as origin, the frame rotates at the same angular velocity as the primaries do around this barycenter. The equations of motion in non-dimensional units for the particle (spacecraft) are given as [17] ⎧ ⎨ ⎩ẍ − 2ẏ = x − (1−μ) where μ = M 2 /(M 1 + M 2 ) represents the mass parameter, and r 1 and r 2 the normalized distances between the particle and the primary and secondary respectively.
In this paper, the Sun-(Earth+Moon)-spacecraft system is used with a mass ratio of 3.036 × 10 −6 . Obviously, Equation 1 is analytically unsolvable. Moreover, the trajectory solutions are very sensitive to the initial conditions. This leads to the difficulty of assessing flyby effects accurately. Given an initial condition, the flyby effects can be calculated using the numerical propagation of the CR3BP equations of motion.

Gravity Assist Mapping
GPR is a regression method to estimate the relation between inputs and outputs in terms of a specific model. Krige first developed GPR for the estimation of the distribution of gold [18]. For astrodynamics purposes, GPR was adopted to efficiently assess the accessibility of main-belt asteroids and found thousands of potential mission targets [19]. It has also been used for modelling the gravity field of small bodies or the optimization of low-thrust trajectories [20,21]. GPR is able to learn effectively the correlation between inputs and outputs of a process. A GPR-based method will be developed here to evaluate the post-flyby state of the spacecraft in a PCR3BP system.

GPR Fundamentals
As a supervised learning tool, GPR is used to predict the output at arbitrary input based on empirical information. Such information is obtained by observing the output of a group of independent input data in a specific input space. In this section, we will discuss the elementary aspects of GPR; the interested reader is referred to [22]. In this work, both input and output are the Keplerian elements of the spacecraft state in an inertial reference frame. The origin is centered at the largest primary and all the bodies are contained in the X-Y plane. The connection between the state of the spacecraft before flyby and after flyby can be described by a mapping function: where a is the semi-major axis, e the eccentricity and ω the argument of periapsis. Subscripts A and B denote before-and post-flyby, respectively. Each of the orbital elements represents a feature of the state vector. One set of input vector and corresponding output vector comprises a so-called sample. In the inertial frame, Fig. 1 shows the osculating orbital elements of the spacecraft at the initial condition. The frame is centred at the primary and the X-axis points to the vernal equinox. Two assumptions are made for these initial conditions: the spacecraft starts from the apoapsis (i.e. θ = −π ); the initial true anomaly of the secondary θ P 2 is selected such that it arrives at the negative X-axis when the spacecraft (in terms of initial osculating orbit) reaches periapsis. The time span before the secondary reaches the negative X-axis is set to be half of the orbital period of the initial osculating orbit of the spacecraft. Therefore, the relationship of phasing between the spacecraft and the secondary is described by the argument of periapsis ω only. When ω = 180 • , the minimum distance between the particle and the secondary happens at the periapsis passage and a flyby effect can be expected (See Appendix A for more details). The output will be predicted by GPR using the general function f where x = [a A , e A , ω A ] is a vector, and y represents one of the variables of the postflyby state [a B , e B , ω B ]. Obviously, the dimension of the output vector determines the number of GPR models. In our case, the output vector consists of three elements, so three GPR models are required to build the Gravity Assist Mapping. The method of Bayesian inference is used to predict the value of the output y with respect to the input x. This procedure works on a Gaussian process which is basically an infinite collection of random variables. Selecting an arbitrary finite number of this collection can form a joint Gaussian distribution [22]. For the GPR method, the random variable is taken as the value of the function f (x). The Gaussian process based on the function f (x) is specified by [23] where x is an input vector in the input space. k(x, x ) represents the covariance function if x = x and the variance of x if x = x . m(x) is the mean function. Commonly, the prior value for m(x) is taken to be zero before obtaining more information about the mapping. After training the model, the posterior value will be inferred, which is not confined to be zero [22]. Therefore, the module k(x, x ) becomes the only determinant of the process. The information about the Gaussian process in Equation 4 can be gained by observing flyby effects for known inputs. The observation of an output (one feature of the post-flyby state) for a given input (beforeflyby state) is actually generated using the propagation in the PCR3BP model. The input and corresponding output compose a so-called training sample. Assuming we have N training samples, the training dataset D is defined as where X and Y are the training input matrix (the states before flyby) and the output vector (the selected feature of the post-flyby states), respectively. The output predicted by the GPR is assumed to differ from the ground truth by a value that follows a Gaussian distribution. Given the training inputs, the joint Gaussian distribution of outputs obtained from the GPR is: where K is the covariance matrix consisting of N 2 covariance functions, written as Assuming that y * is the prediction for a new input vector x * , then the training outputs Y and the predicted output y * form a joint Gaussian distribution: where K(X, x * ) represents the vector of N covariance functions calculated at each entry of training inputs X and new input x * . Clearly, because of symmetry, K(x * , X) = K(X, x * ) T . By conditioning the joint Gaussian distribution in Equation 8 on the observations, the predicted distribution of output y * has a Gaussian form and is completely specified by the mean μ(y * ) and the covariance cov(y * ).
The values of μ(y * ) and cov(y * ) can be evaluated by and Generally, the expected value μ(y * ) can be taken as the predicted output y * . Once a GPR model has been built, the prediction can be obtained by matrix multiplication of the N × 1 vector K(X, x * ), the N × N matrix K(X, X) −1 and the 1 × N vector Y . Equation 10 can be further simplified into: where Q train = K(X, X) −1 Y is precalculated. The prediction efficiency improves about N times after this simplification. When performing the evaluations for N * new inputs, the outputs are obtained by (13) where Y * = [y * 1 , y * 2 , ..., y * N * ] is a vector of predicted outputs and K(X, X * ) is an N * × N matrix.

GPR Covariance Function
The covariance function describes the relation between samples by evaluating a function in terms of inputs. It is the most basic and essential module for any GPR model. The covariance functions can have various mathematical expressions and parameters which are called hyper-parameters. A perfect covariance function contributes to obtaining an exact representation of mapping between input and output. In addition, the generalization of a GPR model depends heavily on the choice of the covariance function. Unfortunately, a universal rule is not available for such a choice [24]. In our research, it needs to capture the dynamics of flybys in the PCR3BP. Two criteria are considered to evaluate the performance of the covariance function. One criterion is to minimize the cost function: where the marginal likelihood p(y|x) describes the probability of obtaining the true training output given a training input. This process is equivalent to maximize the following function [22]: Equation 15 is also the objective function to optimize the hyper-parameters using training samples. The value of log p(y|x) is always negative. A log p(y|x) closer to zero means that the possibility of observing true outputs is higher. Another criterion is to minimize the difference between the numerically propagated outputs of the PCR3BP and the predictions by the GPR. The differences are evaluated on test samples with inputs generated randomly in the input space.
The covariance function selected here initially is the Rational Quadratic Automatic Relevance Determination (RQARD) function [25]. The classical RQ covariance function is stationary, i.e. it depends on the distances between sample points in Euclidean space only. The different influence of orbital elements on the flyby effect cannot be reflected by the RQ covariance function. To distinguish these effects, the Automatic Relevance Determination (ARD) distance measure is integrated into the RQ covariance function. Three specific weights l a , l e and l ω (Equation 17 below) will be optimized based on the sensitivity of the flyby effects to the corresponding orbital elements. The optimized value of the weight is a reflection of the influence of that particular feature. The RQARD function is defined as: The Journal of the Astronautical Sciences (2021) 8: -272 Here σ and α are the magnitude parameter and shape parameter of the output distribution, respectively. Q is a symmetric matrix containing weights where l a , l e and l ω are weights for the elements a, e and ω respectively. Clearly, the RQARD formulation emphasizes the influence of near-by training samples, whereas samples further away have less influence. Some covariance functions can interpret specific model characteristics, such as a cosine or periodicity. For this research, a new covariance function is built by taking the influence of the initial argument of periapsis ω A on the variation of the semimajor axis a during a flyby into consideration. ω A turns out to be the most influential parameter because it specifies the phasing between the massless particle and the secondary [6]. Under the predefined boundary, Fig. 2 shows a typical flyby effect of a B for initial ω A values between 174 • and 186 • at [a A , e A ] = [1.2591AU, 0.2] in the Sun+(Earth+Moon)-spacecraft system. The post-flyby a B tends to have a sinusoidal curve in terms of ω A . This is caused by the phase change of the flyby: the spacecraft can gain Keplerian energy if the secondary is ahead of it and vice versa. The Keplerian energy is computed for the osculating orbit with respect to the primary when the spacecraft moves far away from the secondary. For this near-Keplerian orbit, its Keplerian energy of the spacecraft with respect to the primary is proportional to the inverse of the semi-major axis [6]. The three-body energy is invariant during the flyby. In order to model this flyby effect, we incorporate a cosine function of ω A into the GPR function. The combined covariance function (SUM) is expressed as where σ and α are hyper-parameters defined in Equation 16, and p and h are hyperparameters denoting magnitude and phase transition, respectively. When ω A has a value close to ω A , the last term in Equation 18 is approximately equal to its maximum p 2 . This leads to a sample with ω A gaining more information from a sample with ω A than from other samples.

Training Samples
As mentioned earlier, the GPR model obtains the empirical information by optimizing hyper-parameters using training samples. The training set, in particular their number and distribution, determines the performance, accuracy and efficiency of the GPR model. In principle, the accuracy of prediction improves with the information provided by an increasing number of training samples. However, a smaller training size will reduce the computational effort with respect to Equation 12. In order to balance accuracy and efficiency, the number of training samples is an essential variable which will be analysed in Section Performance of GPR-based Gravity Assist Mapping. Taking the Sun-(Earth+Moon)-spacecraft system as an example, the training input The boundaries are selected to meet a number of requirements. First, the osculating orbit of the spacecraft has a Minimum Orbit Intersection Distance (MOID) with the secondary which allows a minimum of 300 km altitude at Earth taking the Galileo mission as reference [26]. Second, the spacecraft starts outside the Hill sphere such that the status of the spacecraft with respect to the primary can be defined using the Keplerian elements. Third, ω A is defined in order to observe an obvious flyby effect; beyond these boundaries effectively nothing happens (cf. Fig. 2). After generating sets of [r p , r a , ω], the training inputs can be obtained by Once the training inputs have been generated, the perfect post-flyby states are acquired by numerical propagation using Equation 1. The numerical propagation adopts adaptive step-size Runge-Kutta method with a maximum stepsize of 1 × 10 −6 and relative and absolute tolerances of 1 × 10 −12 , such that the quality of training samples is ensured. It is of importance to define a termination condition for the propagation: in principle one orbital period of the initial osculating orbit is defined as end epoch. The GPR-based Keplerian map is to assess the effect of a single Using this parameter as termination index is inspired by the value of the Tisserand parameter before and after a flyby when the spacecraft is far away from the secondary [28]. Figure 4 illustrates the evolution of the Tisserand parameter for the trajectory in Fig. 3. The Tisserand parameter drops dramatically when the first flyby happens, then it increases rapidly and begins to decrease again when the second close encounter is coming. To prevent the second flyby, the propagation is terminated when the Tisserand parameter is maximised after the first drop.
at the top of the curve after the first flyby. In addition, the cases when an Earth impact takes place are removed for obvious reasons.

Performance of GPR-based Gravity Assist Mapping
In order to evaluate the accuracy of the GPR-based Gravity Assist Mapping, a set of test samples has been generated whose inputs are randomly distributed over the input space. To quantify the quality of any combination of GPR model and Fig. 4 The evolution history of the Tisserand parameter for the trajectory of Fig. 3. The X-axis shows the steps of the numerical integration for one orbital period number and distribution of training samples, the Root-Mean-Square Error and the Mean-Absolute-Percentage Error are adopted: and In principle, an increment of the number of training samples can be expected to contribute to the reduction of errors because the GPR model is offered more empirical information. So the acceptable error level serves for choosing the size of the training dataset.
The procedure of building a GPR-based Gravity Assist Mapping is summarized in the following steps: (1) Define the mass ratio for a specific three-body system. The Sun-(Earth+ Moon)spacecraft system is taken as an example with a mass ratio of 3.036×10 −6 . The mass ratio is the only key parameter in the PCR3BP, so it is easy to implement the GPR-based Gravity Assist Mapping for different three-body systems; . Conjugate gradient ascent is used to find optimal values of the hyper-parameters. This is used in a multi-start approach to avoid being trapped in local minima (more precisely: the gradient procedure is initiated at 10 different initial combinations of hyper-parameter values); (6) Determine the optimal size of the set of training data. This is done by adding training samples gradually until a stable error, i.e. the difference between the outputs of the numerical PCR3BP and the GPR models on the test dataset, is observed. (7) Predict the output given a new input. Based on the training samples, the selected covariance function and the corresponding optimized hyper-parameters, the prediction can be obtained.

Accuracy
We generate a test set meeting the conditions of Equation 19 to assess the accuracy of the method. The test set consists of 500 (N * ) input vectors randomly distributed over the input space, and is kept fixed for all evaluations in a certain application. Their corresponding outputs are obtained by numerical propagation in the PCR3BP model and by the GPR model. To evaluate the performance for each model, both the MAPE and RMSE will be calculated using Equations 22 and 23. The Sun-(Earth+Moon)spacecraft system is considered for all applications.  With increasing training size, Fig. 6 shows MAPE and RMSE of the test set on the semi-major axis using the RQARD and SUM covariance functions, respectively. In this case, the RQARD function performs better than SUM, but MAPE is larger than 1000%. Obviously, the systematic sampling fails to offer useful information for the GPR model. This is because the GPR model uses Bayesian inference, a type of statistical inference, which strongly relies on the assumption of random selection [29]. If this assumption is not satisfied, as is the case with regular sampling, the basis for any GPR model drops out and the output is to be considered as unrealistic.

Random Sampling
In this test, we distribute the training inputs randomly over the entire input space. MAPE and RMSE results are shown in Fig. 7. The SUM function has a better performance than the RQARD function. For the case of 780 training samples, SUM obtains the lowest MAPE value of 19.7% and an RMSE of 5.8 × 10 −3 AU. We generate another two sets of training samples using different random seed numbers and perform the predictions using the SUM function. The trends of MAPE and RMSE with respect to three training data sets show that the random seed number is not an influential parameter for the level of convergence. Fig. 6 The error of test samples when the training dataset uses systematic sampling Fig. 7 The error of test samples when the training dataset has a random distribution

Latin Hypercube Sampling
LHS is a statistical method to generate random samples especially for multidimensional problems [30]. This method is generally used to avoid clustering of random samples in a specific subspace. Shown in Fig. 8, the SUM function captures the model mechanics more quickly than RQARD. In addition, the SUM function has a lowest MAPE value of 24.1% and an RMSE of 4.9 × 10 −3 AU at a training data size of 900. The LHS has a poorer accuracy than random sampling with respect to MAPE. Figure 9 shows the number of MAPE values larger than 10% for the subspaces of ω A using random sampling. Due to the stronger flyby effects around ω A = 180 • according to Fig. 2, the error for these test points is worse. More training points can   Figure 10 shows the corresponding RMSE results. The performance becomes stable when using an increasing number of training samples. In Figs. 10a and b, the RMSE of these training datasets converges to the same level with a training data size larger than 1500. The prediction of ω needs more training samples than that of a and e. This indicates that the GPR function needs more information to learn the change of ω. We generate another two groups of training samples using the same SRS method with different random seed numbers. For the prediction of a, the standard deviation among these three groups is about 1.0 × 10 −4 AU when a number of more than 1500 training samples are used. The random seed number for generating training dataset is a negligible influence factor for the performance of GPR models.

Summary of Sampling Methods
We used four types of sampling methods for the generation of the training samples for comparison. The prediction using RS performs better than that of SS and LHS. Based on RS, we developed an SRS specifically for this case (h > 300 km) due to the larger errors for the test samples in the subspace of ω A ∈ [175 • , 185 • ]. The RMSE using SRS is reduced to less than 1/4 of that of RS. By using different training datasets, the training process of the GPR model is illustrated to be insensitive to the random seed numbers.

Flybys Outside the Sphere of Influence of the Secondary
In this application, the periapsis of the initial osculating orbits of the spacecraft are chosen such that they are above the sphere of influence of the secondary (SoI): r p > 1.0062 AU (i.e. r p > r SoI + r 2 ) in the system of Sun-(Earth+Moon)-spacecraft. Figure 11 shows the errors of the semi-major axis predictions after the flybys. After some fluctuations for training data sizes between 0 and 1000, the error becomes gradually smaller and stable until the data size reaches 2400. The RMSE is selected as the main criterion for optimizing hyper-parameters. Using the same set of hyperparameters, MAPE could have a slightly different trend compared to that of RMSE. The performance of predicting eccentricity is presented in Fig. 12. The RMSE of e decreases to a value of 1.0 × 10 −3 using 2000 training samples. The maximum eccentricity e A among the test cases is 0.33. In Fig. 13, the RMSE of ω becomes lower than 0.4 • with more than 1800 training samples. The convergence of optimizing hyper-parameters for the prediction of the output state is illustrated in Fig. 14: the hyper-parameters l a , l e and l ω are selected. When the training data size becomes Fig. 11 The semi-major axis error of test samples when the training dataset has a random sampling. The periapsis of initial osculating orbit of the spacecraft is outside the sphere of influence of the secondary larger than 1000, the GPR model converges gradually to the optimum. Although the optimization process converged at negative values for the hyper-parameters eventually, this has no consequences since the quadratic value of them is used (Equation 17). RS is used in this case because a significant difference of error between different subspaces is not observed.

Flybys Outside the Hill Sphere of the Secondary
For the prediction of distant flyby effects, this application investigates the condition of the radius of periapsis being larger than 1.01 AU (i.e. r p > r 2 + r H ill ). Figure 15 shows the accuracy of prediction with an increasing number of training samples. For three orbital elements, the curves flatten out gradually after using 1300 training samples. The accuracy is better than the case of SoI. MAPE of δa and δe is no more Fig. 12 The eccentricity error of test samples when the training dataset has a random sampling. The periapsis of initial osculating orbit of the spacecraft is outside the sphere of influence of the secondary The Journal of the Astronautical Sciences (2021) 8: -272 Fig. 13 The ω error of test samples when the training dataset has a random sampling. The periapsis of initial osculating orbit of the spacecraft is outside the sphere of influence of the secondary. The best MAPE obtained is 5.7% than 1%. The MAPE of δω has been improved by a factor of two. It shows that the low-energy case is easier for GPR to learn. This is mainly because the flyby effects at low energies are less complex than the previous applications, e.g. the retrograde flybys disappear at such energies [16]. Therefore, less training samples are required to obtain a reliable model.
The comparison between the performance of our GPR-GAM and the Keplerian Map (KM) model of Ross et al. [6] (fully reproduced here) is shown in Table 1. 1500 training samples are considered in our GPR model and the SUM covariance function is selected. The test set has 500 samples with inputs randomly distributed in the input space and outputs numerically calculated by the PCR3BP propagation. Both tests are performed in MATLAB 2017b using a laptop of Core i7 CPU and 8.00  The RMSE and MAPE of test samples when the training dataset adopts a stratified random sampling. The SUM covariance function is used GB RAM. The RMSE of predicting a using GPR-GAM is only 1/6 of that of the KM. In addition, the GPR-GAM improves the accuracy of predicting ω to a level of 4.72 × 10 −3 rad. The KM fails to quantify the variation of ω accurately due to simply evaluating ω in terms of orbital period change. The CPU time of prediction spent on a single test sample is the same for both a and ω. KM is much slower in predicting a because of using the numerical integration but is faster in the prediction of ω.  The time efficiency of the GPR method is demonstrated for up to 300,000 test samples in Figure 16. When using 1500 training samples for the case outside the Hill sphere, the total prediction time increases linearly with the number of test samples. It costs 0.32 s for the evaluation of 300,000 samples. The prediction time per sample is always lower than 1.2 × 10 −6 s. For the case outside the SoI in Figure 16b, the computational effort is bigger because of using 2400 training samples. It spends only 1.14 s for the same number of test samples and the CPU time per sample is 3.8×10 −6 s. When applying GPR to the case outside the Hill sphere, the generation of the training samples (0.28 s per sample) and training the GPR model (1530 s for 10 multi-starts) cost 1950 s in total. However, the GPR model is off-the-shelf once the training process is completed, and the values in Table 1 give an accurate estimation of the computation time needed to predict the flyby effect of a single input. Table 2 summarizes the performance of GPR for various applications. The accuracy of the GPR model becomes better when the closest passage of the spacecraft moves further from the secondary. The radius of the Hill sphere is close to that of the SoI but the accuracy improves especially for a and e. The prediction of ω is the worst which can be further investigated.

Conclusions
A model of the Gravity Assist Mapping was proposed based on a machine-learning method called Gaussian Process Regression. It was trained by a dataset consisting of thousands of samples generated using numerical integration in the planar CR3BP framework. The generation of training data takes into account elliptical orbits of low, moderate and high eccentricity. We determined the size of training samples by choosing a stable RMSE on the test dataset. A covariance function (SUM) was developed combining the cosine term and the rational quadratic term with automatic relevance determination (RQARD), which captures well the dynamics of flybys. The GPRbased model can assess the flyby effect more efficiently given the initial condition of a particle compared to methods based on numerical integration. The model for various planar CR3BP systems can be built by simply changing the mass ratio parameter. Using a machine-learning method, this work investigated and predicted the flyby effect instead of numerical integration of the equations of motion in previous contributions. The domain of applicability is beyond that of the Keplerian map. The GPR-MAP requires to be trained to model the flyby effect of trajectories belonging to different domain. The results demonstrate that the GPR-based Gravity Assist Mapping has a good performance on accuracy. Compared to previous work in literature, significant improvements have been made, in particular when using a combined covariance function and stratified random sampling. This can contribute to two main studies in astrodynamics. On the one hand, the characteristic of the GA in the CR3BP can be investigated by a great deal of accurate data produced by the GPR model, providing a deep understanding of the third-body effect. On the other hand, the GPR model can be considered to design multi-flyby missions, which takes the advantage of a high efficiency to update the post-flyby status. The developed GPR-GAM is still a prototype. It has some limitations, such as being unable to classify collision orbits in test samples because they are removed in the training samples. The post-flyby status near the neighbourhood of the secondary might be neglected using the Tisserand termination condition. Further improvements include developing a classification model to categorise collision orbits given an arbitrary initial condition and using a distance termination condition.

Appendix A: Minimum Distance Between Spacecraft and Secondary
All variables are non-dimensionalized using (M 1 + M 2 ) for the mass, the semi-major axis of the secondary a P 2 for the length, and a 3 P 2 /(G(M 1 + M 2 )) for the time. For the circular orbit of the secondary, the true anomaly θ P 2 is set to be 0 at the position of the positive X-axis. The initial true anomaly of the secondary θ initial P 2 is selected such that it arrives at the negative X-axis when the spacecraft (in terms of initial osculating orbit) reaches periapsis. This is specified by: where a A is the semi-major axis of the initial osculating orbit of the spacecraft. The distance d between the spacecraft and the secondary is defined using the law of cosines: d = r 2 1 + r 2 P 2 − 2r 1 r P 2 cos(γ ) (26) where r 1 represents the distance between the spacecraft and the primary, r P 2 the distance between the secondary and the primary, γ denotes the angle contained between r 1 and r P 2 . Since r P 2 has a constant value of 1, Equation 26 is simplified to: Assuming a fixed value of γ , the partial derivative ∂f (r 1 ,γ ) ∂r 1 is: The value of the right-hand side is always larger than zero given the initial conditions of this paper. The function f (r 1 , γ ) is monotonically increasing. When r 1 equals the radius of periapsis r p and γ = 0, d has the minimum value | r p − 1 |. This condition corresponds to ω = π .

Appendix B: Values of Hyper-Parameters
The optimized values of the hyper-parameters of the models in Table 2 are given below. Table 3 The optimized hyper-parameters of GPR-GAM for the first case in Table 2 Output σ α l a l e l ω p h  Table 4 The optimized hyper-parameters of GPR-GAM for the second case in Table 2 Output  Table 5 The optimized hyper-parameters of GPR-GAM for the third case in Table 2 Output σ α l a l e l ω p h  Table 6 The optimized hyper-parameters of GPR-GAM for the fourth case in Table 2 Output σ α l a l e l ω p h