Closed Loop Guidance During Close Range Rendezvous in a Three Body Problem

The paper presents a novel application of the State Dependent Riccati Equation (SDRE) guidance approach with state constraints for a chaser spacecraft in the close proximity of a passive target. The dynamics are described by full 6 degree of freedom rigid-body relative motion. The final trajectory is defined by a passively safe approaching cone, which acts as path constraint and follows the attitude motion of target. A Near Rectilinear Halo Orbit in the Earth-Moon system is the selected rendezvous scenario to fully validate the proposed solution, even though the parameters related to the constraints and weighting functions are kept as general as possible, thus applicable to other similar missions.


Introduction
In the past few years there has been an increased interest in space exploration. In particular, the International Space Exploration Roadmap was proposed in February 2018 [1] indicating, as objectives, a permanent return to the Moon, and unmanned and manned missions to Mars.
Within this context, an international effort is undergoing to plan a mission to the Moon consisting in a target space station located in a near rectilinear halo orbit (NRHO) about the L2 Lagrangian points of the Earth -Moon system, and a lunar lander equipped with a rover collecting lunar samples and bringing them back to the station, to later be returned to Earth [2]. There are many advantages in using a NRHO, such as stability of the orbit with low ΔV for station-keeping, continuous visibility from Earth for communications, low periselene altitude, etc. However rendezvous and berthing dynamics and control require in depth study primarily because of the non Keplerian environment the vehicles will move in, where classical control strategies based on linearized two-body dynamics no longer apply.
The rendezvous mission considered here consists of a series of orbital maneuvers and controlled trajectories, which successively bring the active vehicle (chaser) close and eventually into contact with the passive vehicle (target). The complexity of the rendezvous approach results from the multitude of conditions and constraints that must be fulfilled. The target station may impose safety zones, approach-trajectory corridors and hold points along the way to verify the chaser's trajectory accuracy, and to switch between appropriate sensor suites. Any dynamic state (position and velocities, attitude and angular rates) of the chaser vehicle outside the nominal limits of the approach trajectory could lead to collision with the target, a situation dangerous for crew and vehicle integrity [3].
The problem of control of rendezvous dynamics in Earth's orbit has been studied since the 1960s' [4,5] and performed in the past, starting with the Apollo program, the historical 1975 Apollo -Soyuz mission, and occurring at the present time with the activities related to the international space station. While long range rendezvous and phasing are generally automated, most of the close range rendezvous is still performed manually [6,7]. Traditionally, rendezvous and proximity operations are performed using open-loop maneuver planning techniques, and ad hoc error corrections. Examples of constrained maneuvers include the thrust magnitude constraints, constraints on the approaching spacecraft to maintain its position within a Line-of-Sight (LOS) cone emanating from the docking port on the target platform, and constraints on the terminal translation velocity for soft-docking are proposed for instance in [8][9][10].
From a guidance and control standpoint, several methods can be found in the literature. Some of the studies use terminal sliding mode control, which enables a time-fixed process with the flight prescribed a priori [11]. A fixed-time glideslope guidance algorithm on a quasi-periodic halo orbit can be found in [12]. Another interesting reference on guidance algorithms for low Earth orbit (LEO) is [13], where linear optimal regulator control combined with proportional navigation was proposed. Hartley and coworkers applied model predictive control techniques for a Keplerian rendezvous [14]. An application of H-infinity control can be found in [5], which shows good performance for the case of elliptic orbit, provided linearization bounds are maintained.
A State Dependent Riccati Equation (SDRE) method provides a systematic approach for solving the infinite horizon optimal control of nonlinear systems, avoiding the solution of the associated Hamilton-Jacobi-Bellman partial differential equation, generally unpractical. The technique guarantees local stability and optimality, robustness with respect to non-modeled dynamics and uncertainties, as well as real time implementation. SDRE effectiveness has been proven extensively on a wide variety of applications, see [15] for instance. The method has been also used for relative motion control in a classical two-body scenarios with good results in the control of translation-attitude coupling [16,17]. The paper proposes and verifies a State Dependent Riccati Equation technique as an effective approach to the close range rendezvous in a three-body problem, with particular reference to the future Artemis program. To the authors' knowledge, this is a novel application due to the nature of the underlying dynamics, except perhaps for the work in [18], where the authors proved the efficiency of SDRE for a station-keeping and reorientation for formation flight in a Sun -Earth scenario, with solar pressure perturbations.
The paper is organized as follow: the mathematical model of relative motion of two spacecraft is provided in Sect. 2. Section 3 describes the motion constraints introduced in the State Dependent Riccati Equation general algorithm. The guidance structure for the problem and numerical examples are presented in Sect. 4 and conclusions in Sect. 5.

Equations of Motion
This section summarizes the relative motion dynamics in the restricted three body problem. More details can be found in [19] and [20], among others. Although general in nature, the application in the paper will be based on the proposed Lunar Orbital Platform Gateway (LOP-G) consisting of a a space station in a lunar NRHO orbit, and a Lunar Ascent Element (LAE) returning from the Moon for berthing with the station. They will be also referred to as target and chaser respectively. The chaser spacecraft is the only actively controlled element.

Relative Translation
Let us consider two spacecraft performing a rendezvous maneuver. The two spacecraft are subjected to the gravitational action of the two primary bodies (in this case Earth and Moon).
The relative motion between the two vehicles is described with respect to a widely used reference system L ∶ t ;̂ ,̂ ,̂ , local-vertical local-horizon (LVLH), which is appropriate for control design, with the unit vector defined as follows: where mt is the target position with respect to the Moon-centered rotating frame, with magnitude r mt = || mt || , t∕m = mt × ̇ mt M is the target angular momentum with respect to the Moon, with magnitude h t∕m = || t∕m || . In general, the unit vectors ̂ ,̂ ,̂ are also known as V-bar, H-bar e R-bar (strictly speaking defined only for Keplerian motion). Note that if we introduce the M ∶ m ;̂ m ,̂ m ,̂ m frame centered in the center of mass of Moon, the unit vectors ̂ m −̂ m lie in the moon orbital plane: Also, em is the Moon position with respect to the Earth, r em = || em || , m∕e = em × ̇ em I is the specific angular momentum of the Moon with respect to the Earth, and h m∕e = || m∕e || . The equations describing the dynamics of the relative position between the two spacecraft, in the LVLH frame, are taken from [19] and are shown below: Referring to Fig. 1a we have: In Eq. (3) we have the relative position , and its derivatives with respect to the LVLH frame; the angular velocity l∕i of the LVLH frame with respect to the inertial frame, and the target and chaser positions with respect to the Moon-centered frame mt , mc , respectively.
The proposed target orbit is shown in Fig. 2, with the average period of 7 days, and aposelene and periselene distances of 70,000 km and 6,000 Km, respectively.
The close range rendezvous and berthing are assumed to occur during aposelene passage for safety reasons, since the target's velocity is the lowest, this also allows for the simplification of the equations. In this case in fact, the approximation of the primary bodies revolving in circular orbits, (Circular Restricted Three-Body Problem CR3BP assumption) appears appropriate [21]. With this assumption, the number of time-varying parameters in Eq. (3) reduce. Indeed em is constant, m∕i = n̂ m and m∕i M = 0 . The use of equations derived from CR3BP, while still nonlinear, allows the reduction of variables with respect to the elliptical case. In fact for target information we only need: mt , mt L .

Relative Attitude
In the study of rendezvous operations, during long range approach, the translational motion is considered sufficient to describe the relative distance propagation and even the design of a reference trajectory. As the chaser moves near the target,

Fig. 1
Reference frames attitude and attitude rates dynamics and control become of paramount importance for the safety of the maneuver as well as the precision required during the final phase (be that either berthing or docking). The procedure adopted here consists of a separate computation of chaser and target attitude, since the latter is undergoing passive motion [22].

Chaser Attitude
The chaser spacecraft (LAE) is the part of the lander, and will depart the Moon's surface once the ground operations are complete. It is assumed to be cylindrical, and its side view is depicted in Fig. 3. Its preliminary configuration can be found in [23]. With reference to the body fixed frame C shown in Fig. 1c, which has the origin in the center of mass of the rigid chaser spacecraft and axes parallel to the principal axes of inertia, the attitude dynamics are given by: where is torque vector, is the inertia matrix and is the angular velocity of the rotating frame. Note that is defined with respect to the inertial frame, and it can be computed as: where c∕l is the angular velocity of chaser with respect to L and l∕i is the angular velocity of L with respect to inertial frame. The kinematic motion can be described by standard Euler angles and by means of quaternions as well. In this work the following definition was used: ⊤ is the Euler rotation eigenaxis and is the rotation angle around .
The differential relationship between quaternions and angular velocity is given by: with c∕l the quaternion that describes the relative attitude between C and L , and [⋅] × denotes the operator that transforms a vector into the associated antisymmetric matrix. The set of differential equations given by Eqs. (4) and (5) provide the nonlinear attitude model of chaser [24].

Target Attitude
For the target's attitude we take as reference the international space station (ISS) dynamics, which is attitude controlled using a two sided limit cycle controller, and has a sawtooth profile. This motion can be modelled as an harmonic oscillator [22] and described in Eq. (6) below using quaternion formulation: where t∕l is the quaternion that describes the attitude of target body frame with respect to LVLH, ̇ t∕l is the time derivative of quaternion, t∕l is the angular velocity of target with respect to LVLH frame; ̇ t∕l is the angular acceleration of target with respect to LVLH frame, (⋅) is the matrix that relates the time derivative of the quaternion with the angular velocity, and qt is a diagonal matrix containing the eigen frequency for each axis. Note that the fixed target frame is defined similarly to that of the chaser (Fig. 1c).

Relative Attitude
The relative attitude between two rotating objects is based on the difference of the respective angular velocities expressed in an appropriate frame [22]. In this case the difference is expressed in the C reference: cl ( ∕ ) is the matrix that transforms the components of a vector from frame L to frame C.
Once the relative angular velocity c∕t has been determined, we can compute the derivative of the associated quaternion as: ̇ c∕t = 1 2 ( c∕t ) c∕t .

Control Synthesis
As described in the introduction, the SDRE methodology is used to synthesize of a closed loop guidance in the final berthing phase of the mission. A short review of the methodology is described below for clarity's sake. The reader can refer to [15] and [25], for more details. The next section specializes the general structure to the specific problem addressed by the paper. Consider a nonlinear regulator problem that minimizes the following quadratic cost function: subjected to nonlinear differential constraints affine in the control of the form: where ∈ ℝ n is the system's state vector (see later for the present application), ∈ ℝ m is the control vector, ∶ ℝ n → ℝ n , ≠ 0 ∀ ∈ ℝ n ; ( ) ≥ 0 and ( ) > 0 are the weight matrices of the state vector and the input vector respectively. If the dynamics of the system in Eq. (10) can be written in a pseudo-linear form by the introduction of a State Dependent Coefficient (SDC) as: with state and input matrices functions of the state, then the SDRE control method assumes the form of a LQR-like controller and can be summarized in the following two steps: In order to obtain a valid solution ( ) of the algebraic Riccati equation, the pair ( ); ( ) must be pointwise stabilizable in the linear sense [15].

• Motion Constraints
The rendezvous maneuvers can be performed by imposing constraints on the relative position and relative velocity of the two spacecraft when they are approaching, especially in the final phase of the rendezvous. This can be incorporated in the general SDRE design as well.
Consider the system described by Eq. (10), with initial conditions ( ) = 0 ∈ Ω and a set of allowable states defined by: it is possible to design a controller such that the closed loop system is stable and does not exceed Ω , the boundary of Ω , defined by: The sufficient condition for to remain in Ω is ∇ ( )̇ = 0 . The controller that satisfies these conditions forces the trajectories of the closed loop system to follow the level curves of the Ω set. Incorporating the constraint as a quadratic term, the cost function becomes: The second term on the RHS of the cost function introduces the constraints, and can be represented by the fictitious output , defined as follow: z is a p × p matrix, selected such that its i-th element has a large value when is near the border of the i-th constraint and small elsewhere. This means that in the cost function the component expressed by J Ω ( , ) is predominant with respect to J 0 ( , ) when the state does not respect the constraint, and becomes negligible when the constraint is satisfied [16].
In cases when ∇ ( ) is orthogonal to ( ) , the term ( ) = 0 . An alternative way of choosing z ( ) is then to penalize the state, that is the i-th element of the weight assumes a large value when we are in a region of the state space to be penalized and zero otherwise [25].
With the introduction of state constraints, the control law becomes: where For the purpose of the present work, the assumption of full state availability was made, when deriving the control law in Eq. (18), that is relative position and rate vectors, attitude quaternions, and angular rate.
If full state is not available to the controller, a discrete state estimate can be considered referring to the work in [26], and improved in [27], in order to avoid possible loss of observability due to the size of selected time intervals. Recalling [27], we consider a stochastic nonlinear system of the form: where is a Gaussian zero-mean white noise associated with the process, is a Gaussian zero-mean measurement noise. By means of Euler's discretization, with step s , we have k = n + s A(x k ) , and k = (x k ).Where (.) is one possible SDC parametrization of a continuous system and (.) is a possible SDC parametrization for the output of the system in Eq. (20). The nonlinear discrete time system can be viewed as a frozen-in-time linear equation. Traditionally, there are two formulations of the discrete SDRE estimator based on Kalman filter. Here we use the two-step recursive update (see [27] for details).

SDRE Guidance Law
The general SDRE controller defined in Eqs. (11), (16) and (18) is now detailed in terms of problem specific state dependent coefficient (SDC) parametrization [15], and the definition of state constraints. Numerical results of the SDRE closed loop guidance applied to the cis-lunar rendezvous will be then presented and discussed in the next section.

SDC Parametrization for Translation
The equations of relative motion described by Eq. (3), can be parametrized since all the conditions of existence are guaranteed. Note that the nonlinearities of the system are due to gravitational terms. The term that takes into account the gravitational attraction due to the Moon can be rewritten as follows:

SDC Parametrization for Attitude
Similarly, it is possible to derive a SDC parameterization for the dynamics and kinematics of the chaser's attitude (see Eqs. (4), and (5)). The parameterization was taken from [28].
where = −0.0001 is a small constant added to the spacecraft quaternion kinematics for numerical reasons in the solution of algebraic Riccati equation. Notice that the addition of correction is only an artifact since the quaternions parameters represent only three independent parameters The coefficients a ij are:

Constraints and Keep-out-Zone
In the final phase and proximity operations of the rendezvous, the chaser should approach the target from a direction bound by a safety zone for collision avoidance mitigation. Typical constraints for close range are selected as spheres of a given radius (for instance in the Heracles-ESA mission the rendezvous sphere has a 10 Km radius, the approach sphere 2 Km, and the keep-out-zone 0.2 Km, respectively). In this work a simpler cone-like final approach corridor is considered as in [9]. Figure 4 shows a qualitative constraint illustration, in which is the unit vector of the path direction, and is the maximum cone angle of the corridor and the main design parameter. The constraint geometry is rewritten using Eq. (14) formalism and becomes: is the system's state vector. The constraint is represented as a fictitious output as in Eq. (17) expressed in a target T frame so, as we can see from Eq. (25), the direction of cone axis depends on target's attitude. Note that in our case ∇l( )⊥ ( ) , so the weight function z was selected to penalize the state when it is far from the imposed constraint. In other words, we chose the weight function that depends on the 3D distance between the chaser's a 11 , a 12 , a 13 center of mass and the line described by the unit vector . In this way, z has a large value when the chaser's center of mass is far from the cone axis and small value when it is close to, as mentioned earlier.

Results
In this section, numerical simulations are presented to validate the proposed method. The target moves on a NRHO and we simulate rendezvous maneuvers both at the aposelene, as well as in the worst case condition, i.e. close to periselene. Table 1 summarizes the initial conditions. We assume that the attitude motion of the target has a maximum amplitude of 5 • and the eigen-frequency of Eq. (6) is equal to k qt = 0.1571 rad s −1 [3]. The chaser is is modeled as a cylinder with inertia matrix = diag(0.0011, 0.0006, 0.0006) kg ⋅ km 2 [20]. The direction vector of the approaching cone is T = [−1, 0, 0] ⊤ and the maximum cone angle is set to = 25 • .
The main equations used for motion propagation are those relative to the circular restricted three body problem. For better numerical stiffness the equations are normalized as in [29]; the distances are normalized to the Moon's orbit semi-major axis, the time in units of the inverse mean motion of the NRHO orbit, and the masses are expressed such as M e + M m = 1 . The terminal conditions selected for the tests are ≤ 1 m for relative position, and ̇ ≤ 0.03 m/s for relative velocity. Note that for ESA's ATV mission concept [30] the constraints were 20 m in relative position, and 0.01 m/s for relative velocity [3]. Simulations were run using Simulink TM , the guidance and navigation algorithm runs at 1 Hz, and the Dorman-Price integration algorithm was used.

Aposelene Approach
The aposelene is considered the most feasible area for the docking/berthing. Current literature indicates that as the most likely location, and it has been shown that CR3P equations are sufficiently accurate for the dynamic description of the relative motion [6,21]. In this case, the target has the slowest orbital velocity. The docking/berthing zone is indicated in red in Fig. 2. The SDRE controller was tested by means of a limited Montecarlo simulation for six different relative distances = 5, 8, 11, 14, 17, 20 Km. For each , 20 random uniformly distributed points were selected.
The weighting matrices coefficients used for the translation are: The constraint matrix depends on distance between cone axis and chaser, and is set as follow: where axis (x, y, z) is the classical formula that defines the distance of one point from line in 3D space. All weights were selected by trial and error in order to maximize the accuracy and minimize the control effort.
The position is assumed to be an available measurement, so ( ) = 3×3 0 3×3 . The measurement error is considered as purely random with Gaussian distribution, zero mean and standard deviation = 1∕3 × 10 −2 m [16]. The process error model is based on [31], and is considered as purely random with Gaussian distribution, zero mean and standard deviation = 1∕3 × 10 −6 Km∕s 2 . The control weighting matrices are the same as in Eq. (27), and the process noise and measurement covariance matrices are given by: The initial condition for the error covariance matrix is given by: The filter initial conditions are: where 0 is the real relative position and velocity of chaser, p and v are 3 × 1 vectors of uniformly distributed random numbers, in the interval (0, 10) [cm] and (0, 1) [cm/s], respectively.
The performance analysis was based on normalized values of position error, error rate and amount of control, over the time of flight period.
, v are the error vectors respectively between real position and estimated position, real velocity and estimated velocity, and equivalent propellant consumption. Figure 5 shows the relative position between chaser and target from the Montecarlo simulation. The time of flight and control effort are shown in Figs. 6 and 7.
As we can see from Figs. 6 and 7, the total control effort v and time of flight t of increase with increasing relative distance, as expected. In addition, the linearity in Fig. 6 confirms the feasibility of aposelene approach in terms of validity of the equations of motion. When the chaser initial conditions are not inside the LOS cone and the relative distance is less than 15 Km, the chaser moves very slowly as it reaches the approach corridor. The corresponding standard deviation could be reduced by increasing the number of tests. The average errors in position and rate, between real and estimated values, are shown in Table 2.  To evaluate the attitude behavior at the aposelene, we consider a sample trajectory from those simulated above and shown in Fig. 8. The rendezvous maneuver lasts about 2 hours starting from relative distance of about 11 Km. The results in terms of ΔV expenditure compare favorably with the results in [32], where different thrust allocation algorithms were used, and the results in [33], where continuous thrust was implemented using the adjoint method.
For the selected example, the control forces and torques are shown in Figs. 9 and 10, respectively.
The large initial values depend on the fact that the chaser is controlling both its translation and rotation, the coupling is noticeable especially along the z axis, with the oscillatory behavior of the z force component. The control amount could be tuned further by changing the weights in the optimization. The angular behavior in terms of quaternions and angular rates is shown in the next figures (See Fig. 11). Figure 9 shows the target and chaser quaternions, while the time histories of chaser and target angular velocities are shown in Fig. 12.
The weighting strategy used to satisfy the constraint imposes a severe penalty on velocity, due to the orthogonality of ∇ ( ) ⟂ ( ) . As matter of fact we have initially a high velocity in R-bar and then, when the chaser reaches the axis of the approaching cone, the R-bar component decreases and the V-bar velocity has a plateau, with the chaser moving towards the target.

Periselene Approach
A rendezvous at the periselene of the NRHO orbit is not considered practical for several reasons: first of all the target vehicle is at its maximum speed, thus making the safety requirements very critical and too restrictive, secondly to maintain a desired relative position and velocity, the requirements on Δ V could be too high for the mission. Thirdly the circular restricted three body problem may lose accuracy during propagation. This case is then used only for the purpose of evaluating the behavior of the proposed guidance in a worst case scenario.  The approach zone at the periselene is shown in red in Fig. 13. The Montecarlo simulations were performed with the same parameters used for the aposelene approach. The propagation equations are based on the elliptical restricted three body problem. The resulting trajectories, time of flight, and control effort are shown in Figs. 14, 15, and 16, respectively.
The most significant difference between the two cases is the increase of control amount required to perform the rendezvous at periselene, as suspected. Table 3 shows very similar errors in position and a slightly increase in rate error for the periselene approach. An interesting comparison is shown in Fig. 17. On the left a zoomed set of trajectories obtained using the ER3BP are shown (taken from    Fig. 14), while on the right the trajectories are computed using the CR3BP equations. This indicates that the CR3BP equations could be sufficient for the relative motion description, see Table 4. Figure 18 shows the influence of the gain in the weighting matrix z described in Eq. (28). The higher the gain and the faster the trajectory moves towards a rectilinear V − bar path (figure to the right). The numerical values for the two cases are shown in Tables 5 and 6, respectively.

Conclusions
The paper presents a State Dependent Riccati Equation approach to closed loop guidance with state constraints. The technique is applied in a rendezvous scenario between two spacecraft around the Moon in a NRHO environment, due to nonlinear nature of the relative dynamics. The constraints are formulated as a conic area that depends on the target's attitude, so the chaser's center of mass must follow the complete target's motion. Although simulations are based on a somewhat limited Montecarlo analysis, the method provides successful control and feasible ΔV requirements at the aposelene, and also a satisfactory behavior at the periselene, with additional control effort. The weighting selection on the constraints allows the designer to modify the trajectory in order to acquire quickly a desired V − bar direction, which appears to be desirable in standard rendezvous maneuvers. The mission scenario used for the synthesis is based on current information on the lunar gateway study, thus the numerical data could be subjected to variations in the future, as well as the computational requirements of SDRE, with respect to mission design.
Acknowledgements The present work was performed with partial support of the European Space Agency under contract No. 4000121575/17/NL/CRS/hh/CCN1. The views expressed in this paper can in no way be taken to reflect the official opinion of the European Space Agency. The contribution of the third author was made while a Ph.D. student in the Department of Information Engineering at the University of Pisa.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission