Turning Motion Control Design of a Two-Wheeled Inverted Pendulum Using Curvature Tracking and Optimal Control Theory

This paper presents a control design method for implementing planar turning motion of a two-wheeled inverted pendulum with an input delay. The control task requires that the inverted pendulum is kept stabilized during the whole turning motion process along a pre-settled curve. Firstly, by using the theory of planar curve, key observations about the motion law of the two-wheeled mobile chassis are made and they are used to set up a dynamical trajectory tracking target. Then, by adjusting the parameters in the tracking target and the weights in the quadratic performance criterion, the optimal integral sliding mode controller based on a linear quadratic regulator is designed for keeping the vehicle body stabilized and tracking a circular path for the two-wheeled inverted pendulum. An illustrative example is given to demonstrate the validity of the theory with numerical simulation.


Introduction
Two-wheeled inverted pendulum (TWIP, for short) is a general term for mechanical models driven by two wheels with a rod of pendulum mounted on the chassis. It is a self-balancing system and has some remarkable superiorities, such as simple structure, good dexterity, true zero turning radius, small footprint, low cost and low energy consumption [1]. Thus, it has been more and more widely used in human transporters and humanoid robots. However, the dynamics analysis and motion control design of TWIP systems are still challenging, because a TWIP is an essential nonlinear and under-actuated system, and the two wheels of the TWIP are subjected to nonholonomic constraints when the wheels move by rolling rather than slipping. The nonholonomic constraints of a TWIP are described by the motion equations of the mobile chassis, which not only restrain the motion displacement, but also the motion velocity that is not integrable. The flexibility of nonholonomic systems is superior to the holonomic ones, because the state of a mechanical system with nonholonomic constraints can be reached to any location in the displacement space. Thus, in some applications, nonholonomic structures are intentionally introduced to the manipulating device to implement intricate motion functions [2]. Usually, the nonholonomic systems are firstly transformed into chain normal forms. Then, different kinds of control methods, based on chain systems, can be used to design controllers for the original nonholonomic systems [3,4]. In addition to the chain normal form, power form and Goursat normal form are two other kinds of normal forms, which can be also used to deal with nonholonomic systems [5,6]. However, the designed controllers based on these kinds of normal forms are focused on speeds, rather than forces or moment of force, which are more aligned with an actual motion control problem of the nonholonomic mechanical system. Thus, the motion equations and dynamics equations of a nonholonomic mechanical system should be simultaneously considered in the control design for implementing a given motion task.
Stabilization of the inverted pendulum is a pre-requisite in many control applications of a TWIP, whereas the strong nonlinearity of the inverted pendulum is a major difficulty in the control design. For some nonlinear Lagrangian mechanical systems, the Chernousko's decomposition method and its extension [7,8] have been used for designing constrained feedback control to implement prescribed control objectives. Especially, for a pendulum-like system, a time-optimal feedback control with several switchings which is not greater than one for any initial condition was proposed in [9]. When the external disturbances and system uncertainties are taken into consideration, different kinds of robust control design methods have been designed to stabilize the TWIP, such as combined control with a decoupled LQR controller and two state variable controllers [10,11]; nonlinear disturbance observer-based dynamic surface control [12]; sliding mode control [13,14]; adaptive backstepping control [15] and so on. For the motion control design problem of the TWIP, most of the available works about controller design usually use the given longitudinal and yaw rotational speeds as tracking targets. Based on the dynamics equations of the TWIP, neural network-123 based control [16], fuzzy logic control [17] and adaptive control combined with some classical control methods have been proposed to design trajectory tracking controllers to track the given longitudinal and yaw rotational speeds target [18][19][20][21][22]. However, few works on the TWIP are devoted to design controllers for implementing a given motion trajectory curve in the Cartesian frame. The main difficulty for this motion control problem is that the relationship between the target trajectory curve and the forward and yaw rotation speeds of the TWIP is not clear. In [23,24], for example, the forward and yaw rotational speeds are considered as the intermediate variables, which are denoted as the control input of the motion equations and the control output of the dynamics equations simultaneously. Then, a composite controller for implementing a given motion task is designed by using direct/indirect adaptive fuzzy control to track the trajectory path, plus a sliding mode control to render the stabilizing process of the vehicle body with strong robustness. In [25], two high gain observers are proposed for estimating the forward and yaw rotational speeds of a two-wheeled mobile robot without an inverted pendulum. Then, an adaptive output feedback tracking controller is designed to implement the circular motion by using the estimated velocities. In addition, time delays in controllers and uncertainties from modeling or measurement or perturbation are usually inevitable in real applications, but they are usually ignored in the literature [1]. The most popular methods to deal with input delay are related to the prediction-based compensation control strategy [26][27][28]. However, input delay is very important especially when some robust controls with high-frequency switching mechanism such as sliding mode controls are used, where the existence of input delay would lead to reversed control if the input delay is ignored. In fact, TWIP is a natural high-frequency vibration system, when the rod of pendulum is stabilized at the unstable equilibrium point. As shown in [27], a very small input delay may enlarge the vibration amplitude of the TWIP system remarkably in the trajectory tracking problem by using integral sliding mode control. Therefore, the input delay, although very small, is one of our major concerns in the control design of a TWIP.
In this paper, a controller design method for a TWIP to run along a given trajectory curve is proposed, for which the nonholonomic constraints must be considered. The design consists of two parts: One is the use of curvature theory to tracking the trajectory path precisely, and the other is the use of integral sliding mode control to stabilize the vehicle body robustly. Section 2 is the problem statement, Sect. 3 presents the key observations of the motion laws, Sect. 4 focuses on the controller design based on the observations, Sect. 5 demonstrates the main results with a numerical example, and finally, Sect. 6 ends with some concluding remarks. Figure 1 is a schematic diagram of the TWIP model, with the parameters and variables of the TWIP described in Table 1. Two kinds of equations are used to describe the motion of the TWIP: One is the equations featuring the motion of the chassis, and the other is the equations characterizing the dynamical behaviors of the whole TWIP.

Problem Statement
Let q = (x o , y o , θ, ϕ, θ r , θ l ) be the generalized coordinates of the TWIP. If the wheels run under the conditions of pure rolling and nonslipping, then, the motion equations are given by the following nonholonomic constraints  Table 1 The parameters and variables of the TWIP

Notation Definition
T l , T r Torques provided by wheel actuators acting on left and right wheels θ l , θ r Rotational angels of the left and right wheels Tilt angle of the pendulum θ Heading angle of the TWIP 123 Let v T , v, ω be the lateral speed, forward speed and yaw rotational speed of the mobile chassis, respectively. Then,ẋ o = vcosθ ,ẏ o = vsinθ , and the motion equations, namely Eq. (1), are equivalent to the following equations The dynamics equations of the TWIP model can be obtained by using the Euler-Lagrange equations for nonholonomic systems as done in our previous paper [28]: u 1 and u 2 are the torque controllers acted on the wheels. Equation (3) is nonlinear, its control design is usually difficult by using the prediction-based control methods [26], especially when the input delay is taken into account. However, the control task requires that ϕ is small during the whole motion process, so the control design can be based on the linearized equation with respect to ϕ round

be the state vector, and
if the input delay τ > 0 is taken into account. Taking model uncertainty, linearization error and external disturbances into account, Eqs. (4), (5) are represented aṡ respectively, where σ 0 1 (t) and σ 2 (t) stand for the integration of the model uncertainty, linearization error and bounded external disturbances.
The control objective is to design a delayed controller (u 1 (t − τ ), u 2 (t − τ )), so that the TWIP runs along a pre-settled pathway Γ (t) = (x o (t), y o (t)) and keeps the tilt angle ϕ small enough. and Proof In a very small time interval [t, t + Δt], the movement distance of the two wheels can be approximated asθ l (t)r Δt andθ l (t)r Δt, respectively. Thus, due to the effect of nonholonomic constraints, one haṡ where R l , R r are the turning radius of the two wheels, and R r = R l + d. It follows that due to Eq. (10). Therefore, the turning radius of the center point O is and the corresponding relative curvature is which depends on t, and can also be transformed to a function with respect to the length arc variable, described by Under the assumptionθ r >θ l > 0, one According to planar curve theory [29] and Eq. (14), the smooth functionk o (s) determines uniquely a smooth curve r (s) can be easily calculated.
On the other hand, the required rotational speed of two wheels can also be obtained if the motion trajectory curve of the mobile chassis is given. Actually, k o (t) can be expressed in the following formula According to Eq. (12), it goes to 2 dθ In addition, since Then, by differentiation with respect to t, one has Solving the rotational speed (θ r ,θ l ) from Eqs. (16)- (18) gives Eq. (8), and substituting (8) to (2) gives (9).
be used for describing the pre-determined control pathway, where s is the arc length variable. Then, the tangent vector of Γ (s) must be an unit vector due to Thus, the initial forward speed target would be a constant if Γ (t) = (x o (t), y o (t)), (s = t) is chosen as the trajectory tracking target. In this case, the initial speed error does not approach zero due to the fact that the initial velocity of the TWIP is zero. Thus, the actual location error of the TWIP would accumulate and becomes larger and larger due to the errors from the forward speed and the yaw rotation speed.
To reduce the initial speed error, an one-to-one smooth mapping s = φ(t) is introduced. Then, the trajectory tracking target is given byΓ (t) = (x o (φ(t)), y o (φ(t))). It follows that Thus, in order that the initial speed error is zero and without jumping, the function φ(t) is required to satisfy such a function can be chosen in different ways. For example, if the speed of the TWIP is expected to be zero, when the motion task is finished, the function φ(t) can be chosen to satisfyφ(t) = αte −βt . In this case, where α > 0, β > 0 and γ > 0 are parameters to be determined from +∞ 0φ (t)dt = l, φ(0) = 0.

Controller Design Based on Curvature Tracking and Optimal Control
The control problems of a TWIP can be roughly classified into two categories: trajectory planning and controller design, which are usually studied separately in the literature. In this section, trajectory planning plays a very important role in the controller design.

A Dynamical Trajectory Tracking Target
Note that the dynamics equation (3)

Theorem 4.1 Let Γ (s) = (x o (s), y o (s)), s ∈ [0, l] be a given trajectory curve, k o (s) be the relative curvature of Γ (s), and v(t) be the actual forward speed of the TWIP. Then, the dynamical trajectory tracking target for implementing the motion task of walking along the given curve Γ (s) can be designed as
where s = φ(t) is an one-to-one smooth mapping required in Lemma 3.1.
As a matter of fact,ṽ is a pre-determined tracking target, which can be pre-adjusted for better control effect in an actual problem. However, the so-called dynamics tracking targetω is state-dependent with the actual forward speed, which is varying dynamically from moment to moment.

Optimal Integral Sliding Mode Control Design
Because Eqs. (6) and (7) are decoupled, the controllers u 1 (t − τ ) and u 2 (t − τ ) can be designed separately. The modeling error ΔAX(t) of the linearized system is System (6) can be rewritten as the following equation if slow motion speed of the TWIP is consideredẊ where the error part ΔAX(t) is combined into σ 1 (t), and σ 1 (t) = σ 0 ṽ] T be the trajectory tracking target vector to be designed according to the given motion task, Y(t) = X(t) −X(t) be the error vector of system (22), and η 1 (t) := AX−Ẋ. Then, system (22) governing the tracking error takes the forṁ In order to have a small tilt angle of the pendulum, so that the control design can be made on the basis of the linearized system, a quadratic performance criterion with large weight of the tilt angle is introduced as follows: where Q 0 , Q 1 are nonnegative definite symmetric matrices, R is a positive definite symmetric matrix, and t f (> 2τ ) is the terminal time of the control. With a large weight of the tilt angle error in J , the tilt angle error can be forced to be small enough, when an optimal control is applied. Hence, the linearization error is small and can be considered as bounded. Moreover, for clarity, letṽ =φ(t) be the tracking target of the forward speed v(t) withφ = lβ 2 te −βt . So v(t) becomes small if the target speed is small by using a small number β. It follows that the yaw rotational speed ω(t) is small enough, and consequently, ΔAX(t) becomes small enough, when the quadratic performance criterion (24) is minimized. In this way, there is a constant D 1 > 0 such that Usually, it is not an easy task to solve the Riccati differential equation for the LQR optimal control problem, when t f > 0 in Eq. (24) is finite. For any given sufficiently small ε > 0, a real number β may be chosen to satisfy This means that the controller can be designed simply by solving an algebraic Riccati equation if the performance criterion (24) is replaced by Here, the trajectory tracking target design is very important in the turning motion control problem of the TWIP. According to Lemma 3.1, the designed trajectory tracking target with zero initial velocity and zero terminal velocity is in agreement with the actual problem in most cases. This leads to a small location error in the whole process and a simple algebraic Riccati equation to be solved. Moreover, small forward velocity of the trajectory tracking target can be designed to keep the tilt angle of the pendulum small enough by choosing a small parameter β. Now, it is in the position to design the controller by following the method proposed in [27]. Firstly, by introducing an integral transformation given by (28) the delayed system (23) is transformed into an equivalent delay-free system as follows: where B 0 = e −Aτ B, the new state Z(t) and the error state Y(t) satisfy the following relationship [27] Y(t + τ ) = e Aτ Z(t).
WithQ 1 = e Aτ T Q 1 e Aτ ,Q 0 = e Aτ T Q 0 e Aτ , the quadratic performance criterion can be rewritten as is fixed, because the control does not take effect when t ∈ [0, τ [.
The nominal system of (29) is given bẏ According to [27], the optimal control of system (32) that minimizes the quadratic performance criterion J is where P z (t) ∈ R n×n and b z (t) ∈ R n are the solutions of the following Riccati differential equationṡ In order to design a robust controller against the effect of σ 1 (t + τ ), the optimal state of the normal system (32) is chosen as the integral sliding mode manifold. Let the sliding mode function be where G ∈ R m×n is a constant matrix, and GB 0 is assumed nonsingular, and Z * (0) is the initial value of the nominal system (32) described by 1 (Z(t)) = 0 is the sliding mode manifold, which is actually the optimal state of the nominal system (32). Thus, according to Eq. (30), the delayed robust optimal controller of system (23) is given by where andȲ(t) is the predictor state of Y(t) defined bȳ Similarly, for Eq. (7), assume that there is a constant D 2 satisfying and define the sliding mode function as whereĝ = 0 is a constant,ω is the tracking target. Then, the delayed robust control of system (7) is designed by where u 10 (t − τ ) =˙ω B 2 is an open-loop control, The existence of the sliding mode motion and the accessibility within finite time of the sliding mode manifold can be proved in a similar way as in [27]. The optimal state of system (32) and the open-loop state of system (5) actuated by the open control u 10 (t − τ ) are the optimal states for implementing the original turning motion task. The linearization errors and system uncertainties are dealt with using the switched control parts u 21 (t − τ ) and u 11 (t − τ ). Thus, the original turning motion control problem of the TWIP can be well implemented by using the proposed control method from a theoretical perspective.
In summary, the controller for the turning motion of a TWIP can be designed as follows.
Moreover, let G = [0, 1, 0, 1.8978],ĝ = − 1 12.5 , μ 1 = μ 2 = 0.1, D 1 = D 2 = 0.51, used in the above switched control. Then, all the quantities required in the delayed trajectory tracking controller (37) and (40) are available in hand. Now, the time histories of all the state variables can be simulated. Figure 3 shows that the tilt angle of the pendulum becomes small enough and is stabilized after a short transient. Figures 4 and 5 present the time history of the tracking error of the forward speed and the yaw rotational speed, respectively. Figure 6 shows that the actual motion trajectory is extremely closed to the target trajectory curve, when the dynamical tracking targetω = k o (t)v(t) is applied, and the location error is very large if the rotational speed target is chosen as a nondynamic targetω =ẋ oÿo −ẍ oẏȯ . The so-called nondynamic target is pre-designed by using the pre-determined trajectory curve, which cannot be varied with the state variable. In this case, the cumulative error becomes larger and larger and cannot be reduced. The simulation results indicate that a given turning motion task of the TWIP is well achieved by using the proposed control method, and the vibration amplitudes of the state variables would be magnified obviously if the small input delay is ignored in designing controllers.

Conclusions
A special feature of this paper is the application of the theory of planar curve in designing the controller. Two major points can be deduced from the proposed control design. One is that in order to keep the TWIP walking along the target trajectory curve accurately, it is important to have the curvature of the target trajectory curve well tracked. Thus, the product of the curvature of the target curve and the forward speed of the TWIP is chosen as a dynamic yaw rotational target. When the dynamical yaw rotational target is designed in such a way, the tracking error of the position can be greatly reduced compared with the use of nondynamical tracking target. The other is that there are almost no limits to the forward speed target if one does not mind the amount of the motion speed in the whole motion process. The use of the forward speed as a target enables that the accumulative error caused by the initial speed error can be reduced dramatically, and the LQR-based optimal trajectory controller is easily determined simply by solving Riccati algebraic equations. Numerical simulations show that with the designed controller, not only the given trajectory curve is well tracked, but also the inverted pendulum is well stabilized.