Stabilization of the cart pole system: by sliding mode control

This paper presents a control strategy designed as a combination of a PD controller and a twisting-like algorithm to stabilize the damped cart pole system, provided that the pendulum is initially placed within the upper-half plane. To develop the strategy, the original system is transformed into a four-order chain of integrator form, where the damping force is included through an additional nonlinear perturbation. The strategy consists of simultaneously bringing the position and velocity of the pendulum to within a compact region by applying the PD controller. Meanwhile, the system state variables are brought to the origin by the twisting-like algorithm. The corresponding convergence analysis is done using several Lyapunov functions. The control strategy is illustrated with numerical simulations.


Introduction
The cart pole system (CPS), also known as the inverted pendulum on a cart, is among the classic mechanical systems that have been studied extensively in control theory during the last four decades. This system was originally used as a benchmark for educational purposes, see [1,6,15,33,39,43,44,46] .Through the years, this system has attracted attention as an important underactuated mechanical system, because the pendulum angular acceleration cannot be directly controlled [18,42]. As the CPS dynamics resembles that of many underactuated robot systems, it has been studied as a simplified model such systems (see [7,27,30,38,40,43] ). This system is made up of a cart that moves, forward and backward, over a straight line and has a free-moving pendulum hanging from it. The cart is moved by a horizontal force, which is the input of the system. It is well known that several control strategies that were initially conceived for fully actuated systems cannot be applied to drive this system. Actually, the system is not feedback linearizable [23,42]; also, the system loses controllability when the pendulum passes through the horizontal plane [18,41]. However, the system can be controlled when located near the unstable equilibrium point, by applying the direct pole placement procedure [25,42].
In the present authors' opinion, there are two important problems related to the control of the CPS. The first consists of the upward swing of the pendulum from the hanging position to the upright position. In general, this problem has been tackled by using methods based on energy control and hybrid schemes [2,3,8,9,22,26,31,44,45] . The second issue arises when the pendulum is located somewhere in the upper-half plane, and the goal is bringing it to its unstable equilibrium point. Usually, this control challenge has been solved by applying nonlinear control tools. A full review of these tools is beyond the scope of this work; however, we mention, for instance, the well-known energy-based controller or Lyapunov-based techniques, and methods based on the feedforward or the backforward forms, in conjunction with the saturation function or the bounded function approaches [3][4][5]13,14,22,30,45].
In this work, we propose a control strategy to stabilize the damped CPS around to its unstable equilibrium point, assuming that the pendulum starts moving from some position located inside of the upper-half plane and that the damped coefficient is known. Considering the damping force in the non-actuated coordinate makes controlling this system a challenge, because this force can easily destroy the system stability [21,32,47]. In addition, managing this force is a very difficult task. Several works have neglected the damping coefficient for this reason.
To develop the proposed control strategy, we transformed the original system into a four-order chain of integrators, with an additional nonlinear perturbation. Then, we proposed the strategy as a combination of a linear PD controller and a modified version of the twisting algorithm [16,28,[34][35][36] ,where the first acts over the pendulum position and velocity, while the second brings the whole system state to the origin. The corresponding stability analysis was carried out by using several Lyapunov functions. Convincing numerical simulations were done to assesses the performance of the proposed control strategy. Finally, we mention that this work was inspired by [35,36]. However, our strategy, when the controller gains are adequately selected, may guarantee global convergence, as long as the position pendulum is initialized into the upper-half plane, though it has the disadvantage of being less robust in the presence of unmodeled perturbations. During the development of this study, we use the following functions: The following sections are organized as follows. The nonlinear model of the system is presented in Sect. 2. In Sect. 3, we develop the control strategy. The numerical simulations and the conclusions are in Sects. 4 and 5, respectively.

Problem statement
Consider the damped inverted pendulum mounted on a cart. This system can be described by the following set of normalized differential equations [42]: where x is the normalized cart displacement, θ , is the angle between the pendulum and the vertical, f is the normalized force applied to the cart, which is also the input to the system, and μ > 0 is a scalar constant dependent on both the cart and pendulum masses. The pendulum viscous friction is considered as a linear function of the angular velocity, dθ , with d ≥ 0.
In this work, the physical parameters μ and d are actually given by [42]: , where M and m stand for the cart and pendulum masses, respectively, the pendulum length is L, g is the gravity constant, and γ is the actual dissipation coefficient presented in the non-actuated coordinated θ . The damping force presented in the actuated coordinate is neglected in order to simplify the methodology presented here. It is important to remark that this force can be easily compensated by using any adaptive control algorithm ( [20,29]).
The control objective consists of bringing the pendulum to its unstable equilibrium point, under the following important considerations: C1) The system is initialized inside of the following set:

C2)
The state variables are available and the parameters are known.
It should be notice that C1 is not very restrictive, because the pendulum is assumed to be somewhere inside of the upper-half plane; in fact, it can be easily accomplished by using some suitable controller such as those proposed in [26].
Differential equations are understood in the Filippov sense [19] in order to provide for the possibility to use discontinuous signals in controls. Filippov solutions coincide with the usual solutions, when the righthand sides are Lipschitzian. It is assumed also that all considered inputs allow the existence of solutions and their extension to the whole semi-axis t ≥ 0.

System transformation
After introducing the following feedback law: the system can be rewritten as: Now, in order to represent the system (3) as a fourorder chain of integrators plus an additional nonlinear perturbation, we define the following new change of coordinates: Then, the system (3) can be rewritten as: and υ is the new control variable, defined as: It is important to remark that: Now, to dominate the undesirable term dβ(z 3 )z 4 , found in the second equation of (5), we use the following change of coordinates and the following scale of time: where is a strictly positive free parameter. Hence, the system (5) can be written in the new coordinates as: where ρ(q) is a vanishing perturbation defined by: Here, the symbol "dot" stands for differentiation with respect to the dimensionless time τ . We must underscore that the free parameter, > 0, can be tuned as desired. Finally, the above system can be expressed in a compact form as: where v ∈ R, q ∈ R 4 and f (q) : R 4 → R 4 .

Control of the cart pole system
The control law is proposed as: where v e (q) is a linear controller devoted to bring the states q 3 and q 4 close enough to the origin, and v s (q) is a bounded controller designed using the twisting sliding mode algorithm. The linear control part of the controller, v , is selected as: Let us introduce the following auxiliary variables: where the set of constants k i > 0 should be selected such that they satisfy the following: where δ i > 0, i = {1, 2, 3}. 1 The inequities in (14) are referred to in the sequel as assumption A1. Let us propose v s as a discontinuous injection based on the twisting control algorithm [28,[34][35][36] .That is, where λ 1 > λ 2 > 0. Let us synthesize the main result of this work in the following theorem: where Under the assumption that the control parameters k i > 0; with i = {1, 2, 3} and λ 1 > λ 2 > 0, satisfy the inequalities in (14), then the closed-loop system is asymptotically stable. In particular, the variables s 1 converge to zero in finite time.
Proof Following the application of the linear control, and after some simple algebra, the dynamics of s 1 and s 2 become: Then, the system composed by (16) and the last two equations of (9) reads as: To be able to carry out the convergence analysis, we analyze the boundedness of the states q 3 and q 4 , when the system (17) is feedback by the twisting controller, to assure the boundedness of the vanishing nonlinear perturbation ρ(q). That is, the system (17), in closedloop with (15), reads aṡ where Before formally presenting the corresponding proof, we introduce the following auxiliary lemma: Lemma 1 Consider the following second order system: where the set of constants k i > 0, for i = {p, d}, with |υ| ≤ υ. Then, there exits a finite time, t 0 > 0, such that: where δ 0 > 0is sufficiently small. The proof of this lemma is omitted due to its obviousness.
According to this lemma, the last two equations of (17) satisfy the following inequity: where t 0 is a finite period of time and δ 0 is a small positive constant. That is, q 3 and q 4 are bounded after t ≥ t 0 . This fact assures that the proposed closed-loop system is Lipschitzian, implying that the states s 1 and s 2 remain bounded during a finite time. Hence, the finite time of scape does not exist-see [24]. On the other hand, from the relations (7) and (10), the inequality, is fulfilled. Having shown that q 3 and q 4 are uniformly bounded after some finite time, we are in a position to finally perform the convergence analysis of the whole system, using a continuous and differentiable almost everywhere Lyapunov function. Before to proceeding, we must remember that these kinds of Lyapunov functions have been introduced since the late nineties to prove the stability of discontinuous systems and systems with solutions intended in Filippov's sense-see for example, [10][11][12]. Let us introduce our Lyapunov function, as: with the vector state p = (s 1 , s 2 , q 3 , q 4 ), whose time derivative around the trajectories of the system (17) is almost everywhere given by: Notice that the derivative of the Lyapunov function (21) exists for all s 1 values except the set of measure zero given by s 1 = 0. Notice that W ( p) can be expressed after using (16), as follows: By using the inequality |q 4 v s | ≤ (q 2 4 + v 2 s )/2, we have that (22) can be upperbounded, as: It is easy to see, after some simple algebra that the following inequality holds, for all t ≥ t 0 ; where for simplicity, we introduce, ρ, such that 2 After substituting (24) into the relation (23), we obtain the following inequality: Then, according to the conditions in assumption A1, we have that after a finite time t ≥ t 0 , the following inequity is fulfilled: From the above, it follows that V T ( p) < V T ( p(0)) and, from its own definition, V T (t) is radially bounded 2 Notice that by definition and differentially everywhere, except when s 1 = 0. Consequently, the vector state p is bounded. On the other hand, as V T is bounded from below, with strictly negative definite time derivative, then V converges and p has a limit. Also, . p is bounded, according to (18). That is, p is uniformingly continuous. Now, integrating both sides of the last inequity and using simple algebra, we can claim that the following inequity: (27) holds, for t > t 0 . It implies that the signals s 2 and q 4 are, respectively, L 1 and L 2 . According to Barbalat's lemma, we have that s 2 → 0 and q 4 → 0, as long as t → ∞. We proceed to show that s 1 converges to zero, in a finite time. So, as the values of |s 2 | and |q 4 | decreasing continuously toward the origin, always exists a finite time t 1 > 0 and a constant μ > 0, such that, λ 1 > | (q(t))| + μ, for all t > t 1 , where: because λ 1 >> λ 2 . Hence, the first equation of (18) can be read, as: Evidently, the dynamics of s 1 concides with the dynamics of a first-order sliding mode.
To see the convergence of s 1 , we propose V 1 = s 2 1 /2. According to (28), we have: From the above inequity, we conclude that s 1 → 0, in a finite time. That is, there is a time t 2 > t 1 , such that, s 1 (t) → 0, as long as t > t 2 . To prove that q 3 converges to zero, we introduce the following auxiliary variable z = k 3 q 4 + k 2 q 3 − s 2 , whose time derivative can be written, after using simple algebra, as: According to (20) and the fact that s 1 → 0 and q 4 → 0, the last differential equation turns out to be . z = −k 1 z/k 2 . It imples that z → 0 and q 3 → 0. Therefore, the closed-loop system (18) asymptotically converges to the origin, if the control gains are selected according with A1. Remark 1 The function V T is continuous but not locally Lipschitz. Therefore, the usual version of the traditional Lyapunov theorem cannot be applied [10,17]. However, it can be shown that function V T ( p)is  (17), implying that V T ( p) is differentiable almost everywhere, monotone decreasing and converges to zero. These are the conditions needed by the theorem of Zubov [28,37].
Remark 2 Should the damping parameter d be very large, assumption A1 becomes a strong condition, because assuring the positiveness of constants δ i and k i , i = {1, 2, 3}, in a way that the inequities in (14) hold, needs the parameter to be sufficiently small, which converts the controller into a high-gain controller-see (9). Another way to see it is that when the damping force is very strong, strong control actions must be taken.
It is important to note that the proposed controller has a very simple structure and does not presents singularities, if the system is initialized inside of the upperhalf plane. Tuning the control parameters The correct performance of the control strategy requires control parameters tuning according to the restriction (23). To illustrate this tuning, we fix the pendulum length, mass, and damping as L = 0.35 (m), m = 0.250 (Kg), and γ = 4 (kgm 2 /s), respectively. Then, according to the expression given in comment C1, the normalized damping coefficient is d = 0.9. Now, fixing the control gains as k 1 = 0.9, k 2 = 3.5, k 3 = 4, λ 1 = 6, and λ 2 = 0.8, and setting the rescale parameter as 0 < < 0.404, it is easy to see in a plot that the inequities in (14) hold. Summarizing Given d > 0 and δ i ≈ 0.1, we need to find an admissible parameter vector Q = (k 1 , k 2 , k 3 , λ 1 , λ 2 , ) ∈ R 6 + fulfilling the restrictions given in (23). This problem can be solved using any numerical optimization program.

Numerical simulations
In order to verify the proposed controller performance, we carried out some numerical simulations, where the above proposed control gains were used, with = 0.4. To make this experiment more interesting, we assume that the knowledge of the damping force has an accuracy of 85%. We ran two experiments with their own different initial conditions. The x). As we can see, the control strategy is able to render the system to the origin after 7 (s) elapsed, even when the value of the damping coefficient, d, is partially known.
To provide an idea of how good the proposed control strategy OC is, we compared it with the control technique proposed by Riachy et al. in [35], here referred to as RC. The control parameters of RC were tuning heuristically, but to be fair, we tried to find the values that enable the best transient response. The initial conditions were fixed as (θ = 0.9, 0, 0, 0). The obtained results are shown in Fig. 2, where we can see that the closed-loop response of the propose control strategy is as good as the responses of RC. Furthermore, we can see that our strategy presents a better behavior in the angular variable, if compared with the RF strategy. However, the cart displacements in our strategy are larger that those in RC. Please keep in mind that this is a numeric comparison, a formal comparison is beyond the scope of this work, as is a comparative study between our control strategy and others found in the literature. We must underscore that all the simulations were carried out in the actual coordinates of the pendulum system. Finally, Fig. 3  V T , for the initial conditions: (s 1 = 1, s 2 = 0, q 3 = −5, q 4 = 1) and (s 1 = 0, s 2 = 0, q 3 = 1, q 4 = 4). The plot on the left side corresponds to the first initial conditions, and the one on the right side to the second initial conditions shows numerically the asymptotic behavior the Lyapunov function V T ( p), and its derivative. To this experiment, we used the same setup as before, but normalized time, and the following two different sets of initial conditions: (s 1 = 1, s 2 = 0, q 3 = −5, q 4 = 1) and (s 1 = 0, s 2 = 0, q 3 = 1, q 4 = 4). The numerical similations shown in this figure are the expect results, because V T converges asymptotically to zero and . V T is always strictly negative, with . V T tends to −4.9.

Conclusions
In this work, we introduced a control strategy, based on a PD controller in conjunction with a twisting-like algorithm, to solve the stabilization of the damped cart pole system, assuming that the pendulum is initialized somewhere inside of the upper-half plane. To this end, we first used some nonlinear transformations over the original pendulum system to express it as a four-order chain of integrators, with an additional perturbation that vanishes at the origin. The PD controller was designed to bring the pendulum position and its velocity inside of a compact region simultaneously. At the same time, the twisting-like algorithm renders the whole system state to the origin. For the convergence analysis, we used several Lyapunov functions, thereby assuring that our strategy converges asymptotically once the pendulum position and its velocity are inside of the compact region. The effectiveness and robustness of the strategy was tested running numerical simulations, where uncertainties in the parameters' values were included. The obtained results allow us to claim that the performance of our controller is satisfactory.