Compensator-critic structure-based event-triggered decentralized tracking control of modular robot manipulators: theory and experimental verification

This paper presents a novel compensator-critic structure-based event-triggered decentralized tracking control of modular robot manipulators (MRMs). On the basis of subsystem dynamics under joint torque feedback (JTF) technique, the proposed tracking error fusion function, which includes position error and velocity error, is utilized to construct performance index function. By analyzing the dynamic uncertainties, a local dynamic information-based robust controller is designed to engage the model uncertainty compensation. Based on adaptive dynamic programming (ADP) algorithm and the event-triggered mechanism, the decentralized tracking control is obtained by solving the event-triggered Hamilton–Jacobi–Bellman equation (HJBE) with the critic neural network (NN). The tracking error of the closed-loop manipulators system is proved to be ultimately uniformly bounded (UUB) using the Lyapunov stability theorem. Finally, experimental results illustrate the effectiveness of the developed control method.


Introduction
The modular robot manipulators (MRMs) [1,2] equipped with standard modules adapt to severe working conditions through changing their configurations and increasing/reducing modules. Since the modularization and light weight of MRMs, they are potential in numerous unmanned and complex environments, such as aerospace explorations, search-rescue operations, and medical assistance. Thus, the effective control strategies are expected to ensure the security and low consumption.
The tracking control strategies of MRMs can be classified into centralized control [3,4], distributed control [5,6], and decentralized control [7][8][9] according to the recent literature. The centralized control and distributed control are designed by employing the information of all subsystems the complexity of the dynamics and to improve the generality in practice. Zhang et al. [14] presented a modular distributed control technique for MRMs that the model uncertainties associated with link and payload masses were compensated using joint torque sensor measurement. Nevertheless, the drawback of mentioned methods lies in ignoring the comprehensive optimization of the control performance and power consumption, as well as the high-energy cost caused by the long-time computation and communication, simultaneously. To the best of our knowledge, there are very few attempts on developing the decentralized optimal tracking control methods for robots, especially, the decentralized tracking control integrating adaptive dynamic programming (ADP) and event-triggered algorithm for MRMs.
Optimal control scheme has been received widespread attentions from both researchers and engineers since the mid-1950s. As an effective way to solve optimal control problems of nonlinear systems, ADP algorithm, which was first proposed by Werbos [15], can avoid the difficulties of "curse of dimensionality". Recently, ADP-based methods are utilized to design optimal controllers for continuoustime [16,17] and discrete-time [18,19] nonlinear systems with input/output constraints [20,21], external disturbances [22,23], and mismatched interconnections [24,25]. Since the optimal control problems of nonlinear systems are solved gradually, the ADP-based optimal control approaches [26] are applied to various fields [27,28]. Nevertheless, all the aforementioned control methods were developed based on the time-triggered mechanism, which neglected the huge amount of unnecessary computation, communication, and energy cost in a long working time. In the last few years, the event-triggered mechanism [29,30] is employed to address above problems. Kyriakos et al. [31] proposed a novel optimal adaptive event-triggered control algorithm for nonlinear continuous-time systems. Yang et al. [32] tackled the optimal event-triggered control problem of nonlinear continuous-time systems subject to asymmetric control constraints. Considering the interconnected systems, Vignesh et al. [33] presented an approximate optimal distributed control scheme for nonzero-sum games. He et al. [34] designed a decentralized event-triggered control method for nonlinear systems with matched interconnections. For the MRM systems, Dong et al. [35] proposed the time-triggered decentralized robust optimal control for MRMs via critic-identifier structure-based ADP approach. Zhao et al. [36] developed an event-triggered decentralized tracking optimal control approach by employing a local NN observer to estimate unknown model dynamics. In general, since the composed components for each module of MRMs are basically identical in practice, the dynamics of MRMs is usually partially known, such as the specification of actuators, the reduction ratio, etc. Besides, the training of NN needs a large amount of online or offline data, which wastes computation, communi-cation, and energy resource. Thus, they should be taken into account to extend their service time. Unfortunately, a few ADP-based event-triggered decentralized tracking control approaches for MRMs were investigated, especially, considering the model-based real-time compensation of model uncertainties.
Inspired by the above literature, this paper presents an event-triggered decentralized tracking control approach with compensator-critic structure for MRMs. First, the dynamic model of MRMs, which is described as the integration of all subsystems associated with coupling dynamics, is formulated based on JTF technique. Then, a model-based real-time robust compensator is implemented to deal with the model uncertainties. Second, the performance index function which contains the tracking error and control torque is defined, and the system state is sampled according to the event-triggering condition. Based on the ADP algorithm, the event-triggered HJBE can be solved by the critic NN, and then, the event-triggered approximate decentralized optimal tracking control policy can be obtained. By utilizing the Lyapunov stability theorem, the tracking error of the closedloop manipulators system is proved to be UUB under the proposed control method. Finally, the effectiveness of the proposed compensator-critic structure-based event-triggered decentralized optimal tracking control method is verified via the experimental results.
The main contributions of this paper are summarized as follows.
1. We address the ADP-based event-triggered decentralized tracking control problem of MRMs with compensatorcritic structure. On the basis of JTF technique, the model-based robust compensator and the critic NN are designed to mitigate model uncertainties in real time and to approximate the optimal compensation tracking control policy, respectively. 2. Unlike existing time-triggered control methods [37,38] which ignored the conservation of limited energy resource, in this paper, a novel compensator-critic structure-based event-triggered decentralized tracking control method for MRMs is proposed. It does not only make the actual trajectory of each joint module follow its desired one, but also reduce the computational burden, save the communication, and energy consumption simultaneously.
The remainder of this paper is arranged as follows. "Dynamic model and preliminaries" sketches the dynamic model and preliminaries of MRM subsystems. In "Compensator-critic structure-based event-triggered decentralized tracking control", the compensator-critic structurebased event-triggered decentralized tracking control of MRMs is proposed, and the stability analysis is given. In

Dynamic model and preliminaries
We consider a n-degree of freedom (DOF) serial MRM, whose each module consists of a rotary joint with a direct current (DC) motor, a speed reducer, and a joint torque sensor, as shown in Fig. 1. Based on the JTF technique [39], the dynamics of the ith joint subsystem can be modeled as: where I ri denotes rotor moment of inertia related to the axis of rotation, γ i refers to the reduction ratio of the speed reducer, q i is the vector of the joint movements,q i andq i are the joint velocity and acceleration, respectively, τ ti represents the measurement of the joint torque sensor, f ri (q i ,q i ) means the joint friction torque, Z i (q,q,q) indicates the dynamic coupling torque among the subsystems, and τ i is the control input torque, also the motor output torque.
The joint friction torque f ri (q i ,q i ) mainly reflects the friction of the motor and speed reducer. Motivated by [40,41], it is assumed to be a function of the joint position and joint velocity as: where f bi represents the viscous friction coefficient, f si is the static friction, f τ i denotes a positive parameter corresponding to the Stribeck effect, f ci reflects the Coulomb friction, f pi (q i ,q i ) denotes the position dependency of friction and other friction modeling errors, and sgn(·) is a classical sign function.
Supposing the nominal values of f bi , f si , f τ i and f ci are closed to their actual values, then according tothe lineariza-tion scheme [41], the friction model (2) can be approximated by: wheref bi ,f si ,f τ i , andf ci are the approximate values of f bi , f si , f τ i , and f ci , respectively, and

Remark 1
In practice, the joint friction torque f ri is always constant and bounded, which is affected slightly by temperature and lubrication. Thus, it is reasonable to assume that the estimated error termF ri is also bounded as |F ri | ≤ β Fbi , where β Fbi is a positive constant vector with b = 1, 2, 3, 4. The non-parametric friction term f pi (q i ,q i ) has an upper bound as || f pi (q i ,q i )|| ≤ β pi with β pi a positive constant.
On the basis of the dynamic model in [42], the dynamic coupling torque Z i (q,q,q) can be obtained by: where c ri , c j , and c k represent unit vectors along the rotation axis of the ith, the jth and the kth joint, respectively. Accordingly, we define Φ i j = c T ri c j and Ψ i k j = c T ri c k × c j . Moreover, we have Φ i j =Φ i j +Φ i j and Ψ i k j =Ψ i k j +Ψ i k j , whereΦ i j andΨ i k j are the estimates of the vectors Φ i j and Ψ i k j ,Φ i j andΨ i k j indicate the alignment errors, respectively. Remark 2 From the dynamic coupling torque (4), we know the terms c ri , c j and c k are bounded as ||Φ i j ||=||c T ri c j || ≤ 1, ||Ψ i k j || = ||c T ri c k × c j || ≤ 1, respectively. Moreover, we also conclude that if the jth and the kth (1 < j, k < i − 1) joints are assembled lower, then the dynamic coupling term Z i (q,q,q) is bounded as ||Z i (q,q,q)|| ≤ β Zi with β Zi a positive constant. Accordingly, the MRM can be controlled "joint by joint", such that the lower joints are all controlled when the current joint is controlled.
According to (1) and (3), the dynamic model of the MRM subsystem is described by: where +f ci sgn (x 2i ) + τ ti γ i , u i = τ i represents the ith joint control torque, and the model uncertainty which includes the friction model error and the interconnection joint coupling can be given as: (5) is Lipschitz continuous for the state x i ∈ Ω, and each subsystem is controllable, and x i (0) = 0 with a equilibrium of system.

Assumption 1 The nonlinear system dynamics
In this paper, we propose a compensator-critic structurebased event-triggered decentralized tracking control of MRMs based on ADP algorithm. The aim is to find a decentralized near-optimal control policy u i to guarantee the stability of the closed-loop MRM subsystem. For the subsystem (5), the improved infinite horizon performance index function is defined as: is the hybrid error function including the position error and velocity error with ϑ i0 (x i (0)) = ϑ i (0), a ei is a positive constant, x 1di and x 2di denote the desired position and velocity trajectories, respectively, Q i and R i are the positive definite matrices, Then, we develop the time-triggered HJBE for subsystem (5) as: where The optimal performance index function is described by: Substituting (9) into the HJBE (8), we obtain: Kalman [43] strictly demonstrated that if the optimal performance index function Ξ * i (ϑ i ) is continuously differential and satisfies (10), the solution of HJBE u * i (ϑ i ) exists as the optimal control policy of the corresponding nonlinear continuous system, which can be formulated as: Then, the HJBE can be presented as: To mitigate the subsystem dynamics, by utilizing the partly known model information, we can rewrite the optimal control u * i as: which is used to deal with dynamic model term Γ f i , Θ i and to realize the optimal tracking control, respectively. Thus, combining (12) with (13), and through simple transformation, we have Remark 3 On the basis of optimization theory and the dynamic model analysis, in this paper, the decentralized tracking control problem of MRMs is transformed into an optimal compensation control problem (13), which consists of model-based robust control u 1i (ϑ i ) and ADP-based optimal control u * 2i (ϑ i ). Inspired by the previous works [41,44], the decentralized optimal tracking control is developed with a compensator-critic structure, which can not only mitigate model uncertainties in real time but also realize the satisfactory tracking performance for MRMs.
According to [38,45], the HJBE (8) can be solved by time-triggered ADP algorithm. However, as mentioned in [46], the time-triggered optimal control strategies do not only suffer from heavy computational burden and communication, but also waste limited energy resource. To address above shortcomings, a compensator-critic structure-based event-triggered decentralized tracking control is designed for MRMs as follows.

Compensator-critic structure-based event-triggered decentralized tracking control
In this section, the detailed design procedure of compensatorcritic structure-based event-triggered decentralized tracking control for MRMs is described.

The model-based robust compensator
In practice, the dynamics of each joint module is partially known. Inspired by [13,47,48], we present a robust compensator u 1i , which consists of the model-based measurable term u 1mi and compensation term for dynamic uncertainties u 1ui , can be expressed by: where the compensation term u 1ui is designed to deal with the approximated friction model error termF ri and the dynamic coupling term Z i (x,ẋ,ẍ). Based on our previous works [49,50], u f i and u zi are presented to compensateF ri and Z i (x,ẋ,ẍ), respectively. The robust compensator u f i can be designed as: where κ f i and κ f ib are positive parameters with b = 1, 2, 3, 4. To facilitate the analysis of the dynamic coupling term Z i (x,ẋ,ẍ), one can rewrite the term as: The robust compensator (19) as: where κ 1i , κ 2i , κ z1oi , κ z2oi , ξ 1oi , and ξ 2oi are positive parameters with o = 1, 2. According to (15), (16), (17), (18), (20) and (21), the robust compensator u 1i can be presented as: Theorem 1 Consider a MRM working in free space, the subsystem dynamics (5) with model uncertainties as (3) and (4).
The tracking errors are ensured to be UUB under the robust compensation control law (22).
Proof Choose the Lyapunov function candidate for the MRM subsystem as: where In (24), the actual termsF ri , U i j , and Z i k j are all constants. Therefore, the time derivative of (23) is expressed as: According the robust compensator u 1i in (15), u 1mi and u 1ui are employed to deal with the known dynamics and uncertainties correspondingly. Through (16) and (22), we obtaiṅ Combining (18), (20) and (21) with (26), we havė By the simple transformation, assuming that According to the Lyapunov's direct method, the tracking error ϑ i can be guaranteed to be UUB, if ϑ i lies outside the compact set: This completes the proof.

Decentralized tracking control based on event-triggered mechanism
The event-triggered mechanism is effective to reduce the computational burden and energy cost. Based on event-triggered mechanism, the decentralized tracking control input is updated when the triggering condition is violated. Suppose that {t l } +∞ l=0 is a monotonically increasing sequence consisting of triggering instants, where t l satisfies 0 < t l < t l+1 and lim l→∞ t l = ∞ for l ∈ {0, 1, 2, . . .}. The sampled state is presented as: whereθ li x li is the sampled data for t ∈ t l , t l+1 ). To obtain the proper event-triggering condition, the gap function between the sampled state and the actual state is defined as: Based on the event-triggering mechanism, the control pol- In this situation, the decentralized tracking control input becomes a piece-wise continuous-time signal by a zero-order hold, which is formulated as: during the time interval [t l , t l+1 ). Based on (11), the eventtriggered decentralized optimal control can be formulated by: However, u * i θ li x li is the discrete value of aperiodic sampling and by introducing the zero-order hold, the control signal becomes continuous. Substituting (32) into (14), we establish the eventtriggered HJBE as:

Assumption 2
The decentralized tracking control u * i is Lipschitz continuous for every state ϑ i ,θ li ∈ Ω, i.e., there exists a positive constant m li , such that

Remark 4
In the event-triggered decentralized tracking control policy (32), the subsystem error function ϑ i (x i ) is substituted byθ li x li to determine the triggering time instant t l , and the decentralized tracking control policy is updated by u

Critic-based event-triggered decentralized tracking control
To solve the event-triggered HJBE, the neural network (NN) which has powerful learning ability, is utilized to approximate the performance index function Ξ * i (ϑ i ) as: where W ci ∈ R K is the desired weight vector, K is the number of neurons in the hidden layer, δ ci (ϑ i ) is the activation function, and ε ci (ϑ i ) is the critic NN approximation error. Thus, the partial derivative of Ξ * i θ li is: where ∇δ ci θ li = δ ci (ϑ i ) According to [51], it is reasonable to assume ||∇δ ci θ li || ≤ δ cid and ||∇ε ci θ li || ≤ ε cid with δ cid and ε cid positive constants. Through Assumption 2, we have ∇δ ci (ϑ i ) − ∇δ ci θ li ≤ P i E li . Combining (32) with (36), we can obtain Therefore, the event-triggered HJBE can be rewritten as: where u * i θ li = u 1i + u * 2i θ li is the ideal control torque.
Since the desired weight vector W ci is unavailable, the critic NN can be approximated by: and the partial derivative ofΞ i θ li can be expressed by The event-triggered approximate decentralized tracking control strategyû i θ li is presented as: Remark 5 Different from the traditional ADP-based optimal control approaches that rely on actor NNs, critic NNs, and even model NNs, in this paper, the compensatorcritic structure-based event-triggered decentralized tracking control method, which consist of model-based robust compensator and only critic NNs-based approximated optimal controller, is proposed for MRMs.
Through (38), (39) and (40), the approximate eventtriggered Hamiltonian is: Comparing (38) with (41), the NN weight approximation error can be defined asW ci = W ci −Ŵ ci , and the residual error ε cHi is: where ε cHi is bounded as ||ε cHi || ≤ ε cHi M with ε cHi M a positive constant, and ε Zi M is the upper bound of ε Zi as: To adjust the critic NN weight vectorŴ ci , we minimize the objective function E ci = 1 2 ε T cHi ε cHi by the gradient decent algorithm, and it should be updated by: where Therefore, the weight approximation error can be updated by:

Remark 6
The critic NN is constructed to approximate the decentralized optimal compensation control based on the powerful learning ability of NNs. Note that the critic NN weight learning law (44) is designed using the local joint modular state without relying on the event-triggered conditions.
Thus, the event-triggered approximate decentralized tracking control policyû i θ li which is applied to MRM as the control torque is given as (40). The structural diagram of the proposed compensator-critic structure-based event-triggered decentralized tracking control strategy of MRM systems is illustrated in Fig. 2.   Fig. 2 Structural diagram of the proposed optimal control method Theorem 2 Considering the n-DOF MRM whose subsystem dynamics described as (5), the weight estimation errorW ci of the critic NN can be guaranteed to be UUB with the weight updating law (44).
Proof Select the Lyapunov function candidate as: Supposed that σ ci σ T ci ≤ λ max σ ci σ T ci Δ = σ ci M with a positive constant σ ci M , where λ max (·) denotes the maximal eigenvalue of matrix. Then, according the critic NN weight updating law (44) and Young's inequality, the time derivative of (46) is calculated as: Thus, the weight approximation errorW ci can be proved to be UUB with α ci > 1 2 , ifW ci lies outside the compact set:

Remark 7
Unlike existing works which presented timetriggered tracking controllers [37,38], in this paper, the event-triggered mechanism is introduced to develop the compensator-critic structure-based decentralized tracking control strategy based on the ADP approach with considering the optimal performance, reducing computational burden, and saving communication and energy consumption.

Stability analysis of the closed-loop MRM system
In this part, the stability analysis of the closed-loop MRM system under the developed compensator-critic structurebased event-triggered decentralized tracking control is provided using the Lyapunov stability theorem.

Theorem 3
Considering the n-DOF MRM whose subsystem dynamics described as (5), and Assumptions 1 and 2, the closed-loop MRM system is UUB via the approximate compensator-critic structure based event-triggered decentralized tracking control law (40) if the following condition is satisfied: Proof Select the Lyapunov function candidate for the MRM subsystem as: (1) The events are not triggered, i.e., t ∈ t l , t l+1 ). Calculating the time derivative of (49)V i =V 1i +V 2i , the first term is: In light of the time-triggered HJBE (14) and optimal control law (32), we obtain: and Then, substituting (51) and (52) intoV 1i , we havė According to Theorem 1 and Assumption 2, through Young's inequality, (53) becomes: For the second termV 2i , we haveV 2i = ∇Ξ * i θ li = 0.

Exclusion of Zeno behaviors
In general, the MRM system is a continuous-time system that the minimum trigger interval t min = min {t l+1 − t l } is possible to be zero, i.e., the so-called Zeno behavior. Thus, it is necessary to prove that t min has a positive lower bound.

Theorem 4
Considering the dynamics of the MRM subsystem (5), the triggering condition (48) and the compensatorcritic structure-based event-trigger-ed decentralized tracking control strategy (40), the minimum trigger interval t min has a positive lower bound by: where and i is a positive constant.
Proof The time derivative of the event-triggered error (30) is: According to Assumptions 1 and 2, the upper bound oḟ ϑ i (x i ) is derived as: Combining (30), (62) with (63), we can obtain: When t = t l+1 , the event-triggered condition satisfies: According to (64) and (65), the lth triggering interval Δt l has the lower bound by: It can be seen from (66) that the minimum triggering interval t min = E Li / θ li + i which increases from zero to the positive value Π l,min = min E li θ li t − l+1 / θ li + i , ∀t ∈ t l , t l+1 ). Therefore, the minimum triggering interval t min satisfies the condition (61), such that t min has a positive lower bound for arbitrary state ϑ i (x i ).

Establishment of experimental platform
A 2-DOF MRM experimental platform has been established, which is composed of two sets of joint modules and connecting rods, as shown in Fig. 3. Each joint module contains a motor, an incremental encoder, a speed reducer, an absolute encoder, and a torque sensor. The DC Brush motor selected from Maxon Inc. is the power to drive the MRM and each joint motor is driven by a linear power amplifier (LPA). The incremental encoder and the absolute encoder are utilized to measure the displacement of the motor and the position of the link module, correspondingly. The speed reducer is connected to increase the motor output torque through reducing motor speed with the gear ratio 100:1. The joint torque sensor is equipped between the link module and the joint module to measure the joint torque. The experimental data acquisition and processing depend on the QPIDe data acquisition device and Matlab/Simulink software installed in the hostcomputer, respectively. The designed control system is built by Simulink, and the packaged QUARC module is utilized to establish the communication between the host-computer and the QPIDe device to realize the real-time control of the 2-DOF MRM. In this paper, the experiments of a 2-DOF MRM with tracking task are established to verify the effectiveness of the proposed compensator-critic structure-based event-triggered decentralized tracking control strategy. We select the desired trajectories for each joint as q 1d = π 4 sin π 45 t +0.05, q 2d = π 2 sin π 45 t + 0.1. For the critic NN, we choose the radial basis function neural network (RBFNN) to approximate the optimal performance index function. The 1-5-1 NN structure is selected with 1 input neuron, 5 hidden neurons, and 1 output neuron for each joint. The NN weights are defined  (1,2) 0.01 κ f i (3,4) 0.01 κ ai 0.5   Table 1.
Note that the real-time state of MRMs can only be obtained by sensor sampling; therefore, we choose the sampled state from the time-triggered mechanism method as the system state in the event-triggered control.

Experimental results and analysis
Experimental results under the proposed control method are shown in Figs. 4, 5, 6, 7 and 8, which compared with the ADPbased time-triggered decentralized tracking control method [37]. Figure 4 shows the position tracking curves of each joint under the proposed control method. The red and blue dashed lines present desired tracking trajectory and actual tracking trajectory, respectively. From this figure, one observes that the asymptotic tracking between the actual and the desired trajectories can be realized in a very short time. Through  Fig. 5, the position tracking errors of each joint keep within an acceptable range (less than ±5×10 −3 rad) under the proposed event-triggered tracking control approach, and it illustrates the effectiveness of the presented control scheme intuitively. Figure 6 presents the joint control torque curves of MRMs under the conventional and the proposed control methods. The red and blue lines show that of the time-triggered control method [37] and the proposed event-triggered control method, respectively. We can see the proposed joint control torques only updated when the event-triggered condition is satisfied, and thus, it has a lower updating frequency.  [20,30] is a piece-wise one depending on the zero-order hold.
The cumulative numbers of sample states used in the time-triggered control method [37] and the proposed eventtriggered control method are shown in Fig. 8. It shows that the updating time of the time-triggered control method is near five times as that of the event-triggered one.
From the experimental results, the developed compensatorcritic structure-based event-triggered decentralized tracking control is effective to MRMs. It cannot only maintain the satisfactory control accuracy, but also effectively reduce the computational burden, save the communication, and energy consumption. time-triggered mechanism -joint 1,2 event-triggered mechanism -joint 1 event-triggered mechanism -joint 2 Fig. 8 Control torque updating times of each joint under the timetriggered control method and the proposed event-triggered control method lished. The model-based robust compensator is utilized to avoid the influence of dynamic uncertainties. The performance index function is constructed to reflect the position error, the velocity error, and the control torque. Thus, the event-triggered decentralized tracking control is obtained including the model-based robust controller and the ADPbased optimal compensation controller. Then, a critic NN is constructed to solve the improved event-triggered HJBE, and the event-triggered approximate decentralized optimal compensation tracking control torque can be derived directly. The Lyapunov stability theorem is utilized to prove UUB of the tracking error of the closed-loop MRM system. In contrast to the time-triggered optimal controller, the proposed compensator-critic structure-based event-triggered decentralized tracking control method cannot only maintain the satisfactory control accuracy, but also effectively reduce the computational burden, save the communication, and energy cost, simultaneously.