Predictive tracking of an object by a pan–tilt camera of a robot

Moving-object tracking using a pan–tilt camera setup is quite a well-known task in robotics. However, the presented research addresses specific properties of the tracked object and introduces novel features to the pan–tilt camera control strategy. Pan–tilt camera control does not operate in an isolated environment. It is a part of the visual servoing system with specific goals. The system has to fulfill certain purposes, which affect its configuration and functionality. The pan–tilt system aims at keeping the visually tracked object within the middle of the image. At the same time, the overall visual servoing efficiently recognizes and tracks the object enabling its grasping by the robot arm. It uses a predictive strategy utilizing specific second-order linear models for pan and tilt joints. Model predictive control (MPC) introduces into the system the ability to predict camera operation over the specific horizon according to the predefined tracking goals. As the system anticipates future positions over the horizon of operation, the setpoint prediction of the future tracked system positions is required. Visual object recognition and tracking system use particular strategies for preparing online tracked object extrapolation over MPC horizon. Therefore, the pan–tilt camera system is intrinsically coupled to camera-based recognition and tracking. Predictive pan–tilt positioning keeps the tracked system in the middle of the image, while the visual system extrapolation improves the tracking performance. The proposed approach is thoroughly tested in the dedicated Gazebo-based robot simulator. Finally, the system is implemented and validated on the Velma robot. The results and their comparison with other control strategies confirm the initial assumptions, allowing further visual servoing system development.


Introduction
Pan-tilt camera systems are frequently utilized in 3D visual tracking. For instance, they are used by autonomous vehicles or robots that have a vision head [2,5,20,27]. Such vision systems have to solve two problems: real-time processing of visual data and inducing the correct tracking motion of the camera. Solving the problem of efficient visual information processing is a significant challenge and constitutes the majority of the research. Data acquired by an RGB-D camera need to be processed in real time [3], and thus the image must be segmented, target features must be identified and extracted, useful information must be structured into control data, and finally, these data must be delivered to the system component that requires them. Those calculations and data handling actions are time-consuming. The majority of the research focuses on their speed up using time-efficient algorithms, usually through parallelization of calculations [24] or faster hardware. The task of servopositioning of the camera supporting neck is considered separately. Control issues and very high control accuracy do not lie in the mainstream of research. The main goal is to keep the object within the image.
Practical experience shows that tracking and recognition are coupled. Image recognition performance depends on proper camera tracking and vice versa. Once we can keep the tracked object within the same sector of the image frame, we may limit the search and analysis space of the whole frame, what speeds up calculations. On the contrary, if we fail to achieve image processing within the system sampling period, the information about the target will be lost for one or more samples, what may cause the servomechanism tracking degradation or even induce losing the tracked object out of sight. Therefore, camera image processing and servo tracking tasks are closely coupled, and the correlation works in both directions. Better camera tracking operation speeds up visual data processing and prevents losing the object from sight. Fast visual data evaluation prevents sampling losses and tracking problems. As the tracked object can be located quite far away from the head and thus the cameras, even a small improvement in tracking accuracy may help.
One may find different control strategies applied in pan-tilt servoing. Classical PID-based feedback control is the primary choice. When the servomotor is overrated and fast, just simple P (proportional) version might be satisfactory, especially when the steady-state error is allowed [44]. However, most frequently full PID version is used [22,41], sometimes modified by factors similar to those used in process control, such as derivative filtering [44] or lead-lag compensation [40]. Feedback control is improved by feedforward decoupling [27] in situations when disturbance measurement is available. A similar task can be solved alternately by active disturbance rejection control [5]. If accurate control is required, feedback control is replaced by more advanced control strategies [4]. Once system nonlinearities affect control performance, the multi-regional PID strategies can be applied, such as fuzzy PID [28].
Visual trackers can use various algorithms to evaluate and extrapolate tracked object trajectories. Kalman filters are the most obvious choices [30,39]. However, the subject literature is much richer [36]. Interestingly, deep coupling between both tasks is not popular, although quite obvious. MPC strategy needs to have a setpoint trajectory over the prediction horizon. And yet the tracking algorithm allows us to evaluate it. The relationship works in the opposite direction as well. Accurate tracking and keeping the tracked target in the same position of the camera image enables improving image recognition algorithm minimizing tracking evaluation time, and minimizing the possible effect of losing a sample due to the prolonged calculations. It is important to stress that if the tracked object is at all times in the center of the image frame, the whole procedure gives the best results.
Following the above observations, the proposed research fills the observed gap. It focuses on efficient, predictive pan-tilt camera servomechanism (neck-MPC). The aim of the neck-MPC is high targetobject tracking accuracy within a camera image frame. The application of predictive control algorithms, such as DMC and GPC in the pan-tilt tracking system enables practical realization of coupling between the servomechanism and image recognition. The DMC and GPC control rules represent a relatively simple predictive approach. However, their superiority lies not only in an embedded model or optimization, but in the repet-itive operation over a horizon. It is especially important how the setpoint is formulated. The PID-type controller uses only the current setpoint and current control error in its rule, even if the future setpoint trajectory is known. Predictive control enables the incorporation of the knowledge about the future setpoint trajectory in the evaluation of the current manipulated variable. We acknowledge the fact that the more sophisticated is the control algorithm, the better is its performance. However, its robustness, sustainability and the number of engineers that can maintain it generally diminishes. Therefore, a simple approach, such as DMC or GPC, is taken into account in the present research. High tracking performance improves visual data processing, generating a beneficial synergy effect.
The paper is organized as follows. Section 2 presents the theoretical aspects of this research, i.e., the system kinematics (Sect. 2.1), reference PID control (Sect. 2.2), and the servomotor predictive control strategy (Sect. 2.3). The developed neck-MPC is validated with Matlab, the Gazebo simulator and the real Velma robot, what is described in Sect. 3. Section 4 summarizes the results of the work, draws conclusions, and presents the identified open issues for further research.

Pan-tilt control system
This section describes two main elements of the proposed contribution, i.e., the controlled pan-tilt system and the applied control strategy.

Neck kinematics
The relevant part of the kinematic structure of the Velma robot [35] equipped with a pan-tilt (neck) mechanism is presented in Fig. 1. Table 1 contains the respective D-H parameters [7].
The following frames are relevant to the considerations: 0 -base coordinate frame, 1 -robot torso frame, 2 -pan-tilt first link frame, 3 -pan-tilt second link frame, C -camera frame, G -goal frame (the tracked object). Homogeneous matrices represent those frames, e.g., 0 C T -camera with respect to the base. A homogeneous matrix contains the orientation submatrix, e.g., 0 C R, and the position vector, e.g., 0 C P. The task of the visual servo is to track the object of interest G in such a way that the location of its image  Table 1 Denavit-Hartenberg parameters of the kinematic structure between the base and the neck of the Velma robot will be approximately at the center of the camera imaging plane. For that to happen, the versor of the camera optical axis C C z and the vector from the camera to the center of the object C G P should coincide. In such a situation, the vector cross-product of those two vectors should be 0. If this is not the case, the vector crossproduct defines the axis of rotation about which the camera should be rotated so that the image of the object will become centered in the camera image plane. If C G P is normalized the length of the resulting vector is equal where φ is the required angle of rotation. If this value is nonzero, the versor of the axis of rotation κ has to be derived If sin φ = 0, no camera motion is required, thus versor κ is irrelevant. Hence, if the motion is required, the orientation of the camera has to change by rotating about κ by φ, i.e., the desired rotation is [7]: where here and further s stands for sin and c for cos. The desired orientation of the camera with respect to the torso is: 1 This has 3-DOF, while the neck has only 2-DOF; thus an artificial angle γ is introduced: where θ 2 and θ 3 are the current value of the neck joint angles, while superscript SP represents the desired value (set-point) of the angles. The value of 1 C R(θ 2 , θ 3 ) is produced by solving the direct kinematics problem: using the current encoder readings. The value of R κ,φ is computed based on (2). Hence is a known matrix. Thus the desired orientation is: Equating (6) and (4) and using their last columns, as they do not depend on γ , the inverse kinematics problem for the neck is solved: It is important to note that equations (7) assume that sin(θ 3 ) = 0. Fortunately, the angle θ 3 never equals to 0 + kπ , k = 0, 1, 2, . . ., due to the kinematics limitations. In other words -it is physically impossible for the camera to be pointed vertically upwards.

PID -benchmark control strategy
A PID algorithm has also been formulated to check whether an MPC algorithm such as DMC would give better results than this standard controller. In continuous time this algorithm generates new values of the control signal according to the following formula: whereθ (0) is the initial value of control for the integral part, e is the control error, K G is the gain, T I is the time constant of the the integral part and T D is the time constant of the derivative part. Vectors in the formulation (8) have the following form: The control error is calculated as follows: Formula (8) can be rewritten into discrete time (k) form, with the sampling period T s : where Vectors in the discrete-time notation presented in (10) have analogous form to the respective vectors in continuous time notation.

Predictive pan-tilt servoing
Model predictive control (MPC) constitutes the focal element of the system. The considered predictive pantilt camera system control is realized in a hierarchical control structure with the MPC delivering joint angle setpoints for the low level servomotors controlling the joint motors. Figure 2 presents the pan-tilt control system neck-MPC(θ ) tracking the target in the pan θ pan (k) and tilt θ tilt (k) joint angle spaces. The neck-MPC controller operates in a direct mode with a high sampling frequency of 500 Hz. It directly controls the pan and tilt joint angles. Its setpoint is generated by the vision system, which acquires the image of the target. Visual object pose estimation operates with a longer sampling rate of 20 Hz. This setpoint value over the 50-ms interval is interpolated into the 2 ms sampling of the pan-tilt servomechanism. Therefore, the dynamic pan-tilt controller has enough time to realize each setpoint signal. Moreover, some time is left for setpoint signal shaping and smoothing during interpolation.
MPC controllers work with the sampling frequency of 500 Hz (denoted by time instant index k). They obtain their setpoint with 20 Hz (denoted by time instant index τ ), which is further interpolated from 20 Hz to 500 Hz. The employed MPC algorithm and the proposed modifications, i.e., the internal model, cost function and constraints, are described below. The schematic diagram of the implementation procedures of the proposed approach is sketched in Fig. 3.

Model predictive control-general formulation
The process input, i.e., manipulated variable (MV), is denoted by u and the output, controlled variable (CV), is denoted by y. The vector of decision variables calculated at each discrete sampling moment (k = 0, 1, 2, . . .) by the MPC [37] is  where N u is the control horizon, i.e., the number of calculated future control increments defined as backward differences, i.e., u(k|k) = u(k|k) − u(k − 1) and u(k . The decision variables of MPC (14) are calculated by solving the optimization problem: subject to The role of the first part of the MPC cost function is to minimize the predicted control errors over the prediction horizon N . The setpoint (CV demand) and predicted values of the CV for the future sampling instant k + p known or calculated at the current moment k are denoted by y SP (k + p|k) andȳ(k + p|k). The predictionsȳ(k + p|k) are calculated on-line using the model of the process, described by the general function f model (·).
The role of the second part of the cost function is to eliminate excessive MV changes. In general, the constraints may be imposed on -future excessive values of the manipulated variable (over the control horizon), limited by the minimal and maximal allowed values u min and u max , -future MV changes with its maximal value denoted as u max and -predicted values of the controlled variable (over the prediction horizon), limited by the minimal and maximal allowed values y min and y max .
Although the whole sequence of decision variables (14) is calculated at each sampling, only its first component is applied to the process. Measurement of the process output is updated during the next sampling period k + 1, and the procedure is repeated. Appropriate stability proofs exist and can be found in the literature [1,23]. The considered MPC optimization task (15) is common in industrial practice [9].

Applied MPC formulation
The optimization problem (15) can be rewritten for the neck into the following form: where The optimization task (16) minimizes the control error for the estimated target pan and tilt positions as well as excessive changes of their next positions. It is subject to a single constraint. The joints can be positioned between ±90 • , i.e., ±1.5708 radians.
The first element of the predicted target estimated positions for both jointsΘ(k) can be expanded intō θ (k + 1|k) = [θ pan (k + 1|k)θ tilt (k + 1|k)] T and the last element intoθ (k By manipulating the values of parameters inside the matrix Ψ an improvement of control performance can be obtained [25]. The result may be significantly better, when the parameters of matrix Λ are changed. However, this approach is less effective, if the setpoint trajectory constantly changes, as in our case. For the considered robot neck two MPC algorithms are used: DMC and GPC. DMC (Dynamic Matrix Control) algorithm uses a step response model to calculate the prediction of the controlled variables. First, the step response coefficients are gathered from a dynamic model of the process. Next, the step responses are used to create the following matrix and also the past dynamic matrix which are of dimensionality 2N ×2N u and 2N ×2(D− 1), respectively. Finally, the vector of the predicted controlled variables is calculated as followŝ where the vectors are of length 2N and 2(D − 1), respectively. The minimized MPC cost function (16), with the use of the prediction equation (21), can be rewritten as The cost function in (23) is quadratic with respect to Θ . After differentiating it, we obtain The optimal solution vector is found by making the gradient vector calculated by equation (24) equal to zero and as a result the following formula is obtained is a matrix of dimensionality 2N u × 2N . GPC (Generalized Predictive Control) algorithm is the foundation of the second approach. In this control approach, the vector of predicted controlled variables is found using the following formulā where the matrix M is the same as in the DMC algorithm (19). The free trajectory vector, denoted as Θ 0 (k), is calculated using the dynamic model of the process where the first entry θ 0 (k + 1|k) can be expanded into θ 0 (k +1|k) = [θ 0 pan (k +1|k) θ 0 tilt (k +1|k)] T and the last element into θ 0 (k + N |k) = [θ 0 pan (k + N |k) θ 0 tilt (k + N |k)] T . The considered model, in this algorithm, is as follows where the entries of the matrices are the polynomials in the backward shift operator q −1 A pan (q −1 ) = 1 + a 1,pan q −1 + . . . + a n A ,pan q −n A A tilt (q −1 ) = 1 + a 1,tilt q −1 + . . . + a n A ,tilt q −n A (32) In GPC, the cost function (23) can be rewritten as Similarly to DMC, as the cost function (34) is quadratic in terms of the decision variables, the following solution applies It is to be noted that the only difference between DMC and GPC is in the way the free trajectory is calculated. Respective stability analyzes can be found for DMC in [12] and for GPC in [13,42]. Robust stability considerations are addressed in [8,15].

Simplified pan-tilt camera model
A continuous-time model of the considered robot neck is proposed using second-order transfer functions for pan and tilt joints, respectively.
G pan (s) = K pan (T 1,pan s + 1) (T 2,pan s + 1)(T 3,pan s + 1) , where K pan and K tilt are static gains and T 1,pan , T 2,pan , T 3,pan and T 1,tilt , T 2,tilt , T 3,tilt are respective time constants. Pan and tilt joint positions, denoted asθ , measured at the current instant k of simulation form the model inputs. Target positions estimated for the pan and tilt, denoted as θ SP , are the model outputs. Transfer functions are discretized and rewritten in the form of the following difference equations applied to the MPC framework:  Fig. 4. Two performance indices are used to compare control strategies: integral absolute error (IAE), denoted as E IAE (40), and mean squared error (MSE), denoted as E MSE (41). The IAE index assigns equal weights to errors regardless of their absolute values. This index is robust with respect to outliers [32]. It is invaluable in checking the overall excessive changes of joint positions. If these changes are too large harmful vibrations of the neck of the real robot occur. This phenomenon can lead to potential damage of the mechanical parts of the robot. The excessive changes are minimal, so the IAE delivers more proper measure than a MSE index calculating a quadratic function of the control error.
where N is the total number of samples and k max is the last sampling instant. MSE is used to take into account large control errors. They are magnified by the quadratic function.

Results
The proposed control strategy is validated in two ways. First, the control scheme is implemented in the Gazebo simulator and tested with the Gazebo model of the robot. Secondly, the MPC controller is tested with a real Velma robot. The components of the system are implemented using the FABRIC framework (Framework for Agent-Based Robot Control Systems) [35].

Validation using the Gazebo simulator
The proposed control strategy is validated using the Gazebo simulator. All the simulations are performed in a discrete-time form, as the predictive control formulation is in a discrete-time form and the ultimate controller implementation is done on a real robot in the discrete-time form. The camera tracks sample recorded object trajectory representing a circular motion in two axes. Setpoint trajectory: θ SP pan for the pan joint and θ SP tilt for the tilt joint is provided by the visual estimation system. Three controllers are compared: PID, DMC and GPC. The PID uses the following parameters: K G = 0.1, T I = 0.01s T D = 0.01s. The DMC is set up with: N = 50, N u = 25, λ pan = 100, λ tilt = 100, ψ pan = 1 and ψ tilt = 1, while the GPC uses the same prediction and control horizons as well as the penalty imposed on the control error as the DMC, but differs in the penalty imposed on the control increments λ pan = 100, λ tilt = 10000. The setpoint trajectory is circular. As the pan-tilt predictive control is implemented in the KUKA LWR 4+ robot, which is envisaged to be used as a companion robot, it is not expected that the system will have to track fast moving objects that rapidly change the direction of their motion. Thus, the parameters of the trajectory comply with the assumed robot operating conditions. Figure 5 shows the comparison of the performance of PID and DMC algorithms, while Fig. 6 depicts the operation of the two MPC algorithms: DMC and GPC. Table 2 presents respective performance indices for both joints. The IAE obtained for PID control has low enough values for both joints to prevent harmful effects on the robot. DMC control provides similar results in this regard. However, comparing the MSE, we notice that the DMC control gives much better results. The MSE values are much lower for both joints than in the Fig. 5 Simulation results: trajectories of the control system with the PID controller vs. trajectories of the same system, but with the DMC controller; controlled variables of both joints -first two top plots, manipulated variables of both joints -two bottom plots PID control scenario. The DMC algorithm uses setpoint trajectory prediction over the prediction horizon, unavailable for PID. GPC algorithm, similarly to DMC, can utilize future changes in the setpoint trajectory. Still, the inclusion of past values of measured signals in the calculations tends to generate high control increments to quickly compensate for measurement noise, disturbances, or modeling inaccuracies. Therefore, a higher value of the penalty imposed on the control signal increments is often required to achieve stability of the controller, as it is in this case. Unfortunately, this causes a tradeoff with the control quality. As it is visible in Fig. 6 and the Table 2, even though the smoothness of the control signal is comparable to the PID and DMC algorithms, the quality is lower.
The MSE and IAE performance measures suggest the following interpretation. IAE index shows quite similar values, but MSE shows the superiority of the DMC control. It is due to the fact that each of these Fig. 6 Simulation results: trajectories of the control system with the DMC controller vs. trajectories of the same system, but with the GPC controller; controlled variables of both joints -first two top plots, manipulated variables of both joints -two bottom plots indices considers different performance aspects. IAE compares normal operation, while MSE focuses on the operation in the case of rapid changes and transient periods, due to the squared error. Comparison of both measures justified the interpretation of the results. Hence the DMC algorithm was implemented and used in the real Velma robot.

Validation on the Velma robot
The DMC algorithm with the same parameters has been implemented on the Velma robot. A similar tar-get movement has been used for validation purposes. These results confirm that the assumed predictive control is appropriate.
We see that the obtained responses look very similar and the level of the accuracy improvement is quite low. However, we have to keep in mind the fact that pantilt control is not used just to move the robot neck. We use it to control the camera that tracks the object and the object should be kept in the middle of the image. When the tracked object is in the middle of the image pattern recognition is improved, especially as the time needed for object recognition becomes lower. Moreover, the tracked object can be located quite far away from the head and therefore even small inaccuracies in the angles start to matter. Thus, the pan-tilt head operation performance has to be kept in the visual object tracking context, not just as simply movement.

Conclusions and further research
This paper describes the results of research focused on evaluating predictive control of a pan-tilt camera system. The issue considered in this paper is part of a larger project related to the launch of a whole visual servoing system, comprising recognition and tracking of objects with an RGB-D camera and predictive control of a KUKA LWR 4+ manipulator.
The algorithm evaluation and validation consist of three steps. Matlab simulations are followed with the simulation using the Velma robot Gazebo simulator. Finally, the selected predictive controller is implemented in the real Velma robot. Initially, three control strategies were taken into account: PID algorithm and two predictive methods: DMC and GPC. Detailed simulations point out the superiority of the DMC predictive controller, which is eventually implemented on the Velma robot. It is interesting that it matters which predictive strategy is used. They are not equivalent. One has to remember that both predictive algorithms are not equivalent. There is a single difference in the type of the model used. One cannot use one or the other with complete freedom. In addition to the different forms of the model, the control law is implemented differently, what translates into their specificity and the resulting control performance. In our case, the properties of the DMC control are more predisposed to this application. Real-time pan-tilt object tracking confirms initial assumptions about the DMC predictive strategy and enables further investigation of the entire visual servoing Velma robot project. It is shown that even the simple predictive strategy, such as the DMC, improves tracking control. It is mostly due the fact that the MPC approach allows to incorporate into the control rule definition the knowledge about the future behavior of the setpoint. It is the main advantage of any predictive control over PID-type controllers. It is shown that we do not have to use complex predictive approaches. Simple DMC algorithm, with a simple dynamical model, improves performance. Moreover, we have to be aware that practice prefers simple approaches. Simplicity probably slightly lowers the performance, but it increases the robustness, control system sustainability, facilitates human supervision (simpler knowledge) and lowers maintenance requirements. DMC introduces improvement while being simple.
One issue that requires further analysis is the possible impact of disturbances. Actually, they virtually have not not been considered here. Although a possible movement of the robot column can be treated as a disturbance. During the realization of this project, we consciously decided not to analyze this issue. We are aware that such disturbance can be easily included into the MPC control. But then, its comparison with the PID control would be unfair from the PID perspective. Thus, the PID control layout should be modified with the feedforward disturbance decoupling. Nonetheless, this subject is really worth to be considered and as such is planned in the future research. Uncertainties are harder to be considered, especially in the simulation, as we cannot evaluate the risk they introduce. We are fully aware that they exist and impact the project, as for instance the unknown and unmodeled dynamics, which certainly occurs.
Unmodeled dynamics and the impact of unknown disturbances constitute two separate issues. The effect of unmodeled dynamics has been already taken into account. The proposed linear model, which is used as the internal model of the predictive strategy is considered as a simplification of the real pan-tilt system. It is shown that the controller that uses such a simplified model achieves good control performance. Therefore, the unmodeled dynamics is considered in the research. The other problem is the inaccurate knowledge of kinematic parameters of the considered system, e.g., Denavit-Hartenberg parameters of the kinematic model. However, our system has been calibrated, and thus those inaccuracies do not influence the system performance to a noticeable extent.
The aspect of unknown disturbances constitutes a different issue. Those disturbances can be produced by quantization effects introduced by the camera. The position of the tracked object is determined on the basis of its position in the image; hence the resolution is of significance. However, this problem is solved either by applying a camera with sufficient resolution or by using a second low-resolution camera mounted on the end effector, as described in [19]. In that case the resolution of the eye-in-hand camera does not influence significantly, the precision of locating the object, because as the gripper gets nearer to the object the image of the object occupies more space in the image, thus can be located more precisely.
Further operational validation with the robot should be performed to observe them and to investigate their impact. As the algorithm robustness is a significant feature of an embedded control strategy, the research on pan-tilt robust control offers an interesting alternative for future research.