Introduction

To comply with the requirements of modern manufacturing for efficiency, visualization and wireless communication [1,2,3], manipulators equipped with different sensors are competent to adapt extreme ambient conditions by means of the non-contact detection. Manipulators with visual sensors simulate human vision which allows the feedback controller to be measured in non-contact positions and directions. At present, visual servoing (VS) manipulators have a wide application potential in many scenarios such as disaster rescue, medical detection, space exploration, etc [4,5,6]. It is well known that the tracking control strategy is demanded to provide the more precision and less consumption of the VS systems especially in the unstructured environments, but only a few advanced techniques have been applied to the VS tracking control to ensure the optimum of system tracking performance.

Related work

Generally, the goal of the feature trajectory tracking control is to drive the system outputs track specified desired trajectories in the VS system. Hence, visual feedback signals have been used as significant information in robotics to tackle the positioning or motion control in unstructured environments. The visual tracking problem for manipulators has been studied over the past few years and a wide range of technologies have been explored.

Kang et al. [7] adopted a reinforcement learning method to adaptively adjust the servoing gain to improve the convergence rate and stability. Li et al. [8] combined proportional derivative (PD) control with sliding mode control (SMC) to tackle the disturbance and uncertainties on a 6-degree-of-freedom (DOF) VS manipulator. Sharma et al. [9] proposed a fractional order SMC method to drive vehicle motion using visual information of image plane. Furthermore, considering the uncertainties, they designed a new adaptive rule to adjust the sliding surface parameters to ensure the finite time stability of the system. Qiu et al. [10] presented a depth-independent interaction matrix based on model predictive control method by taking the input and output constraints into account.

Unlike the above kinematic VS control problems, the dynamic VS control problems can be solved by establishing the composite Jacobian matrix mapping from the image space to the robot joint space which is usually adopted. Based on the obtained system parameters, an effective controller can be designed. For example, Wang et al. [11] presented a new adaptive algorithm based on the estimated image depth, and proved the global stability by Lyapunov method. Li et al. [12] addressed an effective controller design problem for an uncalibrated camera–manipulator system to ensure the finite-time convergence. In addition, some scholars have also paid attention to the large measurement error caused by external interference or system modeling deviation. Based on the parameter uncertainties in manipulator kinematics and dynamics, Cheah et al. [13] proposed an adaptive regression strategy to estimate the Jacobian matrix adaptively. Hua et al. [14] adopted immersion and invariance observer to identify an uncalibrated VS system without measuring the joint velocity on the basis of depth independent matrix. Wang et al. [15] designed a new nonlinear observer to dynamically track the motion of a target in Cartesian space. Wang et al. [16] proposed a novel adaptive observer controller which employed the feature velocity term contained in the unknown kinematics. The superiority of the proposed image-space observer lied in its simple structure in handling uncertainties; thus, it avoided the over parametrization in the existing works. Leite et al. [17] developed a cascade control strategy based on an indirect/direct adaptive method, which introduced uncertainties in robot kinematics and dynamics of visual tracking problem. Considered the output nonlinearity and unknown dynamics, Wang et al. [18] investigated an adaptive neural network control for the VS manipulator system whose dynamic model is not required to be linearly decomposable. Zhang et al. [19] developed an adaptive neural network controller with the Barrier Lyapunov Function (BLF) to overcome nonlinearities and visibility constraint problems.

In addition, the key points of optimal control problems of nonlinear systems are to design a suitable controller tackling input/output constraints, external disturbance, uncertainties, etc [20,21,22,23]. During the past few years, adaptive dynamic programming (ADP) algorithm which is proposed by Werbos [24] has extensively developed the optimal control schemes for robots or manipulators to enhance the control performance and reduce the energy consumption of the controller [25]. Kong et al. [26] introduced an approximate optimal strategy to resolve the non-linearity saturation problem of n-DOF manipulators. In [27], an adaptive fuzzy neural network control method with impedance learning was presented for robots with constraints. In [28], the optimal coordination control which was applied to multi-robots to follow expected trajectories was presented by means of reinforcement learning. Tang et al. [29] employed a reinforcement learning-based adaptive optimal control method to realize the optimal tracking of n-DOF manipulators. Li et al. [30] established the nonlinear discrete-time dynamic model of wheeled mobile robots, where the reinforcement learning and ADP method were adopted to tackle the tracking problem for systems with skidding and slipping constraints. In [31], an artificial potential field scheme cooperated with ADP method was proposed for path planning of bio-mimetic robot fish, where heuristic learning programming was applied to obtain the position and angle. Lian et al. [32] presented a receding-horizon dual heuristic programming algorithm for tracking control of wheeled mobile robots, and developed a backstepping kinematic controller. Zhan et al. [33] proposed an ADP-based control approach to deal with tracking problem for robots with environment interactions. Li et al. [34] proposed a policy iteration-based fault compensation control for modular reconfigurable robots subject to actuator failures. Zhao et al. [35] developed an event-triggered ADP algorithm of decentralized tracking control which can reduce communication frequency and extend the service life of mechanical and electronic devices. Dong et al. [36] designed a novel force/position control scheme based on zero-sum optimal ADP decentralized control strategy for decentralized system by considering the influence of unknown and interconnected dynamics for reconfigurable manipulators. The application of ADP-based optimal control in the field of robotics has progressed in the recent years, the optimal feature tracking control for VS system which is expected in practical systems is still an open problem.

Motivation and contribution

In recent few years, many kinematics-based visual controllers have been proposed by assuming that the VS manipulator has an accurate positioning device with negligible dynamics [37,38,39]. However, in the control perspective, it is difficult to ensure the dynamic performance and stable control when neglecting the nonlinearities in kinematic control due to the existence of both parameter uncertainties in robot dynamics and errors in camera calibration. Therefore, the controller design is a challenging task with considering the influence of control error and system stability, especially in the robot positioning or trajectory tracking control [12, 40]. Unfortunately, there is few discussion on the difficulties of VS manipulator systems, especially with uncertainties of intrinsic parameters, camera calibration errors, external disturbance, friction, etc. From the aforementioned literature, we conclude that the difficulties in designing controllers lie in how to handle unmodeled dynamics and external disturbance without linearity-in-parameters. Furthermore, it is expected to optimize the performance index in VS manipulator systems.

Inspired by the above literature, this paper investigates a feature tracking controller based on ADP scheme for VS manipulators subject to unknown dynamics by taking optimal performance index into account. Based on the radial basis function (RBF) estimated uncertain dynamics of the VS manipulator model, an adaptive NN observer is proposed to identify the uncertainties (e.g., unmodeled dynamics, external disturbance, joint friction, etc) in real time. The cost function is improved by inserting the estimated uncertainties, and the visual tracking problem is converted to the feature error control. Then, the optimal feature tracking control is derived directly. Therefore, the stability of VS manipulator systems is guaranteed by utilizing Lyapunov stability theorem. Finally, in order to show the robustness and effectiveness of the designed controller, a 3-DOF (Degree of Freedom) eye-to-hand (ETH) manipulator is employed to simulation.

The main contributions of the presented scheme can be summarized as follows.

  1. 1.

    In this paper, the proposed feature tracking control strategy, which directly acts on image feature, is facilitated more feasibility and intuitively. Thus, the designing controller based on the camera–manipulator model does not need to obtain regression matrix and avoids complicated calculation.

  2. 2.

    It is the first time to develop the ADP technique to feature-based visual tracking control for VS manipulator systems with unknown dynamics. Unlike the existing visual tracking control approaches, the critic NN-based controller is designed in an optimal manner, which saves the energy cost and is significant in practice.

  3. 3.

    The major advantage of the improved cost function lies in that the estimated uncertainties is introduced and given full consideration in controller design. Simultaneously, the closed-loop VS manipulator system can be guaranteed to be ultimately uniformly bounded (UUB) using the proposed ADP control scheme.

The remainder of this paper is organized as follows. In “Preliminaries and problem statement”, the basic preliminaries and dynamic model are presented. In “ADP-based feature tracking controller”, the unknown dynamics of VS manipulator systems is approximated by an adaptive NN observer, and the optimal error controller is designed in detail. Then, the stability is analyzed. In “Simulation tests”, simulation examples are provided to illustrate the effectiveness of the proposed control scheme. Finally, a brief conclusion is given in “Conclusion”.

Preliminaries and problem statement

Camera–robot kinematics model

In this paper, the ETH structure is selected for the VS system that is shown in Fig. 1, and a n-DOF VS manipulator is employed to construct the forward kinematics. Denote the image coordinate of feature point as \(f_{uv}=[u,v]^\mathrm{{T}}\). The mapping from feature point to robot position [14] can be expressed as

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {f_{uv}(t)}\\ 1 \end{array}} \right] = \frac{1}{{{D_\mathrm{{epth}}}(t)}}{M_\mathrm{{c}}}\left[ {\begin{array}{*{20}{c}} {r(t)}\\ 1 \end{array}} \right] , \end{aligned}$$
(1)

where \(r(t) \in {\mathbb {R}}^3\) is the Cartesian coordinate of robot end-effector with respect to the base frame, \({{D_\mathrm{{epth}}}(t) \in {{{\mathbb {R}}}}}\) is the depth of feature point in the camera frame, \({M_\mathrm{{c}}} \in {\mathbb {R}}^{{3}\times {4}}\) is the perspective projection matrix which can be expressed as

$$\begin{aligned} {M_\mathrm{{c}}} = {M_\mathrm{{in}}}{M_\mathrm{{ex}}}, \end{aligned}$$

where \({M_\mathrm{{in}}} \in {\mathbb {R}}^{{3}\times {4}}\) is the intrinsic matrix of the camera, \({M_\mathrm{{ex}}} \in {\mathbb {R}}^{{4}\times {4}}\) is the homogenous transformation matrix computed via forward kinematics, which also represents the extrinsic matrix.

Fig. 1
figure 1

VS manipulator system

By separating \(f_{uv}(t)\) from (1), we can obtain

$$\begin{aligned} f_{uv}(t) = \frac{1}{{{D_\mathrm{{epth}}}(t)}}{M_\mathrm{{sub}}}r(t), \end{aligned}$$
(2)

where \({M_\mathrm{{sub}}} \in {\mathbb {R}}^{{2}\times {3}}\) is the sub-matrix of perspective projection matrix \({M_\mathrm{{c}}}\), which is given by

$$\begin{aligned} {M_\mathrm{{sub}}} = \left[ {\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {{m_{11}}}&{}{{m_{12}}}&{}{{m_{13}}} \end{array}}\\ {\begin{array}{*{20}{c}} {{m_{21}}}&{}{{m_{22}}}&{}{{m_{23}}} \end{array}} \end{array}} \right] , \end{aligned}$$

where \(m_{ij}\) is the ijth component of \({M_\mathrm{{c}}}\).

The depth of the feature point can be given by

$$\begin{aligned} {D_\mathrm{{epth}}}(t) = M_\mathrm{{D}} r(t) , \end{aligned}$$
(3)

where \(M_\mathrm{{D}}=[m_{31}, m_{32}, m_{33}]^\mathrm{{T}}\). Assume \({D_\mathrm{{epth}}}(t)\) be a positive and bounded constant; i.e.,

$$\begin{aligned} \min ({D_\mathrm{{epth}}}(t)) \le {D_\mathrm{{epth}}}(t) \le \max ({D_\mathrm{{epth}}}(t)). \end{aligned}$$

By differentiating (2) and (3), one obtains

$$\begin{aligned} \dot{f}_{uv}(t) = \frac{1}{{{D_\mathrm{{epth}}}(t)}}({M_\mathrm{{sub}}} - f_{uv}(t)M_\mathrm{{D}})\dot{r}(t) = {J_\mathrm{{I}}}\dot{r}(t), \end{aligned}$$
(4)

where \({J_\mathrm{{I}}}\in {\mathbb {R}}^{{2s}\times {3}}\) is the feature Jacobian matrix (or interaction matrix), s is the number of feature points. Let \(q(t)\in {\mathbb {R}}^n\) be the joint angle vector. From the robot kinematics, the velocity relationship of joint space to Cartesian space can be expressed as

$$\begin{aligned} \dot{r}(t) = {J_\mathrm{{R}}}\dot{q}(t), \end{aligned}$$
(5)

where \(J_\mathrm{{R}}\in {\mathbb {R}}^{{3}\times {n}}\) is the robot Jacobian matrix. Combining (4) with (5), we obtain

$$\begin{aligned} \dot{f}_{uv}(t) = {J_\mathrm{{I}}}{J_\mathrm{{R}}}\dot{q}(t) = {J_\mathrm{{com}}}\dot{q}(t), \end{aligned}$$
(6)

where \(J_\mathrm{{com}}\in {\mathbb {R}}^{{2s}\times {n}}\) denotes the compound Jacobian matrix. We can rewrite (6) as

$$\begin{aligned} \dot{q}(t) = J_\mathrm{{com}}^ + \dot{f}_{uv}(t), \end{aligned}$$
(7)

where \(J_\mathrm{{com}}^ + = {(J_\mathrm{{com}}^\mathrm{{T}}{J_\mathrm{{com}}})^{ - 1}}J_\mathrm{{com}}^\mathrm{{T}}\) is the pseudo-inverse of the compound Jacobian matrix. In practice, the manipulator is required to perform a servoing task in a reachable finite task-space [41]. To avoid the Jacobian matrix singularity, \(M_\mathrm{{c}}\) should be full rank. Hence, \(J_\mathrm{{com}}\) is full rank and its pseudo-inverse matrix exists, whose detailed illustration can be found in [42, 43].

By differentiating (7), the acceleration of joint angle q(t) is formulated as

$$\begin{aligned} \ddot{q}(t) = J_\mathrm{{com}}^ + \ddot{f}uv(t) + \frac{\mathrm{{d}}}{{{\mathrm{{d}}t}}}(J_\mathrm{{com}}^ + )\dot{f}_{uv}(t). \end{aligned}$$
(8)

Camera–manipulator dynamic model

Considering a general n link manipulator, whose dynamic model can be mathematically formulated as

$$\begin{aligned} N(q)\ddot{q} + B(q,\dot{q})\dot{q} + G(q) + A(\dot{q}) + {F _\mathrm{{d}}} = \tau , \end{aligned}$$
(9)

where \(N(q)\in {\mathbb {R}}^{{n}\times {n}}\) is the inertia matrix, \(B(q,\dot{q})\in {\mathbb {R}}^{{n}\times {n}}\) is the centrifugal and coriolis force, \(G(q)\in {\mathbb {R}}^n\) is the gravitational term, \(A(\dot{q})\in {\mathbb {R}}^n\) denotes the friction term, \(F_\mathrm{{d}}\in {\mathbb {R}}^n\) indicates the external disturbance, and \(\tau \in {\mathbb {R}}^n\) denotes the output torque.

Combining (7)–(9), the dynamics of VS manipulators can be expressed as

$$\begin{aligned}&N(q)J_\mathrm{{com}}^ + \ddot{f}_{uv} + N(q)\frac{\mathrm{{d}}}{{{\mathrm{{d}}t}}}(J_\mathrm{{com}}^ + )\dot{f}_{uv} + B(q,\dot{q})J_\mathrm{{com}}^ + \dot{f}_{uv}\nonumber \\&\quad + G(q) + A(\dot{q}) + {F _\mathrm{{d}}} = \tau . \end{aligned}$$
(10)

Multiplying the term \({(J_\mathrm{{com}}^ + )^\mathrm{{T}}}\) on both sides of (10), the dynamics of the VS manipulators is expressed in the workspace as

$$\begin{aligned}&N_\mathrm{{o}}(f_{uv})\ddot{f}_{uv} + {C_\mathrm{{o}}}(f_{uv},\dot{f}_{uv})\dot{f}_{uv} + {G_\mathrm{{o}}}(f_{uv})\nonumber \\&\quad + {A_\mathrm{{o}}}(\dot{f}_{uv}) + {F _\mathrm{{do}}} = {\tau _\mathrm{{o}}}, \end{aligned}$$
(11)

where \(N_\mathrm{{o}} (f_{uv}) = {(J_\mathrm{{com}}^ +)^\mathrm{{T}}}N(q)J_\mathrm{{com}}^ + \), \({C_\mathrm{{o}}}(f_{uv},\dot{f}_{uv}) = {(J_\mathrm{{com}}^ + ) ^\mathrm{{T}}}(N(q)\frac{\mathrm{{d}}}{{{\mathrm{{d}}t}}}(J_\mathrm{{com}}^ + ) + B(q,\dot{q})J_\mathrm{{com}}^ +)\), \({G_\mathrm{{o}}}(f_{uv}) = {(J_\mathrm{{com}}^ +) ^\mathrm{{T}}}G(q)\), \({A_\mathrm{{o}}}(\dot{f}_{uv}) = {(J_\mathrm{{com}}^ +)^\mathrm{{T}}}A(\dot{q})\), \({F _\mathrm{{do}}} = {(J_\mathrm{{com}}^ + )^\mathrm{{T}}}\)

\({F _\mathrm{{d}}}\), and \({\tau _\mathrm{{o}}} = {(J_\mathrm{{com}}^ + )^\mathrm{{T}}}\tau \).

Due to the uncertain kinematic parameters and unmodeled dynamics, the actual parameters of system can be decomposed into the nominal part and uncertainties, so (11) can be written as

$$\begin{aligned}&({\bar{N}}_\mathrm{{o}}(f_{uv}) + \varDelta N_\mathrm{{o}}(f_{uv}))\ddot{f}_{uv} + ({{{\bar{B}}}_\mathrm{{o}}}(f_{uv},\dot{f}_{uv})\nonumber \\&\quad + \varDelta {B_\mathrm{{o}}}(f_{uv},\dot{f}_{uv}))\dot{f}_{uv}\nonumber \\&\quad + ({{{\bar{G}}}_\mathrm{{o}}}(f_{uv})+ \varDelta {G_\mathrm{{o}}}(f_{uv})) + ({{{\bar{A}}}_\mathrm{{o}}}(\dot{f}_{uv})\nonumber \\&\quad + \varDelta {A_\mathrm{{o}}}(\dot{f}_{uv})) + {F _\mathrm{{do}}} = {\tau _\mathrm{{o}}}, \end{aligned}$$
(12)

where \({\bar{N}}_\mathrm{{o}}(f_{uv})\), \({{{\bar{B}}}_\mathrm{{o}}}(f_{uv},\dot{f}_{uv})\), \({{{\bar{G}}}_\mathrm{{o}}}(f_{uv})\) and \({{{\bar{A}}}_\mathrm{{o}}}(\dot{f}_{uv})\) are the nominal part, \(\varDelta N_\mathrm{{o}}(f_{uv}))\), \(\varDelta {B_\mathrm{{o}}}(f_{uv},\dot{f}_{uv})\), \(\varDelta {G_\mathrm{{o}}}(f_{uv})\) and \(\varDelta {A_\mathrm{{o}}}(\dot{f}_{uv})\) are the uncertainties.

By separating the uncertainties from the dynamics of camera–manipulator model, (12) can be reformulated as

$$\begin{aligned}&{\bar{N}}_\mathrm{{o}}(f_{uv})\ddot{f}_{uv} + {{\bar{B}}_\mathrm{{o}}}(f_{uv},\dot{f}_{uv})\dot{f}_{uv}\nonumber \\&\quad + {{\bar{G}}_\mathrm{{o}}}(f_{uv}) + {{\bar{A}}_\mathrm{{o}}}(\dot{f}_{uv}) + D(f_{uv}) = {\tau _\mathrm{{o}}}, \end{aligned}$$
(13)

where the uncertainties \(D(f_{uv})\) is given as

$$\begin{aligned} D(f_{uv})= & {} \varDelta N_\mathrm{{o}}(f_{uv})\ddot{f}_{uv} + \varDelta {B_\mathrm{{o}}}(f_{uv},\dot{f}_{uv})\dot{f}_{uv}\nonumber \\&+ \varDelta {G_\mathrm{{o}}}(f_{uv}) + \varDelta {A_\mathrm{{o}}}(\dot{f}_{uv}) + {F _\mathrm{{do}}}. \end{aligned}$$
(14)

Before designing and analyzing the optimal feature tracking controller, the camera–manipulator dynamic system (13) is supposed to satisfy the following properties.

Property 1

The inertia matrix \({N_\mathrm{{o}}}(f_{uv})\) is symmetric, positive scalar, and satisfies

$$\begin{aligned} {\lambda _1}\left\| \xi \right\| ^2 \le {\xi ^\mathrm{{T}}}{N_\mathrm{{o}}}(f_{uv})\xi \le {\lambda _2}\left\| \xi \right\| ^2,\quad \forall \xi \in {\mathbb {R}}^n, \end{aligned}$$

where \(\lambda _1\) and \(\lambda _2\) are positive constants.

Property 2

The time-derivative of the inertia matrix \({N_\mathrm{{o}}}(f_{uv})\) and centripetal and coriolis matrix \({B_\mathrm{{o}}}(f_{uv},\dot{f}_{uv})\) satisfies the skew-symmetric relationship as

$$\begin{aligned} {x^\mathrm{{T}}}\left\{ {\frac{1}{2}{{\dot{N}}_\mathrm{{o}}}(f_{uv}) - {B_\mathrm{{o}}}(f_{uv},\dot{f}_{uv})} \right\} x = 0,\quad \forall x \in {\mathbb {R}}^n. \end{aligned}$$

Assumption 1

The friction torque \({A_\mathrm{{o}}}(\dot{f}_{uv})\) is bounded by \(\left\| {{A_\mathrm{{o}}}(\dot{f}_{uv})} \right\| \le {\varphi _1}\) , where \({\varphi _1} \in {\mathbb {R}}\) is an unknown positive constant.

Assumption 2

The uncertain dynamics \({D}(f_{uv})\) is bounded by \(\left\| {{D}( f_{uv})} \right\| \le {\varphi _2}\) , where \({\varphi _2} \in {\mathbb {R}}\) is an unknown positive constant.

From the properties above, the VS manipulator systems can be rewritten to facilitate the ADP design. By transforming the dynamic model (13), the state space expression of VS system is proposed as

$$\begin{aligned} \left\{ \begin{array}{l} {{\dot{x}}_1} = {x_2}\\ {{\dot{x}}_2} = k\left( x \right) + g\left( x \right) \left( {{\tau _\mathrm{{o}}} - D\left( x \right) } \right) \\ y = {x_1} \end{array} \right. , \end{aligned}$$
(15)

where \(x = {[{x_1},{x_2}]^\mathrm{{T}}} = {[f_{uv},\dot{f}_{uv}]^\mathrm{{T}}}\) with \({x_1},{x_2} \in {\mathbb {R}}^{2s}\) is the system state vector, y is the output vector, k(x) and g(x) can be defined as

$$\begin{aligned} k\left( x \right)= & {} - {{\bar{N}}_\mathrm{{o}}}^{ - 1}(f_{uv})\left( {{{{\bar{B}}}_\mathrm{{o}}}(f_{uv},\dot{f}_{uv})\dot{f}_{uv}(t) + {{{\bar{G}}}_\mathrm{{o}}}(f_{uv})} \right) , \\ g(x)= & {} {{\bar{N}}_\mathrm{{o}}}^{ - 1}(f_{uv}). \end{aligned}$$

Assumption 3

k(x) and g(x) are locally Lipschitz and continuous in their arguments with \(k(x)=0\).

Remark 1

It is observed from (13), the input and output of dynamic model are mapped from the direct form (i.e., \({\tau _\mathrm{{o}}} \rightarrow f_{uv},\dot{f}_{uv},\ddot{f}_{uv}\)) to the indirect form (i.e., \(\tau \rightarrow \ddot{q},\dot{q} \rightarrow \dot{r} \rightarrow \dot{f}_{uv}\), see (4), (6) and (9)). Moreover, a camera–manipulaotr dynamic model is established by taking the uncertainties \(D(f_{uv})\) into account. In this way, linearity in parameter features cannot be employed in the VS manipulator systems. In this paper, the ADP-based control approach is presented to solve the feature tracking control problem of VS manipulator systems with uncertainties. This implies that the proposed scheme guarantees the closed-loop VS manipulator systems to converge to zero, i.e., the actual trajectories can follow their desired trajectories.

ADP-based feature tracking controller

Optimal visual control

As we know, the aim of the optimal feature tracking control is to design an effective tracking control policy which follows the desired feature trajectory. To achieve this objective, the feature tracking control can be obtained by combining the desired visual tracking control and feature tracking error control.

Assumption 4

The desired feature trajectory \(f_{uv_\mathrm{{d}}}\) , the desired feature velocity \(\dot{f}_{uv_\mathrm{{d}}}\) and the desired feature acceleration \({\ddot{f}_{uv_\mathrm{{d}}}}\) are all bounded and known.

Letting \({x_{f_{uv_\mathrm{{d}}}}} = {[{f_{uv_\mathrm{{d}}}},{\dot{f}_{uv_\mathrm{{d}}}}]^\mathrm{{T}}}\) and \({\dot{x}_{f_{uv_\mathrm{{d}}}}} = {[{\dot{f}_{uv_\mathrm{{d}}}},{\ddot{f}_{uv_\mathrm{{d}}}}]^\mathrm{{T}}}\), the desired feature trajectory can be described as

$$\begin{aligned} {\dot{x}_{f_{uv_\mathrm{{d}}}}} = k({x_{f_{uv_\mathrm{{d}}}}}) + g({x_{f_{uv_\mathrm{{d}}}}}){\tau _{f_{uv_\mathrm{{d}}}}}, \end{aligned}$$
(16)

where \(\tau _{f_{uv_\mathrm{{d}}}}\) denotes the desired control torque. Then, the desired visual tracking controller can be obtained by

$$\begin{aligned} {\tau _{f_{uv_\mathrm{{d}}}}} = {g^ + }({x_{f_{uv_\mathrm{{d}}}}})({\dot{x}_{f_{uv_\mathrm{{d}}}}} - k({x_{f_{uv_\mathrm{{d}}}}})). \end{aligned}$$
(17)

From the state space expression of system (15), the feature tracking error dynamics can be expressed by

$$\begin{aligned} \left\{ {\begin{array}{*{20}{c}} {{e_\mathrm{{f}}} = x - {x_{f_{uv_\mathrm{{d}}}}}}\\ {{{\dot{e}}_\mathrm{{f}}} = \dot{x} - {{\dot{x}}_{f_{uv_\mathrm{{d}}}}}} \end{array}} \right. , \end{aligned}$$
(18)

where \(e_\mathrm{{f}}\) indicates the feature error and \({\dot{e}}_\mathrm{{f}}\) denotes the time derivative of \(e_\mathrm{{f}}\). For the state space expression of camera–manipulator system (15), the optimal objective is to derive the control law by minimizing the following infinite horizon cost function

$$\begin{aligned} U\left( {{e_\mathrm{{f}}}\left( t \right) } \right)= & {} \int _t^\infty P\left( {{e_\mathrm{{f}}}(\sigma ),{\tau _{f_{e}}}(\sigma )} \right) + \alpha \left( {{\hat{D}}}{{(\sigma )}^{\mathrm{T}}}{{\hat{D}}}(\sigma )\right. \nonumber \\&\left. + {{\left( {\nabla {U^*}\left( {{e_\mathrm{{f}}}(\sigma )} \right) } \right) }^2} \right) \mathrm{{d}}\sigma , \end{aligned}$$
(19)

where \(P\left( {{e_\mathrm{{f}}},{\tau _\mathrm{{fe}}}} \right) = e_\mathrm{{f}}^\mathrm{{T}}Q{e_\mathrm{{f}}} + \tau _\mathrm{{fe}}^\mathrm{{T}}R{\tau _\mathrm{{fe}}}\) denotes the utility function, \(P\left( {{e_\mathrm{{f}}},{\tau _\mathrm{{fe}}}} \right) \ge 0\) with \(P\left( {0,0} \right) = 0\), \({\tau _{f_{e}}} = {\tau _\mathrm{{o}}} - {\tau _{f_{uv_\mathrm{{d}}}}}\) is the optimal control input error, \(Q\in {\mathbb {R}}^{{2s}\times {2s}}\) and \(R\in {\mathbb {R}}^{{2s}\times {2s}}\) are the positive definite matrices, \({{\hat{D}}}(t)\) is the estimation of uncertainties, and \(\alpha > 0\) is an unknown constant.

Definition 1

[44] For the dynamic system (15), a control policies \(\tau _\mathrm{{fe}}\) is admissible with respect to the cost function (19) on a compact set \(\varOmega \) , if \(\tau _\mathrm{{fe}}\) is continuous on \(\varOmega \) with \(\tau _\mathrm{{fe}}(0)=0\), \(\tau _\mathrm{{fe}}\) stabilizes on \(\varOmega \), and \(U\left( {{e_\mathrm{{f}}}} \right) \) is finite \(\forall {e_\mathrm{{f}}} \in \varOmega \).

Given a series of admissible control policies \({\tau _\mathrm{{fe}}} \in \varXi (\varOmega )\), then the infinitesimal version of (19) is the the so-called Lyapunov equation as

$$\begin{aligned} 0= & {} P\left( {{e_\mathrm{{f}}}(\sigma ),{\tau _\mathrm{{fe}}}(\sigma )} \right) + \alpha \left( {{\hat{D}}}{{(\sigma )}^{\mathrm{T}}}{{\hat{D}}}(\sigma )\right. \nonumber \\&\left. + {{\left( {\nabla {U^*}\left( {{e_\mathrm{{f}}}(\sigma )} \right) } \right) }^2} \right) + {\left( {\nabla U({e_\mathrm{{f}}})} \right) ^\mathrm{{T}}}{\dot{e}_\mathrm{{f}}}, \end{aligned}$$
(20)

where \(U(0) = 0\) and \(\nabla U({e_\mathrm{{f}}}) = \frac{{\partial U({e_\mathrm{{f}}})}}{{\partial {e_\mathrm{{f}}}}}\) is the partial derivative of \(U({e_\mathrm{{f}}})\) with respect to \(e_\mathrm{{f}}\). The Hamiltonian and the improved cost function can be given by

$$\begin{aligned}&H\left( {{e_\mathrm{{f}}},{\tau _\mathrm{{fe}}},\nabla U\left( {{e_\mathrm{{f}}}} \right) } \right) = \alpha \left( {{{{{\hat{D}}}}^{\mathrm{T}}}{{\hat{D}}} + {{\left( {\nabla {U^*}\left( {{e_\mathrm{{f}}}} \right) } \right) }^2}} \right) \nonumber \\&\quad + P\left( {{e_\mathrm{{f}}},{\tau _\mathrm{{fe}}}} \right) + {\left( {\nabla U\left( {{e_\mathrm{{f}}}} \right) } \right) ^{\mathrm{T}}}{\dot{e}_\mathrm{{f}}}, \end{aligned}$$
(21)
$$\begin{aligned}&{U^*}\left( {{e_\mathrm{{f}}}} \right) = \mathop {\min }\limits _{{\tau _\mathrm{{fe}}} \in \varXi (\varOmega )}\! \int _0^\infty \! P\left( {{e_\mathrm{{f}}}(\sigma ),{\tau _\mathrm{{fe}}}(\sigma )} \right) \nonumber \\&\quad + \alpha \left( {{{\hat{D}}}{{(\sigma )}^{\mathrm{T}}}{{\hat{D}}}(\sigma )\! +\! {{\left( {\nabla {U^*}\left( {{e_\mathrm{{f}}}(\sigma )} \right) } \right) }^2}} \right) \mathrm{{d}}\sigma . \end{aligned}$$
(22)

Thus, the solution of the HJB equation can be obtained by

$$\begin{aligned} 0 = \mathop {\min }\limits _{{\tau _\mathrm{{fe}}} \in \varXi (\varOmega )} H\left( {{e_\mathrm{{f}}},{\tau _\mathrm{{fe}}},\nabla {U^*}\left( {{e_\mathrm{{f}}}} \right) } \right) , \end{aligned}$$
(23)

where \(\nabla {U^*}\left( {{e_\mathrm{{f}}}} \right) = \frac{{\partial {U^*}\left( {{e_\mathrm{{f}}}} \right) }}{{\partial {e_\mathrm{{f}}}}}\). If \({U^*}\left( {{e_\mathrm{{f}}}} \right) \) is continuously differentiable, the optimal feature tracking error controller of the VS system will be derived as

$$\begin{aligned} \tau _\mathrm{{fe}}^* = - \frac{1}{2}{R^{ - 1}}{g^{\mathrm{T}}}\left( x \right) \nabla {U^*}\left( {{e_\mathrm{{f}}}} \right) . \end{aligned}$$
(24)

According to (21) and (23), we can obtain

$$\begin{aligned} {\left( {\nabla U({e_\mathrm{{f}}})} \right) ^\mathrm{{T}}}{\dot{e}_\mathrm{{f}}}= & {} - P\left( {{e_\mathrm{{f}}}(\sigma ),{\tau _\mathrm{{fe}}}(\sigma )} \right) - \alpha \left( {{\hat{D}}}{{(\sigma )}^{\mathrm{T}}}{{\hat{D}}}(\sigma )\right. \nonumber \\&\left. + {{\left( {\nabla {U^*}\left( {{e_\mathrm{{f}}}(\sigma )} \right) } \right) }^2} \right) . \end{aligned}$$
(25)

Adaptive neural network observer design

The uncertainties are estimated by an adaptive NN observer, which can be formulated by

$$\begin{aligned} {\dot{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}} = k\left( {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) + g\left( {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) \left( {{\tau _\mathrm{{o}}} - {{\hat{D}}}({{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}})} \right) + \beta {O_\mathrm{{e}}}, \end{aligned}$$
(26)

where \({{{\hat{x}}}_{\mathrm{{fo}}_{uv}}}\) denotes the observation of the system state x, \(\beta \) is the positive definite observation gain matrix, and \({O_\mathrm{{e}}} = x - {{{\hat{x}}}_{\mathrm{{fo}}_{uv}}}\) denotes the state observer error.

Combining (15) with (26), we can present the observation error dynamics as

$$\begin{aligned} \begin{aligned} {{\dot{O}}_\mathrm{{e}}}\!&=\!k(x) \!+\! g(x)({\tau _\mathrm{{o}}}\! -\! D(x))\! -\! k\left( {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) \\&\quad - g\left( {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) \!\left( {{\tau _\mathrm{{o}}}\! -\! {{\hat{D}}}({{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}})} \!\right) \! -\! \beta \left( {x \!-\! {{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) \\&= \varGamma (x,{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}) - g\left( {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) (D(x) - {{\hat{D}}}({{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}})) - \beta {O_\mathrm{{e}}}\\ \end{aligned},\nonumber \\ \end{aligned}$$
(27)

where \(\varGamma (x,{{{\hat{x}}}_{\mathrm{{fo}}_{uv}}})\! =\! {k_e}(x,{{{\hat{x}}}_{\mathrm{{fo}}_{uv}}}) + {g_e}(x,{{{\hat{x}}}_{\mathrm{{fo}}_{uv}}})({\tau _\mathrm{{o}}} - D(x))\), \({k_e}(x,{{{\hat{x}}}_{\mathrm{{fo}}_{uv}}})\! =\! k(x) - k({{{\hat{x}}}_{\mathrm{{fo}}_{uv}}})\) and \({g_e}(x,{{{\hat{x}}}_{\mathrm{{fo}}_{uv}}}) = g(x) - g({{{\hat{x}}}_{\mathrm{{fo}}_{uv}}})\) are the observation errors of k(x) and g(x), respectively. According to Assumption 3, \(\varepsilon \) is a positive constant such that \(||g(\hat{x}_{fouv}) ||= \varepsilon \).

Assumption 5

\(\varGamma (x,{{{\hat{x}}}_{\mathrm{{fo}}_{uv}}})\) is norm-bounded as \(\left\| {\varGamma (x,{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}})} \right\| \le {\omega _1}\), where \(\omega _1\) is a positive constant.

To estimate the uncertainties D(x) , RBFNN is constructed as

$$\begin{aligned} D(x) = W_\mathrm{{D}}^{\mathrm{T}}\varPhi \left( x \right) + {\chi _\mathrm{{D}}}, \end{aligned}$$
(28)

where \({W_\mathrm{{D}}} \in {\mathbb {R}}^{{l_1} \times 2s}\) denotes the ideal weight matrix, \(\varPhi \left( x \right) \in {\mathbb {R}}^{l_1}\) denotes the NN activation function, \(l_1\) indicates the number of neurons in the hidden layer, and \({\chi _\mathrm{{D}}}\) indicates the NN approximation error.

Let \({{\hat{W}}}_\mathrm{{D}} \) be the estimation of \(W_\mathrm{{D}}\). \({{\hat{D}}}({{{\hat{x}}}_{\mathrm{{fo}}_{uv}}})\) is the estimation of D(x) , which can be expressed as

$$\begin{aligned} {{\hat{D}}}({{{\hat{x}}}_{\mathrm{{fo}}_{uv}}}) = {{\hat{W}}}_\mathrm{{D}}^{\mathrm{T}}{{\hat{\varPhi }}} \left( {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) , \end{aligned}$$
(29)

where \({{\hat{W}}}_\mathrm{{D}}\) can be updated by

$$\begin{aligned} {{\dot{{{\hat{W}}}}}_\mathrm{{D}}} = - \mu {{{\hat{\varPhi }}} ^\mathrm{{T}}}\left( {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) O_\mathrm{{e}}^{\mathrm{T}}g\left( {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) , \end{aligned}$$
(30)

where \(\mu \) is a positive definite matrix. From (28) and (29), one obtains

$$\begin{aligned} \begin{aligned} D(x) - {{\hat{D}}}({{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}})&= W_\mathrm{{D}}^{\mathrm{T}}\varPhi \left( x \right) + {\chi _\mathrm{{D}}} - {{\hat{W}}}_\mathrm{{D}}^{\mathrm{T}}{{\hat{\varPhi }}} \left( {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) \\&= W_\mathrm{{D}}^{\mathrm{T}}{{\tilde{\varPhi }}} \left( {x,{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) + {{\tilde{W}}}_\mathrm{{D}}^{\mathrm{T}}{{\hat{\varPhi }}} \left( {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) + {\chi _\mathrm{{D}}}, \end{aligned}\nonumber \\ \end{aligned}$$
(31)

where \({{\tilde{W}}}_\mathrm{{D}} = {W_\mathrm{{D}}} - {{{\hat{W}}}_\mathrm{{D}}}\) is the weight estimation error, \({{\tilde{\varPhi }}} \left( {x,{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) = \varPhi \left( x \right) - {{\hat{\varPhi }}} \left( {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) \) is the estimation error of the activation function.

Assumption 6

The local observation error \({W_\mathrm{{e}}} = W_\mathrm{{D}}^{\mathrm{T}}{\tilde{\varPhi }} \left( {x,}\right. \left. {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) + {\chi _\mathrm{{D}}}\) is norm-bounded as \(\left\| {{W_\mathrm{{e}}}} \right\| \le {\omega _2}\), where \(\omega _2 > 0\) is an unknown constant.

Theorem 1

For the VS manipulator systems with uncertainties, the proposed adaptive NN observer can ensure the observation error to be UUB with the help of the NN updating law (30).

Proof

Choose a Lyapunov function candidate as

$$\begin{aligned} {Z_1} = \frac{1}{2}O_\mathrm{{e}}^\mathrm{{T}}{O_\mathrm{{e}}} + \frac{1}{2}tr\left( {{{\tilde{W}}_\mathrm{{D}}}^{\mathrm{T}}{\mu ^{ - 1}}{{{{\tilde{W}}}}_\mathrm{{D}}}} \right) . \end{aligned}$$
(32)

The time derivative of (32) is

$$\begin{aligned} \begin{aligned} {{\dot{Z}}_1}&= O_\mathrm{{e}}^\mathrm{{T}}{{\dot{O}}_\mathrm{{e}}} - tr\left( {{{{{\tilde{W}}}}_\mathrm{{D}}}^{\mathrm{T}}{\mu ^{ - 1}}{{\dot{{{\hat{W}}}}}_\mathrm{{D}}}} \right) \\&= O_\mathrm{{e}}^\mathrm{{T}}(\varGamma (x,{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}) - g({{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}})(D(x) - {{\hat{D}}}({{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}))- \beta {O_\mathrm{{e}}})\\&\quad - tr\left( {{{{{\tilde{W}}}}_\mathrm{{D}}}^{\mathrm{T}}{\mu ^{ - 1}}{{\dot{{{\hat{W}}}}}_\mathrm{{D}}}} \right) \\&\le {\omega _1}\left\| {{O_\mathrm{{e}}}} \right\| - O_\mathrm{{e}}^\mathrm{{T}}g({{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}})({W_\mathrm{{e}}} + {{\tilde{W}}}_\mathrm{{D}}^{\mathrm{T}}{{\hat{\varPhi }}} \left( {{{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}} \right) )\\&\quad - {\lambda _{\min }}(\beta ){\left\| {{O_\mathrm{{e}}}} \right\| ^2} - tr\left( {{{{{\tilde{W}}}}_\mathrm{{D}}}^{\mathrm{T}}{\mu ^{ - 1}}{{\dot{{\hat{W}}}}_\mathrm{{D}}}} \right) \\&\le {\omega _1}\left\| {{O_\mathrm{{e}}}} \right\| - {\lambda _{\min }}\left( \beta \right) {\left\| {{O_\mathrm{{e}}}} \right\| ^2} - {\omega _2}\varepsilon \left\| {{O_\mathrm{{e}}}} \right\| \\&\quad - O_\mathrm{{e}}^\mathrm{{T}}g({{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}){{\tilde{W}}}_\mathrm{{D}}^{\mathrm{T}}{{\hat{\varPhi }}} ({{{{\hat{x}}}}_{\mathrm{{fo}}_{uv}}}) - tr\left( {{{{{\tilde{W}}}}_\mathrm{{D}}}^{\mathrm{T}}{\mu ^{ - 1}}{{\dot{{{\hat{W}}}}}_\mathrm{{D}}}} \right) \end{aligned}, \end{aligned}$$
(33)

where \({\lambda _{\min }}\left( \beta \right) \) is the minimum eigenvalue of the matrix. Thus, substituting (29) into (33), we have

$$\begin{aligned} {\dot{Z}_1} \le - (({\lambda _{\min }}(\beta )\left\| {{O_\mathrm{{e}}}} \right\| ) - ({\omega _1} + {\omega _2}\varepsilon ))\left\| {{O_\mathrm{{e}}}} \right\| . \end{aligned}$$
(34)

It can be seen that \({\dot{L}_1} \le 0\) when \(O_\mathrm{{e}}\) lies outside the compact set \({\varOmega _1} = \left\{ {{O_\mathrm{{e}}}:\left\| {{O_\mathrm{{e}}}} \right\| \le \frac{{{\omega _1} + {\omega _2}\varepsilon }}{{{\lambda _{\min }}\left( \beta \right) }}} \right\} \). According to the Lyapunovs direct method, the state observation error can be guaranteed to be UUB. This concludes the proof.

Critic NN and implementation

As an excellent learning tool of nonlinear functions, NN is widely considered to approximate the cost function (22). Thereby, the improved cost function can be expressed by a critic NN on the compact set \({\varOmega _2}\), which is given by

$$\begin{aligned} {T_U}({e_\mathrm{{f}}}) = W_U^\mathrm{{T}}{\varPhi _U}({e_\mathrm{{f}}}) + {\chi _U}, \end{aligned}$$
(35)

where \({W_U} \in {\mathbb {R}}{^{{l_2}}}\) denotes the ideal weight matrix, \({\varPhi _U}({e_\mathrm{{f}}}) \in {\mathbb {R}}{^{{l_2}}}\) indicates the NN basis function, \(l_2\) denotes the number of neurons in the hidden layer, and \(\chi _U\) is the NN approximation error. The partial derivative of \({T_U}({e_\mathrm{{f}}})\) with respect to \(e_\mathrm{{f}}\) is

$$\begin{aligned} \nabla {T_U}({e_\mathrm{{f}}}) = {(\nabla {\varPhi _U}({e_\mathrm{{f}}}))^\mathrm{{T}}}{W_U} + \nabla {\chi _U}, \end{aligned}$$
(36)

where \(\nabla {\varPhi _U}({e_\mathrm{{f}}}) = \frac{{\partial {\varPhi _U}({e_\mathrm{{f}}})}}{{\partial {e_\mathrm{{f}}}}}\) and \(\nabla {\chi _U}\) are the partial derivatives of the basis function \({\varPhi _U}({e_\mathrm{{f}}})\) and the NN approximation error \(\chi _U\), respectively. A critic NN is utilized to approximate the improved cost function as

$$\begin{aligned} {{{\hat{T}}}_U}({e_\mathrm{{f}}}) = {{\hat{W}}}_U^\mathrm{{T}}{\varPhi _U}({e_\mathrm{{f}}}). \end{aligned}$$
(37)

Thus, the partial derivative of \({{{\hat{T}}}_U}({e_\mathrm{{f}}})\) with respect to \(e_\mathrm{{f}}\) is

$$\begin{aligned} \nabla {{{\hat{T}}}_U}({e_\mathrm{{f}}}) = {(\nabla {\varPhi _U}({e_\mathrm{{f}}}))^\mathrm{{T}}}{{{\hat{W}}}_U}. \end{aligned}$$
(38)

Considering (23), the ideal optimal feature tracking error control policy can be described by

$$\begin{aligned} \tau _\mathrm{{fe}}^* = - \frac{1}{2}{R^{ - 1}}{g^{\mathrm{T}}}(x)({(\nabla {\varPhi _U}({e_\mathrm{{f}}}))^\mathrm{{T}}}{W_U} + \nabla {\chi _U}). \end{aligned}$$
(39)

Thus, according to (37) and (38), the approximation optimal feature tracking error control can be given by

$$\begin{aligned} {{\hat{\tau }}} _\mathrm{{fe}}^* = - \frac{1}{2}{R^{ - 1}}{g^{\mathrm{T}}}(x){(\nabla {\varPhi _U}({e_\mathrm{{f}}}))^\mathrm{{T}}}{{{\hat{W}}}_U}. \end{aligned}$$
(40)

For the uncertain system (15), considering (20) and (36), one can obtain

$$\begin{aligned} \begin{aligned}&0 = \alpha \left( {{\hat{D}}}{{(\sigma )}^{\mathrm{T}}}{{\hat{D}}}(\sigma ) + {{\left( {{{(\nabla {\varPhi _U}({e_\mathrm{{f}}}))}^\mathrm{{T}}}{W_U} + \nabla {\chi _U}} \right) }^\mathrm{{T}}}\right. \\&\quad \times \left. \left( {{{(\nabla {\varPhi _U}({e_\mathrm{{f}}}))}^\mathrm{{T}}}{W_U} + \nabla {\chi _U}} \right) \right) \\&\quad +P\left( {{e_\mathrm{{f}}}(\sigma ),{\tau _\mathrm{{fe}}}(\sigma )} \right) + {\left( {{{(\nabla {\varPhi _U}({e_\mathrm{{f}}}))}^\mathrm{{T}}}{W_U} + \nabla {\chi _U}} \right) ^\mathrm{{T}}}{{\dot{e}}_f}. \end{aligned}\nonumber \\ \end{aligned}$$
(41)

Therefore, the Hamiltonian can be expressed by

$$\begin{aligned} \begin{aligned}&H\left( {{e_\mathrm{{f}}},{\tau _\mathrm{{fe}}},{W_U}} \right) \\&\quad = \alpha \left( \begin{array}{l} {{\hat{D}}}{(\sigma )^{\mathrm{T}}}{{\hat{D}}}(\sigma ) + \\ {\left( {{{(\nabla {\varPhi _U}({e_\mathrm{{f}}}))}^\mathrm{{T}}}{W_U} + \nabla {\chi _U}} \right) ^\mathrm{{T}}}\left( {{{(\nabla {\varPhi _U}({e_\mathrm{{f}}}))}^\mathrm{{T}}}{W_U} + \nabla {\chi _U}} \right) \end{array} \right) \\&\qquad + P\left( {{e_\mathrm{{f}}}(\sigma ),{\tau _\mathrm{{fe}}}(\sigma )} \right) + {\left( {{{(\nabla {\varPhi _U}({e_\mathrm{{f}}}))}^\mathrm{{T}}}{W_U}} \right) ^\mathrm{{T}}}{{\dot{e}}_f}\; - {E_{UH}}, \end{aligned} \end{aligned}$$
(42)

where \({E_{UH}} = - {(\nabla {\chi _U})^\mathrm{{T}}}{\dot{e}_\mathrm{{f}}}\) is the approximation residual. And the approximate Hamiltonian is derived in the same manner, which is expressed as

$$\begin{aligned} \begin{aligned}&{{\hat{H}}}\left( {{e_\mathrm{{f}}},{\tau _\mathrm{{fe}}},{{{{\hat{W}}}}_U}} \right) \\&\quad = \alpha \left( \begin{array}{l} {{\hat{D}}}{(\sigma )^{\mathrm{T}}}{{\hat{D}}}(\sigma ) + \\ {\left( {{{(\nabla {\varPhi _U}({e_\mathrm{{f}}}))}^\mathrm{{T}}}{{{{\hat{W}}}}_U}} \right) ^\mathrm{{T}}}\left( {{{(\nabla {\varPhi _U}({e_\mathrm{{f}}}))}^\mathrm{{T}}}{{{{\hat{W}}}}_U}} \right) \end{array} \right) \\&\qquad + P\left( {{e_\mathrm{{f}}}(\sigma ),{\tau _\mathrm{{fe}}}(\sigma )} \right) + {\left( {{{(\nabla {\varPhi _U}({e_\mathrm{{f}}}))}^\mathrm{{T}}}{{{{\hat{W}}}}_U}} \right) ^\mathrm{{T}}}{{\dot{e}}_f} \end{aligned}.\nonumber \\ \end{aligned}$$
(43)

Defining the error function as \({E_U} = H\left( {{e_\mathrm{{f}}},{\tau _\mathrm{{fe}}},{W_U}} \right) - {{\hat{H}}}\left( {{e_\mathrm{{f}}},{\tau _\mathrm{{fe}}},{{{{\hat{W}}}}_U}} \right) \) , then combining (42) with (43), we have

$$\begin{aligned} {E_U} = {E_{UH}} - {{\tilde{W}}}_U^\mathrm{{T}}\nabla {\varPhi _U}({e_\mathrm{{f}}}){\dot{e}_\mathrm{{f}}}, \end{aligned}$$
(44)

where \({{{\tilde{W}}}_U} = {W_U} - {{{\hat{W}}}_U}\) is the weight estimation error.

Assumption 7

The NN function \(\delta = \nabla {\varPhi _U}({e_\mathrm{{f}}}){\dot{e}_\mathrm{{f}}}\) is norm-bounded as \(\left\| \delta \right\| \le {\delta _e}\), where \(\delta _e\) is a positive constant.

To adjust the critic NN weight vector \({{{\hat{W}}}_U}\), we can minimize the objective function \({E_\mathrm{{obj}}} = \frac{1}{2}E_U^\mathrm{{T}}{E_U}\) with the updating law as

$$\begin{aligned} {\dot{{{\hat{W}}}}_U} = {\mu _U}\left( {\frac{{\partial {E_U}}}{{\partial {{{{\hat{W}}}}_U}}}} \right) = - {\mu _U}{E_U}\delta , \end{aligned}$$
(45)

where \(\mu _U\) is the learning rate of the critic NN. Hence, considering (44) and (45), one can obtain the updating law of weight estimation error as

$$\begin{aligned} {\dot{{{\tilde{W}}}}_U} = - {\dot{{{\hat{W}}}}_U} = {\mu _U}{E_U}\delta . \end{aligned}$$
(46)

Theorem 2

For the uncertain camera–manipulator system, the weight vector approximation error of the critic NN can be guaranteed to be UUB with the updating law (45).

Proof

Choose a Lyapunov function candidate as

$$\begin{aligned} {Z_2} = \frac{1}{{2{\mu _U}}}{{\tilde{W}}}_U^\mathrm{{T}}{{{\tilde{W}}}_U}. \end{aligned}$$
(47)

The time derivative of (47) is

$$\begin{aligned} \begin{aligned} {{\dot{Z}}_2}&= - \frac{1}{{{\mu _U}}}{{\tilde{W}}}_U^\mathrm{{T}}{{\dot{{{\tilde{W}}}}}_U}\\&= {{\tilde{W}}}_U^\mathrm{{T}}{E_U}\delta \\&= {{\tilde{W}}}_U^\mathrm{{T}}{E_{UH}}\delta - {\left\| {{{{{\tilde{W}}}}_U}\delta } \right\| ^2}. \end{aligned} \end{aligned}$$
(48)

According to Young’s inequality, we can obtain

$$\begin{aligned} {\dot{Z}_2} \le \frac{1}{2}E_{UH}^2 - \frac{1}{2}{\left\| {{{\tilde{W}}_U}\delta } \right\| ^2}. \end{aligned}$$
(49)

Therefore, \({\dot{Z}_2} < 0\) when the weight approximation error \({{{{\tilde{W}}}}_U}\) lies outside the compact set \({\varOmega _3} = \left\{ {{{{{\tilde{W}}}}_U}:\left\| {{{{{\tilde{W}}}}_U}} \right\| \le \left\| {\frac{{{E_{UH}}}}{{{\delta _e}}}} \right\| } \right\} \). Thus, the weight approximation error can be guaranteed to be UUB. This concludes the proof.

Stability analysis

Unlike existing visual tracking control methods which neglected the optimal control performance, this paper improves the cost function with the information from an adaptive NN observer. Furthermore, via the ADP approach, we develop a novel optimal feature tracking error control method that optimizes the control performance and ensures the system stability.

The optimal feature tracking controller which composes of the desired tracking controller \(\tau _\mathrm{{fd}}\) and feature tracking error controller \(\tau _\mathrm{{fe}}\) is derived by

$$\begin{aligned} {\tau _\mathrm{{o}}} = {\tau _\mathrm{{fd}}} + {{{\hat{\tau }}} _\mathrm{{fe}}}. \end{aligned}$$
(50)

Theorem 3

Consider system dynamics of VS manipulator (15) and improved cost function (19), the closed-loop VS system is UUB under the optimal tracking control policy (50).

Proof

Choose a Lyapunov function candidate as

$$\begin{aligned} {L_3} = \frac{1}{2}{e_\mathrm{{f}}}^\mathrm{{T}}{e_\mathrm{{f}}} + {U^*}\left( {{e_\mathrm{{f}}}} \right) . \end{aligned}$$
(51)

Considering (15), (16) and (24), the time derivative of (51) is expressed as

$$\begin{aligned} \begin{aligned} {{\dot{Z}}_3}&= {e_\mathrm{{f}}}^\mathrm{{T}}{{\dot{e}}_f} + {\left( {\nabla {U^*}\left( e \right) } \right) ^\mathrm{{T}}}{{\dot{e}}_f}\\&= {e_\mathrm{{f}}}^\mathrm{{T}}({k_\mathrm{{d}}}(x,{x_\mathrm{{fd}}}) + {g_\mathrm{{d}}}(x,{x_\mathrm{{fd}}}){\tau _\mathrm{{o}}} + g(x)D(x))\\&\quad - P\left( {{e_\mathrm{{f}}},{\tau _\mathrm{{fe}}}} \right) - \alpha \left( {{{{{\hat{D}}}}^{\mathrm{T}}}{{\hat{D}}} + {{\left( {\nabla {U^*}\left( {{e_\mathrm{{f}}}} \right) } \right) }^2}} \right) \end{aligned},\nonumber \\ \end{aligned}$$
(52)

where \({k_\mathrm{{d}}}(x,{x_\mathrm{{fd}}}) = k(x) - k({x_\mathrm{{fd}}})\) and \({g_\mathrm{{d}}}(x,{x_\mathrm{{fd}}}) = g(x) - g({x_\mathrm{{fd}}})\). According to Assumption 3, \(\varepsilon _f\) is a positive constant such that \(\left\| {{k_\mathrm{{d}}}(x,{x_\mathrm{{fd}}})} \right\| \le {\varepsilon _f}\left\| {{e_\mathrm{{f}}}} \right\| \). Assuming \(\left\| {g(x)} \right\| \le {\kappa _1}\), \(\left\| {g({x_\mathrm{{fd}}})} \right\| \le {\kappa _2}\) and \(\left\| {g(x) - g({x_\mathrm{{fd}}})} \right\| \le {\kappa _3}\), we have

$$\begin{aligned} \begin{aligned}&{{\dot{Z}}_3} \le {\varepsilon _f}{\left\| {{e_\mathrm{{f}}}} \right\| ^2} + {\kappa _{\mathrm{{3}}}}\left\| {{\tau _\mathrm{{fd}}}} \right\| \left\| {{e_\mathrm{{f}}}} \right\| + ({\kappa _{\mathrm{{3}}}} + {\kappa _2})\left\| {{\tau _\mathrm{{fe}}}} \right\| \left\| {{e_\mathrm{{f}}}} \right\| \\&\qquad + {\kappa _1}\left\| {D(x)} \right\| \left\| {{e_\mathrm{{f}}}} \right\| - {\lambda _{\min }}(Q){\left\| {{e_\mathrm{{f}}}} \right\| ^2}\\&\qquad \ - {\lambda _{\min }}(R){\left\| {{\tau _\mathrm{{fe}}}} \right\| ^2}- \alpha ({{{{\hat{D}}}}^{\mathrm{T}}}{{\hat{D}}} + {\left( {\nabla {U^*}\left( {{e_\mathrm{{f}}}} \right) } \right) ^2})\\&\quad \le {\varepsilon _f}{\left\| {{e_\mathrm{{f}}}} \right\| ^2} + \frac{3}{2}{\left\| {{e_\mathrm{{f}}}} \right\| ^2}\\&\qquad + \frac{1}{2}{({\kappa _{\mathrm{{3}}}} + {\kappa _2})^2}{\left\| {{\tau _\mathrm{{fe}}}} \right\| ^2}\\&\qquad + \frac{1}{2}\kappa _1^2{\left\| {D(x)} \right\| ^2} + \frac{1}{2}\kappa _3^2{\left\| {{\tau _\mathrm{{fd}}}} \right\| ^2}\\&\qquad \ - {\lambda _{\min }}(Q){\left\| {{e_\mathrm{{f}}}} \right\| ^2} - {\lambda _{\min }}(R){\left\| {{\tau _\mathrm{{fe}}}} \right\| ^2}\\&\qquad - \alpha ({{{{\hat{D}}}}^{\mathrm{T}}}{{\hat{D}}} + {\left( {\nabla {U^*}\left( {{e_\mathrm{{f}}}} \right) } \right) ^2})\\&\quad \le - \left( {\lambda _{\min }}(Q) - {\varepsilon _f} - \frac{3}{2}\right) {\left\| {{e_\mathrm{{f}}}} \right\| ^2}\\&\qquad - \alpha {\left( {\nabla {U^*}\left( {{e_\mathrm{{f}}}} \right) } \right) ^2} - \left( {{\lambda _{\min }}(R) - \frac{1}{2}{{({\kappa _{\mathrm{{3}}}} + {\kappa _2})}^2}} \right) {\left\| {{\tau _\mathrm{{fe}}}} \right\| ^2}\\&\quad \left( \alpha - \frac{1}{2}\kappa _1^2\right) {{\hat{D}}}{({{\hat{x}}})^\mathrm{T}}{{\hat{D}}}({{\hat{x}}}) \end{aligned},\nonumber \\ \end{aligned}$$
(53)

Assuming \(\left\| {{\tau _\mathrm{{fd}}}} \right\| \le {\zeta _1}\) and \(\left\| {D(x) - {{\hat{D}}}({{\hat{x}}})} \right\| \le {\zeta _2}\), where \(\zeta _1\) and \(\zeta _2\) are positive constant. We have

$$\begin{aligned} \begin{aligned}&{{\dot{Z}}_3} \le - \left( {({\lambda _{\min }}(Q) - {\varepsilon _f} - \frac{3}{2})\left\| {{e_\mathrm{{f}}}} \right\| - \frac{{\kappa _3^2\zeta _1^2 + \kappa _1^2\zeta _2^2}}{{2\left\| {{e_\mathrm{{f}}}} \right\| }}} \right) \\&\quad \left\| {{e_\mathrm{{f}}}} \right\| - \alpha {\left( {\nabla {U^*}\left( {{e_\mathrm{{f}}}} \right) } \right) ^2}\\&\qquad - \left( {{\lambda _{\min }}(R) - \frac{1}{2}{{({\kappa _{\mathrm{{3}}}} + {\kappa _2})}^2}} \right) {\left\| {{\tau _\mathrm{{fe}}}} \right\| ^2}\\&\qquad - \left( \alpha - \frac{1}{2}\kappa _1^2\right) {{\hat{D}}}{({{\hat{x}}})^\mathrm{T}}{{\hat{D}}}({{\hat{x}}}). \end{aligned}\nonumber \\ \end{aligned}$$
(54)

Therefore, it can be seen that \({\dot{Z}_3} < 0\) when \(e_\mathrm{{f}}\) lies outside the compact set \({\varOmega _4} = \left\{ {{e_\mathrm{{f}}}:\left\| {{e_\mathrm{{f}}}} \right\| \le \sqrt{\frac{{\kappa _3^2\zeta _1^2 + \kappa _1^2\zeta _2^2}}{{2({\lambda _{\min }}(Q) - {\varepsilon _f} - \frac{3}{2})}}} } \right\} \) with the following conditions hold.

$$\begin{aligned} \left\{ {\begin{aligned}&{{\lambda _{\min }}(Q) \ge {\varepsilon _f} + \frac{3}{2}}\\&{{\lambda _{\min }}(R) \ge \frac{1}{2}{{({\kappa _3} + {\kappa _2})}^2}}\\&{\alpha \ge \frac{1}{2}\kappa _1^2} \end{aligned}} \right. . \end{aligned}$$
(55)

Simulation tests

In this section, we employ a 3-DOF humanoid manipulator with one feature point marked on the end-effector for simulation tests [45, 46]. The performance of the proposed ADP-based feature tracking control are implemented in two cases, i.e., without/with uncertainties.

Table 1 VS system parameters
Table 2 Controller and observer parameters
Fig. 2
figure 2

Image feature trajectories a the first element of feature vector. b The second element of feature vector

Fig. 3
figure 3

Image feature error a the first element of feature vector. b The second element of feature vector

Fig. 4
figure 4

Image feature velocity trajectories a the first element of feature vector. b The second element of feature vector

The 3-DOF manipulator system and the control parameters are presented in Tables 1 and 2. The intrinsic matrix \(M_\mathrm{{in}}\) and the extrinsic matrix \(M_\mathrm{{ex}}\) are given by

$$\begin{aligned} {M_\mathrm{{in}}}= & {} \left[ {\begin{array}{*{20}{c}} {100}&{} \quad 0&{} \quad {25}&{} \quad 0\\ 0&{} \quad {100}&{} \quad {29}&{} \quad 0\\ 0&{} \quad 0&{} \quad 1&{} \quad 0 \end{array}} \right] , \\ {M_\mathrm{{ex}}}= & {} \left[ {\begin{array}{*{20}{c}} {0.5}&{} \quad { - 0.3}&{} \quad 0&{} \quad {0.2}\\ 0&{} \quad 0&{} \quad { - 0.2}&{} \quad {0.2}\\ {0.2}&{} \quad {0.5}&{} \quad 0&{} \quad {0.9}\\ 0&{} \quad 0&{} \quad 0&{} \quad 1 \end{array}} \right] . \end{aligned}$$

Define the desired feature trajectories as

$$\begin{aligned} {f_\mathrm{{d}}}(1)= & {} 450 + 20*\sin (t), \\ {f_\mathrm{{d}}}(2)= & {} 65 + 20*\cos (t). \end{aligned}$$

In the adaptive NN observer, Gaussian type function is selected as the activation function, the center of the basis function Bc is

$$\begin{aligned} \mathrm{{Bc}} = \left[ {\begin{array}{*{20}{c}} {420}&{} \quad {435}&{} \quad {450}&{} \quad {465}&{} \quad {480}\\ {40}&{} \quad {55}&{} \quad {65}&{} \quad {75}&{} \quad {90}\\ {30}&{} \quad {20}&{} \quad 0&{} \quad {20}&{} \quad {30}\\ {10}&{} \quad 5&{} \quad 0&{} \quad { - 5}&{} \quad {10} \end{array}} \right] , \end{aligned}$$

and the width of the activation function Bb = 80. The improved cost function (19) is approximated by a critic NN, and the weight vector is \({{{{\hat{W}}}}_U} = {[{{{{\hat{W}}}}_{U1}},{{{{\hat{W}}}}_{U2}},{{{{\hat{W}}}}_{U3}},{{{{\hat{W}}}}_{U4}},}{{{{{\hat{W}}}}_{U5}},{{{{\hat{W}}}}_{U6}},{{{{\hat{W}}}}_{U7}},{{{{\hat{W}}}}_{U8}},{{{{\hat{W}}}}_{U9}},{{{{\hat{W}}}}_{U10}}]^\mathrm{{T}}}\) with its initial value \({{{{\hat{W}}}}_U} = {[7,3,50,5,4,5,15,1,0.5,1.5]^\mathrm{{T}}}\), the activation function is chosen as \({\varPhi _U} = {[e_{f1}^2,{e_{f1}}{e_{f2}},{e_{f1}}{e_{f3}},{e_{f1}}{e_{f4}}}{e_{f2}^2,{e_{f3}}{e_{f2}},{e_{f2}}{e_{f4}},e_{f3}^2,e_{f4}^2,}\) \({{e_{f3}}{e_{f4}}]^\mathrm{{T}}}\).

Case 1: VS system without uncertainties

We employ three different initial feature points to verify the visual tracking performance using the proposed ADP scheme. Moreover, it is assumed that the uncertainties can be neglected. The initial feature points are given by

$$\begin{aligned} f_1= & {} [460,70]^\mathrm{{T}}, \\ f_2= & {} [455,90]^\mathrm{{T}}, \\ f_3= & {} [445,65]^\mathrm{{T}}. \end{aligned}$$

As shown in Figs. 2, 3, 4 and 5, simulation results illustrate the trajectories of feature position, feature velocity, and feature tracking error, respectively. The visual tracking trajectories are illustrated in Fig. 2. It is observed that the actual feature trajectories can follow their desired ones with different initial feature points. The image tracking errors are displayed to illustrate the visual tracking performance intuitively in Fig. 3. We can see that the desired trajectory with the initial point \(f_2\) has a faster convergence rate than others. It is obvious that velocity trajectory curves of the VS manipulator systems are smooth and continuous except a slight oscillation at the beginning as shown in Fig. 4. Feature curves on the image plane are described in Fig. 5. From the feature tracking trajectories and their error curves, the VS system is performed to be asymptotically stable.

Case 2: VS system with uncertainties

To test the robustness of our proposed method, we consider a simple servoing task by introducing different uncertainties. Let the initial state be [460, 70, 0, 0], the initial observation state be [461, 69, 0, 0]. The uncertainties as constant vector and sinusoidal noise, which are given by

$$\begin{aligned} D_1= & {} {\left[ {50;50} \right] ^\mathrm{{T}}}, \\ D_2= & {} \left[ {\begin{array}{*{20}{c}} {\mathrm{{100sin(t) + 100cos(t)}}}\\ {\mathrm{{150sin(0}}\mathrm{{.5t) + 50cos(t)}}} \end{array}} \right] . \end{aligned}$$
Fig. 5
figure 5

Image feature trajectories on the image plane

Simulation results are shown in Figs. 6, 7, 8 and 9. The velocity tracking trajectories still accompany the desired ones in both uncertain cases, which are shown in Figs. 6 and 8. From Figs. 7 and 9, the uncertainties can be estimated well within a short period of time by using the developed scheme.

Fig. 6
figure 6

Image feature velocity trajectories a the first element of feature vector. b The second element of feature vector

Fig. 7
figure 7

Estimated uncertainties \({{\hat{D}}}_1\) a \({{\hat{D}}}_1(1)\). b \({{\hat{D}}}_1(2)\)

Fig. 8
figure 8

Image feature velocity trajectories a the first element of feature vector. b The second element of feature vector

Fig. 9
figure 9

Estimated uncertainties \({{\hat{D}}}_2\) a \({{\hat{D}}}_2(1)\). b \({{\hat{D}}}_2(2)\)

Fig. 10
figure 10

Image feature trajectories of three schemes a the first element of feature vector. b The second element of feature vector

Fig. 11
figure 11

Image feature velocity trajectories of three schemes a the first element of feature vector. b The second element of feature vector

Fig. 12
figure 12

Image feature errors of three schemes a the first element of feature vector. b The second element of feature vector

Fig. 13
figure 13

Image feature trajectories on the image plane a ADP scheme. b ANN scheme. c ASMC scheme

To further exhibit the performance of the proposed ADP-based controller, the comparison results with adaptive neural network (ANN) scheme [47] and adaptive sliding mode control (ASMC) scheme [48] are also provided. The uncertainty is set as constant vector \(D_1\). The image feature position and their velocity trajectories of the proposed scheme, ANN scheme and ASMC scheme are illustrated in Figs. 10 and 11. The settling time of VS system under the proposed scheme (about 1.8 s) is longer than that under ANN scheme (about 0.4 s) and ASMC scheme (about 0.2 s). The image tracking errors of three methods are depicted in Fig. 12. It can be seen the error curve of the ASMC scheme has the fastest convergence rate in spite of an obvious fluctuation in Fig. 12b. Furthermore, the error curve of the ANN scheme has a small overshoot compared to our method. It is obvious that the responses of feature tracking exhibit no oscillations and overshoot, and smooth transient performance in Fig. 12c. The comparison of image feature position of three methods on the image plane are shown in Fig. 13. We can observe that the results of tracking in a complete circle period, which validates the accurate of our method. To quantize the tracking accuracy, three performance index functions, where \(E_\mathrm{{max}}\) denotes maximum value of absolute of image feature error, \(E_\mathrm{{min}}\) denotes minimum value of absolute of image feature error and MSE denotes mean-square-error (MSE) measure of image feature error, are defined as

$$\begin{aligned} E_\mathrm{{max}}= & {} \max (|f_{uv}-f_\mathrm{{d}}|), \\ E_\mathrm{{min}}= & {} \min (|f_{uv}-f_\mathrm{{d}}|), \\ \mathrm{{MSE}}= & {} \frac{1}{N}\sum _{t=1}^N (f_{uv}-f_\mathrm{{d}})^2. \end{aligned}$$

After 2 s, the numerical quantitative comparison results of the proposed ADP-based scheme, ANN scheme and ASMC scheme are listed in Table 3. The significance underline points out the minimum value of the row in Table 3. It is clearly indicated that the proposed ADP-based scheme has a more precise tracking accuracy in contrast to other two methods.

Table 3 Numerical comparison of three methods

It concludes that the proposed scheme can fulfill prominent tracking tasks by considering uncertainties.

Conclusion

In this paper, a feature tracking control scheme for VS manipulator with uncertainties based on ADP has been proposed. Under the effective estimation of uncertainties based on the adaptive NN observer, an improved cost function is designed to account for the influence of system uncertainties. The improved HJB equation is solved by a critic neural network, and the approximated optimal feature tracking error controller can be derived directly. Thus, the feature tracking controller is obtained by combining the optimal feature error controller and the desired controller. Moreover, the VS system is guaranteed to be UUB based on Lyapunov stability analysis. Simulation results illustrate the effectiveness of the proposed feature tracking control scheme. It is shown that the proposed controller is capable of controlling the VS manipulator which are regarded as highly nonlinear dynamic systems successfully.

In this study, we investigate the visual servoing control problem for manipulators subject to the unknown dynamics with energy cost optimization. In our future work, the dynamic control of manipulators with time delay, uncertainties in Jacobian matrix, and depth information, as well as VS control with image processing are potential research topics.