Planning and visual-servoing for robotic manipulators in ROS

This article presents a probabilistic road map (PRM) and visual servo control (visual-servoing) based path planning strategy that allows a Motoman HP20D industrial robot to move from an initial positional to a random final position in the presence of fixed obstacles. The process begins with an application of the PRM algorithm to take the robot from an initial position to a point in space where it has a free line of sight to the target, to then apply visual servoing and end up, finally, at the desired position, where an image captured by a camera located at the robot’s end effector matches a reference image, located on the upper surface of a rectangular prismatic object. Algorithms and experiments were developed in simulation, specifically, the visual servo control that includes the dynamic model of the robot and the image sensor subject to realistic lighting were developed in robot operating system (ROS) environment.


Introduction
Automatic robot movement is a complex task requiring an integration of different areas of engineering including computer vision, dynamic systems modeling, and robot control. Autonomous robotics is developed mainly using two approaches: path planning and visual servoing, both of which allow guiding a robot through a series of positions in the joint or task space, from an initial to a desired position.
The first approach is path planning, a field of research that seeks collision-free movement for a robot without considering dynamic model. The techniques addressed by path planning include soft computing-based algorithms (Beom and Cho 1995;Hu et al. 2004;Garcia et al. 2009); analytical path definition-based techniques such as potential fields Borenstein and Koren 1991), where field lines are mathematically modeled to connect points in space and which suffer deformations as a result of objects present therein; and sample-based planning (SBP) path planning (Barraquand and Latombe 1991). This notion of sampling the configuration space gave way to the most used planning algorithms, namely probabilistic roadmap method (PRM), algorithms based on searching for paths using a roadmap (Kavraki et al. 1996); and rapidly-exploring random trees (RRT) (LaValle 1998).
The PRM algorithm is used to solve for multiple paths in an environment, whereas RRT is used for specific planning and, as such, is faster at resolving a single path. Both algorithms are probabilistically complete. These base algorithms have been modified, so they work in unstructured, dynamic environments generating smooth and optimum paths. One example of these is the smoothly RRT (S-RRT) planner (Wei and Ren 2018).
Applications that use path planning include molecular modeling and interaction (Al-Bluwi et al. 2012;Gipson et al. 2012), autonomous navigation (Yang et al. 2013), non-linear R. I. Maldonado-Valencia and C. H. Rodriguez-Garavito contributed equally to this work.
The second approach is robot control, which is where visual servoing can be found. This is a joint control technique that allows defining a robot's movement based on information captured by a visual sensor in the robot's workspace. Three types of visual servoing exist depending on how this information is used: image based visual servoing (IBVS), position based visual servoing (PBVS) and hybrid visual servoing (HVS). IBVS control (Espiau et al. 1992;Malis 2004;Marey and Chaumette 2008), uses coordinates on an image plane corresponding to observation points S that vary over time and are used to calculate an error associated to the reference, which is then used to calculate a compensation signal to guarantee its convergence to zero. PBVS control (Thuilot et al. 2002;Allen et al. 1993;Wilson et al. 1996), takes the object's orientation and positioning as parameters that are compared to the reference to calculate its error, while the HVS control method (Malis et al. 1999), is based on an integration of the above methods to formulate the control law. Visual control can also be classified according to the camera position. If the camera is static in the robot's workspace, it is called an eye-to-hand configuration. If the camera is positioned on the robot, i.e., if the robot's movement moves the camera, the configuration is called eye-in-hand.
Another consideration related to the visual controller architecture is the control loop structure. Two cases are considered here. In the first, the visual control law provides set points that are input into an "internal" controller to the robot, which controls the robot's joints. This is known as dynamic look-and-move. In the second architecture, the visual controller sends information directly to the joins. This is known as direct visual servo control.
This article is structured as follows: Sect. 2 presents the base path planning and visual servoing methods and algorithms for industrial robots. It also explores the theory related to homographic decomposition, used to estimate a camera's position depending on observed visual features, and the sensor that will be used for visually controlling the HP20D robot. Then, Sect. 3 shows the implementation of the algorithms and experiments to validate operation and performance. Finally, Sect. 4 contains a discussion of the results obtained and posits conclusions from the automatic path planning strategy for a manipulator robot.

Robot model environment
The proposed path planning strategy, in its obstacle avoidance, requires the HP20D robot's direct and inverse kinematic models for implementation. In other words, given the joint position of a robot Q with six joints, we require establishing the position of its end effector P , defined by its coordinates base {x, y, z} and its spatial orientation base { , , } , with respect to a base reference system {base} . These models, together with their mathematical deduction using the Denavit-Hartenberg approach can be found in Camacho et al. (2018).
On the other hand, the visual servoing phase is implemented based on the direct and inverse Jacobian and rigid body motion theory. The robot's physical description is a core element of this model, as its materials, inertias, and lengths intervene in the dynamic phenomena that interact between links when the robot is in movement. This dynamic model is incorporated in the ROS Gazebo simulator (Koenig and Howard 2004), as is the camera, which is used to capture images that are processed to feedback the velocities of the visual characteristics detected upon the target where we wish to take the robot's end effector. Figure 1 contains a view of the robot, its working environment, and the image sensor.

PRM-based path planning algorithm
The probabilistic roadmap path planning algorithm was initially presented in Kavraki et al. (1996). It includes two phases. The first of these includes construction of a nondirectional graph G = (V, E) where the nodes V are parametric positions of a robot that are free of collision within the workspace, Q free ; and the edges E are collision-free paths that are also in Q free and are described using a local planner, Δ , which, in its simplest form, is a linear interpolator within a robot's joint space. The second phase, or planner query phase, consists of connecting an initial node, q init , and a final node, q goal , to G , to seek the shortest possible collision free path. This query is performed by implementing an informed search algorithm as A * , although a viable path cannot always be obtained for G , if the roadmap does not completely cover the free space Q free . If this occurs, the graph G must be extended using new seed nodes placed using different techniques, for example as a function of the number of connections of the nodes that exist in G.
A well-known variation of basic PRM is the so-called Lazy PRM, (Bohlin and Kavraki 2000), This version builds G including q init , and q goal , over the entire Q space, It searches for the shortest path between the nodes of interest, T shortest , and then checks the nodes and edges included in that path to see if they are free from collisions. If it detects points in the solution path that interfere with obstacles, these nodes or edges are disconnected from G , and the process is repeated until a minimum collision free path T free is found, or until seed nodes need to be introduced as it can see in Fig. 2.
In our approach, The Lazy PRM algorithm introduces some improvements over the version described by Bohlin and Kavraki, specifically in "Check path for collision" and in the node enhancement block.
The standard procedure for checking collision at every interpolated point along every segment belonging to a feasible path is to discretize each edge in T shortest at a step size fixed. In this work, we define a variable step size that decrements its magnitude gradually as iteration n runs according to a function f ss (n) . The step size is saved for each segment in E and recovered when the collision checker needs it, adjusting its value depending on the current iteration.
This procedure speeds up the algorithm and generates a feasible early path. Nevertheless, this path could be in collision because the test resolution might jump a small area of Q obs ; therefore, the feasible path under test is checked again but at a minimum step size before validating T shortest as T free .
Furthermore, in the node enhancement block, the original version establishes that when a path from q ini to q goal is not feasible, the Roadmap adds several nodes. Some of them are distributed uniformly in Q and the others around cue points called seeds, saved during the process that checks feasible path edges for collisions; the first point, where a collision occurs along a feasible path segment, is marked as seed. In this work, we set additionally as seeds every vertex in Q obs . This assumption is easy to meet because a 3D reconstruction of the robot's workspace generates a description of each obstacle as a set of vertexes.

Visual servoing
Visual control encompasses different areas, including robot modeling, control theory, and image processing. Implementation of this type of methodology can make robotic systems more flexible. This technique can be implemented using different types of cameras, whether monocular, stereo, and RGB-D. The information captured by the camera is compared to a reference to generate an error signal used as an input by the controller.
The work presented is based on an image based IBVS controller with an eye-in-hand configuration and a dynamic lookand-move architecture given their simplicity and advantages. A monocular camera is positioned at the end effector from which sufficient information is obtained for the control law. Some basic foundations of visual control theory that will be useful for developing this work are provided below.
First, a relationship between the robot's motion and changes in image features induced by an object observed in the camera's field of view is required when formulating the visual control law. Now, consider angular velocity c Ω(t) = [ x , y , z ] T and translational velocity c T(t) = [T x , T y , T z ] T of the end effector transferred to a point c P = [x, y, z] T attached to an object described in the camera frame {c} , in the form shown in 1 and 2. (1) This point c P is projected on the image plane of the camera with coordinates of pixel s = [u, v] T , the relationship between the coordinates on image plane and the coordinates of point c P is expressed by the pinhole camera model as: where is the focal length of the camera.
Velocities on the image plane can be obtained if we derive (3). If we substitute (2) in these derivatives, we obtain the following expression.
The relationship between the movement of the features on the image plane and the speed of a point with respect to the camera is given in (4). The matrix that relates them is commonly known as a Jacobian or Interaction Matrix. To control a n DOF robot with n = 6 , where i represent the number of feature points detected, i ≥ 3 is required. This work used i=4, four pairs of visual features corresponding to the top face's corners of a box where an image was painted and used as a template to guide the motion control of the robot. The package, our object of interest, is seen by the camera held by the robot in its end effector. The four visual reference points s ri used for calculating the interaction matrix are estimated using a homography, a matrix that transforms visual features from the template image to the observation image. Dozens of SIFT feature matches built this matrix with an algorithm for parametric estimation that rejects wrong matches. Hence, the quality of the interaction matrix defined by its condition number improves directly with the quality of homography estimation, which grows concerning the number of feature matches detected, producing better visual parameters for building the interaction matrix.
Thus, the reference velocity for the internal control loop can be determined. To do this, we first solve for V c , with which we obtain: Because if n ≠ 2i , L −1 s does not exist, then, due to 2i > n , L −1 s must be approximated by L + s = (L T s L s ) −1 L T s solving a least square problem. To guarantee that 2i is greater than n, we propose using the path planning and visual servoing techniques in cooperation. The robot will meet a pose at the end of path planning, where the camera captures the object of interest. At this point, the sensor will retrieve enough image feature parameters to solve L + s . Finally, the objective of the visual controller is to minimize an error e(t), which can be expressed as: Substituting (6) in (5), we obtain: To ensure an exponential decrease of the error to zero, we have.
where K is an adjustable gain matrix that allows determining the convergence velocity. Given that four points of interest were used, the interaction matrix has the form is a pair of rows x of the matrix .
Having defined the control law and the interaction matrix, it can be seen that this matrix depends on the coordinates of the points of interest on the image plane, the camera's focal length, and z, which is a normal distance from the image plane to the feature position in ℝ 3 . Depth z must be obtained for complete knowledge of the interaction matrix.
A homography decomposition method was used for this work, described in Malis and Vargas (2007). Analytical development of this method returns four possible solutions, two of which can be discarded. Homography decomposition methods consist of obtaining perspective transformation components between two different views from the homography information resulting from two captures in different positions, which can be described as: To obtain these components { , , } we first define a symmetric matrix S in terms of the homography transformation.
Then the following variables are introduced: These, (10) (11), are substituted in (9), by analogous terms and using an auxiliary variable we obtain three second order equations that solve as: where n varies from one to three, where: With the introduction of the variable defined in terms of we have: where is an auxiliary variable that leads to a second order equation. The value of can be found by solving this equation. n can be determined by analogy of this equation with the definition of the auxiliary variable y. Four solutions are arrived at from the possible solutions to the homography decomposition problem (Malis and Vargas 2007), which have the form: where s h ij and M hij are the components and minors of the symmetrical matrix respectively and is sign (M h13 ) , e = a, b . By dividing (16) by its norm, we obtain e , with a similar procedure, the solution for translation vector and rotation vector is obtained. The translation vector is obtained from 10 and 11 expressed in terms of the components of the symmetrical matrix.
The rotation matrix can be obtained from the definition of the homography matrix (9) and (10).

PRM
To test and improve the Lazy PRM algorithm before its implementation in the 6 Degree of Freedom (DOF) space corresponding to the HP20D robot, we first had to start with a simplified version of the environment in 2 DOF. The 2D version seeks paths made up of collision free straight segments in a rectangular environment with the presence of convex obstacles. The 6D version extends the roadmap idea to three-dimensional space and is applied in a simulated environment to the movement of a Motoman HP20D type industrial robot to take its end effector from a start point to an target point in the presence of objects built based on cubes within its workspace. Figure 3a shows a synthetic environment with static polygonal obstacles, showing a start point q init represented as a green node on the graph and an end point q goal node in magenta. An initial graph generated randomly over the entire space can be observed that connects the start and end nodes. Links with the starting node can be seen in green, and links to the ending node are in magenta. All other links appear in red. Figure 3 shows an iterative process that seeks a collision free path from the start point q init to the end point q goal in 2 DOF. The sequence of Fig. 3a, b shows the initial roadmap, a feasible collision path, and the updated map without the nodes and edges of the obtained path found in C obs . Likewise, the sequence of Fig. 3b, c shows the

PRM 2D
refined map after the first iteration, a second path found with interference, and its removal from the roadmap, together with the addition of nodes due to the impossibility of finding a third feasible path. Finally, in Fig. 3d a new viable, collision-free path is found.
To measure the performance of the Lazy PRM 2D planner, an experiment was performed where the algorithm was run 10 times per combination of maximum number of connections [5,10,15] and number of start nodes [30,50,100,200,500]. The process recorded the algorithm's overall processing time and the number of nodes generated for each solution path Fig. 4. Likewise, the number of calls to the collision detector due to the nodes and edges of the feasible paths generated was recorded Fig. 5. Finally, Fig. 6 shows the time used by each of the main stages of the Lazy PRM algorithm: roadmap generation, local planner, and collision check.

PRM 3D
This section covers path planning for a Motoman HP20D 6 DOF industrial robot using a Lazy PRM algorithm. The robot and its workspace are simplified using prismatic links and primitives, which allow efficiently checking collision between the links of the robot themselves and the links and the objects located within the workspace (Rodriguez- Garavito et al. 2018). This object modeling is conservative and adjustable to a robot's real form. Figure 7 contains a visual description of the workspace showing the robot's starting position q init and ending position q goal . Figure 8 shows the HP20D robot's path planning process for moving from q init to q goal . Figure 8a shows a feasible path in collision and Fig. 8b contains a representation of its associated roadmap. Figure 8c then shows a feasible, collision free path after eliminating the nodes and paths with collisions, planning new paths, and progressively refining the roadmap. The final path found is obtained using a linear interpolator in the joint space. In other words, the path shown in Fig. 8d is the movement the robot's end effector actually follows, while Fig. 8c is an interpretation of a succession of points in the N XYZ space that makes up the path found.
Finally, just like in the 2D space, the performance of the Lazy PRM algorithm is applied to a manipulator robot in a six-dimensional joint or workspace, where collision-free paths are expected to be found. The proposed experiment consists of running the Lazy PRM planner 40 times for different numbers of initial nodes [200, 300, 400, 500, 700, 1 000, 3000] on the roadmap, and then measure the average overall processing time for the algorithm and the number of nodes in the solution paths, Fig. 9. The number of calls to the collision detector to evaluate nodes and segments in feasible paths is also counted, Fig. 10. Finally, Fig. 11 records the processing time for each stage of the Lazy PRM 3D algorithm: local planner, collision detector, and manipulation of the graph representing the roadmap.

Visual control
To demonstrate the effectiveness of the visual controller technique, the ROS simulated model of the HP20D manipulator was used together with a box-type object of interest and a monocular camera with an eye-in-hand configuration. Then a reference image is taken, which is used to determine the error required to calculate the controller input. Since the control law requires an error, the position of a group of visual characteristics on the image plane at an instant ith in relation to the reference image is required. For this, the SIFT algorithm together with the RANSAC parametric estimator was used to determine a homography transformation matrix which enables relating visual characteristics between the reference image and the image captured at instant ith and thus finding these positions. The transformation matrix obtained can be used to  Figure 12 shows the evolution of the robot's position and the camera's perspective in the experiment proposed at six instances of a visual control trajectory, Fig. 13, these images show a basic configuration with an object on a table, where Fig. 12a corresponds to the robot's starting position and Fig. 12f corresponds to the ending position once the system is stable.
During the execution of this experiment, velocities and errors in joints tend to zero, as we can see in Figs. 14 and 15.

Conclusions
This work proposes a flexible strategy to provide a robot with the ability to move freely within workspace, avoiding obstacles by mixing motion techniques like Path Planning and visual servoing. Both algorithms cooperate to guarantee that the robot, working under an eye-in-hand configuration, has the object of interest within field of view when starting its Visual Servoing phase. In this fashion, the visual features are available to calculate an accurate homography and interaction matrix with low uncertainty. Each motion technique was probed independently. The Lazy PRM implementation produces some findings from the 2 and 6 DOF experiments: first, Figs. 4 and 9 show an elbow effect, where we could establish a suitable number of initial nodes related to the algorithm processing time around 100 for 2D planning and 500 for 3D planning. In addition, it is possible to verify that the maximum number of connections allowed for each node does not directly correlate with the processing time. Second, the Lazy PRM planner reaches a solution in a minimum of 50 ms with a path size of 23 nodes working in a 2 DOF environment, while in a 6 DOF environment it does so in 15 s with a path size of 5 nodes; what is reasonable for a planning query online in an environment subject to uncertainty. Finally, to conclude the analysis of the global Lazy PRM planner, we can observe that, depending on the application environment, algorithm processing times change. In the 2 DOF case, if the number of nodes increases, the task that takes up the most run time is a local search for feasible paths. In contrast, in the 6 DOF, as the number of nodes on the connection graph increases, the costliest task is the collision detector. Its implementation needs to be efficient and based on a conservative simplification of the environment.
Implementing a camera, manipulator, and environment in ROS provide a suitable framework for testing cooperatively both visual servoing and planning, making possible its future migration over to a real industrial framework.
In Fig. 12 we can see the sequence of images for evolving camera captured frames and manipulator position in a visual servoing trajectory, which shows smooth robot position changes. However, the graph shown in Fig. 13 describes the trajectory of the four feature positions taken as a reference where we can see an overshoot at the end of the motion. In principle, this seems to be related to the time used by the visual control algorithm to calculate the output, but this overpass does not seem to be significant. Considering that the vision algorithm required to determine homography requires the most time for a control inference, a more efficient vision algorithm is necessary to produce smoother position and velocity changes and a shorter stabilization time.
The proposed technique allows for solving the loss of field of view (FOV) issue since the planner can run such that the object of interest will once again be visible.
As future work, based on these motion techniques developed, visual servoing, and path planning for industrial robots, we propose transferring these algorithms to a real implementation based on an RBX1 robot integrated in ROS. In some previous works, we have developed a vision sensor for package-like objects; this will be the link to close the control loop and tackle several autonomous manipulation   Funding Open Access funding provided by Colombia Consortium.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.