The main purpose of the presented framework is to enable robots to acquire an original set of skills to explore various environments, identify their characteristics and accordingly adapt to them, build the knowledge, and use it for future interactions. This methodology is based on the concept of self-adaptability, even after building the knowledge on task or environments. In this way, the robot has the ability to adapt by starting from a reasonable initial condition, even if the environmental conditions were subject to changes.
The required theoretical and technological components to build such a framework are integrated into five main modules, that are illustrated in Fig. 2: (1) a Cartesian impedance controller whose parameters can be tuned online, (2) a multi-axes self-tuning impedance unit to tune the aforementioned parameters when an interaction with the environment is predicted, (3) a trajectory planner to calculate the spatial points to be reached by the controller, (4) a visual perception module that locates the materials’ positions in the robot workspace, and (5) a Finite State Machine (FSM) that, based on the data provided by (4), triggers unit (2) and (3), being also responsible of detecting system faults.
Cartesian impedance controller
Cartesian impedance control techniques provide the ability to achieve any arbitrary quasi-static behavior at the robot end-effector (Schindlbeck and Haddadin 2015; Ajoudani et al. 2017). This is however limited to the positive definiteness and symmetry of the impedance matrices, by considering robot torque boundaries and the number of degrees of freedom (\(\ge 6\)).
This control technique relies on torque sensing and actuation, with the vector of robot joint torques \(\varvec{\tau } \in \mathbb {R}^{n}\) calculated as follows:
$$\begin{aligned}&\varvec{\tau } = \varvec{M(q)\ddot{q}} + \varvec{C(q,\dot{q})\dot{q}} + \varvec{g(q)} + \varvec{\tau }_{\mathrm{ext}}, \end{aligned}$$
(1)
$$\begin{aligned}&\varvec{\tau }_{\mathrm{ext}} = \varvec{J(q)}^T \varvec{F}_c + \varvec{\tau }_{\mathrm{st}}, \end{aligned}$$
(2)
where n is the number of joints, \(\varvec{q} \in \mathbb {R}^{n}\) is the joint angles vector, \(\varvec{J} \in \mathbb {R}^{6 \times n}\) is the robot arm Jacobian matrix, \(\varvec{M} \in \mathbb {R}^{n \times n}\) is the mass matrix, \(\varvec{C} \in \mathbb {R}^{n \times n}\) is the Coriolis and centrifugal matrix, \(\varvec{g} \in \mathbb {R}^{n}\) is the gravity vector and \(\varvec{\tau }_{\mathrm{ext}}\) is the external torque vector. \(\varvec{F}_c\) represents the forces vector in the Cartesian space and \(\varvec{\tau _{\mathrm{st}}}\) the second task torques projected onto the null-space of \(\varvec{J}\).
Forces \(\varvec{F}_c \in \mathbb {R}^{6}\) are calculated as follows:
$$\begin{aligned} \varvec{F}_c = \varvec{K}_c (\varvec{X}_d - \varvec{X}_a) + \varvec{D}_c (\varvec{\dot{X}}_d - \varvec{\dot{X}}_a), \end{aligned}$$
(3)
where \(\varvec{K}_c \in \mathbb {R}^{6 \times 6}\) and \(\varvec{D}_c \in \mathbb {R}^{6 \times 6}\) represent respectively the Cartesian stiffness and damping matrix, \(\varvec{X}_d\) and \(\varvec{X}_a \in \mathbb {R}^{6}\) the Cartesian desired and actual position, \(\varvec{\dot{X}}_d\) and \(\varvec{\dot{X}}_a \in \mathbb {R}^{6}\) their corresponding velocity profiles. The Cartesian desired position and velocity are given in input by the Trajectory planner (see Sect. 2.3).
Self-tuning impedance unit
In the near future, robots will enter in several application scenarios to collaborate with humans, in environments designed and built to match their human counterparts’ needs, characterised by high uncertainty levels. To respond to the varying task conditions and uncertainly levels, we aim to develop a novel self-tuning impedance controller that is able to adapt its parameters when an interaction is expected. The adaptation is limited to the expected direction(s) of interaction/movement to avoid unnecessary stiffening/complying of the remaining axes (Fig. 3).
Since our method implies the adaptation of the impedance parameters only in the interactions that are expected, we need to distinguish when this is the case. To this end, we define a Boolean value, named Interaction expectancy (\(\varvec{I}_e\)), that results from the Boolean logic rule \(\varvec{I}_e = \varvec{I}_m \wedge \varvec{I}_f\). The first value, \(\varvec{I}_m\), induced by the FSM, is True only in the states where an interaction with the environment is expected. The second one, \(\varvec{I}_f\), set by the visual perception module, is set to True when the tool attached to the end-effector is inside the material that has to be manipulated. The importance of this consideration has already been shown in our preliminary work (Balatti et al. 2018) and it will not be repeated here.
When no interaction is expected, and therefore \(\varvec{I}_e\) is False, the Cartesian stiffness matrix \(\varvec{K}_c\) is set to a default diagonal matrix with all the non-zero coefficients set to \(\varvec{k}_{\mathrm{min}}\) to deliver a compliant behavior. That is because the base condition of the presented self-tuning impedance controller is to be soft in all Cartesian and redundant axes, unless an interaction is expected to occur. The impedance values to render softness, however, has to be chosen based on a trade-off between the position tracking accuracy (affected by the existence of unmodelled dynamics such as friction) and the force response, if an unexpected interaction occurs.
The damping matrix \(\varvec{D}_{{\varvec{c}}}\) is derived from \(\varvec{K}_{c}\) by:
$$\begin{aligned} \varvec{D}_{{\varvec{c}}} = \varvec{\Lambda }_* \varvec{D}_{\mathrm{diag}} \varvec{K}_{\mathrm{adj}*} + \varvec{K}_{\mathrm{adj}*} \varvec{D}_{\mathrm{diag}} \varvec{\Lambda }_{*}, \end{aligned}$$
(4)
where \(\varvec{D}_{\mathrm{diag}}\) is the diagonal matrix containing the damping factor (\(\zeta = 0.7\)), \(\varvec{K}_{\mathrm{adj}*}\varvec{K}_{\mathrm{adj}*} = \varvec{K}_{c}\) and \(\varvec{\Lambda }_* \varvec{\Lambda }_* = \varvec{\Lambda }\), where \(\varvec{\Lambda }\) is the desired end-effector mass matrix (Albu-Schaffer et al. 2003).
Instead, when an interaction is expected, being \(\varvec{I}_e\) True, the Cartesian stiffness matrix \(\varvec{K}_c\) and consequently the damping matrix \(\varvec{D}_c\) are subject to changes increasing (or decreasing) the impedance parameters only along the direction of the desired movement defined by:
$$\begin{aligned} \overrightarrow{\varvec{P}} = \varvec{X}_{\varvec{d},\varvec{t}} - \varvec{X}_{\varvec{d},\varvec{t}-\mathbf{1}}, \end{aligned}$$
(5)
(which can also be calculated from \(\dot{X}_d\)) and keeping a compliant behavior, set to \(\varvec{k}_{\mathrm{min}}\) and \(\varvec{d}_{\mathrm{min}} = 2 \zeta \sqrt{\varvec{k}_{\mathrm{min}}}\) (Albu-Schaffer et al. 2003), along the other axes. To achieve this, the stiffness and damping matrices, as being symmetric and positive definite, can be expressed by:
$$\begin{aligned} \varvec{A} = \varvec{U} \varvec{\Sigma } \varvec{V}^*, \end{aligned}$$
(6)
which is known by the Singular Value Decomposition (SVD). Such a decomposition enables us to project the desired stiffness and damping, calculated w.r.t. the reference frame of desired motion vector \(\varvec{\overrightarrow{P}}\), onto the reference frame of the robot base. \(\varvec{U} \in \mathbb {R}^{3}\) and \(\varvec{V} \in \mathbb {R}^{3}\) are orthonormal bases, and \(\varvec{\Sigma } \in \mathbb {R}^{3}\) is a diagonal matrix whose elements are the singular values of matrix \(\varvec{A}\) sorted in decreasing order and representing the principal axes amplitudes of the resulting geometric ellipsoid. The columns of matrix \(\varvec{U}\) form a set of orthonormal vectors, which can be regarded as basis vectors. In this work, the first column of \(\varvec{U}\) represents the desired motion vector \(\varvec{\overrightarrow{P}}\), while the second and the third ones are derived from the first in such a way they form an orthonormal basis. Since the Hermitian transpose \(\varvec{V}^* \in \mathbb {R}^{3}\), and the resulting matrix \(\varvec{A}\), that represents the impedance values, is positive definite, we have that:
$$\begin{aligned} \varvec{V}^*= & {} \varvec{V}^T, \end{aligned}$$
(7)
$$\begin{aligned} \varvec{V}= & {} \varvec{U}. \end{aligned}$$
(8)
Combining (6), (7) and (8), we can derive the stiffness and the damping matrix:
$$\begin{aligned} \varvec{K}_c= & {} \varvec{U} \varvec{\Sigma }_k \varvec{U}^T, \end{aligned}$$
(9)
$$\begin{aligned} \varvec{D}_c= & {} \varvec{U} \varvec{\Sigma }_d \varvec{U}^T, \end{aligned}$$
(10)
where the diagonal matrix \(\varvec{\Sigma }_k\) and \(\varvec{\Sigma }_d\) coefficients are respectively the desired stiffness and damping coefficients along the direction of the vectors composing the \( \varvec{U}\) basis. They are diagonal matrices defined by:
$$\begin{aligned} \varvec{\Sigma }_k= & {} \mathrm{diag}(\varvec{k}_{\mathrm{st}},\varvec{k}_{\mathrm{min}},\varvec{k}_{\mathrm{min}}) \end{aligned}$$
(11)
$$\begin{aligned} \varvec{\Sigma }_d= & {} \mathrm{diag}(\varvec{d}_{\mathrm{st}},\varvec{d}_{\mathrm{min}},\varvec{d}_{\mathrm{min}}) \end{aligned}$$
(12)
where \(\varvec{k}_{\mathrm{st}}\) is the self-tuning stiffness coefficient to be set along the motion vector \(\varvec{\overrightarrow{P}}\) and \(\varvec{d}_{\mathrm{st}}\) its correspondent damping element. \(\varvec{k}_{\mathrm{st}}\) is defined at every time t as:
$$\begin{aligned} \varvec{k}_{\mathrm{st},t}= & {} \varvec{k}_{\mathrm{st},t-1} + \varvec{\alpha } \varvec{\Delta P} \varvec{\Delta T}, \end{aligned}$$
(13)
$$\begin{aligned} \varvec{\Delta P}= & {} | \varvec{\Delta X} \cdot \varvec{\widehat{P}} |, \end{aligned}$$
(14)
$$\begin{aligned} \varvec{\Delta X}= & {} \varvec{X_d} - \varvec{X_a}, \end{aligned}$$
(15)
where \(\varvec{\alpha }\) is the update parameter, \(\varvec{\Delta P}\) is the absolute value of the Cartesian error \(\varvec{\Delta X}\) projected onto the direction of the motion vector \(\varvec{\overrightarrow{P}}\), and normalized \(\varvec{\widehat{P}}\), where \(\varvec{\Delta T}\) is the control loop sample time. It is important to notice that, \(\varvec{k}_{\mathrm{st}}\) is subject to changes only when \(\varvec{\Delta P}\) is beyond a limit. To this end, we introduce a threshold, defined as \(\varvec{\Delta P}_t\), since usually in impedance controlled robots it is hard to achieve a small error between the desired and actual position, unless the gains, i.e. stiffness and damping, are very high and it can be compared to position control. In this way we let a small margin of error, not to increase the impedance parameters when it is not required by the task, and to arrest its growth when a desired accuracy is achieved. Moreover, we want to avoid unnecessary adaptation which can be caused by unmodeled robot dynamics and small amount of friction at joints, and therefore the error can also not be related to the task.
In order to increase robot autonomy, the desired impedance parameters need to be adapted in a reasonably short time. This implies an accurate choice of \(\varvec{\alpha }\). A high value of this parameter would lead to a rapid convergence in materials with high density. Nevertheless, choosing a high value for non-dense material will cause needless stiffening of the robot that must be avoided. Therefore, to obtain an average \(\varvec{\alpha }\) value, as a trade-off between fast convergence and stiffening performance, we performed experiments on different materials such as soil, sand, rocks with different density, air and water.
To achieve this, for every material m, \(\varvec{\alpha }_m\) was estimated and the average value was defined through the arithmetic mean of all the n materials taken into account in the analysis:
$$\begin{aligned} \varvec{\alpha } = \frac{1}{n} \sum _{m=1}^n \varvec{\alpha }_m. \end{aligned}$$
(16)
There are also situations in which the impedance parameters adaptation has to be carried out in the opposite way, i.e. decreasing them, and (13) cannot be applied. An example is given by the case where the tool attached to the robot end-effector exits the material, even still being inside the interaction expectancy area. We define \(\varvec{\Delta F}_{\mathrm{ext},t}\) as the variation of the external forces, along the motion vector, detected at the robot end-effector at time t w.r.t. the ones measured at time \(t-1\):
$$\begin{aligned} \varvec{\Delta F}_{\mathrm{ext},t} = (\varvec{F}_{\mathrm{ext},t} - \varvec{F}_{\mathrm{ext},t-1}) \cdot \varvec{\widehat{P}}. \end{aligned}$$
(17)
In the aforementioned situations, \(\varvec{\Delta F}_{\mathrm{ext},t}\) is positive and \(\varvec{k}_{\mathrm{st}}\) is defined at every time t as:
$$\begin{aligned} \varvec{k}_{\mathrm{st},t} = \varvec{k}_{\mathrm{st},t-1} - \varvec{\beta } \varvec{\Delta F}_{\mathrm{ext},t} \varvec{\Delta T}, \end{aligned}$$
(18)
where \(\varvec{\beta }\) is given by \(\varvec{\alpha }\) scaled by a factor of \(10^{-2}\), to implement a similar rate of adaptation as in (131415). To avoid unnecessary changes caused by negligible force sensing difference, the positiveness of \(\varvec{\Delta F}_{\mathrm{ext},t}\) is defined considering a small \(\varepsilon \). A pseudo-code of the proposed method is presented in Algorithm 1.
Trajectory planner
The framework offers different kinds of motion planning trajectories. The Trajectory planner unit receives in input from the FSM (see Sect. 2.5) the target pose, the desired period to reach it, and the type of trajectory planner that can be selected among:
-
point-to-point motion between two via points, i.e. starting from the actual one and reaching the desired one that is given as input. To achieve smoother trajectory profiles and prevent impulsive jerks, this kind of motion is implemented with fifth-order polynomial. In this way, velocity/acceleration initial and final values are set to zero.
-
scooping motion, which reaches the target pose through a half circular motion on the vertical axis, replicating a scooping movement. This type of motion, designed with constant angular rate of rotation and constant speed, is helpful to collect materials when a scoop-like end-effector is attached to the robot flange.
-
shaking motion, that is based on a rapid sinusoidal movement performed in place. By getting as input the shaking direction, this motion is needed to completely pour the materials in the pot during the “Task” state, without leaving any residuals on the spoon.
Note that, the design and implementation of the trajectories are not explained in details since they do not contribute to the novelty of this work, as they are well-known in robotics literature. However, a short introduction is useful to understand better the different phases explained in Sect. 2.5.
Visual perception module
Visual perception plays a key role in the Finite State Machine, providing information about the materials which are going to be manipulated by the robotic arm. The module splits in two sub-systems, using as input RGB-D data from a range sensor (ASUS Xtion PRO) placed in fixed position with respect to the arm and facing the materials. In the first sub-system, different types of materials are detected in the scene, using RGB data, and their three-dimensional (3D) surface convex hull polygon is calculated, using depth data (materials localization sub-module). In the second one, a peak point per each material is localized in the base robot frame, using the depth data (peaks localization sub-module).
Materials Localization To localize different materials in 3D, we use color-based region growing segmentation. The RGB-D sensor provides colored point clouds that are first transformed in the robot’s base frame (z-axis upwards and y-axis towards left). Point cloud filtering, such as pass-through and box-cropping, is applied to keep only those points of interest that are inside the working space of the robot. Keeping the structure of the point cloud in its original grid organization (i.e. instead of removing points, setting them to NaNs) makes the method more efficient. Then, region growing is applied based on color, in order to classify similar points in clusters. This is a two-steps algorithm. In the first step, points are sorted based on their local curvature. Regions that grow from points with minimum curvature (more flat surfaces) reduce the number of segments. Two neighboring points are consider to be part of the same material if their color is similar to each other. The process continues for a seed’s neighboring point, until no further neighbors can be classified to the same segment. In the second step, clusters with similar average colors or small size are merged. In Fig. 4, the result of this process is visualized. To localize the convex hull polygon of each material, we identify the extreme points in the xy-plane (i.e. the 2.5D bounding box), which results to vertices \(V_1, V_2, V_3, V_4\) (Fig. 4), stored and passed to the Finite State Machine.
Peaks Localization Identifying each material’s peak point is straightforward after having localized the materials themselves. The peak point \(p_i\), for a material i, is the point with the maximum z-value. Similarly, each material’s center is the average of the encapsulating polygon’s vertices.
Finite state machine
To increase the autonomy of the system, we designed a Finite State Machine that is responsible of managing the transitions between the different phases of the framework (see Fig. 2). This unit gets as input the data sent by the Visual perception module, and by processing them, defines the target poses that are sent to the Trajectory planner. Moreover, this module is in charge of determining if an interaction with the environment is to happen, in order to activate the Self-tuning impedance unit.
The FSM is formed by four states. The “Workspace definition” state is responsible of acquiring the knowledge regarding the environment where the robot will operate in the next steps. To do so, it gets as input the vertices of the polygons that shape the materials within the robot workspace. These data are received from the “Materials localization” unit and stored in an appropriate data structure. In order to identify the \({\varvec{k}_{\mathrm{st}}}\) parameter for every material, the FSM switches to the “Exploration” state. At this step, the robot end-effector grasps a stick-like tool, and reaches in a compliant way the material placed in the leftmost part of the workspace, previously identified by the Visual Perception Module. After having dunked into the matter, the Self-tuning impedance unit is triggered: both the boolean value \(\varvec{I}_m\) and \(\varvec{I}_f\) have been set to True, since a contact with the environment is expected and the end-effector tool is inside the material. Then, the robot follows a point-to-point motion towards the polygon center while adapting the impedance parameters as described in Sect. 2.2. The framework stores the resulting \(\varvec{k}_{\mathrm{st}}\) in the data structure, associating it to the corresponding material. After that, the tool is pulled out from the substance and the impedance parameter identification process is repeated for all the materials in the workspace, until the rightmost material has been analyzed. Notice that, it is not necessary to continuously track the tool pose to check if it is inside the material and therefore to activate \(\varvec{I}_m\). In fact, during the “Workspace definition” state, the Visual Perception Module localizes and defines the areas of interaction, and from here on, through the robot forward kinematics, we can identify whether a tool is located within that interaction expectancy area or not. This is possible since the tool is attached to the robot end-effector. In such a way, we can define whether \(\varvec{I}_m\) has to be set to True or False at every time step.
To enhance the robustness of the system to unexpected events, we designed a “Fault Detection” sub-unit, within the “Exploration” state, that is triggered in case of a collision with the boundary (possibly due to a perception unit fault). If the sensed external forces projected along the motion vector experience an abrupt increase, the robot ends its motion and goes back to its homing position. The external forces trend w.r.t the robot displacement, is estimated through a linear regression algorithm, computed for every n samples. When the linear regression slope m goes beyond a threshold set to \(m_{\mathrm{fault}}\), the fault is triggered.
Next, in the “Materials distribution” state, the vision unit detects the highest point for every matter through the “Peaks localization” unit. The FSM receives these points and associates them to the relative material. During the final state, named “Task”, the robot needs to scoop some material and pour it in a pot held by another robot. The scooping trajectories are designed to start in the Euclidean points identified as peak points to ensure that some material is found in that part of the container. While holding a scooping tool, as a scoop or a small shovel, the robot starts to carry out the task by reaching the first scheduled material with the default compliance \(\varvec{k}_{\mathrm{min}}\) set in all the Cartesian axes. When it dunks inside the matter in order to scoop some of it, and therefore activating the interaction expectancy value, the impedance parameter are adapted in the direction of the motion setting the relative \(\varvec{k}_{\mathrm{st}}\) for every material. This value is retrieved from the data structure, where it was stored during the learning phase in the “Exploration” state. Like this, since the very beginning of the task, the robot does not lag behind and can execute the task in a more precise manner. When the scooping motion, that goes from the highest point towards the polygon center, is over, the material is poured in a pot and the process starts over with the next material, as scheduled by the task sequence. There are cases in which the substances’ viscous properties are subject to changes over time. This can be given either because of the material intrinsic properties or due to external circumstances. That is why, even in the “Task” state, the impedance parameters are exposed to changeability. Nevertheless, we decided to set a maximum value \(\varvec{k}_{\mathrm{st}\_\mathrm{max}}\) that can be reached by \(\varvec{k}_{\mathrm{st}}\) in this state to avoid significant variations w.r.t. the one computed in the “Exploration” state:
$$\begin{aligned} \varvec{k}_{\mathrm{st}\_\mathrm{max},m}&= \varvec{k}_{\mathrm{st}\_\mathrm{exploration},m} * (1+p) \nonumber \\ 0&\le p \le 0.5 \quad \forall m, \end{aligned}$$
(19)
where \(\varvec{k}_{\mathrm{st}\_\mathrm{exploration},m}\) is the value stored in the “Exploration” state for material m, and p is the percentage of variation that can be set. If, in the “Task” state, this value is exceeded, the “Fault Detection” sub-unit is triggered, and the robot goes back to its homing position. This method can be useful for detecting both a collision with the boundary (for instance due to a fault of the perception unit) and a substantial change in the material properties, since in industrial production these would be symptoms of system malfunction or material inconsistency.