A method for autonomous robotic manipulation through exploratory interactions with uncertain environments

Expanding robot autonomy can deliver functional flexibility and enable fast deployment of robots in challenging and unstructured environments. In this direction, significant advances have been recently made in visual-perception driven autonomy, which is mainly due to the availability of rich sensory data-sets. However, current robots’ physical interaction autonomy levels still remain at a basic level. Towards providing a systematic approach to this problem, this paper presents a new context-aware and adaptive method that allows a robotic platform to interact with unknown environments. In particular, a multi-axes self-tuning impedance controller is introduced to regulate quasi-static parameters of the robot based on previous experience in interacting with similar environments and the real-time sensory data. The proposed method is also capable of differentiating internal and external disruptions, and responding to them accordingly and appropriately. An agricultural experiment with different deformable material is presented to validate robot interaction autonomy improvements, and the capability of the proposed methodology in detecting and responding to unexpected events (e.g., faults).


Introduction
To respond to the rapidly increasing demand for high levels of flexibility in manufacturing and service applications, recent research has focused on endowing robots with the ability to react and adapt to their environments. From the one hand, robotic systems based on torque sensing and actuation or variable impedance mechanisms have been developed to make them compliant to their surroundings (Albu-Schaffer et al. 2003;Tsagarakis et al. 2016). On the software level instead, a great deal of attention has been devoted to the perception autonomy of the robots, to capture the effects of appearance and context (Kotseruba et al. 2016;Harbers et al. 2017).
Although these two directions have seen significant advancements over the past decade, the bridging action, i.e., associating perception to interaction in an autonomous way, still remains in a non-satisfactory level. This fundamental shortcoming has limited the application of robots in out-of-the-cage application scenarios, making a framework to enhance their physical interaction autonomy a critical requirement.
Previous attempts to endow robots with adaptive interaction skills have pursued different directions. A well-known approach is based on learning from human demonstrations (Katz et al. 2014(Katz et al. , 2008Kronander and Billard 2014), which has shown promising results when sufficient training data is available. In fact, the high dependency of observationbased approaches to the quality of training data sets has been a limiting factor. Moreover, while performing complex manipulation tasks, accurate sensory measurements related to physical interactions (e.g., forces and torques) may not be possible through wearable sensory systems, that is why most learning by demonstration techniques function on a kinematic level.
To provide a solution to this problem, analytical techniques have focused on the use of impedance control (Dietrich et al. 2016;Lee et al. 2014;Ajoudani et al. 2014b), force control for both fixed manipulators Righetti et al. 2014) and mobile base applications (Roveda 2018), or hybrid interaction controllers (Anderson and Spong 1988;Schindlbeck and Haddadin 2015). However, in the majority of cases, the control parameters are pre-selected by robot programmers based on their experience in carrying out analogous tasks. In such a way, the framework cannot adapt when task conditions change, hence, the full potential of such powerful control techniques cannot be exploited (Yang et al. 2011;Ferraguti et al. 2013).
To provide a solution to this shortcoming, adaptive learning techniques have been proposed. In (Xu et al. 2011), an adaptive impedance controller for upper-limb rehabilitation, based on evolutionary dynamic fuzzy neural network was proposed to regulate the impedance profile between the impaired limb and the robot. However, this method lacks versatility, since the algorithm is limited to a specific task. In a similar work (Gribovskaya et al. 2011), empirical constants had to be set, which reduced the flexibility of the framework. In addition, the desired impedance matrices were assumed to be diagonal, resulting in limited adaptability in selective Cartesian axes. More generic methods have been introduced by reducing impedance control to position control (through high position loop gains) when there is no interaction (Li et al. 2012) with the aim to minimize the error between the desired and actual trajectories (He and Dong 2018). This significantly reduced the system's ability to deliver a distinguished respond to the desired (task) and undesired (e.g., collisions) interactions (see also Nemec et al. 2018).
To address these challenges and towards bridging the autonomy gap in the perception-reaction chain, we propose a novel manipulation framework that integrates various components to achieve a context-aware and adaptive robot physical interaction behavior.
The main component of our framework is a multi-axes self-tuning impedance controller, to tune the Cartesian stiffness and damping profiles (see Fig. 1) in any arbitrary direction, which coincide with the direction of interaction, based on the previous experience in interacting with similar environments and the real-time robot sensory data. As (Song et al. 2019) report in the survey and comparison of Fig. 1 The robotic arm explores the materials in its workspace, identifying and self-tuning its impedance parameters along the directions of interaction. These are represented by the principal axis of the geometric ellipsoids depicted in the figure. Longer arrows represent higher Cartesian stiffness and damping values impedance control techniques on robotic manipulation, variable impedance controllers need to determine when and how to vary the impedance parameters. Autonomously carrying out this non-trivial task falls within the primary aims of this work.
To recognise and localize different materials in the robot workspace and associate their newly/previously identified characteristics to the robot interaction knowledge (i.e., impedance control and self-tuning gains), a visual perception module has been developed. This module, embedded in our multi-axis self-tuning controller, enables the robot to explore an environment, to identify its characteristics, and to effectively interact with it. This behavior is inspired by the humans' way of adapting to their surroundings, by constantly building internal models of the external world, while exploring and identifying it. When interacting with new or similar environments, the prior knowledge is used as a preparatory strategy, while keeping open the possibility of adaption, to update our internal knowledge (Kawato 1999). Another similarity between our method and human behavior is given by the default compliant behavior of the robot. In fact, when no interaction is expected, we tend to relax our muscles to comply with unexpected external disturbances (and to minimize energy consumption).
The organization of the individual components and their inter-connections are achieved through a Finite State Machine (FSM). The proposed FSM fuses the data received from the robot sensors and the vision module, to distinguish expected interactions from disturbances and faults, and accordingly, to self-tune control gains and trajectories in real-time. Note that, the presented FSM is connected to the main experiment we conducted to validate the proposed method. Nevertheless, the algorithm can be applied to many other applications by modifying its states, as it is described in a section dedicated to the algorithm scalability.
We prepared an agricultural experimental setup to demonstrate the potential of the presented methodology in one of the Fig. 2 The framework is composed by 5 modules, each of them is developed as a ROS node. Data between them are exchanged via ROS messages on the ROS topics illustrated with dotted lines. The messages in blue are published by the proposed software architecture, while the others represent the robot state (in green) and the vision data provided by the camera (in red) (Color figure online) most promising but somewhat disregarded fields in robotics. Two Franka Emika Panda robotic arms, one equipped with the Pisa/IIT SoftHand (Ajoudani et al. 2014a) and the other with a standard gripper were used to perform the experimental task.
The original idea of this work was presented in a short conference paper . However, to advance the methodology and explain its full potential, in this article, we have added significant contributions w.r.t. to the original work. Here, we introduce a novel "Fault Detection" unit, able to recover from unexpected collisions with the boundary (for instance due to a fault of the perception unit). This can also be useful to detect substantial changes in the material properties, since these would be system malfunction symptoms in industrial production. Next, the performance of the multiaxis self-tuning impedance controller is thoroughly evaluated with several details. Additionally, we performed new experiments to compare our method to non-adaptive techniques, to highlight the full potential of the framework in interacting with uncertain environments. Furthermore, we conducted an analysis to guarantee the stability of the controller, since it is based on continuous variations of the impedance parameters. To illustrate the generality and scalability of the presented approach, we also included a new experiment showing the controller adaptation while grasping and handling a pallet jack handle.

Method
The main purpose of the presented framework is to enable robots to acquire an original set of skills to explore various environments, identify their characteristics and accordingly adapt to them, build the knowledge, and use it for future interactions. This methodology is based on the concept of self-adaptability, even after building the knowledge on task or environments. In this way, the robot has the ability to adapt by starting from a reasonable initial condition, even if the environmental conditions were subject to changes.
The required theoretical and technological components to build such a framework are integrated into five main modules, that are illustrated in Fig. 2: (1) a Cartesian impedance controller whose parameters can be tuned online, (2) a multiaxes self-tuning impedance unit to tune the aforementioned parameters when an interaction with the environment is predicted, (3) a trajectory planner to calculate the spatial points to be reached by the controller, (4) a visual perception module that locates the materials' positions in the robot workspace, and (5) a Finite State Machine (FSM) that, based on the data provided by (4), triggers unit (2) and (3), being also responsible of detecting system faults.

Cartesian impedance controller
Cartesian impedance control techniques provide the ability to achieve any arbitrary quasi-static behavior at the robot endeffector (Schindlbeck and Haddadin 2015;Ajoudani et al. 2017). This is however limited to the positive definiteness and symmetry of the impedance matrices, by considering robot torque boundaries and the number of degrees of freedom (≥ 6).
This control technique relies on torque sensing and actuation, with the vector of robot joint torques τ ∈ R n calculated as follows:

Fig. 3
The self-tuning impedance algorithm flow chart: if no interaction is expected, the robot keeps a compliant behavior, otherwise the impedance parameters can be subject to changes where n is the number of joints, q ∈ R n is the joint angles vector, J ∈ R 6×n is the robot arm Jacobian matrix, M ∈ R n×n is the mass matrix, C ∈ R n×n is the Coriolis and centrifugal matrix, g ∈ R n is the gravity vector and τ ext is the external torque vector. F c represents the forces vector in the Cartesian space and τ st the second task torques projected onto the null-space of J.
Forces F c ∈ R 6 are calculated as follows: where K c ∈ R 6×6 and D c ∈ R 6×6 represent respectively the Cartesian stiffness and damping matrix, X d and X a ∈ R 6 the Cartesian desired and actual position,Ẋ d andẊ a ∈ R 6 their corresponding velocity profiles. The Cartesian desired position and velocity are given in input by the Trajectory planner (see Sect. 2.3).

Self-tuning impedance unit
In the near future, robots will enter in several application scenarios to collaborate with humans, in environments designed and built to match their human counterparts' needs, characterised by high uncertainty levels. To respond to the varying task conditions and uncertainly levels, we aim to develop a novel self-tuning impedance controller that is able to adapt its parameters when an interaction is expected. The adaptation is limited to the expected direction(s) of interaction/movement to avoid unnecessary stiffening/complying of the remaining axes (Fig. 3). Since our method implies the adaptation of the impedance parameters only in the interactions that are expected, we need to distinguish when this is the case. To this end, we define a Boolean value, named Interaction expectancy (I e ), that results from the Boolean logic rule I e = I m ∧ I f . The first value, I m , induced by the FSM, is True only in the states where an interaction with the environment is expected. The second one, I f , set by the visual perception module, is set to True when the tool attached to the end-effector is inside the material that has to be manipulated. The importance of this consideration has already been shown in our preliminary work (Balatti et al. 2018) and it will not be repeated here.
When no interaction is expected, and therefore I e is False, the Cartesian stiffness matrix K c is set to a default diagonal matrix with all the non-zero coefficients set to k min to deliver a compliant behavior. That is because the base condition of the presented self-tuning impedance controller is to be soft in all Cartesian and redundant axes, unless an interaction is expected to occur. The impedance values to render softness, however, has to be chosen based on a tradeoff between the position tracking accuracy (affected by the existence of unmodelled dynamics such as friction) and the force response, if an unexpected interaction occurs.
The damping matrix D c is derived from K c by: where D diag is the diagonal matrix containing the damping factor (ζ = 0.7), K adj * K adj * = K c and * * = , where is the desired end-effector mass matrix (Albu-Schaffer et al. 2003). Instead, when an interaction is expected, being I e True, the Cartesian stiffness matrix K c and consequently the damping matrix D c are subject to changes increasing (or decreasing) the impedance parameters only along the direction of the desired movement defined by: (which can also be calculated fromẊ d ) and keeping a compliant behavior, set to k min and d min = 2ζ √ k min (Albu-Schaffer et al. 2003), along the other axes. To achieve this, the stiffness and damping matrices, as being symmetric and positive definite, can be expressed by: which is known by the Singular Value Decomposition (SVD). Such a decomposition enables us to project the desired stiffness and damping, calculated w.r.t. the reference frame of desired motion vector − → P , onto the reference frame of the robot base. U ∈ R 3 and V ∈ R 3 are orthonormal bases, and ∈ R 3 is a diagonal matrix whose elements are the singular values of matrix A sorted in decreasing order and representing the principal axes amplitudes of the resulting geometric ellipsoid. The columns of matrix U form a set of orthonormal vectors, which can be regarded as basis vectors. In this work, the first column of U represents the desired motion vector − → P , while the second and the third ones are derived from the first in such a way they form an orthonormal basis. Since the Hermitian transpose V * ∈ R 3 , and the resulting matrix A, that represents the impedance values, is positive definite, we have that: Combining (6), (7) and (8), we can derive the stiffness and the damping matrix: where the diagonal matrix k and d coefficients are respectively the desired stiffness and damping coefficients along the direction of the vectors composing the U basis. They are diagonal matrices defined by: where k st is the self-tuning stiffness coefficient to be set along the motion vector − → P and d st its correspondent damping element. k st is defined at every time t as: where α is the update parameter, P is the absolute value of the Cartesian error X projected onto the direction of the motion vector − → P , and normalized P, where T is the control loop sample time. It is important to notice that, k st is subject to changes only when P is beyond a limit. To this end, we introduce a threshold, defined as P t , since usually in impedance controlled robots it is hard to achieve a small error between the desired and actual position, unless the gains, i.e. stiffness and damping, are very high and it can be compared to position control. In this way we let a small margin of error, not to increase the impedance parameters when it is not required by the task, and to arrest its growth when a desired accuracy is achieved. Moreover, we want to avoid unnecessary adaptation which can be caused by unmodeled robot dynamics and small amount of friction at joints, and therefore the error can also not be related to the task.
In order to increase robot autonomy, the desired impedance parameters need to be adapted in a reasonably short time. This implies an accurate choice of α. A high value of this parameter would lead to a rapid convergence in materials with high density. Nevertheless, choosing a high value for non-dense material will cause needless stiffening of the robot that must be avoided. Therefore, to obtain an average α value, as a trade-off between fast convergence and stiffening performance, we performed experiments on different materials such as soil, sand, rocks with different density, air and water.
To achieve this, for every material m, α m was estimated and the average value was defined through the arithmetic mean of all the n materials taken into account in the analysis: There are also situations in which the impedance parameters adaptation has to be carried out in the opposite way, i.e. decreasing them, and (13) cannot be applied. An example is given by the case where the tool attached to the robot end-effector exits the material, even still being inside the interaction expectancy area. We define F ext,t as the variation of the external forces, along the motion vector, detected at the robot end-effector at time t w.r.t. the ones measured at time t − 1: In the aforementioned situations, F ext,t is positive and k st is defined at every time t as: where β is given by α scaled by a factor of 10 −2 , to implement a similar rate of adaptation as in (13). To avoid unnecessary changes caused by negligible force sensing difference, the positiveness of F ext,t is defined considering a small ε. A pseudo-code of the proposed method is presented in Algorithm 1.
Algorithm 1 Self-tuning impedance algorithm

Trajectory planner
The framework offers different kinds of motion planning trajectories. The Trajectory planner unit receives in input from the FSM (see Sect. 2.5) the target pose, the desired period to reach it, and the type of trajectory planner that can be selected among: point-to-point motion between two via points, i.e. starting from the actual one and reaching the desired one that is given as input. To achieve smoother trajectory profiles and prevent impulsive jerks, this kind of motion is implemented with fifth-order polynomial. In this way, velocity/acceleration initial and final values are set to zero. scooping motion, which reaches the target pose through a half circular motion on the vertical axis, replicating a scooping movement. This type of motion, designed with constant angular rate of rotation and constant speed, is helpful to collect materials when a scoop-like endeffector is attached to the robot flange. shaking motion, that is based on a rapid sinusoidal movement performed in place. By getting as input the shaking direction, this motion is needed to completely pour the materials in the pot during the "Task" state, without leaving any residuals on the spoon.
Note that, the design and implementation of the trajectories are not explained in details since they do not contribute to the novelty of this work, as they are well-known in robotics literature. However, a short introduction is useful to understand better the different phases explained in Sect. 2.5.

Visual perception module
Visual perception plays a key role in the Finite State Machine, providing information about the materials which are going to be manipulated by the robotic arm. The module splits in two sub-systems, using as input RGB-D data from a range sensor (ASUS Xtion PRO) placed in fixed position with respect to the arm and facing the materials. In the first sub-system, different types of materials are detected in the scene, using RGB data, and their three-dimensional (3D) surface convex hull polygon is calculated, using depth data (materials localization sub-module). In the second one, a peak point per each material is localized in the base robot frame, using the depth data (peaks localization sub-module).
Materials Localization To localize different materials in 3D, we use color-based region growing segmentation. The RGB-D sensor provides colored point clouds that are first transformed in the robot's base frame (z-axis upwards and y-axis towards left). Point cloud filtering, such as passthrough and box-cropping, is applied to keep only those points of interest that are inside the working space of the robot. Keeping the structure of the point cloud in its original grid organization (i.e. instead of removing points, setting them to NaNs) makes the method more efficient. Then, region growing is applied based on color, in order to classify similar points in clusters. This is a two-steps algorithm. In the first step, points are sorted based on their local curvature. Regions that grow from points with minimum curvature (more flat surfaces) reduce the number of segments. Two neighboring points are consider to be part of the same material if their color is similar to each other. The process continues for a seed's neighboring point, until no further neighbors can be classified to the same segment. In the second step, clusters with similar average colors or small size are merged. In Fig. 4, the result of this process is visualized. To localize the convex hull polygon of each material, we identify the extreme points in the xy-plane (i.e. the 2.5D bounding box), which results to vertices V 1 , V 2 , V 3 , V 4 (Fig. 4), stored and passed to the Finite State Machine.
Peaks Localization Identifying each material's peak point is straightforward after having localized the materials themselves. The peak point p i , for a material i, is the point with the maximum z-value. Similarly, each material's center is the average of the encapsulating polygon's vertices.

Finite state machine
To increase the autonomy of the system, we designed a Finite State Machine that is responsible of managing the transitions between the different phases of the framework (see Fig. 2). This unit gets as input the data sent by the Visual perception module, and by processing them, defines the target poses that are sent to the Trajectory planner. Moreover, this module is in charge of determining if an interaction with the environment is to happen, in order to activate the Self-tuning impedance unit.
The FSM is formed by four states. The "Workspace definition" state is responsible of acquiring the knowledge regarding the environment where the robot will operate in the next steps. To do so, it gets as input the vertices of the polygons that shape the materials within the robot workspace. These data are received from the "Materials localization" unit and stored in an appropriate data structure. In order to identify the k st parameter for every material, the FSM switches to the "Exploration" state. At this step, the robot end-effector grasps a stick-like tool, and reaches in a compliant way the material placed in the leftmost part of the workspace, previously identified by the Visual Perception Module. After having dunked into the matter, the Self-tuning impedance unit is triggered: both the boolean value I m and I f have been set to True, since a contact with the environment is expected and the end-effector tool is inside the material. Then, the robot follows a point-to-point motion towards the polygon center while adapting the impedance parameters as described in Sect. 2.2. The framework stores the resulting k st in the data structure, associating it to the corresponding material.
After that, the tool is pulled out from the substance and the impedance parameter identification process is repeated for all the materials in the workspace, until the rightmost material has been analyzed. Notice that, it is not necessary to continuously track the tool pose to check if it is inside the material and therefore to activate I m . In fact, during the "Workspace definition" state, the Visual Perception Module localizes and defines the areas of interaction, and from here on, through the robot forward kinematics, we can identify whether a tool is located within that interaction expectancy area or not. This is possible since the tool is attached to the robot end-effector. In such a way, we can define whether I m has to be set to True or False at every time step.
To enhance the robustness of the system to unexpected events, we designed a "Fault Detection" sub-unit, within the "Exploration" state, that is triggered in case of a collision with the boundary (possibly due to a perception unit fault). If the sensed external forces projected along the motion vector experience an abrupt increase, the robot ends its motion and goes back to its homing position. The external forces trend w.r.t the robot displacement, is estimated through a linear regression algorithm, computed for every n samples. When the linear regression slope m goes beyond a threshold set to m fault , the fault is triggered.
Next, in the "Materials distribution" state, the vision unit detects the highest point for every matter through the "Peaks localization" unit. The FSM receives these points and associates them to the relative material. During the final state, named "Task", the robot needs to scoop some material and pour it in a pot held by another robot. The scooping trajectories are designed to start in the Euclidean points identified as peak points to ensure that some material is found in that part of the container. While holding a scooping tool, as a scoop or a small shovel, the robot starts to carry out the task by reaching the first scheduled material with the default compliance k min set in all the Cartesian axes. When it dunks inside the matter in order to scoop some of it, and therefore activating the interaction expectancy value, the impedance parameter are adapted in the direction of the motion setting the relative k st for every material. This value is retrieved from the data structure, where it was stored during the learning phase in the "Exploration" state. Like this, since the very beginning of the task, the robot does not lag behind and can execute the task in a more precise manner. When the scooping motion, that goes from the highest point towards the polygon center, is over, the material is poured in a pot and the process starts over with the next material, as scheduled by the task sequence. There are cases in which the substances' viscous properties are subject to changes over time. This can be given either because of the material intrinsic properties or due to external circumstances. That is why, even in the "Task" state, the impedance parameters are exposed to changeability. Nevertheless, we decided to set a maximum value k st_max that can where k st_exploration,m is the value stored in the "Exploration" state for material m, and p is the percentage of variation that can be set. If, in the "Task" state, this value is exceeded, the "Fault Detection" sub-unit is triggered, and the robot goes back to its homing position. This method can be useful for detecting both a collision with the boundary (for instance due to a fault of the perception unit) and a substantial change in the material properties, since in industrial production these would be symptoms of system malfunction or material inconsistency.

Experimental setup
The framework software architecture is developed with the robotics middleware Robot Operating System (ROS) using C++ as client library. The modules illustrated in Sect. 2 are implemented as ROS nodes, and they communicate through the ROS topics depicted in Fig. 2 by means of the publisher/subscriber design pattern. The experimental setup (Fig. 5) includes two Franka Emika Panda robotic arms. The main robot carries out all the steps of the described method, and the other one is used as a support to the first one, providing the pot where the materials are poured after the scooping task. The presented architecture relies upon a tailored version of franka_ros metapackage, the ROS integration for Franka Emika research robots. This package integrates libfranka, Franka Emika's open source C++ interface, into ROS and ROS Control. This interface communicates with the robot trough the Franka Control Interface (FCI), that provides the current robot status and enables its direct control with an external workstation PC. The com-munication is realized via an Ethernet cable in real-time, and the communication rate is 1 kHz.
An underactuated robotic hand, i.e., the Pisa/IIT Soft-Hand (Ajoudani et al. 2014a), is used as end-effector. We installed a pole in front of the robot arm, and we mounted on top an ASUS Xtion Pro RGB-D sensor to provide the perception data. The camera calibration is conducted w.r.t. the robot base frame.

Experiments
We conducted experiments within an agricultural robotics scenario, in order to validate the presented method. The setup included a container, placed between the camera pole and the robot, where three different materials were placed. To demonstrate distinct behaviors in the impedance parameters self-tuning, we considered materials with substantial difference among their viscoelastic properties, and their large-scale use in agriculture: seeds, soil and expanded clay. A video of the experiment is available in the multimedia extension.
We follow the FSM states sequence to describe the experiments. The materials' container had a rectangular shape, and the materials inside were equally separated into three rectangular areas. Therefore, in the "Workspace definition" state, the FSM receives from the "Materials localization" perception unit the 12 Euclidean points delimiting the areas of the three materials. Accordingly, it updates the column of the data structure proposed in Table 1 related to Polygon vertices. Afterwards, the robot grasps a metal stick 27 cm long, to carry out the next phase, i.e. the "Exploration" state. Based on (16) and on the values reported in Table 2, α was set to 20,000. To ensure a good level of compliance in case of unpredicted collisions, the value of k min was set to 500 N/m. During this state, the robot gets to the leftmost material, formed by seeds (material 1), dunking the metal stick into it. An interaction with the environment is expected to happen, and therefore the I e value is activated, as shown at t = 1.5 s in the fourth plot of Fig. 6. Consequently, the Self-tuning impedance unit is enabled. While keeping the tool immersed in the material, the robot performs an 18 cm long movement along the x axis. Since P goes beyond the threshold P t set to 1 cm, k st increases following (13) as shown in the first plot of Fig. 6. As a consequence, the Cartesian stiffness along the direction of interaction is adapted. In this case the movement direction is performed only on x axis. With the increase of the impedance values, we can notice that P, that represents the Cartesian error along the motion vector, gets reduced and goes below the threshold P t . The maximum value reached by k st in this case is equal to 1100 N/m, and it gets associated to the relative material as reported in the right column of Table 1. To complete the "Exploration" state, the robot repeats the same described procedure for the other two materials. As expected, Table 1 The data structure containing the data coming from the Self-tuning impedance unit, relative to the stiffness k st (rounded to integers), and from the Visual perception module, relative to the polygon vertices of the materials and their peak points the soil (material 2) turns out to be the stiffest material, with k st reaching a value of 1650 N/m, and the expanded clay (material 3) is in between the other two, i.e. 1330 N/m. In the third plot of Fig. 6, we can notice how these values are tuned. After the identification of the impedance parameters, the FSM transits to the "Material distribution" state. The starting point of the scooping trajectories are detected by the "Peaks localization" unit of the perception module, and stored in the relative column of Table 1.
Then, the robotic hand grasps a scooping tool in order to carry out the "Task" state, subdivided in four substates. The robot scoops and pours in a plant pot, provided by the second robotic arm, the three materials following this sequence: soil (a), plant seeds (b), other soil (c), and expanded clay (d). These four substates are depicted in the plots of Fig. 7 and Fig. 8. In the latter, the green triangles represent the highest point of each substance provided by the "Peaks localization" perception module. To foster a deeper understanding, the axes of this figure are oriented to analyze the task from a lateral view. In this way, it is clear to see how the stiffness value k st is adapted along the direction of the motion P inside the interaction expectancy area. Faint and shorter arrows symbolize lower stiffness values, while longer and more vivid arrows represent higher stiffness values. The direction of the motion vector in the Cartesian space is also specified in the plot related to the three components of the normalized motion vector P in Fig. 7.
When no interaction is to happen, i.e. outside the containers, the robot keeps a compliant profile, and k st is always set to k min , i.e. 500 N/m. Entering the interaction expectancy area leads to a rapid adaptation of k st , that assumes the value stored in Table 1 relative to each material. This can be noticed by the sudden growth in the arrows length and color intensity. In case P keeps its value below the threshold, it means that the viscoelastic properties of the material did not change, and so there is no need for further adaptation. When the scooping is over, but still being inside the container, k st gets reduced according to (18), as can be seen the last part of the scooping. Notice that, negligible variations could lead to unnecessary changes, so we designed a moving average window to calculate F ext . In the last part of the depicted motion, the robot exits the interaction expectancy area, and k st is restored to its default compliant value, i.e. k min . To show that the impedance self-tuning would occur also in case of viscoelastic properties changes, we decided to pour some water in the soil between the "Exploration" and the "Task" states. This adaptation is visible when the scooping tool enters the soil during "Task" (a), and it is caused by P exceeding the threshold P t as shown in the third plot of Fig. 7 at t = 6.3 s when the Self-tuning impedance unit is activated again. The value of k st for material 2 gets increased from 1650 N/m to 1750 N/m. This is highlighted by the difference between the first and the other arrows inside the leftmost container in Fig. 8.
In Fig. 9 we show how the tuning of the Cartesian stiffness is achieved only in the directions of movement P, when the tool is inside the materials in two of the "Task" state "Task" state: starting from the impedance values already tuned in the "Exploration" state, the framework allows the robot not to lag behind, so that P remains below the P t threshold. Only in "Task" (a), P goes beyond the threshold, since the material viscoelastic properties have been intentionally changed Fig. 8 A lateral view of the four "Task" substates. k st is projected onto the direction of motion and here represented by means of red arrows, whose intensity grows with their value (Color figure online) Fig. 9 "Task" state: the tuning of the Cartesian stiffness is achieved only in the directions of movement P. The vectorial sum of the three diagonal components is always equal to k st sub-phases. In "Task"(a), k st of material 2 is adapted at t= 6.5 s, since the soil viscoelastic properties were changed as explained above. We notice that the sum of the three Cartesian stiffness diagonal components is always equal to k st . In "Task"(d), we see the adaptation also on K c (y).
To show that the framework reliability has been increased by means of the two "Fault Detection" sub-units, we repeated the experiment simulating a fault in the perception unit by changing the pose of the box containing the materials both during the "Exploration" state and the "Task" state execution. Like this, following the desired trajectory, the tool grasped by the robot end-effector collides with one of the container sides. Figure 10 shows an execution of the "Exploration" state performed to retrieve the first material k st . As can be noticed in the third subplot, performing a linear regression (red solid curve) on the measured external forces data (blue scat- On the other hand, Fig. 11 shows the behavior of the "Fault Detection" sub-unit associated to the "Task" state. Since in the "Exploration" state (performed without faults), k st for material 1 reached 1100 N/m, by applying (19) with p = 0.3, we obtain that k st_max,1 = 1433 N/m. As shown in the plots, at the time of the collision, i.e. t= 6 s, P increases suddenly and so k st,1 goes beyond k st_max,1 computed above. The execution halts, the robot exits the material and goes back to its initial configuration.
To demonstrate further the validity and the effectiveness of the proposed method, we decided to perform an other experiment inserting an obstacle inside one of the materials, so as to simulate an uncertain environment, and carrying out one more time the FSM "Task" state. We put a piece of wood inside the seeds (material 1), placing it in the middle of the path of the expected reference trajectory. In this way, the tool held by the robot had to react adapting to the wood shape. We repeated the experiment three times. At first, we removed the "Self-tuning impedance unit" from the framework, so that the impedance parameters were not subjected to changes even if an interaction was expected. With this configuration, we performed the experiment with high and low impedance parameters and we compared the obtained results. Afterwards, we carried out the task with the same setup, but with the impedance regulation enabled. To evaluate the three described trials, whose plots are illustrated in Fig. 12, we decided to compare the external interaction forces acting on the robot end-effector, i.e. F ext , and the Cartesian error projected onto the direction of the movement, i.e. P, under the different conditions of the impedance parameters.
The first column represents the data acquired while keeping always a high level of the impedance parameters, i.e. 1100 N/m (see bottom plot), that is the value reached by k st for material 1 during the "Exploration" state. In this case, although the tracking of the error P does not exceed excessively the imposed threshold P t , we measured interaction forces on z axis, represented by the blue line on the top plot, reached a quite high value (≈ 13 N). Therefore, this approach could lead to a system failure caused by a tool/robot damage. Notice that, with higher obstacle curvatures, the external forces measurements could scale quite rapidly easily leading to more substantial failures.
The second column shows the plots of the trial with lower impedance parameters, set at 500 N/m, as no interaction was ever expected. The robot is able to better comply with the external environment, as highlighted by lower interaction forces on z axis, that reach a maximum value of ≈ 10 N, and therefore damages are more likely prevented. Nevertheless, complying both with the expected and unexpected interactions with the environment leads also to a loss in terms of performances. This can be seen in the plots representing P, where the robot lags behind the desired trajectory up to 3 cm. This behavior can not be considered desirable, since the task is not carried out as expected.
Lastly, the third column depicts the data logged applying the method presented in this work. Stiffness and damping are updated on-line, based on the interaction expectancy and on the direction of the movement P. The external interaction forces are further reduced, reaching at maximum ≈ 8 N. P is significantly less w.r.t. to the case with low impedance.
In Fig. 13, we report the setup used in this experiment enhanced with the reference trajectory (blue curve) and the measured path (red curve) logged during the last trial, when the "Self-tuning impedance unit" was enabled.

Fig. 12
The plots show three different executions of the "Task" state with an object placed inside material 1 (setup shown in Fig. 13): at the left without applying the presented method and always keeping high impedance values, in the center as the previous case but with low impedance parameters, and at the right the trial with the Self-tuning impedance unit enabled. In the latter case, there are less external interaction forces w.r.t. the first case, and the error in the direction of the movement P is substantially less w.r.t. the second trial

Tank-based system passivity observer
Since the presented controller is based on continuous variations of the impedance parameters, we must demonstrate that the passivity of the system, and so its stability, is guaranteed. Following the approach presented in (Ferraguti et al.  Fig. 14 The analysis reveals that the tank energy is above the lower boundT l set to 0.5 J for the entire experiment, thus guaranteeing there is no loss of stability 2013), we implemented a tank-based approach to monitor the system stability. Formally, the model of the robot in the task space is given by: (a) (b) Fig. 15 The self-tuning impedance algorithm applied to the handle pulling phase of a standard industrial pallet jack. When an interaction is expected and P t is beyond a threshold, the impedance parameters are tuned along the motion vector P (a). On the right, the initial and final configuration assumed by the cobot are depicted, with a sketch representing the trajectory followed by the robot to pull the handle down (b) where the desired stiffness is equal to K d (t) = K const + K (t), being K const ∈ R 6×6 the constant stiffness term and K ∈ R 6×6 the time-varying stiffness. (x) ∈ R 6×6 and μ(x,ẋ) ∈ R 6×6 are the Cartesian inertia and Coriolis/centrifugal matrices respectively. The scalar x t ∈ R is the state associated with the tank and the tank energy T ∈ R + and T = 1 2 x 2 t . σ ∈ R and ω ∈ R 6 respectively are where,T u ∈ R + is a suitable, application dependent, upper bound that avoid excessive energy storing in the tank whilē T l ∈ R + is a lower bound below which energy cannot be extracted by the tank for avoiding singularities in (20) and thus the time-varying stiffness K (t) will be removed. For a detailed analysis of the system passivity, please refer to (Ferraguti et al. 2013). Figure 14 shows the stability analysis performed during the entire duration of the experiment, i.e. including the "Exploration" and the "Task" states, when no faults occurred.
As we can see from the bottom subplot, the tank energy was above the lower bound (T l = 0.5J ) during all the phases, which means that the full stiffness including the constant (set to the compliant value, 500 N) and varying parts can be realised without loss of stability.

Algorithm scalability
The self-tuning impedance algorithm has extensively demonstrated its validity in the previous sections. However, its applicability is not limited to the experimental setup described so far. In fact, by changing the FSM states illustrated in Sect. 2.5, various applications can gain advantages thanks to the proposed methodology.
An example is given by pick and place tasks, where the robot keeps a compliant profile on all the Cartesian axes before picking and after placing, while adjust the impedance parameters while carrying the task. Preliminary results, although tuning the parameters only along Cartesian axes, are presented in (Balatti et al. 2018) for a debris removal task.
Another application field is represented by the logistics sector, where cobots are expected to automatize repetitive and physically demanding works, as transporting airport mobile stairways, heavy industrial carts, and warehouses pallet jacks. To this end, hereafter we show a demonstration of the self-tuning impedance algorithm applied to the pulling task of a standard industrial pallet jack. We define a new FSM, that includes a "Handle pull down" state where I m gets triggered. On the other hand, vision comes into play triggering I f whenever the end-effector is within the area around the pallet jack handle (similarly as in (Balatti et al. 2018)), which is recognized through a standard template matching perception algorithm. The "Handle pull down" state is performed with a circular trajectory as depicted in the lower part of Fig. 15b. In this sketch the red arrows represent the online stiffness parameters adaption as in Fig. 8. In the plots of Fig. 15a, it is represented the tuning of k st along the motion vector, triggered only when I e is set to True and P is beyond the threshold P t , set to 3 cm. The top right part of Fig. 15 represents the initial and final configuration of the MOCA robot (Wu et al. 2019) used in this experiment. A video showing the pallet jack handle pulling results is part of the multimedia extension.

Conclusion and discussion
This paper presented a novel framework to enhance robot adaptability in unknown and unstructured environments. The system relied on a self-tuning impedance controller, able to regulate the impedance parameters only on the direction of the motion vector, and activated just when an interaction with the external environment was predicted. It additionally included a visual perception module that improved the situation-awareness of the robot, localizing the surrounding materials and their peak points. Notice that the material detection process (i.e. color-based region growing segmentation) could be replaced with other RGB-based methods, such as deep learning, to improve the time complexity. Given that the visual results were accurate and fast enough for our application, we leave this as future work.
To detect faults, we also presented a novel unit capable of recovering from unexpected collisions with the boundary. A Finite State Machine was introduced to manage the transitions between the system states. We experimentally validated the presented framework in an agricultural task. Although we are working towards making the method autonomous, it still needs an offline tuning of some parameters, such as α and β, whose online adaptation will be the focus of our future work. Nevertheless, using non-optimal α and β values does not imply failure in task execution, but rather introduces some variations to the convergence speed of the self-tuning control gains. An analysis of the stability of the system is reported, since the algorithm includes time-varying tuning of the impedance parameters.