1 Introduction

Human-Robot Collaboration (HRC) is expected to significantly advance manufacturing by introducing high flexibility in assembly cells  [1], but also promises to enhance human capabilities in other fields, including domestic welfare and assistance to medical doctors  [2, 3]. To achieve a smooth collaboration between a human operator and a collaborative robot, reciprocal awareness is fundamental: the robot has to be aware of the human actions and the human has to know the robot state to fluently proceed with the collaborative task. This need was underlined by Drury et al., in a review on awareness in Human-Robot Interaction  [4].

Recent advances in interfaces for improved human and robot perception in HRC were surveyed in  [5]. On the one hand, human sensorimotor information can be used to monitor human behaviour and plan appropriate robot responses in different phases of a collaborative task  [6,7,8]. Peternel et al., for example, used a vision system and EMG electrodes to detect human motion and muscular activity  [6], whereas Ishida et al., used a wearable vibration sensor to discriminate human actions  [7]. On the other hand, visual, auditory, and tactile feedback can be employed to improve human situation awareness in HRC  [1, 9]. In  [9], for instance, human intention was inferred based on visual monitoring, and mutual understanding was achieved by alerting the human through haptic cues when the robot understood human intention with a certain level of confidence.

In this paper, we present a human-robot collaborative set-up where the human sensorimotor system is virtually connected to the system of sensors and actuators of the robot through wearable devices. Human actions are recognized thanks to a wearable skin vibration sensor, and successful recognition is communicated to the human through the activation of a vibrotactile ring. The proposed collaboration paradigm is sketched in Fig. 1. The idea is to integrate the benefits of shared human perception  [7] with those of operator awareness  [9], creating a bilateral haptic connection between humans and robots, that we call shared haptic perception. The effectiveness of the proposed communication paradigm was demonstrated through an experimental validation involving 8 trained volunteers performing a complex collaborative task with a robot arm.

Fig. 1.
figure 1

Shared haptic perception between humans and robots: general idea. Human perception is shared because the same vibrations that are sensed by human touch receptors during the collaborative task are also detected by the wearable sensor and, thus, by the robot. Robot perception (enhanced by an action recognition algorithm) is shared with the human thanks to the tactile signal sent by the robot through the haptic device.

2 Methodology

The proposed collaboration paradigm is based on the use of two wearable devices, a vibration sensor and a vibrotactile ring, and on an action recognition algorithm.

Wearable Devices. In this study, the wearable skin vibration sensor developed by Tanaka et al.  [10] is used for sending tactile information from the human to the robot. The sensor uses polyvinylidene difluoride (PVDF) film and detects vibrations propagating on the human skin surface. The acquired data are used to detect the current human action. The PVDF sensor does not hinder the natural movements of the human hand and allows to directly touch objects, because it is light (about 20 g) and can be worn by wrapping it around one of the fingers, as a ring. In  [7], authors showed the advantages of putting the sensor on the human finger, with respect to applying it on the manipulated object. Not only the sensor output “directly represents operator’s perception”  [7], but instrumenting the human makes it possible to apply the proposed framework in different situations, without having to modify the environment around the user.

To send tactile information from the robot to the human, a wearable vibrotactile ring embedding a HAPTICTM Reactor (ALPS ALPINE CO., LTD.), is used. Two vibration bursts separated by an interval of 20 ms were sent to the participant to alert her/him that her/his action was recognized. We chose a frequency of 200 Hz for the vibration, as in  [9] this kind of feedback was found to be easily recognizable and helpful to proceed smoothly with a HRC task.

Action Recognition. A paradigmatic task in which the human closes an envelope and the robot applies a stamp over it was chosen to show the effectiveness of the proposed tactile communication strategy. A Support Vector Machine (SVM) was used to recognize, based on the PVDF sensor output, the three different human actions involved in the task (see Fig. 2-(left)): gluing (human applies the glue on the envelope), tracing (human traces the envelope opening with index fingernail), and no contact (state other than the above two states). Note that a vibration sensor is particularly suited to recognize actions that imply interaction with the environment. It might be difficult, for example, to infer whether the human is actually tracing the paper with some strength or is just moving over it without even touching it, using only a vision system.

Similarly to  [7], to distinguish the different states with the SVM, we used two features: vibration intensity (\(i_{RMS}\)) and frequency ratio (r). They were computed based on the power spectral density (PSD) of the sensor output calculated in the range between \(f_1=100\) Hz and \(f_2=1000\) HzFootnote 1: \( i_{RMS}=log \sqrt{\int ^{f_2}_{f_1}{PSD(f)df}}\), \(r=\frac{A}{B}. \) The value \(i_{RMS}\) indicates the root mean square (RMS) of the PSD of each sample, A is the log(RMS) of the PSD in the range [850–1000 Hz] and B is the log(RMS) of the PSD in the range [\(F_{peak} \pm 75\) Hz]. \(F_{peak}\) is the frequency at which the PSD reaches its maximum value. Before each experiment, participants were asked to wear the vibration sensor and perform the three different actions, five times each. The collected data were used to create a linear SVM model based on the values of the two indices defined above. Figure 2-(right) presents an example of obtained SVM model. The top panel of the graphs in Fig. 3 show examples of complete acquisitions from the sensor for the gluing and the tracing state. From a total of 2 s, a central interval lasting 1 s was selected and divided into 5 samples of 0.2 s each (middle panel). For each sample, \(i_{RMS}\) and r were computed from the PSD (lower panel). The gluing action generates vibrations with a lower amplitude than those related to tracing.

Fig. 2.
figure 2

(Left) Human states recognized by the SVM: no contact, gluing, tracing. (Right) Example of SVM model where data are well separated considering the intensity and ratio parameters (black: no contact, blue: gluing, red: tracing). (Color figure online)

Fig. 3.
figure 3

Examples of PVDF sensor output and power spectral density (PSD) for gluing state (left) and tracing state (right). Top panel: complete acquisitions, middle and lower panels: sensor output and PSD for a sample of 0.2 s.

3 Experiments

3.1 Experimental Procedure

To design our experimental set-up (Fig. 4) we took inspiration from the previously described prototypical task of closing and stamping an envelope (Fig. 2), and made it more complex, so to better study the effectiveness of the proposed communication paradigm. In particular, we wanted to investigate how awareness vibrotactile signals affect the performance of well trained participants. This is an advancement with respect to  [9], where participants only underwent a brief training, but were not expert in the performed task.

Fig. 4.
figure 4

Experimental set-up for the chosen HRC task.

Participants wore the haptic ring on the left hand and the PVDF sensor on the right, and listened to white noise while conducting the experiments. They sat in front of a collaborative robot arm, the open source manipulator Mikata arm (ROBOTIS Co., Ltd.), having four actuators and a stamp attached at the end-effector. In each experimental trial, the human operator had to trace with the right index finger a long piece of paper (size: \(60\times 1000\) mm), and the robot had to put a stamp in a predefined position upon recognition of the tracing state. In particular, the current PVDF sensor output was classified according to the found SVM model every 0.02 s. If the result of the classification was “tracing state” for 50 consecutive times (i.e., for 1 s), the robot started its stamping task.

Trials were conducted under two conditions, one including vibrotactile feedback from the robot after tracing action recognition (awareness signal), and one without it. To make participants aware of the fact that the collaboration was mediated by an action recognition algorithm, they were instructed to trace slowly until the robot recognized the tracing motion, and then to complete the task as soon as possible. In other words, the goal was to finish the job as quickly as possible, but participants had to take into account the communication with the robot to be sure to get the paper stamped and thus successfully accomplish the task. Coordination between human and robot was important for two main reasons: i) the robot could put a stamp only after the human traced the part of the paper where the stamp had to be applied, and ii) the human had to be sure that the robot recognized the tracing action before it was actually completed. Note that when haptic feedback was not active, the user could infer robot state only by looking at it and waiting to see it moving towards the stamping position.

A within-subjects experimental design with complete counterbalancing was adopted. Each participant tested both conditions, with randomly assigned orderings. In particular, half of the participants initially conducted the experiment with feedback and then without, the other half did the opposite. Eight volunteers (6 males, 2 females, average age 26.5) participated in the study. They all had previous experience with wearable haptics. Informed consent was obtained from all of them and the experimental evaluation protocol followed the Declaration of Helsinki. Participants did not perceive any payment and were able to leave the experiment at any moment. Firstly, they were asked to record data to create the SVM model, as described in Sect. 2. Then, each of them performed 15 trials per condition as training phase and, lastly, 5 trials for each condition as test phase. In the test phase, users’ performance in terms of execution time was recorded. At the end of each trial, users had to press a button on the keyboard of a laptop placed on their right and then wait for a fixed amount of time (showed through a countdown on the screen), before starting the new trial.

In the first part of the training phase (10 trials), we were more interested in making the users learn the task, and thus we kept the robot stationary until the recognition of the tracing state. However, in real applications, the robot is never left idle and usually executes other actions while waiting for human operations. This is why, in the second part of the training phase (5 trials) and in the test phase, the arm was programmed to randomly reach four different poses, emulating other possible tasks, while waiting for the action recognition.

3.2 Experimental Results

The execution time of the 5 trials of the test phase, in the two conditions, is displayed for all participants in Fig. 5. The empty circles show the execution time of each trial, and the filled ones indicate the average execution time for each participant over five trials. The mean and standard deviation (\(3.20\,\pm \,0.25\) s with vibrotactile feedback, and \(3.64\pm 0.31\) s without vibrotactile feedback) of these average data are used to plot the bar plots on the right labeled as “average”. Regarding these data, the Shapiro-Wilk test showed normal distribution and the paired t-test for each condition showed that there was a significant difference between the average execution times for the two conditions (\(t_7 = 3.8\), \(p = 7.0\times 10^{-3}\)). In other words, when haptic feedback was active, participants took significantly less time for completing their task, than when there was no haptic feedback.

Fig. 5.
figure 5

Task execution time in two conditions (with/without feedback): single trial (empty circle) and average (filled circle) for each participant, and bar plot of mean and std of the averages. ** indicates \(p < 0.01\) with the paired t-test.

Fig. 6.
figure 6

Recognition time in two conditions (with/without feedback): single trial (empty circle) and average (filled circle) for each participant, and bar plot of mean and std of the averages.

In both conditions, the PVDF sensor worn by participants was active and was used to recognize user actions. The recognition was successful in all the trials. To ensure the validity of this result, we analysed the recognition time of the robot, i.e., the time that it took to recognize that the human was tracing, in the two conditions. Figure 6 shows the recognition time for each participant for each trial (empty circles) and on average (filled circles). As before, the bar plots are built by considering mean and standard deviation (\(1.13\,\pm \,0.08\) s with vibrotactile feedback, and \(1.13\pm 0.12\) s without vibrotactile feedback) of the average values for all participants. In this case, no significant difference was found between the two conditions at a significance level of 5% for all participants. Thus, the recognition time did not significantly vary between the two conditions.

4 Discussion

Results presented in Sect. 3.2 show that not only the proposed communication paradigm offers a viable solution for implementing human-robot collaborative tasks, but also, and more importantly, that the vibrotactile feedback significantly improves human performance. The vibrotactile awareness signal allows operators to understand whether their action was successfully recognized, without having to wait to see the robot moving towards the stamping position. Besides, the fact that the robot performs other actions before the recognition, makes it even more difficult for users to understand robot next movements just from sight.

The advantages of enhancing operator awareness were initially observed in  [9], and in this paper we show that awareness is important also in a completely different scenario, where human actions are not predicted but recognized, using skin vibration sensing and not visual monitoring, and, above all, where participants are not novice, but are well trained to perform the task.

5 Conclusions

This work presents a new human-robot communication paradigm based on the concept of shared haptic perception: the user sends to the robot haptic cues that allow the robot to recognize human actions, and the robot informs the human through symbolic vibrotactile signals (awareness signals) about the successful interpretation of the received data. This bilateral communication, achieved through the use of unobtrusive wearable sensing and actuation devices, allows to reach reciprocal awareness and mutual understanding between the two partners.

An experimental validation with 8 participants was conducted and showed that awareness signals allow well trained users to complete their task in significantly less time than without haptic feedback. Future work will focus on investigating other tactile feedback modalities (e.g., continuous exchange of tactile information), on finding other collaborative tasks that can benefit from the proposed communication strategy, and on studying whether shared haptic perception can improve also the learning process of a task for untrained operators.