Keywords

1 Introduction

Historically, industrial manipulators had been designed to easily automate the execution of repetitive or dangerous tasks and worked in dedicated cages (denoted as robotic cells) for safety reasons. The birth of the technological revolution known as ‘Industry 4.0’ has radically changed the global industrial scenario and also the field of robotics, making the newborn collaborative robot (cobot, in short) one of the enabling technologies of the smart factory. This is a lighter and smaller manipulator, specifically designed to help the shop-floor worker accomplish several industrial tasks, by safely working in close proximity to him/her. The human-robot cooperation (HRC) fosters the efficiency and flexibility of the industrial production systems, since it allows to effectively combine the human advanced cognitive skills and adaptability with the cobot superior accuracy and repeatability. Therefore, whereas cobots can relieve shop-floor workers from alienating or tiring tasks, humans can indulge in high-level decision-making operations that can be hardly automated. Nowadays cobots are widely employed in small- and medium-sized enterprises (SMEs) for collaborative manufacturing tasks [3].

The problem of improving the effectiveness of the HRC, i.e. minimizing the task cycle time or maximizing the productivity, has always played a pivotal role in the related scientific research. By the way, other aspects of HRC are recently emerging as crucial features to improve the quality of the cooperation. An example is the prevention or mitigation of the cognitive and physical distress that might be experienced by the shop-floor worker [1]. Indeed, humans by nature have a limited capability to endure fatigue and are also vulnerable to workload. Furthermore, high levels of physical and cognitive fatigue negatively affect the human capabilities, increasing not only the risk of work-related musculoskeletal disorders (MSDs) or burn-out phenomena, but also undermining the team performance [6]. Recently, some studies have investigated the possibility of adapting certain robot behavioral features according to the operator’s mental or physical distress, to help him/her accomplish the collaborative task, or to reach a smoother interaction [5, 7, 9]. However, the existing literature in HRC has focused either on the mitigation of the human physical and cognitive workload or on the optimization of the human productivity, without jointly addressing the problem of simultaneously optimizing both human performance and well-being. Moreover, non-invasive methods to estimate the fatigue accumulated by the human during cooperation are at their early stages in the literature.

In this chapter we first analyze how the robot interaction role (being leader or follower during cooperation) influences the human stress and productivity. Based on that, a game-theoretic (GT) model of the trade-off between stress and performance is proposed. Then, based on the real-time monitoring of this trade-off, we describe a novel control strategy that enables the cobot to simultaneously optimize online the human well-being and performance, by suitably varying its role. Moreover, we propose a strategy to estimate in real-time and non-invasively the work-related fatigue experienced by the human co-worker during HRC. Based on that, we describe a method that dynamically decides which task activities should be better allocated to the cobot and which ones to the human to minimize his/her musculoskeletal workload. The remainder of this chapter is structured as follows. Section 2 analyzes how the robot interaction role influences the stress and productivity of the human co-worker. Section 3 describes the GT-based robot strategy to simultaneously optimize the human performance and well-being. Eventually, Sect. 4 addresses the dynamic task allocation strategy to mitigate the physical fatigue accumulated by the human.

2 Understanding the Impact of the Robot Interaction Role on the Human Physiological Stress and Productivity

A pivotal feature of HRC is represented by the robot interaction role, i.e. being leader or follower during cooperation. Whereas in the first case the robot is responsible for driving the task progression, in the second case, it complies with the human decision. To our best knowledge, previous works in the literature have mainly analyzed the feasibility of the leader-follower paradigm in HRC. However, existing researches do not provide a complete analysis on how the robot leader-follower interaction modes might impact on the human co-worker in terms of stress and performance. To tackle this gap we carried out an exploratory study aiming at examining whether and how the leader-follower robot interaction strategy influences the productivity and physiological (cognitive) stress of the human co-worker. The ultimate goal of this research is to lay the foundations for determining how the robot should behave when interacting with the generic human co-worker in order to obtain the best compromise between the mitigation of his/her cognitive stress and the maximization of his/her production performance.

To perform this evaluation, we setup a task consisting in a tower-building exercise, which was jointly performed by the human and the robot (ABB YuMi cobot). They both had to place a box on the top of a stack moving alternately, with the sole constraint that a new block must be of the same color of the previous one. The leader agent was in charge of initiating the task and driving, at each step, the selection of the color of the next box. Conversely, the follower agent had to react by complying with the partner’s choice. To carry out the analysis, a between-subjects design was adopted: the overall population of 33 volunteersFootnote 1 was divided into two similar and homogeneous groups. The subjects belonging to the first group were leaders during the interaction with the robot, whereas those of the second group had to follow the task progression decided by the robot. In both cases the robot performed the same motion trajectory to reach each box and the volunteers were asked to perform the task as fast as possible. We evaluated the physiological stress induced on each subject based on the analysis of the most reliable stress indicators according to the related literature, i.e., LF/HF and RMSSD [8]. These indicators are extracted by analyzing the heart-rate variability (HRV) of the subject’s electrocardiographic (ECG) signal. The latter was acquired on each user via 3 disposable electrodes placed on the subject’s thorax and abdomen. The subject’s production rate was instead estimated based on the measured user’s cycle time.

The results showed that, whereas the robot leadership entails a higher productivity of the team at the expense of an increase in the physiological stress, the human leadership induces lower stress but also determines a lower productivity.

3 Real-Time GT-Based Method to Maximize the Human Performance and Mitigate the Cognitive Stress in HRC

In Sect. 2 the ‘open-loop’ effects of the robot interaction role on the human productivity and stress have been presented. The expression ‘open-loop’ indicates that the effects of the robot role are simply evaluated, but not exploited to vary the robot behavior accordingly (i.e. in closed-loop). The results of this exploratory study have highlighted two important aspects. The first is that there exists a trade-off between the maximization of human productivity and the minimization of the related cognitive stress. The second is that the robot interaction role might serve as a suitable control variable to effectively manage this trade-off. Based on these outcomes, we developed a novel closed-loop control strategy that makes the robot able to simultaneously stimulate the human productivity and mitigate the cognitive stress, by autonomously adapting online its role. Hence, this strategy allows to jointly optimize both human performance and well-being. The difficulties in developing such an optimization strategy are manifold:

  • the dynamics of the human response (in terms of stress and performance) to a certain external stimulus (i.e. stressor), such as the variation of the robot behavior, is neither a priori known nor predictable;

  • due to the intrinsic complexity of the stress regulatory mechanism, the stress response cannot be considered a fully observable phenomenon and thus cannot be properly modeled through mathematical equations;

  • the human response to an external stressor is subject-specific;

  • in the related literature, no guidelines are provided to determine the optimal stress-performance state that a certain human being can achieve;

  • the relationship between the dynamics of stress and performance is not known and varies from individual to individual.

For these reasons, a model-based control strategy would have been unsuitable. To realize the optimization strategy, the human stress \(s_k\) and performance \(p_k\) are continuously monitored by the robot through the following statistics:

$$\begin{aligned} s_k = - \biggl (\frac{LF_k/HF_k}{RMSSD_k}\biggr ) p_k = -(\bar{t}_k + \sigma _k) \end{aligned}$$
(1)

where \(\bar{t}_k\) and \(\sigma _k\) are the average human cycle time and related standard deviation, respectively. To express the existing trade-off between the minimization of the human stress and the maximization of his/her production performance, a GT-based approach has been developed. In other words, the game theory provides the theoretical foundations for modeling how the interaction between human and robot is influencing the stress and performance of the human co-worker. In particular, to realize the model of the above mentioned trade-off, we propose to interpret the overall interaction between the human and the robot as a repeated non-cooperative normal-form game between two players [10], i.e., the human (H) and the robot (R). Moreover, we assume the players H and R to be rational (self-interested), i.e. each one of them aims at maximizing its own profit throughout the game. Hence, these two players allows us to account for the two competing aspects of whatever working environment: on the one hand, the importance of increasing the productivity, on the other hand the need of mitigating the worker stress. Thus, we consider that the goal of player R is to maximize the productivity and the one of player H to minimize the stress. Then, we exploit the concept of game actions [10] to express the potential attitudes of human and robot during the interaction. More specifically, we assume that each player might adapt to the other player (i.e. playing action D), if it meets the goal of the other player, or do not adapt (i.e. playing action ND), in the opposite case. Hence the GT-based model makes it possible to account for the admissible states that can be achieved during HRC. In the GT framework these states are denoted as the outcomes (of the game) which result from the combination of two simultaneous players’ game actions. Besides, according to the principles of GT, each outcome is associated with a pair of values, known as utility functions (or payoffs), each of which expresses how much the related player is satisfied with that outcome. In the considered case, at each step, the payoffs of player H and R associated with the current game outcome are updated with the measured \(s_k\) and \(p_k\), respectively. Thus, the proposed HRC game model can be exploited to evaluate the admissible compromises in terms of stress and performance, that might result from the interaction between the robot and the specific human co-worker. This formulation allows us to identify a particular outcome of the game, known as Nash Equilibrium (NE), that is a local equilibrium state of the game and represents a “no-regrets” situation for the players. Indeed, by definition, an outcome of the game is a NE if, once reached, no player has incentive to unilaterally change its action to increase its payoff. So, in the proposed formulation, the utility functions associated with the NE are considered the stress-performance equilibrium trade-off against which to locally evaluate the quality of the current stress-performance compromises that are resulting from the HRC and thus realize the optimization. Based on these evaluations, a Learning Automaton (LA) [4], which is the full-fledged decisional part of the proposed control strategy, is exploited to command the robot how to adjust its interaction role to optimize this trade-off. More specifically, at each step, we assume that the robot can be commanded to apply one of the following actions: \(\alpha _F\) or \(\alpha _L\). The first indicates that the robot has the role of follower during cooperation and, as such, it has to synchronize with the production rhythm dictated by the human. The latter, instead, indicates that the robot becomes the leader of the cooperation and thus increases the production pace to stimulate the human. In other words, the GT model is used to evaluate how the robot action affects the human co-worker in terms of stress and performance. The LA, by rewarding or penalizing the robot action proportionally to its positive or negative effect on the human, is then able to determine the best next action (interaction role) that the robot has to do to optimize the stress-performance state of the human co-worker.

The proposed method was validated in a realistic collaborative assembly task involving the ABB YuMi cobot and 15 volunteers.Footnote 2 During the experimental campaign, the cycle time and the ECG signal of each subject were acquired as described in Sect. 2. By the way, in this case, each user was asked to perform a collaborative task consisting in assembling the components of a box that exemplifies an electrical circuit. Even in this case a between-subjects design was adopted. The subjects were evenly distributed by gender and age into 3 groups: NC, CP and CPS. Each group was associated with a different testing condition, i.e. a different robot behavior. However, the experimental protocol was the same for all groups. Volunteers of group NC (i.e. No Control) were leaders of the cooperation, namely the robot simply followed their production pace without optimizing productivity or stress. Volunteers of group CP (i.e. Control Productivity) tested a strategy where the robot always dictated the production rhythm to solely stimulate the productivity, regardless of the subject’s stress. Eventually, group CPS (i.e. Control Productivity and Stress) experienced the full strategy just described where the robot adjusted its role, and thus the production pace, to jointly optimize the human well-being and productivity. In view of the outcomes of the study described in Sect. 2, strategy CP, which coincides with the robot being leader of the HRC, is considered the reference situation in terms of maximum productivity (throughput) that can be achieved by the worker. Strategy NC, which coincides with the human being leader of the HRC, is regarded as the reference situation in terms of minimum stress that the worker can experience during cooperation.

3.1 Results

The results of the experimental campaign and the related statistical analysis highlighted that, for what concerns the productivity (see Fig. 1b), the average throughput of group CP and CPS could not be considered statistically different. Thus the proposed strategy (CPS) is able to maximize the human productivity. Regarding the stress (see Fig. 1a), the average ECG-based stress statistic of Group NC and of Group CPS could not be considered statistically different. Hence, the proposed method is also able to minimize the human stress. To conclude, the novel control strategy, by suitably adjusting the robot role during cooperation, effectively enhances the productivity of the human-robot team, while significantly mitigating the work-related cognitive stress.

Fig. 1
figure 1

a Relative ECG-based stress statistics obtained for volunteers of Group NC, CP and CPS. b Productivity rate obtained for volunteers of Group NC, CP and CPS

4 A Dynamic Task Allocation Strategy to Mitigate the Human Physical Fatigue During HRC

Besides the mitigation of the cognitive stress, it is becoming increasingly important to develop methods that also mitigate the physical workload undergone by the shop-floor worker. Indeed, in the last few years, the incidence rate of MSDs cases in the manufacturing industry has registered very high levels. In these frameworks, typically, the human operator is involved in several repetitive manual activities that might overload his/her musculoskeletal apparatus, undermining his/her health (risk of MSDs) and work-related performance.

In this section we present a novel dynamic task allocation strategy aiming at relieving the physical workload accumulated by the worker during cooperation. Indeed, given a certain task composed by a number of activities that human and robot have to do, the proposed method dynamically decides which task activities should be better allocated to the robot and which ones to the human to minimize his/her accumulated fatigue. This dynamic allocation method is based on the possibility of evaluating in real-time the muscle fatigue associated with each human movement. The proposed method assumes that, as it typically happens, the collaborative workstation is equipped with an external vision system that identifies the positions of a set of relevant points along the human silhouette (so-called ‘skeletal positions’) and thus tracks the human motions. To evaluate the fatigue associated with the human motions, we first developed a new musculoskeletal model of the human upper-body including the full musculoskeletal description of the two arms, the neck, and the most relevant muscles of thorax/back in the OpenSimFootnote 3 digital human modeling software. In the developed model, we also placed a set of virtual markers in correspondence to the locations of the skeletal positions retrieved by the vision system. This makes it possible to replicate on the virtual human model the real human motions and evaluate the related muscle effort. The latter is evaluated through the OpenSim Static Optimization (SO) technique which allows to estimate the muscle activationsFootnote 4 that fulfill positions, velocity and acceleration of a certain movement. So, the SO technique can be interpreted as an extension of the inverse dynamics method that, at each time step, transforms the net joint moments of the model into estimates of single muscle forces, by minimizing the sum of the squared muscle activations. However, the SO tool is not fast enough to be used online. To solve this problem, we artificially generated a huge, generic motion dataset (i.e. input) and we exploited the SO method offline to retrieve the related muscle activations (i.e. output). Then, we offline trained a Deep Neural Network (DNN) to learn the underlying mapping function between the above-mentioned input-output dataset. To reduce the complexity of this mapping, the upper-body muscles have been clustered in 6 groups according to the body joint they contributed to actuate. The groups thus obtained (from now on called joints) are: head, shoulders (left and right), elbows (left and right) and trunk. Hence, the DNN is used online to estimate the muscle activations associated with the human motions tracked during the execution of each activity. Then, given the average joints activations and the duration of each task activity, a joint-specific model denoted as Power Model [2] is used to estimate the fatigue experienced by the considered joint after the execution of the current activity. Indeed, the Power Model relates the muscle activation to the activity duration (which corresponds to the effort endurance time) and provides a curve that identifies, with increasing muscle activation, the maximum time before the considered joint experiences overloading. Then, the fatigue undergone by each joint is estimated as the product of the muscle activation and activity duration. An estimate of the maximum fatigue that can be sustained by the joint before overloading is then obtained as the product of the maximum activation and the activity duration. By computing the ratio between the estimated fatigue quantities, a relative indicator expressing the actual fatigue experienced by the joint with respect to the maximum fatigue that can be sustained for that activity is obtained. Given the above, in the proposed formulation, each i-th task activity is then associated with a vector, \(\hat{\boldsymbol{f}}^i\), of 6 features that describes the fatigue which each joint is subject to at the end of that activity. An estimate of the actual fatigue accumulated by the worker, \(\hat{\boldsymbol{f}}_{acc}\), due to the sequence of activities done so far is then obtained by applying the Exponentially Weighted Moving Average (EWMA) filter to the fatigue vectors associated with the last N activities performed by the human. To suggest the human the best activity to do to minimize his/her accumulated fatigue, we propose to evaluate how much the execution of the i-th activity would increase the norm of \(\hat{\boldsymbol{f}}_{acc}\). This contribution is computed for each i-th activity as follows:

$$\begin{aligned} c^i\{\hat{\boldsymbol{f}}^i, \hat{\boldsymbol{f}}_{acc}\} = ||\hat{\boldsymbol{f}}^i||_2 \, \left( \frac{|\hat{\boldsymbol{f}}^i \cdot \hat{\boldsymbol{f}}_{acc}|}{||\hat{\boldsymbol{f}}^i||_2 ||\hat{\boldsymbol{f}}_{acc}||_2}\right) = \frac{|\hat{\boldsymbol{f}}^i \cdot \hat{\boldsymbol{f}}_{acc}|}{||\hat{\boldsymbol{f}}_{acc}||_2} \; \forall i=1,\dots , Nt \end{aligned}$$
(2)

where \(||.||_2\) indicates the vector \(L^2\) norm, |.| the absolute value, \(\cdot \) the scalar product and Nt is the number of task activities. Then, the human is suggested to do the activity \(\alpha ^H_k\) that minimizes this contribution, whereas the robot is commanded to do the activity \(\alpha ^R_k\) that maximizes it (see Eq. (3)).

$$\begin{aligned} \alpha ^H_k = \min _i\{c^1\{\hat{\boldsymbol{f}}^1, \hat{\boldsymbol{f}}_{acc}\}, \dots , c^{Nt}\{\hat{\boldsymbol{f}}^{Nt}, \hat{\boldsymbol{f}}_{acc}\}\} \qquad \alpha ^R_k = \max _i\{c^1\{\hat{\boldsymbol{f}}^1, \hat{\boldsymbol{f}}_{acc}\}, \dots , c^{Nt}\{\hat{\boldsymbol{f}}^{Nt}, \hat{\boldsymbol{f}}_{acc}\}\} \end{aligned}$$
(3)

The proposed strategy was validated in a realistic collaborative task, composed by 4 packaging activities. The task involved the ABB YuMi cobot and 14 volunteers.Footnote 5 During each activity the user was requested to perform manual handling of small and lightweight loads at high frequency. However the execution of the activities mainly entailed the activation of different body regions. For the experimental campaign a between-subject design was exploited: volunteers were evenly distributed by gender and age into two homogeneous groups, i.e. group S and group D. The latter experienced the proposed dynamic allocation strategy, while group S tested a static allocation of the task activities consisting in six consecutive repetitions of a fixed sequence of all the activities. The proposed dynamic allocation method requires an initial (online) calibration phase during which the user is requested to do two consecutive repetitions of the sequence of activities 1–2-3–4 (i.e.‘baseline’ phase).

4.1 Results

The results of the experimental campaign and the related statistical analysis highlighted that the static allocation strategy entails an increase of the average accumulated fatigue by about 7% with respect to the fatigue accumulated during the baseline (see Fig. 2a). Conversely, the dynamic allocation strategy does not increase the average accumulated fatigue (w.r.t. the baseline). Thus, the proposed method is effective in mitigating the human accumulated fatigue. For what concerns the production performance, evaluated in terms of cycle time (see Fig. 2b), the results of the experimental campaign showed that the dynamic method entails a reduction of the average cycle time by more than 16% with respect to the static method. Hence, the proposed strategy is also effective in improving the user performance.

Fig. 2
figure 2

a Relative fatigue accumulated by the volunteers of Group S and Group D. b Relative cycle time obtained for volunteers of Group S and Group D

5 Conclusions

In this chapter we first presented an exploratory study aimed at analyzing how the robot interaction role influences the cognitive stress and production performance of the human co-worker. This study showed that the robot leadership entails a higher human productivity, but also a higher human stress with respect to the case where the human takes the lead. This highlights the existing trade-off between the maximization of the human productivity and the mitigation of the human stress and opens the door for the development of an online closed-loop robot control strategy aiming at simultaneously optimizing both aspects. This strategy was then described in Sect. 3. The proposed method exploits game theory to model the above-mentioned trade-off and evaluate how the robot behavior (role) is influencing the stress and performance of the human co-worker. A control unit, represented by a learning automaton, was used to reward or penalize the robot behavior proportionally to its positive or negative effect on the human. Through that, the robot was enabled to autonomously determine the best action to do (which role to take on) to optimize the stress-performance state of the human co-worker. The proposed strategy turned out to effectively increase the human performance while significantly mitigating the stress, by solely adjusting the robot production rhythm according to the specific human co-worker. Eventually, we proposed an innovative dynamic task allocation strategy that, by relying on a novel human upper-body model, estimates online and non-invasively the muscle fatigue accumulated by the user during each activity and suggests the human the best next activity to do to minimize his/her accumulated fatigue. By optimally distributing the admissible task activities between human and robot, the method effectively reduced the physical workload accumulated by the worker and improved his/her performance.