An experimental focus on learning effect and interaction quality in human–robot collaboration

In the landscape of the emerging Industry 5.0, human–robot collaboration (HRC) represents a solution to increase the flexibility and reconfigurability of production processes. Unlike classical industrial automation, in HRC it is possible to have direct interaction between humans and robots. Consequently, in order to effectively implement HRC it is necessary to consider not only technical aspects related to the robot but also human aspects. The focus of this paper is to expand on previous results investigating how the learning process (i.e., the experience gained through the interaction) affects the user experience in the HRC in conjunction with different configuration factors (i.e., robot speed, task execution control, and proximity to robot workspace). Participants performed an assembly task in 12 different configurations and provided feedback on their experience. In addition to perceived interaction quality, self-reported affective state and stress-related physiological indicators (i.e., average skin conductance response and heart rate variability) were collected. A deep quantitative analysis of the response variables revealed a significant influence of the learning process in the user experience. In addition, the perception of some configuration factors changed during the experiment. Finally, a significant influence of participant characteristics also emerged, auguring the necessity of promoting a human-centered HRC.


Introduction
The emerging Industry 5.0 focuses attention on humancentricity, promoting the development and implementation of technologies that can support humans in their activities. Human-robot collaboration represents one solution for achieving these goals. Human-robot collaboration is a form of direct interaction between human and robotic system apt to combine the abilities of the entities involved to achieve a common goal [1]. On the one hand, robots support with power, precision, and repeatability; on the other hand, humans have flexibility, problem-solving capabilities, and intelligence. The robots involved in HRC are called collaborative robots (or cobots), which, unlike classical industrial robots, are special robots designed to allow physical interaction with human operators. This feature allows for the removal of barriers separating the workspace of robots from that of operators, making the industrial layout more flexible and enabling new forms of interaction [2]. In addition, this perspective allows to promote greater adaptability and reconfigurability of production processes [3,4]. However, working closely with a robot can induce stressful situations for operators, impacting the performance and quality of production processes. Such situations can impact the performance and quality of production processes, as well as the well-being of the operator [5]. From the perspective of Industry 5.0, in order to promote a human-centred industry, it is therefore necessary also to analyze and take human factors into consideration to effectively implement HRC [6,7].
Understanding the operator mental and physical state during HRC is first required in order to provide the optimal support for their wellbeing. In addition to traditional evaluation tools, such as questionnaires, the analysis of physiological response is a valuable resource for investigating user experience. Through physiological response, it is possible to monitor the operator's state as the task is being carried out in real-time, even discovering potential unconscious reactions. Only a relatively small number of publications have so far included physiological measurements to comprehend the operator's state in HRC [8][9][10]. In addition, a gap is present in the literature on how user experience and perception of HRC factors may evolve over time due to the experience gained from interacting with the cobot. The study of these aspects can be helpful for outlining guidelines for the introduction of a collaborative task to an operator.
In a previous work [11], an original experimental setting aimed at investigating how different configurations of a cobot impact the user experience was presented, involving 42 participants. The objective of this paper is to expand on previous findings by addressing the following research question: How do the learning process influence operator's user experience and stress?
Through a deep quantitative analysis, the joint effects of learning process (i.e., the experience gained by interacting with the cobot during the experiment) and different HRC configuration factors on self-reported affective state, perceived interaction quality and physiological response are investigated. In addition, the influence of operator characteristics (e.g., gender, age, previous experience with cobots, and attitude towards robots) on the different response variables is explored.
The main contributions of this paper can be summarized as follows: (i) A deep quantitative analysis on the learning process and its joint effect with different HRC configuration factors on response variables (i.e., perceived interaction quality, affective state, and physiological response). (ii) Exploration of the influence of operator characteristics on response variables.
The findings of this study can have an impact on the design of collaborative tasks as well as the way they are implemented.
The paper structure is as follows. A review on HRC and the human factors involved is provided in Sect. 2. Section 3 provides a summary of the methodology and experimental setting. Section 4 presents the quantitative analysis results, highlighting the relationships with the response variables of the joint effect of learning process and configuration factors. In Sect. 5, discussion of the study results and potential implications for the improvement of HRC are presented. Finally, Sect. 6 focuses on conclusions and future work.

Literature analysis
HRC paradigm is characterized by several aspects related to both the robotic system and humans [6]. For the interaction to effectively support humans in the most difficult operations, careful planning is necessary [12]. Inkulu et al. [13] offered an overview of HRC, outlining some of the critical challenges and prospects. Natural ways of communicating with robots, such as voice and gestures, enable natural engagement and may cut down on idle time, however these recognition techniques need to be strengthened to withstand any environmental disturbances. Power force limitation strategies are helpful for effectively collaborating with low-payload robots, but they might not be appropriate for high-speed and high-payload robots, necessitating the deployment of additional flexible safety methods. Collaborative robots are enabling technologies for reconfigurable production systems, however, to minimize potential production downtime, more research is required on robot adaptive systems.
Human factors related to HRC have received more attention in recent years [14]. Psychological and cognitive ergonomics are equally important to the successful implementation of HRC as physical ergonomics [15,16]. Gualtieri et al. [17] presented an overview on human factors and cognitive ergonomics aspects involved the design of HRC assembly systems. Aspects such as cognitive workload, stress, usability, perceived enjoyment, acceptance, trust, and frustration were highlighted to be of particular interest in the enhancement of operator's work conditions.
The idea of symbiotic HRC was introduced by Wang et al. [18]. In tradition automation practice, humans are required to adhere to rigid work processes. Symbiotic HRC aims at promoting: (i) natural interaction with the robot; (ii) multi-modal, user-friendly programming environment that doesn't demand in-depth system expertise; (iii) an increased context dependency; (iv) an immersive collaboration that enables the operator to participate in the tasks through wearable technology (e.g., smart watches, augmented reality (AR) glasses).
Recent efforts have concentrated on presenting techniques for improving the adaptiveness of robot systems during HRC in order to enhance HRC potentials [19,20]. Neves and Neto [21] proposed a reinforcement learning approach for assembly sequence planning that included user preferences. Mohammed et al. [22] introduced a method for efficient online collision avoidance in an augmented environment. Buerkle et al. [23] presented a sensor framework to model humans, which incorporated subjective, objective, and physiological metrics.
Literature has extensively covered concepts like stress, fatigue, and mental workload which are particularly 1 3 relevant in the context of manufacturing [5,24,25]. Selfreporting methods like the Subjective Workload Assessment Technique (SWAT) [26] and the NASA-TLX [27] are frequently used to assess these dimensions [28]. However, these methods are not well suited to gather real-time data and requires the subject to recall an event, which may introduce some bias [29,30]. In order to work through these constraints, recent years have seen a growing emphasis on physiological measures for the study of the operator's condition [31,32]. Various works on this topic have been presented, but only a limited fraction is specifically focused on industrial HRC. Koppenborg et al. [33] investigated the effects of movement speed and path predictability on the operator though virtual reality. In addition to subjective responses, heart rate was collected by a chest band. Arai et al. [8] used an industrial manipulator to assess how various speeds and distances from the operator affected mental strain, measured through electrodermal activity (EDA). Kühnlenz et al. [9] investigated how various robot trajectory profiles affected operators stress by analyzing EDA and heart rate variability (HRV).

Experimental setup
In the "Mind 4 Lab" (Manufacturing Industry 4.0 Laboratory) of the "Politecnico di Torino", a collaborative assembly task was implemented to emulate an HRC setting [11]. Using the assistance of the cobot UR3e, the task involved attaching two mechanical flanges to a base by tightening two pairs of screws (Fig. 1). The operations can be divided in the following phases: (1) The cobot picks the square flange and places it in the correct position on the base. (2) The operator takes the screws, inserts them into the holes and tightens them. A within-subjects experimental design was implemented in this study to examine the effects of three fixed factors (i.e., robot's movement speed, proximity to robot workspace, and execution time control) and their interactions with the learning process on affective state, interaction quality, and physiological response [11].
Three levels of the robot's joint speed (Speed) were implemented: 30°/s (Low), 90°/s (Medium), and 270°/s (High). These values represent the maximum speed that all the robot's joints could reach. Two levels of proximity of the robot workspace to the operator (Distance) were introduced: 30 cm (Close) and 40 cm (Far). The distances refer to the minimum distance between the operator's chest and the robot workspace. Lastly, two levels of control for the task execution time (Control) were implemented: one in which the operator had a push-button to command the robot to continue with the task (Human) and another in which the cobot proceeded automatically with the operations after waiting 25 s (NoHuman). Each participant performed the task in all 12 possible configurations in random order.
Forty-two participants, with an average age of 28.2 years (standard deviation = 8.1), were recruited from the "Politecnico di Torino" and the surroundings for the study (28.6% females and 71.4% males). In order to collect information about the participants and their user experience, a number of questionnaires were implemented. Before beginning the experiment, each participant was given an initial questionnaire to collect age, gender, and degree of prior experience with cobots (according to the following scale: L0-"I have never interacted with a cobot and I did not know them before now"; L1-"I have never interacted with a cobot but I know what they are"; L2-"I have interacted at least once with a cobot"; L3-"I have already programmed and interacted with a cobot"). Afterwards, the Negative Attitude toward Robots Scale (NARS) [34] was administered to assess general attitude toward robots of the participant. The NARS items are divided in the following three sub-scales and, for each of them, a score can be obtained: Negative Attitudes toward Situations and Interactions with Robots (S1) (scoring between 6 and 30); Negative Attitudes toward Social Influence of Robots (S2) (scoring between 5 and 25), and Negative Attitudes toward Emotions in Interaction with Robots (S3) (scoring between 3 and 15) [34].
At the end of each trial, the Self-Assessment Manikin (SAM) [35,36] was administrated to participants to gather their affective state in the different task configurations by evaluating on a 9-point scale three dimensions: valence, arousal, and dominance. Valence represents the pleasantness relative to a stimulus (for instance, happiness and relaxation are associated with a high valence, while anxiety or anger with a low valence). Arousal refers to the intensity of emotion provoked by a stimulus (e.g., fear and anger are usually associated with a high arousal, while relaxation and boredom with a low arousal). Dominance represents the degree of control felt relative to a stimulus (e.g., relaxation or anger are usually associated with a high dominance, while fear or anxiety with low dominance).
In addition, at the end of each trial, an interaction quality questionnaire (Table 1) based on Hoffman [37] and Baraglia et al. [38] and composed of seven items was used to collect participants' perception on different dimensions related to the interaction with the cobot [11]. The items were evaluated on a 7-point scale (from "strongly disagree" to "strongly agree").
Finally, at the end of the experiment, participants were asked to provide overall unstructured feedback regarding the experience.
In addition to subjective evaluations, physiological signals were also collected to deepen the participant state during the experimental trials. EDA data and heart data through Photopletismogram (PPG) were obtained, respectively, at 4 Hz and 64 Hz using the non-invasive biosensor Empatica E4 wristband [39]. From EDA and PPG signals stress and arousal indicators were derived for each HRC configuration, as explained in the following sub-section. Table 2 provides a summary of all the dependant and independent variables included in the analysis. From physiological signals, potential artifacts were identified and removed. By using the MATLAB-software "Ledalab", EDA signal were decomposed in tonic component and phasic component though Continuous Decomposition Analysis (CDA) [40]. The tonic component is characterized by the Skin Conductance Level (SCL), which represents the long-term fluctuations in EDA that are not directly derived by external stimuli. Short-term EDA fluctuations elicited by an external stimulus represents phasic component. From the phasic component, Skin Conductance Responses (SCRs), which are amplitude differences between the SCL and response peaks, are detected. In the present study, the average SCR (Mean-SCR) was calculated for each HRC configuration, representing a stress and arousal indicator. Regarding heart data, NNintervals (i.e., time intervals between systolic peaks) were obtained from PPG. As HRV measure for stress, the Root Mean Square of Successive Differences between adjacent NN-intervals (RMSSD) was included due to its widespread usage [5,41].

Data processing and analysis
A series of Mixed-effect Ordinal Logistic Regression (MOLR) models were implemented in order to investigate the relationship of the fixed-factors of the experiment and their interactions with the subjective responses (i.e., interaction quality and SAM dimensions). The MOLR model was chosen for its suitability in (i) modelling dependent variables defined on an ordinal scale, and (ii) handling the participant effect as a random block effect. The "ordinal" package from the software R was used to fit these models [42]. Additional details on MOLR are provided in Appendix A.
Since the selected indicators for physiological response were continuous variables, Linear Mixed Models (LMMs) were implemented to explore the relationship with the considered factors and to handle the participant effect as a random block effect. Models were fitted using the "lmerTest" package from the software R.
The formula used for the models is reported below using the Wilkinson notation [43]: Table 1 Questionnaire for interaction quality [11] Item To explore the potential effect of participant characteristics, the same models were also fitted including the independent variables Gender, Age, Experience, NS1, NS2, and NS3.

Results
This section contains the analysis results for each considered response variable.

Perceived interaction quality
The relationships between the learning process, the experimental factors and the different aspects of the perceived quality of interaction with the robot will be discussed in this sub-section. In Appendix B, the estimated parameters of the fitted models are reported. Table 3 7, 8, and 9 provide a representation of the effects of the independent variables on the interaction quality dimensions. For each configuration of the fixed factors (i.e., Speed, Distance, and Control), the evolution of the expected response probabilities over the course of the experiment is shown. Through this representation, it is also possible to notice the significance of the fixed factors by comparing the patterns of the configurations (e.g., in Fig. 2, a distinct difference in patterns between the Human and NoHuman configurations can be seen).
Speed, Control, Trial, and the interaction term Control⋅Trial had a significant effect on the perceived robot helpfulness (Q1_Helpful) and their effect can be seen in Fig. 2: -Speed had a general positive effect (β SpeedMed = 0.95, β SpeedHigh = 0.99), meaning that when the robot moved faster it was perceived as being more helpful in completing the task. -Trial had a significant positive effect (β Trial = 0.07), meaning that the perceived robot helpfulness slowly increased due to the learning process. -The cobot was rated as being less helpful when the person had no control over the task execution time Assessment of perceived interaction unsafety (7-point scale) Q3_Natural Assessment of perceived interaction naturalness (7-point scale) Q4_Efficient Assessment of perceived team efficiency (7-point scale) Q5_Fluid Assessment of perceived team fluency (7-point scale) Q6_Uncomfortable Assessment of perceived discomfort (7-point scale) Q7_Trustworthy Assessment of perceived robot trustworthiness (7-point scale) Valence SAM dimension assessing how positive is the emotion (9-point scale) Arousal SAM dimension assessing how much agitated a person feels (9-point scale) Dominance SAM dimension assessing how strong is the dominance feeling (9-point scale)

of amplitudes of Skin Conductance Responses [μS] RMSSD
Root Mean Square of Successive Differences between adjacent heart rate NN-intervals [ms] (β ControlNoHum = −2.17). However, it is interesting to note that the interaction term Control⋅Trial had a positive effect (β ControlNoHum⋅Trial = 0.16). As the experiment progressed, the negative effect due to the lack of control was partially offset by the progressive acquisition of experience with the collaborative task.
Concerning the interaction unsafety (Q2_NotSafe), Speed, Control, and Trial were found to be significant, and Fig. 3 shows their effect on the response probabilities: -The perceived safety degraded when the robot movement speed was high (β SpeedMed = −0.02, β SpeedHigh = 0.30). -An increase of the perceived unsafety was observed when the participant had no control of task execution time (β ControlNoHum = 1.31). -As the experiment progressed (i.e., Trial increased), the participant became increasingly familiar with the collaborative task and the perceived unsafety gradually decreased (β Trial = −0.15).  that as the participant became more familiar with the task, this negative effect was gradually completely compensated (β SpeedMed⋅Trial = 0.11, β SpeedHigh⋅Trial = 0.16). -Participants felt the interaction less natural when they had no control over task execution (β ControlNoHum = −2.42). However, as the participant became more familiar with the task, this negative effect was partially offset through the learning process (β ControlNoHum⋅Trial = 0.13).
Regarding perceived team efficiency (Q4_Efficient), the terms Speed, Control, Trial, Speed⋅Control, and Control⋅Trial were found to be significant, and  ally mitigated by the experience gained by the participant as the experiment progressed (β ControlNoHum⋅Trial = 0.15).
Regarding team fluency (Q5_Fluid), Speed, Control, Trial, Speed⋅Trial, and Control⋅Trial were found to be significant and their effect can be seen in Fig. 6: -Compared to low robot speed, medium and high robot speeds were related to increased team fluency (β SpeedMed = 1.63, β SpeedHigh = 1.49). However, the interaction term Speed⋅Control had a negative effect (β SpeedMed⋅ControlNoHum = −1.64, β SpeedHigh⋅ControlNoHum = − 1.04), meaning that a medium or high robot movement speed with no execution time control by participants was related with a degradation of team efficiency. It is worth noting that a medium robot speed was associated with slightly greater team fluency than a high one, however the penalty due to the absence of execution time control was significantly greater for the medium speed. Likelihood Ratio (LR) tests with only fixed-term models were performed to check if the random effect Participant was significant. From the obtained results, the random effect Participant resulted highly significant (p < 0.001) for every fitted model.

Self-reported affective state
This subsection examines how the experimental variables influence the self-reported affective state collected thought the SAM. In the Appendix B, the estimated parameters of the fitted models are reported. Table 4 contains the results of the ANODE for each MOLR model, showing the significance of each model term. Figures 10, 11, and 12 provide a representation of the effects of the independent variables on the interaction quality dimensions.
Regarding Valence dimension, the terms Speed, Control, Trial, Speed⋅Control, and Speed⋅Trial were found significant and Fig. 9 shows their effect:   The terms Speed, Control, and Trial were significant in explaining the variability of Arousal and their effect can be seen in Fig. 10: With respect to Dominance, the terms Speed, Control, and Trial were significant and their effect are shown in LR tests with only fixed-term models were performed to check if the random effect Participant was significant. From the obtained results, the random effect Participant resulted highly significant (p < 0.001) for every fitted model.

Physiological measures
The influence of the learning process and the experimental factors on physiological responses will be presented in this sub-section. In the Appendix B, the estimated parameters of the fitted models are reported. Table 5 contains the results of the analysis of variance (ANOVA) for each linear mixed model, showing the significance of each model term.
Regarding the EDA response, the most significant terms for the average SCR (mean = 0.0497, sd = 0.0576) are Speed, Control and Trial. Control was still included among the significant variables because of its p-value close to the significance level (α = 0.05). With reference to Fig. 12, the effects of the significant terms can be interpreted as follows: -An increase of the average SCR was observed when robot speed was higher (β SpeedMed = 0.03, β SpeedHigh = 0.01), meaning that participants experienced more mental strain as the robot speed increased. -As the experiment progressed, mental strain gradually decreased (β Trial = −0.0004) mainly due to the participant's learning process.
-Configurations with no task execution control tended to lead through more stressful conditions, which raised the mean SCR (β ControlNoHum = 0.03).
With respect to HRV, Distance, Control, and Control⋅Trial were the most significant terms in explaining the variability of RMSSD (mean = 56.89, sd = 40.01). Distance was still included among the significant variables because of its p-value close to the significance level (α = 0.05). Referring to Fig. 13, the effects of the significant terms can be interpreted as follows: -More stressful situations arose when participants had no control of time execution, which led to a significant decrease of RMSSD (β ControlNoHum = −14.25). However, this negative effect was gradually compensated by the learning process of the participant (β ControlNoHum⋅Trial = 2.31). -A small positive effect on RMSSD was observed when participants were more distant from the robot's workspace (β DistanceFar = 1.80), implying potentially slightly less stressful conditions.
LR tests with only fixed-term models were performed to check if the random effect Participant was significant. From the obtained results, the random effect Participant resulted highly significant (p < 0.001) for every fitted model.

Influence of participant's characteristics
The diversity of each participant turned out to be an important factor in order to explain the variability of the response variables considered. To gain some additional insights, the participant characteristics collected during the experiment (i.e., age, gender, prior experience with cobots, and attitudes towards robots) were included in the models. Table 6 contains the ANODE for each interaction quality dimension. Only interaction naturalness (Q3) and team fluency (Q5) have at least one significant term related to participant characteristics. For interaction naturalness, Age had a positive effect (β Age = 0.10) and was considered significant due to its p-value close to 0.05. Regarding team fluency, Age had a positive effect (β Age = 0.08), with its p-value close to 0.05, and NS1 had a significant negative effect (β NS1 = −0.23). This means that participants with a more negative attitude towards situations and interactions with robots tended to rate the team fluency lower. Table 7 shows the ANODE for each SAM dimension. Valence was negatively affected by Experience (β Experience = −0.87), meaning that participants with a high level of prior experience with cobots tended to report a less positive feeling, likely due to a decreased sense of novelty.
Dominance was found to be positively affected by Age (β Age = 0.09). This may suggest that older individuals may tend to feel more in control of the situation during HRC.
In Table 8, the results of the ANOVA for the physiological responses are reported. None of the participant characteristics considered appeared to be significant for the average SCR and RMSSD.
LR tests with only fixed-term models were also carried out, revealing that the random effect Participant resulted still highly significant (p < 0.001) for every model. This suggests that there are important participant characteristics that have not been considered, and further study will be needed to identify them.

Discussion
The quantitative analysis revealed the influence of the learning process in conjunction with the configuration factors on the response variables, highlighting some interesting relationships and expanding the preliminary results of a previous work [11].

Effects of configuration factors and learning process
Among the fixed factors of robot configuration, Speed and Control were found to be most significant overall. In addition, the experience gained by interacting with the cobot during the experiment (Trial) significantly influenced the response variables. The main effects of the terms considered are summarized below.
• Speed. The robot movement speed had a general positive effect on the perceived robot helpfulness and efficiency, meaning that higher speeds were generally more appreciated by the participants from a performance viewpoint. However, High robot speed was also associated with slightly greater perceived unsafety and less interaction naturalness and comfort. Regarding the participants' affective state, the robot movement speed had a general high positive effect on arousal and a negative one on dominance. This means that participants, in general, felt slightly less in control of the situation and more aroused when robot movement speed was higher, leading to potentially more stressful situations. This was also confirmed from a physiological point of view, by an increase of average SCRs (MeanSCR) for higher speeds. Valence was positively influenced by higher robot movement speeds, leading to more positive feelings in general. • Control. The absence of participant's control of the task execution time had a significant negative effect on the interaction quality. In this kind of configurations, the robot was perceived by participants less helpful and trustworthy. In addition, the interaction was perceived less safe, natural, efficient, fluid, and comfortable. Regarding the participants' affective state, the lack of time execution control had a general negative effect: valence was slightly degraded, arousal increased, and dominance significantly decreased. This led to more stressful situations for the participants and a physiological confirmation was found in a significant decrease of HRV. • Speed⋅Control. The interaction term Speed⋅Control played an interesting role. The lack of time execution control by the participant amplified or degraded the robot movement speed effect on some aspects of the interaction quality. Perceived team fluency was greater for higher speeds, with a slightly higher effect for the Medium speed. However, without time execution control the positive effect was almost nullified for higher speeds. Speed⋅Control had also a negative effect on perceived efficiency and especially on comfort. Concerning the affective state, the term Speed⋅Control had a significant negative effect on valence, meaning that the lack of time execution control by the participant led to more unpleasant emotions with higher robot speeds. • Distance. The distance from the robot workspace was found to be not particularly influencing in general; just a slight decrease in robot trustworthiness and a slight increase in HRV were associated with a higher distance. This result could be also influenced by the relatively small size of the UR3e robot and future research will be required to verify this hypothesis. • Trial. The experimental progress had a significant overall positive effect on the interaction quality. As the experiment progressed, participants tended to perceive the robot more helpful, safe, and trustworthy. In addition, the interaction was felt gradually more fluid, efficient, and comfortable. However, a constant slight decrease of perceived interaction naturalness emerged when robot speed was Low. Concerning the participants' affective state, the learning process had a negative effect on arousal and a positive effect on dominance, leading to a gradual relaxation of the participant. This effect was also confirmed physiologically by the constant decrease of the average SCR as the experiment progressed. The learning process negatively affected the valence when the robot speed was Low. This can be interpreted as an increasing sense of boredom in participants when the robot movement speed is perceived to be too slow. The overall positive effect of Trial on the response variables was further enhanced by the interaction terms Speed⋅Trial and Control⋅Trial. • Speed⋅Trial. The negative effect of Speed on interaction naturalness and comfort was gradually compensated by the interaction term Speed⋅Trial, meaning that the learning process particularly helped in improving these aspects for higher speeds. Regarding the affective state, valence was positively influenced by the interaction term Speed⋅Trial. Medium speed was initially associated with a slightly more positive feeling than the High one. However, as the experiment progressed, participants were in general more pleased with the High speed level. • Control⋅Trial. As the experiment progressed, the interaction term Control⋅Trial partially compensated the negative effects on interaction naturalness, robot helpfulness, team fluency and efficiency due to the absence of execution time control. In addition, the interaction term Control⋅Trial showed also a significant gradual compensation of the negative effect of Control on HRV due to the learning process.
The interpretations of the results were also reflected in the unstructured interviews with participants. Initially, most participants were more comfortable with the Low and Medium robot movement speeds. However, towards the last few trials, the Low speed was perceived to be particularly tedious, even leading some participants to become more distracted. This may be due to the repetitiveness of the task and, consequently, the learning process that led most participants to prefer a High robot speed by the end of the experiment. The factor of distance from the robot's workspace, instead, was found to be more of a matter of preference due to the participant's comfort in performing their operations. Controlling the task execution time was generally preferred by participants, primarily due to less psychological pressure. Towards the end of the experiment, however, some participants were able to finish their tasks well in advance when they were not in control of the task execution time, implying a longer wait time before the robot proceeded with the next task. This resulted in a partial boredom effect but also in a degradation of the perceived efficiency, naturalness, and fluency of the interaction. Interestingly, at the end of the experiment, some participants expressed a preference for configurations in which the robot continued with its operations automatically, without waiting for a command, because this implied one less operation for the operator. This interesting cue raises the demand and the need to make collaborative robots more situationally aware, in order to allow a greater support of the operator also from the psycho-cognitive point of view. Moreover, the implementation of the ability of initiative by the cobot would allow to better support the operator and to create a symbiotic collaboration, characterized by a greater naturalness and fluency.

Importance of the operator's characteristics
The random effect Participant was found to be significant in all models, highlighting the importance of a more personalized HRC. Moreover, this fact hinted that some user characteristics may significantly influence the perceived interaction quality and affective response. To further investigate this fact, all previous models were re-fitted by including the variables Age, Gender, Experience, and the three NARSrelated scores (i.e., NS1, NS2, NS3) (see Tables 12 13, and 14). Results revealed that of the characteristics considered only a few were significant for some response variables. Age was associated with a positive effect on self-reported dominance and perceived interaction naturalness and fluency. This may imply that an older age may result in feeling more confident and in control of the situation during the HRC. However, this effect may be due to the fact that most of the older people involved in the experiment knew at least what a cobot was. Future investigation of this will be needed by involving people with a more diverse background and a wider, more distributed age range. Experience had a significant negative effect on self-reported valence. This effect could be due to a decreased sense of novelty and excitement for those more familiar with collaborative robots, which can also lead to boredom. Finally, NS1 was associated with a significant negative effect on perceived team fluency. This means that participants with a more positive attitude towards situations and interactions with robots tended to perceive greater team fluency. It is worth noting that even in the new models the Participant random effect still turned out to be quite significant, indicating that there are potentially significant aspects of the participant that were not accounted for. Further investigation of user characteristics is needed and will be the focus of future works.

Conclusions
The aim of this paper was to expand the previous preliminary results [11] by deepening the joint effect of the learning process with several HRC setting factors on the operator's affective state, perceived interaction quality, and physiological response. Among the configuration factors, cobot movement speed (Speed) and control of the task execution time (Control) were found to be the most influential ones. The learning process (Trial) also played a key role, especially in improving user experience and decreasing stressful conditions. It also emerged that the learning effect can change the perception of certain factors.
Regarding the robot movement speed, initially most participants preferred a low speed. However, as the experiment progressed, there was a greater preference toward higher speeds as they helped to be more engaged and efficient. This result highlights that when the robot trajectories are predictable and known, there is a preference to maintain a high speed of the robot's movements, mainly for performance and satisfaction reasons. On the other hand, when the operator 1 3 is not aware of the robot's trajectory, it may be preferred to maintain a lower movement speed.
The absence of participant's control of the task execution time led to a general degradation of perceived interaction quality and an increase of stress. However, it is interesting to note that these negative effects were mitigated through the learning process. In fact, some participants at the end of the experiment even expressed that they appreciated the absence of the control button, as it represented one less operation for them to do. This hint suggests that the implementation of a system that is able to automatically recognize the completion of human operations may be particularly appreciated in the HRC, contributing to establish a more natural and humanlike interaction.
The distance between operator and robot workspace turned out to be merely a question of preference and to be not particularly influential. This fact might be due to the UR3e robot's relatively small size, although further research is needed to confirm this hypothesis. Future work will focus on introducing cobots of different payload (e.g., UR10e or UR16e) to investigate whether the cobot size may influence the perceived interaction quality and the overall user experience.
The non-invasive acquisition of physiological responses provided objective information about the participant's state and unconscious reactions. By analyzing HRV and EDA, confirmation of the participants' self-reported affective state and unstructured feedback was found. Such concordance is certainly promising for the design of a real-time monitoring system of operator well-being in HRC.
The differences among participants were found to be significant, highlighting the need to design and implement a more customizable HRC. An initial investigation into the effect of certain aspects of participants on response variables was conducted, however a more in-depth study is needed. Future work will also focus on a more in-depth study of the link between physiological responses and the affective state of the operator in the HRC.

Appendix A-Mixed-effect ordinal logistic regression (MOLR)
Ordinal Logistic Regression (OLR) model is a regression model for ordinal dependent variables based on Cumulative Link Models (CLMs). CLMs are a powerful class of Generalized Linear Models (GLMs) for ordinal response variables, allowing them to be treated properly as categorical variables as well as exploiting their ordinal nature [42,44]. In CLMs, the cumulative probability for each level j of the ordinal response (i.e., ℙ(Y ≤ j) ) is modeled via a link function. In this context, a commonly used link function (i.e., a function that links the probability to the linear function of the predictor variables) is the logit link. The logit link maps probability values from (0, 1) to real numbers in (−∞, +∞) and is defined as the logarithm of the odds p 1−p , where p is the probability. Logit link is often preferred to others (such as probit or log-log) because of mathematical convenience and higher interpretability of results. CLMs with the logit link function are called OLR models. Therefore, OLR model can be thought of as an extension of the logistic regression model (which applies to binary response variables), allowing for more than two (ordered) response categories.
Compared to OLR, other more common methods can be used to analyze an ordinal response variable, however they have several limitations. For instance, by treating the numerically coded ordinal variable as quantitative, typical least squares regression can be used. However, this common method, in addition to introducing a scale promotion for the response variable, often violates the assumptions of homoscedasticity and normality of the residuals. Another common method of modeling an ordinal variable is to treat it as nominal and use a multinomial logistic regression. This choice also introduces limitations, most notably the loss of information from ignoring the ordering which results in a loss of power for the model.
Mixed-effect Ordinal Logistic Regression (MOLR) models are based on Cumulative Link Mixed Models (CLMMs), which are an extension of CLMs that allow the inclusion of normally distributed random effects. The MOLR model can be specified in terms of cumulative logits as follows: with i = 1, …, n and k = 1, …, K, where J is the number of levels of the ordinal response variable, n the number of observations, and K the number of participants (i.e., 42 in this case). The term ℙ Y ik ≤ j represents the probability that the observation i of the participant k is associated with a rating equal or below j. Note that the logit is not defined for j = J, since ℙ Y ik ≤ J = 1 . The vector contains the model parameters, the vector x T ik contains data of the independent variables of observation i of participant k, and the terms j are called threshold parameters and serves as intercepts of the model. The term u k represents the random effect for participant k, where u k ∼ N(0, 2 ) . Note that the negative sign in front of x T ik ensures that positive parameters are associated with increased probability for higher ratings as the explanatory variables increase.

Appendix B-List of estimated parameters of the models
In this section, the estimated parameters for the fitted models are reported. Tables 9, 10, and 11 contain the estimated parameters of the models for interaction quality dimensions, SAM dimensions, and physiological signals, respectively. In Tables 12, 13, and 14, instead, the estimated parameters of the models with participant characteristics for interaction quality dimensions, SAM dimensions, and physiological signals are reported.

Conflict of interest
The authors declare that they have no conflict of interest.

Ethical Standards
The authors respect the Ethical Guidelines of the Journal. Informed consent was obtained from all individual participants included in the study.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.