1 Introduction

Human-robot interaction (HRI) is referred to as the process that conveys the human operators’ intentions and interprets the task descriptions into a sequence of robot motions complying with the robot capabilities and the working requirements. The identification of suitable interaction methods and interfaces for HRI has been a challenging issue in robotics as it is essential for the prevalence of robots supporting humans in key areas of activities.

Robots can be classified into two general categories, namely, industrial and service robots. Industrial robots are used in various industrial processes where the tasks are often executed in structured environments. These industrial robots, often with little autonomous capability, need to be re-programmed for a new task, in which the robots may need a different tool, fixture or environment [1]. Service robots are usually operated semi- or fully autonomously for the well-being of humans or equipment in unstructured environment. Therefore, suitable human-robot interfaces developed for robots of different categories should cater to various target applications as well as the environments.

Many industrial robotic systems have adopted semi-automatic programming approaches where HRI is a vital component to bridge the communication between the operators and the robots. Robotic safety is of upmost priority and thus needs to be addressed in HRI. Recent development of robotics has introduced haptic interaction through which the users can feel both virtual and real environment, such as in tele-operations and tele-surgeries [2, 3]. One of the key requirements that have been identified for effective HRI is the overlapping space that can be perceived by both the human user and the robot programming system [4].

Augmented reality (AR) can enhance human-machine interaction by overlaying virtual computer-generated information in the form of texts, graphics, or animation on a real world environment. In particular, it has found good applications in robotics to improve HRI, such as tele-operations [57], industrial operations [8, 9], etc., where AR can assist the operators in planning and simulating the task through interacting with the spatial environment prior to actual task execution. It has been reported that AR-based interfaces provide the means to maintain the situational awareness [7], as well as to facilitate different levels of HRI, such as the understanding of the robot perception of the world during debugging and development of the robot programs [10], the extension of the operator’s perception of the real environment during robotic task planning and manipulation [11, 12], as well as the integration of various interaction modalities in localization, navigation and planning of the mobile robots [13, 14], etc. In most of the cases, AR-based human-robot interfaces permit the operators to visualize both the virtual information and real world environment simultaneously, where the virtual elements represent visual cues and enhancements for better understanding of the environment in robot task planning and execution [6, 12, 13].

In this paper, a novel interface for HRI based on AR is proposed and presented. The interface aims at assisting the users in their interaction with a virtual robot in a real operating environment, for the planning of the two basic types of robotic operations, namely, pick-and-place operation and path following operation, during which path planning and end effector (EE) orientation planning are involved. The rest of the paper is organized as follows. Sections 2 and 3 review the various levels of HRI and the current HRI methods. Section 4 presents a novel AR-based interface for human-virtual robot interaction, where a number of interaction methods have been developed in terms of the operations associated with different robotic applications. Section 5 presents a monitor-based visualization where distinctive visual cues are used to assist the users during their interactions with the virtual robot. Section 6 presents the implementation results with two case studies. In Sec. 7, the conclusions and suggestions for future work are given.

2 HRI

Robots have been applied in manufacturing operations to human daily activities. Therefore, the levels of HRI required for these applications vary accordingly. For a robotic system, two prominent principles adopted in the identification of the HRI level are, namely, the level of autonomy (LOA) [15] achievable by the robotic system, and the proximity of the human and the robot during operation [16, 17].

For a robotic system, the LOA describes the degree to which the robot can act on its own accord, or alternatively, the degree to which the HRI is involved in completing a robotic task [15]. Industrial robots generally have lower LOA, and the interaction with the human operator is vital as the motion of an industrial robot needs to be pre-programmed and re-programmed, which will normally require a considerably longer period of time for testing and tuning. Comparatively, the service robots have higher LOA as they have to anticipate the changes in the unprepared environment and act accordingly, e.g., an automatic guided vehicle for material transportation in a factory environment needs to recognize the pathway and avoid obstacles [18].

The proximity between the human user and the robot is used to classify the level of HRI into direct and indirect interaction. Service robots exhibit direct interaction with operators. They often adopt high-level interfaces, such as tactile-based or vocal-based sensors, to facilitate intuitive and efficient HRI. For industrial robots, it is advisable to adopt indirect HRI due to the safety concerns. However, as more industrial robots are being used in the SME environment, where the operators are frequently engaged in direct interaction with the robots, this has raised challenges on the development of efficient and suitable interfaces though the integration of suitable sensors [19, 20]. ABB has developed a control system that would prompt for interaction between a human operator and robot while allowing the robot to continue with the task [20].

Human factors issues need to be addressed in the design and development of efficient and effective HRI [21, 22]. The users of industrial robots normally are professionals in robotics or have expertise in task planning, while the users of personal robots may not necessarily have knowledge on how the functions of the personal robots are being implemented. Situational awareness is closely related to the acquisition of surrounding knowledge of interest for a given task by the humans or the robot [23, 24]. Users of different target user groups normally possess different level of situational awareness. The HRI developed for industrial robots normally requires lower level of situational awareness as there is always a prepared working environment. The service robots, such as rescue and search robots, or the professional robots for inspection and repairs, will require a higher level of situational awareness of the operations meant to be conducted in hazardous/unknown surroundings that are normally inaccessible by the humans [25, 26]. In this case, integration of suitable sensors with the robots will allow the information/data of interest relevant to the task to be accessed even if the robot is out of sight of the operators.

3 HRI methods

HRI in industrial robotics has been largely confined to finding ways to reconfigure or program the robots [16]. The use of controller-specific languages is the original method for programming industrial robots as each robot controller may employ different machine languages to create executable robot programs. Icon-based programming methods are developed based on the controller languages, where an icon usually consists of one or more common robot functions to represent a robot program in the form of a flow-chart. Lead-through and walk-through programming methods represent the two forms of direct interaction where the users need to be present within the working environment of the industrial robots. In lead-through programming, a teaching pendant is used to prompt the HRI. These two methods usually adopt a text editor-based or graphics-based interface to facilitate the storing of robot configurations being taught and the generation of robot programs.

Numerous research efforts have been reported on the development of more efficient and suitable interfaces for HRI as more enabling technologies are being made available, such as multimodal interaction, programming by demonstration (PbD), virtual reality (VR), AR, etc.

3.1 HRI modals and devices

In industrial robotics, the interaction takes place when the human operators usually possess sufficient knowledge on specific controller programming languages, or at least have expertise in task planning and task automation. However, service robots mandate new forms of HRI, as the users may not have any knowledge in robotics. Natural human interfaces for richer HRI have been explored for service robots. Vision-based interfaces in mobile robot navigation [18, 2729], haptic feedback interfaces in tele-operations and surgeries [3, 30, 31], voice-based interfaces [32, 33], and multimodal interfaces which integrate two or more interaction modals [34, 35], are some of the most commonly used approaches.

Apart from multimodal interaction, many interactive devices have been developed to facilitate intuitive HRI. In industrial robotics, handheld devices, such as digital pen [36], interactive stylus [8], or mobile devices, like PDA [37], have been used in task planning and robot programming.

3.2 PbD

PbD is an on-line robot programming approach which has been used in both industrial and service robot applications in which a user performs the task manually, and leaves the robot to observe, follow and learn the human demonstrations in real-time. This enables a user who may not have any robotic programming skills to program a robot. One key issue in PbD is the sub-optimality which often exists in the demonstrations with respect to both the robot and the learning system [38]. Another issue is the presence of noise in the data collected due to variations in human demonstrations. Multiple demonstrations are often required for data collection for the task to be executed many times and a higher quality of performance. This justifies the additional effort put in to obtain the sample data needed for learning [39].

3.3 VR

In VR-based HRI, a virtual environment aided by the necessary sensors provides the operator with an immersive sense of his/her presence at the real location undertaking the tasks. The VR-based HRI allows the operator to project actions that are carried out in the virtual world onto the real world by means of robots. From the perspective of LOA, the VR-based operations are usually conducted with robots having less autonomy, or working at their lower LOA even if the robot systems have adjustable autonomy. For those robots that exhibit high LOA, such as humanoid, VR can provide an intuitive simulation environment which allows the user to interactively control a virtual model of the robot. The behaviors of the virtual humanoid, resulting from the users’ inputs or natural effects (e.g., gravity), can be observed through the VR interface [40].

From the perspective of HRI, a major advantage of using VR is the provision of intuitive interfaces due to its scalable modeling capability of the entire environment where a robot operates in Ref. [41]. However, an accurate VR environment requires dedicated content development to represent the actual working scenario. Another issue that needs to be addressed is the delay between the VR display of the movements of a remote robot and its physical movements [35].

3.4 AR

In robotics, AR can assist the users to interact intuitively with the spatial information for pre-operative planning within the actual working environment. The integration of various types of sensors into AR-based interfaces can assist the users in understanding the environment to facilitate HRI, e.g., the vision-based sensors, which have been commonly used to acquire information of the work cell, human gestures, etc. However, the types of interactions that can be achieved depend on the progress in the computer vision field [42].

Similar to VR, AR can be used to enhance the HRI with robotic systems of a wide range of LOA. AR-based visualization offers the possibility to spontaneously perceive the information that is useful during robot programming and tasks planning, as well as tele-operations [11, 12, 35]. Some AR-based systems have been reported in industrial robotics. Chong et al. [43] presented an interface through which the users could guide a virtual robot using a probe attached with a single marker. Fang et al. [44] improved this HRI interface through which the user could interact with the spatial environment using a marker-cube based interaction device. Zaeh and Vogl [8] introduced a laser-projection-based approach where the operators could manually edit and modify the planned paths projected over a real workpiece using an interactive stylus. The robot systems can benefit significantly from the use of AR technology with intuitive visual cues, which provide rich information and enhance situational awareness to the users during their interactions in a robotic task. In these applications, humans usually provide supervision or actual actions to the robotic systems. In mobile robots applications, AR-based visualization can provide situational awareness to the users for navigation and localization, such as the museum tour guide robot [45], etc. In these cases, humans usually play as a partner to the robot, where interaction needs to occur at a higher level, such as through natural dialogues. The use of AR in robots with higher LOA, like autonomous humanoid robots [14], allows the quick relocalization and navigation in lost situations by detecting active landmarks available in the surroundings of the robot.

4 An AR-based interface for HRI

A novel intuitive interface for HRI based on AR has been developed in this research. Figure 1 shows an overview of this AR-based interface. The AR environment where HRI has taken place consists of the physical entities in the robot working environment, such as the robot arm, the tools, workpieces, etc., and a parametric virtual robot model. An AR Tool Kit-based tracking method is adopted for tracking a handheld device, which is a marker-cube attached with a probe, and virtual robot registration. The tracked interaction device allows the users to interact with spatial information of the working environment. It can be used to guide the virtual robot to intervene in the path planning and EE orientation planning processes. The actual working environment, the virtual robot model, the trajectory information, as well as the interaction processes are visualized through a monitor-based display.

Fig.1
figure 1

AR-based interface for HRI

4.1 Interaction device

The handheld device attached with a marker-cube offers an effective way for manual input of the spatial coordinates and six degrees-of-freedoms (DOFs) interaction. To guide the EE of a robot with a reduced wrist configuration, e.g., the Scorbot-ER II type manipulator, a pose tracked using this interaction device needs to be mapped to an alternative pose which permits valid inverse kinematic solutions. Figure 2(b) gives a valid robot pose mapped from an arbitrary pose tracked using the device (see Fig. 2(a)), in which the positional elements of the pose remain unchanged and rotational elements need to be adjusted adequately.

Fig. 2
figure 2

Coordinate mapping based on a tracked marker-cube

4.2 Euclidean distance-based method

In the AR environment, it may be difficult for a user to locate a point among a number of spatial points defined in the spatial workspace using the probe on the interaction device. Therefore, a Euclidean distance-based method is proposed to assist the user in selecting the point of interest. Unlike the methods reported earlier [8, 9], in which a point is selected when it is closer than a predefined distance from the tip of a probe, this Euclidean distance-based method computes the distances between the probe and each spatial point, and associates this value with the corresponding point. The values are updated automatically when the probe moves in the workspace and the one that has the minimum distance to the probe will be highlighted to the user as a candidate point for selection.

In a pick-and-place operation, the definition of a spatial point (e.g., \( {V}_{\text{poi}} \left( {x,y,z} \right) \)) to be selected is given as

$$ {V}_{\text{poi}} :\;S\left( {{O}_{0} ,{V}_{\text{poi}} } \right) = \hbox{min} \left\{ {S\left( {{O}_{0} ,{V}_{i} } \right);\;i = 0 , 1 , 2 ,\cdots ,N_{\text{p}} } \right\}, $$
(1)

where \( {O}_{0} \left( {x_{0} ,\;y_{0} ,\;z_{0} } \right) \) defines the origin of the coordinate system of the interaction device (tip of the probe); \( {V}_{i} \left( {x_{i} ,\;y_{i} ,\;z_{i} } \right) \) is the ith spatial point; and \( S\left( {{O}_{0} ,{V}_{i} } \right) \) is the Euclidean distance between \( {O}_{0} \) and \( {V}_{i} \). N p is the number of the spatial points that have been created.

In a path following operation, the definitions of the parameters given in Eq. (1) are slightly different. In this case, \( {V}_{i} \left( {x_{i} ,\;y_{i} ,\;z_{i} } \right) \) will be the ith sample point of the curve model, and \( N_{\text{p}} \) the total number of the sample points.

4.3 Spatial interaction mechanisms

Various spatial interaction mechanisms have been provided to the users for efficient and intuitive planning of a robotic task in an AR environment, as shown in Fig. 1. These can be achieved through real-time tracking of the interaction device and the monitor-based visualization, which allows the users to perceive the virtual elements instantaneously while interacting with them. In a robotic pick-and-place operation, the orientation of the target frame for the EE of the robot may not be critical as compared to its position. In a robotic path following operation, the EE of the robot is constrained to follow a visible path on a workpiece at permissible inclination angles with respect to the path. A de-coupled method is adopted in the definition of a target frame for the EE of the robot, i.e., the positional and rotational elements of the target frame are determined separately since the orientation of the tracked interaction device cannot be used directly as the orientation of the target frame, as described in Sec. 4.1. The procedures for the AR-based human-virtual robot interaction, consist of a series of interaction methods, to facilitate two types of robotic operations are shown in Fig. 3, namely, a pick-and-place operation (see Fig. 3(a)) and a path following operation (see Fig. 3(b)).

Fig. 3
figure 3

Procedures for AR-based human-virtual robot interaction in a a pick-and-place operation, and b path following operation

The detailed interaction methods are presented as follows.

  1. (i)

    Collision-free-volume (CFV) generation

A virtual sphere with a known radius is defined where the center is located at the tip of the interaction device (probe). A CFV can be generated by recording the position of the tip while the interaction device is moved around the space relative to the task to be planned [43].

  1. (ii)

    Path tracing from demonstrations

In a path following operation, by moving the interaction device along a visible path/curve, a set of discrete points can be tracked and recorded. Multiple sets of points are used to obtain a parametric model of the original path/curve [46].

  1. (iii)

    Spatial points creation/selection

The positions of a target frame for the EE of the robot are defined by pointing to the desired target positions in the spatial space. The spatial points that are created and as spatial points to form a collision-free path should be accessible by the EE of the robot and within the CFV. In particular, the Euclidean distance-based method, as described in Sec. 4.2, is used to select a point of interest from a list of existing spatial points. During the definition of a number of spatial points on the curve model, the Euclidean distance-based method can be applied to all the parameterized points of the curve model.

  1. (iv)

    Spatial point deletion

A spatial point can be deleted by firstly specifying this point using the Euclidean distance-based method within a list of spatial points that have been created. The numbering sequence of the remaining spatial points will be updated accordingly.

  1. (v)

    Spatial point insertion

Through specifying two consecutive spatial points within a list of spatial points that have been created, a new spatial point can be created and inserted between these two points to form a new spatial points list. The numbering sequence of the spatial points in the new list will be updated accordingly.

  1. (vi)

    EE orientation specification at each spatial point

Given a parametric curve model, a coordinate frame at the start of the curve model can be defined with respect to the coordinate frame at the base of the robot [46]. The coordinate frames with origins at the rest of the spatial points selected along the curve can be defined by applying the transformations reflecting the changes in the curve direction. The orientations of the EE at the spatial points are defined according to the sequence of selection with respect to the coordinate frame at the corresponding spatial points. The EE orientation of a spatial point can also be represented with respect to the robot base frame.

  1. (vii)

    Spline modeling

In a pick-and-place application, a cubic-spline representation of the robots path is generated with the spatial points that have been created. By modifying the existing spatial points, an updated path can be fitted. In a path following application, the angle between the orientation of each spatial point selected on the curve and the Z-axis of the robot base frame is interpolated to form an orientation profile for the EE of the robot.

With these interaction mechanisms, the user can interact with the environment and the spatial points that are of interest in a robot task. For spline modeling, the parameterized points of the path are generated by taking their normalized accumulative path lengths (Euclidean distances) to the start of the path as the interpolation parameter. The same parameter is used to generate the interpolated angles associated with the parameterized points.

5 Visualization

In this research, a monitor-based visualization has been adopted to facilitate the intuitive AR-based HRI. It enables the users to perceive the virtual contents associated with different interaction methods. The virtual content augmented in the real environment is informative and useful to the users during their interaction with the virtual robot. In particular, the 2D workspace of the robot, the visual content/cues used to facilitate interactions are presented.

5.1 Robot 2D workspace

The workspace of a robot is defined as a volume of space which the robot can reach in at least one orientation. It can be represented as a cluster of 3D points in the Cartesian space. There are many research studies on robot workspace modeling and analysis. In this system, however, it is not necessary to calculate the entire workspace of the robot and augment it onto the real scene, as the movement of the robot is related to the work place. For a given Z (referenced with the robot base coordinate system), the workspace is reduced to a 2D region reachable by the EE of the robot. This region can be characterized by two variables, namely, the position (represented by Z) and the radius (R) of the region, as shown in Fig. 4, which depicts the boundary (dashed curve in red) of the workspace of a robot from the side view. Considering the operating range of each joint and the length of each link, for a feasible Z, the corresponding 2D workspace can be determined as follows:

Fig. 4
figure 4

Determining the radius of the 2D workspace given its position (in Z)

  1. (i)

    Locate the given Z and determine the region it falls into;

  2. (ii)

    If Z ∊ [Z 1Z max] (i.e., within region I), the radius of the 2D workspace is determined by θ 2 (in this case, θ 3 = 0);

  3. (iii)

    If Z ∊ [Z minZ 1) (i.e., within region II), the radius is determined as θ 3 (in this case, θ 2 is chosen such that the joint 2 is at the boundary of its operating range);

  4. (iv)

    Determine the angle of the fan-shaped region according to the robot base rotation range;

  5. (v)

    Register the fan-shaped region onto the real working environment.

Figure 5 shows the registration of a 2D workspace onto the real scene. The fan-shaped region enables the user to visualize whether the starting and goal positions of the workpiece are in the reachable range of the EE. If the goal position is adjustable, the 2D workspace can assist the user in adjusting the goal position such that it is accessible by the EE.

Fig. 5
figure 5

Fan-shaped region of the robot at the ground level (Z = 0 referenced in the coordinate system given by the planar marker)

5.2 Visual cues

Distinctive visual cues have been used to facilitate the different processes during the users interacting with the virtual robot model. This allows the users to perceive their interactions with the spatial environment quickly. The detailed visual cues are described as follows.

  1. (i)

    Coordinate frame

A number of coordinate systems are defined and displayed for easier perception of the spatial environment, namely, the universal coordinate frame defined on the base marker, the coordinate frame defined at the EE of the virtual robot, the coordinate systems defined at the start and goal points (for pick-and-place operations), the coordinate systems defined on the tracked curve (for path following operations), and the coordinate frame defined on the interaction device.

  1. (ii)

    CFV

A series of virtual spheres are used to represent the CFV constructed (see Fig. 6(a)). Once a CFV has been generated, it will be represented by the centers of the virtual spheres (see Fig. 6(b)), which allows the users to view the working environment at all times, and enables a visual inspection of the quality of the CFV [43].

Fig.6
figure 6

CFV represented by a semi-transparent virtual spheres, and b centers of virtual spheres

  1. (iii)

    Spatial point of interest

During spatial point creation or modification (i.e., insertion and deletion), the point of interest is highlighted with a distinctive color so the users will be able to differentiate it easily from other spatial points. For example, a candidate spatial point being selected can be represented in red color (see Fig. 7), while the rest in the spatial point dataset can be represented in green color.

Fig. 7
figure 7

Highlight of spatial point of interest in red color: a creation of a spatial point, b deletion of a spatial point

  1. (iv)

    EE orientation cone

A virtual cone is used to represent the EE orientation range at a point. For example, in the planning of the EE following a path on a surface, at each sample point of the path, as shown in Fig. 8, the point defines the vertex, and the surface normal at this point defines the axis of symmetry of the cone. The open angle of the cone is usually task dependent.

Fig. 8
figure 8

Virtual cone representing the EE orientation range

  1. (v)

    Rendering of paths

The path formed by a list of spatial points is registered on the workspace. In a path following operation, the orientation profile of the EE of the robot, in a form of a ruled surface, can be registered on the workpiece.

  1. (vi)

    Exception notification

During spatial point creation or insertion, if a candidate spatial point is outside the CFV, the color of the CFV will be altered (e.g., changing from white to red) for notification. In paths rendering stage, if a segment of the path is outside the CFV, this segment will be highlighted with different colors against the rest of the path. This guides the users to perform spatial point modification at the neighborhood of this path segment to obtain a collision-free path.

6 Implementation and discussion

This section presents the two case studies and a user study on the proposed AR-based interface for intuitive HRI in a robotic pick-and-place application and a path following application. The proposed interface was implemented using C/C++ programming language under Visual C++ 2005 environment on a 1 GHz PC. Two external packages are used, namely, “Roboop” which provides robot kinematics and dynamics modeling, and “gnuplot” which provides various plot routines.

6.1 Case studies

Figure 9 illustrates the use of the proposed spatial interaction method in planning a task for transferring an object from a start point to the goal points, which aims to find a collision-free path between the two points. In this example, the spatial points are created to be within a predefined CFV, and the orientations of the EE at these points are predetermined (e.g., to be parallel to the Z-axis of the robot base frame). Figure 10 demonstrates the use of the interaction mechanisms in a task where the EE of the robot is required to follow a U-shaped curve and the orientation of the EE needs to be planned appropriately to avoid the edge along the curve. This case study is designed to demonstrate robot path following operations, such as robotic gluing, arc welding, etc.

Fig. 9
figure 9

A pick-and-place task a setup, b creation of spatial points, c selection of a spatial point to be deleted, d selection of two consecutive spatial points, e insertion of a spatial point, f path re-generation

Fig. 10
figure 10

A path following task a curve model, b selection of spatial points on the curve, c selection of a point to be deleted, d selection of a point to be inserted, e definition of target frame at the start of the curve model, f definition of the EE orientation at a spatial point, g selection of a spatial point at which the orientation of the EE needs to be modified, h modification of orientation of the EE at a spatial point, i orientation profile generation

6.2 User studies

This section presents the user study on testing the proposed AR-based interface. Twelve researchers, nine male and three female, from the Department of Mechanical Engineering are invited to conduct the experiments. All participants are not familiar with robotic systems, particularly in robot path and task planning, while eight of them have experiences in the use of AR-based systems.

The user study is composed of two parts, namely, a system experiment and a questionnaire-based survey. The questionnaire constitutes two sets of questions. One set of questions, which requires to be filled out by every participant before the test, is to evaluate the participants’ background on their experience in the use of AR-based systems and familiarity with robotic task planning skills. Another set of questions are on the participants’ evaluation of the AR-based interface, as well as the use of visual cues as visualization enhancements, upon their completion of each planning task.

Two robot tasks have been carried out for the user study. The first task, which is a robot pick-and-place task, is designed to evaluate of the proposed HRI interface for geometric path planning and generation, and the setup is given as in Fig. 9(a). In this task, the participants are asked to select a number of spatial points between the starting and goal points and a path will be generated from these points. The second task is a robot path following task, which emphasizes on the performance of the HRI interface in robot EE orientation planning and adjustment. In this task, as illustrated in Fig. 7(a), the participants were asked to select a series of spatial points on a visible curve, then to define the EE orientations on these points needed to follow a known curve. By doing so, an EE orientation profile along the visible curve can be generated. Each of the two tasks has been carried out in two different conditions, which can be differentiated by the increasing levels of situational awareness to the working environment, shown as follows.

  1. (i)

    Limit suite of the functions of the proposed AR-based HRI interface that allows the users to view the real environment and interact with the virtual robot without spatial point modification or robot EE orientation adjustment.

  2. (ii)

    Full suite of the functions of the proposed HRI interface that allows the users to view the augmented environment and perform robotic task planning, EE orientation planning and modification. The planned paths can be simulated and reviewed prior to actual execution.

The first condition can be adopted to mimic or simulate the planning process using traditional “teach-in” robot programming method, in which it is difficult or even impossible to modify the selected spatial points during planning process. If a planned path is not successful, the spatial points will be re-created. Comparatively, the full suite of the proposed method permits the users to adjust the spatial points in case the generated path based on these points is unsatisfactory.

A monitor-based visualization is used to present the augmented view of the working environment as well as necessary visual cues to the users. The objective and the sequence of each task were explained to the participants. Every participant was first allowed to learn and practice the use of the interaction device in guiding the EE of the virtual robot moving around the workspace, and to get familiar with the sequence of the tasks. Before each trial, a CFV had already been generated, and the participants were only responsible for spatial point selection or EE orientation definition. This is to ensure that under the first condition, each point selected is within the CFV and thus the corresponding robot configuration is collision-free. This mimics the process of spatial point selection in “teach-in” robot programming in which an operator operates the real robot using a teaching pendant.

Since the selected spatial points cannot be modified under the first condition, it will be obvious that the more spatial points are being created, the higher possibility that the path generated from these points is satisfactory (i.e., collision-free). Therefore, under this condition, each participant was asked to select ten spatial points considering the complexity of the work environments as shown in Fig. 9(a) and Fig. 10(a) . Comparatively, there is no such constraint in the second condition since the spatial points are modifiable in case the generated path has collision with the CFV. Each participant performed four trials, i.e., the two tasks each under two distinctive conditions. For each trial, the time to completion, possible collisions and the task completion rate were measured. A collision is defined as where a path segment is outside the generated CFV, or where the swept model of the EE is outside the CFV when moving along the visible path.

Figure 11 shows the time to completion of the pick-and-place task under the two conditions, where the experiment conditions have significant effect on the time to task completion, i.e., approximately 206 s (standard error is 45 s) under the first condition, and 125 s (standard error is 18 s) under the second condition. In addition, under the first condition, only one out of the ten participants was able to select the spatial points yielding a collision-free path at their first attempt. Other three participants completed the task in their second attempt, and the rest needed more than two attempts to create a set of suitable spatial points to form a satisfactory path. It should also be noted that in actual application, it will take more time than in the experiments to create the spatial points, as the user needs to manipulate the real robot arms moving to a series of desired positions and record them accordingly.

Fig.11
figure 11

Maximum, minimum and average time for the participants to complete the pick-and-place task

For the path following task, similarly, a CFV has been generated in advance around the visible curve. Each participant in the first trial was asked to select ten robot configurations along the visible curve, and each consists of both robot EE position and orientation, at which the robot EE is within the CFV. If the EE orientation profile generated from these configurations collides with the CFV, the ten configurations will need to be re-generated until the resulting orientation profile is satisfactory. In the second trial, the participants can first select some spatial points on the curve, and define the EE orientation associated with each spatial point correspondingly. In case the resulting orientation profile of the robot EE is unsatisfactory, the participant can adjust the EE orientation at the relevant spatial points, or edit the list of spatial points if he/she feels need to. Figure 12 shows the time to completion of the task for EE orientation planning under the two conditions. The average time for completion of the EE orientation planning under the first condition is nearly 605 s (standard error is 87 s) and about 337 s (standard error is 74 s) under the second condition. It has been observed that all participants have failed at their first attempt to plan a satisfactory EE orientation profile along a given path.

Fig.12
figure 12

Maximum, minimum and average time for the participants to complete the path following task

From the user studies, the participants felt that they were able to interact with the virtual robot in its working environment using the interaction device. In particular, they can achieve the creation, selection and modification of the spatial points quickly and easily in the robotic pick-and-place robot task. They also felt intuitive and convenient to carry out these operations on the visible curve model in the robotic path following task, even though they demonstrated that there were some difficulties in the determination of a suitable EE orientation at each spatial point. Such difficulty may be caused by the misalignment between the virtual EE and the interaction tool as the virtual robot model has a reduced wrist configuration. In addition, they reported that it was time-consuming to perform the tasks under the first condition, as it required them to remember the unsatisfactory segments on the previous generated path or orientation profile and redefine the spatial points or orientations carefully at the neighborhood of these segments. Unsurprisingly, as demonstrated in planning the robot path following task, particularly in the first attempt of each participant, most collisions occurred at the path segments that were adjacent to the obstacles (as shown in Fig. 7(a)).

With regards to the visual cues and feedback presented on the monitor screen, the participants rated that it helped them understand the planning process and correction operations. Particularly under the second condition, they felt that the use of such virtual contents as cues made it easy and flexible in their interaction with the virtual environment that facilitated the completion of path and EE orientation planning tasks. However, they felt distracted that the virtual EE disappeared when the interaction device moved out of the working range of the robot. The participants who were using AR systems for the first time tended to occlude the base marker with the marker-cube or move the marker-cube out of the field of view of the camera, making the virtual robot disappear from their views. In addition, they experienced distractions and fatigue as they needed to alternate their attentions between the perception of the augmented environment through the monitor screen and the manipulation of the interaction device in the real working environment. Meanwhile, the participants tended to focus on the monitor’s view when performing the tasks. This will to some extent lead to improper guidance of the virtual robot as it is not easy for the operator to perceive the depth information from the display on the monitor. The use of a head-mounted display instead of the monitor could solve these issues and improve the performance of the proposed method significantly.

6.3 Discussion

The two case studies demonstrate the successful implementation of the proposed AR-based interface for human-virtual robot interaction. The average tracking errors in these two case studies are both approximately 11.0 mm with the camera installed at 1.5 m away from the workplace. The errors are largely caused by the tracking method adopted. In the first case study, the error is introduced during the generation of the CFV, the creation and insertion of spatial point, while in the second case study, the error is introduced during the acquisition of the parametric model of the spatial U-shaped curve. However, in the first case study, user demonstrations are used to generate a suitable CFV instead of a path for the EE of the robot to follow. Thus, the path generated within the CFV will not be affected by the jitter and noise presented in the demonstration. In addition, the users can hardly perceive misalignment between the actual path and the path model in the second case study. From this perspective, the tracking errors to some extent do not affect significantly the intuitiveness of the interface to facilitate human-virtual robot interaction. It is also worthy to note that in planning the path following task, the coordinate frame at the start of a curve model is defined by the user, however, the coordinate systems at the rest of spatial points are determined by applying the transformations reflecting the changes in the curve direction, which may not properly reflect the changes in the surface normal. Therefore, the direction chosen as the axis of symmetry of the virtual cone representing the EE orientation range may have significant error with respect to the surface normal due to the errors exist in the curve model. Hence, the performance of the proposed system in planning path following task relies on the quality of the curve model.

The results from the user study have suggested some advantages of using the proposed AR-based HRI method over the conventional “teach-in” method in which a teaching pendant is normally used to assist the operator in robot task planning. Firstly, inexperienced users are able to learn the method quickly and interact with the virtual robot using the proposed HRI interface. Secondly, the proposed interface facilitates faster robot programming and path planning. Thirdly, the Euclidean distance-based method allows the users to select a spatial point of interest easily for insertion or deletion. The use of visual cues increases the intuitiveness of the AR-based HRI, and guides the users during the interaction with the virtual robot, e.g., spatial point selection and modification, as well as EE orientation definition and adjustment, in planning a given robotic operation. In the pick-and-place task, the path formed by the spatial points can be updated simultaneously once one or more points are being modified. In path following task, the EE orientation profile is re-generated immediately once the EE orientation at a spatial point has been modified. This enables the users to be immediately aware of the results of their interactions with the virtual robot and the working environment.

7 Conclusions and the future work

In this research, an AR-based interface for intuitive HRI has been proposed and presented. A brief review on the various levels of HRI and the current HRI methods is presented. A number of interaction methods have been defined in terms of the various types of operations needed in an AR environment for human-virtual robot interactions for different robotic applications. A Euclidean distance-based method has been developed to assist the users in the selection of spatial points in spatial point deletion or insertion operation. The monitor-based visualization mode adopted allows the users to perceive the virtual contents augmented onto the real environment in the different interaction metaphors. The two case studies show successful implementation of the proposed interface in planning robotic pick-and-place operations and path following operations.

A number of areas can be further explored and developed to improve the AR-based HRI interface presented in this paper. A more accurate and robust tracking method can be developed to improve the performance of the interface. Improvement can be made to develop an easier, more intuitive and non-distracting interface for the users to perform EE orientation definition and modification. The current interface can be further enhanced to assist the users in tele-operations or tele-manipulations by integrating suitable sensors and devices at the remote operate sites and the control rooms.