1 Introduction

In modern manufacturing, part feeding is to ensure that production lines never stop due to an insufficient amount of parts. Fixed industrial manipulators, often placed in a fenced cell for safety, are supplied by conveyor and human operators (Bøgh et al. 2012). However, these feeding solutions lack the anticipated flexibility and efficiency when having to deal with dynamic tasks (Fathi et al. 2016). Thus, in order to achieve a greater flexibility as well as an efficient production, it is essential to develop a flexible and autonomous part feeding systems. Today, mobile robot, as a flexible and movable platform, has been a popular choice at production lines to transport materials. Considerable progress has been made on using mobile robot for transporting and feeding part materials. However, existing feeding systems using mobile robot need to set stop locations for pick-up (Dang et al. 2014), which is a drawback to efficiency within production, and therefore, is time-consuming and limits the overall performance of the advanced production. Whereas, without stoppage intervals, non-stop part feeding for a robotic manipulator by mobile robot benefits the feeding efficiency and flexibility, which is worth investigating as a potential for improving the industrial production processes. Furthermore, a mobile robot is normally driven to the predefined spot with few interactions from robotic manipulator. Owing to this, the interactions between a mobile robot and a fixed robotic manipulator in the same work scope has received little attention. In addition to achieving dynamic grasping, accurate and robust perceiving and estimating the state of a mobile robot are necessary for a robotic manipulator, which is one of the key focuses in this case.

In the last few decades, multi-sensor fusion approach which integrates multiple information sources to obtain robust and reliable sensing performance, has been extensively researched (Nomura and Naito 2000) as a typical option for robot state estimation. With the rapid development of machine vision techniques, image-based positioning system using cameras has been developed as a promising positioning solution for industrial applications, as well as robotic positioning (Mautz and Tilch 2011). Nevertheless, there are no limitation-free sensors. On one hand, it has been demonstrated that the vision-based positioning system can provide accurate and reliable information of state, especially with fiducial visual markers, which have been widely used in literature. The well-known “ArUco” visual marker, developed by Garrido-Juradoetal et al. (2014) in 2014, has been used extensively as a low-cost and straightforward solution to obtain position information. In Babinec et al. (2014), the authors tested the positioning accuracy of ArUco marker. The conducted experiments demonstrated that the positioning error is less than 0.005m by using webcam when the distance between the camera and the marker is less than 1.2m. Thus, the vision-based positioning system by using ArUco marker has been verified as a superb accurate positioning method and is used in this work. On the other hand, there are some challenges for visual positioning methods, such as a random blocked view and low-quality and distorted images, which result in the failure of visual positioning. In this case, it is reasonable to combine vision system with other sensors for robotic positioning (Ben-Afia et al. 2014). An IMU can offer robust signal with high sampling rate while suffering from accumulated errors owing to the integration during state estimation. Ultrasonic sensor system is superior in accuracy for indoor environment positioning but works at a slow rate with random outlier. To address limitations and utilise the specific features of standalone sensors, IMU and ultrasonic sensors are fused to provide auxiliary state estimation for positioning. Benefited from the design of a backup positioning system, seamless switching positioning is allowed and thus robust repositioning mobile robot can be achieved for robotic manipulator.

Recent years have seen a growing interest in investigating multi-sensor fusion algorithms for robot’s state estimation. Kalman-based filter is a classical method for estimating the state of robot and various variations have been developed to improve different aspects, such as unscented Kalman filter and extended Kalman filter (EKF). For emerging method, neural network based approach has been one of the research focuses in recent years and the technique has been extensively applied into state estimation. The researchers in Guo et al. (2018) and Guo and Chen (2021) constructed neural network based control to estimate the state of robotic manipulator and handle unknown uncertainties for improving the tracking performance. However, neural network based algorithm usually requires large computational load and training process, which is not suitable for real-time performance. In this research work, to accurately grasp the moving objects, the robotic arm must obtain the real-time positions of mobile robot. EKF can deal with nonlinear system through Taylor expansion while ensuring the computational efficiency and thus is implemented in this work to fuse IMU and data from ultrasonic positioning system for providing real-time state estimation.

As part of a preliminary study into the feasibility and efficiency of part feeding for a cooperative mobile robot and manipulator system (Co-MRMS), this paper investigates the interaction mode between the fixed base robotic manipulator and mobile robot, and a multi-sensor fusion positioning strategy between mobile robot and robotic arm. The performance of accurate and robust positioning with seamless switching on a robotic manipulator identifying a moving mobile robot is achieved by following three main technical contributions:

  • A novel flexible and efficient interaction mode, based on the relative distance between the fixed base robotic manipulator and mobile robot, is designed for the Co-MRMS.

  • A new multi-sensor fusion positioning system for the fixed industrial manipulator to perceive and interact with mobile robot is developed. To accurately and robustly acquire the positions of mobile robot, two different indoor environment positioning methods, namely EKF-based approach fusing ultrasonic sensors with IMU and vision-based approach with ArUco marker, have been investigated. In addition, with an outlier rejection method strategy, the EKF-based approach can eliminate the outlier of ultrasonic sensor measurement efficiently.

  • A novel two-stage positioning strategy is proposed to allow the fixed-base robotic manipulator to reposition the mobile robot seamlessly for dealing with the scenario that visual sensor is occluded.

The remainder of the paper is presented as follows. In Sect. 2, the positioning approaches are reviewed. In Sect. 3, the designed interaction mode of the Co-MRMS is presented and validated by simulation-based experiments. The positioning approaches are described in Sect. 4. In Sect. 5, the system performances in number of physical static and dynamic experiments are evaluated. The feasibility of the proposed switching strategy is demonstrated. Finally, in Sect. 6, the conclusions are provided and some directions for future research are drawn.

2 Related works

Many approaches have been proposed to address the part feeding problem in literature. The conveyor-based solution has been carried out and applied in industry over the years (Akella et al. 2000; Lynch 1999; Mirtich et al. 1996; Carlisle et al. 1994; Causey et al. 1997). With the advantages of more robotic mobility and task flexibility over conventional approaches, the interest of deploying mobile robots to providing materials for robotic manipulator is increasing significantly nowadays (Han et al. 2019; Andersen et al. 2017). However, mobile robots in these works are rather inflexible as they are driven and stopped at a predefined location for feeding part. In this work, the problem of an interactive part feeding robotic system that a robotic manipulator dynamically fetches the object on a moving mobile robot, is considered. To achieve this, investigation into continuously identifying and handling moving objects is indispensable and it has been studied by different types of robots such as standalone robotic manipulator and mobile manipulator (Zhang et al. 2019; Dewi et al. 2015; Zhu and Ren 2020; Chang et al. 2015; Salehian et al. 2016; Luu and Tran 2015; Zhang et al. 2018).

The interaction model and coordinated control are significant for a workspace sharing system especially one which contains moving agents. Liu and Tomizuka (2014) established the workspace sharing interaction model between a robot and a human to perform collaborative tasks. In Li et al. (2022), the kinematic modelling between human and exosuit was established. To adapt to different terrains, the impedance learning was utilised in the inner loop to regulate the impedance parameters of the exosuit while human-in-the-loop was deployed in external loop to adjust speeds. Yu et al. (2020, 2021) used visual and force sensors to obtain human motion and the interaction force for human-robot co-transportation task. Impedance-based control strategy and an advanced robot end-effector controller were proposed successively to deal with uncertainties in robot’s dynamics. Loria et al. (2015) deduced the relative kinematic interaction model of swarms of mobile robots for leader-follower formation control. The authors in Muszynska et al. (2016) combined artificial neural networks and fuzzy logic systems to develop a motion controller for dealing with the nonlinearity and uncertain modelling of robotic system. The proposed neuro-fuzzy controller was validated on a wheeled mobile robot to implement predefined loop trajectory and a robotic manipulator to apply hybrid force/position control on a surface. The experiments validated the advantage of neuro-fuzzy information in motion controller, showing fast error convergence. In spite of these advances, cooperative/hybrid robotic system involving mobile robot platform and fixed-base robotic manipulator has received little attention in the context of part feeding and interaction mode design. By contrast, this work addresses the interaction between a mobile robot and a fixed-base robotic manipulator in a single cooperative robotic system. Through establishing a novel interaction mode between the mobile robot and robotic manipulator, the efficiency and flexibility of material detection and grasping can be improved.

An equally important concern for the cooperative/hybrid robotic system is that the robotic manipulator can robustly and accurately perceive the mobile robot, as otherwise it would undermine the performance of picking up objects from the moving platform. Over the years, a plethora of literature applying vision system to localize moving objects have been presented (de Farias et al. 2021; Papanikolopoulos et al. 1993; Ding et al. 2021; Allen et al. 1993). In Mane and Mangale (2018), a standalone vision system with neural networks based algorithm was researched for object detection and tracking. With regards to fixed-base robotic manipulator, a camera is generally mounted on the end-effector as a hand-eye to obtain a superb viewpoint to track the moving object. Researchers in Zabalza et al. (2019) placed an overhead camera as the main camera and a fixed side camera as the complementary camera to detect the obstacle. While only relying on this feature is fairly accurate to identify the location of a moving object, it is not robust to sudden “blind spots” or distorted captured images which may cause a failure of the tracking. Thus, another challenging problem that is unique for a fixed-base robotic manipulator tracking system is visual sensor occlusion.

Multi-sensor fusion, existing in state-of-art literature as a remarkable subject in most robot application research, is adopted to reduce the uncertainty and enhance the robustness of the system. A number of positioning techniques based on multi-sensor fusion have been presented till date (Ebner et al. 2015; Xu et al. 2018). Researchers in Dobrev et al. (2016) proposed a positioning system that based on the multi-sensor fusion of radar, ultrasonic and odometry data, using EKF algorithm to determine the positions and orientation of a mobile robot in indoor environment. The work in Alatise and Hancke (2017) utilized an EKF approach to integrate IMU and vision data extracted from sped-up robust feature and random sample consensus algorithms to estimate mobile robot pose in indoor environments. Authors in Chen et al. (2016) proposed a hybrid mode data fusion approach. The IMU data was fused with ultrasonic data by EKF when it is available while the trained least squares support vector machine (LS-SVM) corrected the inertia navigation system during outages. In Coelho et al. (2018), the researchers applied EKF to fuse the wheel encoder and computer vision system with augmented reality code. In simulation environment, the approach can successfully deal with unknown initial position and robot kidnapping problems where the robot is moved to a new position without previous information. In Kaltiokallio et al. (2018), the relative performance of Particle filter and EKF, were compared in terms of performances in localisation and tracking. The experimental results concluded that they have similar tracking accuracy but Particle filter is much more demanding than EKF in computation cost.

Although a variety of research about multi-sensor fusion have been conducted, the visual positioning using ArUco marker is either generally integrated into the fusion, affecting the positioning accuracy and precision, or used individually for positioning, lacking of sufficient robustness. Additionally, research into ultrasonic sensor aided positioning system still needs to be explored owing to its low-cost and superior positioning accuracy in an indoor environment.

Sensor suites switching has been used and validated to perform robust state estimation (Hausman et al. 2016). GPS and cameras were added into inertial navigation system in Mourikis and Roumeliotis (2007) to handle different cases during the experiment. In Lynen et al. (2013), EKF-based framework was employed to seamlessly handle additional or lost sensor signals. A backup operation mode of airspeed measurement was proposed in Leutenegger and Siegwart (2012) for an unmanned airplane to deal with GPS outages. Hausman et al. (2016) introduced an approach based on statistical signal quality analysis for seamless sensor suites switching on agile aerial vehicles. In this work, due to the existence of a backup positioning module, the positioning system allows seamless switching between different sensor modalities according to the given conditions of the robotic tracking system.

This work deals with the specific problem of positioning mobile robot for fixed-base robotic manipulator. Compared with other multi-sensor fusion positioning method, this research designed a new two-stage positioning strategy based on two different positioning methods, which are EKF-based approach fusing ultrasonic sensors with IMU and vision-based approach with ArUco marker, incorporating both accuracy and robustness. When the vision sensor is available, fusing the positioning data from ArUco marker detection with other positioning data from other sensors such as IMU and ultrasonic sensor undermines the positioning accuracy and precision owing to the superb positioning performance of ArUco marker. Therefore, the ArUco marker is taken as the principal positioning method and used individually without integration with other sensors. If it is available, the pose extracted from the marker is used directly for positioning. However, relying solely on visual sensor is not robust and may cause a failure of the detection due to the sudden “blind spots” or distorted captured images. Therefore, another positioning method fuses IMU and ultrasonic sensors by EKF is adopted and taken as the auxiliary positioning method. In this designed positioning system, thanks for the design and adoption of a backup positioning approach, the seamless switching positioning is allowed and thus robust repositioning the mobile robot can be achieved for the fixed-base robotic manipulator to identify the positions of mobile robot.

3 Interaction of Co-MRMS

3.1 Interaction design of Co-MRMS

The objective of designing the interaction mode for the Co-MRMS is to enhance the flexibility and efficiency of part feeding for the fixed base robotic manipulator by mobile robot. Given that obtaining the accurate position of object is the prior target, the interaction mode is simplified without considering orientational information. Assume that the state vector is X, which is \(\begin{bmatrix} x&y&z \end{bmatrix}^{T}\). The states of robotic manipulator and mobile robot are represented as \(X_{R}\) and \(X_{M}\) respectively. The relative distance between the robotic manipulator and mobile robot denoted by \(d_{rel(t)}\), which is

$$\begin{aligned} \begin{aligned} d_{rel(t)} = \sqrt{(x_{R}-x_{M})^{2}+(y_{R}-y_{M})^{2}+(z_{R}-z_{M})^{2}} \end{aligned} \end{aligned}$$
(1)

The interaction mode between the mobile robot and fixed-base robotic manipulator in the Co-MRMS has been designed and the framework is shown in Fig. 1. The relative distance \(d_{rel(t)}\) is calculated and the interaction between two robots can be divided into three stages: If the location of mobile robot is out-of-scope of the fixed-base robotic manipulator, the robotic manipulator keeps static without movement; If the relative distance is smaller than the value that set for the sake of safety and object identification, both robots stop, waiting for the following operations such as object detection and material grasping; If the mobile robot located in the scope of robotic manipulator and the relative distance is larger than the set value, the robotic manipulator moves quickly to the mobile robot.

Fig. 1
figure 1

Interaction design of the Co-MRMS

The interaction mode was validated in simulated environment and the result is shown in Figs. 2 and 3. For the convenience of validation, the mobile robot is simplified as a moving point and the fixed-base robotic manipulator is a two-link robotic arm. As shown in the figures, the mobile point moves at a certain speed in the vertical direction and horizontal direction respectively. The interaction mode between the fixed-base robotic manipulator and the mobile point is adjusted according to the relative distance between two agents. The fixed-base robotic manipulator keeps static when the moving point is out-of-scope. When the moving point is in the scope and the relative distance between them is larger than the set value, the fixed-base robotic manipulator promptly moves to the moving point. If the relative distance is smaller than the set value, both of the robots stop. Therefore, the feasibility of the designed interaction mode is validated.

Fig. 2
figure 2

Demonstration of the designed interaction with vertically moving point

Fig. 3
figure 3

Demonstration of the designed interaction with horizontally moving point

So far, the interaction mode of the integrated robotic system is designed and validated. The movement of the fixed-base robotic manipulator depends on the relative distance between the mobile robot and fixed-base robotic manipulator. To obtain the relative distance, the industrial robotic manipulator needs to continually perceive the positions of mobile robot. Therefore, positioning the mobile robot accurately and robustly for the robotic manipulator in the Co-MRMS is a critical technique and is as well the research focus in this research.

3.2 Coordinate frame transformation

Fig. 4
figure 4

Schematic diagram of the system and coordinate frames

As is illustrated in Fig. 4, five sets of coordinate frames are defined in the whole system for coordinate transformations to derive the predicted equations and measurement equations for positioning. Here, four stationary ultrasonic sensors are arranged around the workspace and form a positioning system \((X_{u},Y_{u},Z_{u})\). The base coordinate system \((X_{b},Y_{b},Z_{b})\) is defined in the base axis while the end-effector coordinate frame \((X_{e},Y_{e},Z_{e})\) is located at the end-effector of robotic manipulator. The mobile robot coordinate frame \((X_{m},Y_{m},Z_{m})\) is fixed orthogonally to the origin located at the centre between the two wheels of mobile robot. Correspondingly, the obtained raw IMU readings are attached with the mobile robot coordinate frame and are in correspondence with the direction of mobile robot’s motion. To unify the coordinate frames for EKF-based positioning, the accelerations from the accelerometer of IMU are transformed to the ultrasonic sensor coordinate frame. Since the end-effector of the fixed robotic manipulator moves along with the mobile robot, the transformation from the ultrasonic sensor coordinate frame to the end-effector coordinate frame should be calculated and it can be expressed as

$$\begin{aligned} \begin{aligned} X^{E} = f(\textbf{q})T_{UB}X^{U} \end{aligned} \end{aligned}$$
(2)

where \(T_{UB}\) denotes the transformation matrix from the ultrasonic sensor frame to the base coordinate frame and \(\textbf{q}\) is the vector of joint angles. The joint angles can be obtained by the inverse kinematics as the end effector target position is known. Owing to the dynamics modelling of the fixed robotic manipulator, the corresponding torque is calculated and then applied to each joint. Therefore, handling the objects by the end-effector can be fulfilled.

For the visual positioning approach, an eye-in-hand camera is fixed on the end-effector of the robotic manipulator and thus the camera coordinate frame \((X_{c},Y_{c},Z_{c})\) is attached. The ArUco marker coordinate frame \((X_{a},Y_{a},Z_{a})\) is attached on the ArUco marker and the marker positions obtained from image processing are relative to the eye-in-hand camera coordinate frame and should be transformed into the end-effector coordinate frame to facilitate the movement. Therefore, the transformation of the marker position from the camera coordinate frame to the end-effector coordinate frame can be expressed as

$$\begin{aligned} \begin{aligned} X^{E} = T_{CE}X^{C} \end{aligned} \end{aligned}$$
(3)

where \(T_{CE}\) denotes the transformation matrix from the eye-in-hand camera frame to the end-effector coordinate frame.

4 Multi-sensor positioning approaches

In this work, the information fusion includes two meanings: one is the fusion of IMU and ultrasonic sensor based on EKF algorithm for the position estimation of mobile robot, and another is the fusion of two different kinds of positioning methods for repositioning. This section firstly discusses the two kinds of positioning methods that are deployed for localising a mobile robot in an indoor environment. These are EKF-based approach fusing IMU with ultrasonic positioning system and vision-based approach by camera with ArUco marker. Each positioning technique has both limitations and capabilities. The positioning method using camera with ArUco marker has high accuracy and precision but is not robust due to the sudden “blind spots” or distorted captured images, while the EKF-based fusion positioning method with ultrasonic sensor and IMU is robust to the detection distance or views but is not as accurate and precise as the visual positioning method with ArUco marker. Then the seamless switching strategy for repositioning the mobile robot is introduced to deal with the case that the camera becomes unavailable.

4.1 EKF-based state estimation

As a classic sensor fusion approach for nonlinear system, EKF was employed to estimate the positions of mobile robot. The position prediction is done through IMU readings and the correction comes from the position data of ultrasonic sensor. The EKF algorithm, taking advantages of specific features with overcoming the limits of standalone sensors, presents a better resulting performance than individual sensor. The general forms of prediction model and correction model in discrete-time domain are given, respectively:

$$\begin{aligned} X_{i+1}\,=\, & {} f(X_{i},u_{i+1})+\varepsilon _{i+1}. \end{aligned}$$
(4)
$$\begin{aligned} Z_{i+1}\,= \,& {} h(X_{i+1})+ \upsilon _{i+1}. \end{aligned}$$
(5)

where X is the state vector corresponding to the positions and velocities in X-Y plane, which is \(\begin{bmatrix} x&y&v_{x}&v_{y} \end{bmatrix}^{T}\). \(\varepsilon _{i+1}\) and \(\upsilon _{i+1}\) represent the system noise and measurement noise, respectively. Both of the noises are modelled using zero mean Gaussian distribution with the associated covariance matrices:

$$\begin{aligned} \begin{aligned} \varepsilon _{i+1}\sim N(0,Q_{i+1}). \\ \upsilon _{i+1}\sim N(0,R_{i+1}). \end{aligned} \end{aligned}$$
(6)

The values of the covariance matrices used in the EKF method refer to the input-noise covariance matrix Q and output-noise covariance matrix R. The input-noise covariance matrix stems from the IMU noise while the output-noise covariance matrix stems from the ultrasonic sensor noise. Q and R are defined as:

$$\begin{aligned}{} & {} \begin{aligned} Q=\begin{bmatrix} \sigma _{ax} ^{2} &{} 0\\ 0 &{} \sigma _{ay} ^{2} \end{bmatrix} \end{aligned} \end{aligned}$$
(7)
$$\begin{aligned}{} & {} \begin{aligned} R=\begin{bmatrix} \sigma _{ux} ^{2} &{} 0\\ 0 &{} \sigma _{uy} ^{2} \end{bmatrix} \end{aligned} \end{aligned}$$
(8)

\(\sigma _{ax} ^{2}\) and \(\sigma _{ay} ^{2}\) represent the variance of the accelerations that obtained from IMU in the x direction and y direction respectively. \(\sigma _{ux} ^{2}\) and \(\sigma _{uy} ^{2}\) represent the variance of the positions obtained from ultrasonic sensor in the x direction and y direction, respectively. These variances vary at different experimental conditions. Thus, the data from IMU and ultrasonic sensor are collected before the experiment to calculate the variances.

Control vector u, which is corresponding to the mobile robot accelerations along x and y directions, is represented by \(\begin{bmatrix} a_{x}&a_{y}\end{bmatrix}^{T}\). The following equation expresses the acceleration along x or y axis:

$$\begin{aligned} \widetilde{a}= a +b_{a} +\varepsilon _{a} \end{aligned}$$
(9)

where \(\widetilde{a}\) and a are the nominal and true acceleration respectively. \(b_{a}\) represents the bias and \(\varepsilon _{a}\) is the measurement noise. According to the practical measurement, the bias of IMU got minor changes in an hour, which can be assumed that the bias is a constant during the short-term movement. Moreover, the bias can be measured by first recording some readings while the IMU is stationary, then taking those values as offsets when reading the acceleration values in the future.

Let \(\Delta T\) denote the sampling time interval. The state prediction of the mobile robot in this work is defined by the following kinematic equation:

$$\begin{aligned} \begin{aligned} X_{i+1} = f(X_{i}, u_{i+1})+\varepsilon _{i+1} = \begin{bmatrix} x_{i}+v_{x(i)}\Delta T+\frac{1}{2}a_{x(i+1)}\Delta T^{2}\\ y_{i}+v_{y(i)}\Delta T+\frac{1}{2}a_{y(i+1)}\Delta T^{2}\\ v_{xi}+a_{x(i+1)}\Delta T\\ v_{yi}+a_{y(i+1)}\Delta T\\ \end{bmatrix} +\varepsilon _{i+1} \end{aligned} \end{aligned}$$
(10)

4.1.1 The establishment of measurement equation - measurement modelling

The ultrasonic positioning system used in this work consists of four stationary beacons, a mobile beacon, a router and the dashboard beacon software. Each beacon has five transceivers. The distance of the mobile beacon affixed on the mobile robot is calculated by the router with receiving ultrasonic signals from the stationary beacons. The position of the mobile beacon can be calculated using the equation below:

$$\begin{aligned} \begin{aligned} p_{i+1} = \sqrt{(x_{i+1}-x_{s})^{2}+(y_{i+1}-y_{s})^{2}+(z_{i+1}-z_{s})^{2}} \end{aligned} \end{aligned}$$
(11)

where (\(x_{i+1}\), \(y_{i+1}\), \(z_{i+1}\)) represents the positions of mobile beacon at time i+1, and (\(x_{s}\), \(y_{s}\), \(z_{s}\)) represents the stationary beacon’s coordinates. With the position readings from ultrasonic positioning system, the correction model of EKF can be presented in the following equation:

$$\begin{aligned} Z_{i+1} = h(X_{i+1})+ \upsilon _{i+1} = \begin{bmatrix} x_{i+1}\\ y_{i+1}\\ \end{bmatrix}+ \upsilon _{i+1} \end{aligned}$$
(12)

4.1.2 EKF process

Now, the nonlinear system in state-space has been obtained. Then, the EKF procedures can be operated as follows by the given prediction and correction models:

  • Initialisation with state \(X_{0}\) and covariance matrix \(P_{0}\)

  • State prediction by motion model:

    $$\begin{aligned} \hat{X}_{i+1\mid i}= f(\hat{X}_{i\mid i},u_{i+1})+\varepsilon _{i+1} \end{aligned}$$
    (13)

    where \(\hat{X}\) represents the estimate of the state vector X. Covariance matrix prediction:

    $$\begin{aligned} P_{i+1\mid i} = F_{i}P_{i\mid i}F^{T}_{i}+Q_{i} \end{aligned}$$
    (14)

    where \(F_{i}\) is the Jacobian matrix of prediction model and can be written as:

    $$\begin{aligned} F_{i}= \frac{\partial f}{\partial X}\mid _{\hat{X}_{i\mid i},u_{i+1}} \end{aligned}$$
    (15)

    Outlier rejection: Since ultrasonic signal may be delayed or reflected by obstacles in real cases, non-updated readings or error values with a drastic change are obtained occasionally from ultrasonic sensors. Thus, a straightforward outlier rejection strategy to reject outlier from ultrasonic sensor data is proposed. In each iteration, the difference between readings at time i+1 and time i from ultrasonic sensor is calculated and compared with a constant positive value. If the following equation is satisfied, \(Z_{i+1}\) would be considered as an outlier.

    $$\begin{aligned} \left| Z_{i+1} -Z_{i} \right| \ge \rho \times v\times \Delta T \end{aligned}$$
    (16)

    where v is the driving speed of mobile robot and \(\Delta T\) is the sampling time interval, and \(\rho\) is the adjustment factor. Likewise, if the difference between \(Z_{i+1}\) and \(Z_{i}\) is equal to 0, which is taken as no updated reading from ultrasonic sensor. Then the measurement data at time i+1 is abandoned. Otherwise, the ultrasonic receiver data would be adopted in the correction step of EKF algorithm if satisfying the formula:

    $$\begin{aligned} 0<\left| Z_{i+1} -Z_{i} \right| < \rho \times v\times \Delta T \end{aligned}$$
    (17)

    Due to the EKF-based estimation with outlier rejection method, the effects of error readings from ultrasonic sensor can be effectively reduced and the distinguished measurement data can be used for correction model.

  • Measurement data update:

    $$\begin{aligned} Z_{i+1} = h(X_{i+1})+\upsilon _{i+1} \end{aligned}$$
    (18)
  • Calculation of Kalman gain:

    $$\begin{aligned} K_{i+1}= P_{i+1\mid i}\times H^{T}_{i+1}\times (H_{i+1}\times P_{i+1\mid i}H^{T}_{i+1}+R_{i+1})^{-1} \end{aligned}$$
    (19)

    where \(H_{i}\) is the Jacobian matrix of measurement model and can be written as:

    $$\begin{aligned} H_{i+1}= \frac{\partial h}{\partial X}\mid _{\hat{X}_{i+1\mid i},u_{i+1}} \end{aligned}$$
    (20)
  • State update:

    $$\begin{aligned} \hat{X}_{i+1\mid i+1}= \hat{X}_{i+1\mid i}+K_{i+1}(Z_{i+1}-\hat{Z}_{i+1\mid i}) \end{aligned}$$
    (21)

    where \(\hat{Z}_{i+1\mid i}\) is the estimated measurement and can be written as:

    $$\begin{aligned} \hat{Z}_{i+1\mid i} = h(\hat{X}_{i+1\mid i}) \end{aligned}$$
    (22)
  • Covariance matrix update:

    $$\begin{aligned} P_{i+1\mid i+1} = (I - K_{i+1}H_{i+1})P_{i+1\mid i} \end{aligned}$$
    (23)

Afterwards, by setting the estimated state as the target position \(X^{E}\) defined in Sect. 3.2, robotic interaction by the EKF fusion method can be realized and the equation is stated below.

$$\begin{aligned} \hat{X}_{i+1\mid i+1}= X^{E} \end{aligned}$$
(24)
Fig. 5
figure 5

EKF-based algorithm flow chart

The flow chart of EKF-based algorithm with outlier rejection method is shown in Fig. 5. It should be noted that the update frequency of IMU is around 200 Hz while ultrasonic sensor is around 3.65 Hz, which occurs to the data unsynchronization during fusion. To solve this issue, the adopted EKF-based fusion strategy is employed as follows: The predicted state and covariance matrix obtained from the motion model would be corrected if available measurement data is received. Otherwise, the positions of the mobile robot would be estimated with the motion model solely. Accordingly, the time interval is strongly related to the update frequency of acceleration. The IMU possess high update frequency, which is around 200 Hz. The 0.005s time interval implies that adopting uniform acceleration equation affects a little on the effectiveness of the state prediction of the mobile robot.

4.2 Vision system based estimation

Another positioning approach by using an aided camera to detect ArUco marker, except EKF-based algorithm fusing ultrasonic sensors and IMU, is utilised in this work. The used marker is generated with OpenCV library and affixed on top of the mobile robot. A standard ArUco marker is defined by a 7 × 7 square array, where data and fault detection are contained in an inner 5 × 5 matrix of each row and generates a binary pattern. The marker is identified by the unique pattern that is encoded in each ArUco marker, which is robust with low failure rate. Additionally, the marker orientation can be detected by the layout of the four corners. Distortion of images occurs commonly in the applications of computer vision, which affects the measurement accuracy. To correct the image distortion, the process of camera calibration is conducted with OpenCV to determine the intrinsic and extrinsic parameters of camera. Intrinsic parameters are related to the camera itself and correspond to its internal characteristics such as focal length and optical centres while extrinsic parameters refer to the coordinates transformation between camera frame and world frame. As the frame of the marker is identified, multiple computational steps would be performed to obtain the relative pose between camera and marker and the processes are described in Fig. 6.

Fig. 6
figure 6

Marker detection process

The original image is firstly converted to a grey-scale image, then the image binarization is fulfilled through the threshold method. Contours in the image are detected by means of a Canny edge detector (Canny 1986). By the Suzuki algorithm (Suzuki 1985), contours in the image are extracted and filtered with discarding and deleting shapes other than square and closely adjacent shapes. Further processed marker is analysed by detecting the black and white pixels and segmenting the marker image into cells. With Otsu’s method (Otsu 1979), a binarized image is generated again by setting threshold value. The pixels of both colours are counted in each cell, and a certain grid map is obtained based on the average binary value of the cell. In order to enhance the accuracy of the ArUco marker detection, corners of the marker are found with sub-pixel interpolation by the cornerSubpixel function in OpenCV library. Afterwards, the pose of the camera with high accuracy is estimated using the Levenberg-Marquardt optimization algorithm (Marquardt 1963).

4.3 Seamless switching repositioning strategy

The design of the proposed localising system enables the robotic manipulator to estimate the auxiliary state of a mobile robot with additional sensors, which allows the seamless switching between different sensor suites. On one hand, including additional sensors benefits the enhancement of estimating the robot state. On the other hand, the elements of the different sensor suites are capable of seamless switching, which improves the fail-safety and versatility of the positioning system. In this work, the measurement from vision acts as the core state while the estimation by EKF-based algorithm involved with IMU and ultrasonic sensors is taken as the auxiliary state. As shown in Fig. 7, a seamless switching repositioning strategy with two main stages is proposed for the positioning system.

Fig. 7
figure 7

Two-stage seamless switching strategy for repositioning

The first stage processes vision data to estimate the positions, while the second stage fuses IMU and ultrasonic sensors by the EKF with the outlier rejection method for seamless switching. As one of the most popular fiducial markers, the ArUco marker presents high accuracy and speed in pose tracking (Kam et al. 2018). Therefore, the common case is the change from vision-based position estimation to EKF-based positioning in the case of unavailable vision data. In each time, a judgement statement is performed to determine whether the Aruco marker is detected. If the position outputs from vision are zero, then the vision system is labeled as a disabled state. The positioning system is seamlessly switched to the backup positioning stage, which is the EKF-based positioning method fusing IMU and ultrasonic positioning data. As a result of the seamless switching repositioning strategy, the positioning of the integrated robotic system would be robust against an occluded visual sensor, which can achieve reliable and accurate positioning performance.

5 Evaluations

5.1 System hardware and experimental setup

Extensive experiments including static and dynamic state estimation were performed to evaluate the performance of adopted positioning approaches and the proposed switching strategy for repositioning. The overall experimental flowchart is given in Fig. 8.

Fig. 8
figure 8

Overall experiment flowchart

The hardware system design for experimental implementation is given in Fig. 9, where TurleBot3 Burger and UR10e were used as the mobile robot and robotic manipulator respectively. The four stationary ultrasonic beacons and a mobile beacon affixed on the mobile robot were supplied by the Starter Set HW v4.9 of Marvelmind. As a high performance optical motion capture system, OptiTrack Trio camera was utilised and placed on the flat terrain to determine the real-time positions of the mobile robot and the measurement was taken as the ground truth. A low-cost and monocular camera was mounted on the end-effector of the fixed robotic manipulator to identify the positions of the ArUco marker. In this work, the large size 1.6m × 1.6m experimental table was setup based upon two considerations. On one hand, the beacons in ultrasonic positioning system can be conveniently arranged at a certain distance from each other to ensure the positioning accuracy. On the other hand, sufficient scope is required for mobile robot motion. Robot control and computations were processed on a laptop with an Intel i7-8750H, 8 GB RAM.

Fig. 9
figure 9

Experiment layout

5.2 Static state estimation experiment

The static state estimation experiment aims to compare the EKF-based fusion approach and stand-alone ultrasonic sensor system in positioning accuracy, precision to update frequency. Therefore, the mobile robot was placed at the different predefined positions and the data by two positioning approaches were acquired. The root-mean-square errors (RMSE) of positioning were calculated and quantified by the cumulative distribution function (CDF). In this experiment, the static performances of the EKF-based fusion approach and stand-alone ultrasonic sensor system for positioning a mobile robot were investigated and compared from accuracy, precision to update frequency. Firstly, the four stationary ultrasonic beacons were installed at the four corners of a platform to ensure the mobile robot can be located in the effective coverage area of ultrasonic positioning system consistently. The IMU was placed statically on the table and 10,000 accelerations are collected. The ground truth of acceleration is zero. Therefore, the bias and the noise of the IMU data can be obtained by calculating the average and standard variation of collected data, which can be used to get the Q for the EKF algorithm. The bias and the noise of the ultrasonic sensor were tuned by the first recording readings before the experiment, then taking the bias as offsets when reading the values in the future. The noise can be calculated and the variance of the data can be used to obtain the R for the EKF algorithm. Then the mobile robot was placed successively at the predefined four positions and the positioning data by the two approaches were acquired. Next, the RMSE of positions were calculated and quantified by the CDF. Fig. 10 depicts the comparisons of the position error CDF between the EKF-based approach and stand-alone ultrasonic senor system at four defined points. The median (p50), 95th percentile (p95) errors and standard deviation (STD) are summarised in Table 1. As expected, the ultrasonic sensor reveals high-accuracy in indoor environment positioning as 95th percentile errors are less than 0.04m. It can be observed that the median positioning errors of both EKF-based approach and ultrasonic sensor system are below 0.04m with a little difference. This suggests that an EKF-based approach attains comparable accuracy to the ultrasonic method. Furthermore, both of the approaches achieve high-precision performance which is reflected by the millimetre-level STD. Additionally, it shows by practical measurement that the positions were updated by ultrasonic sensors with 3.65 HZ while it can reach to 181.9 HZ through the EKF-based position approach. Thus, an EKF-based positioning approach fusing IMU with ultrasonic sensors can export high-accuracy and high-precision in positioning while keeping high update frequency in static positioning.

Fig. 10
figure 10

Positioning error CDF comparison

Table 1 positioning errors and STD at four predefined test points

To evaluate the convergence rate of the EKF-based method, a simulation was implemented in Matlab. The initial estimation error was set to 5cm. The update frequency of IMU was set to 200 Hz while the update frequency of ultrasonic sensor was set to 3.65 Hz. The EKF-based positioning process is shown in Fig. 11. As it can be seen from the figure, the EKF-based positioning method requires around 0.5 seconds to reach the steady state, proving the fast convergence rate of the EKF-based positioning method.

Fig. 11
figure 11

EKF-based positioning process

As described in the section of EKF process, a potential issue of an ultrasonic sensor system is that outliers exist randomly during operation and the measurement reading may be obtained with a drastic change compared to previous reading, such as positioning in Fig. 12. In this work, the outliers of ultrasonic sensor measurement can be effectively eliminated on account of the proposed outlier rejection method in Sect. 4.1.2, which is another competitive advantage in positioning.

Fig. 12
figure 12

Positioning with outliers of ultrasonic sensor system

5.3 Dynamic state estimation experiment

The dynamic experiments were conducted to determine the accuracy and robustness of the proposed positioning system and seamless switching repositioning strategy when mobile robot moves. The moving lines of the mobile robot were predefined to ensure that the robotic arm can reach all the positions of mobile robot during the movement process. The robotic arm identifies the positions of the mobile robot depending on the state of sensors, where the feasibility of seamless switching repositioning strategy can be evaluated. Here, the mobile robot was driven along a complex path involving linear motions and a U-turn while the end-effector of the manipulator kept perceiving the positions of the mobile robot. While the mobile agent moved along the path, the (x,y) positions of the mobile beacon were determined by the ranges obtained from four stationary beacons and trilateration.

The experimental results of trajectories, x-axis position and y-axis position using different approaches are shown in Figs. 13, 14 and 15 respectively, in which the ultrasonic sensor marked as green line represents the ultrasonic receiver data for mobile robot positioning. The end-effector marked as yellow line represents the end-effector positions of robotic manipulator, EKF-based approach marked as purple line represents the robot positions after fusing IMU with ultrasonic receiver data by EKF, ground truth marked as red line represents the robot positions obtained from OptiTrack, and wheel encoder marked as blue dash line represents the robot positions using wheel encoder that embedded in mobile robot. From the results, outliers from an ultrasonic positioning system occur randomly during mobile robot moving, which leads to a dramatic positioning error. While fusing ultrasonic receiver data with IMU by the EKF method, the trajectory is smoothed and the abnormal value is eliminated even when incorrect ultrasonic readings are obtained.

Fig. 13
figure 13

Comparisons of trajectories

Fig. 14
figure 14

Comparisons of x-axis position

Fig. 15
figure 15

Comparisons of y-axis position

Fig. 16
figure 16

Comparisons of positioning error

Comparing the results of five trajectories, the vision system has an outstanding stability and accuracy if the ArUco marker can be recognised by the camera while the positioning with wheel encoder gradually deviates from the ground truth with time accumulation over a period. The RMSE of different positioning approaches during dynamic motion is shown in Fig. 16. At the end point, the RMSE of wheel encoder is 0.0608m, which is much larger than the vision’s 0.015m, the ultrasonic sensor system’s 0.0095m and the EKF-based approach’s 0.0057m. It is observed that the EKF-based approach with ultrasonic sensor and IMU performs well without the accumulated error but got a lower than 0.05m deviation at the U-turn. This is attributed to the variational transceiver and different arrival time of ultrasonic signals. Nevertheless, the EKF-based approach achieves accurate millimetre level positioning without outlier and accumulative positioning error.

Moreover, to demonstrate the capability of the seamless switching repositioning strategy, the ArUco marker was covered randomly at two phases to cause drop-outs of visual measurements and the dependent sensor suite is indicated in the figures where US represents ultrasonic system. As expected, once the dropouts of the visual update occur, the position of end-effector of the robotic manipulator then seamlessly switches to the EKF-based state estimation, which shows the robustness and capability of the proposed seamless switching repositioning strategy.

6 Conclusions

In this work, an interaction mode between the mobile robot and fixed-base robotic manipulator is firstly designed for the Co-MRMS. Based on the relative distance between the mobile robot and fixed-base robotic manipulator, the Co-MRMS performs different forms of interaction. The simulation-based experiments were conducted to validate the feasibility of the proposed interaction mode. Then, two different kinds of indoor environment positioning methods, EKF-based approach fusing ultrasonic sensors with IMU and vision-based approach with ArUco marker, have been adopted. In addition, with an outlier rejection method strategy, the EKF-based approach can eliminate the outlier of ultrasonic sensor measurement.

Furthermore, a two-stage positioning strategy allows the fixed robotic manipulator to reposition the mobile robot seamlessly to deal with the scenario in which visual sensor is occluded. Through a series of static and dynamic experiments, it is demonstrated that the EKF-based multi-sensor fusion positioning approach can achieve comparative millimetre-sized accuracy as ultrasonic sensor system while keeping high update frequency. In contrast to the wheel encoder positioning, the proposed positioning system can suppress the positioning drifts over time by the benefits from the camera and ultrasonic sensor. The fixed robotic manipulator achieves desirable robust positioning, even in the case when the visual sensor fails. In future work, the positioning system of mobile robot before arriving around robotic manipulator would be focused on and the complete positioning system would be validated for non-stop part feeding. Future work will also focus on comparing the performance of two positioning methods in different cases especially when the robot exhibits fast dynamic movement. In conclusion, by adopting new two-stage multi-sensor fusion positioning method for the Co-MRMS, it is possible for the fixed-base robotic manipulator to obtain the positions of mobile robot accurately and robustly.