1 Introduction

Positioning is essential for various activities, such as navigation, asset tracking, and emergency services (El-Sheimy and Li, 2021). As people spend a significant portion of their time indoors, accurate and reliable indoor positioning becomes increasingly important for various applications (Zhou et al., 2021).

The methods for indoor pedestrian positioning can currently be classified into passive techniques, utilizing existing information such as WiFi signals, RFID tags, and sensors, and active techniques like ultrasonic waves, Bluetooth Low Energy, and Visible Light Communication (Brena et al., 2017; Lluvia et al., 2021). Positioning based on solely inertial navigation system can result in significant data drift and accumulated errors over time. Researchers often counteract this by fusing the inertial navigation system with methods like radio frequency-based positioning. However, these methods necessitate pre-installation, which not only leads to extra expenses but also renders them inappropriate for dynamic or changing environments (Pan et al., 2022).

In the Internet of Things epoch, the rise of robots necessitates an optimization of traditional indoor positioning systems based on fixed base stations. Robots, now commonplace in settings such as shopping malls, factories, and subterranean spaces, harness their architectural design and a broad range of sensors to accomplish precise localization. Emerging methods of vehicle-mounted sensor-based global localization like LiDAR-SLAM offer an adaptable and pragmatic approach to environmental perception and robot localization in uncharted terrains. These techniques, rooted in the 1980s concept of Simultaneous Localization and Mapping (SLAM), have experienced considerable and rapid developments and stand on the brink of a promising future (Leonard and Durrant-Whyte, 1991; Durrant-Whyte and Bailey, 2006).

The precise localization of robots equipped with radio frequency signals can be integrated with the positioning results of Pedestrian Dead Reckoning (PDR). This fusion becomes increasingly feasible in the context of the development of smartphones with integrated multi-sensors. Studies on PDR have assessed step detection, step length estimation, heading estimation, and position estimation using step length and heading information (Zhang et al., 2022). There are many methods to range distance such as ultrasonic, WiFi RTT, Bluetooth. Ultra-Wideband (UWB), originated in the 1960s, stands out for several compelling reasons: high interference resistance, strong penetration capablity, low power consumption and robustness against effects (Kim and Pyun, 2021). For indoor positioning based on the UWB system, the position of pedestrian is determined by measuring the real-time between the base stations and tags (Yuan et al., 2021). The combination of PDR and UWB can not only mitigate the drift issues associated with inertial navigation, but it can also address the challenges of pre-installation requirements and occasional weak signals associated with radio frequency positioning.

In a departure from indoor positioning systems that depend on fixed base stations, our method leverages the capabilities of mobile base stations. This not only delivers enhanced precision with the presence of more stations but also ensures consistent stability in areas where fixed stations might be lacking, making it particularly effective in dynamic environments where conventional systems often fall short. Figure 1 is the diagram of indoor collaborative positioning. There are three robots and one user who can locate himself by using the robots as mobile base stations. In this paper, we contribute to the field of indoor positioning by introducing a novel approach through human-robot collaboration. Our contributions are as follows:

  1. a)

    Proposed a method to enhance positioning accuracy through human-robot collaboration when there are few fixed base stations.

  2. b)

    Introduced an EKF algorithm that integrates PDR and radio frequency signals such as UWB. This method can effectively reduce the trajectory drift of PDR and the positioning error caused by UWB being blocked.

  3. c)

    Conducted experiments in different scenarios, demonstrating that the proposed methodology can improve pedestrians’ positioning accuracy in human-robot collaboration scenarios.

Fig. 1
figure 1

Application scenario diagram

The structure of this paper is as follows. The related works are briefly reviewed in Section 2. Section 3 introduces the proposed human-robot collaboration. Section 4 provides the experimental results. Section 5 concludes the paper.

2 Related works

2.1 Robot localization

Robot localization advancements, such as LiDAR-SLAM, have been crucial for operating in unknown environments since the 1980s. Its high sampling rates and resolution have led to increasing interest (Nguyen et al., 2007). Essential strides were proposed as the LOAM framework, using consecutive laser data for robot motion estimation and environment mapping (Zhang and Singh, 2014) .The emerge of LeGO-LOAM, optimizing LOAM with advanced feature extraction and layered optimization techniques for superior perception and back-end efficiency (Shan and Englot, 2018). Further improvements were combine combining LOAM with Lidar-Inertial Odometry (LIVO) for a robust real-time algorithm suitable for LiDARs with a small FoV and irregular sampling, achieving more accurate and robust localization and mapping (Lin and Zhang, 2020). Additionally, LeGO-LOAM-SC algorithm integrated LeGO-LOAM with Scan Context technology, enhancing performance in underground coalmine environments (Xue et al., 2022).

2.2 Indoor pedestrian position

As mentioned earlier, indoor pedestrian positioning methods can be classified into passive and active techniques, including UWB, WiFi signals, RFID tags, Acoustic, BLE, and Visible Light Communication. Over recent years, UWB technology has garnered increasing attention, a trend largely driven by the emergence of affordable UWB chips in the marketplace. UWB has been the subject of extensive research, and its capability to deliver centimeter-level positioning accuracy is well recognized (Penggang et al., 2022). With the advancement of smartphones, Micro-Electro-Mechanical Systems (MEMS) have provided a multitude of powerful sensors for indoor positioning. PDR is a technique that estimates position using gyroscopes and accelerometers (Kamisaka et al., 2011). This method can temporarily output relative positions, but system errors caused by orientation estimation, step detection, and other factors accumulate over time, leading to significant errors. Therefore, PDR is often combined with other technologies to limit error accumulation (Kang and Han, 2014).

2.3 Human-robot collaboration

Traditional research in indoor positioning primarily enhances individual terminal capabilities through methods like multi-source data fusion positioning and integration of dedicated positioning modules. The former involves synthesizing data from various sources, such as Wi-Fi, Bluetooth, ultrasound, or inertial sensors, to improve accuracy and reliability (Fischer et al., 2022). The latter employs specialized hardware or software modules, designed to use technologies like UWB or BLE, to enhance positioning capabilities (Han et al., 2021). Further advancements encompass techniques like employing artificial intelligence and machine learning algorithms for data processing, using real-time kinematic (RTK) systems for high-precision positioning (Zhang et al., 2023), and leveraging unique features of the built environment for positioning via methods like SLAM.

In the era of the Internet of Things, traditional positioning methods are evolving towards more cooperative models. Collaborative Indoor Positioning Systems (CIPS) have transformed indoor positioning by leveraging the collective power of various independent actors sharing sensory data. The cornerstone of CIPS lies in exchanging crucial positioning data from a diverse set of sensors, aiming to optimize the positioning of each participant (Pascacio et al., 2021).

Compared to traditional Indoor Positioning Systems, CIPS expands the coverage area while minimizing reliance on costly and complex infrastructures. This innovation strikes a balance between extended geographical positioning and cost-efficiency. Additionally, the collaborative nature of CIPS and the variety of positioning data from different actors enhance user positioning accuracy and counteract the ambiguities often associated with sub-optimal anchor placements in conventional systems.

The advent and growing adoption of collaborative systems represent a pivotal shift in the indoor positioning landscape. They harness the potential of diverse sensor data and independent actors to deliver improved positioning accuracy and efficiency, pushing the boundaries of what was previously considered possible.

3 Methodology

3.1 Overview

In this section, we propose a methodology for human-robot collaborate indoor localization. Firstly, we use the smartphone with accelerometer and gyroscope to perform step detection, step length, and heading for PDR. Secondly, the robot uses Lidar for positioning, followed by a coordinate system transformation. Then, we combine the distance information measured by the robot with UWB for one more position estimation. Finally, we propose a method of fusing position estimation through the EKF to obtain the final result. Figure 2 shows the system architecture.

Fig. 2
figure 2

System architecture

3.2 Proposed methodology

3.2.1 Coordinate system transformation

In order to unify the robot’s coordinate system (local coordinate system) with a predefined global coordinate system, we perform a coordinate system transformation. This process typically involves two steps: rotation and translation.

We first determine the rotation angle between the robot’s coordinate system and the global coordinate system. This can usually be ascertained by comparing the directions of the two coordinate systems. We need to rotate the robot’s coordinate system by this angle to align the directions of the two coordinate systems in order to mitigate the angle difference between the x-axis of the robot and the x-axis of the global coordinate system. The rotation matrix is given by:

$$\begin{aligned} \left[ \begin{array}{cc} \cos \theta &{} -\sin \theta \\ \sin \theta &{} \cos \theta \end{array}\right] \end{aligned}$$
(1)

Following rotation, we also need to translate the robot’s coordinate system to coincide with the origin of the global coordinate system. This can usually be determined by comparing the positions of the origins of the two coordinate systems. We need to translate the robot’s coordinate system by this distance difference between the origin of the robot’s coordinate system and the origin of the global coordinate system. The translation matrix is given by:

$$\begin{aligned} \left[ \begin{array}{c} t_x \\ t_y \end{array}\right] \end{aligned}$$
(2)

These two steps can be achieved by a transformation matrix, often referred to as a homogeneous transformation matrix. The form of this matrix is as follows:

$$\begin{aligned} \left[ \begin{array}{ccc} \cos \theta &{} -\sin \theta &{} t_x \\ \sin \theta &{} \cos \theta &{} t_y \\ 0 &{} 0 &{} 1 \end{array}\right] \end{aligned}$$
(3)

3.2.2 UWB-based indoor localization

For indoor positioning based on the UWB system, the ranging information is obtained through bidirectional ranging wireless communication between the base station and the tag. The indoor positioning studied in this article is 2-dimensional, with the result to be solved as \(X=(x_k,y_k)\). The positioning calculation is completed using 3 or more UWB base stations, and the solution model is given by:

$$\begin{aligned} \left\{ \begin{array}{ll} d_1 = &{} \sqrt{(x_k - x_1)^2 + (y_k - y_1)^2 + \Delta h^2} \\ d_2 = &{} \sqrt{(x_k - x_2)^2 + (y_k - y_2)^2 + \Delta h^2} \\ &{} \hspace{10mm}\vdots \\ d_n = &{} \sqrt{(x_k - x_n)^2 + (y_k - y_n)^2 + \Delta h^2} \\ \end{array} \right. \end{aligned}$$
(4)

where d is the distance from base station i to the user tag, \((x_n,y_n)\) is the 2-dimensional coordinate position of base station i in the navigation coordinate system, \((x_k,y_k)\) is the 2-dimensional position of the user tag in the navigation coordinate system, and \(\Delta h\) is the height difference between the UWB base station and the user, which is a fixed value in this article, obtained by measurement.

The UWB positioning equation forms the observation equation in the EKF model. Since the original UWB positioning equation is non-linear, it is linearized and transformed into:

$$\begin{aligned} Z_k=H\cdot X_k+V_k \end{aligned}$$
(5)

In the formula, \(Z_k\) is the ranging time information vector between the base station and the user at time k, H is the observation matrix after linearization, \(X_k\) is the 2-dimensional coordinate vector at time k, and \(V_k\) is the observation noise vector following Gaussian distribution at time k. Its dimension is determined by the number of UWB base stations, and each component represents the stability of the ranging between each base station and the tag. The calculation formula for the linear observation matrix H is as follows:

$$\begin{aligned} H = \frac{\partial h}{\partial X} = \left[ \begin{array}{ccc} \frac{x_k - x_1}{\sqrt{(x_k - x_1)^2 + (y_k - y_1)^2 + \Delta h^2}} &{} \frac{y_k - y_1}{\sqrt{(x_k - x_1)^2 + (y_k - y_1)^2 + \Delta h^2}} &{} 0 \\ \frac{x_k - x_2}{\sqrt{(x_k - x_2)^2 + (y_k - y_2)^2 + \Delta h^2}} &{} \frac{y_k - y_2}{\sqrt{(x_k - x_2)^2 + (y_k - y_2)^2 + \Delta h^2}} &{} 0 \\ \vdots &{} \vdots &{} \vdots \\ \frac{x_k - x_n}{\sqrt{(x_k - x_n)^2 + (y_k - y_n)^2 + \Delta h^2}} &{} \frac{y_k - y_n}{\sqrt{(x_k - x_n)^2 + (y_k - y_n)^2 + \Delta h^2}} &{} 0 \end{array}\right] \end{aligned}$$
(6)

3.2.3 PDR

A PDR system keeps track of a user’s position in a relative manner. Given an initial position \(p_0\), the PDR iteratively updates the position \(p_k=[p_{x,k},p_{y,k}]\) of a user as follows:

$$\begin{aligned} \left\{ \begin{array}{l} p_{x,k} = p_{x,k-1} + L_k \cos (\theta _k), \\ p_{y,k} = p_{y,k-1} + L_k \sin (\theta _k). \end{array} \right. \end{aligned}$$
(7)

Where \(p_{x,k}\) and \(p_{y,k}\) are respectively the x and y coordinate of the user at the end of the kth step, and \(L_k\) and \(\theta _k\) are respectively the length and orientation of the kth step. Hence, to be able to update the position, the PDR must accomplish the following three subtasks:

  1. 1.

    Step detection The detection process is divided into the following three parts:

    1. 1)

      \(a_k\) is the acceleration value at time k. First, it is detected whether \(a_k\) is a peak value, that is, the acceleration value at time k is larger than the acceleration values at the adjacent times \(k-1\) and \(k+1\);

    2. 2)

      After successful detection, it must also satisfy \(a_k > a_m\), where \(a_m\) is an empirical threshold obtained through multiple experiments. Peaks below the threshold are judged as noise disturbances;

    3. 3)

      Calculate whether the time difference between time k and the time of the last successful step detection satisfies \(a_k-a_{\text {step}}(i-1)>T_{\text {th}}\), where \(a_{\text {step}}(i-1)\) is the time of the successful detection of the \(i-1\) step, and \(T_{\text {th}}\) is the empirical value of the time taken for a pedestrian to take a step.

    If all the above conditions are met at the same time, time k is judged as the \(i_{th}\) step of the pedestrian’s walk. The figure shows the change curve of the acceleration data collected during the actual test process, and the time threshold \(T_{th}\) and acceleration amplitude threshold \(a_{th}\) are marked in the Fig. 3.

  2. 2.

    Step length estimation For step length estimation, the nonlinear step length estimation model shown in equation n is selected as follows:

    $$\begin{aligned} S_k = k \cdot \root 4 \of {a_{\text {max}} - a_{\text {min}}} \end{aligned}$$
    (8)
  3. 3.

    Step heading estimation The heading calculation in PDR depends on the heading estimation model is shown as follows:

    $$\begin{aligned} \theta _k = \theta _{k-1} + \Delta T \cdot \sum \omega \end{aligned}$$
    (9)
Fig. 3
figure 3

Accelerometer signal results during normal walking and step detection

Where \(\theta _k\) and \(\theta _{k-1}\) are the headings at times k and \(k-1\) respectively. In this article, the initial heading \(\theta _0\) is known, \(\Delta T\) is the gyroscope data interval time, and \(\sum \omega\) is the sum of the gyroscope output values between times \(k-1\) and k.

Errors in the estimation of the step length \(L_k\) and heading \(\theta _k\) will eventually lead to a large drift in the estimated position. As a consequence, the PDR system can only provide very accurate incremental position updates for a short period.

3.2.4 Data fusion strategy of human-robot collaboration

The position vector \([x \ y]^T\) and the angle \(\alpha\) between the heading calculated by PDR and the absolute heading of navigation are selected as the system estimation state. \(\alpha\) is composed of two parts, one part is the angle between the real heading and the PDR calculated heading.

figure a

Algorithm 1 extended Kalman Filter for Position Estimation

After the system starts, when the installation angle does not change, this angle is a constant; the other part is the angle caused by the accumulated error of the sensor, this part of the angle change is related to the performance of the MEMS sensor and the duration of navigation.

$$\begin{aligned} x_k=\left[ \begin{array}{c} x_k \\ y_k \\ \alpha _k \end{array}\right] \end{aligned}$$
(10)

The change in the position calculated by PDR, \(\left[ \begin{array}{ll} \Delta x_p,&\Delta y_p \end{array}\right]\), is interpolated as \(\left[ \begin{array}{ll} v_x^{pdr},&v_y^{pdr} \end{array}\right]\) , and the speed calculated directly by PDR is used as the system’s dynamic model information to solve the state vector estimate. Therefore, the 3\(\times\)3 system state transition matrix \(\phi _k\) is:

$$\begin{aligned} \phi _k= \left[ \begin{array}{llc} 1 &{} 0 &{} (v_x^{pdr}\cos (\alpha _{k-1})-v_y^{pdr}\sin (\alpha _{k-1}))\Delta t \\ 0 &{} 1 &{} (-v_x^{pdr}\sin (\alpha _{k-1})-v_y^{pdr}\cos (\alpha _{k-1}))\Delta t \\ 0 &{} 0 &{} 1 \end{array}\right] \end{aligned}$$
(11)

The EKF algorithm includes two steps, prediction and correction. The prediction stage is to obtain a priori estimate \(x_k\) and the corresponding error variance matrix prior value \(\hat{X}_k\) of the two-dimensional coordinate vector \(P_k^-\) through the state equation, that is:

$$\begin{aligned} \hat{X}_k^-{} & {} = AX_{k-1}+\hat{S}_k\cdot \hat{\theta }_k \end{aligned}$$
(12)
$$\begin{aligned} P_k^-{} & {} = AP_{k-1}A^T \end{aligned}$$
(13)

In the correction stage, the distance information \(Z_k\) obtained from the UWB and tag strategy is used to eliminate the error term introduced in the prediction step, and the posterior estimate \(\hat{X}_k\) and \(P_k\) of the calculated position two-dimensional coordinate vector and the corresponding error variance matrix are obtained, that is:

$$\begin{aligned} K_k{} & {} = P_k^- H^T \left(HP_k^- H^T+R\right)^{-1} \end{aligned}$$
(14)
$$\begin{aligned} \hat{X}_k{} & {} = \hat{X}_k^-+K_k \left(Z_k-H\hat{X}_k^-\right) \end{aligned}$$
(15)
$$\begin{aligned} P_k{} & {} =(I-K_k H)P_k^- \end{aligned}$$
(16)

Finally, \(\hat{X}_k\) is the final estimate of the user’s two-dimensional coordinate vector.

4 Experiment and results

4.1 Experimental setup

Our investigation carried out in an office buliding at Shenzhen University. The chosen venue typically exhibits the limitations of indoor spaces for GPS navigation, making it a suitable testbed for our indoor positioning system. We established three different scenarios to evaluate the performance of our fusion algorithm, demonstrating the validity of our method.

As depicted in Fig. 4, pedestrians holding smartphones traversed the coverage area at a uniform speed in a counter-clockwise direction. They started from the yellow pentagon and moved along a black dotted rectangular path with dimensions of 5.7m by 5.65m.

Fig. 4
figure 4

Experiment setup

In Fig. 4a, Scenario 1 acts as the control group, with black triangles representing three fixed UWB base stations. The pedestrian’s position was determined using a fusion of PDR and UWB positioning.

Figure 4b introduces Scenario 2, where a mobile robot base station, represented by the green triangle and tracked by Lidar positioning, is added to the system presented in Scenario 1. The robot follows the direction indicated by the green arrow. The addition of the robot base station is plausible as the era of Internet of Things invites the use of robots as base stations, a move that can enhance the accuracy of positioning based on existing base stations.

Finally, as shown in Fig. 4c, Scenario 3 builds upon Scenario 1 by lefting only two fixed base stations and adding two mobile robot base stations. Although positioning would typically be impossible with only two fixed base stations, utilizing robot base station makes our indoor positioning more robust, minimizing instances of positioning loss.

4.2 Performance of the human-robot collaboration

As shown in Fig. 5a, the positioning results for Scenario 1 demonstrate that the integrated positioning outcome significantly outperforms the independent UWB and PDR methods. The trajectory calculated using only UWB information occasionally aligns well with the actual path, yet it often exhibits considerable jumps between some computed points, and there are instances of missing location data.

As illustrated in Fig. 5b, the blue line represents the results for Scenario 1, while the red line denotes those for Scenario 2. It’s apparent that the red line is closer to the ground truth than the blue line, indicating a more accurate positioning outcome, and the continuity of the positioning result is notably stable. Figure 5c displays the positioning results for Scenario 3. Here, even with only two fixed base stations, basic positioning capabilities are maintained, although there is some drift evident.

Fig. 5
figure 5

Positioning result

To evaluate the efficacy of our proposed method and provide a direct comparison with the standard scenarios, we use the Root Mean Square Error (RMSE), which gives a direct measure of how our proposed method performs in comparison to standard setups.

Table 1 shows the RMSE values for each method under Scenario 1. The regular PDR method experiences drift issues, resulting in an error of 6.890 meters. The regular UWB has an error of 1.178 meters, with challenges like human interference and multi-path effects in dense areas. Our proposed method reduces this error to 0.478 meters, showing a precision improvement of about 59.38% over regular UWB. The integration not only mitigates the PDR drift but also rectifies the intermittent inconsistencies presented by UWB.

Table 1 Comparison of RMSE(m) based on different methods

Table 2 shows the comparison of RMSE for three scenarios.In Scenario 1, utilizing standard positioning with three fixed base stations, the error measured was 0.478 meters. In Scenario 2, introducing an additional mobile base station decreased the positioning error to 0.355 meters, marking a notable 25.7% improvement from Scenario 1. Scenario 3 simulates an environment with insufficient fixed base stations; while positioning would typically be unfeasible with just two fixed base stations, our method, incorporating two mobile base stations, maintained positioning with an error of 0.437 meters. By leveraging mobile base stations, we can substantially enhance indoor positioning accuracy, ensuring more reliable results even in settings with fewer fixed base stations.

Table 2 Comparison of RMSE(m) based on different scenarios

As shown in Fig. 6, the cumulative distribution function (CDF) of position results of three scenarios. In the scenario 1, we found that approximately 80% of the localization results have an error within 0.56 meters, and around 90% of the results have an error within 0.68 meters. In the Scenario 2, roughly 80% of the results are within 0.44 meters of error, and 90% of the results are within 0.51 meters of error. In the Scenario 3, about 80% of the results have an error within 0.53 meters, and 90% of the results are within 0.62 meters of error. The effectiveness of our proposed methodology is confirmed by the enhanced positioning performance of both Scenario 2 and Scenario 3, where collaborative robot positioning is employed, in comparison to Scenario 1, which uses three fixed base stations.

Fig. 6
figure 6

Comparison of CDF

5 Conclusion

This paper introduces a novel methodology that leverages robot localization to enhance pedestrian indoor positioning accuracy when using smartphones. The core innovation lies in utilizing mobile robots equipped with UWB as substitutes for fixed base stations. This effectively addresses the prevalent challenge of insufficient base stations in many indoor settings.

The proposed approach exploits the complementary strengths of PDR using the inertial sensors on smartphones and UWB distance measurements from the robots. To integrate the data, coordinate system transformations are first performed to unify the robot and pedestrian coordinate systems. Then, an EKF fuses these heterogeneous sources of positioning information, considering various error sources.

Experimental results validate the effectiveness of the proposed methodology. Three scenarios were set up to assess the performance of the fusion algorithm, demonstrating that the positioning accuracy was significantly improved when mobile robot base stations were added to the system. Even in situations with only two fixed base stations, basic positioning capabilities were maintained, suggesting that the proposed method enhances the robustness of indoor positioning systems.The CDF of position results further attests to the superior performance of the proposed method. Scenarios incorporating collaborative robot positioning demonstrated improved accuracy compared to those with fixed base stations alone. One limitation of our study is that we did not consider optimizing for multi-path effects. Additionally, the scenarios explored in this study may not be comprehensive enough to cover all possible real-world conditions. Potential areas of improvement for future research include node optimization for multi-robot scenarios, exploring a more diverse range of scenarios, and improving fusion methods. Besides, the methods for measuring distance are not confined to UWB, as they also incorporate options such as ultrasonic, WiFi RTT, and Bluetooth for enhanced versatility.

In conclusion, the research showcases a promising advancement in indoor positioning. The proposed method harnesses the potential of human-robot collaboration, leading to improved positioning accuracy and overall system robustness. These findings mark a significant step towards the optimization of indoor positioning systems in the IoT era. Future research can build on these insights, further refining the methodology and broadening its applicability in diverse real-world scenarios.