1 Introduction

Distributed robotics is often characterized by the sensing, computing and communication capability distributed to each robot so that we can deploy many of them with ease of implementation, robustness against single point failure, and scalability. Some applications expected to be benefited by this distributed robotics include: (1) transport of materials and goods (Miyata et al. 2002; Guizzo 2008; 2) search, exploration and sensor network (Sinopoli et al. 2003; Kumar et al. 2004); and (3) collective manipulation and assembly (Rus et al. 1995; Pereira et al. 2004; Wang and Schwager 2014). The advantage of distributed robotics for these applications can be even more prominent when their operation takes place in a large-size environment cluttered with objects, so that a single large robot cannot efficiently cover the whole environment all by itself or can be too large to navigate through some of those objects.

Fig. 1
figure 1

Platoon of distributed nonholonomic WMRs: one “smart” leader WMR (with LiDAR, IMU and FPV camera) and three “simple” follower WMRs (with monocular camera and IMU) (top). Peer-to-peer communication architecture among the WMRs and the information flowing through it (bottom)

In this paper, we consider the problem of deploying a platoon of distributed nonholonomic wheeled mobile robots (WMRs) in such a large-size object-cluttered environment. Here, we say the robots are distributed in the sense that their sensing and computation are all onboard and the communication among them is only peer-to-peer. We also assume the environment be unknown and unchartered. For this case, it is typically very difficult, if not impossible, to achieve operations in a fully-autonomous manner. Therefore, we consider the problem of their tele-driving by a single remote user.

More precisely, we designate one of the WMRs as the leader of the platoon and furnish it with a LiDAR (light detection and ranging) sensor and LiDAR-SLAM (simultaneous localization and mapping) capability, a FPV (first-person view) camera to complement this LiDAR-SLAM information (e.g., recognize glass walls or low-height bumps missed by LiDAR, or traversability of regions behind scattered architectural objects, etc.), and ample computing power to collect/process the pose informations of all the WMRs and present them to the user. This leader WMR serves as the “eyes” of the remote user. All the other WMRs are “simple” followers, each equipped with a monocular camera with limited FOV (field-of-view) to sense the relative distance/bearing from their respective fore-running WMR, and onboard computing power just enough to compute its pose estimation (with fore-running WMR pose information received from communication) and control action. The communication among all the WMRs is also assumed only peer-to-peer. See Fig. 1.

We then aim to enable the remote user to tele-drive the leader WMR, while other WMRs follow their respective fore-running WMR with certain desired relative distance maintained and their limited-FOV camera always facing to the axle-center of the fore-running WMR (with omni-directional fiducial markers attached there) in a distributed manner. For simplicity, here, we mainly focus on the line graph formation shape to render the platoon as the (familiar) n-trailer system (Rouchon et al. 1993; Murray and Sastry 1993; Altafini et al. 2001), although other formation shapes with any tree graph topology (with the leader WMR as its globally reachable node Ren and Beard 2008; Lee 2016) are also possible. For this, we then utilize nonholonomic passive decomposition (Lee 2010; Lee and Lui 2017; Lui and Lee 2017) to split the platoon kinematics into the inter-WMR formation aspect and the collective tele-driving aspect. We can then design the controls for these two aspects individually and separately, ensuring the desired n-trailer formation be maintained regardless of the (arbitrary) user command. We can also distribute these controls to each WMR while fully respecting their nonholonomic constraint and distribution requirement of sensing, computing and communication.

Even if the platoon is reduced to the (familiar) n-trailer system as stated above, it is in general still difficult for typical users to tele-navigate this line-topology platoon amid obstacles while avoiding their collision with any of the WMRs. This is particularly so, as the n-trailer platoon can exhibit complex internal serpentine articulation, when it traverses with a large curvature change (see Fig. 4). To assist a remote user to overcome this difficulty, in this paper, we propose a novel predictive display for the tele-driving of distributed robots. By providing the user with the estimated current and predicted future pose information of each WMR along with the possibility of collision, this predictive display turns out to substantially enhance efficiency and performance of the tele-driving by allowing the user, e.g., to better explore collision-free tele-driving strategies or to adjust the command if the collision is about to occur in near future. We also fully incorporate the uncertainty inherent to the sensing/computation distribution into this predictive display, namely, (1) local pose estimation uncertainty of each WMR accumulated downstream to the last WMR with the relative pose measurement uncertainty; and (2) current pose estimation uncertainty of each WMR accumulated upstream through the communication delay, which is further forward-propagated to obtain the predicted future pose information. See Fig. 2.

Numerous techniques have been proposed for the formation control of multiple distributed nonholonomic WMRs, some experimentally demonstrated with onboard sensing and estimation (e.g., Desai et al. 2001; Das et al. 2002; Marshall et al. 2006; Mariottini et al. 2009). On the other hand, many results have been reported for the teleoperation of multiple mobile robots (e.g., centralized Lee and Spong 2005; Lee 2008; Rodriguez-Seda et al. 2010; Fong et al. 2003, distributed Franchi et al. 2012; Lee et al. 2013; Ochiai et al. 2014) and also for the teleoperation with predictive display (e.g., Kim and Bejczy 1993; Chong et al. 2002; Mitra and Niemeyer 2008; Kelly et al. 2011). Yet, to our knowledge, there has been no result so far, which systematically utilizes the predictive display for the teleoperation of multiple distributed mobile robots to allow the user to better handle with the complex kinematics of their collective while fully taking into account the uncertainty inherently arising from their sensing, computation and communication being distributed. In fact, we believe our framework proposed here is the very first predictive display teleoperation result of distributed mobile robots with the distribution-induced uncertainty fully incorporated and its efficacy fully manifested by human subject study. The holistic framework for the distributed robot teleoperation, encompassing the behavior decomposition, the control design distribution, the distributed estimation and the predictive display, to our knowledge, is also proposed in this paper for the first time.

Fig. 2
figure 2

Platoon of distributed WMRs navigating in a large-hall with array of plant pots (top); and predictive display view with LiDAR-SLAM map, current and future pose estimates of the WMRs with uncertainty (i.e., size ellipsoids and direction cone) (bottom). Possible collision of a WMR will be indicated by white circular shade, if its future pose estimate collides with non-traversable scanned regions of the SLAM map (see bottom middle two pictures of Fig. 11)

The rest of the paper is organized as follows. The nonholonomic passive decomposition and the distributed control design to achieve the n-trailer platoon formation are presented in Sect. 2. The distributed pose estimation of each WMR using onboard sensing and peer-to-peer communication is detailed in Sect. 3. The predictive display, which fully incorporate the uncertainty inherently arising in distributed robotics, is designed and derived in Sect. 4. Experiments and human subject study to verify and demonstrate the efficacy of our proposed framework are performed in Sect. 5. Summary and some remarks on future research direction are given in Sect. 6.

2 Distributed control design

2.1 Nonholonomic passive decomposition

Here, our goal is to design the control action for each WMR in such a way that the platoon can maintain the n-trailer formation (i.e., line graph topology) regardless of (arbitrary) user command while fully respecting the nonholonomic constraint and the distribution requirement. For this, we consider the kinematic equation of the n-WMRs. This assumption is valid because the WMR operating speed of this document is not too fast (for example, the average speed in first paragraph of Sect. 5.2 and the second paragraph of Sect. 5.3), therefore, the slip effect and drifts of the wheel are negligible.

Let us denote the inertial frame by \(\{ \mathcal {O} \}\) and the body-fixed frame of the j-th WMR by \(\{ \mathcal {G}_{j} \}\) with \(j=1\) being the “smart” leader WMR and \(j=2,\ldots ,n\) for the “simple” follower WMRs. Here, we attach the origin of \(\{ \mathcal {G}_{j} \}\) at the axle-center each j-th WMR. See Fig. 3. The pose of each WMR in SE(2) can then be parameterized by its axle-center position \(p_j:=(x_j,y_j) \in \mathfrak {R}^2\) and heading angle \(\phi _j \in \) S of \(\{ \mathcal {G}_{j} \}\) w.r.t. \(\{ \mathcal {O} \}\). Define the configuration of the j-th WMR by \(q_j=[x_j;y_j;\phi _j] \in \mathfrak {R}^3\). It is then well-known that the no-slip/drift condition can be written by the following Pfaffian constraint:

$$\begin{aligned} A_j(q_j)\dot{q_j}=0 \end{aligned}$$

with \(A_j(q_j):= \left[ \begin{array}{ccc} \sin \phi _i&-\cos \phi _i&0 \end{array} \right] \in \mathfrak {R}^{1 \times 3}\), which is well-known to be completely nonholonommic (Murray et al. 1993).

Here, we aim to design the formation control to be distributed to each pair of two fore-running and following WMRs. For this, let us define the configuration of the pair of the j-th and \((j+1)\)-th WMRs by

$$\begin{aligned} q_{j,j+1}&: =&\left[ \begin{array}{cc} q_j;&q_{j+1} \end{array} \right] \\= & {} \left[ \begin{array}{cccccc} x_j;&y_j;&\phi _j;&x_{j+1};&y_{j+1};&\phi _{j+1} \end{array} \right] \in \mathfrak {R}^6 \end{aligned}$$

with their no-slip/drift conditions given by

$$\begin{aligned} A_{j,j+1} \dot{q}_{j,j+1}=0 \end{aligned}$$

with \(A_{j,j+1}(q_{j,j+1}):=\text {diag} [A_j(q_j), A_{j+1}(q_{j+1})] \in \mathfrak {R}^{2 \times 6}\). The unconstrained distribution \(\mathcal{D}^\top _{j,j+1}(q)\) (Lee 2010), which characterizes the sub-space of the velocity respecting the nonholonomic constraint (1), can then be identified by

$$\begin{aligned} \mathcal {D}^{\top }_{j,j+1} := \begin{bmatrix} {\text {c}}\phi _j&\quad {\text {s}}\phi _j&\quad 0&\quad 0&\quad 0&\quad 0\\ 0&\quad 0&\quad 1&\quad 0&\quad 0&\quad 0\\ 0&\quad 0&\quad 0&\quad {\text {c}}\phi _{j+1}&\quad {\text {s}}\phi _{j+1}&\quad 0\\ 0&\quad 0&\quad 0&\quad 0&\quad 0&\quad 1\\ \end{bmatrix}^{T} \in \mathfrak {R}^{6 \times 4} \end{aligned}$$

where \({\text {c}}\phi _j := \cos \phi _j\) and \({\text {s}}\phi _j := \sin \phi _j\). Here, note that \(\mathcal{D}^\top _{j,j+1}\) identifies the null-space of \(A_{j,j+1} \in \mathfrak {R}^{2 \times 6}\) in (1). Evolution of the two WMRs under the nonholonomic constraint (1) can then be written by the following drift-less nonlinear control equation:

$$\begin{aligned} \dot{q}_{j,j+1}=\mathcal{D}^\top _{j,j+1} u_{j,j+1} \end{aligned}$$

where \(u_{j,j+1}:=[v_j;w_j;v_{j+1};w_{j+1}] \in \mathfrak {R}^4\) is the control input, with \(v_j,w_j \in \mathfrak {R}\) being the forward and angular velocity commands of the j-th WMR. The constraint (1) is also completely nonholonomic (Murray et al. 1993).

The control objective to render the n-WMRs as a n-trailer system (i.e., line graph topology) can be written by the following pairwise/distributed virtual constraint:

$$\begin{aligned} h_{j,j+1} := \begin{pmatrix} x_j-x_{j+1}-L_j {\text {c}}\phi _{j+1}\\ y_j-y_{j+1}-L_j {\text {s}}\phi _{j+1} \end{pmatrix} =0 \end{aligned}$$

where \(L_j>0\) is the desired distance between the j-th WMR and the \((j+1)\)-th WMR. This constraint (3) implies that the distance between the two WMRs’ axle-centers is maintained to be \(L_j\) with the camera of the ensuing WMR always pointing to the axle-center of the fore-running WMR. This then means that, if \(h_{j,j+1}=0\), the axle-center of the fore-running WMR will always be within the limited FOV (field-of-view) of the following WMR’s camera regardless of the platoon formation shape and curvature. This further implies that, if \(h_{j,j+1}=0\), with the omni-directional fiducial markers (e.g., cube with tags on each side) attached at those axle-centers, each following WMR can always measure the relative distance and bearing from its fore-running WMR with their onboard camera, while the platoon moves/undulates as a n-trailer system. See Fig. 3. This h(q) is called formation map (Lee 2010). The control objective \(h_{j,j+1} =0\) needs to be attained while respecting the distribution requirement.

Fig. 3
figure 3

Geometry of platoon of the distributed nonholonomic WMRs, when they collectively behave as the n-trailer system under the virtual constraint \(h(q_{j,j+1})=0\), that is, each following WMR maintains the distance \(L_j\) from, and also faces toward the axle-center of, their respective fore-running WMR

For this, following (Lee and Li 2013), we define tangential distribution \(\varDelta ^\top _{j,j+1}\) of \(h_{j,j+1}\) to be the null-space of the one-form \(\frac{\partial h_{j,j+1}}{\partial q_{j,j+1}} \in \mathfrak {R}^{2 \times 6}\) as identified by the following matrix:

$$\begin{aligned}&\varDelta ^{\top }_{j,j+1}\\&\quad := \begin{bmatrix} L_j {\text {s}}\phi _{j+1}&\quad -L_j{\text {c}}\phi _{j+1}&\quad 0&\quad 0&\quad 0&\quad -\,1\\ 1&\quad 0&\quad 0&\quad 1&\quad 0&\quad 0\\ 0&\quad 1&\quad 0&\quad 0&\quad 1&\quad 0\\ 0&\quad 0&\quad 1&\quad 0&\quad 0&\quad 0 \end{bmatrix}^T \in \mathfrak {R}^{6 \times 4} \end{aligned}$$

and normal distribution \(\varDelta ^\perp _{j,j+1}(q_{j,j+1}) \), which is the orthogonal complement of \(\varDelta ^\top _{j,j+1}\) w.r.t. the Euclidean metric s.t.,

$$\begin{aligned} \varDelta ^{\bot }_{j,j+1} := \begin{bmatrix} 1&\quad 0&\quad 0&\quad -\,1&\quad 0&\quad L_j {\text {s}}\phi _{j+1}\\ 0&\quad 1&\quad 0&\quad 0&\quad -\,1&\quad -L_j {\text {c}}\phi _{j+1} \end{bmatrix}^T \in \mathfrak {R}^{6 \times 2} \end{aligned}$$

with \(\varDelta ^\perp _{j,j+1} =\left( \frac{\partial h_{j,j+1}}{\partial q_{j,j+1}} \right) ^T\). As explained in Lee and Li (2013), the velocity component in \(\varDelta ^\top _{j,j+1}\) characterizes the motion of the two WMRs tangential to the (current) level set:

$$\begin{aligned} \mathcal{H}_{h_{j,j+1}}(q_{j,j+1}):= \{ q^\prime \in \mathfrak {R}^6 ~|~ h(q^\prime )=h(q_{j,j+1}) \} \end{aligned}$$

(i.e., collective motion of the two WMRs with \(h_{j,j+1}\) kept intact); whereas that in \(\varDelta ^\perp _{j,j+1}\) the motion normal to the level set \(\mathcal{H}_{h_{j,j+1}}\) w.r.t. the Euclidean metric (i.e., internal motion of the two WMRs changing the inter-WMR coordination \(h_{j,j+1}\)).

Following Lui and Lee (2017), we can then achieve the nonholonomic passive decomposition of the two WMRs under the physical nonoholonomic constraint (1) and the virtual holonomic constraint (3) s.t.,

$$\begin{aligned} \dot{q}_{j,j+1} = \underbrace{\left[ \begin{array}{cc} (\mathcal {D}^{\top }_{j,j+1} \cap \triangle ^{\top }_{j,j+1})&\mathcal{D}^c_{j,j+1} \end{array} \right] }_{=:S_{j,j+1}(q_{j,j+1}) \in \mathfrak {R}^{6 \times 4}} \begin{pmatrix} u_{L_{j,j+1}}\\ u_{C_{j,j+1}} \end{pmatrix} \end{aligned}$$


$$\begin{aligned} (\mathcal {D}^{\top }_{j,j+1} \cap \varDelta ^{\top }_{j,j+1})= \begin{bmatrix} {\text {c}}\phi _j&\quad 0\\ {\text {s}}\phi _j&\quad 0\\ 0&1\\ {\text {c}}(\phi _j-\phi _{j+1}){\text {c}}\phi _{j+1}&0\\ {\text {c}}(\phi _j-\phi _{j+1}){\text {s}}\phi _{j+1}&0\\ \frac{1}{L_j}{\text {s}}(\phi _j-\phi _{j+1})&0 \end{bmatrix} \in \mathfrak {R}^{6 \times 2} \end{aligned}$$

is the (unconstrained) locked distribution, representing the motion tangential to the level set \(\mathcal{H}_{h_{j,j+1}}\) (4) (i.e., collective motion with \(h_{j,j+1}\) locked) among the permissible motion in \(\mathcal{D}^\top _{j,j+1}\) [i.e., satisfying the nonholonomic constraint (1)]; and

$$\begin{aligned} \mathcal{D}^c_{j,j+1}&=\mathcal{D}^\top _{j,j+1} \setminus ( \mathcal{D}^\top _{j,j+1} \cap \varDelta ^\top _{j,j+1}) = \nonumber \\&= \left[ \begin{array}{cc} {\text {c}}\phi _j &{} 0 \\ {\text {s}}\phi _j &{} 0 \\ 0 &{} 0 \\ -{\text {c}}(\phi _j-\phi _{j+1}) {\text {c}}\phi _{j+1} &{} {\text {s}}(\phi _j-\phi _{j+1}) {\text {c}}\phi _{j+1} \\ -{\text {c}}(\phi _j-\phi _{j+1}) {\text {s}}\phi _{j+1} &{} {\text {s}}(\phi _j-\phi _{j+1}) {\text {s}}\phi _{j+1} \\ -L_j {\text {s}}(\phi _j-\phi _{j+1}) &{} -L_j {\text {c}}(\phi _j-\phi _{j+1}) \end{array} \right] \end{aligned}$$

is the quotient distribution with \(\mathcal{D}^c_{j,j+1} \notin \varDelta ^\top _{j,j+1}\) and \(\mathcal{D}^c_{j,j+1} \notin \varDelta ^\perp _{j,j+1}\) (i.e., weakly decomposable Lee 2010) and also \((\mathcal{D}^\top _{j,j+1} \cap \varDelta ^\perp _{j,j+1}) \subset \mathcal{D}^c_{j,j+1}\) (see Lui and Lee 2017). This then means that, for the two WMRs under the control objective (3): (1) we can ensure the inter-WMR coordination \(h_{j,j+1}=0\) simply by stabilizing the motion of the WMRs in this \(\mathcal{D}^c_{j,j+1}\); yet, (2) it is not possible to adjust the inter-WMR coordination \(h_{j,j+1}\) without affecting the collective motion (i.e., tele-driving) aspect, since \(\mathcal{D}^c_{j,j+1}\) contains both the normal and tangential components. Complete decoupling between these two aspects can be attained only if strong decomposability is granted (Lee 2010), which is not the case here.

2.2 Control design and distribution

The matrix \(S_{j,j+1}\) in (5) is called the decomposition matrix. In (5), \(u_{C_{j,j+1}}=[u_{C_{j,j+1}}^1; u_{C_{j,j+1}}^2] \in \mathfrak {R}^2\) and \(u_{L_{j,j+1}}=[u_{L_{j,j+1}}^1; u_{L_{j,j+1}}^2] \in \mathfrak {R}^2\) are respectively the control inputs of the quotient and locked systems, with the former to be utilized to regulate the inter-WMR coordination (i.e., \(h_{j,j+1} \rightarrow 0\)) while the latter to tele-drive the two WMRs while not perturbing the inter-WMR coordination aspect \(h_{j,j+1}\). These controls \((u_{L_{j,j+1}}\), \(u_{C_{j,j+1}}) \in \mathfrak {R}^4\) should also be distributable to each WMR according to their sensing, computing and communication architecture (see Fig. 1).

The fact that \(\mathcal{D}^c_{j,j+1}\) contains some components in \(\varDelta ^\perp _{j,j+1}\) implies that, by using \(u_{C_{j,j+1}}\), we can affect and regulate the inter-WMR coordination \(h_{j,j+1}\). For this, we design \(u_{C_{j,j+1}} \in \mathfrak {R}^2\) s.t.,

$$\begin{aligned} u_{C_{j,j+1}} =&-\mathcal{D}^{cT}_{j,j+1}(q_{j,j+1}) \cdot \left[ \frac{\partial h_{j,j+1}}{\partial q_{j,j+1}} \right] ^T \left[ \frac{\partial \varphi _{h_{j,j+1}}}{\partial h_{j,j+1}} \right] ^T \end{aligned}$$

where \(\varphi _{h_{j,j+1}}(h_{j,j+1}) \ge 0\) is a suitably defined positive-definite potential function to enforce \(h_{j,j+1}=0\) (i.e., \(\varphi _{h_{j,j+1}}=0\) iff \(h=0\) and positively quadratic around \(h=0\) with \(\frac{\partial \varphi _{h_{j,j+1}}}{\partial \partial h_{j,j+1}}=0\) iff \(\varphi _{h_{j,j+1}}=0\) in a neighborhood of \(h_{j,j+1}=0\)). We then have:

$$\begin{aligned} \frac{d \varphi _{h_{j,j+1}}}{dt}&= \frac{\partial \varphi _{h_{j,j+1}}}{\partial h_{j,j+1}} \frac{\partial h_{j,j+1}}{\partial q_{j,j+1}} \dot{q}_{j,j+1} \nonumber \\&= \frac{\partial \varphi _{h_{j,j+1}}}{\partial h_{j,j+1}} \frac{\partial h_{j,j+1}}{\partial q_{j,j+1}} \mathcal{D}_{c_{j,j+1}} u_{C_{j,j+1}} \nonumber \\&= - ||u_{C_{j,j+1}}||^2 \le 0 \end{aligned}$$

where we use the fact that \(\frac{\partial h_{j,j+1}}{\partial q_{j,j+1}} \cdot [ \mathcal{D}^\top _{j,j+1} \cap \varDelta ^\top _{j,j+1} ] \cdot u_{L_{j,j+1}} = 0\) \(\forall u_{L_{j,j+1}}\) from the geometric construction of (5). This then means that \(\varphi _{h_{j,j+1}}(t) \le \varphi _{h_{j,j+1}}(0)\), \(\forall t \ge 0\), that is, if the inter-WRM coordination error starts small (i.e., \(||h_{j,j+1}(0)|| \approx 0\)), it will stay small (i.e., \(||h_{j,j+1}(t)|| \approx 0\), \(\forall t \ge 0\)). Further, this error \(\varphi _{h_{j,j+1}}\) will be strictly decreasing as long as \(u_{C_{j,j+1}} \ne 0\). This also implies that, if we start with small enough \(h_{j,j+1}(0)\) with \(|\phi _j(0)-\phi _{j+1}(0)| < \pi \), \(|\phi _j(t)-\phi _{j+1}(t)| < \pi \) \(\forall t \ge 0\) (i.e., heading angle \(\phi _j\) in Fig. 3 will not flip around), since, if not, \(\varphi _{h_{j,j+1}}(t)\) should increase, contradictory to \(\varphi _{h_{j,j+1}}(t) \le \varphi _{h_{j,j+1}}(0)\), \(\forall t \ge 0\). Here, note that the control \(u_{C_{j,j,+1}}\) is distributed, as it is a function of only \(q_{j,j+1}\).

It should be noted that the convergence of coordination between WRMs is only guaranteed locally due to the non-linearity of the system [i.e., multiple equilibria in (3)] and the possibility of local minima due to the use of potential functions in (8). In other words, if one spreads out the robots with a large formation error and activates the control, then they converge to the desired formation if the each fore-running WMR is within the FOV of the following WMR, in general, but it is not theoretically guaranteed. Nevertheless, we believe that, for the most of the applications, it is not so difficult to attain the start with small formation errors practically.

With the inter-WMR coordination \(||h_{j,j+1}|| \rightarrow 0\) for each pair among the n-WMRs attained by \(u_{C_{j,j+1}}\) in (8), the remaining task is then to tele-drive the collective of those pairs without perturbing \(h_{j,j+1} \rightarrow 0\) in a distributed fashion. This can be done if (and only if) by driving them via the locked system control \(u_{L_{j,j+1}}\) as can be seen from the nonholonomic passive decomposition (5). Here, note that, if \(h_{j,j+1} \rightarrow 0\), \(u_{C_{j,j+1}} \rightarrow 0\) from the construction of \(\varphi _{h_{j,j+1}}\); and, if \(u_{C_{j,j+1}} \rightarrow 0\), \(u_{L_{j,j+1}} \rightarrow (v_j,w_j)\) from (5) with (6). To design \(u_{L_{j,j+1}}\), let us start with the leader WMR with \(j=1\). This leader WMR serves as the “eyes” of the remote user and will be directly tele-controlled by them. This means that \((v_1,w_1)\) is given. From (5) with (6) and (7), we then have

$$\begin{aligned}&v_1 = u^1_{L_{1,2}} + u^1_{C_{1,2}} \\&w_1 = u^2_{L_{1,2}} \end{aligned}$$

from which we can define \(u_{L_{j,j+1}}\) s.t.,

$$\begin{aligned} \begin{pmatrix} u^1_{L_{1,2}} \\ u^2_{L_{1,2}} \end{pmatrix} = \begin{pmatrix} v_1 - u^1_{C_{1,2}} \\ w_1 \end{pmatrix} \end{aligned}$$

where recall that \(u_{C_{j,j+1}}\) is already specified by (8) as a function of only \(q_{j,j+1}\).

Now, suppose that the control input \((v_j,w_j)\) of the j-th WMR is given. Then, from (5) with (6) and (7), similar to (10), the locked system control \(u_{L_{j,j+1}}\) can be obtained by

$$\begin{aligned} \begin{pmatrix} u^1_{L_{j,j+1}} \\ u^2_{L_{j,j+1}} \end{pmatrix} = \begin{pmatrix} v_j - u^1_{C_{j,j+1}} \\ w_j \end{pmatrix} \end{aligned}$$

where again \(u_{C_{j,j+1}}\) is fully specified by (8) as a function of only \(q_{j,j+1}\). With \((u_{L_{j,j+1}},u_{C_{j,j+1}})\) now both determined, again from (5) with (6) and (7), we can compute the forward and angular velocity inputs of the \((j+1)\)-th WMR by

$$\begin{aligned} v_{j+1} =\,&u^1_{L_{j,j+1}}{\text {c}}(\phi _j-\phi _{j+1}) \nonumber \\&- \left[ \begin{array}{cc} {\text {c}}(\phi _j-\phi _{j+1})&-{\text {s}}(\phi _j-\phi _{j+1}) \end{array} \right] u_{C_{j,j+1}} \end{aligned}$$
$$\begin{aligned} w_{j+1} =\,&\frac{1}{L_j}u^1_{L_{j,j+1}} {\text {s}}(\phi _j-\phi _{j+1}) \nonumber \\&-L_j \left[ \begin{array}{cc} {\text {s}}(\phi _j-\phi _{j+1})&{\text {c}}(\phi _j-\phi _{j+1}) \end{array} \right] u_{C_{j,j+1}} \end{aligned}$$

Given \((v_{j+1},w_{j+1})\) of the \((j+1)\)-th WMR, we can also compute \(u_{L_{j+1,j+2}}\) for the pair of the \((j+1)\)-th WMR and \((j+2)\)-th WMR similar to (11) by

$$\begin{aligned} \begin{pmatrix} u^1_{L_{j+1,j+2}} \\ u^2_{L_{j+1,j+2}} \end{pmatrix} = \begin{pmatrix} v_{j+1} - u^1_{C_{j+1,j+2}} \\ w_{j+1} \end{pmatrix} \end{aligned}$$

where, again, \(u_{C_{j+1,j+2}}\) is already specified by (8) as a function of only \(q_{j+1,j+2}\). By repeating this process down to the n-th WMR, we can specify the control input \((v_j,w_j)\) for all the WMRs, \(j=1,\ldots ,n\), which will ensure the \((j+1)\)-th WMR to follow the j-th WMR (moving with \((v_j,w_j)\)) while enforcing the inter-WMR coordination requirement (i.e., \(h_{j,j+1} \rightarrow 0\)) regardless of the user command \((v_1,w_1)\) of the leader WMR.

Here, note that the control (12) and (13) of the \((j+1)\)-th WMR is only a function of \(q_{j,j+1}=[q_j;q_{j+1}]\) and \(u^1_{L_{j,j+1}}\), where \(q_j\) and \(u^1_{L_{j,j+1}}\) are already known by the j-th WMR (with (11)), thus, can be transmitted to the \((j+1)\)-th WMR via the peer-to-peer communication, and \(q_{j+1}\) can be estimated by using the onboard sensor of the \((j+1)\)-th WMR with \(q_{j}\) received from the j-th WMR via the communication. This then shows that the control (12) and (13) is indeed distributed, requiring only onboard sensing and peer-to-peer communication. Here, we also assume the transmission delay between two WMRs via their peer-to-peer communication (with the data loss also included) be negligible as compared to the operation speed of the WMRs. This is necessary for the properly working of the distributed control (12) and (13), and granted for our experimental setup as well (i.e., WMR speed slower than 0.5 m/s with 250 Hz communication rate and near zero data loss—see Sect. 5). This transmission delay effect will be substantial though for the predictive display, when the number of the WMRs is large—see Sect. 4.

Now, suppose that we start with \(||h_{j,j+1}(0)|| \approx 0\) for all \(j=1,\ldots ,n-1\). Then, as stated before (11), \(u_{C_{j,j+1}} \approx 0\) as well, \(\forall j=1, 2, \ldots , n-1\). This then implies from (5) with (6) and (7) that

$$\begin{aligned} u^1_{L_{j,j+1}} \approx v_j,~~~ u^2_{L_{j,j+1}} \approx w_j \end{aligned}$$

which, from (12), further implies that

$$\begin{aligned} v_{j+1} \approx v_j {\text {c}}(\phi _j-\phi _{j+1}),~ w_{j+1} \approx \frac{v_j}{L_j} {\text {s}}(\phi _j-\phi _{j+1}) \end{aligned}$$

or, equivalently,

$$\begin{aligned}&v_{j+1} \approx v_1 \cdot \prod _{p=1}^j {\text {c}}(\phi _p-\phi _{p+1}) \end{aligned}$$
$$\begin{aligned}&w_{j+1} \approx \frac{v_1}{L_j}{\text {s}}(\phi _j-\phi _{j+1}) \cdot \prod _{p=1}^{j-1} {\text {c}}(\phi _p-\phi _{p+1}) \end{aligned}$$

Here, note that if all the WMRs are aligned (i.e., \(\phi _j=\phi _{j+1}\)), \(v_j \rightarrow v_1\) with \(w_j=0\) \(\forall j=1,2,..,n\), i.e., pulling the straight-line platoon by its leader WMR. On the other hand, if \(\phi _j-\phi _{j+1}= \pi /2\), \(w_{j+1} = \frac{v_j}{L}\) with \(v_{j+1}=0\), i.e., pulling the upright stem of the reversed “L”-shape normal to its bottom horizontal line with only the \((j+1)\)-th WMR (i.e., horizontal line end) instantaneously rotating to keep its heading directed to the axle-center of the j-th WMR (i.e., stem end). This is in a stark contrast with the work of Lui and Lee (2017), where a pair of two WMRs is “pushed” by the follower WMR in a centralized manner with external MOCAP (motion capture system).

Although developed here only for the line graph topology for simplicity, our framework can also incorporate any directed tree graph topology among the WMRs, by attaching line branch to its preceding stem branch with the leader WMR as the globally reachable root of the whole tree graph (Ren and Beard 2008; Lee 2016). For this, separation among the branches can easily be achieved by using the same formation map \(h_{j,j+1}\) (3), yet, with some offset between the onboard camera direction and the heading direction of the WMR, i.e.,

$$\begin{aligned} h_{j,j+1} := \begin{pmatrix} x_j-x_{j+1}-L_j {\text {c}}(\phi _{j+1} + \alpha _{j+1})\\ y_j-y_{j+1}-L_j {\text {s}}(\phi _{j+1} + \alpha _{j+1}) \end{pmatrix} =0 \end{aligned}$$

where \(\alpha _{j+1} \in \mathfrak {R}\) is the constant offset angle. If \(h_{j,j+1}=0\), the branch will then be “slanting” by the angle of \(\alpha _{j+1}\) from its stem branch. See Fig. 4, where this slanting is used to attain a triangular formation among the WMRs. Note also that, since the sensing, computation and communication are all distributed and also the control (12) and (13) is uni-directional, any tree sub-graph can be removed or added from the downstream of the platoon without influencing at all the performance of its upstream WMRs. This implies scalability of our proposed framework.

Fig. 4
figure 4

Simulation result of the platoon of 30 WMRs making a triangular formation under the distributed control (12) and (13) and the modified formation map (18) with the camera-heading offset angle \(\alpha _j\) to slant the two branches of the formation from each other. The red circle indicates the leader WMR, the blue-dotted arrow means the desired command, and the colored traces are the past trajectories of the WMRs (Color figure online)

The nonholonomic passive decomposition and its behavior decomposition [e.g., (12) and (13)] turns out to greatly simplify the uncertainty propagation computation for the predictive display in Sect. 4, as it allows us to consider only the collective driving aspect (i.e., with \(u_{L_{j,j+1}}\)), not the inter-WMR coordination aspect, which is regulated locally by \(u_{C_{j,j+1}}\). This is in a stark contrast to Desai et al. (2001), where the same control objective (3) is attained for distributed nonholonomic WMRs, yet, without such a behavior decomposition. Thus, for the uncertainty propagation computation, the full kinematics both with the collective and coordination aspects should be used, which can substantially increase computation complexity as the closed-loop kinematics contains many nonlinearity therein. The passive decomposition was also used in Lee and Spong (2005) (centralized holonomic robots), (Lee and Li 2007) (partially-distributed holonomic robots), (Lee 2012) (distributed holonomic robots), (Lee 2008) (nonholonomic, yet, centralized), and in Lui and Lee (2017) (nonholonomic, yet, only two WMRs and centralized). However, its application to distributed nonholonomic WMRs is done for the first time in this paper.

3 Distributed pose estimation

3.1 EKF pose estimation of leader WMR

The leader WMR runs the SLAM algorithm of Grisettiyz et al. (2005) and Grisetti et al. (2007) with the LiDAR sensor. This LiDAR-SLAM is running at 40 Hz with the LiDAR scanning period. For smoother and more accurate pose estimation, we fuse this LiDAR-SLAM pose information with the IMU data via extended Kalman filtering (EKF). For this, following (Hesch et al. 2014; Lee et al. 2016), we utilize the IMU data (only accelerometer in (xy) and yaw gyroscope used) for the EKF propagation step (with 250 Hz), while the LiDAR-SLAM data for the EKF measurement update (with 40 Hz). We also adopt the technique of error-state EKF for faster and more robust estimation performance (Hesch et al. 2014).

More precisely, we define the state for this LiDAR-IMU sensor fusion s.t.,

$$\begin{aligned} \chi _1 := \begin{bmatrix} p_1;&\xi _1;&\phi _1;&b_{a_1};&b_{g_1} \end{bmatrix} \in \mathfrak {R}^{8} \end{aligned}$$

where \(p_1 := [x_1;y_1] \in \mathfrak {R}^2, \xi _1 := [\dot{x}_1;\dot{y}_1] \in \mathfrak {R}^2\), and \(b_{a_1} =[b_{a_1,x};b_{a_1,y}] \in \mathfrak {R}^2 \) and \(b_{g_1} \in \mathfrak {R}\) are the biases of the accelerometer and gyroscope measurements, each modeled as random-walk processes driven by zero-mean white Gaussian noise \(n_{wa_1} \in \mathfrak {R}^{2}\) and \(n_{wg_1} \in \mathfrak {R}\), respectively. The continuous-time state equation is then given by

$$\begin{aligned} \dot{p}_1(t)&= \xi _1 (t), ~~\dot{\xi }_1(t) = a_1(t), ~~\dot{\phi }_1(t) = \omega _1(t) \nonumber \\ \dot{b}_{a_1}(t)&= n_{wa_1}(t), ~~\dot{b}_{g_1}(t) = n_{wg_1}(t) \end{aligned}$$

where \(a_1(t) \in \mathfrak {R}^2\) is the acceleration of the leader WMR expressed in \(\{ \mathcal {O} \}\). The accelerometer and gyroscope sensor models are also given by

$$\begin{aligned} a_{m_1}(t)&= R^{\mathcal {G}_1}_{\mathcal {O}} (\phi _1(t))a_1(t) + b_{a_1}(t) + n_{a_1}(t)\nonumber \\ \omega _{m_1}(t)&= \omega _1(t) + b_{g_1}(t) + n_{g_1}(t) \end{aligned}$$

where \(n_{a_1}(t) \in \mathfrak {R}^2\) and \(n_{g_1}(t) \in \mathfrak {R}\) are the zero mean, white Gaussian measurement noise of the accelerometer and gyroscope sensing, and \( R^{\mathcal {G}_1}_{\mathcal {O}} \in \text {SO}(2)\) is the rotation matrix from the inertial frame \(\{ \mathcal {O} \}\) to the body-fixed frame of leader WMR \(\{ \mathcal {G}_{1} \}\).

Let us define the error state for the EKF s.t.,

$$\begin{aligned} \tilde{\chi }_1 := \begin{bmatrix} \tilde{p}_1;&\tilde{\xi }_1;&\tilde{\phi }_1;&\tilde{b}_{a_1};&\tilde{b}_{g_1} \end{bmatrix} \in \mathfrak {R}^{8} \end{aligned}$$

where \((\tilde{\cdot }) = (\cdot ) - (\hat{\cdot })\) is the difference between the true state (19) and the estimated state [obtained by applying the expectation to (20)]. Then, the linearized continuous-time error state equation is given by

$$\begin{aligned} \dot{\tilde{\chi }}_1 = F_1(t)\tilde{\chi }_1 + G_1(t)n_1 \end{aligned}$$

where \(n_1 = \begin{bmatrix} n_{a_1};&n_{wa_1};&n_{g_1};&n_{wg_1} \end{bmatrix} \in \mathfrak {R}^{6}\) is the system noise vector, and

$$\begin{aligned} F_1(t)&:= \begin{bmatrix} 0_{2 \times 2}&\quad I_2&\quad 0_{2 \times 1}&\quad 0_{2 \times 2}&\quad 0_{2 \times 1}\\ 0_{2 \times 2}&\quad 0_{2 \times 2}&\quad S \hat{a}_1(t)&\quad -R_{\mathcal {G}_1}^{\mathcal O}(\hat{\phi }_1(t))&\quad 0_{2 \times 1}\\ 0_{1 \times 2}&\quad 0_{1 \times 2}&\quad 0&\quad 0_{1 \times 2}&\quad -\,1\\ 0_{2 \times 2}&\quad 0_{2 \times 2}&\quad 0_{2 \times 1}&\quad 0_{2 \times 2}&\quad 0_{2 \times 1}\\ 0_{1 \times 2}&\quad 0_{1 \times 2}&\quad 0&\quad 0_{1 \times 2}&\quad 0 \end{bmatrix}\\&\qquad \in \mathfrak {R}^{8 \times 8}\\ G_1(t)&:= \begin{bmatrix} 0_{2 \times 2}&\quad 0_{2 \times 2}&\quad 0_{2 \times 1}&\quad 0_{2 \times 1} \\ -R_{\mathcal {G}_1}^{\mathcal O}(\hat{\phi }_1(t))&\quad 0_{2 \times 2}&\quad 0_{2 \times 1}&\quad 0_{2 \times 1}\\ 0_{1 \times 2}&\quad 0_{1 \times 2}&\quad -\,1&\quad 0\\ 0_{2 \times 2}&\quad I_2&0_{2 \times 1}&\quad 0_{2 \times 1}\\ 0_{1 \times 2}&\quad 0_{1 \times 2}&\quad 0&\quad 1 \end{bmatrix} \in \mathfrak {R}^{8 \times 6} \end{aligned}$$

with \(\hat{a}_1(t) := R^{\mathcal {O}}_{\mathcal {G}_1} (\hat{\phi }_1(t)) (a_{m_1}(t)-\hat{b}_{a_1}(t)) \) and \(S:=[0,-1; 1,0]\in \mathfrak {R}^{2\times 2}\). This continuous-time Eq. (22) can then be discretized at time \(t_k\) by using \(\frac{\tilde{\chi }_{1,k+1} - \tilde{\chi }_{1,k}}{dt} = \dot{\tilde{\chi }}_{1,k}\), \(dt = t_{k+1} - t_k\), with the IMU information (21) as done in Hesch et al. (2014) and Lee et al. (2016). From this discretized equation, we can further obtain the error state covariance propagation equation s.t.,

$$\begin{aligned}&P^{\chi _1}_{k+1 \vert k} \simeq (I_8 + F_{1,k} dt)P^{\chi _1}_{k \vert k} (I_8 + F_{1,k} dt)^T \\&\quad \quad \quad \quad + (G_{1,k} dt) Q_1 (G_{1,k}^Tdt) \end{aligned}$$

where \(F_{1,k}:=F_1(t_k)\) and \(G_{1,k}:=G_1(t_k)\) are the matrices at the time \(t_k\), \(P^{\chi _1}_{k |k}\) is the priori covariance of the leader WMR at k, \(P^{\chi _1}_{k+1 |k}\) is its posteriori covariance at k, and \(Q_1 := E[n_1n_1^T ]\) is the covariance of the noise vector \(n_1\). As a measurement for EKF, we use the pose measurement obtained by the LiDAR-SLAM, i.e.,

$$\begin{aligned} z_{\text {SLAM}} = \begin{pmatrix} p_1\\ \phi _1 \end{pmatrix}+ \begin{pmatrix} n_{p_1}\\ n_{\phi _1} \end{pmatrix} \end{aligned}$$

where \((n_{p_1}\),\(n_{\phi _1})\) are the zero mean, white Gaussian noise of the LiDAR-SLAM pose measurement. Then, the measurement model of the error state is given by

$$\begin{aligned} \tilde{z}_{\text {SLAM}}&= \begin{pmatrix} \tilde{p}_1\\ \tilde{\phi }_1 \end{pmatrix} + \begin{pmatrix} n_{p_1}\\ n_{\phi _1} \end{pmatrix}\\&= \underbrace{\begin{bmatrix} I_2&\quad 0_{2 \times 2}&\quad 0_{2 \times 1}&\quad 0_{2 \times 3}\\ 0_{1 \times 2}&\quad 0_{1 \times 2}&\quad 1&\quad 0_{1 \times 3}\\ \end{bmatrix}}_{=:H_1} \tilde{\chi }_1 + \begin{pmatrix} n_{p_1}\\ n_{\phi _1} \end{pmatrix} \end{aligned}$$

which defines the update equation for (22). To correct the error state \(\tilde{\chi }_1\) with this measurement \(z_\text {SLAM}\) received at \(k+1\), the Kalman gain \(K_{k+1}\) is computed s.t.,

$$\begin{aligned} K_{k+1} = P^{\chi _1}_{k+1 |k} H_1^T S_{k+1}^{-1} \end{aligned}$$

where \(S_{k+1} = H_1 P^{\chi _1}_{k+1 |k} H_1^T + R_1\) is the covariance of the residual, and \(R_1\) is the covariance of the measurement. We then update the propagated error state and its covariance s.t.,

$$\begin{aligned} \hat{\chi }_{1,k+1 |k+1}&= \hat{\chi }_{1,k+1 |k} + K_{k+1} \tilde{z}_{\text {SLAM},k+1}\nonumber \\ P^{\chi _1}_{k+1 |k+1}&= P^{\chi _1}_{k+1 |k} - P^{\chi _1}_{k+1 |k} H_1^T S^{-1}_{k+1} H_1 P^{\chi _1}_{k+1 |k} \end{aligned}$$

3.2 EKF pose estimation of follower WMRs

For the pose estimation of the “simple” follower WMRs, we use only low-cost sensors (e.g., monocular camera and IMU). Each follower WMR obtains relative position and bearing measurement from the fore-running WMR by using the monocular camera and fiducial markers similar to Olson (2011) attached on the fore-running WMR. This camera information is then fused with IMU via the error-state EKF as done for the leader WMR in Sect. 3.1. For this, similar to (19), define the sensor fusion state of the j-th follower WMR s.t.,

$$\begin{aligned} \chi _j := \begin{bmatrix} p_j;&\xi _j;&\phi _j;&b_{a_j};&b_{g_j} \end{bmatrix} \in \mathfrak {R}^{8} \end{aligned}$$

The propagation step of the error-state EKF is then the same as that of the leader WMR. Only the difference is the update step, where the measurement of the camera and the pose estimate of the \((j-1)\)-th WMR received from the communication are used.

First, the camera measurement model is given by

$$\begin{aligned} z_{j,\text {camera}} = \begin{pmatrix} z_p \\ z_{\phi } \end{pmatrix} + \begin{pmatrix} n_{p_j}\\ n_{\phi _j} \end{pmatrix} \end{aligned}$$

where \(z_p := R_{\mathcal {O}}^{\mathcal {G}_j}(\phi _j)\) \((p_{j-1} - p_{j}) \in \mathfrak {R}^2\) and \(z_{\phi }:= (\phi _{j-1} - \phi _{j}) \in \mathfrak {R}\) are the relative position and bearing of the \((j-1)\)-th WMR from the j-th WMR expressed in the j-th WMR frame \(\{ \mathcal {G}_{j} \}\), and \(n_{p_j} \in \mathfrak {R}^2\) and \(n_{\phi _j} \in \mathfrak {R}\) are the zero mean, white Gaussian noise of the camera measurement. Then the pose measurement of the j-th WMR expressed in \(\{ \mathcal {O} \}\) is given by

$$\begin{aligned} z_{j,\text {pose}}&= \begin{pmatrix} p_{j-1}\\ \phi _{j-1} \end{pmatrix} - \begin{pmatrix} R^{\mathcal {O}}_{\mathcal {G}_j}(\hat{\phi }_j) (z_p + n_{p_j}) \\ z_{\phi } + n_{\phi _j} \end{pmatrix} \end{aligned}$$

where \((p_{j-1}, \phi _{j-1})\) is the pose of the fore-running WMR (to be received from the communication). The measurement output can then be rewritten for the error state s.t.,

$$\begin{aligned} \tilde{z}_{j,\text {pose}}&= \begin{pmatrix} \tilde{p}_{j-1} \\ \tilde{\phi }_{j-1} \end{pmatrix} - \begin{pmatrix} R^{\mathcal {O}}_{\mathcal {G}_j}(\hat{\phi }_j) n_{p_j}\\ n_{\phi _j} \end{pmatrix}\\&= \underbrace{\begin{bmatrix} I_2&\quad 0_{2 \times 2}&\quad 0_{2 \times 1}&\quad 0_{2 \times 3}\\ 0_{1 \times 2}&\quad 0_{1 \times 2}&\quad 1&\quad 0_{1 \times 3}\\ \end{bmatrix}}_{=:H_j} \tilde{\chi }_j \\&\quad - \begin{pmatrix} R^{\mathcal {O}}_{\mathcal {G}_j}(\hat{\phi }_j(t)) n_{p_j}\\ n_{\phi _j} \end{pmatrix} \end{aligned}$$

with its measurement covariance given by

$$\begin{aligned} R_j :&= E[\tilde{z}_{j,\text {pose}} \tilde{z}_{j,\text {pose}}^T ]\\&= \begin{bmatrix} P^{\chi _{j-1}}_{ p p} + R^{\mathcal {O}}_{\mathcal {G}_j}(\hat{\phi }_j) R_{p_j} (R^{\mathcal {O}}_{\mathcal {G}_j}(\hat{\phi }_j))^T&\quad P^{\chi _{j-1}}_{p \phi }\\ P^{\chi _{j-1}}_{\phi p}&\quad P^{\chi _{j-1}}_{\phi \phi } + R_{\phi _j} \end{bmatrix} \end{aligned}$$

where \((P^{\chi _{j-1}}_{p p},P^{\chi _{j-1}}_{p \phi }, P^{\chi _{j-1}}_{\phi p},P^{\chi _{j-1}}_{\phi \phi })\) are the covariance of the \((j-1)\)-th WMR pose estimate (received from the communication), and \(R_{p_j} := E[n_{p_j} n_{p_j}^T]\) and \(R_{\phi _j} := E[n_{\phi _j} n_{\phi _j}^T]\) are the covariance of the relative position and bearing measurements obtained from the monocular camera, respectively. When the measurement is received, we can then update the state and covariance of the error state EKF of the j-th WMR as done in (23) and (24). The covariance \(P^{\chi _j}_{k+1 |k+1}\) plays a crucial role for the predictive display developed in Sect. 4 via its propagation through the kinematics of the WMR and their communication.

4 Predictive display for distributed robots teleoperation

Even if the n-WMRs is reduced to the (familiar) n-trailer platoon by the distributed control of Sect. 2, it is still difficult for typical users to tele-navigate this platoon to wiggle through an obstacle-laden environment, as this platoon can exhibit complex internal serpentine articulation, particularly when it experiences a large curvature change. See Fig. 4. To assist a remote user to overcome this difficulty, here, we propose a novel predictive display, which, by providing the user with the estimated current and predicted future pose information of each WMR along with the possibility of collision, can significantly enhance the user tele-driving performance while substantially reducing their cognitive loads as manifested by the human subject study in Sect. 5.4. Into this predictive display, we also fully incorporate the uncertainty inherent to the distributed robots with the sensing, computation and communication distribution, that is, (1) local pose estimation uncertainty of each WMR starting from that of the leader WMR with the relative pose measurement uncertainty accumulated downstream to the last WMR; and (2) current pose estimation uncertainty of each WMR stemming from the uncertainty of their motions during the transmission delay accumulated upstream to the leader WMR.

Fig. 5
figure 5

Predictive display consists of the two propagation stages: (1) estimation propagation stage to estimate the pose of each WMR at the current time from its reception with the transmission delay; and (2) prediction propagation stage to predict the future course of platoon motion over prediction time horizon via forward propagation

To address this complex platoon motion and the distribution-inherent uncertainty, here, we design our predictive display to be composed of the following two stages: (1) estimation propagation stage, where the (old) pose estimate of each WMR, received at the current time \(t_k\), is propagated through their transmission delay, so that we can probabilistically estimate its pose at the current time \(t_k\) with the associated uncertainty; and (2) prediction propagation stage, where the estimated current pose of each WMR at \(t_k\), obtained via the above estimation propagation, is forward-propagated with the current user command so that we can predict the future course of the platoon motion over the prediction time horizon. See Fig. 5.

For this, we assume that all the delays (e.g., processing, data conversion, etc.) are lumped into the transmission delay, which is still small enough as compared to the WMR speed (for the distributed control (12) and (13) to properly work) and also can be made constant with some suitable buffering algorithm. Note that this assumption can easily be granted, at least approximately, if the communication rate is much faster than the WMR speed with negligible data loss rate. This is in fact true for our experimental setup in Sect. 5, where all the estimation, control and communication run at 250 Hz with near-zero data loss, which is much faster than the WMR operation speed (\(\le 0.5\) m/s). Thus, below, we assume the transmission delay and the computation time be the same (i.e., \(t_{k+1}-t_k=4\) ms for the setup in Sect. 5). Of course, the below derivation can be easily extended when the transmission delay is constant, yet, longer than the computation time. We also assume the LiDAR-SLAM map uncertainty be much less than that of the pose estimation of each WMR, implying that, for collision detection, we should consider the uncertainty only of the pose estimate of the WMRs.

4.1 Estimation propagation

Note that the pose information of the j-th WMR received at the current time \(t_k\) by the MCS (main control station) is given by

$$\begin{aligned} \hat{q}_{j,k-j} \approx {\mathcal N}(\bar{q}_{j,k-j},P_{j,k-j}) \end{aligned}$$

i.e., the estimated state \(\hat{q}_{j}(t_{k-j})\) computed via the EKF-sensor fusion by the j-th WMR and transmitted via the j-hops peer-to-peer communication, first to the \((j-1)\)-th WMR and all the way to the leader WMR and the MCS. Following the EKF sensor fusion of Sect. 3, this estimated state \(\hat{q}_{j,{k-j}}\) is characterized by Gaussian distribution with the mean \(\bar{q}_{j,{k-j}}\) and the covariance \(P_{j,{k-j}}\). Given this “delayed” information, we then attempt to estimate the pose of the WMRs at the current time \(t_k\) by using the unscented transformation as follows.

Let us start with the leader WMR, from which, at the time k, the MCS receives its estimated pose

$$\begin{aligned} \hat{q}_{1,k-1} \approx {\mathcal N}(\bar{q}_{1,k-1},P_{1,k-1}) \end{aligned}$$

which is assumed to be Gaussian as above. During the interval \([t_{k-1},t_k)\), the control input for the leader WMR is simply given by \((v_{1,k-1},w_{1,k-1})\), which is directly received from the MCS. We then use the following equation to estimate \(\hat{q}_{1,k}\):

$$\begin{aligned} \hat{q}_{1,k} = \hat{q}_{1,k-1} + dt \cdot \underbrace{\left[ \begin{array}{cc} {\text {c}}\hat{\phi }_{1,k-1} &{} 0 \\ {\text {s}}\hat{\phi }_{1,k-1} &{} 0 \\ 0 &{} 1 \end{array} \right] }_{=:A(\hat{q}_{1,k-1})} \begin{pmatrix} v_{1,k-1} \\ w_{1,k-1} \end{pmatrix} \end{aligned}$$

which can be written by the following nonlinear map

$$\begin{aligned} \hat{q}_{1,k} = g(\hat{q}_{1,k-1},v_{1,k-1},w_{1,k-1}) \end{aligned}$$

Here, note that, via this nonlinear map, the random variable \(\hat{q}_{1,k-1}\) propagates to another random variable \(\hat{q}_{1,k}\). This can in fact be computed by using unscented transformation (UT) s.t.,

$$\begin{aligned} \hat{q}^{ut}_{1,k} := {\text {UT}}_{g,v_{1,k-1},w_{1,k-1}}(\hat{q}_{1,k-1}) \approx {\mathcal N}(\bar{q}^{ut}_{1,k}, P^{ut}_{1,k}) \end{aligned}$$

where \({\text {UT}}_{g,\star }\) is the unscented transformation via the nonlinear map \(g( \cdot ,\star )\).

Let us then move on to the second WMR, for which the MCS receives its estimated pose s.t.,

$$\begin{aligned} \hat{q}_{2,k-2} \approx {\mathcal N}(\bar{q}_{2,k-2}, P_{2,k-2}) \end{aligned}$$

at the time instance \(t_k\). We then first propagate the kinematics of this second WMR from \(t_{k-2}\) to \(t_{k-1}\) using (25). For this, from (15), we have

$$\begin{aligned} \hat{q}_{2,k-1}&= \hat{q}_{2,k-2} + dt \cdot A(\hat{q}_{2,k-2})\\&\quad \times \begin{pmatrix} v_{1,k-2} {\text {c}}(\bar{\phi }_{1,k-2}-\bar{\phi }_{2,k-2}) \\ \frac{v_{1,k-2}}{L} {\text {s}}(\bar{\phi }_{1,k-2}-\bar{\phi }_{2,k-2}) \end{pmatrix} \end{aligned}$$

where \(\bar{\phi }_{1,k-2}\) is available at the MCS, since it is received from the leader WMR at the (past) time instance \(t_{k-1}\). Then, we can do the “estimation propagation” via the unscented transformation s.t.,

$$\begin{aligned} \hat{q}^{ut}_{2,k-1} := {\text {UT}}_{g,v_{1,k-2},\bar{\phi }^{k-2}_{1,2}}(\hat{q}_{2,k-2}) \end{aligned}$$

where \(\bar{\phi }^{k-2}_{1,2}:=\bar{\phi }_{1,k-2}-\bar{\phi }_{2,k-2}\). To estimate-propagate from \(k-1\) to k, we also use:

$$\begin{aligned} \hat{q}_{2,k} = \hat{q}^{ut}_{2,k-1} + dt \cdot A(\hat{q}^{ut}_{2,k-1}) \begin{pmatrix} v_{1,k-1} {\text {c}}(\bar{\phi }_{1,k-1}-\bar{\phi }^{ut}_{2,k-1}) \\ \frac{v_{1,k-1}}{L} {\text {s}}(\bar{\phi }_{1,k-1}-\bar{\phi }^{ut}_{2,k-1}) \end{pmatrix} \end{aligned}$$

from which we estimate \(\hat{q}^{ut}_{2,k}\) via the unscented transformation s.t.,

$$\begin{aligned} \hat{q}^{ut}_{2,k} := {\text {UT}}_{g,v_{1,k-1},\bar{\phi }^{\prime ,k-1}_{1,2}}(\hat{q}^{ut}_{2,k-1}) \end{aligned}$$

where \(\bar{\phi }^{ut^\prime ,k-1}_{1,2}:=\bar{\phi }_{1,k-1}-\bar{\phi }^{ut}_{2,k-1}\) with \(\bar{\phi }_{1,k-1}\) and \(v_{1,k-1}\) available via the communication from the leader WMR to the MCS at \(t_k\).

We can then generalize this for the j-th WMR as the following sequential estimation-propagation procedure: with (15),

$$\begin{aligned}&\hat{q}^{ut}_{j,k-j+1} := {\text {UT}}_{g,v_{1,k-j},\bar{\phi }^{k-j}_{1,2},\ldots \bar{\phi }^{k-j}_{j-2,j-1},\bar{\phi }^{k-j}_{j-1,j}}(\hat{q}_{j,k-j})\nonumber \\&\hat{q}^{ut}_{j,k-j+2} := {\text {UT}}_{g,v_{1,k-j+1},\bar{\phi }^{k-j+1}_{1,2},\ldots \bar{\phi }^{k-j+1}_{j-2,j-1},\bar{\phi }^{ut^\prime ,k-j+1}_{j-1,j}}(\hat{q}^{ut}_{j,k-j+1})\nonumber \\&\hat{q}^{ut}_{j,k-j+3} := {\text {UT}}_{g,v_{1,k-j+2},\bar{\phi }^{k-j+2}_{1,2},\ldots \bar{\phi }^{ut^\prime ,k-j+2}_{j-2,j-1},\bar{\phi }^{ut,k-j+2}_{j-1,j}}(\hat{q}^{ut}_{j,k-j+2})\nonumber \\&\vdots \nonumber \\&\hat{q}^{ut}_{j,k} := {\text {UT}}_{g,v_{1,k-1},\bar{\phi }^{ut^\prime ,k-1}_{1,2},\ldots \bar{\phi }^{ut,k-1}_{j-2,j-1},\bar{\phi }^{ut,k-1}_{j-1,j}}(\hat{q}^{ut}_{j,k-1}) \end{aligned}$$

where \(\bar{\phi }^{ut^\prime ,k}_{j,j+1}:=\bar{\phi }^{ut}_{j,k}-\bar{\phi }^{ut}_{j+1,k}\), and all the terms are available at the time k with the same procedure already done from the leader WMR to the \((j-1)\)-th WMR. Here, note that the nonholonomic passive decomposition and its behavior decomposition [i.e., (15)] greatly simplifies this estimation-propagation computation, as it allows us to consider only the collective motion behavior. This computation will be more complex if we use the scheme of Desai et al. (2001), which require us to consider both the collective motion and the inter-WMR coordination behaviors.

4.2 Prediction propagation

Once we obtain the current pose estimate \((\hat{q}^{ut}_{1,k},\hat{q}^{ut}_{2,k}\), ...,\(\hat{q}^{ut}_{n,k})\) of all the n-WMRs via the estimation propagation, we then perform the “prediction propagation”, that is, predict the pose of each WMR when the same current human command \((v_{1,k}, w_{1,k})\) is continuously applied during the prediction horizon \([t_k,t_{k_p})\). Note that this does not mean that the predicted poses are calculated in the prediction horizon \([t_k,t_{k_p})\): the predicted poses are updated in each calculation time \([t_k,t_{k+1})\). Therefore, in general, the human command is reflected smoothly in the predictive display and we exclude the extremely fast command which is faster than the calculation time (4 ms in our experiment). For this, similar as above, we also perform the forward propagation sequentially from the leader WMR to the n-th WMR and from \([t_k,t_{k+1})\) to \([t_{k_p-1},t_{k_p})\) using the unscented transformation. More precisely, first, for the leader WMR, we have: using (15) with \(u_{C_{1,2}} \approx 0\) as assumed above,

$$\begin{aligned}&\hat{q}^{ut}_{1,k+1} := {\text {UT}}_{g,v_{1,k},w_{1,k}}(\hat{q}^{ut}_{1,k}) \\&\hat{q}^{ut}_{1,k+2} := {\text {UT}}_{g,v_{1,k},w_{1,k}}(\hat{q}^{ut}_{1,k+1}) \\&\vdots \\&\hat{q}^{ut}_{1,k_p} := {\text {UT}}_{g,v_{1,k},w_{1,k}}(\hat{q}^{ut}_{1,k_p-1}) \end{aligned}$$

On the other hand, for the second WMR, we have: again, with \(u_{C_{1,2}} \approx 0\) and (16) and (17),

$$\begin{aligned}&\hat{q}^{ut}_{2,k+1} := {\text {UT}}_{g,v_{1,k},\bar{\phi }^{ut,k}_{1,2}}(\hat{q}^{ut}_{2,k}) \\&\hat{q}^{ut}_{2,k+2} := {\text {UT}}_{g,v_{1,k},\bar{\phi }^{ut,k+1}_{1,2}}(\hat{q}^{ut}_{2,k+1}) \\&\vdots \\&\hat{q}^{ut}_{2,k_p} := {\text {UT}}_{g,v_{1,k},\bar{\phi }^{ut,k_p-1}_{1,2}}(\hat{q}^{ut}_{2,k_p-1}) \end{aligned}$$

and, similarly, for the j-th WMRs,

$$\begin{aligned}&\hat{q}^{ut}_{j,k+1} := {\text {UT}}_{g,v_{1,k},\bar{\phi }^{ut,k}_{1,2}\ldots ,\bar{\phi }^{ut,k}_{j-2,j-1},\bar{\phi }^{ut,k}_{j-1,j}}(\hat{q}^{ut}_{2,k}) \\&\hat{q}^{ut}_{j,k+2} := {\text {UT}}_{g,v_{1,k},\bar{\phi }^{ut,k+1}_{1,2}\ldots ,\bar{\phi }^{ut,k+1}_{j-2,j-1},\bar{\phi }^{ut,k+1}_{j-1,j}}(\hat{q}^{ut}_{2,k+1}) \\&\vdots \\&\hat{q}^{ut}_{j,k_p} := {\text {UT}}_{g,v_{1,k},\bar{\phi }^{ut,k_p-1}_{1,2}\ldots ,\bar{\phi }^{ut,k_p-1}_{j-2,j-1},\bar{\phi }^{ut,k_p-1}_{j-1,j}}(\hat{q}^{ut}_{2,k_p-1}) \end{aligned}$$

where \(\bar{\phi }^{ut,k^\prime }_{p-1,p}:=\bar{\phi }^{ut}_{k^\prime ,p-1}-\bar{\phi }^{ut}_{k^\prime ,p}\) with \(\bar{\phi }^{ut}_{k^\prime ,p}\) known \(\forall k^\prime \in \{k, k+1,\ldots ,k_p \}\) and \(\forall p \in \{ 1, 2, \ldots , j-1 \}\) from performing this prediction-propagation sequentially from the leader WMR to the \((j-1)\)-th WMR each for \([t_k, t_{k_p})\). Similar as above, the nonholonomic passive decomposition [i.e., (15)] greatly simplifies this prediction and propagation computation as well, which will be more complex if we use the result of Desai et al. (2001), as it requires to include the full kinematics with both the collective motion and the inter-WMR coordination behaviors.

We then present these estimated current pose \(\hat{q}^{ut}_{j,k} \approx \mathcal {N}(\hat{q}^{ut}_{j,k},\) \(P^{ut}_{j,k})\) and the predicted future pose \(\hat{q}^{ut}_{j,k_p} \approx \mathcal {N}(\hat{q}^{ut}_{j,k_p},P^{ut}_{j,k_p})\) to the user by overlaying them on the LiDAR-SLAM map. We also render their position and orientation estimation uncertainties by enlarging the size and the heading angle cone of each WMR, with their center position and angle corresponding to the estimation means and the sweeping size and angle to their covariances. See Fig. 2. By seeing this predictive display, human users can predict the procession of the distributed n-WMRs in the obstacle-cluttered environment, examine the likelihood and location of collisions, and adjust their tele-driving command if, e.g., collision is likely to occur. This predictive display turns out to be crucial here: if it were not for the predictive display, it is fairly difficult for human users to predict and properly control the motion of the platoon with complex internal serpentine articulation throughout obstacles while avoiding collisions, even if their number is only four—see Sect. 5.

Fig. 6
figure 6

Predictive display with estimated current and predicted future poses of WMRs. Uncertainty of pose estimate is denoted by enlarged size of the WMR and the heading angle cone, with their nominal size and center angle determined by the means: (1) uncertainty is the largest for the last WMR (left); and (2) uncertainty still substantial even with fairly precise sensors if the WMRs are many (right)

Here, note that the uncertainty in both the current and future pose estimation is the largest for the last WMR. This is because the pose estimation uncertainty of the leader WMR is propagated through the platoon with the uncertainty of the relative pose measurements of each WMR all added up on that downstream to the very last WMR. This uncertainty of the last (and each) WMR further increases through the estimation propagation (over the transmission delay) and the prediction propagation (over the prediction time horizon). If this pose uncertainty (e.g., \(3\sigma \)-value) of the last WMR becomes larger than the minimum inter-obstacle distance on the course of operation, the user cannot rely on the predictive display to tele-drive the platoon any more. Note that this limitation of our predictive display would be severer with (1) less precise onboard sensors; (2) longer transmission delay; (3) larger number of the WMRs; (4) larger size of the WMRs; and (5) narrower gap among the obstacles. Theoretical analysis to elucidate analytical relations among these factors of our predictive display is a topic for future research. See also Fig. 6, where some of these relations are shown.

Our proposed predictive display can be applied to general distributed robot systems as well, as it fully incorporates the uncertainty inherently arising from any systems with distributed sensing, computation and communication. Our predictive display framework may also be useful for the problem of driving a platoon of autonomous vehicles with a human driver sitting on the first vehicle while monitoring the state of the vehicles and environment. The idea of predictive display may also be expanded for the general problem of “teleoperation with uncertainty”, by indicating the best possible control direction (e.g., more precise pushing direction for peg-in-hole task) given the sensor uncertainty, parameter estimation error, actuation inaccuracy, etc. All of these constitute interesting topics for ensuing research of our framework presented here.

5 Experiments

5.1 Test setup

We implement one “smart” leader WMR and three “simple” follower WMRs as shown in Fig. 1. All of them are based on unicycle-type nonholonomic platforms, each with one passive front caster to prevent tipping-over and two rear differential-drive wheels. The wheels are driven by Maxon\(^\circledR \) BLDC motors under velocity control mode with the command received from Arduino Uno MCU (micro-controller unit). The leader WMR has a LiDAR sensor (Hokuyo UTM-30LX-EW, scan rate 40 Hz) and an IMU sensor (PhidgetSpatial Precision 3/3/3 IMU, 250 Hz, only (xy)-accelerometer and yaw gyroscope used). The LiDAR-SLAM and all the other computations (i.e, EKF sensor fusion, control computation, predictive display propagation) are run on Intel-NUCi7 on this leader WMR respectively with 40 and 250 Hz. The three follower WMRs have a front-view monocular web-cam (Logitech C922, 640 \(\times \) 480, 30 Hz) and the same IMU sensor as the leader WMR. Known patterns similar to Olson (2011) are also attached at the rear of each WMR for the relative pose sensing via the monocular camera. The follower WMRs run this relative pose measurement with 30 Hz and all the other computations with 250 Hz on its Intel-NUCi7.

Robot Operating System (ROS) is deployed as OS of all the WMRs and OpenCV is used for the pattern recognition. We also use RVIZ (3D visualization tool of ROS) to render the LiDAR-SLAM map with 1 Hz. On this map, we also render the current and future pose estimates of the WMRs at 50 Hz. This predictive display and the LiDAR-SLAM map are rendered on intel-NUCi7 of the leader WMR, which is then remotely accessed by the master PC. Peer-to-peer communication among the WMRs and with the MCS is implemented in UDP protocol with 250 Hz. We also attach a CONNEX ProSight HD to the leader WMR as the FPV camera. We also use the Omega3 haptic device as the commanding device with the following position-velocity mapping:

$$\begin{aligned} \begin{pmatrix} v_1 \\ w_1 \end{pmatrix} = \begin{pmatrix} \eta _1 \cdot y_h \\ \eta _2 \cdot x_h \end{pmatrix} \end{aligned}$$

where \(\eta _1, \eta _2 >0\) are the scaling factors and \(x_h, y_h\) are the device position. The desired inter-WMR distance is also set to be: \((L_1,L_2,L_3)=(1,0.8,0.8)\)m—see Fig. 2. Haptic feedback of the device is turned off during the experiment—it is used as a commanding device.

Fig. 7
figure 7

Circular and s-shape trajectories of the three WMRs during the performance experiments: dotted lines are the ground truth position measured by MOCAP and solid lines the estimated position computed by the distributed estimation of Sect. 3

5.2 Performance experiment

We first evaluate the performance of our distributed estimation and control with MOCAP (VICON, 240 Hz). Due to the limited size of the MOCAP environment, we conduct this experiment with only one leader WMR and two follower WMRs. A total of ten experiments is performed, five with the circular trajectory and five with the s-shape curve. Note that the average velocity of the desired trajectories is about 0.1 \(\text {m/s}^2\), which is adequate for the slow operation speed assumption because we did not observe any slip or drift during the experiments as stated in Sect. 2.1. The results of one trial of those five experiments are shown in Figs. 7 and 8.

First, as shown in Fig. 8, the RMSE (root mean square error) between the MOCAP data and our estimation data of the leader WMR and the two follower WMRs (for all the ten experiments) are found to be 2.11, 2.19 and 2.56 cm for the circular trajectory and 1.86, 2.18, and 3.32 cm for the s-curve trajectory. This level of estimation performance is precise enough for our experiments given the size of the platoon formation (i.e., \((L_1,L_2,L_3)=(1,0.8,0.8)\) m). The error that increases as going the backward WMR is due to the cumulative downward effect of uncertainty as stated in Sect. 4.

To evaluate the combined performance of our distributed estimation and control, we also measure the inter-WMR coordination error \(||h_{j,j+1}(t)||\) as shown in Fig. 8. We have similar trend for all the ten experiments as this Fig. 8. We then observe that the maximum of this coordination error is less than 4 cm, which is again deemed precise enough given the size of the platoon and the environment. This also shows that, thanks to the (distributed) quotient control \(u_{C_{j,j+1}}\) (8), the inter-WMR coordination aspect can be maintained fairly well regardless of the collective platoon motion (e.g., tele-driving).

Fig. 8
figure 8

Inter-WMR coordination error \(||h_{j,j+1}||\) among the three WMRs during the performance experiments with the circular and s-shape trajectories

5.3 Teleoperation experiment with predictive display

We conduct teleoperation experiment with the predictive display and the FPV in a real office environment with no MOCAP. The four WMRs as shown in Fig. 1 are used. The environment is shown in Fig. 9, consisting of the hall, the office room, and the corridor connecting them. The task goal is to check the four markers via the FPV camera: one on the corridor wall and three in the office. For this, a human user tele-drives the platoon starting from the hall, passing the corridor with three obstacles, going around the office room while avoiding a table in the middle of it, and returning to the start point. One obstacle in the corridor is too short for the LiDAR to detect—the user must rely on the FPV camera for this. Collision is prohibited throughout this teleoperation experiment. Before conducting the user study, four authors and two novices performed the teleoperation experiments twice: cases with and without predictive display.

Fig. 9
figure 9

The environment map for the teleoperation experiment and human subject study test: yellow boxes indicate four markers that the user should check; the green boxes are obstacles, with the dotted green box a short obstacle that cannot be detected by the LiDAR but should be noticed by the FPV camera. The blue dotted path presents the trajectory of the leader WMR of a successful trial (Color figure online)

Fig. 10
figure 10

Teleoperation master interface consisting of predictive display with LiDAR-SLAM map and FPV camera video-feed, and haptic device (used only as positioning device without haptic feedback)

Fig. 11
figure 11

Snapshots of the office-corridor teleoperation experiment: (Top row) External camera third person view of the environment with scattered box obstacles and the table in the middle of the office room; (Bottom row) Predictive display with LiDAR-SLAM map, showing the current (solid) and future (opaque) pose estimate of the WMRs and possible collision (white circular shade) along with the FPV camera video-feed

The master interface consists of one monitor and one haptic device as depicted in Fig. 10. The monitor displays the FPV camera view and the LiDAR-SLAM map. The haptic device is used only as a pointing device with haptic feedback turned off. The predictive display shows the current pose and future pose estimates of all the WMRs as explained in Sect. 4. The orientation of these predictive display and LiDAR-SLAM map are also rotated to be consistent with that of the FPV camera view to avoid the user confusion. We scale-up the size of these WMRs in the predictive display according to their uncertainty obtained as explained in Sect. 4. For this, we use the \(3\sigma \)-value of the covariance, which is corresponding to the 99.7% probability. We measure the distance between these “future sized-up” WMRs to the LiDAR-SLAM map, and notify the user of possible (future) collision when any of these distances becomes less than a certain threshold. We do this by overlaying a white circular shade on top of the colliding WMRs. This collision notification is not provided when the future/sized-up WMR hits the regions missing the LiDAR scans, as they may still be traversable. To decide if these “blank” regions are traversable or not, the user instead needs to rely on the FPV camera information. We choose the prediction horizon to be 15 s, which is corresponding to 2 m distance from the “current” WMRs given the average operation speed of about 0.15 m/s. This level of prediction horizon turns out to be adequate for our teleoperation. Of course, depending on the complexity of the environment and the driving speed, this value should be adjusted.

The results of this teleoperation experiment are shown in Fig. 11. Throughout this teleoperation experiment, we observe that: (1) both the FPV camera and the predictive display with the LiDAR-SLAM information are necessary to successfully complete the teleoperation task, as they provide complementary information (e.g., detecting the short obstacle in the corridor); (2) the predictive display is crucial to complete this teleoperation task, particularly for such challenging operation as navigating through the (narrow) door of the office room from the corridor, which requires a large change of the platoon curvature so that the platoon serpentine motions become difficult to understand and control for the human user; and (3) the platoon of the four nonholonomic WMRs keeps behaving as a 4-trailer system throughout the teleoperation experiment, even if they are driven by arbitrary human command. For more rigorous justification on the importance of the predictive display, we then perform the human subject study as stated in the next Sect. 5.4.

5.4 Human subject study

To rigorously verify the efficacy of the predictive display, here, we perform human subject study. For this, we use the same setting as the teleoperation experiment of Sect. 5.3. We hypothesize that the tele-driving of the WMRs without predictive visualization is not so easy, although users learn enough about the operation. To prove this hypothesis, two groups of the subjects are formed: (1) experimental group, where the subjects perform the same teleoperation task twice, first with the predictive display and then without the predictive display; and (2) control group, where the subjects perform the same teleoperation task twice, yet, both with the predictive display. This is to nullify the learning effect, that is, if the performance improvement for the second trial of the experimental group is less than that of the control group, the efficacy of the predictive display can be concluded.

5.4.1 Participants

Thirteen right-handed subjects (one female) with the age from 21 to 27 participate this study with no known neurological disorders. The experimental group is formed with eight subjects, whereas the control group with five subjects. None of them have contributed to the design or implementation of the experiment. The experiments are conducted in accordance with the requirements of the Helsinki Declaration.

5.4.2 Task

Motivated by the office or factory material transport scenario, we consider the same scenario as for the teleoperation experiment of Sect. 5.3. The subjects in the experimental group perform this task with the predictive display (Task E1: w/ PD) and then without predictive display (Task E2: w/o PD); whereas those in the control group perform the same task with the predictive display twice successively (i.e., Task C1: w/ PD, Task C2: w/ PD).

Fig. 12
figure 12

Number of successful/failed trials and average completion time across the subjects of each Task for the experimental group (eight subjects) and control group (five subjects)

5.4.3 Procedure

First, the supervisor explains each subject of the purpose of the experiment and the system configuration. He also explains them about the behavior of the n-trailer system, i.e., their behavior under the condition of \(h_{j,j+1} =0\) in (3). Then, each subject is allowed to play with the test setup as much as they want before performing the actual trials. During this time, the predictive display is turned on. The WMRs are confined within the hall of Fig. 9 though. The supervisor also takes each subject briefly walking around the environment, since, without it, it is too difficult for the subjects to perform the task while also constructing the environment information at the same time. Although this provides partial information of the environment, we believe the human subject study here can still clearly manifest the efficacy of the predictive display for the teleoperation as stated below.

All the subjects start the trial with the predictive display (i.e., Task E1: w/ PD or Task C1: w/ PD). If any WMR collides with the obstacles or environment during this trial, the user stops the task, the operation time is recorded, and the subject starts the new trial of the same task all over again from the start point in Fig. 9. If one fails three successive trials, that task is aborted and moves to the next task or give the seat to the next subject. When the Task E1 or C1 is completed (or three-trail failed), the subject of the experimental group then moves on to the same task without the predictive display (i.e., Task E2: w/o PD), whereas those in the control group to the same task again with the predictive display (i.e., Task C2: w/ PD). For each Task, the number of failed trials is recorded with that of the successful trial, which is always one. During these experiments, each subject wears earplugs to further reduce the (already fairly small) sound from the experiment. For experimental group, each subject is asked to fill in the NASA TLX (Task Load indeX) type questionnaire after the two Tasks are finished.

Fig. 13
figure 13

NASA TLX questionnaire result and the user disfavor (i.e., 1-preference) rating with and without the predictive display for the human subject study experiments. The closer graph to the center indicates the lower load, the better performance, and preference

5.4.4 Results and discussion

The results of this human subject study are summarized in Figs. 12 and 13. First, note from Fig. 12 the substantial reduction of the completion time from Task C1 to Task C2. The performance of Task C2 is improved compared to Task C1, and we can conclude that the enhancement of the performance is only due to the learning effect because there are no changes in the experimental conditions between two tasks. We also conducted one-way analysis of variance (ANOVA) to show the statistical analysis of the results. An ANOVA on the completion time between Task C1 and Task C2 shows that the difference between two tasks is statistically significant (\(F(1,8) = 6.67,~ p = 0.0324\)), which indicates the effect of the learning effect. The effect of the learning effect on the number of trials was also positive, but it did not achieve a statistical significance between two tasks (\(F(1,8) = 2,~ p = 0.195\)).

On the other hand, the performance of the experimental group does not improve in Task E2. See also Fig. 12. An ANOVA on both the completion time and the number of the trials between Task E1 and Task E2 also shows that two tasks do not have a significant difference on the performance (\(F(1,14) = 0.01,~ p = 0.9184\) for the completion time and \(F(1,14)=0.12,~ p=0.7342\) for the number of trials), i.e., no increase of the performance. As the only difference between Task E1 and Task E2 is the offer of predictive display, we can affirm that the absence of predictive display cancels (nullifies) the learning effect. Thus, we can conclude that despite the learning effect, the teleoperation of the distributed WMRs is not easy without the predictive display, which implies the efficacy of the predictive display.

One may assert the bias of the participants. However, an ANOVA on completion time and the number of trials did not show a statistically significant different between E1 and C1 (\(F(1,11)=0.33,~ p=0.5793\) for the completion time and \(F(1,11)=0.35,~ p=0.5683\) for the number of trials). Therefore, the result says that individual differences between users do not bias the results significantly.

In addition, the NASA-TLX result of Fig. 13 shows that, on top of the objective performance improvement of Fig. 12, the predictive display also enhances the tele-driving system from the subjective “feeling” as well, while also substantially reducing cognitive loading of the human users. Keep in mind that, in Fig. 13, the center indicates the “positive” aspect: the lower the workload, the better the performance, the less effort and preference. The predictive display is also advocated by nearly all the subjects (except one).

6 Conclusion

Whenever one attempts to teleoperate a platoon of distributed robots, they encounter with the following issues: (1) how to design the control, which is distributed, yet, still able to maintain a certain desired formation regardless of arbitrary human command, particularly when the robot mechanics is complex and under constraint; (2) how to enable the user to teleoperate all the robots intuitively and efficiently, even if the environment is with obstacles and the pose information of each robot is only imperfect due to distribution-induced uncertainty. This paper proposes a novel framework to resolve these issues of the distributed robot teleoperation. The backbone of our framework is nonholonomic passive decomposition, which allows us not only to split the platoon kinematics into the inter-WMR coordination and collective tele-driving aspects, but also design distributed control for these two aspects separately. To enable the user to tele-drive the platoon through obstacles even with its complex internal serpentine articulation, we also propose a novel predictive display. This predictive display, by providing the user with the estimated current and predicted future poses of the platoon, allows the user to adjust their command whenever collision is about to occur, thereby, substantially enhancing teleoperation performance and easiness. Experiment and human subject study are also performed to manifest the performance and efficacy of our proposed framework.

Some possible future research topics include: (1) development of teleoperation strategy for platoon reconfiguration by using the leader WMR (e.g., backward escape from dead-end); (2) extension of the proposed framework to other types of distributed mobile robots (e.g., quadrotors Ha et al. 2014; Lee et al. 2013; 3) extension to dynamic environments and robustification against various system failures; (4) consideration of the (possibly different) maximum speed of the WMRs in the distributed controls; and (5) extension to the fast operation speed in which dynamics effect and tire slip cannot be negligible.