Abstract
In this paper, we propose a scheme for autonomous exploration in unknown environments using a mobile robot. To reduce the storage consumption and speed up the search of frontiers, we propose a wavefeaturesbased rapidly exploring random tree method, which can inhibit or promote the growth of sampling trees regionally. Then, we prune the frontiers with mean shift algorithm and use the pruned frontiers for decisionmaking. To avoid the repeated exploration, we develop a decision making method with consistency assessment, in which the status of the robot and frontiers are explicitly encoded and modeled as a fixed start open traveling salesman problem (FSOTSP). Furthermore, a redecision mechanism is build to reduce the extra computing cost. Simulations and realworld experiments show the significant improvement of the proposed scheme.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Autonomous exploration technology of mobile robots has been widely applied in various fields. Amongst the main application is active mapping [1]. Its main task is to determine the next expected movement position of the robot, and finally achieve the shortest collisionfree path in the global scope to obtain the most unknown environmental information. The robotics autonomous threedimensional reconstruction [2] has similar goals. In addition, autonomous exploration has also been applied to disaster relief [3], terrain exploration [4] and other fields. Because the environment of these tasks is unknown and changeable, the implementation process requires closely integrated excellent navigation and decision making [5,6,7].
This paper mainly focuses on the autonomous exploration of active mapping tasks which aims to improve the exploration efficiency under ensuring that the robot completes environment mapping. The important measures are the length of the exploration path, the exploration time or even computing consumption. In the process of exploration, the determination of navigation targets is the most critical step, which is a complex task, including searching for frontiers and decisionmaking of exploration order. Afterwards, the robot performs path planning and moves. For searching, the existing methods can be divided into two categories, which are samplingbased and boundarybased. Both algorithms update entire map constantly in real time. The samplingbased method is widely used because of effectively avoiding falling into search dead areas. However, the existing methods usually global sampling and lack regional tendency due to excessive pursuit of probability completeness. This leads to a waste of computing resources and storage, which is unacceptable on lowcost hardware. For decision making, in the existing approaches, the robot preferentially navigates to the point with the smallest value of the utility function which usually considering path length, information gain, positioning accuracy and so on. However, the realtime updating of these utility function values causes the robot to frequently change navigation goals, leading to repeated exploration. In this paper, we propose a consistent decision making method to avoid the decrease of exploration efficiency caused by the indecision. In addition, we emphasize the importance of the pruning of searched frontiers, and present an efficient technical framework for autonomous exploration in unknown environments. Three main contributions are summarized as following.

(1)
Inspired by the natural phenomenon of sea wave, we propose a regional biased sampling search method. It considers the peculiarities of exploration and actively exploits map information to suppress unnecessary expansion of nodes in the explored area.

(2)
We propose a decision making updating method with consistency, including designing a utility function to measure consistency, extending it to the solution of global optimal FSOTSP, and develop a redecision mechanism.

(3)
All modules are effectively integrated into the autonomous exploration system for mobile robots. Experiments show the effectiveness and robustness of the system. We also verify the necessity of pruning frontiers through experiments.
The structure of this paper is as follows. In “Related work”, the review of related research progress in autonomous exploration is given. To facilitate the research, “Problem formulation” carries on the mathematical modeling of the autonomous exploration problem. Then “Proposed method” proposes the detailed method of autonomous exploration. In “Simulation” and “Realworld deployment”, the simulation results and practical experiment are discussed. Finally, “Conclusion” states the conclusions and discusses the future development research direction.
Related work
At present, there have been some autonomous exploration schemes which use mobile robots to map unknown environments. Some of the works focus on exploring the space quickly [8, 9], as this paper does. Meanwhile, other methods place more emphasis on accurate reconstruction [10, 11]. The functional modules of autonomous exploration are often subdivided into simultaneous localization and mapping (SLAM), frontier points searching (Search), exploration decision making (Decision) and motion planning (Planning).
The goal of Search is to generate frontier points. the algorithms for generating frontier points mainly include boundarybased and samplingbased methods. The boundarybased search algorithm generates points to be explored by detecting the boundary, represented by Frontier Detector (WFD) [12], Fast Frontier Detector (FFD) [12], and edge detection algorithms for graphics and images. WFD is based on BreadthFirst Search (BFS) and only scans known instead of unknown regions. The work in FFD uses lidar information to find the edge of the map to obtain possible forward detection points without processing the map information. Senarathne et al. [13] give proof and experimental support to these two algorithms, and further optimized them. Senarathne and Wang [14] apply the digital image processing technology of edge detection for boundary search. In each step of map update, the modified part of the map is processed, and the boundary information is managed and extracted in an incremental manner. Nevertheless, the computational complexity of image processing is still large. The samplingbased exploration algorithm is represented by the rapidly exploring random tree (RRT) and its derivative algorithms. Umari and Mukhopadhyay [15] apply RRT to search for frontier points. The main idea was to randomly sample on the map and use independently grown trees to speed up the search for frontier points. If the edge of the tree falls on the unknown area and the free area at the same time, the unknown points adjacent to the free area are the frontier points. Qiao et al. [16] change the way of storing trees to improve the exploration efficiency of narrow environments. All the above methods search globally and iteratively, without effective mechanisms to reduce storage and computing consumption. The method in [17] utilizes the semantic information of the environment as the heuristics to guide the robot exploration. However, it is premised on collecting enough data sets for training, and its generalization is poor.
After the robot obtains the frontier points, it needs Decision module to decide which point to go first. Robot usually makes decisions based on the value of the utility function. Makarenko et al. [18] pioneer the use of information entropy to represent maps, and proposed to use navigation cost, information gain, and positioning accuracy to construct utility functions. The following researchers continued to conduct indepth research based on this. Isler et al. [2] propose a new probabilistic volume map for 3D reconstruction to quantify the visibility of voxels. The work in [5] combines the fast online replanning function of planning based on trajectory optimization with the idea of maximizing the exploration gain in the sensor field of view to improve the decisionmaking speed of realtime exploration. Meng et al. [19] samples the viewpoints and used the heuristic TSP solver to calculate the visiting order of the frontier points, and visited the frontier points in turn. However, none of the abovementioned exploration decision algorithms consider the problem of repeated exploration in realtime decisionmaking. The method [20] proposes a temporal memorybased RRT (TMRRT) to make robot decisions more firm. But decision updates become slow because robots are assigned a fixed amount of time.
Some works focuses on active SLAM during exploration, which aims to improve the location accuracy and mapping quality [21, 22]. Other works improve the Planning method [23, 24] to make the planned trajectory satisfy the dynamic feasibility of robots or meet some trajectory optimization requirements. The method [25] considers the predicted actuation of a vehicle and uses nonlinear model predictive control (NMPC) to follow a generated path. As an extension of single robotic exploration, some works focus on multirobot cooperative exploration system [26,27,28], which cast their eyes on solving task allocation or communication problem. There are also some robots use different sensors [24, 29] from the using in this paper. All the above methods have contributed to the development of the field of autonomous robotic exploration, but their focus is different from that of this article. In summary, an excellent exploration often requires shorter time, low energy consumption (short paths), and low computational and memory consumption. SLAM and Planning have been widely studied, while there are few researches on the Search and Decision.
Problem formulation
The environment is modeled using occupancy grid map \({\mathcal {M}}\) which is a metric map. The basic idea of occupancy grid Map is to represent the environment map as uniformly distributed cells. According to the state of the cells, the Occupancy Grid Map can be divided into known cells \(S_{\textrm{known}}\) and an unknown cells \(S_{\textrm{unknown}}\). Known cells are further divided into free cells \(S_{\textrm{free}}\) and occupied cells \(S_{\textrm{occupied}}\) which indicate obstacle. We represent the grid set of \({\mathcal {M}}\) as \(S={s(1),s(2),s(3)\ldots s(N)}\), and every grid stores the value of occupied probability. The occupancy probability of each grid is expressed as information entropy:
where \(C=1,2,3\ldots ,\) and P(s(c)) is the occupancy probability of grid s(c) in the occupancy grid map. Then the information entropy of the area to be explored \(W={w(1), w(2), w(3)\ldots w(N)}\) can be expressed as:
The higher the average entropy of the area w(i), the higher the uncertainty of the area’s information which reflects the potential increase of map information. It represents the attractiveness of the mobile robot.
Usually, The robot decides which area to explore first according to the utility function value of the frontier points. The utility function needs to consider the navigation cost \(U_N\), the information gain \(U_I\) [30] and decisionconsistent cost \(U_C(i)\). The navigation cost is the expected distance for the robot to reach the frontier points. To simplify the calculation, the navigation cost can be modeled as a Euclidean distance between the current position of the robot and the position of frontiers. The information gain is the average entropy of the area W to be explored. Consistency is evaluated using the methods presented in “Decision making with consistency”. Each time the robot selects the frontier with the smallest utility value as the next target:
In the exploration process, the goal is to improve the exploration efficiency of the robot as much as possible when the map entropy is reduced to the expectations. Moreover, we hope to reduce the energy consumption of the robot, which means that the total path of the robot’s walking is as small as possible:
where T is the total exploration time. \(C_{{\mathcal {M}}}\) is the map entropy budget for completing the exploration and mapping.
Proposed method
We propose a technical framework for autonomous exploration and mapping in 2D environment. As shown in Fig. 1, the Search module randomly samples on the map constructed by the SLAM module. The searched frontier points are inputted to the Prune module which is a preprocessing before decision. The frontier points outputted by the Prune module will be used for the next Decision module. After making a decision, the robot will select the best forward frontier with the smallest value of the utility function as the target and send it to the Planning module.
Wavefeaturesbased RRT search
The Search module aims to generate frontier points. In this process, we optimize RRT exploration algorithm [15]. RRT randomly samples in the environment and connects the sampling points with the surrounding nodes. The nodes and edges will form a treelike structure. When the tree branch grows across the explored and unknown areas and does not pass the obstacle area, the node of the branch is marked as a frontier.
Since the growth of the tree is random, and all branches in the sampling area grow with a uniform probability. As time goes by, more and more branches will grow in the explored area (free area), which will cause unnecessary memory consumption. To reduce unnecessary memory consumption, we are inspired by wave. For ease of description, the following concepts are defined and visualized in Fig. 2.
Definition 1
(Exploration area) The area close to obstacles or borders, as shown in the green part. The tree that we expand when sampling should be mainly concentrated in this area.
Definition 2
(Safety area) The explored area far away from obstacles and frontiers, as shown in the yellow part. We propose a suppression method in the safe area to inhibit the growth of RRT.
Definition 3
(Potential area) The area between the exploration area and the safety area, as shown in the blue part. The trees in the potential area grow slower than the trees in the exploration area, but the inhibitory effect is weaker than that in the safe area.
It can be observed that the safe area is surrounded by the exploration area. The nodes in the safe area are parent of the nodes in the exploration area. A mathematical model is expected to describe the state of the nodes in the safe zone. The obstacles and frontiers are judged by whether the tree branches cross the known and unknown areas as they grow. The work in this paper is inspired by the interference phenomenon of waves: the nodes in the exploration area will produce an invisible wavelike energy, which is transmitted to the nodes in the safe area and producing an inhibitory effect. This is very similar to the phenomenon of wave interference: each wave propagates independently, and interference occurs only when interference conditions are met, and then the energy increases after interference. The transmission process of the water wave is shown in Fig. 3a.
For ease of description, we represent the RRT search process as a tree structure, as shown in Fig. 3b and define the following concepts for wavefeaturesbased rapidly exploring random tree (WFRRT) algorithm:
Definition 4
(Inhibition potential) The potential energy that inhibits the expansion of the node. The unified unit of measurement is E.
Definition 5
(Inhibition energy) The energy that inhibits the expansion of the node, which is generated when the inhibitory potential is activated.
Definition 6
(Wave source node) When sampling, there are some nodes located in the exploration area \(O=\{o_1, o_2, o_3 \ldots \}\) and the child nodes to be expanded pass through obstacles. These nodes are called wave source nodes. The wave source node has inhibition potential E(O), which means that the growth of its root node may be inhibited.
Definition 7
(Interference node) The common parent node \(o_i (g)\) for the suppression potential of multiple wave source nodes for successful interference(trigger activation mechanism), where g is the generation of the parent node. Interference nodes will produce inhibition energy, which suppresses the expansion.
Algorithm 1 lists the details of WFRRT exploration search algorithm. Compared with the RRT exploration algorithm, in addition to the node set V and the branch set E, it also maintains the wave source node set \(V_{\textrm{source}}\) and the interference node set \(V_{\textrm{int}}\). The WFRRT exploration search algorithm proposed in this paper has the following properties:
(1) Independent propagation of the inhibition potential
We learn from the independent propagation principle of waves which is that waves excited by each wave source can propagate independently in the same medium. After they meet and separate, they are the same as before, and do not interfere with each other. The inhibition potential is regarded as a wave, nodes and branches as the propagation medium. The suppression potential has the characteristics of independent propagation: when a node and its child nodes touch an obstacle, the node will produce the inhibition potential \(E(o_i )=1E\) and spread towards the parent node as the wave source node \(o_i\) is marked as a wave source. In Fig. 3a, \(o_1\), \(o_2\), and \(o_3\) are all wave source nodes whose inhibition potential spreads independently. But \(o'\) is not a source node because it does not extend the child nodes that touch the obstacle.
(2) Superposition of the inhibition potential
Inspired by the superposition principle of wave, in the area where waves overlap, if the particle of the medium participates in the vibration caused by these waves at the same time, the displacement of the particle is equal to the vector sum of the displacements caused when these waves propagate alone. As shown in Fig. 3a, the wind generates new waves, and the waves will be superimposed on each other, eventually making the waves higher and higher. We regard the inhibition potential as the waves caused by wind energy. When the same wave source node \(o_i\) expands the child node, if the connection with the child node passes through an obstacle, it will produce new inhibitory potential that will be superimposed with the previously generated inhibitory potential (Line 9):
(3) Interference of the inhibition potential
We are inspired by the interference principle of wave which is that the superposition of two waves of the same frequency makes the vibration in some areas strengthened under certain conditions (Fig. 4). As shown in Fig. 3a, interference occurs between waves, making the waves higher. Similarly, the suppression potential also has interference phenomena. When the node triggers the activation mechanism (receives two wave source nodes separated by the same generation to suppress the potential), the node is marked as an interference node, and the inhibition energy will inhibit the expansion of the node and its child nodes is generated. As shown in Fig. 3b, \(o_3 (4)\) is the node where \(o_3\) and \(o_2\) interfere in the 4 generations backtracking, which is an interference node. It is worth noting that \(o_1 (2)\) is the point where \(o_1\) goes back to the second generation, which is the same point where \(o_2\) goes back to the first generation. Because of its different generation, this node does not cause interference. The user can adjust the generation g according to different situations. We recommend that the larger the exploration environment, the larger the g setting. The interference energy will affect the expansion probability of the interfering node and its child nodes (Line 10).
where \(P(o_i (g))\) is the expansion probability of the node, \(f_{\textrm{int}}\) is the interference function, and k is a constant. When more than two interference potentials are transmitted to the interference node through the same generation, the interference function of the interference node is given by:
Among them, \(E(O(g))={E(o_1 (g)), E(o_2 (g))\ldots },\) and n is the number of the same backtracking generation from interference nodes. The inhibition function represents the degree to which the growth of the node is inhibited.
(4) Linear diffusion of inhibition energy
Waves spread to the shore to form swells. As the swells receive resistance from the shore and wind resistance during its propagation to the shore, they will become lower and lower. We regard the propagation of inhibition energy as the propagation of swells. When interference nodes have inhibition energy, the inhibition energy will spread to the child nodes of the interference node and get smaller and smaller (Line 11). As shown in Fig. 3b, \(o_3 (4)\) (also \(o_4 (4)\)) the interference node diffuses the interference energy in the direction of the child node respectively (i.e., the direction of \(o_3\) and \(o_4\)). The inhibition energy is:
where \(f_{\textrm{int}}(0)\) is the inhibition function value of the interference node, \(E_{\textrm{int}} (m)\) is the inhibition energy of the child node m generation from the interfering node. \(\Delta E\) is the debilitating energy, which represents the inhibition energy lost by each node that diffuses, and is a fixed value.
Prune based on mean shift
After search module, we will search out many frontier points to form a set \(F={f_1,f_2,f_3\ldots f_n }\). With the time going, there are more and more frontier points, which will increase the computational cost of the forward optimal exploration point decision and cause unnecessary consumption of computing resources. To save decisionmaking costs, we propose a prune method based on mean shift (MSP) which is considered as the postprocessing process of RRT search. The goal of pruning is to retain only one or a small number of frontier points on the same frontier (the junction of the explored area and the unknown area). The process is shown in Fig. 4. We firstly select the best point that has most exploratory potential among the frontier points at similar distances through clustering, and then discard the remaining points.
In clustering, we use the Mean Shift algorithm [31] is used, which has a wide range of applications in target tracking, image segmentation, etc. We extract the frontier points from the map and express them in the form of points in a twodimensional image. For the selected points \(f_i\) of the frontier point set F in the 2D grid map, the general form of the Mean Shift vector is:
where \(S_h\) refers to the circular clustering area with radius h. \(f_j\) is the frontier point adjacent to the selected \(f_i\). K is the number of frontier points in the area. Because the frontier points are randomly sampled, the distance between the sample and the offset point is different, and the contribution of the offset to the mean offset vector is also different. Therefore, we add a kernel function to the general form of the mean shift vector:
Among them, \( G(\frac{f_i  f}{h_i})\) is the kernel function, and we use Epanechnikov, which is the Euclidean distance from the frontier point to the center of the region of interest (ROI). Through iteration, the mean shift will cause the \(S_h\) clustering area to move continuously to the denser area until it converges. We keep the frontier points at the center of the \(S_h\) area after convergence, and discard the other frontier points in the area. Finally we obtain the pruned frontier point set \(F_{\textrm{new}}\), and input this point set into the Decision module.
Decision making with consistency
The frontier points from the Prune module will be used for the Decision module. When making the decision, the frontier point with the smallest utility value will be selected as the next exploration point which is called the best forward point (BFP). The utility function usually includes navigation cost and information gain, but lacks overall plan. This kind of utility function will make the robot’s exploration time local optimal rather than global optimal, which leads to repeated exploration. Furthermore, we hope that the robot has a global view to make the overall exploration more efficient which is similar to obtain a shorter global path. In this paper, we propose a new utility function based on FSOTSP is proposed to achieve the expectations.
The task of autonomous exploration and mapping requires us to explore the unknown area from the current position, and there is no need to return to the starting point after the task ends. Therefore, unlike the closedloop TSP problem, we model the decisionmaking process of autonomous exploration and mapping is modeled as Fixed Start Open TSP (FSOTSP) problem [19]. The Lin–Kernighan–Helsgaun Heuristic (LKH) TSP solver [32] is used to solve the problem. The difference from the closedloop TSP solution process is that the result of FSOTSP only contains the openloop route of all exploration points. In this process, the current position of the robot is the starting point, and the frontier point is the node to be visited. We first calculate the cost matrix of the robot’s navigation cost N for different frontier points in 2D space, and input this matrix into the solver of FSOTSP. We take the total cost of exploration is taken as the objective function and the genetic algorithm (GA) [33] is used to solve it. The cost of navigation is inversely proportional to the utility function evolved from Eq. (3):
Among them, w is the weight coefficient. \(U_I\) is the information gain, which is equal to H(w(i)) in Eq. (2). \(U_N\) is the path cost. To simplify the calculation, we take the Euclidean distance between the location of the robot and the point to be explored. \(U_{\textrm{FSOTSP}}\) is a penalty item that deviates from the FSOTSP decision sequence. This item will make the robot more inclined to choose the frontier point that conforms to the last global optimal decision sequence as the FBP. To achieve this goal, we need to quantify the degree of departure from the FSOTSP decision sequence is quantified, so we propose a decision difference matrix A to measure the degree of difference before and after the decision update:
where \(F_{t_i}\) and \(F_{t_j}\) respectively represent the set of frontier points before and after a decision is made. To facilitate calculation, we express the 2D position information of each point in the set of frontier points in matrix [x, y] and [k, m]. Because the number of frontier points before and after the decision may change, we use 0 to fill in the matrix with the number of elements misplaced.
As the distance from the robot’s current position to the BFP continues to decrease, we hope the robot not to change its destination, because this may lead to longer repeated paths. Then an attenuation coefficient \(\text {exp}^{\omega *{\textrm{Dist}}}\) is introduced, where \(\text {Dist}\) is the Euclidean distance from the robot’s current position to BFP. This coefficient will decrease as the destination is approached, and the penalty for deviating from the original decision sequence will keep increasing, making the robot more determined to explore this direction to the end. Finally, we refine the utility function (11) is refined into:
where \(f_i\) is the BFP’s position, and p is robot’s current position. When heading to the BFP, the robot needs to make overall decisions based on new updated map information. However, realtime update of the decision will consume unnecessary computing resources. Therefore, a Redecision Mechanism is proposed. As shown in Fig. 5, the decision is updated every time the robot advances by a step length S which is compared to odometer integration over time.
Simulation
Experimental setup
We use Robot Operating System (ROS) as the running platform of algorithms, and build simulation environments using Gazebo for autonomous robotics exploration. The Turtlebot robot is used as the simulation experiment platform, and is equipped with a \(180^{\circ }\) viewing angle Hokuyo laser, which has a visual range of 50 m. The localization of the mobile robot is done using Gmapping [34], while the navigation stack uses TEB planner [35]. We construct two simulation environments, which are home and office scene. The total size of the home simulation environment is \(196~\text {m}^2\), including 5 rooms, corridors, and doors. The smallest room is only \(7~\text {m}^2\) while the largest room is \(70~\text {m}^2\). The rooms present different shapes, and each room has only one door. The office simulation environment is \(225~\text {m}^2\) in total, including one lobby and 5 offices. When the robot is in the lobby, it often faces multiple objectives, which greatly tests the robustness of decision making method. All the sampling area is set to \(20~\text {m}*20~\text {m}\). Note that the simulation is visualized using RVIZ.
Benchmark
In this subsection, we compare the performance of our proposed WFRRT against RRT exploration [15] and TMRRT [20] which is stateoftheart. The focus of TMRRT is to improve two modules, which are mapmerging module and temporal memorybased goal assigner module. The mapmerging module are used for multirobotic exploration, therefore it is not considered in this article. Temporal memorybased goal assigner module allocates an adaptive time period for the robot to explore the BFP based on the robot’s desired movement velocity. During this period, the robot does not change its target, which makes robot’s decisionmaking more firm.
In addition, to verify the effectiveness of the main contributing Search and Decision modules, we design an ablation experiment. A simplified version of WFRRT without consistent decision making called WFRRTLite is also used for autonomous exploration. In this way, we can see the effect gain when different modules are stacked together.
For home and office scene, 10 groups of experiments are respectively done using RRT [15], TMRRT [20], WFRRTLite and WFRRT. Due to the randomness of sampling search, the experiment may have extreme results. We remove the best and the worst experiment results and remain 8 groups of experimental results are finally averaged, shown in Table 1.
Comparison and analysis
To comprehensively measure the methods proposed in this article, we compare the following indicators of RRT, TMRRT, WFRRT and WFRRTLite: (1) nodes number which reflects the consumption of memory and computing resource, (2) exploration time which directly represents the exploration efficiency, (3) path length which is usually used to measure energy consumption. Then we discuss in detail the reasons for the results in Table 1.
For the number of nodes, taking the home scene as an example, WFRRTLite’s nodes are 22.32% less than that of RRT. The number of nodes in WFRRT has decreased by 28.20%. This means that WFRRT has only increased by 5.88% compared to the Lite version. It can be concluded that the regionbiased search based on wavefeaturesbased sampling has a greater contribution to reducing the number of nodes, while the consistent decision has a minor contribution. Compared with TMRRT, our method also exhibits excellent performance.
In the comparison of exploration time, WFRRTLite reduced the time by 10.53% in home scene, indicating that it has a contribution to saving time and improving exploration efficiency. It can be noted that TMRRT improves more than WFRRTLite because it has an advantage in making decisions about the BFP. Nevertheless, our complete version of WFRRT has stronger performance. The reason is that TMRRT requires the robot to reach the target point within a time period calculated from the ideal speed and distance to the target, without external interference. Therefore, robots do not hesitate, reducing the time consumption caused by turning. WFRRT’s decisions have a more global perspective, not only considering the next one target, thus saving time.
For the path length of the robotic autonomous exploration, WFRRTLite reduces by 7.76% compared to RRT in the home scene. The improvement is due to WFRRT Lite affecting the update of frontiers. Compared to WFRRT, the reduction is not significant. This is because WFRRT has a greater impact on robot navigation decisions. It can be noticed that the trajectory of TMRRT is the longest, although the exploration time is not. This is because only one goal is set during the memory time, while other algorithms update frontiers in realtime. The shorter exploration time is due to reduced steering, which is not included in the path length.
By comparing the two scenarios, it can also be seen in Table 1 that our method achieves better results in the office scenario than in the home scenario. This is because each room in the home has only one door, while the office scene has multiple doors, so frontiers are distributed in various directions, which further tests the decisionmaking module. Moreover, it is more spacious and easy to grow more nodes during sampling search.
To display the results more intuitively, a few representative experimental visualization graphs are shown in Fig. 6. It can be seen that the node density of the improved random search tree is reduced, and the growth is obviously inhibited, which promotes the exploration efficiency. It also can be seen intuitively that the improved algorithm makes the robot more determined and less change to the BFP. Therefore, there are fewer repeated paths due to the change of the target.
Prune’s necessity verification
We design experiments to evaluate the impact of the Prune module on exploration. 20 groups of experiments are done to compare “searchdecision” (SD) framework with “searchprunedecision” (SPD) framework. One of the explorations is visualized as Fig. 7.
It can be seen from Table 2 that the number of nodes, exploration time, and path length have increased significantly after removing the Prune module, indicating that the Prune module is very important to our technical framework (Fig. 8). We can find that in the office scene, the disadvantages of the SD framework are more obvious than that in the home. This is because that there are more boundaries of explored and unexplored areas in the office. The distribution of frontiers are more scattered. This brings difficulties to the decisionmaking. More swings and repeated paths will arise as shown in Fig. 7, leading to longer exploration time and more generated nodes. The Prune module can improve the efficiency of exploration and save energy losses caused by longdistance driving.
Realworld deployment
After testing WFRRT exploration effectiveness in the simulation, we deploy it in a real robotic exploration system using Kobuki mobile robot. For perception, RPLidar A1 laser range scanner is used which provides range measurements of up to 6 m with a coverage area of \(360^{\circ }\). We use Gmapping [34] for SLAM module. A star [36] and TEB [35] are used as the global and local planner. Intel NUC i7 is used as Kobuki’s computing unit for processing all the modules. A personal laptop is used to monitor robot status and visualize the map by RVIZ.
We construct realworld experiments in two scenarios. One is the laboratory scene, as shown in Fig. 9. It is about \(52~\text {m}^2\) and is separated to different fields of view with cardboard boxes. The sampling area is set to \(8~\text {m}*10~\text {m}\). The whole exploration process take around 153.38 s and the Kobuki robot travels 24.49 m. The final number of nodes is 4218. Another scene is an area enclosed by fences and cube boxes in a robot experimental venue. Another scene is an area enclosed byfences and cube boxes in a robot experimental venue, as shown in Fig. 8. The area is approximately \(82~\text {m}^2\). The sampling area is set to \(15~\text {m}*20~\text {m}\). The robot’s dynamic parameters are as follows: maximum linear velocity \(0.3~\text {m}/\text {s}\), maximum linear acceleration \(0.3~\text {m}/\text {s}^2\), maximum angular velocity \(0.5~\text {rad}/\text {s}\) and maximum angular acceleration \(0.2~\text {rad}/\text {s}^2\).
To describe the experiment process in more detail, we record the video during the experiment and capture some pictures of first person view (FPV) and third person view (TPV) at some key positions, as shown in Fig. 8. Due to the larger and more spacious venue, we relax the dynamic restrictions. In this experiment, we set the maximum linear velocity of the mobile robot to \(0.5~\text {m}/\text {s}\), the maximum linear acceleration to \(1.0~\text {m}/\text {s}^2\), the maximum angular velocity to \(1.0~\text {rad}/\text {s}\), and the maximum angular acceleration to \(1.0~\text {rad}/\text {s}^2\). To adapt bigger environment, the map resolution is adjusted to 0.15. The robot move 19.22 m to complete the mapping. During this period, a total of 3594 nodes grow. The implementations of the WFRRT exploration show successful attempts in exploring unknown regions.
Conclusion
In this paper, we propose an autonomous robotic exploration system with regionbiased sampling and consistent decision making, which is called WFRRT. The core is (1) rapidly random tree search method based on wave characteristics, which can suppress the growth of random trees in areas that have already been explored (called the safety zone in this article) to save memory and computational consumption, and (2) decisionmaking with allocation consistency. Consistency means that every time the robot makes a new decision on the order of exploration frontiers, it takes into account historical decision results and global optimality. It has two benefits. First, it can help the robot avoid frequent changes in direction leading to an increase in exploration time. Second, by encoding the states of robots and frontiers into the FSOTSP, the global optimality of access order can be maintained. In addition, we verify the necessity of the Prune module via experiments.
The proposed exploration method has been tested in simulation and realworld deployment. Results have shown it performs better than conventional RRT exploration and stateoftheart TMRRT algorithms in terms of memory consumption, exploration duration and travel distance. In the future, we will conduct further research on the mathematical interpretability of the framework of autonomous exploration and motion planning. Another noteworthy aspect is that WFRRT can be extended to multirobotic collaborative exploration systems.
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Yamauchi B, Schultz A, Adams W (1998) Mobile robot exploration and mapbuilding with continuous localization. In: Proceedings. 1998 IEEE international conference on robotics and automation (Cat. No.98CH36146)(ICRA), Leuven, Belgium vol 4, pp 3715–3720
Isler S, Sabzevari R, Delmerico J, Scaramuzza D (2016) An information gain formulation for active volumetric 3D reconstruction. In: Proceedings. 2016 IEEE international conference on robotics and automation (ICRA), Stockholm, Sweden pp 3477–3484
Delmerico J, Mueggler E, Nitsch J, Scaramuzza D (2017) Active autonomous aerial exploration for ground robot path planning. IEEE Robot Autom Lett 2(2):664–671
Dimastrogiovanni M, Cordes F, Reina G (2020) Terrain estimation for planetary exploration robots. Appl Sci 17:6044
Oleynikova H, Taylor Z, Siegwart R, Nieto J (2018) Safe local exploration for replanning in cluttered unknown environments for microaerial vehicles. IEEE Robot Autom Lett 3(3):1474–1481
Zhou B, Pan J, Gao F, Shen S (2021) RAPTOR: robust and perceptionaware trajectory replanning for quadrotor fast flight. IEEE Trans Robot 37(6):1992–2009
Wang Z, Li Y, Zhang H, Liu C, Chen Q (2020) Samplingbased optimal motion planning with smart exploration and exploitation. IEEE/ASME Trans Mechatron 25(5):2376–2386
Cieslewski T, Kaufmann E, Scaramuzza D (2017) Rapid exploration with multirotors: a frontier selection method for high speed flight. In: Proceedings. 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), Vancouver, BC, Canada, pp 2135–2142
Dharmadhikari M et al (2020) Motion primitivesbased path planning for fast and agile exploration using aerial robots. In: Proceedings. 2020 IEEE international conference on robotics and automation (ICRA), Paris, France, pp 179–185
L. Schmid, M. Pantic, R. Khanna, L. Ott, R. Siegwartand J. Nieto, “An Efficient SamplingBased Method for Online Informative Path Planning in Unknown Environments,” in IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 1500–1507, April 2020
Song S, Jo S (2017) Online inspection path planning for autonomous 3D modeling using a microaerial vehicle. In: Proceedings. 2020 IEEE international conference on robotics and automation (ICRA), Singapore, pp 6217–6224
Keidar M, Kaminka GA (2012) Robot exploration with fast frontier detection: theory and experiments. In: Proceedings. The 11th international conference on autonomous agents and multiagent systems, vol 1, pp 113–120
Senarathne PGCN, Wang D, Wang Z, Chen Q (2013) Efficient frontier detection and management for robot exploration. In: Proceedings. 2013 IEEE international conference on cyber technology in automation, control and intelligent systems, Nanjing, China pp 114–119
Senarathne PGCN, Wang D (2015) Incremental algorithms for safe and reachable frontier detection for robot exploration. Robot Auton Syst 72:189–206
Umari H, Mukhopadhyay S (2017) Autonomous robotic exploration based on multiple rapidlyexploring randomized trees. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), Vancouver, BC, Canada, pp 1396–1402
Qiao W, Fang Z, Si B (2018) Samplebased frontier detection for autonomous robot exploration. In: IEEE international conference on robotics and biomimetics (ROBIO), 2018, Kuala Lumpur, Malaysia pp 1165–1170
Liu J, Lv Y, Yuan Y, Chi W, Chen G, Sun L (2023) An efficient robot exploration method based on heuristics biased sampling. IEEE Trans Ind Electron 70(7):7102–7112
Makarenko AA, Williams SB, Bourgault F, DurrantWhyte HF (2002) An experiment in integrated exploration. In: IEEE/RSJ international conference on intelligent robots and systems (IROS), vol 1, Lausanne, Switzerland pp 534–539
Meng Z et al (2017) A twostage optimized nextview planning framework for 3D unknown environment exploration, and structural reconstruction. IEEE Robot Autom Lett 2(3):1680–1687
Lau BPL et al (2022) MultiAGV’s temporal memorybased RRT exploration in unknown environment. IEEE Robot Autom Lett 7(4):9256–9263
Papachristos C, Khattak S, Alexis K (2017) Uncertaintyaware receding horizon exploration and mapping using aerial robots. In: IEEE international conference on robotics and automation (ICRA), Singapore, pp 4568–4575
Qin H et al (2019) Autonomous exploration and mapping system using heterogeneous UAVs and UGVs in GPSdenied environments. IEEE Trans Veh Technol 68(2):1339–1350. https://doi.org/10.1109/TVT.2018.2890416
Charrow B, Kahn G, Patil S, et al. “InformationTheoretic Planning with Trajectory Optimization for Dense 3D Mapping,” in Proceedings. Robotics: Science and Systems. 2015, Vol. 11, pp. 3–12
Zhou B, Zhang Y, Chen X, Shen S (2021) FUEL: fast UAV exploration using incremental frontier structure and hierarchical planning. IEEE Robot Autom Lett 6(2):779–786. https://doi.org/10.1109/LRA.2021.3051563
Lindqvist B, AghaMohammadi A, Nikolakopoulos G (2021) ExplorationRRT: a multiobjective path planning and exploration framework for unknown and unstructured environments. In: 2021 IEEE/RSJ international conference on intelligent robots and systems (IROS), Prague, Czech Republic, pp 3429–3435
Gao Y et al (2022) Meetingmergingmission: a multirobot coordinate framework for largescale communicationlimited exploration. In: 2022 IEEE/RSJ international conference on intelligent robots and systems (IROS), Kyoto, Japan, pp 13700–13707. https://doi.org/10.1109/IROS47612.2022.9981544
Andre T, Bettstetter C (2016) Collaboration in multirobot exploration: to meet or not to meet? J Intell Robot Syst 82(2):325–337
Amigoni F, Banfi J, Basilico N (2017) Multirobot exploration of communicationrestricted environments: a survey. IEEE Intell Syst 32(6):48–57
Dang T, Papachristos C, Alexis K (2018) Visual saliencyaware receding horizon autonomous exploration with application to aerial robotics. In: IEEE international conference on robotics and automation (ICRA) Brisbane, QLD, Australia.
Wang C, Chi W, Sun Y, Meng MQ (2019) Autonomous robotic exploration by incremental road map construction. IEEE Trans Autom Sci Eng 16(4):1720–1731
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619
Helsgaun K (2000) An effective implementation of the Lin–Kernighan traveling salesman heuristic. Eur J Oper Res 126:106–130
Man KF, Tang KS, Kwong S (1996) Genetic algorithms: concepts and applications in engineering design. IEEE Trans Ind Electron 43(5):519–534
Grisetti G, Stachniss C, Burgard W (2007) Improved techniques for grid mapping with Rao–Blackwellized particle filters. IEEE Trans Robot 23(1):34–46
Roesmann C, Feiten W, Woesch T, Hoffmann F, Bertram T (2012) Trajectory modification considering dynamic constraints of autonomous robots. In: 7th German conference on robotics, pp 1–6
Hart PE, Nilsson NJ, Raphael B (1968) A formal basis for the heuristic determination of minimum cost paths. IEEE Trans Syst Sci Cybern 4(2):100–107
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant 52175032, the Key R & D Program of Zhejiang Province under Grant 2021C01065, the “Pioneer” and “Leading Goose” R &D Program of Zhejiang under Grant 2023C01070, and Robotics Institute of Zhejiang University under Grant K12107 and K11808.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wang, J., Yu, H., Zheng, Z. et al. Autonomous robotic exploration with regionbiased sampling and consistent decision making. Complex Intell. Syst. 9, 6023–6035 (2023). https://doi.org/10.1007/s4074702301143y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s4074702301143y