REF: A Rapid Exploration Framework for Deploying Autonomous MAVs in Unknown Environments

Exploration and mapping of unknown environments is a fundamental task in applications for autonomous robots. In this article, we present a complete framework for deploying Micro Aerial Vehicles (MAVs) in autonomous exploration missions in unknown subterranean areas. The main motive of exploration algorithms is to depict the next best frontier for the MAV such that new ground can be covered in a fast, safe yet efficient manner. The proposed framework uses a novel frontier selection method that also contributes to the safe navigation of autonomous MAVs in obstructed areas such as subterranean caves, mines, and urban areas. The framework presented in this work bifurcates the exploration problem in local and global exploration. The proposed exploration framework is also adaptable according to computational resources available onboard the MAV which means the trade-off between the speed of exploration and the quality of the map can be made. Such capability allows the proposed framework to be deployed in subterranean exploration and mapping as well as in fast search and rescue scenarios. The performance of the proposed framework is evaluated in detailed simulation studies with comparisons made against a high-level exploration-planning framework developed for the DARPA Sub-T challenge as it will be presented in this article.


Introduction and Background
Rapid exploration and mapping of unknown subterranean environments has become significant interest in the field of autonomous deployment of robots.MAVs have the potential in being a viable solution in terms of mines inspection [25], exploration and mapping [29], [30], [39] and inspection of infrastructures [32] due to their high degree of freedom and fast traversability.The applications of MAVs have also been discussed in developing next generation rotor crafts for mars exploration in [37] and [38].Deploying MAVs for exploration and mapping of dark, dusty and hostile mines and caving systems is particularly challenging because at the beginning of the exploration process, the environment is completely unknown for navigation.In order to map surrounding for safe navigation in such environments, vision-only based navigation techniques are insufficient [36].The unstructured and rocky environment of mines and caves is a major challenge that contribute in uncertainty in sensor measurements [1].In order to explore and map such environments, the crucial requirements for autonomous navigation problem are: a) detecting the frontiers, b) selecting the optimal frontier, and c) safe navigation to such frontier in order to successfully build a map of the environment.In order to safely navigate in an unknown environment, it is crucial that the MAV is backed by a sophisticated on board autonomy for Guidance, Navigation and Control (GNC).In Figure 1 and Figure 2 exploration instances of the proposed method is shown in different environments.The capability of the proposed method to handle exploration of narrow passages as well as wide tall void like structures is evident in Figure 2. The framework introduced in this work selects optimal frontiers based on the idea of continuing the exploration in one direction until there is no new potential information to gain in the particular direction.Planning a safe path to such selected frontiers is crucial when exploring a large environment.The path planning method used in this work takes into account the safety margin of such paths based on the size of the MAV and its ability to traverse through the obstructed areas.The MAVs are also constrained in terms of their limited time of flight.Therefore the proposed framework also accounts for cost based frontier selection while evaluating next optimal area to visit.The proposed framework also complements the idea of efficiently utilising the resources of the vehicle by rapid yet safe navigation.This work presents a rapid exploration framework for safe autonomous navigation of MAVs in caves.The point cloud map of the explored virtual cave environment with the MAV's trajectory is presented in Figure 3.

Related Works
In the original work of frontier based exploration [49], the points lying at the boundary between known (free) space and unknown space are defined as frontier points.In [49] a closest frontier from the robots position is selected to move to such that the boundary at which frontiers lie will also progress towards more unexplored space.The same approach was also extended for the case of multiple robots, as presented in [50].In [21] and [23] frontier based exploration strategies are studied extensively for comparison against different exploration approaches.A 3D Frontier Based Exploration Tool (FBET) for aerial vehicles is presented in [53].The FBET framework uses similar approach to [49] for frontiers generation and the generated frontier are clustered for selection of candidate frontier goal based on cost function that takes into account the cost of moving to such goal point.A Stochastic Differential Equation (SDE) based exploration approach is presented in [45].In the SDE based exploration strategy the authors consider simulating Fig. 3: MAV trajectory while exploring a wide-large cave environment using the proposed approach expansion of system of particles with Newtonian dynamics for evolution of SDE.In [45] the authors consider the region showing significant expansion of particles as a region that would lead the MAV to more unexplored space.In [19] A vision based exploration-mapping problem solving technique is presented that also utilizes MAV to navigate in unexplored areas using continuously updating frontiers.Exploration of unknown environments are also extended to legged or ground robots.Probabilistic Local and Global Reasoning on Information roadMaps (PLGRIM) as presented in [27], discusses a hierarchical value learning strategy for autonomous exploration of large subterranean environments.The methodology presented in [27] uses a hierarchical learning to address local and global exploration of large scale environments while focusing on near optimal coverage plans.A Frontloaded Information Gain Orienteering Problem (FIG-OP) based strategy is presented in [40] that uses topological maps to plan exploration paths in fixed time budget exploration scenario.The method presented in [40] is tested with ground robots in multi-kilometers subterranean environment targeted at time constrained exploration missions.Separated from frontier-selection methods are the methods with integrated exploration behavior in the path planning problem, often based on trying to plan a path in order to maximize the information gain, while minimizing distance travelled or similar metrics.These planners generally fall under the Next-best-view approaches as in [41][4] [9] and have seen great application success, but other methods in similar directions exists, such as ERRT [29] takes into consideration also actuation effort along with information gain in order to yield more efficiency towards exploration of unknown and unstructured areas.Additionally, the Rapid exploration method proposed in [7] is developed to maintain a high MAV velocity, while exploring.Autonomous inspection of structures by utilizing a frontier based algorithm, along with a Lazy Theta * path planner, is presented in [17].Finally, an information driven frontier exploration method for MAV, which uses a hybrid approach between control sampling and frontier based is presented in [8].As state-of-the-art exploration method presented in [12] is tailored and deployed in large-scale exploration mission both in simulations and real world experiments.The developed planner is structured around motion primitives that search for admissible paths, taking advantage of efficient volumetric mapping with collision checks and future-safe path search that evaluates the variation of speed along the path, while also maximizing the exploration gain for an overall fast navigation scheme.Moreover, in [44] an exploration approach that combined frontiers with receding horizon next-best-view planning has been proposed.The frontiers are part of the global planning part, while the next-best-view is responsible for the local exploration part.In [48] a dynamic exploration planner (DEP) for MAV exploration, based on Probabilistic road map has been presented.The sampling nodes are added incrementally and distributed evenly in the explored region, while the planner uses Euclidean Signed Distance Function map to optimize and refine local paths.The exploration scheme in [5] presented the Permutohedral Frontier Filtering, which is based on bilateral filtering with permutohedral lattices to extract the score-based spatial density of the selected frontiers.Multiple studies have also incorporated visual servoing based path planning and control architectures for mobile robots as presented in [13].
The authors in [16] have formulated gaussian functions based control architecture for mobile robot that rely on mainly visual information of surrounding.The authors have extended the work further in [14] that uses decision trees as well as adaptive potential area methods to achieve autonomous control of mobile robots in real life applications.
In the field of sampling based space mapping area the research studies presented in [15], uses bi RRT method to smooth the RRT path using curve fitting methods.In [15] the Ability to navigate from start to goal position using the smooth path by curve fitting also addresses the problem of actuation of robot if extended for MAV in future.More path planning and control approaches have been discussed in a case study presented by the authors in [35] discuss in detail the planning and control of automated ground vehicles in industry.Various planning algorithms have been developed for navigation of aerial platforms in unknown environments, where in general they can be divided in map-based or memory-less approaches or their combination.In [43] a hierarchical planning framework that combines map building from fused depth data, as well as instantaneous depth data, both organized into separate K-D trees has been proposed.The planner relies on a slower global planner to get a goal location, which is evaluated using motion primitives against the K-D trees with the lowest cost candidate primitive to be selected.In [51] a motion planning method for fast navigation of autonomous MAVs has been developed.The algorithm divides the environment modeling in two parts: i) the deterministically visible area within the on board sensor range, and ii) the probabilistically known area beyond the sensor range from a-priory map.The planning method maximizes the likelihood of reaching a goal, where a finite set of candidate trajectories are separated into groups and evaluated for collisions.In [33] a navigation method for MAVs based on disparity image processing has been proposed.More specifically, the disparity image is used for direct collision checking, incorporating C-space expansion of obstacles.The motion planning part verifies obstacle free trajectory, projecting them into the disparity image and comparing their disparity values with the C-space disparity values for collision checking.In [6] a memory-less planner that is partitioning free space in pyramids, using direct depth image measurements has been demonstrated.The use of spatial generation of pyramids of the free spaces, allows for labeling obstacle free trajectories that lie inside the pyramids, while achieving fast generation of large number of candidate trajectories and performs collision checks.In [2] the authors present a reactive navigation system for MAV exploration.The developed algorithm is based on a two layered planning architecture that leverages the global environment map for frontier generation and local instantaneous sensor data for obstacle avoidance based on artificial potential fields.In [46] "FASTER" has been developed, an optimization based planning approach for fast and safe motion in unknown environments.The planner leads to high-speed navigation by allowing to plan in known and unknown configuration space using a convex decomposition in a two trajectory design approach, a fast and a safe trajectory.In [31] a reactive navigation and collision avoidance scheme for MAVs that combines a layer of obstacle detection based on 2D LiDAR with NMPC constraints was proposed for agile local navigation.In [24] a collection of sensor based heading regulation methods have been proposed for aerial platform navigation along underground tunnel areas.In this work the heading regulation methods using i) image centroid calculation from either single image depth estimation, or dark area contour extraction, or CNN for dark area extraction and ii) from processing 2D lidar measurements have been described.In [18] a mapping for motion planning architecture that queries for the minimum-uncertainty view of a point in space, searching a set of recent depth measurements under noisy relative pose transforms has been presented.This work enables the identification of local 3D obstacles in the presence of significant state estimation uncertainty, evaluating motion plans.Table 1 summarizes the SoA exploration strategies, while highlighting the contribution of REF.

Contributions
The exploration, global planning and navigation architecture of this work is part of the development efforts within the COSTAR team [1], [34] related with the DARPA Sub-T competition [11], while it is directly applicable for cave environments.Based on the above mentioned state-of-the-art, the key contributions to this article are listed in the following manner.
-The main contribution of this work stems from the development of safe frontier points generation and local as well as global cost based candidate frontier point selection method.In the presented work we extend the classical and rapid frontier exploration approaches with the improvements concerning safety of MAVs in field as well as maintaining agile nature of exploration.The proposed approach focuses on local frontier selection that take into account the position of such frontier relative to any static or dynamic obstacle in the field of view while also minimizing yaw movement of MAV.When no such frontier exist in the local field of view, the global re-positioning of the MAV is triggered in order to lead the MAV to global frontiers that lead the MAV to more unexplored space.The global re-positioning method is formulated such that it associate cost based on the overall actuation effort required by the MAV to move to a global frontier.The proposed global re-positioning of the MAV considers various factors such as MAV safety, actuation cost as well as how much of the unexplored space will be seen from a potential global frontier.Such contribution differentiate our method from other rapid frontier exploration approaches that directly switch to classical frontier approach, instead in our method MAV globally re-positions itself based on multi layer cost assignment in global frontier selection.As it will be presented, such contribution is particularly important in multi branched tunneling or caving system exploration scenarios.-The second contribution presented in this article is the development of overall autonomy framework which addresses the problem of exploration, safety margin based path planning and reactive navigation through nmpc based control of MAVs.The dedicated risk aware path planning and potential fields based avoidance scheme incorporated within the proposed framework allows to push the limits of exploration in the candidate goal selection process in wide, narrow and obstructed environments.
Extensive simulations are performed for validating the proposed framework in multiple difficult scenarios in order to benchmark the safety, speed and versatility the presented autonomy modules.
The rest of the article is structured as follows.section 2 presents the problem formulation considered in this work.section 3 presents the proposed safe frontier points generation as well as intelligent goal selection with the focus on safe yet fast autonomous exploration addressing the minimizing actuation effort of the MAV.The section also describes the overall autonomy framework which is the combination of exploration, global path planning as well as nmpc based reactive navigation.In section 4, detailed analysis on simulation experiments are presented that prove the efficacy of the proposed scheme.
Finally section 5 provides a discussion with concluding remarks of the proposed work.

Problem Formulation
The problem considered in this work is exploration of bounded 3D space denoted as Occupancy probability based modelling is adapted in order to model free, occupied and unknown space around the robot.The provided framework is targeted for a direct use case in field robotics where autonomous robots are deployed in cave and mines for rapidly mapping the areas.In the case of exploration of bounded space, the exploration will be considered complete when V occupied V f ree = V while V unknown = ∅.In the proposed approach, exploration process is subject to vehicle actuation effort, limitations in time of flight as well as risk margin for safe path generation.
In order to be deployed in real scenario, the exploration and planning framework should adapt based on the available computational resources.The performance evaluation of the proposed framework will be based on exploration time as well as actuation efforts of the MAV.

Proposed Approach
The proposed approach employs a frontier based exploration technique which is modified with the focus on making exploration fast, safe and versatile based on available computational resources.We use occupancy grid maps as a mapping framework, which can generate a 2D or 3D probabilistic map.A value of occupancy probability is assigned to each cell that represents a cell to be either free or occupied in the grid.In this work we are targeting 3D exploration of bounded and unbounded space therefore using the baseline framework of OctoMap [22] we build on top of it in order to develop the proposed 3D occupancy checking framework used in this work.Let us denote a voxel as v(x, y, z).The voxel v is subdivided into eight smaller voxels until a minimum volume is reached.The minimum volume is same as the octree resolution vres.Corresponding to each sensor update if a certain volume in the octree is measured and if it is observed to be occupied, the node containing that particular voxel is marked as occupied.Using ray casting operation for the nodes between the occupied node and the origin (sensor), in the line of ray, can be initialized and marked as free.This process leaves the uninitialized nodes to be marked unknown until the next update in the octree.Let us denote the estimated value of the probability P (N | z 1:t ) of the node N to be occupied for the sensor measurement z 1:t by: where Pn is the prior probability of node N to be occupied.Let us denote the occupancy probability as for node N to be occupied as

MAV current position
Let us define the sensor range R and a sphere of radius r around the MAV.This radius r will be denoted as cleaning radius from here after.The after each update in current octree, if a frontier lie inside this sphere, the frontier is marked as seen and such frontier is deleted from {F}.The cleaning radius is defined such that r < R therefore new frontiers will always be generated at distance R and as the MAV navigate towards frontier, the frontiers lying within the sphere of radius r are deleted and less number of frontiers need to be iterated through in candidate goal selection process.The iterator is defined as it.The meanings of the important notations used in this work is presented in Table 2 The exploration framework presented in this work is is made up of three essential modules, namely the safe frontier point generator, the cost based frontier point selection incorporating also collision check and finally candidate goal publisher as presented in Figure 5.
The first module takes the Lidar point cloud as input and based on the occupancy probability formulation as mentioned earlier, converts the sensor measurement in order to form an octree.The octree is defined as tree data structure in which each sub node is further divided into eight quadrants until minimum volume is reached.The safe frontier point generator module generates all safe frontiers based on the octree as depicted in algorithm algorithm 1.Let us define a risk margin parameter m related to the voxel grid resolution vres.At any instance in the exploration if node N being currently checked for to be considered as a safe frontier then we also check the neighbouring adjacent nodes defined as N adj within the safety margin m.
In our approach we formulate an additional layer of requirement in which we check the neighbouring Voxels of an uninitialized (Unknown) Node N as described earlier and N adj ≤ Pn) than the Node N is considered as safe frontier node and is added to {SF}, where {SF} is a set containing all safe frontiers.This means that a particular node N , it's adjacent node N adj as well as all nodes in the neighbourhood of node N within the range of m * vres are checked and if all such nodes are seen to be free than the node N is considered to be a safe frontier.To be marked as a frontier, each node should have at least n number of minimum unknown or free adjacent nodes.This process makes a big difference in the computational complexity of the process because by specifying a certain risk margin m and minimum unknown or free neighbours k at the start of exploration, the trade off can be made between number of iterations and coverage quality.Another improvement our approach presents is that by not allowing any frontier to be close enough to an occupied node in context of risk margin, we guarantee that inaccessible frontiers can be eliminated which are generated due to the error in probabilistic occupancy mapping.The inaccessible frontiers are defined as the frontiers that are not safe to reach or impossible to reach in terms of MAV size and dynamics to pass through small openings in the map.This simply implies that risk margin can be set in correspondence with the size of the MAV such that the inaccessible areas can be patched and modelled as occupied in the map.The parameters m as well as k are proposed with the focus of testing the proposed approach in extremely difficult areas such as caves and mines where safety of the robot is a major concern.
As defined in algorithm 2, corresponding to each new sensor measurement we check if a N ∈ {F } is still a frontier.We define a candidate frontier set denoted as {C} ⊂ {F} which contains all the valid frontiers which will be examined based on the MAV's position.A 3D Lidar is used in the proposed method to get sensor point cloud and thus, the framework generates frontiers in all directions surrounding the MAV but limited in the vertical directions with field of view V β .In algorithm 2, we classify the frontier nodes in two further sets {L}, {G} {C} named as Local and Global set respectively.Such Local and Global sets contain frontier nodes classified based on the selected horizontal and vertical field of view H θ and V β respectively as shown in Figure 4.
This process allows us to prioritize the unknown space lying ahead of the MAV and if there exist no unknown space ahead of the MAV, the candidate goal is selected based on the global cost based goal assignment.
The frontier points from occupancy formulations are generated in the world frame (W) but the frontier vector f is calculated relative to the MAV body frame {B}.As shown in Figure 4, the angle α is calculated with respect to B. If a frontier f and MAV's current position in inertial frame W is defined as f (x, y, z) and C(x, y, z) respectively then the angle α and γ with respect to body frame B can be computed as, where, ψ is the heading angle of the MAV and h is the vertical height of the footprint of 3D LiDAR field of view.
As discussed previously the algorithm 1 also outputs a list of occupied nodes {O} which has occupancy probability P o higher than 0.5 thus considering the the cluster of occupied points lying in the field of view, the the frontier nodes having a lesser avoidance cost are also favoured to be the N extBestF rontier.The cost formulation for selecting a local or global candidate goal is as follows.If we define the current position of the MAV as C(x, y, z) then the costs for local and global frontier selection can be formulated as, Where, Wo, W h , Wz and W d ∈ R are defined as weights associated to avoidance, heading, height difference and distance cost respectively.We define the actuation effort E as a function of cost such that where, T hover is the minimum thrust required for hovering with zero torques about the MAV arms.Thus, by optimally selecting the next pose reference command for the MAV the actuation effort can be minimized.The MAVs consume high energy to produce yaw torque due to the motor saturation constraints while also keeping the MAV hovering.The overall autonomy scheme of the proposed work is presented in Figure 5.As discussed earlier, the framework uses 3D LiDAR or a camera depth cloud as point cloud input and upon point cloud filtering, the framework generates an octree of occupied, free and unknown nodes.Using the workflow described in algorithm 1, the framework detects frontier points and classifies set of safe frontiers.As presented in the autonomy and navigation scheme (Figure 5), based on local or global frontier, the risk aware global planning module plans a collision free path to the next best frontier.The N BF is then fed into the reactive navigation and control framework to actuate the MAV to navigate to selected frontier point.In Figure 5 APF stands for Artificial Potential Fields that we have incorporated with Nonlinear Model Predictive Control for collision avoidance.The baseline framework for reactive navigation and control used in this framework is inspired from our previous work [30].The Next Best Frontier is sent to a risk aware global planning module which is the extension of D * Lite algorithm but implemented with octomap framework in this case.The global planning module D * + uses the modelled occupied space in order to plan a safe path to the N BF .The risk margin formulation in an expandable octomap grid for global planning is presented in detail in our previous work [26].

Exploration Mission Experiments
In order to validate and test the performance of our proposed exploration approach we use the M100 MAV provided in the open source Rotors Simulator [20] framework.Next-Best-view [4] has been widely used for bench-marking the exploration-planning algorithms.In this work we compare our framework with the latest version of NBV, State-of-the-Art Motion Primitive Based planner (mbplanner) [12] which is developed also as part of the development efforts within DARPA Sub-T challenge.We use a custom cave model with multiple junctions, obstructed walls, narrow openings, steep slopes as well as tunnels with dead-ends for simulation.The cave environment has been made open sourced for public [3].For fair comparison all simulations are performed with same computational unit having Intel core i7 processor and 16 GB memory on ROS Melodic running on Ubuntu 18.04.For mbplanner also the simulations are performed using the cave virtual world where the tuning of parameters such as MAV velocity, mapping resolution and sampling time, was similar to the ones used for the proposed method.In Figure 6 different exploration instances are shown.As described in section 3 the proposed framework (REF) also uses frontier cleaning radius and due to which coverage of large cave like voids can also be performed while exploring.Using the proposed framework the MAV is also able to navigate in narrow and obstructed passages and at the end of such passages if a void like area can also be covered efficiently.The simulation experiment is also carried out to explore a multi-branched virtual cave environment having narrow passages continuing in different heights for a true 3D exploration.The environment is also made open source [28].In Figure 7 the exploration of the virtual cave environment is shown.
In Figure 8 and Figure 9 the explored volume and distance covered by the two exploration frameworks is presented.Figure 8 and Figure 9 depict that our method performs significantly close to the State-of-the-Art mbplanner in terms of exploration volume of the cave environment and distance covered respectively.The proposed approach achieves slightly higher explored volume for the same mission time, this is Fig. 7: Multi-branched large 3D virtual cave world exploration using proposed framework because of the novel Next Best Frontier selection approach as adapted in section 3.As presented in Figure 12, MAV trajectory in our approach is significantly inline with the goal of maximizing the movement into unknown areas while limiting repeated visits to already mapped areas.In Table 3 the exploration volume and distance travelled by the MAV in multiple different runs with different mission duration is presented for both planning framework.As it is evident from Table 3 that the proposed Rapid Exploration Framework (REF) shows higher exploration volume as well as ground covered by the MAV in multiple different runs because of the nature of computing next paths while navigating to current path.All missions considered in Table 3 have same start positions for both planning framework and the MAVs do not return to base in considered cases therefore, showing the exploration capability comparison in given time with same configuration.However, it is also important to mention that even though the V unknown sampling approach in both method is different, the next way points in both case are selected with the focus on maximizing the information gain and exploration volume in same time.
In Figure 11 the exploration mission trajectories are shown for REF and mbplanner with same mission duration (400 s).In Figure 11 it is evident that the MAV covers more ground in given time using the proposed framework.
In Figure 12 in both method the overlap in trajectory is seen.This overlap is mainly due to the lower information gain (corresponding to mbplanner) and {L} = ∅  In this article we proposed a Rapid Exploration Framework for deploying autonomous MAVs in unknown areas such as caves and mines.We present a novel candidate goal selection method with the focus of minimizing the actuation effort of the MAV by employing Look Forward Move Ahead approach.We compare the exploration scenario in the same environment with the motion primitive based planner which is a remarkable extension to Next Best View approach.In terms of volumetric gain and distance travelled, we achieve similar results to that of the mbplanner.We also address the trajectory overlap issue by introducing a simple yet efficient cost based goal selection approach that prevents the MAV to Unnecessarily travel to previously visited areas while also keeping the look forward move ahead approach as priority.As future development efforts we plan to conduct some field experiments to explore abandoned mines and underground cave structures.

Fig. 1 :
Fig. 1: DARPA Sub-T world : Exploration instance.(1) Rapid local exploration behaviour (2) local exploration in very narrow as well as wide cave-void like areas (3) Safe Next Best Frontier (NBF) in obstructed narrow tunnels

Fig. 2 :
Fig. 2: Exploration behaviour using the proposed framework in multiple exploration scenarios in DARPA Sub-T virtual world

Fig. 4 :
Fig. 4: Frontier classification and notations used in the proposed framework

Fig. 5 :
Fig. 5: The proposed overall autonomy and navigation scheme

Fig. 6 :
Fig.6: DARPA-Sub-T virtual world: Exploration of narrow-confined passages as well as large cave like voids using proposed framework.In (1,2,3) the rapid exploration-coverage nature of the proposed framework is shown.In (4) the safe way point selection and risk aware planning to safe frontier is shown

Table 1 :
Different exploration frameworks and their corresponding explorationplanning approach [REF] Safe frontier generation for local and global exploration & local frontier selection based on heading and avoidance cost & heading regulation, height difference and travel to frontier cost based global re-positioning when local exploration gain is low

Table 2 :
Description of the notations used in proposed methodology .

Table 3 :
Exploration volume and distance from multiple runs