1 Introduction

The field of underwater robotics has significantly grown in the last decades. Nowadays not only Remotely Operated Vehicles (ROVs) are safely and routinely used, for example in the offshore industry, but also Autonomous Underwater Vehicles (AUVs) are more and more used. Their ability to carry out missions without close human intervention and supervision and without a tether made them extremely valuable. They can reach locations otherwise not accessible (e.g. under ice), they do not require a support vessel and can carry out missions autonomously, without the constant input from a human operator. Marine robotics is at a level of maturity that new emerging areas are appearing, for example its role in sustainable development Maurelli et al. (2020). There are however still many research areas which are still open, in order to make those vehicles persistently autonomous. One of these areas is safe and reliable autonomous navigation. An autonomous vehicle indeed needs to demonstrate high autonomy in order to keep track of its position and orientation and to build a map of this environment while successfully completing autonomous tasks. In this paper the authors aim to focus on the localisation problem, as it is an important basic functionality that autonomous vehicles need to be able to seamlessly perform.

The progress in deploying advanced localisation techniques comes at the same time as a growing range of applications for automated survey, for which autonomous vehicles are an ideal vector, such as:

  • marine habitat assessment and monitoring, invasive species monitoring;

  • mine countermeasures and protecting harbours;

  • natural disaster and oil spill response;

  • operations around man-made underwater infrastructures;

  • underwater archaeological missions.

The common denominator of these missions is the need for the vehicle to know its own location, in order to be able to fulfil them.

Most vehicles can be equipped with Inertial Measurement Unit (IMU) and/or with Doppler Velocity Log (DVL), which provide noisy estimates of the vehicle motion. Although this represents a significant improvement with respect to using the hydrodynamic model and the command history, the use of only proprioceptive sensors leads to an error increasing over time, due to the integration over time of noise measurements. It is therefore important to try and bound the uncertainty. This can be done in various ways. Equipping the vehicle with sensors to perceive the environment is a possible way which would work if there are enough distinctive features. In this case, the usual sensor used in the underwater world is the sonar, as sound waves can propagate in water for long distances. Vision can also be used when the water is sufficiently clear, mostly at short distances. As those sensors may not be always helpful for localisation, in that case the use of external aids and off-the-shelf external positioning systems is necessary.

The problem of self localising a robot can be formalised in the determination of its pose in the operating environment, given the observation history, the command history, and the knowledge of the environment. Analytically, it can be expressed as estimating the probability distribution:

$$\begin{aligned} p(x_t|z_{0:t}, u_{0:t}, m) \end{aligned}$$
(1)

where \(x_t\) is the robot position in the environment at time t, \(z_{0:t}\) is the sensing history \({z_0, z_1,..., z_t}\) up to time t, \(u_{0:t}\) is the command history \({u_0, u_1,..., u_t}\) up to time t, and m is the map of the environment Thrun et al. (2005).

Although the general pose for an underwater vehicle has six components: three spatial and three angular, additional sensors such as compass, altimeter, depth sensor and the vehicle configuration can help to reduce the state space, thus impacting on the efficiency of the various localisation algorithms.

This work will review current approaches highlighting an important classification in localisation approaches: passive and active. By the word passive, the authors mean without the vehicle control in the loop. It signifies that the robot tries its best to estimate its state, given the sensing history and the command history. There is no decision making involved, in order to facilitate the localisation process. The module takes the values from the sensors input and gives the best state estimate as output. It is to be noted that some of the related work cited in the Passive Techniques section presents algorithms, which employ active techniques. This is due to the different semantic attributed to the adjectives passive and active. All the work cited in that section comply with the above definition of passive: no decision making involved. This ambiguity derives from the subjects to which the adjectives are referred. In the case of this article, passive and active are related to the vehicle. As long as there is no decision making, the vehicle is passive, whilst when the control is in the loop and a specific decision related to localisation is made, the vehicle is active. In other works, the adjectives are referred to devices placed in the environment to facilitate the task of localisation. Therefore, according to those authors, a passive localisation approach deals with passive devices - for example a cat’s eye acoustic buoy (a device which reflects tuned sonar signals at specific frequencies depending on their composition, so to be more easily detected Smith and Williams 2005; Canning 2008), whilst an active localisation approach deals with active devices - for example an acoustic pinger, which actively sends an acoustic wave. Those approach are both classified as passive for the scope of this review.

This article is organised as follow:

  • section 2 will present work in the field of passive localisation. Several approaches to the localisation problem are presented, with a particular emphasis on the use of filtering techniques;

  • section 3 will present work in the field of active localisation, and summarise the current body of knowledge in active decision making for localisation.

Finally, the closing section will summarise the main techniques, present the shortfalls of the main approaches and outline the open questions addressed in this review.

2 Passive techniques

This section will review current approaches in passive underwater localisation, with a focus on filtering techniques. The various techniques presented are clustered based on the type of approach, with the aim of providing a useful view to the readers.

2.1 Seabed transponders

The earliest method using transponders, put forward by Smith & Cheeseman, uses an Extended Kalman Filter (EKF) to estimate jointly the state of the AUV and the position of the transponders Smith and Cheeseman (1986). This approach has been used with success in various cases, for example by Kantor and Singh (2002), Kurth et al. (2003), Djugash et al. (2005), but the filter suffers from linearisation of non-linear models which can quickly lead to divergence as the covariance estimates become unreliable. In addition, the algorithmic complexity of the EKF algorithm grows quadratically with the number of beacons (in the two dimensional case). For example, the prediction step of the EKF only affects the state of the AUV but involves the computation of the Jacobian of the motion model which also takes into account the transponders.

Several methods have been put forward to reduce the complexity of the EKF algorithm, such as the linear state augmentation principle or the partitioned update approach both presented by Bailey & Durrant-Whyte in Bailey and Durrant-Whyte (2006). These methods have a linear complexity with the number of beacons.

Scherbatywk analysed the use of a single transponder in a LBL configuration (Long Base Line). In addition to successfully demonstrate his approach in simulation, he also pointed out the main disadvantages of this technique: transponders installation, calibration, recovery and possible losses Scherbatyuk (1995).

The theory behind a single beacon LBL approach is well described in the work of Webster et al. (2010). The work presented also extends the use of this technique to multiple vehicles. Synchronous clocks on the reference beacon and the vehicles enable the measurement of one-way travel-times, whereby the time of launch of the acoustic signal at the reference beacon is encoded in the acoustic broadcast and the time of arrival of the broadcast is measured by each vehicle. The decentralised navigation algorithm, running independently on each vehicle, is implemented using the information form of the Extended Kalman Filter.

Other relevant work using a single beacon for localisation has been presented by Ferreira et al. in Ferreira et al. (2010). To determine its horizontal position, the AUV fuses distances to the single beacon with dead reckoning data, heading and longitudinal velocity. The authors implemented both a Particle Filter and an Extended Kalman Filter and compared the two techniques in terms of performances. The merged solution among Particle Filters and Extended Kalman Filter exist and is a subset of the solution proposed in Maurelli et al. (2009).

Willumsen et al. additionally showed the feasibility for the AUV to locate itself, with the aid on a single transponder, analysing the efficiency of different approaches, together with quality and quantity of available information Willumsen et al. (2006). Both active (range) and passive (differential range) measurements achieve a good accuracy. As expected, using both range and differential range gives a quicker convergence in position estimate. All these aids however, need time and movement to obtain good navigation.

The idea of underwater transponders can evolve into the idea of a proper underwater station, which not only helps the AUV to localise itself, but it can provide tools for recharging and data exchange. The work proposed by Maki et al. deals with AUV localisation with respect to a seafloor station Maki et al. (2011). The key idea is acoustic mutual orientation and communication. The AUV firstly sends a signal to the station and then receives the reply with a hydrophone array to estimate direction and distance to the station. On the other hand, the station estimates the direction to the AUV by receiving the signal with a hydrophone array, and sends the information back to the AUV. Then, the AUV can estimate its position and orientation at the station-fixed coordinates without drifts. Furthermore, these measurements are fused with other on-board sensors such as DVL, angular rate gyroscope and depth sensor by particle filter, a probabilistic approach, in order to realise stable positioning robust against sensor noises and lack of measurements.

A different approach on acoustic sensor fusion for AUV navigation was performed by Rigaut et al., showing the advantages of acoustic data fusion, versus a dead reckoning strategy Rigaud et al. (1990). The matching of the absolute and relative reckoning modules permits to compute a new hypothesis, and to reset the relative estimation in an asynchronous way with the help of the operator.

Luo et al. presented a localisation system for Underwater Sensor Networks (UWSN), with the use of an AUV Luo et al. (2008). Based on directional signals, which are transmitted by an autonomous underwater vehicle (AUV), the system is able to correctly estimate the node positions. The AUV is considered as a precise source of localisation information, with bounded error. This problem can be mirrored, with uncertain AUV position and with active beacons, as explained by Petillot et al. in Petillot et al. (2010).

2.2 Communication with surface vehicles/buoys

Having the possibility to use the GPS would be however very beneficial, so some other approaches have been explored, arriving to the so-called Surface-LBL. This approach uses third entities on the surface, and the submerged vehicle acoustically communicate with the surface. In most of the cases involving a support vessel, the technique is called SBL, which stands for Short Base Line, or USBL, which stands for Ultra-Short Base Line.

Storkensen et al. proposed to use a support ship equipped with a high-frequency directional emitter able to accurately determine the AUV’s position with respect to the mother ship Storkensen et al. (1998). This approach requires, however, the support of a ship Vestgard et al. (2001). Furthermore, it cannot be used in many situations as it requires the ship and the vehicle being close to each other. This approach is therefore not suitable for navigation around deep off-shore structures.

Watanabe et al. also studied the use of SBL and SSBL (Super Short Base-Line, similar to USBL), analysing its multiple advantages to solve the AUV localisation problem Watanabe et al. (2007), linked to data transmission. It is notable that the proposed approach has been validated at a depth of 1.200m, with the ROV Kaiko.

The French company IXSEA presented a few commercial solutions to fuse USBL with inertial technologies Willemenot et al. (2009). The GAPS positioning system takes advantage of rigid mounting of Inertial Navigation System (INS) and USBL, careful synchronising the data coming from an innovative 3D antenna. The coupling of RAMSES and PHINS systems uses not only of ranges and inertial sensors, but also SLAM techniques adapted to subsea positioning needs. Although the article is very commercial-oriented, the presented results for AUV localisation are promising.

Ura and Kim also analyse the fusion among USBL and INS techniques Ura and Kim (2004). After the vehicle has reached the maximum altitude for the DVL to be able to track the seabed, the position error is calculated as the difference between SSBL position and corresponding position in INS. The proposed approach has been validated on a in-water mission at Kuroshima Knoll, showing great advantages for the navigation system. Caiti et al. developed an acoustic localisation technique using freely floating acoustic buoys equipped with GPS connection Caiti et al. (2005). This system requires the buoys to emit a ping at regular time intervals with the coded information of its GPS position. The vehicle can locate itself using time-of-flight measurements of acoustic pings from each buoy. Figure 1 shows the proposed framework. The limitations of this approach are the necessity to deploy enough floating buoys in the mission area, the need to collect them after the end of the mission and a non efficient communication scheme, as the buoys periodically send acoustic messages. Limitations for deep water missions are also evident.

Fig. 1
figure 1

Use of freely floating acoustic buoys equipped with GPS connection, communicating with the AUV Caiti et al. (2005)

A similar approach is presented by Yang et al., using three surface buoys or vessels Yang et al. (2010). Acoustic ranges are calculated using the equivalent sound speed profile and travel times of sound rays. The linear distances can be obtained through ray tracing and the position thus calculated through geometric calculation. Simulation results have shown an accuracy of 20cm in the area covered by the three transducers at a depth of 500m.

Yang et al. also proposed a sonar-based approach for passive localisation of an AUV Yang et al. (2010). It is important to note however that they use the terms passive and active with a different meaning than the one used in this article. Whilst in the article those terms are related to the vehicle decision making system, as outlined in the Introduction, Yang et al. focus on passive and active sensing. Passive localisation is then related to techniques where passive receivers listen for a signal broadcasted by the AUV and then infer its position. For avoidance of doubt, in the scope of this article passive localisation means that the vehicle has no way to actively link the control loop and the robot actions with the localisation process, which only processes the data produced by the sensors.

A data communication-centered review by Saeed et al. is a reminder that the acoustic sensors are not the only focus in the underwater domain: the article discusses at length the possibility of data transfer and localisation using LED-based underwater light sources Saeed et al. (2018). The latter is mainly discussed in the context of distributed network of underwater buoys/sensors which localise themselves, which is further away from the scope of this article. While using light shortens the useful range of the devices, the authors discuss different strategies of relaying information and optimising transmission.

A more recent approach employed by Fanelli et al. show improvements of the localisation of an underwater target through an ultra short baseline-aided buoy Fanelli et al. (2019). Such a buoy relies on an ultra short baseline device for the localisation and is aided by a proper sensor set in order to compensate variations in its pose. The measurement errors entailed by the buoy motion are therefore preliminarily compensated, before filtering techniques are applied with the aim to further increase the accuracy of the measurements. The solutions proposed have been validated through experimental tests.

Allota et al. propose an algorithm to exploit the positioning measurements provided by a USBL transducer on-board the vehicle to aid the navigation task. In the considered framework the acoustic measurements are embedded in the communication networkscheme, causing time-varying delays in ranging with the fixed nodes Allotta et al. (2015).

2.3 Terrain-based navigation

Terrain-based navigation is another very important area for AUV localisation. The estimated position is not calculated trying to find an acoustic fix either with transponders or with floating buoys, but it is relative to the terrain. The AUV measures the topography of the bottom using on-board sensors, and correlates those measurements with an existing bathymetry map in order to estimate its position on this map during the mission, as shown in Fig. 2. The same concept can be extended for navigation around underwater structures. The underlying idea is the use of sensors to perceive the environment and to match a previously known map, being it a bathymetry map or a map of the environment with structures and objects.

Fig. 2
figure 2

Terrain-based navigation: the AUV measures the topography of the bottom and estimates its location in the map. Picture from https://web.stanford.edu/group/arl/projects/terrain-relative-navigation

This produces a non-trivial advantage in terms of operational cost (including time and money) avoided for the deployment of the aiding devices. Terrain-based navigation has been widely used for decades in aircraft and cruising missiles. An example is in the early work of Hostetler and Andreas, where they explored the application of non-linear Kalman filtering techniques to radar terrain-based navigation Hostetler and Andreas (1983).

One of the first works in the underwater domain is presented by Newman and Durrant-Whyte (1998). Analysing the nature of the sonar beam, they successfully estimated the ocean floor gradient, as a world feature. Applying a simple target extraction algorithm, the AUV was able to identify and rely on natural features for navigation.

Nakatani et al. proposed a terrain-based approach using a Particle Filter for stochastic estimation Ura et al. (2006). The approach has been successfully validated through sea experiments and the work pointed the way to active techniques, which are the core of section 3 of this paper.

The work presented by Sarma represents an important contribution in the field of map-matching based self-localisation. The AUV is equipped with a multi-beam high frequency sonar and matches the received data from the sensor with a bathymetry map, a priori known. The estimated position is computed using the Maximum Likelihood Estimator, widely used for land robotics, for example in Olson and Matthies (1998), Swevers et al. (1997), Maeyama et al. (1994), Nishizawa et al. (1995), Howard et al. (2002).

Carreno et al. presents several works using Particle Filters for terrain-based Navigation Carreno et al. (2010). Karlsson et al. proposed a particle filter approach for terrain-based AUV navigation, based on a SIR-Particle Filter (Sequential Importance Resampling, Thrun et al. (2005). An INS was used for measuring the robot displacement and an echo sounder was employed for measuring the altitude. The focus was however more on the mapping problem than on the localisation one Karlsson et al. (2003).

An analysis of particle filters for terrain-based navigation in presence of tides is given by Anonsen & Hallingstad, who also compared standard Monte Carlo approaches with a Bayesian Point-Mass Filter (PMF) Anonsen and Hallingstad (2006). Although results of PMF were slightly better, the computational needs explode when going into an estimation of a 3D state, whereas it still remains tractable for particle filters.

Silver et al. presented a particle filter merged with scan matching techniques Silver et al. (2004). They use a particle filter and an approximation of the likelihood of sensor readings, based on nearest neighbour distances, to approximate the probability distribution over possible poses. The initial robot location is not considered as an issue in this work, as it focuses mainly on trajectory tracking and mapping.

The choice of sensors always play a major role, due to limited payload capability of AUVs. As many survey-class AUV are equipped with a sidescan sonar, to acquire images of the seabed, Zerr et al. investigated the use of the same sensor data to improve navigation Zerr et al. (2005). It is assumed that the AUV surveys a known area, so a previous map of the environment is known and available for pre-processing and feature extraction. Matching the results of real time payload data processing with a knowledge database has demonstrated potential aid to AUV navigation when performing routine missions.

Teixeira, Quintas and Pascoal consequently develop methods of terrain-based localisation built upon particle filters. While their earlier works Teixeira and Pascoal (2005), Teixeira et al. (2012) uses multi-beam sonar or a combination of thereof with a pencil-beam sonar, a recent publication proposes a scheme which is much more economic in terms of hardware, as it relies entirely on a DVL coupled with inertial measurements Teixeira et al. (2015). The authors compare two filtering strategies: prior-correction particle filter Teixeira et al. (2012) and complementary filter Teixeira et al. (2012). When these two methods were compared side-by-side in Teixeira et al. (2015), a small advantage, in the context of a harbour navigation in closed loop, could be attributed in terms of accuracy to the latter approach.

The approach of terrain-aided navigation based on particle filters finds a refined expression in the work of Salavasidis et al. (2019). The experimental results are reported for open-ocean deployment of Autosub Long Range AUV, featuring both extreme mission duration (77 h) and great depths (close to 4000 m). The authors present a very streamlined and energy-efficient algorithm. It was found that the algorithm could maintain convergence despite very scarce altitude measurements, since the bottom lock occurred only 180 m or closer to the sea floor. During certain segments of the journey the AUV came in the range of an USBL transponder, thus producing an imperfect ground truth; the terrain-aided estimation was found to much more accurately coincide with this data than dead reckoning alone.

An extension to terrain-based navigation is represented by the work of Kimball and Rock (2008). Their proposed approach can be used for AUV navigation relative to an iceberg. Navigation techniques based on inertial sensors or GPS cannot provide ice-relative navigation accounting for the full motion of free-floating icebergs, especially for iceberg rotation. The proposed approach has been validated post-processing multi-beam sonar data, acquired by a ship circumnavigating the iceberg. The unknown movement of the iceberg introduces a higher level of uncertainty than standard terrain-based navigation. It is however to be noticed that a transposition from a side-looking sonar mounted on a ship to a fully autonomous vehicle exploring under-ice would not be an easy task and would certainly bring new challenges.

Claus & Bachmayer have used terrain-based navigation for glider localisation, especially useful for under-ice or heavy trafficked areas where surfacing for acquiring GPS data would be impossible or unsafe. They have used a type of particle filtering technique, called jittered bootstrap algorithm, that makes use of the vehicle’s dead-reckoned navigation solution, onboard altimeter and a local digital elevation model (DEM). Claus and Bachmayer (2015). Their approach have shown significant improvement with respect to a standard dead-reckoning only approach, with the error in a long transit (> 90 km) of a few tens of meters with respect to a few kilometres.

2.4 Localisation around structures

Similarly to terrain-based navigation, navigation around structures uses features from the surrounding environment to determine the robot location. Very often they are based on Bayesian filters, Extended Kalman Filters and Particle Filters being among the most popular choices. The work of Maurelli et al. focuses on Particle Filters for AUV localisation around man-made environments. In Maurelli et al. (2008b, 2009b), improvements to the standard particle filter algorithm are presented. Additionally, a combination of an Extended Kalman Filter with a Particle Filter exploits the strong points of each algorithm, switching between them according to the modelled uncertainty and sensor data. Set-membership methods, albeit less main-stream than bayesian filtering, have often been considered for the localisation of underwater robots Caiti et al. (2002). In situations where strong non-linearities are involved, interval analysis has been shown to be useful (see, e.g., Meizel et al. 2002, where the first localisation of an actual robot has been solved with interval methods). Another strong point of set membership methods is the ability to deal with outliers (see Jaulin 2009). In the Set Membership formulation, both input data and computed robot position are represented by their respective belonging sets. Constraints between the position of the robot and the sensor observations are used to contract the actual position set, ı.e. reducing its size thus increasing the estimates precision. Sliwka successfully applies this method to localisation around structures in an abandoned marina Sliwka (2011). The localisation results using particle filters in the same environment can be found in Maurelli et al. (2009a). Kondo et al. proposes a solution employing a Particle Filter for global localisation and position tracking, while the planner issues waypoints once a convergence has been reached. This is an important aspect to link the localisation results into action. Whenever the environment surrounding the vehicle can be represented by specific high-level concepts, semantic-aided localisation can be performed. As outlined by Maurelli and Krupiński, a semantic-aided localisation approach can have several advantages, among which computational efficiency Maurelli and Krupiński (2018). Semantic-aided localisation is still very much unexplored for the underwater domain, whilst it has become more popular in indoor robotics. The authors believe that this will change in time, as underwater vehicles work closer and closer to complex man-made structures, where a semantic understanding would be very beneficial in contrast to geometric-only approaches. Lane et al. have shown the increasing importance of semantic representation and ontological knowledge base as key enabler for persistent autonomy Lane et al. (2015); Maurelli et al. (2013, 2016).

2.5 Magnetic navigation

Another field explored in subsea navigation is related to magnetic fields. Researchers have either explored how a strong magnetic field can influence navigation, or - on the other hand - can help the localisation process.

Kondo and Ura analysed the difficulties for an AUV of correctly localising itself in presence of strong magnetic disturbances, which are typical near steel underwater structures, or in presence of a particular geological configuration Kondo and Ura (2002). To prevent the navigation being affected by the disturbances, they proposed a relative approach, based on images by a CCD camera and laser pointers. Both cameras and laser pointers can work only in clear water, so this approach is not transferable to deep water application without any change. However, changing the main external sensor to sonar, and navigating around the structure with visual servoing techniques allow the vehicle to prevent unwanted jumps in the navigation state (and thus in the control) due to magnetic disturbances.

Magnetic disturbances have been research subjects also for Huang and Hao (2010). The dipole magnetic anomaly caused by ferromagnetic object or geologic structural change, mixed with geomagnetic field has been investigated and the effect for AUV localisation aided by geomagnetic anomaly. A novel localisation algorithm is presented and validated in simulation.

2.6 Vision-based localisation

Vision-based algorithms are very common for land and aerial robotics applications. In the underwater domain, however, there are several limitations. First of all, light propagation in the water is not uniform. Most of its components are are absorbed in very shallow water, whilst only the blue component can reach a depth of one to two hundreds meters, depending on the water purity, as outlined in Fig. 3. There are therefore significant differences between coastal and ocean waters. Techniques to reconstruct the original colour may be very useful. For example Byson et al. propose an automated approach to recovering the true color of objects on the seafloor in images collected from multiple perspectives by an autonomous underwater vehicle, using dynamic estimation of the model parameters Bryson et al. (2016).

Fig. 3
figure 3

Difference in light penetration between oceanic conditions dominated by absorption, and coastal and estuarine conditions which are dominated by refraction, from Renema 2017 Renema (2017)

Suspended particles in the water reduce further visibility, which is greatly affected by water purity. As no natural lighting arrive in deep underwater, artificial light sources are required when operating at depths. In addition to lighten up all the particles in the water, the visual appearance of a structure is highly dependent on the incident angle of the light source. Additionally, the light source is generally moving, as it would be mounted on the vehicle, and consequently the visual appearance of the structure may vary significantly over time. An overview of the various challenges in underwater vision is outlines by Köser and Frese (2020).

In addition to physical properties of the light, the environment itself is often featureless thus making underwater vision very challenging.

Nevertheless, many approaches can be identified using underwater vision for localisation.

Corke et al. use a stereo camera system to computer visual odometry, by integrating scaled optical flow Corke et al. (2007). Experimental results in the field show that visual odometry is indeed a valid possibility. There is however no global localisation, and as in all odometry approaches, the error is not bound.

In order to perform global localisation, some sort of map needs to be available. Carreras et al. present a vision-based localisation approach for an underwater robot in a structured environment Carreras et al. (2003). The system is based on a coded pattern placed on the bottom of a water tank and an onboard down-looking camera. This approach allows absolute and map-based localisation, landmark detection and tracking, and real-time computation (12.5 Hz). The proposed system provides three-dimensional position and orientation of the vehicle along with its velocity.

The approach presented works in presence of a known coded pattern on the seabottom. An alternative solution is to place artificial landmarks in the environment. Kim et al. propose a novel vision-based object detection technique based on artificial landmarks and apply it to the Monte Carlo localisation (MCL) algorithm, a map-based localisation technique Kim et al. (2014). In the image processing step, a novel correlation coefficient using a weighted sum, multiple-template-based object selection, and color-based image segmentation methods are proposed to improve the conventional approach. In the localisation step, to apply the landmark detection results to MCL, dead-reckoning information and landmark detection results are used for prediction and update phases, respectively.

Moving into the deeper sea, when artificial lights are needed, Nuske et al. model the light source mounted on the AUV to predict the appearance of the structure (the legs of a platform in the example analysed in their work) from different viewing poses Nuske et al. (2010). The process is to use an a priori 3D-surface model of the permanent structure being navigated with the light model to generate artificial images which are compared against the real camera image to localise the AUV.

As vision can be generally used at short distance from the obstacle, it is often use to track objects and features. Therefore, often the problem of localisation addressed with vision sensors is not the global localisation one, but the relative localisation with respect to a specific structure, object or feature. Relative localisation is generally then used to feed controllers that are responsible for governing the AUV motion, such that the motion is relative to the structure, object or feature. For example Figueiredo et al. develop a vision-based target tracking by an AUV Figueiredo et al. (2016). Lee et al. study the problem of object detection and tracking “in depth” Lee et al. (2012). Tracking is performed with optical flow and mean-shift algorithms. Marani and Choi propose a medium-range target localisation algorithm for guiding the vehicle toward the target area Marani and Choi (2010). Based on the use of the dual-frequency identification sonar (DIDSON) sonar, the system can transform the relative coordinates of detected targets into Earth-referenced Cartesian coordinates of a known target, allowing precise localisation of the vehicle. Localisation with respect to a docking station is also one of the field where vision is vastly used. Palmer et al. use model based pose estimation techniques that rely on a list of detected features, considering the prior knowledge of the structure where the vehicle has to dock on Palmer et al. (2009). The system is mainly based on a sensor fusion scheme that produces pose estimation in 6 degrees of freedom using the information from different sensors and the vision module. Krupiński et al. present several approaches to the challenges of underwater docking, with a focus on visual techniques (Krupiński et al. 2008, 2009; Maurelli et al. 2008a). A work from Ortiz et al. addresses the use of particle filters for tracking undersea narrow telecommunication cables with a vision system Ortiz et al. (2009).

2.7 Localisation in partially known map

The problem of localisation usually assumes a complete knowledge of the environment. Not many publications deal with localisation in partially known maps, while it is a very important topic, due to unforeseeable obstacles in the sensor field of view, and discrepancies which might occur between a map of underwater structures and their real position and orientation after years, especially if in presence of strong currents. Cristi analysed this problem presenting tank results of an AUV successfully locating itself also in presence of unexpected objects Cristi (1994). The approach chosen by Cristi was the use of neural networks to model potential functions for the environment. Maurelli et al. also considered the problem of localisation in partially known maps and analysed the error curves based on the sensor field of view and range Maurelli et al. (2009b). This aims to measure the impact of the unknown objects in the map using standard localisation techniques.

2.8 Cooperative localisation

In multi-vehicle scenarios, cooperative localisation can play an important role. In this setting, each vehicle may exchange information with other vehicles in order to improve its state estimation. Whilst cooperative tasks in robotics are not so uncommon on land, this field is still at its infancy in the underwater domain, due to the challenges of communication in the underwater domain Maurelli et al. (2012c). However, the advantages are tangible, considering that not always a marine environment offers the possibility to detect landmarks. Sullivan et al. analysed the general performances of cooperative localisation varying some parameters Sullivan et al. (2018). The goal of their study is to find out when cooperative localisation is worthwhile, and how the performance is affected if the system changes. This information is particularly important for systems with robots that have limited power and processing, which cannot afford to constantly perform (cooperative) localisation. This is often the case in marine robotics, which generally have less processing power and more constraints than land robots. The study concluded that yaw accuracy has a substantial effect on performance, a communication rate that is too fast can be detrimental, and heterogeneous systems are greater candidates for cooperative localisation than homogeneous systems. Those conclusions are very supportive of the use of cooperative localisation for AUVs. Yaw can often be accurately estimated not only with the use of magnetic compass, but also with gyroscope, which reduces the error. Underwater communication is much slower, due to the Physics laws of acoustic communication Maurelli et al. (2012b), hence there is no risk of a too fast communication rate. Finally, heterogeneous vehicles are often used in multi-vehicle collaboration in the marine domain, e.g. Maurelli et al. (2012a), Saigol et al. (2013), Jörg et al. (2016).

Huang et al. base their work on cooperative localisation on a new adaptive extended Kalman filter Huang et al. (2018). The overall framework is based on a leader-follower scenario. The leader is a surface vessel with GPS access while the AUVs communicate with the leader to estimate their state. Acoustic range measurements are used to estimate the relative pose with respect to the leader.

A similar approach is proposed by Willners et al. (2019). An autonomous surface vessel acts as Communication and Navigation Aid, while the AUVs rely on range measurements for state estimation. The contribution of this work is mainly on a path planning algorithm, to position the vessel at strategic positions to transmit ranging messages at optimal times. This reduces the uncertainty and error at the AUVs’ state estimation. It also shows how planning can play a role to increase localisation accuracy, as the second part of this article shows.

An alternative algorithmic approach is presented by Sabra and Fung (2020). Instead of using EKF as the backbone of the localisation system, it uses a fuzzy decision support system. An interesting aspect of this work is the simulation results for a swarm of up to 150 AUVs.

Ferri et al. present cooperative navigation in underwater surveillance applications Ferri et al. (2020). With a focus on a track management module, the reported results are one of the first examples that show the potential of cooperative autonomy and data fusion in realistic underwater scenarios characterised by limited communications.

2.9 A brief glance at simultaneous localisation and mapping

The problem of localisation can be solved together with the mapping problem, thus arriving to a full Simultaneous Localisation and Mapping (SLAM) approach. Although the scope of this article focuses on localisation, a very brief review of relevant work in SLAM in the underwater domain is given, considering the topics are closely related. A throughout review of the core probabilistic techniques for SLAM has been performed by Birk and Pfingsthorn (2016). This focuses primarily on the mathematical formulations and implementation of the various techniques presented. Specific for the underwater domain, Paull et al. present a review of various approaches, based on different sensors Paull et al. (2014).

The underwater group at the Australian Centre for Field Robotics (ACFR) is a leading group in underwater SLAM. In the work presented by Williams et al., a stereo-visual SLAM is presented in order to estimate the vehicle position during survey tasks Williams et al. (2010). The presented results of the mapping, obtained by data post-processing, show the capabilities of the algorithm proposed. Barkby et al. propose a featureless approach using an a priori low-resolution map, in order to enforce consistency between the prior map and the AUV bathymetry Barkby et al. (2009).

Palomeras et al. present a vision-based SLAM to detect known landmark, using Extended Kalman Filter, with the goal of localising the relative pose with respect to a docking station Palomeras et al. (2013). Mallios et al. explored the use of sonar scan matching coming from a mono-beam rotating sonar head (Mallios et al. 2009, 2010, 2014). Consecutive scans are cross-registered under a probabilistic scan matching technique for estimating the displacements of the vehicle including the uncertainty of the scan matching result. This is then combined with an augmented state extended Kalman filter to estimate and keep the registered scans poses. This has been further demonstrated on a long transit exploring an underwater cave Mallios et al. (2016). Using forward lookin sonar, Feder et al. explored the issue of long-term performance of SLAM Feder et al. (1998).

Using underwater acoustic transponders, the work of Newman and Leonard present a pure range-only approach to underwater SLAM Newman and Leonard (2003). In this approach, there is no a priori knowledge about the transponders’ or the vehicle’s location.

Wang et al. present a new SLAM algorithm based on support vector machines(SVM) adaptive EKF for AUVs to reduce the influence of the change of statistical characteristics of the system noise and the observe noise Wang et al. (2011).

Li et al. propose a range sonar array based SLAM method Li et al. (2012). The vehicle’s operational environment is a water tank whose dimension are known to the AUV. Aided by the knowledge of the general environment, the EKF-based SLAM is able to map unknown objects in the tank.

Other solutions to the general SLAM problem are the Extended Information Filter Thrun et al. (2005), the Unscented Kalman Filter, presented by Julier & Uhlmann, in Julier and Uhlmann (1997), the PHD Filter Lee et al. (2012); Abdella et al. (2014) and the FastSLAM algorithms, developed by Montemerlo et al. (2002).

Clark et al. applied Particle-Filter based FastSLAM for data post-processing collected by a small ROV for archaeology purposes Clark et al. (2008).

Silveira et al. develop a SLAM solution which is far in theoretical sense from the well established filtering methods Silveira et al. (2015). Inspired by the ground work done by Milford, Wyeth and Prasser and their algorithm RatSLAM Milford et al. (2004), the method, aptly named DolphinSLAM, tries to replicate the working of the animal brain. A neural network is set up that processes images and sonar data as input and the mapping is done by specialised neurons. The sensor data is processed using features extractors and passed to the neural network in form of bag-of-words. By reading neuron activation, the algorithm performs position tracking and is capable of detecting and benefiting from loop closures. Given the explosion of the machine learning applications, from one’s perspective it is almost surprising that few authors pursue this track in the context of localisation. On the other hand, machine learning techniques are generally computationally very expensive. This means that presently there are several challenges for those algorithms to run embedded in an AUV, considering all the constraints in available computational power. This however may change significantly in the coming future.

Huang et al. proposed a novel inertial-SLAM algorithm for AUVs, by fusing IMU’s data with information from the sonar Huang et al. (2010). The AUV can estimate the velocity and pose by Inertial-SLAM using exclusively on-board IMUs and sonar sensors by feeding the data into a particle filter which additionally estimates the gyros’ bias and keeps track of terrain features. According to the simulation results, the proposed system is more efficient and accurate than a standard EKF-SLAM.

Graph-based SLAM Grisetti et al. (2010) has been proposed by several researchers. Lee et al. present experiments on vision-based localisation using graph-based SLAM Lee et al. (2013). Relative range and bearing values of each landmark are obtained from image processing results. A graph structure is built using the landmark detection results and dead-reckoning data of the AUV. The structured graph is optimized by a graph-based SLAM algorithm. Kim and Eustice apply vision based algorithms to the problem of autonomous ship hull inspection, using a pose-graph SLAM approach, which is data-driven and based upon extracting relative-pose constraints Kim and Eustice (2009). A combination of SIFT and Harris features detectors are used for image registration, to calculate camera-derived relative-pose constraints. Pfingsthorn and Birk propose a Generalized Graph SLAM framework which can represent ambiguous data on both local and global scales Pfingsthorn and Birk (2016). This allows the system to handle multiple mutually exclusive choices in registration results and potentially erroneous loop closures.

2.10 Critical analysis

This section has presented several approaches for AUV localisation and several approaches of the use of filtering techniques. Relying only on state sensors, like for example DVL and compass, the state estimation diverges over time from the real trajectory. In order to correct the estimation, there are two main approaches:

  • use of acoustic beacons: the vehicle communicates with the beacons to compute its location;

  • use of sensor data: the vehicle observes the environment (either the seabed, or underwater structures) and matches the sensor data with a previously known map. This can be done both with acoustic or vision-based sensors, depending on the application scenario,

Whilst the first one is already routinely used, it does have disadvantages linked to deployment and recovery. The second approach is certainly more challenging, and requires a high level of autonomy in the robotic systems, not counting on any external support. For example, the emerging sector of low-cost vehicles Corbeanu et al. (2020); Athar et al. (2020) need to employ on-board localisation techniques as acoustic beacons would be disproportional expensive. Looking at the available literature in the area, it looks like that filtering techniques are well in use, with frequent relatively small innovations, but overall building on top of well-known mathematical foundations. Regarding the sensor used, several approaches based on vision and acoustic sensors have been presented. Acoustic sensors like sonar can be used in the field. Vision-based systems are mainly used in test tanks, in shallow water or in very clear water. For navigation around structures, they can also be used in deeper water, with the aid of artificial lights. With similar and stricter constraints than vision, underwater laser scanner are still a niche in the sensor suites of underwater vehicles. There have been a few approaches, for examples (Karras et al. 2006; Massot-Campos and Oliver-Codina 2014; Palomer et al. 2019, but the physical constraints narrow the application to specific scenarios with generally a short distance between the vehicle and the objects and clear water. Technical improvements in the sensor design however have shown significant improvements in the last decade, and therefore an increased use may be expected.

Table 1 A summary of the various techniques that can be employed for underwater localisation

A summary of the various approaches is presented is Table 1. It is important to notice that all the techniques presented have been tested in the field, in a relevant environment, so it is not only about simulation or limited lab pool tests. The determination of the best technique is heavily influenced by the type of environment the vehicle is moving in, and the type of sensors available. The distinction highlighted above is also clear in the table, with sections 2.1 and 2.2 requiring external aid and the other sections presenting solutions based on on-board sensors. Recovery from a wrong convergence and global localisation are not a given in most of the perception-based approaches. It does indeed depend on the time of filtering techniques which is employed. Standard filters in the Kalman family are great at position tracking, but cannot solve global localisation and generally cannot recover from a wrong convergence. On the contrary, multi-hypothesis filters - and in this the Particle Filter family excels - can generally solve the global localisation problem. There are also algorithmic variations to equip them to recover from wrong convergences. Some algorithms are more robust to noise in sensors data and in the previous knowledge, but mostly localisation techniques in this section are best effort techniques. A deliberative layer would be needed to expand vehicle’s autonomy in understanding and correcting the wrong information.

Although deep learning is overwhelmingly more and more important in robotics research, to date very few works show reliable applications for AUV localisation. It is the authors’ belief that this will change radically in the next few years, representing a substantial paradigm shift in the topic.

The next section will deal with active techniques for AUV localisation.

3 Active Techniques

Several techniques for active localisation have been proposed in the past years. This section aims to critically present them, outlining strengthens and weaknesses. Although they are mainly in the framework of land robotics, many techniques can actually be easily adapted for the underwater domain. Here is an analysis of the most relevant work.

3.1 Active Landmark Choice

The first presented approach deals with active landmark choice. The vehicle’s navigation system is implemented using an Extended Kalman Filter (EKF). As discussed, Extended Kalman Filter is a landmark-based localisation system. The algorithm can be divided into two phases: Prediction and Update. Observation of the landmark triggers an update in the filter, to correct the predicted state estimation. Instead of passively searching for landmarks in the measurement, this approach actively influences the vehicle in order to actively look for landmarks, in order to reduce the uncertainty. It is therefore applied whenever there is a predicted location and an associated uncertainty. From the position uncertainty ellipse, the visible landmarks are selected, and the best landmark is then chosen.

Arsenio & Ribeiro applied this technique to mobile robots in a controlled indoor environment Arsenio and Ribeiro (1998). This approach helps tracking the position of the robot, especially in presence of landmarks which are not easily observable (in the specific case: glass wall, with a laser scanner). The system therefore chooses to look at more observable landmarks. If the uncertainty grows, a specific module is called for global localisation. There is however no specific strategy to discriminate between different possible locations of the vehicle.

This idea is also proposed by Tessier et al., in Tessier et al. (2006), where the landmarks are actively selected, by a supervisor module. The proposed strategy of active landmark detection optimises a combination of tools - a landmarks bank, sensors and detectors - by introducing the notion of perceptive triplets. Figure 4 presents the localisation approach and the active detection principle. This approach is very useful and can be ported in the underwater world, in the case distinctive landmarks can be detected. Similar to the previously described approach, it does not address the initial pose estimation and the disambiguation between multiple possible locations of the vehicle.

Fig. 4
figure 4

Localisation system proposed by Tessier et al. (2006), based on active landmark selection

Olson presents techniques to optimally select landmarks for performing mobile robot localisation by matching terrain maps Olson (2002). The method is based upon a maximum-likelihood robot localisation algorithm that efficiently searches the space of possible robot positions. A sensor error model is used in order to estimate a probability distribution over the terrain expected to be seen from the current robot position. The estimated distribution is compared to a previously generated map of the terrain and the optimal landmark is selected by minimising the predicted uncertainty in the localisation. In order to predict the uncertainty obtained by localisation using various landmarks, the proposed method constructs a probabilistic representation of the terrain expected to be sensed at any position in the global map. Treating the patches of this probability map of the terrain as a local map allows the uncertainty expected by sensing the terrain patch to be estimated using the surface fitting techniques. The presented results in a rocky terrain were quite promising. Similarly to the work from Tessier presented above, this work can be applied in the underwater domain with the necessary adaptation in cases where distinctive landmarks can be selected. It suffers however of the typical limitations of the use of a single Kalman Filter already described in this section.

3.2 Multiple-Hypothesis Kalman Filter

Remaining in the Kalman Filter domain, a possibility to consider multiple possible locations for the vehicle is the use of a multiple-hypothesis Kalman filter. This is based on multiple Kalman filters running in parallel, each carrying a possible location, with related uncertainty. Each hypothesis is represented by a pose estimate \(\hat{x}_i = (\hat{x}, \hat{y}, \hat{z}, \hat{\varphi }, \hat{\psi }, \hat{\theta })^T_i\), with an associated covariance matrix, \(\Sigma _i\), and information about the probability of the hypothesis being the correct one \(P(H_i)\). In many cases, it is possible to simplify the state space, based on the vehicle configuration and sensor availability. For example most of the AUVs cannot control roll and by design the robot tends to return to 0 roll, in case of disturbaces. The consideration of multiple hypothesis is extremely important when the problem is not only position tracking, but also global localisation. Figure 5 shows an example where multiple hypothesis are essential. The robot is in a simple environment with one room and four doors. The perception system recognises a door. Therefore there are eight possible locations, or hypotheses for the robot. In this framework, the active approach consists in the determination of the best move that maximises the expected number of new features observed.

Fig. 5
figure 5

The need for multiple pose hypothesis shown in a simple environment with only one room with four doors. The robot can see a door, thus there are eight possible locations (or hypothesis) Jensfelt and Kristensen (2001)

Jensfelt & Kristensen proposed an approach based on multiple hypothesis Kalman filter Jensfelt and Kristensen (2001). They used a topological map to represent the environment. The decision on the robot move is determined by the maximisation of the expected number of new features observed in the next possible moves. Starting from N possible hypotheses, the most probable is considered. A search in the topological graph is performed from the current location of this hypothesis in order to find the one of the neighbouring nodes not previously visited and providing the largest number of features. This node is then selected as the next node to go to. This approach works well when it is possible to identify a clear set of features. The exploration is driven by some heuristics, including the avoidance of visiting the same location twice. This is an important optimisation, as visiting the same location twice does not provide any new information.

Fig. 6
figure 6

The selection of the waypoint in Gasparri et al. (2007), selecting a point close to the nearest possible obstacle

The work of Gasparri et al. is also based on multiple hypothesis Kalman filter Gasparri et al. (2007). The algorithm relies on two steps: hypothesis generation and safe planning and tracking technique. The first step exploits a particle filter to find out the most likely hypotheses with the assumption of stillness of the robot. The second step plans safe trajectories to reduce the remaining ambiguities using an extended Kalman filter for each hypothesis when the robot is moving. Figure 6 shows the active selection of a waypoint in the second step of the algorithm. It is close to the nearest possible obstacle, to allow a safe navigation. A good strategy of this approach is the use of two steps, one to identify a finite number of poses and another one to disambiguate among the poses. The simulation results are very promising. It is however unclear how the algorithm can avoid deadlock situations, which can arise in some environments where the selection of a single waypoint might not be enough. Surely the algorithm can be applied sequentially, which would help in many cases, though not solving the possible deadlock in the disambiguation problem.

3.3 Entropy minimisation

When the robot localisation is performed with Markov-based techniques (for example particle filters), a different approach to active localisation can be chosen. Considering a location l, defined as \(l = (x,y,\theta )\). The distribution, denoted by Bel(l), expresses the robot’s subjective belief for being at l. Bel(l) is updated in two cases. In the first one, the update is triggered by a robot motion (or action more in general). Modelling the motion in probability terms, \(P_a(l/l^1)\) represents the probability of being at location l, after executing the action a from position \(l^1\). As explained by Fox et al. in Fox et al. (1998), the believe is then updated using the following formula:

$$\begin{aligned} Bel(l) \leftarrow \int P_a(l/l^1) Bel(l^1) dl^1 \end{aligned}$$
(2)

The second case when the belief is updated happens when the robot sensors provide a measurement. Considering s as a sensor reading, and P(s/l) the likelihood of perceiving s at l, Bel(l) is updated using the following formula:

$$\begin{aligned} Bel(l) \leftarrow \frac{P(s/l)Bel(l)}{P(s)} \end{aligned}$$
(3)

To eliminate uncertainty in the position estimate Bel(l), the robot must choose actions which help it distinguish different locations. The entropy of the belief measures the uncertainty in the robot position and is obtained by the following formula:

$$\begin{aligned} H = - \int Bel(l) log(Bel(l)) dl \end{aligned}$$
(4)

If \(H=0\), Bel(l) is centred on a single position. In this framework actions are selected in order to minimise the expected future entropy. Considering \(Bel_{a,s}(l)\) the belief after executing the action a from location l and sensing s, the expected entropy can be represented by:

$$\begin{aligned} E_{a,s}[H] = - \int Bel_{a,s}(l) log(Bel_{a,s}(l)) dl \end{aligned}$$
(5)

The expression expected entropy after executing action a is obtained by integrating over all possible sensor values s, weighted by their likelihood, and by applying the update rule, defined in Eq. 3:

$$\begin{aligned} \begin{aligned} E_{a}[H]&= \int E_{a,s} P(s) ds \\&= - \int \int Bel_{a,s}(l) log(Bel_{a,s}(l)) P(s) dl ds \\&= - \int \int P(s/l) Bel_a(l) log \frac{P(s/l)Bel_a(l)}{P(s)} dl ds \end{aligned} \end{aligned}$$
(6)

Solving the problem of active localisation therefore means to minimise \(E_{a}[H]\).

A very important work using this technique is made by Burgard et al. (1997) and Fox et al. (1998). In their work they selected actions by maximising the weighted sum of the expected decrease in uncertainty (entropy) and the costs of moving to the target point. Target points are specified relative to the current robot position and can represent an arbitrary point in the space. Path planning is not involved in the active localisation module. The result of the algorithm is only a single point to be reached by the robot. Position probability grids are used to estimate the vehicle position.

Fig. 7
figure 7

The algorithm proposed by Kodaka et al. (2009), based on a pre-calculated entropy map

Kodaka et al. proposed an approach for mobile robots based on an entropy map Kodaka et al. (2009). RFID tags have been placed into the environment with the entropy map precalculated based on the arrangement of the tags. After a pose prediction using particle filtering, the robot is attracted to the target using a dynamic model, the fundamental unit of which is rotation-based angular velocity. Figure 7 shows the steps of the algorithms and the relations with the environment. This solution is too specific to be applied in the underwater domain with similar scales. Using active beacons, with greater distances due to the physical nature of the sensors, it is possible to design a similar solution. However, the different constraints in the vehicles would suggest an alternative solution which would suit better the underwater world.

Mariottini & Roumeliotis presented an active vision-based localisation technique in a large-scale image map, represented as a vocabulary tree Mariottini and Roumeliotis (2011). They adopted a sequential Bayesian approach in order to eliminate the localisation ambiguity by exploiting additional camera measurements over an extended time horizon, while navigating towards a target image, and along the least-ambiguous (i.e., low entropy) visual path.

Kümmerle et al. use this approach involving active sensing, i.e. the possibility for the robot to decide where to point the sensor. Active sensing represents a subset of the full active localisation problem, where the possible actions the robot can undertake are limited to the pointing of the sensor. They use particle filters for the vehicle localisation. They cluster the particles into groups and calculate the total expected entropy for the particle filter by a weighted average of the expected entropy for each cluster/group Kummerle et al. (2007).

3.4 Selection of best action

This section presents a selection of approaches which are based on the selection of the single best action for the robot to undertake, in order to localise itself. The method to discriminate between the different actions can be different, but they all have the same framework in common: given a set of n possible actions \(A = {a_1, a_2, a_3,..., a_n}\), the algorithm selects the action \(a_i\), which maximises the following formula:

$$\begin{aligned} i = \underset{i}{argmax} (reward_{a_i} - cost_{a_i}) \end{aligned}$$
(7)

The work carried on by Fairfield & Wettergreen represents an important contribution using this approach, also because it is one of the few examples in the underwater domain Fairfield and Wettergreen (2008). It uses active localisation on top of the map previously constructed by a SLAM approach. The set of possible actions are represented by the heading of the vehicle for the following 30m. The action is selected in order to choose the most discriminative one. The vehicle state is represented with a particle filter, and only a subset of particles are used to evaluate the best action. In many cases however a single action - in this case: setting the heading for the following 30m - is not enough to discriminate between multiple hypotheses.

Solberg et al. propose an approach based on the active movement of an electric field emitter Solberg et al. (2007). Their approach is based on electric fields for vehicle navigation, a biology-inspired concept. The vehicle state is estimated with a particle filter and the chosen control option minimises the expected variance of the particles at the next iteration.

Seifzadeh et al. propose some modification to the standard Monte Carlo localisation, in order to solve the kidnapped robot problem and to initialise the particles in a more efficient way Seifzadeh et al. (2009). In the description of the active approach, the robot chooses an action which represents the best trade-off between cost and gain. Again, similar to other approaches discussed above, choosing only one action does not guarantee any result in complex environment, and thus can only be used in very specific cases.

Chuho et al. developed an active-semantic localisation method Yi et al. (2009). A Bayesian model for robot localisation has been applied, incorporating also spatial contexts among objects, which were described using symbols. The robot action selection is based on a greedy approach. Only the best action is considered.

Murtra et al. also presents an approach based on the selection of the best action to execute Murtra et al. (2008). It is based on a rational criteria to select the action that minimises the expected number of remaining position hypotheses, using a Particle Filter.

3.5 Concatenation of actions

Whilst the selection of best action provides a greedy approach to the active localisation problem, a significant extension is represented by looking for the best concatenation of several actions. Maurelli et al. developed a tree-style action planning approach, where the robot can choose any combination of several available actions (Maurelli et al. 2010; Maurelli and Petillot 2010). Each node of the tree is then evaluated according to cost-benefit, with ending condition being either a threshold or a maximum tree depth. In this scenario the goal is to minimise the expected uncertainty after the sequence of actions. Localisation is based on particle filters techniques, and is triggered when a clear particle clusterisation is detected. Minimising the expected uncertainty means looking for the set of actions which would mostly discriminate among the variouos clusters. This means, in a prediction - observation - update loop, looking for diversity in the expected measurements, as shown in Fig. 8.

Fig. 8
figure 8

The reward calculation for each node of the tree, in the active localisation approach proposed by Maurelli et al. (2010)

3.6 Beacon-aided localisation

Some approaches in active localisation can be very specific and tailored to a specific robot configuration or with specific environmental constraints. This is the case of the approach of Olson et al., who use active beacons deployed in the environment, in order to help the localisation process Olson et al. (2004). The use of active beacons (i.e. acoustic emitters) is quite common in underwater robotics. Using this approach, when a vehicle moves on a straight segment and beacons are lying off the line of travel, two solutions for the beacons’ location (or, reversely for the vehicle’s location) are always possible: one on either side of the vehicle. The active approach to localisation therefore aims to disambiguate between these two solutions, with a specific path to be followed. Figure 9 shows the exploration gradient with two possible beacon locations. The best disambiguating motion is a function of the AUV’s location. The vehicle maximises the difference between the range measurements by travelling along the arrows. The length of the arrows indicates how rapidly the difference in range changes. This approach is capable of performing localisation without relying on carefully surveyed beacon locations. The ability to localise a beacon is tightly coupled to the path travelled by the AUV. The robot’s path should be therefore chosen to optimally resolve ambiguous data. Although simulation results were very promising, this approach cannot be easily generalised or adapted if beacons are not present in the environment.

Fig. 9
figure 9

Exploration gradient with the beacon location at (-1;0) or (1;0). The best disambiguating motion is a function of the AUV’s location. Olson et al. (2004)

3.7 Mission planning and active localisation

Active localisation can also be seen not just as an isolated problem, but linked to the overall robotic system, which has tasks to perform. It might not always possible to focus on localisation neglecting any other goal of the robot. On the other hand, the opposite is almost always impossible: the knowledge of the robot state is very often - if not always - a needed condition. A way to address this area is to perform active actions, which would contribute both to the localisation process and to the end goal. In trajectory planning, for example, a localisation-aware trajectory would be not necessarily the shortest one to the goal point, but the one which would allow the robot to see features and arrive to the goal point with a reduced uncertainty. The work of Bauer is in this area and presents an approach to support the data acquisition for the localisation process of an autonomous robot by well-aimed manoeuvres Bauer (1995). The task of localisation is linked with a specific goal to be reached. In the case of path planning, the proposed approach mediates among the different tasks: localisation and user defined mission. This mediation is performed by analysing the estimated benefits and cost of each task and selecting therefore the optimal manoeuvre. The work was tested in simulation with line features as the landmark types for the robot to localise itself. Important assumptions for this work are the knowledge of the start position and the limitations to line features.

3.8 Cooperative active localisation

In some cases the problem of localisation and active localisation can be addressed in a multi-robot scenario, showing the benefits of active approaches versus passive ones. Bhuvanagiri & Krishna designed a system to guide several robots who are in ambiguity of their states to locations where as many of them can get rid of their ambiguities by localising to a unique hypothesis state Bhuvanagiri and Krishna (2008). It presents a unified probabilistic framework that takes into account the role of measurements between robots as well as the measurement made on the local map structure in deciding the best locations to move. The robots choose to move towards those locations where the probability of localising itself to a unique hypothesis is maximum. This work shows the advantages of a multi-robot system in addressing problems such as state estimation. Davison & Kita demonstrated accurate localisation for an inspection team consisting of a robot with stereo active vision and its companion with an active lighting system Davison and Kita (2000). In this case a single sensor can be used for measuring the position of known or unknown scene features, measuring the relative location of the two robots, and actually carrying out an inspection task. The active vision system is based on active landmark choice, described in Sect. 3.1

3.9 Other approaches

In this section other specific approaches, which cannot be grouped in the categories described above, are presented. They often provide a customize solution for a specific problem, thus making their portability in other domains or in scenarios with different constraints difficult.

Antonelli et al. focus on the improvement of observability for relative localisation of AUVs Antonelli et al. (2010). The case of cooperation between two vehicles is analysed and numerical simulations have shown path configurations which avoid singularities. The proposed approach is purely mathematical dealing with system observability. A system is said to be observable if, for any possible sequence of state and control vectors, the current state can be determined in finite time using only the outputs. The approach evaluated valid paths which allow full rank observability matrix for the linearised system, i.e. all variables are fully observable. Those paths are however defined in advance, and specific to the situation. The behavioural control techniques described in the paper do not provide a general solution to the localisation problem, in cases where the robot needs to choose the best trajectory (or, in a wider sense, the best set of actions).

O’Kane & LaValle have analysed three robot configurations with limited sensing, in order to analyse the possibility for a simple robot to localise itself in a polygonal environment O’Kane and LaValle (2007). A discretisation of the state space is applied. No uncertainty is considered, but all the possible states are in a finite set. In order to disambiguate, two random possible states are considered, and a list of actions is computed in order to arrive to a point when only one of the two states is admissible. This approach, justified with algorithmic proofs, successfully determines the robot pose. Among the limitations, the environment needs to be polygonal, no uncertainty is taken into account and the two possible poses among the set of all possible poses are chosen randomly, which leads to the possibility of having to apply recursively the algorithm \(n-1\) times, with n being the number of possible initial states.

Dudek et al. presented a method for minimum distance traversal for localisation that works in polygonal environments without holes that they show to be NP-Hard Dudek et al. (1998). A randomized version of the same method was presented in Rao et al. (2007).

The work proposed by Kondo et al. on localisation around underwater structure is an example of linking the localisation with the planning system Kondo et al. (2004). The localisation itself is based on particle filters, but the planning system considers the state of the filter in its planning. Once convergence is reached, it issues waypoints, in order to inspect the structure at a fixed distance. In the proposed system, the link among localisation and planning is not very strong. The planner waits for a stable navigation status, before moving the vehicle, which is very reasonable, but does not consider the possibility that a consistent status might be achieved only after a specific set of actions.

3.10 A brief glance at active SLAM

As similarly analysed in the previous section on passive techniques and passive SLAM, it is also possible to look at active SLAM approaches, which extend the problem formulation to jointly address the autonomous exploration problem together with the standard SLAM approaches. The active part of these approaches aims to select the next move (or the next set of actions) in order to build the map as efficiently as possible, or in order to bound the uncertainty looking actively for loop-closure possibilities. Active SLAM approaches have been used in the robotics communities since many years but only recently the underwater robotics research community has started to research them more systematically.

Palomeras et al. introduce a new exploration framework, combining a view planner and a pose-graph based localisation and mapping algorithm Palomeras et al. (2019). The proposed mechanism decides the next viewpoint to visit, taking into account which will be able to reduce the entropy in the map and in the state of the vehicle jointly. This approach of entropy minimisation goes along with several active localisation techniques presented in the previous sections. The SLAM part is implemented with a pose-graph approach, using the ICP algorithm to register the point clouds gathered from different viewpoints. Whenever the uncertainty in the vehicle state is small, the active SLAM maximizes the exploration of new regions, while when this uncertainty is large, the vehicle tries to reduce it by revisiting known regions with small uncertainty.

Suresh et al. focus on volumetric exploration of 3D underwater environments with multibeam sonar Suresh et al. (2020). In order to identify the next location to revisit, a 3D visual dictionary is built from real-world sonar data and a metric of submap saliency is computed. Revisit actions are chosen based on propagated pose uncertainty and sensor information gain. Similar to the previous work presented, the SLAM part is implemented with a pose-graph approach.

Chaves and Eustice use the Bayes tree data structure for efficient planning Chaves and Eustice (2016). In particular they focus on reducing the computational complexity by eliminating redundant computations between candidate actions that are similar. Focusing on visual SLAM, the work has been tested successfully in a hybrid simulation environment.

In the same year Chaves et al. propose an active SLAM framework for performing large-scale inspections with an underwater robot, showing results not only from the hybrid simulation, but also from field trials Chaves and Eustice (2016). The central idea is to plan loop-closure paths in order to decrease navigation uncertainty. While loop-closing revisit actions bound the robot’s uncertainty, they also lead to redundant area coverage and increased path length. Gaussian process regression is used for modelling the prediction of camera registrations and use a two-step optimization procedure for selecting revisit actions.

Kim et al. focus on active visual SLAM applied to ship hull inspection Kim and Eustice (2015). They introduce a perception-driven navigation, that automatically balances between exploration and revisitation using a reward framework. This framework accounts for SLAM localisation uncertainty, area coverage performance, and the identification of good candidate regions in the environment for visual perception. Results are shown for both a hybrid simulation and real-world demonstration of a visual SLAM system for autonomous underwater ship hull inspection.

Hollinger et al. focus on inspection scenario with the aim of improving the quality of the inspection, rather than maximizing the accuracy of a given data stream Hollinger et al. (2013). The inspection planning problem is formulated as an extension to Bayesian active learning. Changing the formulation of the problem from cost minimisation with a constraint on information gain to variance reduction with a constraint on cost, the potential benefit of adaptivity can be reduced from exponential to a constant factor.

Overall, this research area is very important for advanced AUV missions, and only recent work has addressed it in the underwater domain.

3.11 Critical analysis

Among the various presented techniques, it is important to highlight their strengths and boundary conditions in which they work well. Actively choosing the landmark to observe represents a good strategy for landmark-based localisation. However, it does not address the disambiguation among multiple possible locations. Multiple-Hypothesis Kalman Filters address the possibility of multiple possible locations, but this approach works well only when it is possible to identify a clear set of features. The concept of entropy minimisation is considered key from the authors. Considering the localisation function as a probability distribution function, any active localisation technique directly or indirectly needs to minimise the expectation of the future entropy, when selecting the actions. However, a clear drawback is represented by the complex mathematical formulation, which is computational intensive, as also outlined by Fairfield and Wettergreen (2008). Additionally, the solution provided by Burgard et al. (1997) provides one map point relative to the robot which the robot should reach to minimise the entropy in the localisation distribution. However that point might not be accessible, and the path planning to reach that point is not explicit. The selection of the best action works well only in simple environments. In many cases there is the need of a series of action in order to correctly estimate the location in the map. Executing one action only does not give any guarantee to improve the localisation. A simple greedy approach of building up a new action on top of the previous best action is again not suitable, due to the possibility of local minima and local maxima. Figure 10 shows an example. Considering two possible states, A and B, the single best action is to move towards the left, as moving towards the right would not discriminate at all among the two possible location. If the system would build a set of action composing the best single actions, it would then go again towards the left. However, if the system chooses to go right from the beginning, it arrives to a much better location to discriminate among the two hypotheses. The work from Gasparri et al. (2007) shows a proposal which is very similar to the selection of the best action, though the selection process is more elaborated, therefore having similar limitations. The selection of a point close to the first possible obstacle was mainly justified because of safety. It actually helps the whole localisation process. The idea of looking for places which are different according to the different hypotheses is key. The work on the concatenation of actions solves the problems identified with the best one action selection and is able to handle multiple hypotheses. As it is based on tree exploration, optimisation strategies are really useful to constraint the computational complexity. Table 2 summarises the main approaches described in this section.

Fig. 10
figure 10

An example of why building a set of actions greedy concatenating the best single action is not a powerful solution. A and B represent two possible location of the robot. (a) initial situation; (b) the best single action is to move towards the left; (c) the best single action from the previous best single action is again a move towards the left. However, if the robot moves towards the right twice from the initial situation, it arrives to a much better location to discriminate among the two hypotheses

Similar considerations on deep learning, outlined at the end of the previous section, are valid for active localisation. Very little is available in this domain, but the expectation is that this will sensibly change in the near future.

Table 2 A summary of the various techniques that can be employed for active localisation. The concatenation of actions is robust against local minima and can be efficiently generalised

4 Conclusions

This article has reviewed the main techniques currently used for localisation, providing both an analysis of the related work, the mathematical formulation of Bayesian filtering, and some numeric and experimental results using standard techniques such as Kalman Filters, Scan Matching and Set Membership approach. From the analysis of the different approaches, the topics this review aims to address in this area are related to a localisation system for underwater vehicles which:

  • is not dependent on external aid;

  • solves both global localisation and position tracking;

  • is able to cope with high level of noisy data and imprecisions in the previous knowledge of the map;

  • is able to recover from wrong convergences;

  • is robust and reliable not just in simulated tests, but also integrated in the robot architecture and tested over long trajectories in the field.

A special focus was given to active localisation techniques which go one step further in minimising the uncertainty of the position estimate by actively moving the vehicle or sensor to a favourable position. Depending on the exact strategy chosen, the main advantages of this approach can be the following:

  • in case of equally valid localisation hypotheses, the tie can be deliberately broken and the true one chosen;

  • landmarks contributing better information contents are used

  • a trajectory is chosen between desired endpoints with the localisation certainty satisfying certain criteria.

Many of the characteristics mentioned above are critical for navigating in challenging environments such as around repetitive man-made structures or vast bottom zones with clustered or sparse landmarks. The advances in the techniques presented in this article will also affect the design of the future positioning systems which will be able to propose the same quality of localisation at a lower price and lesser initial setup difficulty.

A new area that will surely flourish in AUV localization is linked to deep learning, which had already a remarkable impact in other robotics field, though significant breakthrough in the underwater domain are yet to come.

Interesting theoretical results begin to accumulate in the domain of swarm and cooperative robot navigation and localisation. Practical results are still relatively difficult to obtain and analyse but this is bound to change in light of the numerous companies announcing that the affordable (\(<\text {10,000}\)$) and easy to deploy AUVs they develop are about to enter the market. Without any doubt, more and more AUVs will swim in our seas in the near future.