1 Introduction

Welding stands as a critical manufacturing technology, particularly in the construction of large-scale steel structures such as ships, offshore wind turbines, and nuclear power facilities (Eren et al. 2021; Slobodyan 2021). Notably, underwater steel structures like spent nuclear fuel pools and submerged steel pipe piles operate in complex environments prone to damage. Underwater welding repair technology emerges as pivotal, given the hazards and inefficiencies associated with manual underwater welding (Barnabas et al. 2021; Vashishtha et al. 2022). The utilization of underwater welding robots (UWR) represents the future trajectory for advancing underwater welding technologies.

Presently, underwater robots predominantly comprise autonomous underwater vehicles (AUVs) and remotely operated vehicles (ROVs) (Schramm et al. 2020; Shen and Shi 2020; Cai et al. 2021; Honaryar et al. 2022; Maity et al. 2023). ROVs are tethered to surface ships for power and controlled remotely via cables. In contrast, AUVs operate autonomously with onboard power, managing tasks independently. Due to the demanding power requirements and wire feed systems of underwater welding, coupled with the need for robust sealing against water ingress, current UWRs rely on cable transmission for AC power, wiring, and control signals (Fu et al. 2020; Sun et al. 2022a). Despite efforts focusing on ROV-based underwater welding, challenges persist in achieving optimal welding quality due to stringent requirements on robot posture estimation during welding operations (Lv et al. 2014; Luo et al. 2018). ROVs are also susceptible to fluid dynamics fluctuations, posing challenges for precise underwater welding tasks. Moreover, existing UWRs lag behind in software algorithms, relying primarily on conventional methods for pose estimation, weld feature extraction, and trajectory planning (Xiang et al. 2018; Duan 2020; Chi et al. 2023c).

In recent years, rapid advancements in humanoid robots and artificial intelligence (AI) technologies have redirected the development trajectory of traditional welding robots. International competitions such as the RoboCup League (Kitano et al. 1997) and FIRA competitions (Baltes et al. 2017; Tu et al. 2020), alongside initiatives like DARPA (Clark 1988), have spurred innovation in humanoid robot technologies. Institutions have also developed prototypes such as ASIMO (Sakagami et al. 2002), TORO (Englsberger et al. 2014), ATLAS (Feng et al. 2014), PR-2 (Bohren et al. 2011), and Tesla Optimus (Artificial Intelligence and Autopilot 2021), emphasizing AI-driven task completion and human-robot interaction. Furthermore, humanoid robots aim to replace humans in hazardous and physically demanding tasks, addressing labor shortages and safety concerns in industrial settings.

Underwater welding, due to its perilous nature and demanding technical requirements, faces a shortage of skilled underwater welders. Humanoid robots, equipped with dexterous arms akin to human welders, offer promising potential to revolutionize the stagnant landscape of UWRs. However, the intricate structure and ongoing development of decision-making, communication, and control systems in humanoid robots present substantial challenges. Notably, no precedent exists for underwater welding using humanoid robots, underscoring the pioneering nature of this endeavor. Consequently, this paper concentrates on the future prospects of underwater humanoid welding robots (UHWR), focusing on their hardware platforms, AI algorithm applications in software systems, and prospective development directions. The three main focus points are as follows:

  1. 1.

    What hardware structure and sensors should an UHWR system have?

  2. 2.

    Which AI algorithms can be applied to UHWR?

  3. 3.

    To adapt to UHWR, what challenges and future trends do AI algorithms have?

Based on a comprehensive literature review and theoretical analysis, this paper makes the following contributions:

  • The concept of UHWR is first mentioned in this paper, and its magnetic adsorption chassis, dual robotic arms, underwater camera, computing platform and onshore support equipment are analyzed.

  • The applications of AI methods in underwater environments are systematically reviewed, and their potential applications in UHWR are explored.

  • The challenges associated with UHWR and the future trends of related AI methods are analyzed, which provide a structured technology roadmap for practitioners to overcome existing limitations and improve the capabilities of UHWR.

To fulfill the research gap, The organization of this paper is as follows. In the next section, We will explain the research methodology of this paper. In Sect. 3, this paper introduces the hardware system of the UHWR, including onshore support equipment and underwater equipment. Next, the software system of the UHWR is introduced and compared in Sect. 4, including muti-sensor calibration, visual-based 3D reconstruction, weld seam feature extraction, weld repair decision-making, robot trajectory planning, and dual-arm robot motion planning. Finally, conclusions and recommendations are summarized in Sect. 5.

2 Research methodology

This literature review adheres to the Preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines (Moher et al. 2010), as shown in Fig. 1. We conducted a literature search through Scopus, Google Scholar, Web of Science (WOS), IEEE Xplore, ScienceDirect, and the Institute for Scientific Information (ISI). The search covered title (TITLE), abstract (ABS), and keywords (KEY), and the initial search obtained a total of 1898 documents. Subsequently, articles on welding processes were manually excluded based on the content, leaving 835 articles focused on welding scenarios or technological advances. The screening process included evaluating abstracts, methods, and practical applications. The final inclusion criteria were: 1. Articles that detail underwater multi-sensor calibration methods based on AI; 2. Research on underwater 3D reconstruction and feature extraction using AI methods; 3. Papers discussing underwater mobile robots or robotic arm path planning algorithms based on AI methods; 4. Articles that make significant contributions to underwater robot research or demonstrate practical underwater welding applications. Ultimately, 159 articles were identified as most relevant to the objectives of this review.

The retrieved papers include not only journal papers but also important conference papers because many classic algorithms in the field of AI are proposed in the form of conference papers. In this review, we screen out high-quality papers through citation counts and relevance filters. At the same time, since AI is updated very quickly, we try to focus on documents from the past five years (2019–2023). As shown in Fig. 2, based on 7 keywords such as “underwater welding robot”, “underwater humanoid robot”, “underwater 3D reconstruction”, “underwater camera calibration”, “weld feature extraction”, “underwater robot decision-making”, and “underwater robot motion planning”, we found that the number of academic papers published on underwater humanoid robots and underwater welding robots each year has remained unchanged since 2019, and it seems that the accurate development direction has not been found. The number of academic papers published for the remaining 5 keywords has gradually increased, indicating that the continuous development of basic technologies in recent years can give new vitality to underwater welding robots.

Fig. 1
figure 1

Systematic review based on the PRISMA method

Fig. 2
figure 2

The number of academic papers published on 7 keywords from 2019 to 2023. The keywords are: 1-“underwater welding robot”, 2-“underwater humanoid robot”, 3-“underwater 3D reconstruction”, 4-“underwater camera calibration”, 5-“weld seam feature extraction”, 6-“underwater robot decision making”, 7-“underwater robot motion planning”

3 Hardware of UHWR

Underwater robots are the product of the integration of AI, automation, mechanical engineering, and other fields (Petillot et al. 2019). Welding also belongs to the intersection of physics, chemistry, and materials science, so the welding robot system is extremely complex (Wang et al. 2019). After previous exploration, the hardware platform of underwater welding robots should include onshore support equipment and underwater equipment (Lei et al. 2020; Chi et al. 2023a, b). At present, the commonly used underwater robot design is ROV, such as the JHUROV proposed by Martin and Whitcomb (2016) (Fig. 3) and the Kaiko ROV (Barry and Hashimoto 2009) (Fig. 4). These underwater robots adopt a box-type structural design and operate in a floating manner connected by cables. Although they have many advantages in practical applications, they also have disadvantages such as limited maneuverability and low positioning accuracy, making them difficult to apply to high-precision underwater welding scenes. As mentioned above, the UHWR is an important way to solve the current underwater welding dilemma. Given the underwater welding environment characteristics and actual welding operation requirements, the humanoid robot has also been given a new hardware structure (Saeedvand et al. 2019; Fan et al. 2020; Bu et al. 2022). As shown in Fig. 5, in our comparison, the basic hardware of the UHWR should have the following modules. In terms of underwater equipment, an underwater camera, dual-arm robot, mobile chassis, and computing platform are necessary. Regarding the onshore support equipment, human-robot interaction system, welding power, and wire feeding system are the keys to ensuring underwater welding.

Fig. 3
figure 3

JHUROV proposed by Martin and Whitcomb (2016)

Fig. 4
figure 4

The design of Kaiko ROV (Barry and Hashimoto 2009)

Fig. 5
figure 5

The hardware of UHWR

3.1 Underwater equipment of UHWR

The underwater module is mainly used for four functions: environment perception, computing, movement, and welding. Among them, due to the impact of underwater optical refraction and sealing costs, underwater cameras are used for environment perception. The computing platform is mainly used to run AI algorithm models. Movement refers to using a mobile chassis to control the movement of the UHWR. Welding refers to the use of dual-arm robots to complete underwater welding work.

3.1.1 Underwater camera

Underwater cameras can directly capture underwater environment images with color information and can use binocular cameras to achieve depth estimation. However, unlike terrestrial images, underwater image restoration faces more severe challenges due to different water quality and lighting conditions on optical selective absorption and scattering (Zhou et al. 2023a). Therefore, the design of underwater cameras mainly includes two parts: underwater optical camera communication (UOCC) and underwater image enhancement (UIE).

Over the years, underwater communications have undergone significant advancements, transitioning from underwater acoustic communications (UAC) characterized by limited bandwidth and high latency to radio frequency (RF) communications with reduced latency (Hufnagel et al. 2019; Lin et al. 2022). However, RF communications are hindered by high attenuation, which restricts their effective range. In response to these challenges, UOCC has emerged as a promising solution, offering higher data throughput and lower latency over shorter distances (typically a few meters). However, the implementation of underwater optical camera communications on underwater robots necessitates careful consideration of power requirements. Table 1 provides a summary of the advantages and disadvantages associated with these three communication methods. Extensive research has been dedicated to exploring the capabilities and applications of underwater optical camera communications. Akram et al. (2020) discussed overall system design considerations for optical camera communication, synchronization, and a novel decoding approach. Additionally, they elaborated on the optimization techniques used in image processing algorithms. Shigenawa et al. (2022) proposed an OCC predictive equalization technique assuming color shift keying (CSK) to modulate the light signal by modifying the intensity of a three-color LED luminaire. They verified the feasibility of their proposed method in a 3.5 m depth underwater environment. Li et al. (2020) developed a real-time video transmission system based on a field-programmable gate array (FPGA) using binary frequency shift keying (2FSK) modulation. They utilize transmission control protocol (TCP) and forward error correction (FEC) to achieve full-duplex communication. The full-duplex system achieved a data transmission rate of 1 Mbps with a signal-to-noise ratio of 10.1 dB and a distance of 10 m over the underwater link. Majlesein et al. (2020) proposed the use of multispectral or hyperspectral cameras as OCC receivers in short-range underwater links to exploit the high spectral resolution of such devices relative to conventional cameras. In summary, OCC is the most feasible communication solution in underwater environment, and there are already relatively mature solutions.

Table 1 Comparision of UAC, RF, and UOCC (Hufnagel et al. 2019; Lin et al. 2022)

UIE is an important method for AI to intervene in underwater images. In recent years, many image enhancement methods have been proposed to restore the color of underwater images (Wang et al. 2023; Zhou et al. 2023b). As shown in Table 2, the number of papers published in the field of UIE has increased year by year from 2019 to 2023, reaching 632 papers in 2023. It is worth noting that with the continuous development of AI technology, the proportion of methods based on deep learning exceed 50% in 2022. Zhou et al. (2023c) proposed an UIE method through multi-interval sub-histogram perspective equalization to solve the problems caused by underwater images. They implement adaptive feature enhancement by extracting statistical features of the image to estimate the degree of feature drift in each area of the image, thereby improving the visual effect of the degraded image. Zhou et al. (2023d) proposed an efficient and fully guided information flow network (UGIF-Net) for UIE. They used a multicolor space-guided color estimation module that accurately approximates color information by merging features from two color spaces into a unified network. Islam et al. (2020) proposed a real-time UIE model based on conditional generative adversarial networks. They formulated an objective function for supervised adversarial training to evaluate perceptual image quality based on global content, color, local texture, and style information. Generally speaking, UIE has gradually matured due to the development of AI technology, which is of great help to UHWR in perceiving the underwater environment.

Table 2 The number of underwater image enhancement papers published from 2019 to 2023 and the percent of deep-learning-based methods

3.1.2 Dual-arm robot

UHWRs have a similar structure to humans, which are equipped with dual redundant robotic arms that are similar in size to human hands. Dual-arm robots are the core structure of humanoid robots that can replace human work. Compared with the traditional single-arm welding robot, it has the advantages of high flexibility, strong adaptability, and high efficiency, as shown in Table 3 (Buhl et al. 2019; Garabini et al. 2020). There have been many academic papers on dual-arm robots in recent years, mainly focusing on machine design and motion planning, as shown in Table 4.

Numerous studies have been conducted to support humanoid robot arms. Buhl et al. (2019) studied the application of integrated dual-arm collaborative robot systems in smart factories and Industry 4.0. The human-like functionality of their design stems from dual-arm robots, dual tools, and vision, allowing simultaneous control and synchronized movement of the robot arm and product components. Garabini et al. (2020) proposed a dual-arm robot WRAPP-up, analyzed human picking strategies, and then used these as design, planning, and control guidelines for WRAPP-up. They proposed a solution that is flexible enough to handle goods with different shapes, sizes, and physical properties and requiring different gripping modes. Wang et al. (2021a) developed a new dual-arm robot and used modular actuators. The dual-arm robot consists of two 7-degree-of-freedom (7-DOF) robotic arms. The robot’s high degree of freedom enables it to effectively avoid joint limitations and singularities of the arm. Imanberdiyev et al. (2022) developed a lightweight, low-inertia dual-arm robot and conducted experimental verification. The robotic arm has a center of gravity (COG) balancing mechanism and is designed for aerial maneuvering tasks. Each robotic arm can dynamically adjust its center of gravity for better performance. The development of dual-arm robots is extremely important for humanoid robots, and the accumulated dual-arm robot technology currently provides a good technical foundation for the application of UHWR.

Table 3 Comparision of single-arm robot and dual-arm robot (Buhl et al. 2019; Garabini et al. 2020)
Table 4 The number of papers on dual-arm robot published from 2019 to 2023 and the number of papers on machine design and motion planning within it

3.1.3 Mobile chassis

According to research by Maity et al. (2023), an ROV equipped with a welding manipulator can be called as an underwater welding robot. In addition to the dual-arm robot in Sect. 3.1.2, the most important hardware of the UHWR is the mobile chassis. Although the ROV floating mobile mechanism has greater advantages in terms of moving distance and diving depth, it is difficult to ensure that the robot remains relatively stationary with the working surface during operation by relying solely on its propeller. The underwater wall-climbing mobile chassis is an important way to solve the positioning accuracy of underwater welding robots (Xu et al. 2021). There are many types of underwater wall-climbing mobile chassis, which can be divided into thrust adsorption (FAN et al. 2022), pressure adsorption (FAN et al. 2022), electromagnetic adsorption (Huang et al. 2019) and permanent magnet adsorption (Fan et al. 2020) according to the adsorption method. Also, according to the contact form, it can be divided into crawler (Hu et al. 2022) and wheel (Bisht et al. 2023). Therefore, there are 8 types of underwater wall-climbing mobile chassis through the free combination of adsorption and contact forms. As shown in Table 5, by comparing the reliability, flexibility, immunity to interference, adaptability, and deployment of underwater wall-climbing robots, the permanent magnet wheel adsorption robot is most suitable for the mobile chassis of UHWR.

In terms of underwater permanent magnet-wheeled adsorption robots, many scholars have carried out excellent work. Fan et al. (2020) introduced the mechanical design of an underwater wall-climbing robot (UWCR) for inspecting ship hulls. They discussed the novel design of the UWCR magnetic wheel in detail. A static model of UWCR was established to optimize the magnetic wheel structure. Shahrami et al. (2022) designed and analyzed the permanent magnet mechanism of a hull inspection robot and reviewed various types of inspection robots and cleaning robots. They analyzed the static and dynamic forces related to the robot in different directions on the vertical wall and conducted experimental tests and simulations using magnetostatic analysis in ANSYS software. Lin et al. (2023) proposed a new magnetic wheel wall-climbing robot with a passive-compliant suspension mechanism. They fully studied its curvature adaptability and kinematic transformation process and verified the movement flexibility, stability, and reliability of the magnetic wheeled wall-climbing robot through experimental results. All in all, the permanent magnet adsorption wheeled chassis is the key hardware for the work of UHWR. The development of this technology can directly promote the progress of underwater robot welding technology.

Table 5 Comparison of reliability, flexibility, robustness, adaptability and deployment of underwater wall-climbing robot

3.1.4 Computing platform

The computing platform is the “brain” of the UHWR, which directly affects the quality of underwater welding work. In the early days of the development of robotics technology, they were commonly used on microcontrollers, such as STM32, Arduino Uno, etc. They were low-cost but lacked computing power, which could only run some basic AI algorithms (Araújo et al. 2015; López-Rodríguez and Cuesta 2016). With the development of hardware technology, robot computing platforms based on Raspberry Pi are gradually emerging. They use CPUs for calculations and can run AI algorithms but have low computing power (Al-Sahib and Azeez 2015; Oltean 2019; Karahan and Hökelek 2020). It is worth mentioning that, as the smallest system board, Raspberry Pi can already install a Linux system and has a human-computer interaction interface.

In recent years, with the further development of hardware technology, robot computing platforms based on GPU and FPGA have gradually emerged and become the two main computing platforms. In terms of GPU embedded, as we all know, Nvidia has the world’s leading algorithms and hardware platforms (Kolhatkar and Wagle 2021). Currently, many mobile robot systems use Nvidia graphics cards as computing platforms, allowing them to run more complex deep learning networks (Bokovoy et al. 2019; Alexey et al. 2021). The main products include Jetson Nano, Jetson TX2, Jetson Xavier NX, and Jetson Orin. The computing power of Jetson Orin is as high as 275 trillion floating point operations per second, which is enough to support the operation of most existing deep learning networks. However, in many applications, other requirements besides raw processing power may be more important. For example, in mobile or embedded applications, energy efficiency is more important, especially when the device is battery-powered. Additionally, latency is critical in many edge devices or embedded systems. In this case, FPGAs have more competitive advantages as they offer very low latency and better energy efficiency (Wan et al. 2021). Nitta et al. (2019) developed ZytleBot, a ROS-based robot utilizing FPGA. ZytleBot computes all the processing required for autonomous driving in real time within a programmable SoC. Therefore, without any external action, it is possible to simulate actual road detours, detect signals and obstacles, and take appropriate actions. Fu and Yu (2019) designed an FPGA-based system to accelerate the MTCNN face detection algorithm. They used the Zynq 706 board to evaluate the system and compared performance to Jetson TX2, a popular edge computing platform with embedded CPUs and GPUs. The results showed that the FPGA system had 40x lower latency and 2.5x higher power efficiency than Jetson TX2. Mazzetto and Castanho (2023) developed a field-programmable gate array (FPGA)-based convolutional neural network accelerator and used it for steering control applications of mobile robots. The results of the test showed that the performance of FPGA is better than that of general-purpose processors, significantly shortening the processing time. It is worth mentioning that in embedded robot systems, in addition to the Linux system commonly used to run neural networks, robot operating system (ROS) (Quigley et al. 2009) and ROS2 (Macenski et al. 2022) systems are also commonly used as auxiliary systems. They can be installed in Linux systems and provide data communication and basic function packages for each module of the robot. In summary, the GPU/FPGA embedded computing platform and ROS can provide good help for the intelligence of UHWR and are important media for AI algorithms to interfere with robots. Table 6 summarizes the advantages and disadvantages of different computing platforms in the field of underwater robots (Araújo et al. 2015; Oltean 2019; Alexey et al. 2021; Mazzetto and Castanho 2023).

Table 6 Comparison of cost, performance, flexibility, difficulty of computing platforms (Araújo et al. 2015; Oltean 2019; Alexey et al. 2021; Mazzetto and Castanho 2023)

3.2 Onshore support equipment of UHWR

The onshore support equipment is necessary to ensure that UHWR complete welding tasks, which is important equipment in underwater welding robot technology. Unlike the underwater equipment in Sect. 3.1, onshore support equipment does not need to be sealed and only relies on cable connections to provide control instructions (human-robot interaction system), energy (welding power) and material (wire feeding system).

3.2.1 Human-robot interaction system

Although UHWR is a field rarely explored, underwater human-robot interaction (U-HRI) is not an emerging field. As stated by Birk (2022), typical underwater tasks require extensive communication between humans and robots and therefore have received extensive attention in U-HRI research. Water as a medium has special properties, such as the strong attenuation of radio frequency (RF) signals or the physical properties of underwater image formation, making U-HRI systems particularly important. As shown in Fig. 6, the human-robot interaction system used in UHWR should have the ability to fully interfere with the underwater welding process. For the visual-based 3D reconstruction and weld seam feature extraction based on underwater cameras, it should have the function of manual-assisted damage detection. For weld repair decision-making implemented by the computing platform, the function of manually adjusting the welding angle, welding speed, welding voltage, and wire feed speed is essential. Finally, like other underwater robots, for dual-arm robot motion planning and mobile chassis trajectory planning, manual adjustment of the planned path and emergency stop are the keys to safely completing the welding task.

In recent years, many underwater human-robot interaction methods based on AI have been proposed. Jiang et al. (2021) proposed a method to recognize gesture instructions and applied it to fuzzy control of AUV. Their contribution lies in proposing a gesture recognition framework for human-computer interaction, including a gesture detection network and AUV control algorithm. Zhang et al. (2022) proposed a visual text model for underwater gesture recognition (VT-UHGR). They encoded images of underwater divers into visual features categorized text into textual features, and generated visual textual features through multimodal interaction. Islam et al. (2019) designed and developed two autonomous diver-tracking algorithms. The first algorithm utilizes spatial and frequency domain features related to human swimming patterns to visually track divers. The second algorithm uses a convolutional neural network-based model to achieve robust detection and tracking. Furthermore, they propose a gesture-based human-robot communication framework that is syntactically simpler and more computationally efficient than existing grammar-based frameworks. It is believed that in the future, with the continuous development of AI technology, underwater human-robot interaction technology will further develop, thereby assisting UHWR.

Fig. 6
figure 6

The human-robot interaction system applied to UHWR

3.2.2 Welding power and wire feeding system

The UHWR needs to use the welding power source to provide energy to melt the wire provided by the wire feeding system to complete the underwater welding task (Wang et al. 2017, 2020a). Welding power sources directly control welding parameters such as welding voltage and current (Wang et al. 2020a). Traditional welding power supplies use Si-based power devices (Si insulate-gate bipolar transistor (IGBT), Si metal-oxide-semiconductor field-effect transistor (MOSFET), Si schottky barrier diode (SBD) and Si fast recovery diode (FRD)), whose switching performance is close to the theoretical limit determined by the properties of Si materials, and the inverter frequency can only reach 20 kHz (Colmenares et al. 2014). The third-generation wide-bandgap silicon carbide (SiC) power devices have advantages such as higher voltage and temperature operating range, higher thermal conductivity, stronger radiation resistance, and better high-frequency characteristics (Alves et al. 2017). Therefore, SiC-based power devices have great potential in improving the overall performance of welding power supplies. The wire feeding system controls the type of wire material and the wire feeding speed, which directly affects the quality of welding seam formation. Terrenoir et al. (2023) studied the relationship between WAAM wire feed speed (WFS) and torch speed (TS) to derive the mechanical properties of 316LSi thick parts. Panchenko et al. (2020) developed a high-performance controlled short-circuit metal transfer process using a WAAM wire feed speed of 12 m/min using an Al-Mg-Mn alloy system. Ding et al. (2015) reviewed the quality and accuracy of wire-fed AM machined parts, identified current challenges in wire-fed AM, and pointed out future research directions. In short, process parameters such as welding speed, welding voltage/current, and wire feeding speed are the basis for affecting whether the UHWR can complete welding. Therefore, practitioners should focus on the selection of welding power sources and wire feeding systems.

Underwater welding can be divided into arc welding and laser welding according to the type of heat source, using welding power sources and lasers respectively (Liao et al. 2023; Mei et al. 2023). According to process type, underwater local dry welding has great advantages and has become a hot issue in underwater welding in recent years (Wang et al. 2021c; Liao et al. 2023). The underwater partial dry arc welding method is the first choice for UHWR due to its easy operation, high molding quality, and good cooperation with the robotic arm. Liao et al. (2022) studied the effects of different pulse peak currents on underwater weld formation. The results showed that the peak current was 240 A, the weld was formed irregularly, and large particles were spattered around the weld. The peak current was increased to 260 A and the weld formation quality was improved. When the peak current is 280 A, the weld is formed satisfactorily, with less spatter, the weld forming coefficient is the largest, and the projection transition method is the most stable. As the peak current continues to increase, the instability of the welding arc leads to poor weld formation, serious spatter, and undercut. At the same time, the performance of the welding power circuit also directly affects the welding results. Wang et al. (2022b) noted that the semiconductor characteristics of Si IGBTs limit circuit performance, while the size and weight of SiC MOSFET modules limit power density. Therefore, they proposed a circuit scheme based on parallel discrete SiC MOSFET devices. For the optimal welding power output voltage, the welding speed also needs to reach a matching optimal value. Liao et al. (2022, 2023) found that as the welding speed increased from 9.0 mm/s to 16.2 mm/s, the stability of the welding process first improved and then deteriorated. Weldments obtained at 12.6 mm/s showed the highest performance. A typical underwater local dry welding system is shown in Fig. 7 (Liao et al. 2022). Although the system does not use an underwater mobile robot, it utilizes the welding power source and wire feeding system as onshore support equipment.

Fig. 7
figure 7

The typical underwater local dry welding system with underwater welding power source and wire feeding system (Liao et al. 2022)

4 Software of UHWR

In addition to the hardware platform, software algorithms are also an important part of the UHWR. As shown in Fig. 8, after the analysis in Sect. 3, the software of the UHWR should at least include the following items: multi-sensor calibration, visual-based 3D reconstruction, weld seam feature extraction, welding repair decision making, dual-arm motion planning, robot trajectory planning, and motion control. Although these are areas that have been studied in robotics for many years, new challenges will be introduced in underwater environments with great resistance and optical refraction.

Fig. 8
figure 8

The software of UHWR

4.1 Muti-sensor calibration

Due to the particularity of the UHWR, binocular cameras and IMU are its important sensor combinations, and underwater multi-sensor calibration is the first step (Chi et al. 2023b). As shown in Fig. 9, the number of academic papers published on underwater multi-sensor calibration has increased a lot in recent years, with a total of 219 papers published from 2019 to 2023. As shown in Table 7, we screened 132 papers published in journals and listed the journals with the highest number of publications. Among them, the top three contributors are IEEE transactions on instrumentation and measurement (IEEE T INSTRUM MEAS), Sensors, and IEEE Sensors Journal. Most journals have also started publishing articles on underwater multi-sensor calibration, but only a single document exists in such journals.

The calibration accuracy of the inertial measurement unit (IMU) and camera internal and external parameters directly affects the accuracy of underwater pose estimation. Therefore, many scholars have studied the calibration of underwater multi-sensors. Chi et al. (2023b) proposed an underwater multi-camera-IMU calibration system applied to welding scenes. They used prediction-detection methods and intrinsic-extrinsic parameter joint optimization methods to improve the accuracy and efficiency of calibration. Gu et al. (2021) developed a medium-driven underwater camera calibration method (MedUCC), which can accurately calibrate underwater camera parameters, including the direction and position of transparent glass. They used the changes in optical paths caused by medium refraction between different media to obtain calibration data, thereby estimating the initial values of underwater camera parameters through geometric constraints. Wang et al. (2022c) proposed an underwater structured light visual calibration method considering unknown refractive index. They developed a checkerboard method based on aquila optimizer (AO) to avoid the complex calibration process and solution process of the nonlinear underwater structured light vision model. Liu et al. (2023b) developed a method based on an underwater refraction camera model. They used the refraction coplanarity constraint to solve the initial values of the refraction parameters and the common point constraint to perform nonlinear optimization of the initial values. In underwater calibration, calibration plate detection is of utmost importance. To this end, Wang et al. (2022a) proposed a new algorithm using deep neural network (DNN), polar constraints, and specially designed equipment to solve the problem of unclear underwater images. Beaudoin et al. (2022) proposed three different methods to solve the underwater calibration plate detection problem, including traditional methods, hybrid methods, and deep learning methods. Table 8 shows the comparison of the accuracy, time cost, and complexity of different methods in underwater calibration checkerboard detection, including traditional methods (Gu et al. 2021; Chi et al. 2023b), refraction model methods (Wang et al. 2022c; Liu et al. 2023b), and deep learning methods (Beaudoin et al. 2022; Wang et al. 2022a). Among them, the deep learning methods have the advantages of high accuracy and fast running speed. However, its training process costs a long time and requires high-performance computing resources, and the model complexity is relatively high. The research on underwater multi-sensor calibration technology provides prerequisites for the development of UHWR, where most underwater calibration research has chosen traditional methods rather than AI methods. The difference is that in air scenes, AI methods have long been proposed and put into practical application (Gordon et al. 2019; Wong and Soatto 2021). It can be seen that the difficulty of estimating the pose of the underwater camera has an important impact on AI methods in underwater multi-sensor calibration scenes.

Fig. 9
figure 9

The number of papers on underwater muti-sensor calibration published in 2019–2023

Table 7 Contribution of journals on underwater muti-sensor calibration
Table 8 Comparison of accuracy, time cost, and complexity of different underwater checkerboard detection methods (Gu et al. 2021; Beaudoin et al. 2022; Wang et al. 2022c, 2022; Chi et al. 2023b; Liu et al. 2023b)

4.2 Visual-based 3D reconstruction

In recent years, 3D reconstruction has been widely used in fields such as robot pose estimation and digital twin construction (Wang et al. 2020b). Among them, the vision-based reconstruction method can restore the color of objects while constructing a 3D point cloud map, which makes the map more intuitive (Chi et al. 2023a). As shown in Table 9, we studied 64 academic papers published in journals and listed the journals with the top contributions. The top two journals for the number of papers published on underwater 3D reconstruction are the Measurement Journal Of The International Measurement Confederation and IEEE T INSTRUM MEAS. Based on this section and Section 4.1, most scholars choose to publish their research in IEEE T INSTRUM MEAS. At the same time, as shown in Fig. 10, 23 academic papers were published in 2023, accounting for about 36% of the past five years. Therefore, we have reason to believe that underwater 3D reconstruction has become an important direction in the field of computer vision and many advances have been made.

According to existing research, many scholars have tried to overcome the limitations of underwater optical refraction on traditional visual-based 3D reconstruction (Hu et al. 2023). Fan et al. (2021) used underwater laser triangulation as a reference to correct the global surface shape distortion caused by uneven close-range illumination to address the illusion errors caused by underwater light refraction and attenuation. At the same time, they also considered using underwater camera refraction models to eliminate nonlinear refractive distortion. Ou et al. (2023) designed an underwater active vision measurement system based on binocular structured light to achieve high-precision 3D reconstruction. They attempted to use the fusion of binocular cameras and lasers to solve the problem of underwater optical attenuation and feature sparseness. They also considered the effects of multiple media and proposed underwater refraction models, including a monocular imaging model, a binocular ranging model, and a binocular polar curve constraint model. Fan et al. (2023) proposed a method for underwater pipeline tracking and 3D reconstruction of underwater vehicles based on structured light vision (SLV). They developed a dual-line laser SLV and proposed a method that can simultaneously obtain the lateral deviation, height deviation, and heading deviation of underwater vehicles and underwater pipelines in low-light water environments. They also combined the laser stripe image feature points, refractive underwater SLV model, and Doppler velocity logging (DVL) information to achieve underwater pipeline tracking and dense 3D reconstruction. Sanao et al. (2023) used a light propagation model to analyze the convergence of near-field point light sources in water and designed an underwater photometric stereo system to verify the feasibility of the proposed method. Their research helps achieve the accurate underwater 3D reconstruction of objects and is suitable for underwater surface microdefect detection. Ubiña et al. (2021) proposed an object-based stereo matching to solve the problem of high frame rate computationally intensive depth data and maintain smoothness. They used convolutional neural networks (CNN) for underwater 3D fish reconstruction. Chen et al. (2022) achieved better 3D reconstruction by improving image quality. They also optimized the transmission of 3D reconstruction results through machine learning methods. Table 10 shows the accuracy, time cost, and complexity comparison results of traditional methods (Ou et al. 2023; Sanao et al. 2023) and AI methods (Ubiña et al. 2021; Chen et al. 2022). AI methods, especially those based on CNN, can learn the complex features of underwater scenes through a large amount of training data, thereby capturing more details during the reconstruction process. Their reasoning speed is very fast and they can perform 3D reconstruction in real-time or near real-time. However, the accuracy of the model is highly dependent on the quality and quantity of training data, and the training process is complex and consumes a lot of resources. Judging from the current research results, researchers seem to be inclined to use binocular cameras as sensors to study underwater 3D reconstruction, which is consistent with UHWR.

Table 9 Contribution of journals on underwater visual-based 3D reconstruction from 2019 to 2023
Fig. 10
figure 10

The number of journal papers on underwater visual-based 3D reconstruction published in 2019–2023

Table 10 Comparison of accuracy, time cost, and complexity of different underwater 3D reconstruction methods Ou et al. (2023); Sanao et al. (2023); Ubiña et al. (2021); Chen et al. (2022)

4.3 Weld seam feature extraction

Weld seam feature extraction is one of the important means to make the welding process intelligent, attracting many scholars to conduct research in this field. As shown in Table 11, we found that a total of 176 academic papers were published in journals in different fields from 2019 to 2023. The top three journals with the highest contribution are the International Journal of Advanced Manufacturing Technology, the Measurement Journal of the International Measurement Confederation, and the Journal Of Manufacturing Processes. Generally speaking, scholars mainly publish their research results in the fields of materials science and computer science. Therefore, weld seam feature extraction is a product of cross-disciplinary integration. It is also worth noting that deep learning-based methods have become the dominant method in this field (Rout et al. 2019). Gao et al. (2021) proposed a position detection and weld gap feature extraction method for variable gap fillet weld gas metal arc welding (GMAW) seam tracking system. They used the reflection characteristics of fillet weld stripes to propose a variable gap weld feature extraction method based on column grayscale difference operators. Xiao et al. (2019) proposed an adaptive feature extraction algorithm based on laser vision sensors. They classified typical welds into continuous and discontinuous welds and trained a Faster R-CNN model to automatically identify the weld type and locate the laser stripe ROI. Deng et al. (2023) proposed a weld feature extraction method based on the improved target detection model CenterNet. They used the weld feature points on the laser strip as the center point of the bounding box, eliminating the need for an initial positioning frame. When dealing with multiple welds, they use independent classifiers to predict weld types to avoid false detections. Xiao et al. (2021) proposed a feature extraction algorithm based on the improved Snake model. By carefully redesigning the energy function according to the unique gray value distribution of the laser stripes and optimizing the minimization procedure to accelerate convergence, they finally achieved good adaptability and robustness to multiple welds. After research by many scholars, weld seam feature extraction can already realize weld start and end point detection, weld edge detection, weld width measurement and weld path position determination relative to the robot coordinate system (Rout et al. 2019). Although the underwater environment is complex, after fill-in light and underwater image enhancement, underwater weld feature extraction is the same as on land, which is a major boost to the development of UHWR.

Table 11 Contribution of journals on weld seam feature extraction from 2019 to 2023

4.4 Weld repair decision making

As mentioned in Sect. 3, the welding process is a decision-making process(Wardana et al. 2020). According to the target characteristics obtained in Sects. 4.2 and 4.3, it is extremely important to select the appropriate welding voltage/current, wire feed speed, welding speed, and welding angle. At the same time, it is worth noting that post-weld quality assessment is also the main content of decision-making (Liu et al. 2023a).

Since welding is a complex process and it is difficult to find detailed optimal parameters, therefore, decision-making during the welding process is usually implemented using fuzzy rules (Saluja and Singh 2020). Wardana et al. (2020) proposed a combined p-robust technique, fuzzy credibility constraint programming, and data envelopment analysis (DEA) to select welding processes for crusher hammer repair. They use p-robust techniques to control the relative regret of the data and use fuzzy credibility to constrain confidence. Saluja and Singh (2020) proposed improved multi-attribute decision-making (MADM) MULTIMOORA method based on fuzzy concepts, best-worst method (BWM), and half-quadratic (HQ) theory, which can be applied to welding process selection. Omar et al. (2022) developed a multi-criteria decision-making system and named it FAQT. The system, which integrates the Fuzzy-AHP-TOPSIS system with the quality tool QFD, can incorporate other factors and interdependencies of factors into the problem model and can be used to solve welding process selection problems at both academic and industrial levels. Decision-making in the welding process is still a difficult problem. Many scholars are trying to use fuzzy rules combined with other methods. However, due to its complexity, it is difficult for deep learning neural networks to be competent so far.

In terms of post-weld quality inspection, AI algorithms have become the mainstream method. Asif et al. (2022) used acoustic emission (AE) as a real-time monitoring method to be introduced into gas metal arc welding. Their AE system uses a wide range of frequencies from 5 to 400 kHz and records welding parameters simultaneously. Finally, several features they extracted from AE and welding parameters are fed into a machine learning algorithm to enable welding assessment. Gyasi et al. (2019) developed a commercial adaptive smart welding system prototype with integrated welding quality attribute prediction and control functions. Preliminary results from their experimental work employing infrared thermography (IRT)–based devices and AI systems are discussed as case studies. He et al. (2021) studied a Bayesian network model (BNM) to implement an autonomous decision-making process for the welding position of gas metal arc multi-pass welding using T-joints in automated manufacturing. They used a laser vision sensor to perform profile analysis on the weld and adopted a novel scheme based on scale-invariant feature transformation and directional feature detection to extract the weld profile. Machine learning and deep learning methods have been used in the field of post-weld quality assessment. Real-time assessment systems can significantly reduce the consumption of human, financial, and material resources. Table 12 shows the advantages and disadvantages of machine learning and deep learning methods in post-weld quality inspection (Gyasi et al. 2019; He et al. 2021; Asif et al. 2022). Machine learning-based methods excel in flexibility, and scalability, but may be limited by data quality and algorithm complexity. Deep learning occupies an important position in post-weld quality inspection with its high accuracy, adaptability, and robustness, but it places higher demands on data requirements, computing resources, and model complexity.

Table 12 The advantages and disadvantages of machine learning and deep learning methods in post-weld quality inspection (Gyasi et al. 2019; He et al. 2021; Asif et al. 2022)

4.5 Robot trajectory planning and motion control

Robot trajectory planning and motion control are core topics in robotics and have been studied for generations. Due to different robot structures, planning methods are also different. According to the analysis in Sect. 3.1.3, we only focus on the trajectory planning and control algorithm of wheeled mobile robots here. Due to the large number of relevant researchers, many review papers have been published, as shown in Table 13. From the summary of these scholars, we can see that robot motion planning methods mainly include traditional planning algorithms, machine learning algorithms, optimal value reinforcement learning, and policy gradient reinforcement learning (Zhang et al. 2020; Tagliavini et al. 2022; Zhou et al. 2022). Among them, machine learning algorithms, optimal value reinforcement learning, and policy gradient reinforcement learning belong to the category of AI algorithms and have been research hot spots in recent years. Yang et al. (2022) proposed an N-step priority double DQN (NPDDQN) path planning algorithm, which effectively achieved obstacle avoidance in complex environments. They verified the superior performance of reinforcement learning in three-dimensional underwater path planning compared with various traditional methods. Chu et al. (2022) proposed a deep reinforcement learning (DRL) path planning method based on double deep Q-network (DDQN). The method is created by an improved convolutional neural network with two input layers to adapt to the processing of high-dimensional environments. Hadi et al. (2022) proposed an AUV adaptive motion planning and obstacle avoidance technology based on DRL. The study adopted a double-delay deep deterministic strategy algorithm suitable for Markov processes with continuous actions. Table 14 provides the comparison results of the four methods in terms of accuracy, time cost, and robustness (Saeedvand et al. 2019; Fragapaneet al. 2021; Fang et al. 2022; Yang et al. 2022). Traditional methods and machine learning are limited by scenarios in terms of accuracy and robustness, and perform poorly in unstructured scenes. Robot trajectory planning methods based on DRL have higher performance but have fewer practical applications. Although the performance of traditional methods is slightly lower than methods based on AI, their application effects are good and they have been extensively tested. Using AI algorithms to optimize traditional methods may be the key to the rapid development of UHWR.

Table 13 Review papers on mobile wheeled robot trajectory planning and motion control from 2019 to 2023
Table 14 Comparison of accuracy, time cost, and robustness of different underwater trajectory planning methods (Saeedvand et al. 2019; Fragapaneet al. 2021; Fang et al. 2022; Yang et al. 2022)

4.6 Dual-arm robot motion planning

The motion planning of the dual-arm robot is an important step for the UHWR to autonomously complete the welding task. This type of planning method usually uses the result of feature extraction results in Sect. 4.3 and uses the welding speed and welding angle determined in Sect. 4.4 as constraints to plan the trajectory of the dual-arm robot(Abbas et al. 2023). As shown in Fig. 11, scholars have published a total of 262 academic papers in the past five years, and the number of papers published each year is on a stable trend. As shown in Table 15, a total of 132 articles were published in journals, and the top three journals by publication are IEEE Robotics And Automation Letters, Applied Sciences Switzerland, and IEEE Transactions On Automation Science And Engineering. In terms of journal type, such articles are mainly published in journals in the fields of robotics and automation.

From the perspective of planning methods, different scholars use a variety of methods to solve such planning problems. Zhang and Jia (2023) developed a three-criteria coordinated motion planning scheme for a speed-level redundant dual-arm robot to solve the problems of joint angle drift and motion singularities of the redundant manipulator. They exploited multiple indicators and physical constraints to establish the optimization equation and solved it using a constrained convex quadratic programming problem. Zhang et al. (2023) developed a multi-agent path planning reinforcement learning algorithm that integrates experience replay strategy, shortest path constraints, and policy gradient methods. They adopted a mechanism called “reward cooperation, punishment competition” during the training process to improve the cooperation of the dual-arm robot. Sun et al. (2022b) proposed a hybrid control method for a dual-arm mobile robot. They utilized finite state machines to design a hierarchical task planner that enables the robotic system to process events for active visual perception. They also developed a motion planning method based on rapidly exploring random tree (RRT) to achieve a master–slave coordination mechanism for the dual-arm robot while avoiding obstacles. Yu et al. (2020) proposed a novel dual-arm manipulation path planning framework based on coordinated RRT*. They sampled from a separated inertial space when performing the RRT* algorithm and used quartic splines for smoothing the resulting path. Given the high dimensionality of robot motion patterns, Ying et al. (2021) proposed a bidirectional rapidly-exploring random tree (LSTM-BiRRT) method integrated with long short-term memory to improve the effectiveness and efficiency of the planning process. Tang et al. (2022) used a deep reinforcement learning algorithm to study the trajectory planning of two arms of a dual-arm robot in complex environments. Wong et al. (2021) designed a motion planning method based on Soft Actor-Critic (SAC) for a dual-arm robot, which enables the robot to effectively avoid self-collision while avoiding joint restrictions and singularities of the arms. They used dual-agent training, distributed training structure, and progressive training environment to train neural networks. Table 16 shows the comparison of the accuracy, time cost, and complexity of the dual-arm motion planning algorithm based on the RRT* algorithm and the AI-based algorithm. In situations where high accuracy and adaptability to complex environments are required, the AI-based algorithm may have more advantages; in situations where fast response and lower computing costs are required, the RRT* algorithm may be more suitable. Due to the particularity of the robotic arm and the high performance of the RRT algorithm, many studies have attempted to combine the RRT algorithm with the AI algorithm to seek more efficient motion-planning solutions for dual-arm robots. This is also an important solution for the dual-arm robot motion planning of UHWR.

Table 15 Contribution of journals on dual-arm motion planning from 2019 to 2023
Fig. 11
figure 11

The number of papers on dual-arm motion planning published in 2019–2023

Table 16 Comparison of accuracy, time cost, and complexity of different dual-arm robot motion planning methods (Saeedvand et al. 2019; Fragapaneet al. 2021; Fang et al. 2022; Yang et al. 2022)

5 Conclusion

This paper represents the first comprehensive review on UHWR, where we introduced and extensively discussed the concept of UHWR, focusing on its hardware and software aspects. By reviewing the applications of traditional algorithms and AI algorithms in key UHWR technologies, we explored the potential applications, challenges, and future trends of AI in UHWR.

AI has significantly impacted various domains of UHWR’s key technologies, yet several challenges remain before its practical application:

  • In terms of structural design, current underwater robots predominantly utilize buoyant Remotely Operated Vehicles (ROVs), which exhibit poor positioning accuracy and are susceptible to fluctuations in ocean currents, thereby hindering their ability to perform high-precision underwater welding operations. Consequently, there is an urgent need for a new generation of UHWR to overcome these limitations. Essential underwater equipment for UHWR includes underwater cameras, dual-arm robots, mobile chassis, and computing platforms, while onshore equipment such as human-computer interaction systems, welding power supplies, and wire feeding systems are crucial for ensuring successful welding operations by UHWR.

  • AI technology holds substantial potential in enhancing underwater camera image processing algorithms and conducting model-based reasoning on computational platforms. The advancement of decision-making AI algorithms plays a critical role in promoting collaborative scheduling. However, the complexity of the welding process necessitates considerations of self-positioning amidst the intricate underwater environment, target feature extraction, motion control, and other factors. Existing intelligent decision-making algorithms struggle to achieve real-time decision-making capabilities for complex welding tasks.

  • Multi-sensor calibration, vision-based 3D reconstruction, weld feature extraction, robot trajectory planning, and dual-arm robot motion planning constitute the core algorithms for achieving autonomous motion in UHWR. While current AI algorithms play pivotal roles in these algorithms, most consume extensive computational resources. The limited computing power of underwater platforms poses challenges in supporting real-time reasoning across multiple models, thereby significantly impeding the development of UHWR.

In response to these challenges, this paper presents future trends in AI for the development of the next generation of UHWR:

  • Multi-functional Models: Integration of diverse functional models such as underwater image enhancement, 3D reconstruction, weld feature extraction, and trajectory planning into unified models using techniques like online learning and incremental learning. This integration reduces the computational demands on UHWR’s computing platforms.

  • Compact Edge Computing Models: Optimization of algorithms and model structures to achieve smaller and more efficient models. This minimizes data transmission latency and bandwidth consumption, thereby enhancing the reliability and stability of UHWR systems.

  • Large Model Inference and Decision-Making: Advancement of AI capabilities with the proliferation of data and continual algorithmic improvements, enabling large models to handle more complex and diversified tasks. With improved computational capabilities and optimized algorithms, large AI models will facilitate faster adaptation to decision-making tasks in underwater welding operations, thereby enhancing decision-making efficiency.

This paper synthesizes insights from AI applications across various underwater domains to delineate current challenges and future developmental trends in UHWR. It is noteworthy that UHWR represents a nascent technology, and revolutionary advances in any related field will inevitably influence its future trajectory. This inherent limitation underscores the necessity for future verification through practical UHWR implementations.