1 Article highlights

  1. 1.

    Development of power line inspection technology based on UVA image vision is reviewed.

  2. 2.

    The UAV power line detection platform with its system structure is clarified in detail.

  3. 3.

    The fundamental process and development of power line image recognition with deep learning is discussed.

2 Introduction

The transmission line plays a very important role in the power system, and the smooth operation of the transmission line is the key to ensuring the operation of the power supply system. Because the working environment of transmission lines is very different, in the process of long-distance high-voltage transmission, the transmission lines need to be exposed to the outdoors, withstand the test of wind and sun, and there are birds and other activities on the line, and its safe operation cannot be effectively achieved. Because of this, the detection and maintenance of transmission lines is a key issue in the work of the power transmission sector. Regular inspection of the line by manual or automatic equipment can detect and evaluate the operation status of the line, and timely find out the faults and hidden dangers of the line operation according to the test results, so as to ensure the safe and effective operation of the power line and ensure the power supply for people's production and life of stability.

There are many detection methods for transmission lines, including more traditional manual line inspection methods, advanced line inspection robots, and aircraft inspection methods. These methods provide sufficient guarantee for the inspection of transmission lines, and transmission line inspection work have been completed from different aspects. The ground line detection generally relies on the way that the inspection staff passes the ground, and the line is tracked and observed by the naked eye or detection equipment. The method has strong adaptability, and experienced inspectors can discover various faults and hidden dangers of the line through field observation. However, the efficiency of this method is relatively low, and the requirements for testing personnel are relatively high. It is also prone to missed detection and wrong detection due to personnel fatigue and negligence. Therefore, with the gradual progress of science and technology, it needs to be improved and replaced.

The line patrol robot moves along the transmission line through a specially designed detection robot. In the process, the detection of the transmission line is completed, and real-time or follow-up analysis is given according to the detection results to find out the faults and hidden dangers of the line, so as to solve the problem, and check the safety and reliability of transmission lines. Compared with the manual ground detection method, the transmission line detection in this way has a higher level of automation, and can also be close to various power facilities for detailed detection. However, when the robot crosses various obstacles and fault positions on the line, it still has certain technical difficulties, and the climbing and movement of the line-following robot on the line will cause a certain degree of wear and tear to the line itself. Therefore, this solution can only be used for some specific lines.

Power transmission line inspection is to use helicopters, drones and other types of aircraft to conduct flight detection on the line. The aircraft flies along the line according to the set plan and parameters. On the way, the line is detected by visible light, infrared light, etc., and the collected detection data is transmitted and processed, and finally the entire inspection process is completed. The aircraft line patrol solution has the advantages of fast detection speed, comprehensive detection data, and high degree of detection automation. It can overcome the inadaptability of traditional ground detection to the terrain, and can also avoid the line loss caused by the climbing of the line patrol robot. Moreover, because fixed-wing aircraft, helicopters and airships such as balloons and airships can be used for aircraft patrolling, the detection of the line can be completed from different angles and positions at different speeds, so the detection effect is much higher than that of traditional detection methods. However, the price of general aircraft is relatively expensive. Only by efficiently using these aircraft and related equipment can the detection cost be effectively controlled.

With the development of unmanned aerial vehicle (UAV) technology, there is a brand new option for power transmission line inspection [1, 2]. The cost of the UAV inspection process is relatively low, and the use is flexible and expandable. By installing the corresponding inspection equipment for remote control, most of the power lines currently erected can be inspected. In addition, since the manual inspection along the line and the inspection of the driving aircraft are eliminated, the inspection process of the unmanned aerial vehicle line will greatly enhance the safety of the work and ensure the life safety of the maintenance personnel in the power industry. UAV detection not only inherits the advantages of the general aircraft detection process, but also makes up for the expensive and inconvenient shortcomings of manned aircraft detection along the line, and truly makes the transmission line inspection process a breakthrough in the direction of full automation, which is a technological tool with a bright future. With the further advancement of power grid construction and transformation, UAV line detection has increasingly shown great scientific research value and application value. In order to ensure the safety and stability of power supply in the current huge and complex power grid system, UAV detection has become an urgent need for the development of the current power system.

Campoy and Mejias et al. conducted in-depth research on the navigation problem in the process of UAV line inspection [3]. Based on GPS technology and combined with computer vision analysis, they innovated the positioning and tracking of UAV during transmission line inspection. The unique design makes the fault detection and location more accurate. In addition, this research also helps the drone to complete the functions of automatic obstacle avoidance and safe landing, so that the drone can stick closer when taking pictures along the line, and collect the line operation information from different angles, so as to improve the detection effect and become better. The GSIRO Research Institute in Australia has designed the T21 line-following unmanned aerial vehicle capable of fully automatic line-following [4]. The aircraft uses a small gas turbine as power to ensure fast and long-range operation of the UAV platform. And its base balance system ensures clear image capture. In addition, because the platform is equipped with a laser ranging system, it can perform tasks well in positioning and obstacle avoidance, ensuring the safety of the detection platform. The detection platform has outstanding detection effect and excellent performance, and is the representative of the current world advanced level. Jones and Golightly et al. developed an unmanned helicopter for power line inspection [5]. Its shape is designed in the shape of a ducted fan in order to improve the wind resistance of the detection drone and reduce noise during flight. In addition, a protective layer is designed on the surface of the wing to enhance the safety of the airframe. The energy of the drone can be supplied by the power line, which ensures the endurance of the drone, and enables its flight to be carried out against the power line, eliminating the cumbersome occupation of the waterway and the problem of aviation application.

The Shandong Research Institute of State Grid Corporation of China has also carried out related research and achieved fruitful results [6]. In 2009, the institute took the lead in proposing the UAV intelligent line patrol system in China. The system consists of a UAV line inspection part and a detection-processing part, and the two parts perform their respective duties to realize the efficient operation of the system. In this system, the unmanned helicopter platform mainly controls the flight tasks of a given route, and the detection and processing module equipped with it performs real-time detection and monitoring of the route during the flight. Its detection and processing module has the following innovative features. First, the system can inspect the line in different ways such as visible light and ultraviolet light, which greatly increases the possibility of finding faults during inspection. Second, the system can perform real-time transmission of detection video information for monitoring by terminal monitoring station personnel and computer systems. Third, the system has made significant progress in infrared fault detection, and can find faults that cannot be detected by various conventional means through the comparison of infrared and visible light and hidden dangers. Shaoxing Electric Power Bureau of China has explored this technology [7], and achieved a high degree of integration of the UAV line inspection system through a short period of research and development and debugging. The system organically combines various functions of line inspection by carrying various detection, processing and transmission equipment on the UAV, and creatively completes line target tracking and capture, real-time detection of line hidden dangers and long-term data analysis. Technological innovations such as image recognition lines and power facility ranging have greatly expanded the function of helicopter line inspections, and promoted UAV line inspections from theoretical research to practical application. In the field of practical application, China Hubei Electric Power Company put the UAV line inspection system into the actual maintenance work in 2011 [8]. In the process of coping with snow disasters in winter, UAV line inspection provided excellent technical support for power supply, and the electricity safety is guaranteed. In 2012, Qinghai Province of China applied the drone line inspection technology to the inspection of power lines in the snow-capped mountain area [9], and conducted targeted inspections of insulators along the line, which marks that the UAV power line detection technology has officially been transplanted from ordinary line maintenance to the application in high-altitude harsh environment areas. UVA replace manpower to perform difficult and dangerous tasks, which greatly eases the work pressure of human line inspections.

This paper systematically summarized the development of power line recognition technology based on UAV images. In Sect. 3, we made a comprehensive discussion on the working mode of UAV platform. First, we clarified the types of UAVs, and further discussed the structure of UAV power line detection system and the role played by each component. In Sect. 4, we reviewed the development of power line image processing technology. First, we revealed the basic methods of power line image processing, and then we discussed advanced power line recognition technology based on depth learning. Section 5 is the summary of this review.

3 UAV inspection system

3.1 Types of UAVs

UAV-based transmission line inspection technology is developing rapidly [10,11,12,13]. UAVs used for power line inspection can be divided into: fixed-wing UAVs, rotary-wing UAVs, unmanned helicopters, and vertical take-off and landing fixed-wing UAVs [14]. The representative of fixed-wing UAVs is shown in Fig. 1a, and its flying speed can more than 100 km/h. Its perfect aerodynamic structure makes it consume the least power during flight and has the longest cruising range. Hence, fixed-wing UAVs can perform long-distance line detection. Due to the fast flight speed, it is impossible for fixed-wing UAVs to analyze each part of the line in detail, but can only take the form of separate shooting for rough pattern detection. Fixed-wing UVAs generally take the form of military reconnaissance aircraft, passing over the power line to take pictures to detect the general condition of the power line. In addition, since a fixed-wing aircraft needs a certain initial speed to take off, the traditional way is to take off from the runway, but because the power inspection is greatly affected by the terrain, the commonly used take-off and landing methods are the take-off hand throw or catapult, and the landing is parachute landing. A rotary-wing UAV is show in Fig. 1b, which shows the characteristics of simple structure, low cost, and convenient operation. It generally uses radio for real-time control, and can be suspended at a fixed point in the air for shooting. This kind of UAV generally flies at a slow speed and lacks endurance, and can only be used for inspection of specific parts. As shown in Fig. 1c, compared with fixed-wing aircraft, unmanned helicopters have better maneuverability in the air, can take off and land in short distances, and hover at key detection locations for sampling. But unmanned helicopters lack endurance, and their speed is slightly slower than that of fixed-wing UVAs, thus which is only suitable for short-distance line patrol. Helicopters have a larger load than rotary-wing UAVs during inspection, and can be equipped with special inspection pods to obtain more comprehensive inspection information, but the cost is also more expensive. A vertical take-off and landing type fixed-wing UVA is shown in Fig. 1d. It takes into account the advantages of easy take-off and landing of the rotorcraft and the long flight time of the fixed-wing UVAs, and is more and more favored by the power inspection. The vertical take-off and landing fixed-wing UVAs have two flight states, including rotary-wing state and fixed-wing state. Under the action of the automatic control system, the two states can be switched smoothly. However, its working mode, namely line patrol mode, is still in the fixed-wing state, and the rotary-wing state is only prepared for take-off and landing.

Fig. 1
figure 1

UAVs used for power line inspection can be divided into: a fixed-wing UAV, b rotary-wing UAV, c unmanned helicopter, and d vertical take-off and landing fixed-wing UAV

3.2 Structure and system of UAV inspection platform

UAV is a general term for unmanned aerial vehicles that are controlled by wireless signals or by established procedures. Because UAV can perform more efficient and convenient work at high altitude and long range, it has been widely used in military, surveying and mapping, photography, monitoring and other fields. The control method of UAV is simple with stable operation, and it is more economical and less affected by climate. Therefore, as an operating platform, UAVs are widely used in various aspects. The structural composition of UAV can be divided into flight control system, power system, task load, communication system and ground part [15,16,17,18,19].

  • Flight control system The flight control system is the core of the drone to maintain flight and complete various tasks. It controls the flight status of the drone, and completes the control of the drone flight mode and other matters. The flight control module can be divided into manual mode and automatic mode according to the different control modes. The flight control system usually receives control signals from the communication module, and adjusts the flight status of the UAV according to the flight settings. In automatic mode, the drone needs to determine some emergencies during flight by itself. In manual mode, the aircraft does not need to be controlled by itself, as long as it flies stably according to the signal sent from the ground part.

  • Power system The power of the drone can be provided by power or an internal combustion engine. The internal combustion engine has strong power, but the noise is generally relatively large, and the power system is also relatively complex, and it is often difficult to play an ideal level in the high altitude with thin air. Electric drive noise is relatively small, the structure is simple, the power part is light, the adaptability to temperature and air pressure is strong, and the power supply can be shared with other parts.

  • Energy system The energy system of the UAV provides energy for the power system and various on-board monitoring and control equipment. In the electric-driven UAV, the main component of the energy system is the battery, which can be of nickel-metal hydride and nickel–chromium battery. The output power should meet the requirements of various equipments on the aircraft, and the energy storage should ensure that the line patrol drone is in accordance with the established route to do a one-way or round-trip flight.

  • Task load equipment The task load is the necessary load for the UAV to complete the flight line inspection task. Generally, in the process of power line inspection, the task load includes power line shooting equipment and information processing equipment. The mission load is the key for the UAV to complete the line inspection.

  • Communication system The communication system of the UAV is the way for the UAV to keep in touch with the ground station and ensure that the UAV is controlled in real time. It is also the transmission channel for the detection information of the transmission line. The communication system shall meet the corresponding bandwidth requirements and transmit information according to the communication protocol set between the UAV and the ground.

  • Ground station It is difficult for UAVs to fly with relatively heavy loads, so some heavy equipment needs to be set up on the ground, and the subsequent processing of information should also be transmitted to the ground, and computer equipment will be used for monitoring and in-depth mining of various detection features. The control signal during the flight of the UAV is also sent by the ground station, which implements human intervention in the difficult situation that occurs in the operation of the UAV, so that the UAV can complete the task better.

The UAV line inspection system can be divided into two parts: the on-board system and the ground station system according to the different working positions. The overall structure of the system is shown in Fig. 2. Among them, the on-board system is mainly responsible for the collection of UAV's flight and line information, while the ground station system is responsible for receiving the information sent by the UAV, and then processing, controlling and computing. Since the ground station system is not limited by the UAV, it has almost no requirements on weight, thus can complete complex tasks through computers and control equipment, including flight control, route selection, data transmission and reception, and image processing.

Fig. 2
figure 2

Structure and system of UAV Inspection Platform

The on-board system contains the UVA and various functional modules. The UVA module includes the drone body and the flight control system. The airframe consists of wings and a fuselage. The wings provide lift to the entire onboard system, enabling it to fly at the height of the power lines. The fuselage is the hub that connects the various parts of the aircraft, and is the cornerstone of carrying various functional modules on the aircraft. The flight control system is an important control part of the detection along the normal flight of the drone. It needs to comprehensively control the flight process of the drone to ensure the stability of the flight process and keep the flight of the drone at a certain speed, height and angle. During the general inspection process of the UAV, the flight control system needs to meet the following requirements: collecting flight parameters, stabilizing the flight altitude and angle, resisting changes in airflow, changing the corresponding flight state according to control instructions, and handling emergencies. The detection systems are distributed on the plane and on the ground, and are connected to each other by means of wireless communication. The detection system adopts the established algorithm to photograph the power equipment along the line, and the photographed image is transmitted to the ground terminal through the data management system, and finally the ground terminal completes the detection of the final line running state. The wireless communication system is divided into two directions: uplink and downlink. During operation, it is necessary to establish communication for data and images respectively, so as to realize the simultaneous transmission of information flow. The transmission is usually carried out by radio, and the information is transmitted to the ground terminal through the set communication protocol for subsequent processing and analysis. The data management system spans between the on-board system and the ground system, and completes the extraction, identification and storage of transmission line detection information through mutual data communication and separate processing on both sides. The airborne data management system extracts, compresses, and stores the information in the visible light or infrared image through the images detected by the detection system through the established algorithm, and selects the fault and hidden danger information in the line, and it is worthy of further analysis. The data, combined with the currently collected original image information, is sent to the ground station, and the ground station uses a computer with stronger computing power to conduct in-depth analysis and dig out the effective information.

4 Image recognition

When using drones for line inspection, the amount of transmission line image and video data collected by the drone is very large. It is impossible for the staff to manually judge whether the transmission line is normal or not based on the images. It is necessary to use computer vision technology, intelligent algorithms such as image processing technology realize the inspection of transmission lines, and the identification of transmission line components is an important part of realizing intelligent inspection.

4.1 Image edge detection

The power line image has the following characteristics [20]. (1) In general, the background part of the power line image is natural scenery such as forests, grasslands, and rivers, and the background texture is very complex. (2) The main components of the power line, such as: power lines, shock-proof hammers, insulators, etc. are generally attached to transmission lines. (3) In the inspection image formed by helicopter inspection, the transmission line is the most common straight line in the image. (4) The insulator has rich texture features and periodicity. (5) For different parts of the transmission line, the contrast between the area and the background is different, but there is a common phenomenon that the contrast is not high and the distinction between the target and the background is not strong. Image edge detection is an important way to realize the identification of power transmission line components. The edge of the image is the pixel area where the grayscale and texture of the image change drastically. The edge of the image indicates the middle of one area and the beginning of another area, reflecting the contour characteristics of the object. The above characteristics of the power line image make the edge detection of the power line image different from the traditional image edge detection, and the detection algorithm is also different. The extracted image edge information can be divided into continuous and discontinuous. According to whether the extracted edge is closed, edge detection can be divided into closed edge detection and non-closed edge detection. The advantage of closed edge detection algorithms is that closed boundaries are always obtained regardless of image quality. The basic idea is to analyze the edge detection with mathematical methods, define a contour line on the image, and make the contour line gradually move to the edge of the image by minimizing the energy function. The non-closed edge detection can be divided into the following four types: one is the differential operator method edge detection. For grayscale images, there is a jump in the gray value of the image edge, so there is an extreme value at the first-order reciprocal of the image edge, and the second derivative crosses zero. The classic operators for edge detection based on this theory include: Sobel operator, Roberts operator, Prewitt operator, Laplacian operator, etc. This edge detection method has a faster calculation speed, but the edge positioning accuracy is not high. The second is the optimal operator method for edge detection. The optimal operator is proposed on the basis of the differential operator. This method detects the edge by optimizing the signal-to-noise ratio. The commonly used optimal operators are: LOG operator and Canny operator. The advantage of the optimal operator for edge detection is that it has good anti-manufacturing performance, but it has the disadvantage of a large amount of calculation. The third is edge detection by fitting method. The basic idea of this method is: use a linear combination of a set of basis functions to perform least squares fitting on the local area of the image, and then use the fitting parameters to obtain the edge detection result. The choice of basis functions is very important, and the calculation of edge detection by the fitting method is also very large. The fourth is edge detection based on emerging mathematical theories. Typical representatives of these methods are: wavelet transform, artificial neural network, fractal theory, genetic algorithm, and edge detection algorithms such as support vector machines. Edge detection algorithms based on emerging mathematical theories have different detection effects for different images. For example, edge detection based on fractal theory is better for edge detection of artificial objects in natural backgrounds, while in other applications, the detection results are unsatisfactory.

Image recognition technology needs to process the original image to obtain relevant results, and generally has higher requirements for the original image. At present, the original image of the transmission line is often the image containing the transmission line captured by the camera device, usually the monitoring image. Due to the influence of the line itself, the angle of the camera and the weather, the identification of the monitoring image of the transmission line has the following difficulties. (1) The linear shape is obvious, in the picture, the width is small, and the identification is more difficult. (2) Some angles of shooting may be in a straight line (such as looking down from the top of the transmission line), namely, related to angle of view. (3) The transmission line has a large span, which can generally span the whole picture and cannot be completely displayed in one picture. (4) The multiple lines of the same tower are generally in an approximate parallel relationship, and there is no intersection, but under certain angles, there may be overlapping effects that cause lines to intersect. (5) The background of overhead line monitoring is generally complex, and is greatly affected by external conditions such as weather, which may cause blurred and distorted pictures due to rain and snow.

In view of the above situation, as shown in Fig. 3, the following procedures can generally be used to optimize the identification of transmission lines [21,22,23,24]. (1) Grayscale processing Grayscale processing is performed on the original image, the color information in the image is removed, the storage capacity of the image data is reduced, and the contour of the object is easily detected. (2) Filter enhancement processing Firstly, perform filtering processing on the processed image to reduce the noise pollution caused or generated by the initial imaging, transmission and grayscale processing. Then, enhancement processing is used to emphasize the image of the target transmission line we have identified, so that the feature difference between it and the background in the picture is more obvious, and the success rate of image recognition is improved. (3) Extract the edge contour and filter processing Relying on the features enhanced in the previous step to identify the contour of the target in the image, and then filter the identified contour to reduce the influence of the algorithm on the image, and obtain a pure contour image of the transmission line. (4) Mark the transmission lines The outline image of the transmission lines obtained in the previous step is marked in the original image, and finally the marked original image is displayed.

Fig. 3
figure 3

Flow diagram of image edge detection

Now in conjunction with a specific example, each step in Fig. 3 is demonstrated as follows.

4.1.1 Grayscale processing

After the original image of the transmission lines is obtained, the original image can be grayed out by using the existing method in the prior art to obtain a gray-scale image. As shown in Fig. 4a, it is a grayscale image of an original image. As can be seen from this figure, the width of the transmission line is very small in the figure, and it is not fully displayed in the figure. In addition, due to factors such as weather and environment, the characteristics of the transmission lines in the image are not very clear, and the human eye cannot distinguish it well in some backgrounds.

Fig. 4
figure 4

a Grayscale image of the original image, b filtered image, c enhanced image, d edge contour map of the enhanced image, e filtered image

4.1.2 Filter enhancement processing

There are many ways to implement grayscale image filtering, which can be divided into two categories: linear filtering and nonlinear filtering. Among them, linear filtering is more commonly used because of its simple and effective characteristics. However, since it also destroys the edge information of the image and loses the image information while removing the noise, it is not suitable for the edge recognition of the transmission line. However, nonlinear filtering can greatly improve the problem, so many papers choose the median filtering algorithm in nonlinear filtering. While excluding extreme singular points in the image, it can protect and maintain the step edge and topological structure of the image, and its algorithm formula is shown in Eq. 1:

$${\text{g}}(x,y) = \frac{{\sum\nolimits_{s = - a}^{a} {\sum\nolimits_{t = - b}^{b} {w(s,t)f(x + s,y + t)} } }}{{\sum\nolimits_{s = - a}^{a} {\sum\nolimits_{t = - b}^{b} {w(s,t)} } }},$$
(1)

where g(x, y) is the gray value of the filtered image, that is, the gray value of the (x, y) pixel in the filtered gray image; w(s, t) is the weight value of (x, y) offset (s, t), s and t are the offset values of the abscissa and ordinate, respectively; f(x + s, y + t) is the gray value of the grayscale image coordinate (x + s, y + t). After filtering the grayscale image, the filtered image is obtained, that is, the filtered grayscale image, as shown in Fig. 4b. Then, the grayscale images before and after filtering are subjected to differential enhancement calculation to obtain an enhanced image, as shown in Fig. 4 (c). In the identification and monitoring of power transmission lines, differential enhancement calculation is performed on the grayscale images before and after filtering, and the step of obtaining the enhanced image includes: calculating the difference between the grayscale image before filtering and the grayscale image after filtering to obtain the enhanced image, that is, h(x, y) = f(x, y) −  g(x, y), where h(x, y) is the grayscale value of the enhanced image, and f(x, y) is the grayscale value of the grayscale image before filtering Gray value, g(x, y) is the gray value of the filtered gray image.

4.1.3 Extract the edge contour and filter processing

There are many ways to extract the edge contour image of the enhanced image, such as a priori knowledge method, mathematical morphology method, gradient based method, level set method, etc. Here, we demonstrate the threshold adaptive Canny operator based on gradient to extract the edge contour, as follows. (1) The gradient map solution uses Sobel difference operator to find the gradient of each point in the obtained gray-scale map and obtain the corresponding gradient map of the original image. (2) Determine the strong and weak edge pixel points. The judgment is made by setting high and low thresholds, where the gradient value is higher than the high threshold value as a strong edge pixel point, the gradient value is between the high and low threshold values as a weak edge point, and the gradient value is lower than the low threshold value as a non-edge point. (3) Non-maximum suppression after the first edge point judgment and screening, non-maximum suppression is performed along the gradient direction. (4) Edge determination after the strong edge point is suppressed by non-maximum value, the weak edge point near the suppressed strong edge point is searched to obtain the final edge.

Due to the influence of weather, environment, etc., the thresholds required for the identification of the same transmission line at different times will be different, so it is impossible to artificially give a certain threshold for identification. It is necessary to rely on the adaptive threshold method to realize the adjustment of the threshold value at any time. The commonly used methods are the Otsu and Canny operator methods. The Canny operator method is demonstrated here, and the specific calculation steps are as follows. (1) Find the gradient of each point in the original grayscale image, draw its gradient image, and save its maximum value. (2) Perform a histogram solution on the obtained gradient map. (3) Set the proportion of non-edge pixels in the whole grayscale image. (4) Set the threshold. (5) Traverse the histogram, sum and save the number of pixels corresponding to each gradient value. (6) Exit the traversal of the histogram if the summed value is higher than the threshold. (7) Calculate the low and high thresholds of Canny.

As shown in Fig. 4d, the edge contour image usually contains burrs with complex background. Therefore, it is also necessary to filter the edge contour image to remove the burr of the complex background in the edge contour image to obtain a pure transmission line contour image. Here the image target is a transmission line whose outline is a circumscribed rectangle, so the edge outline image can be filtered according to the following function:

$${\text{r}}_{i} (x,y) = \left\{ \begin{gathered} 1,\quad \begin{array}{*{20}c} {\max_{x} {\text{r}}_{i} (x,y) - \min_{x} {\text{r}}_{i} (x,y) > W/C} \\ {\& \& } \\ \end{array} \hfill \\ 0,\quad \begin{array}{*{20}c} {\max_{y} {\text{r}}_{i} (x,y) - \min_{y} {\text{r}}_{i} (x,y) > H/C} \\ {\text{else}} \\ \end{array} \hfill \\ \end{gathered} \right..$$
(2)

In Eq. 2, ri(x, y) represents the point coordinates (x, y) of the contour i in the edge contour image, W represents the width of the circumscribed rectangle, H represents the height of the circumscribed rectangle, and C is a constant, such as C is 10, “&&” represents the logical operator “and”. The meaning of Eq. 2 is: if the difference between the maximum value and the minimum value of ri(x, y) in the x direction is greater than W/C, and the difference between the maximum value and the minimum value of ri(x, y) in the y direction is also greater than H/C, the value of ri(x, y) is set to 1, otherwise it is set to 0. The filtered contour image is shown in Fig. 4e. It can be seen that after filtering, the burrs of the complex background in the edge contour image have been removed, and the remaining pure transmission line contour image is clear.

4.1.4 Mark the transmission lines

The filtered contour image is redrawn on the original image, so that the user can accurately identify the power transmission line in the original image. The identification function can be realized by combining the contour point extraction unit and the contour point drawing unit. In engineering practice, the contour point extraction unit and the contour point drawing unit work relatively independently to achieve efficient and accurate drawing an outline point extraction unit, used for obtaining the outline point of the transmission line and the coordinates of the outline point from the outline image of the transmission line. The outline point drawing unit is used to apply the original image as a background image, and the contour points are used as the foreground image. Then, the contour points are drawn on the original image according to the coordinates of the contour points.

4.2 Deep learning

The improvement of computer hardware performance makes deep learning develop rapidly. Deep learning is widely used in image recognition. Girshick et al. put forward R-CNN algorithm [25], which is a convolutional neural network recognition method based on a region proposal approach, and has become a typical recognition scheme based on depth learning. Different from the previous target detection algorithms, its target feature extraction is realized through the depth convolution network. In the case of less data, other trained neural networks can be slightly tuned through migration learning of neural networks to adapt to smaller data sets. However, R-CNN algorithm has the disadvantages of long training time and slow prediction stage, because it needs to train CNN, SVM and boundary box regression separately, and needs to use CNN for SVM and boundary box regression to extract and store the feature regions of all images. These defects can be remedied via the SPPnet algorithm [26], which uses shared computing to speed up prediction, but only the layers behind the pyramid pool layer are tuned, so that the accuracy of the deep network is limited. Ren et al. proposed a fast R-CNN method [27]. By introducing a region proposal network (RPN) with shared convolution operations, the computing time and improved the time-consuming disadvantage are decreased, thereby realizing real-time identification and detection. However, the number of real frames is less than the number of candidate frames, resulting in the limited detection efficiency. Ko et al. applied depth learning to the recognition of linear objects [28]. They combined point estimation and point instance segmentation methods to propose PINet technology and successfully used it in traffic line detection. PINet only detects several necessary points to describe the traffic line, which is composed of several hourglass modules and four loss functions. Each hourglass module applies these loss functions at the same time. PINet achieves high performance and low false alarm rate in CULane and TuSimple data sets. The low false alarm rate ensures the safety performance of autonomous vehicle, thus the wrong predicted lanes rarely occur.

In terms of power equipment identification, various algorithms have been proposed [29,30,31,32,33,34,35,36,37], which has achieved good results. Some results are shown in Table 1. Yao et al. proposed an improved and faster R-CNN [33]. By using "Hot Anchors" sampling instead of sliding window uniform sampling anchor points, and the improved RPN layer to avoid a lot of additional calculations, the detection performance of power equipment is improved. The data set used in Yao’s work is composed of about 4200 image samples, mainly identifying three types of components: power transformer, spacer and circuit breaker. Each type of component uses about 1400 samples, and 400 samples are randomly selected from each type of component as the test set, with a total of 1200 images, and the detection accuracy reaches 78.2%. Guo et al. proposed an improved AlexNet anomaly detection model [34], with an average accuracy of 83.55%. In the aspect of feature extraction, the model extracts the features of transmission line equipment through deep convolution neural network (DCNN). In the aspect of recognition algorithm, a SVM classification method combined with depth learning is proposed by using the advantages of traditional machine learning methods and combining the advantages of SVM. Finally, the improved AlexNet model and SVM classification method are used to classify the images of various power equipments. The working data set in Guo’s report consists of 32,000 transmission line inspection images obtained from State Grid Corporation of China, 30,000 of which are used as training samples, and the remaining 2000 are used as test samples. YOLO algorithm further improves the detection speed, and makes predictions from a global perspective, resulting in less background error, and allows learning features with high generalization ability. Wang et al. proposed a detection method of insulator self-explosion zone based on YOLO9000 for UAV detection [36]. The experiment shows that the detection accuracy reaches 84.8%. Compared with traditional detection and machine detection algorithms, the power component recognition technology based on depth learning has made significant progress in detection accuracy and speed, and can meet the requirements of real-time detection, as well as the strict conditions for autonomous inspection of power lines using UAVs. Liu et al. proposed the optimized YOLOv3 network [37], which can detect insulators from aerial images with complex backgrounds. They collected aerial images with common aerial scenes using UAV, and built an insulator data set, which contains 5000 images, 3000 images of which are used as training samples, and the remaining 2000 images are used as test samples. Based on YOLOv3 and dense blocks, YOLOv3 dense network is used for insulator detection to enhance feature reuse and propagation. The multi-scale feature fusion structure is applied to YOLOv3 dense network to improve the detection accuracy of insulators of different sizes, so that the highest accuracy is 94.47%.

Table 1 Power equipment identification report based on multiple algorithms and its performance comparison

The high-precision and real-time detection of overhead transmission lines by the UAV vision system can not only guide the machine to automatically patrol the line and collect inspection data, but also serve as a basis for obstacle avoidance and early warning to ensure safe flight [38]. Power line detection methods based on visible light images are mainly divided into two categories, one is the detection method based on traditional features [39,40,41,42,43], and the other is the method based on deep learning [44,45,46,47,48]. Existing methods have achieved many research results, but the detection accuracy still cannot meet the practical requirements. The traditional method is mainly based on the edge features of the image, as described in Sect. 3.1. According to the straight and parallel characteristics of the power line cluster, the target line segment is selected from the candidate edges and the result is fitted: Sun et al. proposed a parameter correlation method [39], and successively used Hessian transform and Hough transform to extract edge pixels, filter out the straight line, and then filter the parallel power lines whose parameters satisfy the sine relationship through coordinate transformation. Yang et al. removed the background noise in the candidate edges by region growing filtering [40], and then fitted the power lines with expectation maximization. Zhou et al. combined RGB and gray value attributes to classify images [41], and proposed three filtering and adaptive piecewise histogram equalization methods for image preprocessing, using Canny to extract edges, and fitting the target through Hough and morphological closing operations. These methods rely on linearity and parallelism, which are not suitable for the actual scene of intersection of power line clusters, and have many processing steps and time-consuming operations, which cannot meet real-time requirements. Deep learning methods are the mainstream for power line cluster detection. Pan et al. used a seven-layer binary classification neural network to identify sub-regions containing power lines from edge features [44], and used gradient asymptotic probabilistic Hough transform to extract targets. In view of the simple structure of VGG19 and good performance in classification tasks, Lee et al. extract sub-regions from the image and input VGG19 through sliding windows [45, 49]. If the sub-region is judged to contain power lines, the output of the middle layer of the network and the output of the end are multiplied element by element, and then used The Gaussian kernel mask suppresses the surrounding responses and strengthens the central region as the detection result. Choi et al. used the back-propagation visualization network to extract the power line features as the calibration map of the FCN target detection network [48], and used the LSC method to fit the power lines to the FCN detection results. The above-mentioned power line detection methods based on deep learning have low accuracy, and none of them solve the problem of many false breakpoints in output images under complex backgrounds. In essence, power line detection can be regarded as a research sub-direction in the field of semantic segmentation. The mainstream semantic segmentation methods include: PSPNet and DeepLab [50, 51], which respectively propose spatial pyramid pooling layer, multi-scale dilated convolution, and joint multi-scale local information and global information. FCRN uses the residual module to design a reverse mapping block to improve the output resolution [52, 53]. According to the complementary characteristics of high and low-level features (the former and the latter contain semantic information and detail information respectively) [54,55,56], UNet proposes a skip connection structure to combine the two features.

The mainstream semantic segmentation methods improve the power line detection accuracy, but still do not solve the problem of many false breakpoints in the output image. The root cause of false breakpoints in power lines is that the spatial continuity characteristics of power line clusters are not fully utilized. To this end, Cheng et al. designed a special network structure to perform multi-scale enhanced extraction of continuous linear features of power lines [57]. The linear features of power lines include three types: backbone features (including internal texture and color information), edge features (including structural information) and fusion features. Cheng’s research proposed a port-to-port multiple linear-feature enhanced detector (MLED) network, which used a two-way convolution structure to extract three linear multi-scale features of the power line. And through the multi-feature fusion module including residual connection, scale transformation, and element-level processing, the gradual fusion was realized, and the global context information of continuous linear objects was fully extracted. The proposed network was tested and analyzed on the visible light image dataset of overhead transmission line inspection in Nanning, Guangxi, China. At the same time, a multi-feature loss function was designed, and the supervised network learned three multi-scale features. The multi-feature loss function is demonstrated below.

The multi-feature loss function that supervises the learning of different scales of backbone features, edge features and fusion features is: LMLED = Lfuse + Ltrunk + Ledge, where Lfuse, Ltrunk, and Ledge are the loss of fusion features, the loss of backbone features, and the loss of edge features, respectively. All three loss terms compute the cross-entropy loss:

$$\left\{ \begin{gathered} L_{fuse} = - L_{{\text{bce}}} \left( {{\mathbf{O}}_{{\text{fuse}}} ,{\mathbf{T}}_{lab} } \right) \hfill \\ L_{trunk} = - \sum\nolimits_{i = 1}^{5} {\alpha^{i} } L_{{\text{bce}}} \left( {{\mathbf{O}}_{{\text{trunk}}} ,{\mathbf{T}}_{lab} } \right) \hfill \\ L_{{\text{edge}}} = - \sum\nolimits_{i = 1}^{5} {\alpha^{i} } L_{{\text{bce}}} \left( {{\mathbf{O}}_{{\text{edge}}} ,{\mathbf{E}}_{{\text{lab}}} } \right) \hfill \\ \end{gathered} \right.,$$
(3)

where Tlab and Elab are the power line trunk calibration map and edge calibration map respectively, and Ofuse is the final detection result. {Otrunk|Oitrunk; i = 1, 2, …, 5} and {Oedge|Oiedge; i = 1, 2, …, 5} are the backbone and edge features of five scales extracted by residual down-sampling convolution. The number of channels is reduced to 1 by 1 × 1 convolution nonlinear mapping, activated by sigmoid function, and then passed through 2 × , 4 × , 8 × , 16 × and 32 × up-sample the enlarged mapping features, respectivly. The {α|αi; i = 1, 2, …, 5} is the weight vector, and the five elements are [0, 1.0], which is used to balance the effect of different scales of trunk and edge feature loss on LMLED. Combined with the complementary relationship between low and high level features, and the effect of edge features on suppressing noise, the overall structure of a single-input multi-line feature enhancement network is proposed as shown in Fig. 5. The dual-branch Resnet50 is used as the backbone of the feature extraction network to extract the multi-scale trunk, edge and high-level fusion features of the power lines. A 1 × 1 convolutional layer (1 × 1 convolution + BN + ReLu) is used to realize skip connection, and combined with four multi-feature fusion modules to gradually aggregates the above three features and amplifies the fusion output. Finally, the detection result with the same resolution as the input is obtained, which meets the visualization requirements of power line inspection. The gradual fusion process of multi-line features can fully aggregate the details, structure and semantic information of power lines, and effectively enhance the ability of the network to extract power lines.

Fig. 5
figure 5

Multi-linear feature enhancement network structure diagram

The network structure of MLED, from input to output, includes the following steps. (1) The trunk features {T|Ti; i = 1, 2, …, 5} and edge features {E|Ei; i = 1, 2, …, 5} of the input image are extracted by five convolutional layers. In order to control the amount of parameters and ensure the effect of feature extraction, adjust the step size and number of convolution kernels, so that the resolution of the output features of the convolution layer Coni+1 is 1/2 that of Coni, and the number of channels is twice that of Coni. Among them, the output feature size of Con1 is 1/2 of the input, and the number of channels is 64; the output feature size of Con5 is 1/32 of the input, and the number of channels is 1024. Perform element-level processing on T5 and E5 with complementary information (denoted as ⊕). The 1 × 1 convolutional layer is used to halve the number of T5 and E5 channels whose scale is 1/32 of the input, and then add them element by element to obtain high-level fusion features: F5 = T5 ⊕ E5. (2) Aggregate backbone, edge, and fusion features. Use the 1 × 1 convolutional layer to skip connecting Ti, Ei and Fi+1, i = 1, 2, 3, 4, with the same number of channels and different scales. And through four consecutive multi-feature fusion modules, gradually enlarge the feature resolution and refine the target’s details and semantic information: Fi = f(Fi+1, Ti, Ei), i = 1, 2, 3, 4, where f(·) is the multi-feature fusion module, and Fi is the output of the i-th multi-feature fusion module. (3) Amplify F1 by double bilinear interpolation, and output the prediction result.

The MLED is compared with the three mainstream networks PSPNet, FCRN, and UNet, and the results are shown in Fig. 6. It can be seen that PSPNet directly expands the high-level semantic feature resolution by eight times, loses a lot of power line object information, and misses detection and breakpoints seriously. Although FCRN gradually enlarges the high-level features, it reduces the loss of information, but the context information is still insufficient, and the breakpoints and false detections are obvious. UNet fuses details and semantic information by combining low-level and high-level features, but the network structure is simple and does not use edge features, and the loss function only considers one target feature, which is easy to misjudge background interference and breakpoints. MLED uses a dual-branch residual convolution network to extract multi-scale backbone, edge, and fusion features of power lines, and gradually integrates them through a multi-feature fusion module. Compared with FCRN using traditional residual connection blocks, the extraction of power line information is more sufficient. MLED uses a multi-feature loss function to enhance the network's ability to learn linear features. Compared with the three mainstream networks that use a single feature loss, it has better resistance to tree and road disturbances, effectively reduces power line breakpoints, and can also identify small power lines farther out.

Fig. 6
figure 6

Comparison between MLED and mainstream networks

5 Summary

This paper systematically summarizes the application and development of UAV in the field of power transmission line detection. UAVs used for power line inspection can be divided into fixed-wing UAVs, rotary-wing UAVs, unmanned helicopters, and vertical take-off and landing fixed-wing UAVs. The structural composition of the UAV platform can be divided into flight control system, power system, task load, communication system and ground part. The UAV line inspection system can be divided into two parts: the on-board system and the ground station system according to the different working positions. Among them, the on-board system is mainly responsible for the collection of UAV's flight and line information, while the ground station system is responsible for receiving the information sent by the UAV, and processing, controlling and computing. When using UAVs for line inspection, the amount of transmission line image and video data collected by UAVs is very large, and it is necessary to realize inspection of transmission lines through intelligent algorithms such as computer vision technology and image processing technology. Two types of image processing techniques are mainly discussed, one is the method based on image edge detection, and the other is the method based on deep learning. Image edge detection is mainly based on image edge features. According to the straight and parallel characteristics of power line clusters, the target line segment is selected from candidate edges and the result is fitted. This method is simple and commonly used, but it depends on the linearity and parallelism of power lines, which is not applicable for the actual scenario where the power line cluster crosses. The deep learning method is the mainstream of power line cluster detection, which has better resistance to tree and road interference, and effectively reduces power line identification errors.