3.1 Research Status of Information Collection Technology

In the field of wireless sensor network in the application and research and development, foreign countries such as the United States, Europe, Japan, South Korea and other countries started earlier, and their overall strength is strong. The United States “smart grid”, “smart Earth”, the European “Internet of Things Action Plan” and the “U Society” strategy based on the Internet of Things in Japan and South Korea have been implemented, and the Internet of Things has become an important means to seize the “post-crisis” era to enhance the comprehensive competitiveness of countries. In China, wireless sensor networks began to develop after the concept of intelligent dust was put forward, and with the deepening of research on it, it has gradually expanded from the application of national defense and military fields to environmental monitoring, medical health, seabed exploration, forest fire fighting and other fields, and it is included in the future emerging technology development plan, and focus on the application of biotechnology, chemistry and other aspects. After that, the scientific community focused its research on secure and scalable networks, sensor systems and other networks, which prompted scholars from all walks of life to gradually participate in the research and development process of wireless sensor networks.

The development of wireless sensor networks in China started at the same time as in developed countries, and the related research work has gradually received extensive attention from the government,. As a key research project, its basic theories and key technologies are included in the planned research. In recent years, the research of wireless sensor networks in our country has been developing continuously, and more achievements have been obtained. At the same time, with the continuous development and improvement of communication technology and electronic technology, wireless sensor networks have also been rapidly developed, and their application scope is more and more extensive, and their development prospects are broad.

3.2 Disaster Loss Collection Technology of Power Equipment Based on Internet of Things

3.2.1 Summarize

The research on the rapid collection and automatic collection of multiple information of power equipment disaster loss focuses on the field environment and equipment perception collection technology, including machine vision intelligent identification collection and wireless sensor collection technology. Firstly, the method of machine vision is used to create an indoor environment map, and the damage state of power equipment is identified from the perspective of emergency treatment to realize the accurate perception and information correlation of the field environment; Secondly, it studies the identification of power field equipment based on mobile terminals to realize the identification of power grid equipment and related components. Finally, wireless communication module is used to realize the fast networking and data acquisition of power equipment sensors.

3.2.2 Intelligent Recognition and Acquisition Technology of Machine Vision

By using machine vision intelligent recognition and collection technology, the main task is to identify the overall damage of equipment in power facilities. A typical internal damage scenario of a power facility is shown in Fig. 3.1. The research of the identification algorithm will be carried out mainly from the perspective of emergency disposal needs, both the results of the algorithm identification will be used to complete the estimation of the number of resources required for the maintenance of power equipment and the subsequent research work, which will provide direct reference information for emergency decision-making.

Fig. 3.1
A photograph of a damaged power facility. It appears to have been burnt. Various materials are burnt and the remaining parts are covered with dark soot.

Internal damage scene of power facilities

Firstly, the RGB-D (RGB image + depth image fusion) sensor is used to obtain visual information about the indoor environment in real-time, and the feature extraction is performed separately for each subspace of the RGB image using ORB (an algorithm for fast feature point extraction, which can be used to quickly create feature vectors for key points in the image, and these feature vectors can be used to identify objects in the image) feature operators in binary form; then the image features are described as visual words in binary form according to the characteristics of ORB algorithm and stored in a tree structured model to construct a visual dictionary in binary form incorporating spatial information; closed-loop detection of the visual information acquired in real-time incorporating both temporal continuity and geometric consistency constraints, to determine whether the closed-loop condition is satisfied; if the closed-loop conditions are not met, then the image is stitched together with the depth information and through spatial mapping using the RANSAC algorithm (an algorithm that calculates the parameters of a mathematical model of the data to obtain valid sample data based on a sample data set containing anomalous data); if the closed-loop conditions are met, the intelligent identification of on-site power equipment can be realized, and the equipment background operation data can be quickly retrieved after obtaining the equipment number. Field operators need to use the equipment to intelligently identify grid equipment and related components, thus laying the foundation for data collection.

Some physical quantities or states in power equipment can be obtained by visual methods, therefore, using image processing technology to analyze the object under test can realize the measurement or identification of physical quantities characterizing power equipment or its state, and discover abnormal phenomena and potential faults in time, meanwhile, using multi-band image technology can discover changes in subtle images that are difficult to be distinguished by human eyes, and realize equipment fault diagnosis and state early warning. In general, the principle of using machine vision intelligent recognition acquisition technology is shown in Fig. 3.2.

Fig. 3.2
A process diagram. It has a convolution layer and non-linearity, max pooling, vector, fully connected layer, and binary classification. The classification gives wire breakage, guide wire wind deflection, wire extraction, tower collapse, insulator damage, tower head lost, foreign object separation.

Machine vision intelligent recognition acquisition technology principle

The process of identification of electrical equipment using image information is as follows: Image pre-processing: The image acquisition process and transmission process cause images with noise and other unfavorable factors for image analysis, so the first step of image analysis is pre-processing work to improve the quality of images. It mainly uses low-pass filtering to remove image noise and improve the quality of images. Image alignment: Not only does the electrical equipment need Image registration to recognize the infrared image to the visible image. In the comparison between the image obtained during patrol inspection and the image in the historical database, due to the difference in angle and local area in shooting, in order to facilitate the later feature extraction work, it is also necessary to register these images with deviation. The method based on feature matching can be used, and the sift algorithm can extract stable feature points, and deal with the matching problems in the case of translation, rotation, affine transformation, perspective transformation, etc. between two images, which has good robustness to light changes and can match with a high probability. Image feature extraction: Image features are the original characteristics or attributes of the image field. Some of these are natural features that are directly perceived by the image, and some are artificial features that need to be transformed or measured to be obtained. In the walk-through, the color, shape and texture of the image can be used as natural features, while grayscale, histogram and infrared temperature differences can all be used as artificial features for recognition. Feature extraction should focus on the benefits that the extracted features bring to the accuracy and speed of the recognition process that follows.

An image is distinguished from other images mainly by its features, which include color features, texture features, shape features, etc.

  1. (1)

    Color features

Color is the most significant feature of the image, and different electrical devices have different colors. Some electrical devices have obvious color features, which can be used as a basis for judgment. When the color feature of the electrical equipment is obvious, recognition analysis, the color can be used as a feature of recognition. For example, for transformers close to red, gray transformers, etc., the color range of the electrical equipment image is first extracted to obtain the range in which it is located as the main basis for identification, and the identification target in the image is located and analyzed.

The color feature expresses the global properties of the image is its shortcoming, and the local characteristics of the object are not well expressed. Therefore, color feature methods usually have to be used in combination with other methods to provide sufficient image information well. The extraction of color features mainly involves the following key issues: The choice of color space should be appropriate (other color spaces are not all consistent with all with human perception); defining and quantifying color features; similarity metrics and matching, how to define the similarity of color features and match them quickly.

  1. (2)

    Texture features

A texture feature is a global feature of an image that describes the surface properties of the scene corresponding to the image or image region. Unlike color features, texture features are not pixel point-based features, which require statistical calculations in regions containing multiple pixel points. As a statistical feature, texture features are often rotationally invariant and have a high resistance to noise. Insulators and porcelain sleeves of electrical equipment have special textures. In pattern matching, such regional characteristics are superior and will not fail to match successfully due to local deviations.

  1. (3)

    Shape features

The target in an image generally has its specific shape, and there are two types of representation for shape features, one is contour features and the other is area features. The contour features of the image mainly target the outer boundary of the object, while the area features of the image relate to the whole shape region. The general lightning tower is in the shape of a tall tower, for instance. The alignment of the visible light of the electrical equipment is also the basis of the initial recognition of the image, if the matching to the feature points is relatively sparse, it can be considered that this image does not match the database image or is not conducive to analysis, and this image can be discarded and the next image is taken again.

The difficulty of image processing and analysis methods used for machine vision is that machine vision is affected by multiple factors such as environment, lighting, production process and noise, etc. The signal-to-noise ratio of detection is generally low, and weak signals are difficult to detect or cannot be effectively distinguished from noise. How to build a stable, reliable and robust detection method to adapt to the interference of light changes, noise and other adverse external environments to complete the detection and identification of targets is the problem to be solved. When the target to be identified is complex, it is necessary to achieve it through several links, integrated from different sides. The first thing to consider when recognizing and extracting targets is how to automatically separate them from the background. The complexity of target extraction generally lies in the fact that the features of targets and non-targets are not very different, and after the target extraction scheme is determined, the target features need to be enhanced. The effect of machine intelligence recognition using feature recognition is shown in Fig. 3.3.

Fig. 3.3
A photograph and a funnel diagram. A. The photograph has a tower and trees around it. The labeled or highlighted parts are glass insulators and connection fixtures. B. The 3 features of machine vision intelligent recognition acquisition technology are color, texture, and shape characteristics.

Machine vision intelligent recognition

3.2.3 Multiple Information Collection Technology for Power Equipment

  1. (1)

    Power sensor technology and applications

Sensor technology can convert analog signals to digital signals to support the computer to complete the processing of data and information. The role of sensor technology for the Internet of Things, just like the five senses in the human body, is a class of sensory systems.

  1. 1)

    Under the power IoT architecture, the application of sensor technology includes six main types:

  1. a)

    Liquid level sensor

The main principle of the liquid level sensor is hydrostatics, which is a kind of pressure sensor and is suitable for liquid monitoring of power equipment.

  1. b)

    Speed sensor

The speed sensor can transform the non-electricity change to the electricity change, thus achieving speed based monitoring. In addition, speed sensors also include acceleration sensors, electronic devices to measure acceleration forces and mainly used for monitoring electrical environments.

  1. c)

    Humidity sensor

The humidity sensor functions mainly through the moisture-sensitive material applied to the substrate to form a moisture-sensitive film. When humidity sensitive materials adsorb water molecules in the air, the performance of components such as impedance and dielectric constant will change, resulting in humidity sensitivity. It is suitable for humidity monitoring in the environment where power equipment is located. At present, humidity sensors are mainly divided into resistors and capacitance.

  1. d)

    Gas-sensitive sensor

Gas-sensitive sensors monitor specific gases and are suitable for carbon monoxide monitoring of components such as transformers.

  1. e)

    Infrared sensor

Infrared sensors mainly utilize the physical properties of infrared light to achieve non-contact monitoring, monitoring objects such as temperature and gas composition.

  1. f)

    Vision sensor

Vision sensors are suitable for high pixel capture, and are mostly used in industry for measurement orientation and defect detection, and in power equipment management for theft prevention, tower tilt prevention, breeze vibration prevention, and fault location and diagnosis.

  1. 2)

    Overview of common sensor devices for the power IoT

  1. a)

    Transmission line three span monitoring device

The transmission line three-span monitoring device is a field data collection device installed on the towers of high-voltage transmission lines, as shown in Fig. 3.4. The device is mainly composed of camera, solar panel and main chassis. The camera is responsible for real-time video and the master control is responsible for on-site communication, which is transmitted in real-time through wireless so that the administrator can grasp the situation around the monitored tower in real-time and effectively ensure the safety of the line.

Fig. 3.4
2 photographs. A presents the equipment composition of the transmission line three-span monitoring device. It consists of a camera, solar panel, and main chassis. B has the equipment installed on a tower of high-voltage transmission lines.

Transmission line three-span monitoring device

  1. b)

    Distributed fault diagnosis device

The distributed fault diagnosis device adopts distributed traveling wave measurement technology, and the monitoring terminals are distributed and installed on the transmission line conductors, as shown in Fig. 3.5. High potential acquisition line fault moment near the point of the frequency fault signal and traveling wave fault signal can be comprehensively analyzed by the data center to achieve fault interval positioning, accurate fault location, fault cause identification, and lightning stroke characteristic monitoring.

Fig. 3.5
2 photographs. A has a distributed fault diagnosis device. B has distributed fault diagnosis devices installed on the transmission line conductors.

Distributed fault diagnosis device

  1. c)

    Ice cover online monitoring device

Transmission line ice-covering online monitoring device is mainly used to monitor the ice-covering situation of conductors, towers and insulator strings on transmission line sites, as shown in Fig. 3.6. The monitoring means used by the transmission line ice cover online monitoring device include micro-meteorological monitoring, image monitoring and conductor equivalent ice cover thickness monitoring. Monitoring of the conductor equivalent ice thickness is achieved by monitoring parameters such as insulator string wind deflection angle, tilt angle and axial tension. Wire equivalent ice thickness monitoring: The device can collect on-site insulator string wind deflection angle, tilt angle and axial tension data according to preset time, and then transmit the wire equivalent ice thickness, unbalance tension and other data back to the backend server through the power IOT after calculation.

Fig. 3.6
2 photographs and a screenshot. A has the components of an ice cover online monitoring device. B has an ice cover online monitoring device installed on a tower. C. The screenshot presents the data details of ice monitoring units A, B, and C. It has a line graph with 4 fluctuating curves.

Online monitoring device for ice cover

  1. d)

    Video surveillance device

The video surveillance device is mainly designed to achieve a comprehensive understanding of the surrounding environment such as the transmission line corridor area and tower base area, and can perform preset timing capture monitoring and real-time retrieval of video data according to needs. It supports horizontal 360° and vertical 180° rotation, and is currently mainly installed in areas prone to mountain fire hazards, as shown in Fig. 3.7.

Fig. 3.7
2 photographs and a screenshot. A has a video surveillance device. B has a video surveillance device installed on a tower. C. The screenshot presents the effect of the use of the device. It has an aerial view of a mountainous region with mountains and buildings. The text is in a foreign language.

Video surveillance device

  1. e)

    Weather online device

The meteorological online device can measure the meteorological data (temperature, humidity, wind speed, wind direction, rainfall, air pressure, light radiation, etc.) and its change condition in the transmission line area, and transmit it to the main station system of the monitoring center in real-time through GPRS/4G network, and the main station system will conduct accurate analysis, and if abnormal conditions occur, the system will immediately issue early warning information in various ways, and the management personnel will make early decision on the warning area based on the early warning information in order to carry out key maintenance, and the device is shown in Fig. 3.8.

Fig. 3.8
2 photographs and a screenshot. A has the components of a weather online device. B has a weather online device installed on a tower. C. The screenshot presents the effect of the use of the device. It has a line graph with 4 fluctuating curves. The text is in a foreign language.

Weather online device

  1. f)

    Transmission line fusion-type intelligent terminal

The transmission line fusion-type intelligent terminal applies the image monitoring device installed on the tower, obtains the monitoring point information in real-time through 4G wireless communication technology, applies the image intelligent analysis to the collected image intelligent analysis, and pushes it to the panoramic intelligent monitoring center personnel in the first time through the platform alarm method, as shown in Fig. 3.9.

Fig. 3.9
2 photographs and a screenshot. A has a transmission line fusion-type intelligent terminal. B has the equipment installed on a tower. C. The screenshot presents labeled menu options and the effect of the use. It has a photograph of a tower, trees, and buildings. The text is in a foreign language.

Transmission line fusion-type intelligent terminal

  1. 3)

    The application of electric power IOT sensor technology in online monitoring of electric power equipment is mainly reflected in the monitoring application management for transmission equipment and substation equipment, specifically analyzed as follows:

  1. a)

    For online monitoring applications of power transmission equipment

For the transmission equipment online monitoring application, is the key content of the power IOT sensor technology in the application of online monitoring of power equipment, the main monitoring content to the transmission equipment operating conditions and operating environment, effectively enhance the perception and early warning ability of the transmission line when the ice, tower tilt, wire wind deflection droop arc, etc., for the realization of the transmission equipment operating state dynamic full-time monitoring has an important role in promoting.

  1. b)

    For online monitoring application of substation equipment

The online monitoring application of substation equipment mainly includes transformer core current, oil chromatography, GIS local discharge and arrester insulation. Online monitoring based on electric power IOT sensor technology can realize highly sensitive collection of equipment operation information on one hand, and online management of substation equipment pre-testing project on the other hand, so as to carry out online diagnosis and evaluation of equipment operation status. It is important to promote the development of substations in the direction of intelligence and enhance the monitoring capability of substation safety performance.

  1. (2)

    Information collection, automatic collection

The rapid power equipment information collection function can build a power equipment ledger database and a backend transmission server through sensor nodes deployed in the smart grid, and use a dedicated power network to quickly and effectively collect the status of smart grid equipment operation and environmental information, which are power equipment information, temperature, humidity, oxygen, carbon dioxide, carbon monoxide, sulfur dioxide and methane and other related information of the power equipment operation environment, and then transfer them to the system and realize the sensing of ambient air quality information. Power equipment information collection and sensing is also the most basic function of the system. In order to be able to save the energy of the wireless sensor system, the relevant sensor deployment nodes need to be optimized to save the energy of the system, and the collected information is saved to the relevant database server so that it can help the logical business processing system to analyze the data and predict whether the associated collected data exceeds the pre-set warning value so that it can be displayed on the monitoring interface of the power equipment information system to facilitate the analysis by the power system managers.

  1. (3)

    Information transmission

After the power equipment sensor nodes collect the relevant equipment operation information and environmental information, they can use the wireless network to transmit the data to the convergence node to ensure that the power equipment environmental information is accurately saved in the backend database server The specific power equipment information transmission process is described in detail as follows: The sensor nodes are deployed on the power equipment and use the sensors’ own temperature control, humidity control, and equipment status control functions to collect the power equipment data information and transmit it to the aggregation node, which can transmit it to the server through 4G wireless network and fiber optic network to be able to carry out logical business processing.

  1. (4)

    Information processing

After the power equipment and its environment data collected by the power IoT sensors are transmitted to the server, the power equipment information processing module can be responsible for processing the collected equipment information. If the sensor server finds that the relevant power device information to exceed the warning value set by the administrator, the sensor detection device can send control commands to the power device through the ZigBee network and store the collected information in the server database. In power equipment there are many sensors working together in a coordinated manner, and the information sensed by each sensor usually has different characteristics: fuzzy, wrong, mutually dismantling, complementary, competitive, etc. The presence of these various information may lead to the difficulty of accurately judging the real state of the measured object with any single information, so it is often necessary to have multiple sensors of the same type or multiple different types of sensors for cooperative sensing to achieve comprehensive and accurate detection of the target, and further processing of sensor information through cooperative sensing technology can eventually lead to different equipment ledger information and complete information acquisition. As shown in Tables 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, respectively, are some of the disaster damage multi-information collection equipment ledgers of power equipment in Wenzhou, Zhejiang.

Table 3.1 Distributed ledger
Table 3.2 Micro-meteorological accounts
Table 3.3 Mountain fire video ledger
Table 3.4 Converged intelligent terminal wire genie ledger
Table 3.5 Image surveillance ledger
Table 3.6 Ice cover online monitoring device account

3.2.4 Disaster Damage Multiple Information Collection Technology

Targeted at the system fusion scheme of multiple IoT sensors deployed in the power grid and in the need for rapid collection of disaster loss information, callback functions can be designed on the terminal node to achieve emergency data collection at the disaster loss site. The collected data can be transferred back to the main node by using a remote function call machine callback scheme. Through the existing multiple online monitoring sensors for a unified data integration, fusion of UAV acquisition image data, unified access to the data center to complete the analysis and processing of data, can be realized with the field power equipment each sensor for networking control, and finally in the form of graphics and reports to display the processing results, and according to the need to store or forward data to other system interfaces.

ZigBee standard is used to achieve wireless data collection and sharing among wireless sensors of power equipment. The ZigBee standard is a proximity-oriented, low-rate, low-power, low-cost wireless sensor network standard. The ZigBee protocol uses the IEEE 802.15.4 standard at the physical and link layers and adds to it the network layer, security module, and application support sub-layer modules, among others, to enable large area groups. Devices in IEEE 802.15.4 networks are classified into two device types, based on their communication capabilities: Full-Device (FFD) and Reduced-Function-Device (RFD). FFD devices have all the features defined by IEEE 802.15.4 and can play any role in the network, while RFDs can only communicate with FFDs because of their limited functionality.

The ZigBee transmission network and data center upper computer can communicate with wireless sensors of on-site power equipment through networking, ZigBee has three standards. ZigBee coordinators, ZigBee routers and ZigBee end devices. The coordinator is responsible for initializing, maintaining and controlling the network; routers are responsible for data collection, relaying messages and providing routing information; The terminal nodes are responsible for data collection. Each network must have only one coordinator, the coordinator and router must be FFD, and the end nodes can be FFD or RFD. The ZigBee standard supports network topologies such as star, tree, and mesh. ZigBee tree network is the most commonly used topology type in which the coordinator initializes the network, routers form the network branches and relay messages, and end nodes as leaf nodes do not participate in message routing. The collection terminal is responsible for collecting and pre-processing the power information of the switchgear in the substation, and sending the data to the data center host computer using ZigBee’s multi-hop technology. The data center completes the analysis and processing of the data, displays the final results in the form of graphs and reports, and stores the data as needed.

The wireless ZigBee data acquisition in this project consists of a ZigBee data acquisition node and a master node (coordinator) responsible for communication with the computer, OPC server, and force control configuration software. The ZigBee data acquisition node is developed using the SNAP network protocol stack, which is a ZigBee protocol stack with an embedded Python virtual machine that can write application layer scripts, compile and download script files over the air. It provides a simple, reliable, intelligent and complete networking solution for complex ZigBee digital transmission module networks, with significant power optimization and excellent redundancy performance due to the “peer-to-peer network” concept. The function of the end node is ZigBee data acquisition and data transmission. In the SNAP network stack, data transfer is achieved through remote function calls. In the terminal node, the callback function is designed to realize the emergency data collection at the disaster damage site, and the collected data should be passed back to the master node, which is realized by using remote function call machine transfer.

3.2.5 Collection Data Processing

  1. (1)

    Causes of missing data and bad data

The communication capacity of wireless sensor network nodes is limited, so in power system tests or specific production operations, temporary disconnection of the link may be caused by obstacles, or packet loss during data transmission. In addition, since wireless sensor networks are battery-powered, lack of energy or exhaustion of energy may cause abnormal or missing data. Also, the nodes can be damaged due to human reasons, which can cause missing data. Besides, the instability of the working environment may also cause missing and abnormal data in the wireless sensor network. Therefore, due to the above inherent characteristics of wireless sensor networks, missing data and bad data generation are inevitable. Missing data and bad data can affect the results of data analysis, so in order to solve this problem, it is necessary to pre-process the missing data and bad data, that is, to clean them. For missing data, the missing data are processed by estimating the data and filling the missing values with the estimated values, and for bad data, the bad data are detected and corrected.

  1. (2)

    Relevant definitions

  1. 1)

    Bad data: these deviations or corrupted data are referred to as bad data because of deviations or corruption caused by the network and equipment. Bad data detection and identification is to find out whether the data sequence to be tested contains bad data, and to determine the location of the bad data, and finally to correct the bad data. Bad data detection is important in the operation of power systems and even affects the corresponding decisions of power personnel.

  2. 2)

    Bad data detection: determining whether there is bad data during sampling for a specific measurement is called bad data detection.

  3. 3)

    Bad data identification: bad data identification occurs after bad data detection, when the presence of bad data is detected, the location of the data is determined, and this process is called bad data identification.

  4. 4)

    Bad data estimation: the estimation of bad data occurs after the identification of bad data, and the corresponding estimate is given for the bad data. The smaller the error between the estimated value and the true value is, the higher the accuracy of the estimation is.

Based on the understanding of the definitions of bad data detection, bad identification, and bad data correction through the above definitions, Fig. 3.10 gives the specific processing flow.

Fig. 3.10
A flow chart with 8 steps. It begins with a readout quantity measurement data followed by 2 steps and decision box 1. If the answer is yes, it proceeds followed by 2 steps, decision box 2, and ends if the answer is no to condition 2. If the answer is no to condition 1, it loops back to step 3.

Flowchart of defective data detection and identification

  1. (3)

    Missing data interpolation algorithm

In the power wireless sensor network due to the characteristics of the wireless sensor network itself so the phenomenon of missing data is unavoidable, so the study of algorithms for processing missing data is very meaningful. Data interpolation originally originated from the function in mathematics, the actual production and measurement can only get some discrete data, only through the function to express the relationship between the variables more intuitive, so the function will be discrete data into a continuous model, data interpolation in essence, that is, approximating the function, its theory has been gradually improved after the emergence of calculus. Data interpolation is now widely used in geology, meteorology, image processing, wireless sensor networks, etc.

Interpolation of missing data, i.e., establishing a functional correspondence based on the known information, and then estimating the unknown value by the function. There are many methods for interpolation of missing data, mainly: mean, plurality, linear interpolation, nearest neighbor interpolation, kriging interpolation, etc.

  1. 1)

    Mean method

The mean method, which means that all missing values in the series are filled with the mean. For example, for a known series \(\left\{{a}_{1},{a}_{2},{a}_{3},\dots \dots ,{a}_{19},{a}_{20}\right\}\), if the values of \({a}_{1},{a}_{19}\) are missing, then find the average of the non-missing values in the series, and if the result of the average is \(b\), then \({a}_{1},{a}_{19}\) are assigned the value b. This method is simpler and has lower time complexity, and can be used when the data series are smoother and the accuracy of the estimated values is higher. However, when the data series is unstable, there may be a large deviation between the estimated value and the true value. The results estimated using this method not only do not yield useful information but may interfere with the results of the data analysis.

  1. 2)

    Plurality method

The plurality method, which is also a frequently used estimation algorithm in missing data. The algorithm selects the most frequently occurring value in the series to interpolate the missing data. Simply put, for the series \(\{5,4,5,7,6,5,5,8,9,5, x\}\), x is the missing value, and according to the plurality method, the data with the highest frequency in the series is 5, then assign a value of 5 to x. The plurality method precisely uses the principle of probability in mathematics, using the data with the highest probability of occurrence to estimate the missing value. It is similar to the mean method, which is simple and easy to implement, but when there is oscillation in the data series, there is a large deviation between the estimated and actual values.

  1. 3)

    Linear interpolation method

Linear interpolation uses the values of the data adjacent to the missing data to estimate the size of the current missing data. The calculation is as follows: \({y}_{ij}^{*}={y}_{im}+\frac{{y}_{in}-{y}_{im}}{n-m}\left(j-m\right)\), where \({y}_{ij}^{*}\) is the estimate of the missing data \({y}_{ij}\), \({y}_{im}\) and \({y}_{in}\) are the two non-missing data that are nearest neighbors to \({y}_{ij}\) and \(m\le j\le n\).

Like the plurality method and the mean method, the linear interpolation method is also simple to compute and has a low time complexity, using only the values of the two moments closest to the missing data. However, it is also only applicable to smooth time series, and if the series is unstable or has serious data missing, the bias is larger and the estimation accuracy is not enough.

  1. 4)

    Kriging interpolation method

Kriging interpolation is a better method to interpolate data that can achieve linear optimality and has unbiased nature. This method mainly uses the distribution of space to solve the data, and it is suitable for solving the estimation of missing data between some variables that are spatially correlated. The Kriging method has an early origin, but in recent years many experts have studied the method in more depth and combined it with other disciplines, and finally developed and formed some new methods. Some of these methods were combined with fuzzy theory in mathematics, culminating in the fuzzy kriging method; some of them are combined with trigonometric functions in mathematics, a method called trigonometric kriging.

  1. 5)

    Nearest neighbor interpolation method

The nearest-neighbor interpolation method exploits the property that physically neighboring nodes have similarity, an analysis proposed by Dutch meteorologist A. H. Thiessen. This method is mainly used in meteorology, where it is developed from the application of rainfall data from various dispersed weather stations to calculate the average rainfall, and is often used in GIS for fast assignment of nearest neighbor interpolation. In fact, theoretically, the application of the nearest neighbor interpolation method defaults to the fact that the data value of any grid point \(p\left(x,y\right)\) can be replaced by the data value of the nearest location point (which is its implicit condition). This means that it takes the value of the closest node in each grid node as its node value. The interpolation algorithm can be applied to data that is distributed at uniform time intervals and has been converted to a grid file, and it can also be applied to files that contain only a small number of no values, i.e., data points that have no values are filled with the nearest values.

Based on the performance analysis of the above interpolation method, this project uses the nearest neighbor interpolation method to interpolate the grid data.

  1. (4)

    Detection and identification of bad data

  1. 1)

    Traditional bad data detection and identification methods

The commonly used methods for detecting bad data are: target extreme value detection method, weighted residual method, standard residual method, and quantity measurement mutation method. Among them, both the target extreme value detection method and the weighted residual method have certain disadvantages, namely residual contamination and residual flooding; quantity measurement mutation detection method can basically solve the phenomenon of residual contamination and flooding, but it has certain restrictions on the sampling samples, for example, it requires that the structure of the system must not change between two adjacent samples, and there must not be a sudden change in the quantity of the phenomenon either. Where residual contamination is commonly referred to as: In addition to the residuals of bad data points in addition to identifying bad data as such, normal values are also identified as bad data; residual flooding refers to the interaction between multiple bad data, resulting in residuals close to normal residual values at some bad data points. The presence of residual contamination and residual flooding can cause missed detection of bad data as well as false detection (i.e., normal values as bad data). Other bad data detection methods include the pseudo-measurement mutation detection method.

The traditional methods commonly used for bad data identification include estimation identification method and residual search algorithm as well as zero-residual method. The principles of traditional bad data identification methods are similar, i.e., weighted residuals or standard residuals are used as a standard value, and then a threshold value is calculated at a certain confidence level using the principle of hypothesis testing, followed by a judgment on the measured value.

  1. 2)

    New method of bad data detection and correction

At present, with the deepening of the research on the detection and correction of bad data, some scholars have proposed to apply some algorithms in data mining, machine learning neural network and statistics algorithms to the detection and identification of bad data.

After preliminary analysis of the collected images, the project team found that the data had an unbalanced distribution and some of the disaster samples were small, mainly with the following problems: Firstly, the unbalanced distribution of the data set and the long-tail problem, especially the small number of negative samples of abnormal defects, will affect the convergence of model training and the process of data clustering and analysis, which is not conducive to the training of disaster damage identification models; Second, the problem of target detection for small targets with few samples requires a certain degree of data enhancement and data expansion, which can also cause a decrease in model accuracy; Third, server resources and arithmetic problems, large-scale image video data acquisition needs to support high-bandwidth multiple concurrent transmission, high requirements for server throughput, and data cleaning also requires the corresponding arithmetic resources; Fourth, the conflict and contradiction between the coverage of data collection and the generalization of the model, the accuracy of the model will be reduced if migrating to other regions or using other weakly related data. Therefore, the project team adopted a “multi-legged” approach. First, it conducted several rounds of field collection in October, November, December 2020 and January 2021 in Zhejiang to accumulate normal pictures of equipment; second, it visited grassroots units and directly contacted Wenzhou Transmission and Transformation Company to collect disaster damage images of power equipment collected by front-line power sensors and mobile terminals; third, it called on resources from other provincial companies to assist in obtaining disaster damage data of power equipment. In the end, 5620 valid images were obtained for training, completing the multi-dimensional information collection of grid damage and supporting the construction of subsequent damage recognition models.

3.3 Disaster Identification Technology of Power Grid Line Based on UAV

3.3.1 Summarize

In this section, the automatic detection system of UAV is used to identify the power disaster scene and realize the automatic identification and survey of the disaster scene. Uav route survey mainly involves UAV automatic detection system and UAV image recognition technology. To achieve a wide range of power facilities and equipment damage quickly acquired.

The UAV automatic detection is mainly to quickly detect the damage of power facilities and equipment from a distance, and quickly identify serious scenarios such as UHV collapse and substation flooding at the UAV end, as shown in Figs. 3.11 and 3.12.

Fig. 3.11
A photograph of a power facility filled with flood water. The power transmission transformers are immersed in water. There are 2 warning sign boards around them. The text on the boards is in a foreign language.

Flooding scenario of power facilities

Fig. 3.12
A photograph of a collapsed power tower. The upper part of the tower is bent and has fallen down.

Damage scene of power tower

The UAV automatic inspection system consists of two major parts: digital airborne component performance inspection subsystem and analog airborne component performance inspection subsystem. After DJI M300RTK model is used to carry airborne parts for flight survey, ground-based automatic testing equipment can be used for testing and analysis, and automatic testing equipment is a device for test control, data acquisition and fault inference built on PXI bus architecture, including 8-slot PXI chassis assembly, PXI-8106 type zero-slot controller, DA card, AD card, relay control card, serial communication card and multimeter card, totaling 6 PXI-type plug-in cards; the interface adapter consists of an integrated control box that combines power supply, filtering, driving, and level conversion. The device under test is excited by the excitation system and converges the response signal to the interface adapter, which is then used by the automatic detection equipment to collect and process and analyze the display.

UAV target image recognition mainly uses image segmentation (threshold judgment) based, classifier based, feature point based, inter-frame difference method, and background difference method for target image segmentation and recognition. The so-called “threshold” is the boundary of a field or a system, and its value is called threshold, using threshold for image segmentation is a region segmentation technique, especially effective for the segmentation of images with strong contrast between objects and background. It is simple to calculate and always possible to define non intersecting regions with closed and connected boundaries. Threshold segmentation is performed using methods such as grayscale histogram peak and valley localization and image filtering, as shown in Fig. 3.13, the basic principle is to select one or more grayscale thresholds in the grayscale image range, and then compare the grayscale value of each pixel in the image with the threshold, and according to the result of the comparison, the corresponding pixels in the image are divided into two or more classes, so as to divide the image into a collection of non-overlapping regions and achieve the purpose of image segmentation. As shown in the Fig. below, in some simple images, the grayscale distribution for the target is more regular, and the background and the individual targets each form a wave in the grayscale histogram of the image, i.e., the region and the wave correspond one-to-one. Since a trough is formed between each peak, the two regions can be separated by selecting the gray value corresponding to the trough between the two peaks as the threshold value. The subject realizes the pre-processing and segmentation recognition of the UAV target image based on image segmentation (threshold judgment) and achieves the automatic survey and recognition of the disaster site. The recognition image effect is shown in Fig. 3.14.

Fig. 3.13
A histogram of the y-axis values versus the x-axis values. It has a fluctuating trend of bars.

Grayscale histogram peak and valley method

Fig. 3.14
An aerial photograph of a tower with surrounding trees and roads with vehicles. The labeled or highlighted parts are glass insulators 0.96, glass insulators 0.64, connection fixture 0.92, connection fixture 0.99, connection fixture 1.00, connection fixture 0.92, and connection fixture 0.97.

UAV target recognition

3.3.2 Principle of Intelligent Analysis Algorithm for UAV Main Network Disaster Detection

A mixture of intelligent algorithms will be used to achieve the best results for each type of disaster damage pattern. The following addresses the proposed different classes of algorithms to be used are described below.

  1. (1)

    Contrastive Learning

Contrastive Learning is a model for deep learning, which is used to describe the task of similar and different things. Using this approach, we can train machine learning models to distinguish between similar and different images. The basic concepts are as follows.

  1. 1)

    An unsupervised representation learning of images/text.

  2. 2)

    Motivation analysis: Analysis algorithms often learn to distinguish things by comparison. As shown in Fig. 3.15. The model does not need to learn too specific details (image: pixel level; text: word level), but only features at a high enough level to distinguish between objects.

    Fig. 3.15
    2 aerial photographs of 2 towers and the surrounding area with trees and ground. A is the photograph before comparative analysis. It appears slightly hazy. A has high-level features after comparison. It appears clear.

    Contrast learning does not require learning overly specific details

The difference from the previous supervised/unsupervised approach is shown in Fig. 3.16.

Fig. 3.16
2 process diagrams. A. In generation or prediction, data x 0 goes to a neural network to give data x 1. It includes output space loss calculation. B. In contrast learning, data x 0 and x 1 go to a neural network to give classification. It includes representational space loss calculation.

Difference between supervised/unsupervised methods

  1. (2)

    Contrast learning in the image field

Contrast learning is widely used for unsupervised representation learning in the image domain, based on MoCo (ICML2020) and SimCLR (2020), and has achieved significant improvements on the ImageNet dataset. The core of contrastive learning lies in how to construct a set of positive and negative samples. In the image field, image operations such as rotation and cropping are generally used, while in the text field, methods such as backtranslation and character insertion and deletion are often used. These methods rely on domain experience and lack diversity and flexibility; Perform data augmentation through adversarial attacks to obtain positive and negative samples.

SimCLR uses a simple comparison learning framework that allows views sampled from the same image to be as close as possible in the representation space, and views sampled from different images to be as far apart as possible. Contrast loss is the crossover within batch Entropy loss:

$${l}_{i,j}=\frac{\exp\left(\frac{sim\left({z}_{i},{z}_{j}\right)}{\tau }\right)}{{\sum }_{{k=1}^{1}\left[k\ne i\right]}^{2N}\exp\left(\frac{sim\left({z}_{i},{z}_{j}\right)}{\tau }\right)}$$
(3.1)

The numerator is a positive pair; the denominator is 1 positive pair and 2N−2 negative pairs model structure, as shown in Fig. 3.17.

Fig. 3.17
A flow diagram. x gives x i tilde and x j tilde via t and t prime approximately equal to tau. X i tilde and x j tilde give h i and h j via f of dot. H i and h j give z i and z j via g of dot. H i and h j represent the data representation. z i and z j represent the maximization criterion.

simCLR contrastive learning model structure

In addition to this various image enhancement methods are used, which include cropping, flipping, rotating, Gaussian noise, masking, color transformations, filters, etc.

Different enhancement methods have different effects on the image in the same environment, and the reasonable and timely selection has an important role in the subsequent image disaster recognition, as shown in Fig. 3.18.

Fig. 3.18
5 aerial photographs of a tower and the surrounding area with grass, trees, buildings, and roads. The photographs are the original photograph, cover photograph, filter photograph, rotation photograph, and cutting photograph of the same area.

Various image enhancement methods

  1. (3)

    Front and rear view separation

Semantic segmentation techniques in machine vision are used to separate the front and back views of images.

  1. 1)

    Semantic segmentation definition

Semantic segmentation is a classification at the pixel level. It groups pixels of the same class into one category. So semantic segmentation understands the image from the pixel level. For example, in the following photos, pixels belonging to tower poles are divided into one category. Pixels belonging to cross-arms, insulators, and bird's nests are also grouped into one category. In addition, the background pixels are also classified into one category. Semantic segmentation is different from instance segmentation. For example, if there are more than one person in a photo, for semantic segmentation, simply group all the pixels of those people together. But instance segmentation also puts different people's pixels into different classes. This means that instance segmentation goes further than semantic segmentation. Figure 3.19 shows the before and after comparison of semantic segmentation.

Fig. 3.19
A photograph and an illustration. The photograph has a tower pole with cross arms, insulators, wires, and a bird's nest. The illustration presents the semantic segmentation. It has wires and the background in different color gradients. The nest and the tower pole are in the same color gradient.

Comparison of before and after semantic segmentation

  1. 2)

    Deep learning for semantic segmentation scheme

Deep learning methods have been used well for semantic segmentation, and the use of deep learning methods to solve semantic segmentation problems can be summarized in several ideas. Below is a detailed introduction.

  1. a)

    Full convolution method

Full Convolutional Network (FCN) replaces the fully connected layers of the network with convolution, thus making input of any image size possible. This makes it possible to use any image size as input, and it is much faster than the patch classification method.

Despite the removal of the fully connected layer, there is a problem with CNN models for semantic segmentation, which is the downsampling operation (e.g., pooling). Pooling can expand the receptive field and integrate contextual information well. The popular understanding is that more information is integrated to make decisions, which is very effective for high level tasks (such as classification). However, at the same time, the pooling downsampling operation reduces the resolution and therefore weakens the location information. And the semantic segmentation requires score map and original map alignment, so rich location information is needed.

  1. b)

    Encoder-decoder architecture

The encoder-decoder is based on the FCN architecture. Encoder gradually reduces the spatial dimension due to pooling, and decoder gradually restores spatial dimensionality and detail information. Usually there is also a shortcut connection from encoder to decoder (aka cross-layer connection). This is shown in Fig. 3.20.

Fig. 3.20
An illustration presents input picture information and output segmentation information. It has multiple steps with convolved 3 by 3 and relu is activated, copy and cut, maximum pooling 2 by 2, and ascending convolution 2 by 2.

Shortcut connection

  1. c)

    Null convolution (Dilated/atrous)

Void convolution architecture replaces pooling. On the one hand, it can maintain spatial resolution. On the other hand, it can integrate contextual information very well because it can expand the field of perception. As shown in Fig. 3.21.

Fig. 3.21
3 3 by 3 convolution kernel matrices with different ratios that represent feature maps. The first feature map has a ratio of 1. The second feature map has a ratio of 6. The third feature map has a ratio of 24.

Null convolution

This service technology project will verify the actual use effect of the above solutions one by one, and adopt the optimal solution to solve the problem of intelligent analysis of disaster damage.

  1. (4)

    Target detection

Target detection is a more practical and challenging computer vision task that can be seen as a combination of image classification and localization. Given a picture, the target detection system has to be able to identify the target of the picture and give its location. Since the number of targets in the image is variable and the exact location of the target is to be given, target detection is more complex than the classification task.

  1. 1)

    A review of commonly used Object Detection algorithms

The common classical target detection algorithms are shown in Fig. 3.22.

Fig. 3.22
A diagram presents a classical target detection algorithm. It has a new trend. The new trend has 2 sections. Section 1 has first order, classical deep learning model, second order, with R C N N, Fast and faster R C N N, YOLO, and S S D. Section 2 has foundation and norm and gaudy.

Classical target detection algorithm

The basic idea of target detection: localization + recognition is solved simultaneously. It belongs to multi-task learning with two output branches. A branch is used to do image classification by full concatenation + softmax to determine the target class. The difference between this branch and simple image classification is that an additional “background” class is needed here. Another branch is used to determine the target location, and the regression task is completed by outputting four numbers to mark the enclosing box location (e.g., centroid horizontal and vertical coordinates and enclosing box length and width). The output of this branch is only used when the classification branch is not judged to be “background”. The detailed structure is shown in Fig. 3.23.

Fig. 3.23
A 3-D diagram presents multi-task learning. Pre-treatment of the image is followed by feature extraction. Feature extraction has suggestion generation with object classification and box regression. Box regression has frame classification with composite multi-classification and frame purification.

Multi-task learning

A traditional target detection framework that consists of three main steps:

  1. 1)

    Use sliding windows of different sizes to frame a certain part of the image as a candidate area;

  2. 2)

    Extract visual features related to candidate regions. For example, the commonly used Harr features in facial detection; The commonly used HOG features for pedestrian detection and ordinary object detection;

  3. 3)

    Use classifiers for recognition, such as commonly used SVM models. The current deep learning methods in the field of target detection are mainly divided into two categories: Two Stages target detection algorithms and One Stage target detection algorithms.

Two Stages: First, the algorithm generates a series of candidate frames as samples, and then the samples are classified by convolutional neural network.

Common algorithms are R-CNN, Fast R-CNN, Faster R-CNN, etc.

One Stage: This class of methods does not require the generation of candidate boxes, and directly converts the problem of target box location into a regression problem processing (Process).

Common algorithms are YOLO, SSD, etc.

Based on candidate regions (Region Proposal), such as R-CNN, SPP-net, Fast R-CNN, Faster R-CNN, R-FCN.

End-to-End based, no candidate region (Region Proposal), such as YOLO, SSD SSD.

For both approaches, the Region Proposal-based approach is superior in terms of detection accuracy and localization precision, and the End-to-End-based algorithm is superior in speed. Compared with the R-CNN series YOLO only requires a “look” compared to the “look twice” of the R-CNN series (candidate frame picking and classification). In summary, for now, the Region Proposal-based approach still prevails, but the speed of the end-to-end approach has a clear advantage. We will wait and see the subsequent development.

  1. 2)

    Target detection candidate frame generation mechanism

Nowadays, deep learning is developing rapidly, and articles such as RCNN/SPP-Net/Fast-RCNN talk about candidate edge Bounding boxes are generated and filtered. So how are the candidate boxes created? And how is filtered? In fact, object candidate frame acquisition currently mainly uses image segmentation and region growing techniques. Region growth (merging) is mainly due to the detection of objects present in the image with local region similarity (color, texture, etc.). The development of target recognition and image segmentation techniques further promotes the effective extraction of information from images.

According to the different ways of target candidate region extraction, traditional target detection algorithms can be divided into sliding window-based target detection algorithms and selective search-based target detection algorithms. Sliding window method as a classical object detection method, individuals believe that the probability of the presence of objects after the convolution operation when sliding windows of different sizes on the image with the already trained classifier discriminate. Selective search is the main use of image segmentation techniques for object detection.

  1. a)

    Sliding Window target detection

The sliding window approach is a simple target detection algorithm that transforms the detection problem into an image classification problem. The basic principle is to slide windows of different sizes and aspect ratios (width-to-height ratios) over the entire image with a certain stride. Then, image classification is performed on the corresponding regions within these windows, enabling detection of objects in the entire image. However, this method has a fatal drawback: you do not know the scale of the target object that needs to be detected. Therefore, you need to set windows of different sizes and aspect ratios to slide, and also choose an appropriate stride. This approach generates many sub-regions that all need to be passed through the classifier for prediction, which requires significant computational power. Hence, the classifier cannot be too complex to ensure speed. Next, let's take a look at the flowchart of the object detection process using the sliding window method, as shown in Fig. 3.24.

Fig. 3.24
A flowchart with photographs. It begins with an input picture of a tower followed by generating slide windows. There is a trained classifier. The classifier scores the window for insulation and exceptions and filters the window with the highest score picture followed by non-maximum suppression.

Sliding window method object detection flow chart

The specific steps of the sliding window method can be analyzed through the flowchart as follows: First, the input image is subjected to sliding window operation with different window sizes, sliding from left to right and top to bottom. Each time the window slides, the classifier (pre-trained) is applied to the current window. If the current window obtains a high classification probability, it is considered as detecting an object. After detecting with different window sizes, there will be overlapping regions with high repetitions among the detected windows. Finally, non-maximum suppression (NMS) is applied to filter out redundant detections. As a result, the detected objects are obtained after the NMS filtering.

The sliding window method is simple and easy to understand, but searching the entire image using windows of different sizes results in low efficiency, and designing window sizes also requires consideration of the aspect ratio of the object. Therefore, for classifiers with high real-time requirements, the use of the sliding window method is not recommended.

  1. b)

    Selective Search

The sliding window method is similar to an exhaustive search of image sub-regions, but in most cases, the majority of sub-regions in the image do not contain objects. Researchers naturally thought of searching only in the regions most likely to contain objects in order to improve computational efficiency. The chosen search method is the widely known algorithm for extracting image bounding boxes, which is shown in Fig. 3.25.

Fig. 3.25
A flow chart with a photograph and 4 illustrations of transmission wires. It begins with an aerial photograph of 4 transmission wires over a road and trees. It presents a procedure for generating detection information, a segmentation algorithm, followed by merges.

Selective search method object detection flow chart

The main idea for selecting a search algorithm is that regions in the image where objects may exist should exhibit some similarity or continuity. Therefore, a method based on the merging of sub-regions is chosen to extract candidate bounding boxes. Firstly, the input image is segmented using a segmentation algorithm, resulting in numerous small sub-regions (approximately 2000 sub-regions). Secondly, based on the similarity between these sub-regions (measured by criteria such as color, texture, size, etc.), region merging is performed iteratively. During each iteration, bounding boxes are generated for the merged sub-regions, which are referred to as candidate boxes.

  1. c)

    Select Search Benefits

①:

The computational efficiency is better than that of the sliding window method.

②:

Since the sub-region merging strategy is used, it can contain various sizes of suspected object frames.

③:

The diversity of similar indicators in merged regions improves the probability of detecting objects.

  1. 3)

    Overlap of the predicted and manually labeled boxes

To evaluate the annotation performance of the Bounding-box Regression model for target objects, we introduce the concept of Intersection over Union (IOU). Let's provide a brief definition: Object detection requires localizing the Bounding-box of the object, as illustrated in Fig. 3.26. It is not only about locating the Bounding-box of the vehicle but also identifying the object within the Bounding-box, which is the vehicle itself. When it comes to the localization accuracy of the Bounding-box, an essential concept arises because our algorithm cannot perfectly match the manually annotated data. Hence, there exists an evaluation formula for localization accuracy known as IOU (also referred to as Intersection Over Union). IOU defines the degree of overlap between two Bounding-boxes, as depicted in Fig. 3.26.

Fig. 3.26
2 photographs and a diagram. The photographs have a tower with transmission lines outlined by bounding boxes A and B. The diagram has 2 rectangular boxes A and B. Boxes A and B overlap each other and the overlapped region represents A intersection B.

Overlap between the predicted and manually labeled boxes

One overlap IOU of rectangular boxes A and B is calculated as:

$$\textit{IOU}=\frac{\left(A\cap B\right)}{\left(A\cup B\right)}$$
(3.2)

The ratio of the overlapping area of rectangular boxes A and B to the area of the concatenation of A and B is defined as:

$$\textit{IOU}=\frac{SI}{\left(SA+SB-SI\right)}$$
(3.3)
  1. 4)

    Non-Maximun Suppression (NMS)

When studying the R-CNN algorithm, it is crucial to understand a significant concept called Non-Maximum Suppression (NMS). For example, we aim to identify numerous Bounding-boxes that potentially contain objects from an image. Subsequently, we compute the probabilities of each rectangular box belonging to a specific category, as shown in Fig. 3.27.

Fig. 3.27
2 photographs of a tower over a mountainous region. The first photograph has multiple rectangular boxes around the tower. It undergoes a non-maximum suppression to give the same photograph with only one rectangular box around the tower.

Non-extreme value suppression

As shown in the above image, if we want to locate a vehicle, the algorithm ultimately finds a pile of boxes, each corresponding to a probability of belonging to the car category. We need to determine which rectangles are redundant. The method employed is Non-Maximum Suppression (NMS): let’s assume there are six bounding boxes, and we sort them in ascending order based on the probabilities of belonging to the vehicle class, denoted as A, B, C, D, E, and F.

Starting from the maximum probability rectangle F, we can determine whether the overlap IOU between A~E and F is greater than a set threshold.

If the overlap between B, D and F exceeds the threshold, throw away B, D; and mark the first rectangular box F that we keep.

From the remaining rectangles A, C and E, select E with the highest probability, and then determine the overlap between E and A and C. If the overlap is greater than a certain threshold, then throw it away and mark E as the second rectangle we keep.

This cycle is repeated until there are no remaining rectangles, and then all the rectangles that are kept are found, which is the rectangle we think is most likely to contain the car.

  1. a)

    The specific approach of NMS in R-CNN algorithm:

Suppose there are 20 categories and 2000 suggestion boxes, and the final output vector dimension is 2000 * 20, then each column corresponds to one category. One row is the score of each suggestion box, and the NMS algorithm steps as follows:

①:

Sorting each column in the 2000 × 20 dimensional matrix from largest to smallest;

②:

Starting from the largest scoring suggestion box in each column, perform IoU calculation with the scoring suggestion boxes following that column, respectively. If IoU > threshold, the suggestion box with smaller score is excluded, otherwise, it is considered that there are multiple objects of the same class in the image;

③:

Repeat step ② starting with the next largest scoring suggestion box in each column;

④:

Repeat step ③ until all suggestion boxes in that column are traversed;

⑤:

Iterate through all columns of the 2000 × 20 dimensional matrix, i.e., do non-maximal suppression for all object categories;

⑥:

Finally, eliminate the remaining suggestion boxes in each category that have scores less than the threshold value of that category.

  1. 5)

    Region Proposal Suggested Boxes for Crop/Wrap Specific Practices

In the R-CNN paper, anisotropic scaling with padding = 16 achieved the highest accuracy. The authors employed a simple transformation where, regardless of the size of the candidate region, a context region (referring to the surrounding pixels of the Region of Interest or RoI in the image) with a padding of 16 pixels, filled with the average pixel value of the RoI, is added. Then, the region is directly transformed to a size of 227 × 227.

  1. a)

    Anisotropic scaling (non-isometric scaling)

This method is simple, regardless of the aspect ratio of the image, whether it is distorted or not, it is scaled and all scaled to the size of the CNN input 227 * 227, with proportional distortion of the shape.

  1. b)

    Isotropic scaling

Because the distortion of the image has an impact on the training accuracy of the subsequent CNN, the authors also tested the “isotropic. The authors also tested the “isotropic scaling” scheme. There are two ways to do this.

①:

Directly in the original image, extend the bounding box border into a square, and then crop it; if it already extends to the outer boundary of the original image, then use the color in the bounding box the average value of the bounding box;

②:

First crop out the bounding box image, then fill it with a fixed background color to form a square image (the background color is also the pixel color average of the bounding box).

  1. 6)

    Bounding-box Regression Method

The regressor used in R-CNN is linear, with the input being the output of AlexNet’s pool5 layer. Bounding-box regression assumes a linear relationship between the candidate regions and the ground truth bounding boxes (as the regions selected from the SVM closely approximate the ground truth). The input for training the regressor consists of N pairs of values, representing the coordinates of the candidate regions’ bounding boxes and the corresponding coordinates of the ground truth bounding boxes. In the following explanation, the index “i” is omitted unless necessary. Here, the proposals used must have an IoU (Intersection over Union) with the ground truth greater than 0.6 to be considered positive samples. Bounding-box pairs and input characteristics are as follows input characteristics are as follows:

Bounding box for \({\left\{\left({P}^{i},{G}^{i}\right)\right\}}_{i=1,\dots ,N}\) central location (x, y):

$${P}^{i}=\left({P}_{x}^{i},{P}_{y}^{i},{P}_{w}^{i},{P}_{h}^{i}\right)$$
(3.4)
$$G=\left({G}_{x},{G}_{yx},{G}_{w},{G}_{h}\right)$$
(3.5)

And the width and height dimensions (w, h) and the Conv5 feature \({\varphi }_{5}\left(P\right)\) of CNN.

The basic idea of going from the candidate box P to the prediction box is as follows:

After obtaining the candidate boxes P following classification, we define the representation of the boxes using the variables x and y for the center coordinates, and w and h for the width and height of the boxes. In the following explanation, all box localization is defined using this representation. Once we have the representation of the candidate boxes, we can estimate the translation and scale factors between the candidate boxes and the ground truth boxes. With these factors, we can compute our estimated boxes.

The training phase of the regression model is represented as:

$${w}_{*} =arg\,\underset{{\widehat{w}}_{*}}{\mathrm{min}}\sum_{i}^{N}{({t}_{*}^{i}-{\widehat{w}}_{*}^{T}{\varphi }_{5}({P}^{i}))}^{2}+\lambda {\Vert {\widehat{w}}_{*}\Vert }^{2}$$
(3.6)
$${t}_{x} =({G}_{x}-{P}_{x})/{P}_{w}$$
(3.7)
$${t}_{y} =({G}_{y}-{P}_{y})/{P}_{h}$$
(3.8)
$${t}_{w} =\log\left(\frac{{G}_{w}}{{P}_{w}}\right)$$
(3.9)
$${t}_{h} =\log\left(\frac{{G}_{h}}{{P}_{h}}\right)$$
(3.10)

Based on the aforementioned loss function model, we can solve for the optimal weight W. Multiplying the weight with the features from pool 5 yields the translation and scale parameters. During the testing phase of bounding box regression, the weight parameters have already been trained and the results are obtained.

$${d}_{*}(P) ={w}_{*}^{T}{\Phi }_{5}(P)$$
(3.11)

The above equation is the characteristic of pool5 output, so the four transformations can be derived. Then, using the following algorithm flowchart to find the predicted bounding box that contains the object.

$${\widehat{G}}_{x} ={P}_{w}{d}_{x}\left(P\right)+{P}_{x}$$
(3.12)
$${\widehat{G}}_{y} ={P}_{h}{d}_{y}\left(P\right)+{P}_{y}$$
(3.13)
$${\widehat{G}}_{w} ={P}_{w}\mathrm{exp}({d}_{w}\left(P\right))$$
(3.14)
$${\widehat{G}}_{h} ={P}_{h}\mathrm{exp}({d}_{h}\left(P\right)$$
(3.15)
$${d}_{*}\left(P\right)={w}_{*}^{T}{\mathrm{\varphi }}_{5}(P)$$
(3.16)
  1. (5)

    Image Recognition

Image recognition technology is an important field in the information age, with the aim of allowing computers to replace humans in processing a large amount of physical information. With the advancement of computer technology, our understanding of image recognition has deepened. The process of image recognition involves information acquisition, preprocessing, feature extraction and selection, classifier design, and classification decision-making. This text briefly analyzes the introduction of image recognition technology, its technical principles, and pattern recognition. It also discusses neural network-based image recognition techniques, nonlinear dimensionality reduction techniques, and the applications of image recognition technology. From this, we can conclude that image processing technology has wide-ranging applications, and human life has become inseparable from image recognition technology. Therefore, researching image recognition technology holds significant importance.

  1. 1)

    Introduction of image recognition technology

Image recognition is an important field of artificial intelligence. The development of image recognition has gone through three stages: text recognition, digital image processing and recognition, and object recognition. As the name suggests, image recognition involves various processing and analysis of images to ultimately identify the objects of interest. Today, image recognition refers to the use of computer technology to perform recognition, rather than relying solely on human visual perception. Although human recognition capabilities are powerful, they are insufficient to meet the demands of a rapidly developing society. Thus, computer-based image recognition technology has emerged. It is similar to how humans study biological cells. It is unrealistic to rely solely on naked-eye observation for accurate cell examination, which is why instruments like microscopes have been invented for precise observation. When inherent techniques in a field cannot address certain demands, new technologies are developed accordingly. Image recognition technology is no exception. Its purpose is to enable computers to process large amounts of physical information, solving problems that are difficult or nearly impossible for humans to recognize.

  1. 2)

    Principle of image recognition technology

In fact, the underlying principles of image recognition technology are not very difficult. It's just that the information it deals with can be complex. Any processing technique in computer science is not created out of thin air but rather inspired by practical experiences of scholars, who then simulate and implement them through programs. The principles of computer-based image recognition technology are fundamentally similar to human image recognition. The main difference is that machines lack the influence of human sensory and visual perception. Human image recognition is not solely based on memorizing the entire image in our minds. Instead, we recognize images by identifying the inherent features of the images themselves and categorizing them based on these features. However, we are often unaware of this process. When we see an image, our brain quickly senses whether we have seen the image before or if it resembles similar images. In fact, there is a rapid recognition process that occurs between “seeing” and “sensing,” which is similar to searching. During this process, our brain recognizes the image by comparing it with the stored memories and checking for similar or identical features. Machine image recognition works in a similar way, where it categorizes and extracts important features while excluding irrelevant information to identify the image. The features extracted by machines can sometimes be very distinctive, while other times they may be quite ordinary, which greatly affects the speed of machine recognition. In computer vision, the content of an image is typically described using image features.

  1. 3)

    Pattern Recognition

Pattern recognition is an important component of artificial intelligence and information science. It refers to the process of analyzing and processing different forms of information that represent objects or phenomena in order to obtain descriptions, recognition, classification, and other outcomes related to those objects or phenomena.

Computer's image recognition technology simulates the human process of pattern recognition. Pattern recognition is originally a fundamental ability of humans. However, with the development of computers and the rise of artificial intelligence, human pattern recognition alone cannot meet the demands of daily life. Therefore, humans have sought to use computers to replace or augment human cognitive abilities. This gave rise to computer-based pattern recognition. In simple terms, pattern recognition involves classifying data and is a science closely integrated with mathematics, with a significant focus on probability and statistics. Pattern recognition can be broadly categorized into three types: statistical pattern recognition, syntactic pattern recognition, and fuzzy pattern recognition.

  1. 4)

    The process of image recognition technology

Since computer image recognition technology follows the same principles as human image recognition, their processes are quite similar. The process of image recognition technology can be divided into the following steps: information acquisition, preprocessing, feature extraction and selection, classifier design, and classification decision.

Information acquisition refers to the process of capturing the basic information of the research object, such as light or sound, and converting it into electrical information through sensors. It involves obtaining the fundamental data about the object of study and transforming it into information that can be understood by the machine through certain methods.

Preprocessing mainly refers to the operations in image processing, such as denoising, smoothing, and transformation, to enhance the significant features of the image. It aims to improve the quality and extract relevant information from the image.

Feature extraction and selection refer to the process of extracting and selecting relevant features in pattern recognition. In simple terms, when dealing with various types of images, we need to identify and differentiate them based on their inherent features. The process of obtaining these features is called feature extraction. However, not all the features extracted may be useful for the specific recognition task at hand. In such cases, feature selection is performed to choose the most relevant and informative features. Feature extraction and selection are critical techniques in the image recognition process, and understanding this step is essential in image recognition.

Classifier design refers to the process of training a recognition rule that enables the image recognition technology to achieve high recognition accuracy. It involves developing a classifier model that can effectively categorize and classify the extracted features of the objects being studied. On the other hand, classification decision refers to the process of assigning the recognized objects to specific classes within the feature space. This step helps determine the exact category or class to which the studied objects belong.

  1. 5)

    Analysis of image recognition technology

With the rapid development of computer technology and advancements in science and technology, image recognition technology has been applied in various fields. On February 15, 2015, Sina Technology reported: “Microsoft recently published a research paper on image recognition, and in a benchmark test of image recognition, the computer system has surpassed humans. The human error rate in classifying images in the ImageNet database is 5.1%, while Microsoft’s research team achieved an error rate of 4.94% with their deep learning system.” This news indicates that image recognition technology has shown a trend of surpassing human capabilities in image recognition. It also highlights the significant research significance and potential of image recognition technology in the future. Furthermore, computers have advantages that surpass human capabilities in many aspects, which is why image recognition technology can bring more applications to human society.

  1. 6)

    Neural networks for image recognition

Neural network image recognition technology is a relatively new approach that combines neural network algorithms with traditional image recognition methods. Here, neural network refers to artificial neural networks, which are generated by humans to mimic the neural networks found in animals. In neural network image recognition technology, the fusion of genetic algorithms and backpropagation (BP) networks is a classic neural network image recognition model that finds applications in various fields. In an image recognition system that utilizes neural networks, the general process involves first extracting features from the images and then mapping these features to the neural network for image recognition and classification. Let’s take the example of automatic license plate recognition technology for vehicles. When a vehicle passes by, the detection devices installed on the vehicle will sense its presence. The image acquisition device is then activated to capture images of the front and rear sides of the vehicle. Once the images are obtained, they need to be uploaded to a computer for storage and further processing. The license plate localization module will extract the license plate information and recognize the characters on the license plate, displaying the final result. In the process of character recognition on the license plate, both template matching algorithms and artificial neural network algorithms are used.

  1. 7)

    Nonlinear dimensionality reduction for image recognition

Computer image recognition technology is an exceptionally high-dimensional recognition technique. Regardless of the resolution of the image itself, the data generated from it often exhibits a high degree of dimensionality, which poses significant challenges for computer recognition. To enable efficient recognition capabilities in computers, the most direct and effective approach is dimensionality reduction. Dimensionality reduction can be classified into linear and nonlinear methods. For example, Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) are common linear dimensionality reduction methods, known for their simplicity and ease of understanding. However, linear dimensionality reduction methods operate on the entire dataset and seek the optimal low-dimensional projection for the entire dataset. After verification, it has been found that linear dimensionality reduction strategies have high computational complexity and consume relatively more time and space. As a result, image recognition techniques based on nonlinear dimensionality reduction have emerged as highly effective methods for feature extraction. These techniques can discover the nonlinear structure of images and perform dimensionality reduction without disrupting their intrinsic structure. This allows computer image recognition to be conducted in as low-dimensional space as possible, thereby improving the recognition speed. For example, face recognition systems typically require a high number of dimensions, which presents a significant “curse” in terms of complexity for computers. Due to the uneven distribution of face images in high-dimensional space, humans can utilize nonlinear dimensionality reduction techniques to obtain compactly distributed face images, thus enhancing the efficiency of face recognition technology.

  1. 8)

    Application and prospect of image recognition technology

Computer image recognition technology has applications in various fields such as public safety, biology, industry, agriculture, transportation, and healthcare. For instance, in transportation, there are license plate recognition systems. In public safety, there are technologies like facial recognition and fingerprint recognition. In agriculture, there are seed recognition and food quality inspection technologies. In the medical field, there are techniques like electrocardiogram recognition. With the continuous development of computer technology, image recognition technology is being optimized, and its algorithms are constantly improving. Since images are a primary source of information acquisition and exchange for humans, it is evident that image recognition technology related to images will be a major focus of future research. Computer image recognition technology is likely to make further advancements in various fields, with limitless potential for applications. Human life will become increasingly dependent on image recognition technology.

Although image recognition technology is a relatively new field, its applications are already quite extensive. Furthermore, image recognition technology continues to evolve, and with the progress of science and technology, our understanding of it will deepen. In the future, image recognition technology will become even more powerful and intelligent, making significant contributions to various domains of human society. In this era of information technology in the twenty-first century, it is unimaginable how our lives would be without image recognition technology. It is an indispensable technology for our present and future lives.

  1. (6)

    Edge detection

Edge detection is a fundamental problem in image processing and computer vision, and its goal is to identify points in a digital image where there is a significant change in brightness. Prominent changes in image properties often reflect important events and variations in the attributes, including (i) discontinuities in depth, (ii) discontinuities in surface orientation, (iii) variations in material properties, and (iv) changes in scene illumination. Edge detection is a research area in image processing and computer vision, particularly in the context of feature detection.

Image edge detection significantly reduces the amount of data and eliminates potentially irrelevant information, while preserving the important structural attributes of the image. There are various methods for edge detection, most of which can be categorized into two types: the class of lookup-based methods and the class of zero-crossing-based methods. Lookup-based methods detect edges by searching for the maximum and minimum values in the first-order derivatives of the image, typically locating edges in the direction of maximum gradient. Zero-crossing-based methods detect edges by identifying the zero-crossings of the second-order derivatives of the image, often through the use of Laplacian zero-crossings or zero-crossings represented by nonlinear differences.

  1. 1)

    Edge Properties

Edges can be viewpoint-dependent, meaning that they may vary with different perspectives. This is typically reflected in the occlusion of one scene or object by another, or in the presence of attributes such as surface texture and shape of the observed object. In 2D or higher-dimensional spaces, the effects of perspective projection need to be considered. A typical boundary might be the edge between a red region and a yellow region, for example. Conversely, an edge might consist of a few differently colored points against a background that remains constant. Edges have significant importance in many applications of image processing. However, in recent years, computer vision processing methods that do not heavily rely on explicit edge detection as a preprocessing step have achieved substantial research progress and success.

  1. 2)

    Simple Edge Model

The edges of natural images are not always ideal stepped edges. Instead, they are usually affected by one or more of the following factors listed below:

  1. a)

    Focus blur from limited scene depth;

  2. b)

    Penumbra blur from shadows produced by non-zero radius light sources;

  3. c)

    Shadows at the edges of smooth objects;

  4. d)

    Local specular or diffuse reflections near the edges of objects.

Although the following model is not perfect, the error function erf is often used in practical applications of edge blur modeling the effect of edge blurring.

Thus, a one-dimensional image at the boundary of position 0 can be represented by the following model:

$$f\left(x\right)=\frac{{I}_{r}-{I}_{l}}{2}\left(\mathrm{erf}\left(\frac{x}{\sqrt{2}\sigma }\right)+1\right)+{I}_{l}$$
(3.17)
  1. 3)

    Detecting edges is not a simple problem

If we consider an edge to be a location where there is a significant change in brightness across a certain number of points, then edge detection can be seen as computing the derivative of this brightness variation. To simplify the explanation, let’s analyze edge detection in one-dimensional space. In this example, our data consists of a series of brightness values along a line. For instance, in the 1D data below, we can intuitively say that there is an edge between the fourth and fifth points, as shown in Fig. 3.28.

Fig. 3.28
An array represents a 1-D data. The data has numbers, 5, 7, 6, 4, 152, 148, and 149. An edge is between the fourth and fifth points as there is a significant change between 2 adjacent points.

One-dimensional spatial analysis edge detection example

Unless the objects in the scene are very simple and the lighting conditions are well-controlled, it is not an easy task to determine the threshold for what constitutes a significant change in brightness between two adjacent points to be considered an edge. In fact, this is one of the reasons why edge detection is not a straightforward problem.

  1. 4)

    Methods of edge detection

There are many methods for edge detection, which can be broadly categorized into two types: search-based and zero-crossing-based.

Search-based edge detection methods first compute edge strength, often represented by the first-order derivative, such as the gradient magnitude. Then, they estimate the local direction of the edge, typically using the direction of the gradient, and use this direction to find the maximum value of the local gradient magnitude.

Zero-crossing-based methods locate edges by finding the zero-crossing points of the second-order derivative obtained from the image. Typically, the zero-crossings of the Laplacian operator or zero-crossings of nonlinear diffusion equations are used. We will describe these in later sections.

Filtering, often using Gaussian filtering, is commonly performed as a preprocessing step in edge detection.

Published edge detection methods utilize measures of boundary strength, which are fundamentally different from smoothing filters. Just as many edge detection methods rely on computing image gradients, they use different types of filters to estimate gradients in the x- and y-directions.

  1. a)

    Calculating the first order derivative

Many edge detection operations are based on the first derivative of brightness, which yields the gradient of the original data’s brightness. With this information, we can search for peaks in the brightness gradient of the image.

If \(I\left(x\right)\) represents the brightness of point x, \({I}^{\prime}\left(x\right)\) represents the first derivative (brightness gradient) of point \(x\), then we find:

$${I}^{\prime}\left(x\right)=-\frac{1}{2}\cdot I\left(x-1\right)+0\cdot I\left(x\right)+\frac{1}{2}\cdot I(x+1)$$
(3.18)

For higher-performance image processing, the first derivative can be computed through convolution with a masked raw data (1D).

  1. b)

    Calculating the second order derivative

Other edge detection operations are based on the second derivative of brightness. This essentially measures the rate of change of the brightness gradient. In the ideal continuous case, detecting zero crossings in the second derivative will yield local maxima in the gradient.

On the other hand, peak detection in the second derivative is edge line detection, as long as the image operation uses an appropriate scale representation. As mentioned above, an edge line is a double edge, so we can observe a brightness gradient on one side and an opposite gradient on the other side. If there is an edge line in the image, we will see a significant change in the brightness gradient. To find these edge lines, we can search for zero crossings in the second derivative of the image brightness gradient.

If \(I\left(x\right)\) I represents the brightness of point x, and \({I}^{{\prime}{\prime}}\left(x\right)\) represents the second derivative of the brightness of point \(x\), then:

$${I}^{{\prime}{\prime}}\left(x\right)=1\cdot I\left(x-1\right)-2\cdot I\left(x\right)+1\cdot I(x+1)$$
(3.19)

Similarly, many algorithms use convolution masks to process image data quickly. Figure 3.29 illustrates an example of a convolution mask.

Fig. 3.29
An array represents a convolutional mask. The data has numbers, positive 1, negative 2, and positive 1.

Example of a convolutional mask representation

  1. c)

    Threshold determination

Once we have computed the derivatives, the next step is to apply a threshold to determine the edge locations. A lower threshold will detect more edge lines, but it is also more susceptible to noise in the image and can pick up irrelevant features. On the other hand, a higher threshold will result in the loss of fine or short line segments.

One commonly used approach is thresholding with hysteresis. This method employs different thresholds to search for edges. It starts by using an upper threshold to locate the beginning of an edge line. Once a starting point is found, the algorithm tracks the edge path point by point on the image, recording the edge positions as long as they are above the lower threshold, and stops recording when the value falls below the lower threshold. This method assumes that edges are continuous boundaries and allows us to track the fuzzy parts of edges observed earlier without labeling noise points in the image as edges.

  1. 5)

    Edge Detection Operators

There are several commonly used edge detection operators:

First-order operators: Roberts Cross operator, Prewitt operator, Sobel operator, Canny operator, Compass operator.

Second-order operators: Marr-Hildreth operator, which detects zero-crossings in the second derivative along the gradient direction.

Among these, the Canny operator (or its variants) is the most commonly used edge detection method. In Canny’s groundbreaking work, he studied the problem of designing an optimal pre-smoothing filter for edge detection. He later demonstrated that this filter could be effectively optimized using a first-order Gaussian derivative kernel. Additionally, Canny introduced the concept of non-maximum suppression, which states that edges are defined as points with the maximum gradient value in the gradient direction.

In a discrete matrix, non-maximum suppression can be implemented by using a method that involves predicting the direction of the first-order derivative, approximating it to a multiple of 45°, and then comparing the gradient magnitude in the predicted gradient direction.

An improved implementation to obtain subpoint accuracy edges is achieved by detecting the trans-zero of the second-order directional gradient in the gradient direction points to achieve:

$${L}_{x}^{2}{L}_{xx}+2{L}_{x}{L}_{y}{L}_{xy}+{L}_{y}^{2}{L}_{yy}=0$$
(3.20)

Its third-order directional gradient in the direction of the gradient satisfies the sign condition:

$${L}_{x}^{3}{L}_{xxx}+3{L}_{x}^{2}{L}_{y}{L}_{xxy}+3{L}_{x}{L}_{y}^{2}{L}_{xyy}+{L}_{y}^{3}{L}_{yyy}<0$$
(3.21)

where \({L}_{x}{L}_{y}{L}_{xy}\) represents the partial differential calculated from the scale-space representation obtained using the Gaussian kernel to smooth the original image. According to this method, continuous curve edges with subpoint accuracy can be obtained automatically. Following this method, continuous curve edges with sub-point accuracy can be obtained automatically.

The hysteresis threshold can also be used for these differential edge slices.

3.3.3 Intelligent Analysis Algorithm Architecture of UAV Power Grid Disaster Loss Detection

We propose an intelligent analysis algorithm for UAV power grid disaster detection, and the main detection categories include insulators and towers 7 types of fault conditions, such as body, tower head, foreign body, broken wire, wind deviation, tower tilt, etc., are divided by the survey image before and after Separation and image enhancement processing, combined with the normal and abnormal image libraries of training, using comparative learning, anomaly labeling and other means to realize the investigation and identification of line disasters. The algorithm pre-training phase flow is shown in Fig. 3.30.

Fig. 3.30
A flowchart. It presents a normal and an anomaly gallery followed by 4 steps to give a front and back separation algorithm, recommendation algorithm, contrast learning anomaly detection algorithms, and anomaly classification algorithm.

Algorithm pre-training stage flowchart

Figure 3.31 shows the process of running the stage after the algorithm is deployed.

Fig. 3.31
A flowchart. It presents a drone shot and a photograph from a normal gallery that undergo a recommendation algorithm followed by separation of front and rear scenes, anomaly detection, a condition, exception classification manual review of annotations, and exception type to give anomaly gallery.

UAV power grid disaster image collection process

Here it is proposed to use two ways to collect and produce datasets, one is web crawler + data cleaning, and the other is drone on-site shooting and collection. When it is inconvenient to go to the site to shoot, the first method is preferred to collect and sort out to ensure that enough sample photos are collected and verified with the actual disaster photos on site.

There are certain requirements for the amount of data involved in researching related systems. The construction of the resource library adopts “unified planning and classified construction. “Libraries, libraries are associated, fully shared, redundant growth”, the overall design capacity is about 5 million sheets. That Included in:

  1. 1)

    Build a tower foundation image library

    For the data of several tower types commonly used in high voltage power lines, the image upload and query function is opened to users, and the image comparison service interface is opened to the background service system. Users can use the actual needs and business needs of the tower image resource library on the basis of targeted image contrast and image detection services.

  2. 2)

    Establish the negative sample image library of the pole and tower, establish the negative sample image of the pole and tower with the same target at each shooting angle according to the type of the pole and tower, and establish the corresponding index position in the background database to facilitate the image retrieval and comparison, and make full preparation for the subsequent sample training.

  3. 3)

    The data is updated to the image library

    Supporting the update of business data sources, the system supports the loading and updating of dynamic incremental templates, so that the dynamically updated inbound photo data can participate in the comparison in time; The system automatically detects all kinds of business photo databases, and if there is an update, it automatically takes photos according to the update time set by the user, and compares them with the existing image feature database to ensure uniqueness before entering the database.

  1. (1)

    Separation of front and rear scenes

The semantic segmentation technology in machine vision is used to separate the front and back scenes of the image. This is shown in Fig. 3.32.

Fig. 3.32
2 photographs of 3 power towers with transmission wires. The first is the original photograph with the clear sky in the background. The second is the photograph after the separation of the front and back scenes.

Image front and rear separation

  1. (2)

    Recommended image retrieval

When the wide&deep recommendation model is adopted, the UAV's pose (relative position in space), camera's internal parameters (imaging effect parameters), weather, time, and other sensor information are used as wide features when collecting pictures, and picture information is used as deep features. Through a large number of verification and training, the algorithm model and data will be fitted and approximated from the two dimensions of wide and deep. Train a picture retrieval model. During anomaly detection, the image taken at the moment is used to retrieve the most similar image from the normal image library. The wide&deep model connects the wide part of the single-input layer with the deep part consisting of the Embedding layer and the multi-hidden layer to input the final output layer. The wide part of the single layer is good at handling a large number of sparse different fault class characteristics; The deep part uses the characteristics of the neural network with strong expression ability to carry out deep feature crossover and mine the data patterns hidden behind the features. Finally, using the logistic regression model, the output layer combines the wide part and the deep part to form a unified model. The wide & deep (wide and deep) recommended model is shown in Fig. 3.33.

Fig. 3.33
A neural network. It presents an input layer with one node, 2 hidden layers with 3 nodes each, a fully connected layer with 8 nodes, and each sparse feature.

Wide and deep recommendation model

  1. (3)

    Manual labeling

Through comparative learning, train an anomaly detection model. Reasonably design the positive example, negative example, and loss function in comparative learning, and then train it. The final model can determine whether there is a significant difference between the current pictures based on the images retrieved in the previous step. This is shown in Fig. 3.34.

Fig. 3.34
A horizontal chart classifies deep learning into supervised learning and unsupervised learning. Unsupervised learning splits into productive learning and contrast learning. Productive learning splits into G A N and V A E. Contrast learning splits into homogenous and non-homogenous samples.

Anomaly detection model

Then, according to the classification standard, the collected sample data pictures are classified and labeled.

  1. (4)

    Classification of abnormal scenarios

The pictures of various abnormal situations marked manually are used as training data. The amount of such data is expected to be small, so the transfer learning method is adopted to fine-tune the latest classification model of machine vision, and the abnormal situation recognition model is obtained. Figure 3.35 shows the abnormal situation identification model.

Fig. 3.35
A process diagram with photographs. It has a photograph of a collapsed tower which is convolutional and nonlinear. It is followed by max pooling, actor, and a fully connected layer to give classification. The classification presents 8 defects of the towers.

Abnormal situation identification model

3.3.4 Typical Disaster Loss Scenario Identification

We have developed a drone-based main grid disaster detection algorithm and a typical disaster investigation plan for unmanned aerial vehicles (UAVs). The algorithm primarily detects various types of disaster scenarios, including wire breakage, conductor deviation, tower collapse, insulator damage, tower top loss, and foreign object suspension, totaling seven categories. Intelligent analysis and explanations are provided for each type of disaster. The recognition rates and examples corresponding to the functionalities of each algorithm are shown in Table 3.7.

Table 3.7 Recognition rate for each type of algorithm function
  1. (1)

    Wire break identification solution

  1. 1)

    Testing ideas

The detection of wire breakage, which refers to the breakage of wires between towers due to aging or external forces, can be performed in several steps. As the wires on towers are typically connected to insulators, the first step is to train a deep learning object detection model specifically for insulator detection. This model is used to locate the starting point of the wire by identifying the position of the insulators. Since wires are thin and may lose their distinctive features when downscaled in a large-scale view, we perform image cropping after detecting the insulator positions, preserving the original resolution and extracting regions where wires may appear. Next, a wire detection model is developed to detect the wires within the cropped images. Finally, by analyzing the orientation of the detected wires, we can determine if there is a wire breakage. The specific workflow is illustrated in Fig. 3.36.

Fig. 3.36
A flow diagram. From the input picture, cut out a pair of pictures that may contain wires and reduced resolution. It is followed by wire detection and insulation detection, identification of wire direction, and checking whether the cable is disconnected.

Wire break detection process

  1. 2)

    Insulator labeling instructions

The insulator labeling is divided into two steps: first, the general location of the insulator is marked in the original drawing, and then the detailed location is marked after pixel processing of the original drawing. This is shown in Fig. 3.37.

Fig. 3.37
2 photographs of insulators on a tower. A. A rectangular box outlines the insulator and marks the general position of the insulator. B. A zoomed view of the insulator detailed position marking.

Insulator labeling schematic

  1. 3)

    Insulator Inspection Model Introduction

The insulator detection is based on YOLOv4, which redefines object detection as a regression problem. It applies a single convolutional neural network (CNN) to the entire image, dividing it into a grid and predicting class probabilities and bounding boxes for each grid cell. For each grid cell, the network predicts a bounding box and probabilities corresponding to each class (e.g., car, pedestrian, traffic signal). Each bounding box is described using four descriptors: the mapped values of the center, height, and width of the bounding box, and the confidence of an object being present in the bounding box. If an object’s center falls within a grid cell, that grid cell is responsible for detecting the object. There can be multiple bounding boxes within each grid cell. During training, we want each object to be assigned to only one bounding box, so we assign a bounding box that has the highest overlap with the ground truth box to predict the object. Finally, a technique called “Non-Max Suppression” is applied to filter out bounding boxes with confidence scores below a threshold for each class. This gives us the final image predictions. YOLO is known for its speed. Since the detection problem is treated as a regression problem, it does not require a complex pipeline. It is 1000 times faster than “R-CNN” and 100 times faster than “Fast R-CNN”. It can handle real-time video streams with a delay of less than 25 ms. Its accuracy is more than twice that of previous real-time systems. Equally important, YOLO follows the practice of “end-to-end deep learning”.

  1. 4)

    Wire labeling instructions

Wire labeling starts with pixel dot labeling of the wires of the same line to identify abnormal lines. This is shown in Fig. 3.38.

Fig. 3.38
2 photographs of wire labeling on a tower. A. A rectangular box outlines wire pixel point labeling. B. A rectangular box outlines abnormal line labeling.

Wire labeling

  1. 5)

    Introduction to wire detection model

The wire line detection algorithm is an end-to-end detection approach that consists of two network models: LanNet and H-Net. LanNet is a multi-task model that combines semantic segmentation and pixel-level vector representation. Finally, it utilizes clustering to achieve instance segmentation of the lane lines. H-Net is a small network structure responsible for predicting the transformation matrix H. The transformation matrix H is then used to re-model all pixels belonging to the same wire line, with the y-coordinate representing the x-coordinate.

  1. (2)

    Wire strand drawing identification scheme

In recent years, unmanned aerial vehicle (UAV) inspection technology has been developed as a solution to address the issues of low efficiency, poor reliability, and potential risks associated with manual inspections. Implementing UAVs for transmission line inspections can effectively reduce inspection costs and enable large-scale deployment across various levels of State Grid Corporation of China (SGCC). Additionally, it can significantly reduce the reliance on manpower and resources.

  1. 1)

    Basic requirements for UAV flight and shooting

  1. a)

    The drone cannot shake;

  2. b)

    The drone must fly to the bottom of the wire and try to be as level as possible with the wire, not too low or too high, if there is a block part try to bypass;

  3. c)

    The wire must be shot toward the sky.

  1. 2)

    Detection idea

  1. a)

    Based on the overall idea of LSD (Line Segment Detector) for line detection, the aerial wire sag detection based on unmanned aerial vehicles (UAVs) begins with the UAV capturing video images at specific locations. After the inspection is completed, the collected images are processed by recognition algorithms to obtain the inspection results.

  2. b)

    Linear detection based on principal curvature and principal direction.

①:

Overview of principal curvature and principal direction principles

In 3D Euclidean space, if point X is any point on the differentiable surface, a unit normal vector of the surface can be taken through point X. The innumerable planes where the normal vector lies intersect the differentiable surface in a series of Plane curve. Except for some special shapes, the curvature radius of this curve on different cutting planes is often different, so the maximum or minimum curvature value can be obtained on some cutting planes, The radius of curvature that reaches the maximum is called the main radius of curvature, or the main curvature for short. The detection and annotation of power lines based on principal curvatures and principal directions are illustrated in Fig. 3.39.

Fig. 3.39
A flowchart with 10 steps. The steps are input image, Gaussian downsampling, the Sobel operator extracts the edge, gradient pseudo-sorting, select the seed point area, rectangular fitting, rectangular correction, validation of rectangles, extract a valid line segment, and output line information.

Marking lines based on the main curvature and direction

Given the curve (C): \(r=r\left[u\left(t\right),v\left(t\right)\right]\) on the surface S: \(r=r\left(u,v\right)\), then:

$$dr={r}_{u}du{r}_{v}dv$$
(3.22)

Order:

$$E={r}_{u}\cdot {r}_{u}$$
(3.23)
$$F={r}_{u}\cdot {r}_{v}$$
(3.24)
$$G={r}_{v}\cdot {r}_{v}$$
(3.25)

Then the linear characteristics are shown in Table 3.8.

Table 3.8 Shows the linear features

In addition, the ideal linear characteristics should satisfy:

$$\left|{k}_{1}\right|\gg \left|{k}_{2}\right|$$
(3.26)
$${k}_{2}\approx 0$$
(3.27)

And:

$$L={r}_{uu}\cdot n$$
(3.28)
$$M={r}_{uv}\cdot n$$
(3.29)
$$N={r}_{vv}\cdot n$$
(3.30)
$$\left[\begin{array}{ccc}d{v}^{2}& -dudv& d{u}^{2}\\ E& F& G\\ L& M& N\end{array}\right]=0$$
(3.31)

Or:

$$\left(EM-FL\right){du}^{2}+\left(EN-GL\right)dudv+(FN-GM)d{u}^{2}=0$$
(3.32)

Apart from:

$$\frac{E}{L}=\frac{F}{M}=\frac{G}{N}$$
(3.33)
  1. c)

    Line segment detection extraction based on principal curvature and principal direction

①:

Potential linear target screening based on principal curvature:

The principal curvatures represent the bending behavior of a surface in two perpendicular directions. On a grayscale surface, the set of pixels that form a straight line typically exhibit minimal variations along the line direction, while significant variations are observed along the direction perpendicular to the line. As a result, the principal curvatures of image pixels on a straight line usually have significant differences, whereas the principal curvatures of background pixels are similar.

②:

Line segment detection based on main direction

The solution of the main curvature and the main direction is completed by calculating the eigenvalues and eigenvectors of the Hessian matrix. Wherein, \({D}_{m}\), the main direction corresponding to the smaller curvature, is the pixel direction vector, then the pixel direction Angle \(\theta \) is:

$$\theta =arctan\,{D}_{m}$$
(3.34)

By selecting a seed point region, a group of pixels with similar principal directions can be merged to form a potential line region, achieving an initial segmentation of the line. In this case, the density of homogeneous points is computed as the criterion for region growing. Pixels whose principal direction deviation from the fitted rectangle direction is less than \(\tau \) are considered homogeneous points. Let n be the number of homogeneous points in the rectangular approximation region r, then the density of homogeneous points in that region is defined as follows:

$$D(r)=\frac{n}{length(r)\cdot width(r)}$$
(3.35)

If the density of homogeneous points \(D\left(r\right)\) in the rectangular region r exceeds a predefined threshold \({D}_{0}\), the region is considered to meet the requirements for a line region. The NFA (Number of False Alarms) calculation is performed to filter the line pixels and output the detected line segments. The flowchart for this process is shown in Fig. 3.40.

Fig. 3.40
A flowchart with 9 steps. It begins with candidate power lines sorted by intercept. It has 3 decision boxes with conditions. If condition 1 is no, power lines are connected within each group and it ends. It has 2 loopbacks to step 2 through 2 equations if condition 2 is yes and condition 3 is no.

Line segment detection based on main direction

  1. d)

    Screening fit for transmission lines

Through methods such as grayscale transformation, image preprocessing is carried out on the images collected during inspection, effectively reducing the interference of image background on the images. However, in the actual inspection process, the aerial images collected by the UAV inevitably will contain some background information with obvious line characteristics, such as the edge of artificial buildings, the edge of the road lane lines, etc. and power towers, etc., which cause interference to the detection of transmission lines in the video image sequence. In addition, due to factors such as complex backgrounds and lighting, the detected transmission lines may be incomplete, and the obtained straight line information may be screened for intermediate fractures. Therefore, it is necessary to fit the detected straight lines. By analyzing the aerial image information collected by the UAV inspection and the linear information obtained in the previous section, it can be found that the transmission lines are the main linear features in the image, and most of them run through the whole image and are parallel, with obvious directionality. After the set threshold is used to screen potential line segments based on Principal curvature, the long linear targets detected are mainly transmission line targets. Due to the small difference in inclination angles of transmission lines in the same image, it is possible to consider setting a slope inclination interval threshold to filter transmission line information based on the slope of the longest group of linear targets detected. Due to factors such as light and obstructions, the detected transmission line breaks, and needs to be fitted. Traverse all transmission lines, and if the detected transmission line information has the following characteristics, perform grouping fitting for the transmission line:

①:

The distance between adjacent transmission lines is less than a certain distance, i.e., two transmission lines are nearly on the same extension line.

②:

The angle between transmission lines is less than a certain threshold.

After filtering, the transmission lines are sorted based on intercept information. The adjacent transmission lines are then sequentially examined to determine if they belong to the same transmission line. In Fig. 3.41, transmission lines \({l}_{1}\) and \({l}_{2}\) are close in distance and have similar angles. According to the aforementioned criteria, it can be approximated that these two transmission lines are different parts of the same transmission line. Therefore, these two line segments are connected and fitted as a single transmission line. Continuing the search for other connectable lines, it is observed that \({l}_{3}\) is far from \({l}_{1}\), and there is a significant angle deviation between \({l}_{4}\) and \({l}_{3}\). Hence, in the graph, the four candidate transmission lines can be fitted into three separate transmission lines. The process flowchart for grouping and fitting the transmission lines is shown in Fig. 3.42.

Fig. 3.41
4 acute angles angles. The angles are image angle theta 1, image angle theta 2, image angle theta 3, and image angle theta 4 with transmission lines l 1, l 2, l 3, and l 4.

Fitting judgment characteristics of transmission lines

Fig. 3.42
A flowchart with 9 steps. It begins with candidate power lines sorted by intercept. It has 3 decision boxes with conditions. If condition 1 is no, power lines are connected within each group and it ends. It has 2 loopbacks to step 2 through 2 equations if condition 2 is yes and condition 3 is no.

Group fitting of transmission lines

  1. 3)

    Straight line or line segment extraction

Radon transform and Hough transform are combined with each other to extract the line segments.

  1. 4)

    Wire extraction detection foreign body extraction detection

  1. a)

    Wire extraction strand detection methods are as follows:

①:

Line segment extraction to determine whether the broken line.

②:

Line segment extraction of abnormal bumps.

③:

Extraction of edge maximization by edge detection.

④:

Extracting continuous fast by binarization and judging whether the continuous block crosses the wire part.

⑤:

By module matching.

Wire extraction pixel point judgment, wire abnormal image is relatively small, you can set the pixel point threshold as a benchmark. Foreign body detection The method is shown in Fig. 3.43.

Fig. 3.43
4 photographs. Photos 1 and 2 have a transmission wire with split stock. A rectangular box outlines the split stock. Photos 3 and 4 have a transmission wire with a foreign body. A rectangular box outlines the foreign body.

Wire extraction foreign body extraction detection

  1. 5)

    Abnormal weather handling

Handling three types of abnormal weather conditions: overcast and rainy days, snowy days, and backlit situations.

Overcast and rainy days: Rain may create water droplets on the wires, which can cause interference during algorithm computation, leading to potential packet loss. In special cases, a water droplet removal algorithm is applied to eliminate the impact of water droplets.

Snowy days: Prolonged snowfall can result in ice formation or the accumulation of small snowballs on the wires, which can also cause interference during algorithm computation. Since these samples are relatively rare, they are processed when sufficient samples are available.

Backlit situations: When capturing images towards the sky, backlit conditions are often encountered. First, the brightness value is evaluated, and if it exceeds a specified threshold, it is considered as a backlit situation. Then, a white balance algorithm is used to reduce the brightness.

  1. 6)

    Algorithm verification and testing

Based on actual sampling, a portion of the samples are used as the algorithm tuning set, while another portion is used as the algorithm testing set. A 3:7 ratio is employed as the basis for algorithm validation. The detection criterion states that an accuracy rate of 80% or higher within 70% of the samples is considered acceptable. Otherwise, further algorithm optimization is required to achieve the desired results.

  1. (3)

    Guide wire wind deflection identification program

  1. 1)

    Wind deflection overview

Wind-induced displacement fault, also known as wind deviation fault, is one of the most common types of wind-related hazards in power transmission lines. It occurs when the conductors and insulators sway under the influence of wind, leading to discharge and tripping due to inadequate electrical clearance. Wind-induced displacement faults are sometimes abbreviated as wind deviation faults.

Wind deviation detection primarily involves checking whether the distance between insulator strings and tower poles exceeds the limit of electrical air gap, as well as ensuring the normal distance between conductors. Drone-based wind deviation detection can be divided into two parts: detecting the suspended insulators on tower poles and detecting any abnormalities in the conductors during line inspection. The wind deviation detection model is illustrated in Fig. 3.44.

Fig. 3.44
A diagram of a tower with a wind deviation detection model. The labeled parts are cross-arm length d, wind deflection angle theta, electrical gap d, X 0, and horizontal offset of the insulator.

Wind deflection detection of guide wire

  1. 2)

    Testing ideas

The overall idea is shown in Fig. 3.45, where the UAV-based wind deflection detection of the conductor is firstly collected by the UAV at fixed points. After the inspection is completed, the collected images are processed by the recognition algorithm and the inspection results are obtained.

Fig. 3.45
A process diagram with 7 steps. The steps are picture acquisition, pretreatment, insulator string segmentation and extraction, calculation of wind deflection of insulator, calculation of wind deflection of insulator, communication network, and unified monitoring platform.

Ideas for wind deflection detection of guide wire

The camera gimbal of the UAV is calibrated before the acquisition to reduce the influence of the senior distortion of the camera, improve the quality of the acquired images, and further increase the accuracy of recognition. The camera calibration uses the Zhang’s calibration method, and the process is shown in Fig. 3.46.

Fig. 3.46
A process diagram with 6 steps. The steps are read calibration picture, rough extraction of corner information, corner precise extraction, obtain camera calibration parameters, dedistortion, and evaluation and calibration results.

Calibration process of guidewire wind deflection detection camera

Fig. 3.47
A flowchart with 10 steps. The 8 steps are input image, acquisition image, image filtering, image segmentation, expansion corrosion, skeleton extraction, obtain the coordinates of both ends, and calculate the insulator offset. If it exceeds the threshold, severe wind deviation, if not, unbiased.

Process of judging wind deflection of guide wire

The recognition is mainly divided into two stages: localization and judgment. Localization is based on a high-precision and lightweight deep learning target detection Algorithm model implementation, the images collected by the UAV are classified and labeled as samples, and the samples are read into the target detection network. The model can accurately detect the position of the overhanging insulator.

After obtaining the position of the hanging insulator, the wind deflection can be determined by image processing. The main steps include image preprocessing, morphological processing and wind deflection Angle calculation. The detailed steps are shown in flow chart 3.47.

  1. (4)

    Tower collapse identification program

The following is a block diagram of the overall process, as shown in Fig. 3.48.

Fig. 3.48
A flowchart with 6 steps. The steps are images taken by a drone, image preprocessing, the insulator target detection model obtains the insulator region, insulator damage discrimination model judgment, normal insulator, and the insulator is damaged.

Tower collapse identification process

  1. 1)

    Image segmentation algorithm

  1. a)

    Feature encoder based

The self-encoder consists of two parts:

The encoder: this part can compress the input into a potential spatial representation, which can be represented by the encoding function h = f(x).

The decoder: this part reconstructs the input from the potential spatial representation and can be represented by the decoding function r = g(h) is represented.

Thus, the entire self-encoder can be described by the function g(f(x)) = r, where the output r is similar to the original input x.

Auto Encoder can be considered as a neural network with only one implicit layer, and the reconstruction of features is achieved by compression and reduction.

The input data serves as the features, and the encoder, from the input layer to the hidden layer, compresses the input into a latent space representation. The decoder, from the hidden layer to the output layer, reconstructs the input from the latent space representation. Both the input and output neurons of the autoencoder have the same number as the feature dimension. Training this autoencoder aims to make the output features as similar as possible to the input features. The autoencoder attempts to reproduce its original input, so during training, the output of the network should be the same as the input, i.e., y = x. Therefore, an autoencoder should have the same structure for its input and output. By training this network using training data, it learns the capability of xhx.

  1. b)

    Regional proposal based

Regional proposal is a very common algorithm in the field of computer vision, especially in the field of target detection. The core idea is to detect the color space and similarity matrix, and based on these to detect the region to be detected. Then, based on the detection results, classification prediction can be performed.

  1. c)

    Image segmentation applications

By using machine vision algorithms, the target subject can be identified, allowing for the separation of foreground and background. This enables image segmentation, where the target object in the image is distinguished from the background scene. As a result, background interference can be eliminated, and the focus can be solely on processing the objects of interest for recognition and detection.

  1. 2)

    Image retrieval algorithms

  1. a)

    Image Library Retrieval Images

Image retrieval means to find images similar to the image to be matched from a pile of images, and this way is to find images by images. With the increasing and simultaneous updating of the image repository, an image retrieval database will gradually be formed between the images and the established information. The user can retrieve the required image from the existing image database according to the specific image.

  1. b)

    Contrast image features

On the basis of the image retrieval library, users can retrieve the pictures of the nearest time period or a reference point when the UAV is taking aerial photographs of the inspection poles and towers according to the information in the background, so as to compare the captured pictures with the retrieved pictures in terms of similarity or feature great value, and use the comparison result as an evaluation parameter to provide anomaly detection algorithm for reference. If the evaluation parameter satisfies the set condition, the anomaly detection is not required and the recognition detection result is directly fed back. If the evaluation parameters satisfy the set conditions, then no anomaly detection is required and the recognition detection results are directly fed back.

  1. 3)

    Abnormal image detection algorithm

  1. a)

    Abnormal image classification algorithm

We build the corresponding network model, train the image library in batches by deep learning network model, and output the weight file corresponding to the number of iterations after the training, on the basis of which the prediction is inferred according to the preset labeled classification, and in this way the classification information belonging to the abnormal image can be obtained.

  1. b)

    Anomaly detection algorithm

In this case, the analysis is performed on the input using a residual network binary classification model. After obtaining the abnormal classification information, further anomaly detection is required to determine whether the tower has collapsed. Tower collapse is considered an anomaly, and if a collapsed tower is detected, an abnormality message is generated, and the corresponding results are sent back to the service interface.

  1. (5)

    Insulator Loss Identification Program

  1. 1)

    Insulator Damage Overview

Insulators, as crucial insulation devices in power systems, play a vital role in providing mechanical support and preventing current from flowing to the ground in overhead transmission lines. As one of the components prone to faults, insulators are susceptible to damage and loss due to natural factors such as disasters, temperature, and humidity. These faults directly pose a threat to the transmission stability of the entire power system.

In recent years, with the rapid development of unmanned aerial vehicle (UAV) technology and computational image processing, the fault inspection of key electrical equipment in transmission lines based on aerial images has become a major direction to replace traditional manual line inspections.

In line with the development direction of UAV inspections, a detection method for insulator damage and loss has been proposed. This method takes advantage of the fixed number of insulators in each section of the transmission line. By preprocessing the insulator images, extracting their contours, and counting the number of insulators, a simple and efficient fault detection approach is applied to identify insulator faults. This method provides a reference for fault detection in transmission lines.

  1. 2)

    Algorithm ideas

First, images are captured by a UAV at fixed navigation points. The captured images are then subjected to image preprocessing to enhance their quality. Next, insulator localization is performed on the processed images to identify the target regions containing insulators. Finally, a fault analysis is conducted on the target regions to detect any damages or faults. The algorithmic flowchart is illustrated in Fig. 3.49.

Fig. 3.49
A flowchart with 6 steps. The steps are images taken by a drone, image preprocessing, the insulator target detection model obtains the insulator region, insulator damage discrimination model judgment, normal insulator, and the insulator is damaged.

Insulator loss identification process

  1. 3)

    Image target detection

  1. a)

    One-stage based target detection

Yolo is a representative algorithm for one-stage object detection. This algorithm directly performs classification regression and bounding box regression using a feature extraction network. Compared to two-stage algorithms, it eliminates the need for the region proposal generation stage, resulting in faster detection speed. However, the detection accuracy may be slightly compromised. The Yolo series networks employ a feature pyramid structure that combines deep and shallow features, enriching the information extracted by the network and improving accuracy.

The simple structure of Yolov3 is illustrated in Fig. 3.50.

Fig. 3.50
A framework of Yolo v 3 based on a darknet 53 framework. It presents the type, filters, size, and output of layers and residual boxes, scales 1, 2, and 3 with convolutional layer and YOLO detection. It also has a Darknet 53 framework with steps, 5 convolutions, up-sampling, and 3 outputs.

A simple illustration of the structure of Yolov3

  1. b)

    Two-stage based target detection

The second stage target detection algorithm is represented by Faster RCNN, which uses the feature extraction network Feature Map (Feature Map), uses on the Feature Map to generate ROI regions of interest, and then implements on the regions of interest Classification regression and bounding box regression. The ROI generation stage is more than that of the first-stage algorithm, so the recognition accuracy is higher than that of the first-stage algorithm. The detection accuracy is higher than that of the one-stage algorithm, but the detection speed is reduced.

The structure of the Faster RCNN model is shown in Fig. 3.51.

Fig. 3.51
A 3-D diagram of a faster R C N N model. It presents a convolution layer, feature map, area generation network, generate, return pool, and classification.

Structure of faster RCNN model

  1. 4)

    Anomaly detection algorithm

  1. a)

    Image based classification model

After the object detection in the image, clear and high-resolution images are extracted based on the detection information and saved. These images are then divided into positive and negative samples, which are fed into the classification model for training.

The insulator damage classification model is constructed as shown in Fig. 3.52.

Fig. 3.52
A process diagram with 3 steps. A picture of a damaged insulator and a picture of a normal insulator go to the classification model to give 0 and 1 for the damaged and normal insulators, respectively.

Insulator damage classification model construction idea

Insulator damage detection is shown in Fig. 3.53.

Fig. 3.53
A process diagram with 3 steps. The image to be detected goes to the classification model to give 0 and 1 for the damaged and normal insulators, respectively.

Insulator damage classification model construction

  1. b)

    Based on image restoration model

Based on the object detection, the positional information of the detected targets is obtained. High-resolution target images are then extracted from the original image based on the position coordinates. These target images are fed into an image restoration model. When the input image represents a complete target, the model generates an image that closely resembles the original image. However, when the input image represents a damaged target, the model repairs the image and generates a completed image of the target, which may have significant differences from the original image. The degree of difference between the restored image and the original image is used to determine the presence of anomalies. The construction of the image restoration model is illustrated in Fig. 3.54.

Fig. 3.54
A process diagram with 5 steps. The picture of a damaged insulator goes to the generator followed by the generation of the repaired image. The repaired image and picture of a normal insulator go to a discriminator that gives 0 and 1 for damaged and normal insulators, respectively.

Image restoration model construction

Insulator damage detection is shown in Fig. 3.55.

Fig. 3.55
A process diagram with 5 steps. The image to be detected is followed by a generator and a picture of a damaged insulator. The image to be detected and the damaged insulator picture go to a comparator that gives 0 and 1 for damaged and normal insulators, respectively.

Detecting damaged insulators

  1. (6)

    Tower head loss identification solution

The overall flow block diagram is shown in Fig. 3.56.

Fig. 3.56
A flowchart with 9 steps. It has an input image followed by 3 steps. If the answer to decision box 1 is no, the output is true and the chart ends. If yes, it proceeds. After one step, if the answer to decision box 2 is yes, the output is true and the chart ends. If no, the output is false and it ends.

Tower head loss identification process

  1. 1)

    Image target detection

Based on the image database, supervised labeling of samples is performed to train a deep neural network model. This process enables the learning of the main target object, and as a result, the bounding box of the outermost perimeter of the object is obtained, thereby determining the position information of the target.

  1. 2)

    Image retrieval algorithms

During the inspection process, the tower top position information is manually determined or automatically located using a deep neural network model. This information is then stored in the corresponding database location, and the data is continuously recorded and updated. This enables easy retrieval and comparison of captured images in the database for subsequent analysis. If a significant difference is observed during the comparison, it is considered as a missing tower top component. If no significant difference is found, the input is analyzed using an anomaly detection algorithm for further analysis.

  1. 3)

    Abnormal site detection algorithm

The captured images are matched and compared with the feature and sample libraries stored in the database. Traditional computer vision techniques, such as calculating contour area and other relevant information, are used as criteria. By analyzing these criteria, it is determined whether there are any anomalies in the identified parts. The detection results are then generated based on this analysis.

  1. (7)

    Foreign body suspension identification solution

  1. 1)

    Overview of foreign body suspension

Severe natural conditions directly contribute to incidents of insulator stringing, while human activities lead to frequent power line failures and foreign object intrusions. These incidents not only severely impact the safety of power grid operations but also pose a significant threat to the lives and property of people.

In recent years, short circuit and tripping incidents caused by foreign objects such as plastic films, kites, and bird nests on power transmission lines have become another common form of external damage. These incidents also pose a risk to the safety of pedestrians and vehicles under the power lines. Therefore, it is of great significance to promptly detect foreign objects and take appropriate measures.

The situations involving insulator stringing, foreign objects in the power grid, and accidents caused by large vehicles during construction are highly complex and widely distributed. They are characterized by their suddenness, randomness, and dispersal, posing significant challenges to preventive and control work.

Considering the drawbacks of low efficiency, poor reliability, and risks to personnel safety in manual inspections, unmanned aerial vehicle (UAV) line inspection robots are used to move along power transmission lines, inspecting and documenting potential hazards caused by foreign objects. These robots significantly improve the level of automation in inspections.

  1. 2)

    Testing ideas

  1. a)

    Foreign object detection based on generative adversarial networks and deep residual neural networks:

The structure of the generative adversarial network based on the sample set expansion of the generative adversarial neural network is shown in Fig. 3.57.

Fig. 3.57
A process diagram with 5 steps. Random noise is followed by a generator and the generation of a picture. The generated picture and the real picture go to the real picture that gives 0 and 1 for real data and false data, respectively.

Generating an adversarial network

The loss function in the training process of GAN (Generative Adversarial Network) is:

$$\mathrm{min}\,\max V(D,z={E}_{x\sim {\mathcal{O}}_{\mathrm{det}\left(x\right)}}[\mathrm{log}D(x)]+{E}_{x\sim {\mathcal{O}}_{x}\left(x\right)}[1-\mathrm{log}(1-D(G(x)))]$$
(3.36)

Specific steps:

  1. m data samples are selected from noise distribution \({p}_{data}\left(x\right)\).

  2. m real data samples are also selected from the training data samples.

  3. For the discriminant network D, the gradient change of its loss function \({J}^{D}({\theta}^{(G)},{\theta}^{(D)})\) with respect to parameter \({\theta }^{\left(D\right)}\) is:

    $${\theta }^{\left(2\right)}\leftarrow {\theta }^{\left(2\right)}+\nabla \frac{1}{m}\sum_{i=1}^{m}\left[\mathrm{log}D\left({x}^{\left(i\right)}\right)\right]+\mathrm{log}\left[1-\mathrm{log}\left(1-D\left(G\left({z}^{\left(i\right)}\right)\right)\right)\right]$$
    (3.37)
  4. For the generated network G, the gradient change of its loss function \({J}^{G}({\theta}^{(D)}, {\theta }^{(G)})\) with respect to parameter \({\theta }^{\left(G\right)}\) is:

    $${\theta }^{\left(G\right)}\leftarrow {\theta }^{\left(G\right)}-\nabla \frac{1}{m}{\sum }_{i=1}^{m}\left[\mathrm{log}\left(1-D\left(G\left({z}^{\left(i\right)}\right)\right)\right)\right]$$
    (3.38)
  5. Use the ReLU function as the activation function in the G network, use the tanh activation function in the last layer, and use the LeakyReLU as the activation function in the D network. The three activation function expressions are as follows:

    $$ReLU\left(x\right)=\mathrm{max}\left(0,x\right)$$
    (3.39)
    $$\mathrm{tanh}\left(x\right)=\frac{{e}^{x}-{e}^{-x}}{{e}^{x}+{e}^{-x}}$$
    (3.40)
    $$LeakyReLU\left(x\right)=\mathrm{max}\left(0.01x,x\right)$$
    (3.41)
  1. 3)

    Residual neural network

Figure 3.58 shows the residual structure module.

Fig. 3.58
2 flow diagrams. A. Input X goes to the weight layer followed by R e L U and the weight layer to give H of x. B. Input X goes to the weight layer followed by R e L U and the weight layer to give F of x that goes to a summing point. X also goes to the summing point. H of x equals F of x plus x.

Residual structure module

The formula is as follows:

$$H\left(x\right)={\mathbb{F}}\left(x\right)+x$$
(3.42)
$${x}_{k}={x}_{0}+\sum_{i=1}^{K}F\left({x}_{i}-1\right)$$
(3.43)
$$\frac{\partial Lass}{\partial {x}_{0}}=\frac{\partial Loss}{\partial {x}_{0}}\left(1+\frac{\partial }{\partial {x}_{0}}\sum_{i=1}^{K}F({x}_{i-1})\right)$$
(3.44)

The overall foreign body detection process is shown in Fig. 3.59, which consists of three main parts: depth residual neural network for feature extraction of input images, regional recommendation network and region-based classifier of interest.

Fig. 3.59
A flowchart. A photograph goes to a deep residual neural network followed by a feature map, 3 by 3, 1 by 1, and 1 by 1 layers, softmax layer, recommended area, area of interest pooled layer, multiple layers that give window and class probabilities to give a final photograph.

Flowchart for detecting foreign bodies

  1. 4)

    Sample expansion

In addition to some sample pictures generated by generative adversarial network, some samples commonly used in digital image processing field are also added.

  1. a)

    2D rotation. Rotate the input picture at a random Angle.

  2. b)

    Mirror image. It includes horizontal mirror image and vertical mirror image.

  3. c)

    Dimensional transformation. Shrink the target image appropriately.

  4. d)

    Image filtering. The method used is to add Gaussian filtering. Let the size of the picture be N × N, and establish the mapping of \(f\left(i,j\right)\to g\left(x,y\right)\) as:

    $$\left\{\begin{array}{c}g\left(x,y\right)=\frac{1}{\mid S\mid }{\sum }_{i,j\in S}w\left(i,j\right)f\left(i,j\right)\\ w\left(i,j\right)=A{e}^{\frac{{i}^{2}+{j}^{2}}{2{\sigma }^{2}}}\end{array}\right.$$
    (3.45)

3.4 Disaster-Related Social Information Collection Technology

3.4.1 Summarize

Research the intelligent collection, sharing and integration technology of social public emergency information related to disaster loss areas, support the integration of disaster information, and realise the integration with the government's emergency management data platform. By sharing information and resources with the emergency platforms of various regions and departments, the overall emergency plan can be linked with special plans, departmental plans, local plans and grassroots plans, vertically involving the national, provincial, municipal, county and grassroots levels, and horizontally involving natural disasters, accidents and calamities, public health and social security, truly realising a coordinated response between multiple departments and localities.

3.4.2 Disaster Damage Geographical Related Social Public Emergency Information Collection

Study the intelligent collection, sharing and integration technology of social and public emergency information related to disaster and loss regions, support the integration and integration of disaster and loss information, and realize the integration with the big data platform of government emergency management. The integrated framework of intelligent collection of public emergency information is shown in Fig. 3.60. Through information exchange and resource sharing with emergency platforms in various regions and departments, we have the ability to connect the overall emergency plan with special plans, departmental plans, local plans, and grassroots plans, vertically to the bottom, involving the country, province, city, county, and grassroots, horizontally to the edge, involving natural disasters, accident disasters, public health, and social security, truly achieving collaborative response among multiple departments and regions. The disaster loss regional information acquisition subsystem is different from the general acquisition, that is, it only pays attention to signal access and ignores the automatic processing of the signal, and should also have the preliminary information analysis and processing function. Take the image monitoring as an example: Both China and the United Kingdom have successfully developed the technology of using the image monitoring system to automatically monitor the fire conditions in large space and outdoor large places and urban areas. After a fire occurs, the system will automatically give an alarm signal and switch the scene image of the fire to the control center; Utilize the urban traffic image monitoring system to automatically analyze license plate numbers at key checkpoints in and out of the urban area, compare them with data in the database, take photos and give early warnings, and assist road management departments and public security organs in investigating cases; With the development of society, at present, more and more data sensor networks and image monitoring networks are organically integrated. When abnormal signals such as pressure, temperature, and concentration appear, the relevant images on site are immediately automatically sent to the control center for processing and confirmation by the on duty personnel; By using satellite remote sensing images, precipitation analysis, typhoon analysis, heavy fog monitoring, sand storm information, water regime monitoring, sea ice monitoring, vegetation change, dry early monitoring, forest fire information, snow information, Urban heat island, estuarine sediment, land desertification analysis, etc. can be realized. For the above analysis, some can be completed automatically, while others require more in-depth data processing and analysis. This project adopts the combination of peacetime and wartime. It intends to exchange the data distributed with the relevant departments of power emergency work through the data center, and then complete the data exchange and sharing system. The data exchange and warning data of professional government departments (such as the relevant data of government departments such as meteorology, land and forest) will access the data through the data access platform, and then, so as to provide effective technical solutions for the data integration of the system.

Fig. 3.60
A framework. Grid data access with 4 components gives a provincial power company data center. Public information collection gives a data access platform. Data exchange and sharing occur between both sides which fuses to give power grid perception and emergency command system.

Integrated framework diagram of intelligent collection of public emergency information

This project has realized the comprehensive access integration of meteorological data in disaster areas, as shown in Fig. 3.61, giving an example of intelligent collection and sharing integration of public emergency information, and the specific access data include:

Fig. 3.61
A photograph of a monitor and 4 screenshots. The monitor has a photograph, an aerial view, and some details on the screen. The screenshots present headline data power outrage with the help of 2 line graphs, news data collection, Veibo data power outrage, and meteorological data collection.

Intelligent collection and sharing and integration technology of public emergency information

  1. (1)

    Real-time meteorological monitoring: visually display the real-time situation of meteorological information such as rainfall, temperature, wind speed, wind direction, air pressure and humidity in the disaster area, sort the list of meteorological elements, highlight the characteristic value (maximum value, minimum value and average value), and display the change process of meteorological elements in the way of process line. Graded display of meteorological stations, according to the day, month, year and other statistical cycle, the rainfall, style of meteorological stations bar chart statistical display;

  2. (2)

    Weather forecast: obtain authoritative weather forecast information through the local meteorological station in the disaster area. Data of meteorological stations in disaster areas: typhoon forecast path, short, medium and long-term weather forecast documents, city and county weather forecast, prefecture-city numerical forecast, prefecture-city three-day forecast;

  3. (3)

    Meteorological warning: based on public meteorological information, coupling the on-site situation of power grid equipment, real-time monitoring and display, and the warning notice can be automatically generated after analyzing abnormal weather. Realize the real-time early warning and early warning level forecast of disasters. The real-time warning function automatically notifies the number of abnormal meteorological data, meteorological disaster warning and other information.

3.4.3 Multi-source Social Public Emergency Information Collection

The main sources of social public emergency information include:

  1. (1)

    Government departments. At present, a relatively complete system of government emergency response departments has been established in China, and the central government and local governments undertake the tasks of emergency monitoring, early warning, decision-making and command. For example, after the occurrence of a disaster, the Ministry of Emergency Management shall collect and release disaster information; the sources of epidemic emergency information monitoring mainly include information collection of national, provincial, municipal and county epidemic prevention and control agencies.

  2. (2)

    Private institutions. The emergency information stored by consulting agencies, community organizations, enterprises, schools, public welfare organizations, etc., is often targeted and each has its own characteristics, and the knowledge and experience of institutional experts is also a valuable wealth.

  3. (3)

    The masses of the people. It can fill the omissions in the scope of information collection of government departments and non-governmental institutions, and play the role of collective prevention and control.

  4. (4)

    The Internet. Social and public emergency information collection channels can be divided into institutional channels and non-institutional channels.

Institutional channels mainly include:

  1. (1)

    Reporting channels. The national government sets up reporting channels to timely obtain emergency information through the reports of units and individuals.

  2. (2)

    Reporting channel. After the occurrence of an emergency, each region and department shall report immediately, and continue to report the relevant situation in time.

  3. (3)

    Document channels. Collect information through administrative channels. Non-institutional channels mainly include: (1) media channels, including newspapers, magazines, radio, television, the Internet and so on. Mass communication media has the characteristics of fast speed, wide range and great influence. (2) Oral channels. (3) Literature retrieval channel. (4) Communication and interaction channels, including regional, institutional communication, etc.

Emergency information collection methods include traditional information collection methods and modern information collection methods. Traditional information collection methods include personnel on duty, emergency investigation method, consultation method, etc., mainly manual collection. Modern information collection methods rely on advanced science and technology for collection, including docking with the collection system of special monitoring institutions for collection, public opinion information collection, etc. Other methods also include regional communication. Traditional collection methods consume a lot of human and financial resources, In addition, it cannot fully meet the requirements of various natural disasters. This project mainly focuses on modern collection methods.

The technology for collecting and utilizing social public emergency information includes (1) sensor technology. The number of information sources collected According to the processing and identification. Sensor technology can be used in emergency management to obtain the information of emergencies and affected objects. (2) 3S technology. 3S technology is invested in the method of geological disaster early warning and monitoring, first using field investigation of disaster zone, then extracting remote sensing information; using GIS to analyze disaster distribution, using GPS to monitor landslide activities; and finally using the collected information to evaluate and warn disasters. (3) Internet of Things technology. At present, the application scope of the Internet of Things in geological disaster emergency response is small, which can be collected for fixed-point information. The Internet of Things technology can be used to build a disaster monitoring platform, and realize the comprehensive monitoring of disaster information through model simulation calculation and quality management. (4) Radio-frequency identification technology. This is a kind of radio frequency information technology, the working principle is mainly to exchange data between readers and tags, identify time sequence and energy to communicate, which is widely used in libraries and archives. After the information archives encounter a flood or other crisis events, it can use the barrier-free reading function of radio frequency identification technology to quickly collect relevant information, locate the geographical location information of the archives, and save the archives in time. (5) Web-crawler technology. Web crawler is a program or script that automatically captures massive information on the Internet according to certain rules. It uses focused network crawlers or theme network crawlers to selectively crawl those pages related to the social public emergency information to meet the needs of emergency staff for information in the field of public emergency information.

With the increasing demand for public emergency information collection in the society, the information collection relying on advanced modern science and technology is becoming more and more dominant. Monitoring institutions, monitoring stations, monitoring terminals into, the collection of line information, tends to be professional, digital, intelligent. Although the traditional collection method has some disadvantages, it should not be ignored and is still an important auxiliary collection method. The emergency information collection mode gradually tends to the integration of emergency management, and the operation of emergency information collection, organization, reporting, early warning and so on are gradually integrated in one management system; establishing the special database of emergency information resources is another development trend, which is helpful to timely call relevant geographic information, disaster information, cases, etc. In terms of emergency information collection technology, 3S technology, Internet of Things, wireless communication, topic detection and tracking technology are widely used, making the multi- source social emergency information collection gradually mature.

3.4.4 Research on Technical Standards of Information Collection and Sharing Push Interface

Modern emergency management attaches great importance to the collection and management of emergency management information and resources. Only by mastering the relevant information and resources in advance, can we make the most scientific and effective decisions and actions when an emergency occurs. Power grid emergency needs panoramic state data, and the massive, heterogeneous and polymorphic data generated in the process of operation, maintenance and management of power grid have typical big data characteristics. Massive and complex data cannot effectively support the power emergency work. Electricity response needs to be formulated Emergency data collection technical specifications, standardized information collection, sharing and push interface, to provide strong data and information support for the power emergency disposal work.

By standardizing the power emergency data classification, data collection scope, data collection requirements, etc., the internal information data collection regulations and external information data collection regulations are formulated, as shown in Tables 3.9 and 3.10, to realize the information collection sharing and push.

Table 3.9 Internal data collection regulations table
Table 3.10 External data acquisition regulations table

Through the long-term in-depth cooperation between the Wisdom Research Institute and the Energy Industry Electricity Emergency Standardisation Committee, absorbing the experience of the standardisation work related to emergency business data, emergency plan data and emergency data collection in the energy industry, designing the data access interface of the emergency disaster data system and the interface standard for pushing data to the emergency data system, standardising the interface and data standard for emergency data collection from social public information in the electricity industry The development and submission for approval of a power industry standard, Technical Specification for Electricity Emergency Data Collection, through which the standard can be developed to unify the information collection and sharing interface. The standardised technical solutions for the active collection of public information by emergency systems and the interface technical solutions for the passive reception of pushed information were proposed, enabling the research results to be widely influenced in the emergency work of various units in the power industry.