1 Introduction

In the normal use of traffic infrastructure, it is very important to maintain a good running state for driving safety. Crack is one of the important factors that threaten the normal and safe operation of infrastructure [1]. Timely and accurate detection of crack development and propagation can effectively avoid the occurrence of major disaster accidents. Therefore, crack detection has important value in the field of transportation facilities.

In recent years, with the in-depth study of image processing technology, the use of computer equipment to identify and detect cracks has made significant progress. Peng et al. [2] proposed an improved Otsu threshold segmentation algorithm to remove the mark in the road image, and then used the adaptive iterative method to segment the image after removing the mark to obtain the crack image. Xu et al. [3] improved the selection of filtering parameters in Canny iteration method, which effectively improved the detection accuracy of bridge cracks. Fernandez et al. [4] and others used the decision tree heuristic algorithm to classify the crack image, but this work is not integrated into the real system in the simulation environment. Shi et al. [5] proposed the CrackForest model, which uses the random forest method to deal with, so as to reduce the influence of noise on the accuracy of crack detection.

In this paper, a fast pixel-level crack size detection method based on convolutional neural network is proposed. Firstly, the crack is quickly located and identified by YOLOv3 recognition algorithm. On the basis of the target image, the Mask R-CNN segmentation algorithm is used to secondary process the crack and extract its topological characteristics. Finally, the pixel size information of the crack is obtained.

2 Principle of YOLOv3 Algorithm

The image recognition steps based on YOLOv3 algorithm [6] are mainly divided into three parts: image input, convolution network processing, and output of prediction feature layer. Firstly, the 3-channel image with the size of 416 × 416 is input. By using the network structure of Darknet-53, convolution operation is carried out in the first 52 layers of Darknet-53 network and a large number of residual layers are used in the network. Finally, the detection feature layer with the size of 13 × 13, 26 × 26, 52 × 52 was obtained by up-sampling and feature fusion for image recognition. Figure 16.1 is the implementation process of the Yolov3 algorithm.

Fig. 16.1
figure 1

YOLOv3 crack identification process

3 Principle of Mask R-CNN Algorithm

Figure 16.2 is a crack detection network structure framework based on Mask RCNN segmentation algorithm [7], which mainly consists of CNN convolution network area, candidate region network, RoI Align layer, and Mask branch. Firstly, the image is input into the feature extraction network to obtain the feature map. A fixed ROI is set for each pixel position in the feature map and the ROI region is transmitted to the RPN network for binary classification to obtain the candidate box of the target object region. Finally, the target candidate region is classified by multiple categories. The candidate box regression and the introduction of FCN generate Mask, and the segmentation task is finally completed.

Fig. 16.2
figure 2

Mask R-CNN crack segmentation process

Fig. 16.3
figure 3

Two-stage process of crack identification

4 Two-Stage Decision Method

In order to solve the problem of low detection accuracy of YOLOv3 algorithm and slow segmentation speed of Mask R-CNN algorithm, the author adopts two-stage decision method [8]. Firstly, the YOLOv3 algorithm with high detection speed is used for image recognition of the collected sample video to quickly locate the crack frame. Then, the Mask R-CNN algorithm is used for image processing of the crack frame after initial recognition to obtain the specific topology information of the crack in the target frame. Finally, the pixel size of the crack is extracted. The Mask R-CNN algorithm has high recognition accuracy and can be used for secondary processing of crack frames, thus improving the accuracy of crack recognition. Figure 16.3 is the application process of two-stage decision method in crack identification.

5 Methods Application and Results

In order to verify the feasibility of the pixel-level crack size rapid detection method, the cracks existing on some highways were tested. The experimental results are shown in Table 16.1. It can be seen that the average recognition speed of YOLOv3 reaches 45 FPS, which can meet the requirements of real-time detection. At the same time, the method can realize the work from crack recognition to crack pixel size information.

Table 16.1 Crack test results

6 Conclusions

In this paper, a fast detection and measurement method of pixel-level crack size based on convolutional neural network is proposed. The experimental results show that the recognition speed of this method is 45 FPS, which can meet the requirements of real-time crack recognition. At the same time, through the Mask R-CNN algorithm for image segmentation, the topological characteristics of the crack can be extracted and the crack pixel information can be obtained. In the future work, the author will study how to obtain the real crack area by camera calibration ratio and develop the corresponding road crack detection equipment.