1 Introduction

The study of crack image processing algorithms has become the hot research topic in the world. Currently, a large number of image processing algorithms for this have been proposed. Most of the algorithms have good processing effects for different pavement distress, which provides us a direction for the future research and corresponding applications [1].

Among the traditional algorithms, after image enhancement [2], the traditional Otsu thresholding is applied to segment crack images [3], to avoid uneven illumination affecting, the different local thresholding algorithms and the adaptive thresholding algorithms are also utilized [4, 5]. For the pavement images with the low noise, a simple two-dimensional empirical decomposition mode can be set for crack detection [6]; and for the complicate images, the crack tracking could be made by fusing color and depth information [7]. The region growing and splitting was also used for this purpose [8]. A review and comparison of different multi-scale algorithms for pavement distress are in [9].

As computer technique development, the traditional algorithms have been updated from time to time. The Random Structured Forests was applied for crack tracing [10]. In the same time (2016), the comparative supervised classification was utilized to detect pavement cracks [11]. In 2017, the more effective pavement crack detection and classification algorithms were studied [12], which firstly enhances the pavement image, then, extracts cracks and classifies cracks by using a heuristic segmentation algorithm, and the authors reported that in the experiments, the accuracy of crack detection is 88%, and the classification accuracy is up to 80%. Wang et al. used Fractional differential to extract rock fractures which are similar to cracks, by compared with 1–2 order differential operators, the algorithm achieves obvious good results [13]. Graph theory has also been applied for image processing more and more in recent years [14, 15]. Hoang and Nguyen (2019) made a method for asphalt pavement crack classification based on machine learning [16]. For the shadowed crack images, some new algorithms have been developed in 2020 and 2021[17, 18].

There are many algorithms to classify and evaluate cracks, such as support vector machine (SVM), artificial neural network and fuzzy comprehensive evaluation. Most of the algorithms have been used in various applications. For example, the Fuzzy mathematics and Dempster-Shafer theories were applied for crack identification [19]. After more than 30 years of development, 1D, 2D and 3D Fractal theories have been widely utilized in different areas [20, 21]. Wang et al. (2017) made an improved 1D Fractal algorithm to evaluate the grade of rock fractures, it is used to look at the whole rock fracture network, a curve is formed, then dimension and intercept value are computed, the testing result is satisfactory [20]. Xichen, et al. (2018) also applied the Fractal algorithm to evaluate the quality of blind images on gray fluctuation [21].

In the last ten years, the Deep learning has been applied into a lot of research areas, it is also used in the research area of pavement crack detection, e.g., in 2017, Cha and Choi studied an algorithm [22]: the Deep Learning-based crack damage detection using convolutional neural network, and in 2019, they utilized the pixel-level cracking detection on 3D asphalt pavement images through the Deep Learning based cracknet-V [23].

For 2D combining 3D, there are also a number of researchers made efforts to do crack detection, for example, Damjan and Boštjan (2019) improved the decision-making geo-information system for continuous monitoring of deformations on Airport infrastructure [24]; Li, et al.(2019) automatically segment and enhance the pavement cracks on 3D pavement images [25], Cao and Wang (2018) did 3D image enhancement and 3D object detection [26], Wang, et al. (2020) made tunnel centerline detection on Fractional differential and 3D invariant moments [27], and the network can also be used for the object detection [28].

Although the existing automatic crack image acquisition technology has become mature and successful commercialization, the automatic pavement damage identification algorithms are still in a relatively active research stage. Due to the diversity of pavement damage, in real applications, the object identification systems are still in the stage of the semi-automatic recognition combined with manual labeling mode, fully automatic recognition algorithms, especially for micro-cracks, material-filled or degraded cracks, still face the great challenges. In the past 30 years, although the research and development of the automatic recognition algorithms have achieved some progress, it is still difficult for the existing algorithms to perform well in three aspects: recognition accuracy, versatility and real-time.

Anyhow, there are more and more progresses for crack detection in the research stage. Currently, a lot of researchers have studied the Deep learning methods in different applications (including crack image recognition) [29], and some researcher applied different image acquisition sensors to measure crack depth, such as the combination of infrared and visible sensors [30], which may be the two of the main focus subjects in the near future.

The image recognition algorithms have gradually changed from single feature recognition to multi-feature recognition. External factors, such as environmental illumination intensity, material type, damage and aging degree, will lead to different intensity of the same type of targets in different feature dimensions (gray level, edge strength, shape), which leads it difficult to accurately identify the damage targets with a single feature. The multi-feature fusion methods can make full use of the complementary of multiple features of damaged targets to achieve accurate recognition and pseudo-target filtering of damaged target extraction. These methods have attracted more and more attentions of scholars and achieved good results.

In this study, we try to break through the limitation of the traditional image processing and classification algorithms in crack identification, Fracture mechanics is applied into image recognition (crack segment identification). To reach to the goal, firstly the image is shrunk to reduce computation burden, remove noise and preserve cracks, then, a new valley edge detection algorithm based on Fractional differential is researched for crack segment detection roughly in a gray level image, subsequently, a number of the post functions are applied for removing noise and linking segment gaps in a binary image, and finally the image is judged if it is a crack image by using a Fracture mechanics rule.

2 Edge detection on Fractional differential

In a pavement crack image, the characteristics of pavement surface are shown in Fig. 1a, where, the gray levels increase from the bottom to top, that means that the illumination is uneven in the image globally. But in the local regions (different parts in the image), the areas with low gray levels are cracks, e.g. the marked 5 cracks in Fig. 2, and the non-crack areas are of higher gray levels. We give a profile line AB in (a), and the AB line crosses 5 cracks. In the histogram of the AB line, 5 cracks represent obvious 5 valleys. The question is how to find out the cracks without affected by uneven illumination and noise. To do this, we studied a special edge detection algorithm for crack detection as the follows.

Fig. 1
figure 1

Crack image with a profile line AB crossing 5 cracks. a Crack image with a profile line AB; b Gray level histogram of line AB

Fig. 2
figure 2

Histogram of a profile line with 9 different types of valley points

The valley edges can be regarded the narrow edges [14]. In order to detect cracks clearly and disregard the edges caused by noise, a new valley edge detection algorithm detects each pixel to see if it is the lowest valley point in certain direction in a region. If it is, then the pixel is regarded as the valley edge candidate point, and both its direction and location can be marked. The key point is how to judge if the valley pixel is the real valley point we need in a right way. In Fig. 2, we assume that there are 9 valley points in a profile line of an image. The point “1” is a valley point where the gray level is much lower than that of its neighboring pixels, the valley is a deep valley, with a sharp angle; the point “2” is a valley point with a deep valley too, but its neighboring pixels have the similar gray levels as it has; the valley in the point “3” is not very deep, there are also some neighboring pixels having the similar gray levels; the valleys in the points 4 and 5 may be shallow valleys, and cannot be completely regarded as valley points, which depending on the detecting template size and shape; the points 6–9 are shallow valley points, to judge if they are valley points is depending on the template size and shape too. Hence the design of the valley point detection algorithm is important to correctly judge which points are real valley points.

In Fig. 3a, it is a 9 × 9 template, where, there are four different sized square regions around the central pixel, the template size should be large enough for detecting if the central pixel is a valley point candidate. The computation burden for 81 pixels for the valley point detection may be too heavy, and it is not accurate for the detection even the much information is used, as we tested. Instead, based on (a), we also tested a 7 × 7 template as shown in Fig. 3b, we may use the circle regions for the detection which is more suitable for the real cases and can utilize less pixels than that in square regions, where there are three circle regions (3 × 3, 5 × 5 and 7 × 7 kernels) circumscribing the central pixel, since the valley point has its unique orientation in four directions, so the detection operation should be performed in the four different directions respectively, and the four directions are marked in (b). As one example in Fig. 3c, we mark the two trapezoidal regions based on (b), which can be utilized for the valley point detection in the vertical direction (AB in (b)), as we marked “1”, “2” and “3” three lines both in the top trapezoidal region (red color) and the bottom region (blue color), if the detecting pixel “0” is a valley point, its gray level value should be lower than that in line “1”, the gray level value in line “1” should be lower than that in line “2”, and the gray level value in line “2” should be lower than that in line “3”. The question is how to calculate the weighted average value for each of the lines, and an example is given in the follows.

Fig. 3
figure 3

Valley point detection regions and directions: a 9 × 9 template with 4 regions; b 4 directions and 4 circle regions; and c Detection region in AB direction

Assume that there is a valley point p in the vertical direction in Fig. 3c, we have three detection lines, they are ab, cd and ef in Fig. 4 corresponding to line “1”, line”2″ and line “3″ in Fig. 3c. By refereeing to the top trapezoidal region (Fig. 3c), in Fig. 4, we have orthogonal lines ap, cp and ep, and they meet the angle condition of ap < cp < ep, otherwise p is not a valley point, of course, it is still not enough to determine that p is a valley point when the angle condition bp < dp < fp is not met. To make the requirements to decide if the point p is a valley point, we use the following way.

Fig. 4
figure 4

Diagram for Valley-edge detection algorithm

For the gray level of each line in Fig. 3c or Fig. 4, it should be a weighted averaging value, the weight of the central pixel in a kernel should be larger, and the remaining pixel weight values should be smaller. Since a lot of literature reported that the Fractional differential is good for smoothing especially for thin edge images, hence we calculate the coefficients based on Fractional differential. For this, we use Grümwald-Letnikovdefinition [12, 26].

For \(\forall v \in R\), if signal \(s(t) \in [a,t]\begin{array}{*{20}c} {} & {(a < t,a \in R,t \in R)} \\ \end{array}\), the integral part \([v]\) meets the condition \(m + 1\) < \(m \in Z\), Z is for the continuous derivative of the integer set order; if \(v > 0\) and m is equal to \([v]\), then v order derivative can be:

$$ {}_{a}D_{t}^{v} s(t) = \mathop {\lim }\limits_{h \to 0} s_{h}^{v} (t) = \mathop {\mathop {\lim }\limits_{h \to 0} }\limits_{nh \to t - a} h^{ - v} \sum\limits_{r = 0}^{n} {C_{r}^{ - v} } s(t - rh) $$
(1)

where \(C_{r}^{ - v} = ( - v)( - v + 1) \cdots ( - v + r - 1)/r!\)

If the duration \(s(t)\) is \(t \in [a,t]\), the signal duration \([a,t]\) can be divided equally in the unit equal interval \(h = 1\):

$$ n = \left[ {\frac{t - a}{h}} \right]\mathop = \limits^{h = 1} [t - a] $$
(2)

In this way, the v order fractional order of the differential expression in one dimensional signal \(s(t)\) can be deducted as:

$$ \begin{gathered} \frac{{d^{v} s(t)}}{{dt^{v} }} \approx s(t) + ( - v)s(t - 1) + \frac{( - v)( - v + 1)}{2}s(t - 2) + \hfill \\ + \frac{( - v)( - v + 1)( - v + 2)}{6}s(t - 3) + \cdots , + \frac{\Gamma ( - v + 1)}{{n!\Gamma ( - v + n + 1)}}s(t - n) \hfill \\ = a_{0} s(t) + a_{1} s(t - 1) + a_{2} s(t - 2) + a_{3} s(t - 3) + \cdots , + a_{n} s(t - n) \hfill \\ \end{gathered} $$
(3)

These n + 1 non-zero coefficient values are in order as:

$$ \left\{ {\begin{array}{*{20}l} {a_{0} = 1} \hfill \\ {a_{1} = - v} \hfill \\ {a_{2} = {{( - v)( - v + 1)} \mathord{\left/ {\vphantom {{( - v)( - v + 1)} 2}} \right. \kern-\nulldelimiterspace} 2} = (v^{2} - v)/2} \hfill \\ {a_{3} = ( - v)( - v + 1)( - v + 2)/6 = ( - v^{3} + 3v^{2} - 2v)/6} \hfill \\ \begin{gathered} a_{4} = ( - v)( - v + 1)( - v + 2)( - v + 3)/24 = (v^{4} - 6v^{3} + 11v^{2} - 6v)/24 \hfill \\ ...... \hfill \\ \end{gathered} \hfill \\ {a_{n} = \Gamma ( - v + 1)/n!\Gamma ( - v + n + 1)} \hfill \\ \end{array} } \right. $$
(4)

We take the absolute values: \(a_{0} = 1\),\(a_{1} = \left| { - v} \right|\),\(a_{2} = \left| {\left( {v^{2} - v} \right)/2} \right|\), when \({\text{v}} = {0}{\text{.5}}\), we have \(a_{1} = 0.5\), \(a_{2} = 0.{1}25\), to remove the decimal figures, for line “1”, we enlarge all figures for 2 times, then we have \(b_{0} = {\text{2a}}_{{0}} = 2\),\(b_{1} = {\text{2v}} = {1}\); and for line “2”, we enlarge 8 times, then we have \(c_{0} = {\text{8a}}_{{0}} = {8}\),\(c_{1} = {\text{8v}} = {4}\),\(c_{2} = {\text{8u}} = 1\).

As Fig. 5 shown, the templates for four directions are illustrated. Where, we define detecting point or central pixel as \(x_{0}\), line “1” as \(x_{1}\) and line “2” as \(x_{2}\), in the vertical direction (Fig. 5b), the top part (in Fig. 3c, the red color trapezoid region) is taken as the example for the valley point detection. The following \(f\left( {i,j} \right)\) is the gray scale image as input, and \(g\left( {i,j} \right)\) is a binary image as output.

$$ x_{0} = f\left( {i,j} \right) $$
$$ x_{{1}} = \frac{{\left[ {b_{0} f\left( {i{ - 1},j} \right) + b_{1} (f\left( {i{ - 1},j - 1} \right) + f\left( {i{ - 1},j + 1} \right))} \right]}}{{\left( {b_{0} + 2b_{1} } \right)}} $$
(5)
$$ x_{{2}} = \frac{{\left[ \begin{gathered} c_{{0}} f\left( {i{ - 1},j} \right) + c_{1} \left( {f\left( {i{ - 2},j - 1} \right) + f\left( {i{ - 2},j + 1} \right)} \right) + \hfill \\ + {\text{c}}_{{2}} \left( {f\left( {i{ - 2},j{ - 2}} \right) + f\left( {i{ - 2},j + {2}} \right)} \right) \hfill \\ \end{gathered} \right]}}{{\left( {c_{0} + 2c_{1} + {\text{2c}}_{{2}} } \right)}} $$
(6)
$$ y = \left( {x_{2} + x_{1} } \right)/2 - x_{0} $$
(7)
Fig. 5
figure 5

Template for lines marking in the four directions

In the vertical direction, we have two values (in the top region and the bottom region), we call them as \(y_{{ + {9}0}}\) and \(y_{{{ - 9}0}}\), if \(y_{{ + {90}}} > 0\) and \(y_{{ - 90}} > 0\), we have \(y_{{{90}}} = y_{{ + {90}}} + y_{{ - 90}}\). In the same way, we calculate the other three directional y values. Then, we calculate:

$$ z = \max \left( {y_{0} ,y_{45} ,y_{90} ,y_{135} } \right) $$
(8)

If we output a gradient magnitude image, we do:

$$ {\text{If}}\,\,Z > 0,g(i,j) = z,{\text{otherwise }}g(i,j) $$
(9)

If we output a binary image directly, when we set a threshold T, we can do:

$$ {\text{If}}\,\,Z > T,g(i,j) = 255,{\text{otherwise }}g(i,j) = 0 $$
(10)

It is normal that an original pavement crack image may include a lot of noises which can affect valley edge detection result much. One simple way for reducing the noises is to use a smoothing filter such as the Gaussian smoothing function, which has a width parameter sigma, often referred to as the scale space parameter. The choice of sigma depends on the white spot size distribution.

The valley edge detection algorithm we studied in this paper only used a 5 × 5 template, which is available for the cracks of about 8 pixels in width as tested, for other types of cracks, the template size can be changed, for instance, if the width of cracks is 1–3 pixels, the template size can be 3 × 3. In our algorithm, we assume that the crack gray levels must be lower than that of pavement, if the gray levels are greater than that of non-cracks, the image inverse operation should be made firstly, and if an image includes both dark and light cracks, the crack detection may be carried out for two times, one is in the original image and the other is in the inverse image respectively. The new algorithm is a kind of edge detectors, it can output a binary image or a gradient magnitude image, which is chosen based on the application needs. The weighted average gray levels for lines can be defined in different ways, which depends on the type of cracks, e.g., for the cracks on a metal surface, because the surface is uniform, for line “1”, we can just use the central pixel to represents the gray level, and for line “2”, three pixels’ gray levels are used for calculating the weighted gray level. The threshold T is related to the gray level variation, the gray level difference between cracks and background (non-cracks), and the crack width, etc., it can be determined by different ways, one is based on experience, for example, in our case, the images are taken in the same environment, by the same equipment, and with the same pre-processing, it is simple to set the threshold T by experience; and the other ways might be based on some image quality evaluation algorithms according the crack image types or characteristics.

3 Segment detection on Fracture mechanics

The first step in application is to quickly or online identify where are cracks in the pavement, because the image acquisition is to use a vehicle equipped with a line scanning system or a matrix camera system, the system can acquire images continually on the running way, the velocity of the vehicle can be up to 50 km/h, the image resolution is 4096 × 2048 pixels, the normal road width in a lane is 3700 mm, therefore, each pixel represents 0.9 mm. On road, the sequence images can cover the road in a long way, and each image will include the information for 1843.2 mm long on road. When the first crack image position is known, the position of each of the subsequent crack image can be easily calculated. Since the image size is too large for quickly identify if the detecting image has cracks, the image should be shrunk as the minimum size which should preserve crack information and remove a part of image noises. To do this, we designed an image shrink procedure as the follows.

3.1 Image shrinking

Multi-scale representations are more or less related to scale-space theory, notably the theories of pyramids, Wavelets and multi-grid methods. This paper will not describe and discuss the theory, and the detailed information is described in [31]. For the complicated pavement crack images, the studied algorithm is very useful as tested.

The image scale can be reduced as: let \(x = 1,2,3,...,n\), \(y = 1,2,3,...,m\), and \(f\left( {x,y} \right)\) is the original image. Then, we set shrunk image as:

$$ f\left( {x_{k} ,y_{k} } \right) $$
(11)

where \(x_{k} = 1,...,n/2^{k}\), \(y_{k} = 1,...,m/2^{k}\), \(k = 1,{2},{3}\begin{array}{*{20}c} , & {......} \\ \end{array}\) \(k \le K\), \(m \ge 2^{K}\), \(n \ge 2^{K}\).

To obtain the valuable scaled \(f\left( {x_{k} ,y_{k} } \right)\), for simple and convenient, we disregard the complicated algorithms such as Gaussian filter using 5 × 5 template, instead, we only utilize four neighboring pixels’ gray levels in the original image to calculate the corresponding result pixel’s gray level in the shrunk image. The following lists the several image shrinking algorithms for comparison, i.e., Maximum, Top-left, Average, Medium and Minimum filters.

The simplest image reduction is to choose one of the four pixels’ gray levels as the corresponding pixel’s gray level in the shrunk image in certain order or rule, e.g., Top-left: we scan the original image from top-left to bottom-right, then choose the top-left pixel’s gray level value in the four pixels as the corresponding pixel’s gray level in the shrunk image. In Fig. 6, for example, we can have: a = A, b = E, c = I, d = M, of course, one also can choose Top-right, Bottom-left or Bottom-right to do the similar manipulation. For the remaining filters, by referring to Fig. 6, we use the following equations which are convenient for programming coding.

Fig. 6
figure 6

Image shrink diagram

For Average filer:

$$ \left\{ {\begin{array}{*{20}c} {a_{av} = \left( {A + B + C + D} \right),a = a_{av} /4} \\ {b_{av} = \left( {E + F + G + H} \right),b = b_{av} /4} \\ \end{array} } \right.. $$
(12)

For Maximum filter:

$$ \left\{ {\begin{array}{*{20}c} {a_{\max } = \max \left( {A,B,C,D} \right),a = a_{\max } } \\ {b_{\max } = \max \left( {E,F,G,H} \right),b = b_{\max } } \\ \end{array} } \right., $$
(13)

For Minimum filter:

$$ \left\{ {\begin{array}{*{20}c} {a_{\min } = \min \left( {A,B,C,D} \right),a = a_{\min } } \\ {b_{\min } = \min \left( {E,F,G,H} \right),b = b_{\min } } \\ \end{array} } \right., $$
(14)

For Medium filter:

$$ \left\{ {\begin{array}{*{20}c} {a = \left( {4a_{av} - a_{\max } - a_{\min } } \right)/2} \\ {b = \left( {4b_{av} - b_{\max } - b_{\min } } \right)/2} \\ \end{array} } \right., $$
(15)

Figure 7 is one of examples to show the differences among the crack image shrink algorithms (see also Table 1). The original image is in Fig. 2a, there are 10 cracks in the image as marked in Fig. 7e, we applied the above shrinking filters to shrink the original image for two times. The cracks in the shrunk image by Maximum filter are almost disappear (a); For the simplest Top-left filter, the cracks in the images are lost or blur (b); By using Average filter, the cracks 1,7,10 are very vague; the Medium filter result is better than those in (a-c), but the cracks are not sharp; and Minimum filter can preserve all the cracks as shown in (e). The more detailed comparison results are listed in Table 1.

Fig. 7
figure 7

Shrinking image two times on image in Fig. 2a: a Maximum; b Top-left; c Average; d Median; and e Minimum filter

Table 1 Crack quality after image shrinking in Fig. 7

However, there are a number of image shrinking algorithms in the previous study, any algorithm is studied based on its own application requirement. In our case, the objects are the cracks on a rough pavement surface, the gray levels of cracks are lower than that in their local non-crack regions, and the fast processing speed is needed for judging if the detecting image has cracks, thereby, a large template, e.g., template size 5 × 5 or 7 × 7 is not suitable for us, and we just use simple and available filters for image shrinking, the result is that Minimum filter is the best for our case. After image shrinking, we will do crack segment identification by using the valley edge detection algorithm roughly as described in the above Section, but the valley edge detection algorithm only detects valley edge points, not cracks, even not crack segments, so we have to do crack segment identification by some post processing functions after the valley edge detection, then we use the Fracture mechanics to recognize if a segment can represent a part of crack, which is very important for quickly identifying which images have cracks in a huge number of continuous standard road pavement images (e.g., thousands or tens-thousands). If this works, the crack locations can be calibrated, the cracks can be processed and classified further easily.

3.2 Crack identification by Fracture mechanics

Fracture mechanics is the study of the crack propagation in materials. It is to calculate the driving force on a crack and those of experimental solid mechanics to characterize the material’s resistance to crack [32]. It is a science to study the strength and crack propagation of cracked bodies, also known as Fracture mechanics (also called Crack mechanics). Its task is to obtain the crack toughness of all kinds of materials, to determine whether the object breaks under a given external force, that is, to establish a crack criterion, and to study the law of crack propagation during the loading process.

For a solid planar material object, if the stress at each point is set to be \(\sigma_{x}\),\(\sigma_{y}\),\(\tau_{xy}\), the following equation can be obtained:

$$ \frac{{\partial \sigma_{x} }}{\partial x} + \frac{{\partial \tau_{xy} }}{\partial y};\frac{{\partial \sigma_{y} }}{\partial y} + \frac{{\partial \tau_{xy} }}{\partial y} = 0 $$
(16)

If the displacements in horizontal and vertical directions are u and v, the corresponding strains are \(\varepsilon_{x}\),\(\varepsilon_{y}\),\(\gamma_{xy}\), there are the following equations:

$$ \varepsilon_{x} = \frac{\partial u}{{\partial x}};\begin{array}{*{20}c} {} & {\varepsilon_{y} = \frac{\partial v}{{\partial y}}} \\ \end{array} ;\begin{array}{*{20}c} {} & {\gamma_{xy} = \frac{\partial u}{{\partial y}}} \\ \end{array} + \frac{\partial v}{{\partial x}} $$
(17)

For shear modulus, E is Young’s modulus and \(\nu\) Poisson’s ratio, we can get:

$$ E\varepsilon_{x} = \sigma_{x} - v\sigma_{y} ;\begin{array}{*{20}c} {} & {E\varepsilon_{y} = \sigma_{y} - v\sigma_{x} ;\begin{array}{*{20}c} {} & {\mu \gamma_{xy} } \\ \end{array} } \\ \end{array} = \tau_{xy} $$
(18)

where \(\mu = E/2\left( {1 + \nu } \right)\).

If the following equations hold:

$$ \sigma_{x} = \frac{{\partial^{2} \varphi }}{{\partial y^{2} }};\begin{array}{*{20}c} {} & {\sigma_{y} = \frac{{\partial^{2} \varphi }}{{\partial x^{2} }}} \\ \end{array} ;\begin{array}{*{20}c} {} & {\tau_{xy} = { - }\frac{{\partial^{2} \varphi }}{\partial x\partial y}} \\ \end{array} $$
(19)

Then the above equilibrium Eq. (16) is automatically satisfied.

By substituting Eq. (17) and Eq. (19) into Eq. (18) and differentiating them twice, the compatibility equation can be derived.

$$ \frac{{\partial^{4} \phi }}{{\partial x^{4} }} + \frac{{\partial^{4} \phi }}{{\partial x^{2} y^{2} }} + \frac{{\partial^{4} \phi }}{{\partial y^{4} }} = 0;\nabla^{2} (\nabla^{2} \phi ) = 0 $$
(20)

The linear elastic plane problem can be solved by finding a function \(\varphi\) that satisfies the above Eq. (20).

On the basis of the above equations, by derivation of complex function and assumption of some conditions, for a crack of 2a length, if a tensile stress \(\sigma\) is given at a far distance, the stress intensity at the two tips of the crack is T proportional to 2a and \(\sigma\). The simplest case is:

$$ T = c\sigma \sqrt a $$
(21)

For complex crack images, Fracture mechanics can be applied to crack segment identification. For a crack image, the length of a crack segment can be found by the valley edge detection algorithm and the post processing functions, the short edge gaps should be roughly linked based on some rules, and the crack segment length 2a can be determined by the information of the gradient magnitude and gray level in the crack and the bending degree of the crack. For other parameters, c can be determined simply by the percentage of edge pixels in the image (edge density) β, \(\sigma\) intensity is related to the values of the gradient magnitude and gray level in the crack area; the darker (lower gray level value) the crack is, the higher \(\sigma\) intensity is, the greater the crack gradient magnitude is. The value of T can be used to determine whether the crack is pseudo-crack and if the image has cracks or not.

In our case, for one group of the original pavement crack images, since the image acquisition equipment and environment are no change, we do not consider lightning and other hardware effects, the original image resolution is 2048 × 4096 pixels (1840 × 3700 mm), when it is shrunk for k times (k = 0, 1, 2, 3, 4, 5, …), we set \(b = 2^{k}\). For the crack, if its average gray level is \(G_{av}\), and its average gradient magnitude value is \(M_{av}\), we can set:

$$ \sigma = M_{av} /\left( {G_{av} + {1}} \right) $$
(22)

where, \({0} \le M_{av} \le 255\), \({0} \le G_{av} \le 255\).

Hence, for a crack of 2a length (Euclidean distance), we re-write Eq. (21) as:

$$ T = c\sigma \sqrt a = \left( {{100}\beta } \right)\left( {M_{av} /\left( {G_{av} + 1} \right)} \right)\sqrt {ba} $$
(23)

where since we shrank each of the original image 4 times, the image width is 256 pixels and the shortest crack has 10 pixels of Euclidean distance, we set \({5} \le a \le {128}\); the other variable ranges are: \({0} \le \left( {{100}\beta } \right) \le {100}\); \({0} \le \left( {M_{av} /\left( {G_{av} + 1} \right)} \right) \le {255}\); \({9} \le \sqrt {ba} \le {45}\); \(T \ge 0\).

In Eq. (23), if \(M_{av} = 0\), then \(T = 0\), it means that if there is no gradient magnitude value or very low gradient magnitude values on the crack, the crack is spurious crack which may be caused by some wrong edge tracing or noise points connecting; T does not change as k changes because of \(\sqrt {ba}\); the larger \(\left( {{100}\beta } \right)\) is, the more cracks or more crack pixels are in the image; in most cases, \(\left( {M_{av} /\left( {G_{av} + 1} \right)} \right)\) can dominates T value.

As our experience for testing hundreds of road pavement crack images, in an image, to judge if an image has cracks, the number of pixels in an object should be more than 44, it can have least 10 pixels of Euclidean distance for the segment. In this way, the crack segment can be identified and the image can be judged whether it is or not a crack pavement image. Before this stage, in order to trace the crack segment, it needs to link the gaps between valley edge pixels. This task requires the extraction of information about attributes of endpoints, and in particular orientation and neighborhood relationships. As usual, after image shrinking and valley edge detection, the valley edges are thinned into a width of one pixel, but some gaps in the valley edges prevail and the noises still exist in the image. In order to do this, the new algorithm first detects significant endpoints of curves (or lines). Then, it estimates the directions for each endpoint based on local directions of valley edge pixels. Finally, it traces the segments according to the information of directions of each new detected pixel (new endpoint) and an intensity cost function. The valley edge tracing starts from the detected endpoints to see which neighborhood has the lowest gray level, and when a new pixel is found as valley edge point, it is connected to the detecting point, and the new pixel is used as new endpoint. Before it starts to trace from another detected endpoint, the tracing procedure continues until a segment is fully traced. When there is no detected endpoint for continuous tracing, the valley edge tracing procedure stops.

4 Experiments

According to the above valley edge detection, image shrinking, post processing and crack identification based on Fracture mechanics, we tested more than 400 pavement images, the image resolution is 2048 × 4096 pixels. The accuracy of the testing results is 97%. In the following, we present two of the images, including valley edge detection images, as shown in Figs. 89.

Fig. 8
figure 8

Crack detection for an image with single crack: the images on the first row are Sobel and Canny results respectively; and the images on the second row are the three step results by new algorithm

Fig. 9
figure 9

Processing procedure for a complicated multiple crack image

One example is shown in Fig. 8, illustrating the tracing procedure. The original image is in (a), which includes one long crack, the image is shrunk for 3 times. Since the image quality is bad, the widely used Sobel and Canny edge detectors cannot obtain the satisfactory results: Sobel result is too rough for crack extracting; and Canny operator produces many noises.

When we apply the new algorithm on the image, the crack is detected well, but the result in (d) also includes some of noises. In Fig. 8e, After junction pixels and the object less than 3 pixels are removal and the short gaps are filled, then after we makes opening and object labeling by colors, we got that the central red color curve is the maximum object, it has 81 pixels, the left blue curve has 47 pixels, and the remaining three large objects have 38, 34 and 33 pixels respectively. After gradually linking gaps less than 5, 9, 15 pixels step by step, removing the objects less than 5, 9, 15, 30, 44 pixels step by step, and we obtain that the maximum object is the blue color curve having 418 pixels, and the second maximum object with green color curve has 93 pixels. For the maximum curve in Fig. 8f, we measured blue color curve’s Euclidean distance 310 pixels (2a), the gradient magnitude value (Sobel edges) is 75 (\(M_{av}\)), and the gray level is 72 (\(G_{av}\)), edge density β is 0.0042, the original image is shrunk for k = 3 times, we have:

$$ \begin{gathered} T_{{{418}}} = \left( {{100}\beta } \right)\left( {M_{av} /\left( {G_{av} + {1}} \right)} \right)\sqrt {2^{k} a} \hfill \\ = \left( {{0}{\text{.42}}} \right)\left( {{79}/{62}} \right)\sqrt {{\text{8x155}}} = {18}{\text{.9}} \hfill \\ \end{gathered} $$
(24)

For the second maximum curve in Fig. 8f, the green color curve’s length is 70 pixels, the gradient magnitude value is 70, and the gray level value is 75, T is:

$$ \begin{gathered} T_{{{93}}} = \left( {{100}\beta } \right)\left( {M_{av} /\left( {G_{av} + {1}} \right)} \right)\sqrt {2^{k} a} \hfill \\ = \left( {{0}{\text{.42}}} \right)\left( {{78}/{60}} \right)\sqrt {{\text{8x35}}} = {9}.{1} \hfill \\ \end{gathered} $$
(25)

From these results we can make sure the image has cracks, because it has two segments of T > 9, and one of them has T > 18, therefore we give T’s threshold is t = 9. If \(T \ge t\), we judge that the image has cracks.

For the multiple cracks in an image, the algorithm can also make crack identification well. In Fig. 9, the original image is in (a), the Sobel results in a gradient magnitude image, an complex tracking procedure is needed as shown in (b), the Canny edge detector makes double edges, and produces many noises, as shown in (c), which is difficult for the further crack extraction. The valley edge detector extract main crack segments (see (d)), the post functions remove noises and fill gaps step by step (e), and the labeled final result is in (f), which is satisfactory.

5 Conclusion

In this study, a valley edge detection algorithm is studied for pavement crack images. For the rough surface with wide cracks, in order to reduce computation burden and remove the noises and preserve cracks, an image shrinking algorithm with multi-scale analysis is applied before the valley edge detection. The studied valley edge detection on Fractional differential is much better than the traditional algorithms such as Sobel and Canny algorithms. Based on the valley edge detection result, a number of the post functions are used for removing noises and filling segment gaps such as object thinning, junction detection, endpoint detection, and object labeling in a binary image. After the above procedure, the cracks in the images can be extracted well. The aim of our application is to quickly check which images have cracks in a huge number of road pavement images, hence we apply Fracture (Crack) mechanics to calculate the information in the binary images to output the identification result, and the algorithm is also available for other linear object detection. The next step of work is to extend the Fracture mechanics into crack tracing and classification.