Efficient evaluation of the Number of False Alarm criterion
 40 Downloads
Abstract
This paper proposes a method for computing efficiently the significance of a parametric pattern inside a binary image. On the one hand, acontrario strategies avoid the user involvement for tuning detection thresholds and allow one to account fairly for different pattern sizes. On the other hand, acontrario criteria become intractable when the pattern complexity in terms of parametrization increases. In this work, we introduce a strategy which relies on the use of a cumulative space of reduced dimensionality, derived from the coupling of a classic (Hough) cumulative space with an integral histogram trick. This space allows us to store partial computations which are required by the acontrario criterion and to evaluate the significance with a lower computational cost than by following a straightforward approach. The method is illustrated on synthetic examples on patterns with various parametrizations up to five dimensions. In order to demonstrate how to apply this generic concept in a real scenario, we consider a difficult crack detection task in still images, which has been addressed in the literature with various local and global detection strategies. We model cracks as bounded segments, detected by the proposed acontrario criterion, which allow us to introduce additional spatial constraints based on their relative alignment. On this application, the proposed strategy yields stateof theart results and underlines its potential for handling complex pattern detection tasks.
Keywords
Image analysis Number of False Alarms cumulative space Acontrario decision Crack detectionAbbreviations
 FPGA
Fieldprogrammable gate array
 GPU
Graphics processing unit
 NFA
Number of false alarms
1 Introduction
Since the seminal articles of Desolneux et al. [1, 2], detection approaches based on the Number of False Alarms (NFA) criterion became more and more popular in the field of image processing over the last decade. In these approaches, the words “acontrario” refer to the fact that detection is performed by contradicting a “naive” model that represents the statistics of the outliers (the null hypothesis in statistical decision theory). Then, the inliers are detected as too regular to appear “by chance” according to the naive model. The main asset of such approaches is their independence from threshold parameters, since they cast the detection as an optimization problem by maximizing the significance defined from deviation relatively to the naive model. Then, to interpret this maximum of significance (or, equivalently, minimum NFA value) in terms of the presence or absence of structured pattern, one refers to the NFA definition itself: the NFA of a pattern candidate is the expected number of false positives (random patterns) occurring in the search space when accepting all patterns at least as significant as the candidate. By setting the NFA detection threshold to 1 irrespective of the detection task, the acontrario framework simply states that the upper bound for detecting a random rare event is at most one occurrence.
After the introductory illustration of acontrario methods on alignment detection [1] grounded in the Gestalt continuity principle, a number of works has developed around the idea of using this fundamental pattern in order to detect derived structures such as segments [3, 4, 5], vanishing points [6, 7], or scratches [8], while recent works [9, 10] show the ongoing interest about detection of basic alignments. Nevertheless, acontrario methods have simultaneously evolved to deal with the detection of more complex patterns, such as circles and ellipses [11, 12] as well as coherent clusterings in a broader sense [13, 14, 15, 16, 17, 18].
Considering pattern recognition problems, in order to find the most significant subset among the ones representing patterns, one should theoretically compute the significance of every possible pattern. If the researched patterns correspond to parametric objects (e.g., lines, ellipses), the dimension (and thus the actual cardinality being used) of the solution space to explore grows with the number of parameters. Then, in order to maintain tractability, discretizing the parameter space or relying on heuristics [4] for exploration are commonly employed strategies. In this work, we show how a cumulative space may be used in order to compute efficiently the significance of a given parametric pattern. Cumulative approaches, widely used in pattern recognition and introduced in [19], rely on a quantization of the entire feasible parameter space denoted as accumulator, in which every observation increments the count of every discrete cell (i.e., pattern) consistent with the existence of that observation. At the end of the process, each cell records the total number of supporting observations.
In our work, we show that in some favorable cases there is an equivalence between considering cells in a highdimensionality cumulative space and recasting them as northotopes in an alternative cumulative space of decreased dimensionality (by 1 or 2 in the proposed examples). Now, using a trick similar to the integral histogram [20], the fact of considering northotopes allows for an efficient computation of the required NFA values. Specifically, the integral histogram is the result of the propagation of an aggregated histogram from origin through the whole image lattice. In this way, the histogram of any rectangular region may be computed by simple arithmetic operations between four points of the integral histogram. By leveraging the use of this cumulative space of reduced dimensionality, we are thus able to lower significantly the complexity of pattern research and to extend the limits of the parameter space dimension due to the available computer memory size.
Then, as a second contribution, we show how crack detection in still images can benefit from the proposed coupling between an acontrario criterion and a cumulative space.
2 Methods
2.1 Related work
where η_{2} is the “number of tests” coefficient that depends on the number of possible patterns of ν pixels.
2.2 Proposed approach: NFA computation using a cumulative space
According to Eqs. (1) or (2), in order to compute the significance of a given pattern, we need both its geometric area or its number of pixels and its number of 1valued pixels.
The main idea of this work is to use a cumulative space to store partial sums of numbers of points in order to decrease the computational cost. Then, the number of points in any pattern of given parameters can be directly retrieved from the values stored in the cumulative space, allowing us to accelerate the algorithm and/or to cope with a finer discretization of the parameter space.
In the following part of this section, we explain our approach and we illustrate how it works through four classic examples, namely detection of rectangular tiles, strips, rings and bounded strips.
2.2.1 Use of cumulative space
The cumulative space on which we will focus varies with respect to the considered pattern. Specifically, it arises from the chosen parametric form for the pattern of interest. Now, all the parametric forms (of a given pattern) are not equivalent in terms of involved cumulative space. Let us point out the representations as a set of “simpler” patterns (simpler in the sense that they involve less parameters), such that the set is defined by varying one (or two) parameter(s) of the simpler pattern into an interval. For instance, a strip may be represented either as a straight line having a strictly positive width or as a set of parallel lines such that their respective distance to the origin (ρ parameter in polar representation) varies between two bounds. Now, let us remark that, for such a parametric form, when two parameters represent the bounds of an interval, it is possible to handle them both on a single axis/dimension of the associated cumulative space. In the example of the strip, the first representation involves a 3D cumulative space, whereas the second representation allows us to use the same 2D cumulative space as that of straight lines, namely the classic Hough transform space.
Therefore, among several parametric forms of a given pattern, denoted b, we favor the one that is a set of simpler patterns, denoted a, potentially having several parameters that can be represented on a same axis of the cumulative space. Such a representation allows us to reduce the dimensionality of the cumulative space and save processing time by storing partial sums, in a similar fashion to integral histograms [20]. Specifically, it allows us to compute the pattern as follows.
Let l denote the number of parameters required to determine the considered pattern b. We denote by β=(β_{i})_{i∈[1,l]} the tuple of these parameters which take values in \(\mathcal {A}\) (\(\mathcal {A}\) depends on the image lattice and on the application that may introduce some specific constraints on the parameters).
Let \(\mathcal {C}\) be the considered cumulative space. If pattern b has been parametrized as a set of “wellchosen” patterns a, \(\mathcal {C}\) is the cumulative space associated to a parametrization of a. Since a has less parameters than b, some pairs of b parameters are bounds for intervals of a parameters. Then, we distinguish in \(\mathcal {C}\) the axes that represent only one parameter β_{i} and the axes that represent two different β_{i} (playing the role of bounds for some a parameters). For example, in Fig. 1, \(\mathcal {C}\) is the polar representation space, denoted by the two parameters (θ,ρ), with one axis that represents the angle θ, and the other axis that represents the two bounds for ρ parameter. Since this last axis carries in fact two parameters, it is called biparameter axis, in opposition to an axis that carries only one parameter such as θaxis in this example. If m is the number of biparameter axes, with \(0\leq m\leq \frac {l}{2}\), \(\mathcal {C}\) dimensionality is l−m, and l−2m is the number of monoparameter axes. Note that in \(\mathcal {C}\), a simpler pattern is represented by a point and a pattern of interest by a northotope (also called hyperrectangle). Indeed, any pattern of interest b is then represented by a northotope of \(\mathcal {C}\) having l−2m dimensions reduced to a single point and the other dimensions which are nonnull intervals. Now, since \(\mathcal {C}\) is a cumulative space associated to pattern a, the number of votes for a given pattern a is provided by the value of the corresponding point in \(\mathcal {C}\), and the number of votes for a given pattern b is the sum of the point values in \(\mathcal {C}\) over the corresponding hypercube.
Without loss of generality, the elements of β tuple are mapped to the \(\mathcal {C}\) axes, denoted (α_{i})_{i∈[1,l−m]}, as follows: the l−2m first components are mapped to the l−2m first \(\mathcal {C}\) parameter axes (that are thus monoparameter axes) and the 2m last components are mapped to the m last \(\mathcal {C}\) parameter axes (that are thus biparameter axes) so that the β_{i} parameters are reordered as \(\left ((\alpha _{i})_{i\in \left [ 1,l2m \right ]}, \left (\underline {\alpha }_{j}\right)_{j \in \left [l2m+1,lm\right ]}, \left (\overline {\alpha }_{j}\right)_{j\in \left [l2m+1,lm\right ]}\right)\), where \(\underline {\alpha }_{j}\) denotes an interval lower bound and \(\overline {\alpha }_{j}\) denotes an interval upper bound.
 if m=1,$$ \kappa\left(\mathbf{b}\right)\,=\,\!\!\sum_{\alpha_{l1}=\underline{\alpha}_{l1}}^{\overline{\alpha}_{l1}} \!\!J_{\mathcal{C}} \left(\vec{b}_{1},\alpha_{l1}\right) \,=\, \mathbb{J}_{\mathcal{C}} \left(\vec{b}_{1}, \overline{\alpha}_{l1} \right) \,\, \mathbb{J}_{\mathcal{C}} \left(\vec{b}_{1}, \underline{\alpha}_{l1} \right), $$(5)
 if m=2,$$ {\begin{aligned} \kappa\left(\mathbf{b}\right) =& \sum\limits_{\alpha_{l3}=\underline{\alpha}_{l3}}^{\overline{\alpha}_{l3}} \sum\limits_{\alpha_{l2}=\underline{\alpha}_{l2}}^{\overline{\alpha}_{l2}} J_{\mathcal{C}} \left(\vec{b}_{1}, \alpha_{l3}, \alpha_{l2} \right), \\ =& \mathbb{J}_{\mathcal{C}} \left(\vec{b}_{1}, \overline{\alpha}_{l3}, \overline{\alpha}_{l2} \right) + \mathbb{J}_{\mathcal{C}} \left(\vec{b}_{1}, \underline{\alpha}_{l3}1, \underline{\alpha}_{l2}  1 \right)\\ & \mathbb{J}_{\mathcal{C}} \left(\vec{b}_{1}, \underline{\alpha}_{l3}  1, \overline{\alpha}_{l2} \right)  \mathbb{J}_{\mathcal{C}} \left(\vec{b}_{1}, \overline{\alpha}_{l3}, \underline{\alpha}_{l2}1 \right). \end{aligned}} $$(6)
In summary, the main idea of the paper is that the NFA can be computed more efficiently with cumulative space precomputation: for an image of size N^{2}, using the integral histogram trick in a wellchosen cumulative space of reduced dimension l−m (instead of l) with each single dimension of size M, the complexity is roughly N^{2} (\(J_{\mathcal {C}}\) computation) +M^{m} (\(\mathbb {J}_{C}\) computation) +M^{l}. Then, reducing the complexity compared to a complete brute force approach in N^{2}×M^{l} allows for more precise results by providing finer estimation of pattern parameters.

Rectangular tiles are parametrized as sets of 2D points, using the coordinates of two opposite corners: (x_{UL},y_{UL},x_{LR},y_{LR}) where x_{UL} and y_{UL} (respectively x_{LR} and y_{LR}) denote the image coordinates (column and row) of the upper left (respectively lower right) corner of the tile; x_{UL}∈[1,N_{c}], y_{UL}∈[1,N_{r}], x_{LR}∈[x_{UL},N_{c}] and y_{LR}∈[y_{UL},N_{r}]. The cumulative space \(\mathcal {T}\) has two dimensions: the column axis x representing x_{UL} and x_{LR} and the row axis y representing y_{UL} and y_{LR}. \(J_{\mathcal {T}}\) is the binary image itself and \(\mathbb {J}_{\mathcal {T}}\), the cumulative space containing the partial sums of \(J_{\mathcal {T}}\), is derived from Eq. (4) with l=4, m=2, α_{l−3}=x and α_{l−2}=y.

Strips are parametrized as sets of parallel straight lines, through the polar coordinates of the two border lines: (θ,ρ_{0}) and (θ,ρ_{1}), where θ is the parallel line direction (one single value) and ρ_{0} and ρ_{1} the distances of the border lines to the space origin. Choosing the image center as origin, ρ_{0} and ρ_{1} are signed values so that there is no discontinuity in ρ values for strips containing the origin. Then, θ∈[0,π), ρ_{0}∈[−ρ_{d},ρ_{d}], ρ_{1}∈[ρ_{0},ρ_{d}]. The cumulative space \(\mathcal {S}\) has two dimensions: the angular axis θ and the distance axis ρ representing ρ_{0} and ρ_{1}. \(J_{\mathcal {S}}\) is the classic Hough transform [19], and \(\mathbb {J}_{\mathcal {S}}\) is the cumulative space containing the partial sums of \(J_{\mathcal {S}}\), derived from Eq. (4) with l=3, m=1 and α_{l−1}=ρ.

Rings are parametrized as sets of concentric circles, through the circle center coordinates (x_{0},y_{0}) and two rays ρ_{0} and ρ_{1} respectively, where x_{0}∈[1,N_{c}], y_{0}∈[1,N_{r}], ρ_{0}∈[0,ρ_{d}], ρ_{0}∈[ρ_{0},ρ_{d}]. The considered cumulative space \(\mathcal {R}\) has three dimensions: the column and row axes x and y for the coordinates of the center, and the ray axis ρ representing ρ_{0} and ρ_{1}. \(J_{\mathcal {R}}\) is the circle Hough transform, and \(\mathbb {J}_{\mathcal {R}}\) is the cumulative space containing the partial sums of \(J_{\mathcal {R}}\), derived from Eq. (4) with l=4, m=1 and α_{l−1}=ρ.

Bounded strips are sets of parallel segment lines that can also be represented as unbounded strips with two extremities. They are parametrized by a 5tuple (θ,ρ_{0},ϕ,ρ_{1},ψ) where (θ,ρ_{0},ρ_{1}) represents the unbounded strip as previously stated, and ϕ, ψ are the angular coordinates of the extremities: θ∈[0,π), (ϕ,ψ)∈[0,2π)^{2}, ρ_{0}∈[−ρ_{d},ρ_{d}], ρ_{1}∈[ρ_{0},ρ_{d}]. The considered cumulative space \(\mathcal {B}\) has three dimensions, namely the strip angle axis θ, the distance axis ρ representing ρ_{0} and ρ_{1}, and the extremity angular coordinate axis ϕ^{′} representing ϕ and ψ. \(J_{\mathcal {B}}\) is the halfline Hough transform (e.g., a 1valued pixel votes only for the line segments containing it having the starting point with the lower angular coordinate), and \(\mathbb {J}_{\mathcal {B}}\) is the cumulative space containing the partial sums of \(J_{\mathcal {B}}\), derived from Eq. (4) with l=5, m=2, α_{l−3}=ρ and α_{l−2}=ϕ^{′}.
2.2.2 Algorithm
Algorithm 1 describes the way the most significant patterns are detected using the NFA criterion coupled with cumulative spaces. Its inputs are as follows: the considered binary image I, the cumulative space \(\mathcal {C}\) determined by the considered pattern (for conciseness, \(\mathcal {C}\) biparameter axes are denoted as “bipaxes”), and the set of possible patterns \(\mathcal {A}\), which is determined by image dimensions and possibly by some applicationspecific constraints. The output of Algorithm 1 is the collection of the most significant patterns, \(\mathcal {P}\). Note that here we use the term collection to avoid a possible confusion with the term set, which we already use to refer to a group of simpler patterns. After the initialization step, the algorithm begins a loop that successively detects the patterns that will be added, one by one, to \(\mathcal {P}\) as the most significant pattern at the current iteration. At each iteration, \(\mathbb {J}_{\mathcal {C}}\) is computed according to Eq. (3) or to Eq. (4) depending on the value of m (in this work, we focus on m∈{1,2} but the generalization is trivial). Then, two vectors \(\vec {\kappa }[.]\) and \(\vec {\boldsymbol {\alpha }}[.]\) of dimensionality N, the number of pixels assumed to be the maximum size of a pattern, are allocated. Note that \(\vec {\kappa }\left [.\right ]\) elements are integer values and \(\vec {\boldsymbol {\alpha }}\left [.\right ]\) elements are ltuples. They will store, for each pattern size j in pixel unit, the maximum number of 1valued pixels (in \(\vec {\kappa }[j]\)) and the corresponding pattern parameter tuple (in \(\vec {\boldsymbol {\alpha }}[j]\)). Indeed, for a given pattern area, the significance increases with the number of 1valued pixels within it. Then, it is not necessary to compute the significance values for each pattern, but only for the patterns having different areas (in pixels) and achieving the maximum number of 1valued pixels \(\vec {\kappa }[j]\). For this reason, the significance computation is done only after having selected, for each different size of pattern, the pattern having the highest number of 1valued pixels: the first loop selects these patterns; then, a second loop computes their associated significance value while at the same time searching for the maximum one, which is stored in S_{max} along with the corresponding pattern stored in b. Finally, having found the most significant pattern \(\hat {\alpha }\) at the current iteration, we add it to the collection of significant patterns \(\mathcal {P}\) only if the global significance of the collection of patterns increases. If it is not the case, the algorithm ends. Otherwise, before reiteration, the located pattern is removed from the image. In our case, when we remove a pattern, we do not set to false all its pixels, but only the exceeding ones relative to naive model parameter p. Practically, for each 1valued pixel located in the area of a primary pattern, we draw randomly its new value (0 or 1) according to the probability p. Although suboptimal, this simple adjustment allows us to avoid penalizing too much patterns which overlap other patterns previously detected.
Note that because of the nonmonotonicity of the projection of extremities to the strip versus the angular coordinate, in the case of bounded strips, Eq. (6) should be adapted. For instance, let us consider the case of \(\phi \in \left (\frac {\pi }{2},\frac {3\pi }{4} \right ]\) and \(\psi \in \left [ \frac {7\pi }{4},2\pi \right)\), since theoretically (ϕ,ψ)∈[0,2π)^{2}, an intermediate bound noted by its angle φ is such that φ∈[0,ϕ)∪(ψ,2π); to have φ∈(ϕ,ψ)or(ψ,ϕ), in this case, we change \(\psi \in \left [ \frac {\pi }{4},0 \right)\). In a more general case, to get the intermediate bound between ϕ and ψ, we consider as new bound the preimage of the image of ψ closest to ϕ (thus enforcing the monotonicity of the projection).
Comparison of running times (in seconds) of the proposed approach based on use of cumulative space and a naive implementation; case of the toy example patterns involving three objects to detect in each simulated image; running times averaged on 5 realisations on a standard laptop computer (Core i7 2670QM@2.2Ghz, 4Go RAM)
Pattern  Proposed use of cumulative space  Naive implementation  Time gain factor 

Tiles  0.842 s  521.4 s  619 
Strips  1.622 s  6095 s  3757 
Rings  3.622 s  4501 s  1243 
Let us now consider actual data and a real application.
3 Crack detection in still images
The proposed approach is suited for applications involving a significance measure or NFA criterion, and benefiting from a finer sampling of the pattern space. For detection tasks that are quite standard acontrario problems, the basic idea is that the detection relies on pattern significance (or NFA) that itself depends on the number of 1valued pixels belonging to the considered pattern. For instance, for detection estimated at regionlevel, i.e., in rectangular windows, the refinement of the space would imply the sampling to be performed with a 1pixel sliding step in both dimensions, rather than using a nonoverlapping or halfoverlapping window sampling strategy [21]. Undoubtedly, the benefit of the proposed method will be more important for patterns with a high number of parameters, i.e., involving high dimensionality of the parametric space.
In this study, we have chosen to illustrate our refinement algorithm on the problem of crack detection.
Crack detection has indeed critical importance for ensuring the security of infrastructures and for minimizing maintenance costs. In terms of appearance, a crack is a discontinuity in the background with respect to the underlying material (e.g., asphalt for roads or concrete for walls). Then, it may be detected based on some photometric and geometric features [23] that are exploited by several proposed approaches (e.g., [24, 25]) which perform rather well on cracks observed on smooth and homogeneous surfaces (e.g., concrete).
However, these methods often fail when the background exhibits a noisy texture like in the case of road pavement.
In this study, we focus therefore on the noisy background case (including texture and various artifacts), considering the dataset proposed in [26] that contains challenging images of road surfaces. Road texture is indeed characterized by the presence of numerous asphalt textons, i.e., sand and gravel aggregates that appear like ridge details inducing small clusters of light and dark pixels in the image background. Since the road surface observations have been acquired by a camera embedded on a vehicle, this aspect is even emphasized by the camera’s pitch angle that causes these small structures to appear of nonstationary density and size due to perspective distortion, varying at the same time the spatial scale of the noise effect produced by the rough textons (cf. Fig. 5, first column). Therefore, the radiometric features alone do not allow us to distinguish the cracks from the road background and the geometric features should be also considered like in [27]. However, whereas [27] focuses on the modeling of the spatial interactions between line segments, in this work, we focus on the detection of these line segments based on significance (or NFA) computation. The background heterogeneity due to asphalt textons resembles indeed well the null hypothesis. On such a background, the lines that compose the cracks may be seen as a deviation from the naive model, thus having high values of significance.
3.1 Related work
Crack detection approaches generally involve a preprocessing step that aims at computing a new data image on which the detection will be easier. The preprocessing step may consist in removing adverse or clutter features (e.g., shadow removal [26]), or in enhancing the pattern, e.g., by subtracting the median filtered image [25], or in stressing the filiform feature of the cracks e.g. by using Laplacian of Gaussian or steerable filters [28, 29]. In this work, we consider the same preprocessing step as in [30]. Basically, it involves the shadow removal by background subtraction and the estimation of a new radiometric image, whose values gather both gray level information and gradient orientation features.
Then, from the preprocessed image, two analysis scales may be considered for crack detection itself. In [24, 31], the detection is based on a local analysis performed across the whole image space using a sliding window, whereas in [26, 32, 33] the detection relies on a global process related to the expected photometric properties of cracks, based on the computation of minimum cost paths. The local or global search strategy and the parameters required by the algorithms set the scale for the patterns to be detected. Now, we observed that the cracks may be highly variable in terms of scale, thinness, and relative contrast. Thus, both local and global scales seem relevant and will then be considered in the proposed approach, through the measure of significance relatively to the context in a multiscale reasoning.
Finally, note that since the used NFA criterion applies to binary images, we perform an automatic thresholding operation on the preprocessed image to derive the seed image, i.e., the binary image where 1valued pixels are very likely to belong to the crack. Then, the objective of the whole algorithm presented in next section is to remove the false positives and to correct the false negatives on this seed image.
3.2 Crack detection algorithm
Following a multiscale strategy, we adopt a twostep algorithm. The input image is the binary image of the seeds in which the 1valued pixels represent (in an incomplete way and including some false positives) the researched patterns. The first step deals with local scale, and aims at determining the most significant local alignments of 1valued pixels, that are called elementary strips in the following. Then, the second step is intended to identify the significant straight chains of elementary strips.
Algorithm 2 describes the method based on the two successive steps. In Algorithm 2, a window refers to the rectangular image subarea used for local detection. Its dimensions in columns and rows are given as input parameters. Then, in every considered window, the elementary strips are detected as the most significant unbounded strip(s), following Section 2.2. At the end of this step, a new binary image is derived such that the 1valued pixels are exclusively located in detected significant local strips. In other words, the maximization of significance at local scale (over each window) is used as a filtering process which removes a part of the false positives present in the seed image. Besides, the extremities of local strips are also stored as possible extremities of the bounded strips which are estimated in the next step. Then, at image scale, the most significant bounded strip(s) are detected following Section 2.2. Finally, the cracks are approximated by the concatenation of the elementary strips (detected at local scale) that also belong to a bounded strip detected at image scale.
4 Results and discussion
We have applied the proposed algorithm to the public CrackTree dataset provided by [26], which illustrates our approach very well since noisy texture and other degradation artifacts are commonly present on the asphalt surface.
Algorithm 2 inputs are the binary image of the seeds and the window size used for local analysis. In CrackTree dataset, the image size is 800×600 pixels. In Section 4.1, we choose to divide each image dimension by 10, so that the resulting window size is 80×60. In Section 4.2, we validate more accurately this parameter’s choice. Regarding the seed image, we use the same preprocessing step as in [30]. Specifically, we use Algorithm 1 of [30] that includes a final thresholding operation with respect to an automatically derived threshold according to a NFA criterion operating at grayscale pixel level. However, this algorithm takes in input a standard deviation parameter that can be either automatically estimated from the grayscale image, or carefully chosen. This allows us to modulate the automatic threshold estimation. In Section 4.1, we consider the default value for this standard deviation, whereas in Section 4.2 we also consider results obtained with an alternative value to test the robustness of our results.
To evaluate quantitatively the obtained results, the precision and recall parameters are computed while distinguishing between the misdetection and the mislocation of a crack as follows. As in [27], the true positives and the false positives are computed by comparing the detection results with incrementally dilated ground truth, while the false negatives are computed comparing the ground truth with an incrementally dilated version of the detection results. Then, varying the dilation radius allows us to distinguish between errors due to a slight mislocation of the crack and actual nondetections.
4.1 Analysis of the crack detection results
Main statistical values on precision and recall indexes achieved on the CrackTree dataset [26]
Dilation radius  1  2  3 

(Precision %, recall %) mean values  (86.8, 86.4)  (89.5, 88.7)  (90.7, 89.8) 
(Precision %, recall %) median values  (87.8, 86.3)  (90.4, 88.9)  (91.6, 89.7) 
(Precision %, recall %) 75th percentile  (90.6, 91.2)  (93.4, 93.3)  (95.0, 93.9) 
(Precision %, recall %) 25th percentile  (82.5, 82.3)  (85.8, 85.1)  (86.6, 86.9) 
Now, to provide a deeper and more specific analysis of the obtained results, we have selected some typical examples of results that are shown in the next figures. We selected these examples to illustrate the efficiency of the multiscale approach, but also its behavior with respect to crack feature assumptions and to shadow presence. Note that since some of them correspond also to [26] Fig. 7, this allows the reader to refer to this figure for a qualitative comparison. In these figures, the first column shows the original images; the second column shows in the blue channel the seed image obtained automatically following [30], whereas the results of the local (window based) strip detection appear in red with the window grid in green. The third column shows in red the final strips (i.e. the strips belonging to a significant alignment at global scale), overlaid with ground truth in blue. In the last column, we show the results of a simple postprocessing step which connects the 1valued pixels validated by the final strips using minimum cost paths and possibly remove the obtained path based on its average cost value. Note that this postprocessing is not the object of our study, but it was necessary to extract the final cracks (for instance for quantitative evaluation).
Due to the camera tilt, the background textons are much more prominent in the lower part of the image (cf. 1^{st} line example for instance) creating numerous false alarms at pixel level (blue pixels in the images of the second column of Fig. 5). The local analysis captures the non stationarity of such a noise (caused by textons) by selecting the most significant elementary strips within local windows (red segments in the images of the second column of Fig. 5). However, most of the elementary strips are false alarms at global scale. Then, the global analysis allows for their filtering by checking their consistency at image scale (red segments in the third column versus the second column in Fig. 5). Finally, having detected the rough shape of the cracks, the postprocessing step allows for a finer estimation of the shape as well as removal of some isolated false alarms. Note also that the local analysis is important not only to improve the resilience to texture nonstationarity, but also to crack width variations. For instance, the second row illustrates the ability of the method to recover not only the main cracks, but also the thinner ones: the window scale step allows for local significance maximization, whereas the sole use of a global measure would have estimated thin cracks to be insignificant relatively to wider cracks.
In summary, we evaluate our algorithm with respect to five adverse phenomena for crack detection: two phenomena that are independent of the cracks themselves, namely the background texture and the shadow presence, and three phenomena that characterize the cracks, namely the thinness, the length of secondary branch(es), and the partial filling (behaving like partial occlusion). The analysis of the results shows that, even in presence of these adverse phenomena, the algorithm is able to detect almost every strip that composes the actual cracks, and follow their jagged behavior. The window level detection allows us to remove most false alarms introduced at pixel level (seeds), but introduces some window level false alarms (on average, one detection per window is not related to an actual crack) which are then removed by the image scale detection, unless they are found to belong to a significant strip at image level.
4.2 Impact of the scale on the local analysis
In Algorithm 2, the scale of the local analysis is driven by the window size. This parameter depends on the image resolution as well as on the sinuosity of the structures to detect, so that it is up to the user to define its value. However, to illustrate its importance, we study its impact with respect to our specific application and data.
5 Conclusion
In this paper, a generic method for the NFA criterion computation for pattern detection is proposed. We consider that relying on an advantageous grouping of parameters in the engendered cumulative space is applicable to a wide variety of problems and may facilitate the use of acontrario based algorithms for various applications involving parametric pattern detection. Our technique was applied to a crack detection task, which naturally fits the problem as cracks can be seen as a deviation from the naive model in a heterogeneous background, allowing us to illustrate the pertinence and the benefit of reparametrization and multiscale acontrario analysis for complex patterns.
Future work will be devoted to accelerating the accumulation tasks, since most cumulative space operations are inherently independent and could therefore take advantage of parallel architectures. On such architectures (GPU, FPGA), the available memory resources are often more constrained than on generic systems. However, the reduced memory footprint resulting from our algorithm should be directly beneficial in such scenario.
Notes
Acknowledgements
We would like to acknowledge the support from the authors of [26] for providing the dataset for benchmarking and for various discussions.
Funding
This work has been funded by authors’ own resources.
Availability of data and materials
The crack image dataset [26] used for the experiments is publicly available at https://sites.google.com/site/qinzoucn/documents.
Authors’ contributions
SLH designed and conceived the research and implemented the core algorithm. SLH, EA, and JV performed the experiments and analyzed the experimental results. SLH and EA drafted the manuscript. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
 1.A. Desolneux, L. Moisan, J. M. Morel, Meaningful alignments. Int. J. Comput. Vis.40(1), 7–23 (2000).zbMATHCrossRefGoogle Scholar
 2.A. Desolneux, L. Moisan, J. M. Morel, A grouping principle and four applications. IEEE Trans. Pattern. Anal. Mach. Intell.25(4), 508–513 (2003).zbMATHCrossRefGoogle Scholar
 3.R. G. Von Gioi, J. Jakubowicz, J. M. Morel, G. Randall, On straight line segment detection. J. Math. Imaging Vis.32(3), 313 (2008).MathSciNetCrossRefGoogle Scholar
 4.R. G. von Gioi, J. Jakubowicz, J. M. Morel, G. Randall, LSD: A fast line segment detector with a false detection control. IEEE Trans. Pattern. Anal. Mach. Intell.32(4), 722–732 (2010).CrossRefGoogle Scholar
 5.C. Akinlar, C. Topal, Edlines: A realtime line segment detector with a false detection control. Pattern. Recogn. Lett.32(13), 1633–1642 (2011).CrossRefGoogle Scholar
 6.A. Almansa, A. Desolneux, S. Vamech, Vanishing point detection without any a priori information. IEEE Trans. Pattern Anal. Mach. Intell.25(4), 502–507 (2003).CrossRefGoogle Scholar
 7.J. Lezama, R. Grompone von Gioi, G. Randall, J. M. Morel, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Finding vanishing points via point alignments in image primal and dual domains (IEEEColumbus, 2014), pp. 509–515.Google Scholar
 8.A. Newson, A. Almansa, Y. Gousseau, P. Pérez, Robust automatic line scratch detection in films. IEEE Trans. Image Process.23(3), 1240–1254 (2014).MathSciNetzbMATHCrossRefGoogle Scholar
 9.J. Lezama, J. M. Morel, G. Randall, R. G. Von Gioi, A contrario 2d point alignment detection. IEEE Trans. Pattern. Anal. Mach. Intell.37(3), 499–512 (2015).CrossRefGoogle Scholar
 10.S. Blusseau, A. Carboni, A. Maiche, J. Morel, R. G. von Gioi, Measuring the visual salience of alignments by their nonaccidentalness. Vis. Res.126:, 192–206 (2016).CrossRefGoogle Scholar
 11.C. Akinlar, C. Topal, Edcircles: A realtime circle detector with a false detection control. Pattern. Recog.46(3), 725–740 (2013).CrossRefGoogle Scholar
 12.V. Pătrăucean, P. Gurdjos, R. G. von Gioi, Joint a contrario ellipse and line detection. IEEE Trans. Pattern. Anal. Mach. Intell.39(4), 788–802 (2017).CrossRefGoogle Scholar
 13.T. Veit, F. Cao, P. Bouthemy, An a contrario decision framework for regionbased motion detection. Int J Comput. Vis.68(2), 163–178 (2006).CrossRefGoogle Scholar
 14.N. Burrus, T. M. Bernard, J. M. Jolion, Image segmentation by a contrario simulation. Pattern. Recognit.42(7), 1520–1532 (2009).zbMATHCrossRefGoogle Scholar
 15.J. Rabin, J. Delon, Y. Gousseau, A statistical approach to the matching of local features. SIAM J. Imaging Sci.2(3), 931–958 (2009).MathSciNetzbMATHCrossRefGoogle Scholar
 16.A. Robin, L. Moisan, S. Le HégaratMascle, An acontrario approach for subpixel change detection in satellite imagery. IEEE Trans. Pattern Anal. Mach. Intell.32(11), 1977–1993 (2010).CrossRefGoogle Scholar
 17.G. Palma, I. Bloch, S. Muller, Detection of masses and architectural distortions in digital breast tomosynthesis images using fuzzy and a contrario approaches. Pattern Recognit.47(7), 2467–2480 (2014).CrossRefGoogle Scholar
 18.S. Zair, S. Le HégaratMascle, E. Seignez, Acontrario modeling for robust localization using raw GNSS data. IEEE Trans. Intell. Transp. Syst.17(5), 1354–1367 (2016).CrossRefGoogle Scholar
 19.P. V. Hough, Method and means for recognizing complex patterns. Google Patents. US Patent 3,069,654 (1962).Google Scholar
 20.F. Porikli, in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference On, vol. 1. Integral histogram: A fast way to extract histograms in cartesian spaces (IEEESan Diego, 2005), pp. 829–836.Google Scholar
 21.F. Dibos, S. Pelletier, G. Koepfler, in Image Processing, 2005. ICIP 2005. IEEE International Conference On, vol. 1. Realtime segmentation of moving objects in a video sequence by a contrario detection (IEEEGenova, 2005), p. 1065.Google Scholar
 22.F. Dibos, G. Koepfler, S. Pelletier, Adapted windows detection of moving objects in video scenes. SIAM J. Imaging Sciences. 2(1), 1–19 (2009).MathSciNetzbMATHCrossRefGoogle Scholar
 23.S. Chambon, J. M. Moliard, Automatic road pavement assessment with image processing: Review and comparison. Int. J. Geophys.2011:, 1–20 (2011).CrossRefGoogle Scholar
 24.T. Yamaguchi, S. Hashimoto, Fast crack detection method for largesize concrete surface images using percolationbased image processing. Mach. Vis. Appl.21(5), 797–809 (2010).CrossRefGoogle Scholar
 25.Y. Fujita, Y. Hamamoto, A robust automatic crack detection method from noisy concrete surfaces. Mach. Vis. Appl.22(2), 245–254 (2011).CrossRefGoogle Scholar
 26.Q. Zou, Y. Cao, Q. Li, Q. Mao, S. Wang, CrackTree: Automatic crack detection from pavement images. Pattern. Recognit. Lett.33(3), 227–238 (2012).CrossRefGoogle Scholar
 27.J. Vandoni, S. Le HégaratMascle, E. Aldea, in Pattern Recognition (ICPR), 2016 23rd International Conference On. Crack detection based on a marked point process model (IEEECancún, 2016), pp. 3933–3938.CrossRefGoogle Scholar
 28.N. Batool, R. Chellappa, in European Conference on Computer Vision. Modeling and detection of wrinkles in aging human faces using marked point processes (SpringerFlorence, 2012), pp. 178–188.Google Scholar
 29.S. G. Jeong, Y. Tarabalka, J. Zerubia, in Image Processing (ICIP), 2014 IEEE International Conference On. Marked point process model for facial wrinkle detection (ICIPParis, 2014), pp. 1391–1394.CrossRefGoogle Scholar
 30.E. Aldea, S. Le HégaratMascle, Robust crack detection for unmanned aerial vehicles inspection in an acontrario decision framework. J. Electron. Imaging. 24(6), 061119–061119 (2015).CrossRefGoogle Scholar
 31.T. S. Nguyen, S. Begot, F. Duculty, M. Avila, in Image Processing (ICIP), 2011 18th IEEE International Conference On. Freeform anisotropy: A new method for crack detection on pavement surface images (ICIPBrussels, 2011), pp. 1069–1072.CrossRefGoogle Scholar
 32.L. Qingquan, Z. Qin, Z. Daqiang, Q. Mao, FoSA: F* seedgrowing approach for crackline detection from pavement images. Image Vis. Comput.29(12), 861–872 (2011).CrossRefGoogle Scholar
 33.R. Amhaz, S. Chambon, J. Idier, V. Baltazart, in Image Processing (ICIP), 2014 IEEE International Conference On. A new minimal path selection algorithm for automatic crack detection on pavement images (ICIPParis, 2014), pp. 788–792.CrossRefGoogle Scholar
Copyright information
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.