A fault detection method based on partition of unity and kernel approximation

In this paper, we present a scattered data approximation method for detecting and approximating the discontinuities of a bivariate function and its gradient. The new algorithm is based on partition of unity, polyharmonic kernel interpolation, and principal component analysis. Localized polyharmonic interpolation in partition of unity setting is applied for detecting a set of fault points on or close to discontinuity curves. Then a combination of partition of unity and principal component regression is used to thinning the detected points by moving them approximately on the fault curves. Finally, an ordered subset of these narrowed points is extracted and a parametric spline interpolation is applied to reconstruct the fault curves. A selection of numerical examples with different behaviors and an application for solving scalar conservation law equations illustrate the performance of the algorithm.

as edge detection in image processing, geophysical science, oil industry, and tomography [3-5, 9, 27, 44, 49, 55]. In all cases, a segmentation of an enclosed area that is related to a particular phenomenon is needed. In addition, in some approximation methods, the discontinuity curves and regions should be known in advance in order to obtain accurate solutions [6,8].
For a given bivariate function, the curves on which the function itself and gradient of the function are discontinuous are called ordinary and gradient faults, respectively. Most of fault detection methods cannot distinguish between these two kinds of faults. However, recently some algorithms have been developed for determining both ordinary and gradient faults of an underlying function [11][12][13].
The basis of all fault detection methods includes the use of a derivative operator to indicate a cloud of points near the discontinuity curve. Some algorithms use a pure interpolation operator and Gibbs phenomenon measurements near discontinuity [2,28,36,37,44], and some others use gradient and Laplacian operators to define an indicator [12]. In [11], a central difference operator and a statistical procedure are applied and in [5], a local Taylor expansion and a polynomial annihilation criterion are used to indicate the fault clouds.
Assume that a set of scattered points with corresponding function values are available. In this paper, to deal with the non-uniform nature of scattered data set, we employ a localized meshfree approximation based on polyharmonic kernels in combination with partition of unity (PU) method for detecting a non-organized set of points on or close to the (unknown) faults of a function for which the data may have originated. A gradient-and a Laplacian-based indicators are defined for identifying a cloud of fault points from an original data set. In each PU patch, we apply the polyharmonic interpolation on scaled data points to prevent the instability of kernel matrices. Noting that, interpolation by polyharmonic kernels is the only scalable approximation among all radial basis function (RBF) approximations [20,31].
At the first step, our algorithm results in an unorganized cloud of points near the discontinuity curve. To get an accurate reconstruction of the fault, we employ a principal component regression (PCR) [34,35,39] in combination with PU approximation to generate a second set of points which are supposed to be closer to the fault curve than the primary detected set. We apply PCR instead of linear least-squares approximation for better curve fittings especially in subregions on which the fault curve has a near vertical direction.
At the final step, an ordered subset of the previous narrowed fault points is extracted and a smooth parametric spline interpolation is employed to reconstruct the fault curve.
We also address the situations with multiple fault curves and special cases with intersections and multi-branch configurations.
Sufficient number of numerical examples illustrates the performance and efficiency of the method, including the detection and reconstruction of multi-branch and closed faults, and an application for solving a conservation law problem via the finite volume method (FVM).

Polyharmonic spline interpolation
Polyharmonic spline (PHS) interpolation is a particular case of interpolation with conditionally positive definite RBFs [24,31,53]. Assume that φ : R d → R is a conditionally positive definite function of order m+1, i.e., with respect to polynomial space P m (R d ). Let Ω ⊂ R d be a bounded region. The RBF interpolation of a function f : Ω → R on a discrete set X = {x 1 , . . . , x N } ⊂ Ω is given by where {p 1 , . . . , p Q } is a basis for P m (R d ), and α = (α 1 , . . . , α N ) T and a = (a 1 , . . . , a Q ) T satisfy where K ∈ R N×N with K jk = φ(x j − x k ), k, j = 1, . . . , N, P ∈ R N×Q with P jn = p n (x j ), j = 1, . . . , N, n = 1, . . . , Q, and f = (f (x 1 ), . . . , f (x N )) T . We also need to assume N Q and X is P d m -unisolvent to have a full rank matrix P . On the other hand, since φ is conditionally positive definite of order m+1, the symmetric matrix K is positive definite on ker(P T ) as a subspace of R N . These all guarantee that the interpolation system is uniquely solvable. The interpolant s f,X can also be written in the Lagrange form as where (u 1 (x), . . . , u N (x)) T =: u(x) satisfies Lagrange functions possess the property u j (x k ) = δ kj . In the case of PHS interpolation, the function φ is defined as for real number β > 0. The PHS function ϕ is (up to a sign) conditionally positive definite of order m + 1 = β/2 + 1.
Derivatives of f are approximated by corresponding derivatives of s f,X , i.e., Tackling the ill-conditioned system (4) is one the important issues in RBF approximation methods. As it is proved in [53,Chap. 12], for polyharmonic kernels, the condition number of this system grows algebraically with respect to the minimum spacing distance between interpolation points. To overcome this problem, the polyharmonic interpolation matrix can be formed and solved on scaled data points with a spacing distance of O(1). This is motivated by the 5-point star classical FD formula for Δu(0, 0) on points  [20,31,32]. In a more general case, assume that X is a set of points in a local domain D with fill distance h. Assume further that the polyharmonic approximation of Lu(x) for a fixed x ∈ D is sought. Assume that L is a homogeneous operator with scaling order (homogeneity) s. For example, the scaling order of L = Δ is s = 2 and the scaling order of L = D α is s = |α|. If X is blown up (scaled) to points X h of average fill distance 1 and Lagrange functions Lu j are calculated for the blown-up situation, then the Lagrange functions of the original situation are scaled as h −s Lu j . When polynomials of degree m are appended and monomials {x α } |α| m are used as a basis for P m (R d ), it is recommended to shift the points by the center of D and then scale by h to benefit from the local behavior of monomial basis functions around the origin. See [23,24,33,43] for some applications of this strategy in localized RBF methods for numerical solution of PDEs and rational interpolation of singular functions.

Partition of unity approximation
The partition of unity (PU) approximation will be used twice in our detection algorithm. In approximation theory, the PU method is an efficient technique to obtain a sparse global approximation by joining a set of localized and well-conditioned approximations. The first combination of partition of unity with RBF interpolations goes back to [52] and [53] by Holger Wendland (see also [24]). Then, this approach has been extensively used for numerical solution of partial differential equations [14-18, 40, 43, 45]. A recent application to implicit surface reconstruction is provided in [21].
In PU methods, the global domain Ω is covered by a set of open, bounded, and overlapped subdomains Ω , = 1, 2, . . . , N c , where Ω ⊂ ∪ Ω , and a set of PU weights is defined on this covering. The nonnegative and compact support functions If we start with an overlapping covering {Ω } of Ω and we assume V is an approximation space on Ω and s ∈ V is a local approximation of a function f on Ω , then is a global PU approximation of f on Ω. This global approximation is formed by joining the local approximants s via PU weights w . A possible choice for w is given by Shepard's weights where ψ are nonnegative, nonvanishing, and compactly supported functions on Ω .
In a usual way, derivatives of f are approximated by derivatives of s, i.e., or in a general form, for a linear differential operator L with constant coefficients, we have So, the Leibniz's rule should be applied to compute the derivatives of products w s . This standard technique is complicated, somewhat, and requires smooth PU weight functions. In [43], an alternative approach is suggested: where D α f is directly approximated without any detour via the approximant s of f itself. Analogously, for a general case with operator L, we may write Theoretical results show the same convergence properties for both standard and direct approaches [43], while the second approach is simpler and allows discontinuous PU weights to develop some faster algorithms for approximating the derivatives and solving partial differential equations.
The RBF-PU is a special case in which the local approximants Ls are obtained by the RBF approximation on trial sets X = X ∩ Ω in local domains Ω ∩ Ω with global index family J := {j ∈ {1, . . . , N} : x j ∈ X }. In this case, we have for the standard approach, and for the direct approach. Here, u j ( ; x) are Lagrange RBF functions on patches Ω . A simple covering for Ω can be constructed via a set of overlapping balls Ω = B(y , ρ ) where y ∈ R d are patch centers and ρ are patch radii. To have a fixed patch radius ρ ≡ ρ, we can use the following setup for points, parameters, and domain sizes. We assume the set X has fill distance The fill distance indicates how well the points in the set X fill out the domain Ω. Geometrically, h is the radius of the largest possible empty ball that can be placed among the data locations X inside Ω. A data set {y 1 , . . . , y N c } with space distance is used for patch centers. The constant C cov controls the number of patches, N c , compared with the number of interpolation points in X. The radius ρ should be large enough to guarantee the inclusion Ω ⊂ ∪ B(y , ρ ), and to allow enough interpolation points in each patch for a well-defined and accurate local approximation. Thus, we assume and we let the overlap constant C ovlp be large enough to ensure the above requirements. It is also possible to assign variable patch sizes ρ per any patch center y . For example, we can obtain ρ is such way that there exists a certain number of interpolation points in each patch Ω . In this case, we must sure that the inclusion property Ω ⊂ ∪ Ω is still satisfied. In numerical examples of Section 7, we use both fixed and variable patch radius strategies.
Smooth Shepard weight functions are frequently used in PU approximations (see, for example, [40,45,52]). A discontinuous PU weight is also suggested in [43] that highly simplifies the RBF-PU algorithms for solving partial differential equations. Assume the PU weight w (x) takes the constant value 1 if y is the closest center to x and the constant value 0, otherwise. For definition, let I min (x) = arg min ∈I (x) x − y 2 and I min,1 (x) be the first component of I min (x), as I min (x) may contain more than one index . The weight function is then defined by With this definition, we give the total weight 1 to the closest patch and null weights to other patches. In fact, a local set X = Ω ∩ X is a common interpolation set for all evaluation points x k with x k − y 2 ≤ x k − y j 2 for j = 1, . . . , N c and j = . In another view in a 2D domain, by drawing the Voronoi tiles of centers {y 1 , . . . , y N c }, this means that all evaluation points in tile use the same local set X as their interpolation set [43].

Principal component regression
Assume that X ∈ R d×n is a data matrix containing n points (n columns) in the d dimensional space. PCR finds the best k rank approximation to d dimensional data matrix X for 0 < k < d. It is known that a low rank approximation can be solved by the singular value decomposition (SVD). Let μ X ∈ R d×1 be the vector of sample means along each row of X. Subtract each columns of X (each point) by μ X and denote the resulted mean zero data matrix by X 0 . The SVD of X 0 is given by where U ∈ R d×d and V ∈ R n×n are orthogonal matrices and Σ ∈ R d×n is a diagonal matrix carrying the singular values σ j of X decreasingly on its diagonal. Equivalently where principal components u j and v j are columns of U and V , respectively. Then for k d is the closest rank k matrix to X 0 (with respect to 2, Frobenius and any matrix norm that depends only on singular values) where U k and V k consist of first k columns of U and V , respectively, and Σ k = diag{σ 1 , . . . , σ k } [51]. Clearly, the data matrix X k which is obtained by adding the mean vector μ X to all columns of X 0 k is the closest rank k matrix to X. Columns of X k (as points in R d ) are located on a k dimensional subspace of R d .
In this paper, we need the special case d = 2 and k = 1 for narrowing a cloud of fault points around a discontinuity curve in a two-dimensional domain Ω (see Section 5.2).

Detection algorithm
Assume that f is a piecewise smooth real valued function with finite jump discontinuity across some curves on a two-dimensional domain Ω. If the union of such curves is denoted by F, we assume that the measure of F is zero and f is smooth in Ω \ F. Assume further that a set of scattered points X ⊂ Ω and associated function values f (x), x ∈ X, are given. The algorithm consists of three steps: (1) picking out a set of points, called fault points as a subset of X, on or close to discontinuity curves using a partition of unity polyharmonic-based approximation, (2) narrowing the fault points using a partition of unity and PCR algorithm, and (3) constructing the fault curve using a parametric interpolation. In the following subsections, we will describe these steps.

Fault point detection
We aim to detect a small subset F of X consisting of points close to F by considering a procedure based on local kernel interpolation. The points in F are called fault points. In [12], a minimal numerical differentiation formula (MNDF) based on local multivariate polynomial reproduction is given to approximate the function derivatives. This approach is a generalized finite difference (FD) method and the stencil weights are uniquely determined by minimizing their weighted 1 or 2 norm in a suitable way [19].
In this paper, we use the direct PU approximation for localization and PHS kernels for local approximations. We measure and compare the approximate gradient and Laplacian values with some threshold parameters to find a set of fault points F close to the fault curve F.
In each patch Ω , we apply the discontinuous PU weight (8) and PHS kernels ϕ 1 (r) := r and ϕ 3 (r) := r 3 to approximate gradient and Laplacian functions, respectively. Note that, ϕ 1 and ϕ 3 are conditionally positive definite of orders 1 and 2, respectively, and thus polynomials of orders at least 1 and 2 need to be appended to their corresponding RBF expansions to obtain well-defined interpolations on unisolvent sets.
According to the PU procedure, each point x k is subjected to a local set X = Ω ∩ X for an index ∈ {1, . . . , N c } in which = I min,1 (x k ) =: (k). The difference between this approach and the RBF-FD method is that in the new method a set X may be shared with many points x k while in the RBF-FD, each stencil X k is associated with a unique evaluation point x k . This special type of the direct RBF-PU (D-RBF-PU) method [43] is similar to (but not identical with) the overlapped RBF-FD method of [46]. From (7) and (8), we have where ξ L j,k = Lu * j ( (k); x)| x=x k are generalized (related to operator L) Lagrange function values on set X = {x j : j ∈ J } evaluated at x k . In programming, we loop over patches and look for indices of points x k in which a prescribed patch is their closest patch. Such index family will be denoted by k( ).
Usually, the gradient and the Laplacian operators (∇ and Δ) are used in fault curve detection algorithms. An ordinary fault indicator is defined as and a gradient fault indicator is defined as A large value of I ∇ (x k ) shows that the smoothness of f is deficient at or near x k , and the large value of I Δ (x k ) indicates the same behavior for both f and its gradient. For quantification, we define where L stands for either ∇ or Δ, and δ is a proper threshold parameter. Thus, we mark F (X, ∇, δ 1 ) and F (X, Δ, δ 2 ) as points which are close to ordinary and gradient faults, respectively, where δ 1 and δ 2 should be set appropriately to get accurate detections. In [12], a marking method based on the median values of the computed indicators is applied. Since for functions with constant (linear) values in large areas of the domain, the indicator I ∇ (I Δ ) is close to zero at points belonging to those areas, the medians fall around the zero, and thus many points are indicated as fault points even if they are not close to a fault curve. To overcome this possible problem, a doubly nested marking strategy is used in [12]. Here, we still use medians but with a different strategy. Using notations I ∇ (X) and I Δ (X) for {I ∇ (x k ) : x k ∈ X} and {I Δ (x k ) : x k ∈ X}, respectively, we set where h is the fill-distance of X, and C M , C G , and C L are proper constants that should be set by the user manually. In the above process, we first obtain the threshold parameter δ 2 by (11) to form the set F (X, Δ, δ 2 ) containing points close to both ordinary and gradient discontinuities. Then the points close to ordinary faults can be extracted from this set instead of the initial large set X. Thus, the threshold parameter δ 1 is obtained by (12) and ordinary fault points F ∇ are obtained as Obviously the set of points close to gradient discontinuities, F Δ , is Experiments show that F Δ still contains some points in the neighborhoods of ordinary discontinuities. Following [12], we modify F Δ as follows: where B(x, η) is a ball with center x and radius η on set F (X, Δ, δ 2 ), and |B| denotes the cardinality of set B. Parameter η is set proportional to fill-distance of the initial data set such that B(x, η) contains a sufficient number of points. Algorithm 1 presents the fault point detection procedure step by step. Algorithm 2 is called in Algorithms 1, and 3 is called in Algorithm 2.

Algorithm 1 Fault points generation.
We note that steps 1 and 2 of Algorithm 1 can be run simultaneously using a proper nearest search algorithm.

Algorithm 2
The RBF-PU subroutine with constant generated PU weight.

Narrowing step
In this subsection, we give a narrowing procedure to move a cloud of points approximately on a fault curve that may be of either ordinary or gradient type. Let us denote the set of detected fault points by as a small subset of the initial set X. To approximate the fault curve, the set F needs to be narrowed more.
In [12], an orthogonal distance least squares regression is used in this step to handle the cases in which the points near a parametric curve are distributed almost vertically. In fact, the least-squares approximation is used twice: for a coordinates rotation at first and for a curve fitting then (see also [50]). One of advantages of this method is that it can be used to obtain regressions of arbitrary orders. In this paper, we use the standard PCR method to obtain a linear regression using SVD, which seems enough for our algorithm. First, we consider the set F as a 2 × M matrix where each column stands for a point in R 2 . Then we compute the centered data matrix F 0 = F − μ F and its reduced SVD F 0 = UΣV T . Here μ F is the sample mean of rows of F . Finally, we accept the best first rank matrix approximation Fig. 1). But since the fault curves are usually nonlinear, we apply this procedure on local subdomains and blend the local approximants in a proper way to obtain a global nonlinear configuration. For this purpose, we employ a new PU approximation based on a new set of patches {Ω } for = 1, . . . , N c . In this step, we use a constant radius ρ = C ovlp h cov for all patches, i.e., Ω = B(y , ρ) for = 1, . . . , N c . The set of patch centers {y } is a coarsened subset of detected fault points F which is obtained by Algorithm 5 (below) for H 1 = H 2 = ρ andF = F . Note that, the size of the PU problem in this step is considerably less than that of the first PU approximation because now we are working on a much smaller set of points around a one-dimensional fault curve.
We assume F = F ∩ Ω and J = {j : x j ∈ F } with |J | = n . Now, the PCR algorithm is applied on each cloud F for = 1, . . . , N c to get a new narrowing set Fig. 1 Data set F = UΣV T + μ F (blue and filled circles) and narrowing points F 1 = U 1 Σ 1 V T 1 + μ F (red circles) F . As described above, if F and F are considered as 2 × n matrices and the SVD of F − μ F is denoted by UΣV T then F = U 1 Σ 1 V T 1 + μ F . Up to here, we have N c number of fault sets F which are supposed to be closer than the set F to the (unknown) fault curve. Depending on the amount of overlap between patches, the cardinality of ∪ F is larger than that of the original set F . This means that a fault point x k which belongs to more than one patches, say n patches, has n different approximation points from n different sets F . To obtain a unique approximation for any fault point, we apply the PU approximation on covering {Ω }. If we use the smooth PU weights (3), then a smooth combination of these n approximations gives a unique approximationx k for x k . We prefer to apply the discontinuous weight (8) due to its simplicity. In this case, depending on which center y is closer to x k , the approximation point in its corresponding set F is marked as the unique solutionx k for x k . Using this approach, we end with a new set which is obtained by thinning the cloud of detected set F by moving the points close to the fault curve. To further narrow the set of obtained points, we can apply the narrowing procedure once again by replacing F byF .
In programming, we apply this procedure by looping over patch centers rather than looping over fault points (see Algorithm 4).
Note that, we obtain local regressions per any patch and move several points x k , k ∈ k( ), close to the fault curve, simultaneously.

Fault curve reconstruction
Usually, the narrowed setF includes lots of points giving us an opportunity to select an ordered subset ofF to reconstruct the fault curve using a parametric approximation method. We use a method similar to that is given in [12,41]. A point z fromF is selected randomly and a new ordered set F ord is introduced which contains the only point z at the beginning but will be enlarged based on the following procedure.
First, we find the set of fault points F z in H 1 -neighborhood of z, i.e., Then we obtain the direction u z for which the variance of points in F z is maximized. It is well known that this direction is the first column of the U factor in the reduced SVD F 0 z = UΣV T where F 0 z is the mean zero data matrix. Now, two subsets F + z and F − z of F z are formed as and are added to set F ord . In fact, z + and z − have maximum distances from z along the directions u z and −u z , respectively. This process is repeated from two points z + and z − in both directions until no point is found in their neighborhood (see Fig. 2) or the distance between one of the newly selected points and one of the previously selected points in F ord (or in all previously sets F ord for cases with multiple fault curves (see Section 5.4)) is less than H 1 /2. Using the last condition and checking the newly found points in the last iteration, it can be checked whether the fault intersects itself (see the left-hand side of Fig. 3) or whether it may approach another fault (see the right-hand side of Fig. 3) (see Algorithm 5 for a more general case).
Finally, we end with a sequence of ordered points allowing to reconstruct the fault curve using a parametric approximation method. We will apply the parametric cubic

Cases with multiple fault curves
There may be more than one fault (of either ordinary or gradient type) in the domain. So we continue the above algorithm to find other faults as follows. Consider a new set that includes all points fromF whose distance to points in F ord is greater than a parameter H 2 . If this new set is empty, the algorithm is finished and we have no other fault. Otherwise, we select a new random point z from the points in this new set and continue the previous procedure to obtain a new ordered set F ord for the next fault (see right hand side of Fig. 3). Again the parametric spline interpolation is applied to reconstruct the new curve. This algorithm is repeated until none of points inF has distance greater than H 2 to points in all previous sets F ord .

Cases with intersections
There may be some faults that intersect each other. Suppose that the described algorithm yields m different sequences of ordered points corresponding to m different faults. By construction, the sequences do not share the common points on different faults; thus, the reconstructed curves may not intersect even if their corresponding exact faults do intersect. To resolve this problem, we apply the following procedure which is similar to that is given in [12]: 1. The head and the tail of each fault is reconstructed by linear interpolation (line segments) based on the first two and the last two points, respectively (see upper panels of Fig. 4).  threshold H 4 , we mark the intersection of (extension of) and the line segment between z p and z n as an intersection point (see upper panels of Fig. 4). 4. Intersection points whose distances are less than a prescribed threshold H 5 are replaced by their average (see the lower panels of Fig. 4). 5. The ordered sequences are updated either by adding one or two intersection points or by joining to another sequence and an intersection point.
In Algorithm 6, by [[x, y]], we mean a line segment with x and y as its end points.

Parameter selection and the main algorithm
The parameter H 1 in Algorithm 5 is the approximate distance between two consecutive fault points in which the parametric interpolation is built upon. In order to have an accurate reconstruction, H 1 must be selected small enough, but on the other hand, it should be large enough to ensure that if the narrowed points on a fault are spaced, the fault will not split into two or more pieces. Usually, H 1 is set proportional to the Note that all parameters H k , k = 1, . . . , 5 are explicitly related to the approximate fill-distance h of the initial data set X via In our experiments, we use C H = 6. The fill distance h is approximated by h = 1/ √ N where N is the number of initial points in X.
To approximate s ∇ and s Δ for indicators, in the first PU algorithm, we use variable patch radius ρ per any patch center y . Using a nearest search algorithm, we select n = 12 nearest points to y and then adjust the radius ρ as the maximum distance between y and that surrounding 12 points. In examples, Y = {y 1 , . . . , y N c } is assumed to be a grid set in the domain Ω with spacing distance h cov = C cov h = 2.5h. This parameter selection guarantees the inclusion Ω ⊂ ∪ Ω for both random and Halton point sets in our numerical examples. Moreover, for the second PU algorithm in the narrowing step, we use constant radius ρ = C ovlp h cov where we set C ovlp = 1.5. We also set C M = 1/4, C G = 1, C L = 1/2, and η = 4h in all examples unless specified otherwise. Finally, the main algorithm of the fault detection method can be written as follows.

Experimental results
In this section, the results of some experiments are given. The efficiency of the method is confirmed by testing it on various kinds of problems: problems with multiple faults of the same type (ordinary or gradient) and problems with faults of different types with or without intersections. The initial set X is assumed to be a sequence of N random points with uniform distribution on a square domain Ω ⊂ R 2 . We use N = 10000 random points (Fig. 5 left) in our test examples unless otherwise stated. For instance, in some examples, Halton points or a set of varying density random points are also used. The constant-generated weight function (8) is applied for both PU subroutines.
For the final curve reconstruction, we use the csaps function of MATLAB with smoothing parameter p = 0.9999 to obtain a cubic spline smoothing function on ordered fault points. The case p = 1 works as well but we choose a little bit smaller value to have smoother fits. In Example 7.7, we use p = 1 to better capture higher curvatures of the solution at fins and flukes of the dolphin.
We suppose that the type of faults (ordinary or gradient) is not known in advance. Thus, both gradient and Laplace indicators are used by default for all examples.
In Example 7.1, we compute the root mean distance of the detected points from a fine set of points on the real fault to measure the closeness of detected points to the exact fault or to measure the error of the final fault reconstruction. Assume that Z is a set of m points on the exact fault Γ and F is the set of detected points around Γ . We define the root mean distance dist(F, Z) by [12] dist(F, where |F | stands for cardinality of F . In the case of multiple faults, we measure the error for each fault separately. In experiments, we assume that Z is a set of m = 500 points on each individual fault. All algorithms are implemented in MATLAB and executed on a machine with an Intel Core i7 processor, 4.00 GHz, and 16 GB RAM. The code is freely available at GitHub via https://github.com/ddmirzaei/FaultDetection to facilitate the reproduction of the examples presented in this section. The connection between scripts in the GitHub repository and the pseudocodes in the paper is as follows. Other subroutines in the repository are called in the above MATLAB functions.

Example 1
Consider the test function [12] f ( The upper-right panel of Fig. 6 shows the clouds of faults points which are detected by the algorithm out of N = 10000 random points in Ω. The lower-left panel shows the narrowed points and the ordered points. Finally, at the lower-right panel, the exact and reconstructed curves using smooth spline interpolation are depicted. We also test the algorithm on variable density random points (Fig. 5 right). The detected fault points and the final reconstructed curves are shown in Fig. 7. The algorithm handles  this case perfectly because patch sizes are selected automatically. The only difference is that we use more patch centers in that part of the region with a higher point density.
To verify the accuracy of the method, we first discretize the exact ordinary and gradient faults either by Z, a set of m = 500 equidistant points. Then we compute dist(F, Z) where F is the set of primary detected fault points around the fault curve. In fact, dist(F, Z) measures the closeness of the detected points to the exact fault. We also assume that Z is a set of m = 500 equidistant points on the final reconstructed curve (after narrowing and curve fitting) and measure the root mean distance dist(Z , Z) to estimate the error of the reconstruction. Results are given in Table 1 for both ordinary and gradient faults for N = 5000, 10000, 20000 random points with uniform distributions on [0, 1] 2 . Results are comparable with [12] but seem better than the grid-based algorithm of [11].
Finally, we note that the errors for N = 10000 varying density points (Fig. 5 right) are dist(F, Z) = 5.2e − 3 and dist(Z , Z) = 3.5e − 3 for the ordinary fault and dist(F, Z) = 2.6e − 3 and dist(Z , Z) = 1.9e − 3 for the gradient fault. Table 1 The root mean distance between detected fault points F and points Z on the exact fault, and the distance between Z and Z where Z is a set of fine points on the reconstructed curve

Example 2
Consider the test function [11] f (x, y) := where (x, y) ∈ [0, 1] 2 . This function has two ordinary faults that intersect each other at a point inside the domain (see the upper left-hand side of Fig. 8). Faults are exactly represented as The upper-right panel of Fig. 8 shows the clouds of fault points detected by the algorithm. The lower-left panel shows the narrowed and the ordered points. Finally, at the lower-right panel, the exact and reconstructed curves using smooth spline interpolation are plotted.

Example 3
Consider the test function where (x, y) ∈ [0, 1] 2 . As shown in the upper-left side of Fig. 9, this function is discontinuous across six curves. In the upper-right side of Fig. 9, the clouds of

Example 4
In this example, we consider the function [12] f (x, y) =|x −0.2−0.1 sin(2π(y −0.13))|+|x 2 +y 2 −0.5|−|(x −1) 2 +y 2 −0.36| on [0, 1] 2 (see Fig. 10). This function has three gradient faults where one of them intersects two others. The given algorithm detects all fault curves and handles the intersections. For this example, we use the value C M = 1 instead of 1/4. The primary detected points, the narrowed cloud, the ordered points, and the exact and reconstructed curves are all depicted in Fig. 10.

Example 5
Consider the test function [12] f (x, y) : where (x, y) ∈ [0, 1] 2 (see Fig. 11). This function has a gradient and an ordinary faults which intersect to each other with a small angle. In Fig. 11, primary detected clouds, narrowed and ordered points, and exact and reconstructed curves are shown.

Example 6
In this example, we have a tangential intersection. Consider the test function [12] f (x, y) := where (x, y) ∈ [0, 1] 2 (see Fig. 12). This function has two gradient faults which are tangent to each other at the middle of the square. In the upper-right side of Fig. 12,

Example 7
As a toy example, using closed parametric curve C(t) = x(t)i +y(t)j for t ∈ [0, 2π) which represents a plane shape of a dolphin, we construct the test function for (x, y) ∈ [0, 1] 2 . Obviously, this function is discontinuous on C. We use N = 30000 Halton points as an initial set X. Results are given in Fig. 13 where the narrowed fault points and the exact and reconstructed curves are illustrated.

An application for solving conservation laws
In this section, an application of the presented fault detection method is expressed in the process of solving conservation law equations via the weighted essentially nonoscillatory (WENO) finite volume methods (FVM). A brief summary of solution of conservation law equations by WENO FVM is outlined here. The reader is refereed to [1,10,25,30,42,47,54] for a complete explanation.

Spatial disctretization
Consider the following problem of scalar conservation law  (17) on each triangle T ∈ T at time t ∈ I is obtained as where is the cell average value of u on triangle T at time t. Here, the boundary of T is denoted by ∂T and consists of union of Γ j for j = 1, 2, 3 where Γ j are edges of triangle T with unit outward normal vectors n j . Thus (18) can be written as The line integrals in the above equation can be approximated by a N G -point Gaussian integration formula as where ω (j ) are Gaussian weights and x (j ) are Gaussian points on edge Γ j of triangle T . The Lax-Friedrichs numerical flux is used here where u in (·, x (j ) ) is the approximate solution at Gaussian point x (j ) of triangle T itself and u out (·, x (j ) ) is the approximate solution at the same point but from an adjacent triangle which shares Γ j as a common edge with T . The coefficient σ is obtained by Therefore, cell average values {u T (t)} T ∈T can be updated as where It is necessary to reconstruct u in and u out from the current cell average values {u T (t)} T ∈T . There are different methods for reconstruction that will be explained later.

Time discretization
For hyperbolic equations, an ODE solver which maintains the stability of the problem and avoids oscillations should be employed. Here we use a strong stability preserving Runge-Kutta (SSPRK) methods of order 3 [7,26,48]. Consider the time depended system (20). For moving from data {u T (t n )} T ∈T at time t n to data {u T (t n+1 )} T ∈T with time length Δt, the SSPRK3 method consists of three steps By applying the CFL condition, Δt is restricted as where r T is the radius of the incircle of triangle T and η max t = max |F (u) · n|, where maximum is being taken over all Gaussian points on edges of triangle T .

Reconstruction step
As discussed before, the approximation values u in and u out in (21) should be reconstructed from the cell average values {u T (t)} T ∈T in each time step. An efficient particle reconstruction scheme based on polyharmonic spline interpolation is described [32]. For some other reconstruction methods, see, for example, [1,42].
Assume that {x c T } T ∈T is the set of barycenters of triangles in T . For each reference triangle T ∈ T , consider a stencil S = {T 1 , . . . , T n } ⊂ T , where T ∈ S. In each triangle R ∈ S, the cell average value u R (t) is considered as an approximation for u at x c R at time t. So we are looking for a function s that interpolates u from the given values u(x c R , t) ≈ u R (t) for R ∈ S, i.e., In polyharmonic spline interpolation, s is written as and by imposing the interpolation condition, the same system as (2) is resulted where f is replaced by the vector of cell average values at stencil S. In order to avoid the nonphysical oscillations in solution, the WENO reconstruction is frequently used in literature [25,30,38,54]. In WENO schemes, a weighted average of reconstructions from a set of stencils {S k } K k=1 for which T ∈ S k for all k = 1, . . . , K is used. The weights are chosen in such way that the oscillations are minimized. We use a WENO reconstruction with an oscillation indicator parameter based on native space norm of the underlying polyharmonic kernel which is fully described in [32]. Details are left and the reader is referred to original sources.

Combination with the fault detection algorithm
In a WENO reconstruction, the process of selecting stencils {S k } K k=1 for a T ∈ T is an important part [22,25,29,38]. In polyharmonic kernel reconstruction, the size of each stencil S k (the number of triangles in stencil S k ) should not be less than Q, the dimension of polynomial space. Three types of stencils for each triangle T are introduced in [1]; centered, forward sector, and backward sector stencils. Three size 7 stencils of each type are displayed in Fig. 14.
It was concluded in [1] that in WENO polyharmonic spline reconstruction with φ = · 2 2 log( · 2 ), the use of 7 stencils (1 centered, 3 forward, and 3 backward) all of size 4 is sufficient for smooth solutions, while for solutions with discontinuity or a steep gradient at least 7 stencils (1 centered, 3 forward, and 3 backward), all of size 7 are required. For smooth solutions, we can even get rid of WENO and use a simple central stencil approximation instead. Usually the solution of a conservation law problem has a steep gradient or becomes discontinuous on a curve or a small subregion of global domain Ω. Thus, it is reasonable to detect such fault curves or regions in advance and use higher number of stencils or higher sizes in that regions only.
To follow this strategy, at each time step, X = {x c T } T ∈T is considered as a set of scattered points in Ω and {u T (t)} T ∈T as an approximation for solution values at these scattered points. Then, using the proposed fault detection algorithm with a Laplace indicator, a set of fault barycenters F = F (X, Δ, δ 2 ) are detected where δ 2 is defined in (11). As we discussed before, F (X, Δ, δ 2 ) contains points close to both ordinary and gradient discontinuities. A triangle with its barycenter belongs to F is marked as a fault triangle. We use 7 stencils of size 7 for a fault triangle and a central stencil of size 7 otherwise. This means that the WENO reconstruction is used for that parts of the domain on which the solution has steep gradient and is going to be discontinuous. Other parts are subjected to a simple central stencil reconstruction. The algorithm of this hybrid method is as follows.
Numerical experiments show that this approach leads to an speedup of around 30% while retains the accuracy of solution compared to the original WENO reconstruction (see the next section for an example).

A numerical example
Consider the nonlinear Burger's equation where r 0 = 0.15, c 0 = (−0.2, −0.2) and with periodic boundary conditions. The initial condition is shown in Fig. 15. The solution evolves into a very steep gradient when advancing in time.
The MATLAB code of this part is also freely available at GitHub repository https://github.com/ddmirzaei/FaultDetection Application to facilitate the reproduction of the results. All constants and parameters of the fault detection algorithm are set as before except parameter C M which is set to be 1 in this experiment. The previous value C M = 1/4 is also works but we use C M = 1 to reduce the number of fault triangles, slightly. Since the exact solution is not available, in Table 2, the errors between the solution of the full WENO reconstruction and the solution of the hybrid method for different values of triangles sizes h T = { 1 16 , 1 32 , 1 64 , 1 128 } are given. We observe a good agreement between the solutions of both methods which means that the hybrid method retains the accuracy of the full WENO reconstruction method. The next two columns of the table contain the total number of triangles and the average number of fault triangles in all time steps. For this example, and for meshsizes h T = 1/64, 1/128, we observe that 4-5% of all triangles are detected as fault triangles. The total run times are given in the last two columns. We observe speedup of about 30% with the new hybrid method.
For a better illustration, numerical solutions using the hybrid method at time levels t = 0, 0.3, t = 0.5, and t = 1 are shown in Fig. 15, and the fault barycenters at time t = 1 are shown in Fig. 16.

Conclusion
A localized scattered data approximation method based on polyharmonic spline interpolation in combination with partition of unity method was proposed to define a gradient and a Laplacian based indicators for detecting a cloud of fault points on or close to discontinuities of a bivariate function. The polyharmonic interpolation was  done on scaled data points to prevent the instability of kernel matrices. To get an accurate reconstruction of the fault, a localized principal component regression was applied to generate a second set of points which are supposed to be closer to the fault curve than the primary detected set. Then an ordered subset of these narrowed points was extracted and a smooth parametric spline interpolation was employed to reconstruct the fault curve. Situations with multiple fault curves and special cases with intersections and multi-branch configurations were addressed. Finally, an application for solving conservation law PDEs was given.
Applications to other areas such as image processing and geosciences, and generalization to 3-variate functions are left for a future study.