Jumping with variably scaled discontinuous kernels (VSDKs)

In this paper we address the problem of approximating functions with discontinuities via kernel-based methods. The main result is the construction of discontinuous kernel-based basis functions. The linear spaces spanned by these discontinuous kernels lead to a very flexible tool which sensibly or completely reduces the well-known Gibbs phenomenon in reconstructing functions with jumps. For the new basis we provide error bounds and numerical results that support our claims. The method is also effectively tested for approximating satellite images.


Introduction
Radial Basis Function (RBF) methods (refer e.g. to [13,14,29,32]) have become one of the most popular tools for solving multidimensional scattered data problems. Thanks to their independence from the mesh and to their easy implementation, they apply in a variety of fields, such as population dynamics, machine (deep) learning, solution of PDEs and image registration. Even if such meshfree approaches have been extensively studied in the recent years, especially focusing on the efficiency and stability of the interpolant (cf. e.g. [3,4,10,16]) not much effort has been addressed to construct robust approximants for functions with jumps. Indeed, infinitely smooth RBFs, such as Gaussians and Multiquadrics, theoretically show spectral accuracy, which is no longer preserved when interpolating functions with discontinuities. This fact, first observed in the context of truncated Fourier expansions and later used to characterize nonphysical oscillations in approximating discontinuous functions, is known as Gibbs phenomenon.
To mitigate this effect for kernel-based approximation, one can use linear RBFs (see e.g. [15] for a general overview). This has been done in [20], where the Multiquadric has been replaced by the linear spline in regions around discontinuities. Alternatively, post-processing techniques, such as Gegenbauer reconstruction procedure [18] or digital total variation [28], are well-established tools. Finally, we point out that also the so-called Variably Scaled Kernels (VSKs) [1] are truly performing when reconstructing functions with gradient discontinuities, as shown also in [27].
Based on the last mentioned paper, and on the considerations about the usage of discontinuous bases for approximation purposes discussed in [30], here we propose a novel method which uses discontinuous kernels. The associated basis, constructed by means of what we call Variably Scaled Discontinuous Kernels (VSDKs), enables us to naturally reconstruct jump discontinuities (even with the family of Gaussians). The only drawback of the procedure lies in the fact that the algorithm needs to know where the discontinuities occur. To this aim, we recall that in [21] a one-dimensional edge detection method based on RBFs has been constructed and subsequently extended in [22] to a multidimensional framework by considering one-dimensional slices. Moreover, very recently in [26] an effective edge detector that analyzes the behaviour of the coefficients of the RBF approximant has been proposed (a similar idea was previously investigated in [20]). In that paper, a kernel-based discontinuos interpolant is empirically applied in the univariate setting. In this work, for edge detection we consider widely used schemes, such as Canny or Sobel edge detection (cf. [2,31]).
After providing a theoretical analysis of the scheme in the one dimensional case, we extend the idea to higher dimensions and provide very general error bounds in terms of the well-known power function. Extensive numerical experiments are devoted to show the effectiveness of the method. We also provide the Matlab software, freely available for the scientific community at http://www.math.unipd.it/~demarchi/RBF/CAARBF.html that can be used to reproduce the tests presented in the paper. To conclude, we also investigate an application to the reconstruction of satellite images, where we deal with edges of irregular shapes; refer e.g. to [24].
The paper is organized as follows. In Sect. 2, we briefly review the main theoretical aspects of kernel-based approximation methods and introduce the VSKs. Section 3 presents our method for constructing VSDKs and in Sect. 4 we provide extensive numerical experiments. Applications to real world data are reported in Sect. 5. Conclusions and future works are discussed in Sect. 6.

Kernel-based approximation methods
Let X = {x i , i = 1, . . . , N } ⊂ Ω be a set of distinct data points (data sites or nodes) arbitrarily distributed on a domain Ω ⊆ R d and let F = { f i = f (x i ), i = 1, . . . , N } be an associated set of data values (measurements or function values) obtained by sampling some (unknown) function f : Ω −→ R at the nodes x i . We can reconstruct f by interpolation, that is by finding a function R : Ω −→ R satisfying the conditions (2.1) The interpolation problem (2.1) has a unique solution if R ∈ span{Φ ε (·, is a strictly positive definite and symmetric kernel and ε > 0 is the so-called shape parameter. The resulting kernel-based interpolant, denoted by R ε,X , assumes the form We can write (2.1) by using the matrix Then, letting f = ( f 1 , . . . , f N ) the vector of data values, we can find the coefficients c = (c 1 , . . . , c N ) by solving the linear system A ε c = f . Since we consider strictly positive definite and symmetric kernels, the existence and uniqueness of the solution of the linear system is ensured. More precisely, we are interested in the class of strictly positive definite and symmetric radial kernels Φ ε defined as follows.
From (2.3) it follows that if Φ ε is radial, then it is completely identified by the function ϕ ε and we can indifferently use Φ ε or ϕ ε for denoting the interpolant in (2.2).
To Φ ε we associate a real pre-Hilbert space equipped with the bilinear form (·, ·) H Φε (Ω) . We then define the native space for details see the monographs [14,32]). The accuracy of the interpolation process is usually expressed in terms of the power function. Let A ε (X ) be the interpolation matrix related to the set of nodes X and to the kernel Φ ε . Also let A ε (Y ) be the matrix associated to the augmented set Y := {x} ∪ X , x ∈ Ω. The power function is a positive function obtained by the ratio of two determinants (cf. [5,11]) The following pointwise error bound, that uses the power function and the norm of f in the native space, holds (see e.g. [14, Th. 14.2, p.117]).
Note that this theorem bounds the pointwise error in terms of the power function which depends on the kernel and on the data points but is independent of the function values.

From RBF to VSK interpolation
As well-known, the choice of the shape parameter ε is a crucial computational issue for RBF interpolation. If it is not properly chosen we might see instability effects.
To overcome such problems for the Gaussian kernel, a solution is provided by the so-called RBF-QR method which is truly effective (see e.g. [17,23]). An alternative, which can be applied to any kernel, is the use of Variably Scaled Kernels (or VSKs), introduced in [1]. We also notice, that VSK have been successfully used to reconstruct functions with gradient discontinuities in [27].
Starting from this idea, we analyse the case of jump discontinuities and we introduce a new family of kernels that we call Variably Scaled Discontinuous Kernels or simply VSDKs.
Considering VSKs, the classical tuning strategy of finding the optimal shape parameter might be substituted by the choice of a scale function which plays the role of a density function. More precisely (cf. [1, Def. 2.1]): Definition 2.2 Letting I ⊆ (0, +∞) and Φ ε a positive definite radial kernel on Ω×I depending on the shape parameter ε > 0. Given a scale function ψ : Defining then the map Ψ (x) = (x, ψ(x)) on Ω, the interpolant on the set of nodes Ψ (X ) := {(x k , ψ(x k )), x k ∈ X } with fixed shape parameter ε = 1 takes the form From now on we take ε = 1, as in [1, Def. 2.1]. Hence, if otherwise state, we omit to indicate the shape parameter.
By analogy with the interpolant in (2.2), the vector of coefficients c = (c 1 , . . . , c N ) in (2.6) is determined by solving the linear system Once we have the interpolant R Ψ (X ) on Ω × I , we can project back on Ω the points (x, ψ(x)) ∈ Ω × I . In this way, we obtain a VSK interpolant V ψ on Ω that is, using (2.5), The error and stability analysis of this varying scale process on Ω coincides with the analysis of a fixed scale kernel on Ω × I . For details and analysis of these continuous scale functions, we refer the reader to [1].

Variably scaled discontinuous kernels
We introduce the VSDKs by considering the one dimensional case and observing that the extension to the multidimensional case is almost straightforward, as we will show later in Sect. 3.2.
Let Ω = (a, b) ⊂ R be an open interval and let ξ ∈ Ω. We consider the discontinuous function f : where f 1 , f 2 are real valued smooth functions such that lim exist finite and Our aim consists in approximating the function f on the set of nodes X ⊂ Ω. Unfortunately the presence of jumps is the cause of oscillations in the reconstructing process.
To approximate f on X we take interpolants of the form (2.7) and we consider discontinuous scale functions in the interpolation process.
Let α, β ∈ R, α = β and S = {α, β}. We propose the following scale function ψ : Ω −→ S defined as: The function ψ is piecewise constant, having a jump discontinuity at ξ as the function f . Let Φ ε be a positive definite radial kernel on Ω × S , possibly depending on a shape parameter ε > 0 or alternatively a variably scaled kernel Φ ψ on Ω as in (2.5). We now analyze the function ϕ related to the kernel Φ with the fixed shape parameter ε = 1, that is This implies Therefore, the interpolant V ψ is a linear combination of functions having a discontinuity at ξ . We can easily generalize this procedure for a set of distinct discontinuity points on Ω; see the next section.

VSDKs: one dimensional case
To generalize the discussion carried out above, we need the following definition.
is odd.
With this choice of the scale function ψ and similarly to (2.5), we call the kernel Φ ψ a VSDK on Ω.
For the analysis of the VSDKs introduced in Definition 3.1 we cannot rely on some important and well-known results of RBF interpolation. Therefore, before stating upper bounds for the VSDK interpolants in terms of the power function, we give a preliminary analysis.
Let Ω and D be as in Definition 3.1 and n ∈ N. We define ψ n : j is even, where γ 1 , γ 2 are continuous, strictly monotone functions so that From Definition 3.1, it is straightforward to verify that ∀x ∈ Ω the following pointwise convergence result holds We point out that for every fixed n ∈ N the kernel Φ ψ n is a continuous VSK, hence it satisfies the error bound of Theorem 2.1. For VSDKs instead we have the following result.
where Φ ψ is the kernel considered in Definition 3.1.
Recalling (3.1), we get This concludes the proof.
, then it can be expressed as a linear combination of basis functions Φ ψ (·, x), x ∈ Ω. From Theorem 3.1, we get that for every and so f is also a linear combination of the functions lim n→∞ Φ ψ n (·, x), x ∈ Ω, as required.
We get an immediate consequence for the interpolant V ψ too.
for every x ∈ Ω.
Proof Since V ψ is a linear combination of the basis functions, the thesis follows from Theorem 3.1 and Corollary 3.1.
To provide error bounds, we now only need to introduce the power function for a VSDK Φ ψ on the set of nodes X . From (2.4), we know that it is defined as From Theorem 3.1 and Corollary 3.1, it easily follows that ∀x ∈ Ω These results allow to state an error bound for interpolation via VSDKs.
Proof For every n ∈ N and x ∈ Ω, since the VSK Φ ψ n is continuous, we know that (see Theorem 2.1) Then, taking the limit n → ∞ and recalling the results of this subsection, the thesis follows.
Proposition 3.1, as the classical bound for the RBF interpolants, limits the error in terms of the power function and consequently takes into account both the kernel and data.

VSDKs: multidimensional case
The VSDKs rely upon the classical RBF bases and therefore in principle they are suitable to be implemented in any dimension. However, since the geometry is more complex than in 1D, we need to carefully define the scale function ψ.
Let Ω ⊂ R d be an open subset with Lipschitz boundary. In our discussion, we consider step functions f : Ω −→ R such that there exists a disjoint partition P = {R 1 , . . . , R m } of regions having Lipschitz boundaries. That is all the jumps of f lie along (d − 1)-dimensional manifolds γ 1 , . . . , γ p such that Then, a suitable scale function ψ for interpolating f via VSDKs can be defined as follows. With this choice of the scale function ψ and referring to (2.5), we call again the kernel Φ ψ a VSDK on Ω.

Remark 3.1
In Definition 3.2 we choose a scale function which emulates the properties of the one-dimensional function of Definition 3.1. The difference is that the multidimensional ψ could be discontinuous not exclusively at the same points as f , but also at other nodes. Precisely, if we are able to choose P so that then f and ψ have the same discontinuities. Otherwise, if The theoretical analysis in the multidimensional case is done along the same path of the one-dimensional setting. Indeed, we consider continuous scale functions ψ n : for every x ∈ Ω.
We omit the easy extension of all results discussed in Sect. 3.1 and we state directly the error bound.

Proposition 3.2 Let Φ ψ be a VSDK as in Definition 3.2. Suppose that
Proof Refer to Proposition 3.1 and to the remarks made in this section.

Numerical experiments
We consider three strictly positive definite RBFs having different regularities To point out the accuracy of our tests, both in 1D and 2D cases, we refer to the Maximum Absolute Error (MAE) and to the Root Mean Square Error (RMSE). Once the interpolant is constructed, we evaluate it on a grid of S evaluation points {z 1 , . . . , z S } so that Further, we also estimate the upper bound for the error by computing the Maximum of the Power Function (MPF) In the 2D case we also deal with grey-scale images and thus we consider the Structural Similarity Index (SSIM), which is a well-known parameter that indicates the similarity between two images. The SSIM index lies in the interval [0, 1] (the value 1 corresponds to identical images).
The numerical experiments have been carried out with Matlab on an Intel(R) Core(TM) i5-4200U CPU @ 2.30 GHz processor.

A toy example
Let Ω = (−1, 1), We evaluate then the interpolant on a grid of equispaced points on Ω with step size 5.00E−4. First, as described in (2.2), we interpolate the function f 1 via classical RBF interpolation on X , using the kernel functions reported in (4.1). For such RBFs we select the optimal shape parameter ε * via Leave One Out Cross Validation (LOOCV) (refer to [13] for a general overview). The resulting interpolants are plotted in Fig. 1.
In Fig. 2, we plot the scale functions ψ n for some values of n and ψ.
Finally, for each of the RBFs considered in (4.1) we first take the scale function ψ n and compute the VSK interpolant for f 1 by considering increasing values of n (n = 10, 50, 500). Then we approximate f 1 using the VSDK determined by ψ (see formulae  From the figures, we can graphically note how the reconstruction via VSDKs is indeed the limit of the continuous case. As expected the C 0 RBF combined with the VSKs is not truly affected by the Gibbs phenomenon. Using the other kernels, we note that such oscillations are progressively reduced as n increases and graphically disappear when using VSDKs.
From the results reported in this subsection it is evident that the maximum value of the power function usually decreases as n increases. However, there are no theoretical results about the fact that the power function for VSDKs assumes smaller values than the one of VSKs and this is confirmed numerically. Indeed, in some cases the maximum value attained by the power function is sensibly higher for VSDKs, even compared to the classical RBF reconstruction.

Tests with artificial data
Let Ω = (−1, 1) 2 . We consider two test functions, We take 1089 Halton points on Ω as interpolation nodes and we evaluate the approximant on equispaced points with mesh size 1.00E−2.
Similarly to the one-dimensional case in Sect. 4.1, we interpolate the functions f 2 and f 3 via classical RBF interpolation on the set of nodes X , using the kernel functions in (4.1) and selecting the optimal shape parameter ε via LOOCV. Finally, we apply VSDKs and we evaluate the final results.
We start our discussion with the function f 2 . The resulting standard RBF interpolants for f 2 are plotted in Fig. 6. As expected, the infinitely smooth Gaussian RBF introduces huge oscillations, while with functions with low regularity, the Gibbs phenomenon is less evident.  Table 5 we report the accuracy indicators. We can observe that ϕ 1 ε outperforms the other two kernels in terms of SSIM index. Indeed, the related reconstruction is less affected by the Gibbs phenomenon, as graphically visible.
Considering the function f 2 , for VSDKs we need a suitable scale function satisfying the Definition 3.2. For this purpose, we consider the function ψ 2 defined as We show the final reconstructions using VSDKs with different kernels in Fig. 7, while in Table 6 we report the values of the considered errors and parameters. We recover also for the 2D case the pattern already discovered about the fact that VSDKs reconstruct the jumps without graphically introducing oscillations, also with C ∞ RBFs.
Considering now the function f 3 , we show in Fig. 8 and in Table 7 the results obtained via classical RBF interpolation.
We can observe a behavior that is similar to what we obtained for f 2 . Switching to VSDKs, we consider the scale function ψ 3 defined as     Fig. 6 The function f 2 and the classical RBF interpolants on X obtained using different kernel functions We point out that the set of discontinuity points of f 3 is strictly contained in the set of discontinuity points of ψ 3 , which is the case considered in Remark 3.1. We present the final results in Fig. 9 and in Table 8.  We can observe that the VSDK reconstructions using ϕ 2 1 and ϕ 3 1 are not affected by the Gibbs phenomenon as in the classical RBF reconstructions and they outperform the reconstruction obtained using ϕ 1 1 . Furthermore, for both the test functions considered in this section, the SSIM with VSDKs is about 1, which means that graphically there is a high similarity between the original and reconstructed image.
Concerning the maximum value of the power function for VSDKs, we can note that for both f 2 and f 3 it is comparable to the one obtained via standard RBFs. However, the reader should note that usually it reaches lower values than the ones achieved via  In general, for both 1D and 2D, the most promising results are the ones obtained via VSDKs and the Gaussian function. Indeed, it is well known that C ∞ RBFs introduce the Gibbs phenomenon, which is sensibly reduced via VSDKs.

Application to satellite images
The modeling and analysis of data, for instance, coming from distributed measurements of physical quantities and satellite images is a challenging computational issue.  Because of the huge size that some of these data sets achieve, reduced models such as the one presented in [24] are strongly advised. Nevertheless, the Gibbs phenomenon might affect also in this case the accuracy of the approximation. Thus, in this example, we show how VSDKs can intervene in this direction, sensibly reducing the oscillations. We consider the satellite image reported in Fig. 10, consisting of soil moisture data taken by NASA Soil Moisture Active Passive (SMAP) mission in April 2015 [12]. It is composed by a grid of 3856 × 1624 pixels. For dealing with the whole image, one needs to use reduced models, such as the one investigated in [25]. Moreover, if one only concentrates on a small portion of the image, e.g. a portion of Europe, the high resolution is lost (trivially due to zooming). In this case, a reconstruction scheme is necessary. Focusing on a portion of of Europe, we obtain an image composed by N = 144×191 pixels. After using these data to reconstruct the image, we evaluate it on a finer grid of evaluation points, composed by 216 × 286 pixels. Such a computation can be seen as a standard zoom, which might introduce Gibbs oscillations. They are indeed visible if, for instance, we reconstruct the image with the Wendland's C 2 RBF defined by: where (·) + denotes the truncated power function. We consider the Wendland's compactly supported C 2 RBF because it is well-known that by properly scaling the support of the basis function, it might lead to sparse interpolation systems and thus it gains in terms of stability, reducing the usual high condition number of the interpolation matrix. Despite this ability, the reconstruction via the classical method suffers from the Gibbs phenomenon, see Fig. 11 (left). Such oscillations are removed by VSDKs; refer to Fig. 11 (right). This example with real data is also devoted to show that VSDKs are performing also when the discontinuity is analytically unknown. In this case the curve defining the discontinuity has been approximated by means of Sobel detection scheme; see e.g. [31].

Conclusions
In this paper we have presented a robust method for sensibly reducing non-physical oscillations due to the Gibbs phenomenon. The accuracy of the proposed method has been studied theoretically and many numerical experiments confirm its effectiveness, showing that the reconstruction via VSDKs outperforms the standard techniques when jumps occur. Future work consists in investigating on the detection of discontinuities when dealing with real data and to extend the current work to other bases [8]. We stress that the current study might be useful for image reconstruction in the context of Magnetic Particle Imaging [6,7]. Finally, for smooth RBFs, we should study the behaviour of VSDKs when rational RBF interpolants are used [9,19].