1 Introduction

Surface texture, hereinafter referred to as roughness or rough surface, is the scale-limited surface where small-scale components were removed (S-filter) followed by a form (F-operation) and low-scale components (L-filter) removal [1]. Historically, roughness has been used to control manufacturing processes [2, 3]. For example, if the roughness deviates from a reference, unwanted changes may have occurred on the machining tool. Another advantage of roughness, which is growing rapidly nowadays, is its direct contributions to various physical properties, e.g., fatigue [4] or heat transfer [5]. To study these functional contributions of roughness, a model-based approach for rough surfaces can be used. These roughness models should also provide an efficient simulation procedure so that they can be used to accelerate functional studies.

Commonly, stochastic process modeling is considered the golden standard in roughness modeling. Recently, there had a novel stochastic process approach emerged to model a broad spectrum of rough surfaces [6]. This approach utilized Gaussian processes (GPs) to model rough surfaces. Even though it can model and simulate more varieties of rough surfaces automatically or with predefined information than traditional approaches, its computational bottleneck is its simulation procedure. Whereas other approaches have efficient implementations for rough surface simulations due to their nature, which allows the application of FFT [7, 8].

A simulation of a rough surface with the GP approach corresponds to a sample drawn from a multivariate Gaussian distribution. Using traditional methods for sampling multivariate Gaussian distributions is computationally expensive [6]. Particularly using Cholesky decomposition, sampling \(N\) data points have a computational cost of \(\mathcal{O}\left({N}^{3}\right)\) and a memory cost of \(\mathcal{O}\left({N}^{2}\right)\). On the one hand, several recent works in machine learning literature attends to fill this gap [9, 10] since efficient sampling is for example also beneficial in Bayesian optimization [11, 12]. On the other hand, we find that other approaches of traditional roughness modeling are complementary to the GP approach due to its linear process view.

In this paper, we address the computational expensive roughness simulation with a discrete filter approach with FFT for efficient GP sampling. We show that this approach approximates a stationary GP. And we compare the approach with two matrix factorization methods, the Cholesky decomposition and contour quadrature integrals (CIQ) [10]. Moreover, we show that a suitable non-Gaussian noise model can be used to reduce the number of GP samples for a honed surface, further reducing the computational cost. These additions are related to GP and noise model approach in [6]. Our contributions will result in a model which applies traditional methods to model rough surfaces and aims to get a more applicable GP approach.

2 Background

2.1 Roughness Model with Gaussian Processes

We follow the treatment given by [6, 13] and summarize their suggested roughness model. Their model-based approach follows [14] and characterizes a surface with two quantities, the autocovariance function (ACVF) and the probability density function (PDF). An ACVF describes the covariance or similarity between two locations of the surface (1) and, therefore, expresses a space-dependent influence in the model.

$$r\left({{\varvec{x}}}_{i},{{\varvec{x}}}_{j}\right)={\mathbb{E}}\left[\left(Z\left({{\varvec{x}}}_{i}\right)-{\mathbb{E}}\left[Z\left({{\varvec{x}}}_{i}\right)\right]\right)\cdot (Z({{\varvec{x}}}_{j})-{\mathbb{E}}[Z({{\varvec{x}}}_{j})])\right],$$
(1)

where \({\mathbb{E}}[\cdot ]\) is the expectation operator and \(Z\left({\varvec{x}}\right)\in \mathcal{Z}\subseteq {\mathbb{R}}\) is the surface height with position \({\varvec{x}}\in \mathcal{X}\subseteq {\mathbb{R}}^{D}\). For rough surfaces, the traditional assumption is that the mean is zero which modifies the ACVF by \({\mathbb{E}}\left[Z\left({\varvec{x}}\right)\right]=0\) [14]. In contrast, a PDF does not take the locations of the surface into account and considers rather the surface height directly.

In the proposed model [6], these two quantities are assigned to two components of the roughness model. The first component is a GP, which is described by a zero-mean ACVF. The second component is a noise model specified by a PDF. The diagram in Fig. 1 illustrates this roughness model regarding a roughness simulation.

Fig. 1.
figure 1

Simulation procedure for a roughness simulation with model [6].

A specified model with given ACVF and PDF can simulate a rough surface by sampling from the GP and processing it with noise. The noise is sampled from the noise model given a sampled GP. The model and both sampling procedures are described as follows

$$G\left({\varvec{x}}\right)\sim \mathcal{G}\mathcal{P}\left(0,r\left({{\varvec{x}}}_{i},{{\varvec{x}}}_{j}\right)\right),\qquad Z\left({{\varvec{x}}}_{n}\right)|g\left({{\varvec{x}}}_{n}\right)\sim p\left(z\left({{\varvec{x}}}_{n}\right)|g\left({{\varvec{x}}}_{n}\right)\right),$$
(2)

where \(G({\varvec{x}})\) denotes the latent output and \(Z({\varvec{x}})\) denotes the roughness.

If the noise model is Gaussian white noise, e.g., [6], the critical part regarding computational resources is the sampling from the GP. Thus, computational improvements in the GP sampling will lead to efficient simulations of rough surfaces with this model.

2.2 Simulation of Rough Surfaces

A GP is an infinite set of random variables \(\left\{G\left({\varvec{x}}\right)|{\varvec{x}}\in \mathcal{X}\right\}\) any finite set of which is multivariate Gaussian distributed [13]. Hence, a GP on a finite set \({\mathcal{X}}_{N}={\left\{{{\varvec{x}}}_{n}\right\}}_{n=1}^{N}\) is a multivariate Gaussian distribution, e.g., (3). Usually, this Gaussian distribution is high-dimensional and has a non-diagonal covariance matrix. In the roughness model, the latent output is sampled from the following Gaussian distribution

$${\varvec{G}}\sim \mathcal{N}\left({\varvec{G}};0, {\varvec{R}}\right),$$
(3)
$${\left[{\varvec{R}}\right]}_{ij}=r\left({{\varvec{x}}}_{i},{{\varvec{x}}}_{j}\right), \qquad \left({{\varvec{x}}}_{i},{{\varvec{x}}}_{j}\in {\mathcal{X}}_{N}\right),$$
(4)

where \({\varvec{G}}=\{G({\varvec{x}})|{\varvec{x}}\in {\mathcal{X}}_{N}\}\) is the finite set of a GP and \({\varvec{R}}\) is the covariance matrix.

Fig. 2.
figure 2

Sampling procedure for latent outputs with a given ACVF by a matrix factorization or a linear filter.

Matrix Decomposition.

The sampling from a high-dimensional Gaussian distribution traditionally includes the reparameterization trick [15]

$${\varvec{G}}={\varvec{A}}\cdot{\varvec{\varepsilon}}, \qquad {\varvec{\varepsilon}}\sim \mathcal{N}\left({\varvec{\varepsilon}};0,{\varvec{I}}\right),$$
(5)

where the covariance matrix is decomposed \({\varvec{R}}={\varvec{A}}{{\varvec{A}}}^{{\varvec{T}}}\). . We assumed a Gaussian distribution with zero-mean in (5). So, sampling a GP equals a matrix-vector multiplication with the matrix \({\varvec{A}}\) and a sampled Gaussian white noise vector (5) (see Fig. 2). Since the covariance matrix \({\varvec{R}}\) is positive-semidefinite, Cholesky decomposition is often applied to obtain the matrix \({\varvec{A}}\) [13]. This matrix decomposition requires \(\mathcal{O}({N}^{3})\) computational time and \(\mathcal{O}({N}^{2})\) memory for a matrix \({\varvec{R}}\) with size of \(N\times N\) [16]. The finite set \({\mathcal{X}}_{N}\) is often an evenly spaced grid in roughness simulations, hence, the covariance matrix of a stationary ACVF is a Toeplitz matrix for which the decomposition requires a computational cost of \(\mathcal{O}\left({N}^{2}\right)\) [17].

Linear Filter.

Models, that do not consider GPs directly, compute samples from a given stationary ACVF with a linear filter. This linear filter is designed with the ACVF and takes a white noise series as input and outputs the samples (see Fig. 2), e.g., [7]. In roughness literature, mostly a finite impulse response (FIR) filter is used and has the discretized form as follows

$${G}_{p,q}=\sum_{k=0}^{V-1}\sum_{l=0}^{U-1}{h}_{k,l}{\varepsilon }_{p-k,q-l},\quad p\in {\mathcal{I}}_{V-1}, q\in {\mathcal{I}}_{U-1},$$
(6)
$${r}_{vu}=\sum_{k=0}^{K-v-1}\sum_{l=0}^{L-w-1}{h}_{k,l}{h}_{v-k,u-l}, \quad v\in {\mathcal{I}}_{K-1}, u\in {\mathcal{I}}_{L-1}.$$
(7)

where we denote \({\mathcal{I}}_{c}\subset {\mathbb{N}}\) as an index set enumerated from \(0\) to \(c\) and where \(G\left({x}_{p}^{(0)},{x}_{q}^{(1)}\right)={G}_{p,q}\) is the latent output series, \(r\left({\tau }_{v}^{(0)},{\tau }_{u}^{(1)}\right)={r}_{v,u}\) are the ACVF coefficients, \({h}_{k,l}\) are the filter coefficients and \({\varepsilon }_{p,k}\) is a white noise series with unit variance. This filter shall mimic the ACVF (7) so that filtering a white noise series will result in an output series that has the imitated ACVF. To compute the filter coefficients with the ACVF, the nonlinear equations can be solved with the Newton method, or with the Fourier transform. More precisely, applying discrete Fourier transform (DFT) with the FFT algorithm supports efficiently computing the filter coefficients and the linear filter [8].

We found that a roughness simulation with this discrete filter can be applied within the GP roughness model. Thereby, the FFT algorithm is numerically fast and there are in-place implementations of the algorithm. This speed-up is significantly compared to traditional methods.

2.3 Related Work

One example to compute kernel-based methods and draw samples efficiently is the random Fourier features approach [18], e.g., in Bayesian optimization literature. It approximates a stationary ACVF \(r\left({{\varvec{x}}}_{i},{{\varvec{x}}}_{j}\right)\approx{\varvec{\zeta}}{\left({{\varvec{x}}}_{i}\right)}^{\top }{\varvec{\zeta}}({{\varvec{x}}}_{j})\) with a low-dimensional map with Fourier features such that a simulation is efficiently computed by a scalar product between the low-dimensional feature map and a white noise vector. To reduce errors due to approximations other methods emerged recently in machine learning literature. For example, matrix root decomposition has been efficiently computed with Lanczos variance estimates [9] or CIQ [10], which are Krylov approaches. These methods are exact if the number of iterations is \(N\), nevertheless, fewer iterations deal with small errors.

FFT has already been applied to simulate stationary GPs in literature. One commonly known method is circular embedding [19]. This approach extends the covariance matrix to a \(2M\times 2M\) circulant matrix \(M\ge N\), on which a FFT can be applied and lead to efficient matrix root decomposition. This method is exact and help to simulate efficiently. However, the positive-semidefiniteness of the circular matrix is not guaranteed for a minimal embedding which is a requirement. Similar to our method, [7, 20, 21] suggested simulating GPs with a filter by a Fourier transform approach. However, they simulated roughness only with GPs whereas we consider also a noise model. Furthermore, only [20] stated that their approach simulates a stationary GP.

3 Gaussian Process Filter

The zero-mean GP \(G\left({\varvec{x}}\right)\in {\mathbb{R}}\) has the following linear process view [6]

$$G\left({\varvec{x}}\right)=\int _{{\mathbb{R}}^{D}}h\left({\varvec{s}}\right)\varepsilon ({\varvec{x}}-{\varvec{s}})\mathrm{d}{\varvec{s}},$$
(8)
$$r\left({{\varvec{x}}}_{i},{{\varvec{x}}}_{j}\right)=\int _{{\mathbb{R}}^{D}}h\left({{\varvec{x}}}_{i}-{\varvec{s}}\right)h({{\varvec{x}}}_{j}-{\varvec{s}})\mathrm{d}{\varvec{s}},$$
(9)

where \(\{\varepsilon \left({\varvec{x}}\right),{\varvec{x}}\in \mathcal{X}\}\) is a continuous Gaussian white noise process, and \(h:{\mathbb{R}}^{D}\mapsto {\mathbb{R}}\) is a map that characterizes the ACVF. The Gaussian white noise process has a GP representation [22] denoted as

$$\varepsilon \left({\varvec{x}}\right)\sim GP\left(0,\delta \left({{\varvec{x}}}_{i}-{{\varvec{x}}}_{j}\right)\right),$$
(10)

with the Delta function \(\delta (\cdot )\) as its ACVF.

Considering a stationary GP, the ACVF is shift-invariant, and (9) has the following form

$$r\left({\varvec{\tau}}\right)=\int _{{\mathbb{R}}^{D}}h\left({\varvec{s}}\right)h({\varvec{s}}+{\varvec{\tau}})\mathrm{d}{\varvec{s}},$$
(11)

where \({\varvec{\tau}}={{\varvec{x}}}_{i}-{{\varvec{x}}}_{j}\) is the distance between two positions. Applying the Fourier transform on the above equation transforms the ACVF to the PSD \(\tilde{r }({\varvec{f}})\) [23, 24] and transforms the integral into a multiplication

$$\tilde{r }\left({\varvec{f}}\right)={\tilde{h }\left({\varvec{f}}\right)\cdot \overline{\tilde{h }\left({\varvec{f}}\right)}=\left|\tilde{h }\left({\varvec{f}}\right)\right|}^{2},$$
(12)

where \(\stackrel{-}{\left(\cdot \right)}\) denotes the complex conjugate and \(\tilde{h }(f)\) is the Fourier transform of \(h({\varvec{s}})\). Thus, an approximate filter map \(\widehat{h}\left({\varvec{s}}\right)\approx h({\varvec{s}})\) can be derived by the inverse Fourier transform

$$\widehat{h}\left({\varvec{s}}\right)=\int _{{\mathbb{R}}^{D}}{\left(\tilde{r }\left({\varvec{f}}\right)\right)}^\frac{1}{2}\cdot {\mathrm{e}}^{i2\pi {{\varvec{f}}}^{\boldsymbol{\top }}{\varvec{s}}}\mathrm{d}{\varvec{f}}.$$
(13)

If \(|\tilde{h }\left({\varvec{f}}\right)|=\tilde{h }\left({\varvec{f}}\right)\) holds, then the approximated filter map is the true filter map. So, a stationary GP with zero-mean can theoretically be simulated by the continuous linear process

$$\widehat{G}\left({\varvec{x}}\right)=\int _{{\mathbb{R}}^{D}}\widehat{h}\left({\varvec{s}}\right)\varepsilon ({\varvec{x}}-{\varvec{s}})\mathrm{d}{\varvec{s}}.$$
(14)

To obtain the PSD \(S\left({\varvec{f}}\right)\) directly or the approximated filter map with given PSD requires solves of Fourier transforms or inverse transforms, respectively. Furthermore, the linear filtering process in (14) is only possible if the approximated filter map is square-integrable. Even if the feature map is exactly known, the above presentation is not practical due to the continuous white noise process. To avoid this presentation of the white noise process, the next section discusses the (discretized) linear process. With this discretization, a discrete white noise process is considered, and discrete-time Fourier transform (DTFT) can be used to compute the filter map.

3.1 Discrete Filter

We initially assume one-dimensionality (\(D=1\)) for simplicity and extend this method to multidimensionality later. Sampling \(G\left(x\right)\) on discrete points \(G\left(n\Delta \right)={G}_{n}\) with step size \(\Delta \in {\mathbb{R}}_{\prec 0}\), the series \(\{{G}_{n}, n\in {\mathbb{Z}}\}\) has the ACVF

$$r\left(v\Delta \right)={\mathbb{E}}\left[{G}_{n}{G}_{n+v}\right], \qquad v\in {\mathbb{Z}},$$
(15)

which is sampled from the ACVF \(r(\tau )\). The DTFT of the ACVF series is a \(\frac{1}{\Delta }\) periodic continuation of \(\tilde{r }(f)\)

$${\tilde{r }}_{\Delta }(f)=\sum_{k=-\infty }^{\infty }\tilde{r }(f+\frac{k}{\Delta })$$
(16)

If the sampling theorem is fulfilled, then the Fourier transform  \(\tilde{r }(f)\) can be reconstructed. Otherwise, the DTFT is affected by errors due to aliasing.

Analogous to the continuous linear process (8), a discrete linear process presentation can be formulated with the ACVF series \(\{{r}_{v}, v\in {\mathbb{Z}}\}\), which we refer to as a discrete filter

$${G}_{n}=\sum_{k=-\infty }^{\infty }{h}_{k}\cdot {\varepsilon }_{n-k},$$
(17)
$${r}_{v}=\sum_{k=-\infty }^{\infty }{h}_{k}{h}_{k+v},$$
(18)

where \(\left\{{\varepsilon }_{k}, k\in {\mathbb{Z}}\right\}\) is a white noise series with the Kronecker delta as ACVF and the filter coefficients \(\{{h}_{k}, k\in {\mathbb{Z}}\}\) that shall mimic the ACVF series. This discrete filter makes the linear process more applicable by avoiding the cumbersome continuous white noise process. If the filter coefficients are given so that (18) holds, a \({G}_{n}\) can be computed by a convolution (17). To compute the filter coefficients the DTFT  \({\tilde{r }}_{\Delta }(f)\) can be utilized. The inverse DTFT can be used to estimate these filter coefficients [7]

$${\widehat{h}}_{k}=\int _{0}^{l}{\left({\tilde{r }}_{\Delta }\left(f\right)\right)}^\frac{1}{2}\cdot {\mathrm{e}}^{i2\pi f\frac{k}{l}}\mathrm{d}k,$$
(19)

with \(l=\frac{1}{\Delta }\).

The filter coefficients \(\{{h}_{k}, k\in {\mathbb{Z}}\}\) are approximated by (19) since we assume that the DTFT of the filter coefficients  \({\tilde{h }}_{\Delta }(f)\) equals \({\left({\tilde{r }}_{\Delta }(f)\right)}^{1/2}\). Furthermore, due to the periodic feature of \({\tilde{r }}_{\Delta }(f)\) aliasing errors are propagated to the filter coefficients. Nevertheless, the filter coefficients and the simulation of a latent surface can be computed efficiently with FFT in this approach.

3.2 Discrete Filter with FFT

It is not possible to compute the DTFT because a finite set \(\left\{{r}_{v},v\in {\mathcal{I}}_{V-1}\right\}\) instead of an infinite set is given. Therefore, DFT is applied to estimate the DTFT on a discrete set. The DFT of a finite sampled ACVF is

$${\tilde{r }}_{\Delta ,k}=\frac{1}{N}\sum_{v=0}^{N-1}{r}_{v}\cdot {\mathrm{e}}^{-i\frac{2\pi }{N}kv}, \qquad k\in {\mathcal{I}}_{N-1},$$
(20)

and by inverse DFT the filter coefficients can be estimated by

$${\widehat{h}}_{k}=\frac{1}{N}\sum_{v=0}^{N-1}{\left({\tilde{r }}_{\Delta ,v}\right)}^\frac{1}{2}\cdot {\mathrm{e}}^{i2\pi \frac{kv}{N}}, \qquad k\in {\mathcal{I}}_{N-1}.$$
(21)

With the estimated filter coefficients, a rough surface is simulated by

$${G}_{n}=\sum_{k=0}^{N-1}{\widehat{h}}_{k}{\varepsilon }_{n+k}, \qquad n\in {\mathcal{I}}_{N-1},$$
(22)

where \({\widehat{h}}_{k}\) is assumed to be a \(N\) periodic series due to the inverse DFT.

This will result in the following relation between the finite sampled ACVF and the estimated filter coefficients

$${r}_{v}=\sum_{k=0}^{N-1}{\widehat{h}}_{k}{\widehat{h}}_{k+v}, \qquad v\in {\mathcal{I}}_{N-1},$$
(23)

with \({\widehat{h}}_{k+N}={\widehat{h}}_{k}\).

The extension to two-dimensionality is straightforward

$${\tilde{r }}_{\Delta ,k,l}=\frac{1}{VU}\sum_{v=0}^{V-1}\sum_{u=0}^{U-1}{r}_{v,u}{\mathrm{e}}^{-i2\pi (\frac{kv}{V}+\frac{lu}{U})}, \qquad k\in {\mathcal{I}}_{V-1}, \quad l\in {\mathcal{I}}_{U-1}$$
(24)
$${\widehat{h}}_{k,l}=\frac{1}{VU}\sum_{v=0}^{V-1}\sum_{u=0}^{U-1}{\left({\tilde{r }}_{\Delta ,v,u}\right)}^\frac{1}{2}{\mathrm{e}}^{i2\pi (\frac{kv}{V}+\frac{lu}{U})}, \qquad k\in {\mathcal{I}}_{V-1}, \quad l\in {\mathcal{I}}_{U-1}$$
(25)
$${G}_{p,q}=\sum_{k=0}^{V-1}\sum_{l=0}^{U-1}{\widehat{h}}_{k,l}{\varepsilon }_{p+k,q+l}, \qquad p\in {\mathcal{I}}_{V-1}, \quad q\in {\mathcal{I}}_{U-1}$$
(26)

where \({G}_{p,q}\) is the latent series and has size \(V\times U\).

The DFTs and inverse DTFs can be efficiently computed with the FFT algorithm. The Eq. (26) is a convolution operation and, thus, can also be computed by the FFT algorithmFootnote 1. Therefore, this simulation procedure samples a latent GP in a computational time of \(\mathcal{O}((3N-2)\mathrm{log}(3N-2))\) with \(N=V\cdot U\).

4 Experiments

In this section, we conduct benchmarks to compare the discrete filter’s efficiency. Initial experiments are performed to study the timings of the proposed method with different FFT implementations. Afterward, the discrete filter is compared against matrix factorization methods Cholesky and CIQ. In all cases, we focus on ground surfaces with their inherent ACVF

$$r\left({\varvec{\tau}};\phi \right)={\sigma }_{\mathrm{k}}^{2}\mathrm{exp}\left(-{\left({{{\varvec{\tau}}}^{\boldsymbol{^{\prime}}}}^{\top }{{\varvec{\Lambda}}}^{-2}{\varvec{\tau}}\boldsymbol{^{\prime}}\right)}^\frac{1}{2}\right), \qquad {{\varvec{\tau}}}^{\boldsymbol{^{\prime}}}={{\varvec{T}}}_{\phi }{{\varvec{\tau}}}^{\boldsymbol{\top }},$$
(27)

where \({\varvec{\Lambda}}\in {\mathbb{R}}_{\succ 0}^{2\times 2}\) is a diagonal matrix and has diagonal elements \({\lambda }_{\mathrm{a}}\) and \({\lambda }_{\mathrm{b}}\), \(\phi \) is the angle of the grinding grooves, \({{\varvec{T}}}_{\phi }\) is the inverse rotation matrix and \({\sigma }_{\mathrm{k}}\) is a scaling hyperparameter. Furthermore, we conducted the experiments on an equidistant mesh

$${\mathcal{X}}_{N}=\left\{\left(v\Delta ,u\Delta \right),\Delta \in {\mathbb{R}}_{\succ 0}, v,u\in {\mathcal{I}}_{B-1}\right\},$$
(28)

where \(\Delta \) is the sample step size in \(x\)-direction and \(y\)-direction, and the surfaces are quadratic \(N={B}^{2}\). All timings have been performed either on the CPU hardware Intel Xeon Gold 6126 Processor or the NVIDIA Tesla V100 GPU.

4.1 Timings of the Discrete Filter with SciPy and CuFFT

Fig. 3.
figure 3

Timings for computing discrete filter and surface with SciPy and cuFFT.

The parameters were chosen with \({\sigma }_{\mathrm{k}}=1\) µm, \({\lambda }_{\mathrm{a}}=500\) µm, \({\lambda }_{\mathrm{b}}=5\) µm, \(\phi =0\), \(\Delta =0.5\) µm. We utilized the implemented FFT in the SciPy library (version 1.9.3) for CPU and the CUDA FFT library cuFFT (version 11.8) for GPU. For the comparison, we measured the performance of computing the filter coefficients and the simulation process for different surface sizes. This procedure has been conducted 10, 000 times for each surface size. We use this comparison, given the aforementioned hardware, to discuss the SciPy and cuFFT implementations for the discrete filter.

Figure 3 shows the timings of the discrete filter of both FFT implementations. To note is that no uncertainties are assigned to the data because the uncertainties are relatively small. These experiments show that cuFFT and SciPy’s FFT scale equally since both are based on the FFT algorithm. However, cuFFT leads to a significant acceleration against SciPy’s FFT implementation for large surfaces. It should also be noted that because of the memory advantage over GPUs, surfaces with \(9\cdot {10}^{8}\) points could be generated with the CPU, while the GPU reached its limits.

4.2 Benchmarking Discrete Filter

Since the discrete filter approach is more efficient through the GPU, we benchmark it with the cuFFT implementation. And the Krylov approach CIQ is also performant on GPUs due to its inherent matrix-vector multiplications [10].

Fig. 4.
figure 4

Speed-up of discrete filter over Cholesky factorization. The achieved speed-up is computed by sampling 10, 000 times.

Except for the grinding groove angle \(\phi =\frac{\pi }{6}\), we have chosen the hyperparameters of the ACVF identical to Sect. 4.1. Then we computed the speed-up of the discrete filter over Cholesky factorization (see Fig. 4). Since the Cholesky decomposition is memory inefficient, we performed Cholesky on the CPU, and the surface size was at most \(70\times 70\). The Cholesky method has been compared with the cuFFT implementation in Fig. 4, although both were run on different hardware the cuFFt was slower than the SciPy implementation on a CPU for small surfaces (see Fig. 3). The visualization shows that sampling from the high-dimensional Gaussian distribution is much faster with the discrete filter method (more than \(900\) times faster than Cholesky).

For the CIQ approach, we chose its experimental setting according to its publication [10]. For the Krylov method, we selected the number of iterations with \(J=100\). We also selected the number of quadrature points with \(Q=8\). To reduce the memory complexity, they leveraged symbolic tensors [25] for computing [10]. Figure 5 shows the timings and the error of the discrete filter and the CIQ method. The error is computed by drawing 1000 surfaces and computing the mean squared error between the sample covariance matrix and the true covariance matrix element-wise.

Fig. 5.
figure 5

Computing and sampling timings and error by discrete filter and CIQ.

Similar to [10], CIQ produces accurate samples of the latent surface. The error of the discrete filter is greater than that of the CIQ approach because the discrete filter method implies approximations of the GP. Firstly, the filter-map is approximated by assuming its Fourier transform equals the squared root of the PSD (19). Secondly, aliasing might occur due to the discrete Gaussian stochastic process. Even though the error can be reduced by upsampling, resulting in a tradeoff in computational speed, the discrete filter with FFT is much faster than CIQ. We noted that a larger latent surface simulation with constant sampling step size reduces the errors, which is because the given ACVF converges to \(0\) for large \({\varvec{\tau}}\).

5 Applications

We apply the discrete filters to simulate rough surfaces with the standard additive Gaussian noise model \(p\left(z\left({{\varvec{x}}}_{n}\right)|g\left({{\varvec{x}}}_{n}\right)\right)=\mathcal{N}\left(z\left({{\varvec{x}}}_{n}\right);g\left({{\varvec{x}}}_{n}\right),{\sigma }^{2}\right)\) and we show an application for a non-Gaussian noise model.

The usual approach to simulate honed surfaces is by simulating multiple ground surfaces and superposing them [6, 21, 26]. Alternatively, we use a non-Gaussian noise model with a generalized ACVF to simulate a honed surface in only one simulation procedure. This approach can reduce the computational time since a one-step honed surface needs only one rather than two simulations. The generalized ACVF is

$$r\left({\varvec{\tau}}\right)=\frac{{\sigma }_{\mathrm{k}}^{2}}{2}\left(\mathrm{exp}\left(-{\left({{{\varvec{\tau}}}^{\prime}}^{\top }{{\varvec{\Lambda}}}^{-2}{{\varvec{\tau}}}^{\prime}\right)}^\frac{1}{2}\right)+\mathrm{exp}\left(-{\left({{{\varvec{\tau}}}^{\boldsymbol{*}}}^{\top }{{\varvec{\Lambda}}}^{-2}{{\varvec{\tau}}}^{\boldsymbol{*}}\right)}^\frac{1}{2}\right)\right),$$
(29)

with \({{\varvec{\tau}}}^{\boldsymbol{^{\prime}}}={{\varvec{T}}}_{\phi }{{\varvec{\tau}}}^{\boldsymbol{\top }},\boldsymbol{ }\boldsymbol{ }{{\varvec{\tau}}}^{\boldsymbol{*}}={{\varvec{T}}}_{-\phi }{{\varvec{\tau}}}^{\boldsymbol{\top }}.\)

In the following, we simulated a one-step honed surface with the proposed approach with an additive non-Gaussian noise model defined as follows

$$Z\left({\varvec{x}}\right)=g\left({\varvec{x}}\right)+\varepsilon , \qquad {\varvec{x}}\in {\mathcal{X}}_{{\varvec{N}}}$$
(30)

where \(\varepsilon \) is an i.i.d. Pearson type III distributed random variable. We use the Pearson type III distribution that defines a skewness parameter because a honed surface often comes with skewed distributions.

Fig. 6.
figure 6

Simulated honed surfaces with Gaussian and non-Gaussian noise models but the same underlying latent surface.

To observe only the influence of the noise models, the honed surfaces have the same latent surface once simulated with the generalized ACVF (29). So, the same sampled latent surface \({\varvec{g}}=\{g\left({\varvec{x}}\right), x\in {\mathcal{X}}_{N}\}\) was passed into the standard Gaussian white noise model and the non-Gaussian noise model to generate a roughness sample \({\varvec{z}}=\{z\left({\varvec{x}}\right), x\in {\mathcal{X}}_{N}\}\) in each case. The Fig. 6 clearly shows that both surfaces have the same latent surface whereas only different additive noise models have been applied. However, the surfaces do not have continuous grooves which is a characteristic of honed surfaces. This error is due to the structure of the generalized ACVF that has a distinct peak in the center.

Fig. 7.
figure 7

Estimated distributions of a real honed surface and simulated two-step honed surface with an additive Gaussian and an additive non-Gaussian noise model.

To emphasize the differences in both simulations, we compare the sampled distributions of the simulated honed surfaces with a real honed surface in Fig. 7. The real surface data were measured with a confocal measurement tool and processed with a form operator after a noise filter. We obtained the sampling distributions in each case by considering the surface heights of the individual surface. The visualization shows that the non-Gaussian noise model leads to a better match with the real surface than a Gaussian noise model, especially at the tails. This is because the Pearson type III noise leads also to a skewed distribution of roughness. In fact, skewness estimations of the measured surface and the roughness with the non-Gaussian noise model have a comparable magnitude, whereas the Gaussian noise model has a.s. no skewness. Explaining skewness by additive noise models alone, however, will most likely lead to incorrect surfaces. Nevertheless, this approach is helpful for an adjustment in case of small skewness.

6 Conclusion

A model of rough surfaces with a GP and a noise model has been suggested by [6] that generalizes current approaches. However, simulations are limited due to the computational complexity associated with sampling from the latent GP.

We addressed this problem for stationary rough surfaces in this paper. Similar to [7, 20, 21], we applied the discrete filter together with the FFT algorithm since it can be motivated from the GP linear process view. Compared to Cholesky and CIQ, this approach leads to a significant speed-up. However, the discrete filter comes with approximation errors and is only limited to stationary surfaces. The errors can be reduced by sampling a larger latent surface. Additionally, we applied an additive non-Gaussian noise model for honed surfaces that have distributions with small skewness. This approach can further reduce computational complexity.

Even though we benchmarked the discrete filter with Cholesky and CIQ in this paper, an extensive comparison with other state-of-the-art methods is required for further classification of this approach. For example, a comparison should be made with circulant embedding [19], random Fourier features [18], Lanczos variance estimates [9], and also CIQ [10] with inducing point methods [27]. Moreover, future research should study the approximation errors of this approach if the PSD is known or even the continuous filter map is known from the ACVF. Another focus could be the application of non-additive noise models for roughness simulations because we discussed only additive noise models in this work.