1 Introduction

Seismic tomography methods are used to reconstruct the velocity distribution for the investigated region of the Earth such that the travel time data should agree with measurements. In most of the methods, this is done by solving a least-squares (LSQ) problem. In tomography, the least-squares problems are frequently solved by the so-called row action methods (Nolet 1987; Herman 2009) as Algebraic Reconstruction Technique (ART) or Simultaneous Iterative Reconstruction Technique (SIRT). Scales (1987) proved that by taking the advantage of the sparsity of the tomographic distance matrix, the least-squares problem can be solved by the Conjugate Gradient method even in large-scale tomographic inversion. This low-cost CG algorithm of Scales was applied in solving the double trace (DT) tomography problem (Dobróka et al. 1992).

On the other hand, it is well-known that the least-squares solution is very sensitive to the non-Gaussian nature of the noise distribution, especially sparsely distributed large errors, i.e. outliers in the data set. So robust estimation methods should be used. One of the most frequently used robust optimization procedures is the Least Absolute Deviation (LAD) method using the L1 norm to characterize the misfit between the observed and predicted data. An efficient algorithm was developed for its tomographic use by Scales et al. (1988). Another possibility to address the question of statistical robustness is the use of the Cauchy criterion (Amundsen, 1991). In this case, the misfit function is the weighted norm of the deviation between the observed and predicted data vectors (the weights are the so-called Cauchy weights with a priori known scale parameters). In the framework of the Most Frequent Value method (MFV), Steiner (1988) developed a more flexible weight in which the scale parameters are automatically derived from the data set. Using MFV weights in an iteratively reweighted least-squares procedure, efficient outlier reduction can be achieved (Dobroka et al. 1992, Hering et al. 1995). The W-SIRT as an improved version of the traditional SIRT algorithm was developed by Dobróka et al. (2017), in which MFV weights were applied to produce a robust tomography method.

As it is well-known, to improve the quality of the seismic tomograms image processing tools can successfully be applied. It was proved by Gersztenkorn and Scales (1987) that smoothing the tomograms (at the end of the tomographic reconstruction) by the use of the so-called alpha-trimmed mean, (which performs a continuous shift from the arithmetic mean to the median, depending on the value of the alpha parameter) the distortions caused by the outliers can be appreciably reduced.

In this paper, a new image processing tool is introduced—called Steiner filter—in which MFV-weights are applied for further reduction of the influence of outliers. In such a way the outlier reduction can be two-folded: a robust W-SIRT inversion method is applied in the tomographic reconstruction and the tomogram is further improved by using the robust Steiner filter. To analyze the noise reduction capability of the new filter and to compare it to the smoothing filter based on the arithmetic mean and to the robust filter based on the median, medium-sized tomographic images are used.

2 Definition of the Steiner filter

In a 2D case, the tomogram is an array of velocity or slowness data along a (usually) regular grid. In an element of the grid (pixel) the \(f(i,j)\) value of the physical quantity is constant. Here \((i = 1,\ldots\,N,j = 1,\ldots,\,M)\), where \(N,M\) are the tomogram’s size in pixels. To filter the tomogram one can define a 2D window containing (2 k + 1)x(2 k + 1) pixels (k = 1,2,…) around the \((i,j)\) pixel symmetrically. The middle of the window is placed to the \((i,j)\) pixel of the tomogram which finds the filtered value given as.

$$g(i,j) = \sum\limits_{u = - k}^{k} {} \sum\limits_{v = - k}^{k} {T(u,v)f(i - u,j - v)} ,\,\,\,\,(i = 1, \ldots N - k,j = 1, \ldots ,M - k),$$

where \(T(u,v)\) is the filter function (kernel or mask).

In noise reduction, the smoothing filters

$$\,{\mathbf{T}}_{1} = \frac{1}{9}\left[ {\begin{array}{*{20}c} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \\ \end{array} } \right]\;{\text{and}}\;{\mathbf{T}}_{2} = \frac{1}{16}\left[ {\begin{array}{*{20}c} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \\ \end{array} } \right]$$
(1)

are frequently used with the masks (in the 3 × 3 case), where \(\,{\mathbf{T}}_{1}\), \(\,{\mathbf{T}}_{2}\) give the arithmetic-, and the binomial mean, respectively. In image processing, the median filter is extensively used in which the filtered value is the median of the data in the window defined by the mask

$$g(i,j) = {\text{median}}\left\{ {f(i,j,u,v)} \right\}\,,\,\,(u,v = - P, \ldots ,P)$$
(2)

P being the window’s size. A more general filter can give a weighted average of the noisy pixel values

$$T_{w} = \frac{1}{{\sum\limits_{u,v = 1}^{3} {w_{u,v} } }}\left[ {\begin{array}{*{20}c} {w_{11} } & {w_{12} } & {w_{13} } \\ {w_{21} } & {w_{22} } & {w_{23} } \\ {w_{31} } & {w_{32} } & {w_{33} } \\ \end{array} } \right]$$
(3)

In the research activity of the Geophysical Department of the University of Miskolc, it was proved that outlier noise can efficiently be rejected by using Steiner weights (Dobróka et al. 1991; Hering et al. 1995; Kale and Dobróka 2016). So, applying this tool in image processing a new and efficient filter, the Steiner-filter can be introduced with

$$\mathop w\nolimits_{k}^{{}} = \frac{{\mathop \varepsilon \nolimits^{2} }}{{\mathop \varepsilon \nolimits^{2} + (d_{k} - M)^{2} }},\,\,(k = 3(v - 1) + u)$$
(4)

where \(d_{k}\) is the k-th measured datum, \(\varepsilon\) and \(M\) are the scale-, and location parameter calculated in an iterative procedure with

$$M_{{^{j + 1} }}^{{}} = 3\,\frac{{\sum\limits_{k = 1}^{N} {\frac{{d_{k}^{{}} }}{{\left( {\varepsilon_{j}^{2} + (d_{k}^{{}} - M_{j} } \right)^{2} }}} }}{{\sum\limits_{k = 1}^{N} {\left( {\frac{1}{{\varepsilon_{j}^{2} + r_{k}^{2} }}} \right)^{2} } }},\varepsilon_{{^{j + 1} }}^{2} = 3\,\frac{{\sum\limits_{k = 1}^{N} {\frac{{r_{k}^{2} }}{{\left( {\varepsilon_{j}^{2} + r_{k}^{2} } \right)^{2} }}} }}{{\sum\limits_{k = 1}^{N} {\left( {\frac{1}{{\varepsilon_{j}^{2} + r_{k}^{2} }}} \right)^{2} } }}$$
(5)

in the j-th iteration (\(r_{k} = d_{k}^{{}} - M_{j}\)). The starting value for the location parameter is the mean of the data (in the mask), while for \(\varepsilon\) it is given as \(\varepsilon_{0} \le \sqrt 3 \left( {r_{\max } - r_{\min } } \right)/2\) (Steiner, 1988).

3 Application of the Steiner filter in noise reduction of seismic tomography images

For the numerical experiments, a rectangular test area of 100 × 100 cells was defined. The model contains three anomalies of the velocity 6 km/s (red colour), 5 km/s (yellow colour) and 3 km/s (magenta), respectively located in a homogeneous background of 4 km/s velocities (blue colour). Sources and receivers were positioned along the x- and y-axis in an arrangement fulfilling the requirement of full tomographic ray coverage, so the theoretical traveltime data were computed along 60,000 ray traces. To model quasi-measured data containing outliers (dataset I.), the theoretical traveltimes were contaminated with 1% Gaussian noise and an extra 10% noise was added to a randomly selected 20% portion of the data. The tomographic reconstruction was made using the traditional Simultaneous Iterative Reconstruction Technique (SIRT method) and its improved version, the W-SIRT robustified by using MFV weights (Dobróka et al. 2017). Instead of displaying the exact model, its SIRT reconstructed tomogram using the noise-free data set is presented in Fig. 1.

Fig. 1
figure 1

The model reconstructed by means of noise-free travel time data (the horizontal coordinates are given in cell-size units)

The Simultaneous Iterative Reconstruction Technique is one of the most frequently used methods in seismic tomography. In the typical step of the algorithm, the arithmetic mean of the so-called ART correction belonging to the seismic rays crossing the j-th cell is calculated as

$$s_{j}^{(q + 1)} = s_{j}^{(q)} + \frac{1}{{Q_{j} }}\sum\limits_{i = 1}^{{Q_{j} }} {\frac{{D_{ij} r_{i}^{(q)} }}{{\sum\limits_{k} {D_{ik}^{2} } }}}$$
(6)

\(s_{j}^{(q)}\) is the slowness of the j-th cell in the q-th iteration, \(Q_{j}\) denotes the number of rays crossing the j-th cell, \(r_{i}^{(q)}\) means the difference between the i-th measured and calculated traveltime and \(D_{ij}\) is the ray section of the i-th ray in the j-th cell. If instead of this simple arithmetic mean, a weighted average of the ART corrections is used

$$s_{j}^{(q + 1)} = s_{j}^{(q)} + \frac{1}{{\sum\limits_{l = 1}^{{Q_{j} }} {W_{ll} } }}\sum\limits_{i = 1}^{{Q_{j} }} {W_{ii} } \frac{{D_{ij} r_{i}^{(q)} }}{{\sum\limits_{k} {D_{ik}^{2} } }}$$
(7)

a new version of the SIRT algorithm can be defined. Using the (4) Steiner weights a robust W-SIRT method can be defined (Dobróka and Szegedi, 2014; Kale and Dobróka, 2016).

To characterize the accuracy of the reconstruction the (relative) model distance

$$D = \sqrt {\frac{1}{M}\sum\limits_{j = 1}^{M} {\left( {\frac{{s_{j}^{{}} - s_{j}^{(0)} }}{{s_{j}^{(0)} }}} \right)^{2} } }$$
(8)

was used. Here \(s_{j}\) and \(s_{j}^{(0)}\) denotes the slowness in the j-th cell of the reconstructed picture and the model, respectively, M is the total number of cells. The tomograms given by the SIRT or W-SIRT methods contain the slowness data in each pixel, so it can be considered as a black and white image in which the grey level is the slowness (or velocity). Using this procedure the SIRT and W-SIRT tomograms were converted to jpg images of the size of 100 × 100 pixels Fig. 2a, b show the tomograms (colour-coded in displaying).

Fig. 2
figure 2

The reconstruction of the noisy travel time data (with outliers) using a SIRT and b W-SIRT tomography methods (the horizontal coordinates are given in cell-size units)

Utilizing Eq. (8), the distance between the noise-free (Fig. 1 as reference) and the SIRT reconstructed noisy image (Fig. 2a) is D = 0.1894. The image in Fig. 2b given by the W-SIRT method (using Steiner weights) is characterized by D = 0.0409 model distance (improvement due to the use of robust tomography method is 78.4%). It can be seen, that in the tomographic reconstruction of the data set containing outliers the robust W-SIRT method has much better noise reduction capability.

Figure 3a, b show the effect of the Steiner filter on the reconstructed SIRT and W-SIRT images, respectively. The model distance between the noise-free and the Steiner-filtered SIRT reconstruction (Fig. 3a) is D = 0.0528. Relative to Fig. 2a an improvement of 72.1% is found due to the use of the Steiner filter. The same calculation gives D = 0.0241 model distance in the case of the Steiner-filtered W-SIRT picture (Fig. 3b) with 41.1% improvement due to the application of the Steiner filter.

Fig. 3
figure 3

The effect of the Steiner filter used on a SIRT and b W-SIRT pictures (the horizontal coordinates are given in cell-size units)

For the sake of comparison, we applied arithmetic mean and median filtering on the SIRT image in Fig. 2a. The result is shown in Fig. 4a (mean filter) and Fig. 4b (median filter), respectively. Relative to the non-filtered tomogram, the mean filter gives D = 0.0603 model distance (68.2% improvement), while the median filter results in essentially the same effect as the Steiner filter with D = 0.0602 model distance (68.2% improvement). The comparison to Fig. 2 shows that the Steiner-, mean- and median filtering has a smoothing effect (lowering the curvature) at the corners of the anomalies. This distortion is increasing with the size of the filtering mask.

Fig. 4
figure 4

The result of filtering the SIRT reconstructed noisy image using a mean filter and b median filter (the horizontal coordinates are given in cell-size units)

The same tests were made also with the W-SIRT image of Fig. 2b. The model distance between the noise-free and the mean-filtered W-SIRT reconstruction is D = 0.0257. Relative to the non-filtered tomogram in Fig. 2b, the mean filter gives 37.2% improvement, the same calculation gives D = 0.0220 model distance in the case of the median-filtered W-SIRT picture (46.0% improvement due to median filtering).

It can be seen that using image processing tools, the quality of noisy tomograms can further be improved. The examples show that the outlier reduction effect of the new Steiner filter is similar to that of the median filter.

4 Edge detection in seismic tomography using image processing tools

In seismic tomography, the geological structure is investigated using seismic traveltime data. To support the interpretation of the tomographic result special transformations can be applied to the tomogram. It is a frequent problem to emphasize the borders of a certain geological structure (layer boundary, fault, etc.). There are commonly used tools for edge detection in image processing: the Prewitt and Sobel operators.

The difference along the x and y-axis is calculated utilizing the convolution masks of the Prewitt operator as

figure a

.

It can be seen, that the difference is calculated 3 times and their arithmetic mean is used as a local difference. In the case of the Sobel operator the difference is also calculated 3 times, but the binomial mean is used to characterize the local difference:

figure b

Using these convolution masks the change along the x-axis (approximates the x-derivative) can be calculated as

$$\partial_{x} (i,j) = \sum\limits_{u = - k}^{k} {} \sum\limits_{v = - k}^{k} {D_{x} (u,v)f(i - u,j - v)} ,\,\,\,\,(i = 1, \ldots N - k,j = 1, \ldots ,M - k)$$

and similarly

$$\partial_{y} (i,j) = \sum\limits_{u = - k}^{k} {} \sum\limits_{v = - k}^{k} {D_{y} (u,v)f(i - u,j - v)} ,\,\,\,\,(i = 1, \ldots N - k,j = 1, \ldots ,M - k)$$

These quantities can be considered as the two components of the 2D gradient vector. Its direction gives the direction of the maximal change of the slowness function, while its absolute value (the edge gradient \(\sqrt {\partial_{x}^{2} + \partial_{x}^{2} }\)) defines the rate of the total change in the same direction.

In Fig. 5 the effect of the (edge gradient) Sobel operator is demonstrated on a test image (“Lena”, frequently used in image processing). As it can be seen, on the homogeneous ranges the gradient is zero, so in the edge gradient image, the black colour is dominant. The edges appear as strong lines. In colour images, the Sobel filters should be calculated on all the three matrices (red, blue and green) constituents of the image.

Fig. 5
figure 5

The effect of the edge gradient (Sobel filter)

Using edge filters on tomograms the borders of our geological models can be detected. We demonstrate the effect of Sobel edge detection on filtered and non-filtered SIRT and W-SIRT tomograms. As a first step, the Sobel operator is applied to the noise-free tomogram of Fig. 1. The result is shown in Fig. 6. (The small disturbances are caused by reconstruction errors.)

Fig. 6
figure 6

The effect of the Sobel edge detector on the noise-free tomogram (the horizontal coordinates are given in cell-size units)

This picture serves as a reference for later tests, the model distances will be calculated from this image. To calculate the model distance, Eq. (8) is not applicable, because the reference image contains zero values (in the homogeneous segments). The new distance formula is

$$D = \sqrt {\frac{{\sum\limits_{j = 1}^{M} {\left( {s_{j}^{{}} - s_{j}^{(0)} } \right)^{2} } }}{{\sum\limits_{j = 1}^{M} {\left( {s_{j}^{(0)} } \right)^{2} } }}}$$
(9)

where \(s_{j}\) and \(s_{j}^{(0)}\) denotes the difference values in the j-th cell of the actual- and the reference images, respectively, M is the total number of cells.

Figure 7a shows the effect of the Sobel filter on the SIRT reconstructed noisy tomogram (shown in Fig. 2a) while Fig. 7b demonstrates the effect of Sobel edge detection on the Steiner filtered SIRT tomogram (in Fig. 3a). The model distances relative to the image in Fig. 6 are D = 1.809 in the case of Fig. 7a  and D = 0.889 in the case of Fig. 7b.

Fig. 7
figure 7

The effect of the Sobel edge detector on the a non-filtered and b Steiner filtered SIRT tomogram (the horizontal coordinates are given in cell-size units)

A similar test was performed on the W-SIRT images. Figure 8a shows the effect of the Sobel filter on the W-SIRT tomogram (D = 0.593), while Fig. 8b demonstrates the effect of Sobel edge detection on the Steiner filtered W-SIRT tomogram (D = 0.508).

Fig. 8
figure 8

The Sobel filter on a non-filtered and b Steiner-filtered W-SIRT tomogram (the horizontal coordinates are given in cell-size units)

Finally, the Sobel edge detector is used on the mean- and the median filtered W-SIRT tomogram. The result is shown in Fig. 9. The model distances relative to the reference image (in Fig. 6) is D = 0.533 in the case of the mean filter (Fig. 9a) and D = 0.453 in the case of the median filter (Fig. 9b).

Fig. 9
figure 9

The effect of the Sobel edge detector on the a mean-filtered and b median-filtered W-SIRT tomogram (the horizontal coordinates are given in cell-size units)

It can be seen that the combined use of edge detection and noise reduction by smoothing filters as well as robust (median- and Steiner) filters sufficiently improve the quality of the seismic tomographic images. The effect of the newly introduced Steiner filter is similar to that shown by the median filter.

To test the noise dependence of the Steiner-filter we generated dataset II. on the same model and measurement system (perfect ray coverage). The theoretical travel times were contaminated with 2% Gaussian noise and an extra 20% noise was added to a randomly selected 20% portion of the data (relative to dataset I. the signal to noise ratio is reduced by a factor of two.) The dataset was reconstructed using the W-SIRT method, the resulting velocity map is shown in Fig. 10a. The model distance defined in Eq. (8) is D = 0.0811 (increased by a factor of 2 in comparison to Fig. 2b). The same calculation gives D = 0.0328 model distance in the case of the Steiner-filtered W-SIRT picture shown in Fig. 10b. This result shows that the Steiner filter is efficient also in more noisy tomograms and gives an appreciably improved image.

Fig. 10
figure 10

The velocity map found by a reconstructing dataset II. using W-SIRT method and b the picture after Steiner-filtering (the horizontal coordinates are given in cell-size units)

The Sobel edge detector acting on the pictures of Fig. 10a, b results in the images shown in Fig. 11a, b.

Fig. 11
figure 11

The Sobel filter on a non-filtered and b Steiner-filtered W-SIRT tomogram found by reconstructing dataset II. (the horizontal coordinates are given in cell-size units)

5 Conclusions

As a new robust tool in image processing, the Steiner filter is introduced, in which the Most Frequent Value method developed by Steiner (1988) is applied to calculate the central element of the convolution mask. The effect of the filter was tested on a medium-sized tomographic map. The tomographic reconstructions were made using the traditional SIRT method and also its robustified version the W-SIRT, the filter was applied after (and independently) the tomographic reconstruction. The input (travel time) generated on a synthetic model were contaminated by the noise of non-Gaussian distribution (containing outliers). It was shown that the quality of the tomogram can be further improved by using the new filter. For comparison, the smoothing filter based on the arithmetic mean as well as median-based filters were applied. It was found that the Steiner filter acts as a robust tool with similar efficiency as the median filter and can be successfully applied also in edge detection tests.

The good noise rejection power of the Steiner filter makes it applicable inside the tomographic reconstruction (in each or periodically selected iteration). This can be the subject of further investigations in which all the important aspects of tomography and image processing can be jointly investigated.