1 Introduction

Noise reduction belongs to the most important image processing operations. The image restoration and enhancement methods are mainly relevant due to the miniaturization of high-resolution, low-cost image sensors, which frequently operate in poor lighting conditions.

Quite often, color images are corrupted by various types of noise, introduced by imperfections in sensors which influence the image formation process, signal instabilities, aging of the storage material, flawed memory locations, transmission errors in noisy channels and electromagnetic interferences. The quality of color images is severely decreased by impulsive noise distortions, and their removal is one of the most frequently performed low-level processing tasks [15].

The reduction in the disturbances introduced by the impulsive noise is crucial for the image preprocessing, as the corruption may have a significant negative impact on the success of the whole processing pipeline. Therefore, plentiful filtering techniques for impulsive noise suppression were developed during the many years of intensive research.

Numerous filters, which were designed to deal with impulsive noise in color images, are based on order statistics [611]. The majority of these algorithms relies on the ordering of a set of color pixels, treated as vectors, belonging to the processed pixel’s neighborhood, represented by sliding operational window W. For each pixel from W, the sum of distances to other samples belonging to the same window is assigned and then the cumulated distances are sorted. As a result, an ordered sequence of color pixels is obtained, which is the basis for various filtering methods.

One of the most popular methods based on reduced ordering, used in many filtering designs, is the Vector Median Filter (VMF) [6, 12]. The VMF output is the pixel from W for which the sum of cumulated distances to other samples is minimized. It is always one of the pixels of the filtering window, which is profitable as the filter does not introduce any new colors to the processed image. However, when all pixels of W are affected, for example by additional Gaussian noise, the output is also noisy. Numerous solutions devoted to the elimination of this undesired behavior were introduced, resulting in significantly better filtering performance [1215]. To increase the VMF efficiency, weights are assigned to the distances between pixels, which privilege the central pixel of the filtering window, thus diminishing the number of unnecessarily altered pixels [16, 17].

The efficiency of the techniques utilizing various vector ordering schemes is limited due to a common feature—every pixel of the image is processed, regardless whether it is contaminated or not. This results in the inevitable distortion of uncorrupted pixels and degradation of image quality. Therefore, a natural improvement has been made introducing more efficient switching filters [1823], which aim at the restoration of only the polluted pixels, leaving the uncorrupted ones unaltered. In the majority of the switching techniques, there is a need to determine the dissimilarity between the color pixels. The most intuitive and popular approach is to compute the Euclidean distance in the RGB color space; however, there are many other measures of vector dissimilarity applied in various filtering frameworks [2426].

Further improvement resulting in better robustness to the occurrence of outliers was achieved by calculation of only a few smallest distances between a pixel and other samples belonging to the same processing window [2729]. Such modification, in which the trimmed cumulative distance is utilized as a measure of pixel corruption, also facilitates the preservation of the original image edges and tiny details.

The decision-making step, differentiating between the distorted and uncorrupted pixels, seems to be more important than the choice of the algorithm used for the replacement of pixels classified as noise. The reason is simple—more precise impulse detection process results in less unwanted original pixels alteration.

There are numerous noisy pixel detection schemes proposed in the literature, and among switching filters, several groups of filtering designs can be enumerated. The Sigma Vector Median Filter (SVMF) [30, 31] and Adaptive Vector Median Filter (AVMF) [18] can be regarded as popular representatives of techniques based on reduced ordering statistics.

An efficient family of filters based on the peer group framework was proposed in [32]. The idea of this switching strategy can be found in various works [3335]. Also significant improvement has been made introducing the Fast Peer Group Filter (FPGF) [34]. It was also an inspiration of the recently proposed Fast Averaging Peer Group Filter (FAPGF) [36], which delivers a very good performance for highly contaminated images.

Another group of switching filters dedicated to the suppression of the impulsive noise in color images is based on the elements of the quaternion theory [37, 20, 38]. The color pixels, which are generally represented by three channels in the RGB color space, are expressed as quaternions without the real component. In this way, the similarity between pixels is defined in the quaternion form and is used as an alternative for the Euclidean distance, commonly used in the popular filtering designs.

The methods based on fuzzy set theory were also elaborated for the impulsive noise removal [3945]. These algorithms proved to be very flexible and offer a powerful performance not only for single image applications, but also for the enhancement of video sequences.

The filters proposed in this paper belong to the family of switching techniques. The impulse detection step is based on the reduced ordering and computation of trimmed cumulative Euclidean distances. Both Arithmetic Mean Filter (AMF) and VMF will be considered as the filter providing the estimate of the corrupted pixels, to enable a comparison of these two competitive solutions.

2 Adaptive switching filtering design

Most of the filtering techniques determine their output for the pixel located at position (uv) using n samples belonging to a sliding, operating window W with \({\varvec{x}}_{u,v}\) at its center. In order to simplify further analysis, the pixels belonging to W will be denoted as \({\varvec{x}}_1\ldots ,{\varvec{x}}_n\), and \({\varvec{x}}_1\) will be the central pixel of W as shown in Fig. 1.

Fig. 1
figure 1

Notation of the pixels in the filtering window

The reduced ordering scheme operates on the sum of dissimilarity measures (distances) denoted as d, between a given pixel and the samples from the filtering window W. In this way, the cumulated dissimilarity measure D assigned to pixel \({\varvec{x}}_{i}\), (\(i = 1,\ldots ,n\)) from W, is

$$\begin{aligned} D_i = \sum _{j=1}^{n} d({\varvec{x}}_i,{\varvec{x}}_j). \end{aligned}$$
(1)

The distances \(d_{ij} = d({\varvec{x}}_i, {\varvec{x}}_j)\) between \({\varvec{x}}_i\) and all other pixels \({\varvec{x}}_j\) belonging to W, \((i,j = 1\ldots n, i \ne j)\) can be sorted in ascending order:

$$\begin{aligned} d_{i1}, \ldots , d_{i\nu } \longrightarrow d_{i(1)}, \ldots , d_{i(\nu )}, \end{aligned}$$
(2)

where \(\nu = n-1\), and instead of the aggregated distances in (1) a trimmed sum of distances \(\hat{D}\) can be used [28]

$$\begin{aligned} \hat{D}_i = \sum _{r=1}^{m} d_{i(r)}, \end{aligned}$$
(3)

where m denotes the number of nearest pixels taken for the calculation of the trimmed sum of distances and \(d_{i(r)}\) is the r-the smallest dissimilarity measure. The trimmed sum \(\hat{D}_i\) is significantly less susceptible to outliers among pixels of W than the standard sum of distances \(D_i\) [46, 47].

The value of \(\hat{D}_1\), which is assigned to the central pixel \({\varvec{x}}_1\) can be treated as a measure of the pixel corruption. This value is low when there exist at least m similar pixels in the neighborhood, otherwise the central pixel \({\varvec{x}}_1\) may be considered as corrupted. If \(\hat{D}_1\) divided by m is greater than a predefined threshold value T

$$\begin{aligned} \frac{\hat{D}_1}{m} > T, \end{aligned}$$
(4)

then the central pixel of W will be considered noisy and will be replaced by the output of a suitable robust filter, otherwise this pixel will be designated as uncorrupted and remains unaltered. The division by m in (4) makes the T value independent on the number of close pixels taken for the calculation of \(\hat{D}\). The described above decision-making scheme will be denoted as Switching Trimmed (ST).

Additionally, for every processed pixel the map of noise array M is updated

$$\begin{aligned} M_{u,v}= \left\{ \begin{array}{ll} 0 & :\quad {\text {if recognised as corrupted}}, \\ 1 & :\quad {\text {otherwise}}. \end{array} \right. \end{aligned}$$
(5)

This map will be later used for the noisy pixels replacement.

Now, in order to address the effect of high values of the trimmed sum of distances \(\hat{D}\) in textured regions, a kind of adaptiveness can be introduced to the ST scheme by subtracting the minimum value \(\hat{D}_{(1)}\) calculated for the pixels of W [27]. In this way, the modified condition (4) for the noisy pixel detection is

$$\begin{aligned} \frac{\hat{D}_1-\hat{D}_{(1)}}{m} > T, \quad \hat{D}_{(1)}=\min \{D_i :\, i=1,\ldots ,\nu \}. \end{aligned}$$
(6)

This decision-making step becomes more robust and accurate in dealing with local image textural features and tiny details. The proposed scheme will be denoted as Adaptive Switching Trimmed (AST).

In the AST scheme, the trimmed sums have to be computed for every pixel in W in order to determine the minimum one. Despite the computational efficiency of this solution, which will be shown later, it requires a relatively large number of distance computations for every pixel in the processing window and additionally the minimum value has to be determined. Therefore, a simplified and faster approach has also been taken under consideration.

The fast AST (FAST) scheme can be performed in two steps. In the first step, for every image pixel, the computation of the trimmed sum of distances to its neighbors is performed: \(\hat{D}_1 = \sum _{r=1}^{m} d(1,(r))\) [48].

In the second step, the minimum trimmed sum \(\hat{D}_{(1)}\) is computed from all the values of \(\hat{D}\) assigned to the pixels belonging to W and the decision concerning the central pixel corruption is performed according to (6).

Fig. 2
figure 2

Construction of the AST and FAST denoising schemes. The AST design (a) requires the calculation of a trimmed sum of distances for every pixel of W to determine their minimum value, which is assigned to the central pixel. The FAST scheme (b) needs only the distance values directly designated to image pixels and the minimum value is calculated in the local neighborhood of the processed sample

The AST and FAST schemes are summarized in Fig. 2. In the AST scheme (a), the trimmed sum of distances have to be computed for each pixel of the filtering window W and then the minimum value is calculated. The FAST scheme (b) requires only the calculation of the trimmed cumulative distances for each central pixel of W (each image pixel), and the minimum value of \(\hat{D}\) is taken from the values previously assigned to the pixels of W. Therefore, the FAST scheme is n times faster than AST, as only one trimmed distance measure has to be calculated for each pixel, instead of n values required in the AST scheme.

Finally, pixels labeled as corrupted (\(M_{u,v} = 0\)) are processed by one of the two algorithms, which are used to determine the estimate of the noisy pixel:

  • VMF output replaces the central pixel of W with a pixel corresponding to \(\hat{D}_{(1)}\),

  • AMF output replaces the central pixel of W with the average of the pixels from W classified as not corrupted (\(M_{u,v} = 1\)). However, in rare situations, all pixels of the W can be detected as corrupted and in such situations the VMF output is used.

The resulting filters will be denoted as follows:

  • STVMF—Switching Trimmed with VMF output,

  • ASTVMF—Adaptive Switching Trimmed with VMF output,

  • FASTVMF—Fast Adaptive Switching Trimmed with VMF output,

  • STAMF—Switching Trimmed with AMF output,

  • ASTAMF—Adaptive Switching Trimmed with AMF output,

  • FASTAMF—Fast Adaptive Switching Trimmed with AMF output.

There are many approaches to impulsive noise modeling [49, 50]. One of the most popular contamination models is the so called color salt & pepper noise [5153], which assumes that a fraction of the image pixels denoted as p is corrupted in such a way that the RGB channels are assigned either the minimum or maximum value of the allowable dynamic range, (0 or 255 assuming 8 bit channel representation). The pixel modified by the impulsive noise can be totally corrupted, so that all three channels are replaced by the extreme values, but also one or two components of a corrupted pixel can remain unchanged. The noise distortion can be fully correlated (three channels are always affected), or it can be modeled according to a predefined correlation of channel contamination.

The removal of the salt & pepper noise is facilitated by the fact that only the extreme values of the corrupted pixels should be restored. Therefore, a more challenging and realistic noise model assumes that all channels of an affected pixel are replaced by a random variable drawn from the uniform distribution. For the experiments reported in this paper, we assume that the affected pixels have the RGB channels changed independently by values from the range \(\langle 0,255\rangle\) [36, 49, 54]. This corruption scheme will be called uniform noise model—UNM).

3 Parameter selection

In order to determine the recommended values of the parameters m and T of the described filtering techniques, a large number of simulations were performed on the image database consisting of 100 true color test images of size 640\(\times\)480 depicted in Fig. 3. This set of images is made available as Electronic Supplementary Material and is accompanied by a file which provides for each image its entropy measure and the number of unique colors.

Fig. 3
figure 3

Image database used for the analysis of parameter selection

The test images were contaminated with UNM impulsive noise of 3 different intensity levels: \(p=\left\{ 0.1, 0.2, 0.3 \right\}\). For the evaluation of the image restoration performance, following quality metrics were used:

– Peak Signal-to-Noise Ratio (PSNR)

$$\begin{aligned} {\text {MSE}}= & \frac{1}{3N} \sum _{j=1}^{N}\sum _{q=1}^{3} (x_{j,q}-\hat{x}_{j,q})^2,\;\; {\text {RMSE}}=\sqrt{\text {MSE}},\nonumber \\ {\text {PSNR}}= & 10\log _{10} \left( \frac{255^2}{\text {MSE}}\right) = 20\log _{10}{\frac{255}{\text {RMSE}}}, \end{aligned}$$
(7)

where \(x_{j,q}\), \(q=1,2,3\), are the channels of the original image pixels indexed by j, N is the number of image pixels and \(\hat{x}_{j,q}\) are the restored components.

– Mean Average Error (MAE):

$$\begin{aligned} {\text {MAE}} = \frac{1}{3N} {\sum_{j=1}^{N}}{\sum\limits_{q=1}^{3}} |x_{j,q}-\hat{x}_{j,q}|, \end{aligned}$$
(8)

– Normalized Color Distance (NCD) [1]:

$$\begin{aligned} {\text {NCD}}= \frac{ {\sum\limits_{j=1}^{N}} \Vert {\varvec{x}}_{\text {Lab}} -\hat{{\varvec{x}}}_{\text {Lab}} \Vert }{\sum\limits_{j=1}^{N} \Vert {\varvec{x}}_{\text {Lab}} \Vert } , \end{aligned}$$
(9)

where \({\varvec{x}}_{\text {Lab}}\) and \(\hat{{\varvec{x}}}_{\text {Lab}}\) are the components of the original and restored image pixels in the CIE Lab color space and \(\Vert \cdot \Vert\) denotes the Euclidean norm.

Due to a large number of the results obtained using different quality metrics and their very similar qualitative characteristics, only the PSNR measure will be used in the analysis presented in this Section.

The images corrupted according to the UNM with 3 different intensity levels were processed with all 6 described above filtering techniques. Each image was contaminated 10 times with different seeds of the random number generator and the obtained quality measures were then averaged. The denoising process was performed for each pair of parameters \(\left\{ m, T \right\}\), where \(m\in \left\langle 1, 8\right\rangle\) and \(T\in \left\langle 1, 100\right\rangle\).

Fig. 4
figure 4

Images used for the analysis of parameter selection. a MOTOCROSS. b RAFTING

The first (more detailed) step of the filters’ analysis focuses on 2 exemplary, natural test images: RAFTING and MOTOCROSS of size \(640\times 480\) shown in Fig. 4. For these images, the diagrams showing filtering performance for each tested pair of parameters and contamination intensity \(p=0.2\) are presented in Fig. 5 (RAFTING) and in Fig. 6 (MOTOCROSS).

Fig. 5
figure 5

Dependence of PSNR quality measure on the parameters m and T obtained when applying the analyzed techniques to the test image RAFTING contaminated with impulsive noise of intensity \(p=0.2\). a STVMF. b STAMF. c ASTVMF. d ASTAMF. e FASTVMF. f FASTAMF

Table 1 Comparison of the efficiency of the pixel replacement techniques based on the VMF and AMF using the PSNR quality measure

The visual analysis implies following remarks:

  • The filters with AMF output have better peak performance for an optimal pair of parameters. Another argument in favor of such a statement is presented in Table 1, where optimal values of PSNR obtained for both images, and all algorithms and all contamination levels are presented.

  • The filters using VMF output possess better tolerance for the selection of parameters (choosing other than optimal values result in lower loss of filtering efficiency).

  • The value of the m parameter, optimizing the PSNR measure, in each case was equal to 2.

Fig. 6
figure 6

Dependence of PSNR quality measure on the parameters m and T obtained when applying the analyzed techniques to the test image MOTOCROSS contaminated with impulsive noise of intensity \(p=0.2\). a STVMF. b STAMF. c ASTVMF. d ASTAMF. e FASTVMF. f FASTAMF

Fig. 7
figure 7

Dependence of PSNR on the parameter T obtained for test images MOTOCROSS and RAFTING contaminated with noise intensity \(p=0.1, 0.2, 0.3\). a MOTOCROSS, \(p = 0.1\)b RAFTING, \(p = 0.1\)c MOTOCROSS, \(p = 0.2\)d RAFTING, \(p = 0.2\)e MOTOCROSS, \(p = 0.3\)f RAFTING, \(p = 0.3\)

As filters with AMF output achieve better peak performance, the more detailed analysis of the impact of T parameter on their efficiency for STAMF, ASTAMF and FASTAMF techniques is exhibited in Fig. 7. The presented plots enable to draw the following conclusions:

  • The PSNR as a function of T is smooth and slowly varying. Therefore, a deviation from optimal T parameter setting does not result in significant loss of filtering performance.

  • The optimal T values for various algorithms are clearly different. The ASTAMF and FASTAMF schemes require a lower value of threshold T than STAMF, which is caused by subtraction of the minimal trimmed accumulated distance measure in (6).

  • It is difficult to determine which filter achieves best efficiency, as their relative performance is different for various contamination levels. The AST scheme yields better results for low noise intensity, while FAST scheme takes a lead for stronger image corruption.

  • Although the ST scheme may achieve best overall performance for very low and high T values (deviating much from those recommended), there is no evidence that its peek performance may outperform the AST and FAST techniques for low noise intensities (\(p<0.3\)).

The second (more global) step of our analysis considers all of the tested images. The optimal values of parameters \(\left\{ m,T \right\}\), maximizing the PSNR measure, were determined for each of the analyzed filters and noise contamination level.

Fig. 8
figure 8

Histogram of the optimal m parameter values obtained for all 5400 (3 measures × 3 contamination levels × 6 algorithms × 100 images) tested cases

The histogram of optimal values of m is presented in Fig. 8. It is clear that \(m=2\) is a value to be recommended for all filters and all contamination intensities. The statistically insignificant occurrence of \(m=3\) implies that more detailed analysis of this parameter has no relevance.

Fig. 9
figure 9

Box-plots of the optimal T values determined using PSNR quality measure for all 100 test images, a\(p = 0.1\), b\(p = 0.2\), c\(p = 0.3\)

On the other hand, the proper recommendation of the T values is not so unequivocal. Figure 9 depicts the box-plots presenting medians and quartiles for all optimal T obtained with regard to different contamination levels and algorithms. The numeric values of medians and interquartile ranges (IQR) are gathered in Table 2. We observed that bigger values of the threshold T are usually needed for highly textured images. This indicates that the local image entropy could be incorporated into the algorithm for adaptive, structure dependent tuning of the thresholding parameter.

Table 2 Medians with interquartile ranges of optimal T values maximizing the PSNR index

The medians of T are slightly declining with the increase in noise intensity. Therefore, the medians achieved for \(p=0.2\), which can be considered as a medium corruption, should be considered as recommended values of the T parameter. In this way, the suggested values of T are: 34 for STAMF, 40 for STAMF and 28 for FASTAMF.

4 Comparison with the state-of-the-art denoising methods

An indispensable final step in the development of any new filtering technique is the comparison with other competitive solutions available in the rich literature. Among filters proposed in this paper, only those with AMF output have been taken for comparison due to their higher efficiency. The described filtering designs were compared with the following methods, which are known to deliver very satisfying denoising performance:

  • Adaptive Central-Weighed VMF (ACWVMF) [55],

  • Fast Averaging Peer Group Filter (FAPGF) [36],

  • Fast Fuzzy Noise Reduction Filter (FFNRF) [56],

  • Fuzzy Ordered Vector Median Filter (FOVMF) [57],

  • Fast Peer Group Filter (FPGF) [34],

  • Ranked Sigma Vector Median Filter (SVMFr) [30].

In addition to quality measures presented in the previous Section, the Feature Similarity index (FSIMc) [58, 59] was also used to compare the efficiency of the evaluated filters. The structural similarity measures, like FSIMc, are highly correlated with the human visual system, which make them very useful for the analysis of noise suppression efficiency [60, 61].

Fig. 10
figure 10

Color test images used for comparison with the state-of-the-art filters. a GIRL. b HAND. c GOLDHILL. d FLOWER

The efficiency tests were performed using four selected images corrupted 10 times with UNM noise of intensities \(p = 0.1, 0.2, \ldots , 0.5\), (Fig. 10). Every contaminated image was denoised, and the outcome of each method was evaluated using averaged PSNR, NCD, MAE and FSIMc dissimilarity measures. The results obtained for each image are summarized in Tables 3, 4, 5 and 6.

Fig. 11
figure 11

Comparison of the performance of the filtering algorithms for image GIRL contaminated with impulsive noise of intensity \(p = 0.3\). a Original image. b Noisy image. c ASTAMF. d FASTAMF. e STAMF. f ACWVMF. g FAPGF. h FFNRF. i FOVMF. j FPGF. k SVMFr

Table 3 Comparison of the efficiency of the analyzed algorithms with the state-of-the-art techniques for the GIRL color test image
Table 4 Comparison of the efficiency of the analyzed algorithms with the state-of-the-art techniques for the HAND color test image
Table 5 Comparison of the efficiency of the analyzed algorithms with the state-of-the-art techniques for the GOLDHILL color test image
Table 6 Comparison of the efficiency of the analyzed algorithms with the state-of-the-art techniques for the FLOWER color test image

The visual comparison of the performance of the analyzed filters is presented for image GIRL (Fig. 11) and HAND (Fig. 12). The comparison of the efficiency of the denoising methods is presented in terms of PSNR measure for all four test images in Fig. 13.

Fig. 12
figure 12

Comparison of the performance of the filtering algorithms for image HAND contaminated with impulsive noise of intensity \(p = 0.3\). a Original image. b Noisy image. c ASTAMF. d FASTAMF. e STAMF. f ACWVMF. g FAPGF. h FFNRF. i FOVMF. j FPGF. k SVMFr

Fig. 13
figure 13

Comparison of the proposed designs with state-of-the-art denoising algorithms using four test images contaminated with noise of intensity \(p=0.1, \ldots 0.5\). a GIRL. b HAND. c GOLDHILL. d FLOWER

Finally, the obtained results can be summarized as follows:

  • For low contamination levels (\(p<0.3\)), ASTAMF algorithm clearly excels other techniques. However, the fast version of this algorithm (FASTAMF) is not far behind.

  • For medium noise intensity (\(p=0.3\)), the FASTAMF algorithm takes a lead for all tested images and this observation is valid for all computed dissimilarity measures.

  • In case of images corrupted by stronger noises (\(p>0.3\)), the FASTAMF competes with STAMF algorithm, which becomes surprisingly more efficient than others. This observation suggests that the adaptation mechanism becomes less important or even undesirable for more extreme noise occurrence.

  • If only PSNR measure is considered, the FAPGF algorithm shows a competitive performance for medium noise levels.

  • For very high noise corruption (\(p=0.5\)), the adaptive designs of ASTAMF and FASTAMF loose their power and the STAMF yields better results.

  • Generally, the thresholding parameter is only slightly dependent on the structure of natural images. Higher values of T may yield better results for images with high frequency texture or containing tiny details.

5 Computational complexity

Although the noise suppression efficiency, expressed by quality measures, seems to be the most obvious criterion for algorithm selection, the computational complexity is very often equally important. Therefore, in this Section, a straightforward analysis of computational complexity of the Fast Adaptive Switching Trimmed filter with AMF output—FASTAMF is presented. This filter was chosen as it belongs to the fastest available filtering designs and because of its very satisfying denoising efficiency. It will be compared with the state-of-the-art fast techniques: FPGF [34], Fast Averaging Peer Group Filter (FAPGF) [36], Fast Fuzzy Noise Reduction Filter (FFNRF) [56], Fast Modified VMF [62] and Vector Median Filter (VMF) [6] which can serve as a reference filter. The analysis of the computational burden will be performed for impulse detection (decision-making step) and output computation step separately.

We assume that color image is encoded with L channels, and the operating window W used by the filter consists of n pixels. The elementary mathematical operations used by an algorithm will be labeled as follows: Addition—ADD, Multiplication—MULT, Division—DIV, Exponentiation—EXP, Extractions of root—SQRT, Comparison—COMP. A detailed analysis of the computational load with commentary is performed for FASTAMF algorithm only. The complexity of the competitive algorithms is summarized in Table 7.

Table 7 Comparison of computational complexity

The impulse detection step of the FASTAMF requires:

  • Computation of \((n-1)\) Euclidean distances. Each distance requires: \(L\times {\text {MULT}} + 2L \times {\text {ADD}} + 1 \times {\text {SQRT}}\).

  • Calculation of the m smallest distances:

    $$\begin{aligned} m\sum \limits _{i=1}^{m}(n-i)\times {\text {COMP}}, \end{aligned}$$
    (10)
  • Sum of the m smallest distances: \(m\times {\text {ADD}}\),

  • One subtraction (\(1\times {\text {ADDS}}\)), division (\(1\times {\text {DIV}}\) and comparison (\(1\times {\text {COMP}}\)).

As during the noisy pixel replacement, the algorithm requires the map of noise array M, consisting of values: 1 for uncorrupted pixels, when condition (6) is satisfied and 0 for pixels found to be corrupted, the output computation step of the AMF requires \(n\times L \times {\text {MULT}}\), \(n\times L \times {\text {ADDS}}\) to acquire a sum of uncorrupted pixels channel values and \(n \times {\text {ADD}}\), \(1 \times {\text {DIV}}\) to obtain the final color pixel estimate.

As can be derived from Table 7, the proposed FASTAMF algorithm belongs to the group of the fast switching filters and its computational complexity is comparable only to FPGF [34] and newly proposed FAPGF technique [36].

Table 8 Processing times of the filters taken for comparisons when restoring the MOSAIC test image depicted in Fig. 14 contaminated with impulsive noise of increasing intensity p

For the assessment of the practical usability of the proposed filtering framework in real-time applications, we took measurements of the processing time using a large MOSAIC test image of size \(3200\times 2400\) pixels depicted in Fig. 14. To make the results independent on the structural content of the processed data and also to present the acceleration achieved using the CUDA parallel programming platform, this benchmark image is composed of 25 pictures taken from the database shown in Fig. 3.

Fig. 14
figure 14

Test image MOSAIC of size \(3200\times 2400\) pixels composed of 25 pictures from the dataset depicted in Fig. 3

The first group of tests was performed on a machine equipped with Intel i7-3632QM processor unit (2.2 GHz) and 4 GB memory. The 64-bit Debian 8.3 was installed as the operating system. For the purpose of the speed tests, all examined filtering techniques were implemented in ANSI C (gcc 4.9.2) programming language. To assure the fairness of the results and to avoid inefficient algorithm implementations, all of them with the exception of FASTAMF and FAPGF, which we prepared ourselves, were taken from the “Fourier 0.8” library provided by M.E. Celebi [63]. The speed measurements were also taken using a routine form this well-known library.

All comparative tests of the analyzed filters were performed using single-thread processing. The filtering window was consistently \(3\times 3\), and the parameters of the evaluated techniques were adjusted according to the recommendations of their authors. Each filter was run 200 times on the MOSAIC test image, contaminated with impulsive noise of intensity \(p=0, 0.1, \ldots , 0.5\), to assure the statistical significance of the comparisons. The medians of the independent execution times of all tested filters are presented in Table 8 and Fig. 15. The FOVMF algorithm was omitted in the presentation of results due to its very poor performance.

Fig. 15
figure 15

Processing times of the evaluated filters using the MOSAIC test image depicted in Fig. 14 contaminated with impulsive noise of intensity p

As can be observed, the proposed FASTAMF method is about 3 times faster than the standard VMF. The execution time of the FASTAMF measured on the MOSAIC image with medium noise level was about 800 ms. The time needed to process a \(640\times 480\) image was on average 34 ms, which confirms that the computational complexity grows linearly with the number of image pixels. The processing time of FASTAMF is only slightly dependent on the noise intensity, and this behavior is also exhibited by other filtering approaches, with the exception of FPGF, which is the fastest filter for very low contamination ratios (\(p<0.1\)), but is slowing down substantially with increasing noise corruption.

In Fig. 16, we present the relation between the efficiency of the evaluated algorithms expressed in terms of PSNR and the processing time obtained for the MOSAIC test image contaminated by impulsive noise of \(p=0.1\) and \(p=0.3\). Analyzing the results presented in the plots and also in Table 8 and Fig. 15, it can be stated that the proposed FASTAMF is one of the most efficient algorithms among those taken for comparisons in the whole range of contamination ratios and its overall efficiency is comparable with the FAPGF and also with FPGF when the noise intensity is quite low.

Fig. 16
figure 16

Relation between PSNR measure and the processing time of the evaluated filters obtained using the MOSAIC test image contaminated with impulsive noise of intensity \(p=0.1\) and \(p=0.3\)

Although for low contamination level the proposed FASTAMF is slower than FPGF, its efficiency expressed in terms of PSNR and other quality measures is significantly better. The FASTAMF technique is generally slightly slower than the FAPGF, but for low and medium noise corruption it is generally superior in terms of the denoising efficiency.

The new technique can be further accelerated by omitting the computation of already determined distances between pixels. Such a scheme was successfully applied in the construction of fast VMF implementations [64, 65]. Further substantial decrease in the computational time can be achieved using the hardware/software implementations, which are being developed very rapidly especially for image processing applications [6668].

In our algorithm, the image is processed in 3 steps which must by performed sequentially: computing of trimmed sum of distances (3), noise detection (6) and noisy pixel replacement. However, every pixel of the image can be processed independently during particular step. Therefore, each of those steps can be implemented as parallel processing which substantially decreases their execution times.

We implemented the VMF and FASTAMF using the CUDA technology on GeForce GTX 970 GPU equipped with 1664 CUDA cores (1250 MHz) and 4 GB 256-bit GDDR5 memory. The FASTAMF algorithm was written in C++ and compiled under NVIDIA CUDA compiler.

The grid configurations were chosen dynamically depending on the image size. We have chosen a block size of 128 threads in a configuration \(1\times 128\times 1\) threads. The crucial part is the right selection of the grid and block configuration, depending of the size of the image and the GPU parameters, so that the GPU computation ability is maximized.

The second essential part is the optimization of memory reads and writes. In the first step, we copy the image from host into the GPU memory. In that way, we minimize the use of the slow throughput between host and device. Another optimization is the correct way of reading global memory, which is the slowest one on the device, but is accessible by each thread. We use optimal access patterns based on the GPU computational capability and utilize data types that meet the size and alignment which is optimal for the given device.

Three kernels, which were run successively on each image pixel, were implemented, and the number of reads and writes from the global memory was reduced to minimum.

Fig. 17
figure 17

Comparison of the FASTAMF and VMF execution time performed on the noisy MOSAIC test image using the CUDA implementations

The speed tests of CUDA FASTAMF implementation were performed on the MOSAIC image, and the results are presented in Fig. 17. It can be observed that using parallel computing, impressive speed gains can be achieved, which allows to use the FASTAMF in real-time image processing. As only a few milliseconds are needed to process the relatively large MOSAIC image (\(3200\times 2400\)), the algorithm can be applied for video denoising with frame rates exceeding 100 fps or much more for smaller resolutions.

6 Conclusions

The evaluation of the performance of the described filter family provided in the previous Sections confirmed its high efficiency. The proposed filters are competitive against known fast filtering techniques intended for impulsive noise removal. Especially useful is the Fast Adaptive Switching Trimmed filter with AMF output—FASTAMF, which restores efficiently the corrupted pixels even for strong noise contamination. Its performance is comparable with the recently proposed Fast Averaging Peer Group Filter (FAPGF) [36]. The beneficial feature of the FASTAMF is its low computational complexity, which makes the filter interesting for the real-time color image denoising.

The proposed concept of trimmed sum of ordered distances is a very efficient way of determining whether a pixel is corrupted or not. Also the AMF output computed using only pixels recognized as uncorrupted, proved to be a very efficient and computationally inexpensive solution.

Additionally, the adaptation mechanism implemented in the AST and FAST decision-making schemes substantially improves the performance of the filters when the image is contaminated by noise of low and medium intensity (\(p<0.3\)). For higher noise intensity levels, this mechanism fails to detect the outliers, due to a small number of the uncorrupted samples in the filtering window.

The performed experiments confirmed the low computational complexity of the proposed filtering technique and its attractiveness for real-time image processing applications. Additional speed gain was obtained using a parallel implementation on the CUDA platform, which allows to apply the proposed algorithm for video denoising.