A nonlocal gradient concentration method for image smoothing

It is challenging to consistently smooth natural images, yet smoothing results determine the quality of a broad range of applications in computer vision. To achieve consistent smoothing, we propose a novel optimization model making use of the redundancy of natural images, by defining a nonlocal concentration regularization term on the gradient. This nonlocal constraint is carefully combined with a gradient-sparsity constraint, allowing details throughout the whole image to be removed automatically in a data-driven manner. As variations in gradient between similar patches can be suppressed effectively, the new model has excellent edge preserving, detail removal, and visual consistency properties. Comparisons with state-of-the-art smoothing methods demonstrate the effectiveness of the new method. Several applications, including edge manipulation, image abstraction, detail magnification, and image resizing, show the applicability of the new method.


Introduction
Image smoothing is a fundamental and important issue in computer vision.Natural images contain both clear structural edges of objects and abundant details caused by lightness, textures, and so on.Psychological studies show that human beings tend to pay more attention to the outlines of objects than trivial details [1].Indeed, images containing main structures but without details can be of use in many applications such as edge extraction, image abstraction, and tone mapping.Image smoothing aims to produce images which discard insignificant details while preserving the main structural edges.Because of the complexity of natural images, it still remains difficult to give explicit definitions that a computer can use to distinguish between main edges and trivial details: human beings can make flexible decisions.
During the past decades, image smoothing has attracted much research.In terms of approaches, previous methods can be loosely classified into two groups, edge-preserving methods and structurepreserving methods.Edge-preserving methods [2][3][4][5][6][7] consider that human eyes are sensitive to color changes between neighboring pixels, so they aim to preserve strong-contrast edges.However, they cannot remove fine-scale details with large or oscillatory gradient amplitudes.Structurepreserving methods [8][9][10][11][12][13][14][15][16][17] design smoothing models based on the assumption that sliding windows containing different patterns, such as structure and oscillatory details, behave differently with suitable measurement.Although fine-scale details can be smoothed out, edges tend to be shifted from their original positions for natural images because of the patch-wise operator.In fact, all previous works share a common defect that the quantization measure cannot reflect visual importance properly.Using contrast or local statistical responses, computer cannot distinguish details from structures exactly.Therefore, the main goals of image smoothing, detail removal, and edge preserving, are often in conflict.
It is worth noting again that we aim to smooth out details while preserving main structures in a consistent manner.The output image should be composed of sharp structural edges and homogeneous regions.After analyzing the properties of gradient maps in natural images, we find it is reasonable and practical to assume the ideal gradient map of a smoothed image should be sparse in space and nonlocally concentrating in amplitude.In order to achieve consistent smoothing performance, we propose a novel image smoothing method which we call the nonlocal gradient concentration (NGC) method.
NGC is data-driven and can be formulated as an optimization problem.In the new model, a nonlocal self-similarity property is assumed, leading to a nonlocal concentration constraint on the gradient map.Specifically, under the guidance of nonlocally similar patch groups, unstable details can be reduced, and meaningful structures which are lower in contrast can be kept.Nonlocal self-similarity is an intrinsic and useful property for natural images, arising due to redundancy.Typically, it is easy to find similar patches to a given patch.In computer vision and image processing, this property has been adopted as prior knowledge to achieve significant and surprising improvements.Exploiting self-similarity in color and intensity space and in the transform domain, has led to many state-of-the-art algorithms for various applications [18].However, it has not been fully adopted in image smoothing.As far as we know, our work is the first to use nonlocal similarity on the gradient map for image smoothing.Some results produced by our method are illustrated in Fig. 1, showing that our method can remove details while keeping structural edges.
The main contributions of our work are as follows: • We introduce a nonlocal constraint on the gradient map for the first time in image smoothing, and use it as a basis for a new optimization framework.• We present an efficient iterative algorithm for optimizing the new energy model.• We demonstrate the ability of our method in several applications.

Previous work
This section reviews some representative edgepreserving and structure-preserving smoothing methods and analyzes them with an example shown in Figs. 2 and 3.

Edge-preserving methods
Filtering using a weighted-averaging operation is a common scheme.Tomasi and Manduchi [2] proposed the simple and popular bilateral filter (BLF) in 1998.
For each pixel i in input image f , the output u i is computed as where {w i,j } are weights for all pixels in f .These are in inverse proportion to the spatial distance, and color or intensity difference.This prevents   strong edges from being over-smoothed.He et al. [3] proposed a guided filter in 2010, where a separate image is adopted to guide the linear translationvariant filtering procedure.It is similar to BLF in its edge-preserving properties, but it solves the gradient reversal problem in BLF and performs better near edges.The local Laplacian filter [4]  Optimization methods with regularization involving edges are another popular approach.The general form used in such optimization methods is to solve arg min where R(u) is a regularization term and λ is the regularization parameter.Rudin et al. [5] proposed the well-known total variation (TV) regularizer in 1992, which uses the L 1 norm of the gradient map.Farbman et al. [6] proposed an alternative edge-preserving operator based on weighted least squares (WLS) in 2008.WLS involves an L 2 norm.However, both TV and WLS penalize large gradient amplitudes, so image contrast is affected, as analysed in Ref. [7].As shown in Fig. 2(e), image intensity varies significantly from the input image.In order to get rid of the impact of gradient amplitudes, Xu et al. [7] proposed a new regularizer using the L 0 gradient norm in 2011, arg min . It can preserve salient edges globally without blurring, as shown in Fig. 2(f).
Edge-preserving methods assume that pixels with large gradient magnitudes are located on important edges explicitly or implicitly.As shown in Figs.2(b)-2(f), these methods are unable to remove small-scale textural details.

Structure-preserving methods
Structure-preserving methods rely on statistical features within local sliding windows to remove oscillations and extract structures.
Mode filters can suppress details by analyzing the histogram within a local sliding window; the median filter is the classic example.Subr et al. [8] proposed use of local extrema envelopes in 2009.First, maximum and minimum envelops are constructed respectively using extrema detected in local sliding windows.Highly contrasting oscillations can be removed simply by computing the smoothed mean envelope.Xu et al. [11] proposed the relative total variation (RTV) method in 2012.They observed that the inherent variation (aggregation of signed gradient) in sliding windows with texture is much smaller than in windows with structure.RTV can remove textures in mosaic images well, but it performs less well for natural images because complex lighting and perspective distortion make RTV over-smooth details.Karacan et al. [12] proposed a region covariance (RC) method in 2013.They made use of covariance matrices of simple image features to capture local structure and texture information in local patches.It can preserve prominent edges and shading while removing texture, but the resulting structure edges lack sharpness.A novel image decomposition method [13] was proposed by Su et al. in 2013.They applied a Gaussian decomposition and an asymmetric sampling operator to separate texture from structure, and then used a joint bilateral correction to suppress blurring.However, it does not perform well in practice because too many variables are involved.The tree filter [15] was proposed by Bao et al. in 2014.It is a trilateral filter in which for weighted-averaging, a tree distance is adopted as well as the two factors in BLF.The minimum tree can deal with fine-scale details.
As shown in Figs.2(g)-2(j), with these methods, the edges are either blurred heavily or shifted mistakenly.As shown in Fig. 2(l), details on the ostrich are removed and edges of the ostrich are kept consistently by our method.Figure 3 analyzes a scanline further.Figure 3(l) shows that the oscillations on the neck are smoothed thoroughly, while contrast near the two main features is retained well without being over-smoothed.

Model
As noted in Section 2, smoothing methods with gradient constraints have been studied, but their performance is unsatisfactory.To achieve consistently smoothed results, we propose a nonlocal optimization model in Section 3.3 based on the observations made in Section 3.2.Before describing our new model, we first review the general formulation of optimization with gradient constraint in Section 3.1.

Background
Gradient information is vital to the human visual system.
Making use of pixel-wise gradients provides the advantage that edges do not shift from their original positions.
Therefore, we focus on optimization based methods with gradient constraints in this paper.The general form of a gradient-based optimization model for image smoothing is arg min

Motivation
The key to the model defined in Eq. ( 3) is to make reasonable and practical assumptions about the gradient map of the ideal smoothed image.An ideal smoothed image should be composed of sharp structural boundaries between different objects and consistently flat components within homogeneous regions.Correspondingly, the gradient map of the ideal smoothed image should have the following two properties.Property 1.The gradients of pixels on structural edges should be non-zero, and gradients of pixels belonging to the same structural edge should be consistent.In Fig. 4(a), the left sub-image contains a structural edge segment for the squirrel.As shown in Fig. 4(b), along the structural edge, gradient amplitudes of most pixels are large, but some have globally lower gradient amplitudes.
Property 2. The gradients of pixels in homogeneous regions should be zero.In Fig. 4(a), the right sub-image is a representative texture-detail region of a stone.As shown in Fig. 4(b), amplitudes of the gradients in this sub-image are oscillatory and can be seen as stochastic perturbations.Some pixels have even higher contrast compared to structural edges elsewhere in the image, while some have low gradient amplitudes.
In order to make use of the above observed properties to guide the smoothing procedure, we need to quantify them by calculable measures.Consider the spatial distribution of non-zero gradients described in Property 1.One quantitative measure of the gradient map can be provided by a discrete counting scheme.It involves the L 0 norm, as suggested by Ref. [7].
Use of this first measure can avoid edge-blurring and can give impressive results with strong-contrast edges.However, Property 1 relies on the pixelwise gradient amplitudes of the input image to implicitly determine structural edges.Any textures and details will affect the edges in the input image, so it is unreliable.As shown in Fig. 4, for natural images, the gradient amplitudes of the input image cannot completely distinguish pixels on structural edges from pixels in homogeneous regions.Firstly, some parts of structural edges have comparatively low gradient amplitudes in the input image.Secondly, many fine-scale texture-like details have high contrast.Therefore, the model should not completely rely on pixel-wise gradients of the input image.To achieve better smoothing performance, the gradient amplitudes of the ideal smoothed image should be estimated carefully to better guide separating structural pixels from detail pixels.Based on this observation, we propose a new prior as the second measure to help estimate the ideal gradient map.

Definition
We now give a definition of our new model which subtly combines the above quantitative measures.
New model.Assuming the ideal gradient map can be estimated well and is denoted by g = (g x , g y ), a smoothed image of high quality can be estimated by the following model: arg min where the first regularization term is the sparsity constraint on ∇u, and the second regularization term constraints ∇u to follow g.λ and β are regularization parameters.The new model involves a novel V norm, which is explained in detail next.
As elements in ∇u = (∇ x u, ∇ y u) are binary tuples, it cannot be measured by traditional norms.We concisely denote the regularization terms by using a superscript V .The second term in Eq. ( 4) denotes the following formulation: where # represents the cardinality of a set: this term is the number of pixels with non-zero gradient amplitudes.The third term in Eq. ( 4) expands to The L 2 norm has been adopted in various image processing fields, and its ability to preserving fidelity has already been demonstrated.This definition ensures that the resulting gradient map ∇u tends to follow the accurate values in g.
Nonlocal estimation of g.In practice, it is not possible to get the exact gradient map g of the ideal smoothed image as described in Eq. ( 4).However, the redundancy of natural images provides a natural way to estimate g.Because information is redundant in natural images, there are typically many repetitive or similar patches within a single image.In the ideal image, similar features should be removed or preserved in the same way, so gradients of similar patches should be consistent.Based on this observation, we propose a nonlocal gradient estimation method for g.Using this method, the gradient of a pixel can be constrained by nonlocally similar patches.
This leads to the following model: arg min (5) where N L(•) represents a nonlocal estimation operation.Specifically, each pixel should have gradient consistent with the gradients of pixels with similar patterns.A patch centered at pixel u i is denoted by N I p .In a search window S around f i , a group of similar patches {N f i } can be collected.The nonlocal gradient estimate of u i can be expressed as where the weights are calculated in the Y channel of YCbCr color space as follows: and h acts as a parameter controlling the decay rate of the exponential function.By combining nonlocal estimation and L 0 gradient minimization in a single optimization model, our new model can achieve consistent detail removal and edge preserving effects.

Explanation
As the new model exploits information about nonlocally similar patches to constrain each pixel, it can preserve structure while removing details, to give consistent smoothed results.We now demonstrate the smoothing ability by analyzing in turn four representative patterns in natural images.Figure 5 illustrates four kinds of central patches and their corresponding similar groups.
(a) Strong structural edges (patch A in Fig. 5).Both the gradient ∇f i and the gradients ∇f G are strong.During globally optimization, the gradient is kept and resulting edges are sharp.
(b) Details in homogeneous regions (patch B in Fig. 5).In this situation, both the gradient ∇f i and the gradients ∇f G in the related patches are globally weak.In this case, the gradient of the current center pixel is modified to 0 by optimization, removing details.(c) Weak and slender edges (patch C in Fig. 5).The gradient ∇f i conflicts with the gradients ∇f G in the related patches.In the similar pacthes, the gradients are large, indicating that there is a structural edge.However, the gradient of the current patch is weak as it is a weak piece of a structural edge.In this case, the patch group promotes and enhances the weak edge, preserving it in the resulting image.
(d) Fine-scale and high-contrast details (patch D in Fig. 5).As in (c), the gradients are conflicting.The gradients in the group have globally lowamplitudes, while the gradient of the current patch is large and is an outlier for a texture-like regions.After optimization, this outlier is removed.
In summary, the new method can deal with both detail removal and structure preservation.The nonlocal constraint can effectively and automatically select important structures to preserve.Figure 6 illustrates a synthetic image with various kinds of noise.Figure 6(e) shows that our method can handle texture-like details (simulated by random salt and pepper noise) while at the same time keeping structural edges sharp.

Numerical solution
In this section we show how to numerically solve the nonlocal model defined by Eq. ( 5), give the whole algorithm, and discuss the parameters involved.

Solver
Energy optimization.As solving the original model is NP-hard, a splitting method which iteratively optimizes subproblems alternately [19] can be used as an effective technique.We give closedform solutions to the two subproblems respectively.
Replacing the unknown gradient (∇ x u, ∇ y u) by auxiliary variables (b, d), the model in Eq. ( 5) can be made more flexible.The new model can be represented as min where This can be split into two subproblems which can be optimized iteratively.In each iteration, the following two steps are used.
Step 1. Fix (b, d) and solve for u.The following subproblem can be extracted from Eq. ( 7): min 8) which has the optimality condition: where I denotes the identity matrix, and the symbol = − ∇ T x ∇ x + ∇ T y ∇ y denotes the Laplace operator.
The solution to Eq. ( 9) is unique, but it is computationally complex to solve directly, requiring a large matrix inversion.Instead, Gauss-Seidel iteration can be used to solve it approximately; Fourier transforms are taken for further speed.Under periodic boundary conditions for the variable u, using a 2D discrete Fourier transform, we can diagonalize the Hessian matrix on the left hand side of Eq. ( 9), giving an explicit solution which only requires componentwise operations.
Step 2. Fix u and solve for (b, d).The corresponding subproblem can be expressed as follows:  The nonlocal estimation N L(b i ) is intractable when b i is unknown, and it is difficult perform optimization if N L(b i ) is treated as an unknown variable.An effective and practical way is to split this into two stages.Firstly, estimate the nonlocal variable N L(b i ) using the estimated image u from the previous iteration, i.e., replace N L(b i ) by N L(∇u i ).Secondly, estimate b i supposing N L(b i ) known.
Furthermore, the above functional has the useful property that it can be split into |{f i }| individual subproblems.For any i, the corresponding subproblem can be formulated as optimizing the following energy functional with (b i , d i ) unknown.The objective functional for the i-th pixel can be written alternatively as where H(•) represents the Heaviside function, i.e., H(a) = 1 when a = 0 and H(a) = 0 otherwise.It is easy to see that when all subproblems {E i } are solved, the whole functional in Eq. ( 10) is optimized.Equation ( 11) involves a discrete counting function, and its solution is made tractable by using: Weight updating.In Step 2, the nonlocal estimation N L(∇u i ) involves calculation of weights w i,j as defined in Eq. ( 6).As iteration proceeds, the estimated image û will get closer to the ideal smoothed image, so the results of block matching will be more accurate.Therefore, in the k-th iteration, we update the nonlocally similar group using the latest updated image u k−1 , and recalculate the weights as In practice, block matching and weight updating can be performed every K 0 iterations for speed.Finally, the whole procedure for our method for image smoothing is summarized in Algorithm 1.

Analysis of parameters
In this section the various parameters involved are analyzed.
Regularization parameters λ, β, γ.The parameter λ controls the degree of smoothing.In order to satisfy the requirements of the variable splitting scheme, the parameters β and γ should increase as iteration proceeds.When iteration stops, the two parameters should be large enough to guarantee the gradient map of the output image is close to the estimated gradients (b, d).
From the update formula for (b, d) in Eq. ( 12), we can see that β and γ balance the impact of the nonlocal estimated gradient and the local gradient for the i-th pixel.If β > γ, the nonlocal estimation dominates the local gradient, and the result will be more consistent, and vice versa.The nonlocal method degenerates to the L 0 model if β = 0. Figure 7 shows an example using varying parameters.In each row, β/γ is fixed.From left to right, as λ increases, further details are eliminated and the images become coarser.Experiments show λ = 0.01− 0.04 is suitable in most cases.
In each column, λ is fixed and the ratio β/γ adjusts the relative importance of nonlocal and local gradients for locating details and manipulating gradients.In the extreme case of β = 0, our model degenerates to L 0 smoothing, meaning only local pixel-wise gradients are considered.Comparing the first row to the others, we can see that small scale details are retained even for large λ.If β goes beyond γ, nonlocally estimated gradients will help to handle

Comparison
We now compare our method to a series of smoothing methods in this section.A good smoothing method should satisfy the following requirements: structural edges should be well-preserved without being oversmoothed or blurred, while details should be smoothed out completely.As pointed out earlier, our method degenerates to L 0 smoothing method if we set parameter β to 0.
As shown in Fig. 8(a), the "small-house" input image is composed of explicit large-scale structures, including house and road boundaries, and highcontrast details on the roof, grass, and road.From Figs. 8(g)-8(k), we can see that structurepreserving methods can prevent edge blurring.However, some details still remain in the result for local extremal envelops as shown in Fig. 8(h).RTV can remove details, but the edges are not preserved naturally as shown in Fig. 8(i).Region covariance and tree filter methods cannot produce clear images as shown in Fig. 8(j) and Fig. 8(k), respectively.Compared to these methods, our method performs well in both edge preserving and detail removing.In Fig. 8(l), texture details are flattened naturally and structural edges are preserved without blurring.In summary, our method can consistently remove details and preserve structural edges.

Applications
As a fundamental technique, image smoothing has many applications for base and detail layer manipulation.As can be seen from the previous section, our smoothing method can preserve structural edges well while smoothing details out.In this section, we show some applications of our method including edge detection, edge manipulation, detail magnification, and content-aware resizing, to illustrate its advantage of retaining important features.

Edge detection and manipulation
Edge detection.There are rich details in natural images which can interfere with the progress of edge detection.Our method can remove trivial details, so it can help retain clean and accurate edges.As illustrated in Fig. 9, many fine edges are included in the original gradient map, while our smoothed result can produce a gradient map mainly containing meaningful edges.The edge map detected on our smoothed image by the Canny operator is much cleaner and contains fewer unimportant edges.
Image abstraction.Image abstraction is a practical application given the developing demand for image editing tools for amateur users [20].Our method can serve as the abstracting tool.First the input image is smoothed as shown in Fig. 10(b).Then edges are detected in the smooth image.Finally, an enhanced version of the edge map is computed and added back to the smoothed image.This gives a new image with enhanced edges.An example is illustrated in Fig. 10(c).Pencil sketching.Pencil sketching is a useful image editing tool for generating non-photorealistic images [21].It can be accomplished in three steps: firstly, smooth the input image via our method, secondly, detect edges with a Canny operator, and thirdly, randomly select small edge segments with a fixed-length according to the gradient amplitude and add them to the extracted edges.The gray-level is proportional to the gradient amplitude, and the direction is the tangent direction of the edge.As   shown in Fig. 10(d), the result is visually pleasing and reflects the main objects in the input image.

Detail magnification
Detail magnification aims to output an image with similar content but enhanced details relative to the input image.For a given image, first we can get two layers, a smooth layer and a detail layer, via our smoothing method.Then some enhancement operation is performed on the detail layer, e.g., using a difference of Gaussian (DoG) operator.Finally, the enhanced detail layer is composed with the smooth layer.In the composed image details are magnified.
As illustrated in Fig. 11, details in the resulting image are much clearer than in the input image.

Image resizing
Seam carving [22] is a popular content based image resizing method.It aims to keep the shape of the most important objects during the resizing procedure.But textures and details which are common in natural images may also be considered as salient in the contrast-based measurement.Figure 12 illustrates two examples scaled to 0.6 of the size of the input images.From Fig. 12(b) we can see that the grass and wave are considered important, and the stone and sailing boat are distorted mistakenly by the original seam carving method [22].While our method can suppress the features of grass and wave, as shown in Fig. 12(c), seam carving on our output can effectively keep the shapes of the stone and the sailing boat.

Conclusions
This paper provides a novel image smoothing model  based on a nonlocal consistency constraint on the gradient map.Nonlocal estimation provides a datadriven way to distinguish details from structural edges.The method works well for both detail removal in complex areas and feature preservation in less notable areas, and achieves more consistently smoothed results than previous methods.Moreover, as the new model can be solved efficiently by the algorithm described in this paper, it can be flexibly embedded into various techniques, and is able to help improve the performance of many content-aware applications.
The limitation of our method is that it may fail for some mosaic images.In future we will explore proper estimation of the ideal gradient map especially for mosaic images.

Fig. 1
Fig. 1 Image smoothing examples.Top: input images.Bottom: corresponding smoothed results using our method.

Fig. 3 A
Fig.3A scanline extracted from each image in Fig.2, as marked by arrows in Fig.2(a).The green channel is shown as it is representative of the main image content.Each blue curve represents the input signal, and the red curve represents the smoothed signal.
was proposed in 2011 by Paris et al.It manipulates multi-scale details to give halo-free smoothing results.The advantage of this filter is its simplicity.As can be seen from Figs. 2(b)-2(d), the output neither fully preserves the really important edges, nor completely removes some meaningless textures.
)where f and u are the vectorized input and output images respectively, and ∇u = (∇ x u, ∇ y u) denotes gradient.R(∇u) is a regularization term constructed from prior knowledge concerning the gradient map of the ideal smoothed image.The regularization parameter λ balances the smoothing regularization term R(∇u) with the fidelity term ||u − f || 2 2 to control the degree of smoothing.

Fig. 4
Fig. 4 Gradient map of an example natural image and two representative sub-images.(a) Input image.(b) Visualized gradient amplitude map; amplitudes are normalized to [0, 1] and colorized according to the colormap on the right.

Fig. 5
Fig. 5Four representative patches A, B, C, and D (red rectangles), and for each patch, several similar patches (yellow rectangles).

Fig. 6
Fig. 6 Synthetic image example for image smoothing.
Figures 8(b)-8(j) present results of other methods whose parameters have been tuned carefully using our best efforts.As shown in Figs.8(b)-8(e), in order to suppress details, edge-preserving methods, including the bilateral filter, WLS, guided filter, and local Laplacian filter, cause blurring on structural edges.Furthermore, details cannot be removed.Although L 0 smoothing can keep structural edges sharp, it cannot smooth out high-contrast details, as shown in Fig. 8(f).

Fig. 12
Fig. 12 Seam carving.(a) Input.(b) Seam carving results on images in (a).(c) Seam carving results on images smoothed by our method.