Keywords

1 Introduction

Image editing is an important research area in computer vision and image processing technique. In recent years, local linear model is widely applied to image editing applications such as image filter,alpha matting. For example, closed form matting assumes the alpha value in local area of image satisfy the local linear model. Guided filter assumes the output of filter is the linear form of pixels in guidance image. All those methods assume the pixels in local neighbor area of image can be represented by linear form.

However, if the local linear model fails in special conditions, the application could not achieve perfect results. In this paper, we expand the local linear model to nonlocal nonlinear model. Different from the general local linear model defined on gray of pixels, nonlocal nonlinear model is defined on the nonlocal feature of image, and assume that the feature of image in each nonlocal area satisfy the nonlinear model, which can be seen as the extension of local linear model.

In the further discussion, many applications is introduced based on nonlocal nonlinear model, such as image filter and image upsampling and alpha matting. With the nonlocal nonlinear model, we show that our method can achieve better results. Our method can be widely applied to image denoising [1], upsamping [2], alpha matting [35]. Results show our filter is effective.

2 Related Work

Image editing is the important technique in image processing, including alpha matting, image filter, etc. In recent years, local linear model is widely applied to image editing application to get the optimizational results. It is one of the most important assumptions in the image editing method.

Levin et al. [6] proposed a new colorization method by using the local linear model, they assume the color in each small neighbor area satisfy the linear structure. This idea was expanded to alpha matting. Levin et al. [7] assume that the alpha value satisfy the local linear model in each small neighbor area. However, when alpha value do not satisfy the local linear model, closed form matting could not get effective results. In the area that foreground and background color is hard to propagate, local model also fail to work. Different from Levin’s work [7], we assume the alpha value satisfy the nonlinear model in K nearest neighbor(KNN). Compared with old method, our nonlocal model achieves more effective results.

Image filter is widely applied to image editing application. Early image filters focus on how to maintain the edge of image such as bilateral filter [8]. Current image filter techniques aim to maintaining the important structure of image such as edges. Geodesic filter [9] maintains the edge structure of image by preserving the geodesic distance. Adaptive filter  [10, 11] proposed a realtime high dimension filter based on adaptive manifolds. All those works focus on how to implement a high efficient filter and preserve the edge structure.

The edges can also be looked as a kind of local structure in image. We think an effective filter should preserve the local structure of image which reflect the relationship of neighbor pixels. He et al. [12] proposed guided filter based on guidance image.In this paper, we expand the guided filter by using the nonlocal nonlinear model instead of local linear model, and preserve the nonlocal nonlinear structure in target image.

By the local nonlinear model, we also propose a new image upsampling method. Different from traditional upsampling method such as Bicubic interpolation, our method is based on fitting, assume each neighbor area in image satisfy the nonlinear structure, so we can learn the nonlinear structure from low resolution image, and interpolate the subpixels by nonlinear structure. This method is not sensitive to noise and achieve smooth results. Experiment results show our method is effective.

3 Alpha Matting with Nonlocal and Nonlinear Learning Model

In local linear model of alpha matting,the following linear equation is assumed in local neighbor of each pixels.

$$\begin{aligned} \alpha _i = X^T\beta _1 +\beta _2 \end{aligned}$$
(1)

where X is the data vector of gray value of pixels in each local area.

Similar with the learning based matting [13] and KNN matting [5]. The nonlinear model is used instead of local linear model. Nonlocal area model is used instead of local area model.

We assume the alpha value in KNN neighbor satisfy the following nonlinear equation. Each pixels in KNN tree is defined by five dimension vector:(rgbxy).xy is the spatial position of pixels in image. In each KNN neighbor, the alpha value of pixel i satisfy the following nonlinear equation:

$$\begin{aligned} \alpha _i = f(x_i)=\varPhi (X)^T.\beta \end{aligned}$$
(2)

where \(\varPhi (X)\) is the date vector of nonlinear functions \(\varPhi (x)\), x is the rgb channels of pixels in KNN neighbor of pixel i. Formula 2 can be seen as the expansion form of formula 1,

The coefficient \(\beta \) can be solved by following formula:

$$\begin{aligned} argmin\Vert \alpha _i-\varPhi (X_i).\begin{bmatrix} \beta _1 \\ \beta _2 \end{bmatrix} \Vert ^2 + \lambda \begin{bmatrix} \beta _1 \\ \beta _2 \end{bmatrix}^2 \end{aligned}$$
(3)
$$\begin{aligned} \begin{bmatrix} \beta _1 \\ \beta _2 \end{bmatrix} =\varPhi (X_i).(\varPhi (X_i)\varPhi (X_i)^T +\lambda I)^{-1}\alpha _i \end{aligned}$$
(4)

Substituting the \(\beta \) to formula 2. The formula 1 can be expanded to following equation:

$$\begin{aligned} \begin{aligned} f(x_i)=\varPhi (x)^T.\beta \\ =(\varPhi (X_i)^T.\varPhi (X_i)+\lambda I_m)^{-1}\varPhi (X_i).\varPhi (x_i).f(X_i) \end{aligned} \end{aligned}$$
(5)

Notice that the formula 5 is decided by the inner product of two data vectors kernel functions \(K(x_1,x_2)\) is used to represent the inner product by kernel trick.

$$\begin{aligned} K_i(X_i,X_i) = \varPhi (X_i)^T.\varPhi (X_i) K_i(x_i,x_j) = \varPhi (x_i).\varPhi (x_j) \end{aligned}$$
(6)

\(K_i(X_i,X_i)\) is the matrix represented by following formula:

$$\begin{aligned} K_i(X_i,X_i) = \begin{bmatrix} k(x_{\tau 1}',x_{\tau 1}')&...&k(x_{\tau 1}',x_{\tau m}')\\ ...&...&...\\ k(x_{\tau m}',x_{\tau 1}')&...&k(x_{\tau m}',x_{\tau m}') \end{bmatrix} \end{aligned}$$
(7)

where \(k(x_i,x_j)\) is the kernel function in machine learning.

$$\begin{aligned} \begin{aligned} \alpha _i=f(x_i)\\ =(K_i +\lambda _r I)^{-1}k_i).f(X_i)\\ = \kappa (x_i) .f(X_i) = \kappa (x_i) .\overline{\alpha _i} \end{aligned} \end{aligned}$$
(8)

Formula 8 provides the linear relation between the \(\alpha _i\) and the alpha value of neighbor pixels. It leads to the closed form solution for alpha matting. Since the Gaussian kernel always get hard edges [5] in alpha matting, the polynomial kernel is used instead of gaussian kernel which can get smooth results in alpha matte. Compared with closed form matting and KNN matting, Fig. 1 show that our method can get better results.

Fig. 1.
figure 1figure 1

Alpha matting by nonlocal and nonlinear model.

Fig. 2.
figure 2figure 2

The details of alpha matte by different method.

In Fig. 1, closed form matting could not eliminate the characters in background with local model, which details are shown in Fig. 2. KNN matting [5] and CMM matting [14] eliminated the characters, but do not eliminate the noise around the hair with linear model. Our method is nonlinear and nonlocal, it get the better alpha matte in Fig. 2d and f.

4 Learning Based Filter with Nonlinear Model

In this section, the learning based filter is defined by nonlinear model. For a giving image I, the filtering output at pixel i is the weighted average of local neighbor pixels in image I, which is represented by following formula:

$$\begin{aligned} I_i'=F_{i}.\dot{I_i} \end{aligned}$$
(9)

\(\dot{I_i}\) is the data vector of pixels around the pixel i, \(F_i\) is the weight vector of the filter, which should be designed to maintain the feature of image. In this paper, the weight \(F_{i}\) is solved by learning from guidance image.

Let the feature of pixel i in guidance image be represented by d dimension vector \(x_i\). The feature can be represented by (RGB) channels of pixels. For the pixel \(i\in \varOmega _i\), and \(\varOmega _i=[\tau _1,\tau _2,\tau _3...\tau _m]\) include the m neighbor pixels around the i. \(X_i\) is the \(m*d\) matrix which describe the features of pixels in \(\varOmega _i\) . We denote \(X_i=[x_{\tau _1},x_{\tau _2},x_{\tau _3}...x_{\tau _m}]\).

Let the gray of pixel i be \(f(x_i)\), guided filter assume the \(f(x_j)\) in \(\varOmega _i\) of guidance image satisfy the local linear model, which is represented by following formula:

$$\begin{aligned} \begin{aligned} f(x_j) =x_j.\beta ,j \in \varOmega _i\\ \beta =[\beta _1,\beta _2...\beta _d]^T \end{aligned} \end{aligned}$$
(10)

In this paper, similar with nonlocal and nonlinear alpha matting, the gray of pixel is assumed to satisfy following nonlinear formula:

$$\begin{aligned} f(x_j) =\varPhi (X)^T.\beta \end{aligned}$$
(11)

The coefficient \(\beta \) can be solved by least square method:

$$\begin{aligned} \beta = \arg min\Vert f(X_i)-\varPhi (X_i).\beta \Vert ^2 + \gamma \Vert \beta \Vert ^2 \end{aligned}$$
(12)
$$\begin{aligned} \beta = \varPhi (X_i).(\varPhi (X_i)\varPhi (X_i)^T +\lambda I)^{-1}.f(x_i) \end{aligned}$$
(13)

Substituting the formula 13 to formula 11, we have

$$\begin{aligned} \begin{aligned} f(x_i)=\varPhi (x)^T.\beta \\ =(\varPhi (X_i)^T.\varPhi (X_i)+\lambda _r I_m)^{-1}\varPhi (X_i).\varPhi (x_i).f(X_i) \end{aligned} \end{aligned}$$
(14)

The gray of pixel i in guidance image can be represented by following linear form:

$$\begin{aligned} \begin{aligned} f(x_i) = (K_i + \lambda _r I_m^{-1})k_i.f(X_i)\\ = \kappa (x_i) .f(X_i)\\ \kappa (x_i) =(K_i + \lambda _r I_m^{-1})k_i \end{aligned} \end{aligned}$$
(15)

where \(\kappa (x_i)\) is the vector which describes the local structure of guidance image.

Assuming the local structure of input image is similar with that of guidance image, the output of filter can be represented by following formula:

$$\begin{aligned} I_i^{t'} =\kappa (x_i).\dot{I_i^t} \end{aligned}$$
(16)

where \(\dot{I_i^t}\) is the data vector of pixels around the pixel i in the input image, and \(\kappa (x_i)\) is the structure coefficient learned from guidance image.

We apply our nonlinear model to all windows which contain the pixel i in the input image. In each different windows we can get the different output of pixel i. the simple strategy is to average all the different value in different windows \(\omega _k\).

Let \(f(x_i,x_j)\) be the \(j_{th}\) data in data vector \(\kappa (x_i)\). The weight of filter in formula 9 can be represented by following formula:

$$\begin{aligned} F_i^j=\sum _{(i,j)\in \omega _k}(f(x_i,x_j)), \forall i \in \omega _k \end{aligned}$$
(17)

4.1 Guided Filter and the Learning Based Filter

In learning based filter, the feature of image can be represented by different forms. Let the feature be defined by \(x'= [I,1]\). I is the gray of pixels. By assuming the linear model is satisfied in each small windows, formula 11 can be substituted by following formula:

$$\begin{aligned} I = x^T\beta _1 +\beta _2 = x^T \begin{bmatrix} \beta _1 \\ \beta _2 \end{bmatrix} \end{aligned}$$
(18)

So we can get following formula by least square problem:

$$\begin{aligned} \begin{aligned} \begin{bmatrix}\beta _1 \\ \beta _2 \end{bmatrix} = \arg min\Vert I_i-X_i.\begin{bmatrix}\beta _1 \\ \beta _2 \end{bmatrix}\Vert ^2 + \gamma \begin{bmatrix}\beta _1 \\ 0 \end{bmatrix}^2\\ X_i= \begin{bmatrix}I_1&1 \\ I_2&1 \\ ...\\ I_m&1 \end{bmatrix} \end{aligned} \end{aligned}$$
(19)

Notice that the formula 18 is the local linear model represented in [7, 12]. Formula 18 is the same as that in guided filter, so guided filter can be seemed as the special case of learning based filter.

However, our method is quite different from guided filter. Guided filter is based on local linear model, the learning based filter is based on learning and can be expanded to nonlinear model and nonlocal model, it can not only be defined on gray of pixels, but also be defined on feature of image.

In Fig. 4, we show the results of different filters. It is clear that our method and guided filter get better results. Without guidance image, adaptive manifold filter could not preserve the edge with heavy noise. Domain interpolation filter also could not eliminate the noise when the noise is heavy. Because our method learn the nonlinear structure from guidance image, our filter can get good results even with heavy noise. Comparing with polynomial kernel, gaussian kernel is effective in maintaining the edge, we find that learning based filter can maintain the texture and edge very well (Fig. 3).

Fig. 3.
figure 3figure 3

The image with noises and the guidance image

5 Image Upsampling by Learning

From previous section, we know that our method can extract the nonlinear structure from nonlocal area of image. This nonlinear structure can help us to upsample the image.

Let the gray of pixel i be represented by \(f(x_i)\). Different from that of alpha matting, \(x_i\) is the x and y coordinates of pixel i in image.\(X_i\) is the \(2*m\) matrix,each row in \(X_i\) contains the coordinate of a pixel around the pixel i.

Assuming the pixels i in local area of low resolution image satisfy \(f(x_i)=\varPhi (X_i)^T.\beta \), then f(x) can be learned by local nonlinear model in low resolution image. Similar with that in alpha matting and learning based filter, with the known pixels in \(X_i\), \(f(x_i)\) can be solved by following formula:

$$\begin{aligned} f(x_i)=(K_i +\lambda _r I)^{-1}k_i.f(X_i) \end{aligned}$$
(20)
Fig. 4.
figure 4figure 4

Image denoising with guidance image.

Fig. 5.
figure 5figure 5

Image upsampling by learning filter.

Fig. 6.
figure 6figure 6

The details of upsampling image.

$$\begin{aligned} \begin{aligned} K_i = \begin{bmatrix} k(x_{\tau 1}',x_{\tau 1}')&...&k(x_{\tau 1}',x_{\tau m}')\\ ...&...&... \\ k(x_{\tau m}',x_{\tau 1}')&...&k(x_{\tau m}',x_{\tau m}') \end{bmatrix} \end{aligned} \end{aligned}$$
(21)

We use gaussian kernel function. It is defined by following formula.

$$\begin{aligned} k(x,y) = \gamma .exp(-\Vert x-y\Vert ^2) \end{aligned}$$
(22)

where \(\Vert x-y\Vert ^2\) is the Euclidean distance of two pixels xy. In low resolution image, \(X_i\) is the data vector of pixels with integer coordinates in small windows. To upsample the image, we only need to calculate the gray of pixels in fractional coordinates. Since the pixels in \(X_i\) is known, the gray of pixels in fractional coordinates can be easy to calculate by formula 20.

In figure 6, we show the details of different upsampling methods. In Fig. 6b, Bicubic interpolation could not get smooth results around the arrow. The noise around the arrow is amplified by interpolation. Shan’s method is based on deconvolution. Large kernel in Fig. 6c leads to ring artifacts. Small kernel produces noises around arrow in Fig. 6d. Learning based upsampling method learns the linear information including edges from low solution image. It achieves smooth results, which are better than other methods (Fig. 5).

6 Conclusion

In this paper, a new image editing is proposed based on learning. This method can be applied to image denoising and image upsampling. Results show that the method is effective. In future, we will apply our method to new techniques such as image composition [1518]