Keywords

1 Introduction

Outdoor images captured from natural scenes are inevitably degraded under foggy weather, causing reduced contrast and faded vividness of the image [1]. Hazy images cannot meet the requirement of consumer photography and computer vision applications (e.g., object recognition, video surveillance). To address this problem, much work has been carried out to restore the image visibility. In particular, some significant progresses have been achieved in recent years. He et al. [2] proposed a dark channel prior to remove haze, which achieved impressive results. Tang et al. [3] trained a regression model to estimate the medium transmission map by extracting a set of haze-relevant features and training with Random Forest.

Despite of the remarkable progresses of image dehazing, the method of evaluating the performance of the dehazing algorithm is addressed very little. Nishino et al. [4] and Meng et al. [5] adopted the widely-used subjective analysis to conduct assessment according to their own judgements. When conducting the subjective evaluation, evaluators were more inclined to their own advantages, making it tough to reach a unified evaluation result. Dai and Tarel [6] invited seven students to evaluate 1500 dehazed images through visual judgements, which reduced the personal subjectivity to a certain degree.

In order to make quantitative analysis, Wu et al. [7] and Mai et al. [8] employed traditional reference Image Quality Assessment (IQA) indicators to evaluate dehazing performances, such as Mean Squared Error (MSE), Peak Signal to Noise Ratio (PSNR), and Structural Similarity (SSIM) [9]. Since haze-free images were unavailable, these metrics can only be calculated with original hazy images, leaving the evaluation results unconvincing and unreliable. To make the reference assessment feasible, Zhu et al. [10] synthesized hazy images and calculated these IQA indicators between dehazed and clear images. The same assessment manner can be seen in [3, 11,12,13]. However, such IQA indicators are mainly used to evaluate typical image distortions, like blurring and compression. They are not specially designed for haze removal, which cannot effectively and reasonably evaluate dehazing algorithms.

Hautiere et al. [14] paid attention to the contrast enhancement evaluation for restoration algorithms, which is a close work to dehazing evaluation. In their method, three different descriptors were developed based on the gradient of visible edges, which can be used to measure the image visibility. Fang et al. [15] designed an exclusive indicator for haze removal assessment, which measured the image visibility through the local band-limited contrast. However, the gradient and contrast information is very sensitive to noise. These indicators cannot give robust and accurate evaluation results in some cases.

The image visibility is an important factor for the evaluation of dehazing algorithms. In this paper, we put emphasis on the measurement of image visibility from two aspects: image sharpness and haze residual degree. The image sharpness is measured with a proposed multi-scale contrast feature, and the haze residual degree is described using the dark channel feature. Fusing the two features together, an indicator is derived to evaluate the visibility of the restored image. Experimental results demonstrate the effectiveness and robustness of the proposed method.

The remainder of the paper is organized as follows. In Sect. 2, we present the specially designed indicator to measure the visibility of the restored image. Section 3 gives the experimental results and analysis. Finally, we summarize this paper in Sect. 4.

2 The Proposed Approach

A clear restored image should have enhanced contrast and no haze disturbance. In this paper, two features are designed to describe the image sharpness and haze residual degree, respectively. With the combination of the two features, a visibility indicator is derived to rate the image clarity level.

2.1 Multi-scale Contrast Feature

Images with higher contrast are sharper in human visual perceptions. Therefore, the measurement of image contrast can indicate the image sharpness to a certain degree. Weber contrast and Michelson contrast are two popular contrast definitions, which reflect the global contrast of the whole image. Since the restoration is usually spatial-variant, global contrast cannot make use of local information and will lead to inaccurate measurement. Local variance can aggregate information of all pixels and attenuate the disruption of extreme noise, which tends to be a good contrast indicator. To avoid expanding magnitude, the Root Mean Square (RMS) is more common to be used [16]. However, local RMS is sensitive to the window size selected, leading to unfixed results under different windows. To solve this problem, in this paper, a multi-scale contrast descriptor is developed, which can give stable and unified results.

For an image I, we define its contrast map as the local RMS under a non-overlapping sliding window, described as:

$$\begin{aligned} \mathrm {CM}(u,v)=\sqrt{\frac{1}{k^2}\sum _{x=1}^k\sum _{y=1}^k{\big [\mathrm {I}(k(u-1)+x,k(v-1)+y)-\mu (u,v)\big ]}^2} \end{aligned}$$
(1)

where k is the local window size, and \(\mu \) is the local average value:

$$\begin{aligned} \mu (u,v)=\frac{1}{k^2}\sum _{x=1}^k\sum _{y=1}^k\mathrm {I}\left( k(u-1)+x,k(v-1)+y\right) \end{aligned}$$
(2)

We adopt down-sampling to generate image pyramid, denoted as \(\mathrm {I}^{(0)},\mathrm {I}^{(1)},\dots ,\mathrm {I}^{(n)}\), where \(\mathrm {I}^{(0)}\) is the initial image, \(\mathrm {I}^{(j+1)}\) is the down-sampled result of \(\mathrm {I}^{(j)}\). We call each down-sampled image a layer. In order to guarantee the image size big enough for the subsequent operations, the last layer \(\mathrm {I}^{(n)}\) should meet to:

$$\begin{aligned} \mathrm {min}\left( h^{(n)},w^{(n)}\right) \ge \xi \end{aligned}$$
(3)

where \(h^{(n)}\) and \(w^{(n)}\) represent the height and width of the image \(\mathrm {I}^{(n)}\), respectively. In this paper, \(\xi \) is fixed to 200.

Within one pyramid layer \(\mathrm {I}^{(j)}\), a set of contrast maps are generated with different window size \(k_i\), which is defined as:

$$\begin{aligned} k_i^{(j)}=\left\lfloor \frac{i}{m(j+1)}\mathrm {min}\left( \frac{h^{(j)}}{10},\frac{w^{(j)}}{10}\right) \right\rfloor \quad \quad i=1,2,\dots ,m \end{aligned}$$
(4)

where m is the number of scales in one pyramid layer, and \(\lfloor \cdot \rfloor \) indicates rounding down. In this paper, m is fixed to 3. For each image \(\mathrm {I}^{(j)}\), we produce three contrast maps, marked as \(\mathrm {CM}_1^{(j)}\), \(\mathrm {CM}_2^{(j)}\), \(\mathrm {CM}_3^{(j)}\), which consist of one octave. Note that \(\left\lfloor \frac{1}{(j+1)}\mathrm {min}\left( \frac{h^{(j)}}{10},\frac{w^{(j)}}{10}\right) \right\rfloor \) is the max size of local window, which ensures that the smallest size of the contrast map is at least \(10\times 10\).

Since the sizes of three contrast maps in one octave are different, we resize \(\mathrm {CM}_2^{(j)}\) and \(\mathrm {CM}_3^{(j)}\) by nearest-neighbor interpolation to keep the same size with \(\mathrm {CM}_1^{(j)}\). In each pixel position, the largest value of three contrast maps is selected. Then, a new map is generated as:

$$\begin{aligned} \mathrm {CMap}^{(j)}(u,v)=\mathrm {max}\left( \mathrm {CM}_1^{(j)}(u,v),\mathrm {CM}_2^{(j)}(u,v),\mathrm {CM}_3^{(j)}(u,v) \right) \end{aligned}$$
(5)

Once we obtain the \(\mathrm {CMap}^{(j)}\) for the jth pyramid layer, the other layers’ maps can be generated in the same way. Computing each map’s average value and integrating them with L2 norm, the multi-scale contrast descriptor can be derived, formally defined as:

$$\begin{aligned} \mathrm {C_m}={\left\| {\left[ \frac{1}{N_1}\sum _u\sum _v \mathrm {CMap}^{(1)}(u,v),\dots , \frac{1}{N_n}\sum _u\sum _v \mathrm {CMap}^{(n)}(u,v)\right] }^T \right\| }_2 \end{aligned}$$
(6)

A detailed illustration for the computing of multi-scale contrast descriptor is shown in Fig. 1. This descriptor integrates multi-scale information through the image pyramid, which is scale-invariant along with a certain anti-noise ability.

Fig. 1.
figure 1

An illustration for the computing of multi-scale contrast descriptor.

At last, a multi-scale contrast feature that describes the contrast enhancement for a restored image is defined as:

$$\begin{aligned} \mathrm {MC}=\frac{\mathrm {C}_m^d}{\mathrm {C}_m^h} \end{aligned}$$
(7)

where \(\mathrm {C}_m^d\) and \(\mathrm {C}_m^h\) stand for the multi-scale contrast descriptor of the dehazed image and hazy image, respectively. The multi-scale contrast feature MC can reflect the sharpness of a restored image. The larger the MC, the clearer the restored image.

2.2 Dark Channel Feature

The contrast feature can reflect image sharpness to some extent. However, when the restored images are under-dehazed, the difference of their contrast is small, and the contrast feature cannot give distinguishing measurement results.

A highly relevant feature with haze is the dark channel prior introduced by He et al. [2]. The prior reveals that except for the sky area, some pixels in a local haze-free region have at least one color channels with low intensities. In contrast, hazy regions do not meet this principle and have high minimum intensities among three color channels in a local patch. For a dehazed image, the more haze is removed, the more pixels meet the dark channel prior. Thus, we utilize the dark channel feature to measure the haze residual degree for under-dehazed cases.

The minimum intensity map for an image is defined as:

$$\begin{aligned} \mathrm {I}^m(\mathbf {x})=\min \limits _{c\in \{r,g,b\}}(\min \limits _{\mathbf {y}\in \mathrm {\Omega }(\mathbf {x})}(\mathrm {I}^c(\mathbf {y}))) \end{aligned}$$
(8)

where I represents the image, \(\mathrm {I}^c\) is one of the color channels of I, and \(\mathrm {\Omega (\mathbf {x})}\) is a local patch centered at pixel \(\mathbf {x}\). The patch size is \(15\times 15\).

The average of minimum intensity map can reflect the haze residual degree for a restored image. To reduce the effect of brightness, the minimum intensity map is normalized by the sum of RGB channels. Since sky regions do not fit the dark channel prior that always have high intensities for all channels, using the minimum intensity map in sky regions will lead to wrong evaluations. Thus, we define the average of the normalized minimum intensity map in the non-sky region as the dark channel feature to describe the haze residual degree, formally described as:

$$\begin{aligned} \mathrm {DC}=\frac{1}{\Vert \mathbf {S}\Vert }\sum \limits _{\mathbf {x}\in \mathbf {S}}\frac{\mathrm {I}^m(\mathbf {x})}{\sum \limits _{c\in \{r,g,b\}}\mathrm {I}^c(\mathbf {x})+\varepsilon } \end{aligned}$$
(9)

where \(\mathbf {S}\) is the non-sky region of the image, \(\varepsilon \) is a small value to prevent the denominator from being zero, which is set to \(10^{-6}\) in this paper. With the dark channel feature DC, under-dehazed images can be identified and evaluated. The larger the DC, the more remaining haze in the restored image.

2.3 The Proposed Visibility Indicator

The restored image should have enhanced contrast and no haze disturbance. Higher MC indicates more contrast enhancement and lower DC represents less remaining haze. Hence, the combination of the two features can reflect the visibility for a restored image. We define the visibility indicator as:

$$\begin{aligned} \mathrm {VI}=\mathrm {MC}-\alpha \mathrm {DC} \end{aligned}$$
(10)

where \(\alpha \) is a parameter used to control the relative importance between the contrast feature and the dark channel feature. Bigger \(\alpha \) gives more relative importance to DC than MC, which should happen when the restored image is under-dehazed. From this perspective, we make the parameter \(\alpha \) self-adaptive, described as below.

Generally, dense hazy is difficult to be removed and much possible to be under-dehazed. Hence, the more dense haze pixels, the more likely the image is to be under-dehazed. According to the statistics of abundant hazy images’ minimum intensity maps, the dense haze pixel should satisfy the following constraint:

$$\begin{aligned} \mathrm {I}^m(\mathbf {x})>0.6 \end{aligned}$$
(11)

A hazy image with more dense haze pixels should adopt higher \(\alpha \), and less dense pixels should use lower \(\alpha \). Letting r denotes the proportion of dense hazy pixels in the entire image, the parameter \(\alpha \) is decided according to:

$$\begin{aligned} \alpha =\left\{ \begin{aligned} 1&r<0.4 \\ 2&r\ge 0.4 \end{aligned} \right. \end{aligned}$$
(12)

Another key point for the designed indicator is the calculation in sky regions. In general, sky is a challenging region to be restored by dehazing algorithms, which will introduce extra noise and halo artifacts. These introduced distortions will inevitably cause high local RMS and further lead to high MC. In addition, the sky region is also not applicable to DC feature. Therefore, to guarantee the robustness of the indicator, both MC and DC should be calculated in a non-sky mask. An automatic sky detection approach proposed in [6] is adopted here, and manual segmentation is also feasible.

The designed indicator can measure the visibility for a dehazed image. A bigger VI indicates more contrast improvement and less remaining haze, revealing a clearer dehazing result. The effectiveness and robustness of the designed indicator will be demonstrated by the following experiment section.

3 Experiments and Analysis

In order to verify the effectiveness of our proposed indicator, we collect dehazed image samples from http://www.cs.huji.ac.il/~raananf/projects/dehaze_cl/results/ by the state-of-the-art algorithms including He et al.’s [2], Nishino et al.’s [4], Meng et al.’s [5], Fattal’s [11, 17], Gibson and Nguyen [18] and Kim et al.’s [19] methods. Figure 2 presents some dehazing instances along with their corresponding hazy images. For each hazy image, four dehazing results with different clarity are given. Their clarity decreases gradually from left to right.

Fig. 2.
figure 2

(a) Hazy image. From (b) to (e) are dehazing results obtained by the method given in the bottom-right corner. Their clarity decreases gradually from left to right.

Table 1. Results of MC, DC and VI on the four groups of images in Fig. 2

In this paper, the visibility indicator VI is proposed to assess the visibility of the dehazed image, which is the combination of multi-scale contrast feature MC and dark channel feature DC. Higher MC denotes more contrast enhancement and a sharper restored result. Lower DC indicates less remaining haze. The indicator VI can reflect the image visibility. The higher the VI, the clearer the dehazed result. Table 1 gives MC, DC and VI values for the dehazed images in Fig. 2. As can be seen, the results of the visibility indicator VI decrease progressively from (b) to (e) for all images, which give consistent evaluation results with visual judgements. Note that all the calculations are under a non-sky mask for the Buildings image in the first row, which avoids the improper increment of MC caused by noise in the sky region. For the Farmland and Red House images, MC and VI values are increasing and DC values are decreasing gradually from (b) to (e), which is consistent with the visual assessment. For the City image in the last row, MC values are not gradually decreasing, which give the wrong evaluation of image sharpness for (c) and (d). DC gives the right measurement for haze residual degree and corrects the error, leading to accurate assessment results of the final visibility indicator VI. Therefore, the dark channel feature DC can compensate the wrong evaluation of the contrast feature MC in some cases to a certain degree, which demonstrates the reasonability and effectiveness of our designed indicator.

The approach is compared with existing evaluation metrics that can measure the visibility of the dehazed image, including the image visibility descriptors e, \(\overline{\text {r}}\), \(\sigma \) [14], and the image contrast metric \(C\_values\) [15]. Higher values of these metrics indicate clearer results. Table 2 reports the results of these metrics on images shown in Fig. 2. According to the subjective assessment, the three descriptors e, \(\overline{\text {r}}\), \(\sigma \) and the contrast metric \(C\_values\) should decrease strictly from (b) to (e). However, most of their results give inconsistent evaluation results with visual perceptions. Especially for the Buildings image with noise in the sky region, these metrics’ results are seriously deviated from visual observations. Compared with them, our indicator presents completely right assessment results (see Table 1), demonstrating the advantage and robustness of our approach.

Table 2. Results of e, \(\overline{\text {r}}\), \(\sigma \) and \(C\_values\) on the four groups of images in Fig. 2
Fig. 3.
figure 3

(a) Reference image. (b) Synthetic hazy image. (c) Fattal’s result. (d) He et al.’s result. (e) Cai et al.’s result. (f) Kim et al.’s result.

Table 3. Results of LCC and SROCC on 160 images

To quantitatively assess the performance of the proposed approach, we test our indicator and some compared metrics using synthetic images, which include the image visibility descriptors e, \(\overline{\text {r}}\), \(\sigma \) [14], the image contrast metric \(C\_values\) [15], the image enhancement metric BIQME [20], and the general IQA metric BRISQUE [21]. We synthesize 40 hazy images using the method in [3]. Using He et al.’s [2], Fattal’s [11], Cai et al.’s [13] and Kim et al.’s [19] methods to dehaze, \(40\times 4=160\) dehazed images are obtained. Figure 3 gives a dehazing instance, where (a) is the clear image (reference image), (b) is the synthetic hazy image, from (c) to (f) are the dehazing results by Fattal’s [11], He et al.’s [2], Cai et al.’s [13] and Kim et al.’s [19] methods respectively. For a dehazed image, MSE between it and the corresponding reference image can be used as the quality ground truth for the visibility assessment. We then use two evaluation criteria to measure the linear correlation between the quality ground truth and the evaluation results of compared and our methods: Pearson linear correlation coefficient (LCC) and Spearman rank-order correlation coefficient (SROCC). The two coefficients are ranged between \([-1, 1]\), where 1 stands for total positive linear correlation, 0 is no linear correlation, and \(-1\) represents total negative linear correlation. Considering both compared metrics and our indicator are negatively related with the MSE scores, we reverse the LCC and SROCC results to be positive values. Table 3 gives the results of the two coefficients between MSE scores and metric values on 160 images. As can be seen, the proposed indicator VI has the highest linear correlation with the quality ground truth, and outperforms the compared metrics by a large margin.

4 Conclusion

The evaluation of the dehazing method is a tough task. In this paper, the problem is addressed with a proposed indicator that can measure the visibility for the restored image. The proposed visibility indicator VI is designed from two aspects of image sharpness and haze residual degree. A multi-scale contrast feature is proposed to measure the image sharpness. Compared with traditional contrast metrics, our proposed feature integrates multi-scale information through the image pyramid, which is scale-invariant along with a certain anti-noise ability. Considering that under-dehazing happens sometimes, the contrast feature cannot give accurate assessment for this situation. The dark channel feature is adopted to reflect the haze residual degree of the dehazed image. Using a balance coefficient to fuse the two features together, the final visibility indicator is derived. Experimental results indicate that, compared with the existing relevant metrics, the proposed approach can accurately evaluate the visibility of the dehazed image with high consistency with human visual perceptions and objective quality scores.