Keywords

1 Introduction

Dehazed image quality assessment algorithm is aimed to evaluate the quality of dehazed images, which indirectly reflects the performance of the dehaze algorithm. Dehazed image distortion caused by dehazed algorithm tends to appear contrast distortion, noise pollution, and image color cast, such as the more common image super-saturation enhancement phenomenon, which seriously affects the perception of the image for human eye. Therefore, there has been more focus on dehazed image quality assessment.

Recently, there are some dehazed image quality evaluation algorithms while majority of them such as [1,2,3], trained models by a large number of dehazed images with accurate subjective quality scores. However, the acquisition of the dehazed images subjective scores has some problems: First, the subjective quality scores are not accurate. Observers are usually not sure which score can describe the dehazed image quality accurately, so randomly select a score from a small rough range. As a result, the subjective quality scores are difficult to accurately reflect the small difference between images. Second, the subjective quality scores are easily influenced by the observers’ preference to the image content, which further reduces the reliability of the subjective scores. Third, it’s difficult to build a large-scale dehazed image quality evaluation database which limits the practicality of [4, 5], or extend the existing database because of the inconvenience of the database construction process: (1) the database must contain a variety of distortion types, and there must include a number of images with different distortion degree or different contents for each type. (2) In order to reduce the impact of personal preference, the organizer need to arrange multiple observers to judge its quality for each image, which greatly increases the manpower, material and time consumption.

In summary, the subjective quality scores are inaccurate, biased, time consuming, which limit the reliability and expansibility of these dehazed image quality evaluation algorithms.

In order to overcome these problems, it is by using the preference information that we propose a rank learning algorithm to evaluate the dehazed image quality. Here, the ranking learning is a key issue in application areas such as page ordering, text retrieval and image search, and is aimed to learn a function that can predict its rank sequence for a given set of input stimulus, and the subjective quality preference stands for the information such as “Image \(I_a\) quality is better than image \(I_b\)”. Given a pair of images, we would call it “Preference Image Pair” (PIP) if the relative quality of the two images is known. Meanwhile, the relative quality of the two images is represented by a Preference Label. In our algorithm, we transform the problem of dehazed image quality evaluation into the classification problem of quality preference learning, and then use random forest and pairwise comparison in turn to learn the function that can predict the corresponding quality rank sequence for a given set of dehazed images. The experimental results show that our algorithm is highly consistent with the subjective feeling of human eye and is superior to the traditional dehazed image quality evaluation algorithms. Moreover, our algorithm has a strong expansibility.

2 The Acquisition of Preference Images Pairs

So far, researchers have constructed some databases for dehazed image quality evaluation. Therefore, designing a proper method to obtain reliable PIPs from the existing database is pretty meaningful. From those existing databases, we select dehazed images with large difference in quality scores, then construct preference image pairs and get the preference labels based on their subjective quality scores. Moreover, in order to get more PIPs, we use different dehazing algorithm to get dehazing images with different quality, then build the preference image pairs and get the corresponding preference labels, which also proves the proposed methods expansibility.

If an existing dehazed image quality evaluation database contains n images, we can get the set of preference image pairs, we can get the set of preference image pairs \(P_1\), which size is \(N_1\):

$$\begin{aligned} {P_1} \subseteq \left\{ {\left( {{I_i},{I_j}} \right) \left| {\left| {{s_i} - {s_j}} \right| > T,i,j = 1,...,n} \right. } \right\} \ \end{aligned}$$
(1)

where, T is the threshold of the difference in subjective quality scores, and \(s_i\) is the quality score of the image \(I_i\). \(|s_i-s_j|\) is the absolute value of \((s_i-s_j)\). For each preference image pair \(p_k=(I_i,I_j)\) \(\epsilon \) \(P_1\), we can get the preference label \(l_k\), \(k=1,2,...N_1\) based on \((s_i-s_j)\):

$$\begin{aligned} {l_k} = \left\{ \begin{array}{l} sign({s_i} - {s_j})\\ - sign({s_i} - {s_j}) \end{array} \right. \ \end{aligned}$$
(2)

Moreover, we get more PIPs using following method. First, selected some original hazy images, then use method of Fattal13 [6], He09 [7], Choi [4, 8, 9] to obtain dehazed images, and get some preference images pairs and the corresponding preference labels though our subjective evaluation. In order to simplify the process of the preference image pairs acquiring, we specify that each sub-images pair conforms to the same preference label of the corresponding preference image pair. For each pair of images, randomly select 50 pieces of \(n*n\) non-overlapping sub-image blocks, where n is 64. In this way, we get pairs of preference images.

Finally, we get our preference images pairs as follows:

$$\begin{aligned} P = \left\{ {\left( {{I_{k1}},{I_{k2}}} \right) ,k = 1,...,N} \right\} \end{aligned}$$
(3)

where, \(N = {N_1} + {N_2}\), \(I_{k1}\) and \(I_{k2}\) are the two images of the k-th preference image pair. And for each preference pair, we get its preference label \({l_k},k =1,...,N\), if the quality of dehazed image \(I_{k1}\) is better than \(I_{k2}\), the preference label \({l_k} = 1\), and \({l_k} = -1\) if the quality of dehazed image \(I_{k1}\) is worse than \(I_{k2}\). Then we can get the preference labels set:

$$\begin{aligned} L = \{ {l_1},...,{l_N}\} \subset {\{ - 1, + 1\} ^N} \end{aligned}$$
(4)

In the process of PIPs acquisition, we can put the preference image pairs from different databases together without any data correction. In addition, we can also add PIPs from our subjective experiment to the total set of PIPs. So it can be seen that the acquisition of the preference image pairs is very simple and convenient, and extending the existing PIPs is pretty easy, which can effectively overcome the problem of the dehazed images subjective scores’ acquisition and the database building or expanding, and has great significance to the popularization and application of dehazed image quality evaluation algorithms.

3 The Acquisition of Preference Images Pairs

This section details the proposed quality evaluation algorithm. In our algorithm, we transform the problem of dehazed image quality evaluation into the classification problem of quality preference learning, and then, based on our database of preference image pairs, we learn the mapping relationship between the preference image pairs and the corresponding preference labels using the random forest classification model, finally get ranking result through pairwise comparison in turn, which is predicted base on voting strategy (Fig. 1).

Fig. 1.
figure 1

The framework of dehazed image quality assessment.

3.1 Features Extraction

In this paper, features are extracted from the following two aspects. On the one hand, we extract the features that can represent the degree of haze density. Image dehazing is a process of image clarity. Thus, the big difference between dehazed image quality assessment with conventional IQA is the consideration of the haze removal degree, and we will extract the image haze density features from the image sharpening degree, the texture detail richness and the contrast index. On the other hand, those features that can represent the degree of over-enhanced image distortion are also extracted. For the over-enhanced images, not only there exist the difference of haze density, but also exist the image distortion caused by contrast distortion, noise pollution, image color cast and so on, which seriously affects the visual perception comfort of the human eye. For the natural image dehazing, we should keep the similarity of color tone between images before and after dehazing as much as possible when achieving the purpose of haze removal. Therefore, we extract dehazed images’ features using two indicators of perceived comfort and the similarity of color tone between images before and after dehazing. In the following part, we detail these features and demonstrate that these features we extracted can well represent the haze density and over-enhanced image distortion of dehazed images.

  1. (a)

    Features of haze density

Ruderman et al. [10] found that the operation of brightness normalization simulates the contrast gain mechanism of the human visual cortex, which is called the MSCN coefficient [11] as

$$\begin{aligned} {I_{MSCN}}(i,j) = \frac{{{I_{gray}}(i,j) - \mu (i,j)}}{{\sigma (i,j) + 1}}\ \end{aligned}$$
(5)
$$\begin{aligned} \mu (i,j) = \sum \nolimits _{k = - K}^K {\sum \nolimits _{l = - L}^L {{\omega _{k,l}}{I_{gray}}(i + k,j + l)} } \ \end{aligned}$$
(6)
$$\begin{aligned} \sigma (i,j) = \sqrt{\sum \nolimits _{k = - K}^K {{{\sum \nolimits _{l = - L}^L {{\omega _{k,l}}\left[ {{I_{gray}}(i + k,j + l) - \mu (i,j)} \right] } }^2}} } \ \end{aligned}$$
(7)
$$\begin{aligned} {\tilde{J}_{dark}} = 1 - (\mathop {\min }\limits _{y \in \varOmega (x)} (\mathop {\min }\limits _c \frac{{{I^c}(y)}}{A}))\ \end{aligned}$$
(8)

where \(\mathrm{{i}} \in \{ 1,2,...M\} ,\mathrm{{j}} \in \{ 1,2,...N\}\), M and N are the image size, and \(\omega \mathrm{{ = \{ }}{\omega _{\mathrm{{k,k}}}}\left| {k = - K,...K,l = - L,...L\} } \right. \) is the local Gaussian symmetric convolution window corresponding to pixel (ij), and K and L denote respectively the length and width of convolution window.

Fig. 2.
figure 2

MSCN coefficient histogram: (a) natural fog images at different fog density in the same scene. The fog density from image \(\#\)1 to image \(\#\)5 is sequentially decreased. (b) MSCN coefficient histogram of the images in (a). (c) histogram of the parameters sigma in the MSCN coefficients. (Color figure online)

Fig. 3.
figure 3

Differences in image haze density evaluation using respectively the image dark channel features and the MSCN variance coefficient: (a) natural haze images at different haze density. The haze density from image \(\#\)1 to image \(\#\)3 is sequentially decreased. (b) haze density rank of the images in (a) by using respectively the dark channel feature and the MSCN variance coefficient. (Color figure online)

For natural haze images, the variance of the MSCN coefficients decrease as the haze density increase, reflecting the degree of image haze density to a certain extent, as shown in Fig. 2(b). For images with low brightness values but with partial dark block distortion, the variance of the MSCN coefficients is more accurate for haze density evaluation than the dark channel statistical features. Figure 3 compares the differences in image haze density estimate using respectively the image dark channel features and the MSCN variance coefficient. The dark channel statistical feature is defined as (8), which is similar to the MSCN characteristic. For both of them, the larger the eigenvalue is, the better the image dehazing result is. When evaluating the haze density of the “aerial” Level \(\#\)2 and countryside Level \(\#\)2 in Fig. 3(a), the dark channel feature detects that the number of dark pixels in the image is larger, concluding that the two images have the lowest haze density and the result of image dehazing is best which is not consistent with the actual estimate; while MSCN coefficient variance can make an accurate estimate, as shown in Fig. 3 (b).

The local standard deviation parameter \(\sigma (i,j)\) in the MSCN coefficients can accurately measure the sharpness degree of the local structure in the image, which can reflects the image haze density. As shown in Fig. 2(c), the parameter \(\sigma (i,j)\) decrease as the fog density increases. Therefore, we use the local standard deviation parameter \(\sigma (i,j)\) in the MSCN coefficient as a feature of evaluating image haze density.

The texture information can reflect the spatial distribution and the structure information of images, which is the basis of the visual system for image perceiving. For dehazed images, the richness of the texture information indirectly reflects the image clarity and visibility, so we use it to evaluate the image haze density. The gray covariance matrix can represent texture information well, and the entropy value can accurately reflect the amount of information contained in the image and the complexity of the texture. The greater the entropy value, the richer the image texture. In order to make the feature independent of images content and direction, it is defined as

$$\begin{aligned} E = (EN{T^{{0^\circ }}} + EN{T^{{{45}^\circ }}} + EN{T^{{{90}^\circ }}}+ EN{T^{{{135}^\circ }}})\ \end{aligned}$$
(9)

The contrast value reflects the brightness changes in the gray scale of image, and it can well represent the image clarity and detail. High contrast images tend to be sharper and richer, and vice versa. The contrast energy CE, as an approximation of the parameter \(\beta \) in the Weibull function, is a description of the image contrast distribution, which reflects the local contrast changes in the image. CE can convolute the image I by using the Gaussian second derivative filter, and the filter response is normalized to simulate the nonlinear contrast gain control process in human visual cortex. The image contrast is defined in three color channels (grayscale, yellow-blue: yb, red-green: rg):

$$\begin{aligned} CE({I_c}) = \frac{{\alpha \cdot Z({I_c})}}{{Z({I_c}) + \alpha \cdot \kappa }} - {\tau _c}\ \end{aligned}$$
(10)
$$\begin{aligned} Z({I_c}) = \sqrt{{{({I_c} \otimes {h_h})}^2} + {{({I_c} \otimes {h_v})}^2}} \ \end{aligned}$$
(11)

where \(\otimes \) represents the convolution operation, \(h_h\) and \(h_v\) are Gaussian second derivatives in the horizontal and vertical directions respectively, and \(\mathrm{{c}} \in \{ gray,yb,rg\}\), \(\mathrm{{gray}} = 0.\mathrm{{299R}} + 0.\mathrm{{587G}} + 0.\mathrm{{114B,}}\) \(\mathrm{{yb}} = 0.\mathrm{{5}}\left( {\mathrm{{R}} + \mathrm{{G}}} \right) - \mathrm{{B}},\) \(\mathrm{{rg}} = \mathrm{{R}} - \mathrm{{G}}\mathrm{{.}}\) In addition, \(\alpha \) is maximum value of \(Z({I_c})\), \(\kappa \) is the contrast gain and it is 0.1 in our paper, and \(\tau _c\) defines the noise threshold of each color channel, the values are 0.2553, 0.2287, 0.0528 respectively.

  1. (b)

    Over-enhanced distortion feature of image

One of the significant features of over-enhanced image is the contrast distortion caused by high contrast. The paper [12] indicated that the skewness and kurtosis of images can effectively reflect the comfort degree of the human visual perception. For natural images, the skewness and kurtosis distribution of the image conform to the corresponding Gaussian distribution. And the high contrast images show the statistical characteristics of positive skewness, whereas the darker or smoother images show statistical characteristics of negative skewness. Kurtosis is an indicator of measuring the symmetry of variable distribution, and it can also reflect the changes of images contrast. The higher the image’s kurtosis is, the stronger the images gloss and the less natural the visual perception is. For those images with contrast distortion, the skew and absolute kurtosis values are higher. Therefore, we take the images skew and kurtosis as features of contrast distortion perceived by human eye, and they are respectively defined in the following formula (12) and (13).

$$\begin{aligned} \mathrm{{s}}kewness(I) = \frac{{E{{[I - E(I)]}^3}}}{{{\sigma ^3}(I)}} \end{aligned}$$
(12)
$$\begin{aligned} kurtosis(I) = \frac{{E{{[I - E(I)]}^4}}}{{{\sigma ^4}(I)}} - 3 \end{aligned}$$
(13)

If a image is too dark or too bright, the human eye’s feelings to it will be influenced. For these contrast-distorted image of this type. The literature [13] used the Gaussian kernel function to define a first-order statistic, representing the visual comfort of the image, as defined in Eq. (14).

$$\begin{aligned} Comfort = \exp [ - {(\frac{{E(I) - \mu }}{\upsilon })^2}] \end{aligned}$$
(14)

where, \(\mu \) and \(\upsilon \) is the fixed parameter of this model, and the values in our paper is 130 and 300 respectively, \(\sigma (I)\) is the variance of image I, and E (I) is the expectation of I.

Naturalness, as a standard of human visual feeling, affects the subjective evaluation result of human eye. In [14], a factor of the naturalness was proposed to enhance the image quality, and it can be used as an image evaluation standard of naturalness degree without reference images. For any image I, we obtain the naturalness value of each channel separately. The closer the value is to 1, the better the images naturalness. The image naturalness degree is defined as follows:

$$\begin{aligned} {N_f} = (1 - \theta )\frac{{{T_1}}}{{T_1^{pr}}} + \theta \frac{{{T_2}}}{{T_2^{pr}}} \end{aligned}$$
(15)

where \(\varTheta \) is weighting factor and it belongs to 0–1, C is the color channel, \(\mathrm{{c}} \in \{ R,G,B\},\) and T1 and T2 are respectively the gradient distribution model and the Laplace distribution model in [14]. In addition, the values of \(T_1^{pr}\) and \(T_2^{pr}\) is respectively 0.38 and 0.14 in our paper.

It is not the better as the higher color intensity of the dehazed image. The image dehazing operation is aimed at the image clarity. Therefore, it is necessary to keep the maximum similarity between the two images before and after dehazing. Enhanced images often show a higher degree of color distortion, seriously affecting the image quality and the perception feelings of human eye. We convert the image to the YIQ color space, and define the image hue similarity feature as the fidelity of I and Q color channels. In our paper, we use the eigenvalue \({f_{IQ}}\) to measure the hue similarity of images before and after dehazing, and it is defined as follows.

$$\begin{aligned} {f_I} = \frac{1}{N}\sum \limits _x {\frac{{2{I_r}(x) \cdot {I_d}(x) + {c_0}}}{{I_r^2(x) + I_d^2(x) + {c_0}}}} \end{aligned}$$
(16)
$$\begin{aligned} {f_Q} = \frac{1}{N}\sum \limits _x {\frac{{2{Q_r}(x) \cdot {Q_d}(x) + c{}_0}}{{Q_r^2(x) + Q_d^2(x) + {c_0}}}} \end{aligned}$$
(17)
$$\begin{aligned} {f_{IQ}} = \frac{1}{N}\sum \limits _x {(\frac{{2{I_r}(x) \cdot {I_d}(x) + {c_1}}}{{I_r^2(x) + I_d^2(x) + {c_1}}}} \cdot \frac{{2{Q_r}(x) \cdot {Q_d}(x) + {c_2}}}{{Q_r^2(x) + Q_d^2(x) + {c_2}}}) \end{aligned}$$
(18)

where x is the pixel coordinate, \(I_r\), \(Q_r\), \(I_d\), and \(Q_d\) are I and Q color channel values of the hazy image and the dehazed image respectively. In addition, \({c_0}\), \({c_1}\), and \({c_2}\) are constants in order to maintain the validity of the feature.

3.2 The Training Data Set

According to the method of Sect. 2, we can get the database of the preference image pairs with preference labels. Then, using the method of Sect. 3.1, we calculate feature vector for each image included in the image pairs, and calculate the feature difference vector for each preference image pair (Table 1).

Table 1. List of the features used in our algorithm.
  • Our database of preference images pairs:

    $$\begin{aligned} P = \left\{ {\left( {{I_{k1}},{I_{k2}}} \right) ,k = 1,...,N} \right\} \end{aligned}$$
    (19)

    Where, \({I_{k1}}\) and \({I_{k2}}\) are the two images of the k-th preference image pair.

  • The feature vector \({f_{{I_k}}}\) for each image \({I_k}\).

  • For each preference image pair \({P_k} = \left( {{I_{k1}},{I_{k2}}} \right) \in P\) the feature difference vector \(x_k\) and the preference label \({l_k},k = 1,...,N\):

    $$\begin{aligned} {x_k} = {f_{{I_{k1}}}} - {f_{{I_{k2}}}} \end{aligned}$$
    (20)
    $$\begin{aligned} {l_k} \in \{ - 1, + 1\} \end{aligned}$$
    (21)
  • Let X denotes the set of feature difference vectors and Y denotes the set of the preference labels

    $$\begin{aligned} X = \{ {x_1},...,{x_k},...,{x_N}\} \end{aligned}$$
    (22)
    $$\begin{aligned} L = \{ {l_1},...,{l_k},...,{l_N}\} \end{aligned}$$
    (23)

Then, we train two-class stochastic forest model by \(\{ X,L\}\) to study the mapping relation between the feature difference vectors and the preference labels.

3.3 Rank and Preference Learning by Random Forest

The preference labels are \(+1\) or \(-1\), so the problem of learning mapping relation between the feature difference vectors and the preference labels is transformed into a two classification problem.

Because Random forest has high classification precision, ability to avoid over-fitting, and simpleness to implement, we choose random forest as our two classification model in our algorithm. We train the Random forest model based on the establishing database, and the training process is described as follows. First, randomly generate multiple training sets from the established dehazed images set. Then construct a decision tree \({g_i}(x,{\varTheta _i})\) for each image set, and \(i= 1, ..., M\), where M is the total number of decision trees, \({\varTheta _i}\) is a mechanism used in the training data. And random forest set \(G = \{ {g_1}(x),{g_2}(x),...,{g_M}(x)\}\) is constituted by different decision trees. When using the constructed stochastic forest model to classify hazy images, the classification result is determined by voting mechanism, that is, the mode of different classification results evaluated by decision trees determines the final classification result. For the two classification problem, the output is \(-1\) or \(+1\), as shown in Eq. (24).

$$\begin{aligned} C = \arg \max (p(c|x)),c \in \{ - 1,1\} \end{aligned}$$
(24)
$$\begin{aligned} p(c|x) = \frac{1}{M}\sum \limits _{i = 1}^M {{p_i}(c|x)} \end{aligned}$$
(25)

For given test data of preference dehazed pairs, we can use the trained Random forest model to get the preference labels based on its feature difference vectors. And if a feature difference vector is 0, we think the two images of this pair have a same quality, and set the corresponding preference label to 0. Finally, we can get the final ranking result through pairwise comparison in turn.

4 Experimental Results

In order to verify the relevance of the established PIPs and the features mentioned, we conduct the following experiment. First, 80\(\%\) of established PIPs is selected for training the random forest classification model, and the remaining 20\(\%\) is used for testing this model. Repeat the above steps and use the average result of 50 repetitive training as the final classification result. We compare the final results using SVM and random forest classifiers in Table 2.

Table 2. The results classification using SVM and random forest model.

It can be seen that the classification result of random forest model is superior to SVM. This result also proved the validity of our extracted features.

In order to verify the accuracy of our method in evaluating image haze density, we subjectively sort the images with different haze density in the same scene, and sorting results are shown in Fig. 4. Then we compare the subjective sorting result with the evaluation result of our algorithm. The evaluation result of our algorithm is: Level \(\#\)1 < Level \(\#\)2 < Level \(\#\)3 < Level \(\#\)4, and is consistent with the subjectively sorting result, which fully demonstrates the effectiveness of the algorithm in image haze density evaluation.

In order to verify the effectiveness of our algorithm in evaluating over enhancement distortion, we select several hazy images and respectively use the algorithm [6, 7, 15, 16] to remove haze, and the dehazing result is shown in Fig. 5. Then, the results of our algorithm are compared with the result of standard evaluation algorithm proposed in [1, 4], and the comparison results are respectively shown in Tables 3 and 4.

Table 3. The ranking results of images in Fig. 5 using respectively our evaluation algorithm and the image haze density evaluation algorithm in [4], and the quality is more better if it is in the more forward position of the sequence.
Table 4. The evaluation results of images in Fig. 5 using contrast evaluation algorithm [1].

The haze density standard proposed in [4] can evaluate the change of image haze density to a certain extent. However, it excessively used the image color bright features as the haze density evaluation criteria, so the evaluation result was overly dependent on the image color information and was not accurate, as shown in “y01”. At the same time it did not take into account the factor of over-enhanced distortion, so the evaluation results did not match human visual perception. The evaluation results using the blind contrast evaluation algorithm in [1] is shown in Table 4 and it can be seen that the evaluation algorithm cannot detect the over-enhancement phenomenon. In our paper, through a large number of experimental analysis, we select features closely related to the changes in haze density, color distortion features, and features of human visual perception which can reflect over-enhanced distortion. Using the random forest classification model, not only can we make an accurate estimate of image haze density, as shown the sorting result of (a) (b) (d) in Fig. 5, but also can make a more accurate estimate of image over-enhanced distortion; and our evaluation results are largely consistent with the human subjective feelings.

Fig. 4.
figure 4

The hazy images at different haze density in the same scene: the haze density decrease in turn from Level \(\#\)1 to Level \(\#\)4.

Fig. 5.
figure 5

The dehazing results using different algorithm.

5 Conclusion

From the problem of subjective quality scores that are inaccurate, biased and time-consuming, and the difficulty in building a large dehazed images database or extenting the existing database. It is by using the preference information that we propose a rank learning algorithm to evaluate the dehazed image quality. In our algorithm, we transform the problem of dehazed image quality evaluation into the classification problem of quality preference learning, and then use the feature fusion and random forest to solve it, finally get ranking result through pairwise comparison in turn. The experimental results show that our algorithm is highly consistent with the subjective feeling of human eye and is superior to the traditional dehazed image quality evaluation algorithms. Moreover, our algorithm has a strong expansibility. Further research is needed to explore the dehazed image quality-relevant features, and generate preference dehazed image pairs with preference labels.