\({\mathrm {CB_{p}F}}\)-IQA: Using Contrast Band-Pass Filtering as Main Axis of Visual Image Quality Assessment

  • Jesús Jaime Moreno-Escobar
  • Claudia Lizbeth Martínez-González
  • Oswaldo Morales-Matamoros
  • Ricardo Tejeida-Padilla
Part of the Studies in Computational Intelligence book series (SCI, volume 730)


Our proposal is to present a Blind and Reference Image Quality Assessment or CBPF-IQA. Thus, the main proposal of this paper is to propose an Interface, which contains not only a Full-Reference Image Quality Assessment (IQA) but also a No-Reference or Blind IQA applying perceptual concepts by means of Contrast Band-Pass Filtering (CBPF). Then, this proposal consists, in contrast, a degraded input image with the filtered versions of several distances by a CBPF, which computes some of the Human Visual System (HVS) variables. If CBPF-IQA detects only one input, it performs a Blind Image Quality Assessment, on the contrary, if CBPF-IQA detects two inputs, it considers that a Reference Image Quality Assessment will be computed. Thus, we first define a Full-Reference IQA and then a No-Reference IQA, which correlation is important when is contrasted with the psychophysical results performed by several observers. CBPF-IQA weights the Peak Signal-to-Noise Ratio by using an algorithm that estimates some properties of the Human Visual System. Then, we compare \({\mathrm {CB_{p}F}}\)-IQA algorithm not only with the mainstream estimator in IQA and PSNR but also state-of-the-art IQA algorithms, such as Structural SIMilarity (SSIM), Mean Structural SIMilarity (MSSIM), and Visual Information Fidelity (VIF). Our experiments show that the correlation of CBPF-IQA correlated with PSNR is important, but this proposal does not need imperatively the reference image in order to estimate the quality of the recovered image.

1 Introduction and Problem Statement

The evolution of sophisticated Models and applications of Processing of Digital Images gives as a result of extensive and important literature describing several methodologies or algorithms. An important number of these works are dedicated to methodologies for improving, in some cases, only the image appearance. Nevertheless, we cannot consider that the digital image quality have reach the perfection. We can consider that Natural Images have presumably been distorted during a certain process of codification or representation. Thus, it is important in the representation process of some kind of images improving image quality in order to recognize and obtain a measure of the degree of degradation or quality of a natural digital image.

Today, \({\mathrm {MSE}}\) or Mean Square Error is yet the most used objective metrics, since many other algorithms which evaluate image quality are based on \({\mathrm {MSE}}\), Peak Signal-to-Noise Ratio (\({\mathrm {PSNR}}\)), for instance. Wang and Bovik [1, 2] consider that \({\mathrm {MSE}}\) is a poor assessment to be employed in systems that predict image quality or in terms of fidelity. So, we want to mention what is probably wrong with respect to \({\mathrm {MSE}}\) estimations. Thus, we would be in conditions to analyze and to propose a different algorithm that introduces some properties of the human eye into the \({\mathrm {MSE}}\) algorithm, maintaining the best properties of the \({\mathrm {MSE}}\).

Thus, the main proposal of this work is to propose a methodology implemented in an Interface, which contains not only a classical algorithm of Full-Referenced Image Quality Assessment (IQA) but also a No-Referenced or Blind IQA applying some perceptual features of the Human Visual System (HVS) employing a Contrast Band-Pass Filtering (\({\mathrm {CB_{p}F}}\)-IQA). In the particular case of Blind IQA the main goal is not to present another Blind IQA, our main objective is to propose a blind version of \({\mathrm {PSNR}}\), since it is the most important metric along the history because it is the most used IQA and its implementations can be obtained without reference.

Then, this proposal lies in contrasting a degraded original or perfect image (an example is depicted in Fig. 1) with the filtered versions of several distances by a \({\mathrm {CB_{p}F}}\)-IQA, which computes some of HVS characteristics, such as contrast assimilation and sensitivity contrast in terms of intensity.
Fig. 1

Example of original image: Lena image

If \({\mathrm {CB_{p}F}}\)-IQA detects only one input, it considers a No-Referenced Image Quality Assessment, on the contrary if \({\mathrm {CB_{p}F}}\)-IQA detects two inputs, it considers that a Referenced Image Quality Assessment will be estimated. Thus, we first define the Referenced IQA and No-Reference IQA algorithms, and then we contrast their correlation with the psychophysical results applied on several human observers. \({\mathrm {CB_{p}F}}\)-IQA modifies the well-known Peak Signal-to-Noise Ratio formula, weighting and estimating the assimilation or contrast the original source at different distances. Finally, we perform a comparison of \({\mathrm {CB_{p}F}}\)-IQA methodology not only with the mainstream estimator in IQA, \({\mathrm {PSNR}}\) or \({\mathrm {MSE}}\), but also recent IQA algorithms, such as Structural SIMilarity (SSIM), Mean Structural SIMilarity (MSSIM) or Visual Information Fidelity (VIF), for instance. The result in terms of Blind IQA of our experiments demonstrate that \({\mathrm {CB_{p}F}}\)-IQA (\({\mathrm {BIQA}}\)) is highly correlated with the answer or response of \({\mathrm {PSNR}}\), but this proposal does not need mandatory the reference image in order to computes the distortion of the recovered image.

2 Definition of Image Quality Assessment

In this section we a brief review of IQA definition, then, we divide the IQA algorithms in two: Referenced and Non-Referenced approaches, the latter is also known as Blind IQA. Thus, Referenced IQA Metrics can be divided in Bottom-Up and Top-Down Approaches.

Bottom-up approaches for assessing image quality are algorithms those try to emulate well-modeled characteristics of HVS, and the integration of them into the design of algorithms quality evaluation, hopefully, perform similarly as the HVS in the estimation the image quality.

Furthermore, the bottom-up approaches attempt to simulate structural and functional characteristics in HVS which are relevant for the evaluation of image quality assessment. The main goal is to propose algorithms which work alike HVS, at least for attempting assessing of image quality evaluation.

On the contrary, the top-down systems simulate HVS in another way. Top-down algorithms consider HVS as a black box, and only the input-output task is important to considering. A system for assessing image quality from top to bottom can evaluate a little bit different, since it considers the behavior estimation of image quality of an average human observer correctly, namely in some cases this system adapt the results with linear or nonlinear regression.

An important and some times obvious task for the proposition of an algorithm of this type top-down approach is to consider the main challenge of automatic supervised learning, as illustrated in Fig. 2. Thus, HVS is not supervised in order to learn its behavior. The training features are obtained through subjective experiments, where are viewed and evaluated a large number of test images by human observers. Which is way, the main goal is to model the algorithm in a general way giving as a result the minimization of the error between the desired output (subjective assessment) and the model estimation. This is generally a challenge of regression or an approximation function.
Fig. 2

Learning human visual system

By the other hand, no-reference or no-source image quality evaluation is a very difficult task in this field of image quality estimation, but the understanding of the problem is very simple. Somehow, an objective or subjective computational model should assess the quality of any real-world image, without reference to an original image, as HVS does. Thus, this looks like very difficult mission. The quality of an image can be assessed quantitatively without having an methodology of what a good/bad image quality is supposed to be similar. Then, surprisingly, this is a fairly easy assignment for human observers. HSV can easily recognize images with high or good quality when they are contrasted with low or bad quality images, and also the human eye can distinguish what of these two images is good or bad without watching the reference or original image. In addition, humans observers tend to highly correlate with the opinion of other observers. Example of this behavior when the human eye evaluates image quality without comparing the original or reference source, it is very probable that estimates that the image is noisy, fuzzy, or compressed by any image coder, such as JPEG or JPEG2000, for instance. In this way, Figs. 3 and 4 show some examples of JPEG2000 compression, where the recovered images have lower quality than moved and stretched luminance contrast images.
Fig. 3

Baboon image: patches with size \(= 256\times 256\) of recovered images compressed by JPEG2000, \({\mathrm {PSNR}}\) \(=\) 32 dB

Fig. 4

Splash image: patches with size \(= 256\times 256\) of recovered images compressed by JPEG2000, PSNR \(=\) 32 dB

In future and present of Blind Image Quality Assessment (BIQA), i.e., the objective is to predict the perceptual quality of an image without any prior information of its reference image and distortion type. In this way, we can highlight some research works as following:
  1. 1.

    Wu et al. [3] where they characterize by a new feature fusion scheme and a k-nearest-neighbor (KNN)-based quality prediction model.

  2. 2.

    Li et al. [4] proposed a NR DeBlocked Image Quality (DBIQ) metric by simultaneously evaluating blocking artifacts in smooth regions and blur in textured regions. Their experimental results conducted on the DBID database demonstrate that the proposed metric is effective in evaluating the quality of deblocked images. As an application of this metric is further used for automatic parameter selection in image deblocking algorithms.

  3. 3.

    Lu et al. [5] propose an IQA framework which utilizes minimum amount of structure coefficients to capture the variation of color structure and distortion of degraded image by applying a VPT to remove the visual unperceived coefficients. The difference of the proportion of visual perceived coefficients between distorted and reference image is measured to acquire image quality score.


Some authors have collected and described a survey of existing Blind IQAs by one hand Zhang et al. [6] depict an exhaustive statistical evaluation is conducted to justify the added value of computational saliency in objective image quality assessment, using 20 state-of-the-art saliency models and 12 best-known IQMs. Quantitative results show that the difference in predicting human fixations between saliency models is sufficient to yield a significant difference in performance gain when adding these saliency models to IQMs. By the other hand Kamble et al. [7] describe a survey which includes type of noise and distortions covered, techniques and parameters used by these algorithms, databases on which the algorithms are validated and benchmarking of their performance with each other and also with human visual system.

3 \({\mathrm {MSE}}\) Definition

\({\mathrm {MSE}}\) by far is the most important algorithm in the IQA field, which is why we want to define this metric. On one hand, let us define f(ij) and \(\hat{f}(i,j)\) as the couple of images to be compared, which amount of pixels is the size inside each one. Being f(ij) the original source, considered with best or perfect possible quality or fidelity, and \(\hat{f}(i,j)\) a possible degraded estimation of f(ij), whose quality is subjected to evaluate. On the other hand, let us define first the of the \({\mathrm {MSE}}\) and then the \({\mathrm {PSNR}}\). Equations 1 and 2 the latter algorithms, respectively.
$$\begin{aligned} MSE={\frac{1}{l \times m}\sum _{i=1}^{l}\sum _{j=1}^{m} \left[ f(i,j)-\hat{f}(i,j) \right] ^2} \end{aligned}$$
$$\begin{aligned} PSNR=10\log _{10} \left( \frac{\alpha ^2}{MSE} \right) \end{aligned}$$
where \(\alpha \) is the maximum value in terms of intensity inside f(ij), size \(= l \times m\). Thus, for images witch contains only one channel, namely 8 bits per pixel (bpp) \(\alpha =2^8-1=255\). Also, \(\alpha \) represents the maximal distortion when the maximal intensity in 8 bpp, 255, is completely degraded, namely a pixel change from 255 to the minimal intensity, 0. Thus, the peak of \({\mathrm {MSE}}\) is \(\alpha =255^2 = 65025\).

For chromatic images, Eq. 2 also defines the estimation of \({\mathrm {PSNR}}\), but for color images the \({\mathrm {MSE}}\) is separately computed of every component and results of the three channels are averaged.

Both \({\mathrm {MSE}}\) and \({\mathrm {PSNR}}\) are widely employed in the field of image processing, image coding and understanding, because these algorithms have favorable characteristics:
  1. 1.

    Convenient for the purpose of optimizing a certain algorithm that needs to improve quality. For instance in JPEG2000, \({\mathrm {MSE}}\) is employed both in Optimal Rate Allocation Methodology [8, 9] and Region of Interest Algorithms [9, 10]. Also, \({\mathrm {MSE}}\) is differentiable and integrable, so its employment could solve these kind of problems in terms of optimization, when it is use along with linear algebra, for instance.

  2. 2.

    By definition \({\mathrm {MSE}}\) compares the square difference of two images, giving as a result a clear meaning of leak of energy.


However, in some cases \({\mathrm {MSE}}\) estimates image quality with a low correlation with quality given by a human observer. A clear example is depicted by Figs. 3 and 4, where both Baboon and Splash are coded and decoded by JPEG2000 compression with \({\mathrm {PSNR}}\) \(=\) 32 dB. Those images have very different visual quality. Then, for this special case, either \({\mathrm {MSE}}\) or \({\mathrm {PSNR}}\) do not correlates with Human Visual System (HVS).

4 \({\mathrm {CB_{p}F}}\)-IQA Algorithm

4.1 Contrast Band-Pass Filtering

The Contrast Band-Pass Filtering (CBPF) approximately estimates the image seen by a human observer with a \(\delta \) separation by filtering some frequencies witch are important o irrelevant for HVS. So, first of all let us define f(ij) as the mathematical representation of the reference Image and \(\delta \) as the separation between observer and the screen. Then CBPF estimates a filtered image \(\breve{f}(i,j)\), when f(ij) is seen from \(\delta \) centimeters. CBPF is founded on three main features: frequency of the pixel, spatial scales and surround filtering.
Fig. 5

Representation of the CSF (\(\beta _{s,o,i}(r,\nu ))\) for Y channel o illuminate component

The CBPF methodology decomposes reference image f(ij) into a set of wavelet planes \(\omega (s,o)\) of different spatial scales s (i.e., frequency of the pixel \(\nu \)) and spatial scales as:
$$\begin{aligned} f(i,j) =\int _{s=1}^{n}\omega (s,o) + c_n \; \end{aligned}$$
where n is the amount of wavelet decompositions, \(c_n\) is the plane in the pixel domain and o spatial scale either vertical, horizontal or diagonal.
The filtered image \(\breve{f}(i,j)\) is recovered by scaling these \(\omega (s,o)\) wavelet coefficients employing Contrast Band-Pass Filtering function, which is at the same time an approximation Contrast Sensitivity Function (CSF, Fig. 5). The CSF tries to approximate some psychophysical features [11], considering surround filtering information (denoted by r), perceptual frequency denoted by \(\nu \), which is the gain of frequency either positive or negative depending on \(\delta \). Filtered image \(\breve{f}(i,j)\) is defined by Eq. 4.
$$\begin{aligned} \breve{f}(i,j) = \int _{s=1}^{n}\left. \beta (\nu ,r) \omega (s,o) \right. + c_n \; \end{aligned}$$
where \(\beta (\nu ,r)\) is the CBPF weighting function reproduce some properties of the HVS. The term \(\beta (\nu ,r)\;\omega (s,o)\equiv \omega _{s,o;\rho ,\delta }\) is the filtered wavelet coefficients of image f(ij) when it is watch at \(\delta \) centimeters and is written as
$$\begin{aligned} \beta (\nu ,r)= z_{ctr}\cdot C_\delta (\dot{s})+ C_{min}(\dot{s}) \; \end{aligned}$$
This function has a shape similar to the e-CSF and the three terms that describe it are defined as  
Nonlinear function and estimation of the central feature contrast relative to its surround contrast, oscillating from zero to one, defined by
$$\begin{aligned} z_{ctr}= \frac{\left[ \frac{\sigma _{cen}}{\sigma _{sur}}\right] ^2}{1+\left[ \frac{\sigma _{cen}}{\sigma _{sur}}\right] ^2} \end{aligned}$$
being \(\sigma _{cen}\) and \(\sigma _{sur}\) the standard deviation of the wavelet coefficients in two concentric rings, which represent a center–surround interaction around each coefficient.
\(C_\delta (\dot{s})\)
Weighting function that approximates to the perceptual e-CSF, emulates some perceptual properties and is defined as a piecewise Gaussian function [12], such as
$$\begin{aligned} C_{\delta }(\dot{s})=\left\{ \begin{array}{cc} e^{ -\frac{\dot{s}^2}{2\sigma _1^2} } , &{} \dot{s}=s-s_{thr}\le 0, \\ e^{ -\frac{\dot{s}^2}{2\sigma _2^2} }, &{} \dot{s}=s - s_{thr}> 0. \end{array} \right. \end{aligned}$$
Term that avoids \(\alpha (\nu ,r)\) function to be zero and is defined by
$$\begin{aligned} C_{min}(\dot{s})=\left\{ \begin{array}{cc} \frac{1}{2} \; e^{ -\frac{\dot{s}^2}{2\sigma _1^2} } , &{} \dot{s}=s-s_{thr}\le 0, \\ \frac{1}{2}, &{} \dot{s}=s - s_{thr}> 0. \end{array} \right. \end{aligned}$$
taking \(\sigma _1=2\) and \(\sigma _2=2\sigma _1\). Both \(C_{min}(\dot{s})\) and \(C_\delta (\dot{s})\) depend on the factor \(s_{thr}\), which is the scale associated to 4cpd when an image is observed from the distance \(\delta \) with a pixel size \(SZ_p\) and one visual degree, whose expression is defined by Eq. 9. Where \(s_{thr}\) value is associated to the e-CSF maximum value.
$$\begin{aligned} s_{thr}=\log _2\left( \frac{\delta \tan (1^\circ )}{4\;SZ_p}\right) \end{aligned}$$
Fig. 6

Perceptual images obtained by CBPF with \(\delta =30\) cm

Fig. 7

Perceptual images obtained by CBPF with \(\delta =100\) cm

Fig. 8

Perceptual images obtained by CBPF with \(\delta =200\) cm

Figures 6, 7, and 8 depicts three examples of CBPF images of Lenna (Fig. 1), calculated by Eq. 4 for a 19 inch monitor with 1280 pixels of resolution in the horizontal, at \(\delta =\{30,100,200\}\) centimeters. Those figures can show that the higher distance the more distorted the image is.

4.2 General Methodology

Algorithm 1 shows the main methodology of this proposal. Thus, \({\mathrm {CB_{p}F}}\)-IQA Algorithm estimates the referenced visual quality of the distorted image \(\hat{f}(i,j)\) regarding f(ij) the original reference image, if it exists, otherwise \({\mathrm {CB_{p}F}}\)-IQA estimates a blind visual image quality. Both algorithms need the definition of the Observational Distance d given by the observer, so if d is not defined, we estimate the distance d from the actual observer by means of 3D/stereoscopic methodology, Algorithm 3. The main algorithm is also presented in Fig. 9.
Fig. 9

General explanation of the \({\mathrm {CB_{p}F}}\)-IQA algorithm, which contains both \({\mathrm {RIQA}}\) and \({\mathrm {BIQA}}\) subprocess

Fig. 10

a Primary pattern [0,1;1,0] or \(\varUpsilon \). b Sixteenth pattern or \(\varUpsilon ^{16}\)

Then a full-reference image quality metric is performed, there is an reference image f(ij) and a recovered presumably distorted version \(\hat{f}(i,j)=\theta [f(i,j)]\) that is contrasted against f(ij). It is important to mention \(\theta \) is the algorithm that distorts the reference image and henceforth we refer the Full-reference image quality algorithm in \({\mathrm {CB_{p}F}}\)-IQA as \({\mathrm {RIQA}}\). Otherwise, in the no-referenced image quality issue we refer \({\mathrm {CB_{p}F}}\)-IQA as \({\mathrm {BIQA}}\). Furthermore, it is important to mention that \({\mathrm {BIQA}}\) only processes a degraded version of \(\hat{f}(i,j)\). Thus from Fig. 10a and b, we compare \(\hat{f}(i,j)\) against a repetitive pattern \(\varUpsilon \)([0,1;1,0]). Then, we perform the same algorithm in \({\mathrm {RIQA}}\).

Since both f(ij) and \(\hat{f}(i,j)\) are observed at the same time at an observational distance \(\delta \), if the similarity between f(ij) and \(\hat{f}(i,j)\) appears to be better perceived is because \(\delta \) tends to 0. In contrast, if the observer judges f(ij) and \(\hat{f}(i,j)\) when \(\delta \) tends to \(\infty \) the correlation between reference and distorted image would be the same. As any algorithm we need to approximate the \(\delta =\infty \), namely where similarity is so big that the observer confuse both images, we propose a nonlinear regression for approximating \(\infty \) to \(\delta =\varDelta \).

Either Reference Assessment or Blind Assessment, our proposal is based on Algorithm 2.
\(n\mathcal {P}\) and \(\varepsilon m\mathcal {L}\) are two features involved in the evaluation of Distance \(\varDelta \). Equation 10 show the estimation of \(\varDelta \), besides these two parameters it is important to know or estimate also \(\delta \) in order to figure out the \(n\mathcal {P}\) and \(\varepsilon m\mathcal {L}\) distances. Furthermore Figs. 11 and 12 depict the Wavelet Energy Loss or \(\varepsilon \mathcal {R}\), which shows not only the behavior of the relative energy but also the significance of \(\varDelta \), \(n\mathcal {P}\) and \(\varepsilon m\mathcal {L}\) inside an \(\varepsilon \mathcal {R}\) chart (a).
$$\begin{aligned} \mathcal {D} = n\mathcal {P}+ \varepsilon m\mathcal {L}\end{aligned}$$
Fig. 11

Portrayal of distances employed by the \({\mathrm {CB_{p}F}}\)-IQA algorithm. D and \(n\mathcal {P}\) graphical representation

Fig. 12

Inside an \(\varepsilon \mathcal {R}\) chart

Furthermore Fig. 12 also show that the pinnacle inside the function is \(n\mathcal {P}\), which is describe for the eye specialist as Near Point, which is between 15 to 20 cm for an adult. Thereby, \(n\mathcal {P}\) also can be defined as the distance where human eye can evaluate a pair of images f(ij) and \(\hat{f}(i,j)\). From this point \(n\mathcal {P}\), fewer the differences are perceived by the observer, until these differences disappear in the \(\infty \). We find \(\varDelta \) by projecting the points \(\left( n\mathcal {P},\varepsilon \mathcal {R}\left( n\mathcal {P}\right) \right) \) and \(\left( d,\varepsilon \mathcal {R}\left( d\right) \right) \) to \(\left( \varDelta ,0\right) \).

4.3 Estimation of the Observational Distance \(\delta \)

Estimation of the Observational Distance \(\delta \) is based on Algorithm 3, which divided in six steps and is described as follows:

Step 1: Camera calibration by means of Function Stereo Calibration. Calibration Results are stored in a structure, which is defined as stereoParams.

Step 2: Ones both left and right cameras are calibrated, we take two images \(I_l\) and \(I_r\).

Step 3: With the parameters defined in stereoParams, we calibrate both \(I_l\) and \(I_r\) images using undistortImage function, giving as a result \(Ic_l\) and \(Ic_r\).

Step 4: In both \(Ic_l\) and \(Ic_r\) images we estimate two human characteristics: face and eyes detection. This detection is made by means of the function vision. CascadeObjectDetector. If in both \(Ic_l\) and \(Ic_r\) images are detected faces, then we detect eyes. This procedure increases the probability to find the head of the observer in the stereo-pair.

Step 5: We estimate the center of the heads located both in \(Ic_l\) and \(Ic_r\) images.

Step 6: Finally, we estimated the distance \(\delta \) in centimeters between cameras and the observer, using the function triangulate.

It is important to mention that all functions employed in this methodology are toolboxes of MatLab R2015a.

5 Experimental Results

5.1 Referenced Image Quality Assessment

\({\mathrm {MSE}}\), PSNR [13], SSIM, VIFP [14], MSSIM [15], VSNR [16], VIF [17], UQI [18], IFC [19], NQM [20], WSNR [21] and SNR are compared against the performance of \({\mathrm {CB_{p}F}}\)-IQA for JPEG2000 compression distortion. We chose for evaluating these assessments the implementation provided in [22], since it is based on the parameters proposed by the author of each indicator.

Table 1 shows the performance of \({\mathrm {RIQA}}\) and the other 12 image quality assessments across the set of images from TID2008, LIVE, CSIQ, and IVC image databases employing Kendall Rank-Order Correlation Coefficient (KROCC) for testing the distortion produced by a JPEG2000 compression.
Table 1

KROCC of \({\mathrm {RIQA}}\) and other quality assessment algorithms on multiple image databases using JPEG2000 distortion. The higher the KROCC the more accurate image assessment. Bold and italicized entries represent the best and the second best performers in the database, respectively. The last column shows the KROCC average of all image databases


Image database




































\({\mathrm {PSNR}}\)






















































Thus, for JPEG2000 compression distortion, \({\mathrm {RIQA}}\) is getting the best results in all databases. \({\mathrm {RIQA}}\) correlates in 0.8837 for a database of 228 images of the LIVE database. On the average, \({\mathrm {RIQA}}\) algorithm is also correlates in 0.8555, using KROCC. Furthermore, JPEG2000 compression distortion, MSSIM is the second best indicator not only for TID2008, LIVE, and IVC image databases but also on the average, since VIFP occupies second place for CSIQ image database. Thus, the correlation between the opinion of observers and the results of MSSIM is 0.0143 less than the ones of \({\mathrm {RIQA}}\). So in general, we can conclude that \({\mathrm {PSNR}}\) can be improved its performance in 11.5% if it includes four steps of filtering, \({\mathrm {RIQA}}\).

Table 2 shows the performance of \({\mathrm {RIQA}}\) and the other twelve image quality assessments across the set of images from TID2008, LIVE, CSIQ and IVC image databases employing KROCC for testing the distortion produced by a JPEG compression.
Table 2

KROCC of \({\mathrm {RIQA}}\) and other quality assessment algorithms on multiple image databases using JPEG distortion. The higher the KROCC the more accurate image assessment. Bold and italicized entries represent the best and the secondbest performers in the database, respectively. The last column shows the KROCC average of all image databases


Image database


















\({\mathrm {PSNR}}\)


































































\({\mathrm {CB_{p}F}}\)-IQA






Table 2 also shows an average performances for the 534 images of the cited image databases. Bold and Italicized represent the best and the second best performance assessment, respectively. It is appropriate to say that \({\mathrm {RIQA}}\) is the best performer both in each image database and average of them. MSSIM is the second best-ranked metrics not only in all databases but also on the average, except for the CSIQ database, where VIF has this place. \({\mathrm {RIQA}}\) is better 0.0243 than MSSIM and improves the performance of \({\mathrm {PSNR}}\) or \({\mathrm {MSE}}\) by 0.1402 for JPEG compression degradation.

5.2 Blind Image Quality Assessment

Some metrics estimate Quality as \({\mathrm {PSNR}}\) does, but some metrics estimates degradation, \({\mathrm {MSE}}\), for instance. It is important to mention that \({\mathrm {BIQA}}\) estimates the degradation. This degradation tends to zero means that the overall quality is getting better. We already check the behavior of \({\mathrm {RIQA}}\), so in this section we develop comparisons for verifying the performance \({\mathrm {BIQA}}\) by comparing significance performance of different compress versions of the image Baboon. \({\mathrm {BIQA}}\) is a metric that gives decibels as \({\mathrm {PSNR}}\) does, so instead of employing a Non-Parametric Correlation, we use a parametric correlation coefficient, i.e., Pearson correlation coefficient in order to better compare the results between \({\mathrm {BIQA}}\) and \({\mathrm {PSNR}}\).

Figure 13 depicts three JPEG2000 distorted versions of the image Lenna with 0.05 (Fig. 13a), 0.50 (Fig. 13b) and 1.00 (Fig. 13c) bits per pixel. \({\mathrm {PSNR}}\) estimates 23.41, 32.74 and 34.96 dB, respectively. While \({\mathrm {CB_{p}F}}\)-IQA computes 48.42, 36.56 and 35.95 dB, respectively. Thus, both \({\mathrm {PSNR}}\) and \({\mathrm {CB_{p}F}}\)-IQA estimate that image at 1.00 bpp has lower distortion. When this experiment is extended computing the JPEG2000 distorted versions from 0.05 to 3.00 bpp (increments of 0.05 bpp, depicted at Fig. 14), we found that the correlation between \({\mathrm {PSNR}}\) and \({\mathrm {CB_{p}F}}\)-IQA is 99.32 image Lenna for every 10,000 estimation \({\mathrm {CB_{p}F}}\)-IQA misses only in 68 assessments.
Fig. 13

JPEG2000 Distorted versions of color image Lenna at different bit rates expressed in bits per pixel (bpp). a High distortion, b medium distortion and c low distortion

Fig. 14

Comparison of \({\mathrm {PSNR}}\) and \({\mathrm {CB_{p}F}}\)-IQA for the JPEG2000 distorted versions of image Lenna

Figure 15a, b, and c depict three JPEG2000 compression of the image Baboon with 0.05, 0.50, and 1.00 bits per pixel, respectively. Thereby \({\mathrm {PSNR}}\) estimates 18.55 dB for Fig. 15a, 23.05 dB for Fig. 15b, and 25.11 dB Fig. 15c. While \({\mathrm {BIQA}}\) computes 43.49, 30.07 and 28.71 dB, respectively. Thus for the 0.05 bpp (Fig. 15a), higher distortion is estimated both \({\mathrm {PSNR}}\) and \({\mathrm {BIQA}}\).

Figure 16 depicts multiple JPEG2000 decoded images from 0.05 to 3.00 bpp, the increments of varies every 0.05 bpp. With the later data we can found that \({\mathrm {PSNR}}\) and \({\mathrm {BIQA}}\) between them is 0.9695, namely, for image Baboon for every 1000 tests \({\mathrm {BIQA}}\) estimates in a wrong way only 30 assessments.
Fig. 15

JPEG2000 Distorted versions of color image Baboon at different bit rates expressed in bits per pixel (bpp). a High distortion, b medium distortion and c low distortion

Fig. 16

Comparison of \({\mathrm {PSNR}}\) and \({\mathrm {CB_{p}F}}\)-IQA for the JPEG2000 distorted versions of image Baboon

5.3 \({\mathrm {CB_{p}F}}\)-IQA Interface

In Fig. 17 the graphic interface is shown that allows upload pictures and calculate their quality by using methods with and without reference which are selectable via a drop-down menu. The observer can also select the type of distance \(\delta \) used the code which is the distance from the screen to the face of the observer.
Fig. 17

Estimation of distance: static distance \({\mathrm {CB_{p}F}}\)-IQA graphic interface

Fig. 18

Estimation of distance: static distance. Referenced image quality assessment

Fig. 19

Estimation of distance: static distance. Blind image quality assessment

In the Fig. 18, it shows that selecting the metric referenced by the drop-down menu, you can load the original images (without compression) and distorted (noisy) using the buttons to load image which display a window that lets you explore folders and select the image. Pressing the button Calculate the Legend \({\mathrm {RIQA}}\) method allowing the return a numeric result associated with the image quality is applied (Fig. 19).

Figure 18 shows that selecting the metric without reference automatically displays a window with the caption: No original image and the button to load the original image now is disabled.
Fig. 20

Estimation of distance: dynamic distance. Selecting the distance

Fig. 21

Estimation of distance: dynamic distance. Taking stereo-pair \(I\left( r,l\right) \), i.e., \(I_r\) and \(I_l\) images

Fig. 22

Estimation of distance: dynamic distance. Dynamic estimation of distance \(\delta \)

By selecting the metric without reference Calculate button, the algorithm automatically switches to a method without reference to return a numeric value associated with the image quality, namely \({\mathrm {BIQA}}\).

When selecting the type of distance in static mode, it is possible to move the slider that lets you change the value of the distance used in the algorithms metrics with and without reference. By selecting the distance of dynamic type green buttons (enabled preview), show preview (blue) stop preview (red) are enabled allowing handling. Otherwise slider is disabled, Fig. 20.

When the Preview button is pressed stereo-cameras transmit the images to the computer and it is displayed on the screen, Fig. 21.

Thus, When Measure distance button is pressed, it takes an arrangement of pictures from stereo-cameras. With this arrangement our algorithm automatically tries to detect the face and eyes of the observer if \({\mathrm {CB_{p}F}}\)-IQA finds them it estimates the observational distance \(\delta \), Fig. 22.

6 Conclusions and Future Work

\({\mathrm {CB_{p}F}}\)-IQA is a metric divided in two algorithms full-reference (\({\mathrm {RIQA}}\)) and non-reference (\({\mathrm {BIQA}}\)) image quality assessments based on filtered weighting of \({\mathrm {PSNR}}\) by using a model that tries to simulate some features of the Human Visual System (CBPF model). Both proposed metrics in \({\mathrm {CB_{p}F}}\)-IQA are based on five steps.

When we compared \({\mathrm {RIQA}}\) Image Quality Assessment against several state-of-the-art metrics our experiments gave us as a result that \({\mathrm {RIQA}}\) was the best-ranked image quality algorithm in the well-known image databases such as TID2008, LIVE, CSIQ and IVC, JPEG2000 compression algorithm is used as a method of distorting the cited image databases. Thus, it is 2.5 and 1.5% better than the second best performing method, MSSIM. On average, \({\mathrm {RIQA}}\) improves the results of PSNR in 14% and 11.5% for MSE.

In the Blind Image Quality Assessment, \({\mathrm {BIQA}}\) assessment correlates almost perfect for JPEG2000 distortions, since difference between \({\mathrm {BIQA}}\) and \({\mathrm {PSNR}}\), on the average is only 0.0187.

Combine both \({\mathrm {RIQA}}\) and \({\mathrm {BIQA}}\) in the same interface was the main contribution of this work. Thus, a expert or non-expert in the quality images assessment field can perform its own experiments. These experiments could include dynamic quality estimations or static ones. As a future work of this paper could be to include a set of quality images estimators including \({\mathrm {RIQA}}\) and \({\mathrm {BIQA}}\).



This work is supported by National Polytechnic Institute of Mexico (Instituto Politécnico Nacional, México) by means of Project No. SIP-20171179, the Academic Secretary and the Committee of Operation and Promotion of Academic Activities (COFAA) and National Council of Science and Technology of Mexico (CONACyT).

It is important to mention that Sects. 4 and 5 are part of the degree thesis supported by Eduardo García and Yasser Sánchez.


  1. 1.
    Wang, Z., Bovik, A.: Mean squared error: Love it or leave it? a new look at signal fidelity measures. Signal Proces. Mag. IEEE 26(1), 98–117 (2009)Google Scholar
  2. 2.
    Wang, Z., Bovik, A.C.: Modern Image Quality Assessment, 1st edn. Synthesis Lectures on Image, Video, & Multimedia Processing. Morgan & Claypool Publishers (2006)Google Scholar
  3. 3.
    Wu, Q., Li, H., Meng, F., Ngan, K.N., Luo, B., Huang, C., Zeng, B.: Blind image quality assessment based on multichannel feature fusion and label transfer. IEEE Trans. Circ. Syst. Video Technol. 26(3), 425–440 (2016)CrossRefGoogle Scholar
  4. 4.
    Li, L., Zhou, Y., Lin, W., Wu, J., Zhang, X., Chen, B.: No-reference quality assessment of deblocked images. Neurocomputing 177, 572–584 (2016).
  5. 5.
    Lu, W., Xu, T., Ren, Y., He, L.: On combining visual perception and color structure based image quality assessment. Neurocomputing 212, 128–134 (2016), Chinese Conference on Computer Vision 2015 (CCCV 2015).
  6. 6.
    Zhang, W., Borji, A., Wang, Z., Callet, P.L., Liu, H.: The application of visual saliency models in objective image quality assessment: A statistical evaluation. IEEE Trans. Neural Netw. Learn. Syst. 27(6), 1266–1278 (2016)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Kamble, V., Bhurchandi, K.: No-reference image quality assessment algorithms: a survey. Optik—Int. J. Light Electron Opt. 126(1112), 1090–1097 (2015).
  8. 8.
    Auli-Llinas, F., Serra-Sagrista, J.: Low complexity JPEG2000 rate control through reverse subband scanning order and coding passes concatenation. IEEE Signal Proces. Lett. 14(4), 251–254 (2007)CrossRefGoogle Scholar
  9. 9.
    Taubman, D.S., Marcellin, M.W.: JPEG2000: Image Compression Fundamentals, Standards and Practice, ser. Kluwer Academic Publishers (2002). ISBN: 0-7923-7519-XGoogle Scholar
  10. 10.
    Bartrina-Rapesta, J., Auli-Llinas, F., Serra-Sagrista, J., Monteagudo-Pereira, J.: JPEG2000 arbitrary ROI coding through rate-distortion optimization techniques. In: Data Compression Conference, 25-27 2008, pp. 292 –301Google Scholar
  11. 11.
    Mullen, K.: The contrast sensitivity of human color vision to red-green and blue-yellow chromatic gratings. J. Physiol. 359, 381–400 (1985)CrossRefGoogle Scholar
  12. 12.
    Mullen, K.T.: The contrast sensitivity of human colour vision to red-green and blue-yellow chromatic gratings. J. Physiol. 359, 381–400 (1985)CrossRefGoogle Scholar
  13. 13.
    Huynh-Thu, Q., Ghanbari, M.: Scope of validity of PSNR in image/video quality assessment. Electron. Lett. 44(13), 800–801 (2008)CrossRefGoogle Scholar
  14. 14.
    Sheikh, H., Bovik, A.: Image information and visual quality. IEEE Trans. Image Proces. 15(2), 430–444 (2006)Google Scholar
  15. 15.
    Wang, Z., Simoncelli, E., Bovik, A.: Multiscale structural similarity for image quality assessment. In: Conference Record of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1402 (2003)Google Scholar
  16. 16.
    Chandler, D., Hemami, S.: Vsnr: a wavelet-based visual signal-to-noise ratio for natural images. IEEE Trans. Image Proces. 16(9), 2284–2298 (2007)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Wang, Z., Bovik, A., Sheikh, H., Simoncelli, E.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Proces. 13(4), 600–612 (2004)CrossRefGoogle Scholar
  18. 18.
    Wang, Z., Bovik, A.: A universal image quality index. IEEE Signal Proces. Lett. 9, 81–84 (2002)CrossRefGoogle Scholar
  19. 19.
    Sheikh, R., Bovik, A., de Veciana, G.: An information fidelity criterion for image quality assessment using natural scene statistics. IEEE Trans. Image Proces. 14, 2117–2128 (2005)CrossRefGoogle Scholar
  20. 20.
    Damera-Venkata, N., Kite, T., Geisler, W., Evans, B., Bovik, A.: Image quality assessment based on a degradation model. IEEE Trans. Image Proces. 9, 636–650 (2000)CrossRefGoogle Scholar
  21. 21.
    Mitsa, T., Varkur, K.: Evaluation of contrast sensitivity functions for formulation of quality measures incorporated in halftoning algorithms. IEEE Int. Conf. Acoust. Speech Signal Proces. 5, 301–304 (1993)Google Scholar
  22. 22.
    C.U.V.C. Laboratory: MeTriXMuX Visual Quality Assessment Package. Cornell University Visual Communications Laboratory (2010).

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Jesús Jaime Moreno-Escobar
    • 1
  • Claudia Lizbeth Martínez-González
    • 1
  • Oswaldo Morales-Matamoros
    • 1
  • Ricardo Tejeida-Padilla
    • 1
  1. 1.ESIME Zacatenco, Instituto Politécnico NacionalMexico CityMexico

Personalised recommendations