A novel color image compression algorithm using the human visual contrast sensitivity characteristics

In order to achieve higher image compression ratio and improve visual perception of the decompressed image, a novel color image compression scheme based on the contrast sensitivity characteristics of the human visual system (HVS) is proposed. In the proposed scheme, firstly the image is converted into the YCrCb color space and divided into sub-blocks. Afterwards, the discrete cosine transform is carried out for each sub-block, and three quantization matrices are built to quantize the frequency spectrum coefficients of the images by combining the contrast sensitivity characteristics of HVS. The Huffman algorithm is used to encode the quantized data. The inverse process involves decompression and matching to reconstruct the decompressed color image. And simulations are carried out for two color images. The results show that the average structural similarity index measurement (SSIM) and peak signal to noise ratio (PSNR) under the approximate compression ratio could be increased by 2.78% and 5.48%, respectively, compared with the joint photographic experts group (JPEG) compression. The results indicate that the proposed compression algorithm in the text is feasible and effective to achieve higher compression ratio under ensuring the encoding and image quality, which can fully meet the needs of storage and transmission of color images in daily life.


Introduction
In daily life, more than 70% of all the information received by a person is obtained through the sense of vision. Compared with other types of information, the image contains more intuitive, precise, and a large amount of information. Therefore, the images make a major portion of modern-day digital information processing, storage, and transmission. However, due to the enormous amount of data in digital images, the transmission speed will be severely affected, and the image storage capacity will be restricted without compression. These factors have seriously limited the development of the image communication; meanwhile, they have also stimulated the image compression technology to develop faster in recent years [1][2][3][4][5].
The purpose of image compression is to minimize the amount of data as much as possible on the premise of ensuring the quality of the reconstructed image and meanwhile satisfy the requirements of the application. In recent years, a great deal of researches on image compression have been carried out, and many image coding technologies and standards have been developed [4][5][6][7]. These coding techniques mainly involve the image compression technology based on the human visual and auditory characteristics, wavelet transform theory, and fractal theory [8][9][10][11][12][13]. However, the contrast sensitivity characteristics of the human visual system (HVS) have not been exploited much in the field of color image compression [1][2][3]. In 2010, Sreelekha and Sathidevi proposed an adaptive quantization scheme based on human visual characteristics for the compression of color images [14]. Although it achieves a high compression ratio, the algorithm has much higher complexity and requires a longer running time. In 2013, Abu et al. proposed a compression method using a psychovisual error threshold as quantization table [15]. However, it was mainly aimed at the compression of gray images. In 2000, Marcus Nadenau proposed a compression method based on HVS [16]; however, the human visual system model used in the Nadenau's method was obtained only by simple experimental measurements and data fitting. It was observed that a fewer factors were taken into account in the model, which made it less representative model of the HVS. Furthermore, these studies have only investigated partial applications. As the color directly affects the quality of the compressed image and color perception characteristics of the HVS are still being explored, the image compression technology based on color perception characteristics of the HVS needs further research and exploration [17][18][19][20][21]. Therefore, based on the typical contrast sensitivity characteristics of the HVS and the characteristics of frequency spectrum coefficients of the image in the discrete cosine transform (DCT) domain, a novel color image compression scheme is proposed in this paper. And simulations are carried out for two-color images. The results show that the average structural similarity index measurement (SSIM) and peak signal to noise ratio (PSNR) under the approximate compression ratio have been increased by 2.78% and 5.48%, respectively, compared with the joint photographic experts group (JPEG) compression. It is shown that the proposed compression method has the higher compression efficiency and quality than the JPEG. It will better meet the needs of image transmission and storage in the multimedia, photoelectric image technology, mobile communications, and other applications.

Color space
The conventional displays show the color images using the RGB color space. Although the RGB color model can efficiently show various colors, it is strongly associated with the underlying device. In the modern communication, the images are often transmitted between different devices. Therefore, the images need to be converted from the RGB color space to the device-independent YC b C r color space [14][15][16]. It is observed that the human visual perception is a mechanism based on three color-pairs, which are response to the strength of light (black-white), the red-green colors, and the yellow-blue colors. Therefore, it is conducive to make use of visual characteristics to carry out image compression using the YC b C r color space. The color space conversions between the RGB and the YC b C r spaces are given as [17,18] Forward transformation:  .

Visual characteristics and its application in image
The contrast sensitivity is one of the main parameters reflecting the spatial characteristics of the HVS, which is an important theoretical basis for image display, processing, and understanding. This characteristic is generally described quantitatively using the contrast sensitivity function (CSF). The CSF reflects the relationship between the contrast sensitivity thresholds (i.e., the reciprocal of contrast detection thresholds) and the angular frequencies.
The contrast detection thresholds cannot be measured directly, and their values can only be described quantitatively by the contrast values of the target gratings that just can be perceived by the human eyes but cannot be fully distinguished, according to the Weber theorem and the psychophysics metrics [17][18].

Luminance contrast
In the measurement of the human visual characteristics, the target gratings are generally rectangular and made of light and dark stripes, as shown in the example in Fig. 1. The luminance contrast (C) of the strips generally is defined by (3), reported by Michelson [17,18], where L 1 and L 2 represent the luminances of the dark and bright strips, respectively, and L is their average luminance (cd/m 2 ).

Luminance visual models
Since 1930s, a huge amount of researches have been performed on luminance contrast sensitivity visual characteristics. The visual characteristics are influenced by many outside factors, such as the measurement conditions, the subjects involved in the test, and the observation distance. Therefore, the researchers have obtained different results by performing different experiments. By far, the multi-parameter composite model proposed by Dally et al. is considered as the most suitable model to describe the luminance contrast sensitivity characteristics of the HVS, which is commonly known as Daly model [10]. It is mathematically expressed by (4), where f is the angular frequency (cycle/degree), L is the average luminance of every grating, 2 i describes the size of the image (assuming the image is square), and ε is the frequency scaling constants, whose value is 0.9.
In the experiments, the measured results may generally be affected by the adaptive regulation of the human eyes, the diameter of the pupil while viewing, and the selected direction of the grating strips. Equation (4) must be corrected, and the frequency f is described by (5) after correction, where d is the viewing distance, e is the centrifugal rate from the central recess in the retina of the human eyes, and α is the orientation angle of the strips. When the CSF is applied in the image compression with DCT, the results of DCT inverse transformation render the light/dark and vertical/horizontal stripes. The human eye is sensitive to these types of stripes. The value of θ r is equal to 1, when the orientation angle (α ) of the strips is π/2. The tested targets are observed through the 2-mm artificial pupil by the monocular vision, and the centrifugal rate from the central recess is 1 mm [10][11].

Color CSF models
The color contrast lacks a uniform definition due to diversity in the colors. The visual characteristics of color contrast sensitivity and color CSF remain relatively less explored in available literature, and more work must be done in the future studies [10][11][12][13]. Mullen et al. presented an opposite color CSF by performing measurements and data fitting. It is a three-parameter model based on the sum of two Gaussian functions, which is expressed in (6) [16]. The parameters in (6) are given in Table 1.

Applying CSF in image compression
Considering the application of human visual characteristics in image compression, the CSF must be associated with the features of the image coefficients in the transform domain. Therefore, the following problems must be solved: (1) calculation of the angular frequencies of the image, (2) modeling the use of the CSF to quantize the image information. The methods to solve these problems are described as follow.
In the vision research, the angular frequency is defined as the number of cycles of the light and dark stripes stimulating the eyes within each degree of the viewing angle [16][17][18]. The angular frequency of each point in the image is different when viewing the image; therefore, the contrast detection thresholds corresponding to these different angular frequencies are different. It is required to obtain the angular frequency and contrast detection threshold for each point of the image to solve above two problems.
In the measurement of human visual characteristics, the visual angle θ and its calculation method are shown in Fig. 1. In Fig. 1, W is the width of the gratings, and D is the viewing distance. The calculation method for the viewing angle θ is described by (7). If W and D are constant, the visual angle will not be changed. However, the width of stripes can be changed by changing the number of pixels. Different widths of the stripes correspond to different spatial cycles; therefore, the number of cycles in the grating to stimulate the eye within each degree of the visual angle is different to achieve a different angular frequency. Combining the frequency spectrum characteristics of the images in the DCT domain, the calculation methods of the angular frequencies of the images are detailed as follow. 180 In the measurement of color visual characteristics, Mullen adopted luminance contrast definition advised by Michelson to describe chromaticity contrast by using chroma to replace luminance [17]. The average luminance of the sub-block image can be considered as the average luminance L of (4), when the luminance contrast definition is combined during the image processing. It also represents the luminance of bright or dark stripes of the gratings.
In the proposed compression scheme, the color image is decomposed into three component luminance images, and every component luminance image is further divided into sub-blocks. The size of each sub-block is 8×8 pixels. Afterwards, the DCT is carried out for each sub-block. F(u, v) represents the frequency spectrum image after the DCT transformation, and (u, v) represents the positions of a discrete series point. It is found through the experiments that arbitrary two points on the frequency spectrum image show sinusoidal periodic stripes after the inverse DCT, which are shown in Fig. 2. If the difference of the coordinates between any two points is n, the number of cycles of the stripes on the inverse transformed image is n. By stipulating that one of the two points is the origin of the coordinate, the coordinate of the other points can be assumed as (u, v). It shows that any point on the spectrum image with the coordinate of (u, v) corresponds to a grating in the inverse transformation domain with the spatial frequency of (u 2 +v 2 ) 1/2 . Therefore, observing a point in the frequency spectrum image corresponds to detecting a grating. Based on this metrics, the measurement method of human visual characteristics, and the definition of the angular frequency, (8) is used to calculate the angular frequencies of the image when human eyes observe them. By combining the CSF model, the contrast detection threshold of each frequency of the spectrum image that is obtained by the DCT transformation for 8×8 pixels sub-block can be calculated. In the measurement of human visual characteristics, the contrast detection threshold is defined as the contrast value of the gratings that is in the critical state; it is only distinguishable but cannot be fully resolved. The human eyes cannot distinguish those gratings whose contrast values are lower than the thresholds. If the contrast values in some locations of the frequency spectrum images are lower than the thresholds, such information may be defined as the redundant information, which can be completely removed or quantized to zero. It is useful and necessary for image compression. = .
x y

Color image compression scheme
The purpose of image compression is to minimize the data as much as possible, but to keep the original information for realistic visual experience. The basic principle to reduce data is to remove redundant information. The procedure for the proposed compression algorithm is as follows: firstly, the image is converted into the YC b C r color space and further divided into sub-blocks after isolating the three components of the color image. The DCT transform is carried out for each of the blocks in three components. Afterwards, three quantization matrices based on the CSF model are built, and the frequency spectrum coefficients of three component images are selectively quantized using the matrices. After the quantization, the Huffman algorithm is applied to encode the image data based on the zero-distribution characteristics. Finally to verify, the image is decompressed by the inverse process and is matched to reconstruct the color image after decompression. The specific steps are described as follows: (1) The color image is converted from the RGB color space into the YC b C r color space and decomposed into three components that are luminance, red-green color, and yellow-blue color. All of three components show luminance images. Each luminance image is divided into a number of sub-blocks with a size of 8×8 pixels, and each sub-block is transformed using the DCT. Afterwards, the direct component (DC) and alternating component (AC) are drawn from spectral coefficients in the transform domain.
(2) The contrast sensitivity threshold of each angular frequency of an arbitrary sub-block is calculated by the above method. As the corresponding points of every single sub-block have the same coordinates, the angular frequencies are also same according to the above method; therefore, they have the same contrast sensitivity thresholds. The reciprocals of contrast sensitivity thresholds are contrast detection thresholds. The contrast detection thresholds are multiplied by the DC values of the corresponding sub-block, whose results are defined as the visibility thresholds. A matrix J of size 8×8 elements is constructed, and the value of each element of the matrix is described by the visibility thresholds. The matrix J is called quantization matrix. The position of every element of the matrix J is the corresponding coordinate of the sub-block. As the CSFs of luminance and color are different, there are three quantization matrices corresponding to the luminance, red-green color, and blue-yellow color, respectively.
(3) By comparing the frequency spectrum coefficients of the image with the elements of the matrix J, it is identified that the human eyes can not distinguish the point whose coefficients are less than the values of the elements. The contrast value of the gratings corresponding to this point is below the thresholds, according to the human visual characteristics and the above experiments. Hence, the information of this location (or point) in the frequency spectrum images is redundant. The human eyes are less sensitive to the distortion in information at this range of frequencies; therefore, these can all be quantized to zero. According to the characteristics of the spectrum coefficients, some coefficients corresponding to the high-frequencies in the spectrum image are zero or close to zero. Human eyes are also less sensitive to the distortion associated with high frequencies; therefore, the coefficients corresponding to higher frequencies are also quantized to zero. Quantization is carried out for three components of color image.
(4) The Huffman algorithm is used to encode the image based on the distribution characteristics of zero coefficients after quantization.
(5) Considering the above process for encoding, the images are decoded by the inverse operation to obtain the decompressed version of three component images. Afterwards, the color decompressed image in the YC b C r color space is obtained by matching and reconstructing from three component images. The images further are converted to the RGB color space from the YC b C r color space to be displayed. As the quantization is not reversible in the decoding process, the frequency coefficients that are quantized to zero during the encoding process are replaced with the elements of the matrix J, corresponding to each of the suppressed frequencies.
In this way, the discarded information is replaced with human visual sensitive thresholds, which effectively improves the quality of the decompressed image. As the human eyes are less sensitive to the information lost during compression, and they are also unimportant information, loss of that information will not cause the distortion in the image. The flow chart of the compression scheme is shown in Fig. 3 Fig. 3 Flow charts of the compression scheme combining the human visual characteristics.

Results
In order to verify the proposed compression algorithm, simulations are carried out for two color images which are 24 bits color "Fruit" and "Lena" images with the resolution of 512×512 pixels. The experimental results are shown in Fig. 4. At the same time, the JPEG compression technique is adopted to carry out the compression for two images, and the results are show in Fig. 5. Fig. 4 Results of two images decompressed using the proposed compression algorithm: (a) -(c) decompressed Y, C b , and C r components of the "Fruit" image, respectively, (d) the color "Fruit" image reconstructed after decompression in the YC b C r color space, (e) the color "Fruit" image reconstructed after decompression in the RGB color space, and (f) -(j) corresponding to the results of (a) -(e) for "Lena" images, respectively. Fig. 5 Results of two color images decompressed using the JPEG technique: (a) the original color "Fruit" image, (b) -(d) decompressed Y, C b , and C r components of the "Fruit" image, respectively, (e) the color "Fruit" image reconstructed after decompression in the YC b C r color space, (f) the color "Fruit" image reconstructed after decompression in the RGB color space, and (g) -(l) corresponding to the results of (a) -(f) for "Lena" images, respectively.

Discussion
In order to evaluate the quality of the compressed image, various subjective and objective metrics have been commonly used in the literature [19][20][21]. To illustrate the subjective effect of the compression, the data of the quantized images are compared with the original images of "Fruit" and "Lena" in the transformation domain, as shown in Fig. 6. From Fig. 6, it is found that the data quantity is obviously reduced compared with the data of the original image.
In order to objectively analyze the effect of the compression algorithm, two works are carried out: (1) After two color images ("Fruit" and "Lena") are compressed by the JPEG and the proposed compression algorithm uses the approximate compression ratio, the SSIM values of two color images and their components are computed. The SSIM of the color image is abbreviated specifically for the RGB color spaces as SSIM_RGB. Similarly, the SSIM of the components of the images is abbreviated for the YC r C b color spaces as SSIM_Y, SSIM_C r , and SSIM_C b , respectively. The computed results are shown in Table 2.
(2) We compute the PSNR for two color images and their components which are compressed by the JPEG and the proposed compression algorithm under the four compression ratios and obtain rate-distortion curves. The results are shown in Fig. 7. Fig. 6 Comparing the data of the quantized images with ones of the original images for "Fruit" and "Lena" images in the transformation domain: (a) the frequency spectrum coefficients of the Y component of the original "Fruit" image, (b) the quantized results of sub-blocks with the size of 8×8 pixels of "Fruit" image, (c) the quantized results of the frequency spectrum coefficients of the Y component of the "Fruit" image, (d) the quantized results of the frequency spectrum coefficients of the color "Fruit" image, and (e) -(h) corresponding to the results of (a) -(d) for "Lena" images, respectively.    Fig. 7 Rate-distortion curves after the two color images being compressed by the JPEG and the proposed compression algorithm: (a) -(d) the rate-distortion curves of Y, C r , and C b components and color image for "Fruit" image, respectively and (e) -(h) the rate-distortion curves of Y, C r , and C b components and color image for the "Lena" image, respectively.
It can be noticed form Table 2 and Fig. 7 that the SSIM and PSNR are both improved using the proposed compression algorithm for the two test color images and their components under the approximate compression ratio, compared with the JPEG. The average improvements corresponding to the two quality metrics are 2.78% and 5.48%, respectively. The results show that the proposed compression algorithm has the higher compression efficiency and quality than the JPEG.

Conclusions
In order to obtain the higher image compression ratio and improve the visual perception of the decompressed images, a novel image compression scheme based on the contrast sensitivity characteristics of the human visual system for color images is proposed. The main contribution of the compression scheme includes the following two aspects: a method to compute the angular frequencies of the image is proposed, and contrast detection thresholds of the frequency spectrum image are computed. Secondly, it is proposed that the visibility thresholds can be considered as quantization step lengths to quantize the frequency spectrum coefficients of the image to achieve better image compression. Simulations are carried out for two color images. The results show that the average SSIM and PSNR are improved by 2.78% and 5.48% under the approximate compression ratio, respectively, compared with the JPEG. These results indicate that the compression method combining human visual characteristics is feasible and effective to achieve higher compression ratio under the premise of ensuring the encoding and image quality, which can fully meet the needs of the color image storage and transmission in daily life.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/ licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.