A Novel Image Compression Algorithm for High Resolution 3D Reconstruction
- 4.7k Downloads
- 4 Citations
Abstract
This research presents a novel algorithm to compress high-resolution images for accurate structured light 3D reconstruction. Structured light images contain a pattern of light and shadows projected on the surface of the object, which are captured by the sensor at very high resolutions. Our algorithm is concerned with compressing such images to a high degree with minimum loss without adversely affecting 3D reconstruction. The Compression Algorithm starts with a single level discrete wavelet transform (DWT) for decomposing an image into four sub-bands. The sub-band LL is transformed by DCT yielding a DC-matrix and an AC-matrix. The Minimize-Matrix-Size Algorithm is used to compress the AC-matrix while a DWT is applied again to the DC-matrix resulting in LL2, HL2, LH2 and HH2 sub-bands. The LL2 sub-band is transformed by DCT, while the Minimize-Matrix-Size Algorithm is applied to the other sub-bands. The proposed algorithm has been tested with images of different sizes within a 3D reconstruction scenario. The algorithm is demonstrated to be more effective than JPEG2000 and JPEG concerning higher compression rates with equivalent perceived quality and the ability to more accurately reconstruct the 3D models.
Keywords
DWT DCT Minimize-Matrix-Size LSS-Algorithm 3D reconstruction1 Introduction
The researches in compression techniques has stemmed from the ever-increasing need for efficient data transmission, storage and utilization of hardware resources. Uncompressed image data require considerable storage capacity and transmission bandwidth. Despite rapid progresses in mass storage density, processor speeds and digital communication system performance demand for data storage capacity and data transmission bandwidth continues to outstrip the capabilities of available technologies [2]. The recent growth of data intensive multimedia based applications have not only sustained the need for more efficient ways to encode signals and images but have made compression of such signals central to signal storage and digital communication technology [7].
Compressing an image is significantly different from compressing raw binary data. It is certainly the case that general purpose compression programs can be used to compress images, but the result is less than optimal. This is because images have certain statistical properties that can be exploited by encoders specifically designed for them [7, 10]. Also, some of the finer details in the image can be sacrificed for the sake of saving a little more bandwidth or storage space. Lossless compression is involved with compressing data which, when decompressed, will be an exact replica of the original data. This is the case when binary data such as executable documents are compressed [13]. They need to be exactly reproduced when decompressed. On the other hand, images need not be reproduced ‘exactly’. An approximation of the original image is enough for most purposes, as long as the error between the original and the compressed image is tolerable [9].
The neighbouring pixels in most images are highly correlated and therefore hold redundant information. The foremost task then is to find out less correlated representation of the image. Image compression is actually the reduction of the amount of this redundant data (bits) without degrading the quality of the image to an unacceptable level. There are mainly two basic components of image compression—redundancy reduction and irrelevancy reduction [16]. The redundancy reduction aims at removing duplication from the signal source image while the irrelevancy reduction omits parts of the signal that is not noticed by the signal receiver i.e., the Human Visual System (HVS) which presents some tolerance to distortion, depending on the image content and viewing conditions. Consequently, pixels must not always be regenerated exactly as originated and the HVS will not detect the difference between original and reproduced images [3].
The current standards for compression of still image (e.g., JPEG) use Discrete Cosine Transform (DCT), which represents an image as a superposition of cosine functions with different discrete frequencies. The DCT can be regarded as a discrete time version of the Fourier Cosine series. It is a close relative of Discrete Fourier Transform (DFT), a technique for converting a signal into elementary frequency components. Thus, DCT can be computed with a Fast Fourier Transform (FFT) like algorithm of complexity O(nlog2 n) [8]. More recently, the wavelet transform has emerged as a cutting edge technology within the field of image analysis. The wavelet transformations have a wide variety of different applications in computer graphics including radiosity, multi-resolution painting, curve design, mesh optimization, volume visualization, image searching and one of the first applications in computer graphics, image compression [15]. The Discrete Wavelet Transform (DWT) provides adaptive spatial frequency resolution (better spatial resolution at high frequencies and better frequency resolution at low frequencies) that is well matched to the properties of a HVS [6, 7].
Here a further requirement is introduced concerning the compression of 3D data. We demonstrated that while geometry and connectivity of a 3D mesh can be tackled by a number of techniques such as high degree polynomial interpolation [11] or partial differential equations [12], the issue of efficient compression of 2D images both for 3D reconstruction and texture mapping for structured light 3D applications has not been addressed. Moreover, in many applications, it is necessary to transmit 3D models over the Internet to share CAD/-CAM models with e-commerce customers, to update content for entertainment applications, or to support collaborative design, analysis, and display of engineering, medical, and scientific datasets. Bandwidth imposes hard limits on the amount of data transmission and, together with storage costs, limit the complexity of the 3D models that can be transmitted over the Internet and other networked environments [12].
It is envisaged that surface patches can be compressed as a 2D image together with 3D calibration parameters, transmitted over a network and remotely reconstructed (geometry, connectivity and texture map) at the receiving end with the same resolution as the original data. The widespread integration of 3D models in different fields motivates the need to be able to store, index, classify, and retrieve 3D objects automatically and efficiently. In the following sections we describe a novel algorithm that can robustly achieve the aims of efficient compression and accurate 3D reconstruction.
2 The Proposed Compression Algorithm
Proposed image compression method flowchart
2.1 The Discrete Wavelet Transform (DWT)
The DWT exploits both the spatial and frequency correlation of data by dilations (or contractions) and translations of the mother wavelet on the input data. It supports multi-resolution analysis of data (i.e. it can be applied to different scales according to the details required, which allows progressive transmission and zooming of the image without the need for extra storage) [4]. Another useful feature of a wavelet transform is its symmetric nature meaning that both the forward and the inverse transforms have the same complexity, allowing building fast compression and decompression routines. Its characteristics well suited for image compression include the ability to take into account the HVS’s characteristics, very good energy compaction capabilities, robustness under transmission and high compression ratios [5].
The implementation of the wavelet compression scheme is very similar to that of sub-band coding scheme: the signal is decomposed using filter banks. The output of the filter banks is down-sampled, quantized, and encoded. The decoder decodes the coded representation, up-samples and recomposes the signal. Wavelet transform divides the information of an image into an approximation (i.e. LL) and detail sub-band [1]. The approximation sub-band shows the general trend of pixel values and other three detail sub-band shows the vertical, horizontal and diagonal details in the images. If these details are very small (threshold) then they can be set to zero without significantly changing the image, for this reason the high frequencies sub-bands compressed into fewer bytes [14]. In this research the DWT is used twice, this is because the DWT assemble all low frequency coefficients into one region, which represents a quarter of the image size. This reduction in size enables high compression ratios.
2.2 The Discrete Cosine Transform (DCT)
LL1 sub-band transformed by DCT for each 4 × 4 block set
One of the key differences between the applications of the DWT and Discrete Cosine transformation (DCT) is that the DWT is typically applied to an image as a one block or a large rectangular region of the image, while DCT is used for small block sizes. The DCT becomes increasingly complicated to calculate for larger blocks, for this reason in this research a 4 × 4 pixel block is used, whereas a DWT will be more efficiently applied to the complete images yielding good compression ratios [16].
The parameter L in Eq. (3) is computed from the maximum value in the LL1 sub-band, and “Quality” value ≥0.01. The quality value is represented as a ratio for maximum value, if this ratio increased this leads to larger number of coefficients being forced to zero leading thus, to lower image quality. Each DC value from 4 × 4 block is stored in a different matrix called the DC-matrix, and other AC coefficients ([4 × 4] − 1) are stored in the AC-matrix. The other high frequency sub-bands (HL1, LH1 and HH1) are quantized by Eq. (3) and coded by the Minimize-Matrix-Size Algorithm.
The DC-matrix transformed by single level DWT to produce further sub-bands LL2, LH2, HL2 and HH2. The LL2 quantized by divide each value in the matrix by “2”. This is because to reduce bit size. Similarly, the other high frequency sub-bands values are quantized either by “2”, for normalize high-frequencies, also increasing number of zeros.
The LL2 is transformed by using one-dimensional DCT for each 4 items of data (i.e. assume u = 0, v = 0 to converting two dimensional DCT into one dimensional DCT), and then truncate each value. This means that one should not use scalar quantization at this stage.
a A matrix before DBV, b apply DVB between two neighbors in each column
2.3 Compress Data by Minimize-Matrix-Size Algorithm
This algorithm is used to reduce the size of the AC-matrix and other high frequency sub-bands. It depends on the Random-Weight-Values and three coefficients to calculate and store values in a new array. The following List-1 describes the steps in the Minimize-Matrix-Size-Algorithm:
An n × m matrix is minimized into an array M
The limited-data for a 5 × 5 matrix is illustrated as a list of probabilities and the minimized array is subject to arithmetic coding
The final step in our compression algorithm is Arithmetic coding, which takes a stream of data and convert it into a one-dimensional floating-point values. These output values lie in the range between zero and one and, when decoded, should reproduce the exact original stream of data. The arithmetic coding needs to compute the probability of all data and assign a range for each, the ranges are limited between Low and High values.
3 The Decompression Algorithm
A two-stage decompression algorithm is depicted in (a) and (b)
| 30 | 1 | 0 |
| 19 | 1 | 1 |
a A matrix before apply ABV, b apply ABV between two neighbours in each column
4 Experimental Results in 2D and 3D
a 2D colour BMP image, b–c 2D grey scale images
Compressed image sizes using high frequencies in first level DWT
| Image name | Original size (MB) | Compressed size (KB) | Quantization values | |
|---|---|---|---|---|
| Low-frequency | High-frequency | |||
| Wall | 3.75 | 74 | 0.02 | 0.02 |
| Wall | 3.75 | 47.6 | 0.04 | 0.04 |
| Wall | 3.75 | 33.7 | 0.08 | 0.08 |
| Girl | 4.14 | 78 | 0.02 | 0.02 |
| Girl | 4.14 | 48 | 0.04 | 0.04 |
| Girl | 4.14 | 29.1 | 0.08 | 0.08 |
| Woman | 4.14 | 62.1 | 0.02 | 0.02 |
| Woman | 4.14 | 38.1 | 0.04 | 0.04 |
| Woman | 4.14 | 24.5 | 0.08 | 0.08 |
Compressed image size without using high-frequencies in first level DWT
| Image name | Original size (MB) | Compressed size (KB) | Quantization values | |
|---|---|---|---|---|
| Low-frequency | High-frequency | |||
| Wall | 3.75 | 62 | 0.02 | Ignored |
| Wall | 3.75 | 45 | 0.04 | Ignored |
| Wall | 3.75 | 33.5 | 0.08 | Ignored |
| Girl | 4.14 | 61.2 | 0.02 | Ignored |
| Girl | 4.14 | 42.6 | 0.04 | Ignored |
| Girl | 4.14 | 28.3 | 0.08 | Ignored |
| Woman | 4.14 | 53.4 | 0.02 | Ignored |
| Woman | 4.14 | 35.4 | 0.04 | Ignored |
| Woman | 4.14 | 24.3 | 0.08 | Ignored |
a and b 3D decompressed image of wall with different quality values. c, d and e Differences between original 3D wall image and decompressed 3D wall image according to quality parameter. Red regions represent the 3D wall decompressed image matched with the background original 3D Wall image in three cases, i.e., high, median and low quality parameters. (Color figure online)
a and b 3D decompressed girl image with different quality values. c, d and e Differences between original 3D girl image and decompressed 3D girl image according to quality parameters. The pink model represents the original background 3D image, whiles other colour represents the 3D decompressed image with various quality parameters. (Color figure online)
a and b 3D decompressed woman image with different quality values. c, d and e Differences between original 3D woman image and decompressed 3D woman image according to quality parameters. The pink model is the original 3D woman model while blue, green, and golden models refer to high, median and low image quality respectively. (Color figure online)
PSNR and MSE between original and decompressed 2D images
| Image name | RMSE | 3D RMSE | Quantization values | |
|---|---|---|---|---|
| Low-frequency | High-frequency | |||
| Wall | 2.49 | 2.09 | 0.02 | 0.02 |
| Wall | 2.82 | 3.95 | 0.04 | 0.04 |
| Wall | 3.25 | 4.72 | 0.08 | 0.08 |
| Girl | 3.09 | 3.78 | 0.02 | 0.02 |
| Girl | 4.08 | 3.94 | 0.04 | 0.04 |
| Girl | 5.25 | 3.66 | 0.08 | 0.08 |
| Woman | 2.88 | 3.37 | 0.02 | 0.02 |
| Woman | 3.53 | 3.09 | 0.04 | 0.04 |
| Woman | 4.35 | 2.61 | 0.08 | 0.08 |
| Image name | MSE | 3D RMSE | Quantization values | |
|---|---|---|---|---|
| Low-frequency | High-frequency | |||
| Wall | 2.66 | 2.09 | 0.02 | Ignored |
| Wall | 2.86 | 3.95 | 0.04 | Ignored |
| Wall | 3.24 | 4.72 | 0.08 | Ignored |
| Girl | 4.39 | 3.41 | 0.02 | Ignored |
| Girl | 4.71 | 3.83 | 0.04 | Ignored |
| Girl | 5.34 | 3.74 | 0.08 | Ignored |
| Woman | 3.38 | 3.12 | 0.02 | Ignored |
| Woman | 3.73 | 3.07 | 0.04 | Ignored |
| Woman | 4.38 | 2.71 | 0.08 | Ignored |
5 Comparison with JPEG2000 and JPEG Compression Techniques
Comparison between the proposed algorithm and JPEG2000 and JPEG techniques
| Image name | Quality | Proposal method | JPEG2000 | JPEG | Compressed size (Kbytes) | |||
|---|---|---|---|---|---|---|---|---|
| RMSE | 3D RMSE | RMSE | 3D RMSE | RMSE | 3D RMSE | |||
| Wall | High | 2.49 | 2.09 | 1.92 | 4.28 | 3.14 | 2.8 | 74 |
| Median | 2.82 | 3.95 | 2.14 | 5.01 | 3.87 | 4.5 | 47.6 | |
| Low | 3.25 | 4.72 | 2.42 | 3.52 | 5.34 | 6.9 | 33.7 | |
| Girl | High | 3.09 | 3.78 | 2.14 | 3.94 | 3.28 | 3.94 | 78 |
| Median | 4.08 | 3.94 | 2.88 | 4.02 | 4.72 | 3.72 | 48 | |
| Low | 5.25 | 3.66 | Non | Non | Non | Non | 29.1 | |
| Woman | High | 2.88 | 3.37 | 2.14 | 3.14 | 2.6 | 2.55 | 62.1 |
| Median | 3.53 | 3.09 | 2.7 | 3.2 | 4.58 | 2.75 | 38.1 | |
| Low | 4.35 | 2.61 | Non | Non | Non | Non | 24.5 | |
a, b and c Decompressed 3D Wall image by JPEG2000, Decompressed image with quality = 40 % most of regions are matched with the original image, similarly in quality = 26 % and quality = 10 % approximately matched with the original image. d, e Decompressed 3D Flat image by JPEG (degraded) un-recognized with original image. Median quality 2D decompressed image by JPEG at quality = 51 %, quality = 23 % non-capable of generating 3D model
a and b Decompressed 3D girl image by JPEG2000. c, d Decompressed 3D girl image by JPEG. For low quality, JPEG cannot reach to compress size 29.1 Kbytes
a and b Decompressed women image by JPEG2000. c, d Decompressed 3D women image by JPEG. For low quality JPEG cannot reach to compress size 24.5 Kbytes
6 Conclusion
- 1.
Using two transformations, this helped our compression algorithm to increase the number of high-frequency coefficients leading to increased compression ratios.
- 2.
The Minimized-Matrix-Size Algorithm is used to collect each three coefficients from the high-frequency matrices, to be single floating-point values. This process converts a matrix into an array, leading to increased compression ratios while keeping the quality of the high-frequency coefficients.
- 3.
The properties of the Daubechies DWT (db3) help the approach for obtaining higher compression ratios. This is because the Daubechies DWT family has the ability to zoom-in onto an image, and the high-frequencies sub-bands of the first level decomposition can be discarded (See Table 3).
- 4.
The LSS-Algorithm represents the core of our decompression algorithm, which converts a one-dimensional array to a matrix, and depends on the Random-Weight-Values. Also, the LSS-Algorithm represents lossless decompression, due to the ability of the Limited-Data to find the exact original data.
- 5.
The Random-Weight-Values and Limited-Data are the keys for coding and decoding an image, without these keys images cannot be reconstructed.
- 6.
Our approach gives better visual image quality compared to JPEG and JPEG2000. This is because our approach removes most of the block artifacts caused by the 8 × 8 two-dimensional DCT of the JPEG technique [15]. Also our approach removes some blurring caused by multi-level DWT in JPEG2000 [15].
- 7.
The one-dimensional DCT with size n = 4, is more efficient than n = 8 of JPEG; this helped our decompression approach to obtain better image quality than JPEG.
- 1.
The overall complexity of our approach leads to increased execution time for compression and decompression; the LSS-Algorithm iterative method is particularly complex.
- 2.
The compressed header data contain floating-point arrays, thereby causing some increase in compressed data size.
Notes
Open Access
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
References
- 1.Acharya, T., & Tsai, P. S. (2005). JPEG2000 standard for image compression: Concepts, algorithms and VLSI architectures. New York: Wiley.Google Scholar
- 2.Ahmed, N., Natarajan, T., & Rao, K. R. (1974). Discrete cosine transforms. IEEE Transactions Computer, 23, 90–93.MATHMathSciNetCrossRefGoogle Scholar
- 3.Al-Haj, A. (2007). Combined DWT-DCT digital image watermarking. Journal of Computer Science, 3(9), 740–746.CrossRefGoogle Scholar
- 4.Antonini, M., Barlaud, M., Mathieu, P., & Daubechies, I. (1992). Image coding using wavelet transform. IEEE Transactions on Image Processing, 1(2), 205–220.CrossRefGoogle Scholar
- 5.Christopoulos, C., Askelof, J., & Larsson, M. (2000). Efficient methods for encoding regions of interest in the upcoming JPEG 2000 still image coding standard. IEEE Signal Processing Letters, 7(9), 247–249.CrossRefGoogle Scholar
- 6.Esakkirajan, S., Veerakumar, T., Senthil Murugan, V., & Navaneethan, P. (2008). Image compression using multiwavelet and multi-stage vector quantization. WASET International Journal of Signal Processing, 4(4), 524–531.Google Scholar
- 7.Gonzalez, R. C., & Woods, R. E. (2001). Digital image processing. Boston, MA: Addison Wesley publishing company.Google Scholar
- 8.Grigorios, D., Zervas, N. D., Sklavos, N., & Goutis, C. E. (2008). Design techniques and implementation of low power high-throughput discrete wavelet transform tilters for jpeg 2000 standard. WASET International Journal of Signal Processing, 4(1), 36–43.Google Scholar
- 9.Rao, K. R., & Yip, P. (1990). Discrete cosine transform: Algorithms, advantages, applications. San Diego: Academic Press. 1990.MATHGoogle Scholar
- 10.Richardson, I. E. G. (2002). Video codec design. New York: Wiley.CrossRefGoogle Scholar
- 11.Rodrigues, M., Robinson, A., & Osman, A. (2010). Efficient 3D data compression through parameterization of free-form surface patches. In Signal Process and Multimedia Applications (SIGMAP), Proceedings of the 2010 International Conference on IEEE (pp. 130–135).Google Scholar
- 12.Rodrigues, M., Osman, A., & Robinson, A. (2013). Partial differential equations for 3D data compression and reconstruction. Journal Advances in Dynamical Systems and Applications, 8(2), 303–315.Google Scholar
- 13.Sadashivappa, G., & Ananda Babu, K. V. S. (2002). Performance analysis of image coding using wavelets. IJCSNS International Journal of Computer Science and Network Security, 8(10), 144–151.Google Scholar
- 14.Sana, K., Kaïs, O., & Noureddine, E. (2009). A novel compression algorithm for electrocardiogram signals based on wavelet transform and SPIHT. WASET International Journal of Signal Processing, 5(4), 11.Google Scholar
- 15.Sayood, K. (2000). Introduction to data compression (2nd ed.). Morgan Kaufman Publishers: Academic Press.Google Scholar
- 16.Tsai, M. & Hung, H. (2005). DCT and DWT based image watermarking using sub sampling. In Proceeding of the 2005 IEEE Fourth International Conference on Machine Learning and Cybernetics, China (pp. 5308–5313).Google Scholar















