A novel Hexa data encoding method for 2D image crypto-compression

We proposed a novel method for 2D image compression-encryption whose quality is demonstrated through accurate 2D image reconstruction at higher compression ratios. The method is based on the DWT-Discrete Wavelet Transform where high frequency sub-bands are connected with a novel Hexadata crypto-compression algorithm at compression stage and a new fast matching search algorithm at decoding stage. The novel crypto-compression method consists of four main steps: 1) A five-level DWT is applied to an image to zoom out the low frequency sub-band and increase the number of high frequency sub-bands to facilitate the compression process; 2) The Hexa data compression algorithm is applied to each high frequency sub-band independently by using five different keys to reduce each sub-band to1/6of its original size; 3) Build a look up table of probability data to enable decoding of the original high frequency sub-bands, and 4) Apply arithmetic coding to the outputs of steps (2) and (3). At decompression stage a fast matching search algorithm is used to reconstruct all high frequency sub-bands. We have tested the technique on 2D images including streaming from videos (YouTube). Results show that the proposed crypto-compression method yields high compression ratios up to 99% with high perceptual quality images.

communication and data streaming over the Internet. It has been shown in the literature that wavelet-based coding can provide improvements in image quality at higher compression ratios.
DWT is a multi-resolution decomposition technique of the input signals. Signals are first decomposed into two subspaces namely low-frequency sub-band (achieved through low-pass filtering) and high-frequency (high-pass filtering) sub-band. For the classical DWT, a low-pass digital filter and a high-pass digital filter implement the forward decomposition of a signal. Both the digital filters are derived using the scaling function and the corresponding wavelets [3]. The system down-samples the signal by banks of half-band filters resulting in the decomposition process [6].
In this work we use a five-level DWT decomposition to reduce the low-coefficient values (i.e. zoom-in LL size) and to increase the number of high-frequency coefficients. Additionally, the high-frequencies at the first level (i.e. LH1, HL1 and HH1) are set to zero. This is because higher image resolution helps the DWT to remove insignificant coefficients from high-frequency sub-bands and the ability of DWT to recover approximately the original 2D image without the need for LH1, HL1 and HH1 sub-bands [18][19][20]. All the others high-frequency sub-bands in the lower levels are compressed by our approach as illustrated in Fig. 1.
After a five-level DWT is applied to the 2D image, a quantization process is applied to each sub-band independently. The quantization process is based on dividing a sub-band by a uniform value, as described by the following equation: Where 0 < Q < =Max. The maximum value (Max) of Q is the maximum of a high-frequency sub-band which means that Q is strictly data-dependent, and H Q represents a quantized Highfrequency sub-band. Here Q is the quantization value, which means all the coefficients are divided by a single value Q then rounded to integer. Due to the properties of the DWT, each Low-frequency sub-band is quantized by Eq. (1) (i.e. L Q = round(L/Q)) before being decomposed into further sub-bands.

The Hexadata crypto-compression method
The GMPR-Geometric Modelling and Pattern Recognition Research proposed Hexa-data compression is a method to perform data compression with high compression ratios. The main novel step is that from a data represented as an array or a matrix of integer values, divide the data into blocks of 6 items, and convert each such set to a single floating-point through hierarchical two-level key sets. The steps in the algorithm are: 1. Generate three keys and convert each three data items to a single value by multiplying each value by their respective key and sum over the result. The output from such triplet encoding is an ordered list or array of coded values [18,20]. The following equation represents triplet encoding: Where e(i) is the encoded output from a stream of original data d(n). 2. The output from step 1 is converted to another stream of encoded data by multiplying each value by their respective second level keys as shown in Fig. 2 and the following equation represents the second level encoding: Where C(j) is the final encoded output from previous encoded data e(i).
The key values K1, K2, K3, K4 and K5 are generated by a key generator algorithm according to the following steps [18,20]: During crypto-compression, the probabilities of compressed data are computed resulting in a data set that we call Limited-Data [18,20], which are used later at decompression stage as shown in Fig. 3.

The crypto-compression step
In this section we illustrate the encoding steps. Assume we have the following input: Data = In Fig. 4, we stress that sorting by the first data item is a very important function in our proposed method as it is used in the decompression stage for fast search. Meanwhile, the relevant 6 data items are saved in a Limited-Data array, because these data will later be used to recover the original data. The Limited-Data array is sorted by the insert sort algorithm as each data item is generated. It is one of popular sorting methods [21] with linear time complexity O(n) given that data are sorted in ascending order as they are included in the array.
The main reason to remove the output and replace by "Nil", is to reduce the size of the Limited-Data array. Also, the outputs required to decrypt the data and can be separately encrypted for higher security such as by the AES method. The last step in the algorithm is to apply Arithmetic Coding to the Limited-Data and the output to produce a stream of compressed data.
Some observations are made as follows. In the example above, L2 seems to have many data items and may be wrongly interpreted as a handicap to compression. But the fact is that most of the data are repeated in 2D images which help to achieve a high compression ratio. In a five level DWT transform it would normally mean that more than 50% of data are zero or insignificant. The approach works well and yields compression ratios up to 99% with good perceptual quality of the image as shown in Section 5.

The decoding step
The decoding algorithm represents the inverse of Hexa-data crypto-compression algorithm. It starts by computing the summation of each 6 relevant data from Limited-Data through equation Eq. (3) by using the same keys (K1 -K5). Figure 5 illustrates the decoding steps.
Decompression begins by matching each generated set of possible data with the outputs in the Limited-Data and, if the match is successful, the result contains the values we are after namely the relevant 6 data items representing the decoded data. It is a simple search within a look up table with time complexity of O(log n), because the array is sorted and a binary search algorithm is used for fast recovery.
Note that a binary search requires the Limited-Data be sorted in ascending order according to the output. The algorithm compares each item of compressed data with the middle value Limited-Data. If the values match, then its relevant 6 data items are returned (i.e. the 6 relevant   items are the decoded data). Otherwise, if the data value is less than the middle element of the Limited-Data, then the algorithm is repeated on the sub-array to the left of the middle element or, if the value is greater, on the sub-array to the right [21]. The possibility of a "Not Matched" does not exist, because the Limited-Data has been previously built at compression time containing one item of each original data. This decoding method runs much faster than our previous algorithm proposed in the Patent WO 2016/135510A1 [22].

Removing Zeros from high-frequency sub-band
After the five level DWT is applied to the 2D image, at 5th level the LL5 size is very small and can be reduced to a few bytes by Arithmetic Coding. Meanwhile, all other high-frequency subbands are encoded by the proposed Hexa-data crypto-compression method. In order to achieve higher compression ratios, the output from the Hexadata algorithm can still be further compressed by removing zeros from the encoded data.
To separate zeros from nonzero data, a zero-array is computed by detecting the number of zeros between two nonzero values. For example, assume that a data set H = {29,023.567, 0, 0, 0, 9457.334, 0, 0, 0, 0, 0, −7123.123} represents an encoded high-frequency sub-band. The zero-array will be Z = {0,3,0,5,0} where the zeros are placeholders for nonzero data and the other numbers are the zero counts between two consecutive non-zero data. Techniques to further increase the compression ratio can be applied here, such as replacing the number "5" with "3" and "2" to increase the probability (i.e. the re-occurrence) of redundant data [18]. This would yield a new equivalent zero-array as Z = {0,3,0,3,2,0}. Finally, each encoded high- frequency sub-band has its own zero-array and nonzero-array and each array is compressed by arithmetic coding for higher compression ratios.

Experimental results
The proposed image decoding starts by recovering the high-frequency sub-bands. At the 5th level the high-frequency sub-bands (LH5, HL5 and HH5) are recomposed with LL5 (i.e. apply inverse DWT) to generate new LL4 which is recomposed with its relevant high-frequencies. This process continues until the last LL at the 1st level is recovered. In the process, all other levels of LL, HL, LH and HH are recovered completing thus, the 2D image decoding. The experimental results described here were implemented in MATLAB R2014a running on an Intel Core i7-3740QM microprocessor (4 cores).
Results are described in two ways: first, the method is applied to images of different sizes and their visual quality are assessed with Root Mean Square Error (RMSE) and Peak Signal to Noise Ratio (PSNR). Second, the proposed compression technique is applied to a stream of video images. Table 1 shows the first part of results by applying the crypto-compression method to three selected images shown in Fig. 6.
In Table 1, "L" and "H" refer to quantization value "Q" (see Eq. 1), which is applied to both low and high frequency sub-bands in each level respectively. Meanwhile, "N" refers to "No quantization" for the high-frequency sub-band at the first level DWTotherwise image quality would be seriously compromised. Figure 6 illustrates the perceptual quality of decoded images. The images on the left are decoded from crypto-compression with low compression ratios with high image quality, while decompressed images on the right refer to high compression ratios with good image quality. The results in Fig. 6 demonstrate that our proposed crypto-compression method is capable of compressing images at low bit rate while preserving image details.
Additionally, Table 2 shows our proposed Hexa-data compression applied to 2D images of various sizes for 3D reconstruction from a single image. The 3D mesh reconstruction is based on Convolutional Neural Networks (CNN), which depend on training on an appropriate set of 2D images and 3D facial models. The CNN technique works with a single 2D face image and camera lens alignment is not required (i.e. it does not need calibration for an image), it accepts arbitrary facial poses and expressions, and its output is the reconstructed 3D facial geometry [23]. After the Hexa-data method is applied, the decoded images are uploaded to the website [23] and after a few seconds a 3D model is returned as shown in Fig. 7. The main reason to test our method on 3D reconstruction is to provide another level of perceptual assessment of the quality of the compression and decoding methods. Anyone without any training can perform a visual assessment of the quality of a 3D reconstructed face. Furthermore, the widespread of 3D applications into VR-Virtual Reality and AR-Augmented Reality enabled by a range of 3D applications (i.e. Autodesk, 3Dmax, MeshLab, Geomagic, and a range of Animation studios) depend on 2D images.
In a second test, we applied our proposed crypto-compression method to a 720dpi video sequence. We used Windows movie maker application to grab images from the video and the compression method was applied to each image independently (i.e. we did not resort to other video compression techniques). Results are summarized in Table 3 and Fig. 8 shows the performance of the coding and decoding method respectively.
Concerning crypto-compression as presented here, we do not perform a comparative analysis with other techniques because the final crypto-compressed file consists of the compressed data, arithmetic coding probabilities, compression keys, generated limited-data and Hexa-data probabilities. All such information is included for each image as a file header and the header itself can be as large as the actual compressed data. Normally the size of the header file can represent around 25% of the final crypto-compressed file. Since state-of-the-art methods do not have such headers as they are not designed with security in mind, it would not lead to a fair comparison: if header information is removed and only the actual compressed data sizes are compared it would be wrong, as our data cannot be decompressed without header information. If header information is added to the compressed data still would not be right as the size of the header is data-dependent and a direct comparison would yield inconclusive results as some files would be larger and some would be equivalent to standard methods.
The purpose of the research reported here is to demonstrate that the method can be applied to image and video compression where security (or sensitivity) of data is a concern. Furthermore, it is demonstrated that the method yields recovery of the original images with high quality. Although results reported in Table 3 are impressive in terms of compression ratios, we are working on improving these for image and video sequences with a follow on in-depth comparative analysis with state-of-the-art methods. Research is under way and results will be reported in the near future.

Conclusion
This paper has proposed and and demonstrated a novel method for image crypto-compression for 2D images and illustrated the quality of compression through objective measures such as Table 1 Hexa-data crypto-compression results for selected 2D images  RMSE and PSNR and the perceived quality of the visualization. As per JPEG2000, our proposed method is based on DWT, but it is different in the way it is applied. It incorporates  & The method uses a five-level DWT to increase the number of high frequencies and keep a single low-frequency sub-band (i.e. the low-frequencies can compress to a few bytes depending on image size). & At each level the high-frequencies sub-bands are combined together followed by the Hexadata method applied to the combined sub-bands to reduce the matrix size, leading to increased compression ratios, and coding the matrix by using five different keys. This is the main feature of our proposed algorithm lending it to secure data compression using partial encryption. & At decompression stage, a fast matching search algorithm is used to decode the highfrequency matrix, using the five symmetric keys from the compression steps. The generation and use of such keys render the proposed method suitable for security applications, as data can only be decompressed with the keys. Another feature of the algorithm is that it runs much faster than our previous work proposed in the patent [22]. & The experiments show that the technique is suitable for real-time applications such as compression of 3D data and video data streaming. Moreover, compression ratios up to 99% can be achieved without significant image degradation. Equally, the method can be applied to video sequences with good image quality.

Disadvantages:
& The information needed to decode the data (keys, limited-data, image and block size in the form of a look up table) are kept in the header file and such information increases the compressed data size. & The complexity of the crypto-compression steps is greater than that of existing codecs due to the coding of each six items of data which also increases the execution time for large images.
Further work includes the mathematical analysis of the crypto-compression methodthat is, what would be the required effort to decode the image if the compression keys and other information in the look up table were not available. In addition, we are investigating making the AES an integral component of the crypto-compression method such that it can be widely used for secure data compression and encryption.