An efficient secure data compression technique based on chaos and adaptive Huffman coding

Data stored in physical storage or transferred over a communication channel includes substantial redundancy. Compression techniques cut down the data redundancy to reduce space and communication time. Nevertheless, compression techniques lack proper security measures, e.g., secret key control, leaving the data susceptible to attack. Data encryption is therefore needed to achieve data security in keeping the data unreadable and unaltered through a secret key. This work concentrates on the problems of data compression and encryption collectively without negatively affecting each other. Towards this end, an efficient, secure data compression technique is introduced, which provides cryptographic capabilities for use in combination with an adaptive Huffman coding, pseudorandom keystream generator, and S-Box to achieve confusion and diffusion properties of cryptography into the compression process and overcome the performance issues. Thus, compression is carried out according to a secret key such that the output will be both encrypted and compressed in a single step. The proposed work demonstrated a congruent fit for real-time implementation, providing robust encryption quality and acceptable compression capability. Experiment results are provided to show that the proposed technique is efficient and produces similar space-saving (%) to standard techniques. Security analysis discloses that the proposed technique is susceptible to the secret key and plaintext. Moreover, the ciphertexts produced by the proposed technique successfully passed all NIST tests, which confirm that the 99% confidence level on the randomness of the ciphertext.


Introduction
The aim of securing data during storage or transmission is to increase the resistance level against various security attacks and protect the data from accidental modifications, illegal or unauthorized access [1]. Plenty of efforts were made in data security to overcome security challenges. The work in of input data and produce separate outputs. However, all compression systems are grounded on the same principle, that is, compact input data by eliminating redundancy.
Hence, both data compression and encryption are essential, but their implementation is difficult and complicated, which requires extensive computation power and resources to process a large amount of data. Nevertheless, these two different processes can be employed sequentially for the same dataset. The sequential implementation of compression and encryption techniques can be classified into two approaches, i.e., encryption-first and compression-first approaches. In the encryption-first approach, compression is embedded into encryption for achieving a higher level of data security. However, these techniques have insufficient and relatively reduced compression ratio and performance as compare to standard compression techniques. Here, compression is by reducing redundancy present in input data, while encryption also works by reducing redundancy found in input data. However, encryption does not, in general, reduce the size of the input data but introduces the significant randomness in data, thus deeming the compression process in vain. In the compression-first approach, encryption is embedded in compression but suffers from severe processing and security limitations. It needs the output of the compression operation to be piped to the encryption operation, which makes this approach more complicated and time-consuming.
This work focuses on the problems of data compression and encryption collectively so that compression is carried out according to a secret key such that the output will be both encrypted and compressed in a single step. Towards this end, this work proposes an efficient, secure data compression technique by introducing cryptographic capabilities in adaptive Huffman coding using chaotic pseudorandom keystream generator and chaotic S-Box. The proposed technique intends to achieve confusion and diffusion properties of cryptography into the compression process of adaptive Huffman coding and overcome the performance issues of the prior techniques.
The paper is organized as follows. Section 2 discusses the related work. Section 3 introduces some preliminary components and work principles related to the proposed work. The proposed technique and its various parts are presented in Section 4. Section 5 analyses the compression and security of the proposed technique. Finally, we provide concluding remarks in Section 6.

Related work
Next-generation data storage and network systems are proliferating in unique manners due to the proliferation of diverse and ubiquitous connectivity across mobile systems, networks, and sensors. Moreover, data is growing into unprecedented ranges of Terabytes, Petabytes, or even Exabytes, where users require the continuing at full strength and intensity of data. The critical issue now is how to ensure efficient and reliable usage of resources while preserving the data storage and transmission security. In such a situation, data encryption and compression techniques are often applied together to perform secure data compression. In this direction, many chaotic systems were applied because of operational efficiency for achieving secure data compression [9,10]. Compared with the conventional sequential implementation, simultaneous data compression and encryption is aimed to provide data security while also minimizing data redundancy. Notably, most of the existing research works were focused on integrating encryption into conventional data compression techniques [11][12][13]. Since both encryption and compression have a certain resemblance and common objective in the sense of secrecy, which removes redundant data. Furthermore, it is easy to turn the compression algorithm into an encryption algorithm by introducing a secret key control [12,14].
The secure data compression techniques based on HC can be found in the literature [11][12][13]. In [15], a unified data compression and encryption technique is introduced that swaps Huffman tree (HT) branches, left and right, by a secret key. Furthermore, multiple mutated trees were generated for introducing confusion and diffusion in the compression process. However, all the mutated trees generated by this approach were fixed in size. Thus, a new optimized multiple HTs technique [13] was suggested to overcome the fixed size mutated trees issue by statistical-model-based compression for generating the mutated tables. Afterward, enhanced technique over [15], was proposed in [11] using chaotically mutated Huffman trees. It overwhelmed the multiple codeword issues in [15] by increasing the keyspace. The idea of this technique was like Huffman tree mutation, but it controls the tree updates by chaotic pseudorandom keystream. However, this technique suffers from a known-plaintext attack [12]. Further, an improved technique over [11] was presented in [12] by incorporating two chaotic functions to increase resistant level against known-plaintext attacks. Still, the length of the codewords attained by the statistical model continued unaffected, and the resulting system was found vulnerable against known-plaintext attacks [14].
Nevertheless, secure data compression techniques based on HC still possess stern security vulnerabilities [16]. These techniques implement the concept of the tree mutation for achieving data compression and encryption at the same time. In contrast, data encryption is achieved by incorporating a secret key mechanism to control and randomize the Huffman tree process. Therefore, the integrating of encryption features in HC has been primarily limited to the control of Huffman tree branches using the secret key. Further discussion about the most recently used such secure data compression techniques based on HC and chaos and their strengths and weaknesses are presented in Table 1.

Preliminaries
Chaos theory is well-known due to its ability to effectively change to initial condition and parametric control values, which makes the entire system unpredictable and random [17]. The nonlinear dynamical systems of chaos theory can be employed for cryptographic purposes when they successfully implemented in infinite precision computing to achieve confusion and diffusion [18]. Thus, a straightforward and common application of chaos is to design a pseudorandom keystream generator [19]. The chaoticity of one-dimensional chaotic maps can be tested easily with different desirable properties in them, and it can be accessible to rigorous mathematical analysis and experiments such as Lyapunov exponent [20] and Bifurcation analysis [21]. This work implies the chaotic Logistic map to generate pseudorandom keystream due to its complex, chaotic behavior.

Chaotic logistic map
The Chaotic Logistic Map (CLM) is popular onedimensional chaotic map that confirms effective change to initial condition and parametric control values [22]. CLM proves complex chaotic behavior and satisfies various cryptographic basics such as unpredictability and randomness [18,22]. Equation 1 defines CLM: where 0 < x n ≤ 1; n= 0, 1, 2 . . . .. ; λ is a parametric control value in range 0 < λ ≤ 1; and x 0 is the initial condition. The rigorous Bifurcation and Lyapunov exponent results are presented in Figs.1 and 2, respectively that confirms chaotic behavior of the CLM beyond parametric control value λ at 3.57, where, orbits {x n } ∞ n=0 are distributed in a uniform manners between 0 and 1. The design proposed in this work employs the value of λ= 3.57 to generate a pseudorandom keystream for stealth key control to shuffle or randomize the symbols codes during Adaptive Huffman Tree (AHT) generation process.

Pseudorandom Keystream Generator based on CLM
The CLM was carefully implemented for generating pseudorandom keystream for key control in the proposed work. CLM requires two input parameters, an initial value x 0 and parametric control value λ to iterate and obtain the next value x n . Further, these two inputs x 0 and λ are considered as a secret key. The x n value is then used to generate c a threshold t value. The threshold t value is set as t = 0.5 (according to uniform probability model) as per (2): The pseudorandom keystream is obtained from real values x n of CLM by comparing with threshold are presented in Fig. 3. Table 1 A discussion about some recently used approaches based on HC

Technique
Strength Weakness Design of integrated multimedia compression and encryption systems [15].
1) It provides high compression ratio.
2) It swaps HT branches, left and right, using the control key.
1) Poor processing speed. 2) All the mutated trees generated by this approach were fixed in size.
Joint compression and encryption using chaotically mutated Huffman trees [11].
1) It overcomes multiple codeword issues and enlarge the key space of [15]. 2) It controls the tree updates using the PRKG. 1) The length of the codewords obtained by the statistical model remained unchanged.
2) The resulting system was vulnerable to known-plaintext attack. A Chaos-based Joint Compression and Encryption Scheme Using Mutated Adaptive Huffman Tree [13] .
1) Two chaotic functions were adopted to avoid known-plaintext attacks.
1) It offers very poor compression and decompression efficiency. 2) Weak security issues. The process of generating pseudorandom keystream by CLM is described as follows:

Inputs and outputs:
The PRKG based on CLM requires two inputs, secret key K and length L of the pseudorandom keystream, where, inputs x 0 and λ are considered as a secret key K. KS is an output pseudorandom keystream.
Step 1. Initialize the counter value i = 1 and output pseudorandom keystream KS to empty.
Step 2. Calculate the value by CLM as: Step 3. If obtained x 1 value is greater or equal to 0.5, then go to Step 5 else go to Step 6. Step 4. Concatenate 1 with KS as KS = KS 1, (where the symbol " " represents concatenation operation) and go to Step 6.
Step 5. Concatenate 0 with KS as KS = KS 0.  Step 6. Set x 0 = x 1 (to keep x 1 value for the next iteration of CLM in Step 2).
Step 7. Increment the counter value as i = i + 1.
Step 8. Check, if i ≥ L, then go to Step 9 else go to Step 2.
Step 9. End of the pseudorandom keystream generation process.

Chaotic S-Box
A substitution box (S-Box) plays a central role in many conventional symmetric-key techniques in order to prevent differential and linear cryptanalyses [14]. Currently, these two are an extremely effective cryptanalyses attack [23]. Nevertheless, the major issue of the conventional S-Boxes is the static behavior and utilized as a fixed-size lookup table without any secret key control. Thus, chaotic systems were employed to produce dynamic S-Box with key control instead of being fixed [24,25]. This work uses an efficient method proposed by Usama et al. [25] for constructing S-Boxes to perform data substitution and introduce confusion and diffusion in the proposed work without compromising compression capabilities. The presented method requires a secret key K as an input of size 106-bit to generate dynamic S-Box. Table 2 presents the output S-Box constructed using Usama et al. method [25], where, inputs x 0 and λ are set to be x 0 = 0.85 and λ = 0.99 as a secret key K.

The proposed work
This work focuses on incorporating the recent results of chaos theory, which has proven to exhibit strong cryptographic properties into Adaptive Huffman Coding (AHC) to overcome the security and performance issues of the prior techniques. As per the literature review, the chaotic behavior of the CLM is favorable to achieve the is introduced, such that output is secured and compressed simultaneously ( Fig. 4). HC is a well-known data compression technique. It consists of two main parts, i.e., the statistical model known as the Huffman tree and the compression engine. It utilizes the Huffman tree to code the data symbols by assigning shorter codewords to more frequent symbols. However, to accomplish the compression process, it requires two times the scanning of input data. First, it needs to create a statistic model from input data symbols to construct the Huffman tree. Second, the compression engine assigns a shorter codeword to input data symbols using a constructed Huffman tree. An improved version of the HC was presented in [26] that does not require two times scanning for data compression, known as AHC.
AHC also consists of two main parts, i.e., the statistical model and the compression engine. However, it mutates the statistical model known as the adaptive Huffman tree and assigns the shorter codewords at the same time. Thus, it needs to scan input data only one time to complete the compression process. AH tree (AHT) plays an essential role in the AHC coding process to achieve efficient compression efficiency (when AHT matches the exact statistical characteristics of data). More importantly, there are three advantages of these two separate parts that are used as design principles to implement the proposed work: 1. AHC allows to shuffle or randomize the probabilities of the symbols during the AHT generation process, which leads to the achievement of Shannon's suggested fundamental properties of confusion and diffusion. 2. AHC allows symbol substitution or changes to the order of the symbols before updating AHT and data encoding, which provide significant flexibility to introduce complexities in the compression process without compromising compression capabilities. 3. AHC allows for integrating different cryptographic processes. For example, it will enable masking the coding output with some pseudorandom keystream that can enhance encryption quality.
Thus, this work carries out the following three operations in AHC to implement the above mentioned three operations, respectively, so that output is secured and compressed, simultaneously with key control: 1. Proposed work employed CLM to introduce key control in the compression and decompression processes of the AHC, named as secure adaptive Huffman coding (See Section 4.1). 2. The proposed work incorporates a chaotic S-Box construction method [25] to perform data substitution without compromising compression capabilities. 3. The proposed work implements a masking pseudorandom keystream using PRKG based on CLM (presented in Section 3.1) that enhances encryption quality.
The component-level block diagram of the proposed ESDC technique with their sequence to perform secure data compression and decompression is presented in Fig. 5. It consists of three main components chaotic S-Box, secure adaptive Huffman coding, and PRKG. The description of each component is provided in the following sections. The process of the ESDC technique to perform secure data compression by chaotic S-Box, secure adaptive Huffman coding, and PRKG is described as follows: Inputs and outputs: The secure data compression is performed by introducing three secret keys K1, K2, and K3 (each key is a combination of the initial value x 0 and control parameter λ for their respected chaotic map). K1 is used for chaotic S-Box [25] to construct dynamic S-Box, and K2 and K3 are used to generate the pseudorandom keystreams using CLM. Here, I is the input data, L represents the length of the input data, and C represents the output (compressed and encrypted data).
Step 1. Generate the initial AHT.
Step 2. Generate the chaotic S-Box using secret key K1.
(See: Usama et al. S-Box construction method [25]) Step 3. Initialize the counter value i = 1 and output data D to empty. Step 4. Read the input symbol I i from input data I .
Step 5. Substitute the input symbol I i using chaotic S-Box, the output symbol is s i . Step 6. Encode the s i by secure adaptive Huffman coding method using key K2 (See: Section 4.1.1). The Step 7. Increment the counter value asi = i + 1.
Step 9. Check, if i > L, then go to Step 9 else go to Step 4.
Step 10. Finally, key K3 generates a pseudorandom keystream KS using PRKG based on CLM (See: Section 3.2) and masks the completed output data D. Simple XOR operation is used to mask the output of D, which enhances overall randomness. This step produces the final ciphertext C as C = D ⊗ KS.
Step 3. End of secure data compression.
The secure data decompression is like a secure data compression technique. It just reverses the order of secure Firstly, it decrypts the ciphertext using PRKG based on CLM with the secret key K3, then decodes the data using secure adaptive Huffman Decoder with secret key K2, and then performs inverse substitution using chaotic S-Box by Usama et al. method [25] with secret key K1. Finally, it produces the original plaintext data. The process of the secure data decompression is described as follows: Inputs and outputs: The secure data decompression is performed by the same three secret keys K1, K2, and K3 (each key is combination of the initial value x 0 and control parameter λ for their respected chaotic map) as that used in secure data compression. Where C is the input ciphertext data, and P represents the output plaintext data.
Step 1. Generate keystream using PRKG based on CLM with K3 (See: Section 3.2). It produces the keystream KS. This KS is used to unmask the ciphertext C to get the compressed data D as D = C ⊗ KS.
Step 2. Generate the initial AHT.
Step 3. Generate the chaotic S-Box using secret key K1.
(See: Usama et al. S-Box construction method [25]) Step 4. Initialize the plaintext data P to empty.
Step 5. Read the compressed code c from compressed data D.
Step 6. Decode the c by secure adaptive Huffman coding method using key K2, the output code s. Step 7. Substitute the output code s, using chaotic S-Box, the outputs symbol is I . Concatenate I with P as P = P I , (where the symbol " " represents concatenation operation).
Step 8. Check, check the end of the file; if the file ends, then go to Step 9 else go to Step 5.
Step 9. End of secure data decompression.

Secure adaptive Huffman coding
As mentioned earlier, AHC mutates the AHT and assigns the shorter codewords to produce compress output at the same time. AHT is a binary tree that has the shortest weighted path length. The node structure contains node number, symbol, and weight, as shown in Fig. 4. Thus, each node must have a unique number and symbol with its weight in AHT.
AHT generation process starts with a "RAW" node and keeps updating the AHT after performing encoding or decoding input symbol, according to the data statistics [26]. The details of data encoding and decoding processes are provided in the following sections. The process of the updating AHT is described as follows: Inputs and outputs: The AHT updating process requires a symbol as an input and outputs updated AHT.
Step 1. Check, if the input symbol is included in AHT, then go to Step 10 else go to Step 2.
Step 2. Generate a new node for input symbol with weight 1.
Step 4. Add new symbol node and new "RAW" node as children nodes of the old "RAW" node. Continue with this node.
Step 5. Increase the node weight and go to Step 10.
Step 6. Continue with the parent node.
Step 7. Check for the most weighted node; if it is the highest weighted node, then go to Step 9 else go to Step 8. Step 8. Swap the current node with the most weighted node.
Step 9. Increase the node weight.
Step 10. Check for the root node; if it is the root node, then go to Step 11 else go to Step 6.
Step 11. End of the AHT update process.
The proposed work incorporates a stealth key control to shuffle or randomize the symbols codes during the AHT generation process. As per the literature review, the chaotic behavior of the CLM is in favor of achieving the confusion and diffusion properties of cryptography. To randomize the compression process, two rules are incorporated in the update process of the AHT by pseudorandom keystream generated using CLM. CLM is an iterative map that requires two input values x 0 and λ to obtain x n values from Eq. 1.
The obtained x n value is used to apply the rule as given below: 1. If obtained x n value is greater or equal to 0.5 then add new symbol node on the right and RAW node on the left. 2. If obtained x n value is less than 0.5 then add a new symbol node on the left and RAW node on the right.
Here, the uniform probability model is adapted to assign an equal probability to each rule. Moreover, the input parameters x 0 and λ of the CLM are used as a secret key. The process of updating AHT using CLM is described as follows: Inputs and outputs: The process of updating AHT using CLM is performed by introducing input secret key K (which is a combination of the initial value x 0 and control parameter λ for CLM). It requires a symbol as an input and outputs updated AHT.
Step 1. Check, if the input symbol is included in AHT, then go to Step 2 else go to Step 2.
Step 2. Generate a new node for input symbol with weight 1.
Step 4. Calculate the new value by CLM as: x 1 = λx 0 (1 − x 0 ). As mentioned earlier, the input parameters x 0 and λ of the CLM are used as a secret key to new obtain x 1 .
Step 5. If obtained x 1 value is greater or equal to 0.5, then go to Step 6 else go to Step 7. Step 6. According to rule 1, add a new node on the right and new "RAW" node on left as children nodes of the old "RAW" node. Continue with this node and go to Step 8. Step 7. According to rule 2, add a new node on the left and new "RAW" node on the right as children nodes of the old "RAW" node. Continue with this node.
Step 8. Set x 0 = x 1 (to keep x 1 value for the next iteration of CLM in Step 4).
Step 9. Increase the node weight and go to Step 14.
Step 10. Continue with the parent node.
Step 11. Check for the most weighted node; if it is the highest weighted node, then go to Step 13 else go to Step 12.
Step 12. Swap the current node with the most weighted node.
Step 13. Increase the node weight.
Step 14. Check for the root node; if it is root node, then go to Step 15 else go to Step 10.
Step 15. End of the AHT update process.

Note:
In above mentioned AHT updating steps, AHC and AHT implementation details are not included. This work only focused on providing the difference and modification in the AHC technique for secure data compression. Implementation details of the AHC and AHT can be found in [26].

Encoder
The data encoding process requires a symbol as an input to produce output compressed code using AHT [26]. In the end, it updates the AHT for input symbol [26]. The process of the data encoding is described as follows: Inputs and outputs: Encoder require symbol as an input and outputs compressed code.
Step 1. Check, if the input symbol is new using AHT, then go to step 2 else go to step 3.
Step 2. Output the code for RAW, followed by the fixed code of the input symbol.
Step 3. Output the code of the existing symbol from AHT.
Step 4. Update the AHT for input symbol using the method described in the secure adaptive Huffman coding section.

Decoder
Like the data encoding process, the decoder takes the compressed code as input to decode the original symbol using AHT [26]. The decoding process also requires updating AHT after decoding the original symbol according to the data statistics [26]. The process of data decoding is described as follows: Inputs and outputs: Decoder require compressed code as an input and outputs decoded symbol.
Step 4. Decode the code to get a symbol using AHT.
Step 2. Check, if the input code is RAW, then go to step 3 else go to step 4. Step 3. Use the fixed code to decode the code to get the symbol.
Step 4. Update the AHT for the decoded symbol using a method described in the secure adaptive Huffman coding section.

Experiment analysis
The experimental analysis begins with defining the experiment setup for benchmarking the proposed technique. The required input files, software, and description of input data generation are also described. The performance of the proposed technique was evaluated to justify their superiorities over those similar existing techniques. Also, the strength of the proposed technique, as well as its various parts, are analyzed concerning the related parameters per the definitions in this paper.  [12], Chaotic Huffman Tree (CHT) [11], and Simultaneous Arithmetic Coding and Encryption (SACE) [10]. All techniques are implemented using Java language and were benchmarked using Calgary Corpus files [27].

Compression efficiency analysis
The primary concern of any data compression technique is to reduce storage space. The ability to reduce the data size by a compression technique can be measured by determining its compression efficiency in terms of spacesaving (%) capability [28,29]. Literature studies indicate that any compression technique that produces higher spacesaving (%) is supposed to be robust and efficient for saving disk space and reducing transmission overheads [28,29]. In this section, the discussion of space-saving (%) is included to evaluate and show the compression efficiency of the proposed technique in comparison to well-known compression techniques and prior techniques for performing the joint operation of compression and encryption. Spacesaving (%), produced by the proposed technique and other existing techniques, was calculated using Eq. 3 [29]: Space Saving (%) = 1 − I nput data size Output data size × 100 The comparison results are presented in Table 3. Results demonstrate that the proposed technique produced similar space-saving (%) to AC and AHC techniques with slight variations. Thus, the space-saving (%) results identified that the proposed technique has acceptable compression efficiency while also providing adequate security. The detailed security analysis is presented in the security analysis section.

Processing time analysis
This section presents the processing efficiency offered by the proposed and various existing techniques. In any data storage and communication system, the processing time is an important factor as the entire data need to be stored or transmitted for processing within a specified time frame [1]. Thus, it is essential to measure the processing time of the proposed technique to justify performance efficiency. Equation 4 is used to calculate processing time: P rocessing time = P rocess end time − P rocess start time (4) The performances of the proposed technique were compared with their corresponding current simultaneous data compression and security techniques. Tables 4 and 5 demonstrate a processing time comparison between the proposed and CHT, CMAHT, and SACE techniques running on standard Calgary Corpus input files. Results showed that the proposed technique was the fastest while performing data compression and encryption simultaneously, compared to the corresponding existing techniques.
Calgary Corpus test files were compressed and encrypted by standard AC and AHC with AES as well. Tables 6  and 7 give a comparison for processing time between the proposed, AC and AHC with AES techniques. These results showed that both well-known compression techniques AC and AHC were the fastest while performing only data compression without any encryption capabilities. Results show that the proposed technique requires less processing time compared to two separate operations. In a nutshell, experimental results show that AC and AHC had the fastest compression speeds, while proposed technique is in third place. Apart from well-known coding techniques, the proposed technique offered better time efficiency than the other existing simultaneous compression and encryption techniques. Hence, it can be claimed that the proposed technique is efficient compared to the corresponding related techniques to perform simultaneous compression and encryption by reducing the space and time overheads.

Security analysis
Along with an adequately sized key, the cryptographic technique should produce an output that is indistinguishable from random for any input data to prevent an attacker to discover the input data statistics and to minimize any relationship within cipher output. Furthermore, the cipher output should be highly sensitive and dependent upon the  key and plaintext such that a single bit is altered within the key or plaintext, the cryptographic technique should produce entirely different output every time. The proposed technique was designed to achieve high security while reducing data storage and transmission overheads. The security analysis of the proposed technique and its various parts are evaluated in the following sections through several assessment parameters.

Key space analysis
Cryptographic techniques should have sufficiently large enough keyspace to prevent the attacker from decoding the ciphertext or reveal the key in any reasonable time [17]. Henceforth, this section discusses the keyspace of the proposed technique. The proposed technique utilizes CLM for two purposes. First, CLM was employed to generate pseudorandom keystreams and second to randomize the compression process in the update process of the AHT. The CLM is iterative chaotic map (start from two input parameters, an initial value x 0 and a control parameter λ) and obtain x n values from Eq. 1. The obtained x n values from CLM are then converted to generate pseudorandom keystreams. These input parameters x 0 and λ are used as a secret key for generating pseudorandom keystreams from CLM. As mentioned earlier, this work employed an efficient method proposed by Usama et al. in [25] for constructing S-Boxes to perform data substitution. This method is based on the mixing property of the chaotic Sine map, where the initial value x 0 and control parameter λ were used as a secret key K. Thus, the proposed technique requires six input parameters. If they are comprehended in a finite precision system, then the keyspace is ∼318 bits. According to the [17], the keyspace of the proposed technique is acceptable to resist brute-force attacks and satisfy cryptographic requirements.

Key and plaintext sensitivity
Along with a sufficient keyspace, the cryptographic technique should produce a random output from any input data to prevent an attacker from discovering the input data statistics and minimizing the relationship within cipher output [17,30]. Moreover, the cipher output should be highly sensitive and dependent upon the key and plaintext such that if the even single bit is altered within the key or plaintext, the cryptographic technique should produce entirely different output every time [18]. The key sensitivity is the bit change percentage of the ciphertexts obtained after performing the data encryption process using slightly different keys. Experiments were performed by changing x 0 = 0.30896 to x 0 = 0.30897 to assess the key sensitivity, where ciphertext incurred from the corresponding Calgary The bit change percentage for key sensitivity analysis of any cryptographic algorithm must possess a 50% change in bits to resist against cryptanalysis attacks [31]. Table 8 provides an experimental result for key and plaintext sensitivity trials of the proposed technique. Results show that the bit-changepercentage for both analyses was very close to the ideal value 50%. This confirms that the proposed technique is highly sensitive to input key and plaintext.

Randomness analysis of the proposed technique
The NIST SP800-22 [32] test suite is the most common tool for randomness analysis. This study applied the NIST suite to perform the randomness analysis of the proposed technique. This suite includes fifteen statistical tests, where each test gives p-values between 0 and 1. This p-value is further applied to evaluate the randomness of ciphertext by defining the significance level α, e.g., whenp < α, the ciphertext is non-random otherwise random. In this work, the significance levelα is set to 0.01 to confirm 99% confidence level for the randomness analysis of the proposed technique. Table 9 lists all computed p values for all tests with default input parameter settings defined in

Conclusion
This work concentrates on the problems of data compression and encryption collectively without negatively affecting each other. An efficient, secure data compression technique was introduced, which provides cryptographic capabilities for use in combination with an adaptive Huffman coding, pseudorandom keystream generator, and S-Box. This work carries out three operations in adaptive Huffman coding, so that output is secured and compressed, simultaneously with key control. 1) The proposed work employed a chaotic Logistic map to introduce key control in the compression and decompression processes of the adaptive Huffman coding, named as secure adaptive Huffman coding.
2) The proposed work incorporates a chaotic S-Box to perform data substitution without compromising compression capabilities. 3) Furthermore, implementing a masking pseudorandom keystream based on chaotic Logistic map enhanced encryption quality. Thus, data compression is carried out according to a secret key such that the output will be both encrypted and compressed in a single step. Experimental results proved that the proposed technique achieved faster processing time compared to performing the encryption and compression techniques as separate steps. The proposed algorithm decisively achieved secure data compression; thus, proving useful for real-time implementa-tion by reducing data space and transmission consumption. Security analysis also revealed that the proposed work is highly sensitive to both key and plaintext and that the generated ciphertexts successfully passed all NIST tests showing that the randomness of the ciphertext has 99% confidence level. Compression efficiency analysis demonstrates that the proposed technique produced similar space-saving (%) to standard techniques with slight variations while also providing adequate security.
Funding This publication was made possible by the NPRP award NPRP8-2158-1-423 from the Qatar National Research Fund (a member of The Qatar Foundation). The statements made herein are solely the responsibility of the authors. Open access funding provided by the Qatar National Library.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creativecommonshorg/licenses/by/4.0/.