1 Introduction

The trend of transmitting digital information over the Internet is growing exponentially. While this increases convenience and accessibility, extra challenges also increase with every development and improvement in technology. One of the inevitable issues is providing adequate security for data transmission that utilizes insecure communication networks. The number of connected users grows every day, as does their diverse Internet activity. As a result, the numbers and types of potential cybersecurity assaults have increased as well. This creates further challenges because data is an organization's most essential asset in today's world. Protecting sensitive data from unauthorized access has become a critical priority because attackers may use open public Internet for exploitative or malicious purposes. To avoid such attacks, sensitive data requires modification into cipherable forms before being transmitted via unencrypted channels. Confidential information requires a speedy, reliable, and robust cryptosystem to prevent information leakage.

Both researchers and academics have been exploring multiple alternative approaches to protecting transmitted data. With recent developments in communication technology, many encryption algorithms are designed for the security of real-time communication. Cryptography also plays an essential role in providing security for sensitive information. A wide range of algorithms has been presented to this end, including advanced encryption standard (AES), data encryption standard (DES), Elliptic curve cryptography (ECC), and so on. Many attempts have also been made to break down specific algorithms based on advanced encryption standards (AES) and data encryption standards (DES), which have been successful in some instances since 1993.

Regardless of these outliers, cryptography is still one of the most effective methods for preserving sensitive data. With expanding growth of new Internet channels and technologies, more sophisticated cryptanalysis and more robust and efficient image encryption techniques have become necessary for secure data communication. This is because cryptography encodes and transmits data in a specific format that can only be read and processed by those authorized to use advanced mathematical concepts. Encryption, or the act of encoding a communication in a format that unauthorized users cannot read or understand, is a crucial part of cryptography. Encryption in its various forms has been used since the Romans and even earlier, but increasingly complex versions are needed to keep up with new needs. A plain text can be encrypted into ciphertext and then the data can be sent over an insecure transmission medium. Depending on the security of the algorithm, the ciphertext may not be accessed by an unauthorized person.

A variety of symmetric and asymmetric image cryptographic algorithms have also been developed. In symmetric key cryptography, for instance, both users (i.e., sender and receiver) use a single key for the process of ciphering and deciphering. By contrast, asymmetric key cryptography utilizes two keys, a public key and a secret one, at each point to achieve additional security. In this approach, the private key is always kept secure because it decrypts the information. In contrast, the public key is always made publicly available to everyone because it does not help us decrypt the secret information.

In addition, most modern encryption designs are based on chaotic systems. Symmetric key cryptographic algorithms are significant because they produce a strong key for these cryptosystems and are very cheap. The keys are considerably smaller for the degree of security they provide, and running these algorithms is relatively inexpensive. Chaotic maps have also garnered a great deal of attention over the past few decades as another means of protecting cryptographic algorithms. Chaotic cryptography can secure communication further in a shorter duration of time. Quantum image processing is another method that is also becoming more popular to ensure information confidentiality. However, there are multiple proposals for data encryption in the literature. Chaotic systems, quantum encryption, and substitution boxes (S-boxes) are all often used.

Several nonlinear methods have been proposed to combat cryptographic attacks. In past, many image encryption schemes are mainly based on a chaotic dynamical system. The behavior of dynamical systems are pseudorandom and hence suited for multimedia encryption. The output of chaos maps is based on initial conditions. For this reason, chaos-based systems are known as deterministic systems. Their nature of randomness, sensitivity to original conditions, and ergodicity are unique characteristics (Stallings 2006; Chuang et al. 2011; Al-Najjar 2012; Banthia and Tiwari 2013; Rivest 1990; Matthews 1989; Wheeler and Matthews 1991; Chen and Liao 2005; Masood et al. 2020a, 2021, 2020b; Ahmad et al. 2020; Hanouti et al. 2020; Butt et al. 2020; Munir et al. 2020). These characteristics lead to a reliable cryptosystem, while chaotic maps and dynamical systems help to generate long-term chaotic sequences. Here, even a small change in initial conditions will significantly shift the chaotic sequence initially developed. These properties make these options some of the best choices for constructing secure algorithms in cryptography. By contrast, many techniques based on cryptanalysis are offered as a means of securing cryptographic algorithms, in turn depicting weakness in existing cryptosystems (Munir et al. 2021a, 2021b; Hanouti et al. 2021a, 2021b).

DNA computing and its intrinsic properties have been used extensively in the field of cryptography. Massive parallelism, high-level computational capacity, and storing large amounts of data are among these inherent properties. Research in this area often utilizes publicly accessible biological data to encrypt plaintext data in DNA computing applications. Adleman (Adleman 1994, 1998; Jiao and Goutte 2008) was the first to propose cryptographic DNA computing in 1994, initiating a new era of data processing that provides DNA-based encryption algorithms with tangible advantages over conventional cryptographic techniques. However, encrypting images with DNA encoding alone are inefficient. As a result, the underlying vulnerability problems are often solved using encryption techniques utilizing DNA computing and chaotic sequences (Enayatifar et al. 2014; Naskar and Chaudhuri 2016; Hanouti and Fadili 2021). For example, Clelland et al. (1999) have developed an innovative approach to protect secret communications using human genomic DNA. Meanwhile, Xiuli et al. (Chai et al. 2017) created a unique encryption method by adding chaotic maps and DNA sequences. A matrix based on DNA is created initially, and then, a plaintext image is stored before the circulation permutation process of row and column-wise is added. Yueping et al. (Li et al. 2017) have also offered a secure cryptographic technique. These proposed cryptosystems take high-dimensional chaotic maps to get robust security. Yueping et. al’s. systems could withstand various assaults based on chosen plain text and cipher text methods, and their proposed scheme works rapidly and efficiently.

Many other researchers (Mondal and Mandal 2017) have also developed effective and lightweight encryption schemes that use DNA and chaotic approaches. Here, the unencrypted image utilizes confusion with randomly generated numbers obtained from a chaotic logistic map (employed cross-linked). In one approach, the pixels are then distorted with the computational method of DNA. For instance, Chen et al. (2018) have presented a cryptosystem based on the pixel’s permutation and distortion process, which works on the self-adaptive process and is an efficient method due to its randomized but reusable variables. The last stage takes the DNA encoding method. In the Rijndael cipher (Daemen and Rijmen 1998), the work of well-known Belgian cryptographers Vincent Rijmen and Joan Daemen was selected as the advanced encryption standard (AES) in October 2000. The S-box based on AES is often regarded as the highest benchmark in this field. The optimum highly nonlinear value is 120, and the most significant value obtained by the AES S-box is 112 (Rijndael).

Following this lead, many more S-boxes have been developed to provide even stronger alternatives. For example, S-boxes with cryptographic features, such as the AES S-box, can be employed. Our work proposes a novel method for designing a robust substitution box (S-box) with better cryptography features. This S-box helps to substitute original data into plain text while maintaining its entropy level. We used one-dimensional (1D) and two-dimensional (2D) chaotic maps and DNA sequencing to construct this S-box. The sequence generated is filtered to unique random elements of a 256 count. The entropy value of 8 approximates an ideal value that satisfies the complete randomness needed from our proposed S-box.

Following its construction, our new S-box is investigated using multiple randomness and performance analysis tests, whose results show that our constructed S-box is exceptional for implementing real-time communication. Today, most cryptographers work in advanced encryption standard (AES) because of its highly robust cryptographic algorithm. In modern cryptography, block encryption algorithms play an essential role in providing security, such as international data encryption standards (IDES) and advanced encryption standards (AES). Due to their prominent chaos features, S-boxes are a superior choice for designing cryptosystems. At the same time, several security tests of S-boxes support the proposed cryptosystem's strength against both differential and linear attacks (Sani et al. 2021; Azam et al. 2021; Qayyum et al. 2020; Zahid et al. 2021; Liu et al. 2021). Thus, S-boxes form one of the fundamental nonlinear components used to provide security for cryptographic schemes.

1.1 Contribution

The following are the key contribution of our research study:

  • Presenting efficient cryptosystem that uses combined effect of DNA and chaotic dynamical system for the development of initial S-box.

  • The system uses multiple stages that help to generate highly random sequencing that exhibit minimum correlation.

  • The proposed system uses both substitution and permutation for an extra layer of security. Both substitution and permutation ensure higher image security.

  • Investigation of various existing state-of-the-art methods and comparing them with the proposed scheme.

  • The proposed scheme is investigated thoroughly using various tests, i.e., nonlinearity (NL), strict avalanche criterion (SAC), bit independence criterion (BIC), bit independence criterion strict avalanche criterion (BIC-SAC), bit independence criterion nonlinearity (BIC-NL), equiprobable input/output XOR distribution, and linear approximation probability.

2 Fundamental concepts

2.1 Arnold transformation (AT)

Shuffling the pixels of an initial image is one of the essential elements used to provide image security. Here, the security of an image can be accomplished by applying this one image transformation method. There are various shuffling methods; however, Arnold transformation (AT) is one of the methods utilized most extensively. The map of an Arnold transformation was discovered in the 1960s by Vladimir Arnold using a cat image (Arnold and Avez 1968); the map is described in Eq. 1:

$$ \left[ {\begin{array}{*{20}c} {x^{\prime}} \\ {y^{\prime}} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} 1 & 1 \\ 2 & 2 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} x \\ y \\ \end{array} } \right]\text{mod } 1 $$
(1)

where \(x\) and \( \, y \in \left\{ {0,1} \right\}\). The formula illustrated above is defined for a unit square though which the existing matrix can be extended upon image pixels, i.e., if \(x, \, y \in \left\{ {0,1,2, \, 3, \ldots ., \, N} \right\}.\) With the increase in image pixels, there will also be an increase of elements in the matrix, and Eq. 1 can be rewritten as:

$$ \left[ {\begin{array}{*{20}c} {x^{\prime}} \\ {y^{\prime}} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} 1 & 1 \\ 2 & 2 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} x \\ y \\ \end{array} } \right]\text{mod } N $$
(2)

An Arnold map (AM) utilizes linear algebra concepts on the positioning of pixels to change their values (Ye and Wong 2012). An AM can shuffle image pixels of any size and is generalizable. The generalized Arnold map (AMg) is expressed in matrix notation, as demonstrated by Eq. 3:

$$ \left[ {\begin{array}{*{20}c} {x^{\prime}} \\ {y^{\prime}} \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} 1 & a \\ b & {ab + 1} \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} x \\ y \\ \end{array} } \right]\text{mod } N $$
(3)

In this equation, \(a\) and \(b\) are the two control parameters that aid in changing the position of pixels \(x\) and \(y\), making new coordinates of pixels \(x^{\prime}\) and \(y^{\prime}\) in the shuffled image. The pixels of original image \((x,y)\) will then transform into shuffled pixels of \((x^{\prime},y^{\prime})\).

On the other hand, the distinctive exponents of the Lyapunov exponent are calculated as shown by Eq. 4:

$$ \lambda = 1 + \frac{{ab + \sqrt {a^{2} b^{2} + ab} }}{2} > 1 $$
(4)

This map will behave chaotically if the Lyapunov exponent (LE) is greater than 1 (Ye 2011). This implies that if the \(a\) and \(b\) are each greater than 0, i.e., (\(a > 0\)) and (\(b > 0\)), then the system will be in a chaotic state.

The Arnold map generalized (AMg) equation is the discrete system that works on two effects, namely stretching and folding. These effects can be attained using the phase space system, which helps in creating confusing image encryption schemes. However, to obtain the randomly confused image, the confusion process is repeated several times. As a result, utilizing AMg as part of an image encryption scheme will take a long time. Furthermore, a digital image's finite gray levels may cause the original image to reemerge after several rounds of confusion (Wang et al. 2010).

2.2 Logistic may system

A logistic and may map (LOMAS) is a discrete time 1D chaotic system (Nkandeu and Tiedeu 2019) that can be achieved using Eq. 5:

$$ y_{m + 1} = (y_{m} e^{\wedge}\,((r^{\prime} + 9)(1 - y_{m} )) - (r^{\prime} + 5)y_{m} (1 - y_{m} )){\text{mod }}\,1 $$
(5)

where \(y_{m}\) \(\in\) [ 0 1] and \(r{^{\prime}}\) \(\in\) [0, 5]. This modified system will behave with chaotic randomness.

3 DNA system

This section will discuss gene expression, DNA basics—i.e., the four nucleotides—and their application in image encryption.

3.1 DNA and gene expression

Gene expression is the continuous process by which the genome receives and decrypts information that the living organism can utilize and process using a DNA code (Tefferi 2006). The fundamental dogma of living organisms is responsible for gene expression. A DNA molecule is fed into the central dogma process, which is then synthesized into a polypeptide chain that possesses many amino acids bonded together. Molecular biology has also demonstrated that proteins are retrieved using DNA (Hollenbach 2020). Transcription and translation are the two critical stages of the central dogma process (Cooper 1981). Transcription turns DNA into RNA, while polypeptide chains can be obtained by converting RNA through translation. The core dogma process is depicted in Fig. 1.

Fig. 1
figure 1

The process of central dogma

3.2 DNA composition

DNA is composed of four nucleic acid bases. The human genome is enormously long and sophisticated, is comprised of approximately 3.2 billion base-paired nucleotides. These are the four most essential nucleotide bases (Watson and Crick 1953), which are adenine (A), cytosine (C), thymine (T), and guanine (G). These four nucleic acids are complementary pairs, i.e., like the binary ‘0’ and ‘1,’ they complement to each other. When seeking pairwise combinations, we can easily find that ‘00,’ ‘01,’ ‘10,’ and ‘11’ are complementary binary pairs. Thus, it is easy to encode binary numbers of ‘00,’ ‘01,’ ‘10,’ ‘11’ using four bases, i.e., ‘A,’ ‘C,’ ‘G,’ ‘T.’ Using 4! = 24, we can also get the maximum possible number of schemes. Eight out of 24 schemes have satisfied the complementary base pair principle shown in Table 1 (Watson and Crick 1993). DNA sequences have better encryption properties and meet all tests for constructed S-boxes, which in turn means these qualify for real-time communication.

Table 1 The relationship of four nucleotides with \(P_{(i,j)}\)

Consider a colored plain image \(P\) having a size of \(L \times W \times 3\) pixels. \(P\) is divided into three layers, i.e., red (R), green (G), and blue (B) layers, respectively. The image pixel \(P_{(i,j)}\) depicts the position of each pixel where \(i = \, 1, \, 2, \, 3, \ldots .{\text{ L}}\) and \(j = \, 1, \, 2, \, 3, \, \ldots . \, W\) each lie in between 0 and 255 for an eight-bit system. Thus, the DNA cryptography-based encoding scheme can be expressed as (Zhang 2018):

$$ P_{(i,j)} = b_{3} .4^{3} + b_{2} .4^{2} + b_{1} .4^{1} + b_{0} .4^{0} $$
(6)

A, C, G, and T are represented as \(b_{i}\), where \({\text{i }} = \, 0, \, 1, \, 2, \, 3.\) As a result, each pixel may be assigned to the tetrads. The appropriate relationship of A, C, G, and T with \(P_{(i,j)}\) is depicted in Table 1.

Zhang (2018) established 24 similar connections between 0, 1, 2, 3 and A, T, C, G, which they explained as 24 types of principles based on DNA computing and coding. The result is shown in Table 2. In these combinations we utilized A, T, C, G with the available rule number 6: (0, 3, 2, 1) in our proposed scheme. This combination can be represented by \(A0,T3,C2,G1\). Zhang (2018) also presented various kinds of combination and operations, such as DNA join operations and compliment operations, out of 24 kinds of DNA coding rules, as shown in Table 3. Different methods, such as commutative and associative methods, can be applied. The exclusive OR operation for DNA rules is depicted in Tables 4 and 5.

Table 2 DNA coding rules (4\(!\) = 24)
Table 3 The 16 possible join operations
Table 4 Exclusive OR operation for DNA rules
Table 5 XOR operation for DNA genetic sequence (Rules (1,2,3,4,5,6))
Table 6 The obtained S-box

Let us assume that there are three nucleic acids signified by \(x,y\), and \(z\). The DNA join operation between \(x\) and \(y\) is represented by \(x\), \(y\) as shown in Eq. 79:

$$ \left\langle {\left. {x,y} \right\rangle = } \right.\left\langle {\left. {y,x} \right\rangle } \right. $$
(7)
$$ \left\langle {\left. {x,y,z} \right\rangle = } \right.\left\langle {\left. {\left\langle {\left. {x,y} \right\rangle ,z} \right.} \right\rangle } \right. = \left\langle {\left. {x,\left\langle {y,z} \right.} \right\rangle } \right. $$
(8)
$$ \left[ \begin{gathered} A \hfill \\ T \hfill \\ C \hfill \\ G \hfill \\ \end{gathered} \right] = \left[ {\begin{array}{*{20}c} A & T & C & G \\ T & A & G & C \\ C & G & A & T \\ G & C & T & A \\ \end{array} } \right]\vartriangle OP1 $$
(9)

3.3 Transcription process

In molecular biology, transcription is the process of generating RNA from DNA molecules (Hardy et al. 2004). In cryptography, the input sequence of DNA is changed to the output sequence of RNA using DNA sequencing. Using a version of the confusion process, the four nucleotide bases are substituted with the corresponding Watson–Crick (w–c) complements through transcription. For instance, the T is replaced with the U because the RNA strand lacks thymine (T), which is substituted by uracil (U). Likewise, T is swapped with A, A is changed with U, G is replaced with C, and C is replaced with G in the Watson–Crick complementary pairing scheme (Watson and Crick 1953, 1993).

3.4 DNA operation

3.4.1 DNA coding

The line vector based on a sequence of DNA (DSLV) can be achieved through the process of converting pixels' line vector utilizing 24 kinds of DNA encoding rules. This process can be expressed as in Eq. 10:

$$ Rule = \text{mod } (floor(mean(sub\_key),24) $$
(10)

3.4.2 DNA chosen operations

The DNA system can utilize different operations as a means of generating sequences. The system can also be designed using various operations, such as XOR, ADD, and SUB. The resulting DNA system can be expressed as in Eq. 11:

$$ OP = \text{ mod } (floor(mean(Sub\_key),3) $$
(11)

3.4.3 Number of rounds

The selected operation will calculate several rounds (NOR) using Eq. 12:

$$ NOR = floor(\log (vectsize)\log (2)), $$
(12)

wherein \(< < vectsize > >\) depicts the size of the DNA vector sequence.

3.4.4 DNA joined operation

In a DNA joined operation, the permutation sequence is transformed to a permuted sequence generated by DNA with the corresponding information DNA sequence using one of the 16 possible DNA join operations. This process can be expressed as in Eq. 12:

$$ Rule = \text{ mod } (floor(avg(sub\_key),17) $$
(13)

4 Anticipated algorithm for the construction of S-box

  1. 1.

    Let \(T\) be a plain text image with \(j \times k\) representing the entire dimension of a plain image where j and k are the image's rows and columns, respectively. The original image (\(T\)) having an initial size \(512 \times 512 \times 3\) pixels is divided into three channels (e.g., red = R, green = G, and blue = B) each containing \(512 \times 512\) pixels. The three divided channels are then saved to \(U\), where \(U\) = \(T\)(\(T_{R}\), \(T_{G}\), and \(T_{B}\)).

  2. 2.

    A two-dimensional Arnold transformation is iterated to generate random sequences \(X\) and then resized to an exact fit for the pixels of each channel \(T\)(\(T_{R}\), \(T_{G}\), and \(T_{B}\)) in \(U\). The resized sequences are then saved to \(X2\). Moreover, the sequences \(X2\) are tested for several rounds, up to 256 in number (\(R_{1} ,R_{2} ,R_{3} ,\ldots R_{256}\)), to achieve a shuffle matrix (\(M_{1} ,M_{2} ,M_{3} ,\ldots M_{256}\)).

  3. 3.

    In this step, the divided channels (\(T_{R}\), \(T_{G}\), \(T_{B}\)) containing 262,144 pixels each are XORed with the resized sequences (\(X2\)) for three rounds (\(R_{1} ,R_{2}, and R_{3}\)) of the shuffled matrix (\(M_{1} ,M_{2}, and M_{3}\)).

  4. 4.

    In this step, random permutation (\(R_{P}\)) is initiated in order to generate maximum random sequences, which are then further treated with the three rounds (\(R_{1} ,R_{2}, and R_{3}\)) of the shuffled matrix (\(M_{1} ,M_{2}, and M_{3}\)) to achieve \(V\), which is constituted of (\(R_{P}\) \(\oplus\) \(M_{1} ,M_{2}, and M_{3}\)).

  5. 5.

    An alphabetical DNA amino sequence (\(W\)) is generated and then converted into four nucleotides, the adenine (A), thymine (T), cytosine (C), and guanine (G), where \(W = (A,T,C,G)\). The values are encoded as \(W_{encoded}\)(\(A = 00,\) \(C = 01,\) \(G = 10,\) and \(T = 11\)), as shown in Fig. 2.

  6. 6.

    The encoded values for the DNA-based \(W_{encoded} = (00,01,10,11)\) is furthermore quantized (\(Q =\)), with \(Q =\) 0.5, 1.5, 1, and 0. The output is then stored in \(Y\).

  7. 7.

    A modified logistic–may map \( LM_{{RS}} = y_{{m + 1}} = (y_{m} e{{\wedge}}((r^{\prime} + 9)(1 - y_{m} )) - (r^{\prime} + 5)y_{m} (1 - y_{m} ))\text{mod } 1 \) is then generated, including a definition of its initial conditions. This map is iterated in order to generate equal numbers of random sequences (RS) to that of the original pixels of an image (\(U\)). Moreover, \(LM_{(RS)}\) is reshaped to a matrix having \(512 \times 512\) pixels equal to plain text image \(T\) = \(j \times k\) and is stored to \(RM\).

  8. 8.

    The \(RM\) takes the modulus and round functions and is converted into unit8 integers. Then the result of \(RM\) is stored in \(Y1\).

  9. 9.

    Steps 7 and 8 are repeated for the stored values of \(Y\) to get DNA random sequencing by applying the round and modulus functions, respectively, to achieve an output (\(Z0\)).

  10. 10.

    The XOR operation is applied between the stored values of \(Y1\) and \(Z0\) to obtain \(Z1\). Or, in other words, \(Z1\) = \(Y1\) \(\oplus\) \(Z0\).

  11. 11.

    By this step, we have obtained the S-box through the selection of unique elements of \(Z1\), such as how \(Z1\) became \(S1(1 \times 256)\) and random sequencing array is sorted into either ascending or descending order.

  12. 12.

    Finally, the array \(S2(1 \times 256)\) is reshaped to \(S_{obtained} = S2(16 \times 16)\).

Fig. 2
figure 2

Block diagram demonstrating the proposed S-box based on DNA and chaos sequencing

The output of this twelve-step process is shown in Table 6 (Matthews 1989), while the plot of the S-box thus constructed is shown in Fig. 3.

Fig. 3
figure 3

Plot of the constructed S-box

5 Randomness tests for the constructed S-box

5.1 NIST test

Table 7 depicts the completed S-box. To test and determine its level of randomness, the National Institute of Standards and Technology (NIST-800-22) statistical tests are used, including several closely interdependent tests and a close examination of any non-randomness that may exist following the proposed generated sequence. The outcomes of these statistical tests are then assessed according to the p value also established by the NIST-800-22, which holds that the resulting p value must be either equal to or more significant than the present value to signify success. Table 7 shows the results of several randomization tests (Pareschi et al. 2012).

Table 7 NIST 800-22 test results for the obtained S-box

5.2 Histogram uniformity analysis

This analysis is used to assess the pixel arrangement of each channel. The regularity of the pixels is determined by the randomization (i.e., random numbers) obtained using our proposed S-box. The pixels' non-uniformity intimates that the system is not secure, and the data is easily retrievable. As discussed earlier in Sect. 4, let \(T\) be a plain text image. The original image (\(T\)) of \(512 \times 512 \times 3\) pixels is divided into three channels, each containing \(512 \times 512\) pixels. The constructed S-box is then applied on all three channels, as shown in Figs. 48.

Fig. 4
figure 4

a Plain image of splash (512 × 512 × 3); b red channel; c green channel; d blue channel

Figure 4 depicts that the initial image is divided into three respective channels. The Arnold transformation method is utilized to obtain a shuffled image, as shown in Fig. 5. The intensity of shuffling depends upon the number of rounds applied. However, shuffling indicates that the permutation of pixels does not change the distribution of pixels in each channel, as shown in Fig. 6. The pixels show uniformity when the substitution method is applied, as shown in Figs. 7 and 8. A high level of randomness is achieved by the addition of a modified LOMAS and our DNA-based S-box.

Fig. 5
figure 5

a Shuffled plain image of splash (512 × 512 × 3); a red channel; b green channel; c blue channel

Fig. 6
figure 6

a Plain and shuffled image histogram (512 × 512 × 3); a red channel histogram; b green channel histogram; c blue channel histogram

Fig. 7
figure 7

a Encrypted image of splash (512 × 512 × 3); b red channel; c green channel; d blue channel

Fig. 8
figure 8

a Encrypted image histogram (512 × 512 × 3); b red channel histogram; c green channel histogram; d blue channel histogram

6 S-box fundamental characteristics and experimentation process

The robustness and performance of our proposed S-box are also assessed using five different industry standard tests, including the nonlinearity test (N-L), strict avalanche criterion (SAC), bit independence criterion (BIC), differential approximation probability (DP), and linear approximation probability (LP). As the results below demonstrate, each test yielded exceptional results, indicating that the S-box we have constructed competes well against existing S-boxes and even delivers a more remarkable ability to withstand linear assaults.

6.1 Nonlinearity test analysis

The strength of the encryption achieved for various data using the substitution process can be evaluated using the nonlinearity test. The original data is already distorted by substituting pixels of the original image with the constructed S-box. The discussed criterion can be illustrated by using a Boolean function \(g(x)\), whose nonlinearity can be defined as:

$$ N_{g} = 2^{k - 1} (1 - 2^{k} \max_{{\varphi \in GF(2^{k} )}} |S_{(g)} (\varphi )|), $$
(14)
$$ S_{(g)} (\varphi ) = \sum\limits_{{\varphi \in GF(2^{k} )}} {( - 1)^{x.\varphi \oplus g(x)} ,} $$
(15)

where \(x.\varphi = x_{1} \oplus \varphi_{1} + x_{2} \oplus \varphi_{2} + \cdots + x_{n} \oplus \varphi_{n}\). The Boolean function \(g(x)\) and nonlinearity \(N_{g}\) each have a direct relation to each other, so that if the value of \(N_{g}\) increases, then the value of \(g(x)\) will also increase. As a result, the S-box's ability to resist any linear passwords will be robust. If the amount of nonlinearity introduced by an S-box is not sufficient to protect against linear attacks, then unauthorized users can understand the behavior of the Boolean function. Thus, the strength of selected bits in the Boolean function is the fundamental cause of changes in these characteristics.

In this test, we analyzed our proposed S-box's changing values by changing bits in the corresponding Boolean process. The results and comparison of different S-boxes are shown in Table 8.

Table 8 Nonlinearity test for S-boxes

6.2 Strict avalanche criterion (SAC) test analysis

The strict avalanche effect is a criterion in which the changes in bits are proportional to the number of bits in the encrypted message. In this way, the encrypted message will change dramatically if the plain text or key is altered. When complementing a single bit at the input, the strict avalanche effect (SAC) condition is satisfied, and all the output bits will change with half the probability. This change in the input causes an avalanche effect that spreads throughout the system.

By adjusting the input bits, we were able to examine the avalanche effect regarding our proposed S-box. Here it is essential to have a dependence matrix with all the values of dependence averaging 0.5. This dependence matrix can be evaluated using Eqs. 1617:

$$ S(g) = \frac{1}{{k^{2} }}\sum\limits_{1 < r \le k} {\sum\limits_{1 \le \omega \le k} {\left| {\frac{1}{2} - Q_{r,\omega } (g)} \right|,} } $$
(16)
$$ Q_{r,\omega } (g) = 2^{ - k} \sum\nolimits_{{x \in B^{k} }} {g_{\omega } (x) \oplus } g_{\omega } (x \oplus e_{r} ). $$
(17)

where \(e^{r} = [\theta_{r,1} \theta_{r,2} \ldots \theta_{r,k} ]^{T}\) and \(\theta_{r,\omega } = 0\) when \(r \ne \omega\) and \(\theta_{r,\omega } = 1\) when \(r = \omega\).

Table 9 shows the dependence matrix for testing using SAC. The values range from highest to lowest, with the S-box we created showing a maximum resultant value of 0.5625 and a minimum value of 0. 4375. This means that our S-mean box's dependence matrix value is 0.5022, close to the SAC optimum value of 0.5. SAC values for different S-boxes are reported in Table 10. When compared to current S-boxes, our proposed S-box offers better performance and throughput.

Table 9 Dependence matrix of constructed S-box
Table 10 Analysis of strict avalanche criterion (SAC) for different S-boxes

6.3 Bit independence criterion (BIC) test analysis

The bit independence criterion (BIC) test is one of the essential criteria used to measure of S-boxes' strength. With this test, there must be reasonable independence of change in bit pattern at the output end from the input end. If this is the case, it becomes difficult for intruders to map the change criteria in input to output bits. If the S-box in question satisfies the BIC test properties, there must be an independent pairwise avalanche variable for the definite order of avalanche vectors. The avalanche criterion is based on changing input bits and analyzing the nature of output bits on the other side. This is mandatory for bit independence criterion (BIC) property, which should use \(g_{r} \oplus g_{w} (r \ne w,1 \le r,w \le n)\) to accomplish nonlinearity.

The results that we achieved are shown in Tables 11 and 12, respectively. The obtained values for BIC-SAC and BIC-nonlinearity for our S-box are 0.4960 and 112.35, indicating that our S-box fulfills the requirements of both bit independence criterion strict avalanche criterion (BIC-SAC) and bit independence criterion nonlinearity (BIC-NL) properties. Table 13 then depicts BIC properties for various existing S-boxes, demonstrating that the values our new S-box achieved for BIC-SAC and BIC-nonlinearity are higher than those of existing S-boxes.

Table 11 Bit independence criterion (BIC)–nonlinearity (NL) for constructed S-box
Table 12 Bit independence criterion (BIC)–strict avalanche criterion (SAC) for constructed S-box
Table 13 Comparison of Bit independence criterion (BIC)–nonlinearity (NL) with different S-boxes

6.4 The Equiprobable input/output XOR distribution

This test measures the variation of bits at the output in response to the variation of bits at the input. The differential approximation probability can be obtained when any change of \(\partial k\) at the input will immediately change \(\partial l\) at the output. However, it is also important to note that the likelihood of input XOR values and output XOR values must be the same. Ideally, the S-box should have differential uniformity, which can be obtained using differential probability (DP), as illustrated thus:

$$ DP_{g(\partial k \to \partial l)} = \left[ {\frac{{\# \left\{ {} \right.k \in \frac{X}{S(k)} \oplus S\left( {k \oplus \partial k} \right) = \partial l\left. {} \right\}}}{{2^{m} }}} \right] $$
(18)

where X is all the possible input values and \(2^{m}\) counts several elements in constructed S-box. The smaller the value of \(DP_{g}\) the more robustly that S-box has been designed and the more vital its abilities to defend against differential attacks.

The differential approximation that we achieved is shown in Table 14. The maximum value of ‘10’ indicates that a particular S-box has a good ability to resist differential attacks.

Table 14 Differential approximation for constructed S-box

6.5 Linear approximation probability

Linear approximation probability (LP) can be defined as the highest unbalance value, which can be determined using the simple equation:

$$ LP = \max_{{\gamma_{1,} \gamma_{2 \ne 0} }} \left( {\frac{{\# \left\{ {z \in Z|z.\gamma_{1} = g(z).\gamma_{2} } \right.}}{{2^{n} }} - \frac{1}{2}} \right), $$
(19)

Here, the denotation \(\gamma_{1}\) signifies the input mask and the \(\gamma_{2}\) represents the output mask. Z is the representation of all possible input values while \(2^{n}\) is the tally S-box elements. The parity of chosen mask \(\gamma_{1}\) for the input bits is equal to parity of chosen mask \(\gamma_{2}\) for the output bits. The LP value must be smaller in order to be stronger against any linear password attack.

The value of LP that we obtained using our new S-box is shown in Table 15, where it is also compared to existing S-boxes' LP values. As that table demonstrated, our S-box achieves a lower minimum LP value as compared to existing S-boxes' results, making ours the best performance.

Table 15 Comparison of linear approximation probability with different S-boxes

7 Conclusion, discussion, and future prospects

In this article, we proposed a new S-box based on confusion and diffusion to protect sensitive visual information within images. We utilized Arnold cat map, DNA, and LOMAS sequences to achieve confusion and diffusion. Shuffling of the pixels of the plaintext image is achieved using Arnold map at specific iterations to achieve the permutation of pixels. The sequence-based on diffusion is achieved using bitwise XORed operation with DNA and LOMAS random sequences. Moreover, highly random 512 × 512 sequences are filtered to unique 256 × 1 elements, after which it is sorted into ascending order. Finally, the sorted output is reshaped to S(16 × 16) S-box. The newly constructed S-box utilizing various tests to assess and validate its randomness, all of which have been verified by NIST-800–22. The performance analysis we obtained from these different tests showed better results for our constructed S-box, which validate its cryptographic performance and demonstrate that it possesses a better ability to resist any attacks than most existing options. Moreover, we compared the results achieved by our new S-box with those from existing S-boxes for each of these different tests and performed various security analysis. The comparison showed that our constructed S-box has dominant cryptographic features. In future, our aim is to extend this research for audio and video encryption. We will investigate the proposed encryption technique for both audio and video data. The proposed scheme will also be tested on the work presented in Driss et al. (2020a), Masood et al. (2020c) and Driss et al. (2020b).