Introduction

Cryptography is a foundational technology that plays a pivotal role in safeguarding information and ensuring trust in digital interactions. Its importance extends across various domains, from securing online transactions to protecting national security and individual privacy. As the digital landscape continues to evolve, cryptography remains an essential tool for addressing the growing threats and challenges of the digital age. Many algorithms for protecting sensitive information are known in the literature. Ciphers used for plaintext encryption and cipher text decoding are classified into two types: block ciphers and stream ciphers1,2. The block cipher encrypts and decrypts data block by block. The data block is typically made up of one or more bytes. On the other hand, a stream cipher accomplishes these modifications by utilizing one bit or one byte in a single step. A block cipher has become the most potent approach for protecting sensitive data in contemporary cryptography3. Block ciphers are simpler to develop and more often used in practical security applications than stream ciphers4.

The substitution-permutation block ciphers are one of the most common types of block ciphers used to protect data. Permutation and substitution are important processes that are utilized in these ciphers to help change data into a mysterious format. The permutation function exchanges bits or bytes from the plaintext in the form of additional bits or bytes that are also included in the plaintext. In contrast, a nonlinear substitution process swaps out one block of data for another. S-box is used to replace the data in this instance5,6. A crucial component of modern block ciphers, an S-box considerably aids in the production of garbled output data (cipher text) from the provided input data (plaintext). To make things more difficult for the attackers, the S-box should provide a non-linear relationship between the input and output of the data7. If a substitution box may cause greater uncertainty for the attackers, the block cipher is regarded as more secure. On the other hand, a weak S-box can undermine the security of a block cipher. A weak S-box has low non-linearity, meaning it does not effectively obscure the relationship between input and output bits. In other words, it doesn’t introduce sufficient complexity in the substitution operation, making it vulnerable to various cryptanalysis techniques. The avalanche effect is a desirable property in cryptography where a small change in the input should result in a significantly different output. Weak S-boxes often fail to exhibit a strong avalanche effect, meaning small input changes may lead to small or predictable output changes, which can be exploited by attackers. Weak S-boxes may exhibit linear or affine structures, making it easier for cryptanalysts to derive mathematical relationships between the input and output. This can lead to efficient attacks, such as linear and differential cryptanalysis. The security provided by a block cipher using the substitution box depends entirely upon the effectiveness of the corresponding S-box8,9.

As an outcome, researchers eventually developed novel methods for building key-dependent, dynamic S-boxes. Among these algebraic ideas, linear fractional transformation (LFT) is an essential method for constructing robust S-boxes. The LFT approach was employed by the authors in10,11,12,13 to construct efficient and reliable S-boxes. When measured against the accepted cryptographic standards that are used to assess an S-box, the created S-boxes showed respectable results. In addition to LFT approaches, researchers have also suggested effective methods for producing S-boxes. For example, authors in14 offered a method that made use of the concept of cubic fractional transformation (CFT) to create effective S-boxes. Another straightforward method has been proposed in15 for constructing S-boxes by using cubic polynomial transformation. For constructing strong S-boxes, the use of this approach is incredibly straightforward, effective, and efficient. In16, a novel method for building robust S-boxes with linear transformation was put forward. Using the permutation process, this method strengthens the resulting S-box. DNA computing is also employed by authors to construct resilient S-boxes and cryptographically robust ciphers using the techniques provided in17,18,19,20,21,22. These S-boxes and ciphers were shown to be strong in terms of cryptographic standards and resistance to various assaults through analysis. Due to the unpredictability of chaotic systems, chaos is also an important field exploited in cryptography for creating reliable S-boxes20. Chaos theory was utilized by several scholars23,24,25,26 to construct S-boxes with strong cryptographic properties. Hyperchaotic schemes were employed by the authors of27,28 to build robust and reliable S-boxes. Many other researchers have also developed S-boxes using a variety of different methods, including cellular automata29, graph theory30, elliptic curve31, and optimization approaches32, etc.

The construction of S-boxes relies on certain key assumptions to ensure their security and effectiveness. Here are some key assumptions for the construction of S-boxes:

  • S-boxes should exhibit strong non-linearity, which means that they should not be easily expressible as a linear function of their inputs. This property helps prevent attackers from exploiting algebraic relationships between input and output bits.

  • S-boxes should satisfy the avalanche effect, where a change in a single input bit should cause approximately half of the output bits to change on average. This property ensures that small changes in the plaintext result in substantial changes in the cipher text.

  • S-boxes should be designed to resist known cryptanalysis techniques such as differential and linear cryptanalysis. These techniques attempt to find patterns or correlations between plaintext, cipher text, and key bits, and strong S-boxes can help thwart these attacks.

  • S-boxes should be bijective, meaning that each input value maps to a unique output value, and vice versa. This property ensures that the S-box can be inverted, allowing for decryption.

In this manuscript, a text-theoretical approach is used for the construction of effective S-boxes. Text encryption by using the Adjacency matrix of Graphs is used in different branches of the research and is frequently utilized to address a wide range of practical issues. Mathematicians are becoming more and more aware of the importance of graph theory. It is employed in electrical engineering, solid-state physics, organic chemistry, statistical mechanics, operations research, and optimization theory. Numerous other fields, including computer science, mathematics, and engineering, are impacted by the study of cryptography33,34,35,36,37.

The significant contributions provided in this study are as follows:

  • To generate the initial S-boxes, a novel and straightforward Graph-theoretic method is proposed, and subsequently its nonlinearity is enhanced by employing a permutation of symmetric group.

  • Conventional S-box assessment criteria and the S-boxes already accessible in the existing literature are used for analyzing the proposed S-box. The significant contribution of the planned S-box is confirmed by this performance investigation.

  • The suggested method's sample S-box is used for image encryption. The S-box-based encryption has outstanding encryption effectiveness, according to simulation and effectiveness evaluations.

The subsequent sections of the paper are structured as follows. The construction of S-boxes utilizing graphical techniques is thoroughly covered in section “Algebraic structure of S-box”. Section “Algebraic analysis” compares the performance of S-box created using the suggested approach to currently in use modern S-boxes. In section “Image encryption”, the suggested S-box is tested and evaluated for use in image encryption. In contrast, section “Conclusion” brings the study's conclusion.

Algebraic structure of S-box

For the purpose of constructing substitution boxes, the suggested method essentially consists of three stages. First, LFT is generated from a phrase using the concepts of graph theory. Second, this LFT generates an S-box, and finally, a symmetric group permutation is used to enhance the non-linearity of the initially generated S-box. The suggested algorithm’s initial phase is based on graph theory because text characters are hidden in the graph according to the graph's vertices and their degrees. Every character in a text is first changed to its ASCII code, and then it is turned to binary notation. The order of the matrix is then determined by evaluating XOR for each character of the given text, which determines the order of the matrix for each character individually. The number of rows and columns in a matrix corresponds to the number of nodes in a graph, and we follow the same procedure to create a graph using a matrix as the definition of the adjacency matrix of a graph. Each step of the proposed approach to constructing S-boxes is given in algorithm 2.1 and the flow chart for the construction of the S-box is given in Fig. 1.

Figure 1
figure 1

A flow chart for the construction of S-box.

Algorithm 2.1:

  1. 1.

    Input can be any word or phrase.

  2. 2.

    Convert each character of the message into its ASCII code in binary format.

  3. 3.

    Compute XOR32 for each binary number.

  4. 4.

    For each binary number, form a one-dimensional array by counting the number of 1’s.

  5. 5.

    Construct a square matrix \(\left[{M}_{i,j}\right]\) with an order equal to the number of 1's in the array.

  6. 6.

    The diagonal elements of \(\left[{M}_{i,j}\right]\) will be equal to zero.

  7. 7.

    For every binary number, count the number of 0’s following 1’s.

  8. 8.

    If the binary stream ends with 1 and is followed by 0, then put \({m}_{i,j}={m}_{j,i}=n+1\), where n is the number of 0’s following 1.

  9. 9.

    All elements are stored in the upper diagonal of the matrix, and the matrix that is forming is a real symmetric matrix.

  10. 10.

    On finding the last element of binary stream 1, put 1 in the first entry of the last column of a matrix.

  11. 11.

    The above process is repeated until we are done with the binary stream for each character.

  12. 12.

    Create a square matrix \(xI\) for all matrices \(\left[{M}_{i,j}\right]\) with the same order as that of the matrix \(\left[{M}_{i,j}\right]\) where \(I\) is the identity matrix and \(x\) is its vector multiple.

  13. 13.

    Determine the sum of each matrix combination as, \(xI+\left[{M}_{i,j}\right]=[{a}_{i,j}]\).

  14. 14.

    Determine the grand sum \(\sum_{i,j}{a}_{i,j}\) of these matrices that will be linear functions.

  15. 15.

    Find the sum of all these linear functions and take its reciprocal to get an LFT of the type \(\frac{1}{Ax+B}\).

  16. 16.

    Utilize this LFT to construct the initial S-box.

  17. 17.

    Apply the appropriate symmetric group permutation to increase the nonlinearity of this initially created S-box.

  18. 18.

    Output is a suggested S-box with robust algebraic characteristics.

Construction of a specimen S-box

Let's say that we wanted to create an S-box from the word “UNITY.” Using the ASCII table, we were able to find the binary format of all of its alphabets, as shown in Table 1.

Table 1 Binary format for each character of the word “UNITY”.

Use the algorithm 2.1 to create a square symmetric matrix for each letter, as shown in Table 2.

Table 2 LFT computation table.

We now find the desired LFT as shown below by taking the reciprocal of the sum of all the mappings in the last column.

$$ f\left( x \right) = \frac{1}{{\left( {5x + 14} \right) + \left( {5x + 14} \right) + \left( {4x + 14} \right) + \left( {4x + 14} \right) + \left( {5x + 14} \right)}} $$
(1)
$$ f\left( x \right) = \frac{1}{23x + 70} $$
(2)

Now enter the values from 0 to 255 into the mapping provided in Eq. (2) and to keep all the outputs in the range 0 to 255 replace \(f(64)\) with 0 and \(f(131)\) with 191. After resolving all of the outputs under mod 257, we find the initial S-box as shown in Table 3.

Table 3 Initially created S-box from Eq. (2).

This S-box’s Boolean functions have an average non-linearity value of 99.75, though a resilient S-box could be created by using the Symmetric group permutation to increase this value. The 256 cells of Table 3 are subjected to the following symmetric group permutation in this instance. As a result, we find in Table 4 a sturdy S-box with an average non-linearity of 111.5.

Table 4 Proposed S-box.

Permutation of \({{\varvec{S}}}_{256}\): (1 167 140 64 121 211 28 133 163 241 180 228 254 65 93 218 171 69 40 175 209 74 80 58 96 91 111 239 223 42 87 194 166 63 177 53 229 114 110 246 162 172 11 253 224 170 173 85 10 212 150 201 23 210 72 183 240 127 131 100 76 20 7 107 25 30 31 217 124 18 200 41 196 233 101 2 252 185 3 222 60 98 251 115 148 213 153 160 219 15 146 95 205 136 184 50 126 237 86 157 203 56 230 16 188 159 97 128 83 151 236 132 36 44 182 232 206 220 73 174 34 139 189 134 227 71 81 208 47 119 190 176 142 92 156 13 82 117 221 256 234 244 158 192 79 255 155 165 168 4 179 242 118 6 161 231 52 193 90 108 45 197 143 70 51 243 24 191 116 145 199 26 75 59 19 195 39 238 29 149 178 204 66 250 48 120 84 164 32 54 33 104 245 106 17 122 202 61 181 249 152 247 130 67 35 144 215 113 22 207 12 27 198 49 55 21 89 137 216 37 5 141 43 8 125 46 88 147 103) (9 77 169 129 226 135 38 62 235 214 57 78 187 105 225 248 154 68 123 186 14 99) (94) (102) (109) (138) (112).

Algebraic analysis

For effective encryption, an S-box must have robust algebraic properties that enable nonlinear associations between the input and output of data. This section carefully analyzes the created S-box given in Table 4 using the following standards for judging the cryptographic potency of substitution boxes.

  • Bijective

  • Nonlinearity

  • Bit Independence Criterion

  • Strict Avalanche Criterion

  • Linear Probability

  • Differential Uniformity

As a comparison tool between our suggested S-box and several current S-boxes, we additionally choose recently explored S-box approaches.

Bijective

If a mapping has a unique output against a particular input, it is said to be bijective. This characteristic should be very clearly demonstrated by an S-box design. In the instance of 8 × 8 S-box, every value entered in the interval [0, 255] should result in a unique value for output in the interval [0, 255]. This condition is validated by the created S-box in Table 4 since it has all conceivable different output values in the interval [0, 255]. Additionally, the number of 1’s in each Boolean function is the same as the number of 0’s.

Non-linearity

An S-box design is thought to be better if it possesses the capacity to convert the given inputs into outputs nonlinearly. This kind of S-box is useful for thwarting intruders’ attempts at linear cryptanalysis. Using Eq. (3) as provided by38, one may determine the nonlinearity of Boolean functions:

$$ NL_{f} = 2^{n - 1} \left( {1 - 2} \right.^{ - n} max\left| {W_{f} \left( a \right)} \right| $$
(3)

Here \(W_{f} \left( a \right) = \mathop \sum \limits_{{a \in F_{2}^{n} }} \left( { - 1} \right)^{f\left( x \right) \oplus a.x)}\) is the value of Walsh Spectrum. Eight balanced Boolean functions that make up our predicted S-box have non-linearity values of 112, 112, 112, 110, 112, 112, 112, and 110. It is clear that the lowest, highest, and mean nonlinearity values for our S-box are, respectively, 110, 112, and 111.75. Figure 2 shows a graphical comparison of our S-box's nonlinearity results with those of newly developed S-box construction methods. The contrast shows that our predicted S-box's nonlinearity value works better than the nonlinearity outcomes of numerous other different existing S-boxes.

Figure 2
figure 2

A comparison between NL scores of different S-boxes.

Strict Avalanche Criterion (SAC)

This S-box feature guarantees that one input bit shift modifies 50% of the output bits39. As a result, a powerful S-box has a SAC score that is roughly equivalent to 0.5. Table 5 provides the dependence matrix of our S-box’s SAC readings. The mean SAC score of our S-box is 0.4978, which is very close to 0.5. As a result, our anticipated S-box adequately fulfills the SAC requirement. Table 8's comparison of the predicted S-box’s SAC values with those of other S-boxes shows that our S-box’s SAC score is gracefully consistent with those of other S-boxes.

Table 5 Strict Avalanche values of created S-box.

Bit Independence Criterion (BIC)

This feature of an S-box guarantees that when just a single input bit is changed, alterations in any two output bits do not rely on one another39. The BIC-SAC and BIC nonlinearity (BIC-NL) values of the created S-box are shown in Tables 6 and 7, respectively.

Table 6 BIC-SAC values.
Table 7 BIC nonlinearity values.

BIC-NL and BIC-SAC average values for the constructed S-box are 103.86 and 0.4964, respectively. These results suggest that the predicted S-box fully satisfies the BIC criterion. Moreover, Table 8 provides an investigation of BIC values for various S-boxes.

Table 8 A comparison between algebraic properties of different well-known S-boxes.

Linear probability (LP)

The bits of plaintext must be mixed up during the design of a trustworthy block cipher so that cipher text invaders cannot determine the bits' initial sequence. This confusion is made possible by a strong S-box construction that establishes a nonlinear linkage between plaintext bits and cipher text bits. The linear probability described in Eq. (4) is used to assess this algebraic feature of an S-box40.

$$ LP = max_{{t_{x,} t_{y} \ne 0}} \left| {\frac{{\# \left\{ {x \in X|x.t_{x} = f\left( x \right).t_{y} } \right\}}}{{2^{n} }} - \frac{1}{2}} \right| $$
(4)

Here, \(t_{x}\) denotes input mask and \(t_{y}\) denotes output mask. If the relationship among the input and output bits is linear, the LP value for the S-box will be greater and attackers can easily perform linear cryptanalysis. The predicted S-box has an LP score of 0.125, which is a low enough number to withstand linear cryptanalysis. Therefore, the suggested S-box has a good chance of defeating such cryptanalytic attempts. Table 8 compares the LP values of a few existing S-boxes and predicted S-box. Comparing the proposed S-box to other existing S-boxes, it is clear that it has excellent durability.

Differential uniformity (DU)

The attackers can determine the full or incomplete plaintext or secret key by differential attacks41. Analysts assess the differential uniformity (DU) of S-boxes to determine this disparity. An S-box's DU should be relatively small to withstand differential cryptanalysis. Table 9 provides the DU values of the constructed S-box, which were determined using Eq. (3). The highest possible DU score for the predicted S-box is 10. As a result, the differential probability (DP) value equates to 0.039, indicating that the predicted S-box provides respectable resistance to differential cryptanalysis. The contrast of DP values between the predicted S-box and numerous other S-boxes is shown in Table 8. Figure 3 shows a graphical comparison of our S-box's maximum DU results with those of newly developed S-box construction methods.

Table 9 Input/output XOR distribution table of created S-box.
Figure 3
figure 3

A graphical comparison between DU values of various S-boxes.

Image encryption

S-boxes are an essential component of image encryption algorithms because they provide non-linearity, confusion, and protection against various cryptographic attacks. They help ensure the confidentiality and security of encrypted images by making it extremely difficult for unauthorized parties to reverse-engineer the encryption process or recover the original image without the correct decryption key. Image encryption analysis involves evaluating the security and effectiveness of an image encryption scheme to ensure that it meets the desired security goals. This section carefully analyzes the robustness of the created S-box given in Table 4 in image encryption using several standards for judging the cryptographic potency of substitution boxes.

Majority Logic Criterion (MLC)

Using various images, the Majority logic criterion performs several statistical analyses to determine the statistical robustness of the S-box in image encryption58. The distortion that the encryption process generates in the picture determines how effective the suggested method is. To examine the statistical characteristics, different analyses are required. Correlation, entropy, contrast, homogeneity, and energy are a few of these analyses. JPG images of Cameraman, Pepper, and Baboon taken form the article65 are used for MLC analysis. The outcome of image encryption using the suggested S-box and histograms is shown in Table 10. Going forward, Tables 11, 12 and 13 contrast the outcomes of this analysis with those of other well-known S-boxes. These findings suggest that the suggested S-box is sufficient to be included in the algorithms created for the safe transfer of information and data and is appropriate for encryption applications.

Table 10 Original and encrypted images with their histograms.
Table 11 MLC comparison of Cameraman’s image using the proposed S-box with various other S-boxes.
Table 12 MLC comparison of Baboon image using proposed S-box with various other S-boxes.
Table 13 MLC comparison of Pepper image using proposed S-box with various other S-boxes.

Peak Signal-to-Noise Ratio (PSNR)

The Peak Signal-to-Noise Ratio is a metric used to quantify the quality of a reconstructed or compressed signal, typically an image or video, compared to the original, uncompressed signal. PSNR is commonly used in various fields such as image processing, video compression, and signal processing to assess the fidelity of the reconstructed data. The PSNR value is calculated as the ratio of the peak signal power to the noise power, typically expressed in decibels (dB). A higher PSNR value indicates better quality, as it means that the reconstructed signal is closer to the original signal and has less distortion or noise.

The formula for calculating PSNR is:

$$ PSNR = 10 \cdot log_{10} \left( {\frac{{R^{2} }}{MSE}} \right) $$

where R is the maximum possible pixel value of the image and MSE is the Mean Squared Error. The PSNR results of encrypted images are given in Table 14.

Table 14 PSNR, MSE, NPCR, UACI results of encrypted images.

Mean Squared Error (MSE)

Mean Squared Error is a common metric used to measure the average squared difference between the values predicted by a model or estimation technique and the actual values. The MSE test is a mathematical calculation used for evaluating the performance of models or estimators, especially in the context of regression analysis or image processing. In the context of image processing and compression, MSE is often used to quantify the quality of a reconstructed or compressed image compared to the original image. It provides a numerical value that represents the average squared difference in pixel values between the two images. A lower MSE indicates better image fidelity, as it implies that the reconstructed image is closer to the original.

The formula for calculating MSE is as follows:

$$ MSE = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left( {X_{i} - Y_{i} } \right)^{2} $$

where N is the total number of pixels, \(X_{i}\) represents the pixel value in the original and \(Y_{i}\) represents the corresponding pixel value in the reconstructed or compressed image. The MSE results of encrypted images are given in Table 14.

Normalized Pixel Change Rate (NPCR)

Normalized Pixel Change Rate is a metric used to evaluate the quality and security of image encryption or data hiding techniques. It is commonly used in the field of information security and cryptography, particularly when assessing the performance of encryption algorithms applied to images. The result is usually expressed as a percentage. In cryptographic applications, a high NPCR value (close to 100%) is desired because it indicates that a small change in the plaintext image results in a significant change in the encrypted image, making it harder for an attacker to deduce information about the original image from the encrypted version.

The formula for calculating NPCR is as follows:

$$ NPCR = \frac{{N_{c} }}{N} \times 100\% $$

where \(N_{c}\) is the total number of pixels for encrypted images and \(N\) is the total number of pixels in the plain image. The NPCR results of encrypted images are given in Table 14.

Unified Average Changing Intensity (UACI)

Unified Average Changing Intensity is a metric used to evaluate the quality and security of image encryption or data hiding techniques. The UACI metric measures the average change in pixel intensity values for encrypted images, typically generated by applying an encryption algorithm. A lower UACI value is generally considered better, indicating that the encryption algorithm produces smaller changes in pixel values between the two images.

The formula for calculating UACI is as follows:

$$ UACI = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \frac{{\left| {X_{i} - Y_{i} } \right|}}{R} \times 100\% $$

where \(N\) is the total number of pixels in the images, \(X_{i}\) represents the pixel value in the original image, \(Y_{i}\) represents the corresponding pixel value in the encrypted image and R is the maximum possible pixel. The UACI results of encrypted images are given in Table 14.

Complexity analysis

This section investigates the computational complexity of the suggested encryption approach, which involves partitioning the image into Most Significant Bits (MSBs) and Least Significant Bits (LSBs). Each pixel in the image is divided into two sub-blocks with a constant time complexity of \(O(1)\). The initial stage necessitates \(O(M\times N)\) bit operations where \((M\times N)\) is the dimension of the plain image. The substitution module operates in linear time as well. Because all of the algorithm's modules run in linear time, the scheme's overall computational complexity is \(O\left( {M \times N} \right)\). In comparison to previous algorithms64, the computational time complexity for creating the cross-coupled chaotic sequence for one round operation is \(O\left( {2 \times M_{x} } \right)\), where \(M_{x}\) is the maximum value of \(M_{1}\) and \(M_{2}\). The row-column permutation stage has a temporal complexity of \(O\left( {M_{1} \times N_{1} } \right)\), while the row-column diffusion operation has a computational complexity of \(8\left( {M_{1} + N_{1} } \right)\). As a result, the overall total computational time complexity of the encryption technique64 is \(O\left( {2M_{x} + 9(M_{1} + N_{1} } \right)\), which is nearly equivalent to the suggested algorithm.

Correlation

Correlation analysis in image encryption often involves examining the correlation between different pixel pairs in both the original and encrypted images. This can include analyzing vertical, horizontal, and diagonal correlations. Vertical Correlation calculates the vertical correlation by comparing pixel values vertically in the original and encrypted images, Horizontal Correlation calculates the horizontal correlation by comparing pixel values horizontally in the original and encrypted images, and Diagonal Correlation calculates the diagonal correlation by comparing pixel values diagonally in the original and encrypted images. Each of these correlations provides insights into how well the encryption algorithm disrupts spatial relationships in the image. These correlation results of encrypted images are given in Table 15.

Table 15 Correlation results of encrypted images.

Conclusion

In this work, we presented a novel and simple text-theoretical method for the creation of S-boxes. Also, we constructed a sample S-box using the word “UNITY” as an example. When compared to current S-boxes, this newly designed S-box’s algebraic features are strong enough to resist linear and differential attacks, and its performance is superior to that of many other S-boxes. Especially, the mean NL score of the proposed S-box is 111.5, which is extremely high compared to the other S-boxes. The specimen S-box is also utilized in image encryption for images of Cameraman, Pepper, and Baboon. To evaluate the performance of the specimen S-box in image encryption, various statistical analyses such as PSNR, MSE, NPCR, UACI, and MLC are used. In particular, low correlation values of -0.0012, 0.0000, and 0.0006 for vertical, horizontal, and diagonal correlation, respectively, of Pepper image demonstrate the robustness of encryption utilizing the specimen S-box. In future development, this proposed method for S-box construction can be used for dynamic S-box construction using any sentence or paragraph.