1 Introduction

Nowadays, information security has become a very important issue for governments, companies and even individuals. Therefore, cryptographic systems’ designers must constantly search for up-to-date cryptography algorithms with higher efficiency to counter these potential threats.

Usually, any cryptographic security level depends on two major complicating factors. The first factor depends on the complexity of the selected Cryptographic Algorithm (CA) that responsible to convert plaintext to ciphertext, or vice versa. The second factor is the complexity of the secret key(s) generation. Both factors must be difficult enough that the information cannot be accessed or predicted easily.

There are two main types of CA. The first type is called Asymmetric Key Algorithm (AKA) or Public-key cryptography while the other is called Symmetric Key Algorithm (SKA).

Essentially, any security system are built on the basis of CA with its Cryptographic Keys (CK). These keys should be known only to the main users and hidden from others. Attackers often try to find, explore, extract, or even steal CK to recover hidden and confidential information. Significantly, as long as the CK is confidential, the information is safe.

In practice, keeping CK secret is one of the most difficult problems. Therefore, CK must be very complex (very long) to be retrieved. Thus, the CK are generated using powerful algorithms to ensure that each CK is unpredictable [1].

There is a direct proportion between the n-bits of CK and its security level. Oftentimes, n-bit security means that attackers have up to \({2}^{n}\) times to break it [2].

Towards more powerful CA and CK, designers thought to integrate and collect more than one CA (often AKA and SKA) in single algorithm called Hybrid cryptography technique. So, this combination of multiple encryption rules simultaneously takes advantage of their strengths. They collected as many strengths as possible. Besides, they also get rid of many weaknesses as possible.

Hybrid encryption is a unique technology that combines a strong algorithm (such as AKA encryption) with a low execution time (likes SKA encryption). It utilizes more than one key to perform its task, each key has long width (Larger number of bits) likes AKA. Moreover, it handles large amount of data simultaneously with highly speed of processing likes SKA.

The essential advantage of the hybrid encryption is that it generates a new encrypted key with each new transformation from plain to cipher data [3]. Frequently, this key is generated randomly by strong algorithms. According to this power and the constant changes of the generated key, hackers can’t anticipate it. Therefore, this key is the second line of defense against hackers. Moreover, this generated key ensures stronger security during data transmission over the communication media, because this key is unpredictable.

Toward strongest hybrid cryptography, designers utilized neural network (NN) in the field of cryptography that called Neuro-Cryptography (or neural cryptography). This concept is now rapidly increasing. In the past two decades, many researchers already have collected and combined different NNs with various classical cryptographic paradigms [4]. The designers invented many complex neuro-cryptography rules [5,6,7]. They have created their CA by using different learning algorithms to generate strong keys [8,9,10,11,12].

As long as the NN architecture (within neural cryptography) is deeper, the generated keys are too complex to predict.

In recent years, machine learning (ML) techniques have become increasingly powerful in cryptography while processing more than 3 quintillion (3 followed by 18 zeros) of data-bytes around the world every day [13]. ML techniques can be used to obtain the relationship(s) between original data and its encrypted data within cryptographic systems. Also, it can be used to generate CK. Furthermore, ML is used to compress/decompress messages before they are encrypted/decrypted.

The concept of utilizing different machine learning (ML) in Neuro-Cryptography is growing rapidly. Various researchers have proposed many Neuro-Cryptography rules [14,15,16].

From literature survey, there is a lack of research on hybrid-type Neuro-Cryptography (particularly for autoencoders) using backpropagation ML. Therefore, this research proposes a novel hybrid neuro-cryptography rule with instant-changing key based on the fast BP ML paradigm called “Instant Learning-Ratio Machine-Learning (ILRML)” [17].

Due to the great dependence of the ILRML algorithm for our proposed Neuro-Cryptography design, we will dedicate a large space to it in this introduction section.

As its name implies, the ILRML is very fast BP supervised learning. It can update all the weights in the NN during a single iteration. So, it can be used to encrypt/decrypt the original/ciphered data during the online communications. It relies on the learning ratio (∆l) rather than the learning rate (η). In spite the ILRML runs its forward propagation (FP) like any conventional ML rule, but it has a fundamental difference during BP. The Fig. 1 shows the concept of ILRML for one neuron.

Fig. 1
figure 1

The concept of ML based on the learning ratio within a neuron

For simple explanation of ILRML, we will firstly discuss the case of single neuron within a NN. The ILRML checks (each learning iteration) the difference between any instant output from neuron and its instant desired target (\({\varvec{d}}\)). If that difference is unacceptable (according to the required accuracy), then the ILRML enables intersections between the curve of this neuron’s activation function and the constant function of \({\varvec{d}}\) as shown in the Fig. 2.

Fig. 2
figure 2

Intersections between the desired target (d = 0.8) and many activation functions

The intersection may be one point (or two points due to the curvature of activation function). In our case, the constant function \(d=0.8\) intersect with the functions (Linear, Sigmoid, Tanh and ReLU) at the points (2.0, 1.3863, 1.0986 and 0.8) respectively. These intersection points are called (post-intersection). It considered as instant updated pre-activation factor and symbolled by \({{\varvec{x}}}^{\mathbf{n}\mathbf{e}\mathbf{w}}\).

The ILRML divides the pre-activation factor \({x}^{\mathrm{new}}\) (of the assigned function) over the old pre-activation factor (the old sum of product) \({x}^{\mathrm{old}}\) to generate the instant learning ratio (Δℓ) of this neuron during that iteration.

The ILRML uses the factor Δℓ to update the old values of the neuron’s inputs and their weights by multipling them according to the next algorithms 1 and 2.

In fact, the factor \({x}^{\mathrm{new}}\) can be determined (in the programming of this algorithm) from the inverse of the activation function (at the point \(d\)).

$${\varvec{x}}^{{{\text{new}}}} = {\varvec{inverse}} \left( {{\varvec{activation}} {\varvec{function}}} \right)|_{{\varvec{d}}}$$
(1)

As we mentioned before, the ILRML differs in their BP. So, we will discuss two associated algorithms from BP point-of-view. The algorithm 1 explains the feed-backward during BP from a neuron (in any layer) to all neurons in the previous layer as demonstrated in Fig. 3 [17]. While the algorithm 2 illustrates the feed-backward from all neurons (of any layer) to one neuron in the previous layer as demonstrated in Fig. 4 [17].

Fig. 3
figure 3

The BP from a neuron to the previous neurons

Fig. 4
figure 4

The BP from all neurons (within a layer) to a previous neuron

The remaining relevant issues in this research will be organized in this way. In Sect. 2, the description of the basic architecture of the IHNC system and its general key formats besides its workflow. In Sect. 3, proposes an architecture of a lite-IHNC. It includes 2 sub-sections. The first for describing its Encryptor-unit and the second for its Decryptor-unit. The Sect. 4, includes proposal of a Deep-IHNC (to confirm our idea) as a case study with its results. Eventually, the conclusion and future hope.

figure a

Algorithm 1 The BP of the ILRML from a neuron to the previous layer.

figure b

Algorithm 2 The ILRML BP from all neurons of a layer to one of previous neuron.

2 The basic architecture of the IHNC

In this section, we will introduce general architecture of our proposed IHNC. Let's first discuss three major requirements for this design. The first requirement is to find a suitable structure of an Artificial Neural Network (ANN) that can be divided into 2 complementary sub-ANNs to perform two tasks (encryption task and decryption task).

The second requirement, the number of outputs (from the first divided sub-ANN) must equal the same number of inputs (to the second sub-ANN) to transfer the ciphered-data between them.

Third requirement, the selected ANN should have similar numbers of its inputs and outputs to recover the same encrypted (original) data.

According to the previous requirements, neural autoencoders are the best choice for our construction.

The neural autoencoder has two conjugated sub-ANNs (Encoder and Decoder) [18,19,20].

Therefore, we will utilize these two sub-ANNs (autoencoder stages) to carry out our idea as will be explained later on.

The main structure of the IHNC system should contain two units (Encryptor and Decryptor) as seen in the Fig. 5. The Encryptor-unit includes a fully autoencoder (Encoder and Decoder stages) while the Decryptor contains only its Decoder-stage.

Fig. 5
figure 5

The basic architecture of the proposed IHNC (Encryptor and Decryptor)

All initial values of the Encoder-stage are considered as the public-key (green key) of the IHNC. Moreover, all the initial values of the decoder-stage (decoder1 or decoder2) are considered as the private key (yellow key) of the IHNC.

The Encryptor-unit carries out 2 tasks. The first task is relying on the FP by passing an instant sample of plain data (original data) through the initial weights (Public key) of encoder-stage only to generate an instant code (i-Code). This i-Code represents the instant ciphered data. Moreover, it can also call the compressed code according to the autoencoder. The second task (of the Encryptor) depends on the BP by passing the same sample of plain data through the initial weights (Private-key) of the Decoder1. This second task generates Learning ratios (as instant-key) from all neurons in this decoder. Both the generated instant-key (i-key) and instant-code (i-Code) are changing periodically with each new sample of plain-data to the Encryptor-unit.

Both i-key and i-code are sent to the Decryptor-unit (directly or through suitable communication media).

On the other hand, the Decryptor-unit includes only decoder-stage (Decoder2). It exactly similar to the Decoder1 in the Encryptor-unit. The Decoder1 is loaded by the private key (initial values of its weight) and it receives both the i-Code and i-key to recover the instant original data.

As mentioned before, we have Public, Private and Instant keys. The Public and Private keys consist of all initial values of the weights within Encoder-stage and Decoder-stage respectively. While the i-key consists of all generated learning ratios (∆ℓ) from Decoder1.

Like any computer system, all weights values are represented by the Double-precision floating-point (Flot64) format to get high accuracy numbering during the FP and BP calculations through the neural autoencoder.

During the IHNC design, the designers are free to choose any number of neurons inside their autoencoders, and then any number of their weights. Therefore, the size of their keys can vary from one design to another (according the security level requirement). This gives flexibility in the designs because the size of their keys is not fixed like many recent encryption systems. let us give examples. If the public or private key contains only 10 weights, then the size of that key will be 640 bits while if it contains 1000 weights, then it will be 64,000 bits and so on.

Also, the instant key depends on the number of neurons (not the weights) inside the decoder1. For example, if it contains 100 neurons, the i-key size will be 6400 bits.

Further, IHNC designers are free to arrange those weights in concatenation form. For example, as shown in Fig. 6. Here the designer has arranged all the weights of the two keys (public and private) in a regular order. The designer started with the first weight of the first neuron of the Encoder-stage “\({{(W}_{encr}^{init})}_{j1,k1}\)” and then the second weight of the same neuron “\({{(W}_{encr}^{init})}_{j1,k2}\)” until he finishes all the weights of the first neuron. Then, he completes the rest of the weights in the same way.

Fig. 6
figure 6

Bit-format of IHNC parameters (Public, Private, Instant, i-code and Plain data)

Also, he arranged all learning ratios from the first neuron “\({(\Delta {\mathcal{l}}_{decr})}_{\mathrm{1,1}}\)” to the last one “\({(\Delta {\mathcal{l}}_{decr})}_{j,i}\)” inside the i-key.

Moreover, Fig. 6 shows the width of the instant plain-data (total number of either autoencoder inputs or output * 64 bits). Besides, it shows the width of the instant compressed data or i-code that will be transferred from the Encryptor to Decryptor (total number of either Decoder-stage inputs or Encoder-stage outputs * 64 bits).

Furthermore, the designers are more free if they rely on the principle of Asymmetric Stacked Autoencoder (ASA) characteristics [21], they are free to choose how many layers will be inside the encoder-stage and the rest of the asymmetric-autoencoder layers to the decoder unit.

This methodology gives more flexibility to their designs because the same IHNC autoencoder can decrypt the same plain-data in many i-codes and i-keys at the same time. In addition, send them to several users (those have the compatible decoders) at once.

In this case, subscribers will receive the same information, but with different encrypted codes and instant keys. Therefore, they can retrieve the same original information with different keys those agreed between the sender and all the recipients.

For instance, assume IHNC application has an ASA with seven layers as seen in the Fig. 7. It can send three i-codes (i-code1, i-code2 and i-code3) for the same plain-data simultaneously from three encoders (Encoder-a, Encoder-b and Encoder-c) respectively.

Fig. 7
figure 7

An ASA generates several i-codes from multiple encoders simultaneously

Now let’s explain the whole operations of the IHNC system through two flowcharts. The first one for carrying out the encryption as demonstrated in the Fig. 8. In step-1, the user uploads the public-key (as initial weights) to the Encoder in the encryptor-unit. Step-2, the user loads the private-key (initial weights) in the Decoder1 of the encryptor too. From Step-3 to step-5, the user loads his/her instant plain-data to the input of the encryptor-unit and run the FP to all neurons in the encoder. So they get the instant ciphered-data or instant-code (i-code) from the Encoder-stage of the autoencoder. In step-6 and step-7, The user feeds backward the same samples of the instant inputs to the Decoder-stage (as desired target) and run the instant learning ratios BP to Decoder1 only. So the user get the instant-key (i-key) as the learning ratios (∆ℓ) of all neurons within Decoder1. In the last step-8, both the i-key and i-code are sending to the decryptor-unit.

Fig. 8
figure 8

The workflow of the encryption process of the basic IHNC

On the other side, another user receives both i-code and i-key to the compatible Decryptor (Decoder2) as step-9 and step-11 respectively in Fig. 9. Also the user loads the common private-key as step 10. In step-12 the Encryptor-unit updates the received i-code and initial weights of Decoder using the received i-key. In the last 2 steps (13 and 14), the Decryptor run its FP to recover the instant original-data as will be explained in the next section.

Fig. 9
figure 9

The workflow of decryption process of the basic IHNC

3 Architecture and methodology of a lite-IHNC

In this section, we will discuss a simple structure of IHNC system called lite-IHNC. It operates depending on the ML algorithms of the instant ∆ℓ that discussed in introduction-section.

First, before we start diving into the details (formulas, algorithms, applications), we have to provide the Symbol-key for the next part of this research. Therefore, assume an ANN includes two subsequent layers “\(j\)” and layer “\(k\)” (\(k=j+1\)) as seen in Fig. 10. Each layer has “\(i\)”-number of neurons. The weights between them are labeled by two parameters. The First is the number of the source neuron and the other is the number of the destination neuron. For instance, “\({W}_{J1,k\mathrm{i}}\)” indicates the weight between the first neuron of layer “\(j\)” and the “i” neuron in the destination layer “\(k\)”. Moreover, the symbol \({\Delta \mathcal{l}}_{k8}\) indicates the learning ratio of eighth neuron in the layer “\(k\)” and so on.

Fig. 10
figure 10

A simple ANN indicates the used symbols in the rest of this research

Now, all details of the lite-IHNC will be discussed in the next sub-sections.

3.1 Encryptor-unit of the lite-IHNC

This subsection is assigned to present the main architecture of the Encryptor within the proposed lite-IHNC. It consists of 3 layers as illustrated in the Fig. 11. Its encoder-stage includes two layers (input-layer “\(j\)” and layer “\(c\)”). The layer “\(j\)” has 8 inputs (\({p}_{1}:{p}_{8}\)) via 8 weights (\({w}_{p1,j1}:{w}_{p8,j8}\)). The layer “\(j\)” is connected with layer “\(c\)” via 8 weights (\({w}_{j1,c1}:{w}_{j4,c1 } \; and \; {w}_{j5,c2}:{w}_{j8,c2}\)).

Fig. 11
figure 11

The architecture of the Encryptor of the lite-IHNC

All weights in that encoder-stage represents the public-key (with size 1024 bits) as shown in the upper segment of the Fig. 12 and step (1) of algorithm 3.

Fig. 12
figure 12

The bit format of the private and public keys of the lite-IHNC

While, the decoder-stage (in this Encryptor) contains only one layer called “\(op\)”. It connected with layer “\(c\)” via 16 weights (\({w}_{c1,op1}:{w}_{c1,op8} \; and \; {w}_{c2,op1}:{w}_{c2,op2}\)). These weights represent the private-key (width 1024 bits) as the lower segment in the Fig. 12 and step (2) of algorithm 3.

An instant data sample is delivered to the encoder inputs (\({p}_{1}:{p}_{8}\)) as step (3) of algorithm (3).

The FP runs inside the Encryptor-unit to generate the i-codes (\({C1}^{old}\;and \; {C2}^{old}\)) as step (4) of algorithm 3 and as illustrated in the formulas (2: 6). Moreover, it generates the sum-of-products that called “\({Pre\_activation}^{old}\)” (from decoder stage) as demonstrated in formula (7).

$$C1^{old} = {\varvec{f}}\left[ {W_{J1,C1} *{\varvec{f}}\left( {j1} \right) + W_{J2,C1} *{\varvec{f}}\left( {j2} \right) + W_{J3,C1} *{\varvec{f}}\left( {j3} \right) + W_{J4,C1} *{\varvec{f}}\left( {j4} \right)} \right]$$
(2)
$$C1^{old} = {\varvec{f}}\left[ {W_{J1,C1} *{\varvec{f}}(W_{{p1,{\text{j}}1}} {* }P_{1} ) + W_{J2,C1} *{\varvec{f}}\left( {W_{{p2,{\text{j}}2}} {* }P_{2} } \right) + W_{J3,C1} *{\varvec{f}}\left( {W_{{p3,{\text{j}}3}} {* }P_{3} } \right) + W_{J4,C1} *{\varvec{f}}(W_{{p4,{\text{j}}4}} {* }P_{4} )} \right]$$
(3)
$$C2^{old} = {\varvec{f}}\left[ {W_{J5,C2} *{\varvec{f}}\left( {j5} \right) + W_{J6,C2} *{\varvec{f}}\left( {j6} \right) + W_{J7,C2} *{\varvec{f}}\left( {j7} \right) + W_{J8,C2} *{\varvec{f}}\left( {j8} \right)} \right]$$
(4)
$$C2^{old} = {\varvec{f}}\left[ {W_{J5,C2} *{\varvec{f}}(W_{{p5,{\text{j}}5}} {* }P_{5} ) + W_{J6,C2} *{\varvec{f}}\left( {W_{{p6,{\text{j}}6}} {* }P_{6} } \right) + W_{J7,C2} *{\varvec{f}}\left( {W_{{p7,{\text{j}}7}} {* }P_{7} } \right) + W_{J8,C2} *{\varvec{f}}(W_{{p8,{\text{j}}8}} {* }P_{8} )} \right]$$
(5)
$$i\_code = \left[ {\begin{array}{*{20}c} {C1^{old} } \\ {C2^{old} } \\ \end{array} } \right]$$
(6)
$${\text{Pre}}\_{\text{activation}}^{{old}} = \left[ \begin{gathered} {\text{OP}}1 \hfill \\ {\text{OP}}2 \hfill \\ {\text{OP3}} \hfill \\ {\text{OP}}4 \hfill \\ {\text{OP}}5 \hfill \\ {\text{OP}}6 \hfill \\ {\text{OP}}7 \hfill \\ {\text{OP}}8 \hfill \\ \end{gathered} \right] = \left[ {\begin{array}{*{20}c} \begin{gathered} W_{{C1,op1}}^{{init}} \hfill \\ W_{{C1,op2}}^{{init}} \hfill \\ W_{{C1,op3}}^{{init}} \hfill \\ W_{{C1,op4}}^{{init}} \hfill \\ W_{{C1,op5}}^{{init}} \hfill \\ W_{{C1,op6}}^{{init}} \hfill \\ W_{{C1,op7}}^{{init}} \hfill \\ W_{{C1,op8}}^{{init}} \hfill \\ \end{gathered} & \begin{gathered} W_{{C2,op1}}^{{init}} \hfill \\ W_{{C2,op2}}^{{init}} \hfill \\ W_{{C2,op3}}^{{init}} \hfill \\ W_{{C2,op4}}^{{init}} \hfill \\ W_{{C2,op5}}^{{init}} \hfill \\ W_{{C2,op6}}^{{init}} \hfill \\ W_{{C2,op7}}^{{init}} \hfill \\ W_{{C2,op8}}^{{init}} \hfill \\ \end{gathered} \\ \end{array} } \right]*\left[ \begin{gathered} C1^{{old}} \hfill \\ C2^{{old}} \hfill \\ \end{gathered} \right] = \left[ {\begin{array}{*{20}c} \begin{gathered} W_{{C1,op1}}^{{init}} \hfill \\ W_{{C1,op2}}^{{init}} \hfill \\ W_{{C1,op3}}^{{init}} \hfill \\ W_{{C1,op4}}^{{init}} \hfill \\ W_{{C1,op5}}^{{init}} \hfill \\ W_{{C1,op6}}^{{init}} \hfill \\ W_{{C1,op7}}^{{init}} \hfill \\ W_{{C1,op8}}^{{init}} \hfill \\ \end{gathered} & \begin{gathered} *C1^{{old}} \hfill \\ *C1^{{old}} \hfill \\ *C1^{{old}} \hfill \\ *C1^{{old}} \hfill \\ *C1^{{old}} \hfill \\ *C1^{{old}} \hfill \\ *C1^{{old}} \hfill \\ *C1^{{old}} \hfill \\ \end{gathered} & \begin{gathered} W_{{C2,op1}}^{{init}} \hfill \\ W_{{C2,op2}}^{{init}} \hfill \\ W_{{C2,op3}}^{{init}} \hfill \\ W_{{C2,op4}}^{{init}} \hfill \\ W_{{C2,op5}}^{{init}} \hfill \\ W_{{C2,op6}}^{{init}} \hfill \\ W_{{C2,op7}}^{{init}} \hfill \\ W_{{C2,op8}}^{{init}} \hfill \\ \end{gathered} & \begin{gathered} *C2^{{old}} \hfill \\ *C2^{{old}} \hfill \\ *C2^{{old}} \hfill \\ *C2^{{old}} \hfill \\ *C2^{{old}} \hfill \\ *C2^{{old}} \hfill \\ *C2^{{old}} \hfill \\ *C2^{{old}} \hfill \\ \end{gathered} \\ \end{array} } \right]$$
(7)

During the BP (for the decoder-stage only), the same data sample is delivered also to the decoder as desired target “\({p}_{i}\)” as step (5) of same algorithm 3. According to the instant-ML of the learning ratios (that introduced in introduction part), each desired target intersects with its corresponding neuron function-curve to produce new “\({Pre\_activation}^{new}\)” for each neuron “\(i\)” of the output-layer “\(op\)” as formula (8 and 9).

These intersections are similar as carrying out the neuron inverse-function (as mentioned earlier in the formula (1) above in the introduction part).

$${\text{Pre}}_{{{\text{activation}}}}^{{{\text{new}}}} \left( i \right) = {\text{Pre}}\_{\text{activation}}^{{{\text{Desired}}}} \left( i \right) = f^{ - 1} \left( {p_{{\text{i}}} } \right)$$
(8)
$${\text{Pre}}\_{\text{activation}}^{{{\text{new}}}} = \left[ \begin{gathered} {\text{Pre}}\_{\text{activation}}^{{{\text{Desired}}}} \left( 1 \right) \hfill \\ {\text{Pre}}\_{\text{activation}}^{{{\text{Desired}}}} \left( 2 \right) \hfill \\ {\text{Pre}}\_{\text{activation}}^{{{\text{Desired}}}} \left( 3 \right) \hfill \\ {\text{Pre}}\_{\text{activation}}^{{{\text{Desired}}}} \left( 4 \right) \hfill \\ {\text{Pre}}\_{\text{activation}}^{{{\text{Desired}}}} \left( 5 \right) \hfill \\ {\text{Pre}}\_{\text{activation}}^{{{\text{Desired}}}} \left( 6 \right) \hfill \\ {\text{Pre}}\_{\text{activation}}^{{{\text{Desired}}}} \left( 7 \right) \hfill \\ {\text{Pre}}\_{\text{activation}}^{{{\text{Desired}}}} \left( 8 \right) \hfill \\ \end{gathered} \right] = \left[ \begin{gathered} f^{{ - 1}} \left( {p1} \right) \hfill \\ f^{{ - 1}} \left( {p2} \right) \hfill \\ f^{{ - 1}} \left( {p3} \right) \hfill \\ f^{{ - 1}} \left( {p4} \right) \hfill \\ f^{{ - 1}} \left( {p5} \right) \hfill \\ f^{{ - 1}} \left( {p6} \right) \hfill \\ f^{{ - 1}} \left( {p7} \right) \hfill \\ f^{{ - 1}} \left( {p8} \right) \hfill \\ \end{gathered} \right]$$
(9)

Each new pre-activation \("{Pr{e}_{activation}}^{Desired}(i)\)” value will be divided over its corresponding old-activation value to produce its learning ratios(i) as formula (10) and step (6) of the same algorithm. In this case, all generated learning ratios (\({\Delta \mathcal{l}}_{1}: {\Delta \mathcal{l}}_{8}\)) will be considered as the generated i-key. Thus here, the i-key size is \(8 \Delta \mathcal{l}s*64\, bit=512\, bits\) as illustrated in the Fig. 13.

Fig. 13
figure 13

The i-key bit-format of the lite-IHNC

These generated i-key and i-code will be transferred to the Decryptor-unit as step (7 and 8) of the same algorithm.

$$\Delta \ell_{i} = \frac{{{\text{Pre}}\_{\text{activation}}^{{{\text{Desired}}}} \left( i \right)}}{{{\text{Pre}}\_{\text{activation}}^{{{\text{old}}}} \left( i \right)}} = \frac{{f^{ - 1} \left( {p_{{\text{i}}} } \right)}}{{{\text{Sum of products of neuron }}\left( {{\text{op}}_{{\text{i}}} } \right)}}$$
(10)

Algorithm 3 the encryption process of the lite-IHNC.

figure c

3.2 Decryptor-unit of the lite-IHNC

This subsection presents the main architecture of the Decryptor within the lite-IHNC. It includes only decoder-stage (Decoder2) which is exactly the same as decoder1 in the Encryptor-unit as demonstrated in the Fig. 14. It loads by the initial values (private key) as step (1) of algorithm 4. Also, it receives the previously generated i-key (\({\Delta \mathcal{l}}_{1}: {\Delta \mathcal{l}}_{8}\)) as step (2) of algorithm 4. Besides, it receives the i-code as step (3) of algorithm 4. It will carry out the following two points.

Fig. 14
figure 14

The architecture of the Decryptor of the lite-IHNC

Firstly, it updates the received i-codes (\({C1}^{old}\; and \; {C2}^{old}\)) according to the formula of the step (9) in the algorithm 2 to be (\({C1}^{new} \; and \; {C2}^{new}\)) as illustrated in the formulas (11 and 12) and step (5) of algorithm 4.

$$C1^{{{\text{new}}}} = C1^{{{\text{old}}}} *\sqrt {\mathop \prod \limits_{i = 1}^{8} \left| {\Delta \ell_{i} } \right|}$$
(11)
$$C2^{{{\text{new}}}} = C2^{{{\text{old}}}} *\sqrt {\mathop \prod \limits_{i = 1}^{8} \left| {\Delta \ell_{i} } \right|}$$
(12)

Secondly, all Decoder’s weights are updated according to the formula in the step (8) in the algorithm 2. Here, only 2 formulas are written (\({W}_{C1,op1}^{new} \; and \; {W}_{C2,op1}^{new}\)) for the neuron-1as illustrated in the formulas (13 and 14) and step (4) of algorithm 4.

$$W_{C1,op1}^{new} = (W_{C1,op1}^{old} *\Delta \ell_{1} )/ \sqrt {\mathop \prod \limits_{i = 1}^{8} \left| {\Delta \ell_{i} } \right|}$$
(13)
$$W_{C2,op1}^{new} = (W_{C2,op1}^{old} *\Delta \ell_{1} )/ \sqrt {\mathop \prod \limits_{i = 1}^{8} \left| {\Delta \ell_{i} } \right|}$$
(14)

after that, the Decryptor unit needs only the FP to recover the original data from the Decoder2. One sample of this recovered data from the 1st neuron is demonstrated in formula (15).

$$OP1^{{{\text{new}}}} = {\varvec{f}}(C1^{{{\text{new}}}} *W_{C1,op1}^{{{\text{new}}}} + C2^{{{\text{new}}}} *W_{C2,op1}^{{{\text{new}}}} )$$
(15)

By substituting the formulas (11, 12, 13, 14) in formula (15) will get formula (16) thus (17).

$$OP1^{new} = {\varvec{f}}\left( {C1^{old} *\sqrt {\mathop \prod \limits_{i = 1}^{8} \left| {\Delta \ell_{i} } \right|} *W_{C1,op1}^{old} *\frac{{\Delta \ell_{1} }}{{\sqrt {\mathop \prod \nolimits_{i = 1}^{8} \left| {\Delta \ell_{i} } \right|} }} + C2^{old} \sqrt {\mathop \prod \limits_{i = 1}^{8} \left| {\Delta \ell_{i} } \right|} *W_{C2,op1}^{old} *\frac{{\Delta \ell_{1} }}{{\sqrt {\mathop \prod \nolimits_{i = 1}^{8} \left| {\Delta \ell_{i} } \right|} }}} \right)$$
(16)
$$OP1^{new} = {\varvec{f}}[\Delta \ell_{1} *\left( {C1^{old} *W_{C1,op1}^{old} + C2^{old} *W_{C2,op1}^{old} } \right)]$$
(17)

From formula (10), will get the learning ratio (\({\Delta \mathcal{l}}_{1}\)) of the 1st neuron as seen in the formula (18). And so on, for all rest neurons’ outputs (\({OPi}^{new}\)) as step (6) of algorithm 4.

$$\Delta \ell_{1} = \frac{{f^{ - 1} \left( {p_{1} } \right)}}{{{\text{Sum of products of neuron }}\left( {{\text{op}}_{{1}} } \right)}} = \frac{{f^{ - 1} \left( {P1} \right)}}{{C1^{{{\text{old}}}} *W_{C1,op1}^{{{\text{old}}}} + C2^{{{\text{old}}}} *W_{C2,op1}^{{{\text{old}}}} }}$$
(18)

By compensating \({\Delta \mathcal{l}}_{1}\) of formula (18) into (17) will get formula (19)

$$OP1^{{{\text{new}}}} = f\left[ {f^{ - 1} \left( {P1} \right)} \right] = P1\mathop \Rightarrow \limits^{leads \; to} the \; 1st \;{\text{Original input}}\_{\text{data}}$$
(19)

Algorithm (4) the Decryption process of the lite-IHNC.

figure d

4 Case study and results of a deep-IHNC

In last section, we introduced a lite IHNC to simplify our idea. But in this section will propose deeper one. As long as this proposed Deep-IHNC architecture is deeper, the i-codes and i-keys are too complex to crack. Further, its original data is too safe to be predictable.

In this section, we will not mathematically analyze this application. Because it includes similar equations mentioned above albeit in a deeper way. But we will see its behaviors during encrypting a text as well as decryption process with 100% accuracy.

Therefore, we prepared more complex IHNC system depending on asymmetric autoencoder. All parameters in this application are declared with double precision data type (64 bits) to get high accuracy during its computation process.

The proposed Deep-IHNC supports task parallelism by handling 8 characters (via 8 inputs) simultaneously.

Its Encryptor-unit contains seven layers (\({L}_{a}, {L}_{b},{L}_{c},{L}_{d},{L}_{e},{L}_{f}\;{ and }\;{L}_{g}\)) as shown in the Fig. 15. For more complexity in our design, the encoder-stage of the Encoder-unit is completely differing than its decoder-stage. The encoder-stage contains five layers (\({L}_{a}, {L}_{b},{L}_{c},{L}_{d},{L}_{e}\)), while its decoder contains 2 layers (\({L}_{f},{L}_{g}\)). Moreover, the encoder-stage has 8 inputs (\({p}_{1}:{p}_{8}\)) that divided into 2 groups (\({p}_{1}:{p}_{4}\)) and (\({p}_{5}:{p}_{8}\)). Each group is fully connected with 2 neuron (\({L}_{a1},{L}_{a2}\)) and (\({L}_{a3},{L}_{a4}\)) respectively through 8 weights for each group.

Fig. 15
figure 15

The architecture of the proposed Encryptor of the Deep-IHNC

A different interface for the decoder-stage output is designed to give more asymmetry to the autoencoder. The 8 neurons \((o{p}_{1}:{op}_{8})\) of the output layer “\({L}_{f}\)” are fully connected with the 3 neurons in the layer “\({L}_{g}\)” via 24 weights.

The input-layer “\({L}_{a}\)” can accept 8 characters concurrently via its 8 inputs (\({p}_{1}:{p}_{8}\)). This total number of inputs can be exceeded or reduced depending on the designed system requirements. Each input accepts complete character in ASCII-code format.

We select the ReLU-type as activation function for all neurons in that system, because all inputs have ASCII-codes with positive values greater than one.

According to the architecture of the proposed Deep-IHNC, the encoder-stage has (4 + 4 + 12 + 6 + 2 + 2 = 30 weights), so it has public-key with long (30 * 64 = 1920 bit). Moreover, its decoder-stage includes (6 + 24 = 30) weights. Thus, it has private key with long 1920 bits and i-key with long (3 + 8) ∆ℓs * 64 = 704 bits. Its ciphered characters i-code has compressed data (2 * 64 = 128 bits). Therefore, the total number of bits that represent 8 characters simultaneously are (128 + 704 = 832 bits).

Our scenario here, is encrypting a text of 80 characters as seen in the Fig. 16. The Decryptor is fed forward with sequential sets of characters (eight characters per encryption cycle). Therefore, the FP runs to generate the relevant i-codes as shown in the Fig. 17. The figure shows the changes of 10 i-codes during encryption operations for 10 sets of characters.

Fig. 16
figure 16

Input-text (sets of instant plain data) contains characters

Fig. 17
figure 17

The generated i-codes (in our scenario) from the Deep-IHNC

Further, the decoder-stage of the Encryptor is fed backward with the same characters sets (to be their desired targets) to the output layer “\({L}_{g}\)”. Thus, its BP ML is run to generate ten i-keys based on the previous formulas. Each i-key is represented by eleven learning ratios (\({\Delta \mathcal{l}}_{a}\): \({\Delta \mathcal{l}}_{g})\) and\(({\Delta \mathcal{l}}_{f1} :{\Delta \mathcal{l}}_{f3}\)) as demonstrated in Fig. 18. This figure displays sequence changes of 10 samples of i-keys during the BP.

Fig. 18
figure 18

The Changes of the i-key (10 data samples) in the Deep-IHNC

Furthermore, after each encryption cycle, the Encryptor-unit sends its generated i-key (designated for 8 characters) with its associated i-code to the Decryptor-unit.

On other hand, the Decryptor-unit is similar in its architecture to the decoder-stage in the Encryptor. It has 2 layers, input layer “\({L}_{p}\)” and output layer “\({L}_{q}\)” with fully connected weights as illustrated in the Fig. 19.

Fig. 19
figure 19

The Decryptor architecture of the Deep-IHNC

During each decryption cycle, the Decryptor is loaded with the same accepted private-key. It received “i-code” and “i-key”, it performs four main tasks. In the first task, it uses the received i-key to update the received i-code as seen in the Fig. 20. It displays the received 10 samples of old i-codes and their updating (new) i-codes (within the Decryptor-unit) during 10 samples.

Fig. 20
figure 20

The updating of the received i-code in of the Deep-IHNC

In the second task, the received i-key is also used to update all the initial weights (that loaded with private-key) inside the Decryptor as illustrated in the Fig. 21. It shows the old values of all weights (private keys) for all neurons (\({q}_{1}:{q}_{8}\)) in the layer “\({L}_{q}\)” along with their updating values during 10 sequential samples of decryptions.

Fig. 21
figure 21

The Updating of all weights of the layer “\({L}_{\mathrm{q}}\)” of the Deep-IHNC

In the third task, the Decryptor runs its FP to restore the original sets of characters. the Fig. 22 shows the recovered original characters across all outputs “\({Out}_{1}:{Out}_{8}\)” from the eight neurons (\({q}_{1}:{q}_{8}\)) of layer “\({L}_{q}\)”.

Fig. 22
figure 22

All recovered characters from layer “\({L}_{\mathrm{q}}\)” in the Deep-IHNC

Eventually, in the fourth task, the Decryptor-unit collects all the character sets retrieved from all the outputs of the layer “\({L}_{q}\)” in the same sequence (as it encrypted) as shown in the Fig. 23. When we compare the input text to the output text, we find that they are exactly the same.

Fig. 23
figure 23

The recovered original characters on the output-text from the Deep-IHNC

Eventually, there is a trade-off between building a more in-depth ANN versus its cost and speed of processing. many researchers recommend to implement their systems by using hardware description-language (like VHDL) over the FPGA [22,23,24,25,26,27].

The mentioned structures of the IHNC systems can be realized using software programs. However, to get more high speed of processing, hardware systems are recommended. In this case, the designers will face some challenges while writing the IHNC system by VHDL code. We can summarize these challenges in the following points:

  1. (1)

    A new datatype (according to Mantissa and Exponent rule) must be added (in VHDL code) to represent all parameters in Float64.

  2. (2)

    All IHNC parameters (plain data, weights, learning ratios, outputs…etc.) must be declared by this added Float64 datatype.

  3. (3)

    Three VHDL-functions based on Mantissa and Exponent rule must be written to carry out the following.

  • Function1 to make multiplication of Float64 (to perform the products during FP).

  • Function2 to make summation of Float64 (to perform the sum of products during FP).

  • Function3 to make division of Float64 (to get the learning ratios ∆ℓ during BP).

4.1 Conclusion and hope

In this research, we proposed a new methodology of Hybrid cryptography. It has been realized using the asymmetric autoencoder based on the new concept of machine learning called the instant learning ratio “∆ℓ”. This Hybrid cryptography “IHNC” serves the task parallelism.

It encrypts multiple data concurrently using Public-key and convert them to compressed instant-code (i-code). Further, it generates instant-key (every encryption process). Moreover, it uses the instant-code as well as the private-key to recover the original data simultaneously.

The proposed hybrid neuro-cryptography does not belong to a specific architecture of neural autoencoder, so each cryptography designer can get creative with her/his design to get their complexity required.

In the future, I hope to design and implement this Hybrid cryptographic system based on the VHDL and FPGA to make the encryption/decryption time within a few clocks for each data block.

Eventually, I hope to use this proposed Hybrid cryptographic rule in an application contains several end-to-end Encryptor/Decryptor units.