Introduction

Nowadays, the whole world is struggling with the Covid-19 pandemic. The medical experts are acting like warriors in this battle. It is required to maintain the Covid-19 records database for the smooth functioning of all strategic activities. Covid-19 database includes all the patients’ personal details, travel history, contact history, diagnosis reports containing corresponding medical images like X-ray, MRI, PET scans, etc. Travel history includes the mode of travel like a car, train, bus, flight, etc. Contact history includes the details of the person’s friends, colleagues, family members, etc., who are in physical contact. The government can easily trace out the suspected people from the travel history and contact history. From this information, the government can take strategic decisions like quarantine, lock-down, sanitization, etc., in concerned areas like working offices, residential colonies, airports, railway stations, bus stands. This ultimately results in taking control over the patient as well society to prevent further worst effects. In countries like India, the government has categorized the whole country into three different zones—Red, Orange, and Green. Depending on these color codes assigned to the districts, somewhat relaxation or more strictness is applied. In the present and future days, to come back to normal routine life, the government needs to make important decisions to allow passengers to travel through air, rail, and land. In the present situation, people need to get a medical report from the hospitals to travel through various modes of transport. It is observed that to get the Covid-19 report, so many people are standing in a long queue near hospitals. To manage the crowd and provide essential things to them like drinking water, food, etc., is another issue. Also, there is a chance of malpractices in issuing these medical certificates. A person can give a bribe to get a medical certificate. In this regard, the genuine Covid-19 history of the passengers plays an important role in issuing tickets to the passengers. Hence, an information technology (IT)-based Covid-19 status checking system is required everywhere. Through IT, a common database for Covid-19 records can be maintained. By accessing this database, the concerned authority can easily rectify the Covid-19 status for the person, e.g., railway authorities cannot issue a ticket to the person because of his Covid-19-positive status. The use of modern technology plays a key role in winning this battle. Because of digitization in the medical field, diagnosis speed, as well as performance, is increased. Also, due to rapid growth in Internet technology, Covid-19 patient records can be shared among concerned experts in one click. Quick sharing of information is advantageous to all concerned authorities. The medical experts can do telediagnosis from anywhere and anytime.

However, sharing such records openly through the Internet is not at all secure. Some people, like terrorists, always wait for such situations to employ their agenda. Cyber attacks are possible during sharing of such records. From a security point of view, confidentiality, integrity and availability are raised. Confidentiality means preserving privacy using encrypting data. Integrity refers to the non-modification of the data during transmission by an unauthorized user. Availability confirms that data are accessible to read and modify records anytime and anywhere for authenticated users like medical experts, government officers, and other agencies. So, a common shared storage server is needed to maintain the records. But, it is mandatory to store in encrypted form only. The commonly used storage servers are centralized. Single-point failure may happen. To avoid this, IPFS (InterPlanetary File System) is preferable. IPFS provides decentralized, tamper-proof, and immutable storage services.

The next subsequent sections of this paper are: in “Related Work”, the related work is discussed; in “Preliminaries”, the preliminaries required for the proposed system are briefly given. The framework and workflow of the proposed scheme are described in “Proposed System Design”. Performance and security analysis is discussed in “Security and Performance Analysis”. The conclusions are made in the last section.

Related Work

Digital imaging and Internet technology are useful in medical, government, and other concerned fields during these Covid-19 pandemic days. Telemedication, telesurgery, telediagnosis are some of the examples in the medical field. For the government, the Covid-19 patient’s travel history and contact history are required to avoid further complicated situations and to handle law and order smoothly. The other fields like banks, travel agencies also need patients’ status to do their activities. So, to implement the proposed system, the design of a storage system, as well as access through authentication protocol, is required. By considering the issue of confidentiality, data encryption is also needed. In consideration of the above points, here some common cryptographic approaches are reviewed.

In cryptography, various algorithms like DES—Data Encryption Standard, AES—Advanced Encryption Standard, secure cryptographic hashing, Digital Certificates, Digital Signature, El Gamal, RSA, Elliptic curve cryptography are provided which are used to design secured systems. Cryptographic hash functions are very useful in digital security systems like digital fingerprinting of messages, authentication of messages, and key derivation [1]. Hashing function maps data of any length to a fixed-length bit string. The properties of hash functions are—deterministic, one-way, irreversible, quick to compute, and produce avalanche effect [2, 3]. Also, it is a one-way function, i.e., a function which is not reversible. Because of this, these functions are working as an important tool in modern security systems [4]. In modern security applications, researchers are using different cryptographic hash functions like MD-5, SHA-256, RIPEMD-160, Whirlpool, bcrypt, BLAKE3, and many more. Among these, SHA-256 is more secure and popular. SHA-256 was developed by US National Security Agency (NSA) in 2001 [5]. The SHA-256 algorithm takes a message of arbitrary length as input and produces a 256-bit message digest as output.

Apart from these conventional cryptoalgorithms, homomorphic encryption is also designed. Using homomorphic computation, it is possible to do computations in an encrypted domain. Such encrypted domain is complex, costly, and preferably used in high-security applications like data aggregation, secure bidding systems, and cloud computing [6]. Homomorphism is the property that preserves the original properties and association among the information set after converting it into another information set. In cryptography, homomorphic functions allow doing computation without decrypting homomorphically encrypted information. The obtained result after the computations is in cipher form, and after decryption, it will be the same as the computation in plain domain [7]. Homomorphic cryptosystems are designed in two categories—fully and partially. Partial homomorphic cryptosystem does only restricted computations like addition and multiplication. Full homomorphic cryptosystem allows all computations. Craig Gentry [8] designed a full homomorphic system using lattice theory. Partial homomorphic algorithms like DGK, Paillier, and El Gamal are studied and proposed in [9]. Paillier provides addition and multiplication operations in a homomorphic manner. Paillier method is computationally comparable to RSA, and it uses modular maths. In reality, Paillier is useful in systems like secure distortion computation and secure transform [10, 11].

Now concerning authentication, the researchers have proposed various authentication techniques based on biometrics, PINs, smart cards, challenge-response, multi-factor, multi-level, etc. However, in reality, for users’ easiness, authentication using UserId and Password is commonly used, and implementation cost is also less [12, 13]. Hence, companies and institutes often prefer password-based systems [14]. However, it is noticed that the passwords are crackable mostly due to users’ inattentive habit [15]. In practice, most of the users often select delicate passwords [16, 17] and use the same passwords again and again for different services [18, 19]. The selected passwords are the most common words to their day-to-day world and also not difficult to memorize [20, 21]. The user credentials like UserId and Password are preserved in the Authentication Data Table (ADT) in the server. If the security of the server that stores ADT is low, hacker hacks the server and makes passive attacks on ADT [22]. Cryptographic hashing functions are irreversible and are commonly used in authentication techniques [23, 24]. In practice, passwords are hashed and then stored in ADT [25]. But, such direct hash values in ADT are not safe because of precomputed table attacks such as lookup and rainbow table attack [26]. To avoid these attacks, the salting of hash passwords is in practice. Salting is nothing but adding noise. But, salting passwords is also not safe against a dictionary attack. In this type, the matching is done with a guessed password and all the words from the dictionary. To prevent this attack, key stretching can be applied. In key stretching, delicate passwords are transformed into secure passwords. But, the key stretching technique is not well protected against Narrow-Pipe Attack [27]. Authentication based on Encrypted Negative Password (ENP) that uses a negative database is proposed by Luo et al. [28]. ENP is much stronger to protect from a lookup table and dictionary attack. However, the complexity of generating negative passwords is more. In the proposed password-based secure authentication the encryption keys are depending on the input credentials; hence, the overall security of the authentication system is strengthened. Also, authentication data table attack analysis with various metrics is performed and results in a robust and secure authentication system.

Considering the proposed system, the Covid-19 report consists of all the details like personal details, travel history, contact history, medical diagnosis reports, medical images like X-ray, and they are considered in image form (scanned documents) only. So, every record/report is converted to an image and used in the proposed architecture. But, as a cloud is a centralized system, if a single cloud model fails, it creates the problem of unavailability [29]. Nowadays, InterPlanetary File System (IPFS), which is a decentralized storage technology, is acting as an alternative for the cloud. Recently, using IPFS technology in combination with blockchain is coming in boom [29,30,31]. Chen et al. [31] proposed a P2P file system using IPFS and blockchain technology. Zheng et al. [30] proposed an IPFS-based blockchain storage system to distribute the load. Jin Sun et al. [29] used IPFS and blockchain for the storage system of medical records.

As IPFS is an external entity, the doubt of trust is always there. Hence, it is mandatory to encrypt data before outsourcing the storage on IPFS. Here, different image encryption techniques for encrypting Covid-19 medical images are reviewed.

Classical cryptosystems AES—Advanced Encryption Standard and DES—Data Encryption Standard which is block-based symmetric key systems and asymmetric methods like RSA—Rivest Shamir Adleman consume more time. On the other hand, while applying RSA to encrypt images, it creates many security issues as defined in [32]. The comparison of classical symmetric ciphers and similar ciphers is given in Table 1 as below:

Table 1 Comparison of symmetric ciphers

However, because of the bulk capacity of the image, the above and similar methods take a very high time for encryption. Image cryptosystem based on chaotic maps [33,34,35,36] is popular because of easy implementation and effective security levels. These techniques use the Fridrich model. Fridrich used two stages—Confusion and Diffusion. Confusion is also known as Permutation. In confusion, pixels are shuffled, and in diffusion, pixels are modified. For increasing the security level of the image cryptosystem, permutation and diffusion are executed for \({\mathcal {M}}\) and \({\mathcal {N}}\) times, respectively. For permutation and diffusion, the required keystreams are produced using chaos-based functions. The different kinds of chaotic systems like 1-D, 2-D and more dimensions can be applied in different methodologies. To generate complicated, chaotic order, more dimensions are needed. Ultimately, the guessing of chaotic nature is very tough. On the other hand, the development cost of these higher dimension methodologies increases due to their high complexity [37]. The chaos functions such as Henon, Logistic, Arnold cat, Baker are mostly useful in image cryptosystem. Apart from these standard maps, researchers also have used their logic to generate chaotic behavior values. Cao and Mao [38] proposed a 2-dimensional infinite collapse maps using sin() function. Mansouri and Wang [39] proposed a one-dimensional sine-powered chaotic map that uses a sine map and two control values. On the other hand, to decrease the rounds of confusion–diffusion and to achieve more security, bit-level permutation and diffusion techniques are used in [35, 36]. Rubik’s cube movement logic is used to perform confusion of image pixels in [40]. During diffusion, the key generation process involves prime factorization. Zhou and Wang [41] used a 5-D conservative hyper-chaotic system to generate pseudo-random series in the confusion–diffusion phases. But, as the dimensions increase, the complexity also increases. Parallel computing is also applied in [42] to encrypt a group of images preemptively by utilizing the processors of a modern computer system. [43] proposed a method using chaos, and the generated keystream depends on an input image. Depending on methodology the comparison of some existing schemes is given in Table 2 as below:

Table 2 Comparison of existing cryptosystems

As discussed, the above approaches and many more are used by researchers for keystream generation and image encryption; each methodology has advantages along with limitations. The proposed Covid-19 record cryptosystem is based on Fridrich architecture that uses keystreams generated from a 1-D logistic map for pixel confusion. The advantage of a 1-D logistic map is that it is faster and less complex than other high-dimensional chaotic maps. Cryptographic hash function SHA-256 is an irreversible function and deterministic. Using this SHA, a stream of unique values is easily generated through simple logic. Hence, in the proposed system the diffusion keystream is generated through SHA to increase the speed.

In the proposed system, the focus is given on checking the passengers’ Covid-19 status for the travel ticket issuing authority and secure sharing of Covid-19 records among medical practitioners. The status checking system depends on the Covid-19 database. The Covid-19 database maintains the details of each Covid-19 patient and is accessible throughout the world using the Internet. The system uses two servers: cryptomatch server (CMS) and IPFS server. The CMS is a trusted entity and does the encryption and decryption of Covid-19 records. Additionally, it uploads and accesses encrypted records from IPFS. The IPFS stores the file and returns the unique hash value for that file. The uploaded file is downloaded using returned hash value and is used to further access the uploaded file. CMS maintains the Index Table for record-keeping. The index table contains four attributes—1. Patient Id; 2. IPFS returned hash value; 3. Covid-19 records encryption key value (Seed); and 4. Status. Patient Id can be a unique identification number like social security number or Aadhaar number (India). To check the status of the passenger, the Status field is accessed from Index Table, and only if the status is either norecordfound or Recovered then the ticket will be issued. Otherwise, the ticket will not be issued and immediately reported to the police. The proposed system is designed using password-based secure authentication, image cryptosystem, IPFS, cryptographic hashing, and Paillier cryptosystem. The key value (Seed) required for generating keystream is calculated from Patient Id. Seed is encrypted using the Paillier cryptosystem to provide higher security.

In the world, so many countries are developing countries. Because of the high population and lack of enough IT infrastructure, it is difficult to manage data securely and efficiently. As Covid-19 database contains most of the data in digital images (X-ray, CT scan, MRI, etc.) form. As data are huge, cloud technology can be the solution to manage such huge data. For rapid sharing of such private data among authenticated users, centralized control is required. However, if data are stored and transmitted in plain form there will be a privacy issue. Also, researchers have observed that the cloud is honest-but-curious in nature, which means it follows the defined protocol but it may trace data. Hence, to preserve the privacy of individuals and their personal data, the proposed solution is given. The motivation behind developing the proposed system is to provide secure and efficient sharing of data for travel agencies and medical experts. Considering the Covid-19 reports containing personal details, travel history, contact history, medical diagnosis reports, medical images like X-Ray, and other required things, the secure and efficient management of these records becomes essential. With the security concern, confidentiality, integrity and availability (CIA) are focused. Confidentiality is achieved through the encryption of records. Integrity assures no change in data during transmission by an unauthorized user. Availability means the facility to the authenticated users for the management of data. Considering these CIA points, the proposed system is designed through secure authentication, image cryptosystem, and secure data storage. Further, the future usage of the proposed system is like a person with a vaccination history allowed to travel and medical data can be used for further research in the medical field.

The highlights of the proposed work are:

  • Covid-19 status checking system for travel ticket issuing authority is proposed.

  • Rapid sharing of Covid-19 records among medical practitioners is done for telediagnosis.

  • A novel password-based authentication protocol is designed to protect passwords in ADT against offline attacks.

  • Random order keystream required for proposed image encryption system is generated using cryptographic hashing (SHA-256).

  • A novel image cryptosystem is proposed for encrypting Covid-19 records.

Preliminaries

In this section, the fundamentals of the proposed system are briefly described. Paillier cryptosystem is used for encryption of Seed. The chaotic map is used to generate keystream used in confusion round of image encryption. The novel storage mechanism IPFS is also discussed briefly.

Paillier Cryptosystem

To understand Paillier method, assume U and V are two large prime numbers and assume \({R = U\times V}\). Assume Encrypt() and Decrypt() are the cryptofunctions with public key e PK and secret key d SK, respectively, and is given by (Rg), where g is a generator in \({Z^*}_{R^2}\). For two input values \(k_1, k_2 \in {Z_N}\) , the Paillier method has the following properties: (1) Homomorphic add

$$\begin{aligned}&Decrypt(Encrypt (k_1 + k_2 )) \nonumber \\&\quad = Decrypt(Encrypt(k_1 )* Encrypt (k_2) mod \,R^2 ) \end{aligned}$$
(1)

(2) Homomorphic multiply

$$\begin{aligned}&Decrypt(Encrypt(k_1* k_2)) \nonumber \\&\quad = Decrypt(Encrypt(k_1 )^ {k_2} mod\, R^2 ) \end{aligned}$$
(2)

Chaotic Function

A chaotic function or map is a mathematical function that produces chaotic order values. The input for the function is either discrete or continuous type value. In dynamical systems study, chaotic functions are frequently used. In the proposed system, a key is produced using a Logistic map.

Logistic Map

This map can generate chaotic behavior using:

$$\begin{aligned} d_{n+1} =\mu d_n (1- d_n ) \end{aligned}$$
(3)

where \(d_n\) is a number between (0,1) which denotes the ratio of the existing population to the maximum possible population. In (3), \(\mu\) is a control parameter that lies between (0,4] and even further decreased to [3.57,4] as that’s the range of actual chaotic behavior.

InterPlanetary File System (IPFS)

IPFS technology is used to store and retrieve data, files (text, audio, video, image) in a distributed manner. IPFS is a distributed peer-to-peer (p2p) storage network. More specifically, it is a hypermedia distribution protocol in which contents are accessed as distributed identities. IPFS aims to make the web more secure, faster, and more open. Contents are accessible through peers, and the peers can be located anywhere in the network.

IPFS does the following whenever a file is uploaded:

1. For a given file, a unique cryptographic hash is returned. 2. IPFS removes duplicate data throughout the network. 3. Each node in the network stores only the data it is interested in, and required metadata is used for indexing purposes. 4. Whenever the user wants a file, the file’s hash is given to ask the network to retrieve the content behind that file’s hash.

As IPFS is a distributed storage system, it chunks a single file into multiples and distributes it among various nodes. So, complete file contents are not available at a single node. Moreover, IPFS uses transport encryption but not content encryption. This means that data are secure when being sent from one IPFS node to another. In the proposed system context, the complete file is not available at a single node; this is considered to be safer.

Proposed System Design

Considering the proposed system, the Covid-19 report consists of all the details like personal details, travel history, contact history, medical diagnosis reports, medical images like X-Ray, and they are considered in image form (scanned documents) only. So, every record/report is converted to an image and used in the proposed architecture.

The proposed system consists of various participants, cryptomatch server, IPFS, and it is developed using secure authentication protocol and image cryptosystem. The system architecture is shown in Fig. 1. The working details of the proposed system are described through the following subsequent subsections.

Fig. 1
figure 1

System architecture

Participants of the Proposed System

Figure 2 shows the participants involved in the proposed system. There are two types of participants, i.e., primary and secondary. Primary participants are active participants handling the Covid-19 cases directly. Secondary participants are the passive participants who can view the status of any citizen from the Covid-19 database. Covid-19 patients are people undergoing treatment. Medical Experts are the doctors diagnosing Covid-19 patients. The role of the Research laboratory is to perform various required tests like PCR (polymerize chain reaction) tests and updating of records. The duty of the government is to trace out travel history, contact people, quarantine required people, etc.

Fig. 2
figure 2

Participants of the proposed system

Primary participants can create, view, and update the Covid-19 patients’ records. The medical record consists of a sequence of images showing patients’ reports and concerned medical images.

Secondary participants also play an important role in stopping Covid-19 from spreading. These types of participants include all ticket issuing authorities like railway, bus transport, flight service agencies as well as private vehicle transport service agencies. These participants have the right to view only the Covid-19 status of the person. A representative from these participants has to access the Covid-19 record and check the status of the citizen using a unique id like AADHAAR whether the person is infected or not. If not infected, then the corresponding person is allowed to continue their activities like traveling, joining the school, offices, etc. If the status is active, then they have to inform the government immediately to avoid further spreading.

Cryptomatch Server

Cryptomatch server (CMS) is the trusted entity. It permits authenticated users to access the system through authentication. Also, it does encryption and decryption of medical records. CMS also uploads the encrypted medical records to the IPFS and accesses the medical records as per the requirement. In addition, the CMS maintains the Index table and Authentication Data Table (ADT). The index table is used to keep track of all Covid-19 information. ADT maintains the user information, and the details are described in the next subsequent subsection. The attributes of the Index table are: 1. Patient Id; 2. Hash returned from IPFS for the encrypted medical records; 3. Paillier encrypted key value (Seed) and 4. Patient status—Active or Recovered. Seed value is calculated using,

$$\begin{aligned} Seed = \sqrt{Hash(patientId)*c} \end{aligned}$$
(4)

where c is the floating point constant derived from hash of the first image of the corresponding patient’s Covid-19 record.

Figure 3 shows sample index table.

Fig. 3
figure 3

Sample index table

Proposed Authentication

Authentication is the first step to access any service through which the identity of authenticated users is verified. Here, for simplicity purposes, conventional user registration and authentication procedure are not considered. The focus is given on the contents of the Authentication Data Table (ADT). The attributes of ADT are UserId and Password. Concerning the security point of view, the contents of ADT should not be in plain format. In the proposed system, the ADT contains SHA-256 hashed UserId and AES-256 Encrypted Password.

Encrypted Password Generation

Figure 4 shows the encrypted password generation method and is described through the following steps.

  1. 1.

    Initially, during the registration process, given UserId and Password are hashed using SHA-256.

  2. 2.

    Hashed UserId and Hashed Password are concatenated to get the 512-bit string.

  3. 3.

    Using logistic map 3, ShufflingIndex1 of size 512 is generated.

  4. 4.

    The obtained 512-bit hash string is shuffled according to ShufflingIndex1, and the result is called as MixedHashPassword.

  5. 5.

    Once again ShufflingIndex2 of size 512 is obtained by executing logistic map 3.

  6. 6.

    First 256 locations of the ShufflingIndex2 are considered as KeyIndex.

  7. 7.

    256-bits of MixedHashPassword of the locations from KeyIndex are grouped. This selected 256-bit substring is acting as the AES-256 encryption key.

  8. 8.

    Using this encryption key, MixedHashPassword is encrypted and stored as an Encrypted password in ADT.

Table 3 illustrates the flow of encrypted password generation. Forsake and simplicity, and the intermediate values are represented in hex only.

Fig. 4
figure 4

Encrypted password generation

Table 3 Illustration of encrypted password generation

Authentication

The authentication procedure is described in Fig. 5. The user is the participant (primary and secondary) of the system, and steps are given below:

Fig. 5
figure 5

Authentication

  1. 1.

    Initially, user inputs the parameters UserId, Password. Then, these parameters are sent to the Cryptomatch server.

  2. 2.

    Cryptomatch server calculates hash value of UserId as H(UserId) using SHA-256 hashing technique.

  3. 3.

    Cryptomatch server searches ADT for input H(UserId). If desired H(UserId) is found, it reads the encrypted password from ADT as eAP. Otherwise, go to step 7.

  4. 4.

    Input password is encrypted as eIP using methodology described previously.

  5. 5.

    Cryptomatch server reads the encrypted password from ADT as eAP and compares eIP and eAP.

  6. 6.

    If eIP and eAP are the same, it returns a positive acknowledgment to the client.

  7. 7.

    Otherwise, it returns a negative acknowledgment to the client.

Encryption of Medical Record

Covid-19 Patient’s medical record contains personal information, travel history, contact history, and medical reports, along with corresponding medical images. All these records are converted to images. Next, these images are encrypted using a novel image encryption system. The proposed methodology involves Random Keystream Generation using SHA-256. NIST Randomness Tests are performed to check the randomness of generated keystream. It also involves image encryption along with confusion and diffusion procedures.

Secure Hash Algorithm (SHA) 256 Based Random Keystream Generation

An n-bit cryptographic hashing function is a mapping of random size messages into n-bit hash values. It is one-way, i.e., not reversible and collision-resistant. These hashing functions are used in different security systems like digital signatures, blockchain technology, password security, authentication of the message, etc. In SHA-256, the message hash is generated in the following way:

  1. (1)

    The message is appended with its size to obtain a message of multiple of 512 bits long.

  2. (2)

    Then it is parsed into 512-bit message groups \(M_1\), \(M_2\),...,\(M_N\). The message groups are processed one at a time: Starting with a fixed default hash value Hash(0), serially

    $$\begin{aligned} Hash(i) = Hash(i-1) + C_{Mi} (Hash(i-1)) \end{aligned}$$
    (5)

    is computed. In (5), C is the SHA-256 compression function and + is word-wise modulo 232 addition. Hash(N) is the hash of message M. The above hashing technique is used to produce keystream in following way

    $$\begin{aligned} H_{i+1} = Hash(H_{i}) \end{aligned}$$
    (6)

    given that \(H_{0}=Hash(Seed)\).

The function is tested to generate random keystream of length 256. For sample, here \(H_0=35423421\) is taken and generated key value is applied a modulus of 256 and appended to keystream. Figure 6 shows the distribution of random keys.

Fig. 6
figure 6

Distribution of random values

Image Cryptosystem

Fig. 7
figure 7

Medical image encryption

Figure 7 shows the architecture of the proposed Covid-19 image cryptosystem. The proposed methodology uses Fridrich’s confusion–diffusion architecture. Confusion means shuffling of pixels and diffusion means modifying pixel values. Chaotic maps are commonly used to generate keystream of chaotic order values for permutation and diffusion. In the proposed system, 1-D Logistic map (3) is used to produce a keystream of confusion. For generating keystream required for diffusion, a novel method using cryptographic hash function SHA-256 is implemented as described in (6).

Fig. 8
figure 8

Confusion process

Fig. 9
figure 9

Diffusion process

Initially, plain image is converted into 1D pixelArray. Using Logistic map, confusionIndex is produced and pixels are shuffled accordingly into confusedArray. The confusionIndex is based on input values of \(\mu\) and \(d_n\) which are calculated from Seed. The pixelArray is confused to obtain confusedArray as follows:

$$\begin{aligned} confusedArray\leftarrow confusion(pixelArray,confusionIndex) \end{aligned}$$
(7)

Figure 8 illustrates the confusion process for given 4 \(\times\) 4 input image.

figure a
figure b

Next, hashKeyStream is generated using 6 and diffusedArray is found as follows:

$$\begin{aligned}&diffusedArray[i]\leftarrow {(confusedArray[i]\oplus confusedArray[i-1]) \oplus hashKeyStream[i]} \end{aligned}$$
(8)

Algorithm-1 ImageEncrypt describes the logic of encryption. The functions used for image encryption are reversible. Algorithm-2 ImageDecrypt describes the logic of decryption. Figure 9 illustrates the diffusion process for given confusedArray and hashKeyStream.

Finally, diffusedArray is transformed into 2-D to get encrypted image eImage. Permutation and diffusion rounds are ran for \({\mathcal {M}}\) and \({\mathcal {N}}\) times to increase security level. The sample input images and corresponding encrypted images are shown in Fig. 10.

Fig. 10
figure 10

Sample input images and corresponding encrypted images

Uploading Files to IPFS

First, command \(ipfs\,\, daemon\) is used to connect the user’s PC to the network. Next, after initialization of \(ipfs \,\,\,daemon\), Connection object is created. Using add(.) and get(.) methods of the Connection object files are uploaded and retrieved. When the user uploads the file to IPFS, it returns the hash of files and objects using the Multihash format of Base58 encoding of length 32 bytes. In the proposed system, all encrypted medical record images are stored in a folder, and then this folder is uploaded to IPFS. In response, IPFS returns the single hash value, which is stored in the Index table of the corresponding patient. Figure 11 shows the uploading of the file to IPFS.

Fig. 11
figure 11

Uploading file to IPFS

Security and Performance Analysis

For the simulation of the proposed system, two personal computers cryptomatch server and IPFS server, are used. The cryptomatch server has an Intel Core i5-4570, 3.20GHz 4 processor, 8 GB RAM, and Ubuntu 16.04 LTS OS. The configuration of IPFS is a computer with Intel(R) Core(TM) i7-8700 CPU @ 3.20 GHz processor, 32 GB RAM, Ubuntu 18.04.3 LTS. For the implementation of the proposed system, XAMPP with Apache HTTP Server, MySQL ver 14.4 Distrib 5.7.29, and PHP are used. Encryption algorithms are coded in Python 3.6.3. Traditional image processing gray images from the USC SIPI database and medical images from NIH (National Institutes of Health) are taken to analyze the performance. Apart from these databases additional two Covid-19 databases are used for performance analysis are used namely 1.COVID-19 British Society of Thoracic Imaging database, 2. Eurorad COVID-19 cases. These databases are provided by The European Institute for Biomedical Imaging Research (EIBIR).

Here, the performance of the image cryptosystem is analyzed in the following subsections.

NIST (National Institute of Standards and Technology) Randomness Test

The NIST [48] has proposed a statistical test suite to test the randomness of the binary sequence using 15 distinct subtests. NIST uses the threshold level \(\alpha = 0.01\) for randomness test. The \({\mathcal {P}}\)-value is calculated for all binary sequences for each subtest individually. If obtained \({\mathcal {P}}\)-value \(\ge \alpha\) then a binary sequence is passed the randomness subtest. NIST gives two methods for interpreting test results [48]:

(i) Calculation of the pass rate \(({\mathcal {P}}_{r})\) of passing sequences in the subtest. If \(({\mathcal {P}}_{r})\) comes down outside the appropriate proportion given by

$$\begin{aligned} \overline{{\mathcal {P}}}\pm 3 \sqrt{\frac{\overline{{\mathcal {P}}}(1-\overline{{\mathcal {P}}})}{\hat{m}}} \end{aligned}$$
(9)

where \(\overline{{\mathcal {P}}} = 1-\alpha\), then it shows that the test sequence is not sufficiently random.

(ii) To calculate the \({\mathcal {P}}\)-value distribution and to check \({\mathcal {P}}\)-values are within the range [0, 1] and to verify whether it is uniform or not.

Table 4 NIST randomness test

The range [0, 1] is partitioned into 10 equal subranges and \({\mathcal {F}}_{kq}\) denotes the frequency of \({\mathcal {P}}\)-values falling within the \(k^{th}\) subrange. Let \(\hat{{\mathcal {M}}}\) be the sample size and the \({\mathcal {P}}-value\) is calculated using,

$$\begin{aligned} \chi ^{2} = \sum ^{10}_{kq=1} \frac{({\mathcal {F}}_{kq}-\hat{{\mathcal {M}}}/10)^{2}}{\hat{{\mathcal {M}}}/10} \end{aligned}$$
(10)

and

$$\begin{aligned} {\mathcal {P}}\text{-}value_{u} = igamc\left(\frac{9}{2}, \frac{\chi ^{2}}{2}\right) \end{aligned}$$
(11)

If \({\mathcal {P}}\)-value\(_{u}\) \(\ge 0.0001\), a sequence could be regarded as a uniform distribution.

For the experiment, the random number sequence of length \(512 \times 512\) is generated using 6. Table 4 shows the results obtained from the randomness test. As seen from Table 4, the proposed hash keystream generation passes all tests. So, it can be utilized for image encryption.

Correlation Analysis

The correlation among connected pixels of the encrypted images is calculated through a random selection of 3000 samples of connected pixels from the input and cipher images separately. If correlation along vertically, horizontally, and diagonally is around zero, then the level of security of the cryptosystem is considered as good. Table 5 shows the comparison of correlation analysis of the proposed methodology for input and corresponding cipher images.

Table 5 Correlation analysis

The distribution of connected pixels for the sample lena and \(chest-1\) and the corresponding encrypted images for the horizontal, vertical and diagonal directions are shown in Fig. 12. It shows that the pixel values of the ciphered image are spread uniformly in all directions.

Fig. 12
figure 12

Correlation analysis

Histogram Analysis

The distribution of pixels is described as a histogram. The histogram gives the statistical information about the image. Histogram of input and encrypted lena and \(chest-1\) is shown in Fig. 13. Additionally, the histogram of sample COVID-19 British Society of Thoracic Imaging database and Eurorad COVID-19 cases is shown in Fig.14 and Fig.15, respectively.

Fig. 13
figure 13

Histogram analysis

Fig. 14
figure 14

Histogram analysis-2 (COVID-19 British Society of Thoracic Imaging database)

Fig. 15
figure 15

Histogram analysis-3 (Eurorad COVID-19 cases)

It can be noticed that obtained histograms of both images are totally variant. So, it is hard to find information from statistics of pixels.

Differential Analysis

A number of Pixels Change Rate (NPCR) and Unified Average Changing Intensity (UACI) are the testing methods to check the sensitivity of a little modification in the plain image as given by (12) and (14). Assume, \(I_1\) is an input image and \(E_1\) is the obtained encrypted image. \(I_2\) is input image with a little modification and \(E_2\) is the obtained encrypted image.

$$\begin{aligned} NPCR= & {} \frac{\sum _{\alpha \beta }\psi (\alpha ,\beta )}{width \times height} \end{aligned}$$
(12)
$$\begin{aligned} \psi (\alpha ,\beta )= & {} {\left\{ \begin{array}{ll} 0 &{} \hbox { if}\ E_1(\alpha ,\beta )=E_2(\alpha ,\beta ) \\ 1 &{} \hbox { if}\ E_1(\alpha ,\beta )\ne E_2(\alpha ,\beta ) \\ \end{array}\right. } \end{aligned}$$
(13)
$$\begin{aligned} UACI= & {} \frac{1}{width\times height}{\sum _{\alpha =1}^{width}\sum _{\beta =1}^{height}}\frac{|E_1 (\alpha ,\beta ) - E_2 (\alpha ,\beta )|}{255} \end{aligned}$$
(14)

The obtained results are shown in Table 6. From the results, it is concluded that a random one-bit change in the input image produces a major change in the encrypted image.

Table 6 Differential analysis

Analysis of Information Entropy

Information entropy measures randomness. For gray images, if pixels spread is uniform, then the highest entropy is eight. Assume \({\mathcal {J}}\) is the number of gray levels and \({\mathcal {P}}r({\mathcal {J}}_i)\) is the probability of \(i^{th}\) gray level, then entropy is calculated using (15). Table 7 shows the entropy of encrypted images, and it is noticed that the values are nearer to standard entropy value.

$$\begin{aligned} {\mathcal {E}}= & {} \sum _i{\mathcal {P}}r({\mathcal {J}}_i)log_2\nonumber \\&\bigg (\frac{1}{{\mathcal {P}}r({\mathcal {J}}_i)}\bigg ) \end{aligned}$$
(15)
Table 7 Information entropy analysis

Analysis of Key Space

Keyspace is defined as the total number of keys used in the system. In the proposed system, a Logistic map with control values \(\{\mu ,d_n\}\) is used to generate a confusion index. Diffusion keystream is generated using (6), which depends on the value of Seed. Seed is derived from a hash of Patient Id and constant c using (4). \({\mathcal {M}}\) and \({\mathcal {N}}\) are the permutation and diffusion rounds. Hence finally, key contains \(\{\mu ,d_n,Seed,c,m,n\}\). As per the IEEE floating-point number standard, 64-bit double datatype’s computational precision is nearly \(10^{ -15}\) [49]. Hence, the obtained key space is \((10^{15})^6 = 10^{90} \approx 2^{286}\). If the size of the keyspace of the cryptosystem is more than \(2^{100}\), then brute-force attacks are not feasible [33]. Hence, the obtained keyspace in the proposed system is enough worth to withstand attacks like brute force.

Analysis of Key Sensitivity

Here, the sensitivity of the system for small modifications in key parameters is analyzed. In the proposed encryption, \(\{ \mu ,d_n,Seed,m,n\}\) are used as control key values to produce confusion index and diffusion keystream. Suppose \(I_1\) is an input image, and \({\mathcal {K}}1\) is a set containing control key values used for encryption, and \(E_1\) is the corresponding encrypted image. Assume a little modification is done in one of the key parameters of \({\mathcal {K}}1\) and new key \({\mathcal {K}}2\) is produced. Now, again \(I_1\) is encrypted by \({\mathcal {K}}2\) and \(E_2\) is obtained. Next, NPCR of \(E_1\) and \(E_2\) is obtained using (12). If obtained NPCR is around ideal NPCR, then cryptosystem is sensible for a little modification in key also.

Table 8 Key sensitivity analysis

In the proposed scheme, sample ‘chest-1’  image is taken as input image. At first, for the given key values, the input image is encrypted to obtain encrypted image \(E_1\). Next, a little modification in key is done and new encrypted images \(E_2, E_3, E_4, E_5\) are generated. The NPCR of \(E_1\) and \(E_i\) ( i={2,3,4,5} ) are obtained. From Table 8, it is observed that NPCR values are around standard NPCR, which implies that the small change in any key-value produces a large change in the encrypted image. Figure 16 shows the corresponding encrypted images concerning changes in the key parameter value.

Fig. 16
figure 16

Key sensitivity test

Comparative Analysis of Performance

Here, the comparison of the proposed system and existing schemes is made concerning various metrics. The control values of chaotic maps are derived from the original plain image in [33]. However, the chaotic map used in [33] is five-dimensional. As dimensions increase, the complexity and cost also increase. In [36, 37], operations on bit-level are performed to increase the speed. But, the required keystreams are not derived either from the original image or from a random source which further shows the vulnerability to known/chosen plain text attacks. In [50, 51], Deoxyribonucleic Acid (DNA) sequences are used to derive the logic of pixel encoding. However, the logic is based on the rules, and this DNA rule book is required during decryption. This requires transmission of the DNA rule book to the receiver end. Herbadji et al. [52] have derived logic from the classical quadratic map to generate three chaotic sequences. Compared to the existing schemes, the proposed scheme utilizes two chaotic sequences to generate random keystreams that depend on plain images to withstand chosen/known-plaintext attacks without compromising speed.

Apart from the above comparison points, the comparison of differential and statistical analysis of different schemes is shown in Table 9. The security metrics used are NPCR, UACI, Entropy, and correlation analysis of adjacent pixels horizontally, vertically, and in diagonal directions. From Table 9, it is found that the proposed encryption scheme has achieved a better significant level in security and privacy concerning security metrics than other existing schemes.

Table 9 Comparison of differential and statistical analysis of various systems

Security of ADT

User credentials are stored in ADT. Assume that a hacker has already got access to the ADT. To reveal the passwords, the following attacks are analyzed.

(1) Bruteforce attack

Here, the attacker has to check all possible combinations of symbols to reveal the password. But, to make such attack is very hard.

(2) Dictionary attack

The attacker creates the dictionary consisting of familiar words or words of daily routine. Next, matching of words from the dictionary to ADT is done.

(3) Advanced Dictionary attack

In this attack, an attacker has to create a dictionary containing words that are predicted from users’ nature of designing passwords.

(4) Lookup table attack

In this attack, initially, an attacker prepares the list of passwords. The list normally consists of regularly used passwords. Next, he constructs the lookup table containing tuples of hashed passwords and their passwords in the plain form of the prepared list. An attacker applies search operation among lookup table and ADT entries.

(5) Rainbow table attack

Rainbow tables consist of hash chains that differ from hash tables. By alternating hash function and reduction function, a series of alternating passwords and hash values are produced. Only the initial and the endmost plain value produced is entered in the rainbow table.

In all the above attacks except brute force, the attacker has to find a value in ADT with the already calculated hash value. Consider a table having plain value and its hash attributes which are already constructed as shown in Table 10. Now, an attacker has to search in the table for a given hash value. If a given match is found, a password is cracked.

Table 10 Hashed password table

In the proposed scheme, the ADT contains a hash of UserId and AES-256 encrypted hashed passwords. The encryption key is the group of symbols chosen from a particular location of the \(mixed\, hash\, string\). The mixed hash string is constructed by hashed UserId and hashed Password. The length of the mixed hash string is 512, and the length of the encryption key is 256. So there are total of \(^{512} C_{256}\) possible keys, and the attacker must know the original UserId and Password. As it is very hard to guess the original UserId and original Password, the privacy of the ADT is well preserved.

Analysis of Time Complexity

The complexity of the proposed system is based on the complexity of the Paillier cryptosystem, Secure Hash Algorithm (SHA), logistic map, the complexity of AES-256, and the complexity of the proposed image cryptosystem. For n digit number space, the time complexities are given in Table 11.

Table 11 Time complexity analysis

Considering the above-defined complexities, the complexity of different phases of the proposed system is given below:

(1) Seed encryption/decryption

For encryption and decryption, Paillier cryptosystem is used and the complexity is \(O(n^3)\).

(2) Encrypted password generation

During new user registration, for the given UserId and Password, SHA is calculated, shuffled using a logistic map, encryption key is generated using a logistic map, and finally encrypted using AES256. Hence, the time complexity for this phase is given by: \(2 \times O(n) + 2 \times O(n) + O(2^{256})\).

(3) User Authentication

For the given UserId, input encrypted password is generated as given in step 2, i.e., \(2 \times O(n) + 2 \times O(n) + O(2^{256})\). Next, at CMS, an encrypted password is extracted from the ADT, and complexity is given by O(logN), where N is the number of records in the ADT. Then, the complexity of matching two encrypted passwords is given by O(1). Hence, the total complexity of this phase is: \(2 \times O(n) + 2 \times O(n) + O(2^{256}) +O(logN)+O(1)\).

(4) Image Encryption and decryption

The complexities of image encryption and decryption are equal. Let P be the plain image of width w and height h. Also, let m be the number of confusion rounds and let n be the number of diffusion rounds. Then, the complexity of proposed image cryptosystem is given by:\(T(encryption) = T(decryption) = O(mnwh)\).

Analysis of Known/Chosen Plain Text Attack

In chaotic encryption, the attacker mainly gives focus to finding the intermediate values instead of finding the chaos function arguments used for key [49]. To obtain these values, the proposed encryption is broken into four sections. If  iImage is an input image is converted into 1D pixelArray. If confusionIndex is obtained from logistic map using (3) then \(1^{st}\) section of confusion is given by,

$$\begin{aligned} cImage=confusion(pixelArray,confusionIndex) \end{aligned}$$
(16)

In diffusion, the \(2^{nd}\) section of the cipher, the cImage is XORed with the hashKeyStream which is generated using SHA-256 (6). Diffusion is represented as follows:

$$\begin{aligned} diffusedArray=diffusion(confusedArray,hashKeyStream) \end{aligned}$$
(17)

In the 3rd section, diffusedArray is again diffused for \({\mathcal {N}}\) times as follows:

$$\begin{aligned} diffusedArray'=diffusion(confusedArray,hashKeyStream)^{\mathcal {N}} \end{aligned}$$
(18)

Lastly, \(diffusedArray'\) is again confused for \({\mathcal {M}}\) times as:

$$\begin{aligned} cImage'=confusion(diffusedArray',confusionIndex) \end{aligned}$$
(19)

Finally, to get encrypted image

$$\begin{aligned} eImage=confusion(E^n,confusionIndex)^{\mathcal {M}} \end{aligned}$$
(20)

where E is given by,

$$\begin{aligned} E=diffusion(diffusedArray,hashKeyStream) \end{aligned}$$
(21)

In (17), the used hashKeyStream is generated from (6). The control parameters of (6) depends on Seed which is calculated from patientId and constant key value c. Hence, hashKeyStream is dependent on patientId and c. In addition, c is derived from hash of first image of corresponding patient’s Covid-19 record. Hence, it is very hard for an hacker to obtain value of Seed without getting the input values patientId and his record.

Differential Cryptanalysis

Differential cryptanalysis [53] is a cryptanalysis technique that attempts to determine the difference between enciphered plain images. These plain images are typically varied by a single bit. By analyzing a plain image and its encrypted image, the analysis demonstrates that all confusion matrices can be successfully recovered from an encrypted image. Because the proposed algorithm is a one-round encryption process, consider r = 1 when analyzing the proposed system using differential cryptanalysis. Consider plain images as \(P_1\) and \(P_2\) and the acquired encrypted images \(E_1^1\) and \(E_2^1\). The differential image is given by:

$$\begin{aligned} \delta E^r=E_1^1 \oplus E_2^1 \end{aligned}$$
(22)

Let \(F_C^r()\) and \(F_D()\) are linear functions representing the confusion and diffusion processes. Then, ( 22) is expanded to:

$$\begin{aligned} \begin{aligned} \delta E^r&=(F_C^r(P_1,eK_1) \oplus F_D(eK_1)) \oplus (F_C^r(P_2,eK_2) \oplus F_D(eK_2)) \\&=F_C^r(P_1 \oplus P_2) \oplus F_C^r(eK_1 \oplus eK_2) \oplus F_D(eK_1 \oplus eK_2) \\&=F_C^r( \delta P) \oplus F_C^r( \delta eK) \oplus F_D( \delta eK) \\ \end{aligned} \end{aligned}$$
(23)

Equation (23) shows the relationship between the differential plain and encrypted images. The differential encrypted \(\delta E\) image totally depends on the confusion key stream \(F_C^r(eK_1)\), \(F_C^r(eK_2)\) as well as diffusion key stream \(F_D(eK_1)\) , \(F_D(eK_2)\) for every encryption.

In some other schemes, if there is no relationship between the plain image and the key stream in substitution and permutation processes, the differential encrypted image \(E^r\) is completely irrelevant to the key streams.

$$\begin{aligned} \begin{aligned} \delta E^r&=(F_C^r(P_1,eK) \oplus F_D(eK)) \oplus (F_C^r(P_2,eK) \oplus F_D(eK)) \\&=F_C^r(P_1 \oplus P_2) \oplus F_C^r(eK \oplus eK) \oplus F_D(eK \oplus eK) \\&=F_C^r( \delta P) \\ \end{aligned} \end{aligned}$$
(24)

From (24), the following observations are made:

(i) The encrypted image difference results from the outcome of plain image difference P and a sequence of confusion process \(F_C^r\). For instance, if \(P_2\) is a blank image, i.e., all pixels are ZERO then the differential encrypted image depends on \(P_1\) and the confusion function.

(ii)If the attacker selects a special value for \(P_1\), the attacker may be able to determine the confusion function. Once an attacker has encrypted \(E_2^r\), confusion function and its plain image \(P_2\) then any plain image \(P_1\) can be determined from encrypted image \(E_1^r\).

However, in the proposed cryptosystem, there is a correlation between different \(E^r\). Here, confusion and diffusion keys for each encryption round are different, reducing the possibility of breaking the proposed algorithm. Furthermore, because the confusion method is related to plain image, it will complicate the cryptanalysis process.

Conclusion

In this paper, Covid-19 status checking for issuing travel tickets using privacy-preserving storage and sharing of Covid-19 records through secure authentication is proposed. The proposed system is developed using cryptographic hash function SHA-256, Chaotic map, Paillier cryptosystem, and InterPlanetary File System (IPFS). In secure authentication, the contents of the Authentication Data Table (ADT) are well protected against a brute-force attack, dictionary attack, advanced dictionary attack, lookup table attack, and rainbow table attack because of hashed UserIds and AES-256 encrypted password. The encryption key is unique for each user and is derived from hashed input UserId and Password. Novel image encryption is also developed as a part of the proposed system. To withstand known plain text/chosen plain text attacks and to increase the level of security, the control values of the keystreams required during confusion–diffusion of the proposed cryptosystem are derived from input Patient Id and Covid-19 record. From comparison analysis, it is found that the proposed image encryption scheme has achieved significant security as well as privacy.

In the future, the focus will be given to increasing the speed of the cryptosystem by applying parallel computing in confusion and diffusion. Also, nowadays, the biometric recognition system is becoming popular in all sectors, including banking, identity verification in the airport, and authentication systems. Biometric information of the user such as face template, fingerprint template, iris template can be encrypted using the proposed system to provide a higher level of security.