Introduction

In cloud computing, Infrastructure as a Service (IaaS) offers reliable and scale effective storage services for storing a massive volume of data via the internet. IaaS reduces infrastructure costs and provides efficient management [1]. Nowadays, user data transactions are performed over the internet. These processes handle wide-ranging data with different genres and sensitivity. Traditionally user information was maintained on their personalized storage devices with their premises. This on-premise storage technique provided more protection than the third-party storage location-based cloud storage system. These cloud storage locations are controlled by third-party cloud service providers (CSP). Hence, cloud storage-based data falling under high-security issues. Ensuring security of cloud-based data is a crucial and critical process when it is used in financial and healthcare organizations. Because, these organizations are handling a large amount of sensitive information like the personal identification number (PIN), salary, disease, etc. Additionally, a large quantity of data is transferred into the online and storage of sensitive attributes in third party locations is increasing the probability of unauthorized access in the cloud storage system [2]. Nowadays, user information is retrieved by variety of people for refining their business, doing research work, and provides improved services to users. Thus, data usability is to be provided to inter-organization members and outside authorized users, without compromising data owner privacy is an essential task also. To achieve this requirement, sensitive attributes (SA) are segregated from nonsensitive attribute (NSA) with a knowledge of the data owner (DO) is proposed in this work.

Generally, data classification techniques are used for data classification and symmetric or asymmetric key encryption algorithms are being used for data protection. These techniques are not suitable for a cloud-based storage system because the classification technique’s accuracy depends on the training set and the attribute identification count is common all users. Similarly, conventionally symmetric/asymmetric encryption techniques are suitable to on-premises storage locations not for the cloud storage system. Hence, attribute-based encryption (ABE) techniques are preferred in cloud-based secure data storage. To provide efficient access control and protection to user information, different kind of ABE encryption techniques are used in cloud-based data. Those are key-policy attribute-based encryption, ciphertext policy attribute-based encryption technique, role-based attribute encryption technique etc. But, the security strength of the ABE depends on the number of attributes, key values or roles are involved in an encryption process [3]. Hence, the vertical and horizontal partition-based selective attribute encryption techniques are used for providing better security to selected attributes instead of entire attributes. Similarly, DO’s are willing to share their data with, authorized users for utilizing better services from them. This willingness is differing from user to user. Hence, alternate classification and security technique are required instead of machine learning and traditional security techniques for providing DO-based security preferences to their data [4].

The main intention of encryption technique is to afford a proficient data access control, confidentiality, and integrity to SA. However, few issues caused in current security techniques, such as higher encryption time, identical level security of entire attributes, single key-dependent encryption/decryption, non-involvement of DO, and inter-organization members. These issues lead to critical problems and leak entire information if the key is hacked. The current security techniques like, Rivest–Shamir–dleman (RSA) has taken exponential encryption time for processing a smaller volume of data [5]. Message-Digest_5 (MD5) increases an encryption time and processed by the advanced encryption standard (AES) encryption technique. The AES encryption technique security strength depends on key secrecy. In the two-factor data security mechanism, data can be encrypted by a sender with knowledge of user identity and a secret key is used for accessing data. The foremost issues of the cloud-based data storage system are data security, management, and monitoring. Because, in cloud-based data storage, user information is maintained by cloud service providers. Hence, cloud service providers are having a complete control over user data. Thus, the data are not in DO control and their data is accessed by all without their knowledge [6]. Hence, DO control-based data storage and security are proposed in this manuscript.

The drawbacks of existing classification and security techniques are:

  • The ABE technique provides lesser data confidentiality, monitoring, and access control to SA in cloud storage.

  • Higher computational cost and process overhead based on entire attributes encryption and decryption process.

  • Non-involvement of DO and inter-organization members.

  • Possible to access SA by inside adversaries.

Nowadays, partition-based security is an emerging technique for providing perfect protection to attributes with the level of sensitivity and secrecy. Similarly, some attributes do not require protection. If security techniques are applied to the entire attribute leads to processing overhead of an authorized user. Hence, user attributes need to be partitioned as SA and NSA and different kinds of security technique are applied to SA is an essential task. Hence, data classification techniques are applied to user data. The existing data classification technique’s accuracy depends on the training set. Thus, the DO preference-based SA segregation technique had proposed my previous research work [7]. The proposed MRFC algorithm is used to overcome this problem.

The major contributions and novelties of this paper are,

  • To generate group key using a MRFC-ECC technique for providing higher security to sensitive attribute groups.

  • The DO and Group Admin (GA) preference-based attribute encryption.

  • To reduce encryption time with reduced key management and inside adversaries’ harms.

The remaining portion of this manuscript is organized as follows: The existing works related to SA classification and protection techniques in cloud-based data is discussed in “Related works”. The detailed description of the Proposed MRFC key generation, data encryption, and decryption process with its algorithm is discussed in “Proposed MRFC-based secure sensitive attribute storage system”. “Result and discussions” presents an experimental result with a comparison of previous methods and proposed techniques. In “Security analysis”, the mathematical and security analysis of MRFC is discussed. Finally, the manuscript is concluded with future work is discussed in “Conclusion”.

Related works

Sensitive attribute classification and protection

This section analyses the existing classification and protection techniques related to SA. Identifying SA through machine learning techniques are not suitable for all domains. Fast distributed mining (FDM) is used to identify private subsets from the entire data. The K-nearest neighbour (KNN) classification technique is used for classifying SA from NSA and SA was encrypted with the RSA algorithm. RSA algorithm takes exponential encryption time and security strength depends on prime factor values. Additionally, SA accuracy depends on the training set of the KNN algorithm [8]. Furthermore, the fuzzy logic classifier is used for classifying organization data into top-secret, secret, confidential, and public data. The level of encryption is determined as high, medium, and low. The data security and classification accuracy depend on organization and time specifications [9]. The existing SA classification accuracy depends on the type of classifier and training set, not a DO. Protection to SA by partitioning attributes into several chunks with semantic meaning and places an encrypted chunk into the separate cloud. It reduces data usage to the authorized user and increases the computational complexity of merging into original information [10]. The SA is classified from the NSA by applying user-defined classification rules and SA are encrypted with the AES algorithm. The SA security depends on key transformation, key-size, and the number of rounds in an encryption process and classification rules [3]. This analysis clearly shows that the classification accuracy depends on a training set and the number of rules is involved in a classification process. Similarly, DO is not involved in a classification process. Hence, the DO preference-based SA classification technique is required in a current system.

Attribute-based encryption

This subsection furnishes the outline of existing ABE techniques. Cipher-text policy attribute-based encryption (CP-ABE) uses for the cooperative key management protocol to share data in the cloud. The storage of private and distributed key generation was added for immediate attribute revocation and fine-grained access was utilized to construct the private key update algorithm. This system was providing more security with high cipher-text size, inefficient access structure, and encryption/decryption cost [11, 12]. Weighted attribute-based encryption (WABE) method provides a fine-grained access control and was providing better performance than other schemes. The attribute weight was assigned by admin, not a DO [13]. The dynamic search method for secure and efficient data access provides enhanced efficiency compared to a linear search with reduced access time and searching cost. The secure KNN algorithm was used to protect two threat models. Here, DO was in-charge to create the updated data and a data was stored in a cloud storage location. The disadvantages were security challenges were occurred in the multi-user scheme and user revocation [14].

The decentralized access control technique is used for maintaining the data securely in the cloud. The CSP was verified the legitimacy of a server without the knowledge of user identity before storing data. This method was preventing replay attacks and supports the establishment, adjustment, and interpretation of data that was kept in a cloud storage location. An authentication and access control method were decentralized. The limitations were, access policy of each datum was kept in the cloud and was not concealing the attributes and access plan of the user. ABE was excellently permitted the users who were having access rights, is able to use a data security in the cloud. The DO assist in key generation and management process. This method was taking the less computational time and reduced traffic burden with improved scalability. But the requirement of enough client infrastructures was uneconomical [15,16,17].

The sensitive, revocable, and proficient access control method for a multi-authority cloud storage technique achieved both forward and backward secrecy. Without DO knowledge, the key generation and encryption processes are completed by organization members. The multi-authority storage technique was used in the remote storage system, online networks, etc. [18, 19]. Dual server public key encryption with keyword search (DS-PEKS) avoids inner keyword predicting attack which was an intrinsic weakness of the conventional PEKS framework. The smooth projective hash function (SPHF) was denoted as linear and homomorphic SPHF (LH-SPHF). The stronger security was achieved by the decision of Diffie–Hellman-based LH-SPHF [20]. The security protection is to be protecting the deployed data user privacy and providing data integrity, access control, and confidentiality of cloud data. The cloud security was improved their efficiency, reduced computational complexity, cost of bandwidth, and overhead in storage. However, it had drawbacks like difficult to maintain accountability, privacy protection, data integrity, and availability with low cost [21]. Role-based encryption (RBE) was integrated into the security technique by role-based access control (RBAC). The single key used for the decryption process and it operated efficiently irrespective of the role hierarchy and user membership complexity in the system [22]. Fully homomorphic encryption (FHE) technique was used for protecting a data and computation analysis. The data and computational analysis is divided into a number of subset and are encrypted by FHE technique which is maintained in separate cloud for providing an improved security to user data. The hard clustering and fuzzy clustering algorithms were used for forming of data group. Each subgroup was encrypted separately by FHE technique [23].

Fibonacci cryptography

Quantum key distribution (QKD) protocol, used for the Fibonacci valued OAM entangled states. The Fibonacci matrix representation was well-defined to enhance the original protocol. It has not only enhanced the efficiency of encoding, but also verifiability. QKD protocol is used to attain the verifiability and this protocol is used for better implementation using recent technology [24]. The multiple variable factors and recovery of the original data were very difficult. The size of the circular queue had a tenable factor. The keyword letter and the numbers are denoted in the Fibonacci format. Using shift and logical operations, all letters are converted into ASCII binary format by security algorithm. This is mainly used for text messages. The results of the proposed algorithm had given a 50% lower complexity when compared to Multiple Circular Queue Algorithm (MCQA) [25].

Secure communication is established through the group key-based encryption task. The better encryption QKD was utilized with Lucas, Fibonacci, and Fibonacci–Lucas that gives the quantum signature verifications. This proposed technique improved the verification of signing and verified the information that is received from the participants for an authentication. This proposed final outcomes the protection by minimal delay when compared to a normal QKD technique [26, 27].

Proposed MRFC-based secure sensitive attribute storage system

Group-key-based sensitive attribute protection using Modified Random Fibonacci Cryptography (MRFC) in cloud storage system provides better security to DO preferred SA in a private cloud. The SA and NSA partition depends on privacy score values of individual attributes, where security preferences are given by DO. Table 1 indicates the roles of each participant are involved in a proposed system.

Table 1 Roles of participants

DO assign a Likert scale value (LSV) to all attributes and the LSV is converted to Dichotomous scale value (DSV) for constructing response matrix RD(i, j) by general admin (GNA). Sensitivity (β) and visibility (V(i, j)) values are calculated from RD(i, j). The threshold (T) is calculated from an average value of the privacy score value (PSV). If PSV of an attribute is lesser than the average privacy score value (threshold value), the specific attribute is partitioned as SA otherwise attribute is partitioned as NSA [7].

Now SA is encrypted by the proposed technique and NSA are stored as in a plain-text form in the cloud. Table 2 shows a list of symbols are used in the proposed MRFC technique. The current security techniques are depending on DO preferences. Hence, data security fully depends on DO not data handling organizations.

Table 2 Symbols used in proposed method

Nowadays, DO perform their task through online and data transactions are performed between multiple organizations. When data is moved to inter-organizations, high security is required, to protect a SA. The proposed method provides one such technique involves DO, GNA, Group Admin (GA) [e.g. Insurance Admin (GA_1), Marketing Admin (GA_2), Loan Admin (GA_3)], and Cloud Service Provider (CSP). The DO encrypts their Group of SA(G(SA))’s by a Group Key (GK) before uploading into cloud storage. The GA sends a request message to CSP for acquiring information about DO for their process. The request is verified by CSP for authorization and sends the key request to DO. Then, the request is analysed by DO and send the encrypted G(SA) to the requested GA.

Here, the GK is generated by DO is used for encryption of their G(SA). The major benefit of this technique is, without DO approval, none of their data are shared with others and DO will get the log records of the requester for every transaction from CSP. Hence, the DO having complete access control and monitoring over their data. Now, the GA can access the DO data, depending on the decision provided by a GNA. Figure 1 shows the proposed system architecture.

Fig. 1
figure 1

System architecture for proposed MRFC

Modified Random Fibonacci Cryptographic Technique

The proposed MRFC technique describes the GK generation process, SA encryption/decryption, GK sharing, and the encrypted G(SA) transfer. The flow depicts the files are uploaded with sensitivity preference given by DO and the GNA receives the sensitivity preferences. The GNA analysed the security preference and splits(π) the data as SA and NSA such as attribute ∈ π(SA, NSA). The identified SA is grouped into ‘n + 1’ groups by GNA. Similarly, GK is generated by DO using the MRFC technique. The generated GK is used for encrypting the SA and merged with NSA for uploading into the cloud. The GA sends a data request to CSP and the request is verified by CSP. If the GA request is valid, the requested data are transferred to GA by DO.

Group key-based sensitive attribute protection

MRFC is established by an Elliptic Curve Cryptography technique that is used to generate keys over the properties of the Elliptic Curve equation “\({y}^{2}=\left({x}^{3}+ax+b\right)\mathrm{m}\mathrm{o}\mathrm{d} p\)”. In MRFC, the key generation is combined with a Diffie–Hellman Key exchange technique for transfer a key between the parties.

Grouping of sensitive attributes

The groups are divided into three different categories listed as follows:

  • SA is the attributes are unknown to others/attribute required privacy.

  • \({\mathrm{G}}_{{A}_{i}}\) is the attributes are accessed by ith GA.

  • C(G(SA)) is the G(SA) common to more than one GA.

Algorithm 1 shows the SA grouping process depends on the number of inter-organizations are going to access user information. The inter-organization requirement is gathered from the GA which is, Req(\({\mathrm{G}}_{{A}_{i}}\)) ∈ {A1, …, An}. Then Req(\({\mathrm{G}}_{{A}_{i}}\)) belongs to either SA or NSA. Such as, if A1, …, AmSA then A1, …, Am ∈ \({G}_{{A}_{i}}\)(G(SA)) and Am+1, …, An ∈ NSA, else A1, …, Am ∈ \({G}_{{A}_{i+1}}\)(G(SA)). A similar process is repeated for the remaining GA requirement-based SA grouping. Here, some attributes are common for all. The common attributes C(G(SA)) are identified as the intersection of all GA requirements. The attributes which are subtracted from the C(G(SA)) is identified as \({G}_{{A}_{i}}\) (G(SA)) or \({G}_{{A}_{i+1}}\)(G(SA)). Based on this process, two different groups of attributes are generated such as organization required attributes and DO private attribute group. Now, the identified \({G}_{{A}_{i}}\)(G(SA)) is passed to the encryption process.

figure a

Group key generation

The key generation is a process of producing GKs for the purpose of encryption. The key generation is the combination of Fibonacci values and Elliptic Curve Cryptography. Algorithm 3 shows the GK generation using MRFC, taking an basic elliptic curve equation \({y}^{2}=\left({x}^{3}+ax+b\right)\mathrm{m}\mathrm{o}\mathrm{d} p\) is for getting the initial and final values of random number generation. Based on these values, the initial parameters ‘P’ (initial value), ‘Q’ (Final values), and ‘n’ (number of values) is declared in the GK generation task.

Cryptographic preliminaries

Bilinear Map Consider a pair of cyclic groups (G, +) and (G,.) of a prime order P, and P0 is a initiator of G. In a bilinear mapping process \(e\left(G X G\right)\to {G}_{1}\) is true, and then it fulfils the given properties:

  1. 1.

    Non-degeneracy: \(e\left({P}_{0}, {P}_{0}\right)\ne 1\) is satisfied.

  2. 2.

    Bilinear: \(e\left({P}^{x}, {P}^{y}\right)=e\left({P}^{y}, {P}^{x}\right)=e{(P,P)}^{xy}\) is true, for any x, y ∈ FP and P ∈ G.

  3. 3.

    Computability: The algorithm computes e(P1, P2), for any P1, P2 ∈ G.

Hardness assumption of Decision Bilinear Diffie–Hellman Problem: Consider Ǣ be a polynomial time algorithm and it yields ‘n’ outputs (q, G, G1, e). Here, ‘q’ is a prime number chosen based on ‘n’, G, G1 and e. The decision bilinear Diffie–Hellman problem is hard (Ҥ) for any protection variables ‘n’, any probabilistic polynomial-time(PPT), distinguisher D, and any (q, G, G1, e) generated by Ǣ(1n), there is a negligible function neg is defined as

$$|\mathrm{P}\mathrm{r}(D\left(G, {G}_{1},q, e, {P}^{x}, {P}^{y}, {P}^{Z, e} {\left({P}_{0},{P}_{0}\right)}^{xyz}=1\right)-\mathrm{Pr}\left(D\left(G, {G}_{1},q, e, {P}^{x}, {P}^{y}, {P}^{Z, e} {\left({P}_{0},{P}_{0}\right)}^{w}=1\right)\right|\le \mathrm{n}\mathrm{e}\mathrm{g}\left(n\right),$$

where P0 is a random generator of G, and x, y, z and w are four identical components of FP. The proposed system consists of the list of functions such as Setup, Group KeyGen, Encryption, Decryption and Adduser()/Revokeuser().

  1. 1.

    Setup: The GNA generates a list of global parameters for the generation of (Pu, Pr) for each DO and GA. Elliptic Curve base point(G), rand(), EC points (P, Q) is taken as an input for GK generation. Algorithm 3 describes this setup(). The user setup process, describes the list GA and DO are involved in a process. The GA ID and DO ID are assigned in this user setup process based on FP. These ID’s are stored in the cloud for DO and GA verification process by the cloud.

  2. 2.

    Group KeyGen(): Each DO compute their (Pu, Pr) using MRFC technique. The Fibonacci values ∈ {P, Q} is used for generating the Pr. Each DO generate ‘n + 1’ GK for encryption, if \({G}_{{K}_{i}}\)\({D}_{{O}_{i}}\) who is registered in a setup phase is maintained their corresponding GK.

  3. 3.

    Encrypt: E(\({G}_{{A}_{i}}\)(G(SA))) ← Enc(G(SA)): Encryption algorithm run by the DO to encrypt the group of SA.

  4. 4.

    Decrypt: D(\({G}_{{A}_{i}}\)(G(SA)))—Decryption algorithm is executed by a GA. E(\({G}_{{A}_{i}}\)(G(SA))) and GK are taken as an input and D(\({G}_{{A}_{i}}\)(G(SA))) is produced as an output.

  5. 5.

    Adduser()/Revokeuser(): The adduser() and revokeuser() is executed by a DO.

The security strength of the cryptographic algorithm is dependent on the random numbers. In a proposed technique, the modified Fibonacci cryptography is used for random number selection process. To solve the basic elliptic curve equation for the selection of initial and final positions of Fibonacci series. The generated Fibonacci values are taken as an input for the random number generation. Using Rand(), the random values are chosen from the Fibonacci values and are considered as a private key for each user instead of conventional elliptic curve-based private key selection process. This private key selection process increases the hardness of private key identification by an adversary. Because, an adversary unable to guess, which Fibonacci value is chosen as a private key for a particular group of sensitive attribute encryption. Using this technique, each DO generates ‘n + 1’ private keys for their ‘n + 1’ group key generation. Thus, the proposed system provides high randomness than the conventional elliptic curve-based private key generation. In a proposed technique, each user chosen number of values for generating a group keys. Algorithm 2 shows the GK generation process.

figure b

In a proposed MRFC, the Fibonacci series F[n] ∈ EC[P…. Q]. From F[n], Do use rand() to pick any one of the value and stored into A[]. Using this A[] value, the Pr is generated for each DO. Equation 1 is used for finding the Pr of a DO:

$${P}_{{r}_{i}}\leftarrow \left(Q*A\left[i\right]+P+\left(\frac{F\left[n\right]}{A\left[n\right]}\right)\right).$$
(1)

The generated ‘Pr’ values are stored into an array K[n]. The ‘Pu’ is calculated by multiplying ‘Pr’ with the base point ‘G’. Then, \({G}_{{K}_{i}}\leftarrow {P}_{{u}_{A}}* {P}_{{r}_{B}}\). Similarly, the required number of \({G}_{{K}_{i}}\) is generated and is used for encryption of G(SA). The key generation algorithm checks whether the GA ∈ Oi or not. If not, returns, GA ≠ Oi else returns the GK which consists of the tuples: GK ∈ (P, Q, A[i], F[i], Pr, Pu, B).

Sensitive attribute encryption algorithm

The G(SA) is encrypted with an appropriate GK. A unique encryption scheme is adopted to encrypt a G(SA). Algorithm 3 shows the G(SA) encryption process. According to the number of organizations are available in a process the number of ciphertext (C) are generated.

figure c

In an encryption process, if a \(\left(G\left({S}_{A}\right)\right)\in {G}_{{A}_{i}}(G\left({S}_{A}\right))\), \({G}_{{K}_{i}}\) is used for encryption. A similar process applies to the other G(SA) with different GK. Now, the encrypted G(SA) is merged with NSA and uploaded into cloud storage.

Sharing of group key

When a specific G(SA) is required, the Pu of DO is used for finding the \({G}_{{K}_{i}}\). The ‘Pu’ is shared between DO and GA. Algorithm 4. shows the identification of GK process.

figure d

When a GA required for accessing a specific G(SA), their Pr is multiplied with the DO(Pu) for obtaining the GK. The calculated GK is used for decryption of the required G(SA).

Sensitive and non-sensitive attribute merging and transfer

The GA sends a request to CSP, then the CSP verifies an authentication of GA. If the verification is successful, the CSP sends encrypted G(SA) to GA. The GA decrypts the G(SA) using the corresponding GK. The G(SA) and NSA merging and transferring procedure is described in Algorithm 5.

figure e

The encrypted attribute E(G(SA)) is completely secured in cloud-based sharing process. Afterwards, the approved GA of the process are able to access E(\({G}_{{K}_{i}}\)). In a proposed system, requested customer detail is sent to GA instead of entire customers. This process takes lesser transfer time, decryption time, reduce unnecessary data transmission cost with higher security.

Sensitive attribute decryption

The SA decryption process is an inverse function of the SA encryption process. If a GA wants to decrypt and access the G(SA), the User ID is verified whether the GA is a non-revoked GA or not. If the GA is a non-revoked GA, then CSP forward a request to DO and DO verifies their authenticity and send the requested E(G(SA)). Algorithm 6 specifies the decryption process of the subgroup based on GA request.

figure f

The specific requirement/application dependent attributes are decrypted from the data instead of complete data. The decryption task required a particular GK’s from the specified, GA and DO. If GA revoked from an organization, they are unable to find the GK. Since, the GA related GK is deleted from list during a revocation process. That is the revoked GA’s are unable to access G(SA) from a decryption process. In decryption process, the GA needs to calculate the \({G}_{{K}_{i}}\) from their \({P}_{{r}_{i}}\):

$$ P_{{u_{i} }} \leftarrow P_{{r_{i} }} *G, $$

If \({P}_{{r}_{i}}\in {F}_{P}\), then \({P}_{{u}_{i}}\in {F}_{P}\).

Now, \({G}_{{K}_{i}} \leftarrow {P}_{{r}_{A}}* {P}_{{u}_{B}}\).

If a requester is a revoked user then \({P}_{{r}_{i}}\) and \({P}_{{u}_{i}}\) not belongs to Fp. Thus, the revoked GA’s are unable to access the G(SA).

User revocation

When a GA is revoked or added to process, user update, the ciphertext update, and GK update are required in cloud storage for providing perfect security to SA. Whenever the GA revoked, the new GK generates and perform encryption based on new GK for the specific G(SA). The revocation system consists of the following process:

  1. 1.

    Delete the User ID of the GA from the CSP and DO in a dynamic manner.

  2. 2.

    Include the revoked GA user ID in the revoked user list.

  3. 3.

    Choose a new random number (R1) ∈ Fibonacci Series.

  4. 4.

    Compute new (Pr, Pu) key pair for the generation of GK.

  5. 5.

    Now, new ciphertext is generated for the G(SA) ∈ revoked user and upload into the cloud.

Result and discussions

The results of the proposed MRFC algorithm are discussed in this segment. The created synthetic structured data for banking is used in the proposed method contains more sensitive information about the DO. The data consists of 1000 records with 25 attributes are used in this process. From these 25 attributes, 15 attributes are SA and 10 attributes are NSA. This SA and NSA count are varied according to the user preference like user_1 SA = 15, user_1 SA = 12, user_3 SA = 8, etc. The SA is encrypted instead of complete attributes for reducing encryption time and computational complexity with high security. The proposed system is tested, and validation is done by JDK 1.7 in NetBeans 7.1. CloudMe is a private cloud that is used for data storage.

Sensitive attribute identification analysis

In a proposed system, the SA is identified with the preferences of the DO. The existing KNN classification technique-based SA identification count depends on the training set. The attribute count given to the training set is 15. When an algorithm executed multiple times, the identified attribute count is same. But in a proposed system the identified attribute count varies depends on DO preferences [7]. Figure 2 clearly expresses the SA identification comparison of the proposed technique and the existing KNN classification technique.

Fig. 2
figure 2

Sensitive attribute identification comparison

Execution time analysis

In a proposed system the execution speed depends on the number of records is to be encrypted and decrypted. Instead of encrypting the entire attribute only SA is to be encrypted and a specific group of attributes is to be decrypted instead of the entire attribute. This process takes lesser execution time than entire attribute encryption and decryption time. Table 3 shows the execution speed of the proposed system.

Table 3 Execution speed

Encryption time analysis

An encryption time consumed by a proposed system with the various numbers of attribute group is shown in Fig. 3. The encryption time varies with the number of attributes. If the attributes count is less, the encryption and decryption time is less. When an attribute count is increased, encryption time is also increased. It clearly shows that the entire attribute encryption takes higher encryption and decryption time than the minimal sized attribute group. The MRFC encryption technique is more reliable than the conventional schemes and works well for larger volumes of data size. Because instead of entire data the partitioned group of attributes with the varies size is encrypted. Hence, the proposed technique takes lesser encryption time is proved. The encryption time is measured in milliseconds (ms). The time complexity is to be increased linearly when several groups are increasing.

Fig. 3
figure 3

Encryption time analysis

In a proposed G(SA) storage technique, the GA requirement is very less. Such as, each GA required lesser than five SA’s. Each G(SA)s are encrypted by separated GK. Hence, the encryption time depends on the number of organizations are involved in a process. If ‘n’ organizations are involved in a encryption, then encryption time for entire G(SA) is defined as follows:

$$ {\text{Encryption}}\;{\text{time}}\;G\left( {S_{{\text{A}}} } \right) = n \times {\text{number}}\;{\text{of}}\;{\text{groups}}{.} $$

Similarly, before encryption, the NSA’s are separated from SA. These NSA’s are not involved in an encryption process. Hence, this proposed system encryption time is lesser than the entire attributes encryption time.

Decryption time analysis

Due to ECC-based key generation, the proposed MRFC technique is working faster than RSA-based encryption and decryption process. In the proposed technique, both the encryption and decryption depends on the GK which is shared between DO and GA. Instead of decrypting entire attributes, the specific G(SA) is decrypted by a GA. This decryption process requires minimal time than the complete data decryption process. Figure 4 provides the decryption time analysis of the proposed scheme.

Fig. 4
figure 4

Decryption time analysis

This Fig. 4 clearly shows that the proposed technique takes lesser decryption time than the entire attribute decryption. This encryption and decryption time reduction process reduced the authorized user processing overhead.

Memory space utilization analysis

Table 4 and Fig. 5 shows the memory space consumption of the proposed work in an encryption process. The memory space utilization is represented in the y-axis and the file size, that are used for experiments are represented on the x-axis. The amount of storage space is required to execute the algorithm with an input amount of data is known as encryption storage space. Here the proposed method occupied lesser storage space as compared to existing techniques. The memory space consumption is computed using the given formula:

Table 4 Memory space consumption
Fig. 5
figure 5

Memory space consumption

$$ {\text{Consumed}}\;{\text{memory}}\;{\text{space}} = {\text{total}}\;{\text{memory}}\;{\text{space}} - {\text{amount}}\;{\text{of}}\;{\text{free}}\;{\text{space}} $$

Less memory space utilization requires minimal storage cost in the cloud. Cloud computing is a pay per usage model, based on these characteristics the memory cost of SA encryption takes minimal cost than entire data encryption.

Key generation time analysis

The key generation shows the number of keys used and compared to an author Rui Ruo work [28]. The proposed system has a lesser time. The key generation time for the proposed system is shown in Table 5. The key generation process is a one time process, hence, key updating overhead is not in a proposed system.

Table 5 Key generation time analysis

The proposed MRFC system has taken minimal time than the RUIGUO technique for different numbers of key generation. The number of key generation depends on the number of organizations are involved in a process. E.g. If four organizations are involved in a process, four GK’s are generated for encryption. A similar process is continued for other cases.

Through these experimental results the proposed DO preference-based SA identification technique satisfied the DO requirement is proven. Similarly, the proposed MRFC-based SA protection technique takes lesser encryption/decryption time, memory space consumption, key generation time than the entire attribute processing time is proven. The major role of the cryptographic technique is to provide secure data storage and communication. Hence, the security strength of the proposed system needs to be proved is a necessary task. The following section discussed the security strength of the proposed system.

Security analysis

The components that are used for a secure and efficient storage representatin is, the keyspace, security of data against attacks, computational speed, information entropy and correlation coefficient [29].

Keyspace analysis

The complete keys that are used in the cryptographic technique is known as keyspace. The strength of the technique is depending on the length of the key. If the keyspace is longer, the more resistant the algorithm is to a successful brute force attack. The key length is indicated by a number of bits. A N-bits (key length) has the keyspace 2N possibilities. The size must be greater than 2100 to give high-level security from the cryptographic point of view [30]. The GK length of the proposed encryption algorithm is 256-bits, hence the single GK space is 2256 bits. In a proposed system ‘n’ number of organizations are involved in a process. Thus, ‘2256*(n+1)’ keyspace is used. This space is sufficient for reliable, practical usage and avoids brute force attacks.

Attack analysis

The well-known attacks are examined with the number of analyis steps and time requirement for a successful attack. Due to discrete logarithmic approach, the adversaries are unable to access the key or data in a polynomial time period. This is proven in the forthcoming points.

  • Inside attack In a proposed system, the SA is grouped by ‘n + 1′ groups and every individual group is encrypted by an individual GK by a DO. After the encryption process, the encrypted G(SA) is uploaded into CSP. Each group is to be encrypted by a separate GK; hence, no one can predict the GK’s which is used in the encryption of a specific group. Thus, the inside attacks are avoided in the proposed MRFC technique.

  • Outside attack In a GK-based access system, the SA, which is accessed by GA is also possible by accessing other members involved in a specific organization. To overcome this drawback, the GA(Pr) is used for GK generation. Hence, the Pr of GA is required for the decryption of specific G(SA). Hence, no one can access group information.

  • Brute force attack In a brute force attack, an adversary tries to identify the GK’s and plaintext messages in two different ways such as guessing and forging of GK. In both cases, the adversary (Ã) tries to identify the key within a polynomial time. In a brute force attack, Ã tries all possibilities within a polynomial time period. But, in a proposed system the key length is 2{256*(n+1)}, to identify such larger key size in a specific time period is a complicated task. If an adversary identifies anyone GK, the remaining GA’s are unable to predict. Because, each GK is independent of the others and depends on DO and GA. E.g. Three GA’s are involved in a proposed system. Therefore, 2[256+256+256+256] = 21024 is the actual key size. Generally, 2256-bit key size provides higher-level security in ECC than the 21024-bit RSA. In a proposed system, 21024-bit key is used. Hence, it is unbreakable in a polynomial-time period.

  • Known group key (resilience)

    Theorem

    If any of the GK is identified with an Ã, the remaining GK cannot be found further. Because, each GK belongs to independent MRFC-ECC-based random numbers.

    Proof:

    Based on MRFC-ECC-based GK, the other GK’s should not be disclosed from the compromised GK. Assume that, an à compromised anyone \({G}_{{K}_{i}}= {P}_{{u}_{A}}* {P}_{{r}_{B}}\), where \({P}_{{u}_{A}}=({P}_{{r}_{A}}*G)\) and \({P}_{{r}_{i}}\leftarrow (Q*A\left[i\right]+P+\left(\frac{F\left[n\right]}{A\left[n\right]}\right))\). The security of the GK depends on the randomness of Pr. If an à used the compromised GK, to find other \({P}_{r}^{1}\leftarrow ({Q}^{1}*A\left[i+1\right]+{P}^{1}+\left(\frac{F\left[{n}^{1}\right]}{A\left[{n}^{1}\right]}\right))\). The à knows the information of GK ∈ G(SA) only. If an à used \({G}_{K}^{1}\), for accessing of other G(SA) is not possible due to ECDLP and Fibonacci series-based random function, the hardness of the proposed system is high. That is,

    $$ P_{{r_{i} }} \leftarrow \left( {Q*A\left[ i \right] + P + \left( {\frac{F\left[ n \right]}{{A\left[ n \right]}}} \right)} \right) \ne P_{r}^{1} \leftarrow \left( {Q^{1} *A\left[ {i + 1} \right] + P^{1} + \left( {\frac{{F\left[ {n^{1} } \right]}}{{A\left[ {n^{1} } \right]}}} \right)} \right) $$

    That is, due to random function R, the A[i] values are differing for each Pr and \({G}_{K}\ne {G}_{K}^{1}\). Similarly, the R ∈ {P, Q} values. If the ECC prime value is high, the hardness of the Pr identification is high. Thus, the proposed MRFC-ECC resists the known GK attack.

  • Key compromise impersonate

Theorem

If an à revels the D O (P r ), only that G(S A ) are accessed by them. It’s impossible to compute the remaining P r of the same D O .

Proof

If an à knows the DO(Pr) and tries to access another Pr of the same DO. The proposed MRFC-ECC algorithm resists this attack. Suppose, à knows GA’s Pu and send \(\left({G}_{ID}, {P}_{{u}_{B}}\right)\) to DO. Now, DO compute \(\left({D}_{ID}, {P}_{{u}_{A}}\right)\) to Ã. However, in order to derive other GK, à must obtain the corresponding Pr for that GK. Due to the difficulty of the ECDLP and MRFC-ECC-based Pr, the à is unable to derive new Pr. Thus, the proposed MRFC-ECC resists the key-compromise impersonate attack.

  • Chosen plaintext attack The proposed GK technique is against the chosen-plaintext attack and it is discussed with a security game between an à and the challenger (Ĉ). In a chosen plain-text attack, the ‘Ã’ gets a ciphertext for an arbitrary plain-text and tries to reveal all or part of the message from the ciphertext.

Theorem 2

Within a polynomial-time period ‘Ã’ unable to crack the specific G(S A ) against the G K with a challenge access structure in the security game of Elliptic Curve Diffie–Hellman (ECDH) holds its assumption. This game is discussed as follows:

Proof: Game Initialization and Query for phase 1 The ‘Ã’ chooses the defy access rights (Ŕ) and sends it to the Challenger (Ĉ). In a setup phase, the ‘Ĉ’ executes an algorithm for generating a GK and sends a GK to ‘Ã’.


Challenge Now, ‘Ã’ selects two attribute groups G1(SA) and G2(SA) and sends it to the ‘Ĉ’. The number of attributes and size of these two groups is the same. The ‘Ĉ’ receives these groups and generates random bit value ∂ ∈ {0,1}. Now, the ∂ value is used for encryption of groups by ‘Ĉ’. The ‘Ĉ’ returns the (∂ = Enc (G(SA), Ŝ, Pr) to the ‘Ã’.


Query Phase-2 The ‘Ã’ sends another request message to ‘Ĉ’ for finding a further GK. Based on this request, the ‘Ĉ’ does the same job in phase-1.


Guess The ‘Ã’ should submit the guess ∂1 ∈ {0,1} for ∂. The ‘Ã’ wins the game when ∂1 = ∂. The ‘Ã’ wins the game is defined as (Pr1 = ∂] – 1/2).

The proposed GK scheme is said to be more secure against the chosen plain-text attack if no probabilistic polynomial-time adversaries have a non-negligible advantage in the above game.


  • Forward and backward revocation When a new GA is added to a group, the new GK is generated for that GA. Now, the new GK is used for the encryption process. Similarly, if any GA is revoked from their role, the GK based on that GA is also revoked. In this process, only the specific group of attributes is re-encrypted instead of all groups. Thus, the forward and backward revocation takes lesser complexity than the existing forward and backward revocation process.

    Forward secrecy If any GA joining to the process and try to access the E(G(SA)), the proposed MRFC-ECC provides forward secrecy to the new GA.

Proof

The forward secrecy of a MRFC-ECC algorithm-based GK is to all new GA’s to join in a process and tries to access DO information; a new GK is generated without modifying an existing group GA’s GK. For generating a new GK to a new GA, the DO check GA ∈ Oi, if none of the existing GA is not belongs Oi, and then new GK is generated for GA.

Backward Secrecy In cloud-based storage system, the user revocation and adding is a regular process. If the G(SA) is encrypted by a specific GK, is needed to be updated.

Proof

In a revocation process, a new random number R1 is chosen for Pr generation:

$$ \begin{aligned} & G_{K}^{1} \in \left( {D_{O,} G\left( {S_{A} } \right), G, P_{r}^{1} , P_{u}^{1} , R^{1} } \right) \\ & G_{K} \ne G_{K}^{1} \\ & E\left( {G\left( {S_{A} } \right)} \right)^{1} \in G_{K}^{1} \\ & E\left( {G\left( {S_{A} } \right)} \right) \in G_{K} i.e. E(G\left( {S_{A} } \right))^{1} \ne E\left( {G\left( {S_{A} } \right)} \right). \\ \end{aligned} $$

In this analysis the ciphertext with \({G}_{K}\ne {G}_{K}^{1}\). Hence, the revoked GA is unable to access the new GK-based E(G(SA)).

Mathematical proof

This section discussed the various comparative analyses in terms of security and storage overhead is discussed in Tables 6 and 7 respectively. Table 6 lists the various mathematical descriptions used for analysis.

Table 6 Mathematical description
Table 7 Comparison of security analysis

Due to different GK usage, the difficulty in the identification of each key is high. Hence, the proposed system is a collusion resistance (Co-Res) free, supports both backward and forward revocation process (B-F), provides confidentiality against CSP (Ag-Cloud), and user (Ag-User). Similarly, the proposed system, provides provable security, integrity, and access control system.

Table 7 shows the comparative security analysis for various existing techniques such as distributed access control scheme in cloud (DACC), Data access control- multi-authority cloud storage system (DAC-MACS), extensive data access control-multi-authority cloud storage system (EDAC-MACS) and proposed MRFC technique. These techniques can be compared in terms of collusion resistance, revocation security, data confidentiality, provable security, integrity and access control against the static corruption of authorities. Our proposed techniques obtained security additionally in integrity and access control compared to the other existing techniques. E.g. Each group of SA is accessed by an individual organization through separate GK. This GK is generated by a DO and the DO having complete control over their data. Through this process, the access and integrity of the proposed system are maintained. Hence it is observed that our proposed technique has better security.

Table 8 shows the comparative analysis of the storage overhead for the existing DACC, DAC-MACS, NEDAC-MACS, and proposed approach. The existing techniques may have multiple attributes that need more storage. But our proposed MRFC technique does not contain multiple attributes. Only the minimal sized SA is to be processed and stored, which reduces the storage overhead. As a result, our proposed technique improves performance with reduced storage overhead.

Table 8 Storage overhead analysis

Conclusion

This paper proposed the sensitive characteristic-based encryption for the secure cloud storage system using MRFC. The MRFC provides enhanced security to sensitive data with nominal processing cost and security is provided through the data owner knowledge. The sensitive data are grouped into ‘n + 1’ groups and every individual group is encrypted by different GK. Hence, to identify entire data from single key breaches is a difficult and impossible task. The encryption and decryption process is performed with the knowledge group admin. Hence, insider attacking is not possible. Similarly, the collision resistance, forward and backward revocation, chosen-ciphertext attacks, known-plaintext attacks are avoided through a group key-based encryption process. The key management problem is completely overcomes through the group key and is managed by individual data owners and group admins. The novelty of the proposed work is achieved through Group key-based encryption technique. The hardness of group key identification by an adversary is improved through MRFC-based elliptic curve technique. Usually, elliptic curve cryptography provides higher security with minimal key size. In addition to that, the random Fibonacci cryptography is used for the selection of random numbers which is used as a private key. This private key selection process improves the hardness of key identification by an adversary. Hence, the proposed technique overcomes the brute force attack, known group key attack, key compromise impersonate, chosen plaintext attack, and forward and backward user revocation processes in an efficient way. In future work, the same technique is going to be implemented in an unstructured complex document with images like a medical document.