1 Introduction

The widely popular Software as a Service (SaaS) cloud computing model employs a multi-tenant architecture to achieve cost efficiency along with scalability and flexibility. SaaS applications are commonly based on multi-tenant data storage where resources as well as datastore schema are shared among different tenants. In contrast, a different instance of a single tenant application is created for each user or tenant; thereby increasing the service cost as well as the maintenance cost for the tenants and the providers respectively. Consequently, sharing of resources in multi-tenancy leads to lowered cost for all [1, 2]. Despite this advantage, a multitenant architecture faces a challenge related to data security. As multiple tenants share the same storage, there is a higher possibility of data infringement if tenants’ data is not isolated from each other. A malicious tenant can attack other tenant’s data by sending malicious information, initiating illegitimate transactions or denying transactions among different tenants. Therefore, building a secure system is required to avoid data tampering in such a multi-tenant system. Flexibility and data isolation are the two most important features required in a multi-tenant application.

A recent technology that can be used for implementing tamper-free data storage for multi-tenant applications is the blockchain technology. Apart from its frequent use in cryptocurrency, it is emerging as a prominent technology for building secure and immutable storage systems [3]. Blockchains or Distributed Ledgers eliminate risk of alteration in data once stored through the use of decentralization and cryptographic hashing. Thus, blockchain storage is designed to save any data or transactions forever and in an immutable matter, i.e., without any modification. Moreover, the private or permissioned blockchains only allow identifiable participants to be admitted to the blockchain network [4]. Such blockchains provide an immediate and transparent view of stored data stored on an immutable ledger to the permissioned network members.

The use of blockchain by a multitenant SaaS application for prevention of tampering and allowing auditing of transactions between different tenants is bound to build trust among tenants. Therefore, this paper proposes a single blockchain- based scalable system to handle shared data among multiple tenants while preventing data tempering between tenants. Specifically, tenant data is stored offline while its record or metadata is stored on the blockchain using the platform, namely Multichain [5]. This allows efficient data storage, efficient querying as well as complete validation of a data storage or retrieval operation. Further, the Multichain-based blockchain is a permissioned network where any data can be stored or accessed only by authorized users or tenants. This makes the system apt for data storage by multiple tenants of a SaaS application without the risk of any unauthorized data access or tampering.

The system proposed in the paper endeavours to improve upon other related blockchain based storage systems in the following aspects:

  1. (a)

    Support for multi-tenant applications: The proposed system’s main objective is to provide a multi-tenant storage system where different tenants can store and share data in the same blockchain.

  2. (b)

    Scalability: The proposed system is a scalable storage system due to the use of blockchain along with an off-chain storage for tenants’ data.

  3. (c)

    Tamper-proof: To build trust among the tenants, a permissioned blockchain framework-MultiChain is used for the implementation which provides security as well as tenant data isolation.

  4. (d)

    Query processing time: The proposed system is using off-chain storage for tenant data and employing the blockchain to store only the meta data. This is unlike most existing systems that store the entire data on the blockchain resulting in high query processing time.

The rest of the paper is structured as follows: An overview of blockchain technology along with the related work is discussed in Sect. 2. Design of the proposed system is presented in Sect. 3. Results of implementation are described in Sect. 4 along with a threat model. Finally, we conclude the article.

2 Background and related work

2.1 Overview of blockchain technology

The blockchain technology is being used in varied applications ranging from cryptocurrencies, transaction management to record keeping for digital assets. Blockchain is a distributed ledger that consists of a chain of blocks in which transaction details or records are stored. The first block in the blockchain is termed as the Genesis block. Each block contains hash of the previous block along with a timestamp and transaction data or record [3, 6]. Figure 1 illustrates the structure of a blockchain.

Fig. 1
figure 1

Blockchain structure

In a blockchain, every block has its own unique nonce and hash. Each block also references the hash of the previous block in the chain. This makes mining of a block difficult especially on large chains. Miners use special software to solve the incredibly complex math problem of finding a nonce that generates an accepted hash. Since the nonce is only 32 bits and the hash is 256, there are approximately four billion nonce-hash combinations that must be mined before the right one is identified. Only after that, a miner’s block is added to the chain [7]. Once a block is successfully mined, the change is accepted by all of the nodes on the network. Making a change to any block previously added in the chain requires re-mining not just the block with the change, but all of the blocks that are subsequent to this block. Therefore, it is extremely difficult to manipulate data or transactions added to the blockchain.

Apart from its property to store data practically in an immutable manner, the growing number and variety of blockchain applications is also due to its decentralised nature. Blockchain allows building of transparent, secure and scalable systems where users can view and check any action performed on the chain.

There are different variations of blockchains available, like Consortium and Hybrid blockchains but Public and Private blockchain are the two most popular blockchains. Table 1 depicts how these two blockchains differ from each other.

Table 1 Public v/s Private blockchain

2.2 MultiChain: open source blockchain platform

A substantial number of open-source and commercial platforms, such as Ethereum, Ripple, MultiChain, Sidechain, IBM Blockchain, Hyperledger Fabric, Hyperledger Sawtooth, R3 Corda etc., are available for creation of blockchains. The proposed work has used MultiChain platform to create blockchain based data storage system for SaaS applications.

MultiChain is an open-source platform for the creation and deployment of enterprise blockchain, referred to as a private blockchain [5]. It can be used to develop a blockchain to be used either within one or between multiple organizations. MultiChain supports all the prevalent operating systems such as Windows, Linux and Mac and can be used to store any kind of digital asset. The primary advantage of MultiChain platform over other related platforms is the implementation freedom that comes with the use of MultiChain. Blockchain users are given the privilege to define blockchain parameters and an ability to store multiple assets on their blockchain [5, 8]. While designing new blockchains, developers can configure the following parameters of a block: release timing, transaction rate, proof-of-work requirement, mining diversity, active permission types, level of consensus required for creating/removing administrators and miners, maximum block size and maximum metadata per transaction.

Unlike some other platforms like Bitcoin Core, MultiChain provides ease to configure different blockchains at the same time. Further, Multichain supports a number of programming languages such as Python, C#, PHP, JavaScript or Ruby. Some significant features of MultiChain are summarized below [9]:

  • Permissioned: To access this blockchain, permission is required since it is an invite-based network. Table 3 illustrates some of the access-level permissions granted by MultiChain.

  • MultiChain Stream: Streams provide data retrieval, timestamping and archiving of data that can be used to implement shared immutable key-value, time-series and identity- driven databases.

  • Assets: A MultiChain asset is the term referring to an item/token having some value, that can be transferred between blockchain users via a transaction.

  • MultiChain handshake protocol: Since blockchains are decentralized, a peer-to-peer connection is established between the nodes in a blockchain via handshaking by MultiChain.

  • Scalability: MultiChain provides flexibility in storing information on a stream, it can be on-chain or go off-chain any time according to the administrator. The dual chain storage method further leads to scalability of the system.

  • Mining: Mining of data blocks in MultiChain is performed by a set of pre-assigned or identifiable members of the blockchain only. Here, in mining process, instead of computing any mathematical puzzle or hash by multiple nodes, these pre-assigned block validators can add new blocks in the blockchain.

The above features and flexibility provided by MultiChain motivates its use as the blockchain platform for the implementation of a private blockchain in this work.

2.3 Literature review

Significant research is being performed in multidisciplinary domains utilizing blockchain technology. The work in [10] has tried to address the problem of online piracy in movie industry by developing a blockchain-based anti-piracy system, “Vanguard”. This system substitutes the conventional Intellectual property (IP) registration system and keeps track of the owners IP rights to ensure that there is no illicit distribution of their data. Blockchain technology and certificateless cryptography have been used in [11] to implement a data storage system for managing and protecting huge amounts of IoT data. Authors of [12] have utilized blockchain in the design of approaches to be used for data sharing in smart cities. A Blockchain Tree to store information from smart Id cards has been proposed in [13]. This system provides multilevel security by integrating blockchain at lower level to that of higher level. The work in [4] focusses on the implementation of blockchain for various applications in food industry like food tracing, land registrations, customer awareness programs, agriculture insurance etc. Authors have used the open-source platform- MultiChain for the implementation of the proposed system. One of the main advantages of using blockchain is to prevent forgery or frauds as it is immutable and transparent. One such scenario has been discussed in [14], where blockchain based system is used for preventing property frauds such as frauds involving bank loans etc.

Ping End-to-end Reporting (PingER) is a framework developed by the SLAC National Accelerator Laboratory USA for worldwide end-to-end internet performance measurement [15]. The work proposes a decentralized blockchain based data storage system for PingER. Here, instead of centralized storage of data, all the data files are distributed among multiple locations using Distributed Hash Tables (DHT) and only metadata of these files is stored on the blockchain. A new terminology, Blockchain-as-a-Service (BaaS), similar to SaaS has been discussed in paper [16]. It is a cloud-based service which eases blockchain set-up, provides platform to run applications, security and some other core features of blockchain. Authors of [17] have proposed a blockchain based system for multitenant architecture. Each tenant has individual permissioned blockchain which in turn is connected to a main chain. This work has been carried out with Laava ID Pty Ltd (Laava) and implementation is done using Ethereum.

The present work considers an important application area for blockchain, i.e., the healthcare sector. Significant existing work performed for application of blockchain in the healthcare sector is summarized in Table 2. For instance, authors in [18] have included blockchain technology in digital services like online consultations. They have used a decentralized solution to provide security in the healthcare sector. Use of blockchain has added transparency in the user-client, here doctor- patient communication. Also, authors have presented three different case studies in this field, namely; Telemedicine, patientory and medblock. Paper [19] is a survey paper which highlights the work done in the field of electronic heath record (EHR) systems using blockchain technology. Authors have discussed various consensus algorithms to be used in public blockchain, such as; Practical Byzantine Fault Tolerance replication algorithm (PBFT), RAFT, Proof of Authority (PoA), Proof of Capacity (PoC) and Proof of Elapsed Time (PoET). A privacy preserving system-MediBchain has been proposed in [20] for healthcare data using blockchain technology. For encrypting private data, they have applied Elliptic Curve Cryptography (ECC).

Table 2 Comparison of proposed system with related work in healthcare domain

A mobile application has been developed using blockchain technology for storing data of cognitive behavioral therapy for insomnia patients [28]. Data is stored in Hyperledger Fabric blockchain network in JSON format. This blockchain based system provides data transparency and accessibility without risk of data tampering. Blockchain has been integrated with artificial intelligence systems in [29] to design a predictive system for COVID-19 infection for a better clinical risk management. The improvements made by the proposed system are listed in Table 2.

2.4 Threat model

We analyse the security of our proposed system using a threat model as described in this section. Threat modelling is an organized procedure for identification of probable threats, potential vulnerabilities and listing a corresponding mitigation plan. In a cloud environment, data is distributed among several servers across different locations, thus it is highly likely to suffer from inevitable security threats [30]. Therefore, the main idea of threat model utilized in our work is to propose a systematic analysis of different possible attacks, vulnerabilities, and its possible defence mechanism in a cloud storage environment.

This work, uses the STRIDE framework [31] developed by Microsoft, to create an effective threat model for the blockchain based multitenant system. STRIDE model covers the following threats:

  1. (i)

    Spoofing: This is a threat to authentication where the attacker pretends to be an authorised user and uses his identity to access his/her clinical data added in the cloud or blockchain network.

  2. (ii)

    Tampering: This is a threat to integrity where the attacker performs some unauthorised modification of the data, thus violating the integrity of the data stored on the system.

  3. (iii)

    Repudiation: Repudiation is a breach of contract, in which a user can deny that a certain transaction was not performed by him.

  4. (iv)

    Information Disclosure: This is a threat to confidentiality where the attacker tries to access the information from the storage system without any authorization.

  5. (v)

    Denialof Service: This is threat to availability where attacker can flood the cloud environment resources with heavy amount of fake data packets, so that the system is unavailable to handle real data traffic.

  6. (vi)

    Elevated privileges: This is a threat to authorization where some already authorised user tries to access data of some other user without significant permission.

Further discussion and security threat analysis are covered under Sect. 4.2.

3 Multi-tenant tamper-proof storage

The section presents the design and implementation of a tamper-proof storage for a multi-tenant SaaS application related to healthcare. Management of healthcare data is a critical task as it involves confidential and sensitive information related to patients. Along with the secure storage of the patient’s data, its efficient retrieval is also imperative. For instance, timely fetching of data is required to haste up the treatment process after diagnostic by the doctor or for simplifying the insurance claims.

3.1 Motivation

Blockchain technology has lately emerged as one of the most promising solutions for secure and efficient data storage. Data can be directly stored in the form of transaction on the blockchain network. These transaction data and records are stored in the form of hash digest in a Merkle tree; thus, making it difficult to decode the content using the hashed data. Only the authenticity of the data can be checked by verifying the hash of the block with the stored hash making the blockchain immutable.

Despite the advantages provided by blockchain, it faces challenges related to data privacy and query processing time. Firstly, the transaction data stored in a block is visible to every node associated with that particular blockchain network. However, there can be the need for some data to be hidden from certain subset of nodes. Existing solutions to provide data privacy on the blockchain rely on asymmetric encryption of stored data. Transaction data is encrypted with an encryption key and stored on the blockchain. For retrieving the data, decryption key is provided to only the subset of nodes who are authorized to access this data; thus, the data remains hidden form the remaining nodes. The proposed work builds a multitenant storage system that provides privacy to tenant nodes by isolating their data from each other and giving access to only authorized tenants.

Another challenge that needs to be considered is related to degradation in query response time as the blockchains grow in size. The proposed work is based on horizontal scaling of blockchains which implies that new nodes can be added to the blockchain network depending on the system requirement as it is a decentralized network. However, scalability has a negative impact on time spent in retrieval of data stored on the blockchain, i.e., the query processing time. Although blockchain is used to store transaction data in blocks, it does not support any query language unlike relational databases. Applications that have small data can easily store entire data on-chain. However, if data grows in size, it will take more time in retrieving data from the chain resulting in a high cost. Using a blockchain to save the entire data of a large-scale application is time-consuming and not scalable too. Therefore, the proposed system employs an off-chain storage (databases or dedicated file systems) for actual application data and stores only the meta data on the blockchain. A similar approach has been implemented by systems like StorJ [32], Filecoin [33] etc. though these are not focused on multi-tenant data storage. The proposed storage system is based on a private blockchain developed using the Multichain platform. However, instead of storing complete documents on the blockchain, the proposed system stores only the metadata information such as transaction time, summary of all transactions in a block, reference to previous block hash along with the hash of a document on the blockchain. Off-chain data storage is used for storing complete documents in order to reduce the blockchain’s (a) space overhead in storing large documents (b) time overhead in retrieving and storing data. In contrast, storing only the hash value of the document on a blockchain makes the system secure and improves system performance drastically by saving storage space and time.

3.2 A multi-tenant SaaS healthcare application

A multitenant SaaS application is utilized by a set of users or tenants. The considered SaaS application is based on healthcare sector where patient’s data is stored on the blockchain and can be accessed by different nodes involves in the blockchain network- Health insurer, healthcare provider, research institutes, supply chain and the government are multiple tenants associated with this system. The tenants in the proposed application can be classified according to their role in the system as follows [34]:

  1. (a)

    Data Contributor (DC)

    A Data contributor is any individual or a group of individuals who intend to collect and share their own data or data collected by them among different nodes.

  2. (b)

    Data Readers (DR)

    Data readers are individuals or a group of individuals who plan to use the shared data provided by data-contributors through the blockchain to fulfill their information requirements. A permissioned blockchain allows the grant of different access level permissions to its users, as depicted in Table 3. By default, all the privileges are granted to DC, which can subsequently grant permissions to other tenants such as DRs.

  3. (c)

    Miners

    Miners in a blockchain are those nodes which have permission to add transaction blocks in the network. Thus, in addition to DC, there can be multiple other miner nodes chosen by DC. This implies that a DR can act as a miner if DC grants ‘Mine’ permission to it. In the proposed system if a node is a miner, then its miner_status is 1 else 0.

Table 3 Access-level permissions by MultiChain

3.3 Creation of blockchain

The work employs a private blockchain for developing a tenant-based storage system. Individual patient stores the metadata of medical record in the form of a transaction on a single blockchain. There is a separate blockchain for each patient. Suppose a patient X undergoes some tests from an unknown disease. Now, to store information about this patient (such as his personal details etc.), a new blockchain will be created with the creation of a genesis block, i.e., the first block of this blockchain. Next, the first node, i.e., the patient data node will add the metadata of X’s tests’ results as transaction data on the block (see Fig. 1). Similarly, all his medical records like doctor’s prescription, vaccination certificate etc. can be added as subsequent blocks on this blockchain. The actual data of the patient is stored off-chain in a database [35].

MultiChain platform has been used for the implementation of a private or permissioned blockchain. Unlike a public blockchain network which requires tedious mathematical computation for the proof of work mining to create a new block, here some pre-authorized members, i.e., “Miners” are authorized to add a new block in the blockchain [9]. Health insurer, healthcare provider, research institutes, supply chain and the government are the multiple tenants, also known as tenant nodes who can see this X’s data in the blockchain. Here, the first node i.e. patient who is adding his/her data in the blockchain acts as the DC (and the default miner node) and the other remaining tenants’ nodes are DR.

Figure 2 depicts the role of multiple tenants sharing patient’s data in the proposed system. MultiChain uses sha256 cryptographic secure hash functions to calculate header hashes. Additionally, MultiChain uses smart filters for validating the transaction data. Also, to prove that the certificate issued to patient exists, Proof of Existence (POE) is provided to the user. In order to provide POE to tenants that the particular file or patient’s record existed on the stated date and time, document ‘s hash and timestamp are linked with it [36].

Fig. 2
figure 2

Multiple Tenants sharing Patient’s record in the Proposed System

4 Data validation

Algorithms 1–5 have been proposed to allow retrieval of patient’s medical record by other tenants of the application. To access the patient’s record, which is the primary data contributor (DC) for our application, tenants who are data readers need to have access level permissions provided by MultiChain platform (refer Table 3). Initially all the nodes have connect permission using which each node can connect to the other nodes and can read the data stored in the blockchain. All the remaining permissions such as send, issue, activate, mine, create etc. are granted by the admin node or the first node in our case on request to the tenant node.

Validate(txn-id) function used in Algorithm 1–5 uses transaction filters supported by the MultiChain platform. A transaction filter validates a transaction by considering the input, output and metadata of the transaction. The transaction needs to pass this smart filter validation test. A failure of this test implies an invalid transaction which is therefore rejected. This ensures timely and tamperproof access of the X’s data whenever required.

For instance, consider a text file with a patient’s test results. In our application, this text file will be stored on the cloud and only the hash of the contents of this file will be stored on the blockchain. Suppose, an intruder manages to access and modify this file. Using the SHA 256 hash function, the hash of this file content can be recalculated and compared to the hash stored on the blockchain. If these two hash values do not match, it will be clearly evident that the file has been tampered. Thus, the proposed system is able to prevent any tampering of the data stored on the cloud using the hash value of the data stored on the blockchain.

4.1 Data validation by different tenants

  1. (A)

    Healthcare Provider

    Each individual patient has control over his blockchain. In case, patient X needs to visit a new hospital or healthcare provider, he can provide it access to his blockchain. Storing medical data on blockchain not only provides security in comparison to paper-based record keeping but also reduces redundant clinical tests. Algorithm 1 corresponds to record fetching by healthcare provider who gets access to a patient’s record which is tamper-proof and can also add a new block (for e.g. including prescription data) to the patient’s blockchain. For adding prescription as a new block in the blockchain, healthcare provider is granted send, receive and mine permission. Thus, this tenant will now act as a DC in the blockchain.

figure c
  1. (B)

    Health Insurance Companies

    Using blockchain technology, it will be easier for health insurance companies to verify the medical claims and avoid forgery of documents. Algorithm 2 depicts the record fetching by a health insurance company. This tenant is not mining or adding data in the blockchain, thus it remains as a DR.

figure d
  1. (III)

    Research institutes and Supply chain or pharmacy

    For research on new diseases and development of new drugs, a large corpus of patient’s data is required. After prior approval from patients, their tamperproof medical data that is stored on blockchain can be supplied to research institutes and supply chain or pharmacy. Algorithm 3 and 4 illustrates the patient’s record fetching by research institutes and supply chain respectively. Both of these tenants are utilizing the blockchain data for research and analysis, these tenants are not adding any data in the blockchain, thus they remain as a DR.

figure e
figure f
  1. (IV)

    Government

    Governments of most of the countries have made a transition from paper-based health record keeping to electronic health records. “MyHealthRecord” is the National Health Portal hosted by Centre for Health Informatics (CHI), set up at National Institute of Health and Family Welfare (NIHFW), by the Ministry of Health and Family Welfare (MoHFW), Government of India [37, 38]. This has helped health experts to study various disease trends and propose its eradication policies. The recent menace to the world health is the outbreak of the novel Coronavirus Disease (COVID-19). The only possible solution to eliminate COVID-19 which WHO can think of right now is effective vaccination of the population. Indian government has also started its vaccination drive against this pandemic. Using the proposed work, government can make available person X vaccination certificate on the blockchain and thus, preventing forgery or tampering of the certificate. For adding a vaccination certificate as a new block in the blockchain, government will be granted send, receive and mine permission. Thus, this tenant will now also act as a data contributor in the blockchain. Algorithm 5 depicts the record fetching by government.

figure g

4.2 Security analysis

This section examines how the proposed system mitigates the potential threats associated with the cloud storage system; as listed by STRIDE model and discussed in Sect. 2.4 [30].

The proposed blockchain based multitenant system tackles the threats as follows:

  1. (a)

    Authentication and access control: The proposed system uses permissioned blockchain for its implementation. Thus, all the users are added to the network after subsequent authentication process and only authorised users are allowed to access the data. Depending on the requirement each user or tenant is a DC or DR and is accordingly granted different access permissions to control, read, send, mine etc. (see Table 3).

  2. (b)

    Integrity: The verification procedure to check the integrity of the data stored in the blockchain is handled by a hash tree popularly known as Merkle tree which is a tree-based data structure in which the hash of the child node is stored in non-leaf node. Using the Merkle tree, user can confirm whether the transaction is legitimate or not by verifying the hashes.

  3. (c)

    Repudiation: Non-repudiation is an important property for secure communication made available by blockchain networks.

  4. (d)

    Confidentiality: The proposed system allows only the authorised users to access the data stored, thus ensuring data confidentiality. Instead of complete data, only the hashed meta data is stored in the blockchain network; therefore, restricting unauthorised access of data to an extent.

  5. (e)

    Key Security: The proposed system uses Elliptic Curve Digital Signature Algorithm (ECDSA): a public key cryptography encryption algorithm for key generation in the MultiChain blockchain network [39].

5 Results

This section describes the results of implementation of the proposed model.

5.1 Simulation setup

The initial implementation of the proposed system is performed on a Windows server with an Intel Core i5 CPU (1.60 GHz) and 8 GB of DDR3 memory. For the deployment of the private blockchain, MultiChain an open-source platform is used along with Amazon Elastic Compute Cloud (Amazon EC2) [40]. It is one of the most popular web services by Amazon, which offers computational capacity to run applications on the cloud. EC2 provides easy scaling and flexibility in configuring features like memory size, processors etc. EC2 also has an elastic load balancer which can automatically divide incoming data load to multiple available instances as required, here instances are virtual computing environments. We have used Ubuntu Server 16.04 LTS (HVM), SSD Volume Type for implementing multiple tenant nodes in the proposed Multichain system.

Initially, the blockchain has five nodes in the network. One of them is data contributor (DC), in our case it is an individual whose medical record will be shared on the blockchain. The other nodes are the data readers (DR).

The procedure of implementation is listed as follows:

  1. (A)

    Launching and connecting to EC2 Server using AWS account to set up DC

    In the first step, an EC2 Server is launched and connected using AWS account to set up the DC. An Ubuntu Server 16.04 LTS (HVM), SSD Volume Type has been used and connection to this sever has been made using putty through its .pem file (named as Multichain-key.pem) with the command.

    ssh -i “Multichain-key.pem” ec2-3-144-147-56.us-east 2.compute.amazonaws.com

  2. (B)

    Creating blockchain and Genesis block

    Once MultiChain is successfully installed, using the DC node created in the step 1, a blockchain, referred to as Chain1, for the first patient is created. Fig. 3 demonstrates the MultiChain core daemon which initiates the server and mines the first block, referred to as Genesis block in the network. Fig. 4 illustrates the details of the created blockchain. Similarly, all the other tenant nodes are added into the network.

  3. (C)

    Connecting tenants in a Blockchain and defining permissions to each node

    Multiple EC2 servers were used for creating multiple tenant nodes and all these tenants were connected to a blockchain Chain1. After successful connection to chain1, admin node, here node1 granted different access level permissions as explained in Table 3 to each tenant. Fig. 5 illustrates the information about the peers connected to a tenant node and Fig. 6 shows the permissions granted to the tenant node.

Fig. 3
figure 3

MultiChain Core Demon

Fig. 4
figure 4

Information about created MultiChain Chain1

Fig. 5
figure 5

Information about the peers connected to Chain1

Fig. 6
figure 6

Permissions assigned to node

Table 4 lists all the details of implemented nodes including the private IP and public IP DNS which is used for establishing connection using ssh client. The column, node address shows the address of each tenant node using which nodes can connect to each other along with its best computed block hash.

Table 4 Details of implemented nodes

Further we assess the scalability of the proposed system by adding multiple nodes in the blockchain. The subsequent experiments were carried out on NVIDIA DGX- Workstation having 20-core intel Xeon e5-2698 v4 2.2 GHz processor and 256 GB DDR4 memory. We utilized Containerization to implement 100 nodes in the MultiChain network, each node in a separate container. Containerization helps to achieve significantly lighter weight implementation as compared to the use of virtualization in the earlier experiments. We have used Docker [41] an open-source containerization platform that allows users to encapsulate all the application code with libraries and necessary dependencies into isolated containers. Figure 7 shows a sample list of multichain nodes installed on docker containers.

Fig. 7
figure 7

Information about Containers acting as peers in blockchain

For creating multiple nodes, we initiated multiple docker instances. The blockchain is created on the first node called as the seed node and all the remaining nodes are connected to this blockchain as peers. Each node has its own private key generated using ECDSA. We assigned proper permissions (see Table 3) for the nodes to send, connect, create or mine blocks in the network. Figure 8 depicts how different peer nodes are connected to a tenant node.

Fig. 8
figure 8

Information about the peers connected to Chain1 using containers

We have instigated the network for 100 nodes and the system can be further scaled up. Since only metadata is being stored, hundreds of transactions and data streams can be processed per second. Single MultiChain node can proficiently handle millions of data load but the same node is not capable to store millions of addresses or users in its own MultiChain wallet [5].

5.2 Evaluation metrics

The performance of the proposed storage system has been evaluated using tests, namely, the admin elasticity test, tenant isolation test and scalability test [32].

  1. 1.

    Admin elasticity test

    A blockchain network can have combination of different nodes having admin permissions. In this test, we shut down some servers hosting the admin nodes and then tested whether new nodes can connect to the network. This test was performed ranging from few admin nodes to no admin nodes in the network. Table 5 illustrates the significance of admin nodes in the network since no new nodes can be added if there is no existing admin node. However, it is possible that there are multiple admin nodes in a blockchain.

  2. 2.

    Tenant isolation test

    There are multiple nodes that are tenants in the blockchain network. In this test, we shut down some servers hosting tenant nodes. After performing this test, it was observed that all tenants are isolated from each other. Table 6 illustrates that even if some nodes were taken off, the system continues to work properly, remaining tenant nodes were still capable to process the transactions required to communicate in the network.

  3. 3.

    Scalability test

    All the above tests were carried out at different sized networks to check the scalability of the system. We started with 7 nodes network and scaled up to 25 nodes using AWS.

    It was observed from Table 7 that the proposed system is scalable and new nodes can be added or removed from the blockchain network without hindering the working of the remaining blockchain network.

    Further to assess scalability, the system performance was checked for 100 nodes by creating docker containers and performing the admin elasticity test and tenant isolation test with these nodes. We initially divided the nodes in a ratio of 40:60 where 40 nodes are given admin permissions and 60 nodes are tenants. Then to perform the admin scalability test, we shut down admin nodes till there was no admin node in the network. It was observed, that at least one admin node is required for proper functionality of the permissioned blockchain network.

    Similarly, on the same nodes we performed tenant isolation test by shutting down some tenant nodes ranging from few nodes to no tenant node in the network. As already stated, the blockchain is created on the first node and all the remaining nodes are added to this blockchain as its peers. Thus, it is not required to connect each node to the remaining nodes, all peers are independent. As expected, after performing this test, it was observed that all tenants are isolated from each other and even after breakdown of some nodes, remaining nodes accomplished the transactions in the network without affecting the working of the proposed system.

Table 5 Admin elasticity test
Table 6 Tenant Isolation test
Table 7 Scalability test

6 Conclusion

The paper has presented the design and implementation of a Blockchain based data storage system that prevents data tampering in a multitenant environment. The blockchain is used for validating access to data by tenants while the actual tenant data is stored offchain for efficient query processing. The efficacy of the system has been investigated using data from a case study of healthcare sector. Further. a threat model has also been utilized for showing that the system can effectively handle the possible security threats. The proposed system leverages benefits of blockchain technology by providing scalability of the system along with data isolation among different tenants in the system. The proposed storage system was successfully implemented and evaluated using three different tests- the admin elasticity test, tenant isolation test and scalability test. It was observed that data of each tenant was independent of the other tenants and accessible only to the authorized users. Even if some tenants were shut down, the system continued to function properly since the remaining working nodes could communicate with each other effectively.