System model
In this paper, we propose a system for fog servers specifically for Blockchain-IoT applications such that a side-chain would be run on the fog server, and the nodes of the side-chain would be individual containers. Each of these containers would serve an IoT device and offload all Blockchain-related activities from the device. The fog node would itself serve as a full node on the main Blockchain.
In our proposed IoT-Fog node architecture in Fig. 2, the Fog computing node would not just be a node connected to the Blockchain. The Fog node would instead run a side chain for the devices that are connected to it. In such a case, completed transactions on the side chain will be sent to the main chain. The Blockchain mining process and interactions are still going to be abstracted from the IoT peer devices. The fog will not employ a traditional service-oriented monolithic architecture [39], and the nodes of the Blockchain would not be the IoT peer devices. The way the side chain would operate would be by using microservices [6]. Thus, multiple microservices would run as containers on the fog that would act as nodes on the side-chain. These containers would take up the mining activities from the IoT devices.
Each IoT device would be randomly assigned to a Node in the side-chain (i.e., a container). Separate containers would not be created for each IoT device, but rather a pool of IoT devices would be assigned to a node at a time. The depiction of this proposed update is shown in Fig. 2.
Only a limited amount of delay is introduced for message propagation through the side-chain, ensuring high throughput of transactions resulting from the small number of container nodes. This architecture would be best suited for the high transaction—high-performance IoT implementations.
As mentioned earlier, the Blockchain's storage requirements on the fog node are quite intensive because even a low throughput Blockchain such as Bitcoin has a ledger size of about 321.21 GB. In contrast, a high throughput Blockchain such as Ripple has a ledger size of up to 9TB [32]. Our proposed system curtails this challenge by moving blocks from fog nodes to the cloud for storage, as shown in Fig. 2.
Problem statement
Based on our proposed model, the question that arises is which blocks should be moved to the cloud and how many of those blocks should be moved to ensure the smooth and efficient running of the IoT devices connected to it. This problem is formulated with the appropriate mathematical models are presented here in this section of the paper.
Suppose we have several fog servers acting as fog nodes or peers for IoT devices connected to it as \(S=\left\{{s}_{1}, {s}_{2}, \dots , {s}_{m}\right\}\), where \({s}_{i}\in S\) denotes a single server and \(m\) the total number of fog nodes being considered. For each fog node/peer, we need to select some blocks to be sent to the cloud to alleviate that peer's storage pressure. The blocks in each fog node can be represented by \(b_{1} , b_{2} , \ldots , b_{N}\), where \(N\) is the number of blocks mined by the peer at any given time.
The number of blocks that would be taken to the cloud can be represented by \(M_{{\text{w}}} \left( {1 \le M_{{\text{w}}} \le N} \right)\), where \(w = s_{i} \in S\) as shown in Fig. 3.
Once \({M}_{w}\) blocks are sent to the cloud the blocks in the fog peer are renumbered such that \(b_{{M_{{\text{w}}} + 1}}\) then becomes \(b_{1}\) and so on.
Based on the type of application that the Blockchain-IoT implementation is being used to serve, we can have three different query conditions for the blocks on the fog node. The query conditions can be for either a fixed case or linear decay scenarios. The fixed case scenario when used in a traceability application such as [33] denoted in (2). The linear decay scenario describes when the blocks are not queried as often as denoted by (3). In exponential decay scenarios, the Blockchain is used as storage for transactions such as in cryptocurrencies, as denoted by (4).
$$F\left( t \right) = F_{0}$$
(2)
$$F\left( t \right) = F_{0} - \alpha_{2} t.$$
(3)
$$F\left( t \right) = F_{0} e^{{ - \alpha_{1} t}}$$
(4)
where \(\alpha_{1}\) represents the attenuation coefficient in the exponential decay scenario, and \(\alpha_{2}\) represents that for the linear decay scenario. These can be determined based on the Blockchain use case scenario.
Xu et al. [21] proposed that the number of blocks taken to the cloud for each peer on a Blockchain can be formulated into a multi-objective optimization problem based on three main objective functions. These are the Query Probability of the various blocks on the fog node. The cloud storage cost when some blocks are moved to the cloud, and the local space occupancy represents that amount of local space on the fog node that will be alleviated by sending some of the blocks to the cloud. The mathematical models for these objective functions are outlined as follows:
Query probability
The query probability for the blocks in a fog node is based on the query frequency \(F\left( t \right)\) for the type of Blockchain-IoT application being implemented. The value represented as t for every block is tightly coupled to when that block was generated. Thus, with the addition of every new block, \(t\) is increased by 1. This means that the first generated block in the set has a t value of 0 as shown in (5). In the eventual scheme of events, the first generated block would be the last block in the arrangement given by \(b_{N}\) as shown in Fig. 3.
$$\Rightarrow t = 0, F\left( t \right) = F_{0} .$$
(5)
The query probability for the blocks in a fog node can be represented by \(P_{{b_{1} }} , P_{{b_{2} }} , P_{{b_{2} }} , \ldots , P_{{M_{{\text{w}}} }} , \ldots P_{{b_{{\text{N}}} }}\) where \(b_{j} \left( {1 \le j \le N - 1} \right)\). Thus, the query probability for the various blocks can be found by (6). It must be noted the block \(b_{{\text{N}}}\) would have both the query frequency and query probability of \(F_{0}\) since it was the first block created.
$$P_{{b_{j} }} = \left\{ {\begin{array}{*{20}l} {\left[ {\frac{1}{{\mathop \smallint \nolimits_{0}^{N - j} F\left( t \right){\text{d}}t}}} \right]} \hfill & {\quad 1 \le j \le N - 1} \hfill \\ {F_{0} } \hfill & {\quad j = N} \hfill \\ \end{array} } \right..$$
(6)
Based on (6), we can calculate the sum of the all the query probability for all the blocks for a fog node as \(\Lambda\) as shown in (7). The sum can be used to normalize the values of the query probabilities represented by \(P_{{b_{1} }}^{\prime } , P_{{b_{2} }}^{\prime } , P_{{b_{2} }}^{\prime } , \ldots , P_{{M_{w} }}^{\prime } , \ldots , P_{{b_{N} }}^{\prime }\) as shown in (8).
$$\Lambda = \mathop \sum \limits_{j = 1}^{N - j} P_{{b_{j} }} + F_{0} .$$
(7)
$$P_{{b_{j} }}^{\prime } = \left\{ {\begin{array}{*{20}l} {\left[ {\frac{1}{{{\Lambda }\mathop \smallint \nolimits_{0}^{N - j} F\left( t \right){\text{d}}t}}} \right]} \hfill & {\quad 1 \le j \le N - 1} \hfill \\ {\frac{{F_{0} }}{{\Lambda }} } \hfill & {\quad j = N} \hfill \\ \end{array} } \right..$$
(8)
Thus, after all the query probabilities of the blocks have been found, the overall query probability for the fog node is based on the number of blocks \(M_{{\text{w}}}\) to be sent to the cloud can be found. This can be achieved by finding the sum of all the normalized query probabilities up to the \(M_{{\text{w}}} {\text{th}}\) block. For fog node \(s_{i}\), the overall query probability is denoted by \(P_{{s_{i} }}\) as shown in (9).
$$P_{{s_{i} }} = \mathop \sum \limits_{j = 1}^{{M_{w} }} P_{{b_{j} }}^{\prime } \quad 1 \le s_{i} \le m.$$
(9)
where \(w = s_{i} \,\,\,{\text{and}}\,\,m = \left| S \right|\).
Based on the value of \(m\), there will \(m\) objective functions; one for each fog node \(s_{i}\) with \(P_{{s_{i} }}\) being minimized.
Storage cost
The storage cost deals with the cost of storing the blocks in the cloud. The storage cost for cloud storage is assumed to be the same for all the fog nodes for the sake of simplicity. The size of one block for each fog node is represented by \(C\). Different Blockchain has different sizes for the blocks that are generated. Thus, the Blockchain being used must be considered, and the block size of an individual block must be known. For example, the bitcoin Blockchain is known to have blocks with a size of 1 MB [40], the size of blocks on the Hyperledger Fabric Blockchain can be adjusted to be as large as possible [41]. The size of a block on the Ethereum Blockchain varies based on the gas limit [11].
The storage cost is considered a linear function as shown in (10), which is the total size of all blocks moving from the fog node to the cloud. This linear function is governed by a factor \(k\) representing the ratio of the cost of cloud storage compared to local storage. Thus, when \(k\) has a small value, it means that cloud storage is cheaper than local storage options for the fog node and vice versa. This value is based on the cloud service provider used by a fog node and the type of local storage available on the fog node, such as Optical Hard Disk Drives (HDD)—which are slightly cheap but slow, or Solid-State Drives (SSD)—which are relatively more expensive and faster.
$$cost = kC \times \mathop \sum \limits_{w = 1}^{m} M_{{\text{w}}} .$$
(10)
where \(m = \left| S \right|\).
Local space occupancy/storage availability of IoT Fog node
Local space occupancy/the storage availability of IoT Fog nodes is directly related to the number of stored blocks in the cloud. This is because the more blocks stored in the cloud, the more space would be made available on the fog node, thereby easing the given node's storage pressure. The local space occupancy is inversely proportional to the storage availability of a node. More blocks sent over to the cloud for storage means local storage occupancy would be small, and thus there would be an increase in the storage available on the IoT fog node. It should be noted that the storage size on each fog node is determined by the number of IoT devices connected to it, the type of application that it is being used for, and the operator of that fog node. Thus, some nodes may prioritize local space occupancy/storage availability more than others. In this sense, weights can be assigned to each fog node such that nodes that prioritize storage availability and have low storage capacity would be given bigger weights. The weights can be represented as \(\beta_{{s_{i} }} , { }\) such that \(\beta_{{s_{1} }} , { }\beta_{{s_{2} }} , { }\beta_{{s_{3} }} , \ldots , { }\beta_{{s_{m} }}\) represent weights for individual fog node where \(m = \left| S \right|\). The weights would be used in a weighted sum equation to find the overall local space occupancy for the fog nodes; thus, they would be given decimal values which would sum up to 1 as shown in (11).
$$\mathop \sum \limits_{i = 1}^{m} \beta_{{s_{i} }} { } = 1 \left( {1 \le i \le m} \right).$$
(11)
The individual local space occupancy for each fog node can be denoted by \(Q_{{s_{i} }}\), expressed in (12). This value is based on the number of \(M_{{\text{w}}}\) blocks that are sent to the cloud. The overall local space occupancy, \(Q\), of all fog nodes can be expressed as a weighted sum based on their assigned weights \(\beta_{{s_{i} }}\) as shown in (13).
$$Q_{{s_{i} }} = \frac{{e^{{\left( {\frac{{N - M_{w} }}{N}} \right)}} }}{e - 1}.$$
(12)
$$Q = \mathop \sum \limits_{i = 1}^{m} Q_{{s_{i} }} \times \beta_{{s_{i} }} .$$
(13)
Multi-objective formulation
Based on the objective functions expressed earlier in the query probability, the storage cost, and the local space occupancy, the block selection problem can be formulated as a minimization multi-objective problem. This work's primary goal is to minimize the objective functions while taking as many blocks as possible to be stored in the cloud. Thus, all objective functions would be minimized, as shown in (14)–(16).
$$\begin{aligned} \min \left( {query\,probs.} \right)\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle\cdot}$}}{ \Rightarrow } & \min \mathop \sum \limits_{j = 1}^{{M_{{s_{1} }} }} P_{{b_{j} }}^{\prime } \\ & \min \mathop \sum \limits_{j = 1}^{{M_{{s_{i} }} }} P_{{b_{j} }}^{\prime } \\ & \min \mathop \sum \limits_{j = 1}^{{M_{{s_{m} }} }} P_{{b_{j} }}^{\prime } . \\ \end{aligned}$$
(14)
From (14), it can be seen that there will be \(m\) fog nodes and thus \(m\) objective functions for the query probabilities, i.e., one for each fog node. The objective functions for the storage cost and the local space occupancy also minimized, as shown in (15) and (16).
$$\min kC \times \mathop \sum \limits_{w = 1}^{m} M_{{\text{w}}} .$$
(15)
$$\min \mathop \sum \limits_{i = 1}^{m} Q_{{s_{i} }} \times \beta_{{s_{i} }} .$$
(16)
$$1 \le M_{w} \le N,\,\,\, M_{w } \in {\mathbb{N}}.$$
(17)
Thus, for every set of \(m\) fog nodes that we have, it can be noted there would always be \(m + 2\) objective functions that we have to adhere to at every time. Users and operators of the Blockchain-IoT applications can always have constraints on these objective functions and the individual variables used in them. The constraints can be represented with \(\gamma_{1} , \gamma_{2} , \ldots , \gamma_{m + 2}\) as shown in (18). The constraints are solely the decision of the operator or the user. It must also be noted that the number of blocks that can be taken to the cloud must
$$\begin{aligned} & \max \left( {\min \mathop \sum \limits_{j = 1}^{{M_{{s_{1} }} }} P_{{b_{j} }}^{\prime } } \right) \le \gamma_{1} \\ & \max \left( {\min \mathop \sum \limits_{j = 1}^{{M_{{s_{m} }} }} P_{{b_{j} }}^{\prime } } \right) \le \gamma_{m} \\ & \max \left( {\min kC \times \mathop \sum \limits_{w = 1}^{m} M_{w} } \right) \le \gamma_{m + 1} \\ & \vdots \\ & \max \left( {\min \mathop \sum \limits_{i = 1}^{m} Q_{{s_{i} }} \times \beta_{{s_{i} }} } \right) \le \gamma_{m + 2} \\ & {\text{s.t}}{. }m. \\ \end{aligned}$$
(18)
always be an integer and not a decimal number. Thus, \({M}_{\mathrm{w}}\) cannot be less than 1 as shown in (17).
The objective functions and multi-objective problem formulation that has been outlined in this section would be solved using the proposed advanced multi-objective particle swarm optimization approach, outlined in Sect. 4.