1 Introduction

In the last ten years, many new technological paradigms have been described, developed, and (finally) applied to a large catalog of different scenarios. From Cyber-Physical Systems [1] and Ambient Intelligence [2] to Industry 4.0 [3] and the Industrial Internet of Things [4]. But together with this technological revolution, a new family of digital risks and vulnerabilities has emerged. Among all these innovative attacking strategies, Cyber-Physical attacks are probably the most dangerous and worrying.

In Cyber-Physical attacks [5], intruders take advantage of feedback control loops and other similar algorithms deployed within technological platforms, to amplify and extend the impact of their attack strategy across the entire system architecture. Only a very specific (and typically small) manipulation or malicious action over the key (vulnerable) element or component is necessary to affect and modify the behavior of the whole system. Although that key vulnerable element would be different for each attack, deployment, technology, and implementation, or event it could not exist, some technological paradigms and architecture allow identifying those components which could be vulnerable with a higher probability.

Actually, among all innovative technological paradigms, Ambient Intelligence (AmI) is where more clearly the potential attacking vectors for a Cyber-Physical attack can be identified. Every AmI deployment is supported by a large and dense network of sensor nodes, including thousands of resource-constrained devices [6]. To make the deployment and management of such a large network feasible, devices are randomly distributed, connected, or replaced [7]. It is not a planned network, and there is no exhaustive description of its structure or the participating devices. In this context, sensor nodes must be able to self-configure and start data capture and transmission automatically; and any device operating with the proper configuration and sending information to the correct endpoint is accepted as part of the AmI deployment [8].

For Cyber-Physical attackers, therefore, it is easy to inject false information into the AmI hardware platform, through intruder malicious devices or infecting the communications among legitimate sensor nodes (among other possibilities) [9]. To protect AmI deployments against these attacks, complex data analysis algorithms are usually employed. Using stochastics models [10], time series [11] and artificial intelligence [12], false information can be detected, corrected, and/or removed. But these algorithms are too computationally heavy to be maintained by sensor nodes and are usually deployed in the cloud. However, this approach presents two main problems.

First, it requires all Ambient Intelligence systems to be networked and connected to the cloud. And, although some AmI applications are actually networked, in most scenarios, AmI deployments are isolated [7]. Either because the system is deployed in a remote location without communication infrastructure (such as in natural environment monitoring application or digital agriculture solutions), or because we are dealing with a critical system where Internet connections are not allowed (such as in many Industry 4.0 applications and military missions). Many AmI systems do the data processing locally, with no option to execute complex algorithms for false information mitigation.

And second, this protection strategy requires centralized data management. Stochastic models or artificial intelligence algorithms need a full vision of the data captured and accepted within the AmI deployment to be precise. Thus, all information must be transmitted and accumulated at the same point. But this centralized scheme is very slow (because of transmission delays) and computationally heavy. Mostly, this approach is not compatible with the most innovative decentralized architectures, such as edge computing [13]. In these architectures, data processing tends to be decomposed into atomic tasks that can be delegated and solved only one step away from the sensor networks (a layer known as “edge”). As a result, AmI deployments would reduce their reaction capacity, performance, and the catalog of applications they can cover.

In conclusion, new distributed solutions are needed to protect isolated AmI deployments against false information injections caused by Cyber-Physical attacks. These solutions should not be computationally heavy, so they can be supported by sensor nodes and edge devices.

Therefore, in this paper we propose a new decentralized security solution, based on a Blockchain ledger. In this ledger, new sensing data are considered transactions that must be validated by edge managers, which operate a Blockchain network. This validation is based on two different information sources. On the one hand, sensor nodes are validated as valid information sources through reputation metrics. These metrics combine implicit reputation indicators computed by sensor nodes using historical network data and explicit reputation indicators based on identity parameters. On the other hand, new data are evaluated to verify whether they are potentially legitimate or not. Information theory metrics and the coherence of new transactions with the behavior of the historical deployment are analyzed to evaluate their legitimacy. Only validated transactions are considered, stored, and processed.

Edge managers maintain the Blockchain network and execute the validation algorithm. As they do not have a full vision of the AmI deployment and different edge managers collect information from different groups of sensor nodes, validation must be achieved through consensus. The consensus protocol that we propose requires a weighted majority of edge managers to validate a block to finally be accepted. For this “voting”, the relevance of edge managers in the Blockchain network is weighted considering the knowledge they have about the deployment, and the historical series of previously accepted (and rejected) blocks.

The remainder of the paper is organized as follows. Section 2 discusses the state of the art in security and protection mechanisms for AmI deployments. Section 3 presents the proposed security solution, including the validation algorithm and the consensus protocol. Section 4 describes the experimental methodology and analyzes the obtained results. Section 5 concludes the paper.

2 State of the art on AmI security solutions

Currently, the most discussed and popular topic with respect to AmI security is governance [37]. In general, the social and administrative dimensions of AmI security are deeply analyzed, from the acceptance and perception of privacy in AmI systems [38] to techniques to make AmI technologies accountable and responsible (in legal terms) [39].

However, traditional and still open security risks and vulnerabilities in AmI systems have also been exhaustively studied. Most reported security solutions for Ambient Intelligence systems are designed for deployments with Internet connection and communicating to the cloud. In this context, risks associated with public networks are dominant and well-known security technologies are studied, such as HTTPS (HyperText Transfer Protocol Secure) protocol and certificates [14] or cryptography based on elliptic curves [15]. Innovative security schemes for these networked AmI systems may also be found, although they are sparse. In this category, Intrusion Detection Systems (IDS) for Ambient Intelligence are the most popular approach. Some authors propose honeypots to capture information about attackers and feed a severity analyzer supported by reinforcement learning algorithms [16]. Other works employ advanced access control policies to identify intruding devices, for example, using optimization problems, convex functions, and dual decompositions [17] to model and handle the wireless network. On the other hand, for those AmI deployments where web protocols are implemented, trustworthiness is also studied and enhanced. For example, through trusted ontology frameworks, which can be adapted and personalized to the specific services provided by each different AmI system [18]. Finally, some authors propose centralized data repositories, so general stochastic models [10] can be applied to detect outliers, incoherent information, and, eventually, Cyber-Physical attacks. Although these algorithms aim to correct and clean stored data from malicious samples, intrusion detection is just a secondary, very limited application.

However, none of these approaches is adequate for isolated AmI deployments. In fact, in isolated AmI deployments, security mechanisms must be supported by sensor nodes and edge managers. And in this context, low-level network parameters are typically employed to monitor and control intruders. In standard mesh networks, parameters such as reliability are periodically considered and updated to make decisions about which devices remain connected and which ones are blacklisted and removed [20]. However, in layered networks, such as Publication/Subscription networks, edge devices must implement analysis algorithms [19] (using, for example, artificial intelligence) as they only get indirect observations about the sensor nodes and their behavior. The main problem of all these solutions is their low precision. Both the false negative and false positive rates usually grow up linearly with the number of sensor nodes, as errors are accumulative, and network parameters are calculated aggregating information from all devices within the network. Although errors are below 2% for small networks (less than fifty nodes), they increase above 20% for large deployments (more than a thousand devices). That is not acceptable for most AmI applications.

To mitigate this situation and improve the performance of low-level security mechanisms, some works propose improved indicators representing “trust” in AmI networks, but with much lower associated errors. To do that, they combine network information with social information [21]. However, the results show only that these indicators are more stable and present a lower calculation error than previous proposals. And there is no information on how they would behave when integrated into a real security solution and AmI deployment. In conclusion, AmI security remains an open challenge, and very recent work [22] confirms this conclusion by describing all pending issues within this research topic.

In this context, most recent proposals work in two different directions. On the one hand, as attack detection is a complex task, some authors propose frameworks to detect the most vulnerable components within an AmI deployment [9]. The final objective is to correct or mitigate all these vulnerabilities, but (sometimes) the state of the art does not allow for it. Like it happens with the very novel Cyber-Physical attacks. On the other hand, security mechanisms based on Blockchain networks have been reported.

Several authors have confirmed the benefits of Blockchain technologies when applied to AmI deployments [23]. And although several works on unions between AmI systems and Blockchain have been reported [24], most of them require sensor nodes to communicate with the global Internet and the cloud. In the most common approach, Blockchain networks are independent of the AmI deployment and operate in the cloud. For example, new secure access control protocols in which Blockchain networks are supported by edge-cloud collaboration [35]. Or authentication services for smart homes, where a Metropolitan Area Network (MAN) is required connecting different regions of the city to communicate with the Blockchain provider of the region [36]. Some works use Blockchain as a public, transparent, and reliable registration system for sensor nodes [25]. In this solution, other peer nodes and manager devices use transparent information to determine if nodes are legitimate or malicious. Furthermore, some authors describe mechanisms in which AmI data are collected through public general Blockchain networks and instruments, such as the InterPlanetary File System (IPFS) [26].

Public Blockchain networks (such as Ethereum) are also used to support automatic alert and incidence management in AmI systems [27], although in this case Blockchain is more related to automation than to security. Similarly, Blockchain for AmI networks has been used as a communication system [28] or a payment platform [29].

However, again, all of these solutions are designed for AmI deployments connected to the cloud. It is difficult to find Blockchain-based technologies specifically designed for isolated AmI deployments. Works describing new consensus protocols (based on network indicators), so that Blockchain can be executed by sensor nodes have been reported [30]. But results show sensor nodes are not powerful enough to maintain a Blockchain network in the long term [31] (most of the computing time is consumed by the Blockchain protocols), and no performance analysis of these new protocols when integrated into AmI systems have been reported. In fact, existing results assume AmI deployments have Internet connection [32], so isolated deployments are not studied.

Our paper aims to fill this gap. In this paper we describe a new Blockchain ledger, but to be supported by edge managers so it is computationally sustainable at long-term. It includes a new consensus protocol adapted to isolated AmI deployments, where no connection to the cloud is needed. We use network parameters, together with other information sources and analysis algorithms, to improve the precision and success rate reported in the state of the art, correctly detecting up to 93% of Cyber-Physical attacks.

3 A new blockchain ledger for AmI securization

Isolated AmI deployments are supported by a three-layer architecture (see Fig. 1). The first layer is composed of randomly distributed sensor nodes that capture data on a random basis. In this paper we are assuming \(N\) sensor nodes \({n}_{i}\) (1) are part of this AmI deployment. In the second layer, \(M\) edge managers \({m}_{i}\) (2) communicate with the \({K}_{i}\) sensor nodes \({n}_{j}^{{\mathcal{C}}_{i}}\) within their coverage area \({\mathcal{C}}_{i}\) (3). Coverage areas are not homogeneous and are unknown a priori, as they become self-configured when the AmI deployment starts operating. The third layer is composed of a local data processing server (usually distributed), whose internal structure and behavior are transparent for the purpose of this paper.

Fig. 1
figure 1

Proposed architecture for an isolated AmI deployment

$$\mathcal{N}= \left\{{n}_{i} i=1,\dots ,N\right\}$$
(1)
$$\mathcal{M}= \left\{{m}_{i} i=1,\dots ,M\right\}$$
(2)
$${\mathcal{C}}_{i}= \left\{{n}_{j}^{{\mathcal{C}}_{i}} j=1,\dots ,{K}_{i}\right\}$$
(3)

In this architecture, edge managers \(\mathcal{M}\) maintain a data structure in a collaborative way. It is a Blockchain \(C\) (4) (hereinafter referred to “chain” too), that is, a sequence of connected blocks \({b}_{i}\) where each new block \(C[i+1]\) contains an explicit reference to the previous one \(C[i]\) through its hash. Blocks \({b}_{i}\) contain a random number \({T}_{i}\) of accepted transitions \({t}_{j}^{i}\) (or operations), invoked by sensor nodes \(\mathcal{N}\) (5). In our proposal, these transactions \({t}_{j}^{i}\) represent the transmission of new data that are accepted by edge managers as legitimate.

$$C= \left\{{b}_{i}\right\}=\left\{C[i]\right\} being {b}_{i} \equiv C[i]$$
(4)
$${b}_{i}= \left\{{t}_{j}^{i} j=1,\dots ,{T}_{i}\right\}$$
(5)

A ledger is a set of mechanisms, shared and common to all edge managers \(\mathcal{M}\), employed to maintain updated and coherent the chain and the list of accepted transactions, according to the common acceptance criteria and guaranteeing the consensus among all the edge managers. In this paper, we propose a Blockchain ledger where three basic mechanisms are considered. First, a reputation model and calculation framework, including explicit and implicit reputation indicators. This mechanism is used to identify legitimate data sources whose transactions may eventually be included in the ledger (see Sect. 3.1). Second, a stochastic framework to calculate how probable a new data is to be legitimate. In this framework, probabilities are obtained using information theory indicators and considering the historical behavior of the AmI deployment (see Sect. 3.2). This instrument is used to identify valid transactions. And third, and finally, a new consensus protocol and block generation and transaction validation algorithms. Considering reputation indicators and data validity probabilities, these instruments identify fully valid transactions through weighted consensus among all edge managers (see Sect. 3.3).

3.1 Sensor node validation: reputation model

Reputation, as a technological parameter, can be defined using several different approaches [33]: cognitive, computational, neurological, or even game-theoretical. But in security applications, indicators must be precise and stable to avoid false positive and false negative detections. Therefore, in this paper, we propose a hybrid definition of reputation to improve the stability and precision of classic approaches [34].

In our framework, the global reputation \(R[{n}_{i}]\) of sensor node \({n}_{i}\) is obtained as the geometric average of two different reputation measures (6). On the one hand, the explicit reputation \({R}_{e}[{n}_{i}]\) obtained from direct recommendations generated by other nodes within the AmI deployment. On the other hand, the implicit reputation \({R}_{im}[{n}_{i}]\) calculated by the surrounding sensor nodes using traffic statistics and other network indicators. Both the implicit and the explicit reputations vary in the interval \([\mathrm{0,1}]\).

$$R\left[{n}_{i}\right]= \sqrt{{R}_{e}[{n}_{i}] \bullet {R}_{im}[{n}_{i}]}$$
(6)

Explicit reputation \({R}_{e}[{n}_{i}]\) is calculated by edge managers \(\mathcal{M}\) from direct recommendations produced by sensor nodes \({n}_{j}\) (being \(j\ne i\)). Figure 2 shows the block diagram of the proposed calculation algorithm.

Fig. 2
figure 2

Explicit reputation calculation algorithm

Periodically, every \({\tau }_{i}^{en}\) seconds, sensor node \({n}_{i}\) creates a map with all nodes \({n}_{j}\) within its coverage region. For each node \({n}_{j}\) it evaluates three identity and configuration parameters: the link address (it may be a MAC -Media Access Control- address or an UUID -Universally Unique Identifier-, for example); the communication protocols being employed; and the provided data formats and/or services. Considering the received information and responses, node \({n}_{i}\) may generate one or several positive or negative recommendations about node \({n}_{j}\). Recommendations are generated according to the following criteria:

  • A positive recommendation is generated if the link address belongs to a device that was part of the AmI network in the past. A negative recommendation is generated if the link address belongs to a blacklisted device.

  • A positive recommendation is generated if communication protocols are standard protocols already present in the AmI deployment. And a negative recommendation is generated if protocols or configurations usually used in cyberattacks are detected.

  • A positive recommendation if data formats and/or services are similar to the ones managed by other sensor nodes. A negative recommendation is produced when data formats or services are detected that are typically associated with cyberattacks.

All recommendations are sent to edge managers. When received, all recommendations are collected in two different buckets. The first one for positive recommendations and the second one for negative recommendations. Every \({\tau }_{i}^{em}\) seconds one recommendation about node \({n}_{i}\) is extracted from both buckets. If a positive and a negative recommendation are extracted, explicit reputation remains unchanged. But if the negative bucket is empty, and only a positive recommendation is extracted, explicit reputation \({R}_{e}[{n}_{i}]\) is incremented using the multiplier \(\left(1+\frac{1}{Q}\right)\) where \(Q\) is a real number higher than or equal to the unit (7). On the contrary, if the positive bucket is empty and only a negative recommendation is extracted, explicit reputation \({R}_{e}[{n}_{i}]\) is decremented using the same multiplier. This calculation procedure is described in Algorithm 1.

$$Q \in [1,\infty )$$
(7)

In addition, both the positive recommendation bucket and the negative recommendation bucket have a maximum capacity of \({\Sigma }_{max}^{pos}\) and \({\Sigma }_{max}^{neg}\) recommendations, respectively. When any bucket is full, recommendations of the corresponding type are rejected. With this design, we protect the reputation calculation framework from recommendation bursts, which are unnatural and typically associated with attacks. Besides, a correct balance between time constants \({\tau }_{i}^{em}\) and \({\tau }_{i}^{en}\) allow the algorithm to automatically compensate malicious attacks thanks to genuine recommendations coming from legitimate sensor nodes.

figure d

Implicit reputation \({R}_{im}[{n}_{i}]\) is deducted from nodes’ behavior. But every sensor node \({n}_{j}\) has a different vision of the other nodes’ behavior. Then, the global implicit reputation \({R}_{im}[{n}_{i}]\) is calculated by aggregating partial estimations \({R}_{im}({n}_{j})[{n}_{i}]\) generated by nodes \({n}_{j}\) (8). However, different estimations may show different significances. Weights \({\lambda }_{j}\) (9) represent these differences in the final reputation calculation.

$${R}_{im}\left[{n}_{i}\right]= \sum_{\begin{array}{c}\forall {n}_{j} \in N\\ j \ne i\end{array}}{\lambda }_{j}\bullet {R}_{im}({n}_{j})[{n}_{i}]$$
(8)
$${\lambda }_{j} \in (0, 1]$$
(9)

In our model, implicit reputation \({R}_{im}({n}_{j})[{n}_{i}]\) is calculated by combining three different network parameters (10):

  • Reliability \(\left(\rho \right)\). It measures the availability of sensor nodes to communicate when it is requested by other nodes within the AmI deployment.

  • Goodness \(\left(\zeta \right)\). It represents the posteriori probability of a sensor node to become malicious and attack other nodes in its surroundings.

  • Importance \(\left(\nu \right)\). This parameter refers to the relevance and how essential a node is within an AmI deployment. It depends, basically, on how many other nodes may assume its functions if it is removed from the system.

Each one of these network indicators is weighted (using \({\mu }_{1}\), \({\mu }_{2}\) and \({\mu }_{3}\) real parameters), so it is possible to select the relative significance of each contribution to the final implicit reputation estimation (11).

$${R}_{im}\left({n}_{j}\right)\left[{n}_{i}\right]=\left[\begin{array}{ccc}{\mu }_{1}& {\mu }_{2}& {\mu }_{3}\end{array}\right]\bullet \left[\begin{array}{c}\rho \\ \zeta \\ \nu \end{array}\right]$$
(10)
$${\mu }_{i} \in \left(0, 1\right] i=\mathrm{1,2},3$$
(11)

Node \({n}_{i}\) divides time into measurement slots with a duration of \({\tau }_{i}^{mes}\) seconds. For each slot, node \({n}_{i}\) controls the total number of communication attempts \({p}_{total}\) with every node \({n}_{j}\) within its coverage region, together with the number of attempts that were actually successful, \({p}_{success}\). Using the Laplace definition for probability and these two measures, we can calculate the instant availability \({a}_{inst}^{j}\) for node \({n}_{j}\) (12). All these instant measurements are collected in a common time series \(w[k]\) (13), where \(k\) is the discrete time variable (\(k\)-th slot). Hereinafter \(k=0\) is the current (present) time instant. However, reliability depends not only on events happening in the last \({\tau }_{i}^{mes}\) seconds, but on all historical node’s behavior. Although past measures are slightly less relevant than recent measurements. Thus, we can calculate the historical availability \({a}_{j}\) of node \({n}_{j}\) through a weighted average (14), where \(A\) is an integer parameter and weights slowly reduce their value thanks to the logarithmic function.

$${a}_{inst}^{j}=\frac{{p}_{success}}{{p}_{total}}$$
(12)
$$w\left[k\right]= \left({a}_{inst}^{j}\right)$$
(13)
$${a}_{j}= \sum_{k=1}^{A}\frac{w\left[1-k\right]}{1+ln\left(k\right)}$$
(14)

In network engineering, traffic is modeled as a Poisson process. Then, service time, congestion, and reliability follow an exponential distribution. In our model, reliability follows the same law (15), being \({K}_{av}\) a constant controlling the equivalence between historical availability \({a}_{j}\) and reliability \(\rho\) (16). In general, full reliability is achieved when historical availability is equal to \(5\bullet {K}_{av}\) or higher.

$$\rho =1-exp\left\{-\frac{{a}_{j}}{{K}_{av}}\right\}$$
(15)
$${K}_{av} \in \left(\mathrm{0,1}\right] \vdots {a}_{j}\ge 5\bullet {K}_{av} \Rightarrow \rho \approx 1$$
(16)

On the other hand, for each measurement time slot, node \({n}_{i}\) also monitors the number of attacks it receives from node \({n}_{j}\). For each time slot, \({z}_{attack}^{j}\) represents this indicator. All these instant measurements are collected in a common time series \(x[k]\) (17), where \(k\) is the discrete time variable (\(k\)-th slot).

$$x\left[k\right]= \left({z}_{attack}^{j}\right)$$
(17)

As before, this is an instant value; however, reputation does not only depend on the events that occurred during the last time slot, but also on the entire historical behavior. However, past behaviors are not as relevant as recent events. And, regarding attacks, fast adaptation to malicious behaviors allows quick detection and mitigation. Thus, a global historical attack counter \({z}_{j}\) is obtained through a weighted average (18), where weights follow an exponential law which reduces the significance of past behaviors much faster than logarithmic laws. Being \(Z\) and \(r\) integer parameters higher than the unit.

$${z}_{j}= \sum_{k=0}^{Z}x[-k]\bullet {\left(\frac{1}{r}\right)}^{k+1}$$
(18)

Using this global indicator, and through a sigmoid function (19), we calculate the goodness \(\zeta\) of node \({n}_{j}\).

$$\zeta = \frac{1}{{z}_{j}\bullet \sqrt{1+{\left(\frac{1}{{z}_{j}}\right)}^{2}} }$$
(19)

Finally, node \({n}_{i}\) may calculate the importance of \({n}_{j}\) using two parameters. First, parameter \(cl\) indicates how critical are services or data provided by node \({n}_{j}\) in the context of the AmI deployment. Parameter \(cl\) takes values in the interval \((0,\infty ]\), where lower values indicate the component’s criticality is limited. This parameter must be defined by AmI system managers and cannot be self-selected by sensor nodes. Secondly, parameter \(red\) represents the network redundancy. It measures how many other nodes within the coverage region of node \({n}_{i}\) are providing the same services or data than node \({n}_{j}\). It can be estimated using the cardinality \(card\left\{\bullet \right\}\) operator (20).

$$red=\frac{card\left\{similar nodes {n}_{j}\right\}}{card\left\{total {n}_{j} nodes\right\}}$$
(20)

Using these two parameters, importance can be calculated (21). As can be seen, through the proposed exponential law, importance \(\nu\) decreases as \(red\) parameter increases, but the decreasing rate is slower as the parameter \(cl\) is higher.

$$\nu =exp\left\{-red\bullet cl\right\}$$
(21)

In this case, instant (last) calculation is the only relevant, and historical values for importance \(\nu\) are not considered. Configurations of the past deployment do not affect the performance of the current AmI deployment.

Finally, for validation purposes, any sensor node \({n}_{i}\) is considered to have a good reputation and then, validated as a legitimate information source when its reputation \(R\left[{n}_{i}\right]\) goes above a threshold \({R}_{th}\) (22).

$$if R\left[{n}_{i}\right] \ge {R}_{th} \Rightarrow {n}_{i} validated$$
(22)

3.2 Data validation

Even valid and legitimate nodes may be infected or affected by malicious effects, making them to generate false information. Then, in addition to the reputation calculation framework, a mechanism for data validation is essential.

In our model, each transaction or data sample is characterized by a Bernoulli distribution \(\mathcal{B}(q)\) where transaction \({t}_{j}^{i}\) is false with probability \(q\), and it is legitimate with probability \(\left(1-q\right)\) (23).

$${\mathcal{B}}\left( q \right)\sim P\left( {t_{j}^{i} \;is\;valid} \right) = \left\{ {\begin{array}{*{20}c} {1 - q\quad if\;true} \\ {q\quad if\;false} \\ \end{array} } \right.$$
(23)

Every transaction\({t}_{j}^{i}\), then, is fully characterized by probability\(q\). This probability is obtained by combining three stochastic indicators. First, information entropy,\(H(\bullet )\). In general, legitimate data series have a low entropy because they follow a deterministic pattern. High entropy is evidence of malicious attack trying to confuse the AmI system. Second, mutual information\(I\left(\bullet ; \bullet \right)\). Close and similar nodes are expected to generate data series sharing a large amount of information. A mid-term or long-term sequence of outliers is evidence of the injection of false information. And, finally, the Probability Density Function (PDF),\(f(\bullet )\). Physical processes are typically stationary, and probability distribution of data samples does not change with time. Radical, fast, or unexpected changes in the probability distribution of data samples are evidence of false information too.

However, all these stochastic indicators cannot be applied to individual transactions \({t}_{j}^{i}\) but a sequence or collection \({T}_{L}\) including \(L\) transactions. Then, edge managers must collect \(L\) transactions and, later, validate all together through the same data validation process. Parameter \(L\) has not to be fixed and may change with time according to the AmI deployment’s needs. Hereinafter, \({y}_{i}[k]\) is the series with all the history of data generated by node \({n}_{i}\) (as transactions \({t}_{j}^{i}\)), and \({y}_{i}^{L}[k]\) is the series with the last \(L\) samples generated by node \({n}_{i}\) (contained in collection \({T}_{L}\)).

Information entropy, \(H(\bullet )\), is directly applied to series \({y}_{i}^{L}[k]\) (24), where \({p}_{\mathcal{y}}\) is the probability of symbol \(\mathcal{y}\) (or data sample) within the series \({y}_{i}^{L}[k]\). In order to calculate probabilities \({p}_{\mathcal{y}}\) the Laplace’s definition for probability is employed (25). But because of noise, fluctuations, numerical errors, etc., sensor nodes rarely generate two identical samples. Then, in our model all samples within the range \([\mathcal{y}-\varepsilon , \mathcal{y}+\varepsilon ]\) are considered to be same, where \(\varepsilon\) is a real parameter representing the precision (absolute value) of the AmI system. This operation is performed by the Heaviside’s step function, \(u\left[\bullet \right]\).

$$H\left( {y_{i}^{L} } \right) = - \sum _{{\begin{array}{*{20}c} {\forall differenty \in y_{i}^{L} } \\ \end{array} }} p_{y} \bullet {\text{log}}_{2} \left( {p_{y} } \right)$$
(24)
$$p_{y} = \frac{1}{L} \bullet \sum\limits_{{\forall {\text{ }}\tilde{y} \in y_{i}^{L} }} u \left[ {\varepsilon - \left| {y - \tilde{y}} \right|} \right]$$
(25)

Besides, information entropy varies in the range \([0, {s}_{i}]\) (in bits), being \(s\) is the exponent satisfying an exponential equality (26) where \({S}_{i}^{t}\) is the number of different symbols (data samples) in the sequence \({y}_{i}^{L}[k]\). But, for coherence with the other indicators, it is convenient if entropy \(H\) also varies in the interval \([0, 1]\) (as probability functions do). Then, a mapping function is applied to get the final value for entropy \({H}_{map}\) (27).

$${S}_{i}^{t}={2}^{{s}_{i}}$$
(26)
$${H}_{map}\left({y}_{i}^{L}\right)=H\left({y}_{i}^{L}\right)\bullet \frac{1}{{s}_{i}}$$
(27)

On the other hand, mutual information \(I(\bullet ; \bullet )\) is applied to series \({y}_{i}^{L}[k]\) and \({y}_{j}^{L}[k]\) (28), being \({n}_{i}\) and \({n}_{j}\) equivalent or similar nodes, according to explicit reputation parameters previously described in Sect. 3.1. \({p}_{{\mathcal{y}}_{1}}\) is the probability of symbol \({\mathcal{y}}_{1}\) (or data sample) within the series\({y}_{i}^{L}[k]\), \({p}_{{\mathcal{y}}_{2}}\) is the probability of symbol \({\mathcal{y}}_{2}\) (or data sample) within the series\({y}_{j}^{L}[k]\), and being \({p}_{{\mathcal{y}}_{1},{\mathcal{y}}_{2}}\) the joint probability of symbols (or data samples) \({\mathcal{y}}_{1}\) and \({\mathcal{y}}_{2}\) to be generated at the same time instant by nodes \({n}_{i}\) and \({n}_{j}\) respectively. As before, in order to calculate probabilities \({p}_{{\mathcal{y}}_{1}}\) (29), \({p}_{{\mathcal{y}}_{2}}\) (30) and \({p}_{{\mathcal{y}}_{1},{\mathcal{y}}_{2}}\) (31) the Laplace’s definition for probability is employed and being \(u\left[\bullet \right]\) the Heaviside’s step function. Additionally, for the mutual information calculation we are considering a tolerance range\(\varepsilon\), so two samples are assumed to be the same if their difference is lower than this tolerance. Equally, sensor nodes in AmI deployments are not synchronized. Thus, all samples within a given time tolerance \(\pi\) (discrete time units) are considered to be generated at the same instant.

$$I\left({y}_{i}^{L} ; {y}_{j}^{L}\right)= \sum_{\begin{array}{c}\forall different {\mathcal{y}}_{1} \in {y}_{i}^{L}\\ \end{array}}\sum_{\forall different {\mathcal{y}}_{2} \in {y}_{j}^{L}}{p}_{{\mathcal{y}}_{1},{\mathcal{y}}_{2} }\bullet {\mathrm{log}}_{2}\left(\frac{{p}_{{\mathcal{y}}_{1},{\mathcal{y}}_{2} }}{{p}_{{\mathcal{y}}_{1}}\bullet {p}_{{\mathcal{y}}_{2}}}\right)$$
(28)
$$p_{{y_{1} }} = \frac{1}{L} \bullet \sum\limits_{{\forall \tilde{y} \in y_{i}^{L} }} u \left[ {\varepsilon - \left| {y_{1} - \tilde{y}} \right|} \right]$$
(29)
$$p_{{y_{2} }} = \frac{1}{L} \bullet \sum\limits_{{\forall \;\tilde{y} \in y_{j}^{L} }} u \left[ {\varepsilon - \left| {y_{2} - \tilde{y}} \right|} \right]$$
(30)
$${p}_{{\mathcal{y}}_{1},{\mathcal{y}}_{2} }=\frac{1}{L\bullet \left(2\pi +1\right)}\bullet \sum_{k=-L+1}^{0}\sum_{{k}_{0}=-\pi }^{\pi }\left(u\left[\varepsilon -\left|{\mathcal{y}}_{2}-{y}_{j}^{L}\left[k-{k}_{0}\right]\right|\right]\bullet u\left[\varepsilon -\left|{\mathcal{y}}_{1}-{y}_{i}^{L}[k]\right|\right]\right)$$
(31)

In this case, mutual information also varies in the interval \([0, {s}_{i,j}]\), where \({s}_{i,j}\) is the solution to the exponential Eq. (32) to calculate the total number of different samples (symbols) \({S}_{i,j}^{t}\) in series \({y}_{i}^{L}[k]\) and \({y}_{j}^{L}\left[k\right]\). Again, we employ a mapping function (33) to move the target interval to the range \([0, 1]\) (where probability functions usually take values) and obtain the final mutual information \({I}_{map}\).

$${S}_{i,j}^{t}={2}^{{s}_{i,j}}$$
(32)
$${I}_{map}\left({y}_{i}^{L} ; {y}_{j}^{L}\right)=I\left({y}_{i}^{L} ; {y}_{j}^{L}\right)\bullet \frac{1}{{s}_{i,j}}$$
(33)

And third, and finally, the Probability Density Function (PDF) allows to analyze the occurrence probability of every individual sample \(\mathcal{y}\) or transaction \({t}_{j}^{i}\). To calculate the PDF \(f(\bullet )\), we use the entire series \({y}_{i}\left[k\right]\), the Heaviside’s step function, \(u\left[\bullet \right],\) the cardinality operator \(card\left\{\bullet \right\}\), and the probability theory (34), so histograms approach to the PDF when the number of realizations is high. As before, all samples \(\mathcal{y}\) within the t \([\mathcal{y}-\varepsilon , \mathcal{y}+\varepsilon ]\) are considered to be equal, where \(\varepsilon\) is an integer value representing the precision of the AmI system.

$$f\left( y \right) = \frac{1}{{card\left\{ {y_{i} } \right\}}} \bullet \sum\limits_{{\forall {\text{ }}\tilde{y} \in y_{i} }} u \left[ {\varepsilon - \left| {y - \tilde{y}} \right|} \right]$$
(34)

In this case, PDF already takes values in the range \([0, 1]\), so no additional transformation is required. But, to be consistent with previous stochastic indicators, function \(f(\bullet )\) should also refer to the entire series \({y}_{i}^{L}[k]\) and not only to individual samples \(\mathcal{y}\). To obtain this aggregated value \({f}_{av}\left(\bullet \right)\), we are considering the average probability (35) of all samples in the series \({y}_{i}^{L}[k]\).

$$f_{{av}} \left( {y_{i}^{L} } \right) = \frac{1}{L}\sum\limits_{{\forall y \in y_{i}^{L} }} f \left( y \right) = \frac{1}{{card\left\{ {y_{i} } \right\} \bullet L}} \bullet \sum\limits_{{\forall y \in y_{i}^{L} }} {\sum\limits_{{\forall {\text{ }}\tilde{y} \in y_{i} }} u } \left[ {\varepsilon - \left| {y - \tilde{y}} \right|} \right]$$
(35)

With all these three stochastics indicators, probability \(q\) may be obtained through a polynomial function (36), where \({h}_{max }^{1}\), \({h}_{max }^{2}\) and \({h}_{max }^{3}\) are the maximum exponents for the polynomial and coefficients \({\sigma }_{1}\), \({\sigma }_{2}\) and \({\sigma }_{3}\) are weights to control the relevance of each stochastic indicator in the calculation of probability \(q\). To be consistent with the definition of probability, the addition of these three weights must be equal to the unit (37). These exponents \({h}_{max }^{1}\), \({h}_{max }^{2}\) and \({h}_{max }^{3}\) control the changing speed of probability \(q\) with the three previously described indicators. As exponents get higher, changes in entropy \({H}_{map}\), mutual information \({I}_{map}\) or the PDF \({f}_{av}\) cause a more significant change in probability \(q\).

$$q=\frac{{\sigma }_{1}}{{h}_{max }^{1}}\bullet \sum_{{h}_{1}=1}^{{h}_{max }^{1}}{\left(1-{H}_{map}\right)}^{{h}_{1}}+\frac{{\sigma }_{2}}{{h}_{max }^{2}}\bullet \sum_{{h}_{2}=1}^{{h}_{max }^{2}}{\left({I}_{map}\right)}^{{h}_{2}}+\frac{{\sigma }_{3}}{{h}_{max }^{3}}\bullet \sum_{{h}_{3}=1}^{{h}_{max }^{3}}{\left({f}_{av}\right)}^{{h}_{3}}$$
(36)
$${\sigma }_{1}+{\sigma }_{2}+{\sigma }_{3}=1$$
(37)

Finally, transactions are validated if probability \(q\) goes below a given threshold \({q}_{th}\) (38). When that happens, the entire series \({y}_{i}^{L}[k]\) is validated.

$$if q \le {q}_{th} \Rightarrow {y}_{i}^{L}[k] validated$$
(38)

3.3 Transaction validation, block generation and consensus protocol

In the proposed security solution, edge managers \(\mathcal{M}\) maintain a Blockchain ledger. Edge managers \(\mathcal{M}\) receive, accumulate, and validate transactions \({t}_{j}^{i}\), describing the generation of new AmI data by sensor nodes \({n}_{i}\). Valid transactions \({t}_{j}^{i}\) are written down in a new block \({b}_{i}\) for the chain by edge manager \({m}_{j}\). Block \({b}_{i}\) is linked to the last valid block through its hash, which will be validated by the other edge managers \({m}_{k}\).

In order to validate transactions, and according to the proposed data validation mechanism (see Sect. 3.2), a minimum of \(L\) different transactions from node \({n}_{i}\) must be accumulated by edge manager \({m}_{i}\). But, since transactions associated to different nodes \({n}_{j}\) can be accumulated at the same time, final blocks \({b}_{i}\) contain \({T}_{i}\) valid transactions \({t}_{j}^{i}\) which may combine transactions referring different nodes \({n}_{j}\).

In our Blockchain ledger, we name \(\mathcal{T}({n}_{i})\) the collection of all transactions that meet the conditions to be valid only considering the source node (i.e., according to the proposed reputation model, see Sect. 3.1). On the other hand, we name \(\mathcal{T}(\mathcal{L})\) the set of all transactions that meet the conditions to be part of the historical record \(\mathcal{L}\) of the ledger (i.e., according to the proposed data validation framework, see Sect. 3.2). Thus, valid transactions \({t}_{j}^{i}\) are those contained in the intersection of both sets (39).

$${t}_{j}^{i} \in \mathcal{T}\left({n}_{i}\right)\cap \mathcal{T}(\mathcal{L})$$
(39)

In that way, a block \({b}_{i}\) is validated by consensus, if a majority of edge managers \({m}_{j}\) validates all the individual transactions \({t}_{j}^{i}\) it contains. But in the general case, the sensor nodes and edge managers are randomly distributed. Some sensor nodes can be connected to several edge managers, and the edge managers’ coverage area \({\mathcal{C}}_{i}\) may contain different numbers \({K}_{i}\) of sensor nodes \({n}_{j}^{{\mathcal{C}}_{i}}\). Then, the knowledge that each edge manager has about the AmI deployment is different. To represent this asymmetry, voting is weighted by \({K}_{i}\) parameters. And a block \({b}_{i}\) is validated if the aggregate knowledge \({K}_{ag}^{+}\) of managers supporting the block validation is superior to the aggregate knowledge \({K}_{ag}^{-}\) of managers which do not do it (40).

$${K}_{ag}^{+}= \sum_{\forall {m}_{j} validates {b}_{i}}{K}_{j} > {K}_{ag}^{-}= \sum_{\forall {m}_{j} does not validate {b}_{i}}{K}_{j}$$
(40)

Figure 3 shows a description with details of the proposed block generation, transaction validation and consensus algorithm (executed by edge managers).

Fig. 3
figure 3

Block generation, transaction validation and consensus algorithm

As can be seen, transactions \({t}_{j}^{i}\) and blocks \({b}_{i}\) for validation, as well as voting results \({v}_{i}\), arrive randomly to the edge manager. Regarding transactions \({t}_{j}^{i}\), they are stored together with all transactions \({y}_{i}[k]\) coming from the same sensor node \({n}_{i}\). Edge manager \({m}_{j}\) tries to create a new block at the discrete time instants \({\psi }_{r}\) (41). These time instants may be homogenously distributed (periodical) or can be triggered by events.

$$\left\{{\psi }_{r} r\in {\mathbb{N}}\right\}$$
(41)

At each time instant \({\psi }_{r}\), edge manager \({m}_{j}\) evaluates if it stores at least \(L\) data samples (or transactions) coming from any of the nodes \({n}_{j}^{{\mathcal{C}}_{i}}\) within its coverage area \({\mathcal{C}}_{i}\). For all nodes \({n}_{i}\) for which at least \(L\) transactions are available, the edge manager \({m}_{j}\) validates the source node \({n}_{i}\) through the proposed reputation framework. If node \({n}_{i}\) is validated, the algorithm continues. If not, all associated transactions are discharged and deleted from series \({y}_{i}[k]\). For all nodes \({n}_{i}\) whose reputation is high enough, the algorithm validates the collection of \(L\) transactions \({y}_{i}^{L}[k]\) through the data validation framework. If transactions are validated, the algorithm continues. If not, all transactions are discharged and deleted.

After this process, all validated transactions \({t}_{j}^{i}\) are written down in a block \({b}_{i}\) by manager \({m}_{j}\), which is published and distributed among all other edge managers \({m}_{k}\).

When an edge manager receives a new block \({b}_{i}\) for validation, it executes the algorithm as described above: first, it validates the source node \({n}_{i}\) using the reputation model and, later, the transaction series \({y}_{i}^{L}[k]\) through the data validation framework. If all transactions \({t}_{j}^{i}\) within the block \({b}_{i}\) are validated, manager \({m}_{k}\) votes positively (and publicly) and knowledge \({K}_{j}\) is added to aggregated knowledge \({K}_{ag}^{+}\). On the contrary, knowledge \({K}_{j}\) is added to the negative aggregated knowledge \({K}_{ag}^{-}\). Voting results \({v}_{i}\) are public, and all managers receive the updates. When all edge managers \(\mathcal{M}\) have voted, the block \({b}_{i}\) is definitely validated, only if the consensus is enough (40). This operation is distributed as all edge managers track the voting results.

When block \({b}_{i}\) is definitely validated, all edge managers \({m}_{j}\) remove from their collection of transactions to be validated \({y}_{i}^{L}[k]\) all transactions \({t}_{j}^{i}\) included in the block \({b}_{i}\), as the Blockchain ledger cannot contain duplicated transactions.

4 Experimental validation

In order to evaluate the performance of the proposed security solution in the context of isolated AmI deployments, we designed and carried out an experimental validation. All experiments were supported by simulation tools and scenarios, so we can easily control variables such as the number of sensor nodes and/or edge manager within the deployment and then, analyze more deeply the behavior of the proposed technology. Section 4.1 describes the experimental methodology, while Sect. 4.2 presents and discusses the obtained results.

4.1 Experimental methdology, material and methods

The proposed experimental validation includes two different phases. In the first phase, the Cyber-Physical attack detection and mitigation capabilities of the proposed security solution are analyzed. Basically, the percentage of attacks successfully identified and blocked is studied. In the second phase, we focus on the performance level, and variables such as the required processing time or detection delay (critical in Blockchain-based systems) are analyzed. The objective is to identify not only whether the proposed mechanisms behave as expected, but also whether its performance is compatible with AmI deployment operations.

Specifically, five experiments were carried out. The first two experiments took place in the first experimental phase. The first experiment analyzed the percentage of Cyber-Physical attacks (false information injection) that were correctly detected and stopped, together with the number of false positive and false negative detections. The experiment was repeated for different numbers \(N\) of sensor nodes and different numbers \(M\) of edge managers in the AmI deployment. Later, the second experiment analyzed the percentage of Cyber-Physical attacks that were correctly detected and stopped too (as well as the amount of false positive and false negative detections), but in this case the experiment was repeated for two different types of attacks (fast and slow attacks). In fast attacks, false information was injected as a flood trying to collapse the AmI deployment as soon as possible. In slow attacks, false information is injected with the same speed as legitimate information, looking for a long-term impact. Different values for \(L\) parameter (employed in the data validation framework) were also considered in this second experiment.

On the other hand, the three last experiments were part of the second experimental phase. The third experiment analyze the required time (delay) to detect and mitigate a Cyber-Physical attack when using the proposed security solution, for different \(N\) of sensor nodes and different numbers \(M\) of edge managers in the AmI deployment. Additionally, the fourth experiment also studied the attack detection delay in the proposed security mechanism, but in this case for different values of \(L\) parameter (employed in the data validation framework) and two different types of attacks (fast and slow attacks, as described above). Finally, the fifth experiment studies the required processing time and computational resources (memory) required to execute the proposed security solution. Results were divided into two different groups, depending on the type of device (sensor node or edge manager).

All of these experiments were supported by a simulation scenario, built and operated using the MATLAB 2022a and Simulink software. This software was executed on a Linux-based machine (Ubuntu 22.04 LTS) with the following hardware characteristics: Dell R540 Rack 2U, 96 GB RAM, two processors Intel Xeon Silver 4114 2.2G, HD 2 TB SATA 7,2 K rpm.

In the proposed simulation model, the sensor nodes generated new data samples (or transactions) on a random basis. Edge managers were also randomly associated with sensor nodes. Sensor nodes could have three different measurement capabilities: temperature, humidity, and carbon dioxide. The proportion of each sensor type was randomly configured at every simulation. Attacks were represented by turning malicious a random percentage of sensor nodes, always below 25% of the total number available in every different simulation. Communication protocols of legitimate nodes were Bluetooth or WiFi (randomly selected), while malicious nodes could also employ LoRa protocols. Table 1 shows the proposed configuration for all parameters that are not experimental variables.

Table 1 System configuration

Each simulation represented seventy-two (72) hours of AmI deployment operations. All simulations were repeated twelve times to reduce the impact of exogenous effects, such as numerical errors or interferences caused by the operating system. Final results were obtained as the average of all twelve realizations.

4.2 Results

Figure 4 shows the results of the first experiment. As can be seen in Fig. 4(a), the proposed security solution is able to detect and mitigate up to 93% of Cyber-Physical attacks. Besides, even in the worst case, the detection capability is above 78%. As the Blockchain-ledger and the associated validation algorithms are distributed, the performance is significantly worse when few devices (sensor nodes and/or edge managers) are deployed within the AmI system. The best performance is obtained for deployments including between one thousand (1000) and two thousand (2000) sensor nodes. After this point, performance sightly decreases again (the success detection rate is reduced to around 2%), because of noises and numerical fluctuations caused by very large AmI deployments. But this decrease is not as relevant as the observed increase between AmI deployment with twenty-five (25) and one thousand (1000) nodes (the success detection rate increases around 19%). In conclusion, the proposed solution is especially useful for large isolated AmI deployments (more than one thousand nodes).

Fig. 4
figure 4

Results for the first experiment. a True positive detections. b False negative detections. c False positive detections

The impact of the number of edge managers is also observable and relevant. But this is only significant for small AmI deployments (below one thousand nodes). For this kind of deployments, systems with only five (5) edge managers only get a detection rate of 78%, while systems including one hundred (100) managers achieve an 89% rate. That is because of consensus. When few edge managers are considered, consensus is weak and block validations are not consistent (the are highly affected by numerical errors in the data validation framework, mainly). But a higher number of edge managers can mitigate this impact, thanks to a much more complex consensus. Anyway, as the AmI deployment increases in size (more sensor nodes are included), this effect disappears. Deficiencies in the consensus protocol are compensated by much abundant reputation information from the AmI deployment, as well as more samples and historical series to be employed in the data validation framework, which reduces errors and makes all configurations behave the same regardless the number of edge managers. In conclusion, for small AmI deployments, a higher number of edge managers can increase performance.

Regarding false negative detections, Fig. 4(b), they are much more common than false positive detections, Fig. 4(c). Actually, the proposed security solution is designed to specially avoid false positive detections that perturbed the AmI deployment operations, since false information (in small amounts) is usually risk-free. The false positive detection rate decreases monotonously as the success detection rate increases, and for large AmI deployments keep under 0.3% in all cases. The false negative detection rate behaves similarly, but in this case the minimum value is around 3%. In addition, the rate also increases sightly for very large AmI deployments, which is consistent with the reduction in the success detection rate.

Figure 5 shows the results of the second experiment. This experiment was carried out for an AmI deployment with one thousand (1000) sensor nodes and twenty-five (25) edge managers. As can be seen, the evolution for all rates is qualitatively similar to the results in Fig. 4. Thus, both experiments are coherent.

Fig. 5
figure 5

Results for the second experiment. a True positive detections. b False negative detections. c False positive detections

For low values of \(L\) parameter fast attacks are more efficiently detected. In fast attacks, the impact is more relevant, even in short time periods, so the data validation framework may detect malicious behavior even with a limited number of samples (or transactions). But if the value for \(L\) parameter gets even smaller, performance get worse in any situation, with a success detection rate around 60% for fast attacks and around 40% for slow attacks. As can be seen in Fig. 5(a) the optimal point for fast attacks is \(L \approx 75\), when the success detection rate is around 93%. When \(L\) parameter goes beyond this point, too many transactions must be accumulated, and fast attacks may finish and complete their objective before they are detected. On the contrary, slow attacks present a very bad behavior for low and medium values of \(L\) parameter, but it gets better monotonously as \(L\) parameter increases. For values \(L >100\) the success detection rate goes above 90%, with a maximum rate of 91%. Taking into account this analysis, a good balanced configuration could be \(L=100\). For this value, both types of attack are successfully detected with rates greater than 90%.

Regarding false-positive and false-negative detection rates, we can distinguish two regions. For low values of \(L\) parameter, false positive detection and false negative detection rates present similar values. Mainly because for such low values, the data validation framework is unstable and numerical errors make false positive detections and false negative detections occur randomly. But, for high values of \(L\) parameter, the false positive detection rate is low for both kinds of attacks (fast and slow). Although for fast attacks the rate increases sightly (minimum value is around 2.5%, and grows up to 8%, approximately), and for slow attacks the rate decreases monotonously (minimum rate, 2.5%). On the contrary, false negative detection rate for fast attacks greatly increases its value. Increasing from 3% (minimum value for \(L=75\)) to 19%. Meanwhile, for slow attacks, this rate also decreases monotonously and stays around 5%.

On the other hand, Fig. 6 shows the results of the third experiment (second experimental phase). In this experiment, only attacks successfully detected were considered. As can be seen, attack detection delays evolve linearly with the number of sensor nodes within the AmI deployment, as well as with the number of edge managers. This is consistent with the proposed validation and consensus algorithm, as the voting process requires all edge managers to vote (so it is a longer process as more managers participate). Furthermore, since more nodes are included in the AmI deployment, the blocks tend to include more transactions, which also increases (because of the data validation framework and the reputation model) the detection delay (linearly, see Fig. 3). If we analyze the situation where the highest success detection rate was reported (one thousand nodes), the delay is between eleven (11) and thirty-seven (37) seconds. These values are acceptable for most AmI deployments and Cyber-Physical attacks, as they are resilient enough to handle and attack for such a short time.

Fig. 6
figure 6

Results for the third experiment. Detection delay

Figure 7 shows the results for the third experiment. For the second experiment, these results represent an AmI deployment with one thousand (1000) devices. And only successful attack detections were considered. In this case, the differences between slow and fast attacks are very reduced. Delays are only 7% higher for slow attacks because, on some occasions, attacks are so slow that more than one block is needed to detect it, contrary to fast attacks. Time also evolves linearly with parameter \(L\), as stochastic indicators in the data validation framework are obtained using sequential loops over all accumulated samples. If we consider a balanced value for \(L\) parameter, for example \(L=75\), detection delay is around 37.5 s for slow attacks and 29 s for fast attacks. As said above, these values are acceptable for most AmI deployments, so we can conclude that the proposed solution is adequate for isolated AmI deployments.

Fig. 7
figure 7

Results for the fourth experiment. Detection delay

Finally, Table 2 shows the results of the fifth experiment. This analysis was carried out for the configuration with the best behavior (one thousand sensor nodes, twenty-five edge manager and \(L=100\)). As can be seen, around 10% of the resources in the sensor nodes are consumed by the proposed solution. To do these calculations, the ESP32 microcontroller was taken as a reference. Processing delay, in addition, is acceptable for most applications, where sensor nodes generate data every few seconds. On the other hand, resource consumption in edge managers is slightly higher. Around 15% memory is required by the proposed Blockchain-based solution, and most part of the required attack detection delay is assumed by edge managers (on average more than forty seconds). To obtain these results, an Artik 530 architecture was considered. In conclusion, the resource consumption of the proposed security solution is compatible with the characteristics of AmI deployments.

Table 2 Results for the fifth experiment. Resource consumption

5 Conclusions

In this paper, we propose a new decentralized security solution, based on a Blockchain ledger, to protect isolated Ambient Intelligence deployments. In this ledger, new sensing data are considered transactions that must be validated by edge managers, which operate a Blockchain network. This validation is based on reputation metrics evaluated by sensor nodes using historical network data and identity parameters. Through information theory, the coherence of all transactions with the behavior of the historical deployment is also analyzed and considered in the validation algorithm. The relevance of edge managers in the Blockchain network is also weighted considering the knowledge they have about the deployment.

Five different experiments are provided, showing that the attack detection rate can achieve values of up to 93%, with a detection delay of 37 s in the worst case. Additionally, resource consumption in sensor nodes and edge managers in AmI deployments is acceptable for most current hardware platforms. As a result, we can conclude that the proposed security solution is adequate for isolated AmI deployments.

For future work, the proposed solution will be validated in a real scenario, considering hardware sensor nodes and edge manager within a functional isolated AmI deployment.