1 Introduction

A cyber-physical system (CPS) is a system comprised of a physical component and a computational component (Rawat et al. 2015). The physical component is made up of objects such as actuators, sensors and control units. The computational component processes the information provided by the physical components in order to make decisions and control or protect some physical asset. CPSs are ubiquitous in todays world and are found in everyday devices including appliances, cell phones, vehicles and are used heavily by many industries such as the medical and manufacturing industries. CPSs have been studied extensively over the last few decades and many reliability models have been analyzed using various techniques. In Kumar et al. (2021), a Markov model is studied where the value of system availability is calculated using the 4th order Runga-Kutta method.

In many cases, a CPS will be the target of hostile attacks aimed at making the CPS fail and will exist in an environment where energy cannot be renewed for long periods of time. Under these circumstances, the CPS needs to have a way of protecting itself to ensure it remains in working order for as long as possible. Intrusion detection systems (IDS) or intrusion detection and response systems (IDRS) are often utilized to detect and respond to these malicious attacks. Of particular interest, is the quantification of the reliability of such a system. Several models for a CPS with an IDS system have been analyzed previously. Al-Hamadi and Chen (2013) proposed a redundancy management of heterogeneous wireless sensor networks. In Cho et al. (2010), the use of a linear time attacker function for modeling attacker behavior is considered. Mitchell and Chen (2011) developed a generic hierarchical model for performance analysis of intrusion detection techniques. The survivability issue of a mobile CPS comprising sensor-carried agents is addressed in Mitchell and Chen (2012b). In Mitchell and Chen (2016), an analytical model was developed based on stochastic Petri nets to capture the dynamics between adversary behavior and defense for CPSs. Orojiloo and Azgomi (2019) proposed a model for evaluating the security of CPSs. In Kholidy (2021), a new Autonomous Response Controller is introduced to respond against attacks on a CPS. Wang et al. (2018) developed a modelling and simulation framework for generating cyber attack scenarios using Monte Carlo sampling. In Fang et al. (2017), a general framework is proposed for cyber-physical system reliability models where cyber-intrusion is modeling by a semi-Markov process. Tripathi et al. (2021) proposed a design-time methodology to map and analyze system security using stochastic Petri nets. For a thorough review on the research advances regarding CPSs with intrusion detection systems, we refer the reader to Mitchell and Chen (2014).

The objective of this paper is to present a mathematical analysis of a generalized version of the mathematical model first introduced in Mitchell and Chen (2013) that is based on stochastic Petri nets (Chen and Wang 1996a; Chen and Wang 1996b; Robin et al. 1996) which is representative of a CPS in order to characterize the reliability function and mean-time to failure of such a system. The stochastic Petri net for the model we study here is shown in Fig. 1.

Fig. 1
figure 1

Stochastic Petri net model depicting the transition of good nodes to bad or evicted nodes and bad nodes to evicted nodes. Also shown is the possibility for failure due to an attack from a bad node or energy exhaustion

The CPS which will be considered in this paper has the following characteristics:

  • The CPS is designed to function for extended time periods without the renewal of energy, however, the system has a finite reserve of energy. At the time of energy exhaustion, the system fails.

  • The components of the CPS, which from now on we will refer to as nodes, are vulnerable to attacks that will corrupt the target node. Once a node is corrupted, it will act against the security of the system through attacks of its own. A successful attack by a corrupted node will cause impairment failure.

  • Nodes that are functioning properly have the job of detecting nodes in the system that have been corrupted. The detection of a node that has been corrupted will lead to the eviction of that node from the system. In the search for corrupted nodes, it is possible that a properly functioning node is incorrectly identified as corrupted, resulting in the eviction of this node from the system.

In addition to energy exhaustion and impairment-failure due to attacks from a corrupted node, there are other conditions for failure of the system that can be considered. For example:

  • Byzantine failure: When the number of corrupted nodes exceeds 1/3 of the total number of nodes left in the system, the system fails due to a Byzantine failure (Lamport et al. 1982).

  • Intrusion-detection failure: In many cases, the detection of bad nodes requires a vote from the remaining nodes in the system. If there are too few nodes left in the system such that a vote of m nodes cannot be taken, the system fails.

This CPS can be represented by an acyclic continuous-time Markov chain model which will be rigorously defined in the next section that is representative of the general CPS described above and an extension of the model introduced in Mitchell and Chen (2013). The model includes an absorbing state inclusive of failure due to energy exhaustion and failure due to impairment. Other than those specific cases of failure, all other conditions imposed on the system that would induce failure can be applied independent of the analysis of the transient-state transition probabilities.

A real-world example of a CPS representative of this model is discussed by Hawkinson et al. (2012). Furthermore, Mitchell and Chen (2013) exemplify several real-world applications including disaster and emergency response, military patrol and combat, healthcare (Mitchell and Chen 2012a) and unmanned aircraft (Mitchell and Chen 2012c).

A restricted version of the model analyzed in this paper has been studied by several researcher previously mostly from a numerical perspective. In Masetti and Robol (2020), and Mitchell and Chen (2013), the model is studied numerically where reliability was analyzed under a range of attacker behaviors and computing performability measures were considered, respectively. In Masetti and Robol (2020), the instantaneous reward measure at time t was calculated numerically. In Mitchell and Chen (2013), the mean-time to failure was calculated utilizing the stochastic Petri net package presented in Ciardo et al. (1989). In Martinez et al. (2017), an algorithm for calculating the mean-time to failure for a restricted version of this model was presented under the Byzantine and intrusion-detection failure conditions.

In addition to previous study of this model, there has been work on acyclic continuous-time Markov chains that could be applied to this model. In Lindemann et al. (1995), the reliability of such a system is calculated numerically. In Nabli (1998), an algorithm for computing the performability is given. An algorithm for symbolically finding the transient-state transition probabilities of a general acyclic continuous-time Markov chain is presented in Marie et al. (1987), however, a main component of this algorithm requires consideration of the structure of the Markov chain in an iterative first step to the algorithm.

The contribution of this paper is twofold. First we present an explicit expression for the transient-state transition probabilities, each of which is a finite sum of exponential functions and is dependent on a set of recursively defined coefficients of two indices. The second contribution is the derivation of an explicit expression for the mean-time to failure utilizing the transient-state transition probabilities. This expression for mean-time to failure is general in that it can be utilized for conditions of failure other than that of Byzantine and intrusion-detection. As this model is a generalized version of the model presented in Mitchell and Chen (2013), the analytical results presented in this paper hold for the more restricted model as well. The key to deriving expressions for the transient-state transition probabilities and mean-time to failure is exploiting the structure of the model which is an acyclic continuous-time Markov chain. As a result, the expressions presented in this paper already have this structure built-in as opposed to the result in Marie et al. (1987).

In the next three sections, we give a rigorous definition of an extended version of the model introduced in Mitchell and Chen (2013), and state our main results about the transient-state transition probabilities and mean-time to failure. The appendix is allocated to proofs of supporting lemmas.

2 Model Description

2.1 Definition of State Space

We fix N to be the total number of nodes in the system that at any given time are rated as good, bad or evicted. Assuming that at time t, the process has not yet entered an absorbing state, the state at time t can be fully characterized by number of good nodes, bad nodes and evicted nodes in the system. This leads us to the definition of the following set of configurations corresponding to the transient-states

$$\begin{aligned} \mathcal {T}:=\{(m,\,k)\in \mathbb {N}_0\times \mathbb {N}_0: 0\le m+k\le N\} \end{aligned}$$
(1)

with the understanding that \(\mathbb {N}_0 = \mathbb {N}\cup \{0\}.\) Moving forward, we will think of the element \((m,k)\in \mathcal {T}\) as the configuration with m bad nodes, k evicted nodes and \(N-m-k\) good nodes. Note that for all \((m,k)\in \mathcal {T}\), we have

$$m+k+(N-m-k)=N,$$

implying that the total number of nodes is conservative such that no node enters or exits the system. Furthermore, because m, k and \(N-m-k\) represent quantities of bad, evicted and good nodes respectively for a configuration \((m,k)\in \mathcal {T}\), we also have

$$m,\ k,\ N-m-k\ge 0.$$

We will denote by \(\mathcal {A}\), the absorbing state representing failure due to energy exhaustion or impairment-failure. Keeping in mind the possibility for other types of failure such as Byzantine or intrusion-detection failure, we will denote \(\mathcal {R}\subset \mathcal {T}\) as the set of configurations in \(\mathcal {T}\) that are considered working (as opposed to failed) under optional additional conditions being applied to the system.

2.2 Definition of \(\widetilde{\mathcal{T}}\), \(\varphi\) and \(\preceq\)

The results in Sections 3 and 4 and the proofs in Appendix 2 lend themselves to the enumeration of the elements of \(\mathcal {T}\). This ordering of the set \(\mathcal {T}\) is intuitive and corresponds to the flow through the state space starting from N good nodes. In particular, the proofs in Appendix 2 are made simpler by letting the state space of configurations be a set of integers. Thus, for ease of discussion, we will define the following set \(\widetilde{\mathcal {T}}\) which corresponds directly to \(\mathcal {T}\).

$$\begin{aligned} \widetilde{\mathcal {T}}:=\{1,2,3,\ldots ,{{\,\textrm{card}\,}}(\mathcal {T}\ )\}. \end{aligned}$$
(2)

In order to match each element of \(\mathcal {T}\) with an integer in \(\widetilde{\mathcal {T}}\), we define a mapping \(\varphi\) from \(\mathcal {T}\) to \(\widetilde{\mathcal {T}}\) which gives a straightforward way of enumerating all elements of \(\mathcal {T}\) and will later be shown to be a bijection.

Definition 1

Let the function

$$\varphi :\mathcal {T}\rightarrow \widetilde{\mathcal {T}}$$

be given by the mapping

$$\begin{aligned} \varphi (m,k):=\frac{(m+k)(m+k+1)}{2}+k+1. \end{aligned}$$
(3)

Lemma 2.1

\(\varphi\) is a bijection.

The proof of Lemma 2.1 is in Appendix 1.

Considering the set of working configurations \(\mathcal {R}\subset \mathcal {T}\), we define the subset \(\widetilde{\mathcal {R}}\subset \widetilde{\mathcal {T}}\) to correlate with \(\mathcal {R}\) and be given by

$$\begin{aligned} \widetilde{\mathcal {R}} := \varphi (\mathcal {R}). \end{aligned}$$
(4)

The enumeration of the configurations in \(\mathcal {T}\) through the use of the mapping \(\varphi\) onto \(\widetilde{\mathcal {T}}\) can be well described with a relation \(\preceq\) on the set \(\mathcal {T}\) defined below.

Definition 2

Define the relation \((\preceq )\) on the set \(\mathcal {T}\) such that \((m,k)\preceq (\widetilde{m},\widetilde{k})\) if at least one of the following conditions holds:

  1. (i)

    \(m+k<\widetilde{m}+\widetilde{k}\)

  2. (ii)

    \(m+k=\widetilde{m}+\widetilde{k}\) and \(k\le \widetilde{k}\).

The relation \(\preceq\) gives a natural ordering to \(\mathcal {T}\) that corresponds directly with the ordering of the integers in \(\widetilde{\mathcal {T}}\) and can be understood easily by referencing Fig. 2. It will be shown that for \((m,k),(\widetilde{m},\widetilde{k})\,\in\, \mathcal {T}\), \((m,k)\preceq (\widetilde{m},\widetilde{k})\) if and only if \(\varphi (m,k)\le \varphi (\widetilde{m},\widetilde{k})\).

Lemma 2.2

Let \((m,k),(\widetilde{m},\widetilde{k})\,\in\, \mathcal {T}\). We have that \((m,k)\preceq (\widetilde{m},\widetilde{k})\) if and only if \(\varphi (m,k)\le \varphi (\widetilde{m},\widetilde{k})\).

The proof of Lemma 2.2 is in Appendix 1.

2.3 Definition of \(X\,(t)\)  

We are now equipped to rigorously define our Markov chain as well as the dynamics that govern it. The state of our continuous-time Markov chain is

$$X:[0,+\infty )\rightarrow \widetilde{\mathcal {T}}\,\cup\, \{\mathcal {A}\}.$$

For a given time \(t\ge 0\), we understand

$$X(t) = \varphi (m,\,k)$$

to indicate that the state of our process is \(\varphi (m,\,k)\) and equates to the system being in the configuration where there are \(N-m-k\) good nodes, m bad nodes and k evicted nodes at time t. Furthermore, we understand

$$X(t) = \mathcal {A}$$

to indicate that the state of our process is \(\mathcal {A}\) meaning the process has entered the absorbing state representative of energy exhaustion and impairment failure.

As the model is representative of a real-world system, we will assume that at time \(t=0\), all of the nodes are good nodes giving us that \(X(0)=\varphi (0,0)=1\). We will define the rate of transition from state \(i=\varphi (m,\,k)\in \widetilde{\mathcal {T}}\) to state \(j\in \widetilde{\mathcal {T}}\cup \{\mathcal {A}\}\) where \(i\ne j\) by

$$\begin{aligned} \lambda ^i_j := {\left\{ \begin{array}{ll} \lambda ^{\varphi {(m,\kern 0.12em k)}}_{\varphi (m+1,\kern 0.12em k)}\cdot \mathbbm {1}\{m+k < N\}, &\quad j=\varphi (m+1,\kern 0.12em k) \\ \lambda ^{\varphi {(m,\kern 0.12em k)}}_{\varphi (m,\kern 0.12em k+1)}\cdot \mathbbm {1}\{m+k < N\}, &\quad j=\varphi (m,k+1) \\ \lambda ^{\varphi {(m,\kern 0.12emk)}}_{\varphi (m-1,\kern 0.12em k+1)}\cdot \mathbbm {1}\{m > 0\}, &\quad j=\varphi (m-1,\kern 0.12em k+1)\\ \lambda ^{\varphi {(m,\kern 0.12em k)}}_{\mathcal {A}}, &\quad j=\mathcal {A}\\ 0, &\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$
(5)

We set

$$\begin{aligned} \Lambda _i := -\sum _{\begin{array}{c} k\, \ne \, i \\ k\, \in\, \widetilde{\mathcal {T}}\cup \{\mathcal {A}\} \end{array}}\lambda ^i_k. \end{aligned}$$
(6)

and refer to \(\Lambda _i\) as the opposite of the total exit rate from state i.

Definition 3

A continuous-time Markov chain is called \(\Delta _N\)-Markov provided its state-transition diagram may be given by the directed graph shown in Fig. 2, for some \(N\in \mathbb {N}_0\) and nonnegative transition rates \(\lambda ^{i}_{j}\)’s, where \(\lambda ^{i}_{j}\) denotes the transition rate from state i to state j.

In the context of Definition 3 above, when checking whether a continuous-time Markov chain is \(\Delta _N\)-Markov, we shall make the convention that if state i (as in Fig. 2) does not exist, we take  \(\lambda ^{(\varvec{\cdot })}_{i}\)  and  \(\lambda ^{i}_{(\varvec{\cdot })}\)  to be all zero while assuming that state i is a constituent of the state-transition diagram for the continuous-time Markov chain in question.

Fig. 2
figure 2

The state-transition diagram for a \(\Delta _N\)-Markov chain and the mapping \(\varphi\) from \(\mathcal {T}\) to \(\widetilde{\mathcal {T}}\) with \(K={{\,\textrm{card}\,}}(\mathcal {T}\ )\). The state at time \(t=0\) is at the top of the triangle in state \(\varphi (0,0)=1\). The black arrows reflect the possible transitions through the state space. The red arrow on each state reflects that from any state \(i\in \widetilde{\mathcal {T}}\), the process can enter the absorbing state \(\mathcal {A}\). Furthermore, the natural ordering given by \(\preceq\) is shown where \(\varphi (0,0) = 1\), \(\varphi (1,0) = 2\), \(\varphi (0,1) = 3\), etc

3 Transient-State Transition Probabilities

By virtue of the applications we have in mind, we record in the present paper a body of results regarding \(\Delta _N\)-Markov chains which satisfy the following property:

Property 1

For all \(i,j\in \widetilde{\mathcal {T}}\) such that \(i\not =j\), if \(\max \{\Lambda _i,\Lambda _j\}<0\) then \(\Lambda _i\not =\Lambda _j\).

As it turns out, the expression found for the mean-time to failure in Martinez et al. (2017) places restrictions on the model dynamics in addition to the criteria for failure. Letting \(\varphi (m,k)\) be the state of the process at time t, the model studied in Martinez et al. (2017), and Mitchell and Chen (2013) imposes the Byzantine failure and intrusion-detection failure conditions on the model. Furthermore, the rates of transition for that model follow a specific structure:

$$\begin{aligned} \begin{array}{lll} \varphi (m,k) \longmapsto \varphi (m+1,k) &{} \,\,\,\,\text {with rate}&{}\,\,\,\,\lambda _c\times (N-m-k)\times \mathbbm {1}\{m+k < N\}\\ \varphi (m,k) \longmapsto \varphi (m,k+1) &{} \,\,\,\,\text {with rate}&{}\,\,\,\,(N-m-k)\times \mathcal {P}_{fp}\,/\,T_{IDS}\times \mathbbm {1}\{m+k < N\}\\ \varphi (m,k) \longmapsto \varphi (m-1,k+1) &{} \,\,\,\,\text {with rate}&{}\,\,\,\,m\times (1-\mathcal {P}_{fn})\,/\,T_{IDS}\times \mathbbm {1}\{m > 0\}\\ \varphi (m,k) \longmapsto \mathcal {A} &{} \,\,\,\,\text {with rate}&{}\,\,\,\,1\,/\,(N_{IDS}\times T_{IDS})+p_a\times m\times \lambda _{if}\\ \end{array} \end{aligned}$$
(7)

where \(\lambda _c\), \(\lambda _{if}\), \(\mathcal {P}_{fp}\), \(\mathcal {P}_{fn}\), \(p_a\), \(T_{IDS}\), \(N_{IDS}\) are fixed constants. For details on the interpretation of these constants, we refer the reader to Mitchell and Chen (2013).

The adoption of (7), guarantees that Property 1 is satisfied, thus the model looked at in Martinez et al. (2017), and Mitchell and Chen (2013) is a particular case of the more general model studied in this paper. As a consequence, all results found in this paper are valid for the restricted versions of the model looked at previously. The following recursively defined constants are integral to the main results of this paper and give a concise way of writing the transient-state transition probabilities.

Definition 4

Suppose that \(i=\varphi (m,k), j=\varphi (\widetilde{m},\widetilde{k})\in \widetilde{\mathcal {T}}\) are two distinct states such that \(i<j\) and define \(\mathscr {C}^{\;i}_j\) as

$$\begin{aligned} \mathscr {C}^{\;1}_{2}:= {\left\{ \begin{array}{ll} 0, &\ \text {if } \Lambda _1=0 \\ \Bigl (\Lambda _1-\Lambda _2\Bigr )^{-1}\,\lambda ^{1}_{2}, &\ \text {otherwise,} \end{array}\right. } \end{aligned}$$
(8)
$$\begin{aligned} \mathscr {C}^{\;1}_{3}:= {\left\{ \begin{array}{ll} 0, &\ \text {if } \Lambda _1=0 \\ \Bigl (\Lambda _1-\Lambda _3\Bigr )^{-1}\, \Bigl (\lambda ^{1}_{3}+\lambda ^{2}_{3}\,\, \mathscr {C}^{\;1}_{2}\Bigr ), &\ \text {otherwise,} \end{array}\right. } \end{aligned}$$
(9)

and, for \((i,j)\notin \{1\}\times \{2,3\}\), define

$$\begin{aligned} \mathscr {C}^{\;i}_{j}:= {\left\{ \begin{array}{ll} 0, &{} \text{ if } \,\,\Lambda _i\mathop {=}0\, \text{ or } \,\widetilde{k} < k\\ \Bigl (\Lambda _i\mathop{-}\Lambda _j\Bigr )^{-1}\, \Bigl (-\lambda ^{i}_{j}\mathop{\sum} \limits_{p\mathop {=}1}^{i\mathop{-}1}\mathscr {C}^{\;p}_{i}\Bigr ), &{} \text{ if } \,m=\widetilde{m}\mathop{-}1 \, \text{ and } \,k=\widetilde{k}\\ \Bigl (\Lambda _i-\Lambda _j\Bigr )^{-1}\, \Bigl (\lambda ^{\varphi (\widetilde{m}\mathop{-}1,\,\widetilde{k})}_{j}\,\, \mathscr {C}^{\;i}_{\varphi (\widetilde{m}\mathop{-}1,\,\widetilde{k})}\Bigr ), &{} \text{ if } \,m\ne \widetilde{m}\mathop{-}1 \, \text{ and } \,k=\widetilde{k}\\ \Bigl (\Lambda _i\mathop{-}\Lambda _j\Bigr )^{-1}\, \Bigl (-\lambda ^{i}_{j}\,\mathop{\sum} \limits_{p\mathop {=}1}^{i\mathop {-}1}\mathscr {C}^{\;p}_{i}+ \lambda ^{j\mathop {-}1}_{j}\,\,\mathscr {C}^{\;i}_{j\mathop{-}1}\Bigr ), &{} \text{ if } \,m=\widetilde{m}=0 \, \text{ and } \,k=\widetilde{k}-1\\ \Bigl (\Lambda _i-\Lambda _j\Bigr )^{-1}\, \Bigl (-\lambda ^{i}_{j}\mathop{\sum} \limits _{p\mathop {=}1}^{i\mathop {-}1}\mathscr {C}^{\;p}_{i}+ \lambda ^{i\mathop{+}1}_{j}\,\,\mathscr {C}^{\;i}_{i\mathop{+}1}\\ +\, \lambda ^{j\mathop {-}1}_{j}\,\,\mathscr {C}^{\;i}_{j-1}\Bigr ), &{} \text{ if } \,m=\widetilde{m}\ne 0 \, \text{ and } \,k=\widetilde{k}-1\\ \Bigl (\Lambda _i-\Lambda _j\Bigr )^{-1}\, \Bigl (-\lambda ^{i}_{j}\mathop{\sum} \limits _{p\mathop {=}1}^{i\mathop {-}1}\mathscr {C}^{\;p}_{i}\Bigr ), &{} \text{ if } \,m=\widetilde{m}+1 \, \text{ and } \,k=\widetilde{k}-1\\ \Bigl (\Lambda _i\mathop{-}\Lambda _j\Bigr )^{-1}\, \Bigl (\lambda ^{j\mathop {-}1}_{j}\,\,\mathscr {C}^{\;i}_{j\mathop{-}1}\Bigr ), &{} \text{ if } \,m+k=\widetilde{m}+\widetilde{k} \, \text{ and } \,k < \widetilde{k}-1\\ \Bigl (\Lambda _i\mathop{-}\Lambda _j\Bigr )^{-1}\, \Bigl (\lambda ^{\varphi (\widetilde{m},\,\widetilde{k}-1)}_{j}\,\, \mathscr {C}^{\;i}_{\varphi (\widetilde{m},\,\widetilde{k}\mathop{-}1)}\\ +\lambda ^{j\mathop {-}1}_{j}\,\,\mathscr {C}^{\;i}_{j\mathop{-}1}\Bigr ), &{} \text{ if } \,\widetilde{m}=0,\,\,\, m+k < \widetilde{k},\, \text{ and } \,k < \widetilde{k}-1\\ \Bigl (\Lambda _i\mathop{-}\Lambda _j\Bigr )^{-1}\, \Bigl (\lambda ^{\varphi (\widetilde{m},\,\widetilde{k}\mathop{-}1)}_{j}\,\, \mathscr {C}^{\;i}_{\varphi (\widetilde{m},\,\widetilde{k}\mathop{-}1)} \\ +\lambda ^{\varphi (\widetilde{m}\mathop{-}1,\,\widetilde{k})}_{j}\,\, \mathscr {C}^{\;i}_{\varphi (\widetilde{m}\mathop{-}1,\,\widetilde{k})}+\lambda ^{j\mathop {-}1}_{j}\,\,\mathscr {C}^{\;i}_{j\mathop{-}1}\Bigr ), &{} \text{ otherwise. } \end{array}\right. } \end{aligned}$$
(10)

In aiming to explicitly define the transient-state transition probabilities, we define the following probability distribution on the set \(\widetilde{\mathcal {T}}\) which will later be shown to be the distribution that describes the behavior of the process at time t.

Definition 5

Let \(\pi _i:[0,+\infty )\rightarrow \mathbb {R}\) be given by

$$\begin{aligned} \pi _{1}(t):=\exp (\Lambda _{1}t), \end{aligned}$$
(11)

and

$$\begin{aligned} \pi _{j}(t):=\sum _{i\mathop {=}1}^{i\mathop {-}1}\mathscr {C}^{\;i}_{j}\, \Bigl (\exp (\Lambda _{i}t)-\exp (\Lambda _{j}t)\Bigr ), \,\,\text {for all}\,\,j\in \{2,3,\dots ,{{\,\textrm{card}\,}}(\widetilde{\mathcal {T}})\}. \end{aligned}$$
(12)

We are now in a position to formulate the main result in this section. Such a result is of fundamental importance for the subsequent considerations in this paper. The theorem that follows gives an explicit expression for the transient-state transition probabilities starting from state \(\varphi (0,0)=1\). At the time of writing, the authors of this paper are not aware of the existence of any such expressions for the model looked at here or the more restricted model in Martinez et al. (2017), and Mitchell and Chen (2013).

Theorem 3.1

Assume that \(\{X(t)\,:\,t\in [0,+\infty )\}\) is a \(\Delta _N\)-Markov chain with transition rates \(\{\lambda ^i_j\}\) that satisfies Property 1, then the transient-state transition probabilities starting from state \(1=\varphi (0,0)\) at time \(t\in [0,+\infty )\) are given by

$$\begin{aligned} P(X(t)=\varphi (m,k)\vert X_0=\varphi (0,0))=\pi _{\varphi (m,\kern 0.12em k)}(t) \end{aligned}$$
(13)

for all non-negative integers m and k such that \(m+k\le N\).

We are now equipped to prove Theorem 3.1 by induction in the following steps:

  1. 1.

    In Lemma 3.2, we will show that Eq. (13) holds for all states \(\varphi (m,0)\in \widetilde{\mathcal {T}}\).

  2. 2.

    In Lemma 3.3, using the result of Lemma 3.2 as our base case, we will show that if Eq. (13) holds for some fixed \(k\in \{0,2,\ldots ,N-1\}\), then Eq. (13) holds for \(k+1\).

  3. 3.

    The combination of Lemmas 3.2 and 3.3 allows us to set up an induction which covers all combinations of \((m,k)\in \mathcal {T}\).

The proofs of the following lemmas are in Appendix 2.

Lemma 3.2

Given the above Markov chain \(\left\{ X(t):t\in [0,+\infty )\right\}\),

$$P(X(t)=\varphi (m,0)\vert X_0=1)=\pi _{\varphi (m,0)}(t)\ {for \ all}\ m\in \{0,1,\ldots ,N\}.$$

Lemma 3.3

Given \(k\in \{1,2,\ldots ,N\}\), and the above Markov chain \(\left\{ X(t):t\in [0,+\infty )\right\}\), if

$$P(X(t)=\varphi (m,k-1)\vert X_0=1)=\pi _{\varphi (m,k-1)}(t) \ for\ all \ m\in \{0,1,\ldots ,N-k+1\},$$

then

$$P(X(t)=\varphi (m,k)\vert X_0=1)=\pi _{\varphi (m,k)}(t) \ for\ all\ m\in \{0,1,\ldots ,N-k\}.$$

Proof of Theorem 3.1

Assume that \(\{X(t)\,:\,t\in [0,+\infty )\}\) is a \(\Delta _N\)-Markov chain with transition rates \(\{\lambda ^i_j\}\) that satisfies Property 1.

We aim to show that the transient-state transition probabilities starting from state \(1=\varphi (0,0)\) at time \(t\in [0,+\infty )\), are given by

$$\begin{aligned} P(X(t)=\varphi (m,k)\vert X_0=1)=\pi _{\varphi (m,k)}(t). \end{aligned}$$
(14)

for all non-negative m and k such that \(0\le m+k\le N\).

We proceed with a proof by induction on k.

Base Case: (\(k=0\)) By Lemma 3.2 we have that

$$P(X(t)=\varphi (m,0)\vert X_0=1)=\pi _{\varphi (m,0)}(t) \text { for all } m\in \{0,1,\ldots ,N\}.$$

Inductive Step: Assume that for \(k=r-1\) where \(0\le r-1\le N-1\) the following holds

$$P(X(t)=\varphi (m,r-1)\vert X_0=1)=\pi _{\varphi (m,r-1)}(t) \text { for all } m\in \{0,1,\ldots ,N-r+1\}.$$

Then by Lemma 3.3 we have that

$$P(X(t)=\varphi (m,r)\vert X_0=1)=\pi _{\varphi (m,r)}(t) \text { for all } m\in \{0,1,\ldots ,N-r\}.$$

This completes our proof. \(\square\)

Numerical simulations of the \(\Delta _N\)-Markov process suggest that the curves \(\pi _t(i)\) for \(i\in \widetilde{\mathcal {T}}\) shown in Fig. 3 agree with the numerical results found for these quantities via simulation.

Fig. 3
figure 3

Theoretical transition probabilities at time t starting from state 1 with \(N=2\) alongside results for these quantities via \(10^5\) simulations using the same transition rates. All \(\lambda ^i_j\)’s were randomly generated. The solid curves represent the theoretical transition probabilities \(\pi _t(i)\), \(1-\sum _{i\mathop {=}1}^6\pi _t(i)\) and the markers represent simulated results for states \(i\in \{1,\,2,\,3,\,4,\,5,\,6,\,\mathcal {A}\)}

4 Mean-Time to Failure

This section is concerned with establishing the mean-time to failure for a system which has a \(\Delta _N\)-Markov chain as its underlying structure. Although the mean-time to failure was calculated analytically by Martinez et al. (2017), their result is based on the assumption that (7) holds and that intrusion-detection failure and Byzantine failure conditions are imposed. Our approach does not require that  (7) is satisfied and leaves open the possibility that some of the states in \(\widetilde{\mathcal {T}}\) are failed states that are absorbing. In such a case, the explicit expression presented in Theorem 3.1 is still valid. With this formulation, one can easily calculate the mean-time to failure under many combinations of conditions for failure of the system. For a more thorough review of reliability theory, we refer the reader to Trivedi (2002).

Consider a reliability model \(\{Y(t)\vert t\ge 0\}\) with set of failed states \(\mathcal {F}\), set of working states \(\mathcal {R}\) where \(r^{+}\in \mathcal {R}\) is the optimum working state where every component of the system is working and state space \(\mathcal {S} = \mathcal {R}\cup \mathcal {F}\). The mean-time to failure is the expected amount of time it takes starting from state \(r^{+}\) until the process reaches the set \(\mathcal {F}\). Letting T represent the time it takes for our process to reach a failed state as

$$T = \inf \{t\ge 0:Y(t)\in \mathcal {F}\},$$

we can define the mean-time to failure formally as

$$MTTF = E(\,T\,\vert \,Y(0) = r^{+}).$$

A more useful expression for MTTF is

$$\begin{aligned} MTTF = \int _0^\infty R(t)\, dt, \end{aligned}$$
(15)

where

$$R(t) = P(T \ge t\,\vert \,Y(0)=r^{+}).$$

R is called the reliability function and R(t) is the probability that the system has not failed by time t. For a process which is acyclic, such as is the case with a \(\Delta _N\)-Markov chain, this function is simply the probability that the process is in a working state at time t. This is because we assume that all failed states are absorbing and because once the process leaves a state i, it can never again return to state i. This allows us to express the reliability function for our \(\Delta _N\)-Markov chain \(\{X(t)\,:\,t\ge 0\}\) with working states \(\tilde{\mathcal {R}}\) and failed states \((\widetilde{\mathcal {T}}\setminus \tilde{\mathcal {R}})\cup \{\mathcal {A}\}\) as

$$\begin{aligned} R(t) = \sum _{j\,\in \,\tilde{\mathcal {R}}}\pi _j(t). \end{aligned}$$
(16)

Theorem 4.1

Assume that \(\{X(t)\,:\,t\in [0,+\infty )\}\) is a \(\Delta _N\)-Markov chain with working states \(\widetilde{\mathcal {R}}\subseteq \widetilde{\mathcal {T}}\) and failed states \((\widetilde{\mathcal {T}}\setminus \widetilde{\mathcal {R}})\cup \{\mathcal {A}\}\) with transition rates \(\{\lambda ^i_j\}\) that satisfies Property 1, then the mean-time to failure is given by

$$\begin{aligned} MTTF = \mathop{\sum} \limits_{j\,\in \,\tilde{\mathcal {R}}}\sum _{\begin{array}{c} i\mathop {=}1 \\ i\,\in \,\widetilde{\mathcal {R}} \end{array}}^{j\mathop {-}1}\mathscr {C}^{\;i}_j\left( \frac{1}{\Lambda _j}-\frac{1}{\Lambda _i}\right) . \end{aligned}$$
(17)

Proof of Theorem 4.1

Notice that if \(i\notin \tilde{\mathcal {R}}\), then \(\mathscr {C}^{\;i}_{(\cdot )}=0\) and so combining the results from Theorem 3.1 with Eqs. (17) and (16), we observe that

$$\begin{aligned} \begin{aligned} MTTF&= \int _0^\infty \sum _{j\,\in \,\tilde{\mathcal {R}}}\pi _j(t)\,dt \\&= \int _0^\infty \sum _{j\,\in \,\tilde{\mathcal {R}}}\sum _{i\mathop {=}1}^{j\mathop {-}1}\mathscr {C}^{\;i}_j\left( \exp (\Lambda _it)-\exp (\Lambda _jt)\right) \,dt.\\&= \int _0^\infty \sum _{j\,\in \,\tilde{\mathcal {R}}}\sum _{\begin{array}{c} i\mathop {=}1 \\ i\,\in \,\tilde{\mathcal {R}} \end{array}}^{j\mathop {-}1}\mathscr {C}^{\;i}_j\left( \exp (\Lambda _it)-\exp (\Lambda _jt)\right) \,dt. \end{aligned} \end{aligned}$$
(18)

Because both sums in Eq. (18) are finite, we can rewrite this equation as

$$\begin{aligned} \begin{aligned} MTTF&= \sum _{j\,\in \,\tilde{\mathcal {R}}}\sum _{\begin{array}{c} i\mathop {=}1 \\ i\,\in \,\tilde{\mathcal {R}} \end{array}}^{j\mathop {-}1}\mathscr {C}^{\;i}_j\int _0^\infty \left( \exp (\Lambda _it)-\exp (\Lambda _jt)\right) \,dt. \end{aligned} \end{aligned}$$
(19)

Keeping in mind that if \(k\in \tilde{\mathcal {R}}\), then \(\Lambda _k<0\), we can evaluate the improper integral associated with indices \(i,\,j\,\in \widetilde{\mathcal {R}}\) as

$$\begin{aligned} \begin{aligned} \int _0^\infty \left( \exp (\Lambda _it)-\exp (\Lambda _jt)\right) \,dt&= \lim _{b\mathop {\rightarrow }\infty }\int _0^b\left( \exp (\Lambda _it)-\exp (\Lambda _jt)\right) \,dt \\&= \lim _{b\mathop {\rightarrow }\infty }\left. \left( \frac{\exp (\Lambda _it)}{\Lambda _i}-\frac{\exp (\Lambda _jt)}{\Lambda _j}\right) \right| _{0}^{b} \\&= \lim _{b\mathop {\rightarrow }\infty }\left[ \left( \frac{\exp (\Lambda _ib)}{\Lambda _i}-\frac{\exp (\Lambda _jb)}{\Lambda _j}\right) -\left( \frac{\exp (0)}{\Lambda _i}-\frac{\exp (0)}{\Lambda _j}\right) \right] \\&= \frac{1}{\Lambda _i}-\frac{1}{\Lambda _j}. \end{aligned} \end{aligned}$$
(20)

Finally, we achieve our desired result by invoking Eqs. (19) and (20) resulting in

$$\begin{aligned} \begin{aligned} MTTF&= \sum _{j\,\in \,\tilde{\mathcal {R}}}\sum _{\begin{array}{c} i\mathop {=}1 \\ i\,\in \,\tilde{\mathcal {R}} \end{array}}^{j\mathop {-}1}\mathscr {C}^{\;i}_j\left( \frac{1}{\Lambda _i}-\frac{1}{\Lambda _j}\right) . \end{aligned} \end{aligned}$$

\(\square\)

5 Conclusion

In this paper, we studied a model for a CPS with an IDS consisting of cooperative nodes under the possibility of failure due to energy exhaustion and impairment failure. The model allows for optional additional conditions for failure such as Byzantine failure and Intrusion-detection failure to be imposed. We found explicit expressions for the transient-state transition probabilities and mean-time to failure for a generalized model representing a CPS. All cases where this model was studied previously assumed Byzantine and Intrusion-detection failure conditions and only apply to that scenario. These results extend previous analytical results for the model introduced in Mitchell and Chen (2013) and make any numerical results redundant.