Architecture-based resilience evaluation for self-adaptive systems

Cámara, Javier; de Lemos, Rogério; Vieira, Marco; Almeida, Raquel; Ventura, Rafael

doi:10.1007/s00607-013-0311-7

Architecture-based resilience evaluation for self-adaptive systems

Published: 14 March 2013

Volume 95, pages 689–722, (2013)
Cite this article

Computing Aims and scope Submit manuscript

Javier Cámara¹,
Rogério de Lemos²,
Marco Vieira¹,
Raquel Almeida¹ &
…
Rafael Ventura¹

813 Accesses
13 Citations
Explore all metrics

Abstract

One of the major challenges related to self-adaptive software systems is the provision of assurances that the system is resilient against changes that may occur either in the system or its environment. These assurances should be based on complementary sources of evidence that collectively justify that the system is able to attain the specified levels of resilience. The contribution of this paper is the definition and development of an architecture-based approach that evaluates by comparison the adaptation mechanisms of a self-adaptive software system. The proposed approach relies on the identification of representative environmental and system changeloads (i.e., sequences of changes) used in the run-time stimulation of the system. The system response obtained from this stimulation is collected and aggregated into a probabilistic model that is employed in the evaluation of system resilience. Our approach is intended to be used before deployment, since the process often involves putting the system through adverse conditions which are not adequate when the system is in production. The feasibility and effectiveness of the proposed approach is demonstrated in the context of Rainbow, an architecture-based platform for self- adaptation, and Znn.com, a case study that reproduces the typical infrastructure for a news website.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Claims and Evidence for Architecture-Based Self-adaptation: A Systematic Literature Review

Software Engineering for Self-Adaptive Systems: Research Challenges in the Provision of Assurances

cloud-ATAM: Method for Analysing Resilient Attributes of Cloud-Based Architectures

Notes

Steps decorated with a gear in the figure are fully automatic, whereas steps decorated with the designer are entirely manual. The remaining steps are partially automated, requiring input from the user.
In the remainder of this paper, we make a distinction between the general term adaptation or adaptation alternative, and adaptation strategy in the concrete context of Rainbow.
Architectural styles are patterns of system organization that enable the exploitation of commonalities across systems [24].
A summary of PCTL is provided in Appendix A.
Actually, changes can also occur in system goals. However, in the scope of this work, we assume fixed goals, therefore change sources are always associated with (system or environment) properties of an architectural type.
Although the distance to a non-conventional operational profile $N_\alpha $ from state $s$ requires the existence of a trajectory $\pi $ from $s$ to a state in $N_\alpha $, we pessimistically assume that non-conventional operational profiles are always reachable from any state in the conventional operational profile of the system. Therefore in practice, this distance is always estimated as $dst(s,N_\alpha )=\displaystyle \min _{s_N \in N_\alpha } \delta (s,s_N)$.
The code of these strategies can be downloaded from the SEAMS portal (http://seams.self-adapt.org).

References

Abowd G, Allen R, Garlan D (1993) Using style to understand descriptions of software architecture. ACM Trans Softw Eng Methodol 4:319–364
Article Google Scholar
Almeida R, Vieira M (2012) Changeloads for resilience benchmarking of self-adaptive systems: a risk-based approach. In: Proceedings of EDCC
Andova S, Hermanns H, Katoen J-P (2003) Discrete-time rewards model-checked. In: FORMATS of Lecture Notes in Computer Science, vol 2791, Springer, Berlin, pp 88–104
Baier C, Katoen J-P (2008) Principles of Model Checking. MIT Press, Cambridge
Calinescu R, Grunske L, Kwiatkowska MZ, Mirandola R, Tamburrelli G (2011) Dynamic QoS management and optimization in service-based systems. IEEE Trans Softw Eng 37(3):387–409
Article Google Scholar
Calinescu R, Kwiatkowska MZ (2009) Using Quantitative Analysis to Implement Autonomic IT Systems. In: ICSE. Institute of Electrical and Electronics Engineers, MN, pp 100–110
Cámara J, de Lemos R (2012) Evaluation of Resilience in Self-Adaptive Systems Using Probabilistic Model-Checking. In: Proceedings of the 7th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS 2012), IEEE, pp 53–62
Cheng BHC, et al. (2009) Software Engineering for Self-Adaptive Systems: a Research Roadmap. In: SEfSAS of LNCS, vol 5525, Springer, Berlin, pp 1–26
Cheng S-W (2008) Rainbow: Cost-Effective Software Architecture-Based Self-Adaptation. PhD thesis, Carnegie Mellon University, Pittsburgh
Cheng S-W, Garlan D, Schmerl BR (2009) Evaluating the Effectiveness of the Rainbow Self-Adaptive System. In: SEAMS, IEEE, Pittsburgh, pp 132–141
de Lemos R et al (2011) Software Engineering for Self-Adaptive Systems: a second Research Roadmap. In: de Lemos R, Giese H, Müller H, Shaw M (eds) Software Engineering for Self-Adaptive Systems, number 10431 in Dagstuhl Seminar Proceedings, Dagstuhl, Germany. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany
Dwyer MB, Avrunin GS, Corbett JC (1999) Patterns in Property Specifications for Finite-State Verification. In: ICSE, Cobleigh, pp 411–420
Epifani I, Ghezzi C, Mirandola R, Tamburrelli G (2009) Model Evolution by Run-Time Parameter Adaptation. In: ICSE,IEEE CS, Cobleigh, pp 111–121
Garlan D, Cheng S-W, Huang A-C, Schmerl BR, Steenkiste P (2004) Rainbow: architecture-based self-adaptation with reusable infrastructure. IEEE Comput 37(10):46–54
Article Google Scholar
Garlan D, Monroe RT, Wile D (2000) Acme: architectural description of component-based systems. In: Leavens GT, Sitaraman M (eds) Foundations of Component-Based Systems, chapter 3, Cambridge University Press, Cambridge, pp 47–67
Gray J (1992) Benchmark Handbook: For Database and Transaction Processing Systems. Morgan Kaufmann Publishers Inc., San Francisco
Google Scholar
Grunske L (2008) Specification Patterns for Probabilistic Quality Properties. In: ICSE, ACM, Hawthorn, pp 31–40
Kanoun K, Spainhower L (2008) Dependability Benchmarking for Computer Systems. Wiley-IEEE Computer Society Pr, Wiley
Kephart JO, Chess DM (2003) The vision of autonomic computing. Computer 36:41–50
Article Google Scholar
Kwiatkowska M, Norman G, Parker D (2007) Stochastic model checking. Lecture notes in computer science. Springer, Berlin
Laprie J-C (2008) From Dependability to Resilience. In: DSN Fast Abstracts, IEEE CS, New York
Madeira H (2005) Towards a security benchmark for database management systems. In: Proceedings of the 2005 International Conference on Dependable Systems and Networks, DSN ’05, IEEE Computer Society, Washington, pp 592–601
Oreizy P, Gorlick MM, Taylor RN, Heimbigner D, Johnson G, Medvidovic N, Quilici A, Rosenblum DS, Wolf AL (1999) An architecture-based approach to self-adaptive software. IEEE Intel Syst 14: 54–62
Google Scholar
Shaw M, Garlan D (1996) Software Architecture: Perspectives on an Emerging Discipline. Prentice-Hall, Indiana
Williams R, Pandelios G, Behrens S (1999) Software Risk Evaluation (SRE) Method Description: Version 2.0. Technical report. Carnegie Mellon University, Software Engineering Institute, Pittsburgh

Download references

Author information

Authors and Affiliations

University of Coimbra, Coimbra, Portugal
Javier Cámara, Marco Vieira, Raquel Almeida & Rafael Ventura
University of Kent, Canterbury, UK
Rogério de Lemos

Authors

Javier Cámara
View author publications
You can also search for this author in PubMed Google Scholar
Rogério de Lemos
View author publications
You can also search for this author in PubMed Google Scholar
Marco Vieira
View author publications
You can also search for this author in PubMed Google Scholar
Raquel Almeida
View author publications
You can also search for this author in PubMed Google Scholar
Rafael Ventura
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Javier Cámara.

Appendices

Appendix A: Probabilistic computation-tree logic

To express resilience properties about the system, we use PCTL [4], which is a logic language inspired by CTL [4]. Instead of the universal and existential quantification of CTL, PCTL provides the probabilistic operator $\mathcal{P }_{\bowtie p}(.)$, where $p \in [0, 1]$ is a probability bound and $\bowtie \; \in \{ \le , <, \ge , > \}$. Given a time bound $t \in \mathbb{N }$, PCTL is defined by the following syntax:

Formulae $\varPhi $ are named state formulae and can be evaluated over a Boolean domain $(true, false)$ in each state. Formulae $\psi $ are named path formulae and describe a pattern over the set of all possible paths originating in the state where they are evaluated. In state formulae, other Boolean operators, such as disjunction $(\vee )$, implication $(\Rightarrow )$, etc. can be specified based on the primary Boolean operators. Moreover, we employ the abbreviations: (bounded) finally $(F^{(\le t)} \varPhi =true~U^{(\le t)} \varPhi )$ and globally $(G \varPhi = \lnot F \lnot \varPhi )$. The satisfaction relation for PCTL is defined for a state $s$ as:

$$\begin{aligned} \begin{array}{lll} s \models true &{} &{} s \models a \; \text{ iff } \; a \in L(s)\\ s \models \lnot \varPhi \; \text{ iff } \; s \nvDash \varPhi &{} &{} s \models \varPhi _1 \wedge \varPhi _2 \; \text{ iff } \; s \models \varPhi _1 \; and \; s \models \varPhi _2\\ s \models \mathcal{P }_{\bowtie p}(\psi ) \; \text{ iff } \; Pr(s \models \psi ) \bowtie p \end{array} \end{aligned}$$

A formal definition of how to compute probability $Pr(s\models \psi )$ is presented in [4].The intuition is that its value corresponds to the fraction of paths originating in $s$ and satisfying $\psi $ over the entire set of paths originating in $s$. The satisfaction relation of a path formula with respect to a path $\pi $ that originates in state $s (\pi [0] = s)$ is:

Appendix B: Modelling probabilistic system response

This appendix introduces the relevant concepts and procedure followed to build the two kinds of probabilistic models used in our approach, namely: (i) Operational profile model, used to quantify the probability of satisfaction of a property specified as a PCTL formula in a given operational profile, and (ii) Impact model, employed to quantify the impact of a particular change in the system or its environment on the different quality dimensions considered in the system.

For the sake of clarity, we first define a set of concepts that will help to follow the discussion throughout the rest of this section.

Definition 16

(Boundary) The boundary of an operational profile $P$ is defined as:

$$\begin{aligned} boundary(P)=\{ s_P \in P \; | \; \exists s \longrightarrow s_P : \; s \in S \backslash P \} \end{aligned}$$

Definition 17

(Trajectory) A trajectory is a finite sequence of transitions:

$$\begin{aligned} \pi =s_1 \longrightarrow \dots \longrightarrow s_m. \end{aligned}$$

We assume that for all transitions $s\longrightarrow s^{\prime }$, the time needed by the system to move from state $s$ to state $s^{\prime }$ is given by $\tau \in \mathbb{R }^+$. Then, for any trajectory of the system $\pi $, we define its duration as $duration(\pi )=m \tau $.

A state $s^{\prime }\in S$ is reachable from a state $s$ in time $t$ (denoted as $s \longrightarrow t^{t} s^{\prime }$) if there exists a trajectory $\pi =s \longrightarrow \dots \longrightarrow s^{\prime }$, such that $duration(\pi )\le t$.

1.1 B.1 Operational profile model

In operational profile models, initial states correspond to those where the system enters the associated operational profile, and the state space of the model includes all the states reachable from initial states within the specified time frame determined by a time bound $t$.

Definition 18

(Operational profile model) A model for an operational profile $P$ and a collection of variables $X=\{x_1, \ldots , x_n\}$ during the time frame $[0,t]$ is a DTMC $\mathcal{M }^P_t$ built over $[\mathbb{R }^n]_{X}$, such that:

Set of initial states $S^\mathcal{M }_0= boundary(P)$;
Set of states $S^\mathcal{M }=\{ s \in S \; | \; \exists s_0 \in S^\mathcal{M }_0 : s_0 \longrightarrow t^{t} s \}$;

Algorithm 1 synthesizes a DTMC from a set of traces $T$ of the form $\langle s_1, \ldots , s_m \rangle $, where each state $s_{j \in \{1,\ldots ,m \} } \in [\mathbb{R }^n]_{X}$, and the trace length is given by $m \ge t / \tau $. Input also includes the set of atomic propositions of interest $AP$ to label the DTMC. The algorithm starts by incrementally building the sets of states (initial and global, lines 10, 14), and their corresponding labelling (line 15). Moreover, for the second part of the algorithm where the transition probability matrix is built, we employ an auxiliary matrix $O$ that is used to keep information about the number of times that a transition between any two states $prev$ and $current$ is observed (line 18). Hence, values in the transition probability matrix are updated according to this information by assigning to each of the successors of state $prev$ (i.e., all states $s^{\prime } \in S$ s.t. $O(prev,s^{\prime })\ne 0$) a value proportional to the number of times that the transition $prev \longrightarrow current$ has been observed w.r.t. the total number of transitions observed from source state $prev$ (line 21). Function $extract$ (line 8) returns and removes the first element of a vector

$$\begin{aligned} V=\langle e_1, e_2, \ldots , e_n \rangle : extract(V)=e_1, \; with \; V=\langle e_2,\ldots ,e_n \rangle . \end{aligned}$$

A model of a specific operational profile $P$ can be synthesized from a set of traces obtained from the observation of the running system, where initial states are in $boundary(P)$, and trace length corresponds to the chosen time frame for the model [7]. The probability of satisfaction of properties expressed as PCTL formulas instanced using probabilistic response patterns as described in Table 1 can be directly quantified against operational profile models, provided that the time bound associated to the property is smaller or equal to that of the time frame associated with the model.

1.2 B.2 Impact model

Impact models describe the evolution of the different system variables considered for adaptation over a particular time frame, starting from an initial condition $\varPsi $ (e.g., associated with an event). Concretely, every state $s$ is labeled with a reward for each variable that consists of the difference between the value of the variable in $s$, and the value for the same variable in the closest initial state of the model. Hence, impact models describe the response of the system, i.e., evolution over time of the deviation of system variables (impact), caused by the occurrence of an event (e.g., a change) in the system or its environment. System variables comprised in impact models are those relevant to the satisfaction of the system’s goals and can be traced back to properties in the architectural model.

Definition 19

(Impact model) An impact model from an initial condition $\varPsi $ (expressed as a PCTL formula) and a collection of variables $X=\{x_1, \ldots , x_n\}$, during the time frame $[0,t]$ is a DMRM $\mathcal{I }^\varPsi _t=(D, \langle \rho _1, \ldots , \rho _n \rangle )$ such that:

$D=(S^\mathcal{I }_0, S^\mathcal{I }, P^\mathcal{I }, L^\mathcal{I })$ is a DTMC built over the metric space $([\mathbb{R }^n]_{X}, \delta )$:
- Set of initial states $S^\mathcal{I }_0= \{ s \in S\; | \; s \models \varPsi \}$;
- Set of states $S^\mathcal{I }=\{ s \in S \; | \; \exists s_0 \in S^\mathcal{I }_0 : s_0 \longrightarrow t^{t} s \}$;
Reward assignment function for each variable $x_{i\in \{1,\dots ,n\}}$ is $\rho _i(s)=s[i] - s_0[i], \; $ with $s \in S^\mathcal{I }$ and $s_0 \in S^\mathcal{I }_0 $, s.t. $\forall s^{\prime }_0 \in S^\mathcal{I }_0, \; \pi =s_0 \longrightarrow \dots \longrightarrow s, \; \pi ^{\prime } =s^{\prime }_0 \longrightarrow \dots \longrightarrow s : \; duration(\pi ) \le duration(\pi ^{\prime }) $.

Let us remark that the reward associated with any initial state $s_0 \in S^\mathcal{I }_0$ is by definition $0$ (i.e., $\rho _i(s_0)=0, \; i \in \{1,\dots , n\}$).

An impact model can be synthesized from a set of traces $TR$ associated to an initial condition (e.g., a change). In this set, each trace is of the form $\langle s_1, \ldots , s_m \rangle $, has a length that corresponds to the time frame of the resulting impact model, the initial state satisfies the initial condition $(s_1 \models \varPsi )$, and each state $s_{j \in \{1,\ldots ,m \} } \in [\mathbb{R }^n]_{X}$ contains reward assignments that correspond to the difference between the values of the variables in the first state of the trace and the (quantized) values of the variables measured at instant $j$ (i.e., the impact of initial condition on variables at time instant $j$). In particular, let $TR_{s_j}=\{ tr \in TR \; | \; s_j \in tr \}$ be the set of all traces in which the same arbitrary state $s_j$ is observed. We define the impact in a trace $k$ for variable $i$ as:

$$\begin{aligned} impact^k_i(s_j)=quant(x^k_i(\tau j))-s^k_1[i]; \; \; \mathrm{with}~i \in \{1,\ldots ,n\}, \; j \in \{1,\ldots ,m\} \end{aligned}$$

where $x^k_i(\tau j)$ indicates the value of variable $x_i$ at time instant $j$ in trace $k$ (always the same for all traces in $TR_{s_j}$), and $s^k_1[i]$ is the quantized value of variable $x_i$ in the first state of the trace. The reward assignment for the variable $x_i$ in a state $s_j$ is computed as the average of all impacts for all observed occurrences of state $s_j$ in different traces:

$$\begin{aligned} \rho _i(s_j)=\frac{\displaystyle \sum \nolimits ^{|TR_{s_j}|}_{k=1} impact^k_i(s_j)}{|TR_{s_j}|} \end{aligned}$$

Expected impact over a set of system or environment variables, in a time instant that lies within the time frame associated with the model, can be directly checked through PRCTL formulas making use of the instantaneous reward operator $\mathcal{Y }^t_{\bowtie r}(\varPhi )$ introduced previously in this section. Concretely, the instantaneous $i$-th reward for a time instant $t_x$ can be quantified as $Ir^i(true,t_x)$ directly on an impact model.

Example 9

Figure 8 depicts a simple DMRM impact model for Znn.com that describes the evolution of variables $\mathtt{expRspTime}$ and $\mathtt{totalCost}$ for the time frame $[0,2]$ (we assume that $\tau =1$ for the sake of clarity). Every state displays the reward value assigned in that state by $\rho _1$ and $\rho _2$, that correspond to $\mathtt{expRspTime}$ and $\mathtt{totalCost}$ , respectively. Transitions are labeled with their associated probability. Focusing on response time, we can compute the expected impact in time instants $1$ and $2$ as:

Based on these values, we can check properties expressed in PRCTL such as: $\mathcal{Y }^{2,1}_{\le -100}(true)$. This property reads as: “the expected reduction in response time after 2 seconds will be at least 100 milliseconds”.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cámara, J., de Lemos, R., Vieira, M. et al. Architecture-based resilience evaluation for self-adaptive systems. Computing 95, 689–722 (2013). https://doi.org/10.1007/s00607-013-0311-7

Download citation

Received: 22 May 2012
Accepted: 10 February 2013
Published: 14 March 2013
Issue Date: August 2013
DOI: https://doi.org/10.1007/s00607-013-0311-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Architecture-based resilience evaluation for self-adaptive systems

Abstract

Access this article

Similar content being viewed by others

Claims and Evidence for Architecture-Based Self-adaptation: A Systematic Literature Review

Software Engineering for Self-Adaptive Systems: Research Challenges in the Provision of Assurances

cloud-ATAM: Method for Analysing Resilient Attributes of Cloud-Based Architectures

Notes

References