1 Introduction

System administrators, network engineers, and IT managers can learn much about the vulnerabilities of a cyber system by investigating and analysing analytical attack graphs (AAGs) [35]. An AAG provides a graph-based representation that describes ways by which an attacker can achieve progress towards a desired goal, a.k.a. a crown jewel, in a digital environment given an entry point, e.g., using social engineering. An AAG consists of logical rule nodes, fact nodes, and derived fact nodes. Given an AAG, different types of analyses can be performed to identify paths to reach a target goal, measure the vulnerability of the network, and gain insights on how to optimize the efforts to secure it. Several tools allow users to extract AAGs from their network, analyze the AAGs, and provide relevant reports for system administrators and IT managers, e.g., [11, 22, 30, 34, 44].

Yet, as the size of the AAGs representing real-world organizational networks may be very large, existing analyses are incomplete or very slow, to the extent that make them impractical and hinder the wide-spread adoption of the technology. One such analysis is the detection of the AAG ’s rules that one should change in order to protect the crown jewels [11].

In this paper, we introduce and show how to compute an AAG ’s defense core: a locally minimal subset of the AAG ’s rules whose removal will prevent an attacker from reaching a crown jewel. Most importantly, in order to scale-up the performance of the detection of a defense core, we introduce a novel application of the well-known notion of bisimulation to AAGs.

Bisimulation is a binary relation between the nodes of two graphs that applies when the nodes have similar topological properties. Specifically, the result of computing a maximum bisimulation between a graph and itself, allows one to find a smaller representation of the graph that preserves important topological properties. Bisimulation is well-studied in theoretical computer science and has important applications in formal verification [3, 7]. Multiple algorithms exist to compute a bisimulation relation [8, 9, 36]. To the best of our knowledge, we are the first to apply bisimulation to AAGs. The result of computing a maximum bisimulation of an AAG is a compact representation we call an AAG-fold.

Given an AAG, a defense core is a minimal subset of the AAG ’s rules whose removal will make the system safe, i.e., prevent an attacker from reaching a crown jewel. However, computing a defense core is computationally expensive and (as we show in our experiments) can become very slow or practically impossible on large AAGs. Thus, rather than computing it on the original AAG directly, we compute it on its AAG-fold. The correctness of our work relies on the fact that AAG-folds preserve attacks, which we formalize and prove in Sect. 5. The scalability of our work relies on (1) the ability to compute the AAG-fold efficiently and (2) that in practice, as we show in our experiments, the AAG-fold is typically much smaller than the original AAG.

Finally, note that defense minimality is important because system changes corresponding to logical rule change or removal may be expensive or technically difficult to apply. Typically, many different defense cores may exist and computing the minimal one is too expensive. As a pragmatic solution, we compute a locally minimal subset, one which may be larger than other possible cores but in itself, does not include any redundant rules. To compute it, we use a variant of the well-known minimization algorithm QuickXplain  [18].

We implemented our ideas as an extension to AgiSC, developed by Accenture Labs and used by Accenture Security as part of its IT and consulting services to clients. The tool uses Datalog to represent facts and derivation rules about the system. Our experiments show that a direct approach to computing a defense core does not scale. They also show that the use of bisimulation results in significantly smaller graphs and in faster and scalable defense-core computations.

2 Illustrative Example

We use an example to semi-formally illustrate and motivate the use of AAGs  and the problem of computing AAG defense cores. See Sect. 3.3 for a formal definition of an AAG based on Datalog specifications, and for an example of Datalog facts, rules, and a fact derivation. We consider a network with n domain users, each with a personal workstation and a server. All users belong to a special domain group with remote desktop protocol (RDP) access to all servers. All servers can use a network file sharing protocol (SMB) to share files and request services from the domain controller. One of the servers, namely server1, contains a local privilege escalation vulnerability, and the domain admin admin@example.domain is logged into the server.

Consider the following attack scenario. An attacker compromises a personal workstation using a social engineering technique that steals the user credentials. The attacker then connects to server1 by logging into it via RDP with the stolen user credentials. The attacker escalates her privileges on server1 using a local privilege escalation vulnerability. Then, the attacker uses a hacking tool to find the credentials of the logged domain admin. Finally, the attacker uses a Windows OS procedure with the stolen domain admin credentials to log on to domain controller via the SMB protocol with administrative privileges. From the domain controller the attacker effectively gains complete control over network, and so the domain is compromised.

Fig. 1.
figure 1

An AAG (L), an excerpt of it (TR), and a folding of “hasAccount” (BR).

Figure 1 shows an example of an AAG of a small network. Graph nodes shaped as circles, triangles, and rectangles represent facts, implication rules, and derived facts respectively. Rules are numbered and facts have only predicate names to avoid clutter. The blue dashed rectangles depict two nodes with the same label, namely hasAccount. The red rectangle highlights the excerpt that appears on the top right figure. The red circles highlight a fact and a rule that are relevant to the discussion below.

Figure 1 (left) shows a visualization of the AAG that captures the scenario described above when the network has one domain user, i.e., \(n=1\). This AAG is automatically generated by a solver that takes a set of facts capturing the initial configuration of the system; a Datalog specification that specifies predicates and logical implication rules; and a target goal. The solver produces an AAG that shows every possible derivation of the goal, i.e., every possible attack.

Consider the fact isDC(‘domain_controller’,‘example.domain’) in Fig. 1 (left) at the bottom circle, represented by its predicate name isDC, which is included in the initial configuration. It encodes that domain_controller is the domain controller for example.domain. Consider the following implication rule that encodes that if an attacker gains elevated code execution privileges on a domain controller, then the domain is compromised: domainCompromised(Domain) :- execCodeElevated(_, DC), isDC(DC, Domain)

This rule is applied as the last step to derive the attack goal. The rule and the goal are represented in Fig. 1 (left) by triangle 56 and the blue rectangle respectively.

One common use of AAGs is to mitigate potential cyber threats. Different mitigation strategies exist, some focus on rectifying specific facts [1, 10, 15, 41], while others on blocking potential lateral moves, represented by Datalog rules, by installing security controls [11, 12]. In this work, we focus on the latter and search for a set of rules whose removal prevents all attacks towards the goal.

Consider the rule labeled 45 in our example. It encodes that if two hosts are connected via an SMB protocol and an attacker has elevated privileges on one, then the attacker can gain elevated privileges of the other. Removing this Datalog rule means that we prevent its applications, i.e., all rule nodes labeled 45 in the graph. As a result, the goal is no longer reachable, and thus the graph ceases to show possible attacks.

A major obstacle when searching for a defense core relates to the size of the AAG, which dramatically increases with the size of the network. Consider our example, scaling up the network by increasing the number of users \(n=1, 5, 50\) yields AAGs with 135, 2565, and \(\approx 1.3M\) nodes and edges respectively. Thus, as the network grows, finding a defense becomes a computationally expensive and long if not an impractical task. We observe that while the number of possible attacks increases with the network size, many of the attacks share the same structure. Thus, we use bisimulation, and fold graph nodes with similar labels and graph topology. The motivation to use bisimulations relies on the fact that an AAG typically depicts many different attacks that differ only in agents and machines that share similar properties.

Figure 1 (bottom right) shows a visualization of a portion of the folded attack graph. Although the folded graph is smaller, the original graph and the folded one exhibit the same set of labeled paths. For example, the two nodes labelled hasAccount in the AAG that appear in Fig. 1 (top right), have same label and graph topology, e.g., the same outgoing edges to the equally labeled nodes. Both are folded into one node in the bottom right figure. The 135 nodes and edges of the AAG, are represented by 112 nodes and edges in the folded representation.

By folding the graphs of our example for \(n=1, 5, 50\), the original graphs of 135, 2565, and \(\approx 1.3M\) nodes and edges are reduced to graphs with 112, 252, and 252 nodes and edges resp. Interestingly, the folded representation does not change when increasing n from 5 to 50, as in this case the newly added nodes are folded into existing folded nodes. The difference between a graph of size 1.3M and a graph of size 252 makes our work scalable. In this paper we leverage folding to achieve faster defense analysis time for large AAGs.

3 Preliminaries

3.1 Monotonic Criteria and Cores

Given a set T, and a monotonic criterion on subsets of T, a core is a local minimum that satisfies the criterion. Formally:

Definition 1

(Monotonic criterion). A Boolean criterion over subsets of T is monotonic iff for any two sets AB such that \(A\subseteq B\subseteq T\), if A satisfies the criterion then B satisfies the criterion.

Definition 2

(Core). Given a set T and a monotonic criterion over its subsets, a set \(C\subseteq T\) is a core of T iff C satisfies the criterion, and all its proper subsets \(C'\subset C\) do not satisfy the criterion.

Note that multiple cores may exist, not all of them minimal in size. There are several well-known domain-agnostic algorithms that compute a core, given a method that computes a monotonic criterion. We chose QuickXplain as our core computation algorithm. It has a worst-case complexity of \(O(k+klog( {|T|\over k} ))\), where T is the minimized set, and k is the size of the largest core. See Sect. 7 for a discussion of core computation algorithms.

3.2 Bisimulation Relations

Bisimulation relations are equivalence relations on nodes that share topological properties. They can be extended by labels on nodes and edges that distinguish between different types of nodes and edges. Here we use strong bisimulation with labeled nodes [28]. This allows more succinct representations of graphs while keeping certain properties.

Two nodes \(v_1\) and \(v_2\) of a directed graph may be equivalent in terms of the paths starting from them. The equivalence of the paths means that for each path starting from \(v_1\) there is a path starting from \(v_2\) composed of equivalent nodes by the same relation, and vice versa. For example, every two leaves (i.e., nodes with no outgoing edges), may be considered equivalent.

Such an equivalence relation is called a bisimulation. The most fine grained bisimulation is the identity relation, and the most coarse is called a maximum bisimulation. We borrow the formal definitions and propositions from [8].

Definition 3

(Bisimulation relation). Given graphs \(G_1=\langle V_1,E_1 \rangle \) and \(G_2=\langle V_2,E_2 \rangle \), a bisimulation between \(G_1\) and \(G_2\) is a relation \(b\subseteq V_1\times V_2\) such that

  1. 1.

    \((u_1~b~u_2 \wedge \langle u_1, v_1 \rangle \in E_1) \Rightarrow \exists v_2\in V_2 (v_1~b~v_2 \wedge \langle u_2, v_2 \rangle \in E_2)\)

  2. 2.

    \((u_1~b~u_2 \wedge \langle u_2, v_2 \rangle \in E_2) \Rightarrow \exists v_1\in V_1 (v_1~b~v_2 \wedge \langle u_1, v_1 \rangle \in E_1)\).

Definition 4

(Maximum bisimulation and minimum representation). Given a graph \(G=\langle V,E \rangle \), a maximum bisimulation \(\equiv \) on G is the union of all bisimulation relations between G and itself. The minimum representation of G has nodes \(V/\equiv \) and edges \(\langle [u], [v]\rangle \), s.t. \(\exists u_1\in [u], v_1\in [v] (\langle u_1, v_1 \rangle \in E)\).

Proposition 1

(Uniqueness and bisim. of minimum representations). A maximum bisimulation \(\equiv \) on G always exists. It is a unique equivalence relation over the nodes of G. A graph G and its minimum representation are bisimilar, i.e., there is a bisimulation relation between them.

The definition of a bisimulation relation may be refined to consider labels on nodes. In this case, for example, nodes without outgoing edges are equivalent iff they have the same label.

Definition 5

(Labeled graph bisimulation). Let L be a finite set of labels. Given a labeled graph \(G=\langle V, E, l\rangle \) with \(l: V \rightarrow L\), a labeled bisimulation on G is a bisimulation relation \(b\subseteq V\times V\) on G such that u b v implies \(l(u) = l(v)\).

Definition 4 and Proposition 1 apply as they are to maximum labeled bisimulations and to minimum representations of labeled graphs respectively.

Several algorithms for computing a maximum bisimulation exist. PT is an efficient algorithm suggested by Paige and Tarjan [36], with a complexity of O(|E|log|V|) for a graph G(VE). We discuss other algorithms in Sect. 7. We used BisPy [2], an open-source project that includes an efficient implementation of PT for maximum bisimulation computation.

3.3 Analytical Attack Graphs

An analytical attack graph (AAG) provides a graph-based representation that describes ways by which an attacker can achieve progress towards a specified goal. An influential work by Ou et al. [34] presented MulVal, a framework for generating AAGs based on facts and vulnerabilities that are collected from the organizational network. Facts describe logical and physical entities of the network. They are formally modeled by Datalog predicates. Each predicate is an n-ary relation between such entities. The Datalog statement \(P(arg_1, arg_2, \ldots ).\) states that the literals \(arg_1, arg_2, \ldots \) satisfy the predicate P. Literals are constant strings of characters.

For example, Listing 1 contains two facts. The predicate names of the facts are entryPoint and hasSession.

Derivation rules allow the deduction of new facts from given facts. Essentially, derivation rules derive a fact when a conjunction of facts is detected. The derivation may be a general one, as some of the arguments may be represented by variables. Thus, a single rule is usually applied to many different sets of facts.

The syntax of a Datalog rule is

$$\begin{aligned} P(a_1, a_2, \ldots ) :- P_1(a_{11}, \ldots ), P_2(a_{21}, \ldots ),\ldots , P_n(a_{n1},\ldots ) \end{aligned}$$

where P is the derived predicate name. If predicates \(P_1, P_2,\ldots P_n\) hold, it is possible that several instances of P with some literals are derived, depending on the parameters of \(P_1, P_2,\ldots P_n\). The parameters of predicates in rules are variables (that begin with a capital letter or underscore for variables not essential for the derivation) and literals.

figure a
figure b
figure c

For example, Listing 2 shows a rule from one of our Datalog files stating that if the literal of entryPoint and the first literal of hasSession are the same, the predicate execCode with the same parameters as hasSession in reverse order is derivable. In our case, Listing 1 and Listing 2 together indicate the derived fact that appears in Listing 3. Note that Listing 1 and Listing 2 show real Datalog code while Listing 3 is a textual representation of the derived fact.

Given a Datalog representation of the network, one can use a reasoning engine, e.g., XSB [37], to check whether there exists an attack from the input facts to the target goal. Given primitive facts and rules, which describe the system, the reasoning engine deduces derived facts. Derived facts have the same syntax as primitive facts, i.e., predicates over literals. Given a goal, the reasoning provides two outputs, namely, deciding whether the goal is achievable, and if so, producing information about all possible attacks in the form of an AAG.

An AAG exists iff the goal is achievable. We borrow the definition of an AAG and its semantics from [10]

Definition 6

(Analytical attack graph (AAG)). An analytical attack graph is a structure \(A = \langle N_r, N_f, N_d, E, L, g \rangle \) where \(N_r, N_f, N_d\) are mutually exclusive sets of nodes denoting derivation rules, facts, and derived facts respectively. E is a set of edges that connects facts, either primitive or derived, to derivation rules, and derivation rules to derived facts. Formally, \(E\subseteq ((N_f\cup N_d)\times N_r) \cup (N_r\times N_d\)). L is a mapping from a node to its label, i.e., fact nodes \(N_f\cup N_d\), and rule nodes \(N_r\) are mapped to the facts and rules they represent respectively. Finally, \(g\in N_d\) is the target node. We denote by \(V = N_r\cup N_f\cup N_d\) the set of all the nodes.

For example, an AAG that represents the facts and rules in Listings 1–3, has nodes \(v_1,v_2,v_3,v_4\) and edges \((v_1,v_3), (v_2,v_3), (v_3,v_4)\). The two facts in Listing 1 are the labels \(L(v_1), L(v_2)\) of two nodes, \(v_1, v_2 \in N_f\). Listing 2 lines 2-4 is the label \(L(v_3)\) of a rule node \(v_3 \in N_r\). Listing 3 is the label \(L(v_4)\) of a derived node \(v_4 \in N_d\). The AAG will include three edges. \((v_1,v_3), (v_2,v_3) \in N_f \times N_r\) and \((v_3,v_4) \in N_r \times N_d\).

We defined an AAG to have a single goal. It is always possible to reduce an AAG that represents a set of goals, where either all or at least one of them must be achieved, to the AAG as defined in Definition 6, with additional rules.

An AAG is a special case of an And/Or graph [27], where each rule node instantiates only one (derived) fact. The semantics of an AAG is that derived facts are supported by a rule and facts that imply the derived fact in accordance with the rule. Formally:

Definition 7

(AAG semantics). For every \(v_r\in N_r\) and \(v_d\in N_d\) s.t. \(\langle v_r,v_d\rangle \in E\), it holds that \(\wedge _{\langle v,v_r \rangle \in E}L(v)\rightarrow {L(v_d)}\) is an instance of the rule \(L(v_r)\).

An AAG indicates that the goal can be achieved. Since the reasoning engine deduces all possible derivable facts, an AAG represents all possible attacks, possibly including circular ones, which occur when two facts contribute to the deduction of each other. An attack or attack plan is intuitively a single attack scenario. Explicitly, it is a subgraphs of the AAG that contains the goal node, each derived fact node has an incoming degree 1, and each rule node is satisfied by its preconditions. For a formal definition of attack plans see [10].

4 The Defense Problem and a Naive Defense Algorithm

We now describe the defense problem, i.e., finding a subset of the rules whose removal prevents all possible attacks on the goal. We call the remaining set of rules safe, and define safe sets first. We then show a naive defense algorithm, i.e., one that uses an AAG directly.

Definition 8

(Safe sets of rules). Given an AAG \(A = \langle N_r, N_f, N_d, E, L, g \rangle \), we denote its set of rules by \(R = \{r\in L(v_r) | v_r\in N_r\}\). A subset \(R'\subseteq R\) is safe if any subgraph of A with a set of rules restricted to \(R'\) and the same goal g is not an AAG. A subset \(R'\subseteq R\) is maximally safe if it is safe and every \(R''\) such that \(R'\subset R''\subseteq R\) is not safe.

Note that there may exist more than one maximally safe set of rules.

A defense-set is the complement of a safe set of rules. Formally,

Definition 9

(Defense-sets). Given the notation of Definition 8, a subset \(R'\subseteq R\) is a defense-set iff \(R\setminus R'\) is safe.

We define a defense problem: Given an AAG as input, output a defense-set.

The duality of locally maximal sets satisfying a property, and their complements being locally minimal sets the removal of which satisfies the property is trivial and well-known [21].

The direct way to compute a defense-set is to apply a domain-agnostic core computation algorithm to a set of rules of the AAG. This requires a method that computes the defense-set criterion, i.e., given a set of rules, decide if it a defense-set. We implemented a naive defense-set check algorithm. We call the application of QuickXplain to a set of rules with check as the monotonic criterion computation AAG-Defense. Since the criterion is evidently monotonic, the correctness of the algorithm follows. Moreover, since QuickXplain ensures a core the complement of the obtained defense-set is maximally safe.

5 Applying Bisimulation to Attack Graphs, and a Fast Defense Algorithm

An AAG may be folded using a bisimulation, which generates a succinct representation of it. The succinct representation may be helpful for various purposes, such as comprehension and computational efficiency of analyses.

We now present our contribution. In Sect. 5.1 we introduce the notion and the semantics of an AAG-fold which represents an AAG that has been folded using a bisimulation. We then prove that the AAG-fold semantics must hold. In Sect. 5.2 we introduce a faster defense algorithm, namely, AF-Defense.

5.1 Folding an AAG

To fold an AAG, we define a labeled bisimulation based on the predicate names, and ignore the arguments of primitive and derived facts. We first define an abstraction function for predicates and rules.

Definition 10

(Abstraction function). The function abs ignores arguments in fact and rule labels. For a fact label \(l_1:=``P(a_1, a_2, \ldots )"\) let \(abs(l_1) = ``P"\). For rule label \(l_2:=``P(a_1, a_2, \ldots ) :- P_1(a_{11}, \ldots ), P_2(a_{21}, \ldots )),\ldots , P_n(a_{n1},\ldots ))"\) we define \(abs(l_2) = ``P :- P_1, P_2,\ldots , P_n"\).

For example, the abstractions of the two facts in Listing 1 are the names of the predicates, namely entryPoint and hasSession. The abstraction of the rule in Listing 2 lines 2–4 is execCode :- hasSession, entryPoint.

Next, we define an AAG-fold to be the minimum representation of an AAG. We collapse the facts but not the rules according to a maximum labeled bisimulation. That is, apart from considering the topology of the graph, nodes can become equivalent only if they keep the exact rule for rule nodes, and can become equivalent if they have the same predicate name (regardless of the arguments) for fact nodes. Formally:

Definition 11

(AAG-fold). Let \(A = \langle N_r, N_f, N_d, E, L, g \rangle \) be an AAG. We define a label function l over nodes \(N_r\cup N_f \cup N_d\) as follows. \(\forall v\in N_r~ l(v)=L(v)\), and \(\forall v\in N_f \cup N_d~ l(v)=abs(L(v))\). We apply the unique maximum labeled bisimulation relation \(\equiv \) to obtain an AAG-fold \(AF = \langle N_r/\equiv , N_f/\equiv , N_d/\equiv , E_\equiv , L_\equiv , [g] \rangle \) where \(E_\equiv \) is defined in accordance with the edges of the minimum representation in Definition 4, and \(L_\equiv \) abstracts both rule nodes and fact nodes, namely, \(\forall v\in N_r\cup N_f \cup N_d~ L_\equiv ([v])=abs(L(v))\).

The maximum labeled bisimulation relation \(\equiv \) exists and is unique according to Proposition 1. Note that this equivalence relation ranges over the whole set of nodes V. It is possible that two nodes, one from \(N_f\) and one from \(N_d\), are equivalent. The quotient set \(N_d/\equiv \) uses the restriction of the relation to the set \(N_d\) (and similarly for \(N_f\)). Note that the label function \(L_\equiv \) is well defined. First, rule nodes use the syntax of the original rule as their label for the purpose of bisimulation. This means that only rules with the exact syntax may become equivalent. Thus, they must have the same abstraction. Second, fact nodes that become equivalent must have the same predicate name, which is also their abstraction. Thus the definition of \(L_\equiv \) for equivalent fact nodes must agree.

The semantics of the AAG-fold is slightly different than that of an AAG:

Definition 12

(AAG-fold semantics). Let \(AF = \langle N_r, N_f, N_d, E, L, g \rangle \) be an AAG-fold. For every \(v_r\in N_r\) and \(v_d\in N_d\) s.t. \(\langle v_r,v_d\rangle \in E\), then \(\wedge _{\langle v,v_r \rangle \in E}L(v)\) implies \(L(v_d)\) according to rule \(L(v_r)\).

Note that the implication involves only predicate names rather than predicates with literals, which is more relaxed. For example, \(P_1\wedge P_1\wedge P_2\) and \(P_1\wedge P_2\wedge P_2\) are equivalent, although they are not the same formula. The difference is that the number of instances of the same predicate name may vary in the AAG-fold, and we only require that one of each predicate name of the rule appears as a support. We discuss the effect of this in detail below.

We now prove our main claim, namely, that the AAG-fold of any AAG must adhere to AAG-fold semantics.

Theorem 1

(An AAG-fold adheres to AAG-fold semantics). Given an AAG \(A = \langle N_r, N_f, N_d, E, L, g \rangle \), AF as defined in Definition 11 has AAG-fold semantics as defined in Definition 12.

Proof

Let \(v'_r \in N_r/\equiv \) and \(v'_d \in N_d/\equiv \) be nodes satisfying \(\langle v'_r, v'_d \rangle \in E_\equiv \). By Definition 4 there are \(v_r\in v'_r\) and \(v_d\in v'_d\) s.t. \(\langle v_r, v_d \rangle \in E\). By Definition 5, all elements of \(v'_r\) have the same bisimulation label \(r=l(v_r)\). Bisimulation labels of rule nodes are not abstracted (Definition 11), thus \(r=L(v_r)\) is an AAG rule. According to AAG semantics, \(\wedge _{\langle v,v_r \rangle \in E}L(v)\rightarrow {L(v_d)}\) is an instance of r (see Definition 7). By Proposition 1 A and AF are bisimilar, and according to Definition 3, for each v s.t. \(\langle v,v_r \rangle \in E\) there is a (not necessarily unique) \(v'\) s.t. \(\langle v',v'_r \rangle \in E_\equiv \), and s.t. v and \(v'\) satisfy the bisimilarity between A and AF. By Definition 11, the \(L_\equiv (v')\) are abstractions of the facts of their corresponding L(v), which, in turn, satisfy r. Thus, \(\wedge _{\langle v',v'_r \rangle \in E_\equiv }L_\equiv (v')\) implies \(L_\equiv (v'_d)\) according to the abstracted rule \(L_\equiv (v'_r)\). This satisfies the semantics of \(AAG_\equiv \) according to Definition 12.

The rationale of Theorem 1 implies that every attack plan of the AAG has a corresponding attack scenario in the AAG-fold, obtained by translating nodes and edges of the attack plan to their counterparts in the AAG-fold. Essentially this means that an AAG-fold maintains all possible attacks on the goal node.

The preconditions supporting a derived fact in the AAG match the list of predicates appearing in the rule, including multiple appearances of the same predicate. In the AAG-fold, however, we only need one node of each predicate name (in the rule’s preconditions) to deduce the node. This may have the following consequences. First, a rule node may have more incoming edges than the preconditions in its declaration. This can occur if not all incoming predicate nodes that share a label were folded, due to topological differences. Second, if an AAG rule for deriving predicate \(P_1\) requires two different instances of \(P_2\) (i.e., \(P_1 (\ldots ) :- P_2 (\ldots ),P_2 (\ldots ), \ldots \)), two different instances of \(P_2\) must appear in the AAG. However, as nodes may get folded in the AAG-fold, the two incoming instances may merge in the AAG-fold. This still complies with the semantics of the folded rule. The idempotency of the conjunction ensures that one instance of \(P_2\) is enough for both instances of \(P_2\) as they are indistinguishable without their arguments.

Note that the converse of Theorem 1 does not hold. For example, if we need predicates \(P_1\) and \(P_2\) in order to derive P, the AAG-fold may contain many instances of \(P_1\) and \(P_2\) nodes leading to the rule node. However, not all pairs of \(P_1\) and \(P_2\) represent AAG nodes that match the rule, so not any such pair necessarily supports the derived node. Thus, an AAG-fold which depicts an attack on the goal node does not necessarily indicate that the AAG it was produced from has a corresponding attack plan.

5.2 The AF-Defense Algorithm

Algorithm AF-Defense improves the naive approach of AAG-Defense. It first computes an AAG-fold of the AAG by applying the PT Algorithm (see Sect. 3.2 and Definition 11). Next, it applies QuickXplain on the AAG-fold to find a core, which it returns as a defense-set. The same check operation (described in the appendix available in the extended version of the paper), which is required for the QuickXplain algorithm, is applied to the AAG-fold, and uses the semantics of the AAG-fold instead of the semantics of AAGs.

From Theorem 1 follows the correctness of AF-Defense.

Theorem 2

(Correctness of AAG-Defense). Algorithm AF-Defense computes a defense-set.

Proof

Assume by contradiction that the computed set of rules \(R'\) is not a defense-set, thus the complementary set w.r.t. all the rules R is an unsafe set of rules \(R\setminus R'\). According to Definition 8 there is a subgraph of the AAG which is an AAG with rules \(R\setminus R'\) and the same goal node. According to Theorem 1 its induced AAG-fold maintains AAG-fold semantics, which implies that the goal of the AAG-fold is achievable with rules \(R\setminus R'\) in the AAG-fold.

However, The criterion check directly checks that the goal of the AAG-fold is not achievable for a removed set of rules. The computed criterion is monotonic also for AAG-folds similar to the check for AAGs. By correctness of QuickXplain the produced set \(R'\) is a core, which satisfies the checked criterion (Definition 2). Thus \(R'\) is a set of rules, the removal of which makes the goal of the AAG-fold unachievable, a contradiction. Thus, the computed set of rules \(R'\) is a defense-set.

In Sect. 5.1 we explained why the converse of Theorem 1 does not hold. Thus, in theory, AAG-Defense has the advantage of ensuring that the complement of the defense-set is a maximally safe set of rules, while AF-Defense does not ensure maximality. That said, in the appendix in the extended version of the paper we show that in practice, the actual difference in defense-set size is small, if any.

6 Evaluation

We provide an overview of our evaluation. Details appear in the appendix available in the extended version of the paper.

We implemented AAG folding and defense-set algorithms in Python. We used BisPy [2], an open-source project that includes an efficient implementation of PT for computing bisimulation over directed graphs. For minimization we implemented a variant of QuickXplain  [18]. The end-to-end implementation allows the user to choose an AAG, a set of rules that can be removed (all the rules by default), and a flag to control whether the defense-set should be done directly or using the AAG-fold. The tool runs our algorithm and outputs a defense-set, i.e., a set of rules to be removed such that the remaining rules are safe (do not allow an attack).

In our experiments, we compared the performance of algorithms AAG-Defense and AF-Defense. Note that we were unable to compare to previous works directly as none of them computed defense-sets.

We considered the following research questions: RQA Can we compute an AAG-fold efficiently and how do the sizes of the original and folded graphs compare? RQB How do defense-set computation times compare between the original and the folded graphs? RQC How do sizes of defense-sets of AAG-Defense and AF-Defense compare?

We ran experiments over several datasets that include real-world examples of AAGs that were generated to detect potential vulnerabilities in different systems of two large manufacturing facilities in the automotive and retail industry, versions of an IT system created for the purpose of assessing segments of a managed organization network, examples taken from Hadar et al. [12], as well as synthetic examples that simulates a network with the vulnerability described in Sect. 2.

Tables and graphs summarizing the characteristics of these datasets and the experiment results appear in the extended version of the paper. We summarize the answers to the research questions as follows:

figure d
figure e
figure f

7 Related Work

Analytical Attack Graphs Analysis. Inference of analytical attack graphs (AAGs)  [34, 38] over real-world systems often produces large models that are hard to comprehend and analyze using existing techniques  [13, 16, 20, 23, 26, 33, 47].

Yousefi et al. [47] present an algorithm that refines the attack graph and generates a simplified transition graph. The algorithm produced a smaller graph but provides no guarantees about soundness. Noel and Jajodia [31] describe a framework for managing attack graph complexity through interactive visualization, which includes hierarchical aggregation of graph elements. The aggregation collapses non-overlapping subgraphs to single vertices but is applied to a different model of attack graph and therefore cannot be directly compared to our work. Homer et al.  [14] present two simplifying methods for AAGs. The first is a data filtering approach, which identifies portions of an attack graph that do not help users understand the security problems and trims them. The second is an abstraction approach, which groups similar attack steps as virtual nodes in a model of the network topology. These two methods can be viewed as complementary to our approach. Others  [17, 32] have suggested methods to simplify the attack graph by grouping similar hosts together and representing grouped hosts by single nodes, and by using hierarchical displays. These approaches still result in complex attack graphs that are difficult for system administrators to relate to the underlying analysed network [32]. Williams et al. [46] present an interactive tool with a cascade display that produces a compact representation, highlights critical attack steps that lead into new network areas, and displays both attack graph and reachability information over a multiple-prerequisite (MP) graph. They use treemaps to present hosts in subnets in close proximity. Hosts in each treemap are automatically grouped based on level of compromise, how the hosts are treated by firewalls, trust relationships the hosts participate in, and prerequisites required to compromise hosts. These groupings provide visual indications of the network security and greatly simplify the display. Recently, Sabur et al. [40] suggested a divide-and-conquer approach to divide a large attack graph into smaller segments based on similarity between services. A distributed firewall prevent the attacker from compromising separated segments. They optimize their approach by removing cycles from the graph, and computing the optimal number of segments, based on the implementation cost of the segmentation. Mjihil et al. [29] present the use of well-known efficient decomposition algorithms of graphs into strongly connected components, which in turn allows the use of parallel computation for faster analysis of the subgraphs. They acknowledge that their approach works better on sparse graphs.

In contrast to the aforementioned works, our work is unique in that it uses the well-known bisimulation relation, a topology preserving equivalence relation for graph abstraction. This allows us to create a sound abstraction of the attack graph that respects its topology and labeling; it eliminates redundancies while preserving all possible attacks. As opposed to works that attempt to decompose the attack graph, our approach is resilient to topological aspects that impede decomposition. To the best of our knowledge, no earlier work has made such guarantees. As we show, it can be computed efficiently and results in smaller graphs that allow faster analyses. Finally, several ways to speed up attack graph computation based on parallel and/or distributed computing have been proposed [5, 19]. These methods do not reduce the size of the attack graph.

Symbolic Attack Detection. Several authors employed symbolic approaches such as model checking to detect attacks and compute attack graphs, e.g., [39]. More recently [43] modeled AWS IAM attacks using Boolean formulas. Solving those with SAT solvers allows proving no attacks are possible, and detecting attacks with the possibility of grouping similar attacks. Contrary to this approach, we exploit a representation of all possible attacks in a structure that tries to avoid repetitions of similar attacks. A recent work [6] presented a formal verification approach to handle attack graphs. The work models attack graphs as Kripke structures and proposes to use model-checking in order to verify whether an attacker can gain access to certain resources.

Bisimulation. Bisimulation is well-studied in theoretical computer science and has important applications in formal verification [3]. Multiple algorithms exist to compute a bisimulation relation [8, 9, 36]. To the best of our knowledge, we are the first to apply bisimulation to AAGs. For the bisimulation computation required to obtain an AAG-fold, we use the PT algorithm [36], which has an O(|E|log|V|) complexity for a graph G(VE). Dovier et al. [8], propose algorithms for acyclic graphs, and labled graphs. They suggest some further improvements such as computing sets of ranks instead of ranks, which is finer. They also suggest symbolic computation using BDDs.

Cores. Core computations are applied in many domains, usually for fault-localization. For example, cores of unsatisfiable CNF formulas, a.k.a. MUS, minimal unsatisfiable subsets of clauses [21], are computed for Alloy [45] and for component and connector specifications [24]. Many different core computation algorithms exist, either single core domain-agnostic, e.g., DDMin  [48] and QuickXplain  [18], domain-specific [25] and all cores computations [4, 21, 25]. Cores have also been applied to the removal of redundant elements in valid specifications [42]. We chose QuickXplain for core computations thanks to its complexity and prioritization parameter (see below). We are the first to apply cores to attack graph defense.

Reducing Risk Based on Analytical Attack Graphs. Some works suggested means to select nodes whose removal from the AAG  will reduce the risk of attack, based on different criteria such as centrality measures [1, 10, 15, 41]. Hadar et al. [11, 12] enumerate risk-reducing security requirements and suggest means to prioritize security controls to reduce risk. They do not aim to prevent attacks but focus on prioritizing between given security controls.

In contrast, given an AAG, we automatically compute a safe subset of the AAG ’s rules for which no attack is possible. In the future, it may be interesting to consider prioritization in our work too. The QuickXplain algorithm allows prioritization as a parameter. See the last paragraph in Sect. 8.

8 Conclusion and Future Work

We presented fast means to compute an attack graph defense core, identifying a minimal set of changes to a cyber system that will prevent an attacker from reaching a crown jewel. To scale-up attack graph defense performance, we introduced a novel application of the well-known notion of bisimulation to attack graphs and showed how to compute a defense-set over the resulting graphs. Our experiments showed that the use of bisimulation results in significantly smaller graphs and in defense-set computations that are significantly faster than a direct solution, making them practical.

We consider the following future work. First, it is possible to improve the computation of the bisimulation with ideas from [8]. One example is the replacement of the notion of a rank of a node as a number, by the set of ranks the node points to. This may improve the running times of the computation. Another direction is the use of symbolic representations of sets of nodes, for example using BDDs. Symbolic computation of bisimulations were considered, e.g., in [9].

Second, we consider additional applications for AAG-fold, beyond defense-set computations. For example, faster detection of possible attacks, faster risk assessments of the vulnerability of the network, and possibly more user-friendly and scalable UIs for viewing and exploring AAGs.

Third, it may be possible to accelerate checks of the safety of subsets of rules. A simple case is when the goal node is disconnected from the primitive facts, which can be detected easily by finding connected components of an AAG limited to a set of rules. This is equivalently useful for an AAG-fold. Another possible approach is to find all locally minimal subsets of rules required for the validation of each derived node, using dynamic programming. By doing this once over an AAG or an AAG-fold, the validation detection for a given set of rules may become very efficient.

Finally, QuickXplain allows different ways to order the importance of rules. In our present work we ranked rules by their frequency in the graph. Other ways to rank rules exist, e.g., by employing centrality measures [11]. Moreover, not all rules are equally difficult or expensive to remove, and so users may be interested in using domain-knowledge for rule ranking. Different rule rankings will induce different notions of defense-set minimality, e.g., rather than computing a defense-set that includes a minimal number of rules, compute one whose set of rules is the least expensive to change. That is, investigating the quality of defense-sets while considering different notions of quality. We leave all these for future work.