1 Introduction

Network disruption has important applications to infrastructure design (Brown et al., 2006; McMasters & Mustin, 1970; Church et al., 2004), energy transmission (Callaway et al., 2000; Holmgren, 2006), robust network design (Crucitti et al., 2004b; Dodds et al., 2003; Estrada, 2006), biological systems (Rasti & Vogiatzis, 2022), illicit trade networks (Anzoom et al., 2021), and counterterrorism (Basu, 2005; Sageman, 2004). Much of this work focuses on two primary problem types: network flow interdiction and network diversion. In network interdiction, an attacker is interdicting vertices or edges to maximize the minimum cost of routing flow through a network (Alderson et al. 2015). This is commonly examined in the literature in the form of two special cases: selecting an interdiction set that minimizes the maximum flow between a source and sink (e.g., Altner et al. (2010), Balcioglu and Wood (2003), Bertsimas et al. (2016), Lei et al. (2018), Lim and Smith (2007), Royset and Wood (2007), Wood (1993), Enayaty-Ahangar et al. (2019)), or such that the shortest path between a source and sink is maximized (e.g., Israeli and Wood (2002), Pay et al. (2019), Zhang et al. (2018)). In network diversion, a minimum cost, minimal cutset of edges is identified such that when removed, any source-sink path in the network is forced to travel through a particular set of critical edges (e.g., Cintron-Arias et al. (2001), Cullenbine et al. (2013), Curet (2001)).

Of interest in this paper is the concept of vertex (equivalently, edge) vitality, which measures the reduction in the maximum flow between the source and sink when that vertex (or edge) is removed from the graph (Ausiello et al., 2019; Koschützki et al., 2005). A vertex having high vitality is needed to achieve a high volume of flow from source to sink, and as such, this vertex will have a high volume of flow passing through it when the maximum flow is achieved. We define the all-pairs vitality of a vertex v to be the summed reduction in the maximum flow between all pairs of nodes (themselves excluding vertex v), when vertex v is removed from the graph.

We present the following combinatorial optimization problem, the all-pairs vitality maximization problem (VIMAX): Given a connected, directed, general capacity graph \(G=(V,E)\) with vertex set V, edge set E, and a key vertex of interest, k, identify a subset of vertices S, whose removal from the graph G maximizes the all-pairs vitality of k. This problem was first introduced in the second author’s unpublished manuscript for the specific context of undirected, unit-capacity graphs, for which the maximum flow between a pair of vertices represents the number of edge-disjoint paths between that pair (Martonosi et al. 2011).

VIMAX can be considered a network disruption problem that is distinct from the three forms outlined above. Covert organizations, such as terrorist groups or drug cartels, tend to communicate along longer paths that are difficult to trace, suggesting a trade-off between efficiency and secrecy that could render path-length-based attacks ineffective (Anzoom et al., 2021; Freeman et al., 1991; Morselli et al., 2007). Moreover, we leverage the possibility that critical vertices in certain types of networks can become vulnerable if they are forced to become more active. (As an example, Osama bin Laden and, subsequently, Ayman al-Zawahiri were known to be leaders of the al-Qaeda terrorist network, yet they remained in hiding for many years before U.S. intelligence could pinpoint their geographic locations.) If we assume the volume of communication, money, or illicit substances passing through a vertex is a proxy for that corresponding member’s visibility to intelligence officers, and communication between pairs of members in the organization is proportional to path capacity, then VIMAX can identify members of the organization whose removal will maximize communication through an important but clandestine leader. Unlike in network diversion problems, we do not require all flow in the remaining graph to be routed through this vertex (indeed in a network diversion problem, the volume of flow passing through the critical vertex might be quite small after vertex or edge removal); instead we seek to maximize the total flow routed through this vertex.

In this paper, we examine VIMAX from both computational and theoretical perspectives. In Sect. 2, we frame this work in the context of the existing literature. In Sect. 3, we define VIMAX, present it as a mixed integer linear program, and demonstrate that it is NP-Hard. Section 4 presents a simulated annealing heuristic for solving VIMAX. The computational performance of these two methods is compared in Sect. 5. Section 6 presents mathematical properties of VIMAX that can be leveraged to streamline computations. Section 7 provides future extensions of this work and concludes.

2 Literature review

We first contrast the network interdiction and diversion problems commonly seen in the literature with the VIMAX problem we will present in this paper. We then discuss the relationship between vitality and other graph centrality metrics. Finally, we present research on optimization approaches that could be useful to the problem of vitality maximization.

2.1 Network interdiction and diversion

Network interdiction models address the logistical problem of removing edges or vertices from a graph to inhibit the flow of resources through a network. This has applications to military operations and combating drug or human trafficking (Konrad et al., 2017; Tezcan and Maass, 2023; Zhang et al., 2018). Analysis of complex network interdiction typically focuses on disconnecting the network, increasing the lengths of shortest paths, cutting overall flow capacity, or reducing the desirability of paths in the network (Albert et al., 2000; Flaxman et al., 2007; Cavallaro et al., 2004; Gallos et al., 2004, 2005; Gallos et al., 2006; Gierszewski et al., 2006; Grubesic et al., 2008; Holme et al., 2002; Holzmann & Smith, 2021; Memon et al., 2008; Paul et al., 2005; Pay et al., 2019; Sun et al., 2007; Tezcan & Maass, 2023; Wu et al., 2007; Zhang et al., 2018). The most well-known model involves maximum flow network interdiction and its variants (Altner et al., 2010; Bertsimas et al., 2016; Cormican et al., 1998; Lei et al., 2018; McMasters & Mustin, 1970; Phillips, 1993; Ratliff et al., 1975; Royset & Wood, 2007; Wood, 1993). Of note, Wood (1993) introduces the “dualize-and-combine” method that is commonly used in network interdiction literature, as well as in this paper. Smith and Song thoroughly survey the network interdiction literature, and demonstrate that the assumptions widely held across the papers they survey make interdiction problems a special case of Stackelberg games (Smith & Song, 2020).

A related problem to network interdiction is the network diversion problem in which an attacker seeks to interdict, at minimum cost, a set of edges (equivalently, vertices) such that all source-sink flow must be routed through at least one member of a pre-specified set of “diversion” edges or vertices. This problem was first posed by Curet (2001). Applications include military operations, in which it might be beneficial to force a foe to divert its resources through a target edge that is heavily armed; and information networks, in which communications are routed through a single edge that can more easily be monitored (Lee et al., 2019).

Cullenbine et al. also study the network diversion problem (Cullenbine et al., 2013). They present an NP-completeness proof for directed graphs, a polynomial-time solution algorithm for \(s-t\) planar graphs, a mixed integer linear programming formulation that improves upon that given in Curet (2001), and valid inequalities to strengthen the formulation.

Lee et al. examine an extension of the network diversion problem known as the multiple flows network diversion problem in which there are many source-sink pairs being considered simultaneously (Lee et al., 2019). They define a set S of possible source nodes and T of possible sink nodes. They are interdicting a minimum cost set of edges such that all remaining flow in the network passes through the diversion edge. They formulate the problem as a mixed integer linear program, and compare its performance to standard combinatorial Benders decomposition and a branch-and-cut combinatorial Benders decomposition. Without loss of generality, vertex interdiction can be formulated as arc interdiction in which each vertex v in the original graph is represented by two vertices \(v_i\) and \(v_o\) in a modified graph having a single arc between them, \((v_i, v_o)\). Each arc (uv) in the original graph is then transformed to a corresponding arc \((u_o, v_i)\) in the modified graph. Interdicting this arc in the modified graph is equivalent to interdicting the vertex in the original graph. For undirected graphs, the graph is first transformed into a directed one before doing the transformation.

There are several aspects of Lee et al. (2019) worth noting as they connect to our work. First, after the interdiction set is removed from the graph, there is no guarantee that the total flow passing through the diversion edge is particularly large. In the vitality maximization problem that we present here, we are identifying an interdiction set of vertices such that the flow through the target vertex is maximized, thus ensuring that the target being surveilled has ample flow. Although our formulation does not associate a cost with each vertex that is interdicted, it is disadvantageous for the removal subset to be very large, as that would inherently cause the flow through the target vertex to drop. Second, we adopt their testing scheme of examining the performance of the algorithms we develop on grid networks (planar), as well as random \(G_{n,m}\) graphs (Knuth, 2014), and a drug trafficking network (Natarajan 2000).

A question conversely related to network interdiction and diversion is that of network resilience and detection of attacks. Sharkey et al. survey literature on four types of resilience: robustness, rebound, extensibility, and adaptability, with a primary focus on research addressing network robustness and the ability of a network to rebound following an attack (Sharkey et al., 2021). Dahan et al. study how to strategically locate sensors on a network to detect network attacks Dahan et al. (2022).

2.2 Vitality and other graph centrality measures

Vitality is one of several types of graph centrality metrics. Centrality metrics quantify the importance of a given vertex in a network. The book of Wasserman and Faust provides a detailed examination of social network analysis stemming from the field of sociology and includes discussion of many commonly known centrality metrics, including degree, betweenness, and closeness (Wasserman & Faust, 1994). The survey of Rasti and Vogiatzis presents centrality metrics commonly used in computational biology (Rasti & Vogiatzis, 2019).

The degree of a vertex is the number of neighbors it has. The betweenness of a vertex is the number of shortest paths between all pairs of vertices on which the vertex lies. Closeness measures the average shortest path length between the vertex and all other vertices in the graph. Vogiatzis et al. present mixed integer programming formulations for identifying groups of vertices having the largest degree, betweenness, or closeness centrality in a graph (Vogiatzis et al., 2015).

Stephenson and Zelen first proposed information centrality and applied it to a network of men infected with AIDS in the 1980s (Stephenson & Zelen, 1989). They are among the first to develop a centrality metric that does not require an assumption that information must flow along shortest paths. They use the theory of statistical estimation to define the information of a signal along the path to be the reciprocal of the variance in the signal. Assuming the noise induced along successive edges of a path is independent, the variance along each path is additive, and the total variance in the signal grows with the path length. They then use this assumption to evaluate the total information sent between any pair of vertices (st). From here, they define the centrality of a vertex i to be the harmonic average of the sum of the inverses of the information sent from from vertex i to every other vertex. They point out that “information \(\ldots \) may be intentionally channeled through many intermediaries in order to ‘hide’ or ‘shield’ information in a way not captured by geodesic paths.” This appears to be the case in terrorist and other covert networks as well (Carpenter et al., 2002).

Centrality metrics can be used to guide network disruption approaches. Cavallaro et al. show that targeting high betweenness vertices efficiently reduces the size of the largest connected component in a graph based on a Sicilian mafia network (Cavallaro et al. 2004). Grassi et al. find that betweenness and its variants can be used to identify leaders in criminal networks Grassi et al. (2019).

There also exist centrality measures related to network flows, as surveyed in Koschützki et al. (2005). In particular, for any real-valued function on a graph, Koschützki et al. define the vitality of a vertex (or edge) to be the difference in that function with or without the vertex (or edge). When the function represents the maximum flow between a pair of vertices, the vitality of a vertex k in a graph (equivalently, an edge u) with respect to an \(s-t\) pair of vertices is defined to be the reduction in the maximum flow between s and t when vertex k (equivalently, edge u) is removed from the graph. Moreover, when one examines the same reduction in maximum flow in the network over all possible \(s-t\) pairs with respect to a given vertex, we have what Freeman et al. define as network flow centrality (Freeman et al., 1991), or what we refer to as all-pairs vitality in this paper.

The most-vital edge or component is the one whose removal decreases the maximum flow through the network by the greatest amount. Identifying the most-vital edge in a network is a long-studied problem dating back to the work of Corley and Chang (1974), Wollmer (1963), and Ratliff et al. (1975). More recent examination includes the work of Alderson et al. (2013), who formulate a mathematical program to maximize resilience, using a defender-attacker-defender model. They additionally cite several applications for the most-vital edge problem including electric power systems, supply chain networks, telecommunication systems, and transportation. Ausiello et al. provide a method for calculating the vitality of all edges (with respect to a given s and t) with only \(2(n-1)\) maximum flow computations, rather than the m computations expected if one were to calculate the vitality of each edge individually (Ausiello et al., 2019). None of the found literature pertaining to vitality focuses on the problem presented here: that of identifying a set of removal vertices to maximize the vitality of a key vertex (VIMAX).

3 Optimization framework

We will show that VIMAX can be formulated as an integer linear program. We start by presenting terminology that will be used in the paper.

3.1 Definitions

We consider a connected, directed graph \(G=(V,E)\) with vertex set V, edge set E, and a key vertex of interest, k. Each edge (ij) has a capacity \(u_{ij}\) reflecting the maximum amount of flow that can be pushed along that edge. The graph has a key vertex, k, which could represent, for example, an important but elusive participant in an organization. The vitality maximization problem (VIMAX) seeks to identify a subset of vertices whose removal from the graph G maximizes the all-pairs vitality of k. Thus, the objective is to identify a set of vertices to remove from the graph to make the key vertex k as “active” as possible by forcing flow to pass through that vertex.

For any source-sink s-t pair, let \(z_{st}(G)\) be the value of the maximum s-t flow in graph G. We call \(Z_k(G)\) the flow capacity of graph G with respect to vertex k, which is the all-pairs maximum flow in G that does not originate or end at k. Thus,

$$\begin{aligned} Z_k(G) = \sum _{\begin{array}{c} s,t \in V \setminus \{k \} \\ s \ne t \end{array}} z_{st}(G). \end{aligned}$$
(1)

The all-pairs vitality of k, \({\mathcal {L}}_k(G)\), equals the flow capacity of the graph with respect to k minus the flow capacity with respect to k of the subgraph \(G {\setminus } \{k\}\) obtained when vertex k is deleted:

$$\begin{aligned} {\mathcal {L}}_k(G) = Z_k(G)-Z_k\left( G \setminus \{ k \}\right) . \end{aligned}$$
(2)

It is worth noting that maximizing the all-pairs vitality of k does not imply an assumption that all \(s-t\) pairs will communicate simultaneously. Rather, maximizing the sum over all pairs is equivalent to maximizing the average vitality of k when an \(s-t\) pair is chosen uniformly at random from all pairs. Thus, we are maximizing the expected communication through the key vertex for a randomly chosen pair of vertices. The framework that follows can be easily extended to weighted pairs according to their likelihood of engaging in communication; we outline this in Sect. 3.3.

To measure how the removal of a subset of vertices impacts the vitality of the key vertex, we define the vitality effect of subset S on key vertex k to be the change in the key vertex k’s vitality caused by removing subset S: \({\mathcal {L}}_k(G\setminus S) - {\mathcal {L}}_k(G )\). If the vitality effect of S on k is positive, then removing subset S from the graph has diverted more flow through k, a desired effect.

The goal of this research is to identify the subset of vertices S that maximizes the vitality effect, which is equivalent to maximizing the value of \({\mathcal {L}}_k(G \setminus S)\). We formally define the all-pairs vitality maximization problem (VIMAX) as

$$\begin{aligned} max_{ S \subseteq V} \ {\mathcal {L}}_k(G \setminus S). \end{aligned}$$
(3)

From expressions (1) and (2), we see that there is no guarantee that the vitality effect on k of removing any subset S need ever be positive. When subset S is removed from the graph, the overall flow capacity \(Z_k(G{\setminus } S)\) generally decreases, and never increases, because S’s contribution to the flow is removed. In order for subset S’s removal to have a positive vitality effect on key vertex k, the remaining flow must be rerouted through k in sufficiently large quantities to overcome the overall decrease in flow through the network. However, as we will show in Sect. 5.3, identification of an optimal or near-optimal removal subset often dramatically increases the vitality of the key vertex.

3.2 Mixed integer linear programming formulation

To formulate VIMAX as an optimization problem, we first formulate a linear program to solve for the vitality of k in any graph G. Then we expand that formulation into a mixed integer programming formulation that seeks the optimal subset S of vertices to remove from the graph to maximize the vitality of k in the resulting graph.

3.2.1 Vitality max-flow subproblems.

Following the approach of Israeli and Wood (2002), we take the dual of problem \(Z_{k}(G\setminus \{k\})\) to convert it into a minimum cut problem having the same optimal objective function value, and embed it in the formulation of \({\mathcal {L}}_k(G)\). Since the dual problem is a minimization problem, the objective function will correctly correspond to the vitality. Letting \(V' = V {\setminus } \{k\}\), and letting \(E'\) be the set of edges that remain after removing vertex k and its incident edges, we obtain the following linear program for finding \({\mathcal {L}}_k(G)\):

$$\begin{aligned} \begin{array}{ll} {\text {Maximize}} &{} \displaystyle \sum \limits _{\begin{array}{c} s,t \in V'\\ s \ne t \end{array}} v_{s,t} - \displaystyle \sum \limits _{\begin{array}{c} s,t \in V' \\ s \ne t \end{array}} \displaystyle \sum \limits _{(i,j) \in E'} u_{i,j}\alpha _{i,j,s,t} \\ {\text {subject to}} &{} \\ &{} \displaystyle \sum \limits _{j:(i,j) \in E} x_{i,j,s,t} - \displaystyle \sum \limits _{j':(j',i) \in E} x_{j',i,s,t} = {\left\{ \begin{array}{ll} v_{s,t} &{}\text{ if } i = s \\ -v_{s,t} &{}\text{ if } i = t \\ 0 &{}\text{ otherwise } \end{array}\right. } \\ &{} \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \forall i \in V, \forall s,t \in V'\\ &{} \\ &{} x_{i,j,s,t} \le u_{i,j}, \quad \forall (i,j) \in E, \forall s,t \in V'\\ &{} \beta _{i,s,t} - \beta _{j,s,t} + \alpha _{i,j,s,t} \ge 0, \quad \forall (i,j) \in E', \forall s,t \in V'\\ &{} -\beta _{s,s,t} + \beta _{t,s,t} \ge 1, \quad \forall s,t \in V' \\ &{} \\ &{} v_{s,t} \ge 0, \quad \forall s,t \in V'\\ &{} x_{i,j,s,t} \ge 0, \quad \forall (i,j) \in E, \forall s,t \in V'\\ &{} \alpha _{i,j,s,t} \ge 0, \quad \forall (i,j) \in E', \forall s,t \in V' \\ &{} \beta _{i,s,t} \text { unrestricted}, \quad \forall i,s,t \in V'. \\ \end{array} \end{aligned}$$
(4)

Variables \(x_{i,j,s,t}\) and \(v_{s,t}\) are the primal variables from the maximum flow formulation of problem \(Z_k(G)\). \(x_{i,j,s,t}\) represent the optimal \(s-t\) flow pushed along edge (ij), and \(v_{s,t}\) represent the optimal \(s-t\) flow values. Variables \(\alpha _{i,j,s,t}\) and \(\beta _{i,s,t}\) are the dual variables from the minimum cut formulation of problem \(Z_k(G\setminus \{k\})\). We can interpret \(\beta _{i,s,t}\) as vertex potentials: For every edge (ij), if \(\beta _{i,s,t} < \beta _{j,s,t}\), meaning vertex i has lower potential than vertex j when computing the minimum \(s-t\) cut, then edge (ij) must cross the cut. In such a case, dual variable \(\alpha _{i,j,s,t}=1\), and edge capacity \(u_{i,j}\) is counted in the objective function.

3.2.2 VIMAX: choosing an optimal removal subset.

Now that we have expressed the vitality of k in G as a linear program, we can return to VIMAX, which finds a subset S of vertices whose removal maximizes the vitality of k. Given a set S, the linear program in Eq. 4 applied to graph \(G {\setminus } S\) solves for \({\mathcal {L}}_{k}(G {\setminus } S)\). We must modify the LP above to choose a subset S that maximizes the objective function \({\mathcal {L}}_k(G\setminus S).\)

We can formalize this by creating binary variables \(q_i\) for each vertex i such that \(q_i=1\) if vertex i remains in the graph, and \(q_i =0\) if vertex i is removed from the graph (that is, i is included in subset S). We also define variables \(w_{i,j}\) for each edge that indicate whether or not edge (ij) remains in the graph following the removal of S and/or k. We define linking constraints so that whenever both vertices i and j remain in the graph (that is, \(q_i=q_j=1\)), then \(w_{i,j}\) must equal 1, and whenever either vertex i or j is selected for deletion (that is, \(q_i = 0\) or \(q_j =0\) or both) then \(w_{i,j}\) must equal 0. (Due to this relationship between \(w_{i,j}\) and the binary \(q_i\), the \(w_{i,j}\) are effectively constrained to be binary variables without explicitly declaring them as such.)

To Eq. 4, we make the following adjustments to the original primal and dual constraints. We constrain the primal flow variables \(x_{i,j,s,t} \le u_{i,j}w_{i,j},\) reflecting whether or not edge (ij) remains in the graph. We also modify the dual potential constraints so that \(\alpha _{i,j,s,t}=0\) whenever vertices i and j are at the same potential (as before) or edge (ij) no longer exists in the graph.

Introducing the variables \(q_i\) and \(w_{i,j}\) and the modifications on our vitality constraints, we can now write the full mixed-integer linear program. Given a graph \(G = (V,E)\), a key vertex k, and a maximum size, m, of the removal set, the following mixed-integer linear program solves VIMAX.

$$\begin{aligned} \begin{array}{lll} \text {Maximize} &{} \displaystyle \sum \limits _{\begin{array}{c} s,t \in V'\\ s \ne t \end{array}} v_{s,t} - \displaystyle \sum \limits _{\begin{array}{c} s,t \in V' \\ s \ne t \end{array}} \displaystyle \sum \limits _{(i,j) \in E'} u_{i,j}\alpha _{i,j,s,t} \\ \text {subject to} &{} \\ &{} \\ &{} \displaystyle \sum \limits _{\begin{array}{c} i \in V \end{array}} q_i \ge n-m \\ &{} q_k = 1 \\ &{} w_{i,j} \le q_i, \quad \forall (i,j) \in E \\ &{} w_{i,j} \le q_j, \quad \forall (i,j) \in E\\ &{} w_{i,j} \ge q_i + q_j -1, \quad \forall (i,j) \in E \\ &{} \\ &{} \displaystyle \sum \limits _{j:(i,j) \in E} x_{i,j,s,t} - \displaystyle \sum \limits _{j':(j',i) \in E} x_{j',i,s,t} = {\left\{ \begin{array}{ll} v_{s,t} &{}\text{ if } i = s \\ -v_{s,t} &{}\text{ if } i = t \\ 0 &{}\text{ otherwise } \end{array}\right. } \\ &{} \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \forall i \in V, \forall s,t \in V'\\ &{} \\ &{} x_{i,j,s,t} \le u_{i,j}w_{i,j}, \quad \forall (i,j) \in E, \forall s,t \in V' \\ &{} \beta _{i,s,t} - \beta _{j,s,t} + \alpha _{i,j,s,t} \ge -(1-w_{i,j}), \quad \forall (i,j) \in E', \forall s,t \in V'\\ &{} -\beta _{s,s,t} + \beta _{t,s,t} \ge 1, \quad \forall s,t \in V' \\ &{} \\ &{} q_i \text { binary}, \quad \forall i \in V \\ &{} w_{i,j} \ge 0, \quad \forall (i,j) \in E\\ &{} v_{s,t} \ge 0, \quad \forall s,t \in V'\\ &{} x_{i,j,s,t} \ge 0, \quad \forall (i,j) \in E, \forall s,t \in V'\\ &{} \alpha _{i,j,s,t} \ge 0, \quad \forall (i,j) \in E', \forall s,t \in V' \\ &{} \beta _{i,s,t} \text { unrestricted}, \quad \forall i,s,t \in V' \\ \end{array} \end{aligned}$$
(5)

Extending the approach of Ovadia (2010) to general capacity, directed graphs, we can show that VIMAX is NP-Hard. In the case that \(m=1\) and we can remove at most one vertex, we can do brute-force and solve the above MIP setting \(q_i = 0\) and all other \(q_j = 1\) for all \(i \in V'\).

Theorem 1

The all-pairs vitality maximization problem is NP-Hard.

Proof

The proof of this can be found in Appendix A. \(\square \)

3.3 Extension to pairwise weights

Maximizing the sum over all pairs is equivalent to maximizing the average vitality of k when an \(s-t\) pair is chosen uniformly at random from all pairs. Thus, we are maximizing the expected communication through the key vertex for a randomly chosen pair of vertices. However, not all pairs may be equally likely to communicate. The objective function above can be easily extended to weight each pair according to their likelihood of engaging in communication. Let \(\text {P}_{s,t}\) be a weight corresponding to the frequency that pair \(s-t\) communicates. For example, traffic matrix estimation can be used to estimate the pairwise demands for an internet network Zhang et al. (2003); Medina et al. (2002). The objective function in Eq. 5 can be modified to include these weights to maximize the expected communication through the key vertex.

$$\begin{aligned} \text {Maximize} \displaystyle \sum \limits _{\begin{array}{c} s,t \in V'\\ s \ne t \end{array}} \text {P}_{s,t} v_{s,t} - \displaystyle \sum \limits _{\begin{array}{c} s,t \in V' \\ s \ne t \end{array}} \displaystyle \sum \limits _{(i,j) \in E'} \text {P}_{s,t} u_{i,j}\alpha _{i,j,s,t} \end{aligned}$$

4 Simulated annealing heuristic

As an alternative to solving VIMAX exactly with a MIP, we develop a simulated annealing heuristic. Each iteration of simulated annealing begins with a candidate removal subset. In the first iteration, this is the empty set, and in subsequent iterations the initial solution is the best solution found at the conclusion of the previous iteration. The objective function value of each solution is computed as the vitality of the key vertex when this subset is removed from the graph. Each call to the algorithm consists of an annealing phase and a local search phase.

During the annealing phase, neighboring solutions of the current solution are obtained by toggling a single vertex’s, or a pair of vertices’, inclusion or exclusion from the candidate removal subset, subject to the constraint that \(\vert S \vert \le m\). If the neighboring solution improves the objective function value, it is automatically accepted for consideration. If the neighboring solution has a worse objective function value, it will be accepted to replace the current solution with an acceptance probability governed by a temperature function, T. When the temperature is high (in early iterations), there is a high probability of accepting a neighboring solution even if its objective function value is worse than that of the incumbent solution. This permits wide exploration of the solution space. In later iterations, the temperature function cools, reducing the likelihood that lower objective function value solutions will be considered. This permits exploitation of promising regions of the solution space.

Given temperature T, the probability of accepting a solution having objective function value \(e_0\) when the best objective function value found so far is \(e_{max} > e_0\) is given by \(P = e^{-\left( e_{max}-e_0\right) /T}\). The initial temperature, T, is chosen so that the acceptance probability of a solution having at least 90% of the initial objective function value is at least 95%. In subsequent iterations, T is cooled by a multiplicative factor of 0.95.

After a set number of annealing iterations, a single iteration of local search is conducted on the best solution found so far by toggling each vertex sequentially to determine if its inclusion or exclusion improves the objective function value. The best solution found is returned.

We use a Gomory-Hu tree implementation of the all-pairs maximum flow problem to rapidly calculate the vitality of the key vertex on each modified graph encountered by the heuristic (Gomory & Hu, 1961; Gusfield, 1990). For mathematical reasons that are discussed in Sect. 6, we can exclude leaves from consideration in any removal subset. These two enhancements permit the simulated annealing heuristic to run very fast on even large instances, as we discuss in Sect. 5.3.

5 Computational analysis

We now present performance comparisons on a variety of datasets of the MIP formulation and the simulated annealing heuristic. Following the approach of Lee et al. (2019), we generate grid networks, which are planar. We also test the performance of the methods on random networks (Knuth, 2014) and on a real drug trafficking network (Natarajan, 2000). We first describe these data sets and the computational platform used, and then we present the results. Code and data files are available at our Github repository: https://github.com/alicepaul/network_interdiction.

5.1 Data

5.1.1 Grid networks

We generate grid networks in a similar fashion as Lee et al. (2019). We generate square \(M \times M\) grids with M varying from five to eight. Such graphs have an edge density of \(\frac{4}{M(M+1)}\), which ranges from \(13.3\%\) for \(M=5\) to \(5.6\%\) for \(M=8\). On each grid, we generate edge capacities independently and uniformly at random from the integers from 1 to M. For each case, we likewise consider two scenarios, testing a maximum removal subset size of \(m=1\) or \(m=M\) (that is, \(\sqrt{\vert V \vert }\)). For each grid size and removal subset size combination, we generate three trial graphs. For each trial, a key vertex is selected uniformly at random over the vertices.

5.1.2 Random \(G_{n,m}\) Networks.

Random \(G_{n,m}\) graphs are parametrized by a number of vertices, \(n=\vert V \vert \), and a number of edges, \(m=\vert E \vert \) (Knuth, 2014). Each graph is sampled by finding a random graph from the set of all connected graphs with n nodes and m edges. We test our methods on graphs having the same number of vertices and same number of edges as the grid networks above: \(\vert V \vert = \{25, 36, 49, 64\}\) vertices, with \(\vert E \vert = \{40, 60, 84, 112\}\), respectively. On each graph, we generate edge capacities independently and uniformly at random from the integers from 1 to \(\sqrt{\vert V \vert }\). For each case, we likewise consider two scenarios, testing a maximum removal subset size of \(m=1\) or \(m=\sqrt{\vert V \vert }\). For each graph size and removal subset size combination, we generate three trial graphs. For each trial, the key vertex is selected to have the highest betweenness centrality.

5.1.3 Drug trafficking network

Lastly, we test our models on a real-world covert cocaine trafficking group, prosecuted in New York City in 1996 (Natarajan, 2000). This network consists of 28 people between whom 151 phone conversations were intercepted over wiretap over a period of two months. An edge exists between persons i and j if at least one conversation between them appears in the data set. There are 40 edges in this graph, corresponding to an edge density of \(10.6\%\). We can consider a unit capacity version of the network, as well as a general capacity version in which the capacity on edge \(i-j\) is equal to the number of conversations between them appearing in the data. The weighted network is shown in Fig. 1, where line width is proportional to the number of wiretapped calls occurring between two operatives. According to Natarajan et al., some individuals in the network are known to have the roles described in Table 1. We test a maximum removal subset size of \(m=1\) or \(m =5 \approx \sqrt{\vert V \vert }\). Because the Colombian bosses (vertices 1, 2, and 3) are high-level leaders important to the functioning of the organization, we treat these vertices as the key vertices on which we attempt to maximize vitality.

Fig. 1
figure 1

Cocaine trafficking network of Natarajan et al. Line width is proportional to number of wiretapped calls made between pairs of operatives (Natarajan, 2000)

Table 1 Roles of notable vertices in the cocaine trafficking network of Natarajan et al. (Natarajan, 2000)

5.2 Computational framework

The performance of the MIP and the simulated annealing heuristic was tested on a computer with a 3 GHz 6-Core Intel Core i5 processor and 16 GB of memory. The Single-VIMAX and VIMAX MIP instances were run in python 3.9.6 calling the CPLEX solver through the CPLEX python API, and were each limited to two hours of computation time. The simulated annealing heuristic was also coded in python and limited to 10,000 iterations on each trial instance. Initial results were collected using the Extreme Science and Engineering Discovery Environment (XSEDE) supercomputers (Towns et al., 2014) and up to five hours of computation time but did not show substantially different results. In addition to the general VIMAX MIP, a single vertex removal MIP (Single VIMAX) was also tested. Single vertex removal simulated annealing results are not reported, as they are effectively equivalent to brute force search.

5.3 Results

Table 2 presents the results of all completed trials. The first five columns explain the graph type, number of vertices \((\vert V \vert )\), number of edges \((\vert E \vert )\) and for the general VIMAX problem allowing multiple removals, the maximum allowed size, m, of the removal subset. Column six gives the initial vitality of the key vertex in the original graph with no vertices removed. Columns seven through ten provide results on the performance of the single vertex removal MIP (Single VIMAX); columns eleven through fifteen provide results from the multi-removal MIP (VIMAX); and columns sixteen through nineteen provide results from the multi-removal simulated annealing heuristic. (There is no need to use simulated annealing for Single VIMAX because it can be solved by sequentially testing the removal of each vertex.) For the three methods, the best vitality found within the time or iteration limit, the MIP gap if available, the percentage increase of the best vitality found by the method over the original vitality of the key vertex in the full graph, and the running time in seconds are given. For the multi-removal methods, the size of the best found removal subset \((\vert S \vert )\) is also given. MIP instances that terminated due to time limit have Time reported as \('-'\). Figures 2, 3, and 4 plot, for each graph type, vitality averaged over the three trials by removal method (original vitality, single-removal MIP, multi-removal MIP, and multi-removal simulated annealing).

Table 2 Computational results of solving VIMAX via mixed integer program (single VIMAX and multi-removal VIMAX) and simulated annealing (multi-removal VIMAX). For the grid and random networks, each row represents a randomly generated instance with randomly selected key vertex. MIP trials that reached the two-hour time limit show ‘Time’ reported as ‘-’
Fig. 2
figure 2

Mean vitality across three trials for the unit capacity and general capacity instances of the drug network, by removal type (original graph, single-removal MIP, multi-removal MIP, and multi-removal simulated annealing)

Fig. 3
figure 3

Mean vitality across three trials for each size of the random network, by removal type (original graph, single-removal MIP, multi-removal MIP, and multi-removal simulated annealing)

Fig. 4
figure 4

Mean vitality across three trials for each size of the grid network, by removal type (original graph, single-removal MIP, multi-removal MIP, and multi-removal simulated annealing)

First we note that Table 2 and Figs. 24 provide a proof-of-concept demonstrating that it is possible to increase (sometimes dramatically) the vitality of the key vertex through subset removal. Removing a single vertex increased the vitality by 42%-200% in all grid network instances for which the MIP solved to optimality within the time limit, and by up to 82% in the random graph instances; single vertex removal was not able to increase the vitality of the key vertex in the drug network. When allowing multiple removals, simulated annealing was able to identify removal subsets that increased the vitality on the key vertex by as much as 1,373%.

Unsurprisingly, the full VIMAX MIP allowing multiple removals is substantially harder to solve than the single removal MIP. On grid and random networks, the MIP failed to terminate within the two-hour time limit on all instances with at least \(n=36\) nodes. On the \(n=36\) random and the \(7 \times 7\) and \(8 \times 8\) grid network instances, the single removal MIP also did not terminate within the time limit, but an improving solution was returned in more cases. The large MIP gaps on the multi-removal MIP indicate a failure to find improving integer solutions.

For multiple vertex removal, the simulated annealing heuristic yielded excellent solutions in a fraction of the time required by even the single removal MIP. On the large instances for which the multiple removal MIP reached the time limit, the simulated annealing heuristic found substantially better solutions than the MIP incumbents. For those instances in which the multiple removal MIP solved to optimality, the solutions found by simulated annealing are often optimal and always near-optimal.

The effectiveness of vertex removal to maximize vitality appears to depend on the network structure and choice of key vertices. While the drug network has approximately the same number of vertices and edges as the 25-node instances of the random and grid networks, the key vertices (corresponding to vertices Boss 1, Boss 2, and Boss 3 in Fig. 1) chosen in these trials are less amenable to vitality maximization. The drug network has a large number of leaves, whereas the grid networks do not. As we will see in Sect. 6, vertices, such as leaves, that do not have at least two vertex-disjoint paths to the key vertex will never appear in an optimal removal subset.

Lastly, in these trials, we chose to restrict the removal subset size to at most m vertices. The reason to restrict the removal subset size is to reduce the solution space, and thus the complexity, of the problem. This decision is justifiable because we know removing too many vertices will cause overall flow in the network to drop such that the vitality on the key vertex cannot increase. Thus, an important question is what should be an appropriate value of m to effectively reduce the solution space without compromising the quality of solutions found? We do not have a definitive answer to this question. However, we see that in many of the trials, the best removal subset identified by any method has a size strictly less than \(m \approx \sqrt{\vert V \vert }\), suggesting that this choice of m is reasonable for the sizes and types of graphs considered here.

6 Leveraging structural properties of vitality

Thus far, we have established that subset removal can dramatically increase the vitality of a key vertex. However, solving this problem exactly as a MIP is computationally intractable for even modestly sized graphs. Fortunately, simulated annealing is an appealing alternative that yields very good solutions in dramatically less time than the MIP. In this section, we explore mathematical properties that characterize vertices that can be ignored by subset removal optimization approaches. We demonstrate how these properties can be leveraged to simplify the graph on which VIMAX is run.

6.1 Identifying vitality-reducing vertices

To reduce the complexity of the optimization formulation, we turn to identifying conditions that cause a vertex to have a vitality-reducing effect on the key vertex. This allows us to ignore such vertices in any candidate removal subset and reduce the solution space of the VIMAX problem.

Our first observation is that the presence of a cycle is necessary for the removal of a vertex to increase the vitality of a key vertex. The vitality of a leaf is always equal to 0, so the removal of any subset that results in k becoming a leaf also cannot increase the vitality of k. As a corollary, if k has neighbor set N(k) and more than \(\vert N(k)\vert -2\) of k’s neighbors are removed, the vitality effect on k will be nonpositive.

We can generalize this further. When there are not at least two vertex-disjoint paths from i to k, any removal subset including i will have a vitality effect on k no greater than the same subset excluding i, as stated by the following theorem.Footnote 1:

Theorem 2

Let G be a graph with key vertex k, and let i be a vertex such that there do not exist at least two vertex-disjoint paths starting at i and ending at k. Let S be any vertex subset containing i, and let \(T = S \setminus \{i\}\). Then, \({\mathcal {L}}_k(G {\setminus } S) \le {\mathcal {L}}_k(G {\setminus } T)\). Therefore, T will have at least as large a vitality effect on k as S.

Proof

The proof of this can be found in Appendix B. \(\square \)

Put simply, the existence of only one vertex-disjoint path between i and k means that i and k do not lie on a cycle together. Therefore when i is removed, any \(s-t\) paths that previously passed through i cannot be rerouted through any alternate path passing through k.

Note that identifying vertices that do not have at least two vertex-disjoint paths to k is computationally straightforward. We can solve an all \(u-k\) pairs maximum flow problem on a related graph \({\hat{G}}\) in which every vertex u is replaced with a pair of vertices connected by a unit capacity edge: \((u,u')\). For every directed edge \(i-j\) in the original graph, we include directed edge \((i',j)\) in the modified graph. Through the use of a Gomory-Hu tree, we can solve this in \(O(\vert V \vert ^3\sqrt{\vert E \vert })\) time (Gomory & Hu, 1961; Gusfield, 1990). Any vertex u corresponding to vertex \(u^{'}\) in \({\hat{G}}\) that has a maximum \(u^{'}-k\) flow of one in \({\hat{G}}\) does not have at least two vertex-disjoint paths to k in the original graph and can be ignored by any removal subset. We call the set of such vertices, \({\mathcal {Q}}\). Every vertex in \({\mathcal {Q}}\) should be maintained in the graph and not be considered for removal.

These properties show that when seeking a vitality-maximizing subset for removal, we can ignore all subsets that include:

  • vertices in \({\mathcal {Q}}\) (i.e. they do not share a cycle with k);

  • more than \(\vert N(k)\vert -2\) of k’s neighbors.

After performing preprocessing on the graph to identify N(k) and \({\mathcal {Q}}\), we can add the following constraints to the MIP formulation:

$$\begin{aligned} \begin{array}{ll} &{} q_i = 1, \forall i \in {\mathcal {Q}} \\ &{} \displaystyle \sum \limits _{\begin{array}{c} i \in N(k) \end{array}} q_i \ge 2 \\ &{} \\ \end{array} \end{aligned}$$
(6)

Although the above constraints provide a tighter formulation for VIMAX, the anticipated benefits of these constraints are likely to be modest. Table 3 shows \(\vert {\mathcal {Q}} \vert \) (the number of vertices that do not have at least two vertex-disjoint paths to k) for each graph used for testing in Sect. 5.

Table 3 Improvement in key VIMAX instance size parameters by identifying vitality-reducing vertices and using graph-simplification. \(\vert {\mathcal {Q}} \vert \) is the number of vertices that do not have at least two vertex-disjoint paths to k; vertices in \({\mathcal {Q}}\) can be ignored by VIMAX (see Sect. 6.1). \(\vert {\hat{V}}\vert \) and \(\vert {\hat{E}}\vert \) are the numbers of vertices and edges, respectively, in the reduced graph after applying the graph simplification method of Sect. 6.2. The last two columns report the percentage decrease in time and percentage increase in best objective function value of the graph simplification method compared to the Multi-Removal MIP results reported in Table 2. Entries denoted by ’-’ indicate instances in which the MIP did not terminate within two hours

Unsurprisingly given their structure, all the vertices in the grid networks have at least two vertex-disjoint paths to k; thus none of these vertices can be eliminated from consideration and are omitted from Table 3. By contrast, the sparse drug trafficking network has nearly half of its vertices that do not have at least two vertex-disjoint paths to the key vertex; this is a significant reduction in the number of candidate vertices for removal, but VIMAX was readily tractable on this already-small network. Thus, this criterion alone is unlikely to render previously intractable MIP instances tractable.

6.2 Simplifying the graph

Because VIMAX grows rapidly in the number of vertices, we can improve the computational tractability of VIMAX by simplifying our original graph into a vitality-preserving graph having fewer vertices. We rely heavily on Theorem 2 to do this.

Suppose that a vertex v disconnects the graph into two components \(T_1\) and \(T_2\) such that \(k\in T_1\). Then, by Theorem 2, an optimal solution will not contain any vertex in \(T_2\). Further, the maximum flows between pairs of vertices within \(T_2\) do not contribute to the vitality effect on k. Therefore, all that is needed to preserve the vitality effect on k in the simplified graph is to preserve information about the maximum flow between all pairs of vertices st such that \(s \in T_1\) and \(t \in T_2\).

For all vertices \(t \in T_2\) we create a single edge between t and v with capacity equal to the maximum flow between t and v. This replaces all previous edges between vertices in \(T_2\). This affects the value of the all-pairs maximum flow problem but does not affect the vitality effect on k for any subset \(S \subset T_1\). Further, if any subset of vertices \(T' \subseteq T_2\) all have the same new capacity value, we combine \(T'\) into a single vertex with weight \(\vert T'\vert \). When calculating the maximum flow between any pair of vertices s and t in the graph, we multiply the flow by the product of the weights of the vertices to account for this simplification.

Using the process described in the previous section, we can identify the subset of vertices \({\mathcal {Q}} \subseteq V \setminus \{k\}\) that do not have at least two vertex-disjoint paths to k. Given a vertex \(i \in {\mathcal {Q}}\), we find a path from i to k and find the first vertex v along i’s path to k such that v has at least two vertex-disjoint paths to k. Removing the vertex v disconnects the graph. Therefore, we follow the simplification process above and mark all vertices in the corresponding \(T_2\), including i, as processed. We then repeatedly identify any unprocessed vertex in \({\mathcal {Q}}\) to further simplify the graph. After all vertices in \({\mathcal {Q}}\) have been processed, all these vertices will be weighted leaves in the new simplified graph where the weight depends on how many vertices have been combined. All other vertices will retain a weight of one.

Figure 5 shows an example of this simplification process in which there are two components that have been simplified. Note that vertices 4, 6, and 7 have been combined together into a vertex with weight three. Further, vertices 5 and 8 have been combined together into a vertex with weight two.

Fig. 5
figure 5

An example of a graph (left) and its simplified version (right) with vertex weights. Vertices 4, 6, and 7 have been combined together into a vertex with weight three. Further, vertices 5 and 8 have been combined together into a vertex with weight two

As argued above, the maximum flow between all pairs of vertices that were in the same simplified component never contribute to the vitality effect on k. Therefore, we ignore these pairs in the optimization problem by removing the appropriate variables and constraints. We therefore just need to check that we have preserved the maximum flow between all pairs of vertices that were not in the same component. This is true by nature of the weights which are multiplied. For example, in Fig. 5, we multiply by weight 4 for the maximum flow between vertex 4 and vertex 1, accounting for all the paths between vertices 4, 6, and 7 and vertex 1. Thus, our optimization problem still finds an optimal subset to remove on the simplified graph that is optimal in the original graph. The number of pairs of vertices decreases from 45 to 19 since the number of vertices excluding k decreases from 10 to 7 and we can ignore the flow between vertices 9 and 10 and between vertices 4 and 8 in the simplified graph.

Table 3 shows the number of vertices \((\vert {\hat{V}} \vert ) \) and edges \((\vert {\hat{E}} \vert )\) in each test graph after applying the graph simplification algorithm. The only graph types experiencing an appreciable reduction in size after simplification are the drug trafficking network and the smaller random graphs. We posit that highly connected graphs such as the grid networks are less amenable to the simplification method than sparser networks. In Table 3 we also include the percentage decrease in time and percentage increase in the best objective function value found via graph simplification to the Multi-Removal MIP removal results reported in Table 2. The time includes the time to perform the graph simplification, which is very efficient. For graphs with a significant reduction in the number of nodes and edges, we see a corresponding decrease in the runtime for the MIP. For the larger networks that did not terminate within the time limit, we only see the best vitality found improve in one instance.

7 Future work and conclusions

In this paper we have presented the VIMAX optimization problem that identifies a subset of vertices whose removal maximizes the volume of flow passing through a key vertex in the network. VIMAX is NP-Hard. We have used the dualize-and-combine method of Wood (1993) to formulate VIMAX as a mixed integer linear program, and we compared its performance to that of a simulated annealing heuristic. We also demonstrated how identifying vertices not having at least two vertex-disjoint paths to the key vertex can be used to simplify the graph and reduce computation time on certain graph types. Key limitations to the work presented are the computational bottlenecks. Future work could focus on two areas highlighted in this paper - graph simplification and the Bender’s Decomposition.

  • Graph Simplification: Additional properties of vitality-reducing vertices, such as those outlined in Paul (2012) for the unit capacity case, could be derived for the general capacity case and used to preprocess or simplify the graph to reduce the solution space of VIMAX. In particular, it would be beneficial to identify small cuts in the graph such that all vertices on the other side of the cut as k can be ignored from consideration.

  • Bender’s Decomposition: Because the number of constraints in the VIMAX MIP grows on the order of \(O(\vert E \vert \vert V \vert ^2)\), we can use Bender’s decomposition algorithm to solve our problem for large graphs. The decomposition is presented in Appendix C, but preliminary testing did not improve the MIP performance. The survey of Smith and Song illustrates a variety of approaches that could be applied to improve the performance of the Bender’s decomposition of VIMAX (Smith and Song 2020).

Additionally, this paper opens up a rich area of future research on extensions of this problem.

  • Optimization: In this paper, we have focused on identifying vertices having high vitality effect on the key vertex without considering the cost or difficulty of removing them from the graph. An enhancement to VIMAX could include a budget constraint restricting the choice of subsets based on the difficulty of their removal.

  • Dynamic response: The disruption technique described in this paper focuses on the network at one snapshot in time and assumes that any subset removal occurs simultaneously and that the network remains static. This might be a reasonable assumption for networks that evolve slowly over time, such as transportation supply chains, or for disruption interventions that occur over a short time scale, such as military maneuvers. On the other hand, social networks might be able to more rapidly reconfigure following a disruption, counter to our assumption. Extensions to VIMAX might explore cascading effects of sequential vertex removal, similar to the literature on multi-period interdiction (Enayaty-Ahangar et al., 2019), cascading failures (Crucitti et al., 2004a; Motter & Lai, 2002; Zhao et al., 2005), and agent-based models for counter-interdiction responses Magliocca et al., 2019).

  • Imperfect information: The VIMAX formulation presented here assumes complete and perfect knowledge of the network’s structure. However, the complete structure of a covert network is typically not known to enforcement agencies, and can evolve rapidly (Konrad et al., 2017). Future work could address applying VIMAX to networks with uncertain or unknown structure and capacities. For example, one could examine the robustness of the vitality measure and vitality-maximizing subset to graph perturbations over an uncertainty set.

  • Robust network design: We can use the results of this research to design networks, such as telecommunication and other infrastructure networks, to be robust to vitality-diverting attacks (Crucitti et al., 2004b).

  • Multiple key vertices: In the case that we want to maximize the flow through a subset S of key vertices, we can extend the definition of vitality maximization to maximize the all-pairs vitality of S. The MIP and simulated annealing algorithm can be updated accordingly.

VIMAX has broad applicability to problems including disrupting organized crime rings, such as those used in terrorism, drug smuggling and human trafficking; disrupting telecommunications networks and power networks; as well as robust network design.