The All-Pairs Vitality-Maximization (VIMAX) Problem

Traditional network interdiction problems focus on removing vertices or edges from a network so as to disconnect or lengthen paths in the network; network diversion problems seek to remove vertices or edges to reroute flow through a designated critical vertex or edge. We introduce the all-pairs vitality maximization problem (VIMAX), in which vertex deletion attempts to maximize the amount of flow passing through a critical vertex, measured as the all-pairs vitality of the vertex. The assumption in this problem is that in a network for which the structure is known but the physical locations of vertices may not be known (e.g. a social network), locating a person or asset of interest might require the ability to detect a sufficient amount of flow (e.g., communications or financial transactions) passing through the corresponding vertex in the network. We formulate VIMAX as a mixed integer program, and show that it is NP-Hard. We compare the performance of the MIP and a simulated annealing heuristic on both real and simulated data sets and highlight the potential increase in vitality of key vertices that can be attained by subset removal. We also present graph theoretic results that can be used to narrow the set of vertices to consider for removal.


Introduction
Network disruption has important applications to infrastructure design [9,48,13], energy transmission [10,37], robust network design [18,22,24], biological systems [59], illicit trade networks [4], and counterterrorism [7,62].Much of this work focuses on three primary problem types: 1) network flow interdiction, in which an attacker is trying to decrease the flow capacity of the network by interdicting vertices or edges such that the maximum flow between a source and sink is minimized (e.g., [3,6,8,44,45,61,73,23]); 2) shortest path interdiction, in which an attacker interdicts vertices or edges such that the shortest path between a source and sink is maximized (e.g., [39,56,75]); and 3) network diversion, in which a minimum cost, minimal cutset of edges is identified such that when removed, any source-sink path in the network is forced to travel through a particular set of critical edges (e.g., [14,19,20]).
Of interest in this paper is the concept of vertex (equivalently, edge) vitality, which measures the reduction in the maximum flow between the source and sink when that vertex (or edge) is removed from the graph [5,42].A vertex having high vitality is needed to achieve a high volume of flow from source to sink, and as such, this vertex will have a high volume of flow passing through it operations and combating drug or human trafficking [41,67,75].Analysis of complex network interdiction typically focuses on disconnecting the network, increasing the lengths of shortest paths, cutting overall flow capacity, or reducing the desirability of paths in the network [1,25,12,27,28,29,30,33,36,38,49,55,56,66,67,74,75].The most well-known model involves maximum flow network interdiction and its variants [3,8,16,44,48,57,60,61,73].Of note, [73] introduces the "dualize-and-combine" method that is commonly used in network interdiction literature, as well as in this paper.Smith and Song thoroughly survey the network interdiction literature, and demonstrate that the assumptions widely held across the papers they survey make interdiction problems a special case of Stackelberg games [64].
A related problem to network interdiction is the network diversion problem in which an attacker seeks to interdict, at minimum cost, a set of edges (equivalently, vertices) such that all source-sink flow must be routed through at least one member of a pre-specified set of "diversion" edges or vertices.This problem was first posed by [20].Applications include military operations, in which it might be beneficial to force a foe to divert its resources through a target edge that is heavily armed; and information networks, in which communications are routed through a single edge that can more easily be monitored [43].
Cullenbine et al. also study the network diversion problem [19].They present an NP-completeness proof for directed graphs, a polynomial-time solution algorithm for s − t planar graphs, a mixed integer linear programming formulation that improves upon that given in [20], and valid inequalities to strengthen the formulation.
Lee et al. examine an extension of the network diversion problem known as the multiple flows network diversion problem in which there are many source-sink pairs being considered simultaneously [43].They define a set S of possible source nodes and T of possible sink nodes.They are interdicting a minimum cost set of edges such that all remaining flow in the network passes through the diversion edge.They formulate the problem as a mixed integer linear program, and compare its performance to standard combinatorial Benders decomposition and a branch-and-cut combinatorial Benders decomposition.Without loss of generality, vertex interdiction be formulated as arc interdiction in which each vertex v in the original graph is represented by two vertices v i and v o in a modified graph having a single arc between them, (v i , v o ).Each arc (u, v) in the original graph is then transformed to a corresponding arc (u o , v i ) in the modified graph.Interdicting this arc in the modified graph is equivalent to interdicting the vertex in the original graph.For undirected graphs, the graph is first transformed into a directed one before doing the transformation.
There are several aspects of [43] worth noting as they connect to our work.First, after the interdiction set is removed from the graph, there is no guarantee that the total flow passing through the diversion edge is particularly large.In the vitality maximization problem that we present here, we are identifying an interdiction set of vertices such that the flow through the target vertex is maximized, thus ensuring that the target being surveilled has ample flow.Although our formulation does not associate a cost with each vertex that is interdicted, it is disadvantageous for the removal subset to be very large, as that would inherently cause the flow through the target vertex to drop.Second, we adopt their testing scheme of examining the performance of the algorithms we develop on grid networks (planar), as well as random G n,m graphs [40], and a drug trafficking network [52].
A question conversely related to network interdiction and diversion is that of network resilience and detection of attacks.Sharkey et al. survey literature on four types of resilience: robustness, rebound, extensibility, and adaptability, with a primary focus on research addressing network robustness and the ability of a network to rebound following an attack [63].Dahan et al. study how to strategically locate sensors on a network to detect network attacks [21].

Vitality and Other Graph Centrality Measures
Vitality is one of several types of graph centrality metrics.Centrality metrics quantify the importance of a given vertex in a network.The book of Wasserman and Faust provides a detailed examination of social network analysis stemming from the field of sociology and includes discussion of many commonly known centrality metrics, including degree, betweenness, and closeness [70].The survey of Rasti and Vogiatzis presents centrality metrics commonly used in computational biology [58].
The degree of a vertex is the number of neighbors it has.The betweenness of a vertex is the number of shortest paths between all pairs of vertices on which the vertex lies.Closeness measures the average shortest path length between the vertex and all other vertices in the graph.Vogiatzis et al. present mixed integer programming formulations for identifying groups of vertices having the largest degree, betweenness, or closeness centrality in a graph [69].
Stephenson and Zelen first proposed information centrality and applied it to a network of men infected with AIDS in the 1980s [65].They are among the first to develop a centrality metric that does not require an assumption that information must flow along shortest paths.They use the theory of statistical estimation to define the information of a signal along the path to be the reciprocal of the variance in the signal.Assuming the noise induced along successive edges of a path is independent, the variance along each path is additive, and the total variance in the signal grows with the path length.They then use this assumption to evaluate the total information sent between any pair of vertices (s, t).From here, they define the centrality of a vertex i to be the harmonic average of the sum of the inverses of the information sent from from vertex i to every other vertex.They point out that "information . . .may be intentionally channeled through many intermediaries in order to 'hide' or 'shield' information in a way not captured by geodesic paths."This appears to be the case in terrorist and other covert networks as well [11].
Centrality metrics can be used to guide network disruption approaches.Cavallaro et al. show that targeting high betweenness vertices efficiently reduces the size of the largest connected component in a graph based on a Sicilian mafia network [12].Grassi et al. find that betweenness and its variants can be used to identify leaders in criminal networks [32].
There also exist centrality measures related to network flows, as surveyed in [42].In particular, for any real-valued function on a graph, Koschützki et al. define the vitality of a vertex (or edge) to be the difference in that function with or without the vertex (or edge).When the function represents the maximum flow between a pair of vertices, the vitality of a vertex k in a graph (equivalently, an edge u) with respect to an s − t pair of vertices is defined to be the reduction in the maximum flow between s and t when vertex k (equivalently, edge u) is removed from the graph.Moreover, when one examines the same reduction in maximum flow in the network over all possible s − t pairs with respect to a given vertex, we have what Freeman et al. define as network flow centrality [26], or what we refer to as all-pairs vitality in this paper.
The most-vital edge or component is the one whose removal decreases the maximum flow through the network by the greatest amount.Identifying the most-vital edge in a network is a long-studied problem dating back to the work of [15], [72], and [60].More recent examination includes the work of [2], who formulate a mathematical program to maximize resilience, using a defender-attackerdefender model.They additionally cite several applications for the most-vital edge problem including electric power systems, supply chain networks, telecommunication systems, and transportation.
Ausiello et al. provide a method for calculating the vitality of all edges (with respect to a given s and t) with only 2(n − 1) maximum flow computations, rather than the m computations expected if one were to calculate the vitality of each edge individually [5].None of the found literature pertaining to vitality focuses on the problem presented here: that of identifying a set of removal vertices to maximize the vitality of a key vertex (VIMAX).

Optimization framework
We will show that VIMAX can be formulated as an integer linear program.We start by presenting terminology that will be used in the paper.

Definitions
We consider a connected, directed graph G = (V, E) with vertex set V , edge set E, and a key vertex of interest, k.Each edge (i, j) has a capacity u ij reflecting the maximum amount of flow that can be pushed along that edge.The graph has a key vertex, k, which could represent, for example, an important but elusive participant in an organization.The vitality maximization problem (VIMAX) seeks to identify a subset of vertices whose removal from the graph G maximizes the all-pairs vitality of k.Thus, the objective is to identify a set of vertices to remove from the graph to make the key vertex k as "active" as possible by forcing flow to pass through that vertex.
For any source-sink s-t pair, let z st (G) be the value of the maximum s-t flow in graph G.We call Z k (G) the flow capacity of graph G with respect to vertex k, which is the all-pairs maximum flow in G that does not originate or end at k. Thus, The all-pairs vitality of k, L k (G), equals the flow capacity of the graph with respect to k minus the flow capacity with respect to k of the subgraph G \ {k} obtained when vertex k is deleted: To measure how the removal of a subset of vertices impacts the vitality of the key vertex, we define the vitality effect of subset S on key vertex k to be the change in the key vertex k's vitality caused by removing subset S: L k (G \ S) − L k (G).If the vitality effect of S on k is positive, then removing subset S from the graph has diverted more flow through k, a desired effect.
The goal of this research is to identify the subset of vertices S that maximizes the vitality effect, which is equivalent to maximizing the value of L k (G \ S).We formally define the all-pairs vitality maximization problem (VIMAX) as From expressions (1) and (2), we see that there is no guarantee that the vitality effect on k of removing any subset S need ever be positive.When subset S is removed from the graph, the overall flow capacity Z k (G \ S) generally decreases, and never increases, because S's contribution to the flow is removed.In order for subset S's removal to have a positive vitality effect on key vertex k, the remaining flow must be rerouted through k in sufficiently large quantities to overcome the overall decrease in flow through the network.However, as we will show in Section 5.3, identification of an optimal or near-optimal removal subset often dramatically increases the vitality of the key vertex.

Mixed Integer Linear Programming Formulation
To formulate VIMAX as an optimization problem, we first formulate a linear program to solve for the vitality of k in any graph G. Then we expand that formulation into a mixed integer programming formulation that seeks the optimal subset S of vertices to remove from the graph to maximize the vitality of k in the resulting graph.

Vitality Max-Flow Subproblems.
Following the approach of [39], we take the dual of problem Z k (G\{k}) to convert it into a minimum cut problem having the same optimal objective function value, and embed it in the formulation of L k (G).Since the dual problem is a minimization problem, the objective function will correctly correspond to the vitality.Letting V = V \ {k}, and letting E be the set of edges that remain after removing vertex k and its incident edges, we obtain the following linear program for finding Variables x i,j,s,t and v s,t are the primal variables from the maximum flow formulation of problem Z k (G).x i,j,s,t represent the optimal s − t flow pushed along edge (i, j), and v s,t represent the optimal s − t flow values.Variables y i,s,t and y i,j,s,t are the dual variables from the minimum cut formulation of problem Z k (G \ {k}).We can interpret y i,s,t as vertex potentials: For every edge (i, j), if y i,s,t < y j,s,t , meaning vertex i has lower potential than vertex j when computing the minimum s − t cut, then edge (i, j) must cross the cut.In such a case, dual variable y i,j,s,t = 1, and edge capacity u i,j is counted in the objective function.

VIMAX: Choosing an Optimal Removal Subset.
Now that we have expressed the vitality of k in G as a linear program, we can return to VIMAX, which finds a subset S of vertices whose removal maximizes the vitality of k.Given a set S, the linear program in Equation 4 applied to graph G \ S solves for L k (G \ S).We must modify the LP above to choose a subset S that maximizes the objective function L k (G \ S).
We can formalize this by creating binary variables z i for each vertex i such that z i = 1 if vertex i remains in the graph, and z i = 0 if vertex i is removed from the graph (that is, i is included in subset S).We also define variables w i,j for each edge that indicate whether or not edge (i, j) remains in the graph following the removal of S and/or k.We define linking constraints so that whenever both vertices i and j remain in the graph (that is, z i = z j = 1), then w i,j must equal 1, and whenever either vertex i or j is selected for deletion (that is, z i = 0 or z j = 0 or both) then w i,j must equal 0. (Due to this relationship between w i,j and the binary z i , the w i,j are effectively constrained to be binary variables without explicitly declaring them as such.) To Equation 4, we make the following adjustments to the original primal and dual constraints.We constrain the primal flow variables x i,j,s,t ≤ u i,j w i,j , reflecting whether or not edge (i, j) remains in the graph.We also modify the dual potential constraints so that y i,j,s,t = 0 whenever vertices i and j are at the same potential (as before) or edge (i, j) no longer exists in the graph.
Introducing the variables z i and w i,j and the modifications on our vitality constraints, we can now write the full mixed-integer linear program.Given a graph G = (V, E), a key vertex k, and a maximum size, m, of the removal set, the following mixed-integer linear program solves VIMAX.

Maximize
Extending the approach of [53] to general capacity, directed graphs, we can show that VIMAX is NP-Hard.In the case that m = 1 and we can remove at most one vertex, we can do brute-force and solve the above MIP setting z i = 0 and all other z j = 1 for all i ∈ V .

Theorem 1. The all-pairs vitality maximization problem is NP-Hard.
Proof.The proof of this can be found in Appendix A.

Simulated Annealing Heuristic
As an alternative to solving VIMAX exactly with a MIP, we develop a simulated annealing heuristic.Each iteration of simulated annealing begins with a candidate removal subset.In the first iteration, this is the empty set, and in subsequent iterations the initial solution is the best solution found at the conclusion of the previous iteration.The objective function value of each solution is computed as the vitality of the key vertex when this subset is removed from the graph.Each call to the algorithm consists of an annealing phase and a local search phase.
During the annealing phase, neighboring solutions of the current solution are obtained by toggling a single vertex's, or a pair of vertices', inclusion or exclusion from the candidate removal subset, subject to the constraint that |S| ≤ m.If the neighboring solution improves the objective function value, it is automatically accepted for consideration.If the neighboring solution has a worse objective function value, it will be accepted to replace the current solution with an acceptance probability governed by a temperature function, T .When the temperature is high (in early iterations), there is a high probability of accepting a neighboring solution even if its objective function value is worse than that of the incumbent solution.This permits wide exploration of the solution space.In later iterations, the temperature function cools, reducing the likelihood that lower objective function value solutions will be considered.This permits exploitation of promising regions of the solution space.
Given temperature T , the probability of accepting a solution having objective function value e 0 when the best objective function value found so far is e max > e 0 is given by P = e −(emax−e 0 )/T .The initial temperature, T , is chosen so that the acceptance probability of a solution having at least 90% of the initial objective function value is at least 95%.In subsequent iterations, T is cooled by a multiplicative factor of 0.95.
After a set number of annealing iterations, a single iteration of local search is conducted on the best solution found so far by toggling each vertex sequentially to determine if its inclusion or exclusion improves the objective function value.The best solution found is returned.
We use a Gomory-Hu tree implementation of the all-pairs maximum flow problem to rapidly calculate the vitality of the key vertex on each modified graph encountered by the heuristic [31,34].For mathematical reasons that are discussed in Section 6, we can exclude leaves from consideration in any removal subset.These two enhancements permit the simulated annealing heuristic to run very fast on even large instances, as we discuss in Section 5.3.

Computational Analysis
We now present performance comparisons on a variety of datasets of the MIP formulation and the simulated annealing heuristic.Following the approach of [43], we generate grid networks, which are planar.We also test the performance of the methods on random networks [40] and on a real drug trafficking network [52].We first describe these data sets and the computational platform used, and then we present the results.Code and data files are available at our Github repository: https://github.com/alicepaul/network_interdiction.

Grid Networks.
We generate grid networks in a similar fashion as [43].We generate square M × M grids with M varying from five to eight.Such graphs have an edge density of 4 M (M +1) , which ranges from 13.3% for M = 5 to 5.6% for M = 8.On each grid, we generate edge capacities independently and uniformly at random from the integers from 1 to M .For each case, we likewise consider two scenarios, testing a maximum removal subset size of m = 1 or m = M (that is, |V |).For each grid size and removal subset size combination, we generate three trial graphs.For each trial, a key vertex is selected uniformly at random over the vertices.For each graph size and removal subset size combination, we generate three trial graphs.For each trial, the key vertex is selected to have the highest betweenness centrality.

Drug Trafficking Network.
Lastly, we test our models on a real-world covert cocaine trafficking group, prosecuted in New York City in 1996 [52].This network consists of 28 people between whom 151 phone conversations were intercepted over wiretap over a period of two months.An edge exists between persons i and j if at least one conversation between them appears in the data set.There are 40 edges in this graph, corresponding to an edge density of 10.6%.We can consider a unit capacity version of the network, as well as a general capacity version in which the capacity on edge i − j is equal to the number of conversations between them appearing in the data.The weighted network is shown in Figure 1, where line width is proportional to the number of wiretapped calls occurring between two operatives.According to Natarajan et al., some individuals in the network are known to have the roles described in Table 1.We test a maximum removal subset size of m = 1 or m = 5 ≈ |V |.Because the Colombian bosses (vertices 1, 2, and 3) are high-level leaders important to the functioning of the organization, we treat these vertices as the key vertices on which we attempt to maximize vitality.

Computational Framework
The performance of the MIP and the simulated annealing heuristic was tested on a computer with a 3 GHz 6-Core Intel Core i5 processor and 16 GB of memory.The Single-VIMAX and VIMAX MIP instances were run in python 3.9.6 calling the CPLEX solver through the CPLEX python API, and were each limited to two hours of computation time.The simulated annealing heuristic was also coded in python and limited to 10,000 iterations on each trial instance.Initial results were collected using the Extreme Science and Engineering Discovery Environment (XSEDE) supercomputers [68] and up to five hours of computation time but did not show significantly different results.In addition to the general VIMAX MIP, a single vertex removal MIP (Single VIMAX) was also tested.Single vertex removal simulated annealing results are not reported, as they are effectively equivalent to brute force search.

Results
Table 2 presents the results of all completed trials.The first five columns explain the graph type, number of vertices (|V |), number of edges (|E|) and for the general VIMAX problem allowing multiple removals, the maximum allowed size, m, of the removal subset.Column six gives the initial vitality of the key vertex in the original graph with no vertices removed.Columns seven through ten provide results on the performance of the single vertex removal MIP (Single VIMAX); columns eleven through fifteen provide results from the multi-removal MIP (VIMAX); and columns sixteen through nineteen provide results from the multi-removal simulated annealing heuristic.(There is no need to use simulated annealing for Single VIMAX because it can be solved by sequentially testing the removal of each vertex.)For the three methods, the best vitality found within the time or iteration limit, the MIP gap if available, the percentage increase of the best vitality found by the method over the original vitality of the key vertex in the full graph, and the running time in seconds are given.For the multi-removal methods, the size of the best found removal subset (|S|) is also given.MIP instances that terminated due to time limit have Time reported as − .First we note that Table 2 provides a proof-of-concept demonstrating that it is possible to   ) and simulated annealing (multi-removal VIMAX).For the grid and random networks, each row represents a randomly generated instance with randomly selected key vertex.MIP trials that reached the two-hour time limit show 'Time' reported as '-'.increase (sometimes dramatically) the vitality of the key vertex through subset removal.Removing a single vertex increased the vitality by 42%-200% in all grid network instances for which the MIP solved to optimality within the time limit, and by up to 82% in the random graph instances; single vertex removal was not able to increase the vitality of the key vertex in the drug network.When allowing multiple removals, simulated annealing was able to identify removal subsets that increased the vitality on the key vertex by as much as 1,373%.
Unsurprisingly, the full VIMAX MIP allowing multiple removals is substantially harder to solve than the single removal MIP.On grid and random networks, the MIP failed to terminate within the two-hour time limit on all instances with at least n = 36 nodes.On the n = 36 random and the 7 × 7 and 8 × 8 grid network instances, the single removal MIP also did not terminate within the time limit, but an improving solution was returned in more cases.The large MIP gaps on the MIP allowing multiple removals indicate a failure to find improving integer solutions.
For multiple vertex removal, the simulated annealing heuristic yielded excellent solutions in a fraction of the time required by even the single removal MIP.On the large instances for which the multiple removal MIP reached the time limit, the simulated annealing heuristic found substantially better solutions than the MIP incumbents.For those instances in which the multiple removal MIP solved to optimality, the solutions found by simulated annealing are often optimal and always near-optimal.
The effectiveness of vertex removal to maximize vitality appears to depend on the network structure and choice of key vertices.While the drug network has approximately the same number of vertices and edges as the 25-node instances of the random and grid networks, the key vertices (corresponding to vertices Boss 1, Boss 2, and Boss 3 in Figure 1) chosen in these trials are less amenable to vitality maximization.The drug network has a large number of leaves, whereas the grid networks do not.As we will see in Section 6, vertices, such as leaves, that do not have at least two vertex-disjoint paths to the key vertex will never appear in an optimal removal subset.
Lastly, in these trials, we chose to restrict the removal subset size to at most m vertices.The reason to restrict the removal subset size is to reduce the solution space, and thus the complexity, of the problem.This decision is justifiable because we know removing too many vertices will cause overall flow in the network to drop such that the vitality on the key vertex cannot increase.Thus, an important question is what should be an appropriate value of m to effectively reduce the solution space without compromising the quality of solutions found?We do not have a definitive answer to this question.However, we see that in many of the trials, the best removal subset identified by any method has a size strictly less than m ≈ |V |, suggesting that this choice of m is reasonable for the sizes and types of graphs considered here.

Leveraging Structural Properties of Vitality
Thus far, we have established that subset removal can dramatically increase the vitality of a key vertex.However, solving this problem exactly as a MIP is computationally intractable for even modestly sized graphs.Fortunately, simulated annealing is an appealing alternative that yields very good solutions in dramatically less time than the MIP.In this section, we explore mathematical properties that characterize vertices that can be ignored by subset removal optimization approaches.We demonstrate how these properties can be leveraged to simplify the graph on which VIMAX is run.

Identifying Vitality-Reducing Vertices
To reduce the complexity of the optimization formulation, we turn to identifying conditions that cause a vertex to have a vitality-reducing effect on the key vertex.This allows us to ignore such vertices in any candidate removal subset and reduce the solution space of the VIMAX problem.
Our first observation is that the presence of a cycle is necessary for the removal of a vertex to increase the vitality of a key vertex.The vitality of a leaf is always equal to 0, so the removal of any subset that results in k becoming a leaf also cannot increase the vitality of k.As a corollary, if k has neighbor set N (k) and more than |N (k)| − 2 of k's neighbors are removed, the vitality effect on k will be nonpositive.
We can generalize this further.When there are not at least two vertex-disjoint paths from i to k, any removal subset including i will have a vitality effect on k no greater than the same subset excluding i, as stated by the following theorem 1 : Theorem 2. Let G be a graph with key vertex k, and let i be a vertex such that there do not exist at least two vertex-disjoint paths starting at i and ending at k. Let S be any vertex subset containing i, and let T = S \ {i}.Then, L k (G \ S) ≤ L k (G \ T ).Therefore, T will have at least as large a vitality effect on k as S.
Proof.The proof of this can be found in Appendix B.
Put simply, the existence of only one vertex-disjoint path between i and k means that i and k do not lie on a cycle.Therefore when i is removed, any s − t paths that previously passed through i cannot be rerouted through any alternate path passing through k.
Note that identifying vertices that do not have at least two vertex-disjoint paths to k is computationally straightforward.We can solve an all u − k pairs maximum flow problem on a related graph Ĝ in which every vertex u is replaced with a pair of vertices connected by a unit capacity edge: (u, u ).For every directed edge i − j in the original graph, we include directed edge (i , j) in the modified graph.Through the use of a Gomory-Hu tree, we can solve this in O(|V | 3 |E|) time [31,34].Any vertex u corresponding to vertex u in Ĝ that has a maximum u − k flow of one in Ĝ does not have at least two vertex-disjoint paths to k in the original graph and can be ignored by any removal subset.We call the set of such vertices, Q.Every vertex in Q should be maintained in the graph and not be considered for removal.
These properties show that when seeking a vitality-maximizing subset for removal, we can ignore all subsets that include: • vertices in Q (i.e. they do not share a cycle with k); After performing preprocessing on the graph to identify N (k) and Q, we can add the following constraints to the MIP formulation: 1 A more general cut theorem holds for the specific case of an undirected graph in which all edges in the graph have unit capacity [47].In such a graph, the value of the maximum s − t flow equals the number of edge disjoint paths between s and t in the graph.In this case, the relationship between the size of the cut between the key vertex k and a candidate for removal, i, and the connectivity between vertices along the boundaries of that cut conveys information about the vitality effect on k of removing i.The reader is also referred to [54] for an overview of how this theorem might be implemented in practice for unit capacity, undirected graphs.
Although the above constraints provide a tighter formulation for VIMAX, the anticipated benefits of these constraints are likely to be modest.Table 3 shows |Q| (the number of vertices that do not have at least two vertex-disjoint paths to k) for each graph used for testing in Section 5.
Unsurprisingly given their structure, all the vertices in the grid networks have at least two vertex-disjoint path to k; thus none of these vertices can be eliminated from consideration and are omitted from Table 3.By contrast, the sparse drug trafficking network has nearly half of its vertices that do not have at least two vertex-disjoint paths to the key vertex; this is a significant reduction in the number of candidate vertices for removal, but VIMAX was readily tractable on this already-small network.Thus, this criterion alone is unlikely to render previously intractable MIP instances tractable.

Simplifying the Graph
Because VIMAX grows rapidly in the number of vertices, we can improve the computational tractability of VIMAX by simplifying our original graph into a vitality-preserving graph having fewer vertices.We rely heavily on Theorem 2 to do this.
Suppose that a vertex v disconnects the graph into two components T 1 and T 2 such that k ∈ T 1 .Then, by Theorem 2, an optimal solution will not contain any vertex in T 2 .Further, the maximum flows between pairs of vertices within T 2 do not contribute to the vitality effect on k.Therefore, all that is needed to preserve the vitality effect on k in the simplified graph is to preserve information about the maximum flow between all pairs of vertices s, t such that s ∈ T 1 and t ∈ T 2 .
For all vertices t ∈ T 2 we create a single edge between t and v with capacity equal to the maximum flow between t and v.This replaces all previous edges between vertices in T 2 .This affects the value of the all-pairs maximum flow problem but does not affect the vitality effect on k for any subset S ⊂ T 1 .Further, if any subset of vertices T ⊆ T 2 all have the same new capacity value, we combine T into a single vertex with weight |T |.When calculating the maximum flow between any pair of vertices s and t in the graph, we multiply the flow by the product of the weights of the vertices to account for this simplification.
Using the process described in the previous section, we can identify the subset of vertices Q ⊆ V \ {k} that do not have at least two vertex-disjoint paths to k.Given a vertex i ∈ Q, we find a path from i to k and find the first vertex v along i's path to k such that v has at least two vertex-disjoint paths to k. Removing the vertex v disconnects the graph.Therefore, we follow the simplification process above and mark all vertices in the corresponding T 2 , including i, as processed.We then repeatedly identify any unprocessed vertex in Q to further simplify the graph.After all vertices in Q have been processed, all these vertices will be weighted leaves in the new simplified graph where the weight depends on how many vertices have been combined.All other vertices will retain a weight of one.Figure 2 shows an example of this simplification process in which there are two components that have been simplified.Note that vertices 4, 6, and 7 have been combined together into a vertex with weight three.Further, vertices 5 and 8 have been combined together into a vertex with weight two.
As argued above, the maximum flow between all pairs of vertices that were in the same simplified component never contribute to the vitality effect on k.Therefore, we ignore these pairs in the optimization problem by removing the appropriate variables and constraints.We therefore just need to check that we have preserved the maximum flow between all pairs of vertices that were not in the same component.This is true by nature of the weights which are multiplied.For example, in Figure 2, we multiply by weight 4 for the maximum flow between vertex 4 and vertex 1, accounting for all the paths between vertices 4, 6, and 7 and vertex 1.Thus, our optimization problem still finds an optimal subset to remove on the simplified graph that is optimal in the original graph.The number of pairs of vertices decreases from 45 to 19 since the number of vertices excluding k decreases from 10 to 7 and we can ignore the flow between vertices 9 and 10 and between vertices 4 and 8 in the simplified graph.
Table 3 shows the number of vertices (| V |) and edges (| Ê|) in each test graph after applying the graph simplification algorithm.The only graph types experiencing an appreciable reduction in size after simplification are the drug trafficking network and the smaller random graphs.We posit that highly connected graphs such as the grid networks are less amenable to the simplification method than sparser networks.In Table 3 we also include the percentage decrease in time and percentage increase in the best objective function value found via graph simplification to the Multi-Removal MIP removal results reported in Table 2.The time includes the time to perform the graph simplification, which is very efficient.For graphs with a significant reduction in the number of nodes and edges, we see a corresponding decrease in the runtime for the MIP.For the larger networks that did not terminate within the time limit, we only see the best vitality found improve in one instance.

Future Work and Conclusions
In this paper we have presented the VIMAX optimization problem that identifies a subset of vertices whose removal maximizes the volume of flow passing through a key vertex in the network.VIMAX is NP-Hard.We have used the dualize-and-combine method of [73] to formulate VIMAX as a mixed integer linear program, and we compared its performance to that of a simulated annealing heuristic.We also demonstrated how identifying vertices not having at least two vertex-disjoint paths to the key vertex can be used to simplify the graph and reduce computation time on certain graph types.
Additionally, this paper opens up a rich area of future research.
• Computational improvements -Graph Simplification: Additional properties of vitality-reducing vertices, such as those outlined in [54] for the unit capacity case, could be derived for the general capacity case and used to preprocess or simplify the graph to reduce the solution space of VIMAX.In particular, it would be beneficial to identify small cuts in the graph such that all vertices on the other side of the cut as k can be ignored from consideration.
• Computational improvements -Bender's Decomposition: Because the number of constraints in the VIMAX MIP grows on the order of O(|E||V | 2 ), we can use Bender's decomposition algorithm to solve our problem for large graphs.The decomposition is presented in Appendix C, but preliminary testing did not improve the MIP performance.The survey of Smith and Song illustrates a variety of approaches that could be applied to improve the performance of the Bender's decomposition of VIMAX [64].
• Optimization: In this paper, we have focused on identifying vertices having high vitality effect on the key vertex without considering the cost or difficulty of removing them from the graph.An enhancement to VIMAX could include a budget constraint restricting the choice of subsets based on the difficulty of their removal.
• Game theory and dynamic response: The disruption technique described in this paper focuses on the network at one snapshot in time and assumes that any subset removal occurs simultaneously and that the network remains static.Extensions to VIMAX might explore cascading effects of sequential vertex removal, similar to the literature on multi-period interdiction [23], cascading failures [17,51,76], agent-based models for counter-interdiction responses [46], and game theoretic responses of the network to disruptions, such as adding new edges.
• Imperfect information: The VIMAX formulation presented here assumes complete and perfect knowledge of the network's structure.However, the complete structure of a covert network is typically not known to enforcement agencies, and can evolve rapidly [41].Future work could address applying VIMAX to networks with uncertain or unknown structure.
• Robust network design: We can use the results of this research to design networks, such as telecommunication and other infrastructure networks, to be robust to vitality-diverting attacks [18].
• Multiple key vertices: In the case that we want to maximize the flow through a subset S of key vertices, we can extend the definition of vitality maximization to maximize the all-pairs vitality of S. The MIP and simulated annealing algorithm can be updated accordingly.
VIMAX has broad applicability to problems including disrupting organized crime rings, such as those used in terrorism, drug smuggling and human trafficking; disrupting telecommunications networks and power networks; as well as robust network design.

A Proof of Theorem 1
In this section we prove Theorem 1 stating that the all-pairs vitality maximization problem is NP-Hard.Our proof extends the proof of [53] for the special case of undirected, unit-capacity edges.We first restate VIMAX as a decision problem: For a fixed value C, does there exist a subset S such that L k (G \ S) ≥ C? Theorem 3. The all-pairs vitality maximization problem is NP-Hard.
Proof.We use a reduction from the 3-Satisfiability problem (3SAT).Given an instance of 3SAT with n boolean variables x 1 , x 2 , . . ., x n and m clauses in 3-conjunctive normal form c 1 , c 2 , . . ., c m , the 3SAT decision problem is whether there is an assignment of variables to true/false values such that all clauses are satisfied.As an example with three variables, any assignment with x 3 set to false would satisfy the two clauses (x 1 or x 2 or x 3 ) and (x 1 or x 2 or x 3 ).
Given an instance of 3SAT, we construct a corresponding instance of VIMAX.We start building our directed graph G with three vertices d 1 , k (the key vertex), and d 2 with an edge from k to d 2 with capacity n + m.Further, for each variable x i we create four vertices {a i , b i , t i , f i } and add edges (d 1 , a i ), (a i , t i ), and (a i , f i ) each with capacity two and edges and (b i , k) each with capacity one.
Then, for each clause c j , we create two variables u j and v j and add unit capacity edges (d 1 , u j ) and (v j , k).To encode this clause, for each variable x i in clause c j we add unit edges (u j , t i ) and (t i , v j ); for each variable x i in clause c j we add unit edges (u j , f i ) and (f i , v j ).Last, we create M = 8 • (m + n + n • m) leaves with unit edges to d 1 and M leaves with unit edges from d 2 and set C = (M + 1) 2 (n + m).An example graph of a single-clause, three-variable, 3SAT problem having clause (x 1 or x 2 or x 3 ) is given in Figure 3.
Note that the leaves adjacent to d 1 and d 2 essentially increase the weight of the flow between d 1 and d 2 .In particular, if we define and let V be all vertices excluding these leaves as well as d 1 , d 2 , and k, then we can rewrite the all-pairs vitality as The last line holds since paths from d 1 to s ∈ V \ S or between s and t ∈ V \ S cannot travel through k.Further, we can bound the second half of the sum above by bounding the vitality by the capacity out of the starting node for each maximum flow.
(M + 1) This shows that the maximum flow from pairs that are not {d 1 , d 2 } contributes a trivial amount to the overall vitality.Therefore, finding a subset such that L k (G \ S) ≥ C = (M + 1) 2 (n + m) is equivalent to finding a subset S such that L d 1 ,d 2 k (G \ S) ≥ n + m.We now show that given an assignment of variables to boolean values that satisfy all clauses, we can find an equivalent subset S such that L d 1 ,d 2 k (G \ S) ≥ n + m.Let S contain t i for all i such that x i is set to false and f i for all i such that x i is set to true.
Consider the maximum flow between d 1 and d 2 in G \ S. For each variable x i such that t i ∈ S, we send two units of flow: one along the path (d 1 -a i -f i -b i -k-d 2 ) and one along the path (d 1 -a if i -d 2 ).If, instead, f i ∈ S, then the paths change to use t i instead of f i .Further for each clause j, since this clause is satisfied, there exists at least one vertex t i or f i adjacent to u j that is not in S. Without loss of generality, let this vertex be t i .We send one unit of flow along the path (d 1 -u j -t i -v j -k-d 2 ).The overall flow has value 2n + m.Since all edges adjacent to d 1 are saturated, this is a maximum flow.Now consider the maximum flow between d 1 and d 2 in G \ (S ∪ {k}).For each variable x i such that t i ∈ S, we send one unit of flow along the path (d 1 -a i -f i -d 2 ).If, instead, f i ∈ S, then the path changes to use t i instead of f i .The overall flow has value n.Since all edges adjacent to d 2 are saturated in G \ (S ∪ {k}) this is a maximum flow.This shows that L We must now show the reverse direction to complete the proof.Suppose that we have found a subset S such that L k (G \ S) ≥ C.Then, given that all pairs except d 1 and d 2 contribute at most 1 2 (M + 1) 2 to the vitality, it must be the case that L d 1 ,d 2 k (G \ S) ≥ n + m.We decompose the flow into unit flow paths from d 1 to d 2 .Let f (s, t) be the number of these paths that go from s to t in the maximum flow from d 1 to d 2 in G \ S and f (s, t) be the number of paths from s to t in the maximum flow between d 1 and d 2 in G \ (S ∪ {k}).Then, For the first term in Equation 7, we can verify that [f (a i , d 2 ) − f (a i , d 2 )] ≤ 1 if exactly one of t i and f i is in S and {a i , b i } ∩ S = ∅ and at most zero otherwise.In particular, if t i and f i are both in S then f (a i , d 2 ) = f (a i , d 2 ) = 0.If both t i and f i are not in S, then at most two units of flow can go from a i to d 2 in both graphs and both t i and f i can avoid using vertex k.Only when exactly one of t i or f i has been chosen will at least one path be forced to go through vertex k.For the second term, each term is also at most one given the unit capacity of the edge from d 1 into u j .Therefore, ≥ n + m this implies equality throughout and that |{t i , f i } ∩ S| = 1 for all i = 1, 2, . . ., n.For each variable for which t i is in S, we set that variable to false.Otherwise, we set the variable to true.Last, in order for every clause to contribute at least one to the overall vitality, u j must be adjacent to some t i or f i not in S. Given the design of our network, this indicates that the assignment satisfies that clause.
Overall, this shows that every 3SAT decision problem can be reduced to a VIMAX decision problem and that VIMAX is NP-Hard.

B Proof of Theorem 2
Here we prove Theorem 2 stating that the removal of any vertex not having at least two vertexdisjoint paths to the key vertex k can never increase the vitality of k.Theorem 4. Let G be a graph with key vertex k, and let i be a vertex such that there do not exist at least two vertex-disjoint paths starting at i and ending at k. Let S be any vertex subset containing i, and let T = S \ {i}.Then, L k (G \ S) ≤ L k (G \ T ).Therefore, T will have at least as large a vitality effect on k as S.
Proof.Let G be a graph with key vertex k and let i be a vertex such that there do not exist at least two vertex-disjoint paths starting at i and ending at k. Then there exists a cut vertex v whose removal would disconnect the graph into at least two components.We consider two cases, v = i and v = i.
When v = i, then v separates a component G k that includes k from a component G i that includes i.Consider the maximum flow between an s − t pair (s, t = k).• If both s and t are in G k , their contribution to the vitality of k is unaffected by the removal of i by the same logic as above: any optimal flow path that passes through vertex i must go into and out of vertex v, creating a flow cycle, and thus is equivalent to a flow path that avoids G i entirely.
• If, without loss of generality, s ∈ G i and t ∈ G k , then the removal of vertex i may reduce the flow between s − . . .− v, but the remainder of the path v − . . .− t is unaffected.Thus no additional flow can be routed through k when i is removed than when i is present.
When v = i, then i separates a component G k that includes k from the remainder of the graph, G i .In this case, the removal of i will eliminate all s − t flow between s ∈ G i and t ∈ G k , regardless of whether or not k is in the graph.Thus, no additional flow can be routed through k when i is removed from the graph than when i is present.

C Benders Decomposition
Because the number of constraints in the VIMAX MIP grows on the order of O(|E||V | 2 ), we can use Benders decomposition algorithm to solve our problem for large graphs.In our case, the integer master problem chooses the subset of vertices to remove; this problem has relatively few variables and constraints.Given a fixed removal subset, we are left with a large linear network flow subproblem that is guaranteed to have an integer optimal solution.
We see in Equation 5constraints that couple w i,j , x i,j,s,t , y i,j,s,t and y i,s,t .We let the z i 's and w i,j 's be the variables in our master problem.Our initial master problem contains only the constraints related to the w i,j 's and z i 's, representing the choice of subset to remove.Thus the master problem is Maximize L k subject to i∈V z i ≥ n − m z k = 1 w i,j ≤ z i , ∀(i, j) ∈ E w i,j ≤ z j , ∀(i, j) ∈ E w i,j ≥ z i + z j − 1, ∀(i, j) ∈ E z i binary, ∀i ∈ V w i,j ≥ 0, ∀(i, j) ∈ E L k ≥ 0. Here, L k represents the optimal vitality of k.It currently has no restrictions on its value.Solving Equation 9 determines a feasible z and w, which we can use to compute the vitality of k in the dual of the linear subproblem.When taking the dual we let x i,s,t be the dual variables corresponding to the flow balance constraints of the x i,j,s,t 's and x i,j,s,t be the dual variables corresponding to the capacity constraints on the x i,j,s,t 's.Similarly, we let y i,j,s,t be the dual variables corresponding to the edge constraints on y i,j,s,t , and we let y s,t be the dual variables corresponding to the constraints on the relationship between y s,s,t and y t,s,t x i,j,s,t ≥ 0, ∀(i, j) ∈ E, ∀s, t ∈ V x i,s,t unrestricted, ∀i, s, t ∈ V y s,t ≤ 0, ∀s, t ∈ V y i,j,s,t ≤ 0, ∀(i, j) ∈ E , ∀s, t ∈ V .(10) At the beginning of each iteration c, the master is solved and we obtain the optimal values for z i and w i,j .Initially, we start with an infinite objective function and all z i = 1.The dual of the linear subproblem, shown in Equation 10, is then solved with the optimal w i,j 's substituted in.
If the subproblem is unbounded, simplex returns the extreme ray, defining x c and y c , and we add the constraint (1 − w i,j )y i,j,s,t,c ≥ L k .
Otherwise, the algorithm terminates.Preliminary testing of the Benders decomposition of VIMAX reveals the same problem that plagues large instances of the MIP formulation: the objective function values of the linear subproblems encountered are quite large compared to the objective function value of any feasible integer solution.Thus, the cuts added do not adequately constrain the master problem.Future work is needed to develop improved Benders decompositions.

Figure 1 :
Figure 1: Cocaine trafficking network of Natarajan et al.Line width is proportional to number of wiretapped calls made between pairs of operatives [52].

Figure 2 :
Figure 2: An example of a graph (left) and its simplified version (right) with vertex weights.Vertices 4, 6, and 7 have been combined together into a vertex with weight three.Further, vertices 5 and 8 have been combined together into a vertex with weight two.

Figure 3 :
Figure 3: Graph representation of a single clause 3SAT problem with three variables and the clause (x 1 or x 2 or x 3 ).All edge capacities equal one except where indicated otherwise.

•
If both s and t are in G i , the flow between them is unaffected by the removal of vertex k, whether or not vertex i is removed from the graph.This is because any optimal flow path that passes through vertex k must first go into and out of vertex v, creating a flow cycle, s − . . .− v − . . .− k − . . .− v − . . .− t, and thus is equivalent to a flow path that avoids G k entirely, s − . . .− v − . . .− t.

Table 1 :
[40]s of notable vertices in the cocaine trafficking network of Natarajan et al.[52].Random G n,m graphs are parametrized by a number of vertices, n = |V |, and a number of edges, m = |E|[40].Each graph is sampled by finding a random graph from the set of all connected graphs with n nodes and m edges.We test our methods on graphs having the same number of vertices and same number of edges as the grid networks above: |V | = {25, 36, 49, 64} vertices, with |E| = {40, 60, 84, 112}, respectively.On each graph, we generate edge capacities independently and uniformly at random from the integers from 1 to |V |.For each case, we likewise consider two scenarios, testing a maximum removal subset size of m = 1 or m = |V |.

Table 2 :
Computational results of solving VIMAX via mixed integer program (single VIMAX and multi-removal VIMAX

Table 3 :
Improvement in key VIMAX instance size parameters by identifying vitality-reducing vertices and using graph-simplification. |Q| is the number of vertices that do not have at least two vertex-disjoint paths to k; vertices in Q can be ignored by VIMAX (see Section 6.1).|V | and | Ê| are the numbers of vertices and edges, respectively, in the reduced graph after applying the graph simplification method of Section 6.2.The last two columns report the percentage decrease in time and percentage increase in best objective function value of the graph simplification method compared to the Multi-Removal MIP results reported in Table2.Entries denoted by '-' indicate instances in which the MIP did not terminate within two hours.
. The linear subproblem becomes i,j w i,j x i,j,s,t + − w i,j )y i,j,s,t subject tox i,s,t − x j,s,t + x i,j,s,t ≥ 0, ∀(i, j) ∈ E, ∀s, t ∈ V −x s,s,t + x t,s,t ≥ 1, ∀s, t ∈ V ≥ −u i,j , ∀(i, j) ∈ E , ∀s, t ∈ V u If the subproblem has an objective function value less than or equal to the incumbent value of L k , then we add in the constraint u i,j w i,j x i,j,s,t,c + (1 − w i,j )y i,j,s,t,c ≥ 0. u i,j w i,j x i,j,s,t,c +