Exact methods for discrete Γ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varGamma }$$\end{document}-robust interdiction problems with an application to the bilevel knapsack problem

Developing solution methods for discrete bilevel problems is known to be a challenging task—even if all parameters of the problem are exactly known. Many real-world applications of bilevel optimization, however, involve data uncertainty. We study discrete min-max problems with a follower who faces uncertainties regarding the parameters of the lower-level problem. Adopting a Γ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varGamma $$\end{document}-robust approach, we present an extended formulation and a multi-follower formulation to model this type of problem. For both settings, we provide a generic branch-and-cut framework. Specifically, we investigate interdiction problems with a monotone Γ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varGamma $$\end{document}-robust follower and we derive problem-tailored cuts, which extend existing techniques that have been proposed for the deterministic case. For the Γ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varGamma $$\end{document}-robust knapsack interdiction problem, we computationally evaluate and compare the performance of the proposed algorithms for both modeling approaches.


Introduction
In the last years and decades, bilevel optimization problems have gained increasing attention due to their ability to model hierarchical decision making processes that occur in various applications such as transportation [7,55], energy markets [1,45], or pricing [30,52]. In bilevel problems, the decision maker on the upper level (the leader) makes a decision anticipating the reaction of the lower-level player (the follower). In this paper, we consider discrete linear bilevel problems of the form x ∈ X ⊆ Z n x , (1c) where Y (x) denotes the lower-level feasible set that is parameterized by the leader's variables x. The set X is used to denote integrality constraints. Moreover, we have c ∈ R n x , d ∈ R n y , A ∈ R k×n x , and a ∈ R k . We refer to (1a)-(1c) as the upper-level and to (1d) as the lower-level problem. Note that Problem (1) is a min-max problem. Hence, the follower's response yields the worst-possible outcome for the leader, which is why there is no need to distinguish between the optimistic and the pessimistic approach; see, e.g., [28]. Let us further point out that we do not consider coupling constraints in the upper level, which is crucial for the validity of the branch-and-cut methods we propose in the following sections. In particular, this type of problem covers the important classes of interdiction [17,25,31,36,40,49,69] and blocking problems [2,39,44,[59][60][61]74] that arise in various real-world applications such as in critical infrastructure defense, network disruption, or marketing. A recent survey on network interdiction models and algorithms can be found in [64]. Due to their nested structure, even linear bilevel problems are strongly NP-hard in general; see, e.g., [47]. Moreover, merely checking feasibility for mixed-integer bilevel problems is an NP-hard problem. Thus, it is a difficult task to develop solution methods-especially for bilevel problems that involve discrete variables. In the seminal work by [58], the first branch-and-bound method for solving mixed-integer linear bilevel problems is discussed. The idea is extended by [32] who provide a branchand-cut approach that is based on techniques of standard integer linear programming. In particular, this work can be considered as a turning point regarding computational mixed-integer bilevel optimization that is followed by many influential works on solution methods for bilevel problems; see, e.g., [34,35,67,70]. For a detailed discussion on further techniques for mixed-integer bilevel optimization, we refer to the recent survey in [51].
Throughout this paper, we say that an upper-level decision x is feasible if x ∈ X and Ax ≥ a are satisfied. Moreover, we will hold on to the following. Assumption 1 For every feasible decision x of the leader, the lower-level feasible set Y (x) is non-empty.

Assumption 2
The shared constraint set {(x, y) : Ax ≥ a, x ∈ X , y ∈ Y (x)} is non-empty and compact. Assumption 3 All linking variables, i.e., all variables of the leader that appear in the lower-level constraints, are bounded integers.
Assumptions 1-3 are necessary to ensure that Problem (1) has an optimal solution. For a feasible upper-level decision x, we further define the lower-level optimal-value function to re-write Problem (1) as the single-level problem min x∈X ,η∈R Up to this point, we implicitly made the assumption that the input data of both players is certain. However, in many practical situations, this is not the case and the players are forced to make decisions under uncertainty. In mathematical optimization, there are two main approaches to deal with data uncertainty-stochastic optimization [15] and robust optimization [8,10,65]. In stochastic optimization, it is assumed that the uncertainties can be described by probability distributions that are known in advance. In this setting, the decision maker hedges against uncertainties in a probabilistic sense, e.g., by optimizing over expected values, by considering chance constraints, or some risk-averse models. In between stochastic and robust optimization there is the further approach of distributional robustness; see, e.g., [43]. In this paper, however, we focus on a robust approach. In robust optimization, the decision maker is interested in a solution that is feasible for all possible realizations of the uncertain data that are assumed to take values in a given uncertainty set. Thus, one pursues a worst-caseoriented philosophy. However, a major point of criticism regarding this approach is the possible over-conservatism of solutions in the sense that ensuring robustness can be very expensive. Addressing this matter, [11,12,63] propose a more flexible robust approach-the so-called Γ -robust approach-which allows to control the level of conservatism of the solution. In this setting, it is assumed that the decision maker hedges against the cases in which only a subset of the uncertain parameters will change as to adversely affect the solution of the problem at hand. In the context of bilevel optimization, problems involving data uncertainties have been investigated using both stochastic as well as robust approaches. [25,48] address stochastic network interdiction problems with uncertainties regarding the interdiction success and uncertain arc capacities, respectively. A stochastic approach for interdiction problems under uncertainty is also considered in the survey by [64]. Further works that pursue a stochastic approach for more general bilevel problems under uncertainty can be found, e.g., in [20,21,29,50,71]. To the best of our knowledge, robust approaches to address data uncertainty in bilevel optimization have been much less investigated. In the context of power markets, a Γ -robust approach to deal with uncertain lower-level data is considered in [46]. [24] consider problems with uncertain upper-as well as lower-level constraints and solve the robust counterpart via a sequence of semidefinite programming relaxations. In [73], a worst-case oriented approach for bilevel problems with lower-level data uncertainty is addressed. [18,19] present complexity results for robust bilevel problems with uncertainties regarding the lower-level objective function coefficients. For a brief introduction to robust bilevel optimization, we refer to [3] and for a general discussion on bilevel optimization under uncertainty, we refer to the recent survey by [5].
Let us further mention that, in bilevel optimization, the sources for uncertainty are much richer compared to classic, i.e., single-level, optimization. In bilevel optimization, uncertainties may not only arise in the problem data but there may also be uncertainty regarding the (observation of the) decisions of the two players. Despite being still in its infancy, there are a few works that consider robust approaches to deal with decision uncertainty. [13] propose a robust approach to hedge against nearoptimal lower-level decisions. In this context, complexity results are discussed in [14]. A similar setting in which the leader anticipates sub-optimal follower's decisions due to lower-level algorithmic uncertainty is considered in [72]. The authors consider the setting in which the leader hedges against the Γ th least damaging choices of the solution algorithm for the lower-level problem, which is to some extent related to the notion of Γ -robustness proposed in [11]. Lastly, robust optimization techniques are used in [6] to model follower's response uncertainty due to limited observability regarding the leader's decision.
The contributions of this paper are the following. We study discrete linear bilevel problems involving a follower who faces uncertainties regarding the parameters of the lower-level problem. In this context, we pursue the same idea as in [11,12,63] so that the follower aims to only hedge against a subset of deviations of uncertain parameters. In contrast to the aforementioned literature, we consider bilevel problems that involve discrete variables on the lower level. Therefore, standard reformulation techniques like replacing the lower-level problem by its Karush-Kuhn-Tucker conditions (see, e.g., [38]) cannot be applied anymore.
With regard to the uncertainties, we distinguish the following two cases. On the one hand, we assume that the lower-level's objective function coefficients are uncertain. Instead of d i , we consider the uncertain coefficientsd i , whered i ∈ [d i − Δd i , d i ] for all i ∈ n y := {1, . . . , n y }. We denote d i as the nominal value of the ith coefficient of the lower-level's objective function and Δd i as its maximum deviation from the nominal value. For a feasible upper-level decision x, the robust counterpart of the lower-level problem (1d) in which the follower hedges against at most Γ d ∈ {0, . . . , n y } deviations in the objective function coefficients is given by The overall bilevel problem with a Γ d -robust follower facing uncertain objective function coefficients can thus be written as On the other hand, we deal with uncertainties regarding the lower-level constraints. Here, we focus on the specific case of a single packing-type constraint of the form for all i ∈ n y with w, Δw ∈ R n y . For a feasible upper-level decision x, the robust counterpart in which the follower hedges against at most Γ w ∈ {0, . . . , n y } deviations in the constraint coefficients is given by such that the Γ w -robust counterpart of Problem (2) can be written as Let us mention that we focus on this specific case to ensure that both of the two approaches we present in this paper to model discrete bilevel problems with a robust follower can deal with uncertain lower-level constraints. Nevertheless, possible extensions are sketched in Sect. 5. Without loss of generality, we further impose the following.

Assumption 4
The deviations are non-negative, i.e., Δd i , Δw i ≥ 0 for all i ∈ n y .
To model the two types of situations, we consider an approach using an extended formulation. Additionally, we present an approach using a multi-follower formulation for the special case in which all lower-level variables are binary, i.e., Y (x) ⊆ {0, 1} n y . However, the multi-follower-based approach can be extended naturally to allow for additional non-binary lower-level variables as long as the coefficients of the objective function or the constraints corresponding to the non-binary variables are not subject to uncertainty. We present a generic branch-and-cut framework to solve the value-function reformulation of the robustified bilevel problem. Moreover, we derive problem-tailored cuts for interdiction problems that can be used in the proposed branch-and-cut procedure. These cuts assume that Γ -robust follower sub-problems satisfy a downward monotonicity property, which arises in many packing-type applications. In this context, it is our aim to provide a natural extension of the results that have been proposed in [36] for the deterministic case. The main results of this paper are stated in Theorems 2-5 and form the core for the implementation of the proposed solution methods.
The remainder of the paper is organized as follows. In Sect. 2, we provide an extended formulation and a multi-follower formulation to model discrete linear bilevel problems with a Γ -robust lower-level problem. We present a generic branch-and-cut method to solve these problems. In Sect. 3, we focus on interdiction problems with a follower problem that satisfies a downward monotonicity property. In Sect. 4, we evaluate the effectiveness of the proposed approaches in a numerical study using the bilevel knapsack interdiction problem, which is a prominent example of an interdiction problem that satisfies the monotonicity property. Finally, we conclude in Sect. 5.

Generic Branch-and-Cut Frameworks
The aim of this section is to present generic branch-and-cut frameworks that can be used to solve the Γ -robustification of Problem (2) as stated around (4) and (6). The methods are similar to a procedure proposed by [69], which resembles (generalized) Benders decomposition [9,42]. To initialize the methods, we start by solving the problem in which the integrality constraints on the variables x as well as η ≥ Φ(x) are omitted. This means that we first consider the linear problem Here,X is a continuous relaxation of X , i.e., the integer points contained inX coincide with X . Furthermore, we use an a priori lower bound η − ∈ R on Φ(x) for all feasible leader's decisions x. In the following sections, we elaborate on how to obtain a valid lower bound in our setting. After considering Problem (P 0 ), we iteratively add valid inequalities or branch to cut off integer-infeasible points and also add valid inequalities to cut off bilevel infeasible points. Let be the problem of node j of the branch-and-cut search tree. Here, the set Ω j contains all valid inequalities that have been added previously to cut off integer-infeasible and bilevel-infeasible points as well as all branching decisions. If either Problem (P j ) is infeasible or, if the objective function value corresponding to an optimal solution (x j , η j ) exceeds the current upper bound U , we can fathom node j. Otherwise, we do the following. First, we check if the upper-level variables x j satisfy the integrality constraints, i.e., we check if x j ∈ X holds. If this is not the case, we separate a fractional solution by either exploiting standard cutting planes from mixed-integer linear optimization as elaborated in, e.g., [26], or by branching. Otherwise, we proceed by checking for bilevel feasibility, i.e., we check if η j ≥ Φ(x j ) is satisfied. For this purpose, we solve a reformulation of the robust counterpart of the lower-level problem that is parameterized by the current leader's decision x j ∈ X . In the following sections, we will elaborate on how to obtain these reformulations. In particular, we present two approaches-an extended formulation and a multi-follower formulationthat are derived from Theorem 1 and 3, respectively, in [11]. Based on the latter two types of formulations, valid cuts to separate bilevel-infeasible points can be obtained. Nevertheless, the development of such cuts strongly depends on the specific problem considered at the lower level. Hence, the branch-and-cut frameworks presented in the remainder of this section remain fairly general and need to be adapted accordingly to capture the application problem at hand. We will show such adaptations for the bilevel knapsack interdiction problem in the following sections.

Extended Formulation
One possibility to reformulate the robust counterparts (3) and (5) is to allow for an extended variable space of the follower that involves additional continuous variables.

Lemma 1 For a feasible upper-level decision x, the robust counterpart of the lowerlevel problem (3) can be solved as the mixed-integer linear problem
The extended formulation for the case of uncertainties in the lower-level constraint is stated in the following lemma.

Lemma 2 For a feasible upper-level decision x, the robust counterpart of the lowerlevel problem (5) can be solved as the mixed-integer linear problem
Both lemmas can be shown in analogy to the proof of Theorem 1 in [11]. Before we comment on how the previous reformulations can be embedded in a branch-and-cut framework to separate bilevel-infeasible points, let us now provide valid lower bounds that can be used in Problem (P 0 ) to initialize the method.
Δd i u i is a valid lower bound for the optimal objective function value of the x-parameterized problem (7).
All the proofs that we omit here can be found in Appendix A. In the following, we provide a lower bound for the variant of the problem with uncertainties regarding the follower's packing-type constraint.
Proposition 2 Let x be a feasible upper-level decision. Then, by Assumption 2, there is a vector of finite upper bounds u ∈ R n y + for the follower's variables such that Y (x) ⊆ [0, u 1 ] × · · · × [0, u n y ] and such that is a valid lower bound for the optimal objective function value of the x-parameterized problem (8).
The method to process node j of the branch-and-cut search tree that exploits an extended formulation (as stated in the last two lemmas) is formally stated in Algorithm 1. To determine Φ(x j ) for the current leader's decision x j in Step 9, we need to solve the x j -parameterized robust lower-level problem. Depending on the considered uncertainty model, this can either be Problem (7) or Problem (8) Step 11, we generate a valid cut to exclude the bilevel-infeasible point (x j , η j ). To this end, one can use generic cuts like (generalized) no-good cuts; see, e.g., [66].

Algorithm 1
Processing node j using the extended formulation 1: Solve Problem (P j ). 2: if Problem (P j ) is infeasible then 3: Fathom the current node. 4: Let (x j , η j ) denote the solution of Problem (P j ). 5: if c x j + η j ≥ U then 6: Fathom the current node. 7: if x j / ∈ X then 8: Either generate cuts valid for Ω j ∩ (X × R), augment Ω j , and go to Step 1 or branch. 9: Determine Φ(x j ) and set U ← min{U , c x j + Φ(x j )}. 10: if η j < Φ(x j ) then 11: Generate a valid cut that excludes (x j , η j ) from Ω j , augment Ω j , and go to Step 1.

Multi-Follower Approach
An alternative reformulation of the robust counterparts (3) and (5) can be obtained under the following additional assumptions.

Assumption 6
The indices are ordered such that the deviations are given in nonincreasing order, i.e., Δd i ≥ Δd i+1 or Δw i ≥ Δw i+1 for all i ∈ n y with Δd n y +1 = 0 and Δw n y +1 = 0.
Note that Assumption 6 is w.l.o.g. as long as we do not have uncertainties in both the objective function coefficients and the constraint coefficients on the lower level. Assumptions 5 and 6 are necessary to exploit Theorem 3 in [11], which is what we do in the following.
holds, where for all ∈ {1, . . . , n y + 1}, we have withd Note that the lower-level optimal-value function (9) is defined as the maximum of n y + 1 value functions. Thus, the robustification of Problem (1) can be interpreted as a single-leader-multi-follower problem with n y + 1 many followers, which is why we refer to (9) as multi-follower formulation. Moreover, we refer to (10) as the th follower sub-problem throughout this paper.
In the following, we provide a valid lower bound for the optimal objective function value of (9) that can be used in Problem (P 0 ).

Proposition 3 Let x be a feasible upper-level decision and let ∈ {1
, . . . , n y + 1} be arbitrary but fixed. Under Assumption 5, a valid lower bound for the optimal objective function value of the x-parameterized problem (9) is given by The multi-follower reformulation for the case of uncertainties in the lower-level constraint is stated in the following lemma.

Lemma
Lemmas 3 and 4 can be shown in analogy to the proof of Theorem 3 in [11]. Note that, in the case of uncertainties regarding the follower's inequality constraint, we have Δd = 0 such thatd( ) = d holds for all ∈ {1, . . . , n y + 1}. Using the latter, Proposition 3 also provides a valid lower bound for the setting considered in Lemma 4. Let us further note that we consider the same deterministic lower-level objective function in the extended formulation as well as in the multi-follower formulation. Moreover, we would like to point out that the fact that there are only binary variables corresponding to uncertain coefficients on the lower level is a crucial point for the validity of the multi-follower formulation.
In what follows, we omit the subscripts d and w that are used to denote the considered uncertainty modeling for notational convenience. Further, we will hold on to an improvement of the previous results that has been established in [57] by reducing the number of nominal problems to be considered to n y − Γ + 2, i.e., Note that a further reduction of the number of nominal problems to be solved has been published by [53]. Since this is, however, not the core focus of this paper, we stay with the version as published by [57]. The method to process node j that exploits the multi-follower formulation for the robust counterpart of the lower-level problem is formally stated in Algorithm 2. In contrast to the approach using the extended formulation, in which a single cut is added at each node of the branch-and-cut search tree in case of bilevel-infeasibility, a cut for each follower sub-problem ∈ {Γ , . . . , n y + 1} that satisfies η j < Φ (x j ) is added in Step 12 of Algorithm 2. This means that up to n y − Γ + 2 cuts could be added at each node. However, it would also be valid to consider, e.g., adding only the most violated cut for the given leader's decision x j . We will address this aspect in detail when we discuss various cut separation strategies in Sect. 4.

Theorem 1
If we embed either Algorithm 1 or 2 into a usual branch-and-bound framework, we obtain a correct method that terminates with an optimal solution (x * , η * ) after a finite number of nodes and after adding an overall finite number of cuts. Algorithm 2 Processing node j using the multi-follower approach 1: Solve Problem (P j ). 2: if Problem (P j ) is infeasible then 3: Fathom the current node. 4: Let (x j , η j ) denote the solution of Problem (P j ). 5: if c x j + η j ≥ U then 6: Fathom the current node. 7: if x j / ∈ X then 8: Either generate cuts valid for Ω j ∩ (X × R), augment Ω j , and go to Step 1 or branch. 9: for all ∈ {Γ , . . . , n y + 1} do 10: Solve the th lower-level sub-problem to obtain Φ (x j ).
Generate a valid cut that excludes ( 14: If at least one cut was added in Step 12, then go to Step 1.

Proof
We first recall that all linking variables are bounded integers. We then observe that, w.l.o.g., non-linking variables can be moved to the lower level; see also [16]. Hence, the finite termination is due to the finiteness of the number of feasible upperlevel decisions, the finiteness of branch-and-cut methods to solve the lower-level subproblems, and from the fact that a leader's decision cannot be selected twice. If (x, η) is a non-optimal leader's decision, i.e., η < Φ(x), adding a globally valid inequality excludes (x, η) from being feasible for all subsequent considerations. If an upperlevel decision were ever to occur again, i.e., if there would exist solutions ( would have to hold and the termination criterion would be satisfied. Thus, an optimal solution cannot be overlooked. In particular, the number of cuts possibly added to the problem formulation is finite and in O(2 n x ).
Note that the sub-problems that are solved in Step 10 of Algorithm 2 are independent. This means that the objective function and the constraints of a sub-problem only include the upper-level decision x and the lower-level variables corresponding to the th sub-problem with ∈ {Γ , . . . , n y + 1}. Thus, the sub-problems can be solved in parallel. Further note that we have not specified how the cuts that are added in Algorithm 1 and 2 are generated. For an overview of various cutting planes that can be used for general classes of mixed-integer linear bilevel problems, we refer to [66]. Nevertheless, stronger formulations can be obtained for certain problems; see, e.g., [36,40]. Thus, it is often essential to exploit specific properties of the application problem at hand to derive valid cuts, which is what we do in the remainder of the paper.
follower's decision by prohibiting the usage of certain items by the follower. This is established by either setting the leader's variable x i = 1 to interdict item i ∈ [n] for the follower or x i = 0, otherwise. For the ease of presentation, we assume the following.

Assumption 7
The number of variables on the upper and the lower level is the same, i.e., n x = n y = n holds.
In particular, this means that all variables of the leader need to be binary, i.e., x ∈ X = {0, 1} n . However, the following results can as well be adapted to account for non-interdicting (and thus possibly non-binary) variables of the leader. The case in which the lower-level problem also includes variables that are not subject to interdiction can be handled by partitioning the follower's variable set into interdicted and non-interdicted variables as it is done in [36].

and a vector of finite upper bounds
Assumption 8 ensures that the nominal lower-level problem (1d) satisfies a downward monotonicity property, which we formally define in the following. Further note that the leader's variables x are linked to the lower-level problem only via the interdiction constraints y i ≤ u i (1 − x i ). Both aspects are essential for the validity of the cuts we propose in the remainder of this section.

Proposition 4 (Monotonicity Property) Let x be a feasible decision of the leader.
Further, let y ∈ Y (x) and let y ∈ Y be such that y ≤ y holds. Then, y is a feasible follower's decision for the given leader's decision x, i.e., y ∈ Y (x).
It is noteworthy that, under Assumption 8, all items with non-positive objective function coefficients are not chosen by the follower. Consequently, the leader does not need to spend interdiction resources on these items and we could thus omit all items with non-positive objective function coefficients in the problem formulation. This leads us to, w.l.o.g., making the following assumption. Finally, we will hold on to the following, which is reasonable in the interdiction setting.

Assumption 10
There are no terms depending on the leader's decision in the upperlevel objective function, i.e., c = 0.
In the following sections, we show that the downward monotonicity property remains satisfied when Γ -robust followers are considered, which we exploit to derive valid cuts. In Sect. 3.1, we focus on the case of uncertainties in the follower's objective function coefficients. We derive two variants of interdiction cuts based on the two approaches discussed in Sect. 2. We devote Section 3.2 to strengthened formulations for the proposed interdiction cuts. Finally, the case of uncertainties in the lower-level constraint is addressed in Sect. 3.3.

Problems with Uncertain Objective Function Coefficients
According to the notation considered in the previous sections, we assume that all lower-level objective function coefficients may be subject to uncertainty and that the follower hedges against at most Γ d deviations in the objective function coefficients. The robust counterpart of the lower-level problem and of the overall bilevel problem is given in (3) and (4), respectively. The corresponding extended formulation and the multi-follower formulation have already been stated in Sect. 2. For the extended formulation, we only need to replace (7d) with (13), i.e., we consider the problem For the multi-follower approach, we need to replace the feasible set in (10) with (13). Due to Assumption 5, u i = 1 is a trivially valid upper bound for all i ∈ [n]. For ∈ {Γ d , . . . , n y + 1}, we thus consider the sub-problem Proposition 5 Let x be a feasible decision of the leader. Then, the x-parameterized problem (15) satisfies the downward monotonicity property. Moreover, let (y, z, θ) be feasible for the x-parameterized problem (14), and let y ∈ Y be such that y ≤ y holds. Then, (y , z, θ) is feasible for the x-parameterized problem (14) as well.
Due to Proposition 5 and for the ease of presentation, we also say that the extended formulation (14) satisfies the downward monotonicity property. In what follows, we exploit the previous result to introduce penalized formulations for Problems (14) and (15) that are used to derive valid interdiction cuts. To this end, we omit the interdiction constraints y i ≤ u i (1 − x i ) in the problem formulation and instead add the penalty terms −d i y i x i to the objective function for all i ∈ [n].

Proposition 6 Let x be a feasible upper-level decision. Then, Problem (14) and the mixed-integer linear problem
admit the same optimal value.
For the multi-follower-based approach, we obtain the following similar result.

Proposition 7 Let x be a feasible upper-level decision and let
. . , n + 1} be arbitrary but fixed. Then, Problem (15) and the problem admit the same optimal value.
The feasible set of Problem (16) and of each sub-problem (17) is independent from the leader's decision. Moreover, the objective functions are linear for fixed x. Thus, an optimal solution is attained at a vertex of the convex hull of the respective feasible set. We set In what follows, we useΨ andŶ to denote the set containing the finite number of vertices of the convex hull of Ψ and Y , respectively. Further, let (x, η) be feasible for Problem (2) with the lower-level optimal-value function (3). Due to Proposition 6, we have are valid for Problem (2). To derive valid cuts for the multi-follower-based approach, we do the following. Under Assumptions 5 and 6, we have for arbitrary but fixedŷ ∈Ŷ and for all ∈ {Γ d , . . . , n + 1}. The first equality follows from (12) and the second one holds due to Proposition 7. As a result, the cuts are valid for Problem (2). In particular, we may obtain different valid cuts for each ∈ {Γ d , . . . , n+1}. Finally, we exploit the previous results to equivalently reformulate the interdiction problem with a Γ d -robust follower facing uncertain objective function coefficients. (3) can be equivalently reformulated by replacing Constraint (2c) with (18). Under Assumptions 5 and 6, an equivalent reformulation can be obtained by replacing Constraint (2c) with (19).

Cut Strengthening and Enhanced Formulations
In this section, we provide enhancements and techniques to strengthen the cuts proposed in the previous section. First, we introduce the notion of maximal packings.

Proposition 8 Let (ŷ,ẑ,θ) ∈Ψ be a non-maximal packing for Problem
Here and in what follows, domination between two cuts is understood in the sense that the feasible set induced by the dominating cut is contained in the feasible set induced by the dominated one. Due to the previous result, it is sufficient to only consider the interdiction cuts that correspond to maximal packings of the follower.
Note that there is no need to specify ∈ {Γ d , . . . , n + 1} in the previous definition since, in each sub-problem, we consider the same setŶ , which is independent from . To obtain a dominance result for interdiction cuts associated with maximal packings in the multi-follower setting, we need to further study the properties of the modified objective function coefficientsd( ) for each ∈ {Γ d , . . . , n+1}. Note that the modified objective function coefficients can be non-positive for certain items in some follower sub-problems. Ifd( ) i ≤ 0 holds for all ∈ {Γ d , . . . , n + 1}, the ith item will not be chosen in any of the follower sub-problems. Thus, the leader does not need to spend interdiction resources on the ith item and x i = y i = 0 can be fixed. This is equivalent to completely omitting the ith item in the problem formulation. However, if there is an item i ∈ [n] with non-positive modified objective function coefficients only for some follower sub-problems, i.e.,d( . . , n + 1} \ S, the ith item might be part of an optimal solution. Therefore, we introduce the following notation. For each ∈ {Γ d , . . . , n + 1}, we define the set

Proposition 9
The interdiction cuts (19) can be replaced with In particular, the cuts (20) dominate the basic interdiction cuts (19).
With the previous considerations, we can finally state a dominance result for interdiction cuts associated with maximal packings in the multi-follower setting.

Proposition 10
Letŷ ∈Ŷ be a non-maximal packing for Problem (17) and let y ∈Ŷ \ {ŷ} be such thatŷ ≤ y holds. Then, the interdiction cuts (20) associated withŷ are dominated by the interdiction cuts associated with y .
Note that the previous result is not valid for the basic interdiction cuts as stated in (19) Moreover, we would like to mention that we also considered maximal packings for the leader. However, preliminary computational tests revealed that this does not improve the performance of the overall solution method. Thus, we decided to refrain from using this ingredient in our computational study in Sect. 4. However, we exploit dominance properties among items to obtain further enhancements. First, we introduce additional inequalities regarding the leader's decision x. In what follows, A ·i denotes the ith column of A.
is satisfied in at least one optimal solution of Problem (2) with the lower-level optimalvalue function (3).
Proof Let (x * , η * ) be an optimal solution of Problem (2) with the lower-level optimalvalue function (3). Without loss of generality, suppose that there are exactly two distinct items s, t ∈ [n] that satisfy the requirements of the theorem but for which the dominance inequality (21) does not hold, i.e., x * s = 0 and x * t = 1. Otherwise, we repeat the following procedure as long as there are still items left that satisfy the requirements but violate the corresponding dominance inequality. The idea is to construct an optimal leader's decision that satisfies the dominance inequality. To this end, we set By construction, x is feasible for Problem (2) with the lower-level optimal-value function (3) and satisfies the dominance inequality associated with s and t. Without loss of generality, we show that x is also an optimal solution of Problem (2) using the extended formulation. Let (y , z , θ ) be an optimal solution of Problem (14) for x , i.e., we have y s = 0. Moreover, z i = max{Δd i y i − θ , 0} holds for all i ∈ [n] due to Constraints (14b), (14e), and the objective function. If y t = 0 holds, (y , z , θ ) is also a feasible follower's decision for x * and we obtain If y t ≥ 1 holds, we consider the alternative follower's decision (ŷ,ẑ, θ ) witĥ By construction, (ŷ,ẑ, θ ) is feasible for Problem (14) given the leader's decision x * and we obtain Hence, the alternative leader's decision x is optimal for Problem (2) with the lowerlevel optimal-value function (3) and satisfies the dominance inequality associated with s and t. This concludes the proof.
Note that, in the case of only binary follower's variables, the requirement u s ≥ u t in the previous theorem is trivially satisfied since u i = 1 for all i ∈ [n] is a valid upper bound. Further note that the requirement A ·s > A ·t can be relaxed to A ·s ≥ A ·t if the matrix A has only non-negative entries.
In the remainder of this section, we provide lifted cuts that dominate their respective basic counterparts stated in (18) and (20). We start by lifting the basic interdiction cuts corresponding to the extended formulation.
. Then, the following lifted interdiction cut is valid for Problem (2) with the lower-level optimal-value function (3): Proof Let (x, η) be a feasible leader's decision for Problem (2) with the lowerlevel optimal-value function (3). If x b k = 1 holds for all k ∈ [K ], we obtain the basic interdiction cut as stated in (18), which is satisfied by x. Otherwise, we define K := {k ∈ [K ] : x b k = 0} and consider an alternative follower's decision (y , z ,θ), which is obtained as follows. For all k ∈ K, we set y a k = 0, y b k =ŷ a k as well as z a k = 0 and z b k = max{Δd b kŷ a k −θ, 0}. We further defineK := [n] \ {a k , b k : k ∈ K} and set y i =ŷ i as well as z i =ẑ i for all i ∈K. By construction, we have (y , z ,θ) ∈ Ψ . Since the interdiction cuts (18) are not only valid for vertices contained in the setΨ but also for any lower-level feasible point, the leader's decision x also satisfies the basic interdiction cut associated with (y , z ,θ), i.e., In particular, we haveẑ a k = max{Δd a kŷ a k −θ, 0} andẑ b k = 0 for all k ∈ [K ]. Hence, we can re-write Inequality (22) Note that the right-hand side of (24) corresponds to the right-hand side of (23) with the additional term k∈K d a kŷ a k x a k + (Δd b k − Δd a k )ŷ a k + max{Δd a kŷ a k −θ, 0} − max{Δd b kŷ a k −θ, 0} (25) subtracted. We now show that the latter term is non-negative. For all k ∈ K, we have and we consequently obtain The latter is greater or equal to d a kŷ a k x a k for all k ∈ K. In particular, we have d a kŷ a k x a k = 0 if x a k = 0 and d a kŷ a k > 0, otherwise. To sum up, the term in (25) is non-negative. This means that feasibility w.r.t. Inequality (23) implies feasibility w.r.t. Inequality (24). Since (23) is a valid inequality, this concludes the proof.
In the next theorem, we consider lifted versions for the enhanced interdiction cuts stated in (20).
is valid for Problem (2) with the lower-level optimal-value function (3).
Proof Let (x, η) be a feasible leader's decision for Problem (2) with the lower-level optimal-value function (3). If x b k = 1 holds for all k ∈ [K ], we obtain the enhanced interdiction cut as stated in (20), which is satisfied by x. Otherwise, let us define K := {k ∈ [K ] : x b k = 0} as well asK := D + \ {a k , b k : k ∈ K} and consider the alternative decision of the follower given by By construction, we have y ∈ Y . Since the interdiction cuts (20) are not only valid for vertices contained in the setŶ but also for any lower-level feasible point, the leader's decision x satisfies the basic interdiction cut associated with y and , i.e., Further, we can re-write Inequality (26) as Note that the right-hand side of (28) corresponds to the right-hand side of (27) with the additional term k∈Kd ( ) a k x a k ≥ 0 subtracted. Here, we haved( ) a k x a k = 0 if x a k = 0 andd( ) a k x a k > 0, otherwise. Hence, feasibility w.r.t. Inequality (27) implies feasibility w.r.t. Inequality (28).
Since (27) is a valid inequality, this concludes the proof.
Let us point out that the items in the sets S a and S b (or S a and S b in the multi-follower case) can be paired in different ways. This might yield different lifted cuts. To this end, we consider the following separation procedure. For an item a that is a candidate to enter the set S a or S a , i.e.,ŷ a ≥ 1 holds, we select its counterpart b among all items that satisfy the requirements with maximum value of (( for the extended formulation and the multi-follower-based approach, respectively. If such a pair (a, b) exists, items a and b are inserted into the sets S a and S b (or S a and S b in the multi-follower case) and then removed from any further consideration.

Problems with an Uncertain Lower-Level Constraint
To conclude this section, we briefly address uncertainties that only arise in a single packing-type constraint of the follower as stated around (5) and (6). For the validity of the proposed cuts, we need to impose the following.

Assumption 11
The uncertain lower-level constraint does not contain terms depending on the leader's decision, i.e., v = 0.

Assumption 12
All constraint coefficients of the uncertain lower-level constraint are non-negative, i.e., w i ≥ 0 holds for all i ∈ [n].
Assumption 11 is an analog of Assumption 8 stating that the leader's variables x are linked to the lower-level problem only via the interdiction constraints y i ≤ u i (1 − x i ). Assumption 12 ensures that the Γ w -robust lower-level problems (8) and (11) satisfy the downward monotonicity property. Both assumptions are necessary to exploit a penalized formulation of the Γ w -robust follower's problem to derive valid cuts as it is done in the case of uncertain objective function coefficients. Again, we remove interdiction constraints and add penalty terms −d i y i x i to the objective function for all i ∈ [n]. Hence, we consider the same deterministic objective function in the extended formulation as well as in the multi-follower formulation. In particular, the objective function is linear for a fixed leader's decision x. However, the description of the resulting feasible set differs for both approaches. For the extended formulation, we maximize over the feasible set of the penalized follower's problem projected onto the y-space, which is given by Θ = {y ∈ Y : ∃ z, θ ≥ 0 such that (8b) and (8c) are satisfied} .
The feasible set Θ is independent from the leader's decision. Hence, an optimal solution of the Γ w -robust follower's problem is attained at a vertex of the convex hull of Θ. We denoteΘ as the set containing the finite number of vertices of the convex hull of Θ. Then, the interdiction cuts are valid for Problem (2) and can equivalently replace Constraint (2c). Under Assumptions 5 and 6, when considering the multi-follower approach, we maximize over the -dependent feasible sets which may differ in each follower sub-problem. However, it is easy to see that an optimal solution needs to be contained in the union of all -dependent sets. Let Y be the set containing the finite number of vertices of the convex hull of the union of all -dependent sets. Then, we obtain the interdiction cuts for the multi-follower case by replacingŷ ∈Θ withŷ ∈ Y in (29), which can equivalently replace Constraint (2c) in Problem (2).

Computational Results
We now provide detailed numerical results for the proposed methods to solve interdiction problems with a monotone Γ -robust follower. Our solution approaches are implemented in Python 3.6.9 and Gurobi 9.1.2 is used to solve all arising optimization problems. 1 To add the interdiction cuts described in the previous sections, we use Gurobi's lazy constraint callbacks, which requires to set the parameter LazyConstraints to 1. All other parameters have been left at their default settings. The tests have been realized on an Intel XEON SP 6126 at 2.6 GHz (6 cores) with 32GB RAM, which is part of the high performance cluster "Elwetritsch" at TU Kaiserslautern within the "Alliance of High Performance Computing Rheinland-Pfalz" (AHRP). 2 For each test run, we set the time limit to 1 h. We refer to the branch-and-cut method that exploits the multi-follower approach as MF and to the one that is based on the extended formulation as Ext. In Sect. 3.2, we have discussed several enhancements to improve the performance of MF and Ext, which we assess in the following. The aim is to determine a "winner setting" for MF and Ext and to compare both approaches. For the ease of presentation, we focus on the bilevel knapsack interdiction problem as a typical example of an interdiction problem with a monotone follower. Moreover, we only discuss the results for problems with uncertainties regarding the objective function coefficients.
Before we start, let us also comment on that there is no other tailored method in the literature that solves bilevel knapsack interdiction problems using a Γ -robust treatment of uncertainties. Hence, there are no alternative methods that we could compare with. However, using the extended formulation, one could transform the Γrobust interdiction problem into a standard mixed-integer linear bilevel problem. The latter can, in general, be solved with the MibS solver [67] and the general branchand-cut solver presented in [34]. We tested both solvers for 40 robustified knapsack instances with 35 items; see below for more details on our test set. MibS is not able to solve any of these instances-although 35 items belong to the smallest class of instances in our test set. The smallest optimality gap we get is 105%. The other solver solved 2 out of 40 instances. The runtimes are 1797.27 s and 3188.13 s, which is more than a factor of 360 or 635 longer than our tailored methods take. Consequently, we omit a more detailed comparison with these general-purpose bilevel solvers.

Generation of Knapsack Test Instances
To test our solution approaches, we consider the bilevel knapsack interdiction problem that has been considered in [23] and which is formally stated as min x∈{0,1} n p y with B, C ∈ Z + , and p, v, w ∈ Z n + . In particular, the bilevel knapsack interdiction problem is a prominent example for an interdiction problem with a monotone follower. The motivation for us to focus on this type of problem is the following. Classic knapsack problems belong to the most intensively studied discrete optimization problems, which is due to their relevance in many real-world applications, e.g., in the field of economics. In particular, the bilevel knapsack interdiction problem naturally extends the classic knapsack problem such as to capture competitive situations; see, e.g., [31] for a specific application in corporate strategy. Moreover, the knapsack interdiction problem is commonly used as a benchmark for testing bilevel optimization solvers; see, e.g., [32,68]. In its deterministic variant, the knapsack interdiction problem has been studied, e.g., in [22,23,27,31,36,37,62,68]. For our computational study, we generate deterministic knapsack interdiction instances according to [54], which we adapt to account for a Γ -robust follower. Before we comment on the uncertainty parameterizations that we consider, let us briefly describe the generation of the knapsack instances. The profits p i and the follower's weights w i take uncorrelated integer values from the inter-val [0, 100]. For each instance size n ∈ {35, 40, 45, 50, 55, . . . , 100}, 10 instances have been generated. The follower's knapsack capacity C is set to (N /11) n i=1 w i , where N ∈ {1, . . . , 10} is used to identify the instance number. The leader's weights v i and the interdiction budget B are uniformly random integers from the intervals [0, 100] and [C − 10, C + 10]. Let us point out that the deterministic knapsack instances with size n ∈ {35, 40, . . . , 55} are taken from [23]. All other instances have not yet been studied in the literature and are newly generated. To study the effects of a Γ -robust follower, we consider four different uncertainty parameterizations. We assume that the deviations take either 10% or 25% of the nominal value. The parameter Γ is set to either 10% or 50% of the instance size n. In the case of a fractional value for Γ , we then consider the closest integer. Hence, our test set contains 560 robustified knapsack instances. Finally, let us mention that we do not consider any instances with more than 100 items because even the most advanced variants of the presented approaches cannot solve all of the 560 robustified knapsack instances described above within a reasonable amount of time.

Lifted Cuts and Dominance Inequalities
We now assess the influence of lifted cuts and dominance inequalities on the overall performance of the solution method. To this end, we consider the following four settings.

MF/Ext
The basic setting in which only basic interdiction cuts are added without any further enhancements. MF-D/Ext-D The basic setting with the addition of dominance inequalities (21) regarding the leader's decision. MF-L/Ext-L The basic setting but instead of considering basic interdiction cuts, we add lifted interdiction cuts. MF-LD/Ext-LD Like MF-L or Ext-L but with the addition of dominance inequalities (21).
As a default for MF, we consider the cut separation strategy in which all violated (lifted) interdiction cuts are added to the problem formulation. Figure 1 shows the empirical cumulative distribution functions (ECDFs) w.r.t. the runtimes and the number of branch-and-bound nodes of the four settings. The ECDFs can be interpreted as the percentage of instances (y-axis) that can be solved within a certain amount of time and/or after investigating a certain number of branch-and-bound nodes (logscaled x-axis). Note that, to have a fair comparison, we exclude 29 instances that none of the considered variants can solve within the time limit. Moreover, we exclude 19 instances that every variant can solve within 5 s to avoid drawing erroneous conclusions because of low runtimes. Hence, we consider a total of 512 instances at this point. While lifted interdiction cuts only slightly improve the performance of MF and Ext, the use of dominance inequalities significantly enhances the performance of both approaches. The combination of lifted cuts and dominance inequalities yields only minor further improvement compared to MF-D and Ext-D. Nevertheless, MF-LD and Ext-LD dominate all other settings of the respective solution approach. This observation is also underlined by the results in Table 1. In what follows, we thus hold on to the   variants with additional dominance inequalities and lifted interdiction cuts as our "winner setting" for both approaches.

The Benefits of Maximal Packings
We consider maximal packings of the follower to avoid the generation of unnecessary interdiction cuts. This is achieved in the following manner. We determine a feasible follower's decision y for the current x by either solving the follower's extended formulation (14) or the sub-problems (15). In the next step, we complete the follower's decision to a maximal packing in a greedy-like fashion. To this end, we order the indices of the follower's variables according to non-increasing profit-to-weight ratio and then gradually add items that still fit into the follower's knapsack. When considering the extended formulation, however, we need to further check if the follower's decision (y, z, θ) satisfies θ ≥ Δp i . This requirement is necessary to ensure the feasibility of the maximal packing w.r.t. the constraints z i + θ ≥ Δp i y i for all i ∈ [n]. Finally, we generate and add only the interdiction cut that corresponds to the follower's maximal packing. To assess the influence of maximal packings, we adopt the parameterizations of our previous "winner settings" MF-LD and Ext-LD with the difference that we now only add cuts corresponding to maximal packings of the follower. We refer to these settings as MF-LD-Max and Ext-LD-Max. Again, for a fair comparison, we exclude 21 instances that none of the variants can solve within the time limit and 90 instances that every variant can solve within 5 s so that we consider a total of 449 instances here. Based on Fig. 2, it can be seen that adding only the interdiction cuts corresponding to maximal packings of the follower improves the performance of the overall solution method for both approaches. In particular, this holds true for the easier instances, which is also underlined by the number of solved problems presented in Table 2. Table 2 further shows considerably smaller mean and median runtimes for MF-LD-Max and Ext-LD-Max compared to their counterparts without maximal packings. Also the mean and median number of branch-and-bound nodes visited is significantly smaller in both settings when maximal packings of the follower are considered. To sum up, the observations drawn from Fig. 2 and Table 2 clearly suggest that MF-LD-Max and Ext-LD-Max are the "winner settings" among the considered variants.
As mentioned in Sect. 3.2, we also considered leader's maximal packings. However, preliminary computational results revealed that maximal packings for the leader interfere with Gurobi's integrated branching rules and node selection, which is why we decided to refrain from this ingredient.

The Impact of Warmstarting
In our computational study, we also investigated how warmstarting the proposed methods may affect their performance. To this end, we considered two options-a heuristic similar to the one presented in [36] and solving the nominal knapsack interdiction problem-to determine an initial feasible (and potentially good) decision of the leader for Problem (P 0 ). The pair (x, η) that we obtain using any of the two options is then provided to Gurobi as MIP start.
The computational results obtained for both settings revealed that warmstarting the methods has no significant impact on their performance, neither for MF nor for Ext. This is why we omit the details on the runtimes and the number of branch-and-bound nodes here. As per default, both methods are warmstarted using the heuristic in all subsequent considerations.

Comparison of Different Cut Separation Strategies and the Potential of Parallelization
We further evaluate enhancement techniques that exploit the special structure of the multi-follower formulation. Due to the fact that the overall problem can be considered as a single-leader-multi-follower problem with independent followers, we can make use of parallelization as briefly mentioned in Sect. 2. In this context, we can further consider different cut separation strategies instead of adding all violated interdiction cuts in each iteration of the algorithm. This way, the number of cuts added to the problem formulation can be reduced, which might speed up the overall solution method. We adopt the parameterizations of the previous "winner setting", i.e., we consider the multi-follower formulation with lifted cuts corresponding to maximal packings of the follower, dominance inequalities regarding the leader's decision, and heuristic warmstarts. For notational convenience, we omit MF-LD-Max as a prefix for the considered variants when we focus on the following cut separation strategies.

All-In
The default setting in which all interdiction cuts that are violated by the current leader's decision are added to the problem formulation. Most-Violated A single cut is added corresponding to the interdiction inequality (20) that is maximally violated by the current leader's decision. Sorting In [56], the authors propose a learning mechanism to identify the sub-problems that produce potentially good cuts. This is done by taking the information of previous iterations into account. We adapt the proposed strategy to our setting such that a single potentially good cut is added in each iteration. First-In We iterate over ∈ {Γ , . . . , n + 1}, add the first interdiction cut that is violated by the current leader's decision, and then break the loop. Random Among the violated interdiction cuts, we randomly choose a single cut and add it to the problem formulation.
To assess the potential of parallelization, we consider so-called idealized runtimes, which reflect the overall runtime of the solution method provided that there are sufficient capacities available to solve all sub-problems in parallel. For each instance, the idealized runtimes are computed after solving the problems sequentially by taking the maximum over the runtime of each sub-problem.
In Fig. 3, we compare the aforementioned cut separation strategies w.r.t. sequential and idealized runtimes. Note that we exclude 21 instances that none of the considered variants can solve within the time limit and 43 instances that every variant can solve within 5 s. Hence, we consider a total of 496 instances here. It can be seen that the first insertion strategy First-In harms the performance of the solution method for both sequential and idealized runtimes as a benchmark. The last observation is also reflected by the results on the number of branch-and-bound nodes presented in Fig. 4 (left). A possible reason for this might be the following. By adding the first violated interdiction cut, later follower sub-problems that might produce stronger cuts are neglected. In particular, it is possible that the cut corresponding to the same follower sub-problem is added in each iteration of the algorithm. To overcome this situation, we consider Random as a modification of First-In. Based on Fig. 3, it can be observed that this variant performs significantly better. In particular, Random outperforms all other cut separation strategies w.r.t. sequential runtimes. The potential reasons for the effectiveness of the randomized first insertion strategy are twofold. On the one hand, it seems beneficial to add a single cut in each iteration of the algorithm; otherwise, as for the strategy All-In, the leader's problem can get extremely large w.r.t. the number of constraints. On the other hand, Random has comparatively low computational costs. For Sorting, it can be seen that having information from previous iterations does not lead to a better choice than randomly adding a violated cut. When considering Most-Violated, we need to solve all sub-problems to determine the most violated interdiction cut, which seems to be rather expensive. The previous observations are underlined by the results in Table 3, since the sequential mean and median runtimes for Random are considerably smaller compared to all other cut separation strategies. Focusing on idealized runtimes, however, it is noteworthy that Random is no longer the "winning" cut separation strategy. Based on Fig. 3 and Table 3, Most-Violated dominates the randomized first insertion strategy w.r.t. idealized runtimes. This is to be expected since we can benefit the most from parallelization for Most-Violated, where we indeed have to solve all of the sub-problems. In particular, the previous observations suggest that the higher computational costs for solving all sub-problems-especially if this is done in parallel-can be compensated to some extent by the strength of the added cuts. This strength is further visualized in Fig. 4 (right) and Table 4. Here, to assess the quality of the added cuts, we consider the runtimes independently of the times spent for the cut generation. However, it is noteworthy that, apart from the strength of Most-Violated, the results also suggest that Sorting and Random yield cuts of good quality for the easier instances. Finally, we further highlight the benefit of adding the most-violated cut by considering the mean and median number of branch-and-bound nodes presented in Table 4. It can be observed that the mean and median number of nodes for Most-Violated are significantly smaller than the ones for almost all of the other cut separation strategies. Only All-In has a considerably smaller mean number of investigated nodes compared to Most-Violated. However, as stated earlier, it seems beneficial to add a single cut in each iteration of the algorithm instead of adding all violated cuts, which is underlined by both the runtime results as well as the number of solved instances.
To sum up, Most-Violated yields the strongest cuts among the considered variants. Moreover, provided that the necessary capacities are available to solve all sub-problems in parallel, Most-Violated particularly outperforms all other variants w.r.t. idealized runtimes. Hence, we consider Most-Violated as our "winning" strategy in the idealized setting. When considering sequential runtimes, however, the cost of solving all sub-problems cannot be completely compensated by the strength of the added cuts. Since the overall runtime is our decisive criterion, we prefer Random in the sequential setting.

Comparison of the Solution Approaches
We now compare the "winning" parameterizations of the extended formulation and the multi-follower formulation. For the multi-follower-based approach, we particularly distinguish between the sequential and the idealized setting. Hence, we consider MF-LD-Max-Random, MF-LD-Max-Most-Violated, and Ext-LD-Max, which we abbreviate with MF-seq, MF-ideal, and Ext in the following. Again, we exclude 16 instances that none of the considered variants can solve within the time limit and 137 instances that every variant can solve within 5 s. In total, we thus consider 407 instances. Figure 5 (left) shows the ECDF plots w.r.t. the runtimes of the three considered variants. Note that we consider sequential and idealized runtimes for MF-seq and MF-ideal, respectively. We observe that MF-ideal clearly outperforms the remaining two approaches. This particularly affirms that the strength of the multi-follower-based approach lies in the possibility to parallelize the solution of the follower sub-problems. The previous observation is also underlined by the mean and median runtimes in Table 5. It can be seen that MF-ideal has significantly smaller mean and median runtimes compared to MF-seq and Ext. Despite being not as significant as when considering the runtimes, the same qualitative behavior can also be observed for the results on the number of nodes in Fig. 5 (right) and Table 5. If, however, the capacity is not available to have an idealized parallelization, the multi-follower-based approach MF-seq still performs slightly better than Ext. Based on Fig. 5, MF-seq seems to have an advantage over Ext on the easier instances. The last observation is also supported by the results on the mean and median runtimes presented in Table 5. Yet the number of solved instances suggests that Ext performs slightly better than MF-seq on the harder instances. Let us emphasize, however, that the amount of instances that can be solved by Ext but not by MF-seq is comparatively small, which is why we consider MF-seq as the overall better method in the sequential setting.

The Computational Cost of Robustness
To conclude our computational study, we address the computational cost of robustness. This expression captures the effect of robustification, e.g., on the overall runtimes of the solution methods. To evaluate the computational cost of robustness, we compare the three "winning" approaches MF-seq, MF-ideal, and Ext for the considered uncertainty parameterizations, which are referred to as (Δ, Γ). Here, Δ ∈ {10, 25} is used to specify the considered percentage deviations in the objective function coefficients and Γ ∈ {10, 50} denotes the percentage that the parameter Γ takes of the instance size. Based on Table 6, it can be seen that the mean and median runtimes to solve the Γ -robust knapsack interdiction problem increase with increasing values of Δ and Γ for Ext. For MF-seq and MF-ideal, however, this does not seem to be the case in principle. Detailed runtime results for each knapsack instance can further be found in Table 8. For both the multi-follower approach and the extended formulation, we compare the nominal runtimes to the mean runtimes obtained from all considered uncertainty parameterizations. Note that we label the sequential and idealized mean runtimes with the superscripts seq and ideal, respectively. To further assess the cost of robustness w.r.t. the runtimes, we measure the relative performance of the method in the Γ -robust and in the nominal case. Note that we restrict ourselves to the instances that have been considered in Sect. 4.6 for a fair comparison, i.e., we exclude 16 instances that none of the "winner" settings can solve within the time limit and 137 instances that every variant can solve within 5 s. To measure the relative performance, we determine the coefficient of runtimes q i = t i,rob /t i,nom for each knapsack instance i. Here, t i,rob and t i,nom denote the runtimes of the considered solution method for instance i in the Γ -robust and in the nominal case, respectively. In Fig. 6, we show box-plots for the coefficients of runtimes corresponding to MF-seq,  6 Box-plots of the coefficients of runtimes q i = t i,rob /t i,nom for the "winner" settings of Ext and MF MF-ideal, and Ext. Each box in Fig. 6 represents the distribution of the determined coefficients q over the 407 considered instances for the three "winning" solution methods. It can be seen that the box for Ext is considerably larger compared to the other two approaches. This shows that the coefficients for MF-seq and MF-ideal are less dispersed, which suggests that the performance of these methods is more stable compared to the one of Ext. In particular, the boxes for MF-seq and MF-ideal are of similar size, which reflects the similarity between the multi-follower-based solution approaches. Note that MF-seq and MF-ideal rely on the same solution procedure. The only difference between these approaches is that MF-seq is a sequential method and MF-ideal exploits parallelization. Let us point out that, in Fig. 6, we use a logarithmic scaling of the y-axis for runtime coefficients greater than 40 such as to capture the spread of the outliers in a detailed and comprehensive way. For Ext, the outliers are widely scattered, which shows that the performance of the method is rather volatile. In contrast to that, smaller ranges of the outliers can be observed for MF-seq and MF-ideal. All previous observations are also affirmed by the results in Table 7. Based on Fig. 6 and Table 7, it thus seems reasonable to prefer the multi-follower approach over the extended formulation since it seems to be the more stable method. Interestingly, we also observe coefficients of runtimes strictly smaller than 1 for some of the instances. This means that the robust counterpart can be solved faster than the nominal problem, i.e., robustification does not necessarily always lead to increased computational costs. However, the mean and median coefficients of runtimes shown in Table 7 emphasize that the robustification is not "for free". However, the computational cost of robustness for the multi-follower-based approaches-MF-ideal in particular-is comparatively small. To sum up, Γ -robust solutions are obtained at the expense of increased computational difficulty of the problem but, provided that there are sufficient capacities available to solve all of the follower sub-problems in parallel, the price of robustness w.r.t. runtimes is comparatively small. Thus, we prefer MF-ideal over MF-seq and Ext. In the sequential setting, the results in Table 7 clearly suggests that MF-seq outperforms Ext w.r.t. the stability of the method. Based on the results in Table 5, however, the advantage of MF-seq over Ext is not as significant as when the relative performance measure is considered. In particular, as elaborated in Sect. 4.6, Ext seems to have an advantage on harder instances, which also justifies the use of the extended formulation.

Conclusion
In this paper, we consider discrete min-max problems with a follower facing uncertain lower-level data. We exploit a Γ -robust approach so that the follower only hedges against a subset of deviations in the uncertain parameters as to adversely affect the solution of the problem. We present two approaches-an extended formulation and a multi-follower formulation-to model this type of situation. For both frameworks, we present a fairly generic branch-and-cut method. Nevertheless, we can obtain stronger formulations for certain types of problems. As an example, we consider interdiction problems with a monotone Γ -robust follower to derive problem-tailored cuts that generalize existing interdiction cuts from the literature. Finally, we conduct a computational study to assess the performance of the two proposed solution approaches. To this end, we focus on the bilevel knapsack interdiction problem, which is one of the most prominent examples of monotone interdiction problems.
The computational results suggest that the extended formulation (Ext) performs slightly better on harder knapsack instances. However, smaller overall mean and median runtimes and a more stable performance of the method compared to Ext can be observed for the multi-follower formulation (MF). In particular, we can exploit parallelization for the multi-follower formulation, which is a major strength of this solution approach. Nevertheless, the study justifies the use of both the extended formulation as well as the multi-follower approach.
Despite the contribution of this paper, there are still several interesting research questions that require further investigation. We briefly sketch three of them.
1. Throughout this paper, we assume that there are no coupling constraints, i.e., there are no upper-level constraints explicitly depending on the variables of the follower. This is a crucial assumption for the validity of the proposed methods. Otherwise, we would not be able to project the follower's variables out of the problem using the optimal-value function as it is done in Sect. 1. Nevertheless, developing solution methods for Γ -robust bilevel problems with coupling constraints is a reasonable aspect of future work. 2. In this paper, we focus on interdiction problems with a monotone Γ -robust follower to obtain problem-tailored cuts. An interesting direction for future research would be to investigate if generic cuts such as, e.g., intersection cuts [33,35] can be adapted to the setting described in Sect. 2. 3. Finally, we would like to emphasize that we only consider uncertainties in a single packing-type constraint in the lower level. The extended formulation can easily be adapted to allow for deviations in multiple lower-level constraints. For the multifollower formulation, however, this situation significantly increases the difficulty of the problem. This is due to the assumption regarding the ordering of the indices, which is required to exploit the results in [11]. In general, it is not possible to order the indices such that the deviations in the constraint coefficients are nonincreasing if there are multiple uncertain constraints. Thus, this aspect is left for future research.

Proof of Proposition 3
Let x be a feasible upper-level decision and let ∈ {1, . . . , n y + 1} be arbitrary but fixed. Further, let y be an optimal solution of the th sub-problem (10) that is parameterized in x. By (9) and since y ∈ {0, 1} n y , we obtain which concludes the proof.

Proof of Proposition 4
Let x be a feasible upper-level decision. Further, let y ∈ Y (x) and let y ∈ Y be such that y ≤ y holds. Due to B ∈ R m×n + , we obtain Thus, y is a feasible follower's decision for the given x, i.e., y ∈ Y (x).

Proof of Proposition 5
Let x be a feasible upper-level decision. First, we show that the extended formulation (14) satisfies the monotonicity property. To this end, let (y, z, θ) be a feasible follower's decision for Problem (14) for the given x. Further, let y ∈ Y be such that y ≤ y holds. Due to B ∈ R m×n + and Δd i ≥ 0 for all i ∈ [n], we obtain z i + θ ≥ Δd i y i ≥ Δd i y i , i ∈ [n], By ≤ By ≤ b, i.e., the follower's decision (y , z, θ) is feasible for Problem (14) for the given x. Second, we show that each sub-problem (15) satisfies the monotonicity property. Note that there is no need to specify ∈ {Γ d , . . . , n + 1} since the feasible set of (15) does not depend on . For the given x, let y be a feasible follower's decision for sub-problem (15). Further, let y ∈ {0, 1} n be such that y ≤ y. Since we restrict ourselves to binary follower's variables in the multi-follower case, we have valid upper bounds u i = 1 for all i ∈ [n]. Applying the same arguments as before, y is feasible for sub-problem (15) for the given x. Consequently, the Γ d -robust lower-level problems (14) and (15) satisfy the monotonicity property.

Proof of Proposition 6
Let x be a feasible upper-level decision. For notational convenience, let Ψ (x) and Ψ denote the feasible set of Problem (14) and (16), respectively. Further, let (y * , z * , θ * ) be an optimal solution of Problem (14) for the given leader's decision x. Then, (y * , z * , θ * ) is also feasible for Problem (16), i.e., (y * , z * , θ * ) ∈ Ψ . In particular, y * i x i = 0 holds for all i ∈ [n], i.e., both problems have the same objective function value for (y * , z * , θ * ). Thus, we obtain max y,θ,z Let (ŷ,ẑ,θ) be an optimal solution of Problem (16) for the given leader's decision x. Without loss of generality, we assume that there is exactly one item k ∈ [n] for which the interdiction constraintŷ k ≤ u k (1 − x k ) is not satisfied, i.e.,ŷ k ≥ 1 and x k = 1. Otherwise, we repeat the following as long as there are no more items left that violate the interdiction constraint. We consider the alternative follower's decision (y , z ,θ) with By construction, (y , z ,θ) is feasible for Problem (16) and satisfies all interdiction constraints. Moreover, we obtain max y,θ,z i.e., the alternative follower's decision is optimal for Problem (16). In particular, it is also feasible for Problem (14), i.e., (y , z ,θ) ∈ Ψ (x), and we have y i x i = 0 for all i ∈ [n]. Hence, we obtain max y,θ,z Due to (30) and (31), Problem (14) and (16) admit the same optimal value.

Proof of Proposition 7
Let x be a feasible upper-level decision and let ∈ {Γ d , . . . , n + 1} be arbitrary but fixed. Further, let y * be an optimal solution of the th sub-problem (15) for the given leader's decision x. Then, y * is also feasible for the th sub-problem (17), i.e., y * ∈ Y . In particular, y * i x i = 0 holds for all i ∈ [n], i.e., both sub-problems have the same objective function value for y * . Thus, we obtain Letŷ be an optimal solution of the th sub-problem (17) for the given leader's decision x. Without loss of generality, suppose there is exactly one item k ∈ [n] for which the interdiction constraintŷ k ≤ 1 − x k is not satisfied, i.e.,ŷ k = 1 = x k . Then, we consider the alternative follower's decision By construction, y is feasible for the th sub-problem (17) and satisfies all interdiction constraints. Moreover, we obtain i.e., the alternative follower's decision is optimal for Problem (17). In particular, it is also feasible for Problem (15) Due to (32) and (33), Problem (15) and (17) admit the same optimal value.

Proof of Theorem 2
Let (x, η) ∈ X × R be a given leader's decision. Due to the validity of the proposed cuts, it suffices to show that the feasibility of (x, η) w.r.t. either the interdiction cuts (18) or (19) implies η ≥ Φ(x). To this end, suppose that (x, η) satisfies Ax ≥ a and either the interdiction cuts i for all (ŷ,ẑ,θ) ∈Ψ which concludes the proof.

Proof of Proposition 8
This follows immediately from d i > 0 and from the fact that x i ∈ {0, 1} implieŝ

Proof of Proposition 9
Let (x, η) be feasible for Problem (2) with the lower-level optimal-value function (3). Further, let ∈ {Γ d , . . . , n + 1} be arbitrary but fixed. Due to x i ∈ {0, 1} for all i ∈ [n], we haved( ) i (1 − x i ) ≤ 0 for all i / ∈ D + . Hence, all follower's variables y i with i / ∈ D + could be omitted in this sub-problem, i.e., we obtain The validity of the new interdiction cuts (20) can be shown in a similar way as it is done in Sect. 3. In particular, holds for allŷ ∈Ŷ and ∈ {Γ d , . . . , n + 1}. Thus, the cuts (20) dominate the basic interdiction cuts (19).

Proof of Proposition 10
This can be shown in analogy to the proof of Proposition 8.