Tighter reformulations using classical Dawson and Sankoff bounds for approximating two-stage chance-constrained programs

We extend and improve recent results given by Singh and Watson on using classical bounds on the union of sets in a chance-constrained optimization problem. Specifically, we revisit the so-called Dawson and Sankoff bound that provided one of the best approximations of a chance constraint in the previous analysis. Next, we show that our work is a generalization of the previous work, and in fact the inequality employed previously is a very relaxed approximation with assumptions that do not generally hold. Computational results demonstrate on average over a 43% improvement in the bounds. As a byproduct, we provide an exact reformulation of the floor function in optimization models.


Background
In a recent work, Singh and Watson [13] approximate a two-stage joint chanceconstrained optimization model using approximations based on the union of sets.First, they rewrite a joint chance constraint (JCC) as a union of sets of "failed" scenarios.Then, using classical Bonferroni-styled approximations of the union, they bound the chance constraint, thereby bounding the optimization model itself.With a maxi-mization objective, a lower (upper) bound for a union (or, the JCC) provides an upper (lower) bound for the optimization model.They consider two upper bounds and four lower bounds for the union; and, find that in a tradeoff between computational effort and the quality of solution, the so-called Bonferroni union bound and the linearized version of the Dawson and Sankoff [4] union bound nearly always provide the best optimization bounds.
Bounding the probability of the union of n events by the use of joint probabilities of k < n events has been studied since Boole and Bonferroni.For an extensive historical survey of such inequalities, see, e.g.[9].The sharpness of linear bounds can be determined using linear programming [9].Relatively recent methods to approximate this probability include cherry trees [3], chordal graphs [5], and aggregation and disaggregation [10].The hardness of computing these bounds is also studied [12].For a survey of bounds, including linear and non-closed form expressions, see [2].
We use the same notation as [13] and reiterate it here for conciseness.Let, x = (x t ) t∈T be the first stage decision; w = (w t ) ω t∈T ∈ Ω be a set of equally likely discrete scenario realizations with N = |Ω|, ω = ω 1 , ω 2 , . . ., ω N ; y = (y t ) ω t∈T be the second stage decision; A t = {ω : x t > y ω t + w ω t } be the set of scenarios we fail to satisfy at stage t; and, ε be a small positive number strictly less than 1.Let u ω t = 1 and v ω tt = 1, t > t denote we fail in scenario ω at time t, and in scenario ω at both times t and t , respectively; i.e., v ω tt = 1 if and only if Further, we use the following two mathematical conventions.First, • rounds its argument down to the nearest integer; i.e., for any x ∈ R, x = max n∈Z {n ≤ x}.Second, frac{x} denotes the fractional part of its argument; i.e., for any x ∈ R, frac{x} = x − x .
The following is the generic chance-constrained optimization model of [13]: The JCC in equation (1b) can be rewritten exactly as: P( t∈T A t ) ≤ ε.The contribution of Singh and Watson is to utilize classical (upper and lower) approximations of this union within optimization model (1).

Our contributions
In this work, we investigate these classical bounds further.We seek to tighten the approximating optimization model by using tightened approximations of the probability of the union.We also present suitable reformulations of some bounds that were not investigated in the previous work due to their non-linear structure.In this sense, our work improves on the work initiated by Singh and Watson [13].The following are the key contributions of this article: 1. We observe that the Dawson and Sankoff bound, whose linearized version is claimed by Singh and Watson [13] as the tightest, is in fact a very relaxed bound with stringent necessary assumptions that are nearly always not satisfied.2. We utilize a tighter bound than the lower bound of Sathe et al. [11] used in Singh and Watson [13].3. We reformulate a resulting mixed-integer non-linear optimization exactly as a mixed-integer optimization model, including an exact reformulation of the floor function.4. We demonstrate a significant improvement in our computational experiments, with the new inequality dramatically improving the bounds by over 40%.

Bound by Sathe et al. [11]
The following two bounds on the union of sets, available from Result 1 and Result 3 by Sathe et al. [11] respectively, are used in [13].
Sathe et al. also provide a tightening of the bounds in equation (2); Result 2 and Result 4 of [11] say that the bounds in (2a) and (2b) can be tightened if 2S 2 < (|T | − 1)S 1 and S 2 < 0 respectively hold.Clearly the latter condition cannot hold.Indeed, the upper bound in (2b) is the tightest bound by a linear combination of S 1 and S 2 ; this is proved in [10].Result 2 of [11] provides the following tightened approximation of equation (2a): where z = 2 + 2S 2 S 1 .By the use of linear programming, Prékopa et al. [10] prove that the bounds in equation (3) and equation (2b) are the tightest lower and upper bounds using a linear combination of S 1 and S 2 .Further, both Prékopa [10] and Dawson and Sankoff [4] prove the bound in (3) without requiring the condition mentioned by Sathe et al [11]; this condition is sufficient to guarantee an improvement but not necessary.
A subtle and important distinction stems from what the previously cited authors call as "linear" combinations of S 1 and S 2 .Such linear bounds follow a structure of aS 1 + bS 2 where a and b are coefficients that can be determined.In previous works [4,10,11], it is assumed that S 1 and S 2 can be calculated apriori; thus, z or 2S 2 S 1 is a known quantity.Then, equation ( 3) contains an exponent of one for both S 1 and S 2 ; hence, the bound is referred to as a linear bound.In this sense, the bounds can be said to be the tightest linear bounds given that z is known; for nonlinear bounds, see, e.g.[8].We revisit this distinction in Sect.2.3.

Bound by Dawson and Sankoff [4]
Theorem 1.1 of Dawson and Sankoff [4] provides the following lower bound, using the notation presented above.
where θ = frac{ 2S 2 S 1 }.Substituting θ in terms of z, and using straightforward algebra, the right hand sides of equation ( 3) and equation ( 4) are exactly equivalent.The minimum of the right hand side of equation ( 4) occurs at θ = 0; then, the following bound results: This bound is available from Corollary 1 of [4].In the computational results of [13], a piecewise linearization of the bound in equation ( 5) nearly always achieves the best upper bounds for the optimization model; this is given by equation ( 6) of [13] and is called as the linearized Dawson and Sankoff bound.However, equation ( 5) is a special case (i.e., a relaxation) of the general bound in equation ( 4).Thus, an approximation of model (1) with equation (5) instead of equation ( 4) always results in a worse (i.e., larger) bound.As we are interested in smaller upper bounds, this reasoning suggests we can do better with equation (4).Further, for this special case to hold we require the strong apriori assumption: θ = 0; in other words, 2S 2  S 1 ∈ Z.This clearly does not hold true in general.
An analogy for this observation is that of solving a mixed-integer program (MIP) using its linear programming relaxation (LP).The objective function value of the LP always provides a bound for that of the MIP, but for the LP solution to be valid to the MIP the restrictive assumption of all its optimal variables being integer is required.

Summary of observations
In the proceeding sections of this manuscript, we are interested in investigating the tighter version of the bound in equation (2a)-specifically that given by equation (3) or equation ( 4)-as an approximation of the JCC in model (1).We note that in an optimization model such as model (1), S 1 and S 2 are decision variables which are obviously unknown before solving the model.Hence, z is a decision variable as well, and equation (3) presents a challenging mixed-integer non-linear constraint.In Sect.3.1 we provide results that assist us in modeling the quantity 2S 2 S 1 , while in Sect.3.2 we linearize the constraint (3).
We conclude this section with a summary of the above-mentioned observations, all of which follow directly from the three cited works [4,10,11].
Remark 2 Constraint (5) offers a special case of constraint (4) as a relaxation.Thus, upper bounds using the former are expected to be larger (worse) than the upper bounds using the latter.The upper bound using constraint ( 5) is derived by a stringent assumption on S 1 and S 2 .
Remark 3 Constraint (2a) is weaker than constraint (3) or constraint (4).Thus, upper bounds using the former are expected to be larger (worse) than the upper bounds using the latter.However, unlike constraint (5), constraint (2a) does not require any assumptions on S 1 and S 2 .Proof See equation (6.2.13) of [9].

Remark 4
In any optimal solution to model (1) with positive coefficients R and B we always have S 1 > 0. To see this, first we note that if S 1 = 0 then all of u, v, S 2 are zero, and also P( t∈T A t ) = 0; see, e.g., [9].Thus, the objective function value always improves when at least one of the u variables is one, or S 1 > 0. Hence, the ratio S 2 S 1 is well-defined.
This background brings us to our main result of this section.

Theorem 1 For model (1), we have
Proof From Lemma 1, equation (6a) directly holds.Next, from the of S 1 , S 2 we observe that Lemma 2 is applicable.
In general, to model a floor function exactly we require a strict inequality because the feasible region is an open set; see, e.g., [6].However, as optimization solvers can only model non-strict inequalities, strict inequalities are approximated with a finite tolerance; see, e.g., [7].Having an upper bound on the fractional part of a decision variable, such as that provided by Theorem 1, allows an exact reformulation.That being said, if the quantity N |T | is very large the above result can have little practical relevance as this could potentially cause the model to be (i) poorly scaled, or (ii) the quantity 1  N |T | could be lower than the feasibility tolerance of the optimization solver.The following corollary to Theorem 1 summarizes this discussion.

Linearizing equation (3): inequality by Dawson and Sankoff [4]
In this section, we linearize the constraint . Unlike the piecewise linear approximation used in [13], or a finite tolerance approximation of the floor function, here we provide an exact linearization.Consider the following equations: Proposition 1 Equation ( 7) is an exact reformulation of equation ( 3) for all α ∈ (1 − We observe from equation (7d) and (7g) that exactly one of the κ variables is one; i.e., κ is a so-called SOS1 variable, see, e.g.[1].Then, z and z 2 − z can be represented by the quantities  7g)-(7h) we observe that if κ t * = 1 then δ t * = S 1 and δ t = 0, t = t * .Thus, the quantity ).This is ensured by constraints (7e)-(7f).Then constraint (7a) correctly captures In the Electronic Supplementary Material, we present the complete reformulation of model ( 1) approximated by equation ( 3).

Computational experiments
In this section, we compare the reduction of the upper bounds for model (1), with the new inequalities proposed in Sect. 3 as opposed to inequalities (2a) and ( 5) used by Singh and Watson [13].To validate the new inequality, we use the same system as [13], with two exceptions.First, we re-run all computations on a newer version of GAMS-30.1 as opposed to the originally used 24.8.5 since the newer version contains Gurobi 9.0.0 with significantly improved features [7].Second, to strictly compare the improvement of the new inequality with the original ones, we do not use the so-called mixing set of Remark 2 of Singh and Watson [13], and neither do we use a piecewise linearization.Thus, we re-compute the optimal objective function values for both of these inequalities, as well as the true optimal objective function (i.e., using an exact representation of the JCC).We do this for all the eight tables reported in Singh and Watson [13], with a maximum time limit of 2100 seconds for each instance.Further details on the data, representation and modeling are available in [13].
In Tables 1a and 1b, we summarize our two sets of computational results using the ARMA and Gaussian sampling methods of [13].The bottom halves of both the tables have samples containing half the mean and variances as the top halves; see, [13] for details.Here, the constraints "Original 1" and "Original 2" refer to inequalities (2a) and ( 5) used by Singh and Watson, while constraint "New" refers to inequality (3) with the exact reformulation of Sect.3.2.The "MIP gap" column denotes the percentage optimality gap reported by GAMS/Gurobi while solving the optimization model with the respective constraint.The "Gap" column denotes the optimality gap between the true objective function value and the objective function value of the approximating model.In the same spirit as Singh and Watson, we use a conservative estimate of the Gap 1 ; thus, our results are worst-case estimates.The "Improvement" column denotes the percentage reduction in the (conservative) upper bounds of the New inequality from the Original inequalities.We recall two facts presented above: (i) inequalities (3) 1 The Gap is defined as follows: Bound , where "Bound" is the maximum possible value of the objective function value of the relevant approximation and z * L B is the lower bound on the true optimal value reported by Gurobi.For the numbers above the dashed lines in the two tables the true model is solved to optimality; but for the numbers below the dashed lines in the two tables the true model could not be solved to optimality.See [13] for details. and ( 4) are exactly the same, and (ii) the linearization in Sect.3.2 is exact.Thus, the empirical quantification of the improvement in bounds results strictly from two sources: (i) a tightened approximation of the JCC, and (ii) a better reformulation of the mathematical model of the approximating constraint.
In the 32 instances of Table 1, the New inequality results in an average MIP gap of only 2.9% as opposed to 20.6% of the Original 1 inequality.This significantly lower MIP gap suggests the new formulation is much more efficient to solve for a MIP solver than the original formulation.The Gap from the optimal in the New inequality is on average 4.4% -a dramatic reduction from the 44.7% average Gap in the Original 1 inequality.Since our Gap estimate is the worst-case estimate, in reality these gaps could be even smaller.The upper bounds of the approximating model with the New inequality improve on average by 43.2% from the Original 1 inequality.
In contrast, the Original 2 inequality could always be solved in under 30 seconds to a MIP gap of 0%.This shows the effectiveness of Gurobi 9.0.0 for quadratic optimization models.However, the bounds obtained from it are even farther off than the Original 1 inequality; the average Gap from the optimal in the Original 2 inequality is 68.2%.The upper bounds of the approximating model with the New inequality improve on average by 66.8% from the Original 2 inequality.
We end this section with a discussion on the size of the approximating problems.Representing S 1 and S 2 requires N |T | and N |T ||T −1| 2 variables, respectively.The constraint matrix given by equation (7) requires O(|T |) number of constraints and variables.Further, we note that the optimization models corresponding to the New, Original 1 and Original 2 inequalities change only in the representation of the JCC.Thus, if N |T |, as in [13], the ballpark size of problem remains the same for all three inequalities.For N = 250, our optimization models have an order of 81K continuous variables, 6K binary variables, and 213K constraints; these values approximately double when N doubles to 500.how the previously used version of this bound is a restrictive special case whose apriori assumptions might not hold true.We provided linear reformulations of non-linear combinations of S 1 and S 2 , as well as an exact reformulation of the floor function.
Our computational results show significant promise; thus, there seems scope in this direction of work.Future work could examine developing a better upper bound on 2S 2 S 1 , as well as on frac{ 2S 2 S 1 }.The former would reduce the number of binary variables in the resulting mathematical model.Future research could also examine quantifying the analytical gap between the approximations.