Skip to main content
Log in

Probability estimation via policy restrictions, convexification, and approximate sampling

  • Full Length Paper
  • Series B
  • Published:
Mathematical Programming Submit manuscript

Abstract

This paper develops various optimization techniques to estimate probability of events where the optimal value of a convex program, satisfying certain structural assumptions, exceeds a given threshold. First, we relate the search of affine/polynomial policies for the robust counterpart to existing relaxation hierarchies in MINLP (Lasserre in Proceedings of the international congress of mathematicians (ICM 2018), 2019; Sherali and Adams in A reformulation–linearization technique for solving discrete and continuous nonconvex problems, Springer, Berlin). Second, we leverage recent advances in Dworkin et al. (in: Kaski, Corander (eds) Proceedings of the seventeenth international conference on artificial intelligence and statistics, Proceedings of machine learning research, PMLR, Reykjavik, 2014), Gawrychowski et al. (in: ICALP, LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018) and Rizzi and Tomescu (Inf Comput 267:135–144, 2019) to develop techniques to approximately compute the probability binary random variables from Bernoulli distributions belong to a specially-structured union of sets. Third, we use convexification, robust counterpart, and chance-constrained optimization techniques to cover the event set of interest with such set unions. Fourth, we apply our techniques to the network reliability problem, which quantifies the probability of failure scenarios that cause network utilization to exceed one. Finally, we provide preliminary computational evaluation of our techniques on test instances for network reliability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. ApS, M.: Mosek modeling cookbook (2020)

  2. Bangla, A.K., Ghaffarkhah, A., Preskill, B., Koley, B., Albrecht, C., Danna, E., Jiang, J., Zhao, X.: Capacity planning for the google backbone network. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45385.pdf (2015)

  3. Bao, X., Sahinidis, N.V., Tawarmalani, M.: Multiterm polyhedral relaxations for nonconvex, quadratically constrained quadratic programs. Optim. Methods Softw. 24(4–5), 485–504 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  4. Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optim., vol. 28. Princeton University Press, Princeton (2009)

    Book  MATH  Google Scholar 

  5. Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications, vol. 2. SIAM, Philadelphia (2001)

    Book  MATH  Google Scholar 

  6. Ben-Tal, A., Nemirovski, A.: On safe tractable approximations of chance-constrained linear matrix inequalities. Math. Oper. Res. 34(1), 1–25 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bertsimas, D., Goyal, V.: On the power and limitations of affine policies in two-stage adaptive optimization. Math. Program. 134(2), 491–531 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  8. Bertsimas, D., Popescu, I.: Optimal inequalities in probability theory: a convex optimization approach. SIAM J. Optim. 15(3), 780–804 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  9. Chang, Y., Jiang, C., Chandra, A., Rao, S., Tawarmalani, M.: Lancet: better network resilience by designing for pruned failure sets. Proc. ACM Meas. Anal. Comput. Syst. 3(3), 1–26 (2019)

    Article  Google Scholar 

  10. Chang, Y., Rao, S., Tawarmalani, M.: Robust validation of network designs under uncertain demands and failures. In: Proceedings of the 14th USENIX Conference on Networked Systems Design and Implementation, NSDI’17, pp. 347–362. USENIX Association, USA (2017)

  11. Doerr, B.: Probabilistic tools for the analysis of randomized optimization heuristics. In: Theory of Evolutionary Computation, pp. 1–87. Springer (2020)

  12. Dworkin, L., Kearns, M., Xia, L.: Efficient inference for complex queries on complex distributions. In: Kaski, S., Corander, J. (eds.) Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, vol. 33, pp. 211–219. PMLR, Reykjavik (2014)

    Google Scholar 

  13. Gawrychowski, P., Markin, L., Weimann, O.: A faster FPTAS for #knapsack. In: ICALP, LIPIcs, vol. 107, pp. 64:1–64:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018)

  14. Ghanem, R., Higdon, D., Owhadi, H.: Handbook of Uncertainty Quantification, vol. 6. Springer, Berlin (2017)

    Book  MATH  Google Scholar 

  15. Gill, P., Jain, N., Nagappan, N.: Understanding network failures in data centers: measurement, analysis, and implications. In: Proceedings of the ACM SIGCOMM 2011 Conference, pp. 350–361 (2011)

  16. Gopalan, P., Klivans, A., Meka, R., Stefankovic, D., Vempala, S., Vigoda, E.: An FPTAS for #knapsack and related counting problems. In: Proceedings of the 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS’11, pp. 817–826. IEEE Computer Society, USA (2011)

  17. Gurobi Optimization Incorporated: Gurobi optimizer reference manual. http://www.gurobi.com (2019)

  18. Han, S., Tao, M., Topcu, U., Owhadi, H., Murray, R.M.: Convex optimal uncertainty quantification. SIAM J. Optim. 25(3), 1368–1387 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  19. Hanasusanto, G.A., Kuhn, D.: Conic programming reformulations of two-stage distributionally robust linear programs over Wasserstein balls. Oper. Res. 66(3), 849–869 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  20. Karp, R.M., Luby, M., Madras, N.: Monte-Carlo approximation algorithms for enumeration problems. J. Algorithms 10(3), 429–448 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  21. Knight, S., Nguyen, H., Falkner, N., Bowden, R., Roughan, M.: The internet topology zoo. IEEE J. Select. Areas Commun. 29(9), 1765–1775 (2011)

    Article  Google Scholar 

  22. Lasserre, J.B.: A semidefinite programming approach to the generalized problem of moments. Math. Program. 112(1), 65–92 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  23. Lasserre, J.B.: The moment-sos hierarchy. In: Proceedings of the International Congress of Mathematicians (ICM 2018) (2019)

  24. Laurent, M.: Sums of squares, moment matrices and optimization over polynomials, The IMA Volumes in Mathematics and its Applications Series, vol. 149, pp. 155–270. Springer, Germany (2009)

  25. Liu, H.H., Kandula, S., Mahajan, R., Zhang, M., Gelernter, D.: Traffic engineering with forward fault correction. In: Proceedings of the 2014 ACM Conference on SIGCOMM, pp. 527–538 (2014)

  26. Luedtke, J., Ahmed, S.: A sample approximation approach for optimization with probabilistic constraints. SIAM J. Optim. 19(2), 674–699 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  27. Markopoulou, A., Iannaccone, G., Bhattacharyya, S., Chuah, C.N., Ganjali, Y., Diot, C.: Characterization of failures in an operational IP backbone network. IEEE/ACM Trans. Netw. 16(4), 749–762 (2008)

    Article  Google Scholar 

  28. Matthews, L.R., Gounaris, C.E., Kevrekidis, I.G.: Designing networks with resiliency to edge failures using two-stage robust optimization. Eur. J. Oper. Res. 279(3), 704–720 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  29. Mihalák, M., Šrámek, R., Widmayer, P.: Counting approximately-shortest paths in directed acyclic graphs. In: Kaklamanis, C., Pruhs, K. (eds.) Approximation and Online Algorithms, pp. 156–167. Springer, Cham (2014)

    Chapter  Google Scholar 

  30. Mühlpfordt, T., Roald, L., Hagenmeyer, V., Faulwasser, T., Misra, S.: Chance-constrained ac optimal power flow: a polynomial chaos approach. IEEE Trans. Power Syst. 34(6), 4806–4816 (2019)

    Article  Google Scholar 

  31. Nemirovski, A., Shapiro, A.: Convex approximations of chance constrained programs. SIAM J. Optim. 17(4), 969–996 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  32. Owhadi, H., Scovel, C., Sullivan, T.J., McKerns, M., Ortiz, M.: Optimal uncertainty quantification. SIAM Rev. 55(2), 271–345 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  33. Rizzi, R., Tomescu, A.I.: Faster FPTASes for counting and random generation of knapsack solutions. Inf. Comput. 267, 135–144 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  34. Rockafellar, R.T.: Convex Analysis 28. Princeton University Press, Princeton (1970)

    Book  MATH  Google Scholar 

  35. Rockafellar, R.T., Uryasev, S., et al.: Optimization of conditional value-at-risk. J. Risk 2, 21–42 (2000)

    Article  Google Scholar 

  36. Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, vol. 317. Springer, Berlin (2009)

    MATH  Google Scholar 

  37. Sherali, H.D., Adams, W.P.: A Reformulation–Linearization Technique for Solving Discrete and Continuous Nonconvex Problems, vol. 31. Springer, Berlin (2013)

    MATH  Google Scholar 

  38. Tawarmalani, M., Richard, J.P.P., Xiong, C.: Explicit convex and concave envelopes through polyhedral subdivisions. Math. Program. 138(1–2), 531–577 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  39. Wang, Y., Wang, H., Mahimkar, A., Alimi, R., Zhang, Y., Qiu, L., Yang, Y.R.: R3: resilient routing reconfiguration. In: Proceedings of the ACM SIGCOMM 2010 Conference, pp. 291–302 (2010)

  40. Wets, R.J.B.: Stochastic programs with fixed recourse: the equivalent deterministic program. SIAM Rev. 16(3), 309–339 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  41. Wood, R.K.: Deterministic network interdiction. Math. Comput. Model. 17(2), 1–18 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  42. Xu, G., Burer, S.: A copositive approach for two-stage adjustable robust optimization with uncertain right-hand sides. Comput. Optim. Appl. 70(1), 33–59 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  43. Zhang, Y., Ge, Z., Greenberg, A., Roughan, M.: Network anomography. In: Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement, pp. 30–30 (2005)

Download references

Acknowledgements

We acknowledge Shabbir Ahmed for his insightful comments at Dagstuhl on the use of Markov inequality for OUQ and Sanjay G. Rao for extensive discussions on NR. The second author would like to acknowledge the funding provided by NSF CMMI-1727989 and AFOSR 21RT0453

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohit Tawarmalani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

List of assumptions

Here, we will briefly describe the assumptions we make in different parts of the paper.

  1. 1.

    In Sect. 2, for relating RLT to affine and polynomial policies, we assume:

    • There is no duality gap between CP\((\cdot )\) and CD\((\cdot )\) (A1).

  2. 2.

    For deriving the column generation algorithm and to show convergence of RLT at the \(m^{\text {th}}\) level in Theorem 1, we assume in Sect. 2.1 that:

    • The distribution of \({\mathbb {X}}\) is supported on a finite set of points \({\mathscr {T}}\) in \({\mathscr {P}}\) (A2).

    • \({\mathbb {K}}={\mathbb {R}}^p_+\) (A3).

    • \({\mathscr {T}}\subseteq \{0,1\}^m\) and an inequality description of \(\mathop {\text {conv}}({\mathscr {T}})\) is available (A4).

    Additionally, when \({\mathscr {T}}\) consists of the vertices of a simplex, we show in Proposition 2 that the concave envelope of the indicator function can be used to compute \(\mathop {\text {Pr}}\nolimits _{*}({\mathcal {F}})\). The column generation algorithm also assumes that expectations of a set of functions of the random variable \({\mathbb {X}}\), denoted as \(\{{\mathfrak {f}}_{\alpha }({\mathbb {X}}), \alpha \in {\bar{\varGamma }}\subseteq {\mathbb {N}}^{m}\}\), are known.

  3. 3.

    In Sects. 3 and 4 we devise counting and sampling algorithms by assuming that:

    • \({\mathscr {P}}=\mathop {\text {conv}}({\mathscr {T}})=[0,1]^m\) and \({\mathbb {X}}\in \{0,1\}^m\), with distribution \(\varTheta = \bigotimes _{i=1}^{m}\text {Bernoulli}(p_i)\) (tensor product of m independent Bernoulli distributions). Moreover, we assume that \(p_i = \frac{a_i}{n_i}\), where \(a_i, n_i\in {\mathbb {N}}\), and GCF\((a_i, n_i) = 1\) (A5).

    • Without loss of generality, the weights of the general inequality defining each Sliced low weight polytope (SLWP) are non-negative i.e., \(w_{i} \in {\mathbb {Z}}_{\ge 0}\) for all \(i \in [m]\) (A6).

  4. 4.

    We derive the Bernstein approximation by assuming in Sect. 4.2 that:

    • \(S' = \big \{\sum _{i=1}^{m} w_{i}x_{i}\ge C, \sum _{i=1}^{m} x_{i} = {\mathfrak {b}}, \text { where } w_{i} \ge 0\ \forall i \in [m]\bigr \}\) and \(\mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}_i=1)=p\) for all \(i\in [m]\) (A7).

Proof of Proposition 1

Let \(y = P^\intercal x + q\). For y to be feasible, \((AP^\intercal + B)x + Aq \le _{\mathbb {K}}c\) for all x satisfying \({\mathfrak {C}}x \le {\mathfrak {d}}\). Then, \((AP^\intercal + B)x + Aq - c \le _{{\mathbb {K}}} U^\intercal {\mathfrak {C}}x - U^\intercal {\mathfrak {d}}= U^\intercal ({\mathfrak {C}}x - {\mathfrak {d}}) \le _{{\mathbb {K}}} 0\), where the first inequality follows from (6b) and (6c) and the last inequality because \(U^\intercal ({\mathfrak {C}}x - {\mathfrak {d}})\), by (6e) is a non-positive conic combination of vectors in \({\mathbb {K}}\). Moreover, the objective, \(e^\intercal (P^\intercal x + q) = {\underline{\varTheta }}^\intercal {\mathfrak {C}}x + e^\intercal q \le {\underline{\varTheta }}^\intercal {\mathfrak {d}}+ e^\intercal q\), where the first equality is from (6d) and the second inequality is because \({\underline{\varTheta }}\ge 0\) and \({\mathfrak {C}}x \le {\mathfrak {d}}\). This shows that the feasible solutions in (6) describe an affine policy and the objective function value overestimates that of the corresponding affine policy. We now show that relaxation is exact when \({\mathbb {K}} = {\mathbb {R}}_{+}^{p}\) and \({\mathscr {P}}\ne \emptyset \). For an affine policy to be feasible, \((A_{k}P^{\intercal } + B_{k})x + A_{k}q - c_{k} \le 0\) for all x satisfying \({\mathfrak {C}}x \le {\mathfrak {d}}\) and for all \(k\in [p]\), where \(A_{k}^\intercal \in {\mathbb {R}}^{n}\), \(B_{k}^\intercal \in {\mathbb {R}}^{m}\) represent the \(k^{\text {th}}\) row of A and B respectively, and \(c_{k}\in {\mathbb {R}}\) represents the \(k^{\text {th}}\) entry of c. In other words, for \(a^\intercal = -(A_{k}P^\intercal + B_{k})\) and \(b = A_{k} q - c_{k}\), it follows that \(\{x\mid a^\intercal x < b, {\mathfrak {C}}x \le {\mathfrak {d}}\} =\emptyset \). By Farkas’ Lemma, one of \({\mathscr {S}}_{1}\) and \({\mathscr {S}}_{2}\) is therefore feasible, where

$$\begin{aligned} {\mathscr {S}}_{1}&\,{:}{=}\, \bigl \{(\uplambda , \mu )\in {\mathbb {R}}_{++}\times {\mathbb {R}}^{l}_+\bigm |\uplambda a^\intercal + \mu ^\intercal {\mathfrak {C}}= 0, \uplambda b + \mu ^\intercal {\mathfrak {d}}\le 0\bigr \}\\ {\mathscr {S}}_{2}&\,{:}{=}\, \bigl \{(\uplambda ,\mu )\in {\mathbb {R}}_{+}\times {\mathbb {R}}^{l}_+ \bigm | \uplambda a^\intercal + \mu ^\intercal {\mathfrak {C}}= 0, \uplambda b + \mu ^\intercal {\mathfrak {d}}< 0\bigr \}. \end{aligned}$$

In \({\mathscr {S}}_{1}\), to see the equivalence scale \(\uplambda \) to 1 and set \(\mu = {\widetilde{U}}_{k}\), the \(k^{\text {th}}\) column of U. In \({\mathscr {S}}_{2}\), we assume \(\uplambda =0\), otherwise we obtain a solution to \({\mathscr {S}}_1\). Therefore, there is a non-negative \(\mu \) such that \(\mu ^\intercal {\mathfrak {C}}= 0\) and \(\mu ^\intercal {\mathfrak {d}}< 0\), which, by Farkas’ Lemma contradicts that \({\mathscr {P}}\ne \emptyset \).\(\square \)

Extension of Proposition 1 to consider polynomial policies

The design of such polynomial policies relates to the use of polynomial chaos expansion for structured representation of uncertainty in chance-constrained optimization; see [30] for its use in optimal power flow. Suppose \(y_{j} = \sum _{\alpha \in \gamma _j} g_{j}^{\alpha } x^{\alpha }\) for \(j \in [n]\), where \(\alpha = (\alpha _{1},\dots , \alpha _{m})\in \gamma _j\subseteq {\mathbb {N}}^m\), \(g_{j}^{\alpha } \in {\mathbb {R}}\), and \(x^{\alpha }\) represents the monomial \(x_{1}^{\alpha _{1}}\cdots x_{m}^{\alpha _{m}}\). Let \(Y(x) = \{y \in {\mathbb {R}}^n : A(x)y + {\mathscr {X}}(x) \le _{{\mathbb {K}}} c\}\), where A(x) is a \(p\times n\) matrix of polynomial functions, \({\mathscr {X}}(x)\) is a p sized vector of polynomials, such that \(A(x)_{kj} = \sum _{\beta \in S_{kj}} a^{\beta }_{kj} x^{\beta }\) and \({\mathscr {X}}(x)_{k} = \sum _{\beta \in S_{k0}}a^{\beta }_{k0}x^{\beta }\) for some sets \(S_{kj}\) and \(S_{k0}\). Let \(S'{:}{=}\{\alpha ': \exists (j,k) \text { such that } \alpha ' = \alpha + \beta , \alpha \in \gamma _j, \beta \in S_{kj}\}\). Assume that \({\mathscr {P}}\) = \(\{x': {\mathfrak {C}}'x' \le {\mathfrak {d}}'\}\) is a linear relaxation of \(\{x': x'_{\alpha '} = x^{\alpha '}\forall \alpha '\in S', {\mathfrak {C}}x\le {\mathfrak {d}}\}\). Let \(\varsigma = \max _{x\in {\mathscr {P}}}\min _{y\in Y(x)} e^\intercal y\) and restrict y to a polynomial policy to define:

$$\begin{aligned} \varPsi _{1}^* {:}{=}&\min _{\xi ,g}&\quad&\xi&\quad&\\&\xi \ge e^\intercal \sum _{\alpha \in \gamma _j} g_j^\alpha x'_{\alpha }&\forall x'\in \{ {\mathfrak {C}}'x \le {\mathfrak {d}}' \}\\{}&\sum _{j\in [n]}\sum _{\beta \in S_{kj}}\sum _{\alpha \in \gamma _j} a^\beta _{kj}g^\alpha _{j} x'_{\alpha + \beta } + \sum _{\beta \in S_{k0}}a^\beta _{k0} x'_{\beta } \le _{{\mathbb {K}}} c_k&\begin{aligned} \forall k\in [p]\\ \forall x'\in \{{\mathfrak {C}}'x \le {\mathfrak {d}}'\}. \end{aligned} \end{aligned}$$

Then, assuming \(\{x': {\mathfrak {C}}' x'\le {\mathfrak {d}}'\}\) is not empty and \({\mathbb {K}}={\mathbb {R}}^p_+\), dualization allows us to succinctly express the constraints for all \(x'\) so that \(\varPsi _D^* = \varPsi _1^*\), where:

$$\begin{aligned} \varPsi _{D}^* =&\min _{g, \varTheta ,U}&\quad&\sum _{r\in [l]} {\mathfrak {d}}'_{r}\varTheta _{r} + \sum _{j:0\in \gamma _{j}} e_{j}g_{j}^0&\quad&\end{aligned}$$
(C.2a)
$$\begin{aligned}&\sum _{r\in [l]}U_{rk}{\mathfrak {C}}'_{r\alpha } = \sum _{j\in [n]}\sum _{\alpha -\alpha '\in S_{kj}}\sum _{\alpha '\in \gamma _{j}} a^{\alpha -\alpha '}_{kj} g_{j}^{\alpha } + a^{\alpha }_{k0}&\forall k\in [p],\alpha \in \gamma _j\end{aligned}$$
(C.2b)
$$\begin{aligned}&\sum _{r\in [l]}U_{r k}{\mathfrak {d}}'_{r} + \sum _{j:0\in S_{k,j}} a^{0}_{kj}g^{0}_{j} + \sum _{0\in S_{k0}} a^{0}_{k0} \le c_k&\forall k\in [p]\end{aligned}$$
(C.2c)
$$\begin{aligned}&\sum _{r\in [l]}\varTheta _{r}{\mathfrak {C}}'_{r \alpha } = \sum _{j:\alpha \in \gamma _{j}} g_{j}^{\alpha }e_{j}&\forall \alpha \in \gamma _j\end{aligned}$$
(C.2d)
$$\begin{aligned}&\varTheta \ge 0, \ U_{\cdot k}\ge 0&\forall k\in [p]. \end{aligned}$$
(C.2e)

Proposition 6

Assume that \({\bar{x}}\in {\mathscr {P}}\), and there exists a \({\bar{w}}\in -{\mathbb {K}}^*\) such that \({\bar{w}}^\intercal A({\bar{x}}) = e^\intercal \), and that strong duality holds for the inner problem, i.e.,

$$\begin{aligned} \min _{y\in Y(x)} e^\intercal y = \text {CD}(x) {:}{=} \max _{w}\bigl \{w^\intercal \bigl (c-{\mathscr {X}}(x)\bigr ) \bigm | w^\intercal A(x) = e^\intercal , w \le _{\mathbb {K}}^* 0 \bigr \}, \end{aligned}$$

where \(Y(x) = \{y \in {\mathbb {R}}^n : A(x)y + {\mathscr {X}}(x) \le _{{\mathbb {K}}} c\}\), such that \(A(x)_{kj}\) and \({\mathscr {X}}(x)_{k}\) for \(k\in [p]\) and \(j\in [n]\), are as discussed above. Then, if \({\mathbb {K}}={\mathbb {R}}^p_+\), there is an RLT relaxation of \(\varsigma = \max _{x\in {\mathscr {P}}}\min _{y\in Y(x)} e^\intercal y\) which dualizes (C.2) and has the same optimal value.

Proof

By strong duality, \(\varsigma = \max _{x\in {\mathscr {P}}} \text {CD}(x)\). We obtain the following constraints by taking products of equality constraints in \(\text {CD}(x)\) with \(x^\alpha \) and inequalities with \({\mathfrak {C}}' x'\le {\mathfrak {d}}'\) that relax the monomial definitions:

$$\begin{aligned} \sum _{k\in [p]}\sum _{\beta \in S_{kj}} w_{k}a^{\beta }_{kj} x^{\alpha + \beta } = x^{\alpha } e_{j} \ \forall \alpha \in \gamma _{j}, \forall j\in [n]; \;\text { and }\; ({\mathfrak {d}}'_{r} - {\mathfrak {C}}'_{r}x' )w^\intercal \le _{{\mathbb {K}}^*} 0. \end{aligned}$$

Upon linearization, we obtain:

$$\begin{aligned} {\underline{\varDelta }}{:}{=}&\max _{w,x,w',w''}&\quad&w^\intercal ( c- B'x')&\quad&\end{aligned}$$
(C.3a)
$$\begin{aligned}&w' A' = M'\end{aligned}$$
(C.3b)
$$\begin{aligned}&{\mathfrak {d}}'_{r}w^\intercal - {\mathfrak {C}}'_{r} w'' \le _{{\mathbb {K}}^*} 0&\forall r \in [l]\end{aligned}$$
(C.3c)
$$\begin{aligned}&{\mathfrak {C}}'x' \le {\mathfrak {d}}'\end{aligned}$$
(C.3d)
$$\begin{aligned}&w\le _{{\mathbb {K}}^*} 0, \end{aligned}$$
(C.3e)

where: (i) \(w''_{(\alpha ,k)}\) linearizes \(x^\alpha w^\intercal \) and \(x'_\alpha w^\intercal \), (ii) whenever \(\alpha ' + \beta ' = \alpha \), \(j\in [n]\), \(\alpha ' \in \gamma _{j}\), and \(\beta ' \in S_{kj}\), \(w'_{(j,\alpha '),(\beta ',k)} = w''_{(\alpha ,k)}\), (iii) for all \(k\in [p]\), \(j\in [n]\), \(A'_{(\beta ,k),j} = a^{\beta }_{kj}\) if \(\beta \in S_{kj}\) and 0 otherwise, (iv) for all \(\alpha \in \gamma _{j}\) and \(j \in [n]\), \(M'_{(j,\alpha )} = x^{\alpha }e_{j}\), and (v) for all \(k\in [p]\), \(B'_{(k, \beta )} = a^{\beta }_{k0}\) if \(\beta \in S_{k0}\) and 0 otherwise. Let \(P'\), \(\{U'_{r}\}_{r\in [l]}\), and \(\varTheta '\) be the dual variables to the equations (C.3b), (C.3c) and (C.3d) respectively. Given that \(({\bar{w}},{\bar{x}})\) is feasible for \(\max _{x\in {\mathscr {P}}} \text {CD}(x)\), its relaxation (C.3) used to compute \({\underline{\varDelta }}\) is also feasible. When \({\mathbb {K}}={\mathbb {R}}^p_+\), (C.3) is a linear program and so has no duality gap. In general, its dual is:

$$\begin{aligned}&\min _{\varTheta ',U',P'}&\quad&\sum _{j:0\in \gamma _{j}} P'_{j0}e_{j} + \sum _{r\in [l]}\varTheta '_{r} {\mathfrak {d}}'_{r}&\quad&\end{aligned}$$
(C.4a)
$$\begin{aligned}&-U'^\intercal {\mathfrak {C}}' + F' + B' = 0\end{aligned}$$
(C.4b)
$$\begin{aligned}&U'^\intercal {\mathfrak {d}}' + L \le _{{\mathbb {K}}} c \end{aligned}$$
(C.4c)
$$\begin{aligned}&h + {\mathfrak {C}}'^\intercal \varTheta ' = 0 \end{aligned}$$
(C.4d)
$$\begin{aligned}&\varTheta ' \ge 0, U'_{r} \ge _{{\mathbb {K}}} 0&\forall r\in [l], \end{aligned}$$
(C.4e)

where, for \(k\in [p]\) and \(\alpha \in \gamma _{j}\), \(F'_{\alpha ,k} = \sum _{j} \sum _{\alpha '\in \gamma _{j}} \sum _{\beta '=\alpha -\alpha '\in S_{kj}} a^{\beta '}_{kj} P'_{\alpha 'j}\), and, for \(k\in [p]\), \(L_{k} = \sum _{j\in [n]}\sum _{0\in \gamma _{j}}\sum _{k:0\in S_{kj}} a^{0}_{kj} P'_{0j}\). Finally, for \(\alpha \in \gamma _j\), \(h_{\alpha } = -\sum _{j:\alpha \in \gamma _{j}} P_{\alpha j}e_{j}\). When \({\mathbb {K}}={\mathbb {R}}^p_+\), we obtain (C.2) by replacing \((\varTheta ',U',P')\) in (C.4) with \((\varTheta , U, g)\). \(\square \)

The equivalence in Propositions 1 and 6 holds when \({\mathbb {K}}\) has a tractable linear inequality representation. To reduce the more general case to that for \({\mathbb {R}}^p_+\), we write \(U\in {\mathbb {K}}\) as \({\mathcal {G}}U \ge 0\) for some \({\mathcal {G}}\) and replace \(Ay + Bx \le _{{\mathbb {K}}} c\) with \({\mathcal {G}}Ay + {\mathcal {G}}Bx \le {\mathcal {G}}c\).

Proof of Proposition 2

By definition, \({\hat{\mathbbm {1}}}_{E}({\mathbb {E}}_{*}[{\mathbb {X}}]) = \max _{\uplambda }\bigl \{\sum _{i\in {\mathcal {I}}}\uplambda _{i} \mathbbm {1}_{{\mathcal {F}}}(x^{i}) \big | \sum _{i\in {\mathcal {I}}}\uplambda _{i}x^{i} = {\mathbb {E}}_{*}[{\mathbb {X}}], \uplambda \ge 0, \sum _{i\in {\mathcal {I}}}\uplambda _{i} = 1 \big \}, \) where \(\uplambda = \{\uplambda _{i}\}_{i\in {\mathcal {I}}}\) and \(\{x^{i}\}_{i\in {\mathcal {I}}}\) are the extreme points in \({\mathscr {T}}\). There is a unique feasible solution with \(\uplambda _{i} = \mathop {\text {Pr}}_{*}({\mathbb {X}}= x^{i})\). So, \(\displaystyle {\hat{\mathbbm {1}}}_{E}({\mathbb {E}}_{*}[{\mathbb {X}}]) = \sum _{i\in {\mathcal {I}}} \text {Pr}_{*}({\mathbb {X}} = x^{i}) \mathbbm {1}_{{\mathcal {F}}}(x^{i}) = {\mathbb {E}}_{*}[\mathbbm {1}_{{\mathcal {F}}}({\mathbb {X}})]\). \(\square \)

Proof of Proposition 3

We first write \({\hat{\mathbbm {1}}}(x)\) as \(\max _{(w,v)\in {\mathcal {S}}}h(x,w,v)\). Then, for any \({\bar{x}}\in {\mathscr {T}}\), \(\max _{(w,v)\in {\mathcal {S}}} r({\bar{x}},w,v) = \mathbbm {1}_{{\mathcal {F}}\cap {\mathscr {T}}}({\bar{x}})\ge 0\). Since \(\max _{(w,v)\in {\mathcal {S}}}h(x,w,v)\) is concave and, for \(x\in \mathop {\text {conv}}({\mathscr {T}})\), is larger than \(\mathbbm {1}_{{\mathcal {F}}\cap {\mathscr {T}}}(x)\), it follows that \({\hat{\mathbbm {1}}}(x)\ge {\hat{\mathbbm {1}}}_E(x)\). For the converse, observe that, for all \(({\bar{x}},w,v)\in \mathop {\text {conv}}({\mathscr {T}})\times {\mathcal {S}}\), \(r({\bar{x}},w,v)\le \mathbbm {1}_{{\mathcal {F}}\cap {\mathscr {T}}}({\bar{x}})\le {\hat{\mathbbm {1}}}_E(x)\). Since \({\hat{\mathbbm {1}}}_E(x)\) is concave, it follows that \(h({\bar{x}},w,v)\le {\hat{\mathbbm {1}}}_E(x)\) and, so, \({\hat{\mathbbm {1}}}(x) = \max _{(w,v)\in {\mathcal {S}}}h(x,w,v) \le {\hat{\mathbbm {1}}}_E(x)\). \(\square \)

Proof of Theorem 1

We prove the result by showing that \(\varphi ^R(b_\alpha )\) computes the optimal value in (12). Let \({\mathscr {M}}_{J}(x){:}{=}\prod _{j\in J}x_{j}\). Let \(\alpha \) be defined so \(\alpha _j=1\) if \(j\in J\) and 0 otherwise. Then, \({\mathscr {M}}_J(x) = x^\alpha \). Clearly, for any variable z, the functions \(z{\mathscr {M}}_{J}(x)\) and \(z{\mathfrak {M}}_{J}(x)\) for \(J\subseteq [m]\) form bases of the same vector space of functions. Indeed, \(z{\mathfrak {M}}_{J}(x) = \sum _{J':J\subseteq J'\subseteq [m]}(-1)^{|J'\backslash J|}z{\mathscr {M}}_{J'}(x)\). Conversely, we have \(z{\mathscr {M}}_{J}(x) = \sum _{J':J\subseteq J'\subseteq [m]} z{\mathfrak {M}}_{J'}(x)\). Therefore, we write the RLT relaxation obtained from (13) equivalently without expanding the multilinear terms, instead linearizing \(\varphi {\mathfrak {M}}_{J}(x)\), \(w{\mathfrak {M}}_{J}(x)\), \(v{\mathfrak {M}}_J(x)\), and \({\mathfrak {M}}_{J}(x)\) directly using \(\varphi ^J\), \(w^J\), \(v^J\), and \({\mathfrak {p}}_J\) respectively. Since the former basis includes 1, we must also require that \(\sum _{J':J'\subseteq [m]} z{\mathfrak {M}}_{J'}(x) = z\) for each \(z\in \{\varphi ,w,v,1\}\). When z is \(\varphi \), this shows that the objective (12a) matches that in (13a). The substitution \(x_i^2=x_i\) replaces \(x_i{\mathfrak {M}}_J(x)\) with \({\mathfrak {X}}^{J}_i{\mathfrak {M}}_J(x)\). This is linearized as \({\mathfrak {X}}^{J}_i{\mathfrak {p}}_J\) in (13e) while \({\mathfrak {M}}_J(x) w^\intercal Bx\) in (13b) is replaced with \((w^J)^\intercal B{\mathfrak {X}}^{J}\). The constraints (12b), (12c), and (12d) now follow easily from the linearizations of (13b), (13c) and (13d).

We show that the set defined by the linearization of (13e), denoted as \(X'\) is: \(X = \bigl \{({\mathfrak {p}}_{J})_{J\subseteq [m]}\,:\, {\mathfrak {p}}_J\ge 0,\,J\subseteq [m];\, \sum _{J\subseteq [m]}{\mathfrak {p}}_J = 1;\, {\mathfrak {p}}_J = 0 \text { if } {\mathfrak {X}}^J\not \in {\mathscr {T}}\bigr \}\). Note that X models the probability distributions with support on \({\mathscr {T}}\). Because \(x_i{\mathfrak {M}}_J(x)\) linearizes to \({\mathfrak {X}}^J_i{\mathfrak {p}}_J\), \(X'\) has the same variables as X. We first show that \(X'\subseteq X\). Observe that \(\sum _{J':J'\subseteq [m]} {\mathfrak {M}}_{J'}(x) = 1\), linearizes to \(\sum _{J':J'\subseteq [m]} {\mathfrak {p}}_{J'} = 1\). Moreover, for any \(j\in J\), (resp. \(j\in J^C\)), \(x_j\ge 0\) (resp. \(1-x_j\ge 0\)) is implied by \(\mathop {\text {conv}}({\mathscr {T}})\). Thus, the linearization of \(x_j{\mathfrak {M}}_J(x)\ge 0\) (resp. \((1-x_j){\mathfrak {M}}_J(x)\ge 0\)) is implied by (13e) and yields \({\mathfrak {p}}_J\ge 0\). Now, consider any \({\mathfrak {X}}^J\not \in \mathop {\text {conv}}({\mathscr {T}})\). Then, if \({\mathfrak {p}}_J > 0\), we obtain a contradiction since (13e) requires that \({\mathfrak {p}}_J {\mathfrak {X}}^J \in {\mathfrak {p}}_J \mathop {\text {conv}}({\mathscr {T}})\). Therefore, \({\mathfrak {p}}_J = 0\) whenever \({\mathfrak {X}}^J\not \in {\mathscr {T}}\). Now, we show that \(X'\supseteq X\). Since \(X'\) is convex, it suffices to show that the extreme points of X are contained in \(X'\). It can be verified that if \({\mathfrak {X}}^J\in {\mathscr {T}}\) then the solution \({\mathfrak {p}}_J=1\) and \({\mathfrak {p}}_{J'}=0\) for \(J'\ne J\) is feasible to \(X'\).

Finally, we show that \(x_\alpha = b_\alpha \) is feasible to the linearization of (13e). Let \({\bar{{\mathfrak {p}}}}_J = \sum _{J'\subseteq J^{C}} (-1)^{|J'|} b_{\alpha (J\cup J')}\) for all \(J\subseteq [m]\). Then, since \(b_\alpha \) is the moment of \(x^\alpha \) with support on \({\mathscr {T}}\), it follows that \({\bar{{\mathfrak {p}}}}_J\in X\). Let \(x_\alpha \) linearize \({\mathscr {M}}_J(x)\), where \(\alpha _j=1\) if \(j\in J\) and 0 otherwise. Observe that, with this linearization, (13e) yields an affine transform of X, say T(X), in the space of \(x_\alpha \) variables. Then, \(x_\alpha = \sum _{J': J\subseteq J'\subseteq [m]}{\bar{{\mathfrak {p}}}}_{J'}\) is feasible to T(X). However, it can be easily verified that \(\sum _{J': J\subseteq J'\subseteq [m]}{\bar{{\mathfrak {p}}}}_{J'} = \sum _{J':J'\subseteq [m]}{\bar{{\mathfrak {p}}}}_{J'}({\mathfrak {X}}^{J'})^\alpha = b_\alpha \). The first equality is because \(({\mathfrak {X}}^J)^\alpha = 1\) if \(J\subseteq J'\) and 0 otherwise, while the second equality follows since \({\bar{{\mathfrak {p}}}}_J\) is the probability distribution corresponding to the moments \(b_\alpha \). Thus, \(x_\alpha = b_\alpha \) is feasible to T(X). Then, it follows that \(\varphi ^R(b_\alpha )\) computes \(\mathop {\text {Pr}}\nolimits _\varTheta ({\mathcal {F}})\) as in (12). \(\square \)

Proof of Theorem 2

We denote the maximum value of \(s^{\ge }((i,j),*)\) by \(M_i\), where \(s^{\ge }((i,j),c) = \sum _{{\tilde{c}}\ge c}s((i,j),{\tilde{c}})\). For any i, range\((j)_i\) is the range of possible values of j. For \(i\le K_2\), \(M_i\le \left( {\begin{array}{c}i\\ j\end{array}}\right) {\mathcal {T}}^{i}\), \(j \in \bigl [\max \{0,i+ {\mathfrak {b}}-K_{2}\}, \min \{ K_{2}, K_{1}+{\mathfrak {b}},i \}\bigr ]\), and range\((j)_{i} = \min \{K_{2}, i + {\mathfrak {b}}- 2K_{2}, K_{1} + {\mathfrak {b}}, m- i \}\). For \(i > K_2\), there is a \(l\in \bigl [0, \min \{K_2-j,i-K_2\}\bigr ]\) so that we select \(j+l\) (resp. \(i-K_2-l\)) variables from \(\{1,\ldots ,K_2\}\) (resp. \(\{K_2+1,\ldots ,i\}\)) to set to 1 (resp. 0). It follows that \(M_i\le \left( {\begin{array}{c}i\\ j+i-K_2\end{array}}\right) {\mathcal {T}}^{i}\), \(j\in \bigl [{\mathfrak {b}}, \min \{K_{2}, m- i + {\mathfrak {b}}\}\bigr ]\), and range\((j)_{i} = \min \{ K_{2} - {\mathfrak {b}}, m-i \}\). We choose a sparsification parameter, \(\delta _{s}\), to perform \((1+\delta _{s})\) sparsification of each \(s((i,j),{\tilde{c}})\). Observe that \(\log _{(1+\delta _{s})}\left( {\begin{array}{c}i\\ j\end{array}}\right) \le \min \{\frac{i}{2},j\}\log _{1+\delta _{s}}\frac{i\exp (1)}{\min \{\frac{i}{2},j\}}\). Since the time-complexity of summing, shifting, and querying function lists is bounded by their size, the time complexity is \(O\bigl (m{\underline{\varTheta }}\bigl (\xi \log _{1+\delta _{s}}(m/\xi ) + m\log _{1+\delta _{s}}{\mathcal {T}}\bigr )\bigr )\). The time complexity in Theorem 2, follows by choosing \(\delta = (1+\epsilon _{s})^{1/m} - 1\), and using \(\ln (1+\epsilon _{s}) \ge \epsilon _{s}/ 2\) for \(\epsilon _{s}\in (0,1)\). \(\square \)

Proof of Theorem 3

Consider a \(t+1\) dimensional DAG, where the \((l+1)^{\text {st}}\) dimension corresponds to the \(l^{\text {th}}\) low weight constraint. Let \(s((i,j_{1},\dots ,j_{t}),*)\) be the list of all pairs \((c,s((i,j_{1},\dots ,j_{t}),c))\). For \({\mathcal {T}} \ge \max _{i} n_{i}\), if M is the maximum value of \(s^{\ge }((i,j_{1},\dots ,j_{t}),*)\), then \(M\le (2{\mathcal {T}})^{i}\) as there are \(2^i\) solutions, each of which occurs at most \({\mathcal {T}}^i\) times. Moreover, let \(\gamma \) be such that \(\gamma \ge \max _{k} w_{k}^{l} - \min _{k} w_{k}^{l}\) for all \(l\in [t]\). Thus, the \(l^{\text {th}}\) low weight constraint at the \(i^{\text {th}}\) slice has at most \(i\gamma \) values. Since there are \(t\) low weight constraints, the number of nodes with first coordinate i is bounded by \((m\gamma )^{t}\). Then, after a \(1+\delta _{s}\) sparsification the cumulative length of lists is bounded by \(m(m\gamma )^{t}\log _{1+\delta _{s}}(2{\mathcal {T}})^{m}\). For a \(1+\epsilon _{s}\) approximation, with \(1+\delta _{s}= (1+\epsilon _{s})^{1/m}\), the time complexity is \(O[\epsilon _{s}^{-1}m^{t+3}\gamma ^{t} \ln {\mathcal {T}}]\). \(\square \)

A lower estimate for probability of 0–1 solutions to a SLWP

Given a function \(f:{\mathbb {Z}}^+ \rightarrow {\mathbb {Z}}^+\) and an approximation parameter \(\epsilon _{s}>0\), we say \(F:{\mathbb {Z}}^+\rightarrow {\mathbb {Z}}^+\) (resp. \({\underline{F}}:{\mathbb {Z}}^+ \rightarrow {\mathbb {Z}}^+\)) is a \((1 - \epsilon _{s})\) function approximation (resp. sum-approximation) of f if, for all x, \((1-\epsilon _{s})f(x) \le F(x) \le f(x)\) (resp. \((1-\epsilon _{s})f^{\ge }(x) \le {\underline{F}}^{\ge }(x)\le f^{\ge }(x)\)). The properties in Lemma 2 follow easily for \((1-\epsilon _{s}{})\) sum-approximation of functions. The sparsifier takes as input a function f and a parameter \(\delta _{s}>0\). We partition values of \(f^{\ge }\) into \([r_{i+1},r_{i})\), where \(r_{0}=\max _c{f^{\ge }(c)}\) and if \(r_i > 0\), then \(r_{i+1} = \min \{r_{i}-1, \lceil (1-\delta _{s})r_{i}\rceil \}\). Let \(c_{i} = \min _{c}\{c\mid f^{\ge }(c) \le r_{i}\}\). For any c, we define \(l(c) = \min _{i}\{c_{i}\mid c_{i} > c \}\). If l(c) is finite, \({\underline{F}}^{\ge }(c) = f^{\ge }(l(c) - 1)\). Otherwise, \({\underline{F}}^{\ge }(c) = \lim _{x\rightarrow \infty }f^{\ge }(x)\). Then, \({\underline{F}}^{\ge }(x)\) is a \((1-\delta _{s})\) approximation of \(f^{\ge }(x)\). As a consequence of the Theorem 2, we obtain the time complexity to compute \(|\underline{S_s}|_{\varTheta }\) such that for any given \(\epsilon _{s}\in (0,1)\), \((1-\epsilon _{s})|S_s|_{\varTheta } \le |\underline{S_s}|_{\varTheta } \le |S_s|_{\varTheta }\).

Corollary 1

Given \(S_s\), \(\varTheta \) as in (14), (A5) respectively, and an error parameter \(\epsilon _{s}\in (0,1)\), we can deterministically compute a \(1-\epsilon _{s}\) relative error approximation of \(|S_s|_{\varTheta }\) in time given as in Theorem 2.

Proof

When we use a \((1-\delta _{s})\) sparsifier, the time to compute \(|S_s|_\varTheta \) is \(O\bigl (m{\underline{\varTheta }}\bigl [\xi \log _{\frac{1}{(1-\delta _{s})}}\bigl ( \frac{m}{\xi } \bigr ) + m\log _{\frac{1}{1-\delta _{s}}}{\mathcal {T}}\bigr ]\bigr )\). To control the approximation error, we set \((1-\delta _{s})^{m} = (1-\epsilon _{s})\). Then, we obtain the same time-complexity as in Theorem 2 using \(\ln (1-\epsilon _{s})^{-1}\ge \epsilon _{s}\). \(\square \)

Proof of Theorem 4

We write the solution set of \(S_\varOmega \) as \(\bigcup _{J} S(J)\), where each \(S(J) = \{x\in \{0,1\}^{m} : \sum _{i=1}^{m} w_{i}x_{i} \ge C, \varOmega x = J\}\). For a given \({\widetilde{x}}\in \{0,1\}^m\), we first compute \(\mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}= {\widetilde{x}}|{\mathbb {X}}\in S_\varOmega )\). To do so, we will compute \(\mathop {\text {Pr}}\nolimits _{\varTheta }({\mathbb {X}}= {\widetilde{x}}\mid {\mathbb {X}}\in S(J))\). Let \(s_{i}(j,c) = \bigl \{x : \sum _{k=1}^{i}w_{k}x_{k} \ge c, \varOmega _{\cdot ,1:i} x_{1:i} = j\bigr \}\) and \(s'_{i}(j,c) = s_{i}(j,c)\cap \{x : x_{k} = {\widetilde{x}}_{k} \forall k > i\}\). Define \(c(i) = C- \sum _{k=i+1}^{m} w_{k}{\widetilde{x}}_{k}\) and \( j^{J}(i) = J - \varOmega _{\cdot ,i+1:m} {\widetilde{x}}_{i+1:m}\). Clearly, if \({\widetilde{x}}\in S(J)\), \(j^J(0) = 0\), \(c(0) \le 0\), and \(s'_0\bigl (0,c(0)) = \{{\widetilde{x}}\}\). Observe that \(s'_{r}\bigl (j^J(r),c(r)\bigr )\subseteq s'_{r+1}\bigl (j^J(r+1),c(r+1)\bigr )\) because if \(x\in s'_{r}\bigl (j^J(r),c(r)\bigr )\), we have \(x_k = {\widetilde{x}}_k\) for \(k > r\), \(\sum _{k=1}^rw_kx_k \ge c(r) = c(r+1) - w_{r+1}{\widetilde{x}}_{r+1}\), and \(\varOmega _{\cdot ,1:r}x_{1:r} = j^J(r) = j^J(r+1) -\varOmega _{\cdot ,r+1}{\widetilde{x}}_{r+1}\), showing that \(x\in s'_{r+1}\bigl (j^J(r+1),c(r+1)\bigr )\). Then, \(s'_{0}\bigl (j^J(0),c(0)\bigr ) \subseteq s'_{1}\bigl (j^J(1),c(1)\bigr )\subseteq \dots \subseteq s'_{m}\bigl (j^J(m),c(m)\bigr ) = S(J) \), and we have

$$\begin{aligned}&\mathop {\text {Pr}}\nolimits _\varTheta \Big ({\mathbb {X}}= {\widetilde{x}}\big | {\mathbb {X}}\in S_\varOmega \Big ) \nonumber \\&=\sum _J\left( \mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in S(J)\bigm | {\mathbb {X}}\in S_\varOmega \bigr )\prod _{i=1}^{m}\mathop {\text {Pr}}\nolimits _\varTheta \Bigl ({\mathbb {X}}\in s'_{i-1}\bigl (j^J(i-1),c(i-1)\bigr )\Big |{\mathbb {X}}\in s'_{i}(j^J(i),c(i))\Bigr )\right) .\nonumber \\ \end{aligned}$$
(J.1)

Further, for all J and \(i\in [m], \mathop {\text {Pr}}\nolimits _\varTheta \Bigl ({\mathbb {X}}\in s'_{i-1}\bigl (j^J(i-1),c(i-1)\bigr )\Big |{\mathbb {X}}\in s'_{i}(j^J(i),c(i))\Bigr )\) is:

$$\begin{aligned}&\frac{\mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s'_{i-1}(j^J(i-1),c(i-1))\bigr )}{\mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s'_{i}(j^J(i),c(i))\bigr )}\\&\quad = \frac{\mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s_{i-1}(j^J(i-1),c(i-1))\bigr )}{\mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s_{i}(j^J(i),c(i))\bigr )}\mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}_{i}={\widetilde{x}}_{i})\\&\quad = \frac{|s_{i-1}(j^J(i-1),c(i-1))|_{\varTheta }}{|s_{i}(j^J(i),c(i))|_{\varTheta }}\mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}_{i} = {\widetilde{x}}_{i})n_{i} = p^J_{i}\delta _{{\widetilde{x}}_i = 0} + (1-p^J_{i})\delta _{{\widetilde{x}}_i = 1}, \end{aligned}$$

where \(p^J_{i} = \frac{|s_{i-1}(j^J(i),c(i))|_{\varTheta }}{|s_{i}(j^J(i),c(i))|_{\varTheta }}(n_i-a_i)\) and \(\delta _{{\widetilde{x}}={\mathfrak {a}}}\) is 1 if \({\widetilde{x}}={\mathfrak {a}}\) and 0 otherwise. The first equality is because the event \({\mathbb {X}}\in s_i(j,c)\) is independent of \(\{{\mathbb {X}}_{i'}\}_{i'=i+1}^m\) and, therefore, \(\mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s'_{i}(j^J(i),c(i))\bigr ) = \mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s_{i}(j^J(i),c(i))\bigr )\prod _{i'= i+1}^{m} \mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}_{i'}={\widetilde{x}}_{i'})\). The second equality follows since \(|s_i(j^J(i),c(i))|_\varTheta = \mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s_{i}(j^J(i),c(i))\bigr )\prod _{i'=1}^i n_{i'}\) and the last equality is because \(|s_{i-1}(j^J(i),c(i))|_{\varTheta }(n_i-a_i) + |s_{i-1}(j^J(i)-\varOmega _{.,i},c(i)-w_i)|_{\varTheta }a_i = |s_{i}(j^J(i),c(i))|_{\varTheta }\). Now, we compute \(\mathop {\text {Pr}}\nolimits ({\widetilde{{\mathbb {X}}}}={\widetilde{x}})\), where \({\widetilde{{\mathbb {X}}}}\) is the generated random variable. We write \(\mathop {\text {Pr}}\nolimits ({\widetilde{{\mathbb {X}}}}={\widetilde{x}}) = \sum _J \mathop {\text {Pr}}\nolimits \bigl ({\widetilde{{\mathbb {X}}}}\in S(J)\bigr )\prod _{i=1}^m\mathop {\text {Pr}}\nolimits ({\widetilde{{\mathbb {X}}}}_i={\widetilde{x}}_i\mid {\widetilde{{\mathbb {X}}}}_k = {\widetilde{x}}_k \forall k > i \text { and } {\widetilde{{\mathbb {X}}}}_i\in S(J))\). Let \({\tilde{p}}^J_i = \mathop {\text {Pr}}\nolimits ({\widetilde{{\mathbb {X}}}}_i=0| {\widetilde{{\mathbb {X}}}}_k = {\widetilde{x}}_k \forall k > i \text { and } {\widetilde{{\mathbb {X}}}}_i\in S(J))\). At the \({(m+1-i)}^{\text {th}}\) iteration, the algorithm chooses the value for \({\widetilde{{\mathbb {X}}}}_i\). Assume that \({\widetilde{{\mathbb {X}}}}_k\) was chosen to be \({\widetilde{x}}_k\) for \(k > i\). Then,

$$\begin{aligned} {\widetilde{p}}^J_{i} = \frac{{\tilde{s}}^{\ge }((i-1,j^J(i)),c(i))(n_{i}-a_{i})}{{\tilde{s}}^{\ge }((i-1,j^J(i)),c(i))(n_{i}-a_{i}) + {\tilde{s}}^{\ge }((i-1,j^J(i)-\varOmega _{.,i}),c(i)-{w}_{i})a_{i}}. \end{aligned}$$

Since \({\tilde{s}}^{\ge }((i,j^J(i)),c(i))\) is a \((1+\delta _{s})^{i-1}\) approximation of \(|s_{i}(j^J(i),c(i))|_{\varTheta }\):

$$\begin{aligned} \frac{p^J_{i}}{(1+\delta _{s})^{i-2}} \le {\widetilde{p}}^J_{i} \le (1+\delta _{s})^{i-2}p^J_{i} \end{aligned}$$
(J.2a)
$$\begin{aligned} \frac{1-p^J_{i}}{(1+\delta _{s})^{i-2}} \le 1-{\widetilde{p}}^J_{i} \le (1+\delta _{s})^{i-2}(1-p^J_{i}), \end{aligned}$$
(J.2b)

where the left hand side inequality in (J.2a) (respectively (J.2b)) is obtained by realizing that \({\widetilde{s}}^{\ge }((i-1,j^J(i)),c(i)) \ge |s_{i-1}(j^J(i),c(i))|_{\varTheta }\) and \({\widetilde{s}}^{\ge }((i-1,j^J(i) - \varOmega _{.,i}),c(i) - w_{i}) \le (1+\delta _{s})^{i-2}|s_{i-1}(j^J(i) - \varOmega _{.,i}, c(i) - w_{i})|_{\varTheta }\), (respectively \({\widetilde{s}}^{\ge }((i-1,j^J(i) - \varOmega _{.,i}),c(i)-w_{i}) \ge |s_{i-1}(j^J(i) - \varOmega _{.,i},c(i)-w_{i})|_{\varTheta }\) and \({\widetilde{s}}^{\ge }((i-1,j^J(i)),c(i)) \le (1+\delta )^{i-2}|s_{i-1}(j^J(i),c(i))|_{\varTheta }\)). The right hand side of (J.2a), (J.2b) can be obtained in a similar way. For \(\delta _{s}\in (0,1)\), we have \(1/(1+\delta _{s})^{i} > (1-\delta _{s})^{i}\). Thus, \((1-\delta _{s})^{i-2}p^J_{i} \le {\widetilde{p}}^J_{i}\le (1+\delta _{s})^{i-2}p^J_{i}\). We let \(\mathop {\text {Pr}}\nolimits \bigl ({\widetilde{{\mathbb {X}}}}\in S(J)\bigr ) = \frac{{\tilde{s}}^{\ge }((m,J),C)}{\sum _{J'} {\tilde{s}}^{\ge }((m,J'),C)}\), and observe that:

$$\begin{aligned} \frac{\mathop {\text {Pr}}\nolimits ({\mathbb {X}}\in S(J)\mid {\mathbb {X}}\in S_\varOmega )}{(1+\delta )^{m-1}} \le \mathop {\text {Pr}}\nolimits \bigl ({\widetilde{{\mathbb {X}}}}\in S(J)\bigr ) \le (1+\delta )^{m-1}\mathop {\text {Pr}}\nolimits ({\mathbb {X}}\in S(J)\mid {\mathbb {X}}\in S_\varOmega ). \end{aligned}$$

Therefore, each term in the summation on the right hand side of (J.1) is approximated within a relative error of \((1+\delta _s)^{\eta }\) where \(\eta = m(m-1)/2\). It follows that

$$\begin{aligned} (1-\delta _{s})^{\eta }\mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}={\widetilde{x}}|{\mathbb {X}}\in S_\varOmega ) \le \mathop {\text {Pr}}\nolimits ({\widetilde{{\mathbb {X}}}}={\widetilde{x}}) \le (1+\delta _{s})^{\eta }\mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}={\widetilde{x}}|{\mathbb {X}}\in S_\varOmega ).\nonumber \\ \end{aligned}$$
(J.3)

Now, we obtain a \((1\pm \epsilon _{s})\) approximation if \(\delta _{s}\le 1 - (1-\epsilon _{s}{})^{1/m^{2}}\) and \(\delta _{s}\le (1+\epsilon _{s})^{1/m^{2}}-1\). Since \((\cdot )^{\frac{1}{m^2}}\) is concave, it follows that \(\frac{1}{2}(1+\epsilon _{s})^{1/m^2} + \frac{1}{2}(1-\epsilon _{s})^{1/m^{2}} \le 1\). Therefore, it suffices to choose \(\delta _{s}= (1+\epsilon _{s})^{1/m^{2}} - 1\) in J.3. The desired complexity follows from Theorem 3.\(\square \)

Proof of Proposition 4

We assume wlog that \(x_{ij}=0\) for all \(\langle i,j \rangle \in E\) and \(U=1\). Given a solution to \(\mathop {\text {Slack-MLU}}{(x)}\), we construct a feasible solution to \(\mathop {\text {MLU}}(x)\). If \({\underline{y}}\) is a solution to \(\mathop {\text {MLU}}(x)\) with demand \({\underline{d}}\) and \({\underline{d}}= {\underline{d}}' + {\underline{d}}''\) where \({\underline{d}}',{\underline{d}}''\ge 0\), then, using augmenting paths, \({\underline{y}}\) can be decomposed into \({\underline{y}}'\), \({\underline{y}}''\ge 0\), where \({\underline{y}}'\) services \({\underline{d}}'\), \({\underline{y}}''\) services \({\underline{d}}''\) and \(y''\) does not contain cycles. Now, let \((y^{a},a)\), be the given solution to \(\mathop {\text {Slack-MLU}}(x)\) where, \(y^{a}\), is a routing of \(d'\). Then, we decompose \(y^{a}\) into \(y^{1}\) and \(y^{2}\), where \(y^{1}\) routes d, \(y^{2}\) routes a, and \(y^{2}\) does not contain cycles. Assume wlog that the support of a is a pair (ij), and, so:

$$\begin{aligned} \sum _{t\in V} y^{1}_{klt} + y^{2}_{klj} \le c_{kl} + a_{ij}\delta _{(\langle i,j\rangle = \langle k,l \rangle )} \end{aligned}$$
(K.1)

where \(\delta _{(\langle i,j\rangle = \langle k,l \rangle )} = 1\) if \(\langle i,j\rangle = \langle k,l \rangle \) and 0 otherwise. Clearly, \(a_{ij} \ge y^{2}_{ijj}\) because \(y^{2}\) does not contain cycles. We define

$$\begin{aligned} Z_{t} = {\left\{ \begin{array}{ll} \frac{y^{1}_{ijt}}{c_{ij} + a_{ij} - y^{2}_{ijj}} &{}\text { if } c_{ij}+a_{ij} - y^{2}_{ijj} > 0\\ 0 &{}\text { otherwise. } \end{array}\right. } \end{aligned}$$
(K.2)

Since \( 0\le \sum _{t\in V} y^{1}_{ijt} \le c_{ij} + a_{ij} - y^{2}_{ijj} \), we get \(0\le \frac{\sum _{t\in V} y^{1}_{ijt}}{c_{ij} + a_{ij} - y^{2}_{ijj}} = \sum _{t\in V} Z_{t} \le 1\). We argue that the flow \(y''\), defined as

$$\begin{aligned} y''_{klt} = y^{1}_{klt} + Z_{t}y^{2}_{klj} - Z_{t}a_{ij}\delta _{(\langle i,j \rangle = \langle k,l \rangle )} \end{aligned}$$
(K.3)

is feasible to \(\mathop {\text {MLU}}{(x)}\). First, we show feasibility to the capacity constraint.

  • C.1 Consider \(\langle i,j \rangle = \langle k,l \rangle \) and observe that: \(\sum _{t\in V}Z_{t} c_{ij} - \sum _{t\in V} y''_{ijt} = \sum _{t\in V} \bigl (Z_{t} c_{ij} + Z_{t}a_{ij} - Z_{t}y^{2}_{ijj} - y^{1}_{ijt}\bigr ) = 0\), where the first equality is by (K.3). If \(c_{ij} + a_{ij} - y^{2}_{ijj} > 0\), the second equality is from (K.2). Otherwise, it follows from \(0\le \sum _{t\in V} y^{1}_{ijt} \le \sum _{t\in V} Z_{t}(c_{ij} + a_{ij} - y^{2}_{ijj}) = 0\). Then, \(\sum _{t\in V} y''_{ijt} \le c_{ij}\) because \(0\le \sum _{t\in V} Z_{t}c_{ij} \le c_{ij}\), where the second inequality holds because \(\sum _{t\in V}Z_t \le 1\).

  • C.2 Now, consider \(\langle k,l \rangle \ne \langle i,j \rangle \). We have \(0\le \sum _{t\in V}y''_{klt} = \sum _{t\in V} (y^{1}_{klt} + Z_{t} y^{2}_{klj}) \le \sum _{t\in V}y^{1}_{klt} + y^{2}_{klj} \le c_{kl}\), where, the first equality is from (K.3), the first inequality is because \(Z_t\), \(y^1\), and \(y^2\) are non-negative, the second inequality is because \(y^{2}_{klj} \ge 0\) and \(\sum _{t\in V}Z_{t} \le 1\), and the last inequality follows from (K.1).

Finally, \(y''\) satisfies flow balance equations in \(\mathop {\text {MLU}}(x)\) because it is defined in (K.3) by adding a circulation to \(y^1\) which services d. \(\square \)

Formulation of Gen-R3

For a directed arc e from i to j, we write \(\text {tail}(e)\) to represent i and \(\text {head}(e)\) to represent j. For a node j and commodity t, we write \(ex'(r,j,t)\) to represent \(\sum _{e\in E:\text {tail}(e)=j} r_{et} - \sum _{e\in E:\text {head}(e)=j} r_{et}\). Then Gen-R3 is: [9]:

$$\begin{aligned} \text {Gen-R3:}&\min _{r,p,a}&\quad&U&\quad&\end{aligned}$$
(L.1a)
$$\begin{aligned}&\sum _{t\in V}r_{et} + \sum _{l\in E}p_{el}x_{l} \le U c_e(1-x_e) + a_ex_e&\forall e \in E, \forall x\in {\mathscr {X}}_{\mathfrak {b}}\end{aligned}$$
(L.1b)
$$\begin{aligned}&ex'(r,j,t) = d_{jt} - \sum _{i\in V}d_{it}\delta _{j=t}&\forall j,t\in V\end{aligned}$$
(L.1c)
$$\begin{aligned}&ex'(p,j,l) = a_l\delta _{\text {tail}(l)=i} - a_l\delta _{\text {head}(l)=j}&j\in V, l\in E\end{aligned}$$
(L.1d)
$$\begin{aligned}&r_{et},p_{el}\ge 0&e,l\in E, t\in V. \end{aligned}$$
(L.1e)

Here, \(r_{et}\) is the traffic on link e destined to t and \(p_{el}\) is the the amount of traffic on link l that is bypassed on e when l fails, and \(a_e\) is the reservation to bypass traffic on link e.

We solve Gen-R3 with \({\mathfrak {b}}=1\) in \({\mathscr {X}}_{\mathfrak {b}}\) i.e.,for \({\mathscr {X}}_1\) in (L.1b). Then using the obtained \((r^*,p^*,a^*)\), the G-cuts are the negation of constraint (L.1b) with U fixed to one i.e.,

$$\begin{aligned} \sum _{t\in V}r^{*}_{et} + \sum _{l\in E}p^{*}_{el}x_{l} > c_e(1-x_e) + a^*_e x_e \text { for } e \in E \text { and } x \in {\mathscr {X}}_{\mathfrak {b}}. \end{aligned}$$
(L.2)

Constraint (L.2) can be used to outer-approximate the set of scenarios in \({\mathscr {X}}_{\mathfrak {b}}\) where MLU exceeds 1.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chandra, A., Tawarmalani, M. Probability estimation via policy restrictions, convexification, and approximate sampling. Math. Program. 196, 309–345 (2022). https://doi.org/10.1007/s10107-022-01823-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-022-01823-6

Keywords

Mathematics Subject Classification

Navigation