Probability estimation via policy restrictions, convexification, and approximate sampling

Chandra, Ashish; Tawarmalani, Mohit

doi:10.1007/s10107-022-01823-6

Probability estimation via policy restrictions, convexification, and approximate sampling

Full Length Paper
Series B
Published: 20 May 2022

Volume 196, pages 309–345, (2022)
Cite this article

Mathematical Programming Submit manuscript

373 Accesses
2 Citations
Explore all metrics

Abstract

This paper develops various optimization techniques to estimate probability of events where the optimal value of a convex program, satisfying certain structural assumptions, exceeds a given threshold. First, we relate the search of affine/polynomial policies for the robust counterpart to existing relaxation hierarchies in MINLP (Lasserre in Proceedings of the international congress of mathematicians (ICM 2018), 2019; Sherali and Adams in A reformulation–linearization technique for solving discrete and continuous nonconvex problems, Springer, Berlin). Second, we leverage recent advances in Dworkin et al. (in: Kaski, Corander (eds) Proceedings of the seventeenth international conference on artificial intelligence and statistics, Proceedings of machine learning research, PMLR, Reykjavik, 2014), Gawrychowski et al. (in: ICALP, LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018) and Rizzi and Tomescu (Inf Comput 267:135–144, 2019) to develop techniques to approximately compute the probability binary random variables from Bernoulli distributions belong to a specially-structured union of sets. Third, we use convexification, robust counterpart, and chance-constrained optimization techniques to cover the event set of interest with such set unions. Fourth, we apply our techniques to the network reliability problem, which quantifies the probability of failure scenarios that cause network utilization to exceed one. Finally, we provide preliminary computational evaluation of our techniques on test instances for network reliability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Survey on Lagrangian relaxation for MILP: importance, challenges, historical review, recent advancements, and opportunities

Article 11 July 2023

Sample intelligence-based progressive hedging algorithms for the stochastic capacitated reliable facility location problem

Article Open access 07 May 2024

Analyzing quantitative performance: Bayesian estimation of 3-component mixture geometric distributions based on Kumaraswamy prior

Article 15 May 2024

References

ApS, M.: Mosek modeling cookbook (2020)
Bangla, A.K., Ghaffarkhah, A., Preskill, B., Koley, B., Albrecht, C., Danna, E., Jiang, J., Zhao, X.: Capacity planning for the google backbone network. https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45385.pdf (2015)
Bao, X., Sahinidis, N.V., Tawarmalani, M.: Multiterm polyhedral relaxations for nonconvex, quadratically constrained quadratic programs. Optim. Methods Softw. 24(4–5), 485–504 (2009)
Article MathSciNet MATH Google Scholar
Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optim., vol. 28. Princeton University Press, Princeton (2009)
Book MATH Google Scholar
Ben-Tal, A., Nemirovski, A.: Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications, vol. 2. SIAM, Philadelphia (2001)
Book MATH Google Scholar
Ben-Tal, A., Nemirovski, A.: On safe tractable approximations of chance-constrained linear matrix inequalities. Math. Oper. Res. 34(1), 1–25 (2009)
Article MathSciNet MATH Google Scholar
Bertsimas, D., Goyal, V.: On the power and limitations of affine policies in two-stage adaptive optimization. Math. Program. 134(2), 491–531 (2012)
Article MathSciNet MATH Google Scholar
Bertsimas, D., Popescu, I.: Optimal inequalities in probability theory: a convex optimization approach. SIAM J. Optim. 15(3), 780–804 (2005)
Article MathSciNet MATH Google Scholar
Chang, Y., Jiang, C., Chandra, A., Rao, S., Tawarmalani, M.: Lancet: better network resilience by designing for pruned failure sets. Proc. ACM Meas. Anal. Comput. Syst. 3(3), 1–26 (2019)
Article Google Scholar
Chang, Y., Rao, S., Tawarmalani, M.: Robust validation of network designs under uncertain demands and failures. In: Proceedings of the 14th USENIX Conference on Networked Systems Design and Implementation, NSDI’17, pp. 347–362. USENIX Association, USA (2017)
Doerr, B.: Probabilistic tools for the analysis of randomized optimization heuristics. In: Theory of Evolutionary Computation, pp. 1–87. Springer (2020)
Dworkin, L., Kearns, M., Xia, L.: Efficient inference for complex queries on complex distributions. In: Kaski, S., Corander, J. (eds.) Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, vol. 33, pp. 211–219. PMLR, Reykjavik (2014)
Google Scholar
Gawrychowski, P., Markin, L., Weimann, O.: A faster FPTAS for #knapsack. In: ICALP, LIPIcs, vol. 107, pp. 64:1–64:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018)
Ghanem, R., Higdon, D., Owhadi, H.: Handbook of Uncertainty Quantification, vol. 6. Springer, Berlin (2017)
Book MATH Google Scholar
Gill, P., Jain, N., Nagappan, N.: Understanding network failures in data centers: measurement, analysis, and implications. In: Proceedings of the ACM SIGCOMM 2011 Conference, pp. 350–361 (2011)
Gopalan, P., Klivans, A., Meka, R., Stefankovic, D., Vempala, S., Vigoda, E.: An FPTAS for #knapsack and related counting problems. In: Proceedings of the 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS’11, pp. 817–826. IEEE Computer Society, USA (2011)
Gurobi Optimization Incorporated: Gurobi optimizer reference manual. http://www.gurobi.com (2019)
Han, S., Tao, M., Topcu, U., Owhadi, H., Murray, R.M.: Convex optimal uncertainty quantification. SIAM J. Optim. 25(3), 1368–1387 (2015)
Article MathSciNet MATH Google Scholar
Hanasusanto, G.A., Kuhn, D.: Conic programming reformulations of two-stage distributionally robust linear programs over Wasserstein balls. Oper. Res. 66(3), 849–869 (2018)
Article MathSciNet MATH Google Scholar
Karp, R.M., Luby, M., Madras, N.: Monte-Carlo approximation algorithms for enumeration problems. J. Algorithms 10(3), 429–448 (1989)
Article MathSciNet MATH Google Scholar
Knight, S., Nguyen, H., Falkner, N., Bowden, R., Roughan, M.: The internet topology zoo. IEEE J. Select. Areas Commun. 29(9), 1765–1775 (2011)
Article Google Scholar
Lasserre, J.B.: A semidefinite programming approach to the generalized problem of moments. Math. Program. 112(1), 65–92 (2008)
Article MathSciNet MATH Google Scholar
Lasserre, J.B.: The moment-sos hierarchy. In: Proceedings of the International Congress of Mathematicians (ICM 2018) (2019)
Laurent, M.: Sums of squares, moment matrices and optimization over polynomials, The IMA Volumes in Mathematics and its Applications Series, vol. 149, pp. 155–270. Springer, Germany (2009)
Liu, H.H., Kandula, S., Mahajan, R., Zhang, M., Gelernter, D.: Traffic engineering with forward fault correction. In: Proceedings of the 2014 ACM Conference on SIGCOMM, pp. 527–538 (2014)
Luedtke, J., Ahmed, S.: A sample approximation approach for optimization with probabilistic constraints. SIAM J. Optim. 19(2), 674–699 (2008)
Article MathSciNet MATH Google Scholar
Markopoulou, A., Iannaccone, G., Bhattacharyya, S., Chuah, C.N., Ganjali, Y., Diot, C.: Characterization of failures in an operational IP backbone network. IEEE/ACM Trans. Netw. 16(4), 749–762 (2008)
Article Google Scholar
Matthews, L.R., Gounaris, C.E., Kevrekidis, I.G.: Designing networks with resiliency to edge failures using two-stage robust optimization. Eur. J. Oper. Res. 279(3), 704–720 (2019)
Article MathSciNet MATH Google Scholar
Mihalák, M., Šrámek, R., Widmayer, P.: Counting approximately-shortest paths in directed acyclic graphs. In: Kaklamanis, C., Pruhs, K. (eds.) Approximation and Online Algorithms, pp. 156–167. Springer, Cham (2014)
Chapter Google Scholar
Mühlpfordt, T., Roald, L., Hagenmeyer, V., Faulwasser, T., Misra, S.: Chance-constrained ac optimal power flow: a polynomial chaos approach. IEEE Trans. Power Syst. 34(6), 4806–4816 (2019)
Article Google Scholar
Nemirovski, A., Shapiro, A.: Convex approximations of chance constrained programs. SIAM J. Optim. 17(4), 969–996 (2007)
Article MathSciNet MATH Google Scholar
Owhadi, H., Scovel, C., Sullivan, T.J., McKerns, M., Ortiz, M.: Optimal uncertainty quantification. SIAM Rev. 55(2), 271–345 (2013)
Article MathSciNet MATH Google Scholar
Rizzi, R., Tomescu, A.I.: Faster FPTASes for counting and random generation of knapsack solutions. Inf. Comput. 267, 135–144 (2019)
Article MathSciNet MATH Google Scholar
Rockafellar, R.T.: Convex Analysis 28. Princeton University Press, Princeton (1970)
Book MATH Google Scholar
Rockafellar, R.T., Uryasev, S., et al.: Optimization of conditional value-at-risk. J. Risk 2, 21–42 (2000)
Article Google Scholar
Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, vol. 317. Springer, Berlin (2009)
MATH Google Scholar
Sherali, H.D., Adams, W.P.: A Reformulation–Linearization Technique for Solving Discrete and Continuous Nonconvex Problems, vol. 31. Springer, Berlin (2013)
MATH Google Scholar
Tawarmalani, M., Richard, J.P.P., Xiong, C.: Explicit convex and concave envelopes through polyhedral subdivisions. Math. Program. 138(1–2), 531–577 (2013)
Article MathSciNet MATH Google Scholar
Wang, Y., Wang, H., Mahimkar, A., Alimi, R., Zhang, Y., Qiu, L., Yang, Y.R.: R3: resilient routing reconfiguration. In: Proceedings of the ACM SIGCOMM 2010 Conference, pp. 291–302 (2010)
Wets, R.J.B.: Stochastic programs with fixed recourse: the equivalent deterministic program. SIAM Rev. 16(3), 309–339 (1974)
Article MathSciNet MATH Google Scholar
Wood, R.K.: Deterministic network interdiction. Math. Comput. Model. 17(2), 1–18 (1993)
Article MathSciNet MATH Google Scholar
Xu, G., Burer, S.: A copositive approach for two-stage adjustable robust optimization with uncertain right-hand sides. Comput. Optim. Appl. 70(1), 33–59 (2018)
Article MathSciNet MATH Google Scholar
Zhang, Y., Ge, Z., Greenberg, A., Roughan, M.: Network anomography. In: Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement, pp. 30–30 (2005)

Download references

Acknowledgements

We acknowledge Shabbir Ahmed for his insightful comments at Dagstuhl on the use of Markov inequality for OUQ and Sanjay G. Rao for extensive discussions on NR. The second author would like to acknowledge the funding provided by NSF CMMI-1727989 and AFOSR 21RT0453

Author information

Authors and Affiliations

Krannert School of Management, Purdue University, West Lafayette, 47906, USA
Ashish Chandra & Mohit Tawarmalani

Authors

Ashish Chandra
View author publications
You can also search for this author in PubMed Google Scholar
Mohit Tawarmalani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohit Tawarmalani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

List of assumptions

Here, we will briefly describe the assumptions we make in different parts of the paper.

1.
In Sect. 2, for relating RLT to affine and polynomial policies, we assume:
- There is no duality gap between CP$(\cdot )$ and CD$(\cdot )$ (A1).
2.
For deriving the column generation algorithm and to show convergence of RLT at the $m^{\text {th}}$ level in Theorem 1, we assume in Sect. 2.1 that:
- The distribution of ${\mathbb {X}}$ is supported on a finite set of points ${\mathscr {T}}$ in ${\mathscr {P}}$ (A2).
- ${\mathbb {K}}={\mathbb {R}}^p_+$ (A3).
- ${\mathscr {T}}\subseteq \{0,1\}^m$ and an inequality description of $\mathop {\text {conv}}({\mathscr {T}})$ is available (A4).
Additionally, when ${\mathscr {T}}$ consists of the vertices of a simplex, we show in Proposition 2 that the concave envelope of the indicator function can be used to compute $\mathop {\text {Pr}}\nolimits _{*}({\mathcal {F}})$. The column generation algorithm also assumes that expectations of a set of functions of the random variable ${\mathbb {X}}$, denoted as $\{{\mathfrak {f}}_{\alpha }({\mathbb {X}}), \alpha \in {\bar{\varGamma }}\subseteq {\mathbb {N}}^{m}\}$, are known.
3.
In Sects. 3 and 4 we devise counting and sampling algorithms by assuming that:
- ${\mathscr {P}}=\mathop {\text {conv}}({\mathscr {T}})=[0,1]^m$ and ${\mathbb {X}}\in \{0,1\}^m$, with distribution $\varTheta = \bigotimes _{i=1}^{m}\text {Bernoulli}(p_i)$ (tensor product of m independent Bernoulli distributions). Moreover, we assume that $p_i = \frac{a_i}{n_i}$, where $a_i, n_i\in {\mathbb {N}}$, and GCF$(a_i, n_i) = 1$ (A5).
- Without loss of generality, the weights of the general inequality defining each Sliced low weight polytope (SLWP) are non-negative i.e., $w_{i} \in {\mathbb {Z}}_{\ge 0}$ for all $i \in [m]$ (A6).
4.
We derive the Bernstein approximation by assuming in Sect. 4.2 that:
- $S' = \big \{\sum _{i=1}^{m} w_{i}x_{i}\ge C, \sum _{i=1}^{m} x_{i} = {\mathfrak {b}}, \text { where } w_{i} \ge 0\ \forall i \in [m]\bigr \}$ and $\mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}_i=1)=p$ for all $i\in [m]$ (A7).

Proof of Proposition 1

Let $y = P^\intercal x + q$. For y to be feasible, $(AP^\intercal + B)x + Aq \le _{\mathbb {K}}c$ for all x satisfying ${\mathfrak {C}}x \le {\mathfrak {d}}$. Then, $(AP^\intercal + B)x + Aq - c \le _{{\mathbb {K}}} U^\intercal {\mathfrak {C}}x - U^\intercal {\mathfrak {d}}= U^\intercal ({\mathfrak {C}}x - {\mathfrak {d}}) \le _{{\mathbb {K}}} 0$, where the first inequality follows from (6b) and (6c) and the last inequality because $U^\intercal ({\mathfrak {C}}x - {\mathfrak {d}})$, by (6e) is a non-positive conic combination of vectors in ${\mathbb {K}}$. Moreover, the objective, $e^\intercal (P^\intercal x + q) = {\underline{\varTheta }}^\intercal {\mathfrak {C}}x + e^\intercal q \le {\underline{\varTheta }}^\intercal {\mathfrak {d}}+ e^\intercal q$, where the first equality is from (6d) and the second inequality is because ${\underline{\varTheta }}\ge 0$ and ${\mathfrak {C}}x \le {\mathfrak {d}}$. This shows that the feasible solutions in (6) describe an affine policy and the objective function value overestimates that of the corresponding affine policy. We now show that relaxation is exact when ${\mathbb {K}} = {\mathbb {R}}_{+}^{p}$ and ${\mathscr {P}}\ne \emptyset $. For an affine policy to be feasible, $(A_{k}P^{\intercal } + B_{k})x + A_{k}q - c_{k} \le 0$ for all x satisfying ${\mathfrak {C}}x \le {\mathfrak {d}}$ and for all $k\in [p]$, where $A_{k}^\intercal \in {\mathbb {R}}^{n}$, $B_{k}^\intercal \in {\mathbb {R}}^{m}$ represent the $k^{\text {th}}$ row of A and B respectively, and $c_{k}\in {\mathbb {R}}$ represents the $k^{\text {th}}$ entry of c. In other words, for $a^\intercal = -(A_{k}P^\intercal + B_{k})$ and $b = A_{k} q - c_{k}$, it follows that $\{x\mid a^\intercal x < b, {\mathfrak {C}}x \le {\mathfrak {d}}\} =\emptyset $. By Farkas’ Lemma, one of ${\mathscr {S}}_{1}$ and ${\mathscr {S}}_{2}$ is therefore feasible, where

$$\begin{aligned} {\mathscr {S}}_{1}&\,{:}{=}\, \bigl \{(\uplambda , \mu )\in {\mathbb {R}}_{++}\times {\mathbb {R}}^{l}_+\bigm |\uplambda a^\intercal + \mu ^\intercal {\mathfrak {C}}= 0, \uplambda b + \mu ^\intercal {\mathfrak {d}}\le 0\bigr \}\\ {\mathscr {S}}_{2}&\,{:}{=}\, \bigl \{(\uplambda ,\mu )\in {\mathbb {R}}_{+}\times {\mathbb {R}}^{l}_+ \bigm | \uplambda a^\intercal + \mu ^\intercal {\mathfrak {C}}= 0, \uplambda b + \mu ^\intercal {\mathfrak {d}}< 0\bigr \}. \end{aligned}$$

In ${\mathscr {S}}_{1}$, to see the equivalence scale $\uplambda $ to 1 and set $\mu = {\widetilde{U}}_{k}$, the $k^{\text {th}}$ column of U. In ${\mathscr {S}}_{2}$, we assume $\uplambda =0$, otherwise we obtain a solution to ${\mathscr {S}}_1$. Therefore, there is a non-negative $\mu $ such that $\mu ^\intercal {\mathfrak {C}}= 0$ and $\mu ^\intercal {\mathfrak {d}}< 0$, which, by Farkas’ Lemma contradicts that ${\mathscr {P}}\ne \emptyset $.$\square $

Extension of Proposition 1 to consider polynomial policies

The design of such polynomial policies relates to the use of polynomial chaos expansion for structured representation of uncertainty in chance-constrained optimization; see [30] for its use in optimal power flow. Suppose $y_{j} = \sum _{\alpha \in \gamma _j} g_{j}^{\alpha } x^{\alpha }$ for $j \in [n]$, where $\alpha = (\alpha _{1},\dots , \alpha _{m})\in \gamma _j\subseteq {\mathbb {N}}^m$, $g_{j}^{\alpha } \in {\mathbb {R}}$, and $x^{\alpha }$ represents the monomial $x_{1}^{\alpha _{1}}\cdots x_{m}^{\alpha _{m}}$. Let $Y(x) = \{y \in {\mathbb {R}}^n : A(x)y + {\mathscr {X}}(x) \le _{{\mathbb {K}}} c\}$, where A(x) is a $p\times n$ matrix of polynomial functions, ${\mathscr {X}}(x)$ is a p sized vector of polynomials, such that $A(x)_{kj} = \sum _{\beta \in S_{kj}} a^{\beta }_{kj} x^{\beta }$ and ${\mathscr {X}}(x)_{k} = \sum _{\beta \in S_{k0}}a^{\beta }_{k0}x^{\beta }$ for some sets $S_{kj}$ and $S_{k0}$. Let $S'{:}{=}\{\alpha ': \exists (j,k) \text { such that } \alpha ' = \alpha + \beta , \alpha \in \gamma _j, \beta \in S_{kj}\}$. Assume that ${\mathscr {P}}$ = $\{x': {\mathfrak {C}}'x' \le {\mathfrak {d}}'\}$ is a linear relaxation of $\{x': x'_{\alpha '} = x^{\alpha '}\forall \alpha '\in S', {\mathfrak {C}}x\le {\mathfrak {d}}\}$. Let $\varsigma = \max _{x\in {\mathscr {P}}}\min _{y\in Y(x)} e^\intercal y$ and restrict y to a polynomial policy to define:

$$\begin{aligned} \varPsi _{1}^* {:}{=}&\min _{\xi ,g}&\quad&\xi&\quad&\\&\xi \ge e^\intercal \sum _{\alpha \in \gamma _j} g_j^\alpha x'_{\alpha }&\forall x'\in \{ {\mathfrak {C}}'x \le {\mathfrak {d}}' \}\\{}&\sum _{j\in [n]}\sum _{\beta \in S_{kj}}\sum _{\alpha \in \gamma _j} a^\beta _{kj}g^\alpha _{j} x'_{\alpha + \beta } + \sum _{\beta \in S_{k0}}a^\beta _{k0} x'_{\beta } \le _{{\mathbb {K}}} c_k&\begin{aligned} \forall k\in [p]\\ \forall x'\in \{{\mathfrak {C}}'x \le {\mathfrak {d}}'\}. \end{aligned} \end{aligned}$$

Then, assuming $\{x': {\mathfrak {C}}' x'\le {\mathfrak {d}}'\}$ is not empty and ${\mathbb {K}}={\mathbb {R}}^p_+$, dualization allows us to succinctly express the constraints for all $x'$ so that $\varPsi _D^* = \varPsi _1^*$, where:

$$\begin{aligned} \varPsi _{D}^* =&\min _{g, \varTheta ,U}&\quad&\sum _{r\in [l]} {\mathfrak {d}}'_{r}\varTheta _{r} + \sum _{j:0\in \gamma _{j}} e_{j}g_{j}^0&\quad&\end{aligned}$$

(C.2a)

$$\begin{aligned}&\sum _{r\in [l]}U_{rk}{\mathfrak {C}}'_{r\alpha } = \sum _{j\in [n]}\sum _{\alpha -\alpha '\in S_{kj}}\sum _{\alpha '\in \gamma _{j}} a^{\alpha -\alpha '}_{kj} g_{j}^{\alpha } + a^{\alpha }_{k0}&\forall k\in [p],\alpha \in \gamma _j\end{aligned}$$

(C.2b)

$$\begin{aligned}&\sum _{r\in [l]}U_{r k}{\mathfrak {d}}'_{r} + \sum _{j:0\in S_{k,j}} a^{0}_{kj}g^{0}_{j} + \sum _{0\in S_{k0}} a^{0}_{k0} \le c_k&\forall k\in [p]\end{aligned}$$

(C.2c)

$$\begin{aligned}&\sum _{r\in [l]}\varTheta _{r}{\mathfrak {C}}'_{r \alpha } = \sum _{j:\alpha \in \gamma _{j}} g_{j}^{\alpha }e_{j}&\forall \alpha \in \gamma _j\end{aligned}$$

(C.2d)

$$\begin{aligned}&\varTheta \ge 0, \ U_{\cdot k}\ge 0&\forall k\in [p]. \end{aligned}$$

(C.2e)

Proposition 6

Assume that ${\bar{x}}\in {\mathscr {P}}$, and there exists a ${\bar{w}}\in -{\mathbb {K}}^*$ such that ${\bar{w}}^\intercal A({\bar{x}}) = e^\intercal $, and that strong duality holds for the inner problem, i.e.,

$$\begin{aligned} \min _{y\in Y(x)} e^\intercal y = \text {CD}(x) {:}{=} \max _{w}\bigl \{w^\intercal \bigl (c-{\mathscr {X}}(x)\bigr ) \bigm | w^\intercal A(x) = e^\intercal , w \le _{\mathbb {K}}^* 0 \bigr \}, \end{aligned}$$

where $Y(x) = \{y \in {\mathbb {R}}^n : A(x)y + {\mathscr {X}}(x) \le _{{\mathbb {K}}} c\}$, such that $A(x)_{kj}$ and ${\mathscr {X}}(x)_{k}$ for $k\in [p]$ and $j\in [n]$, are as discussed above. Then, if ${\mathbb {K}}={\mathbb {R}}^p_+$, there is an RLT relaxation of $\varsigma = \max _{x\in {\mathscr {P}}}\min _{y\in Y(x)} e^\intercal y$ which dualizes (C.2) and has the same optimal value.

Proof

By strong duality, $\varsigma = \max _{x\in {\mathscr {P}}} \text {CD}(x)$. We obtain the following constraints by taking products of equality constraints in $\text {CD}(x)$ with $x^\alpha $ and inequalities with ${\mathfrak {C}}' x'\le {\mathfrak {d}}'$ that relax the monomial definitions:

$$\begin{aligned} \sum _{k\in [p]}\sum _{\beta \in S_{kj}} w_{k}a^{\beta }_{kj} x^{\alpha + \beta } = x^{\alpha } e_{j} \ \forall \alpha \in \gamma _{j}, \forall j\in [n]; \;\text { and }\; ({\mathfrak {d}}'_{r} - {\mathfrak {C}}'_{r}x' )w^\intercal \le _{{\mathbb {K}}^*} 0. \end{aligned}$$

Upon linearization, we obtain:

$$\begin{aligned} {\underline{\varDelta }}{:}{=}&\max _{w,x,w',w''}&\quad&w^\intercal ( c- B'x')&\quad&\end{aligned}$$

(C.3a)

$$\begin{aligned}&w' A' = M'\end{aligned}$$

(C.3b)

$$\begin{aligned}&{\mathfrak {d}}'_{r}w^\intercal - {\mathfrak {C}}'_{r} w'' \le _{{\mathbb {K}}^*} 0&\forall r \in [l]\end{aligned}$$

(C.3c)

$$\begin{aligned}&{\mathfrak {C}}'x' \le {\mathfrak {d}}'\end{aligned}$$

(C.3d)

$$\begin{aligned}&w\le _{{\mathbb {K}}^*} 0, \end{aligned}$$

(C.3e)

where: (i) $w''_{(\alpha ,k)}$ linearizes $x^\alpha w^\intercal $ and $x'_\alpha w^\intercal $, (ii) whenever $\alpha ' + \beta ' = \alpha $, $j\in [n]$, $\alpha ' \in \gamma _{j}$, and $\beta ' \in S_{kj}$, $w'_{(j,\alpha '),(\beta ',k)} = w''_{(\alpha ,k)}$, (iii) for all $k\in [p]$, $j\in [n]$, $A'_{(\beta ,k),j} = a^{\beta }_{kj}$ if $\beta \in S_{kj}$ and 0 otherwise, (iv) for all $\alpha \in \gamma _{j}$ and $j \in [n]$, $M'_{(j,\alpha )} = x^{\alpha }e_{j}$, and (v) for all $k\in [p]$, $B'_{(k, \beta )} = a^{\beta }_{k0}$ if $\beta \in S_{k0}$ and 0 otherwise. Let $P'$, $\{U'_{r}\}_{r\in [l]}$, and $\varTheta '$ be the dual variables to the equations (C.3b), (C.3c) and (C.3d) respectively. Given that $({\bar{w}},{\bar{x}})$ is feasible for $\max _{x\in {\mathscr {P}}} \text {CD}(x)$, its relaxation (C.3) used to compute ${\underline{\varDelta }}$ is also feasible. When ${\mathbb {K}}={\mathbb {R}}^p_+$, (C.3) is a linear program and so has no duality gap. In general, its dual is:

$$\begin{aligned}&\min _{\varTheta ',U',P'}&\quad&\sum _{j:0\in \gamma _{j}} P'_{j0}e_{j} + \sum _{r\in [l]}\varTheta '_{r} {\mathfrak {d}}'_{r}&\quad&\end{aligned}$$

(C.4a)

$$\begin{aligned}&-U'^\intercal {\mathfrak {C}}' + F' + B' = 0\end{aligned}$$

(C.4b)

$$\begin{aligned}&U'^\intercal {\mathfrak {d}}' + L \le _{{\mathbb {K}}} c \end{aligned}$$

(C.4c)

$$\begin{aligned}&h + {\mathfrak {C}}'^\intercal \varTheta ' = 0 \end{aligned}$$

(C.4d)

$$\begin{aligned}&\varTheta ' \ge 0, U'_{r} \ge _{{\mathbb {K}}} 0&\forall r\in [l], \end{aligned}$$

(C.4e)

where, for $k\in [p]$ and $\alpha \in \gamma _{j}$, $F'_{\alpha ,k} = \sum _{j} \sum _{\alpha '\in \gamma _{j}} \sum _{\beta '=\alpha -\alpha '\in S_{kj}} a^{\beta '}_{kj} P'_{\alpha 'j}$, and, for $k\in [p]$, $L_{k} = \sum _{j\in [n]}\sum _{0\in \gamma _{j}}\sum _{k:0\in S_{kj}} a^{0}_{kj} P'_{0j}$. Finally, for $\alpha \in \gamma _j$, $h_{\alpha } = -\sum _{j:\alpha \in \gamma _{j}} P_{\alpha j}e_{j}$. When ${\mathbb {K}}={\mathbb {R}}^p_+$, we obtain (C.2) by replacing $(\varTheta ',U',P')$ in (C.4) with $(\varTheta , U, g)$. $\square $

The equivalence in Propositions 1 and 6 holds when ${\mathbb {K}}$ has a tractable linear inequality representation. To reduce the more general case to that for ${\mathbb {R}}^p_+$, we write $U\in {\mathbb {K}}$ as ${\mathcal {G}}U \ge 0$ for some ${\mathcal {G}}$ and replace $Ay + Bx \le _{{\mathbb {K}}} c$ with ${\mathcal {G}}Ay + {\mathcal {G}}Bx \le {\mathcal {G}}c$.

Proof of Proposition 2

By definition, ${\hat{\mathbbm {1}}}_{E}({\mathbb {E}}_{*}[{\mathbb {X}}]) = \max _{\uplambda }\bigl \{\sum _{i\in {\mathcal {I}}}\uplambda _{i} \mathbbm {1}_{{\mathcal {F}}}(x^{i}) \big | \sum _{i\in {\mathcal {I}}}\uplambda _{i}x^{i} = {\mathbb {E}}_{*}[{\mathbb {X}}], \uplambda \ge 0, \sum _{i\in {\mathcal {I}}}\uplambda _{i} = 1 \big \}, $ where $\uplambda = \{\uplambda _{i}\}_{i\in {\mathcal {I}}}$ and $\{x^{i}\}_{i\in {\mathcal {I}}}$ are the extreme points in ${\mathscr {T}}$. There is a unique feasible solution with $\uplambda _{i} = \mathop {\text {Pr}}_{*}({\mathbb {X}}= x^{i})$. So, $\displaystyle {\hat{\mathbbm {1}}}_{E}({\mathbb {E}}_{*}[{\mathbb {X}}]) = \sum _{i\in {\mathcal {I}}} \text {Pr}_{*}({\mathbb {X}} = x^{i}) \mathbbm {1}_{{\mathcal {F}}}(x^{i}) = {\mathbb {E}}_{*}[\mathbbm {1}_{{\mathcal {F}}}({\mathbb {X}})]$. $\square $

Proof of Proposition 3

We first write ${\hat{\mathbbm {1}}}(x)$ as $\max _{(w,v)\in {\mathcal {S}}}h(x,w,v)$. Then, for any ${\bar{x}}\in {\mathscr {T}}$, $\max _{(w,v)\in {\mathcal {S}}} r({\bar{x}},w,v) = \mathbbm {1}_{{\mathcal {F}}\cap {\mathscr {T}}}({\bar{x}})\ge 0$. Since $\max _{(w,v)\in {\mathcal {S}}}h(x,w,v)$ is concave and, for $x\in \mathop {\text {conv}}({\mathscr {T}})$, is larger than $\mathbbm {1}_{{\mathcal {F}}\cap {\mathscr {T}}}(x)$, it follows that ${\hat{\mathbbm {1}}}(x)\ge {\hat{\mathbbm {1}}}_E(x)$. For the converse, observe that, for all $({\bar{x}},w,v)\in \mathop {\text {conv}}({\mathscr {T}})\times {\mathcal {S}}$, $r({\bar{x}},w,v)\le \mathbbm {1}_{{\mathcal {F}}\cap {\mathscr {T}}}({\bar{x}})\le {\hat{\mathbbm {1}}}_E(x)$. Since ${\hat{\mathbbm {1}}}_E(x)$ is concave, it follows that $h({\bar{x}},w,v)\le {\hat{\mathbbm {1}}}_E(x)$ and, so, ${\hat{\mathbbm {1}}}(x) = \max _{(w,v)\in {\mathcal {S}}}h(x,w,v) \le {\hat{\mathbbm {1}}}_E(x)$. $\square $

Proof of Theorem 1

We prove the result by showing that $\varphi ^R(b_\alpha )$ computes the optimal value in (12). Let ${\mathscr {M}}_{J}(x){:}{=}\prod _{j\in J}x_{j}$. Let $\alpha $ be defined so $\alpha _j=1$ if $j\in J$ and 0 otherwise. Then, ${\mathscr {M}}_J(x) = x^\alpha $. Clearly, for any variable z, the functions $z{\mathscr {M}}_{J}(x)$ and $z{\mathfrak {M}}_{J}(x)$ for $J\subseteq [m]$ form bases of the same vector space of functions. Indeed, $z{\mathfrak {M}}_{J}(x) = \sum _{J':J\subseteq J'\subseteq [m]}(-1)^{|J'\backslash J|}z{\mathscr {M}}_{J'}(x)$. Conversely, we have $z{\mathscr {M}}_{J}(x) = \sum _{J':J\subseteq J'\subseteq [m]} z{\mathfrak {M}}_{J'}(x)$. Therefore, we write the RLT relaxation obtained from (13) equivalently without expanding the multilinear terms, instead linearizing $\varphi {\mathfrak {M}}_{J}(x)$, $w{\mathfrak {M}}_{J}(x)$, $v{\mathfrak {M}}_J(x)$, and ${\mathfrak {M}}_{J}(x)$ directly using $\varphi ^J$, $w^J$, $v^J$, and ${\mathfrak {p}}_J$ respectively. Since the former basis includes 1, we must also require that $\sum _{J':J'\subseteq [m]} z{\mathfrak {M}}_{J'}(x) = z$ for each $z\in \{\varphi ,w,v,1\}$. When z is $\varphi $, this shows that the objective (12a) matches that in (13a). The substitution $x_i^2=x_i$ replaces $x_i{\mathfrak {M}}_J(x)$ with ${\mathfrak {X}}^{J}_i{\mathfrak {M}}_J(x)$. This is linearized as ${\mathfrak {X}}^{J}_i{\mathfrak {p}}_J$ in (13e) while ${\mathfrak {M}}_J(x) w^\intercal Bx$ in (13b) is replaced with $(w^J)^\intercal B{\mathfrak {X}}^{J}$. The constraints (12b), (12c), and (12d) now follow easily from the linearizations of (13b), (13c) and (13d).

We show that the set defined by the linearization of (13e), denoted as $X'$ is: $X = \bigl \{({\mathfrak {p}}_{J})_{J\subseteq [m]}\,:\, {\mathfrak {p}}_J\ge 0,\,J\subseteq [m];\, \sum _{J\subseteq [m]}{\mathfrak {p}}_J = 1;\, {\mathfrak {p}}_J = 0 \text { if } {\mathfrak {X}}^J\not \in {\mathscr {T}}\bigr \}$. Note that X models the probability distributions with support on ${\mathscr {T}}$. Because $x_i{\mathfrak {M}}_J(x)$ linearizes to ${\mathfrak {X}}^J_i{\mathfrak {p}}_J$, $X'$ has the same variables as X. We first show that $X'\subseteq X$. Observe that $\sum _{J':J'\subseteq [m]} {\mathfrak {M}}_{J'}(x) = 1$, linearizes to $\sum _{J':J'\subseteq [m]} {\mathfrak {p}}_{J'} = 1$. Moreover, for any $j\in J$, (resp. $j\in J^C$), $x_j\ge 0$ (resp. $1-x_j\ge 0$) is implied by $\mathop {\text {conv}}({\mathscr {T}})$. Thus, the linearization of $x_j{\mathfrak {M}}_J(x)\ge 0$ (resp. $(1-x_j){\mathfrak {M}}_J(x)\ge 0$) is implied by (13e) and yields ${\mathfrak {p}}_J\ge 0$. Now, consider any ${\mathfrak {X}}^J\not \in \mathop {\text {conv}}({\mathscr {T}})$. Then, if ${\mathfrak {p}}_J > 0$, we obtain a contradiction since (13e) requires that ${\mathfrak {p}}_J {\mathfrak {X}}^J \in {\mathfrak {p}}_J \mathop {\text {conv}}({\mathscr {T}})$. Therefore, ${\mathfrak {p}}_J = 0$ whenever ${\mathfrak {X}}^J\not \in {\mathscr {T}}$. Now, we show that $X'\supseteq X$. Since $X'$ is convex, it suffices to show that the extreme points of X are contained in $X'$. It can be verified that if ${\mathfrak {X}}^J\in {\mathscr {T}}$ then the solution ${\mathfrak {p}}_J=1$ and ${\mathfrak {p}}_{J'}=0$ for $J'\ne J$ is feasible to $X'$.

Finally, we show that $x_\alpha = b_\alpha $ is feasible to the linearization of (13e). Let ${\bar{{\mathfrak {p}}}}_J = \sum _{J'\subseteq J^{C}} (-1)^{|J'|} b_{\alpha (J\cup J')}$ for all $J\subseteq [m]$. Then, since $b_\alpha $ is the moment of $x^\alpha $ with support on ${\mathscr {T}}$, it follows that ${\bar{{\mathfrak {p}}}}_J\in X$. Let $x_\alpha $ linearize ${\mathscr {M}}_J(x)$, where $\alpha _j=1$ if $j\in J$ and 0 otherwise. Observe that, with this linearization, (13e) yields an affine transform of X, say T(X), in the space of $x_\alpha $ variables. Then, $x_\alpha = \sum _{J': J\subseteq J'\subseteq [m]}{\bar{{\mathfrak {p}}}}_{J'}$ is feasible to T(X). However, it can be easily verified that $\sum _{J': J\subseteq J'\subseteq [m]}{\bar{{\mathfrak {p}}}}_{J'} = \sum _{J':J'\subseteq [m]}{\bar{{\mathfrak {p}}}}_{J'}({\mathfrak {X}}^{J'})^\alpha = b_\alpha $. The first equality is because $({\mathfrak {X}}^J)^\alpha = 1$ if $J\subseteq J'$ and 0 otherwise, while the second equality follows since ${\bar{{\mathfrak {p}}}}_J$ is the probability distribution corresponding to the moments $b_\alpha $. Thus, $x_\alpha = b_\alpha $ is feasible to T(X). Then, it follows that $\varphi ^R(b_\alpha )$ computes $\mathop {\text {Pr}}\nolimits _\varTheta ({\mathcal {F}})$ as in (12). $\square $

Proof of Theorem 2

We denote the maximum value of $s^{\ge }((i,j),*)$ by $M_i$, where $s^{\ge }((i,j),c) = \sum _{{\tilde{c}}\ge c}s((i,j),{\tilde{c}})$. For any i, range$(j)_i$ is the range of possible values of j. For $i\le K_2$, $M_i\le \left( {\begin{array}{c}i\\ j\end{array}}\right) {\mathcal {T}}^{i}$, $j \in \bigl [\max \{0,i+ {\mathfrak {b}}-K_{2}\}, \min \{ K_{2}, K_{1}+{\mathfrak {b}},i \}\bigr ]$, and range$(j)_{i} = \min \{K_{2}, i + {\mathfrak {b}}- 2K_{2}, K_{1} + {\mathfrak {b}}, m- i \}$. For $i > K_2$, there is a $l\in \bigl [0, \min \{K_2-j,i-K_2\}\bigr ]$ so that we select $j+l$ (resp. $i-K_2-l$) variables from $\{1,\ldots ,K_2\}$ (resp. $\{K_2+1,\ldots ,i\}$) to set to 1 (resp. 0). It follows that $M_i\le \left( {\begin{array}{c}i\\ j+i-K_2\end{array}}\right) {\mathcal {T}}^{i}$, $j\in \bigl [{\mathfrak {b}}, \min \{K_{2}, m- i + {\mathfrak {b}}\}\bigr ]$, and range$(j)_{i} = \min \{ K_{2} - {\mathfrak {b}}, m-i \}$. We choose a sparsification parameter, $\delta _{s}$, to perform $(1+\delta _{s})$ sparsification of each $s((i,j),{\tilde{c}})$. Observe that $\log _{(1+\delta _{s})}\left( {\begin{array}{c}i\\ j\end{array}}\right) \le \min \{\frac{i}{2},j\}\log _{1+\delta _{s}}\frac{i\exp (1)}{\min \{\frac{i}{2},j\}}$. Since the time-complexity of summing, shifting, and querying function lists is bounded by their size, the time complexity is $O\bigl (m{\underline{\varTheta }}\bigl (\xi \log _{1+\delta _{s}}(m/\xi ) + m\log _{1+\delta _{s}}{\mathcal {T}}\bigr )\bigr )$. The time complexity in Theorem 2, follows by choosing $\delta = (1+\epsilon _{s})^{1/m} - 1$, and using $\ln (1+\epsilon _{s}) \ge \epsilon _{s}/ 2$ for $\epsilon _{s}\in (0,1)$. $\square $

Proof of Theorem 3

Consider a $t+1$ dimensional DAG, where the $(l+1)^{\text {st}}$ dimension corresponds to the $l^{\text {th}}$ low weight constraint. Let $s((i,j_{1},\dots ,j_{t}),*)$ be the list of all pairs $(c,s((i,j_{1},\dots ,j_{t}),c))$. For ${\mathcal {T}} \ge \max _{i} n_{i}$, if M is the maximum value of $s^{\ge }((i,j_{1},\dots ,j_{t}),*)$, then $M\le (2{\mathcal {T}})^{i}$ as there are $2^i$ solutions, each of which occurs at most ${\mathcal {T}}^i$ times. Moreover, let $\gamma $ be such that $\gamma \ge \max _{k} w_{k}^{l} - \min _{k} w_{k}^{l}$ for all $l\in [t]$. Thus, the $l^{\text {th}}$ low weight constraint at the $i^{\text {th}}$ slice has at most $i\gamma $ values. Since there are $t$ low weight constraints, the number of nodes with first coordinate i is bounded by $(m\gamma )^{t}$. Then, after a $1+\delta _{s}$ sparsification the cumulative length of lists is bounded by $m(m\gamma )^{t}\log _{1+\delta _{s}}(2{\mathcal {T}})^{m}$. For a $1+\epsilon _{s}$ approximation, with $1+\delta _{s}= (1+\epsilon _{s})^{1/m}$, the time complexity is $O[\epsilon _{s}^{-1}m^{t+3}\gamma ^{t} \ln {\mathcal {T}}]$. $\square $

A lower estimate for probability of 0–1 solutions to a SLWP

Given a function $f:{\mathbb {Z}}^+ \rightarrow {\mathbb {Z}}^+$ and an approximation parameter $\epsilon _{s}>0$, we say $F:{\mathbb {Z}}^+\rightarrow {\mathbb {Z}}^+$ (resp. ${\underline{F}}:{\mathbb {Z}}^+ \rightarrow {\mathbb {Z}}^+$) is a $(1 - \epsilon _{s})$ function approximation (resp. sum-approximation) of f if, for all x, $(1-\epsilon _{s})f(x) \le F(x) \le f(x)$ (resp. $(1-\epsilon _{s})f^{\ge }(x) \le {\underline{F}}^{\ge }(x)\le f^{\ge }(x)$). The properties in Lemma 2 follow easily for $(1-\epsilon _{s}{})$ sum-approximation of functions. The sparsifier takes as input a function f and a parameter $\delta _{s}>0$. We partition values of $f^{\ge }$ into $[r_{i+1},r_{i})$, where $r_{0}=\max _c{f^{\ge }(c)}$ and if $r_i > 0$, then $r_{i+1} = \min \{r_{i}-1, \lceil (1-\delta _{s})r_{i}\rceil \}$. Let $c_{i} = \min _{c}\{c\mid f^{\ge }(c) \le r_{i}\}$. For any c, we define $l(c) = \min _{i}\{c_{i}\mid c_{i} > c \}$. If l(c) is finite, ${\underline{F}}^{\ge }(c) = f^{\ge }(l(c) - 1)$. Otherwise, ${\underline{F}}^{\ge }(c) = \lim _{x\rightarrow \infty }f^{\ge }(x)$. Then, ${\underline{F}}^{\ge }(x)$ is a $(1-\delta _{s})$ approximation of $f^{\ge }(x)$. As a consequence of the Theorem 2, we obtain the time complexity to compute $|\underline{S_s}|_{\varTheta }$ such that for any given $\epsilon _{s}\in (0,1)$, $(1-\epsilon _{s})|S_s|_{\varTheta } \le |\underline{S_s}|_{\varTheta } \le |S_s|_{\varTheta }$.

Corollary 1

Given $S_s$, $\varTheta $ as in (14), (A5) respectively, and an error parameter $\epsilon _{s}\in (0,1)$, we can deterministically compute a $1-\epsilon _{s}$ relative error approximation of $|S_s|_{\varTheta }$ in time given as in Theorem 2.

Proof

When we use a $(1-\delta _{s})$ sparsifier, the time to compute $|S_s|_\varTheta $ is $O\bigl (m{\underline{\varTheta }}\bigl [\xi \log _{\frac{1}{(1-\delta _{s})}}\bigl ( \frac{m}{\xi } \bigr ) + m\log _{\frac{1}{1-\delta _{s}}}{\mathcal {T}}\bigr ]\bigr )$. To control the approximation error, we set $(1-\delta _{s})^{m} = (1-\epsilon _{s})$. Then, we obtain the same time-complexity as in Theorem 2 using $\ln (1-\epsilon _{s})^{-1}\ge \epsilon _{s}$. $\square $

Proof of Theorem 4

We write the solution set of $S_\varOmega $ as $\bigcup _{J} S(J)$, where each $S(J) = \{x\in \{0,1\}^{m} : \sum _{i=1}^{m} w_{i}x_{i} \ge C, \varOmega x = J\}$. For a given ${\widetilde{x}}\in \{0,1\}^m$, we first compute $\mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}= {\widetilde{x}}|{\mathbb {X}}\in S_\varOmega )$. To do so, we will compute $\mathop {\text {Pr}}\nolimits _{\varTheta }({\mathbb {X}}= {\widetilde{x}}\mid {\mathbb {X}}\in S(J))$. Let $s_{i}(j,c) = \bigl \{x : \sum _{k=1}^{i}w_{k}x_{k} \ge c, \varOmega _{\cdot ,1:i} x_{1:i} = j\bigr \}$ and $s'_{i}(j,c) = s_{i}(j,c)\cap \{x : x_{k} = {\widetilde{x}}_{k} \forall k > i\}$. Define $c(i) = C- \sum _{k=i+1}^{m} w_{k}{\widetilde{x}}_{k}$ and $ j^{J}(i) = J - \varOmega _{\cdot ,i+1:m} {\widetilde{x}}_{i+1:m}$. Clearly, if ${\widetilde{x}}\in S(J)$, $j^J(0) = 0$, $c(0) \le 0$, and $s'_0\bigl (0,c(0)) = \{{\widetilde{x}}\}$. Observe that $s'_{r}\bigl (j^J(r),c(r)\bigr )\subseteq s'_{r+1}\bigl (j^J(r+1),c(r+1)\bigr )$ because if $x\in s'_{r}\bigl (j^J(r),c(r)\bigr )$, we have $x_k = {\widetilde{x}}_k$ for $k > r$, $\sum _{k=1}^rw_kx_k \ge c(r) = c(r+1) - w_{r+1}{\widetilde{x}}_{r+1}$, and $\varOmega _{\cdot ,1:r}x_{1:r} = j^J(r) = j^J(r+1) -\varOmega _{\cdot ,r+1}{\widetilde{x}}_{r+1}$, showing that $x\in s'_{r+1}\bigl (j^J(r+1),c(r+1)\bigr )$. Then, $s'_{0}\bigl (j^J(0),c(0)\bigr ) \subseteq s'_{1}\bigl (j^J(1),c(1)\bigr )\subseteq \dots \subseteq s'_{m}\bigl (j^J(m),c(m)\bigr ) = S(J) $, and we have

$$\begin{aligned}&\mathop {\text {Pr}}\nolimits _\varTheta \Big ({\mathbb {X}}= {\widetilde{x}}\big | {\mathbb {X}}\in S_\varOmega \Big ) \nonumber \\&=\sum _J\left( \mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in S(J)\bigm | {\mathbb {X}}\in S_\varOmega \bigr )\prod _{i=1}^{m}\mathop {\text {Pr}}\nolimits _\varTheta \Bigl ({\mathbb {X}}\in s'_{i-1}\bigl (j^J(i-1),c(i-1)\bigr )\Big |{\mathbb {X}}\in s'_{i}(j^J(i),c(i))\Bigr )\right) .\nonumber \\ \end{aligned}$$

(J.1)

Further, for all J and $i\in [m], \mathop {\text {Pr}}\nolimits _\varTheta \Bigl ({\mathbb {X}}\in s'_{i-1}\bigl (j^J(i-1),c(i-1)\bigr )\Big |{\mathbb {X}}\in s'_{i}(j^J(i),c(i))\Bigr )$ is:

$$\begin{aligned}&\frac{\mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s'_{i-1}(j^J(i-1),c(i-1))\bigr )}{\mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s'_{i}(j^J(i),c(i))\bigr )}\\&\quad = \frac{\mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s_{i-1}(j^J(i-1),c(i-1))\bigr )}{\mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s_{i}(j^J(i),c(i))\bigr )}\mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}_{i}={\widetilde{x}}_{i})\\&\quad = \frac{|s_{i-1}(j^J(i-1),c(i-1))|_{\varTheta }}{|s_{i}(j^J(i),c(i))|_{\varTheta }}\mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}_{i} = {\widetilde{x}}_{i})n_{i} = p^J_{i}\delta _{{\widetilde{x}}_i = 0} + (1-p^J_{i})\delta _{{\widetilde{x}}_i = 1}, \end{aligned}$$

where $p^J_{i} = \frac{|s_{i-1}(j^J(i),c(i))|_{\varTheta }}{|s_{i}(j^J(i),c(i))|_{\varTheta }}(n_i-a_i)$ and $\delta _{{\widetilde{x}}={\mathfrak {a}}}$ is 1 if ${\widetilde{x}}={\mathfrak {a}}$ and 0 otherwise. The first equality is because the event ${\mathbb {X}}\in s_i(j,c)$ is independent of $\{{\mathbb {X}}_{i'}\}_{i'=i+1}^m$ and, therefore, $\mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s'_{i}(j^J(i),c(i))\bigr ) = \mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s_{i}(j^J(i),c(i))\bigr )\prod _{i'= i+1}^{m} \mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}_{i'}={\widetilde{x}}_{i'})$. The second equality follows since $|s_i(j^J(i),c(i))|_\varTheta = \mathop {\text {Pr}}\nolimits _\varTheta \bigl ({\mathbb {X}}\in s_{i}(j^J(i),c(i))\bigr )\prod _{i'=1}^i n_{i'}$ and the last equality is because $|s_{i-1}(j^J(i),c(i))|_{\varTheta }(n_i-a_i) + |s_{i-1}(j^J(i)-\varOmega _{.,i},c(i)-w_i)|_{\varTheta }a_i = |s_{i}(j^J(i),c(i))|_{\varTheta }$. Now, we compute $\mathop {\text {Pr}}\nolimits ({\widetilde{{\mathbb {X}}}}={\widetilde{x}})$, where ${\widetilde{{\mathbb {X}}}}$ is the generated random variable. We write $\mathop {\text {Pr}}\nolimits ({\widetilde{{\mathbb {X}}}}={\widetilde{x}}) = \sum _J \mathop {\text {Pr}}\nolimits \bigl ({\widetilde{{\mathbb {X}}}}\in S(J)\bigr )\prod _{i=1}^m\mathop {\text {Pr}}\nolimits ({\widetilde{{\mathbb {X}}}}_i={\widetilde{x}}_i\mid {\widetilde{{\mathbb {X}}}}_k = {\widetilde{x}}_k \forall k > i \text { and } {\widetilde{{\mathbb {X}}}}_i\in S(J))$. Let ${\tilde{p}}^J_i = \mathop {\text {Pr}}\nolimits ({\widetilde{{\mathbb {X}}}}_i=0| {\widetilde{{\mathbb {X}}}}_k = {\widetilde{x}}_k \forall k > i \text { and } {\widetilde{{\mathbb {X}}}}_i\in S(J))$. At the ${(m+1-i)}^{\text {th}}$ iteration, the algorithm chooses the value for ${\widetilde{{\mathbb {X}}}}_i$. Assume that ${\widetilde{{\mathbb {X}}}}_k$ was chosen to be ${\widetilde{x}}_k$ for $k > i$. Then,

$$\begin{aligned} {\widetilde{p}}^J_{i} = \frac{{\tilde{s}}^{\ge }((i-1,j^J(i)),c(i))(n_{i}-a_{i})}{{\tilde{s}}^{\ge }((i-1,j^J(i)),c(i))(n_{i}-a_{i}) + {\tilde{s}}^{\ge }((i-1,j^J(i)-\varOmega _{.,i}),c(i)-{w}_{i})a_{i}}. \end{aligned}$$

Since ${\tilde{s}}^{\ge }((i,j^J(i)),c(i))$ is a $(1+\delta _{s})^{i-1}$ approximation of $|s_{i}(j^J(i),c(i))|_{\varTheta }$:

$$\begin{aligned} \frac{p^J_{i}}{(1+\delta _{s})^{i-2}} \le {\widetilde{p}}^J_{i} \le (1+\delta _{s})^{i-2}p^J_{i} \end{aligned}$$

(J.2a)

$$\begin{aligned} \frac{1-p^J_{i}}{(1+\delta _{s})^{i-2}} \le 1-{\widetilde{p}}^J_{i} \le (1+\delta _{s})^{i-2}(1-p^J_{i}), \end{aligned}$$

(J.2b)

where the left hand side inequality in (J.2a) (respectively (J.2b)) is obtained by realizing that ${\widetilde{s}}^{\ge }((i-1,j^J(i)),c(i)) \ge |s_{i-1}(j^J(i),c(i))|_{\varTheta }$ and ${\widetilde{s}}^{\ge }((i-1,j^J(i) - \varOmega _{.,i}),c(i) - w_{i}) \le (1+\delta _{s})^{i-2}|s_{i-1}(j^J(i) - \varOmega _{.,i}, c(i) - w_{i})|_{\varTheta }$, (respectively ${\widetilde{s}}^{\ge }((i-1,j^J(i) - \varOmega _{.,i}),c(i)-w_{i}) \ge |s_{i-1}(j^J(i) - \varOmega _{.,i},c(i)-w_{i})|_{\varTheta }$ and ${\widetilde{s}}^{\ge }((i-1,j^J(i)),c(i)) \le (1+\delta )^{i-2}|s_{i-1}(j^J(i),c(i))|_{\varTheta }$). The right hand side of (J.2a), (J.2b) can be obtained in a similar way. For $\delta _{s}\in (0,1)$, we have $1/(1+\delta _{s})^{i} > (1-\delta _{s})^{i}$. Thus, $(1-\delta _{s})^{i-2}p^J_{i} \le {\widetilde{p}}^J_{i}\le (1+\delta _{s})^{i-2}p^J_{i}$. We let $\mathop {\text {Pr}}\nolimits \bigl ({\widetilde{{\mathbb {X}}}}\in S(J)\bigr ) = \frac{{\tilde{s}}^{\ge }((m,J),C)}{\sum _{J'} {\tilde{s}}^{\ge }((m,J'),C)}$, and observe that:

$$\begin{aligned} \frac{\mathop {\text {Pr}}\nolimits ({\mathbb {X}}\in S(J)\mid {\mathbb {X}}\in S_\varOmega )}{(1+\delta )^{m-1}} \le \mathop {\text {Pr}}\nolimits \bigl ({\widetilde{{\mathbb {X}}}}\in S(J)\bigr ) \le (1+\delta )^{m-1}\mathop {\text {Pr}}\nolimits ({\mathbb {X}}\in S(J)\mid {\mathbb {X}}\in S_\varOmega ). \end{aligned}$$

Therefore, each term in the summation on the right hand side of (J.1) is approximated within a relative error of $(1+\delta _s)^{\eta }$ where $\eta = m(m-1)/2$. It follows that

$$\begin{aligned} (1-\delta _{s})^{\eta }\mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}={\widetilde{x}}|{\mathbb {X}}\in S_\varOmega ) \le \mathop {\text {Pr}}\nolimits ({\widetilde{{\mathbb {X}}}}={\widetilde{x}}) \le (1+\delta _{s})^{\eta }\mathop {\text {Pr}}\nolimits _\varTheta ({\mathbb {X}}={\widetilde{x}}|{\mathbb {X}}\in S_\varOmega ).\nonumber \\ \end{aligned}$$

(J.3)

Now, we obtain a $(1\pm \epsilon _{s})$ approximation if $\delta _{s}\le 1 - (1-\epsilon _{s}{})^{1/m^{2}}$ and $\delta _{s}\le (1+\epsilon _{s})^{1/m^{2}}-1$. Since $(\cdot )^{\frac{1}{m^2}}$ is concave, it follows that $\frac{1}{2}(1+\epsilon _{s})^{1/m^2} + \frac{1}{2}(1-\epsilon _{s})^{1/m^{2}} \le 1$. Therefore, it suffices to choose $\delta _{s}= (1+\epsilon _{s})^{1/m^{2}} - 1$ in J.3. The desired complexity follows from Theorem 3.$\square $

Proof of Proposition 4

We assume wlog that $x_{ij}=0$ for all $\langle i,j \rangle \in E$ and $U=1$. Given a solution to $\mathop {\text {Slack-MLU}}{(x)}$, we construct a feasible solution to $\mathop {\text {MLU}}(x)$. If ${\underline{y}}$ is a solution to $\mathop {\text {MLU}}(x)$ with demand ${\underline{d}}$ and ${\underline{d}}= {\underline{d}}' + {\underline{d}}''$ where ${\underline{d}}',{\underline{d}}''\ge 0$, then, using augmenting paths, ${\underline{y}}$ can be decomposed into ${\underline{y}}'$, ${\underline{y}}''\ge 0$, where ${\underline{y}}'$ services ${\underline{d}}'$, ${\underline{y}}''$ services ${\underline{d}}''$ and $y''$ does not contain cycles. Now, let $(y^{a},a)$, be the given solution to $\mathop {\text {Slack-MLU}}(x)$ where, $y^{a}$, is a routing of $d'$. Then, we decompose $y^{a}$ into $y^{1}$ and $y^{2}$, where $y^{1}$ routes d, $y^{2}$ routes a, and $y^{2}$ does not contain cycles. Assume wlog that the support of a is a pair (i, j), and, so:

$$\begin{aligned} \sum _{t\in V} y^{1}_{klt} + y^{2}_{klj} \le c_{kl} + a_{ij}\delta _{(\langle i,j\rangle = \langle k,l \rangle )} \end{aligned}$$

(K.1)

where $\delta _{(\langle i,j\rangle = \langle k,l \rangle )} = 1$ if $\langle i,j\rangle = \langle k,l \rangle $ and 0 otherwise. Clearly, $a_{ij} \ge y^{2}_{ijj}$ because $y^{2}$ does not contain cycles. We define

$$\begin{aligned} Z_{t} = {\left\{ \begin{array}{ll} \frac{y^{1}_{ijt}}{c_{ij} + a_{ij} - y^{2}_{ijj}} &{}\text { if } c_{ij}+a_{ij} - y^{2}_{ijj} > 0\\ 0 &{}\text { otherwise. } \end{array}\right. } \end{aligned}$$

(K.2)

Since $ 0\le \sum _{t\in V} y^{1}_{ijt} \le c_{ij} + a_{ij} - y^{2}_{ijj} $, we get $0\le \frac{\sum _{t\in V} y^{1}_{ijt}}{c_{ij} + a_{ij} - y^{2}_{ijj}} = \sum _{t\in V} Z_{t} \le 1$. We argue that the flow $y''$, defined as

$$\begin{aligned} y''_{klt} = y^{1}_{klt} + Z_{t}y^{2}_{klj} - Z_{t}a_{ij}\delta _{(\langle i,j \rangle = \langle k,l \rangle )} \end{aligned}$$

(K.3)

is feasible to $\mathop {\text {MLU}}{(x)}$. First, we show feasibility to the capacity constraint.

C.1 Consider $\langle i,j \rangle = \langle k,l \rangle $ and observe that: $\sum _{t\in V}Z_{t} c_{ij} - \sum _{t\in V} y''_{ijt} = \sum _{t\in V} \bigl (Z_{t} c_{ij} + Z_{t}a_{ij} - Z_{t}y^{2}_{ijj} - y^{1}_{ijt}\bigr ) = 0$, where the first equality is by (K.3). If $c_{ij} + a_{ij} - y^{2}_{ijj} > 0$, the second equality is from (K.2). Otherwise, it follows from $0\le \sum _{t\in V} y^{1}_{ijt} \le \sum _{t\in V} Z_{t}(c_{ij} + a_{ij} - y^{2}_{ijj}) = 0$. Then, $\sum _{t\in V} y''_{ijt} \le c_{ij}$ because $0\le \sum _{t\in V} Z_{t}c_{ij} \le c_{ij}$, where the second inequality holds because $\sum _{t\in V}Z_t \le 1$.
C.2 Now, consider $\langle k,l \rangle \ne \langle i,j \rangle $. We have $0\le \sum _{t\in V}y''_{klt} = \sum _{t\in V} (y^{1}_{klt} + Z_{t} y^{2}_{klj}) \le \sum _{t\in V}y^{1}_{klt} + y^{2}_{klj} \le c_{kl}$, where, the first equality is from (K.3), the first inequality is because $Z_t$, $y^1$, and $y^2$ are non-negative, the second inequality is because $y^{2}_{klj} \ge 0$ and $\sum _{t\in V}Z_{t} \le 1$, and the last inequality follows from (K.1).

Finally, $y''$ satisfies flow balance equations in $\mathop {\text {MLU}}(x)$ because it is defined in (K.3) by adding a circulation to $y^1$ which services d. $\square $

Formulation of Gen-R3

For a directed arc e from i to j, we write $\text {tail}(e)$ to represent i and $\text {head}(e)$ to represent j. For a node j and commodity t, we write $ex'(r,j,t)$ to represent $\sum _{e\in E:\text {tail}(e)=j} r_{et} - \sum _{e\in E:\text {head}(e)=j} r_{et}$. Then Gen-R3 is: [9]:

$$\begin{aligned} \text {Gen-R3:}&\min _{r,p,a}&\quad&U&\quad&\end{aligned}$$

(L.1a)

$$\begin{aligned}&\sum _{t\in V}r_{et} + \sum _{l\in E}p_{el}x_{l} \le U c_e(1-x_e) + a_ex_e&\forall e \in E, \forall x\in {\mathscr {X}}_{\mathfrak {b}}\end{aligned}$$

(L.1b)

$$\begin{aligned}&ex'(r,j,t) = d_{jt} - \sum _{i\in V}d_{it}\delta _{j=t}&\forall j,t\in V\end{aligned}$$

(L.1c)

$$\begin{aligned}&ex'(p,j,l) = a_l\delta _{\text {tail}(l)=i} - a_l\delta _{\text {head}(l)=j}&j\in V, l\in E\end{aligned}$$

(L.1d)

$$\begin{aligned}&r_{et},p_{el}\ge 0&e,l\in E, t\in V. \end{aligned}$$

(L.1e)

Here, $r_{et}$ is the traffic on link e destined to t and $p_{el}$ is the the amount of traffic on link l that is bypassed on e when l fails, and $a_e$ is the reservation to bypass traffic on link e.

We solve Gen-R3 with ${\mathfrak {b}}=1$ in ${\mathscr {X}}_{\mathfrak {b}}$ i.e.,for ${\mathscr {X}}_1$ in (L.1b). Then using the obtained $(r^*,p^*,a^*)$, the G-cuts are the negation of constraint (L.1b) with U fixed to one i.e.,

$$\begin{aligned} \sum _{t\in V}r^{*}_{et} + \sum _{l\in E}p^{*}_{el}x_{l} > c_e(1-x_e) + a^*_e x_e \text { for } e \in E \text { and } x \in {\mathscr {X}}_{\mathfrak {b}}. \end{aligned}$$

(L.2)

Constraint (L.2) can be used to outer-approximate the set of scenarios in ${\mathscr {X}}_{\mathfrak {b}}$ where MLU exceeds 1.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chandra, A., Tawarmalani, M. Probability estimation via policy restrictions, convexification, and approximate sampling. Math. Program. 196, 309–345 (2022). https://doi.org/10.1007/s10107-022-01823-6

Download citation

Received: 22 May 2020
Accepted: 14 April 2022
Published: 20 May 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s10107-022-01823-6

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Probability estimation via policy restrictions, convexification, and approximate sampling

Abstract

Access this article

Similar content being viewed by others

Survey on Lagrangian relaxation for MILP: importance, challenges, historical review, recent advancements, and opportunities

Sample intelligence-based progressive hedging algorithms for the stochastic capacitated reliable facility location problem

Analyzing quantitative performance: Bayesian estimation of 3-component mixture geometric distributions based on Kumaraswamy prior

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

List of assumptions

Proof of Proposition 1

Extension of Proposition 1 to consider polynomial policies

Proposition 6

Proof

Proof of Proposition 2

Proof of Proposition 3

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

A lower estimate for probability of 0–1 solutions to a SLWP

Corollary 1

Proof

Proof of Theorem 4

Proof of Proposition 4

Formulation of Gen-R3

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Probability estimation via policy restrictions, convexification, and approximate sampling

Abstract

Access this article

Similar content being viewed by others

Survey on Lagrangian relaxation for MILP: importance, challenges, historical review, recent advancements, and opportunities

Sample intelligence-based progressive hedging algorithms for the stochastic capacitated reliable facility location problem

Analyzing quantitative performance: Bayesian estimation of 3-component mixture geometric distributions based on Kumaraswamy prior

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

List of assumptions

Proof of Proposition 1

Extension of Proposition 1 to consider polynomial policies

Proposition 6

Proof

Proof of Proposition 2

Proof of Proposition 3

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

A lower estimate for probability of 0–1 solutions to a SLWP

Corollary 1

Proof

Proof of Theorem 4

Proof of Proposition 4

Formulation of Gen-R3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation