Sample average approximation with heavier tails II: localization in stochastic convex optimization and persistence results for the Lasso

Oliveira, Roberto I.; Thompson, Philip

doi:10.1007/s10107-023-01940-w

Sample average approximation with heavier tails II: localization in stochastic convex optimization and persistence results for the Lasso

Full Length Paper
Series A
Published: 23 March 2023

Volume 199, pages 49–86, (2023)
Cite this article

Mathematical Programming Submit manuscript

389 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

“Localization” has proven to be a valuable tool in the Statistical Learning literature as it allows sharp risk bounds in terms of the problem geometry. Localized bounds seem to be much less exploited in the stochastic optimization literature. In addition, there is an obvious interest in both communities in obtaining risk bounds that require weak moment assumptions or “heavier-tails”. In this work we use a localization toolbox to derive risk bounds in two specific applications. The first is in portfolio risk minimization with conditional value-at-risk constraints. We consider a setting where, among all assets with high returns, there is a portion of dimension g, unknown to the investor, that has significant less risk than the other remaining portion. Our rates for the SAA problem show that “risk inflation”, caused by a multiplicative factor, affects the statistical rate only via a term proportional to g. As the “normalized risk” increases, the contribution in the rate from the extrinsic dimension diminishes while the dependence on g is kept fixed. Localization is a key tool to show this property. As a second application of our localization toolbox, we obtain sharp oracle inequalities for least-squares estimators with a Lasso-type constraint under weak moment assumptions. One main consequence of these inequalities is to obtain persistence, as posed by Greenshtein and Ritov, with covariates having heavier tails. This gives improvements in prior work of Bartlett, Mendelson and Neeman.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Article Open access 07 July 2017

Likelihood robust optimization for data-driven problems

Article 19 September 2015

Distributionally robust stochastic programs with side information based on trimmings

Article Open access 22 November 2021

Notes

Actually, the Corollary 2.5 in [29] cover only the case when $\varvec{\Sigma }$ is the identity matrix, but the arguments are based on VC dimension theory that are readily extendable to our setting. We omit such details.
See also Corollary 2.5, item (2) of [29] for essentially the same statement. See also [36].

References

Artstein, Z., Wets, R.J.-B.: Consistency of minimizers and the SLLN for stochastic programs. J. Convex Anal. 2, 1–17 (1995)
MathSciNet MATH Google Scholar
Bartlett, P., Bousquet, O., Mendelson, S.: Local Rademacher complexities. Ann. Stat. 33, 1497–1537 (2005)
MathSciNet MATH Google Scholar
Bartlett, P., Mendelson, S.: Empirical minimization. Probab. Theory Rel. Fields 135(3), 311–334 (2006)
MathSciNet MATH Google Scholar
Barlett, P.L., Mendelson, S., Neeman, J.: $\ell _1$-regularized linear regression: persistence and oracle inequalities. Probab. Theory Relat. Fields 154, 193–224 (2012)
MATH Google Scholar
Bickel, P.J., Ritov, Y., Tsybakov, A.B.: Simultaneous analysis of the Lasso and Dantzig selector. Ann. Stat. 37(4), 1705–1732 (2009)
MathSciNet MATH Google Scholar
Bellec, P.C., Lecué, G., Tsybakov, A.B.: Slope meets lasso: improved oracle bounds and optimality. Ann. Stat. 46(6B), 3603–3642 (2018)
MathSciNet MATH Google Scholar
Bunea, F., Tsybakov, A.B., Wegkamp, M.H.: Sparsity oracle inequalities for the Lasso. Electron. J. Stat. 1, 169–194 (2007)
MathSciNet MATH Google Scholar
Bunea, F., Tsybakov, A.B., Wegkamp, M.H.: Aggregation for Gaussian regression. Ann. Stat. 35(4), 1674–1697 (2007)
MathSciNet MATH Google Scholar
Bunea, F., Tsybakov, A.B., Wegkamp, M.H.: Sparse density estimation with $\ell _1$ penalties. In: Bshouty, N.H., Gentile, C. (Eds.) Learning Theory. COLT 2007. Lecture Notes in Computer Science, vol. 4539. Springer, Berlin (2007)
Bunea, F., Tsybakov, A.B., Wegkamp, M.H.: Aggregation and sparsity via $\ell _1$-penalized least squares. In: Lugosi, G., Simon, H.U. (Eds.) Learning Theory. COLT 2006. Lecture Notes in Computer Science, vol. 4005. Springer, Berlin (2006)
Bunea, F., Tsybakov, A.B., Wegkamp, M.H.: Aggregation for regression learning. Preprint: arXiv.org/abs/math/0410214 (2004)
Candes, E., Tao, T.: The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35(6), 2313–2351 (2007)
MathSciNet MATH Google Scholar
Dupacovà, J., Wets, R.J.-B.: Asymptotic behavior of statistical estimators and of optimal solutions of stochastic optimization problems. Ann. Stat. 16(4), 1517–1549 (1988)
MathSciNet MATH Google Scholar
Guigues, V., Juditsky, A., Nemirovski, A.: Non-asymptotic confidence bounds for the optimal value of a stochastic program. Optim. Methods Softw. 32(5), 1033–1058 (2017)
MathSciNet MATH Google Scholar
Greenshtein, E.: Best subset selection, persistence in high-dimensional statistical learning and optimization under $\ell _1$ constraint. Ann. Stat. 34(5), 2367–2386 (2006)
MATH Google Scholar
Greenshtein, E., Ritov, Y.: Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli 10(6), 971–988 (2004)
MathSciNet MATH Google Scholar
Homem-de-Mello, T., Bayraksan, G.: Monte Carlo sampling-based methods for stochastic optimization. Surv. Oper. Res. Manag. Sci. 19, 56–85 (2014)
MathSciNet Google Scholar
Iusem, A.N., Jofré, A., Thompson, P.: Incremental constraint projection methods for monotone stochastic variational inequalities. Math. Oper. Res. 44(1), 236–263 (2019)
MathSciNet MATH Google Scholar
Kim, S., Pasupathy, R., Henderson, S.G.: A guide to sample average approximation. In: Michael, Fu. (ed.) Handbook of Simulation Optimization, International Series in Operations Research & Management Science, vol. 216, pp. 207–243. Springer, New York (2015)
Google Scholar
King, A.J., Rockafellar, R.T.: Asymptotic theory for solutions in statistical estimation and stochastic programming. Math. Oper. Res. 18, 148–162 (1993)
MathSciNet MATH Google Scholar
King, A.J., Wets, R.J.-B.: Epi-consistency of convex stochastic programs. Stoch. Stoch. Rep. 34, 83–92 (1991)
MathSciNet MATH Google Scholar
Koltchinskii, V.: Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems. Lecture Notes in Mathematics book series (LNM, volume 2033), Ecole d’Eté Probabilit. Saint-Flour Book Sub Series (LNMECOLE, volume 2033). Springer, Berlin (2011)
Koltchinskii, V., Lounici, K., Tsybakov, A.B.: Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Stat. 39(5), 2302–2329 (2011)
MathSciNet MATH Google Scholar
Koltchinskii, V.: The Dantzig selector and sparsity oracle inequalities. Bernoulli 15(3), 799–828 (2009)
MathSciNet MATH Google Scholar
Koltchinskii, V.: Sparsity in penalized empirical risk minimization. Ann. Inst. H. Poincaré Probab. Stat. 45(1), 7–57 (2009)
MathSciNet MATH Google Scholar
Koltchinskii, V.: Sparse recovery in convex hulls via entropy penalization. Ann. Stat. 37(3), 1332–1359 (2009)
MathSciNet MATH Google Scholar
Koltchinskii, V.: Local Rademacher complexities and oracle inequalities in risk minimization. Ann. Stat. 34(6), 2593–2656 (2006)
MathSciNet MATH Google Scholar
Lecué, G., Mendelson, S.: General nonexact oracle inequalities for classes with subexponential envelope. Ann. Stat. 40(2), 832–860 (2012)
MathSciNet MATH Google Scholar
Lecué, G., Mendelson, S.: Sparse recovery under weak moment assumptions. J. Eur. Math. Soc. 19, 881–904 (2017)
MathSciNet MATH Google Scholar
Leng, C., Lin, Y., Wahba, G.: A note on the lasso and related procedures in model selection. Stat. Sin. 16, 1273–1284 (2006)
MathSciNet MATH Google Scholar
Lounici, K.: Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electron. J. Stat. 2, 90–102 (2008)
MathSciNet MATH Google Scholar
Meinshausen, N., Yu, B.: Lasso-type recovery of sparse representations for high-dimensional data. Ann. Stat. 37(1), 246–270 (2009)
MathSciNet MATH Google Scholar
Meinhausen, N.: Relaxed lasso. Comput. Stat. Data Anal. 52(1), 374–393 (2007)
MathSciNet Google Scholar
Meinhausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the Lasso. Ann. Stat. 34(3), 1436–1462 (2006)
MathSciNet MATH Google Scholar
Oliveira, R.I.: The lower tail of random quadratic forms, with applications to ordinary least squares and restricted eigenvalue properties (2013), preprint at arXiv:1312.2903
Oliveira, R.I.: The lower tail of random quadratic forms with applications to ordinary least squares. Probab. Theory Relat. Fields 166, 1175–1194 (2016)
MathSciNet MATH Google Scholar
Oliveira, R.I., Thompson, P.: Sample average approximation with heavier tails I: non-asymptotic bounds with weak assumptions and stochastic constraints. Math. Program. (2022). https://doi.org/10.1007/s10107-022-01810-x
Article Google Scholar
Pflug, G.C.: Asymptotic stochastic programs. Math. Oper. Res. 20, 769–789 (1995)
MathSciNet MATH Google Scholar
Panchenko, D.: Symmetrization approach to concentration inequalities for empirical processes. Ann. Probab. 31, 2068–2081 (2003)
MathSciNet MATH Google Scholar
Pang, J.-S.: Error bounds in mathematical programming. Math. Program. Ser. B 79(1), 299–332 (1997)
MathSciNet MATH Google Scholar
Pflug, G.C.: Stochastic programs and statistical data. Ann. Oper. Res. 85, 59–78 (1999)
MathSciNet MATH Google Scholar
Pflug, G.C.: Stochastic optimization and statistical inference. In: Ruszczyński, A., Shapiro, A. (eds.) Handbooks in OR & MS, vol. 10, pp. 427–482. Elsevier (2003)
Rockafellar, R.T., Urysaev, S.: Optimization of conditional value-at-risk. J. Risk 2(3), 493–517 (2000)
Google Scholar
Römisch, W.: Stability of stochastic programming problems. In: Ruszczyński, A., Shapiro, A. (eds.) Handbooks in OR & MS, vol. 10, pp. 483–554. Elsevier (2003)
Shapiro, A.: Asymptotic properties of statistical estimators in stochastic programming. Ann. Stat. 17, 841–858 (1989)
MathSciNet MATH Google Scholar
Shapiro, A.: Asymptotic analysis of stochastic programs. Ann. Oper. Res. 30, 169–186 (1991)
MathSciNet MATH Google Scholar
Shapiro, A.: Monte Carlo sampling methods. In: Ruszczyński, A., Shapiro, A. (eds.) Handbooks in OR & MS, vol. 10, pp. 353–425. Elsevier (2003)
Shapiro, A., Dentcheva, D., Ruszczynski, A.: Lectures on Stochastic Programming: Modeling and Theory. MOS-SIAM Series Optimization. SIAM, Philadelphia (2009)
Talagrand, M.: Sharper bounds for Gaussian and empirical processes. Ann. Probab. 22, 28–76 (1994)
MathSciNet MATH Google Scholar
Talagrand, M.: Upper and Lower Bounds for Stochastic Processes. Springer (2014)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)
MathSciNet MATH Google Scholar
van de Geer, S.A.: High-dimensional generalized linear models and the Lasso. Ann. Stat. 36(2), 614–645 (2008)
MathSciNet MATH Google Scholar
Zhang, C.-H., Huang, J.: The sparsity and the bias of the lasso selection in high-dimensional linear regression. Ann. Stat. 36(4), 1567–1594 (2008)
MathSciNet MATH Google Scholar
Zhao, P., Yu, B.: On model selection consistency of Lasso. J. Mach. Learn. Res. 7, 2541–2563 (2006)
MathSciNet MATH Google Scholar
Zhang, T.: Some sharp performance bounds for least squares regression with L1 regularization. Ann. Stat. 37(5A), 2109–2144 (2009)
MATH Google Scholar
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

Roberto I. Oliveira has been funded by FAPESP. Philip Thompson was funded by the grant STAR - F.10005389.06.001 by Krannert School of Management, Purdue University.

Author information

Authors and Affiliations

Instituto de Matemática Pura e Aplicada (IMPA), Rio de Janeiro, RJ, Brazil
Roberto I. Oliveira
Purdue University, Krannert School of Management, West Lafayette, USA
Philip Thompson
School of Applied Mathematics, FGV EMAp, Rio de Janeiro, Brazil
Philip Thompson

Authors

Roberto I. Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
Philip Thompson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philip Thompson.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

of Proposition 1

Given admissible sequences $\{{\mathcal {A}}_{1,j}\}_{j\ge 0}$ and $\{{\mathcal {A}}_{2,j}\}_{j\ge 0}$ for ${\mathcal {M}}_1$ and ${\mathcal {M}}_2$, one may define an admissible sequence $\{{\mathcal {C}}_j\}_{j\ge 0}$ for ${\mathcal {M}}$ via:

$$\begin{aligned}{\mathcal {C}}_0:=\{{\mathcal {M}}\}\hbox { and }{\mathcal {C}}_j:= {\mathcal {A}}_{1,j-1}\times {\mathcal {A}}_{2,j-1}\,(j\ge 1).\end{aligned}$$

It is easy to see that this is indeed admissible and moreover

$$\begin{aligned}\text {\textsf{diam}}({\mathcal {C}}_0) = \text {\textsf{diam}}({\mathcal {M}})= \text {\textsf{diam}}({\mathcal {M}}_1) + \text {\textsf{diam}}({\mathcal {M}}_2),\\\text {\textsf{diam}}({\mathcal {C}}_j)\le \text {\textsf{diam}}({\mathcal {A}}_{1,j-1}) + \text {\textsf{diam}}({\mathcal {A}}_{2,j-1}).\end{aligned}$$

Therefore,

$$\begin{aligned}\gamma _2^{(\alpha )}({\mathcal {M}})\le \text {\textsf{diam}}({\mathcal {M}})^\alpha + \sum _{j\ge 1}2^{j/2}(\text {\textsf{diam}}({\mathcal {A}}_{1,j-1})^\alpha + \text {\textsf{diam}}({\mathcal {A}}_{2,j-1})^\alpha )\end{aligned}$$

or equivalently

$$\begin{aligned}\gamma _2^{(\alpha )}({\mathcal {M}})\le & {} \text {\textsf{diam}}({\mathcal {M}}_1)^\alpha + \text {\textsf{diam}}({\mathcal {M}}_2)^\alpha \\ {}{} & {} + \sqrt{2}\left( \sum _{j\ge 0}2^{j/2}\text {\textsf{diam}}({\mathcal {A}}_{1,j-1})^\alpha \right) +\sqrt{2}\left( \sum _{j\ge 0}2^{j/2}\text {\textsf{diam}}({\mathcal {A}}_{2,j-1})^\alpha \right) .\end{aligned}$$

The proof finishes when we note that $\text {\textsf{diam}}({\mathcal {M}})^{\alpha }\le \gamma ^{(\alpha )}_2({\mathcal {M}})$ and take the infimum over admissible sequences. $\square $

We recall the following fundamental result due to Panchenko. It establishes a sub-Gaussian tail for the deviation of an heavy-tailed empirical process around its mean after a proper self-normalization by a random quantity ${{\widehat{V}}}$.

Theorem 4

(Panchenko’s inequality [39]) Let ${\mathcal {F}}$ be a finite family of measurable functions $g:\Xi \rightarrow {\mathbb {R}}$ such that ${\textbf{P}}g^2(\cdot )<\infty $. Let also $\{\xi _j\}_{j=1}^N$ and $\{\eta _j\}_{j=1}^N$ be both i.i.d. samples drawn from a distribution ${\textbf{P}}$ over $\Xi $ which are independent of each other. Define

$$\begin{aligned} {\textsf{S}}:=\sup _{g\in {\mathcal {F}}}\sum _{j=1}^Ng(\xi _j),\quad \quad \hbox {and}\quad \quad {\widehat{V}}:={\mathbb {E}}\left\{ \sup _{g\in {\mathcal {F}}}\sum _{j=1}^N\left[ g(\xi _j)-g(\eta _j)\right] ^2\Bigg |\sigma (\xi _1,\ldots ,\xi _N)\right\} . \end{aligned}$$

Then, for all $t>0$,

$$\begin{aligned} {\mathbb {P}}\left\{ {\textsf{S}}-{\mathbb {E}}[{\textsf{S}}]\ge \sqrt{\frac{2(1+t)}{N}{\widehat{V}}}\right\} \bigvee {\mathbb {P}}\left\{ {\textsf{S}}-{\mathbb {E}}[{\textsf{S}}]\le -\sqrt{\frac{2(1+t)}{N}{\widehat{V}}}\right\} \le 2e^{-t}. \end{aligned}$$

The following result is a direct consequence of Theorem 4 applied to the unitary class ${\mathcal {F}}:=\{g\}$. It provides a sub-Gaussian tail for any random variable with finite 2nd moment in terms its variance and empirical variance.

Lemma 8

(Sub-Gaussian tail for self-normalized sums) Suppose $\{\xi _j\}_{j=1}^N$ is i.i.d. sample of a distribution ${\textbf{P}}$ over $\Xi $ and denote by ${{\widehat{{\textbf{P}}}}}$ the correspondent empirical distribution. Then for any measurable function $g:\Xi \rightarrow {\mathbb {R}}$ satisfying ${\textbf{P}}g(\cdot )^2<\infty $ and, for any $t>0$,

$$\begin{aligned}{} & {} {\mathbb {P}}\left\{ ({{\widehat{{\textbf{P}}}}}-{\textbf{P}})g(\cdot )\ge \sqrt{\frac{2(1+t)}{N}\left( {{\widehat{{\textbf{P}}}}}+{\textbf{P}}\right) \left[ g(\cdot )-{\textbf{P}}g(\cdot )\right] ^2}\right\} \le 2e^{-t},\\{} & {} {\mathbb {P}}\left\{ ({{\widehat{{\textbf{P}}}}}-{\textbf{P}})g(\cdot )\le -\sqrt{\frac{2(1+t)}{N}\left( {{\widehat{{\textbf{P}}}}}+{\textbf{P}}\right) \left[ g(\cdot )-{\textbf{P}}g(\cdot )\right] ^2}\right\} \le 2e^{-t}. \end{aligned}$$

Finally, we present the sub-gaussian tail of nonnegative random variables.

Lemma 9

(Sub-Gaussian lower tail for nonnegative random variables) Let $\{Z_j\}_{j=1}^N$ be i.i.d. nonnegative random variables. Assume $a\in (1,2]$ and $0<{\mathbb {E}}[Z_1^a]<\infty $. Then, for all $\epsilon >0$,

$$\begin{aligned} {\mathbb {P}}\left\{ \frac{1}{N}\sum _{j=1}^NZ_j\le (1-\epsilon ){\mathbb {E}}[Z_1]\right\} \le \exp \left\{ -\left( \frac{a-1}{a}\right) \epsilon ^{\frac{a-1}{a}}\left\{ \frac{({\mathbb {E}}[Z_1])^a}{{\mathbb {E}}[Z_1^a]}\right\} ^{\frac{1}{a-1}}N\right\} . \end{aligned}$$

Proof

Let $\theta ,\epsilon >0$. By the usual “Bernstein trick”, we get

$$\begin{aligned} {\mathbb {P}}\left\{ \frac{1}{N}\sum _{j=1}^NZ_j\le (1-\epsilon ){\mathbb {E}}[Z_1]\right\}\le & {} {\mathbb {P}}\left\{ \sum _{j=1}^N({\mathbb {E}}[Z_i]-Z_i)\ge \epsilon {\mathbb {E}}[Z_1]N\right\} \nonumber \\\le & {} {\mathbb {P}}\left\{ e^{\theta \sum _{j=1}^N({\mathbb {E}}[Z_i]-Z_i)}\ge e^{\theta \epsilon {\mathbb {E}}[Z_1]N}\right\} \nonumber \\\le & {} e^{-\theta \epsilon {\mathbb {E}}[Z_1]N}{\mathbb {E}}\left[ e^{\theta \sum _{j=1}^N({\mathbb {E}}[Z_i]-Z_i)}\right] \nonumber \\= & {} e^{-\theta \epsilon {\mathbb {E}}[Z_1]N}{\mathbb {E}}\left[ e^{\theta ({\mathbb {E}}[Z_1]-Z_1)}\right] ^N. \end{aligned}$$

(57)

It is a simple calculus exercise to show that $ \forall x\ge 0,e^{-x}\le 1-x+\frac{x^a}{a}. $ Applying this with $x:=\theta Z_1$, we obtain

$$\begin{aligned} {\mathbb {E}}\left[ e^{\theta ({\mathbb {E}}[Z_1]-Z_1)}\right]\le & {} e^{\theta {\mathbb {E}}[Z_1]}\left( 1-{\mathbb {E}}[\theta Z_1]+\frac{{\mathbb {E}}[(\theta Z_1)^a]}{a}\right) \\\le & {} e^{\theta {\mathbb {E}}[Z_1]}e^{-\theta {\mathbb {E}}[Z_1]+\frac{{\mathbb {E}}[(\theta Z_1)^a]}{a}}=e^{\frac{{\mathbb {E}}[(\theta Z_1)^a]}{a}}, \end{aligned}$$

where the second inequality follows from the relation $1+x\le e^x$ for all $x\in {\mathbb {R}}$. We plug this back into (57) and get, for all $\theta >0$,

$$\begin{aligned} {\mathbb {P}}\left\{ \frac{1}{N}\sum _{j=1}^NZ_j\le (1-\epsilon ){\mathbb {E}}[Z_1]\right\}\le & {} e^{\left( -\theta \epsilon {\mathbb {E}}[Z_1]+\theta ^a\frac{{\mathbb {E}}[Z_1^a]}{a}\right) N}. \end{aligned}$$

(58)

Since $a\in (1,2]$, we may actually minimize the above bound over $\theta >0$. The minimum is attained at $ \theta _*:=\left( \frac{\epsilon {\mathbb {E}}[Z_1]}{{\mathbb {E}}[Z_1^a]}\right) ^{\frac{1}{a-1}}. $ To finish the proof, we plug this in (58) and notice that

$$\begin{aligned} -\theta _*\epsilon {\mathbb {E}}[Z_1]+\theta _*^a\frac{{\mathbb {E}}[Z_1^a]}{a}=\left( -1+\frac{1}{a}\right) \frac{(\epsilon {\mathbb {E}}[Z_1])^{\frac{a}{a-1}}}{{\mathbb {E}}[Z_1^a]^{\frac{1}{a-1}}}, \end{aligned}$$

using that $1+\frac{1}{a-1}=\frac{a}{a-1}$. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Oliveira, R.I., Thompson, P. Sample average approximation with heavier tails II: localization in stochastic convex optimization and persistence results for the Lasso. Math. Program. 199, 49–86 (2023). https://doi.org/10.1007/s10107-023-01940-w

Download citation

Received: 21 November 2017
Accepted: 10 February 2023
Published: 23 March 2023
Issue Date: May 2023
DOI: https://doi.org/10.1007/s10107-023-01940-w

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sample average approximation with heavier tails II: localization in stochastic convex optimization and persistence results for the Lasso

Abstract

Access this article

Similar content being viewed by others

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Likelihood robust optimization for data-driven problems

Distributionally robust stochastic programs with side information based on trimmings

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

of Proposition 1

Theorem 4

Lemma 8

Lemma 9

Proof

Rights and permissions

About this article

Cite this article

Mathematics Subject Classification

Navigation

Sample average approximation with heavier tails II: localization in stochastic convex optimization and persistence results for the Lasso

Abstract

Access this article

Similar content being viewed by others

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations

Likelihood robust optimization for data-driven problems

Distributionally robust stochastic programs with side information based on trimmings

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

of Proposition 1

Theorem 4

Lemma 8

Lemma 9

Proof

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation