Annals of the Institute of Statistical Mathematics

, Volume 72, Issue 1, pp 297–328

# Asymptotic theory of the adaptive Sparse Group Lasso

Article

## Abstract

We study the asymptotic properties of a new version of the Sparse Group Lasso estimator (SGL), called adaptive SGL. This new version includes two distinct regularization parameters, one for the Lasso penalty and one for the Group Lasso penalty, and we consider the adaptive version of this regularization, where both penalties are weighted by preliminary random coefficients. The asymptotic properties are established in a general framework, where the data are dependent and the loss function is convex. We prove that this estimator satisfies the oracle property: the sparsity-based estimator recovers the true underlying sparse model and is asymptotically normally distributed. We also study its asymptotic properties in a double-asymptotic framework, where the number of parameters diverges with the sample size. We show by simulations and on real data that the adaptive SGL outperforms other oracle-like methods in terms of estimation precision and variable selection.

## Keywords

Asymptotic normality Consistency Oracle property

## Supplementary material

10463_2018_692_MOESM1_ESM.pdf (279 kb)
Supplementary material 1 (pdf 279 KB)

## References

1. Anderson, P. K., Gill, R. D. (1982). Cox’s regression model for counting processes: A large sample study. The Annals of Statistics, 10(4), 1100–1120.
2. Bertsekas, D. (1995). Nonlinear programming. Belmont, MA: Athena Scientific.
3. Billingsley, P. (1961). The Lindeberg–Levy theorem for martingales. Proceedings of the American Mathematical Society, 12, 788792.
4. Billingsley, P. (1995). Probability and measure. New York: Wiley.
5. Bühlmann, P., van de Geer, S. (2011). Statistics for high-dimensional data: Methods, theory and applications. Springer series in statistics Berlin: Springer.
6. Chernozhukov, V. (2005). Extremal quantile regression. The Annals of Statistics, 33(2), 806–839.
7. Chernozhukov, V., Hong, H. (2004). Likelihood estimation and inference in a class of nonregular econometric models. Econometrica, 72(5), 1445–1480.
8. Davis, R. A., Knight, K., Liu, J. (1992). M-estimation for autoregressions with infinite variance. Stochastic Processes and Their Applications, 40, 145–180.
9. Fan, J. (1997). Comments on wavelets in statistics: A review by A. Antoniadis. Journal of the Italian Statistical Association, 6, 131138.Google Scholar
10. Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360.
11. Fan, J., Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters. The Annals of Statistics, 32(3), 928–961.
12. Francq, C., Thieu, L. Q. (2015). QML inference for volatility models with covariates. MPRA paper no. 63198.Google Scholar
13. Francq, C., Zakoïan, J. M. (2010). GARCH models. Chichester: Wiley.
14. Fu, W. J. (1998). Penalized regression: the Bridge versus the Lasso. Journal of Computational and Graphical Statistics, 7, 397–416.
15. Geyer, C. J. (1996). On the asymptotics of convex stochastic optimization. Unpublished manuscript.Google Scholar
16. Hjort, N. L., Pollard, D. (1993). Asymptotics for minimisers of convex processes. Unpublished manuscript.Google Scholar
17. Huber, P. J. (1973). Robust regression: Asymptotics, conjectures and Monte Carlo. The Annals of Statistics, 1(5), 799821.
18. Hunter, D. R., Li, R. (2005). Variable selection using MM algorithms. The Annals of Statistics, 33(4), 1617–1642.
19. Kato, K. (2009). Asymptotics for argmin processes: Convexity arguments. Journal of Multivariate Analysis, 100, 1816–1829.
20. Knight, K., Fu, W. (2000). Asymptotics for Lasso-type estimators. The Annals of Statistics, 28(5), 1356–1378.
21. Li, X., Mo, L., Yuan, X., Zhang, J. (2014). Linearized alternating direction method of multipliers for Sparse Group and Fused Lasso models. Computational Statistics and Data Analysis, 79, 203–221.
22. Nardi, Y., Rinaldo, A. (2008). On the asymptotic properties of the Group Lasso estimator for linear models. Electronic Journal of Statistics, 2, 605–633.
23. Neumann, M. H. (2013). A central limit theorem for triangular arrays of weakly dependent random variables, with applications in statistics. Probability and Statistics, 17, 120–134.
24. Newey, W. K., Powell, J. L. (1987). Asymmetric least squares estimation and testing. Econometrica, 55(4), 819–847.
25. Pollard, D. (1991). Asymptotics for least absolute deviation regression estimators. Econometric Theory, 7(2), 186–199.
26. Racine, J. (2000). Consistent cross-validatory model-selection for dependent data: hv-block cross-validation. Journal of Econometrics, 99, 39–61.
27. Rio, E. (2013). Inequalities and limit theorems for weakly dependent sequences. 3 ème Cycle, cel–00867106, 170.Google Scholar
28. Rockafeller, R. T. (1970). Convex analysis. Princeton: Princeton University Press.
29. Shiryaev, A. N. (1991). Probability. Berlin: Springer.
30. Simon, N., Friedman, J., Hastie, T., Tibshirani, R. (2013). A Sparse Group Lasso. Journal of Computational and Graphical Statistics, 22(2), 231–245.
31. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society. Series B, 58(1), 267–288.
32. Wainwright, M. J. (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using $$l^1$$-constrained quadratic programming. IEEE Transactions on Information Theory, 55(5), 2183–2202.
33. Wellner, J. A., van der Vaart, A. W. (1996). Weak convergence and empirical processes. With applications to statistics. New York, NY: Springer.
34. Yuan, M., Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society. Series B, 68(1), 49–67.
35. Zou, H. (2006). The adaptive Lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429.
36. Zou, H., Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320.
37. Zou, H., Zhang, H. H. (2009). On the adaptive elastic-net with a diverging number of parameters. The Annals of Statistics, 37(4), 1733–1751.