Consistent Model Combination of Lasso via Regularization Path

  • Mei Wang
  • Yingqi Sun
  • Erlong Yang
  • Kaoping Song
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 662)


It is well-known that model combination can improve prediction performance of regression model. We investigate the model combination of Lasso with regularization path in this paper. We first define the prediction risk of Lasso estimator, and prove that Lasso regularization path contains at least one prediction consistent estimator. Then we establish the prediction consistency for convex combination of Lasso estimators, which gives the mathematical justification for model combination of Lasso on regularization path. With the inherent piecewise linearity of Lasso regularization path, we construct the initial candidate model set, then select the models for combination with Occam’s Window method. Finally, we carry out the combination on the selected models using the Bayesian model averaging. Theoretical analysis and experimental results suggest the feasibility of the proposed method.


Model combination Lasso Prediction consistency Regularization path Occam’s Window 



Project supported by the Natural Science Foundation of Heilongjiang Province (No. F2015020, E2016008), the Beijing Postdoctoral Research Foundation (No. 2015ZZ-120), Chaoyang District of Beijing Postdoctoral Foundation (No. 2014ZZ-14), the Natural Science Foundation of China (No. 51104030, 61170019), the Northeast Petroleum University Cultivation Foundation (No. XN2014102).


  1. 1.
    Draper, D.: Assessment and propagation of model uncertainty. J. Roy. Stat. Soc. B 57, 45–97 (1995)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Nilsen, T., Aven, T.: Models and model uncertainty in the context of risk analysis. Reliab. Eng. Syst. Saf. 79(3), 309–317 (2003)CrossRefGoogle Scholar
  3. 3.
    Raftery, A.E., Madigan, D., Hoeting, J.A.: Bayesian model averaging for linear regression models. J. Am. Stat. Assoc. 92(437), 179–191 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Yang, Y.: Adaptive regression by mixing. J. Am. Stat. Assoc. 96(454), 574–588 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Yuan, Z., Yang, Y.: Combining linear regression models: when and how. J. Am. Stat. Assoc. 100(472), 1202–1214 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Kittler, J.: Combining classifiers: a theoretical framework. Pattern Anal. Appl. 1, 18–27 (1998)CrossRefGoogle Scholar
  7. 7.
    Wang, M., Liao, S.: Model combination for support vector regression via regularization path. In: Anthony, P., Ishizuka, M., Lukose, D. (eds.) PRICAI 2012. LNCS (LNAI), vol. 7458, pp. 649–660. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-32695-0_57 CrossRefGoogle Scholar
  8. 8.
    Wang, M., Song, K., Lv, H., Liao, S.: Consistent model combination for svr via regularization path. J. Comput. Inf. Syst. 10(22), 9609–9617 (2014)Google Scholar
  9. 9.
    Lugosi, G., Vayatis, N.: On the Bayes-risk consistency of regularized boosting methods. Ann. Stat. 32(1), 30–55 (2004)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Steinwart, I.: Consistency of support vector machines and other regularized kernel classifiers. IEEE Trans. Inf. Theory 51(1), 128–142 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Tibshirant, R.: Regression shrinkage, selection via the lasso. J. Roy. Stat. Soc. B (Methodol.) 58(1), 267–288 (1996)Google Scholar
  12. 12.
    Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Zou, H., Hastie, T., Tibshirani, R.: On the “degrees of freedom” of the lasso. Ann. Stat. 35(5), 2173–2192 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Fraley, C., Hesterberg, T.: Least angle regression and lasso for large datasets. Stat. Anal. Data Min. 1(4), 251–259 (2009)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Hoeting, J.A., Madigan, D., Raftery, A.E., Volinsky, C.T.: Bayesian model averaging: a tutorial. Stat. Sci. 14, 382–417 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syst. Mag. 6(3), 21–45 (2006)CrossRefGoogle Scholar
  17. 17.
    Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: a new explanation for the effectiveness of voting methods. Ann. Stat. 26(5), 1651–1686 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Wang, M., Liao, S.: Model combination of lasso on regularization path. J. Comput. Inf. Syst. 10(2), 755–762 (2014)Google Scholar
  21. 21.
    Wang, M., Liao, S.: Thress-step bayesian combination of SVM on regularization path. J. Comput. Res. Dev. 50(9), 1855–1864 (2013)Google Scholar
  22. 22.
    Chatterjee, S.: Assumptionless consistency of the lasso, pp. 1–10 (2014). arXiv:1303.5817v5 [math.ST]
  23. 23.
    Madigan, D., Raftery, A.E.: Model selection and accounting for model uncertainty in graphical models using Occam’s window. J. Am. Stat. Assoc. 89(428), 1535–1546 (1994)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2016

Authors and Affiliations

  • Mei Wang
    • 1
    • 2
  • Yingqi Sun
    • 1
  • Erlong Yang
    • 3
  • Kaoping Song
    • 2
    • 3
  1. 1.School of Computer and Information TechnologyNortheast Petroleum UniversityDaqingChina
  2. 2.Center for Post-Doctoral Studies of Beijing Deweijiaye Science and Technology Ltd.BeijingChina
  3. 3.Key Laboratory of Enhanced Oil Recovery in Ministry of EducationNortheast Petroleum UniversityDaqingChina

Personalised recommendations