Wavelet-based gradient boosting

Dubossarsky, E.; Friedman, J. H.; Ormerod, J. T.; Wand, M. P.

doi:10.1007/s11222-014-9474-0

Wavelet-based gradient boosting

Published: 08 May 2014

Volume 26, pages 93–105, (2016)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

E. Dubossarsky¹,
J. H. Friedman²,
J. T. Ormerod³ &
…
M. P. Wand⁴

1005 Accesses
6 Citations
Explore all metrics

Abstract

A new data science tool named wavelet-based gradient boosting is proposed and tested. The approach is special case of componentwise linear least squares gradient boosting, and involves wavelet functions of the original predictors. Wavelet-based gradient boosting takes advantages of the approximate $\ell _1$ penalization induced by gradient boosting to give appropriate penalized additive fits. The method is readily implemented in R and produces parsimonious and interpretable regression fits and classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparative analysis of gradient boosting algorithms

Article 24 August 2020

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Siamese Neural Networks: An Overview

References

Binder, H., Tutz, G.: A comparison of methods for the fitting of generalized additive models. Stat. Comput. 18, 87–99 (2008)
Article MathSciNet Google Scholar
Bühlmann, P.: Boosting for high-dimensional linear models. Ann. Stat. 34, 559–583 (2006)
Article MATH Google Scholar
Bühlmann, P., Yu, B.: Sparse boosting. J. Mach. Learn. Res. 7, 1001–1024 (2006)
MATH MathSciNet Google Scholar
Bühlmann, P., Hothorn, T.: Boosting algorithms: regularization, prediction and model fitting (with discussion). Stat. Sci. 22, 477–522 (2007)
Article MATH Google Scholar
Donoho, D.L.: De-noising by soft-thresholding. IEEE Trans. Inf. Theor. 41, 613–627 (1995)
Article MATH MathSciNet Google Scholar
Donoho, D.L., Johnstone, I.M.: Ideal spatial adaptation by wavelet shrinkage. Biometrika 81, 425–456 (1994)
Article MATH MathSciNet Google Scholar
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–451 (2004)
Article MATH MathSciNet Google Scholar
Friedman, J.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
Article MATH Google Scholar
Hansen, M.H., Yu, B.: Model selection and the principle of minimum description length. J. Am. Stati. Assoc. 96, 746–774 (2001)
Article MATH MathSciNet Google Scholar
Hastie, T.: Comment on paper by Bühlmann & Hothorn. Stat. Sci. 22, 513–515 (2007)
Article MATH MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer, New York (2009)
Book MATH Google Scholar
Hothorn, T., Bühlmann, P., Kneib, T., Schmid, M. & Hofner, B.: mboost 2.2. Model-based boosting. R package.(2011) http://cran.r-project.org
Hurvich, C.M., Simonoff, J.S., Tsai, C.: Smoothing parameter selection in nonparametric regression using an improved A kaike information criterion. J. R. Stat. Soc. B 60, 271–293 (1998)
Article MATH MathSciNet Google Scholar
Hyndman, R.J.: hdrcde 2.15. Highest density regions and conditional density estimation. R package. (2010) http://cran.r-project.org
Leitenstorfer, F., Tutz, G.: Knot selection by boosting techniques. Comput. Stat. Data Anal. 51, 4605–4621 (2007)
Article MATH MathSciNet Google Scholar
Nason, G.P.: Wavelet Methods in Statistics with R. Springer, New York (2008)
Book MATH Google Scholar
Nason, G.P.: wavethresh 4.5. Wavelets statistics and transforms. R package. (2010) http://cran.r-project.org
R Development Core Team R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, (2012) http://www.R-project.org
Ridgeway G.: gbm 1.6. Generalized boosted regression models. R package. (2012) http://cran.r-project.org
Samworth, R.J., Wand, M.P.: Asymptotics and optimal bandwidth selection for highest density region estimation. Ann. Stat. 38, 1767–1792 (2010)
Article MATH MathSciNet Google Scholar
Vidakovic, B.: Statistical Modeling by Wavelets. Wiley, New York (1999)
Book MATH Google Scholar
Wand, M.P., Jones, M.C.: Kernel Smoothing. Chapman and Hall, London (1995)
Book MATH Google Scholar
Wand, M.P., Ormerod, J.T.: Penalized wavelets: embedding wavelets into semiparametric regression. Electron. J. Stat. 5, 1654–1717 (2011)
Article MATH MathSciNet Google Scholar
Zou, H., Hastie, T., Tibshirani, R.: On the “degrees of freedom” of the lasso. Ann. Stat. 5, 2173–2192 (2007)
Article MathSciNet Google Scholar

Download references

Acknowledgments

We are grateful to Andrew Chernih for his provision of the Sydney residential property price data and to Peter Green for his comments on aspects of this research. Partial support was provided by Australian Research Council Discovery Project DP0877055. Assistance from the University of Technology, Sydney’s Distinguished Visitor programme is gratefully acknowledged.

Author information

Authors and Affiliations

Presciient Pty. Ltd, Epping, Australia
E. Dubossarsky
Department of Statistics, Stanford University, Stanford, CA , 94305, USA
J. H. Friedman
School of Mathematics and Statistics, University of Sydney, Sydney, NSW , 2006, Australia
J. T. Ormerod
School of Mathematical Sciences, University of Technology, Sydney, Broadway, Ultimo, NSW , 2007, Australia
M. P. Wand

Authors

E. Dubossarsky
View author publications
You can also search for this author in PubMed Google Scholar
J. H. Friedman
View author publications
You can also search for this author in PubMed Google Scholar
J. T. Ormerod
View author publications
You can also search for this author in PubMed Google Scholar
M. P. Wand
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. P. Wand.

Appendix: Highest-density region grids

We now provide details of the highest density region (HDR) grids used in Figures 3 and 5.

Let $\varvec{x}=(x_1,\ldots ,x_n)$ be a generic univariate sample and ${\widehat{p}}$ be a probability density estimate based on $\varvec{x}$. Then a $100(1-\tau )\%$ highest-density region estimate is

$$\begin{aligned} {\widehat{R}}_{\tau }=\{x\in {\mathbb R}:{\widehat{p}}(x)\ge {\widehat{p}}_{\tau }\} \end{aligned}$$

where ${\widehat{p}}_{\tau }$ is chosen so that the probability mass of ${\widehat{p}}$ over the set ${\widehat{R}}_{\tau }$ does not exceed $1-\tau $. See, for example, Samworth and Wand (2010) for a precise mathematical definition of ${\widehat{p}}_{\tau }$.

The most commonly used estimator ${\widehat{p}}$ for HDR estimation is the kernel density estimator

$$\begin{aligned} {\widehat{p}}(x)=\frac{1}{nh}\sum _{i=1}^n K\left( \frac{x-x_i}{h}\right) \end{aligned}$$

where $K$ is a kernel function and $h>0$ is a bandwidth (see e.g. Wand and Jones 1995). Recently, Samworth and Wand (2010) devised an automatic rule for selection of $h$ in the HDR estimation context. The R package hdrcde (Hyndman 2010) implements both HDR estimation and the Samworth-Wand bandwidth selector. Figure 8 shows 80% HDR estimate for the variable distance to coastline variable in the Sydney residential property prices data. The corresponding HDR grid of size 50 is shown at the base of the plot.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dubossarsky, E., Friedman, J.H., Ormerod, J.T. et al. Wavelet-based gradient boosting. Stat Comput 26, 93–105 (2016). https://doi.org/10.1007/s11222-014-9474-0

Download citation

Received: 13 November 2012
Accepted: 15 April 2014
Published: 08 May 2014
Issue Date: January 2016
DOI: https://doi.org/10.1007/s11222-014-9474-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Wavelet-based gradient boosting

Abstract

Access this article

Similar content being viewed by others

A comparative analysis of gradient boosting algorithms

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Siamese Neural Networks: An Overview

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Highest-density region grids

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Wavelet-based gradient boosting

Abstract

Access this article

Similar content being viewed by others

A comparative analysis of gradient boosting algorithms

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Siamese Neural Networks: An Overview

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Highest-density region grids

Appendix: Highest-density region grids

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation