Bagging of density estimators

  • Mathias BourelEmail author
  • Jairo Cugliari
Original paper


In this work we give new density estimators by averaging classical density estimators such as the histogram, the frequency polygon and the kernel density estimators obtained over different bootstrap samples of the original data. Using existent results, we prove the \(L^2\)-consistency of these new estimators and compare them to several similar approaches by simulations. Based on them, we give also a way to construct non-parametric pointwise variability band for the target density.


Aggregation Bagging Density estimation Histogram Kernel density estimator Polygon frequency 



We would like to thank project ECOS-2014 Aprendizaje Automático para la Modelización y el Análisis de Recursos Naturales, no. U14E02, the LIA-IFUM and the ANII-Uruguay for their financial support.


  1. Botev ZI, Grotowski JF, Kroese DP (2010) Kernel density estimation via diffusion. Ann Stat 38(5):2916–2957MathSciNetCrossRefzbMATHGoogle Scholar
  2. Bourel M, Ghattas B (2013) Aggregating density estimators: an empirical study. Open J Stat 3(5):344–355CrossRefGoogle Scholar
  3. Bourel M, Ghattas B, Fraiman R (2014) Random average shifted histograms. Comput Stat Data Anal 79:149–164MathSciNetCrossRefzbMATHGoogle Scholar
  4. Bowman A, Azzalini A (1997) Applied smoothing techniques for data analysis: the kernel approach with S-Plus illustrations. Oxford statistical science series. OUP Oxford, OxfordzbMATHGoogle Scholar
  5. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140zbMATHGoogle Scholar
  6. Breiman L (2001) Random forests. Mach Learn 45(1):5–32CrossRefzbMATHGoogle Scholar
  7. Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7(1):1–26MathSciNetCrossRefzbMATHGoogle Scholar
  8. Efron B, Tibshirani R (1993) An introduction to the bootstrap. Monographs on statistics and applied probability. Chapman & Hall, Boca RatonCrossRefzbMATHGoogle Scholar
  9. Fisher R (1932) Statistical methods for research workers. Biological monographs and manuals. Oliver and Boyd, New YorkGoogle Scholar
  10. Freund Y, Schapire R (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139MathSciNetCrossRefzbMATHGoogle Scholar
  11. Glodek M, Schels M, Schwenker F (2013) Ensemble gaussian mixture models for probability density estimation. Comput Stat 28(1):127–138MathSciNetCrossRefzbMATHGoogle Scholar
  12. Hall P (1997) The bootstrap and Edgeworth expansion. Springer series in statistics. Springer, New YorkGoogle Scholar
  13. Marron J, Wand M (1992) Exact mean integrated square error. Ann Stat 20(2):712–736CrossRefzbMATHGoogle Scholar
  14. R Core Team (2018) R: a language and environment for statistical computing. R Foundation for Statistical Computing, ViennaGoogle Scholar
  15. Ridgeway G (2002) Looking for lumps: Boosting and bagging for density estimation. Comput Stat Data Anal 38(4):379–392MathSciNetCrossRefzbMATHGoogle Scholar
  16. Rigollet P, Tsybakov AB (2007) Linear and convex aggregation of density estimators. Math Methods Stat 16(3):260–280MathSciNetCrossRefzbMATHGoogle Scholar
  17. Rosset S, Segal E (2002) Boosting density estimation. In: Advances in neural information processing systems (NIPS), pp 641–648Google Scholar
  18. Scott D (1985a) Averaged shifted histogram: effective nonparametric density estimators inseveral dimensions. Ann Stat 13(3):1024–1040CrossRefzbMATHGoogle Scholar
  19. Scott D (1985b) Frequency polygons: theory and application. J Am Stat Assoc 80(390):348–354MathSciNetCrossRefzbMATHGoogle Scholar
  20. Scott D (2015) Multivariate density estimation: theory, practice, and visualization. Wiley series in probability and statistics. Wiley, HobokenGoogle Scholar
  21. Scott DW (1979) On optimal and data-based histograms. Biometrika 66:605–610MathSciNetCrossRefzbMATHGoogle Scholar
  22. Smyth P, Wolpert D (1999) Linearly combining density estimators via stacking. Mach Learn 36(1–2):59–83CrossRefGoogle Scholar
  23. Song X, Yang K, Pavel M (2004) Density boosting for gaussian mixtures. Neural Inf Process 3316:508–515CrossRefGoogle Scholar
  24. Wasserman L (2006) All of nonparametric statistics. Springer texts in statistics. Springer, New YorkzbMATHGoogle Scholar
  25. Wolpert D (1992) Stacked generalization. Neural Netw 5:241–259CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.IMERL, Facultad de IngenieríaUniversidad de la RepúblicaMontevideoUruguay
  2. 2.DMC, Facultad de Ciencias Económicas y AdministraciónUniversidad de la RepúblicaMontevideoUruguay
  3. 3.Laboratoire ERIC EA 3083Université Lumière Lyon 2LyonFrance

Personalised recommendations