, Volume 26, Issue 3, pp 527–545 | Cite as

Bandwidth selection in kernel density estimation for interval-grouped data

  • Miguel Reyes
  • Mario Francisco-FernándezEmail author
  • Ricardo Cao
Original Paper


When interval-grouped data are available, the classical Parzen–Rosenblatt kernel density estimator has to be modified to get a computable and useful approach in this context. The new nonparametric grouped data estimator needs of the choice of a smoothing parameter. In this paper, two different bandwidth selectors for this estimator are analyzed. A plug-in bandwidth selector is proposed and its relative rate of convergence obtained. Additionally, a bootstrap algorithm to select the bandwidth in this framework is designed. This method is easy to implement and does not require Monte Carlo. Both proposals are compared through simulations in different scenarios. It is observed that when the sample size is medium or large and grouping is not heavy, both bandwidth selection methods have a similar and good performance. However, when the sample size is large and under heavy grouping scenarios, the bootstrap bandwidth selector leads to better results.


Smoothing parameter selection Plug-in bandwidth Bootstrap bandwidth selector Interval data 

Mathematics Subject Classification

62G07 62N99 62G09 



This research has been partially supported by the Spanish Ministry of Science and Innovation, Grants MTM2011-22392 and MTM2014-52876-R, and Xunta de Galicia Grant CN2012/130. The authors thank two anonymous referees for numerous useful comments that significantly improved this article.

Supplementary material

11749_2017_523_MOESM1_ESM.pdf (281 kb)
Supplementary material 1 (pdf 281 KB)


  1. Bowman A (1984) An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71:353–360MathSciNetCrossRefGoogle Scholar
  2. Bowman A, Azzalini A (1997) Applied smoothing techniques for data analysis: the kernel approach with S-plus illustrations. Oxford University Press, OxfordzbMATHGoogle Scholar
  3. Cao R (1993) Bootstrapping the mean integrated squared error. J Multivar Anal 45:137–160MathSciNetCrossRefzbMATHGoogle Scholar
  4. Cao R, Francisco-Fernandez M, Anand A, Bastida F, Gonzalez-Andujar J (2011) Computing statistical indices for hydrothermal times using weed emergence data. J Agric Sci 149:701–712CrossRefGoogle Scholar
  5. Chacón JE, Duong T (2010) Multivariate plug-in bandwidth selection with unconstrained pilot bandwidth matrices. TEST 19:375–398MathSciNetCrossRefzbMATHGoogle Scholar
  6. Devroye L (1997) Universal smoothing factor selection in density estimation: theory and practice. TEST 6:223–320MathSciNetCrossRefzbMATHGoogle Scholar
  7. Faraway J, Jhun M (1990) Bootstrap choice of bandwidth for density estimators. J Am Stat Assoc 85:1119–1122CrossRefGoogle Scholar
  8. Guidoum AC (2014) kedd: Kernel estimator and bandwidth selection for density and its derivatives. R package version 1.0.1.
  9. Hall P, Marron JS (1987) Estimation of integrated squared density derivatives. Stat Probab Lett 6:109–115MathSciNetCrossRefzbMATHGoogle Scholar
  10. Hall P, Wand MP (1996) On the accuracy of binned kernel density estimators. J Multivar Anal 56:165–184MathSciNetCrossRefzbMATHGoogle Scholar
  11. Jang W, Loh JM (2010) Density estimation for grouped data with application to line transect sampling. Ann Appl Probab 4:893–915MathSciNetzbMATHGoogle Scholar
  12. Jones MC (1991) The roles of ISE and MISE in density estimation. Stat Probab Lett 12:51–56MathSciNetCrossRefGoogle Scholar
  13. Jones MC, Sheather SJ (1991) Using non-stochastic terms to advantage in kernel-based estimation of integrated squared density derivatives. Stat Probab Lett 11:511–514CrossRefzbMATHGoogle Scholar
  14. Jones MC, Marron JS, Sheather SJ (1996) A brief survey of bandwidth selection for density estimation. J Am Stat Assoc 91:401–407MathSciNetCrossRefzbMATHGoogle Scholar
  15. Mächler M (2014) nor1mix: normal (1-d) mixture models (S3 classes and methods). R package version 1.2-0.
  16. Mammen E (1990) A short note on optimal bandwidth selection for kernel estimators. Stat Probab Lett 9:23–25MathSciNetCrossRefzbMATHGoogle Scholar
  17. Marron J (1992) Bootstrap bandwidth selection. In: LePage R, Billard L (eds) Exploring the limits of bootstrap. Wiley, New York, pp 249–262Google Scholar
  18. Marron JS, Wand MP (1992) Exact mean integrated squared error. Ann Stat 20:712–736MathSciNetCrossRefzbMATHGoogle Scholar
  19. Park BU, Marron JS (1990) Comparison of data-driven bandwidth selectors. J Am Stat Assoc 85:66–72CrossRefGoogle Scholar
  20. Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33:1065–1076MathSciNetCrossRefzbMATHGoogle Scholar
  21. R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria,
  22. Reyes M, Francisco-Fernandez M, Cao R (2016) Nonparametric kernel density estimation for general grouped data. J Nonparametr Stat 2:235–249MathSciNetCrossRefzbMATHGoogle Scholar
  23. Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27:832–837MathSciNetCrossRefzbMATHGoogle Scholar
  24. Scott D, Sheather SJ (1985) Kernel density estimation with binned data. Commun Stat Theory Methods 14:1353–1359CrossRefGoogle Scholar
  25. Sheather SJ, Jones MC (1991) A reliable data-based bandwidth selection method for kernel density estimation. J R Stat Soc Series B 53:683–690MathSciNetzbMATHGoogle Scholar
  26. Silverman BW (1986) Density estimation for statistics and data analysis. Chapman and Hall, LondonCrossRefzbMATHGoogle Scholar
  27. Taylor C (1989) Bootstrap choice of the tuning parameter in kernel density estimators. Biometrika 76:705–712MathSciNetCrossRefzbMATHGoogle Scholar
  28. Wand M (2014) KernSmooth: functions for kernel smoothing for Wand & Jones (1995). R package version 2.23-12.
  29. Wand MP, Jones MC (1995) Kernel smoothing. Chapman and Hall/CRC, LondonCrossRefzbMATHGoogle Scholar

Copyright information

© Sociedad de Estadística e Investigación Operativa 2017

Authors and Affiliations

  1. 1.Centro de Investigación en Matemáticas, De Jalisco S-NGuanajuatoMexico
  2. 2.Research Group MODES, Departamento de Matemáticas, Facultad de InformáticaUniversidade da CoruñaCoruñaSpain

Personalised recommendations