Abstract
The histogram estimator of a discrete probability mass function often exhibits undesirable properties related to zero probability estimation both within the observed range of counts and outside into the tails of the distribution. To circumvent this, we formulate a novel second-order discrete kernel smoother based on the recently developed mean-parametrized Conway–Maxwell–Poisson distribution which allows for both over- and under-dispersion. Two automated bandwidth selection approaches, one based on a simple minimization of the Kullback–Leibler divergence and another based on a more computationally demanding cross-validation criterion, are introduced. Both methods exhibit excellent small and large sample performance. Computational results on simulated datasets from a range of target distributions illustrate the flexibility and accuracy of the proposed method compared to existing smoothed and unsmoothed estimators. The method is applied to the modelling of somite counts in earthworms, and the number of development days of insect pests on the Hura tree.
Similar content being viewed by others
References
Botev ZI, Grotowski JF, Kroese DP (2010) Kernel density estimation via diffusion. Ann Stat 38:2916–2957
Deheuvels P (1977) Estimation nonparamétrique de la densité par histogrammes generalizes. Revue de Statistique Appliquée 25:5–42
Fung T, Alwan A, Wishart J, Huang A (2020) mpcmp: mean parametrized Conway–Maxwell Poisson Regression, R package version 0.3.6
Huang A (2017) Mean-parametrized Conway–Maxwell–Poisson regression models for dispersed counts. Stat Model 17:359–380
Huang A (2021) On arbitrarily underdispersed discrete distributions, (under review)
Kiessé TS (2017) On finite sample properties of nonparametric discrete asymmetric kernel estimators. Statistics 51:1046–1060
Kokonendji CC, Kiessé TS, Zocchi SS (2007) Discrete triangular distributions and non-parametric estimation for probability mass function. J Nonparam Stat 19:241–254
Kokonendji CC, Zocchi SS (2010) Extensions of discrete triangular distributions and boundary bias in kernel estimation for discrete functions. Stat Probab Lett 80:1655–1662
Kokonendji CC, Kiessé TS (2011) Discrete associated kernels method and extensions. Stat Methodol 8:497–516
Marsh LC, Mukhopadhyay K (1999) Discrete Poisson kernel density estimation with application to wildcat coal strikes. Applied Economics Letters 6:393–396
Owen AB (2001) Empirical likelihood, monographs on statistics and applied probability (series). Chapman & Hall
Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33:1065–1076
Pearl R, Fuller WN (1905) Variation and correlation in the earthworm. Biometrika 4:213–229
Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27:832–837
Silverman BW (1986) Density estimation for statistics and data analysis, monographs on statistics and applied probability (series). Chapman & Hall, London
Wansouwé WE, Somé SM, Kokonendji CC (2015) Ake: associated Kernel estimations, R package version 1.0
Acknowledgements
We thank Prof. Dirk Kroese (UQ) and an anonymous reviewer for helpful comments that improved this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Huang, A., Sippel, L. & Fung, T. Consistent second-order discrete kernel smoothing using dispersed Conway–Maxwell–Poisson kernels. Comput Stat 37, 551–563 (2022). https://doi.org/10.1007/s00180-021-01144-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-021-01144-w