Abstract
In this paper, we prove that two multiplicative bias correction techniques (MBC) can be applied for discrete kernels in the context of probability mass function estimation. First, some properties of the MBC discrete kernel estimators (bias, variance and mean integrated squared error) are investigated. Second, the popular cross-validation technique is adapted for bandwidth selection. Finally, a simulation study and a real data application for discrete data illustrate the performance of the MBC estimators based on dirac discrete uniform and triangular discrete kernels.
Similar content being viewed by others
References
Aitchison J, Aitken CGG (1976) Multivariate binary discrimination by the kernel method. Biometrika 63:413–420
Belaid N, Adjabi S, Zougab N, Kokonendji CC (2016) Bayesian bandwidth selection in discrete multivariate associated kernel estimators for probability mass functions. J Korean Stat Soc 45:557–567
Chu CY, Henderson DJ, Parmeter CF (2015) Plug-in bandwidth selection for kernel density estimation with discrete data. Econometrics 3:199–214
Funke B, Kawka R (2015) Nonparametric density estimation for multivariate bounded data using two non-negative multiplicative bias correction methods. Comput Stat Data Anal 92:148–162
Greene W (2011) Econometric analysis. Pearson, Cambridge
Hirukawa M (2010) Nonparametric multiplicative bias correction for kernel-type density estimation on the unit interval. Comput Stat Data Anal 54:473–495
Hirukawa M, Sakudo M (2014) Nonnegative bias reduction methods for density estimation using asymmetric kernels. Comput Stat Data Anal 75:112–123
Hirukawa M, Sakudo M (2015) Family of the generalised gamma kernels: a generator of asymmetric kernels for nonnegative data. J Nonparametric Stat 27:41–63
Jones MC, Foster PJ (1993) Generalized jackknifing and higher order kernels. J Nonparametric Stat 3:81–94
Jones MC, Linton O, Nielsen JP (1995) A simple bias reduction method for density estimation. Biometrika 82:327–338
Kokonendji CC, Senga Kiessé T (2011) Discrete associated kernels method and extensions. Stat Methodol 8:497–516
Kokonendji CC, Senga Kiessé T, Zocchi SS (2007) Discrete triangular distributions and non-parametric estimation for probability mass function. J Nonparametric Stat 19:241–254
Kokonendji CC, Somé SM (2015) On multivariate associated kernels for smoothing general density function. arXiv: 1502.01173
Racine JS, Li Q (2004) Nomparametric estimation of regression functions with both categorical and continuous data. J Econom 119:99–130
Senga Kiessé T, Mizère D (2012) Weighted Poisson and semiparametric kernel models applied for parasite growth. Aust N Z J Stat 55:1–13
Terrell GR, Scott DW (1980) On improving convergence rates for nonnegative kernel density estimators. Ann Stat 8:1160–1163
Wang M, Ryzin J (1980) A class of smooth estimators for discrete distributions. Biometrika 68:301–309
Zougab N, Adjabi S (2015) Multiplicative bias correction for generalized Birnbaum–Saunders kernel density estimators and application to nonnegative heavy tailed data. J Korean Stat Soc 45:51–63
Acknowledgements
This research has been supported by the Unit of Research LAMOS of University of Bejaia. The authors thank the editor, an associate editor and anonymous referees for their valuable comments that allowed us to improve this article.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
We present a sketch of proofs of Theorems 1 and 2. We provide the proofs when the discrete triangular kernel is used. The proofs using the other kernels can be given similarly.
1.1 Sketch of the proof of Theorem 1
1.1.1 Bias
First, note that \(\mathbb {E}\left( \widehat{f}_{DT,h}(x)\right) =\mathbb {E}(f(\mathcal {T}))\), where the random variable \(\mathcal {T}\sim DT(a;x,h)\). By using a fourth-order discrete Taylor expansion around \(\mathcal {T}=x\) for
we have
By using the property of the discrete triangular random variable and a Taylor expansion around \(h=0\),
where
The Taylor expansion of \(I_{h}(x)\) around \(h=0\) is then given by
where
and
Similarly, \(I_{h/c}(x)=\mathbb {E}\left( \widehat{f}_{DT,h/c}(x)\right) \) can be approximated by
Now, we define
and
The estimator \(\tilde{f}_{TS,DT}\) can be written as follows:
Using the expansion \((1+t)^{\alpha }=1+\alpha t+o(t^{2})\), we then have
Based on Assumption 2 and using the same calculations as in Hirukawa (2010) and Terrell and Scott (1980), we can show easily that
1.1.2 Variance
For the variance, from Eq. (4) we have
First, note that the terms \(\mathrm{Var}\left( \widehat{f}_{DT,h}(x)\right) \) and \(\mathrm{Var}\left( \widehat{f}_{DT,h/c}(x)\right) \) are given by [see Kokonendji et al. (2007)]:
and
Now,
Therefore, the variance of \(\tilde{f}_{TS,DT}(x)\) is given by
which corresponds to the results in Theorem 1.
1.2 Sketch of the proof of Theorem 2
1.2.1 Bias
At first, the estimator \(\tilde{f}_{JLN,DT}\) can be written as [see Hirukawa (2010)]
where \(\psi (x)=n^{-1}\sum _{i=1}^{n}K_{DT,h}(X_{i})/\widehat{f}_{DT}(X_{i})\). Then, we have
By using Assumption 2 and the properties of DT random variables, the terms \(\mathbb {E}\left\{ \frac{\widehat{f}_{DT}(x)-f(x)}{f(x)}\right\} \), \(\mathbb {E}\left\{ \psi (x)-1\right\} \) and \(\mathbb {E}\left\{ \left( \frac{\widehat{f}_{DT}(x)-f(x)}{f(x)}\right) \left( \psi (x)-1\right) \right\} \) can be approximated following the same procedures as in Hirukawa (2010). Thus, \(\mathbb {E}(\widehat{f}_{JLN-DT})\) is approximated by
where \(q(x)=l_{1}(x,f)/f(x)\) with \(l_{1}(x,f)\) given in the proof of Theorem 1.
1.2.2 Variance
Note that following Hirukawa (2010) and Jones et al. (1995), we can show that \(\tilde{f}_{JLN,DT}(x)\) is equivalent to
It follows that
Therefore, we obtain the approximation for the variance given in Theorem 2. \(\square \)
Rights and permissions
About this article
Cite this article
Harfouche, L., Adjabi, S., Zougab, N. et al. Multiplicative bias correction for discrete kernels. Stat Methods Appl 27, 253–276 (2018). https://doi.org/10.1007/s10260-017-0395-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-017-0395-x