Multiplicative bias correction for discrete kernels

Harfouche, Lynda; Adjabi, Smail; Zougab, Nabil; Funke, Benedikt

doi:10.1007/s10260-017-0395-x

Multiplicative bias correction for discrete kernels

Original Paper
Published: 02 September 2017

Volume 27, pages 253–276, (2018)
Cite this article

Statistical Methods & Applications Aims and scope Submit manuscript

Lynda Harfouche¹,
Smail Adjabi¹,
Nabil Zougab¹ &
…
Benedikt Funke²

262 Accesses
12 Citations
Explore all metrics

Abstract

In this paper, we prove that two multiplicative bias correction techniques (MBC) can be applied for discrete kernels in the context of probability mass function estimation. First, some properties of the MBC discrete kernel estimators (bias, variance and mean integrated squared error) are investigated. Second, the popular cross-validation technique is adapted for bandwidth selection. Finally, a simulation study and a real data application for discrete data illustrate the performance of the MBC estimators based on dirac discrete uniform and triangular discrete kernels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature selection techniques for machine learning: a survey of more than two decades of research

Article 01 December 2023

A review of unsupervised feature selection methods

Article 29 January 2019

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

References

Aitchison J, Aitken CGG (1976) Multivariate binary discrimination by the kernel method. Biometrika 63:413–420
Article MathSciNet MATH Google Scholar
Belaid N, Adjabi S, Zougab N, Kokonendji CC (2016) Bayesian bandwidth selection in discrete multivariate associated kernel estimators for probability mass functions. J Korean Stat Soc 45:557–567
Article MathSciNet MATH Google Scholar
Chu CY, Henderson DJ, Parmeter CF (2015) Plug-in bandwidth selection for kernel density estimation with discrete data. Econometrics 3:199–214
Article Google Scholar
Funke B, Kawka R (2015) Nonparametric density estimation for multivariate bounded data using two non-negative multiplicative bias correction methods. Comput Stat Data Anal 92:148–162
Article MathSciNet Google Scholar
Greene W (2011) Econometric analysis. Pearson, Cambridge
Google Scholar
Hirukawa M (2010) Nonparametric multiplicative bias correction for kernel-type density estimation on the unit interval. Comput Stat Data Anal 54:473–495
Article MathSciNet MATH Google Scholar
Hirukawa M, Sakudo M (2014) Nonnegative bias reduction methods for density estimation using asymmetric kernels. Comput Stat Data Anal 75:112–123
Article MathSciNet Google Scholar
Hirukawa M, Sakudo M (2015) Family of the generalised gamma kernels: a generator of asymmetric kernels for nonnegative data. J Nonparametric Stat 27:41–63
Article MathSciNet MATH Google Scholar
Jones MC, Foster PJ (1993) Generalized jackknifing and higher order kernels. J Nonparametric Stat 3:81–94
Article MathSciNet MATH Google Scholar
Jones MC, Linton O, Nielsen JP (1995) A simple bias reduction method for density estimation. Biometrika 82:327–338
Article MathSciNet MATH Google Scholar
Kokonendji CC, Senga Kiessé T (2011) Discrete associated kernels method and extensions. Stat Methodol 8:497–516
Article MathSciNet MATH Google Scholar
Kokonendji CC, Senga Kiessé T, Zocchi SS (2007) Discrete triangular distributions and non-parametric estimation for probability mass function. J Nonparametric Stat 19:241–254
Article MathSciNet MATH Google Scholar
Kokonendji CC, Somé SM (2015) On multivariate associated kernels for smoothing general density function. arXiv: 1502.01173
Racine JS, Li Q (2004) Nomparametric estimation of regression functions with both categorical and continuous data. J Econom 119:99–130
Article MATH Google Scholar
Senga Kiessé T, Mizère D (2012) Weighted Poisson and semiparametric kernel models applied for parasite growth. Aust N Z J Stat 55:1–13
Article MathSciNet MATH Google Scholar
Terrell GR, Scott DW (1980) On improving convergence rates for nonnegative kernel density estimators. Ann Stat 8:1160–1163
Article MathSciNet MATH Google Scholar
Wang M, Ryzin J (1980) A class of smooth estimators for discrete distributions. Biometrika 68:301–309
Article MathSciNet MATH Google Scholar
Zougab N, Adjabi S (2015) Multiplicative bias correction for generalized Birnbaum–Saunders kernel density estimators and application to nonnegative heavy tailed data. J Korean Stat Soc 45:51–63
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This research has been supported by the Unit of Research LAMOS of University of Bejaia. The authors thank the editor, an associate editor and anonymous referees for their valuable comments that allowed us to improve this article.

Author information

Authors and Affiliations

Research Unit LaMOS, University of Bejaia, Bejaïa, Algeria
Lynda Harfouche, Smail Adjabi & Nabil Zougab
Department of Mathematics, Technical University of Dortmund, Dortmund, Germany
Benedikt Funke

Authors

Lynda Harfouche
View author publications
You can also search for this author in PubMed Google Scholar
Smail Adjabi
View author publications
You can also search for this author in PubMed Google Scholar
Nabil Zougab
View author publications
You can also search for this author in PubMed Google Scholar
Benedikt Funke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lynda Harfouche.

Appendix

We present a sketch of proofs of Theorems 1 and 2. We provide the proofs when the discrete triangular kernel is used. The proofs using the other kernels can be given similarly.

1.1 Sketch of the proof of Theorem 1

1.1.1 Bias

First, note that $\mathbb {E}\left( \widehat{f}_{DT,h}(x)\right) =\mathbb {E}(f(\mathcal {T}))$, where the random variable $\mathcal {T}\sim DT(a;x,h)$. By using a fourth-order discrete Taylor expansion around $\mathcal {T}=x$ for

$$\begin{aligned} I_{h}(x)=\mathbb {E}(\widehat{f}_{DT,h})=\sum K_{DT,h}(y)f(y)=\mathbb {E}(f(\mathcal {T})), \end{aligned}$$

we have

$$\begin{aligned} I_{h}(x)=f(x)+\sum \limits _{j=1}^{4}\frac{f^{(j)}}{j!} \mathbb {E}(\mathcal {T}-x)^{j}+o(\mathbb {E}(\mathcal {T}-x)^{4}). \end{aligned}$$

By using the property of the discrete triangular random variable and a Taylor expansion around $h=0$,

$$\begin{aligned} \mathbb {E}(\mathcal {T}-x)= & {} 0,\\ \mathbb {E}(\mathcal {T}-x)^{2}= & {} \left\{ \log (a+1)S(a)-2\sum \limits _{k=1}^{a}k^{2}\log (k)\right\} h \\&+\left\{ \frac{\log ^{2}(a+1)}{2}S(a)-\sum \limits _{k=1}^{a}k^{2}\log ^{2} (k)\right\} h^{2}+o(h^{2}),\\ \mathbb {E}(\mathcal {T}-x)^{3}= & {} 0,\\ \mathbb {E}(\mathcal {T}-x)^{4}= & {} \left\{ \log (a+1)R(a)-2\sum \limits _{k=1}^{a}k^{4}\log (k)\right\} h \\&+\left\{ \frac{\log ^{2}(a+1)}{2}R(a)-\sum \limits _{k=1}^{a}k^{4}\log ^{2} (k)\right\} h^{2}+o(h^{2}), \end{aligned}$$

where

$$\begin{aligned} R(a)=\frac{2}{5}a^{5}+a^{4}+\frac{2}{3}a^{3}-\frac{1}{15}a. \end{aligned}$$

The Taylor expansion of $I_{h}(x)$ around $h=0$ is then given by

$$\begin{aligned} I_{h}(x)=f(x)\left\{ 1+\frac{l_{1}(x,f)}{f(x)}h+\frac{l_{2}(x,f)}{f(x)}h^{2} +o(h^{2})\right\} , \end{aligned}$$

where

$$\begin{aligned} l_{1}(x,f)&=\frac{f^{(2)}(x)}{2}\left( \log (a+1)S(a)-2\sum \limits _{k=1}^{a} k^{2}\log (k)\right) \\&\quad + \frac{f^{(4)}(x)}{24}\left( \log (a+1)R(a)-2\sum \limits _{k=1}^{a} k^{4}\log (k)\right) \end{aligned}$$

and

$$\begin{aligned} l_{2}(x,f)&=\frac{f^{(2)}(x)}{2}\left( \frac{\log ^{2}(a+1)}{2}S(a) -\sum \limits _{k=1}^{a}k^{2}\log ^{2}(k)\right) \\&\quad + \frac{f^{(4)}(x)}{24}\left( \frac{\log ^{2}(a+1)}{2}R(a) -\sum \limits _{k=1}^{a}k^{4}\log ^{2}(k)\right) . \end{aligned}$$

Similarly, $I_{h/c}(x)=\mathbb {E}\left( \widehat{f}_{DT,h/c}(x)\right) $ can be approximated by

$$\begin{aligned} I_{h/c}(x)=f(x)\left\{ 1+\frac{1}{c}\frac{l_{1}(x,f)}{f(x)}h +\frac{1}{c^{2}}\frac{l_{2}(x,f)}{f(x)}h^{2}+o(h^{2})\right\} . \end{aligned}$$

Now, we define

$$\begin{aligned} \widehat{f}_{DT,h}(x)=I_{h}(x)+Z \end{aligned}$$

and

$$\begin{aligned} \widehat{f}_{DT,h/c}(x)=I_{h/c}(x)+W. \end{aligned}$$

The estimator $\tilde{f}_{TS,DT}$ can be written as follows:

$$\begin{aligned} \tilde{f}_{TS,DT}= \left\{ I_{h}(x)\right\} ^{\frac{1}{1-c}}\left\{ 1+\frac{Z}{I_{h}(x)}\right\} ^{\frac{1}{1-c}} \left\{ I_{h/c}(x)\right\} ^{-\frac{c}{1-c}}\left\{ 1+\frac{W}{I_{h/c}(x)}\right\} ^{-\frac{c}{1-c}}. \end{aligned}$$

Using the expansion $(1+t)^{\alpha }=1+\alpha t+o(t^{2})$, we then have

$$\begin{aligned} \tilde{f}_{TS,DT}(x)= & {} \left\{ I_{h}(x)\right\} ^{\frac{1}{1-c}}\left\{ I_{h/c}(x)\right\} ^{-\frac{c}{1-c}} +\frac{1}{1-c}Z\left\{ \frac{I_{h}(x)}{I_{h/c}(x)}\right\} ^{-\frac{c}{1-c}}\nonumber \\&-\frac{c}{1-c}W\left\{ \frac{I_{h}(x)}{I_{h/c}(x)}\right\} ^{\frac{1}{1-c}}+O\left\{ (Z+W)^{2}\right\} . \end{aligned}$$

(4)

Based on Assumption 2 and using the same calculations as in Hirukawa (2010) and Terrell and Scott (1980), we can show easily that

$$\begin{aligned} \mathbb {E}\left( \tilde{f}_{TS,DT}(x)\right) =f(x)+\frac{1}{c}\left[ \frac{1}{2}\left\{ \frac{l^{2}_{1}(x,f)}{f(x)}-l_{2}(x,f)\right\} \right] h^{2}+o(h^{2}). \end{aligned}$$

1.1.2 Variance

For the variance, from Eq. (4) we have

$$\begin{aligned} \mathrm{Var}\left( \tilde{f}_{TS,DT}(x)\right)= & {} \mathbb {E}\left( \frac{1}{1-c}Z-\frac{c}{1-c}W\right) ^{2}+o(n^{-1})\\= & {} \mathrm{Var}\left( \frac{1}{1-c}\widehat{f}_{DT,h}(x)-\frac{c}{1-c}\widehat{f}_{DT,h/c}(x)\right) +o(n^{-1})\\= & {} \frac{1}{(1-c)^{2}}\mathrm{Var}\left( \widehat{f}_{DT,h}(x)\right) +\frac{c^{2}}{(1-c)^{2}}\mathrm{Var}\left( \widehat{f}_{DT,h/c}(x)\right) \\&-\frac{2c}{(1-c)^{2}}\mathrm{cov}\left( \widehat{f}_{DT,h}(x),\widehat{f}_{DT,h/c}(x)\right) . \end{aligned}$$

First, note that the terms $\mathrm{Var}\left( \widehat{f}_{DT,h}(x)\right) $ and $\mathrm{Var}\left( \widehat{f}_{DT,h/c}(x)\right) $ are given by [see Kokonendji et al. (2007)]:

$$\begin{aligned} \mathrm{Var}\left( \widehat{f}_{DT,h}(x)\right)= & {} f(x)(1-f(x))K^{2}_{DT,h}(x)+o\left( \frac{1}{n}\right) \\= & {} \frac{f(x)}{n}(1-f(x))\frac{(1+a)^{2h}}{P^{2}(a,h)}+o\left( \frac{1}{n}\right) , \end{aligned}$$

and

$$\begin{aligned} \mathrm{Var}\left( \widehat{f}_{DT,h/c}(x)\right)= & {} f(x)(1-f(x))K^{2}_{DT,h/c}(x)+o\left( \frac{1}{n}\right) \\= & {} \frac{f(x)}{n}(1-f(x))\frac{(1+a)^{2h/c}}{P^{2}(a,h/c)}+o\left( \frac{1}{n}\right) . \end{aligned}$$

Now,

$$\begin{aligned}&\mathrm{cov}\left( \widehat{f}_{DT,h}(x),\widehat{f}_{DT,h/c}(x)\right) \\&\quad = \mathbb {E}\left( \widehat{f}_{DT,h}(x)\widehat{f}_{DT,h/c}(x)\right) -\mathbb {E}\left( \widehat{f}_{DT,h}(x)\right) \mathbb {E}\left( \widehat{f}_{DT,h/c} (x)\right) \\&\quad =\frac{1}{n^{2}}\sum \limits _{i=1}^{n}\sum \limits _{j=1}^{n}\mathbb {E} \left( K_{DT,h}(X_{i})K_{DT,h/c}(X_{j})\right) -\mathbb {E} \left( K_{DT,h}(X_{i})\right) \mathbb {E}\left( K_{DT,h/c}(X_{j})\right) \\&\quad =\frac{1}{n}\mathbb {E}\left( K_{DT,h}(X_{i})K_{DT,h/c}(X_{i})\right) +\frac{(n-1)}{n}\mathbb {E}\left( K_{DT,h}(X_{i})\right) \mathbb {E}\left( K_{DT,h/c} (X_{j})\right) \\&\qquad -\mathbb {E}\left( K_{DT,h}(X_{i})\right) \mathbb {E}\left( K_{DT,h/c}(X_{j})\right) \\&\quad =\frac{1}{n}\mathbb {E}\left( K_{DT,h}(X_{i})K_{DT,h/c}(X_{i})\right) -\frac{1}{n}\mathbb {E}\left( K_{DT,h}(X_{i})\right) \mathbb {E}\left( K_{DT,h/c} (X_{j})\right) \\&\quad =\frac{1}{n}K_{DT,h}(x)K_{DT,h/c}(x)f(x)-\frac{1}{n}K_{DT,h}(x)K_{DT,h/c} (x)f^{2}(x)+o\left( \frac{1}{n}\right) \\&\quad =\frac{1}{n}f(x)(1-f(x))\frac{(1+a)^{h}}{P(a,h)}\frac{(1+a)^{h/c}}{P(a,h/c)} +o\left( \frac{1}{n}\right) . \end{aligned}$$

Therefore, the variance of $\tilde{f}_{TS,DT}(x)$ is given by

$$\begin{aligned} \mathrm{Var}\left( \tilde{f}_{TS,DT}(x)\right) =\frac{f(x)(1-f(x))}{n(1-c)^{2}} \left( \frac{(1+a)^{h}}{P(a,h)}-c\frac{(1+a)^{h/c}}{P(a,h/c)}\right) ^{2}+o \left( \frac{1}{n}\right) , \end{aligned}$$

which corresponds to the results in Theorem 1.

1.2 Sketch of the proof of Theorem 2

1.2.1 Bias

At first, the estimator $\tilde{f}_{JLN,DT}$ can be written as [see Hirukawa (2010)]

$$\begin{aligned} \tilde{f}_{JLN,DT}(x)=f(x)\left\{ 1+\frac{\widehat{f}_{DT}(x)-f(x)}{f(x)} \right\} \left\{ 1+(\psi (x)-1)\right\} , \end{aligned}$$

where $\psi (x)=n^{-1}\sum _{i=1}^{n}K_{DT,h}(X_{i})/\widehat{f}_{DT}(X_{i})$. Then, we have

$$\begin{aligned} \mathbb {E}\left( \tilde{f}_{JLN,DT}(x)\right)= & {} f(x)+f(x)\mathbb {E} \left\{ \frac{\widehat{f}_{DT}(x)-f(x)}{f(x)}\right\} +f(x) \mathbb {E}\left\{ \psi (x)-1\right\} \\&+f(x)\mathbb {E}\left\{ \left( \frac{\widehat{f}_{DT}(x)-f(x)}{f(x)}\right) \left( \psi (x)-1\right) \right\} . \end{aligned}$$

By using Assumption 2 and the properties of DT random variables, the terms $\mathbb {E}\left\{ \frac{\widehat{f}_{DT}(x)-f(x)}{f(x)}\right\} $, $\mathbb {E}\left\{ \psi (x)-1\right\} $ and $\mathbb {E}\left\{ \left( \frac{\widehat{f}_{DT}(x)-f(x)}{f(x)}\right) \left( \psi (x)-1\right) \right\} $ can be approximated following the same procedures as in Hirukawa (2010). Thus, $\mathbb {E}(\widehat{f}_{JLN-DT})$ is approximated by

$$\begin{aligned} \mathbb {E}(\tilde{f}_{JLN-DT}(x))= & {} f(x)-f(x)\left[ \frac{1}{2}\left\{ \log (a+1)S(a)-2\sum \limits _{k=1}^{a} k^{2}\log (k)\right\} q^{(1)}(x)\right] h^{2}\\&-f(x) \left[ \frac{1}{24}\left\{ \log (a+1)R(a)-2\sum \limits _{k=1}^{a}k^{4} \log (k)\right\} q^{(2)}(x)\right] h^{2}+o(h^{2}),\\ \mathbb {E}(\tilde{f}_{JLN-DT}(x))= & {} f(x)-f(x)\left[ \frac{1}{2}\left\{ \log (a+1)S(a)-2\sum \limits _{k=1}^{a} k^{2}\log (k)\right\} q^{(1)}(x)\right. \\&\left. +\frac{1}{24}\left\{ \log (a+1)R(a)-2\sum \limits _{k=1}^{a}k^{4} \log (k)\right\} q^{(2)}(x)\right] h^{2}+o(h^{2}), \end{aligned}$$

where $q(x)=l_{1}(x,f)/f(x)$ with $l_{1}(x,f)$ given in the proof of Theorem 1.

1.2.2 Variance

Note that following Hirukawa (2010) and Jones et al. (1995), we can show that $\tilde{f}_{JLN,DT}(x)$ is equivalent to

$$\begin{aligned} \tilde{f}_{JLN-DT}(x)=f(x)\frac{1}{n}\sum \limits _{i=1}^{n}\frac{K_{DT,h} (X_{i})}{f(X_{i})}. \end{aligned}$$

It follows that

$$\begin{aligned} \mathrm{Var}\left( \widetilde{f}_{JLN-DT}(x)\right)= & {} f^{2}(x)\frac{1}{n}\mathrm{Var}\left\{ \frac{K_{DT,h}(X_{i})}{f(X_{i})}\right\} \\= & {} f^{2}(x)\frac{1}{n}\left\{ \mathbb {E}\left( \frac{K^{2}_{DT,h}(X_{i})}{f^{2}(X_{i})}\right) -\left[ \mathbb {E}\left( \frac{K_{DT,h}(X_{i})}{f(X_{i})}\right) \right] ^{2}\right\} \\= & {} \frac{f^{2}(x)}{n}\left\{ \frac{K^{2}_{DT,h}(x)}{f(x)}-K^{2}_{DT,h}(x)\right\} +o\left( \frac{1}{n}\right) \\= & {} \frac{f(x)}{n}(1-f(x))\frac{(1+a)^{2h}}{P^{2}(a,h)}+o\left( \frac{1}{n}\right) . \end{aligned}$$

Therefore, we obtain the approximation for the variance given in Theorem 2. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Harfouche, L., Adjabi, S., Zougab, N. et al. Multiplicative bias correction for discrete kernels. Stat Methods Appl 27, 253–276 (2018). https://doi.org/10.1007/s10260-017-0395-x

Download citation

Accepted: 27 August 2017
Published: 02 September 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s10260-017-0395-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiplicative bias correction for discrete kernels

Abstract

Access this article

Similar content being viewed by others

Feature selection techniques for machine learning: a survey of more than two decades of research

A review of unsupervised feature selection methods

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 Sketch of the proof of Theorem 1

1.1.1 Bias

1.1.2 Variance

1.2 Sketch of the proof of Theorem 2

1.2.1 Bias

1.2.2 Variance

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multiplicative bias correction for discrete kernels

Abstract

Access this article

Similar content being viewed by others

Feature selection techniques for machine learning: a survey of more than two decades of research

A review of unsupervised feature selection methods

Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Sketch of the proof of Theorem 1

1.1.1 Bias

1.1.2 Variance

1.2 Sketch of the proof of Theorem 2

1.2.1 Bias

1.2.2 Variance

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation