A Comparative Study of Bandwidth Choice in Kernel Density Estimation for Naive Bayesian Classification

Liu, Bin; Yang, Ying; Webb, Geoffrey I.; Boughton, Janice

doi:10.1007/978-3-642-01307-2_29

Bin Liu²³,
Ying Yang²³,
Geoffrey I. Webb²³ &
…
Janice Boughton²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5476))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3278 Accesses
13 Citations

Abstract

Kernel density estimation (KDE) is an important method in nonparametric learning. While KDE has been studied extensively in the context of accuracy of distribution estimation, it has not been studied extensively in the context of classification. This paper studies nine bandwidth selection schemes for kernel density estimation in Naive Bayesian classification context, using 52 machine learning benchmark datasets. The contributions of this paper are threefold. First, it shows that some commonly used and very sophisticated bandwidth selection schemes do not give good performance in Naive Bayes. Surprisingly, some very simple bandwidth selection schemes give statistically significantly better performance. Second, it shows that kernel density estimation can achieve statistically significantly better classification performance than a commonly used discretization method in Naive Bayes, but only when appropriate bandwidth selection schemes are applied. Third, this study gives bandwidth distribution patterns for the investigated bandwidth selection schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence, vol. 2, pp. 1022–1027 (1993)
Google Scholar
Yang, Y., Webb, G.: Discretization for naive-bayes learning: managing discretization bias and variance. Machine Learning (2008) Online First
Google Scholar
Bay, S.D.: Multivariate discretization for set mining. Knowledge and Information Systems 3(4), 491–512 (2001)
Article MATH Google Scholar
John, G.H., Langley, P.: Estimating continuous distributions in bayesian classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345 (1995)
Google Scholar
Silverman, B.W.: Density Estimation for Statistics and Data Analysis, 1st edn. Chapman & Hall/CRC (1986)
Google Scholar
Wand, M.P., Jones, M.C.: Kernel Smoothing. Chapman & Hall/CRC (1994)
Google Scholar
Epanechnikov, V.A.: Non-parametric estimation of a multivariate probability density. Theory of Probability and its Applications 14(1), 153–158 (1969)
Article MathSciNet MATH Google Scholar
Friedman, J.H.: On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1(1), 55–77 (1997)
Article MathSciNet Google Scholar
Hall, P., Kang, K.H.: Bandwidth choice for nonparametric classification. Annals of Statistics 33(1), 284–306 (2005)
Article MathSciNet MATH Google Scholar
Bowman, A.W.: An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2), 353–360 (1984)
Article MathSciNet Google Scholar
R Development Core Team: R: A Language and Environment for Statistical Computing, Austria, Vienna (2008), http://www.R-project.org
Scott, D.W., Terrell, G.R.: Biased and unbiased cross-validation in density estimation. Journal of the American Statistical Association 82(400), 1131–1146 (1987)
Article MathSciNet MATH Google Scholar
Sheather, S.J., Jones, M.C.: A reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society. Series B 53(3), 683–690 (1991)
MathSciNet MATH Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Hyndman, R.J.: The problem with sturge’s rule for constructing histograms (1995), http://www-personal.buseco.monash.edu.au/~hyndman/papers
Sturges, H.A.: The choice of a class interval. Journal of the American Statistical Association 21(153), 65–66 (1926)
Article Google Scholar
Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S-PLUS, 3rd edn. Springer, Heidelberg (1999)
Book MATH Google Scholar
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Webb, G.I.: Multiboosting: A technique for combining boosting and wagging. Machine Learning 40(2), 159–196 (2000)
Article Google Scholar
Kohavi, R., Wolpert, D.H.: Bias plus variance decomposition for zero-one loss functions. In: Machine Learning: Proceedings of the Thirteenth International Conference, vol. 275, p. 283 (1996)
Google Scholar
Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association 32(200), 675–701 (1937)
Article MATH Google Scholar
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7, 1–30 (2006)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Clayton School of Information Technology, Monash University, Australia
Bin Liu, Ying Yang, Geoffrey I. Webb & Janice Boughton

Authors

Bin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ying Yang
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey I. Webb
View author publications
You can also search for this author in PubMed Google Scholar
Janice Boughton
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Sirindhorn International Institute of Technology, Thammasat University, 131 Moo 5 Tiwanont Road, 12000, Bangkadi, Muang, Pathumthani, Thailand
Thanaruk Theeramunkong
Dept. of Computer Engineering, Faculty of Engineering, Chulalongkorn University, 10330, Bangkok, Thailand
Boonserm Kijsirikul
Faculty of Science & Engineering, York University, 355 Lumbers Building, 4700 Keele Street, M3J 1P3, Toronto, Ontario, Canada
Nick Cercone
School of Knowledge Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, 923-1292, Ishikawa, Japan
Tu-Bao Ho

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, B., Yang, Y., Webb, G.I., Boughton, J. (2009). A Comparative Study of Bandwidth Choice in Kernel Density Estimation for Naive Bayesian Classification. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, TB. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2009. Lecture Notes in Computer Science(), vol 5476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01307-2_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-01307-2_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01306-5
Online ISBN: 978-3-642-01307-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics