A Fundamental Issue of Naive Bayes

Zhang, Harry; Ling, Charles X.

doi:10.1007/3-540-44886-1_55

Harry Zhang⁵ &
Charles X. Ling⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2671))

Included in the following conference series:

Conference of the Canadian Society for Computational Studies of Intelligence

1075 Accesses
3 Citations

Abstract

Naive Bayes is one of the most efficient and effective inductive learning algorithms for machine learning and data mining. But the conditional independence assumption on which it is based, is rarely true in real-world applications. Researchers extended naive Bayes to represent dependence explicitly, and proposed related learning algorithms based on dependence. In this paper, we argue that, from the classiffication point of view, dependence distribution plays a crucial role, rather than dependence. We propose a novel explanation on the superb classi.cation performance of naive Bayes. To verify our idea, we design and conduct experiments by extending the ChowLiu algorithm to use the dependence distribution to construct TAN, instead of using mutual information that only re.ects the dependencies among attributes. The empirical results provide evidences to support our new explanation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chow, C.K., Liu, C.N.: Approximating Discrete Probability Distributions with Dependence Trees. IEEE Trans. on Information Theory, Vol. 14 (1968), 462–467.
Article MATH Google Scholar
Domingos P., Pazzani M.: Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier. Machine Learning 29 (1997) 103–130
Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning, Vol: 29 (1997), 131–163.
Article Google Scholar
Merz, C., Murphy, P., Aha, D.: UCI Repository of Machine Learning Databases. In: Dept of ICS, University of California, Irvine (1997). http://www.www.ics.uci.edu/mlearn/MLRepository.html..
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer Science, University of New Brunswick, Fredericton, New Brunswick, Canada, E3B 5A3
Harry Zhang
Department of Computer Science, The University of Western Ontario, London, Ontario, Canada, N6A 5B7
Charles X. Ling

Authors

Harry Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Charles X. Ling
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computing and Information Science, College of Physical and Engineering Science, University of Guelph, Guelph, Ontario, Canada, N1G 2W1
Yang Xiang
Dépt. Informatique-Génie Logiciel, Université Laval, Pavillon Pouliot, Ste-Foy, PQ, Canada, G1K 7P4
Brahim Chaib-draa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, H., Ling, C.X. (2003). A Fundamental Issue of Naive Bayes. In: Xiang, Y., Chaib-draa, B. (eds) Advances in Artificial Intelligence. Canadian AI 2003. Lecture Notes in Computer Science, vol 2671. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44886-1_55

Download citation

DOI: https://doi.org/10.1007/3-540-44886-1_55
Published: 27 May 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40300-5
Online ISBN: 978-3-540-44886-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics