Abstract
Naive Bayes is one of the most efficient and effective inductive learning algorithms for machine learning and data mining. But the conditional independence assumption on which it is based, is rarely true in real-world applications. Researchers extended naive Bayes to represent dependence explicitly, and proposed related learning algorithms based on dependence. In this paper, we argue that, from the classiffication point of view, dependence distribution plays a crucial role, rather than dependence. We propose a novel explanation on the superb classi.cation performance of naive Bayes. To verify our idea, we design and conduct experiments by extending the ChowLiu algorithm to use the dependence distribution to construct TAN, instead of using mutual information that only re.ects the dependencies among attributes. The empirical results provide evidences to support our new explanation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chow, C.K., Liu, C.N.: Approximating Discrete Probability Distributions with Dependence Trees. IEEE Trans. on Information Theory, Vol. 14 (1968), 462–467.
Domingos P., Pazzani M.: Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier. Machine Learning 29 (1997) 103–130
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning, Vol: 29 (1997), 131–163.
Merz, C., Murphy, P., Aha, D.: UCI Repository of Machine Learning Databases. In: Dept of ICS, University of California, Irvine (1997). http://www.www.ics.uci.edu/mlearn/MLRepository.html..
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, H., Ling, C.X. (2003). A Fundamental Issue of Naive Bayes. In: Xiang, Y., Chaib-draa, B. (eds) Advances in Artificial Intelligence. Canadian AI 2003. Lecture Notes in Computer Science, vol 2671. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44886-1_55
Download citation
DOI: https://doi.org/10.1007/3-540-44886-1_55
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40300-5
Online ISBN: 978-3-540-44886-0
eBook Packages: Springer Book Archive