Abstract
Large Bayes (LB) is a recently introduced classifier built from frequent and interesting itemsets. LB uses itemsets to create context-specific probabilistic models of the data and estimate the conditional probability P(c i|A) of each class c i given a case A. In this paper we use chi-square tests to address several drawbacks of the originally proposed interestingness metric, namely: (i) the inability to capture certain really interesting patterns, (ii) the need for a user-defined and data dependent interestingness threshold, and (iii) the need to set a minimum support threshold. We also introduce some pruning criteria which allow for a trade-off between complexity and speed on one side and classification accuracy on the other. Our experimental results show that the modified LB outperforms the original LB, Naïve Bayes, C4.5 and TAN.
Chapter PDF
Similar content being viewed by others
Keywords
- Product Approximation
- Conditional Entropy
- Minimum Support Threshold
- Pruning Criterion
- Bayesian Network Classifier
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
R. Agrawal, R. Srikant, Fast algorithms for mining association rules, VLDB-94, 1994.
R. Duda, P. Hart, Pattern Classification and Scene Analysis, John Wiley & Sons, 1973.
U.M. Fayyad, K.B. Irani, Multi-Interval discretization of continuous-valued attributes for classification learning, 13th IJCAI, 1022–1027, 1993.
N. Friedman, D. Geiger, M. Goldszmidt, Bayesian Network Classifiers, Machine Learning, 29, 131–163, 1997.
D. Meretakis, B. Wüthrich, Extending Naïve Bayes Classifiers Using Long Itemsets, KDD-99, pp 165–174, San Diego, USA, 1999.
D. Meretakis, B. Wüthrich, Classification as Mining and Use of Labeled Itemsets, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD’99), Philadelphia, USA, 1999.
C.J. Merz, P. Murphy, UCI repository of machine learning databases, 1996 (http://www.cs.uci.edu/~mlearn/MLRepository.html).
W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes in C, 2nd Ed, Cambridge University Press, 1992.
J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Meretakis, D., Lu, H., Wüthrich, B. (2000). A Study on the Performance of Large Bayes Classifier. In: López de Mántaras, R., Plaza, E. (eds) Machine Learning: ECML 2000. ECML 2000. Lecture Notes in Computer Science(), vol 1810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45164-1_29
Download citation
DOI: https://doi.org/10.1007/3-540-45164-1_29
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67602-7
Online ISBN: 978-3-540-45164-8
eBook Packages: Springer Book Archive