Advertisement

A Combined Classification Algorithm Based on C4.5 and NB

  • Liangxiao Jiang
  • Chaoqun Li
  • Jia Wu
  • Jian Zhu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5370)

Abstract

When our learning task is to build a model with accurate classification, C4.5 and NB are two very important algorithms for achieving this task because of their simplicity and high performance. In this paper, we present a combined classification algorithm based on C4.5 and NB, simply C4.5-NB. In C4.5-NB, the class probability estimates of C4.5 and NB are weighted according to their classification accuracy on the training data. We experimentally tested C4.5-NB in Weka system using the whole 36 UCI data sets selected by Weka, and compared it with C4.5 and NB. The experimental results show that C4.5-NB significantly outperforms C4.5 and NB in terms of classification accuracy. Besides, we also observe the ranking performance of C4.5-NB in terms of AUC (the area under the Receiver Operating Characteristics curve). Fortunately, C4.5-NB also significantly outperforms C4.5 and NB.

Keywords

decision trees naive Bayes combined algorithms weights classification ranking data mining 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Mitchell, T.M.: Decision tree Learning. In: Machine Learning, ch. 3. McGraw-Hill, New York (1997)Google Scholar
  2. 2.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)Google Scholar
  3. 3.
    Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)Google Scholar
  4. 4.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Francisco (1988)zbMATHGoogle Scholar
  5. 5.
    Langley, P., Iba, W., Thomas, K.: An analysis of Bayesian classifiers. In: Proceedings of the Tenth National Conference of Artificial Intelligence, pp. 223–228. AAAI Press, Menlo Park (1992)Google Scholar
  6. 6.
    Friedman, G., Goldszmidt: Bayesian Network Classifiers. Machine Learning 29, 131–163 (1997)CrossRefzbMATHGoogle Scholar
  7. 7.
    Merz, C., Murphy, P., Aha, D.: UCI repository of machine learning databases. In: Dept of ICS, University of California, Irvine (1997), http://www.ics.uci.edu/mlearn/MLRepository.html
  8. 8.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005), http://prdownloads.sourceforge.net/weka/datasets-UCI.jar zbMATHGoogle Scholar
  9. 9.
    Zhang, H., Jiang, L., Su, J.: Hidden Naive Bayes. In: Proceedings of the 20th National Conference on Artificial Intelligence, AAAI 2005, pp. 919–924. AAAI Press, Menlo Park (2005)CrossRefGoogle Scholar
  10. 10.
    Liang, H., Zhang, H., Guo, Y.: Decision Trees for Probability Estimation: An Empirical Study. In: Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2006, pp. 756–764. IEEE Computer Society Press, Los Alamitos (2006)Google Scholar
  11. 11.
    Nadeau, C., Bengio, Y.: Inference for the generalization error. Advances in Neural Information Processing Systems 12, 307–313 (1999)zbMATHGoogle Scholar
  12. 12.
    Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)CrossRefGoogle Scholar
  13. 13.
    Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45, 171–186 (2001)CrossRefzbMATHGoogle Scholar
  14. 14.
    Jiang, L., Zhang, H., Cai, Z., Su, J.: Learning tree augmented naive bayes for ranking. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 688–698. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  15. 15.
    Jiang, L., Zhang, H., Cai, Z.: Discriminatively Improving Naive Bayes by Evolutionary Feature Selection. Romanian Journal of Information Science and Technology 9(3), 163–174 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Liangxiao Jiang
    • 1
  • Chaoqun Li
    • 2
  • Jia Wu
    • 1
  • Jian Zhu
    • 1
  1. 1.Faculty of Computer ScienceChina University of GeosciencesWuhanP.R. China
  2. 2.Faculty of MathematicsChina University of GeosciencesWuhanP.R. China

Personalised recommendations