Combining Rules for Text Categorization Using Dempster’s Rule of Combination

  • Yaxin Bi
  • Terry Anderson
  • Sally McClean
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3177)

Abstract

In this paper, we present an investigation into the combination of rules for text categorization using Dempster’s rule of combination. We first propose a boosting-like technique for generating multiple sets of rules based on rough set theory, and then describe how to use Dempster’s rule of combination to combine the classification decisions produced by multiple sets of rules. We apply these methods to 10 out of the 20-newsgroups – a benchmark data collection, individually and in combination. Our experimental results show that the performance of the best combination of the multiple sets of rules on the 10 groups of the benchmark data can achieve 80.47% classification accuracy, which is 3.24% better than that of the best single set of rules.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Xu, L., Krzyzak, A., Suen, C.: Associative Switch for Combining Multiple Classifiers. Journal of Artificial Neural Networks 1(1), 77–100 (1994)Google Scholar
  2. 2.
    Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156 (1996)Google Scholar
  3. 3.
    Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting (Technical Report). Stanford University Statistics Department (1998), http://www.statstanford.edu/~tibs
  4. 4.
    Pawlak, Z.: Rough Set: Theoretical aspects of reasoning about data. Kluwer Academic, Dordrecht (1991)MATHGoogle Scholar
  5. 5.
    Bi, Y.: Combining Multiple Pieces of Evidence for Text Categorization using Dempster’s rule of combination. Internal report (2004)Google Scholar
  6. 6.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Matero (1093)Google Scholar
  7. 7.
    Apte, C., Damerau, F., Weiss, S.: Automated Learning of Decision Text Categorization. ACM Transactions on Information Systems 12(3), 233–251 (1994)CrossRefGoogle Scholar
  8. 8.
    Weiss, S.M., Indurkhya, N.: Lightweight Rule Induction. In: Proceedings of the International Conference on Machine Learning, ICML (2000)Google Scholar
  9. 9.
    Shafer, G.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)MATHGoogle Scholar
  10. 10.
    Joachims, T.: Text categorization With Support Vector Machines: Learning With Many Relevant Features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, Springer, Heidelberg (1998)CrossRefGoogle Scholar
  11. 11.
    van Rijsbergen, C.J.: Information Retrieval, 2nd edn., Butterworths (1979)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Yaxin Bi
    • 1
    • 2
  • Terry Anderson
    • 3
  • Sally McClean
    • 3
  1. 1.School of Computer ScienceQueen’s University of BelfastBelfastUK
  2. 2.School of Biomedical ScienceUniversity of UlsterColeraine, LondonderryUK
  3. 3.Faculty of EngineeringUniversity of UlsterNewtownabbey, Co. AntrimUK

Personalised recommendations