Advertisement

Integrating Global and Local Application of Discriminative Multinomial Bayesian Classifier for Text Classification

  • Emmanuel Pappas
  • Sotiris Kotsiantis
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 182)

Abstract

The Discriminative Multinomial Naive Bayes classifier has been a center of attention in the field of text classification. In this study, we attempted to increase the prediction accuracy of the Discriminative Multinomial Naive Bayes by integrating global and local application of Discriminative Multinomial Naive Bayes classifier. We performed a large-scale comparison on benchmark datasets with other state-of-the-art algorithms and the proposed methodology gave better accuracy in most cases.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cardoso-Cachopo, A., Oliveira, A.L.: An Empirical Comparison of Text Categorization Methods. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 183–196. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  2. 2.
    Figueiredo, F., Rocha, L., Couto, T., Salles, T., Gonçalves, M.A., Meira Jr., W.: Word co-occurrence features for text classification. Information Systems 36(5), 843–858 (2011)CrossRefGoogle Scholar
  3. 3.
    Forman, G.: Feature selection for text classification. In: Computational Methods of Feature Selection, pp. 257–276. Chapman and Hall/CRC (2007)Google Scholar
  4. 4.
    Guo, G.D., Wang, H., Bell, D., Bi, Y.X., Greer, K.: Using kNN model for automatic text categorization. Soft Computing 10(5), 423–430 (2006)CrossRefGoogle Scholar
  5. 5.
    Feng, G., Guo, J., Jing, B.-Y., Hao, L.: A Bayesian feature selection paradigm for text classification. Information Processing Management (2011) ISSN 0306-4573, 10.1016/j.ipm.2011.08.002Google Scholar
  6. 6.
    Chen, J., Huang, H., Tian, S., Qu, Y.: Feature selection for text classification with Naïve Bayes. Expert Systems with Applications 36(3), Part I, 5432–5435 (2009)CrossRefGoogle Scholar
  7. 7.
    Joachims, T.: Learning to classify text using support vector machines. Kluwer Academic, Hingharn (2002)CrossRefGoogle Scholar
  8. 8.
    Kim, S.-B., Rim, H.-C., Yook, D., Lim, H.-S.: Effective Methods for Improving Naive Bayes Text Classifiers. In: Ishizuka, M., Sattar, A. (eds.) PRICAI 2002. LNCS (LNAI), vol. 2417, pp. 414–423. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Kłopotek, M.A., Woch, M.: Very Large Bayesian Networks in Text Classification. In: Sloot, P.M.A., Abramson, D., Bogdanov, A.V., Gorbachev, Y.E., Dongarra, J., Zomaya, A.Y. (eds.) ICCS 2003. LNCS, vol. 2657, pp. 397–406. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  10. 10.
    Leopold, E., Kindermann, J.: Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? Machine Learning 46, 423–444 (2002)MATHCrossRefGoogle Scholar
  11. 11.
    Lewis, D., Yang, Y., Rose, T., Li, F.: RCV1: A New Benchmark Collection for Text Categorization Research. Journal of Machine Learning Research 5, 361–397 (2004)Google Scholar
  12. 12.
    Liu, Y., Loh, H.T., Sun, A.: Imbalanced text classification: A term weighting approach. Expert Systems with Applications 36, 690–701 (2009)CrossRefGoogle Scholar
  13. 13.
    Madsen, R.E., Sigurdsson, S., Hansen, L.K., Lansen, J.: Pruning the Vocabulary for Better Context Recognition. In: 7th International Conference on Pattern Recognition (2004)Google Scholar
  14. 14.
    Mccallum, A., Nigam, K.: A Comparison of Event Models for Naive Bayes Text Classification. In: AAAI 1998 Workshop on Learning for Text Categorization (1998)Google Scholar
  15. 15.
    Ogura, H., Amano, H., Kondo, M.: Feature selection with a measure of deviations from Poisson in text categorization. Expert Systems with Applications 36, 6826–6832 (2009)CrossRefGoogle Scholar
  16. 16.
    Schneider, K.-M.: Techniques for Improving the Performance of Naive Bayes for Text Classification. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 682–693. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surveys 34(1), 1–47 (2002)CrossRefGoogle Scholar
  18. 18.
    Shang, W., Huang, H., Zhu, H., Lin, Y.: A novel feature selection algorithm for text categorization. Expert Systems with Applications 33, 1–5 (2007)CrossRefGoogle Scholar
  19. 19.
    Su, J., Zhang, H., Ling, C., Matwin, S.: Discriminative Parameter Learning for Bayesian Networks. In: ICML 2008 (2008)Google Scholar
  20. 20.
    Sun, A., Lim, E., Liu, Y.: On strategies for imbalanced text classification using SVM: A comparative study. Decision Support Systems 48(1), 191–201 (2009)CrossRefGoogle Scholar
  21. 21.
    Vikramjit, M., Wang, C.-J., Banerjee, S.: Text classification: A least square support vector machine approach. Applied Soft Computing 7(3), 908–914 (2007)CrossRefGoogle Scholar
  22. 22.
    Yu, B.: An evaluation of text classification methods for literary study. Literary and Linguistic Computing 23(3), 327–343 (2008)CrossRefGoogle Scholar
  23. 23.
    Witten, I., Frank, E., Hall, M.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann (2011) ISBN 978-0-12-374856-0Google Scholar
  24. 24.
    Zhang, W., Yoshida, T., Tang, X.: Text classification based on multi-word with support vector machine. Knowledge-Based Systems 21(8), 879–886 (2008)CrossRefGoogle Scholar
  25. 25.
    Zhang, W., Gao, F.: An Improvement to Naive Bayes for Text Classification. Procedia Engineering 15, 2160–2164Google Scholar
  26. 26.
    Zhang, W., Yoshida, T., Tang, X.: A comparative study of TF*IDF, LSI and multi-words for text classification. Expert Systems with Applications 38(3), 2758–2765 (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Hellenic Open UniversityPatrasGreece
  2. 2.Department of MathematicsUniversity of PatrasPatrasGreece

Personalised recommendations