Advertisement

Journal of Computer Science and Technology

, Volume 29, Issue 3, pp 376–391 | Cite as

Higher-Order Smoothing: A Novel Semantic Smoothing Method for Text Classification

  • Mitat Poyraz
  • Zeynep Hilal Kilimci
  • Murat Can Ganiz
Regular Paper

Abstract

It is known that latent semantic indexing (LSI) takes advantage of implicit higher-order (or latent) structure in the association of terms and documents. Higher-order relations in LSI capture “latent semantics”. These findings have inspired a novel Bayesian framework for classification named Higher-Order Naive Bayes (HONB), which was introduced previously, that can explicitly make use of these higher-order relations. In this paper, we present a novel semantic smoothing method named Higher-Order Smoothing (HOS) for the Naive Bayes algorithm. HOS is built on a similar graph based data representation of the HONB which allows semantics in higher-order paths to be exploited. We take the concept one step further in HOS and exploit the relationships between instances of different classes. As a result, we move beyond not only instance boundaries, but also class boundaries to exploit the latent information in higher-order paths. This approach improves the parameter estimation when dealing with insufficient labeled data. Results of our extensive experiments demonstrate the value of HOS on several benchmark datasets.

Keywords

Naive Bayes semantic smoothing higher-order Naive Bayes higher-order smoothing text classification 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

11390_2014_1437_MOESM1_ESM.pdf (75 kb)
ESM 1 (PDF 75 kb)

References

  1. [1]
    Taskar B, Abbeel P, Koller D. Discriminative probabilistic models for relational data. In Proc. the 18th Conf. Uncertainty in Artificial Intelligence, August 2002, pp.485-492.Google Scholar
  2. [2]
    Chakrabarti S, Dom B, Indyk P. Enhanced hypertext categorization using hyperlinks. In Proc. International Conference on Management of Data, June 1998, pp.307-318.Google Scholar
  3. [3]
    Neville J, Jensen D. Iterative classification in relational data. In Proc. AAAI 2000 Workshop on Learning Statistical Models from Relational Data, July 2000, pp.13-20.Google Scholar
  4. [4]
    Getoor L, Diehl C P. Link mining: A survey. ACM SIGKDD Explorations Newsletter, 2005, 7(2): 3-12.CrossRefGoogle Scholar
  5. [5]
    Ganiz M C, Kanitkar S, Chuah M C, Pottenger W M. Detection of interdomain routing anomalies based on higher-order path analysis. In Proc. the 6th IEEE International Conference on Data Mining, December 2006, pp.874-879.Google Scholar
  6. [6]
    Ganiz M C, Lytkin N, Pottenger W M. Leveraging higher order dependencies between features for text classification. In Proc. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, September 2009, pp.375-390.Google Scholar
  7. [7]
    Ganiz M C, George C, Pottenger W M. Higher order Naive Bayes: A novel non-IID approach to text classification. IEEE Trans. Knowledge and Data Engineering, 2011, 23(7): 1022-1034.CrossRefGoogle Scholar
  8. [8]
    Lytkin N. Variance-based clustering methods and higher order data transformations and their applications [Ph.D. Thesis]. Rutgers University, NJ, 2009.Google Scholar
  9. [9]
    Edwards A, Pottenger W M. Higher order Q-Learning. In Proc. IEEE Symp. Adaptive Dynamic Programming and Reinforcement Learning, April 2011, pp.128-134.Google Scholar
  10. [10]
    Deerwester S C, Dumais S T, Landauer T K et al. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990, 41(6): 391-407.CrossRefGoogle Scholar
  11. [11]
    Kontostathis A, Pottenger W M. A framework for understanding latent semantic indexing (LSI) performance. Journal of the Information Processing and Management, 2006, 42(1): 56-73.CrossRefGoogle Scholar
  12. [12]
    Sarah Z, Hirsh H. Transductive LSI for short text classification problems. In Proc. the 17th International Florida Artificial Intelligence Research Society Conference, May 2004, pp.556-561.Google Scholar
  13. [13]
    Li S, Wu T, Pottenger W M. Distributed higher order association rule mining using information extracted from textual data. SIGKDD Explorations Newsletter — Natural Language Processing and Text Mining, 2005, 7(1): 26-35.Google Scholar
  14. [14]
    McCallum A, Nigam K. A comparison of event models for Naive Bayes text classification. In Proc. AAAI 1998 Workshop on Learning for Text Categorization, July 1998, pp.41-48.Google Scholar
  15. [15]
    Kim S B, Han K S, Rim H C, Myaeng S H. Some effective techniques for naive Bayes text classification. IEEE Trans. Knowl. Data Eng., 2006, 18(11): 1457-1466.CrossRefGoogle Scholar
  16. [16]
    Schneider K M. On word frequency information and negative evidence in Naive Bayes text classification. In Proc. Int. Conf. Advances in Natural Language Processing, October 2004, pp.474-485.Google Scholar
  17. [17]
    Metsis V, Androutsopoulos I, Paliouras G. Spam filtering with Naive Bayes — Which Naive Bayes?. In Proc. Conference on Email and Anti-Spam, July 2006.Google Scholar
  18. [18]
    McCallum A, Nigam K. Text classification by bootstrapping with keywords, EM and shrinkage. In Proc. ACL 1999 Workshop for the Unsupervised Learning in Natural Language Processing, June 1999, pp.52-58.Google Scholar
  19. [19]
    Juan A, Ney H. Reversing and smoothing the multinomial Naive Bayes text classifier. In Proc. International Workshop on Pattern Recognition in Information Systems, April 2002, pp.200-212.Google Scholar
  20. [20]
    Peng F, Schuurmans D, Wang S. Augmenting naive Bayes classifiers with statistical language models. Information Retrieval, 2004, 7(3/4): 317-345.CrossRefGoogle Scholar
  21. [21]
    Zhou X, Zhang X, Hu X. Semantic smoothing for Bayesian text classification with small training data. In Proc. International Conference on Data Mining, April 2008, pp.289-300.Google Scholar
  22. [22]
    Chen S F, Goodman J. An empirical study of smoothing techniques for language modeling. In Proc. the 34th Annual Meeting on Association for Computational Linguistics, June 1996, pp.310-318Google Scholar
  23. [23]
    Joachims T. Text categorization with support vector machines: Learning with many relevant features. In Proc. the 10th European Conf. Machine Learning, Apr. 1998, pp.137-142.Google Scholar
  24. [24]
    Gao B, Liu T, Feng G, Qin T, Cheng Q, Ma W. Hierarchical taxonomy preparation for text categorization using consistent bipartite spectral graph co-partitioning. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(9): 1263-1273.CrossRefGoogle Scholar
  25. [25]
    Aggarwal C C, Zhao P. Towards graphical models for text processing. Knowledge and Information Systems, 2013, 36(1): 1-21.CrossRefGoogle Scholar
  26. [26]
    Tomás D, Vicedo J L. Minimally supervised question classification on fine-grained taxonomies. Knowledge and Information Systems, 2013, 36(2): 303-334.CrossRefGoogle Scholar
  27. [27]
    Nguyen T T, Chang K, Hui S C. Supervised term weighting centroid-based classifiers for text categorization. Knowledge and Information Systems, 2013, 35(1): 61-85CrossRefGoogle Scholar
  28. [28]
    Chakrabarti S. Supervised learning. In Mining the Web: Discovering Knowledge from Hypertext Data, Morgan Kaufmann Publishers, 2002, pp.148-151.Google Scholar
  29. [29]
    Manning C D, Schütze H. Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA, 1999.zbMATHGoogle Scholar
  30. [30]
    AmasyalıM F, Beken A. Measurement of Turkish word semantic similarity and text categorization application. In Proc. IEEE Signal Processing and Communications Applications Conference, April 2009. (in Turkish)Google Scholar
  31. [31]
    Torunoğlu D, Çakırman E, Ganiz M C et al. Analysis of preprocessing methods on classification of Turkish texts. In Proc. International Symposium on Innovations in Intelligent Systems and Applications, June 2011, pp.112-118.Google Scholar
  32. [32]
    Rennie J D, Shih L, Teevan J, Karger D R. Tackling the poor assumptions of Naive Bayes text classifiers. In Proc. ICML2003, August 2003, pp.616-623.Google Scholar
  33. [33]
    Eyheramendy S, Lewis D D, Madigan D. On the Naive Bayes model for text categorization. In Proc. the 9th International Workshop on Artificial Intelligence and Statistics, January 2003, pp.332-339.Google Scholar
  34. [34]
    Kolcz A, Yih W. Raising the baseline for high-precision text classifiers. In Proc. the 13th Int. Conf. Knowledge Discovery and Data Mining, August 2007, pp.400-409.Google Scholar
  35. [35]
    Japkowicz N, Shah M. Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press, 2011.Google Scholar
  36. [36]
    Su J, Shirab J S, Matwin S. Large scale text classification using semi-supervised multinomial Naive Bayes. In Proc. the 28th Int. Conf. Machine Learning, June 2011, pp.97-104.Google Scholar
  37. [37]
    Nakov P, Popova A, Mateev P. Weight functions impact on LSA performance. In Proc. the EuroConference Recent Advances in Natural Language Processing, September 2001, pp.187-193.Google Scholar
  38. [38]
    Poyraz M, Kilimci Z H, Ganiz M C. A novel semantic smoothing method based on higher order paths for text classification. In Proc. IEEE Int. Conf. Data Mining, Dec. 2012, pp.615-624.Google Scholar

Copyright information

© Springer Science+Business Media New York & Science Press, China 2014

Authors and Affiliations

  • Mitat Poyraz
    • 1
  • Zeynep Hilal Kilimci
    • 1
  • Murat Can Ganiz
    • 1
  1. 1.Department of Computer EngineeringDogus UniversityIstanbulTurkey

Personalised recommendations