Skip to main content

Text Classification for a Large-Scale Taxonomy Using Dynamically Mixed Local and Global Models for a Node

  • Conference paper
Advances in Information Retrieval (ECIR 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6611))

Included in the following conference series:

Abstract

Hierarchical text classification for a large-scale Web taxonomy is challenging because the number of categories hierarchically organized is large and the training data for deep categories are usually sparse. It’s been shown that a narrow-down approach involving a search of the taxonomical tree is an effective method for the problem. A recent study showed that both local and global information for a node is useful for further improvement. This paper introduces two methods for mixing local and global models dynamically for individual nodes and shows they improve classification effectiveness by 5% and 30%, respectively, over and above the state-of-art method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Broder, A.Z., Fontoura, M., Gabrilovich, E., Joshi, A., Josifovski, V., Zhang, T.: Robust classification of rare queries using web knowledge. In: 30th ACM SIGIR, pp. 231–238 (2007)

    Google Scholar 

  2. Broder, A., Fontoura, M., Josifovski, V., Riedel, L.: A semantic approach to contextual advertising. In: 30th ACM SIGIR, pp. 559–566 (2007)

    Google Scholar 

  3. Zhang, B., Li, H., Liu, Y., Ji, L., Xi, W., Fan, W., Chen, Z., Ma, W.Y.: Improving web search results using affinity graph. In: 28th ACM SIGIR, pp. 504–511 (2005)

    Google Scholar 

  4. Kosmopoulos, A., Gaussier, E., Paliouras, G., Aseervatham, S.: The ECIR 2010 large scale hierarchical classification workshop. SIGIR Forum. 44, 23–32 (2010)

    Article  Google Scholar 

  5. Sun, A., Lim, E.P.: Hierarchical text classification and evaluation. In: IEEE ICDM, pp. 521–528 (2001)

    Google Scholar 

  6. Liu, T.Y., Yang, Y., Wan, H., Zeng, H.J., Chen, Z., Ma, W.Y.: Support vector machines classification with a very large-scale taxonomy. ACM SIGKDD Explorations Newsletter 7, 36–43 (2005)

    Article  Google Scholar 

  7. Bennett, P.N., Nguyen, N.: Refined experts: improving classification in large taxonomies. In: 32nd ACM SIGIR, pp. 11–18 (2009)

    Google Scholar 

  8. Xue, G.R., Xing, D., Yang, Q., Yu, Y.: Deep classification in large-scale text hierarchies. In: 31st ACM SIGIR, pp. 619–626 (2008)

    Google Scholar 

  9. McCallum, A., Rosenfeld, R., Mitchell, T., Ng, A.Y.: Improving text classification by shrinkage in a hierarchy of classes. In: 15th ICML, pp. 359–367 (1998)

    Google Scholar 

  10. Cai, L., Hofmann, T.: Hierarchical document categorization with support vector machines. In: 13th ACM CIKM, pp. 78–87 (2004)

    Google Scholar 

  11. Labrou, Y., Finin, T.: Yahoo! as an ontology: using Yahoo! categories to describe documents. In: 8th ACM CIKM, pp. 180–187 (1999)

    Google Scholar 

  12. Sasaki, M., Kita, K.: Rule-based text categorization using hierarchical categories. In: IEEE International Conference on Systems, Man, and Cybernetics, vol. 3, pp. 2827–2830 (1998)

    Google Scholar 

  13. Wang, K., Zhou, S., He, Y.: Hierarchical classification of real life documents. In: 1st (SIAM) International Conference on Data Mining, pp. 1–16 (2001)

    Google Scholar 

  14. Yang, Y., Zhang, J., Kisiel, B.: A scalability analysis of classifiers in text categorization. In: 26th ACM SIGIR, pp. 96–103 (2003)

    Google Scholar 

  15. Oh, H.S., Choi, Y., Myaeng, S.H.: Combining global and local information for enhanced deep classification. In: 2010 ACM SAC, pp. 1760–1767 (2010)

    Google Scholar 

  16. Koller, D., Sahami, M.: Hierarchically classifying documents using very few words. In: International Conference on Machine Learning, pp. 170–178 (1997)

    Google Scholar 

  17. Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: International Conference on Machine Learning, pp. 412–420 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Oh, HS., Choi, Y., Myaeng, SH. (2011). Text Classification for a Large-Scale Taxonomy Using Dynamically Mixed Local and Global Models for a Node. In: Clough, P., et al. Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20161-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20161-5_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20160-8

  • Online ISBN: 978-3-642-20161-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics