Skip to main content

Chinese Weblog Pages Classification Based on Folksonomy and Support Vector Machines

  • Conference paper
  • 600 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4476))

Abstract

For centuries, classification has been used to provide context and direction in any aspect of human knowledge. Standard machine learning techniques like support vector machines and related large margin methods have been successfully applied for this task. Unfortunately, automatic classifiers often conduct misclassifications. Folksonomy, a new manual classification scheme based on tagging efforts of users with freely chosen keywords can effective resolve this problem. In folksonomy, a user attaches tags to an item for their own classification, and they reflect many one’s viewpoints. Since tags are chosen from users’ vocabulary and contain many one’s viewpoints, classification results are easy to understand for ordinary users. Even though the scalability of folksonomy is much higher than the other manual classification schemes, the method cannot deal with tremendous number of items such as whole weblog articles on the Internet. For the purpose of solving this problem, we propose a new classification method FSVMC (folisonomy and support vector machine classifier). The FSVMC uses support vector machines as a Tag-agent which is a program to determine whether a particular tag should be attached to a weblog page and Folksonomy dedicates to categorize the weblog articles. In addition, we propose a method to create a candidate tag database which is a list of tags that may be attached to weblog pages. Experimental results indicate our method is more flexible and effective than traditional methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apte, C., Damerau, F., Weiss, S.M.: Text mining with decision trees and decision rule. In: Proceeding of the automated learning and discovery conference, Carnegie-Mellon University, pp. 99–103 (1998)

    Google Scholar 

  2. Gunn, S.R.: Support vector machines for classification and regression. ISIS technical report, 31–36. Image speech and intelligent systems group of University of Southampton (1998)

    Google Scholar 

  3. Tan, S.: Neighbor-weighted k-nearest neighbor for unbalanced text corpus. Expert Systems with Applications, 1–5 (2005)

    Google Scholar 

  4. Salton, G., McGill, M.J.: Introduction to modern information retrieval, pp. 13–17. McGraw-Hill, New York (1983)

    MATH  Google Scholar 

  5. Schapire, R.E., Singer, Y.: Boostexter: A boosting-based system for text categorization. Machine Learning, 135–168 (2000)

    Google Scholar 

  6. Avesani, P., et al.: Learning contextualised weblog topics. In: WWW 2005 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, pp. 20–33 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Vladimir Gorodetsky Chengqi Zhang Victor A. Skormin Longbing Cao

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Wang, X., Bai, R., Liao, J. (2007). Chinese Weblog Pages Classification Based on Folksonomy and Support Vector Machines. In: Gorodetsky, V., Zhang, C., Skormin, V.A., Cao, L. (eds) Autonomous Intelligent Systems: Multi-Agents and Data Mining. AIS-ADM 2007. Lecture Notes in Computer Science(), vol 4476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72839-9_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72839-9_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72838-2

  • Online ISBN: 978-3-540-72839-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics