Abstract
A key problem in text mining is the extraction of relations between terms. Hand-crafted lexical resources such as Wordnet have limitations when it comes to special text corpora. Distributional approaches to the problem of automatic construction of thesauri from large corpora have been proposed, making use of sophisticated Natural Language Processing techniques, which makes them language specific, and computationally intensive. We conjecture that in a number of applications, it is not necessary to determine the exact nature of term relations, but it is sufficient to capture and exploit the frequent co-occurrence of terms. Such an application is tag recommendation.
Collaborative tagging systems are social data repositories, in which users manage web resources by assigning to them descriptive keywords (tags). An important element of collaborative tagging systems is the tag recommender, which proposes a set of tags to a user who is posting a resource. In this talk we explore the potential of three tag sources: resource content (including metadata fields, such as the title), resource profile (the set of tags assigned to the resource by all users that tagged it) and user profile (the set of tags the user assigned to all the resources she tagged). The content-based tag set is enriched with related tags in the tag-to-tag and title-word-to-tag graphs, which capture co-occurrences of words as tags and/or title words. The resulting tag set is further enriched with tags previously used to describe the same resource (resource profile). The resource-based tag set is checked against user profile tags - a rich, but imprecise source of information about user interests. The result is a set of tags related both to the resource and user. The system participated in the ECML/PKDD Discovery Challenge 2009 for the “content-based”, “graph-based”, and “online” recommendation tasks, in which it took first, third and first place respectively.
Joint work with Marek Lipczak, Yeming Hu, and Yael Kollet.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Milios, E. (2010). Corpus-Based Term Relatedness Graphs in Tag Recommendation. In: Farzindar, A., Kešelj, V. (eds) Advances in Artificial Intelligence. Canadian AI 2010. Lecture Notes in Computer Science(), vol 6085. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13059-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-13059-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13058-8
Online ISBN: 978-3-642-13059-5
eBook Packages: Computer ScienceComputer Science (R0)