Predicting User Tags Using Semantic Expansion
Manually annotating content such as Internet videos, is an intellectually expensive and time consuming process. Furthermore, keywords and community-provided tags lack consistency and present numerous irregularities. Addressing the challenge of simplifying and improving the process of tagging online videos, which is potentially not bounded to any particular domain, we present an algorithm for predicting user-tags from the associated textual metadata in this paper. Our approach is centred around extracting named entities exploiting complementary textual resources such as Wikipedia and Wordnet. More specifically to facilitate the extraction of semantically meaningful tags from a largely unstructured textual corpus we developed a natural language processing framework based on GATE architecture. Extending the functionalities of the in-built GATE named entities, the framework integrates a bag-of-articles algorithm for effectively searching through the Wikipedia articles for extracting relevant articles. The proposed framework has been evaluated against MediaEval 2010 Wild Wild Web dataset, which consists of large collection of Internet videos.
Keywordstag prediction video indexing user-contributed metadata speech recognition evaluation
Unable to display preview. Download preview PDF.
- 1.Hearst, M.: Automatic acquisition of hyponyms from large text corpora. In: Fourteenth International Conference on Comput. Linguistics, pp. 539–545 (1992)Google Scholar
- 2.Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press (1999)Google Scholar
- 3.Bast H., Dupret G., Majumdar D., Piwowarski B.: Discovering a Term Taxonomy from Term Similarities Using Principal Component Analysis. Semantic Web Mining (2006)Google Scholar
- 5.Nemeth, Y., Shapira, B., Taeib-Maimon, M.: Evaluation of the real and perceived value of automatic and interactive query expansion. In: SIGIR (2006)Google Scholar
- 6.Shapira B., Taieb-Maimon M., Nemeth Y.: Subjective and objective evaluation of interactive and automatic query expansion. Online Information Review (2005)Google Scholar
- 8.Snow, R., Jurafsky, D., Ng, A.: Learning syntactic patterns for automatic hypernym discovery. In: NIPS (2005)Google Scholar
- 9.Nemrava, J.: Refining search queries using WordNet glosses. In: EKAW (2006)Google Scholar
- 10.Kliegr, T., Chandramouli, K., Nemrava, J., Svatek, V., Izquierdo, E.: Combining Captions and Visual Analysis for Image Concept Classification. In: Proceedings of the 9h International Workshop on Multimedia Data Mining (2008)Google Scholar
- 11.Kliegr, T.: Entity Classification by Bag of Wikipedia Articles. In: Doctoral Consortium, CIKM (2010)Google Scholar
- 12.Cucerza, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proc. of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (2007)Google Scholar
- 13.Budanitsky A., Hirst G.: Evaluating wordnet-based measures of lexical semantic relatedness. Comput. Linguist. (2006)Google Scholar