Utilizing Microblogs for Web Page Relevant Term Acquisition

Uherčík, Tomáš; Šimko, Marián; Bieliková, Mária

doi:10.1007/978-3-642-35843-2_39

Utilizing Microblogs for Web Page Relevant Term Acquisition

Tomáš Uherčík²¹,
Marián Šimko²¹ &
Mária Bieliková²¹

Conference paper

1113 Accesses
7 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7741))

Abstract

To allow advanced processing of information available on the Web, the web content necessitates semantic descriptions (metadata) processable by machines. Manual creation of metadata even in a lightweight form such as (web page) relevant terms is for us humans demanding and almost an impossible task, especially when considering open information space such as the Web. New approaches are devised continuously to automate the process. In the age of the Social Web an important new source of data to mine emerges – social annotations of web content. In this paper we utilize microblogs in particular. We present a method for relevant domain terms extraction for web resources based on processing of the biggest microblogging service to date – Twitter. The method leverages social characteristics of the Twitter network to consider different relevancies of Twitter posts assigned to the web resources. We evaluated the method in a user experiment while observing its performance for different types of web content.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ahmad, K., Gillam, L., Tostevin, L.: University of Surrey participation in TREC 8: Weirdness indexing for logical document extrapolation and retrieval (WILDER). In: Proc. of the Eighth Text REtrieval Conference, TREC 8 (1999)
Google Scholar
Barla, M.: Towards Social-based User Modeling and Personalization. Information Sciences and Technologies Bulletin of the ACM Slovakia 3(1), 52–60 (2011)
Google Scholar
Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American Magazine (May 2001)
Google Scholar
Bieliková, M., Barla, M., Šimko, M.: Lightweight Semantics for the “Wild Web”. In: White, B., Isaías, P., Santoro, F.M. (eds.) Proc. of the IADIS Int. Conf. on WWW/Internet, ICWI 2011, pp. xxv–xxxii. IADIS Press (2011)
Google Scholar
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proc. of the 7th Int. Conf. on World Wide Web, pp. 107–117 (1998)
Google Scholar
Dong, A.: Time is of the essence: improving recency ranking using Twitter data. In: Proc. of the 19th Int. Conf. on World Wide Web, pp. 331–340. ACM (2010)
Google Scholar
Chen, J., Nairn, R., Nelson, L., Bernstein, M., Chi, E.: Short and tweet: experiments on recommending content from information streams. In: Proc. of the 28th Int. Conf. on Human Factors in Computing Systems, pp. 1185–1194. ACM (2010)
Google Scholar
Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. In: Computational Linguistics, pp. 22–29. MIT Press (1991)
Google Scholar
Java, A., Song, X., Finin, T., Tseng, B.: Why we twitter: understanding microblogging usage and communities. In: Proc. of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, pp. 56–65 (2007)
Google Scholar
Kanta, M., Šimko, M., Bieliková, M.: Trend-Aware User Modeling with Location-Aware Trends on Twitter. In: Proc. of Semantic Media Adaptation and Personalization, SMAP 2012. IEEE Computer Society (to appear, 2012)
Google Scholar
Lučanský, M., Šimko, M.: Improving Relevance of Keyword Extraction from the Web Utilizing Visual Style Information. In: van Emde Boas, P., Italiano, G.F., Nawrocki, J., Sack, H., Groen, F.C.A. (eds.) SOFSEM 2013. LNCS, vol. 7741, pp. 445–456. Springer, Heidelberg (2013)
Google Scholar
Majer, T., Šimko, M.: Leveraging Microblogs for Resource Ranking. In: Bieliková, M., Friedrich, G., Gottlob, G., Katzenbeisser, S., Turán, G. (eds.) SOFSEM 2012. LNCS, vol. 7147, pp. 518–529. Springer, Heidelberg (2012)
Chapter Google Scholar
Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Texts. In: Proc. of Conf. on Empirical Methods in Natural Language Processing, pp. 404–411. ACL (2004)
Google Scholar
Phelan, O., McCarthy, K., Smyth, B.: Using twitter to recommend real-time topical news. In: Proc. of the 3rd ACM Conf. on Recommender Systems, pp. 385–388. ACM (2009)
Google Scholar
Sabou, M., Gracia, J., Angeletou, S., D’Aquin, M., Motta, E.: Evaluating the Semantic Web: A Task-Based Approach. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ISWC/ASWC 2007. LNCS, vol. 4825, pp. 423–437. Springer, Heidelberg (2007)
Chapter Google Scholar
Tunkelang, D.: A Twitter Analog to PageRank (2009), http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/
Weng, J., Lim, E., Jiang, J., He, Q.: TwitterRank: Finding Topic-sensitive Influential Twitterers. In: Proc. of the 3rd Int. Conf. on Web Search and Data Mining, pp. 261–270 (2010)
Google Scholar
Wong, W., Liu, W., Bennamoun, M.: Ontology learning from text: A look back and in the future. ACM Computing Surveys (CSUR) 44(4), Article No. 20 (2012)
Google Scholar
Wu, W., Zhang, B., Ostendorf, M.: Automatic generation of personalized annotation tags for twitter users. In: The 2010 Annual Conf. of the North American Chapter of the Association for Computational Linguistics, pp. 689–692. ACL (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Informatics and Information Technologies, Slovak University of Technology in Bratislava, Ilkovičova 3, 842 16, Bratislava, Slovakia
Tomáš Uherčík, Marián Šimko & Mária Bieliková

Authors

Tomáš Uherčík
View author publications
You can also search for this author in PubMed Google Scholar
Marián Šimko
View author publications
You can also search for this author in PubMed Google Scholar
Mária Bieliková
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics and Computer Science, University of Amsterdam, Plantage Muidergracht 24, 1018 TV, Amsterdam, The Netherlands
Peter van Emde Boas
Informatics Institute, Intelligent Systems Lab Amsterdam, University of Amsterdam, Science Park 904, 1098 XH, Amsterdam, The Netherlands
Frans C. A. Groen
Department of Civil Engineering and Computer Science, University of Rome Tor Vergata, Via del Politecnico 1, 00133, Rome, Italy
Giuseppe F. Italiano
Institute of Computing Science, Poznan University of Technology, ul. Piotrowo 2, 60-965, Poznan, Poland
Jerzy Nawrocki
Hasso-Plattner-Institute for Software Systems Engineering, Prof.-Dr.-Helmert-Str. 2-3, 14482, Potsdam, Germany
Harald Sack

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Uherčík, T., Šimko, M., Bieliková, M. (2013). Utilizing Microblogs for Web Page Relevant Term Acquisition. In: van Emde Boas, P., Groen, F.C.A., Italiano, G.F., Nawrocki, J., Sack, H. (eds) SOFSEM 2013: Theory and Practice of Computer Science. SOFSEM 2013. Lecture Notes in Computer Science, vol 7741. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35843-2_39

Download citation

DOI: https://doi.org/10.1007/978-3-642-35843-2_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35842-5
Online ISBN: 978-3-642-35843-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics