Skip to main content

Utilizing Microblogs for Web Page Relevant Term Acquisition

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7741))

Abstract

To allow advanced processing of information available on the Web, the web content necessitates semantic descriptions (metadata) processable by machines. Manual creation of metadata even in a lightweight form such as (web page) relevant terms is for us humans demanding and almost an impossible task, especially when considering open information space such as the Web. New approaches are devised continuously to automate the process. In the age of the Social Web an important new source of data to mine emerges – social annotations of web content. In this paper we utilize microblogs in particular. We present a method for relevant domain terms extraction for web resources based on processing of the biggest microblogging service to date – Twitter. The method leverages social characteristics of the Twitter network to consider different relevancies of Twitter posts assigned to the web resources. We evaluated the method in a user experiment while observing its performance for different types of web content.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahmad, K., Gillam, L., Tostevin, L.: University of Surrey participation in TREC 8: Weirdness indexing for logical document extrapolation and retrieval (WILDER). In: Proc. of the Eighth Text REtrieval Conference, TREC 8 (1999)

    Google Scholar 

  2. Barla, M.: Towards Social-based User Modeling and Personalization. Information Sciences and Technologies Bulletin of the ACM Slovakia 3(1), 52–60 (2011)

    Google Scholar 

  3. Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American Magazine (May 2001)

    Google Scholar 

  4. Bieliková, M., Barla, M., Šimko, M.: Lightweight Semantics for the “Wild Web”. In: White, B., Isaías, P., Santoro, F.M. (eds.) Proc. of the IADIS Int. Conf. on WWW/Internet, ICWI 2011, pp. xxv–xxxii. IADIS Press (2011)

    Google Scholar 

  5. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proc. of the 7th Int. Conf. on World Wide Web, pp. 107–117 (1998)

    Google Scholar 

  6. Dong, A.: Time is of the essence: improving recency ranking using Twitter data. In: Proc. of the 19th Int. Conf. on World Wide Web, pp. 331–340. ACM (2010)

    Google Scholar 

  7. Chen, J., Nairn, R., Nelson, L., Bernstein, M., Chi, E.: Short and tweet: experiments on recommending content from information streams. In: Proc. of the 28th Int. Conf. on Human Factors in Computing Systems, pp. 1185–1194. ACM (2010)

    Google Scholar 

  8. Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. In: Computational Linguistics, pp. 22–29. MIT Press (1991)

    Google Scholar 

  9. Java, A., Song, X., Finin, T., Tseng, B.: Why we twitter: understanding microblogging usage and communities. In: Proc. of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, pp. 56–65 (2007)

    Google Scholar 

  10. Kanta, M., Šimko, M., Bieliková, M.: Trend-Aware User Modeling with Location-Aware Trends on Twitter. In: Proc. of Semantic Media Adaptation and Personalization, SMAP 2012. IEEE Computer Society (to appear, 2012)

    Google Scholar 

  11. Lučanský, M., Šimko, M.: Improving Relevance of Keyword Extraction from the Web Utilizing Visual Style Information. In: van Emde Boas, P., Italiano, G.F., Nawrocki, J., Sack, H., Groen, F.C.A. (eds.) SOFSEM 2013. LNCS, vol. 7741, pp. 445–456. Springer, Heidelberg (2013)

    Google Scholar 

  12. Majer, T., Šimko, M.: Leveraging Microblogs for Resource Ranking. In: Bieliková, M., Friedrich, G., Gottlob, G., Katzenbeisser, S., Turán, G. (eds.) SOFSEM 2012. LNCS, vol. 7147, pp. 518–529. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  13. Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Texts. In: Proc. of Conf. on Empirical Methods in Natural Language Processing, pp. 404–411. ACL (2004)

    Google Scholar 

  14. Phelan, O., McCarthy, K., Smyth, B.: Using twitter to recommend real-time topical news. In: Proc. of the 3rd ACM Conf. on Recommender Systems, pp. 385–388. ACM (2009)

    Google Scholar 

  15. Sabou, M., Gracia, J., Angeletou, S., D’Aquin, M., Motta, E.: Evaluating the Semantic Web: A Task-Based Approach. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ISWC/ASWC 2007. LNCS, vol. 4825, pp. 423–437. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  16. Tunkelang, D.: A Twitter Analog to PageRank (2009), http://thenoisychannel.com/2009/01/13/a-twitter-analog-to-pagerank/

  17. Weng, J., Lim, E., Jiang, J., He, Q.: TwitterRank: Finding Topic-sensitive Influential Twitterers. In: Proc. of the 3rd Int. Conf. on Web Search and Data Mining, pp. 261–270 (2010)

    Google Scholar 

  18. Wong, W., Liu, W., Bennamoun, M.: Ontology learning from text: A look back and in the future. ACM Computing Surveys (CSUR) 44(4), Article No. 20 (2012)

    Google Scholar 

  19. Wu, W., Zhang, B., Ostendorf, M.: Automatic generation of personalized annotation tags for twitter users. In: The 2010 Annual Conf. of the North American Chapter of the Association for Computational Linguistics, pp. 689–692. ACL (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Uherčík, T., Šimko, M., Bieliková, M. (2013). Utilizing Microblogs for Web Page Relevant Term Acquisition. In: van Emde Boas, P., Groen, F.C.A., Italiano, G.F., Nawrocki, J., Sack, H. (eds) SOFSEM 2013: Theory and Practice of Computer Science. SOFSEM 2013. Lecture Notes in Computer Science, vol 7741. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35843-2_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35843-2_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35842-5

  • Online ISBN: 978-3-642-35843-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics