Abstract
Finding semantically similar documents is a common task in Recommender Systems. Explicit Semantic Analysis (ESA) is an approach to calculate semantic relatedness between terms or documents based on similarities to documents of a reference corpus. Here, usually Wikipedia is applied as reference corpus. We propose enhancements to ESA (called Extended Explicit Semantic Analysis) that make use of further semantic properties of Wikipedia like article link structure and categorization, thus utilizing the additional semantic information that is included in Wikipedia. We show how we apply this approach to recommendation of web resource fragments in a resource-based learning scenario for self-directed, on-task learning with web resources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Böhnstedt, D., Scholl, P., Benz, B., Rensing, C., Steinmetz, R., Schmitz, B.: Einsatz persönlicher Wissensnetze im Ressourcen–basierten Lernen. In: Seehusen, S., Lucke, U., Fischer, S. (eds.) DeLFI 2008: 6. e–Learning Fachtagung Informatik, Köllen, Bonn, Gesellschaft für Informatik. LNI, vol. P–132, pp. 113–124 (September 2008)
Sowa, J.F.: Semantic Networks. In: Shapiro, S.C. (ed.) Encyclopedia of Artificial Intelligence, vol. 2, pp. 1493–1511. John Wiley, New York (1992)
Scholl, P., Benz, B.F., Böhnstedt, D., Rensing, C., Schmitz, B., Steinmetz, R.: Implementation and Evaluation of a Tool for setting Goals in self-regulated Learning with Web Resources. In: Cress, U., Dimitrova, V., Specht, M. (eds.) EC-TEL 2009. LNCS, vol. 5794, Springer, Heidelberg (2009)
Zesch, T., Müller, C., Gurevych, I.: Extracting lexical semantic knowledge from Wikipedia and Wiktionary. In: Proceedings of the Conference on Language Resources and Evaluation, LREC (2008)
Budanitsky, A., Hirst, G.: Evaluating Wordnet-based measures of lexical semantic relatedness. Computational Linguistics 32(1), 13–47 (2006)
Fellbaum, C.: Wordnet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of International Conference Research on Computational Linguistics, ROCLING X (1997)
Jarmasz, M., Szpakowicz, S.: Rogets thesaurus and semantic similarity. In: Recent Advances in Natural Language Processing III: Selected Papers from RANLP, p. 111 (2004)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by Latent Semantic Analysis. Journal of the American society for information science 41(6), 391–407 (1990)
Strube, M., Ponzetto, S.P.: Wikirelate! Computing semantic relatedness using Wikipedia. In: Proceedings of the National Conference on Artificial Intelligence, vol. 21, p. 1419. AAAI Press, MIT Press, Menlo Park, Cambridge (2006)
Milne, D., Witten, I.H.: Learning to link with Wikipedia. In: CIKM 2008: Proceeding of the 17th ACM conference on Information and knowledge management, pp. 509–518. ACM, New York (2008)
Zesch, T., Gurevych, I.: Analysis of the Wikipedia category graph for NLP applications. In: Proceedings of the TextGraphs-2 Workshop (NAACL-HLT 2007), pp. 1–8 (2007)
Kaiser, F., Schwarz, H., Jakob, M.: Using Wikipedia-based conceptual contexts to calculate document similarity. In: International Conference on the Digital Society, pp. 322–327 (2009)
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 6–12 (2007)
Anderka, M., Stein, B.: The ESA retrieval model revisited. In: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pp. 670–671. ACM, New York (2009)
Gabrilovich, E., Markovitch, S.: Overcoming the brittleness bottleneck using Wikipedia: Enhancing text categorization with encyclopedic knowledge. In: Proceedings of the Twenty–First National Conference on Artificial Intelligence, pp. 1301–1306. American Association for Artificial Intelligence Press, AAAI, Menlo Park (2006)
Grimm, J.: Berechnung semantischer Ähnlichkeit kleiner Textfragmente mittels Wikipedia. Master thesis, Technische Universität Darmstadt (September 2009)
Chakrabarti, S.: Mining the Web: discovering knowledge from hypertext data. Morgan Kaufmann Publishing, San Francisco (2003)
Yang, Y.: An evaluation of statistical approaches to text categorization. Information Retrieval 1(1), 69–90 (1999)
Gurevych, I.: Using the structure of a conceptual network in computing semantic relatedness. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 767–778. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Scholl, P., Böhnstedt, D., Domínguez García, R., Rensing, C., Steinmetz, R. (2010). Extended Explicit Semantic Analysis for Calculating Semantic Relatedness of Web Resources. In: Wolpers, M., Kirschner, P.A., Scheffel, M., Lindstaedt, S., Dimitrova, V. (eds) Sustaining TEL: From Innovation to Learning and Practice. EC-TEL 2010. Lecture Notes in Computer Science, vol 6383. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16020-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-16020-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16019-6
Online ISBN: 978-3-642-16020-2
eBook Packages: Computer ScienceComputer Science (R0)