Advertisement

Path-Based Semantic Relatedness on Linked Data and Its Use to Word and Entity Disambiguation

  • Ioana HulpuşEmail author
  • Narumol Prangnawarat
  • Conor Hayes
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9366)

Abstract

Semantic relatedness and disambiguation are fundamental problems for linking text documents to the Web of Data. There are many approaches dealing with both problems but most of them rely on word or concept distribution over Wikipedia. They are therefore not applicable to concepts that do not have a rich textual description. In this paper, we show that semantic relatedness can also be accurately computed by analysing only the graph structure of the knowledge base. In addition, we propose a joint approach to entity and word-sense disambiguation that makes use of graph-based relatedness. As opposed to the majority of state-of-the-art systems that target mainly named entities, we use our approach to disambiguate both entities and common nouns. In our experiments, we first validate our relatedness measure on multiple knowledge bases and ground truth datasets and show that it performs better than related state-of-the-art graph based measures. Afterwards, we evaluate the disambiguation algorithm and show that it also achieves superior disambiguation accuracy with respect to alternative state-of-the-art graph-based algorithms.

Keywords

Noun Phrase Link Data Word Sense Disambiguation Common Noun Name Entity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agirre, E., Alfonseca, E., Hall, K., Kravalova, J., Paşca, M., Soroa, A.: A study on similarity and relatedness using distributional and wordnet-based approaches. In: NAACL 2009, pp. 19–27. ACL (2009)Google Scholar
  2. 2.
    Agirre, E., Soroa, A.: Personalizing pagerank for word sense disambiguation. In: Proc. 12th Conf. of the European Chapter of the Association for Computational Linguistics (2009)Google Scholar
  3. 3.
    Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., Ruppin, E.: Placing search in context: The concept revisited. ACM Trans. Inf. Syst. 20(1), 116–131 (2002). http://doi.acm.org/10.1145/503104.503110 CrossRefGoogle Scholar
  4. 4.
    Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: IJCAI 2007, pp. 1606–1611. Morgan Kaufmann Publishers Inc., San Francisco (2007)Google Scholar
  5. 5.
    Garcia, A., Szomszor, M., Alani, H., Corcho, O.: Preliminary results in tag disambiguation using dbpedia. In: Knowledge Capture (K-Cap 2009) - 1st International Workshop on Collective Knowledge Capturing and Representation (2009)Google Scholar
  6. 6.
    Gentile, A.L., Zhang, Z., Xia, L., Iria, J.: Semantic relatedness approach for named entity disambiguation. In: Agosti, M., Esposito, F., Thanos, C. (eds.) IRCDL 2010. CCIS, vol. 91, pp. 137–148. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  7. 7.
    Grieser, K., Baldwin, T., Bohnert, F., Sonenberg, L.: Using ontological and document similarity to estimate museum exhibit relatedness. ACM Journal of Computing and Cultural Heritage 3(3), 1–20 (2011)CrossRefGoogle Scholar
  8. 8.
    Hakimov, S., Oto, S.A., Dogdu, E.: Named entity recognition and disambiguation using linked data and graph-based centrality scoring. In: Proceedings of the 4th International Workshop on Semantic Web Information Management, SWIM 2012, pp. 4:1–4:7. ACM, New York (2012)Google Scholar
  9. 9.
    Hoffart, J., Seufert, S., Nguyen, D.B., Theobald, M., Weikum, G.: Kore: keyphrase overlap relatedness for entity disambiguation. In: CIKM 2012, pp. 545–554. ACM (2012)Google Scholar
  10. 10.
    Hulpuş, I.: Semantic Network Analysis for Topic Linking and Labelling. Ph.D. thesis, National University of Ireland, Galway (2014)Google Scholar
  11. 11.
    Hulpuş, I., Hayes, C., Karnstedt, M., Greene, D.: An eigenvalue-based measure for word-sense disambiguation. In: FLAIRS 2012 (2012)Google Scholar
  12. 12.
    Hulpuş, I., Hayes, C., Karnstedt, M., Greene, D.: Unsupervised graph-based topic labelling using dbpedia. In: WSDM, pp. 465–474. ACM, New York (2013)Google Scholar
  13. 13.
    Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)CrossRefzbMATHGoogle Scholar
  14. 14.
    Kulkarni, S., Singh, A., Ramakrishnan, G., Chakrabarti, S.: Collective annotation of wikipedia entities in web text. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2009, pp. 457–466. ACM, New York (2009)Google Scholar
  15. 15.
    Leal, J.P., Rodrigues, V., Queirs, R.: Computing semantic relatedness using dbpedia. In: Simes, A., Queirs, R., da Cruz, D.C. (eds.) SLATE. OASICS, vol. 21, pp. 133–147 (2012)Google Scholar
  16. 16.
    Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: I-Semantics 2011, pp. 1–8 (2011)Google Scholar
  17. 17.
    Mihalcea, R., Tarau, P., Figa, E.: Pagerank on semantic networks, with application to word sense disambiguation. In: Proceedings of the 20th International Conference on Computational Linguistics, COLING 2004. ACL (2004)Google Scholar
  18. 18.
    Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: Proceedings of AAAI 2008 (2008)Google Scholar
  19. 19.
    Milne, D., Witten, I.H.: Learning to link with wikipedia. In: Proceedings of the 17th ACM CIKM, CIKM 2008, pp. 509–518. ACM (2008)Google Scholar
  20. 20.
    Mirizzi, R., Di Noia, T., Ragone, A., Ostuni, V.C., Di Sciascio, E.: Movie recommendation with dbpedia. In: CEUR Workshop Proceedings, vol. 835 (2012)Google Scholar
  21. 21.
    Navigli, R., Lapata, M.: Graph connectivity measures for unsupervised word sense disambiguation. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, IJCAI 2007, pp. 1683–1688 (2007)Google Scholar
  22. 22.
    Pereira Nunes, B., Dietze, S., Casanova, M.A., Kawase, R., Fetahu, B., Nejdl, W.: Combining a co-occurrence-based and a semantic measure for entity linking. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 548–562. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  23. 23.
    Passant, A.: Measuring semantic distance on linking data and using it for resources recommendations. In: Linked Data Meets Artificial Intelligence, Papers from the 2010 AAAI Spring Symposium, Stanford, California, USA (2010)Google Scholar
  24. 24.
    Ratinov, L., Roth, D., Downey, D., Anderson, M.: Local and global algorithms for disambiguation to wikipedia. In: HLT 2011, pp. 1375–1384. Association for Computational Linguistics (2011)Google Scholar
  25. 25.
    Röder, M., Usbeck, R., Hellmann, S., Gerber, D., Both, A.: N3 - a collection of datasets for named entity recognition and disambiguation in the nlp interchange format. In: The 9th edition of LREC, May 26–31, Reykjavik, Iceland (2014)Google Scholar
  26. 26.
    Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965). http://doi.acm.org/10.1145/365628.365657 CrossRefGoogle Scholar
  27. 27.
    Schuhmacher, M., Ponzetto, S.P.: Knowledge-based graph document modeling. In: Proceedings of the 7th ACM WSDM, WSDM 2014, pp. 543–552. ACM (2014)Google Scholar
  28. 28.
    Sinha, R., Mihalcea, R.: Unsupervised graph-based word sense disambiguation using measures of word semantic similarity. In: Proc. International Conference on Semantic Computing, pp. 363–369. IEEE Computer Society (2007)Google Scholar
  29. 29.
    St-Onge, D.: Detecting and Correcting Malapropisms with Lexical Chains. Master’s thesis, University of Toronto (1995)Google Scholar
  30. 30.
    Sussna, M.: Word sense disambiguation for free-text indexing using a massive semantic network. In: Proceedings of the second CIKM, CIKM 1993, pp. 67–74. ACM, New York (1993)Google Scholar
  31. 31.
    Szumlanski, S.R., Gomez, F., Sims, V.K.: A new set of norms for semantic relatedness measures. In: ACL (2), pp. 890–895 (2013)Google Scholar
  32. 32.
    Usbeck, R., Ngonga Ngomo, A.-C., Röder, M., Gerber, D., Coelho, S.A., Auer, S., Both, A.: AGDISTIS - graph-based disambiguation of named entities using linked data. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 457–471. Springer, Heidelberg (2014) Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Ioana Hulpuş
    • 1
    Email author
  • Narumol Prangnawarat
    • 1
  • Conor Hayes
    • 1
  1. 1.Insight Centre for Data AnalyticsNational University of Ireland, Galway (NUIG)GalwayIreland

Personalised recommendations