Skip to main content

Applying Linked Data Principles to Linking Multilingual Wordnets

  • Chapter
  • First Online:
Linguistic Linked Data

Abstract

Wordnets are the most widely used lexical resources in natural language processing (NLP). There exist wordnets in more than 40 languages by now and all of these are connected to the original Princeton WordNet. The origins of linguistic linked data (LD) can thus in some sense be traced to the WordNet project. The implementation of the linking, however, has not relied on stable identifiers and has thus led to technical problems of reference when new versions of a wordnet are released. This chapter describes how linked data principles have been applied in the development of the Global WordNet Grid (GWG), an attempt to form a catalogue of interlingual contexts that extends beyond the Anglo-Saxon roots of the Princeton WordNet. We will describe in particular how LD technologies have been used in realizing a Collaborative Interlingual Index (CILI) that builds on standard LD vocabularies and the resource description framework (RDF) data model. We finally describe a method to link wordnets to external resources such as DBpedia/Wikipedia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. G.A. Miller, WordNet: a lexical database for English. Commun. Assoc. Comput. Mach. 38(11), 39 (1995)

    Article  Google Scholar 

  2. C. Fellbaum, Wordnet, in Theory and Applications of Ontology: Computer Applications (Springer, Berlin, 2010), pp. 231–243

    Book  Google Scholar 

  3. S. Rothe, H. Schütze, Autoextend: extending word embeddings to embeddings for synsets and lexemes, in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (2015), pp. 1793–1803

    Google Scholar 

  4. K.K. Schuler, VerbNet: a broad-coverage, comprehensive verb lexicon. Ph.D. thesis (University of Pennsylvania, Pennsylvania, 2005)

    Google Scholar 

  5. C.F. Baker, C.J. Fillmore, J.B. Lowe, The Berkeley FrameNet project, in Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, vol. 1 (Association for Computational Linguistics, Stroudsburg, 1998), pp. 86–90

    Google Scholar 

  6. A. Esuli, F. Sebastiani, SentiWordNet: a high-coverage lexical resource for opinion mining, in Technical Report ISTI-PP-002/2007, Institute of Information Science and Technologies (ISTI) of the Italian National Research Council (CNR) (2006). http://tcc.itc.it/projects/ontotext/Publications/sentiWN-TR.pdf

  7. J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, F.F. Li, ImageNet: a large-scale hierarchical image database, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009, pp. 248–255

    Google Scholar 

  8. M. Maziarz, M. Piasecki, E. Rudnicka, S. Szpakowicz, P. Kedzia, plWordNet 3.0—a comprehensive lexical-semantic resource, in Proceedings of the 26th International Conference on Computational Linguistics (COLING), ed. by N. Calzolari, Y. Matsumoto, R. Prasad (ACL, Osaka, 2016), pp. 2259–2268

    Google Scholar 

  9. F. Bond, R. Foster, Linking and extending an open multilingual wordnet, in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL) (The Association for Computer Linguistics, Stroudsburg, 2013), pp. 1352–1362

    Google Scholar 

  10. P. Vossen, EuroWordNet General Document (University of Amsterdam, The Netherlands, 1999). Technical Report. http://www.illc.uva.nl/EuroWordNet/

  11. P. Vossen, Introduction to eurowordnet. Comput. Hum. 32(2-3), 73 (1998)

    Google Scholar 

  12. S. Stamou, K. Oflazer, K. Pala, D. Christoudoulakis, D. Cristea, D. Tufis, S. Koeva, G. Totkov, D. Dutoit, M. Grigoriadou, BalkaNet: a multilingual semantic network for the Balkan languages, in Proceedings of the International Wordnet Conference, Mysore, India (2002), pp. 21–25

    Google Scholar 

  13. P. Bhattacharyya, IndoWordNet, in Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC) (2010)

    Google Scholar 

  14. M. Van Assem, A. Gangemi, G. Schreiber, Conversion of WordNet to a standard RDF/OWL representation, in Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC), Genoa (2006), pp. 237–242

    Google Scholar 

  15. J. McCrae, E. Montiel-Ponsoda, P. Cimiano, Integrating WordNet and Wiktionary with lemon, in Linked Data in Linguistics (Springer, Berlin, 2012), pp. 25–34

    Book  Google Scholar 

  16. R. Navigli, S.P. Ponzetto, BabelNet: building a very large multilingual semantic network, in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (2010), pp. 216–225

    Google Scholar 

  17. M. Ehrmann, D. Vannela, J.P. McCrae, F. Cecconi, P. Cimiano, R. Navigli, Representing multilingual data as linked data: the case of BabelNet 2.0, in Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC-14) (2014)

    Google Scholar 

  18. I. Gurevych, J. Eckle-Kohler, S. Hartmann, M. Matuschek, C.M. Meyer, C. Wirth, UBY: A large-scale unified lexical-semantic resource based on LMF, in Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (Association for Computational Linguistics, Stroudsburg, 2012), pp. 580–590

    Google Scholar 

  19. J. Eckle-Kohler, J. McCrae, C. Chiarcos, lemonUby-a large, interlinked, syntactically-rich resource for ontologies. Semant. Web 6(4), 371–378 (2015)

    Google Scholar 

  20. J.P. McCrae, C. Fellbaum, P. Cimiano, Publishing and Linking WordNet using lemon and RDF, in Proceedings of the 3rd Workshop on Linked Data in Linguistics (2014)

    Google Scholar 

  21. J. McCrae, A. Rademaker, F. Bond, E. Rudnicka, C. Fellbaum, English WordNet 2019—an open-source WordNet for english, in Proceedings of the 10th Global WordNet Conference (2019)

    Google Scholar 

  22. N. Guarino, Some ontological principles for designing upper level lexical resources, in Proceedings of the 1st International Conference on Language Resources and Evaluation (LREC), Granada, 28–30 May 1998

    Google Scholar 

  23. M. Kemps-Snijders, M. Windhouwer, P. Wittenburg, S.E. Wright, ISOcat: corralling data categories in the wild, in Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC) (2008), pp. 887–891

    Google Scholar 

  24. M. Windhouwer, S.E. Wright, Linking to linguistic data categories in ISOcat, in Linked Data in Linguistics (Springer, Berlin, 2012), pp. 99–107

    Book  Google Scholar 

  25. E. Ruci, On the current state of Albanet and related applications (University of Vlora, University of Vlora, 2008). Technical Report. http://fjalnet.com/technicalreportalbanet.pdf

  26. L. Abouenour, K. Bouzoubaa, P. Rosso, On the evaluation and improvement of Arabic wordnet coverage and usability. Lang. Resour. Eval. 47(3), 891 (2013)

    Article  Google Scholar 

  27. S. Elkateb, W. Black, H. Rodríguez, M. Alkhalifa, P. Vossen, A. Pease, C. Fellbaum, Building a wordnet for Arabic, in Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC) (2006)

    Google Scholar 

  28. K. Simov, P. Osenova, Constructing of an ontology-based lexicon for Bulgarian, in Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC), ed. by N.C.C. Chair, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner, D. Tapias (European Language Resources Association (ELRA), Valletta, 2010)

    Google Scholar 

  29. S. Wang, F. Bond, Building the Chinese open Wordnet (cow): starting from core synsets, in Proceedings of the 6th International Joint Conference on Natural Language Processing (2013), pp. 10–18

    Google Scholar 

  30. C.R. Huang, S.K. Hsieh, J.F. Hong, Y.Z. Chen, I.L. Su, Y.X. Chen, S.W. Huang, Chinese wordnet: design and implementation of a cross-lingual knowledge processing infrastructure. J. Chin. Inf. Process. 24(2), 14 (2010) (in Chinese)

    Google Scholar 

  31. B. Pedersen, S. Nimb, J. Asmussen, N. Sørensen, L. Trap-Jensen, H. Lorentzen, DanNet—the challenge of compiling a wordnet for Danish by reusing a monolingual dictionary. Lang. Resour. Eval. 43(3), 269 (2009)

    Article  Google Scholar 

  32. M. Montazery, H. Faili, Automatic Persian wordnet construction, in Proceedings of the 23rd International Conference on Computational Linguistics (COLING) (2010), pp. 846–850

    Google Scholar 

  33. K. Lindén, L. Carlson., Finnwordnet—wordnet påfinska via översättning. LexicoNordica—Nord. J. Lexicogr. 17, 119 (2010). In Swedish with an English abstract

    Google Scholar 

  34. B. Sagot, D. Fišer, Building a free French wordnet from multilingual resources, in Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC), ed. by E.L.R.A. (ELRA) (Marrakech, Morocco, 2008)

    Google Scholar 

  35. N. Ordan, S. Wintner, Hebrew WordNet: a test case of aligning lexical databases across languages. Int. J. Transl. 19(1), 39 (2007)

    Google Scholar 

  36. A. Oliver, K. Šojat, M. Srebačić, Automatic expansion of Croatian wordnet, in Proceedings of the 29th CALS International Conference on Language “Applied Linguistic Research and Methodology”, Zadar (2015)

    Google Scholar 

  37. I. Raffaelli, B. Bekavac, Agi, M. Tadi, Building croatian wordnet, in Proceedings of the 4th Global WordNet Conference 2008, Szeged, ed. by A. Tancs, D. Csendes, V. Vincze, C. Fellbaum, P. Vossen (2008), pp. 349–359

    Google Scholar 

  38. E. Pianta, L. Bentivogli, C. Girardi, Multiwordnet: Developing an aligned multilingual database, in Proceedings of the 1st International Conference on Global WordNet, Mysore (2002), pp. 293–302

    Google Scholar 

  39. A. Toral, S. Bracale, M. Monachini, C. Soria, Rejuvenating the Italian WordNet: upgrading, standardising, extending, in Proceedings of the 5th International Conference of the Global WordNet Association (GWC), Mumbai (2010)

    Google Scholar 

  40. H. Isahara, F. Bond, K. Uchimoto, M. Utiyama, K. Kanzaki, Development of the Japanese WordNet, in Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC), Marrakech (2008)

    Google Scholar 

  41. A. Gonzalez-Agirre, E. Laparra, G. Rigau, Multilingual central repository version 3.0: upgrading a very large lexical knowledge base, in Proceedings of the 6th Global WordNet Conference (GWC), Matsue (2012)

    Google Scholar 

  42. E. Pociello, E. Agirre, I. Aldezabal, Methodology and construction of the Basque wordnet. Lang. Resour. Eval. 45(2), 121 (2011)

    Article  Google Scholar 

  43. N. Mohamed Noor, S. Sapuan, F. Bond, Creating the open Wordnet Bahasa, in Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation (PACLIC 25), Singapore (2011), pp. 258–267

    Google Scholar 

  44. M. Postma, E. van Miltenburg, R. Segers, A. Schoen, P. Vossen, Open DutchWordNet, in Proceedings of the 8th Global Wordnet Conference, Bucharest (2016)

    Google Scholar 

  45. R.V. Fjeld, L. Nygaard, Nornet—a monolingual wordnet of modern Norwegian, in Proceedings of the NODALIDA 2009 Workshop WordNets and Other Lexical Semantic Resources—Between Lexical Semantics, Lexicography, Terminology and Formal Ontologies, vol. NEALT Proceedings Series, Vol. 7 (Estonia, 2009), pp. 13–16

    Google Scholar 

  46. M. Piasecki, S. Szpakowicz, B. Broda, A Wordnet from the Ground Up (Wroclaw University of Technology Press, Wroclaw, 2009). http://www.plwordnet.pwr.wroc.pl/main/content/files/publications/A_Wordnet_from_the_Ground_Up.pdf. ISBN 978-83-7493-476-3

    Google Scholar 

  47. D. Tufiş, R. Ion, L. Bozianu, A. Ceauşu, D. Ştefănescu, Romanian wordnet: current state, new applications and prospects, in Proceedings of the 4th Global WordNet Association Conference, Szeged (2008), pp. 441–452

    Google Scholar 

  48. R. Garabk, I. Pileckyt, From multilingual dictionary to Lithuanian wordnet, in Natural Language Processing, Corpus Linguistics, E-Learning, ed. by K. Gajdoov, A. kov (RAM, Ldenscheid, 2013), pp. 74–80

    Google Scholar 

  49. D. Fišer, J. Novak, T. Erjavec, sloWNet 3.0: development, extension and cleaning, in Proceedings of the 6th International Global Wordnet Conference (GWC) (The Global WordNet Association, Herensingel, 2012), pp. 113–117

    Google Scholar 

  50. L. Borin, M. Forsberg, L. Lönngren, Saldo: a touch of yin to wordnet’s yang. Lang. Resour. Eval. 47(4), 1191 (2013)

    Article  Google Scholar 

  51. S. Thoongsup, T. Charoenporn, K. Robkop, T. Sinthurahat, C. Mokarat, V. Sornlertlamvanich, H. Isahara, Thai wordnet construction, in Proceedings of the 7th Workshop on Asian Language Resources (ALR7), co-located with the Joint of the 47th Annual Meeting of the Association for Computational Linguistics (ACL) and the 4th International Joint Conference on Natural Language Processing (IJCNLP) (Suntec, Singapore, 2009)

    Google Scholar 

  52. X.S. Vu, S.B. Park, Construction of Vietnamese SentiWordNet by using Vietnamese dictionary. 40th Conf. Korea Inf. Process. Soc. 21, 745 (2014)

    Google Scholar 

  53. L. Bentivogli, P. Forner, B. Magnini, E. Pianta, Revising wordnet domains hierarchy: semantics, coverage, and balancing, in Proceedings of the Workshop on Multilingual Linguistic Resources Co-located with COLING, Geneva (2004), pp. 101–108

    Google Scholar 

  54. Y.J. Seah, F. Bond, Annotation of pronouns in a multilingual corpus of Mandarin Chinese, English and Japanese, in Proceedings of the 10th Joint Annual Meeting of the Association for Computational Linguistics (ACL)—ISO Workshop on Interoperable Semantic Annotation, Reykjavik (2014)

    Google Scholar 

  55. P. Vossen, F. Bond, J.P. McCrae, Toward a truly multilingual Global Wordnet Grid, in Proceedings of the Global WordNet Conference (2016)

    Google Scholar 

  56. F. Bond, P. Vossen, J.P. McCrae, C. Fellbaum, CILI: the Collaborative Interlingual Index, in Proceedings of the Global WordNet Conference (2016)

    Google Scholar 

  57. CICC, Research on Malaysian Dictionary. Technical Report 6—CICC—MT54 (Center of the International Cooperation for Computerization, Tokyo, 1994)

    Google Scholar 

  58. J.P. McCrae, P. Vossen, L.M. da Costa, F. Bond, The GLobal WOrdNEt ASsociation Schemas. Linguistic Issues in Language Technology (2018, Under Review)

    Google Scholar 

  59. G. Francopoulo, M. George, N. Calzolari, M. Monachini, N. Bel, M. Pet, C. Soria, et al., Lexical markup framework (LMF), in Proceedings of the International Conference on Language Resources and Evaluation, vol. 6 (2006)

    Google Scholar 

  60. J. McCrae, G.A. de Cea, P. Buitelaar, P. Cimiano, T. Declerck, A. Gómez-Pérez, J. Gracia, L. Hollink, E. Montiel-Ponsoda, D. Spohr, T. Wunner, Interchanging lexical resources on the Semantic Web. Lang. Resour. Eval. 46(6), 701 (2012)

    Google Scholar 

  61. P. Cimiano, J.P. McCrae, P. Buitelaar, Lexicon model for ontologies: community report. W3C community group final report (World Wide Web Consortium, Cambridge, 2014)

    Google Scholar 

  62. C. Soria, M. Monachini, P. Vossen, Wordnet-LMF: fleshing out a standardized format for wordnet interoperability, in Proceedings of the International Workshop on Intercultural Collaboration (ACM, New York, 2009), pp. 139–146

    Google Scholar 

  63. M. Sporny, D. Longley, G. Kellogg, M. Lanthaler, N. Lindström, JSON-LD 1.0, in W3C recommendation (World Wide Web Consortium, Cambridge, 2014)

    Google Scholar 

  64. D. Beckett, T. Berners-Lee, E. Prud’hommeaux, G. Carothers, RDF 1.1 Turtle, in W3C Recommendation (World Wide Web Consortium, Cambridge, 2004)

    Google Scholar 

  65. D. Beckett, B. McBride, RDF/XML Syntax Specification, in W3C Recommendation (World Wide Web Consortium, Cambridge, 2004)

    Google Scholar 

  66. J. Eckle-Kohler, I. Gurevych, S. Hartmann, M. Matuschek, C.M. Meyer, UBY-LMF-a uniform model for standardizing heterogeneous lexical-semantic resources in ISO-LMF, in Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC) (2012), pp. 275–282

    Google Scholar 

  67. M. Windhouwer, J. Petro, S. Shayan, RELISH LMF: unlocking the full power of the Lexical Markup Framework, in Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC) (2014), pp. 1032–1037

    Google Scholar 

  68. D. Lindemann, F. Kliche, Bilingual Dictionary Drafting: Bootstrapping WordNet and BabelNet, in Proceedings of the 5th Biennial Conference on Electronic Lexicography (eLex) (2017), pp. 23–42

    Google Scholar 

  69. J.P. McCrae, P. Buitelaar, Linking datasets using semantic textual similarity. Cybern. Inf. Technol. 18(1), 109 (2018)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Cimiano, P., Chiarcos, C., McCrae, J.P., Gracia, J. (2020). Applying Linked Data Principles to Linking Multilingual Wordnets. In: Linguistic Linked Data. Springer, Cham. https://doi.org/10.1007/978-3-030-30225-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30225-2_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30224-5

  • Online ISBN: 978-3-030-30225-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics