Abstract
Wordnets are the most widely used lexical resources in natural language processing (NLP). There exist wordnets in more than 40 languages by now and all of these are connected to the original Princeton WordNet. The origins of linguistic linked data (LD) can thus in some sense be traced to the WordNet project. The implementation of the linking, however, has not relied on stable identifiers and has thus led to technical problems of reference when new versions of a wordnet are released. This chapter describes how linked data principles have been applied in the development of the Global WordNet Grid (GWG), an attempt to form a catalogue of interlingual contexts that extends beyond the Anglo-Saxon roots of the Princeton WordNet. We will describe in particular how LD technologies have been used in realizing a Collaborative Interlingual Index (CILI) that builds on standard LD vocabularies and the resource description framework (RDF) data model. We finally describe a method to link wordnets to external resources such as DBpedia/Wikipedia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
G.A. Miller, WordNet: a lexical database for English. Commun. Assoc. Comput. Mach. 38(11), 39 (1995)
C. Fellbaum, Wordnet, in Theory and Applications of Ontology: Computer Applications (Springer, Berlin, 2010), pp. 231–243
S. Rothe, H. Schütze, Autoextend: extending word embeddings to embeddings for synsets and lexemes, in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (2015), pp. 1793–1803
K.K. Schuler, VerbNet: a broad-coverage, comprehensive verb lexicon. Ph.D. thesis (University of Pennsylvania, Pennsylvania, 2005)
C.F. Baker, C.J. Fillmore, J.B. Lowe, The Berkeley FrameNet project, in Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, vol. 1 (Association for Computational Linguistics, Stroudsburg, 1998), pp. 86–90
A. Esuli, F. Sebastiani, SentiWordNet: a high-coverage lexical resource for opinion mining, in Technical Report ISTI-PP-002/2007, Institute of Information Science and Technologies (ISTI) of the Italian National Research Council (CNR) (2006). http://tcc.itc.it/projects/ontotext/Publications/sentiWN-TR.pdf
J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, F.F. Li, ImageNet: a large-scale hierarchical image database, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009, pp. 248–255
M. Maziarz, M. Piasecki, E. Rudnicka, S. Szpakowicz, P. Kedzia, plWordNet 3.0—a comprehensive lexical-semantic resource, in Proceedings of the 26th International Conference on Computational Linguistics (COLING), ed. by N. Calzolari, Y. Matsumoto, R. Prasad (ACL, Osaka, 2016), pp. 2259–2268
F. Bond, R. Foster, Linking and extending an open multilingual wordnet, in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL) (The Association for Computer Linguistics, Stroudsburg, 2013), pp. 1352–1362
P. Vossen, EuroWordNet General Document (University of Amsterdam, The Netherlands, 1999). Technical Report. http://www.illc.uva.nl/EuroWordNet/
P. Vossen, Introduction to eurowordnet. Comput. Hum. 32(2-3), 73 (1998)
S. Stamou, K. Oflazer, K. Pala, D. Christoudoulakis, D. Cristea, D. Tufis, S. Koeva, G. Totkov, D. Dutoit, M. Grigoriadou, BalkaNet: a multilingual semantic network for the Balkan languages, in Proceedings of the International Wordnet Conference, Mysore, India (2002), pp. 21–25
P. Bhattacharyya, IndoWordNet, in Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC) (2010)
M. Van Assem, A. Gangemi, G. Schreiber, Conversion of WordNet to a standard RDF/OWL representation, in Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC), Genoa (2006), pp. 237–242
J. McCrae, E. Montiel-Ponsoda, P. Cimiano, Integrating WordNet and Wiktionary with lemon, in Linked Data in Linguistics (Springer, Berlin, 2012), pp. 25–34
R. Navigli, S.P. Ponzetto, BabelNet: building a very large multilingual semantic network, in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (2010), pp. 216–225
M. Ehrmann, D. Vannela, J.P. McCrae, F. Cecconi, P. Cimiano, R. Navigli, Representing multilingual data as linked data: the case of BabelNet 2.0, in Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC-14) (2014)
I. Gurevych, J. Eckle-Kohler, S. Hartmann, M. Matuschek, C.M. Meyer, C. Wirth, UBY: A large-scale unified lexical-semantic resource based on LMF, in Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (Association for Computational Linguistics, Stroudsburg, 2012), pp. 580–590
J. Eckle-Kohler, J. McCrae, C. Chiarcos, lemonUby-a large, interlinked, syntactically-rich resource for ontologies. Semant. Web 6(4), 371–378 (2015)
J.P. McCrae, C. Fellbaum, P. Cimiano, Publishing and Linking WordNet using lemon and RDF, in Proceedings of the 3rd Workshop on Linked Data in Linguistics (2014)
J. McCrae, A. Rademaker, F. Bond, E. Rudnicka, C. Fellbaum, English WordNet 2019—an open-source WordNet for english, in Proceedings of the 10th Global WordNet Conference (2019)
N. Guarino, Some ontological principles for designing upper level lexical resources, in Proceedings of the 1st International Conference on Language Resources and Evaluation (LREC), Granada, 28–30 May 1998
M. Kemps-Snijders, M. Windhouwer, P. Wittenburg, S.E. Wright, ISOcat: corralling data categories in the wild, in Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC) (2008), pp. 887–891
M. Windhouwer, S.E. Wright, Linking to linguistic data categories in ISOcat, in Linked Data in Linguistics (Springer, Berlin, 2012), pp. 99–107
E. Ruci, On the current state of Albanet and related applications (University of Vlora, University of Vlora, 2008). Technical Report. http://fjalnet.com/technicalreportalbanet.pdf
L. Abouenour, K. Bouzoubaa, P. Rosso, On the evaluation and improvement of Arabic wordnet coverage and usability. Lang. Resour. Eval. 47(3), 891 (2013)
S. Elkateb, W. Black, H. Rodríguez, M. Alkhalifa, P. Vossen, A. Pease, C. Fellbaum, Building a wordnet for Arabic, in Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC) (2006)
K. Simov, P. Osenova, Constructing of an ontology-based lexicon for Bulgarian, in Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC), ed. by N.C.C. Chair, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner, D. Tapias (European Language Resources Association (ELRA), Valletta, 2010)
S. Wang, F. Bond, Building the Chinese open Wordnet (cow): starting from core synsets, in Proceedings of the 6th International Joint Conference on Natural Language Processing (2013), pp. 10–18
C.R. Huang, S.K. Hsieh, J.F. Hong, Y.Z. Chen, I.L. Su, Y.X. Chen, S.W. Huang, Chinese wordnet: design and implementation of a cross-lingual knowledge processing infrastructure. J. Chin. Inf. Process. 24(2), 14 (2010) (in Chinese)
B. Pedersen, S. Nimb, J. Asmussen, N. Sørensen, L. Trap-Jensen, H. Lorentzen, DanNet—the challenge of compiling a wordnet for Danish by reusing a monolingual dictionary. Lang. Resour. Eval. 43(3), 269 (2009)
M. Montazery, H. Faili, Automatic Persian wordnet construction, in Proceedings of the 23rd International Conference on Computational Linguistics (COLING) (2010), pp. 846–850
K. Lindén, L. Carlson., Finnwordnet—wordnet påfinska via översättning. LexicoNordica—Nord. J. Lexicogr. 17, 119 (2010). In Swedish with an English abstract
B. Sagot, D. Fišer, Building a free French wordnet from multilingual resources, in Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC), ed. by E.L.R.A. (ELRA) (Marrakech, Morocco, 2008)
N. Ordan, S. Wintner, Hebrew WordNet: a test case of aligning lexical databases across languages. Int. J. Transl. 19(1), 39 (2007)
A. Oliver, K. Šojat, M. Srebačić, Automatic expansion of Croatian wordnet, in Proceedings of the 29th CALS International Conference on Language “Applied Linguistic Research and Methodology”, Zadar (2015)
I. Raffaelli, B. Bekavac, Agi, M. Tadi, Building croatian wordnet, in Proceedings of the 4th Global WordNet Conference 2008, Szeged, ed. by A. Tancs, D. Csendes, V. Vincze, C. Fellbaum, P. Vossen (2008), pp. 349–359
E. Pianta, L. Bentivogli, C. Girardi, Multiwordnet: Developing an aligned multilingual database, in Proceedings of the 1st International Conference on Global WordNet, Mysore (2002), pp. 293–302
A. Toral, S. Bracale, M. Monachini, C. Soria, Rejuvenating the Italian WordNet: upgrading, standardising, extending, in Proceedings of the 5th International Conference of the Global WordNet Association (GWC), Mumbai (2010)
H. Isahara, F. Bond, K. Uchimoto, M. Utiyama, K. Kanzaki, Development of the Japanese WordNet, in Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC), Marrakech (2008)
A. Gonzalez-Agirre, E. Laparra, G. Rigau, Multilingual central repository version 3.0: upgrading a very large lexical knowledge base, in Proceedings of the 6th Global WordNet Conference (GWC), Matsue (2012)
E. Pociello, E. Agirre, I. Aldezabal, Methodology and construction of the Basque wordnet. Lang. Resour. Eval. 45(2), 121 (2011)
N. Mohamed Noor, S. Sapuan, F. Bond, Creating the open Wordnet Bahasa, in Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation (PACLIC 25), Singapore (2011), pp. 258–267
M. Postma, E. van Miltenburg, R. Segers, A. Schoen, P. Vossen, Open DutchWordNet, in Proceedings of the 8th Global Wordnet Conference, Bucharest (2016)
R.V. Fjeld, L. Nygaard, Nornet—a monolingual wordnet of modern Norwegian, in Proceedings of the NODALIDA 2009 Workshop WordNets and Other Lexical Semantic Resources—Between Lexical Semantics, Lexicography, Terminology and Formal Ontologies, vol. NEALT Proceedings Series, Vol. 7 (Estonia, 2009), pp. 13–16
M. Piasecki, S. Szpakowicz, B. Broda, A Wordnet from the Ground Up (Wroclaw University of Technology Press, Wroclaw, 2009). http://www.plwordnet.pwr.wroc.pl/main/content/files/publications/A_Wordnet_from_the_Ground_Up.pdf. ISBN 978-83-7493-476-3
D. Tufiş, R. Ion, L. Bozianu, A. Ceauşu, D. Ştefănescu, Romanian wordnet: current state, new applications and prospects, in Proceedings of the 4th Global WordNet Association Conference, Szeged (2008), pp. 441–452
R. Garabk, I. Pileckyt, From multilingual dictionary to Lithuanian wordnet, in Natural Language Processing, Corpus Linguistics, E-Learning, ed. by K. Gajdoov, A. kov (RAM, Ldenscheid, 2013), pp. 74–80
D. Fišer, J. Novak, T. Erjavec, sloWNet 3.0: development, extension and cleaning, in Proceedings of the 6th International Global Wordnet Conference (GWC) (The Global WordNet Association, Herensingel, 2012), pp. 113–117
L. Borin, M. Forsberg, L. Lönngren, Saldo: a touch of yin to wordnet’s yang. Lang. Resour. Eval. 47(4), 1191 (2013)
S. Thoongsup, T. Charoenporn, K. Robkop, T. Sinthurahat, C. Mokarat, V. Sornlertlamvanich, H. Isahara, Thai wordnet construction, in Proceedings of the 7th Workshop on Asian Language Resources (ALR7), co-located with the Joint of the 47th Annual Meeting of the Association for Computational Linguistics (ACL) and the 4th International Joint Conference on Natural Language Processing (IJCNLP) (Suntec, Singapore, 2009)
X.S. Vu, S.B. Park, Construction of Vietnamese SentiWordNet by using Vietnamese dictionary. 40th Conf. Korea Inf. Process. Soc. 21, 745 (2014)
L. Bentivogli, P. Forner, B. Magnini, E. Pianta, Revising wordnet domains hierarchy: semantics, coverage, and balancing, in Proceedings of the Workshop on Multilingual Linguistic Resources Co-located with COLING, Geneva (2004), pp. 101–108
Y.J. Seah, F. Bond, Annotation of pronouns in a multilingual corpus of Mandarin Chinese, English and Japanese, in Proceedings of the 10th Joint Annual Meeting of the Association for Computational Linguistics (ACL)—ISO Workshop on Interoperable Semantic Annotation, Reykjavik (2014)
P. Vossen, F. Bond, J.P. McCrae, Toward a truly multilingual Global Wordnet Grid, in Proceedings of the Global WordNet Conference (2016)
F. Bond, P. Vossen, J.P. McCrae, C. Fellbaum, CILI: the Collaborative Interlingual Index, in Proceedings of the Global WordNet Conference (2016)
CICC, Research on Malaysian Dictionary. Technical Report 6—CICC—MT54 (Center of the International Cooperation for Computerization, Tokyo, 1994)
J.P. McCrae, P. Vossen, L.M. da Costa, F. Bond, The GLobal WOrdNEt ASsociation Schemas. Linguistic Issues in Language Technology (2018, Under Review)
G. Francopoulo, M. George, N. Calzolari, M. Monachini, N. Bel, M. Pet, C. Soria, et al., Lexical markup framework (LMF), in Proceedings of the International Conference on Language Resources and Evaluation, vol. 6 (2006)
J. McCrae, G.A. de Cea, P. Buitelaar, P. Cimiano, T. Declerck, A. Gómez-Pérez, J. Gracia, L. Hollink, E. Montiel-Ponsoda, D. Spohr, T. Wunner, Interchanging lexical resources on the Semantic Web. Lang. Resour. Eval. 46(6), 701 (2012)
P. Cimiano, J.P. McCrae, P. Buitelaar, Lexicon model for ontologies: community report. W3C community group final report (World Wide Web Consortium, Cambridge, 2014)
C. Soria, M. Monachini, P. Vossen, Wordnet-LMF: fleshing out a standardized format for wordnet interoperability, in Proceedings of the International Workshop on Intercultural Collaboration (ACM, New York, 2009), pp. 139–146
M. Sporny, D. Longley, G. Kellogg, M. Lanthaler, N. Lindström, JSON-LD 1.0, in W3C recommendation (World Wide Web Consortium, Cambridge, 2014)
D. Beckett, T. Berners-Lee, E. Prud’hommeaux, G. Carothers, RDF 1.1 Turtle, in W3C Recommendation (World Wide Web Consortium, Cambridge, 2004)
D. Beckett, B. McBride, RDF/XML Syntax Specification, in W3C Recommendation (World Wide Web Consortium, Cambridge, 2004)
J. Eckle-Kohler, I. Gurevych, S. Hartmann, M. Matuschek, C.M. Meyer, UBY-LMF-a uniform model for standardizing heterogeneous lexical-semantic resources in ISO-LMF, in Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC) (2012), pp. 275–282
M. Windhouwer, J. Petro, S. Shayan, RELISH LMF: unlocking the full power of the Lexical Markup Framework, in Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC) (2014), pp. 1032–1037
D. Lindemann, F. Kliche, Bilingual Dictionary Drafting: Bootstrapping WordNet and BabelNet, in Proceedings of the 5th Biennial Conference on Electronic Lexicography (eLex) (2017), pp. 23–42
J.P. McCrae, P. Buitelaar, Linking datasets using semantic textual similarity. Cybern. Inf. Technol. 18(1), 109 (2018)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Cimiano, P., Chiarcos, C., McCrae, J.P., Gracia, J. (2020). Applying Linked Data Principles to Linking Multilingual Wordnets. In: Linguistic Linked Data. Springer, Cham. https://doi.org/10.1007/978-3-030-30225-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-30225-2_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30224-5
Online ISBN: 978-3-030-30225-2
eBook Packages: Computer ScienceComputer Science (R0)