Abstract
Nowadays, there is a significant quantity of linguistic data available on the Web. However, linguistic resources are often published using proprietary formats and, as such, it can be difficult to interface with one another and they end up confined in “data silos”. The creation of web standards for the publishing of data on the Web and projects to create Linked Data have lead to interest in the creation of resources that can be published using Web principles. One of the most important aspects of “Lexical Linked Data” is the sharing of lexica and machine readable dictionaries. It is for this reason, that the lemon format has been proposed, which we briefly describe. We then consider two resources that seem ideal candidates for the Linked Data cloud, namely WordNet 3.0 and Wiktionary, a large document based dictionary. We discuss the challenges of converting both resources to lemon , and in particular for Wiktionary, the challenge of processing the mark-up, and handling inconsistencies and underspecification in the source material. Finally, we turn to the task of creating links between the two resources and present a novel algorithm for linking lexica as lexical Linked Data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Berners-Lee T (2009) Linked Data-The Story So Far. International Journal on Semantic Web and Information Systems 5(3):1–22
Chiarcos C (2010) Grounding an Ontology of Linguistic Annotations in the Data Category Registry. In: Proceedings of the 2010 International Conference on Language Resource and Evaluation (LREC)
Chiarcos C (this vol.) Interoperability of corpora and annotations. pp 161–179
Cimiano P, Buitelaar P, McCrae J, Sintek M (2010) Lexinfo: A declarative model for the lexicon-ontology interface. Web Semantics: Science, Services and Agents on the World Wide Web
Farrar S, Langendoen D (2003) Markup and the GOLD Ontology. In: Proceedings of Workshop on Digitizing and Annotating Text and Field Recordings
Fellbaum C (1998) WordNet: An electronic lexical database. MIT press Cambridge, MA
Kemps-Snijders M, Windhouwer M, Wittenburg P, Wright S (2008) ISOcat: Corralling data categories in the wild. In: Proceedings of the 2008 International Conference on Language Resource and Evaluation (LREC)
Kipper-Schuler K (2005) Verbnet: A broad coverage, comprehensive verb lexicon. PhD thesis, University of Pennsylvania
Levin B (1993) English Verb Classes and Alternations: A Preliminary Investigation.. University of Chicago Press, Chicago
McCrae J, Spohr D, Cimiano P (2011) Linking Lexical Resources and Ontologies on the Semantic Web with Lemon. The Semantic Web: Research and Applications 245–259
McCrae J, Aguado-de Cea G, Buitelaar P, Cimiano P, Declerck T, Gomez-Perez A, Gracia J, Hollink L, Montiel-Ponsoda E, Spohr D, Wunner T (in press) Interchanging lexical resources on the semantic web. Language Resources and Evaluation
Montiel-Ponsoda E, Gracia J, Aguado de Cea G, Gómez-Pérez A (2011) Representing translations on the semantic web. In: Proceedings CW (ed) Proceedings of the 2nd International Workshop on the Multilingual Semantic Web 2011 (MSW 2011), vol 775, pp 25–37
Zesch T, Müller C, Gurevych I (2008) Extracting lexical semantic knowledge from wikipedia and wiktionary. In: Proceedings of the Conference on Language Resources and Evaluation (LREC), Citeseer, pp 1646–1652
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
McCrae, J., Montiel-Ponsoda, E., Cimiano, P. (2012). Integrating WordNet and Wiktionary with lemon . In: Chiarcos, C., Nordhoff, S., Hellmann, S. (eds) Linked Data in Linguistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28249-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-28249-2_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28248-5
Online ISBN: 978-3-642-28249-2
eBook Packages: Computer ScienceComputer Science (R0)