LLODifying Linguistic Glosses

  • Christian Chiarcos
  • Maxim Ionov
  • Monika Rind-Pawlowski
  • Christian Fäth
  • Jesse Wichers Schreur
  • Irina Nevskaya
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10318)

Abstract

Interlinear glossed text (IGT) is a notation used in various fields of linguistics to provide readers with a way to understand the linguistic phenomena. We describe the representation of IGT data in RDF, the conversion from two popular tools, and their automated linking with resources from the Linguistic Linked Open Data (LLOD) cloud. We argue that such an LLOD edition of IGT data facilitates their reusability, their infrastructural support and their integration with external data sources.

Our converters are available under an open source license, two data sets will be published along with the final version of this paper. To our best knowledge, this is the first attempt to publish IGT data sets as Linguistic Linked Open Data we are aware of.

Keywords

Linguistic Linked Open Data (LLOD) Interlinear Glossed Text (IGT) Empirical linguistics Data modeling 

References

  1. 1.
    Abromeit, F., Chiarcos, C., Fäth, C., Ionov, M.: Linking the Tower of Babel: modelling a massive set of etymological dictionaries as RDF. In: McCrae, J., Chiarcos, C., Montiel Ponsoda, E., Declerck, T., Osenova, P., Hellmann, S. (eds.) Proceedings of the 5th Workshop on Linked Data in Linguistics (LDL-2016): Managing, Building and Using Linked Language Resources, Portoroz, Slovenia, 11–19 May 2016Google Scholar
  2. 2.
    Sanderson, R., Ciccarese, P., Van de Sompel, H.: Open annotation data model. Technical report, W3C Community Draft, 08 February 2013Google Scholar
  3. 3.
    Sanderson, R., Ciccarese, P., Young, B.: Web annotation data model. Technical report, W3C Recommendation, 23 February 2017Google Scholar
  4. 4.
    Hellmann, S., Lehmann, J., Auer, S., Brümmer, M.: Integrating NLP using linked data. In: Proceedings of 12th International Semantic Web Conference, Sydney, Australia, 21–25 October 2013. http://persistence.uni-leipzig.org/nlp.2rdf/
  5. 5.
    Comrie, B., Haspelmath, M., Bickel, B.: The Leipzig glossing rules: conventions for interlinear morpheme-by-morpheme glosses (2008). https://www.eva.mpg.de/lingua/pdf/Glossing-Rules.pdf
  6. 6.
    Lewis, W.D.: ODIN: a model for adapting and enriching legacy infrastructure. In: Second International Conference on e-Science and Grid Technologies (e-Science 2006), 4–6 December 2006, p. 137. IEEE Computer Society, Amsterdam (2006)Google Scholar
  7. 7.
    Sérasset, G.: DBnary: wiktionary as a lemon-based multilingual lexical resource in RDF. Semantic Web J. 648 (2014). http://kaiko.getalp.org/about-dbnary/
  8. 8.
    Chiarcos, C., Sukhareva, M.: OLiA - Ontologies of Linguistic Annotation. Semantic Web J. 518, 379–386 (2015)CrossRefGoogle Scholar
  9. 9.
    Dipper, S., Götze, M., Skopeteas, S.: Information structure in cross-linguistic corpora: annotation guidelines for phonology, morphology, syntax, semantics, and information structure. In: Interdisciplinary Studies on Information Structure (ISIS), Working papers of the SFB 632 7 (2007)Google Scholar
  10. 10.
    Poornima, S., Good, J.: Modeling and encoding traditional wordlists for machine applications. In: Proceedings of the 2010 Workshop on NLP and Linguistics: Finding the Common Ground, Uppsala, Sweden. Association for Computational Linguistics, 1–9 July 2010Google Scholar
  11. 11.
    Nakhimovsky, A., Good, J., Myers, T.: Interoperability of language documentation tools and materials for local communities. In: Digital Humanities (DH 2012), Hamburg, July 2012. http://www.dh2012.uni-hamburg.de/conference/programme/abstracts/interoperability-of-language-documentation-tools-and-materials-for-local-communities.1.html
  12. 12.
    Schalley, A.C.: Tyto - a collaborative research tool for linked linguistic data. In: Chiarcos, C., Nordhoff, S., Hellmann, S. (eds.) Linked Data in Linguistics, pp. 139–149. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  13. 13.
    Forkel, R.: The cross-linguistic linked data project. In: 3rd Workshop on Linked Data in Linguistics: Multilingual Knowledge Resources and Natural Language Processing, Reykjavik, Iceland, pp. 60–66, May 2014Google Scholar
  14. 14.
    Hellmann, S., Lehmann, J., Auer, S., Brümmer, M.: Integrating NLP using linked data. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8219, pp. 98–113. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41338-4_7 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Goethe-Universität Frankfurt am MainFrankfurtGermany

Personalised recommendations