Skip to main content

A Model for Language Annotations on the Web

  • Conference paper
  • First Online:
Knowledge Graphs and Semantic Web (KGSWC 2019)

Abstract

Several annotation models have been proposed to enable a multilingual Semantic Web. Such models hone in on the word and its morphology and assume the language tag and URI comes from external resources. These resources, such as ISO 639 and Glottolog, have limited coverage of the world’s languages and have a very limited thesaurus-like structure at best, which hampers language annotation, hence constraining research in Digital Humanities and other fields. To resolve this ‘outsourced’ task of the current models, we developed a model for representing information about languages, the Model for Language Annotation (MoLA), such that basic language information can be recorded consistently and therewith queried and analyzed as well. This includes the various types of languages, families, and the relations among them. MoLA is formalized in OWL so that it can integrate with Linguistic Linked Data resources. Sufficient coverage of MoLA is demonstrated with the use case of French.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For the sake of brevity, namespaces are assumed defined the usual way.

  2. 2.

    ISO 639 is the International Standard for language codes [1].

  3. 3.

    https://glottolog.org, www.ethnologue.com, http://multitree.org [05-03-2019].

  4. 4.

    https://glottolog.org/resource/languoid/id/sout3200 [22-02-2019].

  5. 5.

    http://multitree.org/codes/xho [03-03-2019].

  6. 6.

    http://id.loc.gov/authorities/subjects/sh85148822.rdf [03-03-2019].

  7. 7.

    Language is constantly evolving. Influences by other languages due to cultural contact can result in lexical, phonetic and morphologic changes. The question when to characterize a language ‘a’ as ‘being influenced’ by a language ‘b’ depends on the granularity level of the analysis and is subject to discussion of linguists.

  8. 8.

    As sub-languoids of ‘Oïl’ (varieties that use an adaptation of the Vulgar Latin term hoc ille “this (is) it” as ‘Yes’); note that Francoprovençalic is a non-Oïl language.

  9. 9.

    Independent from diachronic issues, the hierarchy of modern French varieties also needs a revision (in line with “Most of the information on dialects in Glottolog [...] contains numerous errors and inconsistencies which we are aware of” [17]).

  10. 10.

    http://aims.fao.org/vest-registry/vocabularies/agrovoc [22-02-2019].

  11. 11.

    The term for the written representation of the spoken dialects of Old French [4, 206].

  12. 12.

    I.e., Moselle and Rhine Franconian for which a thorough revision on Glottolog is advised as well, see https://glottolog.org/resource/languoid/id/fran1268 [24-02-2019].

References

  1. Language codes - ISO 639. https://www.iso.org/iso-639-language-codes.html

  2. Berners-Lee, T.: Linked Data (2009). https://www.w3.org/DesignIssues/LinkedData.html

  3. Berschin, H., Fernández-Sevilla, J., Felixberger, J.: Die spanische Sprache. Georg Olms Verlag, Hildesheim (2012)

    Google Scholar 

  4. Berschin, H., Goebl, H.: Französische Sprachgeschichte. Georg Olms Verlag, Hildesheim (2008)

    Google Scholar 

  5. Cardillo, E., Folino, A., Trunfio, R., Guarasci, R.: Towards the reuse of standardized thesauri into ontologies. In: Proceedings of WOP 2014. CEUR-WS, vol. 1302, pp. 26–37 (2014)

    Google Scholar 

  6. Chavula, C., Maria Keet, C.: An orchestration framework for linguistic task ontologies. In: Garoufallou, E., Hartley, R.J., Gaitanou, P. (eds.) MTSR 2015. CCIS, vol. 544, pp. 3–14. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24129-6_1

    Chapter  Google Scholar 

  7. Chiarcos, C., Sukhareva, M.: OLiA - ontologies of linguistic annotation. Semant. Web J. 6(4), 379–386 (2015)

    Article  Google Scholar 

  8. Cimiano, P., McCrae, J.P., Buitelaar, P.: Lexicon model for ontologies: community report. Final community group report, 10 May 2016, W3C (2016). https://www.w3.org/2016/05/ontolex/

  9. Coseriu, E.: ‘Historische Sprache’ und ‘Dialekt’. In: Göschel, J. (ed.) Dialekt und Dialektologie. Ergebnisse des Internationalen Symposions “Zur Theorie des Dialekts”. Marburg/Lahn, 5–10 September 1977, pp. 106–122. Franz Steiner Verlag (1980)

    Google Scholar 

  10. Crystal, D.: The Cambridge Encyclopedia of Language. Cambridge University Press, Cambridge (2010)

    Google Scholar 

  11. Dimitrova, V., Fäth, C., Chiarcos, C., Renner-Westermann, H., Abromeit, F.: Interoperability of language-related information: mapping the BLL Thesaurus to Lexvo and Glottolog. In: Proceedings of LREC 2018, pp. 4555–4561. ELRA, Miyazaki, 7–12 May 2018

    Google Scholar 

  12. Farrar, S., Langendoen, D.T.: A linguistic ontology for the semantic web. In: GLOT International, vol. 7, no. 3, pp. 97–100 (2003)

    Google Scholar 

  13. Gillis-Webber, F., Keet, C.M., Tittel, S.: A model for language annotations on the web: supplementary material (2019). https://ontology.londisizwe.org/mola/article/2019-kgswc-supplementary-material

  14. Gillis-Webber, F., Tittel, S.: The shortcomings of language tags for linked data when modelling lesser-known languages. In: Proceedings of LDK 2019, Leipzig, 20–23 May 2019 (2019)

    Google Scholar 

  15. Guarino, N., Oberle, D., Staab, S.: What is an ontology? In: Staab, S., Studer, R. (eds.) Handbook on Ontologies, pp. 1–17. Springer, Heidelberg (2009)

    Google Scholar 

  16. Halpin, T., Morgan, T.: Information Modeling and Relational Databases, 2nd edn. Morgan Kaufmann, Burlington (2008)

    Google Scholar 

  17. Hammarström, H., Haspelmath, M., Forkel, R.: Glottolog 3.3. About languoids (2018). https://glottolog.org/glottolog/glottologinformation. Accessed 17 Feb 2019

  18. Hellmann, S., Stadler, C., Lehmann, J.: Linked data for linguistic diversity research: Glottolog/Langdoc and ASJP online. In: Chiarcos, C., Nordhoff, S., Hellmann, S. (eds.) Linked Data in Linguistics, pp. 191–200. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28249-2_18

    Chapter  Google Scholar 

  19. Keet, C.M.: An introduction to ontology engineering, Computing, vol. 20, p. 334. College Publications (2018)

    Google Scholar 

  20. Kibbee, D.: For to Speke Frenche Trewely. John Benjamins Publishing Company, Amsterdam (1991)

    Book  Google Scholar 

  21. Klare, J.: Französische Sprachgeschichte. Klett, Stuttgart (1998)

    Google Scholar 

  22. Kless, D., Jansen, L., Lindenthal, J., Wiebensohn, J.: A method for re-engineering a thesaurus into an ontology. In: Proceedings of FOIS 2012, pp. 133–146. IOS Press (2012)

    Google Scholar 

  23. Masolo, C., Borgo, S., Gangemi, A., Guarino, N., Oltramari, A.: Ontology library. WonderWeb Deliverable D18 (version 1.0, 31–12-2003) (2003). http://wonderweb.semanticweb.org

  24. de Melo, G.: Lexvo.org: language-related information for the linguistic linked data cloud. Semant. Web 6(4), 393–400 (2015)

    Article  Google Scholar 

  25. Miles, A., Bechhofer, S.: SKOS simple knowledge organization system reference: W3C recommendation, 18 August 2009 (2009). https://www.w3.org/TR/2009/REC-skos-reference-20090818/. Accessed 17 Feb 2019

  26. Rickard, P.: A History of the French language. Hutchinson University Library, London (1974)

    Google Scholar 

  27. Simperl, E., Mochol, M., Bürger, T.: Achieving maturity: the state of practice in ontology engineering in 2009. Int. J. Comput. Sci. Appl. 7(1), 45–65 (2010)

    Google Scholar 

  28. Soergel, D., Lauser, B., Liang, A., Fisseha, F., Keizer, J., Katz, S.: Reengineering thesauri for new applications: the AGROVOC example. J. Digit. Inf. 4(4) (2004). http://journals.tdl.org/jodi/article/view/jodi-126/111

  29. Tittel, S., Chiarcos, C.: Historical lexicography of old french and linked open data: transforming the resources of the Dictionnaire étymologique de l’ancien français with OntoLex-Lemon. In: Proceedings of LREC 2018. GLOBALEX Workshop 2018, Miyazaki, Japan, pp. 58–66. (ELRA), Paris (2018)

    Google Scholar 

  30. Vannini, L., Le Crosnier, H.: Net.lang. Towards the multilingual cyberspace. C & F Éditions (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Frances Gillis-Webber , Sabine Tittel or C. Maria Keet .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gillis-Webber, F., Tittel, S., Keet, C.M. (2019). A Model for Language Annotations on the Web. In: Villazón-Terrazas, B., Hidalgo-Delgado, Y. (eds) Knowledge Graphs and Semantic Web. KGSWC 2019. Communications in Computer and Information Science, vol 1029. Springer, Cham. https://doi.org/10.1007/978-3-030-21395-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-21395-4_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-21394-7

  • Online ISBN: 978-3-030-21395-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics