Abstract
Several annotation models have been proposed to enable a multilingual Semantic Web. Such models hone in on the word and its morphology and assume the language tag and URI comes from external resources. These resources, such as ISO 639 and Glottolog, have limited coverage of the world’s languages and have a very limited thesaurus-like structure at best, which hampers language annotation, hence constraining research in Digital Humanities and other fields. To resolve this ‘outsourced’ task of the current models, we developed a model for representing information about languages, the Model for Language Annotation (MoLA), such that basic language information can be recorded consistently and therewith queried and analyzed as well. This includes the various types of languages, families, and the relations among them. MoLA is formalized in OWL so that it can integrate with Linguistic Linked Data resources. Sufficient coverage of MoLA is demonstrated with the use case of French.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
For the sake of brevity, namespaces are assumed defined the usual way.
- 2.
ISO 639 is the International Standard for language codes [1].
- 3.
https://glottolog.org, www.ethnologue.com, http://multitree.org [05-03-2019].
- 4.
https://glottolog.org/resource/languoid/id/sout3200 [22-02-2019].
- 5.
http://multitree.org/codes/xho [03-03-2019].
- 6.
http://id.loc.gov/authorities/subjects/sh85148822.rdf [03-03-2019].
- 7.
Language is constantly evolving. Influences by other languages due to cultural contact can result in lexical, phonetic and morphologic changes. The question when to characterize a language ‘a’ as ‘being influenced’ by a language ‘b’ depends on the granularity level of the analysis and is subject to discussion of linguists.
- 8.
As sub-languoids of ‘Oïl’ (varieties that use an adaptation of the Vulgar Latin term hoc ille “this (is) it” as ‘Yes’); note that Francoprovençalic is a non-Oïl language.
- 9.
Independent from diachronic issues, the hierarchy of modern French varieties also needs a revision (in line with “Most of the information on dialects in Glottolog [...] contains numerous errors and inconsistencies which we are aware of” [17]).
- 10.
http://aims.fao.org/vest-registry/vocabularies/agrovoc [22-02-2019].
- 11.
The term for the written representation of the spoken dialects of Old French [4, 206].
- 12.
I.e., Moselle and Rhine Franconian for which a thorough revision on Glottolog is advised as well, see https://glottolog.org/resource/languoid/id/fran1268 [24-02-2019].
References
Language codes - ISO 639. https://www.iso.org/iso-639-language-codes.html
Berners-Lee, T.: Linked Data (2009). https://www.w3.org/DesignIssues/LinkedData.html
Berschin, H., Fernández-Sevilla, J., Felixberger, J.: Die spanische Sprache. Georg Olms Verlag, Hildesheim (2012)
Berschin, H., Goebl, H.: Französische Sprachgeschichte. Georg Olms Verlag, Hildesheim (2008)
Cardillo, E., Folino, A., Trunfio, R., Guarasci, R.: Towards the reuse of standardized thesauri into ontologies. In: Proceedings of WOP 2014. CEUR-WS, vol. 1302, pp. 26–37 (2014)
Chavula, C., Maria Keet, C.: An orchestration framework for linguistic task ontologies. In: Garoufallou, E., Hartley, R.J., Gaitanou, P. (eds.) MTSR 2015. CCIS, vol. 544, pp. 3–14. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24129-6_1
Chiarcos, C., Sukhareva, M.: OLiA - ontologies of linguistic annotation. Semant. Web J. 6(4), 379–386 (2015)
Cimiano, P., McCrae, J.P., Buitelaar, P.: Lexicon model for ontologies: community report. Final community group report, 10 May 2016, W3C (2016). https://www.w3.org/2016/05/ontolex/
Coseriu, E.: ‘Historische Sprache’ und ‘Dialekt’. In: Göschel, J. (ed.) Dialekt und Dialektologie. Ergebnisse des Internationalen Symposions “Zur Theorie des Dialekts”. Marburg/Lahn, 5–10 September 1977, pp. 106–122. Franz Steiner Verlag (1980)
Crystal, D.: The Cambridge Encyclopedia of Language. Cambridge University Press, Cambridge (2010)
Dimitrova, V., Fäth, C., Chiarcos, C., Renner-Westermann, H., Abromeit, F.: Interoperability of language-related information: mapping the BLL Thesaurus to Lexvo and Glottolog. In: Proceedings of LREC 2018, pp. 4555–4561. ELRA, Miyazaki, 7–12 May 2018
Farrar, S., Langendoen, D.T.: A linguistic ontology for the semantic web. In: GLOT International, vol. 7, no. 3, pp. 97–100 (2003)
Gillis-Webber, F., Keet, C.M., Tittel, S.: A model for language annotations on the web: supplementary material (2019). https://ontology.londisizwe.org/mola/article/2019-kgswc-supplementary-material
Gillis-Webber, F., Tittel, S.: The shortcomings of language tags for linked data when modelling lesser-known languages. In: Proceedings of LDK 2019, Leipzig, 20–23 May 2019 (2019)
Guarino, N., Oberle, D., Staab, S.: What is an ontology? In: Staab, S., Studer, R. (eds.) Handbook on Ontologies, pp. 1–17. Springer, Heidelberg (2009)
Halpin, T., Morgan, T.: Information Modeling and Relational Databases, 2nd edn. Morgan Kaufmann, Burlington (2008)
Hammarström, H., Haspelmath, M., Forkel, R.: Glottolog 3.3. About languoids (2018). https://glottolog.org/glottolog/glottologinformation. Accessed 17 Feb 2019
Hellmann, S., Stadler, C., Lehmann, J.: Linked data for linguistic diversity research: Glottolog/Langdoc and ASJP online. In: Chiarcos, C., Nordhoff, S., Hellmann, S. (eds.) Linked Data in Linguistics, pp. 191–200. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28249-2_18
Keet, C.M.: An introduction to ontology engineering, Computing, vol. 20, p. 334. College Publications (2018)
Kibbee, D.: For to Speke Frenche Trewely. John Benjamins Publishing Company, Amsterdam (1991)
Klare, J.: Französische Sprachgeschichte. Klett, Stuttgart (1998)
Kless, D., Jansen, L., Lindenthal, J., Wiebensohn, J.: A method for re-engineering a thesaurus into an ontology. In: Proceedings of FOIS 2012, pp. 133–146. IOS Press (2012)
Masolo, C., Borgo, S., Gangemi, A., Guarino, N., Oltramari, A.: Ontology library. WonderWeb Deliverable D18 (version 1.0, 31–12-2003) (2003). http://wonderweb.semanticweb.org
de Melo, G.: Lexvo.org: language-related information for the linguistic linked data cloud. Semant. Web 6(4), 393–400 (2015)
Miles, A., Bechhofer, S.: SKOS simple knowledge organization system reference: W3C recommendation, 18 August 2009 (2009). https://www.w3.org/TR/2009/REC-skos-reference-20090818/. Accessed 17 Feb 2019
Rickard, P.: A History of the French language. Hutchinson University Library, London (1974)
Simperl, E., Mochol, M., Bürger, T.: Achieving maturity: the state of practice in ontology engineering in 2009. Int. J. Comput. Sci. Appl. 7(1), 45–65 (2010)
Soergel, D., Lauser, B., Liang, A., Fisseha, F., Keizer, J., Katz, S.: Reengineering thesauri for new applications: the AGROVOC example. J. Digit. Inf. 4(4) (2004). http://journals.tdl.org/jodi/article/view/jodi-126/111
Tittel, S., Chiarcos, C.: Historical lexicography of old french and linked open data: transforming the resources of the Dictionnaire étymologique de l’ancien français with OntoLex-Lemon. In: Proceedings of LREC 2018. GLOBALEX Workshop 2018, Miyazaki, Japan, pp. 58–66. (ELRA), Paris (2018)
Vannini, L., Le Crosnier, H.: Net.lang. Towards the multilingual cyberspace. C & F Éditions (2012)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Gillis-Webber, F., Tittel, S., Keet, C.M. (2019). A Model for Language Annotations on the Web. In: Villazón-Terrazas, B., Hidalgo-Delgado, Y. (eds) Knowledge Graphs and Semantic Web. KGSWC 2019. Communications in Computer and Information Science, vol 1029. Springer, Cham. https://doi.org/10.1007/978-3-030-21395-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-21395-4_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21394-7
Online ISBN: 978-3-030-21395-4
eBook Packages: Computer ScienceComputer Science (R0)