Linked Data in Linguistics

pp 15-23

Treating Dictionaries as a Linked-Data Corpus

  • Peter BoudaAffiliated withResearch Unit “Quantitative Language Comparison”, Ludwig Maximilians University Email author 
  • , Michael Cysouw

* Final gross prices may vary according to local VAT.

Get Access


In this paper we describe a practical approach to the challenge of linguistic retrodigitization. We propose to distinguish strictly between a base digitization and separate interpretation of the sources. The base digitization only includes a literal electronic transcript of the source. All sources are thus simply treated as strings of characters, i.e. as unstructured corpora. The often complex structure as found in many dictionaries and grammars will subsequently (and possibly much later) be added as Linked Data in the form of standoff annotation. A further advantage of this approach is that the complete digitization and interpretation can be performed collaboratively without a complex organizational superstructure.