Integrating Lexical Resources Through an Aligned Lemma List

  • Axel Herold
  • Lothar Lemnitzer
  • Alexander Geyken


This paper presents the modelling of a common meta-index for large modern and historical lexical resources of the DWDS project. Four dictionaries of the German language are part of DWDS: (1) eWDG2, a digital version of the Wörterbuch der deutschen Gegenwartssprache (WDG, 1962-1977); (2) DWDSWB, a new and extended edition of the WDG (started in 2010); (3) EtymWB, a digital version of Wolfgang Pfeifer’s Etymologisches Wörterbuch des Deutschen (1989); (4) 1DWB, a digital version of the first edition of Grimm’s Deutsches Wörterbuch (1832-1961). Due to the different lexicographical principles and traditions employed for these resources as well as the different historical periods covered, such a meta-index cannot be modelled as a simple list of 1:1-correspondences between entries across different dictionaries. In order to model the occurring phenomena such as graphematic headword variance, homography, semantic change and differences in the semantic entry structure a more complex typed link structure is required.


Resource Description Framework Lexical Entry German Language Semantic Description Lexical Resource 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Behrens L (2002) Structuring of word meaning II: Aspects of polysemy. In: Cruse DA, Hundsnurscher F, Job M, Lutzeier PR (eds) Lexikologie – Lexicology. Ein internationales Handbuch zur Natur und Struktur von Wörtern und Wortschätzen, vol 1, de Gruyter, Berlin, pp 319–337 Google Scholar
  2. DWB (1854–1961) Deutsches Wörterbuch. Hirzel, Leipzig Google Scholar
  3. Dückert J (ed) (1987) Das Grimmsche Wörterbuch. Untersuchungen zur lexikographischen Methodologie. Hirzel, Leipzig Google Scholar
  4. Geyken A (2007) The DWDS corpus: A reference corpus for the German language of the twentieth century. In: Fellbaum C (ed) Idioms and collocations: Corpus-based linguistic and lexicographic studies, Research in corpus and discourse, Continuum, London, pp 23–40 Google Scholar
  5. Geyken A, Didakowski J, Siebert A (2009) Generation of word profiles for large German corpora. In: Kawaguchi Y, Minegishi M, Durand J (eds) Corpus analysis and variation in linguistics, Studies in Linguistics, vol 1, John Benjamins, pp 141–157 Google Scholar
  6. Herold A (2011) Retrodigitalisierung und Modellierung des Wörterbuchs der deutschen Gegenwartssprache. In: Krafft A, Spiegel C (eds) Sprachliche Förderung und Weiterbildung – transdisziplinär, no. 51 in Forum Angewandte Linguistik, Peter Lang, Frankfurt (M.), Berlin Google Scholar
  7. Jurish B (2010) More than words. Using token context to improve canonicalization of historical German. JLCL 25(1):23–40 Google Scholar
  8. Kempcke G (2001) Polysemie oder Homonymie? Zur Praxis der Bedeutungsgliederung in den Wörterbuchartikeln synchronischer einsprachiger Wörterbücher der Deutschen Sprache. Lexicographica 17:61–68 Google Scholar
  9. Klein W (2004) Vom Wörterbuch zum Digitalen Lexikalischen System. Zeitschrift für Literaturwissenschaft und Linguistik 136:10–55 Google Scholar
  10. Klein W, Geyken A (2010) Das digitale Wörterbuch der deutschen Sprache (DWDS). Lexicographica 26:79–93 Google Scholar
  11. Kunze C, Lemnitzer L (2010) Lexical-semantic and conceptual relations in GermaNet. In: Storjohann P (ed) Lexical-semantic relations: Theoretical and practical perspectives, no. 28 in Lingvisticæ Investigationes Supplementa, John Benjamins, Amsterdam, pp 163–183 Google Scholar
  12. Pfeifer W (1989) Etymologisches Wörterbuch des Deutschen. Akademie-Verlag, Berlin Google Scholar
  13. Schmidt H (2004) Das Deutsche Wörterbuch. Gebrauchsanweisung. In: Bartz HW, Burch T, Christmann R, Gärtner K, Hildenbrandt V, Schares T, Wegge K (eds) Deutsches Wörterbuch. Elektronische Ausgabe der Erstbearbeitung von Jacob Grimm und Wilhelm Grimm, Zweitausendeins, Frankfurt (M.), pp 25–64 Google Scholar
  14. Sokirko A (2003) DDC – a search engine for linguistically annotated corpora. In: Proceedings of Dialog 2003, Protvino (Russia) Google Scholar
  15. WDG (1962–1977) Wörterbuch der deutschen Gegenwartssprache. Akademie-Verlag, Berlin Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Berlin-Brandenburgische Akademie der WissenschaftenBerlinGermany

Personalised recommendations