Towards a Unified Exploitation of Electronic Dialectal Corpora: Problems and Perspectives

  • Nikitas N. Karanikolas
  • Eleni Galiotou
  • Angela Ralli
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8655)


In this paper, we deal with the problem of storing and retrieving dialectal data in a unified framework. In particular, we discuss issues concerning the design and implementation of a multimedia database which will contain written and oral data from three Greek dialects in Asia Minor. At first, we describe the overall architecture of a system aiming at providing the user with the possibility to store audio recordings, text transcripts, and other annotations. Then we discuss the possibilities and limitations of a retrieval module aiming at combining different linguistic levels for a unified exploitation of oral and written corpora.


Computational Dialectology Electronic Corpora Modern Greek Dialects Multimedia databases 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Anhoj, J.: Generic Design of Web-Based Clinical Databases. Journal Medical Internet Research 4 (2003)Google Scholar
  2. 2.
    Barbiers, S., et al.: Dynamic Syntactic Atlas of the Dutch dialects (DynaSAND). Meertens Institute, Amsterdam (2006), Google Scholar
  3. 3.
    Boersma, P.: The use of Praat in corpus research. In: Jacques Durand, J., Gut, U., Kristofferson, G. (eds.) Handbook of Corpus Phonology, OUP, Oxford (2012)Google Scholar
  4. 4.
    Boersma, P., Weenink, D.: Praat: Doing phonetics by computer (2013),
  5. 5.
    Buttcher, S., Clarke, C., Cormack, G.: Information Retrieval: Implementing and Evaluating Search Engines. MIT Press, Cambridge (2010)Google Scholar
  6. 6.
    ELAN: Max Planck Institute for Psycholinguistics, The Language Archive, Nijmegen, The Netherlands,
  7. 7.
    Fromont, R., Hay, J.: ONZE Miner: the development of a browser-based research tool. Corpora 3(2), 173–193 (2008)CrossRefGoogle Scholar
  8. 8.
    Galiotou, E., Karanikolas, N., Manolessou, I., Pantelidis, N., Papazachariou, D., Ralli, A., Xydopoulos, G.: Asia Minor Greek: Towards a Computational Processing. In: Procedia: Social and Behavioral Science. Elsevier (in press, 2014)Google Scholar
  9. 9.
    Johnson, S.B., Chatziantoniou, D.: Extended SQL for manipulating clinical warehouse data. In: AMIA 1999, pp. 819–823 (1999)Google Scholar
  10. 10.
  11. 11.
    Karanikolas, N.N., Galiotou, E., Xydopoulos, G.J., Ralli, A., Athanasakos, K., Koronakis, G.: Structuring a Multimedia tridialectal dictionary. In: Habernal, I. (ed.) TSD 2013. LNCS, vol. 8082, pp. 509–518. Springer, Heidelberg (2013)Google Scholar
  12. 12.
    Koliopoulou, M., Markopoulos, T., Pantelidis, N.: Pontus, Cappadocia, Aivali: Challenges of a digital corpus of written material. In: The 11th International Conference of Greek Linguistics, Rhodes (September 2013) (in Greek)Google Scholar
  13. 13.
    Koutsoukos, N., Ralli, A.: From derivation to inflection: a process of grammaticalization. In: Morphology Meeting 2012. Leiden, the Netherlands (2012)Google Scholar
  14. 14.
    LaBB-CAT (formerly ONZE Miner),
  15. 15.
    Manolessou, I., Beis, S., Bassea-Bezantakou: The phonetic transcription of Modern Greek dialects. Lexicographicon Deltion 26, 161–222 (2012) (in Greek)Google Scholar
  16. 16.
    Nadkarni, P.: Clinical Patient Record Systems Architecture: An Overview. Journal of Postgraduate Medicine 46(3), 199–204 (2000)Google Scholar
  17. 17.
    Nadkarni, P.: An introduction to entity-attribute-value design for generic clinical study data management systems. Presentation in: National GCRC Meeting, Baltimore, MD (2002)Google Scholar
  18. 18.
    Nerbonne, J., Kleiweg, P.: Lexical distance in LAMSAS. Computers and the Humanities 37(3), 339–357 (2003)CrossRefGoogle Scholar
  19. 19.
    Ralli, A., Papazachariou, D., Karasimos, A.: Laboratory of Modern Greek Dialects and the project GreeD. In: Ralli, A., et al. (eds.) Proc. 4th Int. Conf. of Modern Greek Dialects and Linguistic Theory (2010)Google Scholar
  20. 20.
    Sloetjes, H., Wittenburg, P.: Annotation by category - ELAN and ISO DCR. In: Proceedings of the 6th International Conference on Language Resources and Evaluation, LREC 2008 (2008)Google Scholar
  21. 21.
    Themistocleous, C., Katsogiannou, M., Armosti, S., Christodoulou, K.: Cypriot Greek Lexicography: An Online Lexical Database. In: Proceedings of Euralex, pp. 889–891 (2012)Google Scholar
  22. 22.
    Wallis, S., Nelson, G.: Knowledge discovery in grammatically analyzed corpora. Data Mining & Knowledge Discovery 5, 305–335 (2001)CrossRefzbMATHGoogle Scholar
  23. 23.
    Wells, J.C.: ’SAMPA computer readable phonetic alphabet’. In: Gibbon, D., Moore, R., Winski, R. (eds.) Handbook of Standards and Resources for Spoken Language Systems 1997, Part IV, section B. Mouton de Gruyter, Berlin (1997)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Nikitas N. Karanikolas
    • 1
  • Eleni Galiotou
    • 1
  • Angela Ralli
    • 2
  1. 1.Department of Informatics, Technological Educational Institute of AthensAigaleoGreece
  2. 2.Department of PhilologyUniversity of PatrasRio, PatrasGreece

Personalised recommendations