Development of a Large Spontaneous Speech Database of Agglutinative Hungarian Language

  • Tilda Neuberger
  • Dorottya Gyarmathy
  • Tekla Etelka Gráczi
  • Viktória Horváth
  • Mária Gósy
  • András Beke
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8655)


In this paper, a large Hungarian spoken language database is introduced. This phonetically-based multi-purpose database contains various types of spontaneous and read speech from 333 monolingual speakers (about 50 minutes of speech sample per speaker). This study presents the background and motivation of the development of the BEA Hungarian database, describes its protocol and the transcription procedure, and also presents existing and proposed research using this database. Due to its recording protocol and the transcription it provides a challenging material for various comparisons of segmental structures of speech also across languages.


database spontaneous speech multi-level annotation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Mengusoglu, E., Deroo, O.: Turkish LVCSR: Database preparation and language modeling for an agglutinative language. In: IEEE International Conference on Acoustics Speech And Signal Processing, vol. 6, pp. 4018–4018. IEEE (1999, 2001)Google Scholar
  2. 2.
    Seppänen, T., Toivanen, J., Väyrynen, E.: MediaTeam speech corpus: a first large Finnish emotional speech database. In: Proceedings of the Proceedings of XV International Conference of Phonetic Science, pp. 2469–2472 (2003)Google Scholar
  3. 3.
    Mihajlik, P., Fegyyó, T., Tüske, Z., Ircing, P.: A morphographemic approach for the recognition of spontaneous speech in agglutinative languages - like Hungarian. In: Proc. Interspeech 2007, Antwerp, Belgium, pp. 1497–1500 (2007)Google Scholar
  4. 4.
    Keating, P., Byrd, D., Flemming, E., Todaka, Y.: Phonetic analyses of word and segment variation using the TIMIT corpus of American english. Speech Communication 14(2), 131–142 (1994)CrossRefGoogle Scholar
  5. 5.
    Bael, C.V., Boves, L., van den Heuvel, D., Strik, H.: Automatic phonetic transcription of large speech corpora. Journal of Computer Speech and Language 21(4), 652–668 (2007)CrossRefGoogle Scholar
  6. 6.
    Aston, G., Burnard, L.: The BNC Handbook. Exploring the British National Corpus with SARA. Oxford University Press (1998)Google Scholar
  7. 7.
    Svartvik, J. (ed.): The London Corpus of Spoken English: Description and Research. Lund Studies in English, 82. Lund University Press, Lund (1990)Google Scholar
  8. 8.
    Godfrey, J.J., Holliman, E.C., Daniel, J.: SWITCHBOARD: telephone speech corpus for research and development. In: Acoustics, Speech, and Signal Processing, ICASSP 1992, vol. 1, pp. 517–520 (1992)Google Scholar
  9. 9.
    Anderson, A.H., Bader, M., Bard, E.G., Boyle, E., Doherty, G., Garrod, S.,…Weinert, R.: The HCRC map task corpus. Language and Speech 34(4), 351–366 (1991)Google Scholar
  10. 10.
    Pitt, M.A., Johnson, K., Hume, E., Kiesling, S., Raymond, W.: The Buckeye corpus of conversational speech: labeling conventions and a test of transcriber reliability. Speech Communication 45, 89–95 (2005)CrossRefGoogle Scholar
  11. 11.
    Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., ... Wooters, C.: The ICSI meeting corpus. In: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2003, vol. 1, pp. 364–367 (2003)Google Scholar
  12. 12.
    Carletta, J.E., et al.: The AMI meeting corpus: A pre-announcement. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 28–39. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Kohler, K.J., Pätzold, M., Simpson, A.P.: From the acoustic data collection to a labelled speech data bank of spoken Standard German. Arbeitsberichte des Instituts fär Phonetik und digitale Sprachverarbeitung der Universität Kiel (AIPUK) 32, 1–29 (1997)Google Scholar
  14. 14.
    Grønnum, N.: A Danish phonetically annotated spontaneous speech corpus (DanPASS). Speech Communication 51(7), 594–603 (2009)CrossRefGoogle Scholar
  15. 15.
    Maekawa, K.: Corpus of Spontaneous Japanese: Its design and evaluation. In: ISCA IEEE Workshop on Spontaneous Speech Processing and Recognition (2003)Google Scholar
  16. 16.
    Chan, D., et al.: EUROM: a spoken language resource for the EU. In: Proceedings of the 4th European Conference on Speech Communication and Speech Tecnology, Eurospeech 1995, Madrid, vol. 1, pp. 867–880 (1995)Google Scholar
  17. 17.
    Roach, P., Arnfield, S., Barry, W.J., Baltova, J., Boldea, M., Fourcin, A., ... Vicsi, K.: BABEL: an eastern european multi-language database. In: ICSLP (1996)Google Scholar
  18. 18.
    Váradi, T.: A Budapesti Szociolingvisztikai Interjú. In: Kiefer F, Siptár P. (ed.). A magyar nyelv kézikényve Akadémiai Kiadó, Budapest, pp. 339–359 (2003)Google Scholar
  19. 19.
    Vicsi, K., Tóth, L., Kocsor, A., Gordos, G., Csirik, J.: MTBA – magyar nyelvű telefonbeszéd-adatbázis. Híradástechnika 8, 35–39 (2002)Google Scholar
  20. 20.
    Papay, K.: Designing a Hungarian multimodal database – speech recording and annotation. In: Esposito, A., Esposito, A.M., Martone, R., Müller, V.C., Scarpetta, G. (eds.) COST 2102 Int. Training School 2010. LNCS, vol. 6456, pp. 403–411. Springer, Heidelberg (2011)Google Scholar
  21. 21.
    Gósy, M.: BEA A multifunctional Hungarian spoken language database. The Phonetician 105(106), 50–61 (2012)Google Scholar
  22. 22.
    Gósy, M. (ed.): Beszéd, adatbázis, kutatások. Akadémiai Kiadó, Budapest (2012)Google Scholar
  23. 23.
    Gráczi, T.E., Horváth, V.: A magánhangzók realizációja spontán beszédben. In: Beszédkutatás 2010, pp. 5–16 (2010)Google Scholar
  24. 24.
    Beke, A., Gósy, M.: Characteristic and spectral features used in automatic prediction of vowel duration in spontaneous speech. In: Institute of Electrical Electronics Engineers (eds.): CogInfoCom 2012: 3rd International Conference on Cognitive Infocommunications, pp. 65–71 (2012)Google Scholar
  25. 25.
    Gráczi, T.E., Beke, A.: Fricatives in spontaneous speech. In: ExAPP 2013, Copenhagen, March 20-22 (2013)Google Scholar
  26. 26.
    Beke, A., Gósy, M., Horváth, V.: Temporal variability in spontaneous Hungarian speech. In: Proceedings of 6th Language Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Poznan, December 7-9, pp. 219–223 (2013)Google Scholar
  27. 27.
    Gósy, M., Gyarmathy, D., Horváth, V.: Improper activation and monitoring failures in speech planning. Govor / Speech 29(1), 3–22 (2012)Google Scholar
  28. 28.
    Gyarmathy, D., Neuberger, T.: Self-monitoring strategies: the factor of age. In: Presentation at the 19th International Congress of Linguists, Geneva, July 21-27 (2012)Google Scholar
  29. 29.
    Beke, A.: Automatic speaker diarization in Hungarian spontaneous conversations. PhD thesis. ELTE, Budapest (2013)Google Scholar
  30. 30.
    Neuberger, T., Beke, A.: Automatic laughter detection in spontaneous speech using GMM-SVM method. In: Habernal, I. (ed.) TSD 2013. LNCS (LNAI), vol. 8082, pp. 113–120. Springer, Heidelberg (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Tilda Neuberger
    • 1
  • Dorottya Gyarmathy
    • 1
  • Tekla Etelka Gráczi
    • 1
  • Viktória Horváth
    • 1
  • Mária Gósy
    • 1
  • András Beke
    • 1
  1. 1.Departement of PhoneticsResearch Institute for Linguistics of the Hungarian Academy of SciencesBudapestHungary

Personalised recommendations