Advertisement

On the Quantitative and Qualitative Speech Changes of the Czech Radio Broadcasts News within Years 1969–2005

  • Michaela Kuchařová
  • Svatava Škodová
  • Ladislav Šeps
  • Václav Lábus
  • Jan Nouza
  • Marek Boháč
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8082)

Abstract

In this paper we introduce the quantitative and qualitative characteristics of the Czech Radio Broadcasts News during a period of significant political and social changes in the Czech Republic (1969 - 2005). The research is mainly focused on the quantitative features of speech that can be determined from the results of automatic speech recognition system. We describe the used archive transcription system and selected characteristics of the macro- and micro- structure of the Radio Broadcasts News; namely the changes in studio vs. out-of-studio speech ratio, distribution of speakers by male and female, moderators and guest-speakers, changes in the use of signature tunes (including jingles), approximate use of phrasal introductory and closing language specific for the time periods, speech speed changes, average silence length, coordinative vs. subordinate conjunctions ratio and the most frequent semantic words. The sample of data consists of 6,580 hours of news broadcasting and 48,721,952 lexical words.

Keywords

audio archive processing spoken formal speech radio broadcast news non-speech events automatic speech recognition 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Sonkova, J.: Morfologie mluvene cestiny: Frekvencni analyza (Morphology of Czech: Frequency Analysis). NLN, Praha (2008)Google Scholar
  2. 2.
    Cmejrkova, S., Hoffmannova, J. (eds.): Mluvena cestina: hledani funkcniho rozpeti (Spoken Czech in Search of Functional Range). Academia, Praha (2011)Google Scholar
  3. 3.
    Nouza, J., et al.: Making Czech Historical Radio Archive Accessible and Searchable for Wide Public. Journal of Multimedia 7(2), 159–169 (2012)CrossRefGoogle Scholar
  4. 4.
    Hansen, J.H.L., et al.: SpeechFind: Spoken document retrieval for a National Gallery of the Spoken Word. In: 6th Nordic Sig. Proc. Symposium, NORSIG 2004, pp. 1–4 (2004)Google Scholar
  5. 5.
    Chaloupka, J., Nouza, J., Červa, P., Málek, J.: Downdating lexicon and language model for automatic transcription of czech historical spoken documents. In: Habernal, I., Matousek, V. (eds.) TSD 2013. LNCS (LNAI), vol. 8082, pp. 201–208. Springer, Heidelberg (2013)Google Scholar
  6. 6.
    Boháč, M., Nouza, J., Blavka, K.: Investigation on Most Frequent Errors in Large-scale Speech Recognition Applications. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 520–527. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  7. 7.
    Barton, T., Cvrcek, V., Cermak, F., Jelinek, T., Petkevic, V.: Statistiky cestiny (Statistics of Czech). NLN, Praha (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Michaela Kuchařová
    • 1
  • Svatava Škodová
    • 2
  • Ladislav Šeps
    • 1
  • Václav Lábus
    • 2
  • Jan Nouza
    • 1
  • Marek Boháč
    • 1
  1. 1.Institute of Information Technology and ElectronicsTechnical University of LiberecLiberecCzech Republic
  2. 2.Department of the Czech Language and LiteratureTechnical University of LiberecLiberecCzech Republic

Personalised recommendations