Time Durations of Phonemes in Polish Language for Speech and Speaker Recognition

  • Bartosz Ziółko
  • Mariusz Ziółko
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6562)


Statistical phonetic data for Polish were collected. Phonemes are of different lengths, varying from 30 ms to 200 ms. Average phoneme durations are presented. A corpus of spoken Polish was used to collect statistic values of real language and evaluated to be applied in an automatic speech recognition and speaker identification systems. These natural phenomena could be used in phonemes parametrisation and modelling. An additional source of information for a case of speech segmentation was obtained. The collected data are presented in the paper (average values for all available male speakers and for some chosen ones), along with comments on the corpus and the used method. The obtained data were compared with the expected values according to phonetic literature.


Speech Recognition Discrete Wavelet Transform Automatic Speech Recognition Speaker Recognition Speech Recognition System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Demenko, G., Wypych, M., Baranowska, E.: Implementation of grapheme-to-phoneme rules and extended SAMPA alphabet in Polish text-to-speech synthesis. In: Speech and Language Technology, PTFon, Poznań, vol. 7(17) (2003)Google Scholar
  2. 2.
    Glass, J.: A probabilistic Framework for Segment-Based Speech Recognition. Computer Speech and Language 17, 137–152 (2003)CrossRefGoogle Scholar
  3. 3.
    Grayden, D.B., Scordilis, M.S.: Phonemic Segmentation of Fluent Speech. In: Proceedings of ICASSP, Adelaide, pp. 73–76 (1994)Google Scholar
  4. 4.
    Grocholewski, S.: Założenia akustycznej bazy danych dla języka polskiego na nośniku cd rom (eng. Assumptions of acoustic database for Polish language). Mat. I KK: Głosowa komunikacja człowiek-komputer, WrocławGoogle Scholar
  5. 5.
    Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE Transactions on Speech and Audio Processing 2(4), 578–589 (1994)CrossRefGoogle Scholar
  6. 6.
    Holmes, J.N.: Speech Synthesis and Recognition (2001)Google Scholar
  7. 7.
    Jassem, W.: Podstawy fonetyki akustycznej (Eng. Rudiments of acoustic phonetics). Państwowe Wydawnictwo Naukowe, Warszawa (1973)Google Scholar
  8. 8.
    Morgan, N., Zhu, Q., Stolcke, A., Sonmez, K., Sivadas, S., Shinozaki, T., Ostendorf, M., Jain, P., Hermansky, H., Ellis, D., Doddington, G., Chen, B., Cretin, O., Bourlard, H., Athineos, M.: Pushing the envelope - aside. IEEE Signal Processing Magazine 22(5), 81–88Google Scholar
  9. 9.
    Ostendorf, M., Digalakis, V.V., Kimball, O.A.: From HMM’s to segment models: A unified view of stochastic modeling for speech recognition. IEEE Transactions on Speech and Audio Processing 4, 360–378Google Scholar
  10. 10.
    Rabiner, L., Juang, B.-H.: Fundamentals of speech recognition. PTR Prentice-Hall, Inc., New Jersey (1993)zbMATHGoogle Scholar
  11. 11.
    Russell, M., Jackson, P.J.B.: A multiple-level linear/linear segmental HMM with a formant-based intermediate layer. Computer Speech and Language 19, 205–225Google Scholar
  12. 12.
    Stöber, K., Hess, W.: Additional use of phoneme duration hypotheses in automatic speech segmentation. In: Proceedings of ICSLP, Sydney, pp. 1595–1598 (1998)Google Scholar
  13. 13.
    Suh, Y., Lee, Y.: Phoneme segmentation of continuous speech using multi-layer perceptron. In: Proceedings of ICSLP, Philadelphia, pp. 1297–1300 (1996)Google Scholar
  14. 14.
    Toledano, D.T., Gómez, L.A.H., Grande, L.V.: Automatic phonetic segmentation. IEEE Transactions on Speech and Audio Processing 11(6), 617–625 (2003)CrossRefGoogle Scholar
  15. 15.
    Weinstein, C.J., McCandless, S.S., Mondshein, L.F., Zue, V.W.: A system for acoustic-phonetic analysis of continuous speech. IEEE Transactions on Acoustics, Speech and Signal Processing 23, 54–67Google Scholar
  16. 16.
    Wierzchowska, B.: Fonetyka i fonologia języka polskiego (Eng. Fonetics and phonology of Polish). Zakład Narodowy im. Ossolińskich, Wrocław (1980)Google Scholar
  17. 17.
    Young, S.: Large vocabulary continuous speech recognition: a review. IEEE Signal Processing Magazine 13(5), 45–57Google Scholar
  18. 18.
    Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: HTK Book. Cambridge University Engineering Department, UKGoogle Scholar
  19. 19.
    Ziółko, B., Manandhar, S., Wilson, R.C., Ziółko, M.: Wavelet method of speech segmentation. In: Proceedings of 14th European Signal Processing Conference EUSIPCO, Florence (2006)Google Scholar
  20. 20.
    Zue, V.W.: The use of speech knowledge in automatic speech recognition. Proceedings of the IEEE 73, 1602–1615 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Bartosz Ziółko
    • 1
  • Mariusz Ziółko
    • 1
  1. 1.Department of ElectronicsAGH University of Science and TechnologyKrakówPoland

Personalised recommendations