Evaluation of the Slovenian HMM-Based Speech Synthesis System

  • Boštjan Vesnicer
  • France Mihelič
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3206)


A new HMM-based speech synthesis system for Slovenian language is presented. The quality of synthesized speech has been assessed by subjective and objective tests. The results show that the new system outperforms our previously developed diphone-based waveform concatenation synthesizer in terms of naturalness and general impression.


Speech Signal Automatic Speech Recognition Speech Synthesis Pitch Contour Natural Speech 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Campbell, N., Black, A.: Prosody and the Selection of Source Units for Concatenative Synthesis. In: van Santen, J., Sproat, R., Olive, J., Hirschberg, J. (eds.) Progress in Speech Synthesis, pp. 279–282. Springer, Heidelberg (1996)Google Scholar
  2. 2.
    Dobrišek, S.: Analysis and Recognition of Phones in Speech Signal. Ph.D. Thesis (in Slovene), Faculty of Electrical Engineering, University of Ljubljana (2001)Google Scholar
  3. 3.
    Fukada, T., Tokuda, K., Kobayashi, T., Imai, S.: An Adaptive Algorithm for Mel-Cepstral Analysis of Speech. Proc. of ICASSP 1, 137–140 (1992)Google Scholar
  4. 4.
    Gros, J.: A two-level duration model for the Slovenian speech. Electrotechnical Review 66(2), 92–97 (1999)Google Scholar
  5. 5.
    Gros, J., Mihelič, F., Pavešić, N.: Slovene interactive text-to-speech evaluation site – SITES. In: Proc. of TSD, Plzen, Czech Republic, pp. 223–228 (1999)Google Scholar
  6. 6.
    Gros, J., Pavešić, N., Mihelič, F.: Text-to-speech synthesis: A complete system for the Slovenian language. CIT 5(1), 11–19 (1997)Google Scholar
  7. 7.
    Mihelič, F., Gros, J., Dobrišek, S., Žibert, J., Pavešić, N.: Spoken Language Resources at LUKS of the University of Ljubljana. International Journal of Speech Technology 6, 221–232 (2003)CrossRefGoogle Scholar
  8. 8.
    Ostendorf, M., Bulyko, I.: The Impact of Speech Recognition on Speech Synthesis. In: Proc. of the IEEE Workshop on Speech Synthesis (2002)Google Scholar
  9. 9.
    Rabiner, L., Huang, B.-H.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs (1993)Google Scholar
  10. 10.
    Talkin, D.: A Robust Algorithm for Pitch Tracking. In: Kleijn, W.B., Paliwal, K.K. (eds.) Speech Coding and Synthesis, pp. 495–518. Elsevier Science, Amsterdam (1995)Google Scholar
  11. 11.
    Tokuda, K., Kobayashi, T., Imai, S.: Speech parameter generation from HMM using dynamic features. Proc. of ICASSP 1, 660–663 (1995)Google Scholar
  12. 12.
    Tokuda, K., Masuko, T., Miyazaki, N., Kobayashi, T.: Multi-Space Probability Distribution HMM. IEICE Transactions on Information and Systems E85-D(3), 455–464 (2002)Google Scholar
  13. 13.
    Tokuda, K., Zen, H.: An HMM-Based Speech Synthesis System Applied to English. In: Proc. IEEE Workshop on Speech Synthesis, USA (2002)Google Scholar
  14. 14.
    Tokuda, K., Yoshimura, T., Masuko, T., Kobayashi, T., Kitamura, T.: Speech Parameter Generation Algorithms for HMM-based Speech Synthesis. Proc. ICASSP 3, 1315–1318 (2000)Google Scholar
  15. 15.
    Yoshimura, T., Tokuda, K., Masuko, T., Kobayashi, T., Kitamura, T.: Duration Modeling for HMM based Speech Synthesis. Proc. ICSLP 2, 29–32 (1998)Google Scholar
  16. 16.
    Zemljak, M., Kačič, Z., Dobrišek, S., Gros, J., Weiss, P.: Computer-based Symbols for Slovene Speech. Journal for Linguistics and Literary Studies 2, 159–294 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Boštjan Vesnicer
    • 1
  • France Mihelič
    • 1
  1. 1.Faculty of Electrical Engineering, Laboratory of Artificial Perception, Systems and CyberneticsUniversity of LjubljanaLjubljanaSlovenia

Personalised recommendations