Skip to main content

Corpus-Based Unit Selection TTS for Hungarian

  • Conference paper
Text, Speech and Dialogue (TSD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4188))

Included in the following conference series:

Abstract

This paper gives an overview of the design and development of an experimental restricted domain corpus-based unit selection text-to-speech (TTS) system for Hungarian. The experimental system generates weather forecasts in Hungarian. 5260 sentences were recorded creating a speech corpus containing 11 hours of continuous speech. A Hungarian speech recognizer was applied to label speech sound boundaries. Word boundaries were also marked automatically. The unit selection follows a top-down hierarchical scheme using words and speech sounds as units. A simple prosody model is used, based on the relative position of words within a prosodic phrase. The quality of the system was compared to two earlier Hungarian TTS systems. A subjective listening test was performed by 221 listeners. The experimental system scored 3.92 on a five-point mean opinion score (MOS) scale. The earlier unit concatenation TTS system scored 2.63, the formant synthesizer scored 1.24, and natural speech scored 4.86.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Möbius, B.: Corpus-Based Speech Synthesis: Methods and Challenges. AIMS 6(4), 87–116 (2000)

    Google Scholar 

  2. Olaszy, G., Németh, G., Olaszi, P., Kiss, G., Gordos, G.: PROFIVOX - A Hungarian Professional TTS System for Telecommunications Applications. International Journal of Speech Technology 3(3/4), 201–216 (2000)

    Article  MATH  Google Scholar 

  3. Németh, G., Zainkó, C.: Word Unit Based Multilingual Comparative Analysis of Text Corpora. In: Eurospeech 2001, pp. 2035–2038 (2001)

    Google Scholar 

  4. Boersma, P.: Accurate Short-Term Analysis of the Fundamental Frequency and the Harmonics-to-Noise Ratio of a Sampled Sound. In: IFA Proceedings, vol. 17, pp. 97–110 (1993)

    Google Scholar 

  5. Mihajlik, P., Révész, T., Tatai, P.: Phonetic Transcription in Automatic Speech Recognition. Acta Linguistica Hungarica 49(3–4), 407–425 (2002)

    Article  Google Scholar 

  6. Vicsi, K., Tóth, L., Kocsor, A., Gordos, G., Csirik, J.: MTBA - Magyar nyelvű telefonbeszéd adatbázis (Hungarian Telephone-Speech Database). In: Híradástechnika, vol. 2002/8, pp. 35–39 (2002)

    Google Scholar 

  7. Taylor, P., Black, A., W.: Speech Synthesis by Phonological Structure Matching. In: Eurospeech 1999, vol. 2, pp. 623–626 (1999)

    Google Scholar 

  8. Olaszy, G.: Az artikuláció akusztikus vetülete – a hangsebészet elmélete és gyakorlata (The Articulation and the Spectral Content—the Theory and Practice of Sound Surgery). In: Hunyadi, L. (ed.) KIF-LAF (Journal of Experimental Phonetics and Laboratory Phonology), Debreceni Egyetem, pp. 241–254 (2003)

    Google Scholar 

  9. Olaszy, G., Gordos, G., Németh, G.: The MULTIVOX Multilingual Text-to-Speech Converter. In: Bailly, G., Benoit, C., Sawallis, T. (eds.) Talking machines: Theories, Models and Applications, pp. 385–411. Elsevier, Amsterdam (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fék, M., Pesti, P., Németh, G., Zainkó, C., Olaszy, G. (2006). Corpus-Based Unit Selection TTS for Hungarian. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2006. Lecture Notes in Computer Science(), vol 4188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846406_46

Download citation

  • DOI: https://doi.org/10.1007/11846406_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-39090-9

  • Online ISBN: 978-3-540-39091-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics