Abstract
We propose using statistical methods for predicting positions and durations of prosodic breaks in a Russian TTS system, in order to improve on a baseline rule-based system. The paper reports experiments with CART and Random Forests (RF) classifiers. We used CART to predict break durations inside and between sentences, and compared the results of CART and RF for predicting break positions inside sentences. We find that both classifiers show an improvement over the baseline system in predicting break positions, with RF showing the best results. We also observe good results in experiments with predicting break durations. To increase the naturalness of synthesized speech, we included probability-based break durations into a working Russian TTS system. We also built an experimental system with probability-based break placement in sentence parts without punctuation marks, which was evaluated higher than the baseline system in a pilot listening experiment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Parlikar, A., Black, A.W.: Modeling Pause-Duration for Style-Specific Speech Synthesis. In: Proceedings of Interspeech, Portland, OR, USA, pp. 446–449 (2012)
Bachenko, J., Fitzpatrick, E.: A computational grammar of discourse-neutral prosodic phrasing in English. Computational Linguistics 16(3), 155–170 (1990)
Tepperman, J., Nava, E.: Where should pitch accents and phrase breaks go? A syntax tree transducer solution. In: Proceedings of Interspeech, Florence, Italy, pp. 1353–1356 (2011)
Zellner, B.: Pauses and the temporal structure of speech. In: Keller, E. (ed.) Fundamentals of Speech Synthesis and Speech Recognition, pp. 41–62. John Wiley, Chichester (1994)
Abney, S.: Parsing by chunks. In: Berwick, R.C., Abney, S.P., Tenny, C.L. (eds.) Principle-Based Parsing: Computation and Psycholinguistics, vol. 44, pp. 257–278. Springer (1991)
Atterer, M.: Assigning Prosodic Structure for Speech Synthesis: A Rule-based Approach. In: Proceedings of Speech Prosody, Aix-en-Provence, pp. 147–150 (2002)
Black, A.W., Taylor, P.: Assigning phrase breaks from part-of-speech sequences. Computer Speech & Language 12(2), 99–117 (1998)
Busser, B., Daelemans, W., Bosch, A.V.D.: Predicting phrase breaks with memory-based learning. In: 4th ISCA Tutorial and Research Workshop (ITRW) on Speech Synthesis, pp. 29–34 (2001)
Khomitsevich, O.G., Solomennik, M.V.: Automatic pause placement in a Russian TTS system [Avtomaticheskaja rasstanovka pauz v sisteme sinteza russkoj rechi po tekstu]. In: Komp’iuternaia Lingvistika i Intellektual’nye Tehnologii: Trudy Mezhdunarodnoj Konferentsii “Dialog 2010” [Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialog 2010”], pp. 531-537 (2010) (in Russian)
Loh, W.-Y.: Classification and Regression Tree Methods. In: Encyclopedia of Statistics in Quality and Reliability, pp. 315–323. Wiley (2008)
Breiman, L., Cutler, A.: Random Forests, http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm
Khomitsevich, O.G., Rybin, S.V., Anichkin, I.M.: Linguistic analysis for text normalization and homonymy resolution in a Russian TTS system [Ispol’zovanie lingvisticheskogo analiza dlja normalizatsii teksta i snjatija omonimii v sisteme sinteza russkoj rechi]. In: Izvestija vuzov. Priborostroenie. Tematicheskij vypusk “Rechevye informatsionnye sistemy” [Instrument making. Thematic issue Speech information systems], vol. 2, pp. 42–46 (2013) (in Russian)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Chistikov, P., Khomitsevich, O. (2013). Improving Prosodic Break Detection in a Russian TTS System. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-01931-4_24
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01930-7
Online ISBN: 978-3-319-01931-4
eBook Packages: Computer ScienceComputer Science (R0)