Synthesising Expressive Speech – Which Synthesiser for VOCAs?
- 457 Downloads
In the context of people with complex communication needs who depend on Voice Output Communication Aids, the ability of speech synthesisers to convey not only sentences, but also emotions would be a great enrichment. The latter is essential and very natural in interpersonal speech communication. Hence, we are interested in the expressiveness of speech synthesisers and their perception. We present the results of a study in which 82 participants listened to different synthesised sentences with different emotional contours from three synthesisers. We found that participants’ ratings on expressiveness and naturalness indicate that the synthesiser CereVoice performs better than the other synthesisers.
KeywordsComplex Communication Needs Voice Output Communication Aid Expressive Speech Synthesis Online survey
The work presented here is partially supported by ‘PROMI - Promotion inklusive’ and the employment centre. We thank the students, Lena Tikovsky and Ewald Heinz, for their contribution to this work.
- 1.Aylett, M.P., Cowan, B.R., Clark, L.: Siri, echo and performance: you have to suffer darling. In: Conference on Human Factors in Computing Systems, Extended Abstracts, Glasgow, Scotland, UK. ACM, New York (2019). https://doi.org/10.1145/3290607.3310422
- 2.Aylett, M.P., Pidcock, C.J.: Adding and controlling emotion in synthesised speech. Tech. Rep. UK patent GB2447263A (2008)Google Scholar
- 5.Chafe, W.: Prosody: the music of language. In: Genetti, C., Adelman, A. (eds.) How Languages Work - An Introduction to Language and Linguistics, 2nd edn, pp. 236–256. Cambridge University Press, Cambridge (2019)Google Scholar
- 6.Dang, C.T., Andre, E.: Acceptance of autonomy and cloud in the smart home and concerns. In: Dachselt, R., Weber, G. (eds.) Mensch und Computer 2018 (MuC 2018) - Tagungsband (2018)Google Scholar
- 7.Dang, C.T., Aslan, I., Lingenfelser, F., Baur, T., André, E.: Towards somaesthetic smarthome designs: exploring potentials and limitations of an affective mirror. In: Proceedings of the 9th International Conference on the Internet of Things. IoT 2019. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3365871.3365893
- 8.Fiannaca, A.J., Paradiso, A., Campbell, J., Morris, M.R.: Voicesetting: voice authoring UIs for improved expressivity in augmentative communication. In: Mandryk, R.L., Hancock, M., Perry, M., Cox, A.L. (eds.) Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, CHI 2018, Montreal, QC, Canada, 21–26 April 2018, p. 283. ACM (2018). https://doi.org/10.1145/3173574.3173857
- 11.Hoffmann, L., Wülfing, J.O.: Usability of electronic communication aids in the light of daily use. In: Proceedings 14th Biennial Conference of the International Society for Augmentative and Alternative Communication, p. 259 (2010)Google Scholar
- 13.Schröder, M., Charfuelan, M., Pammi, S., Steiner, I.: Open source voice creation toolkit for the MARY TTS platform. In: INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, 27–31 August 2011, pp. 3253–3256. ISCA (2011)Google Scholar
- 14.Steiner, I., Maguer, S.L.: Creating new language and voice components for the updated marytts text-to-speech synthesis platform. In: Calzolari, N., et al. (eds.) Proceedings of the Eleventh International Conference on Language Resources and Evaluation, LREC 2018, Miyazaki, Japan, 7–12 May 2018. European Language Resources Association (ELRA) (2018)Google Scholar
- 15.Wagner, P., et al.: Speech synthesis evaluation - state-of-the-art assessment and suggestion for a novel research program. In: Proceedings of the 10th ISCA Speech Synthesis Workshop, pp. 105–110 (2019). https://doi.org/10.21437/SSW.2019-19