Skip to main content
Log in

The Bonn Open Synthesis System 3

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

The Bonn Open Synthesis System (BOSS) is an open-source software distribution for unit selection speech synthesis that aims to be easily extensible to new target languages and different applications. To achieve this flexibility, many aspects of the software have been changed in recent years, including the addition of a refined interface to synthesis modules and a more strict separation of language-specific and language-independent code. This article wants to give an overview of the architecture from a technical perspective and explain how it can be adapted for a particular purpose and voice. This is preceded by a short introduction to the unit selection paradigm in general and a section on the specifics of the approach taken by BOSS. A particular focus will be placed on the extensions made for the integration of Polish during which some of the flexibilisation measures were conducted. Further information on the application to Polish but with an emphasis on the linguistic, phonetic and acoustic aspects as well as the speech corpus used can be found in the second part of this two-part article, “Polish unit selection speech synthesis with BOSS”, also published in this issue of the Journal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bachmann, A., & Breuer, S. (2007). Development of a BOSS unit selection module for tone languages. In SSW6-2007 (pp. 166–171).

  • Birkholz, P., & Jackèl, D. (2003). A three-dimensional model of the vocal tract for speech synthesis. In Proceedings of the 15th international congress of phonetic sciences (pp. 2597–2600), Barcelona, Spain.

  • Birkholz, P., Steiner, I., & Breuer, S. (2007). Control concepts for articulatory speech synthesis. In 6th ISCA workshop on speech synthesis (pp. 5–10), Bonn, Germany.

  • Black, A. W., Taylor, P., & Caley, R. (1999). The festival speech synthesis system: system documentation. CSTR, Edinburgh, edition 1.4 for festival version 1.4.0 edition.

  • Bonn Open Synthesis System (BOSS) (2010). Project Homepage: http://sourceforge.net/projects/boss-synth/.

  • Breuer, S. (2009). Multilinguale und multifunktionale Unit-Selection-Sprachsynthese: Designprinzipien für Architektur und Sprachbausteine. PhD thesis, Universität Bonn. http://hss.ulb.uni-bonn.de/diss_online/phil_fak/2009/breuer_stefan/breuer.htm.

  • Breuer, S., & Abresch, J. (2003). Unit selection speech synthesis for a directory enquiries service. In Proceedings of the ICPhS, Barcelona, Spain.

  • Breuer, S., & Abresch, J. (2004). Phoxsy: Multi-phone segments for unit selection speech synthesis. In Proceedings of the international conference on spoken language processing (ICSLP), Jeju.

  • Campbell, W. N., & Black, A. (1996). Prosody and the selection of source units for concatenation synthesis. In J. P. H. Van Santen, R. Sproat, J. Olive, & J. Hirschberg (Eds.), Progress in speech synthesis (pp. 279–291). New York: Springer.

    Google Scholar 

  • Daelemans, W. M., & van den Bosch, A. P. J. (1996). Language-independent data-oriented grapheme-to-phoneme conversion. In J. van Santen, R. Sproat, J. Olive, & J. Hirschberg (Eds.), Progress in speech synthesis (pp. 77–89). New York: Springer.

    Google Scholar 

  • Hess, W. (1992). Speech synthesis—a solved problem? In Signal processing VI, proceedings EUSIPCO, Brussels, Belgium.

  • Hunt, A., & Black, A. (1996). Unit selection in a concatenative speech synthesis system using a large speech database. In Proceedings of ICASSP (pp. 373–376).

  • Klabbers, E. (1997). High-quality speech output generation through advanced phrase concatenation. In Speech technology in the public telephone network: Where are we today?, Proceedings COST Telecom workshop, Rhodes, Greece.

  • Klabbers, E., & Stöber, K. (2001). Creation of speech corpora for the multilingual Bonn Open Synthesis system. In 4th ISCA tutorial and research workshop on speech synthesis, Pitlochry, Scotland.

  • Klabbers, E., Stöber, K., Veldhuis, R., Wagner, P., & Breuer, S. (2001). Speech synthesis development made easy: The Bonn Open Synthesis system. In Proceedings of EUROSPEECH, Aalborg, Denmark.

  • Klatt, D. H. (1987). Review of text-to-speech conversion for English. Journal of the Acoustical Society of America, 82, 737–793.

    Article  Google Scholar 

  • Möbius, B. (2000). Corpus-based speech synthesis: Methods and challenges. In W. Sendlmeier (Ed.), Forum Phoneticum : Vol. 69. Speech and signals: Aspects of speech synthesis and automatic speech recognition (pp. 79–96). Frankfurt a. M.: Hector.

    Google Scholar 

  • Moers, D., Wagner, P., & Breuer, S. (2007). Assessing the adequate treatment of fast speech in unit selection speech synthesis systems for the visually impaired. In SSW6-2007 (pp. 282–287).

  • Moers, D., Wagner, P., Möbius, B., Müllers, F., & Jauk, I. (2010). Integrating a fast speech corpus in unit selection speech synthesis: Experiments on perception, segmentation and duration prediction. In Speech prosody 2010, satellite workshop on prosodic prominence: Perceptual and automatic identification, Chicago, IL.

  • Rohde, H., & Breuer, S. (2005). An HMM-synthesizer for BOSS. In Proceedings of the 16th conference on electronic speech signal processing (ESSP), Prague.

  • Sagisaka, Y. (1988). Speech synthesis by rule using an optimal selection of non-uniform synthesis units. In Proceedings IEEE ICASSP, New York, USA.

  • Schröder, M., & Breuer, S. (2004). XML representation languages as a way of interconnecting tts modules. In Proceedings of the international conference on spoken language processing (ICSLP), Jeju.

  • Sjölander, K., & Beskow, J. (2000). Wavesurfer—an open source speech tool. In Proc. of ICSLP (Vol. 4, pp. 464–467), Beijing.

  • Sproat, R. (Ed.) (1998). Multilingual text-to-speech synthesis: The Bell labs approach. Dordrecht: Kluwer Academic.

    Google Scholar 

  • Stöber, K. (2003). Bestimmung und Auswahl von Zeitbereichseinheiten für die konkatenative Sprachsynthese. Frankfurt a. M.: Lang.

    Google Scholar 

  • Stöber, K., Wagner, P., Helbig, J., Köster, S., Stall, D., Thomae, M., Blauert, J., Hess, W., Hoffmann, R., & Mangold, H. (2000). Speech synthesis using multilevel selection and concatenation of units from large speech corpora. In W. Wahlster (Ed.), Verbmobil: Foundations of speech-to-speech translation (pp. 519–536). Berlin: Springer.

    Google Scholar 

  • Zen, H., Nose, T., Yamagishi, J., Sako, S., Masuko, T., Black, A. W., & Tokuda, K. (2007). The HMM-based speech synthesis system version 2.0. In Proc. of ISCA SSW6, Bonn, Germany.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefan Breuer.

Additional information

S. Breuer now with Phonetics Arts Ltd., Cambridge, UK.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Breuer, S., Hess, W. The Bonn Open Synthesis System 3. Int J Speech Technol 13, 75–84 (2010). https://doi.org/10.1007/s10772-010-9072-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-010-9072-2

Keywords

Navigation