Skip to main content

Free Tools and Resources for HMM-Based Brazilian Portuguese Speech Synthesis

  • Conference paper
  • First Online:
Advances in Artificial Intelligence - IBERAMIA 2018 (IBERAMIA 2018)

Abstract

Text-to-speech (TTS) is currently a mature technology used in many areas such as education and accessibility. Some modules of a TTS system depend on the language and, while there are many public materials for some languages (e.g., English and Japanese), the resources for Brazilian Portuguese (BP) are still limited. This work describes the development of a complete hidden Markov model (HMM) based TTS system for BP which can be applied to the desktop environment. It also releases a set of natural language processing tools for BP, which expands the already publicly available resources, supporting the development of new researches for academic or industrial purposes. Subjective and objective performance tests are presented, comparing the proposed TTS system with other softwares currently available for BP.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The syllable is a unit relatively easy to identify and segmental if the splitting rules stipulated by the language orthography are followed. However, as a phonological unit, there is no consensus about its basic structure, as discussed in [9]. For most authors, a syllable is defined so that its nucleus, canonically a vowel, constitutes a peak in the curve of audibility that is preceded (onset) and/or followed (coda) by a sequence of segments (none or more consonants), with progressively decreasing sonority values. The nucleus and coda are sometimes lumped together to form what is called the rhyme. By applying these principles, the syllable is a speech unit of rhythmic organization, although other authors disagree, stating that the syllable should not be seen in parts but as a whole.

References

  1. Dicionário Online de Português. (2018). http://www.dicio.com.br/

  2. Grupo falabrasil (2018). https://goo.gl/EWcfdg

  3. HTS (2018). http://hts.sp.nitech.ac.jp/

  4. HTS Engine (2018). http://hts-engine.sourceforge.net/

  5. Alcaim, A., Solewicz, J.A., de Morais, J.A.: Frequência de ocorrência dos fones e listas de frases foneticamente balanceadas para o português falado no Rio de Janeiro. Revista da Sociedade Brasileira de Telecomunicacoes 7(1), 23–41 (1992)

    Google Scholar 

  6. Braga, D., Coelho, L., Resende Jr., F.G.V.: A rule-based grapheme-to-phone converter for TTS systems in European Portuguese, pp. 141–156 (2007)

    Google Scholar 

  7. Braga, D., Silva, P., Ribeiro, M., Dias, M.S., Campillo, F., Garc’a-Mateo, C.: Hélia, Heloisa and Helena: new HTS systems in European Portuguese, Brazilian Portuguese and Galician. In: International Conference on Computational Processing of the Portuguese Language, PROPOR 2010 (2010)

    Google Scholar 

  8. Cirigliano, R.J.R., Monteiro, C., Barbosa, F.L., Resende Jr., F.G.V.R., Couto, L.R., de Morais, J.A.: Um conjunto de 1000 frases foneticamente balanceadas para o português brasileiro obtido utilizando e a abordagem de algoritmos genéticos. Anais do Simpósio Brasileiro de Telecomunicações (SBrT) (2005)

    Google Scholar 

  9. Collischonn, G.: Introdução a Estudos de Fonologia do Português Brasileiro. Porto Alegre: EDIPUCRS, pp. 95–126 (2005)

    Google Scholar 

  10. Costa, E., Monte, A., Neto, N., Klautau, A.: Um Framework para Desenvolvimento de Sistemas TTS Personalizados no Português do Brasil. In: XXX Simpósio Brasileiro de Telecomunicações (2012)

    Google Scholar 

  11. Couto, I., Neto, N., Tadaiesky, V., Klautau, A., Maia, R.: An open source HMM-based text-to-speech system for Brazilian Portuguese. In: 7th International Telecommunications Symposium (2010)

    Google Scholar 

  12. Dutoit, T., Pagel, V., Pierret, N., Bataille, F., Vrecken, O.V.D.: The MBROLA project: towards a set of high-quality speech synthesizers free of use for non-commercial purposes. In: Proceedings of ICSLP 1996, Philadelphia, vol. 3, pp. 1393–1396 (1996)

    Google Scholar 

  13. Faria, A.: Applied Phonetics: Portuguese Text-to-Speech. Technical report, University of California (2003)

    Google Scholar 

  14. Maciel, A., Carvalho, E.: Integration and evaluation of an HMM-based text-to-speech system to FIVE. In: 19th International Conference on Systems, Signals and Image Processing, IWSSIP 2012 (2012)

    Google Scholar 

  15. Maia, R., Zen, H., Tokuda, K., Kitamura, T., Resende, F.: An HMM-based Brazilian Portuguese speech synthetiser and its characteristics. J. Commun. Inf. Syst. 21, 58–71 (2006)

    Google Scholar 

  16. Monte, A., Ribeiro, D., Neto, N., Cruz, R., Klautau, A.: A rule-based syllabification algorithm with stress determination for Brazilian Portuguese natural language processing. In: 17th International Congress of Phonetic Sciences, pp. 1418–1421 (2011)

    Google Scholar 

  17. Barbosa, P., et al.: Aiuruete: a high-quality concatenative text-to-speech system for Brazilian Portuguese with demisyllabic analysis-based units and hierarchical model of rhythm production. In: Proceedings of the Eurospeech 1999, pp. 2059–2062 (1999)

    Google Scholar 

  18. Schröder, M., Trouvain, J.: The German text-to-speech synthesis system MARY: a tool for research, development and teaching. Int. J. Speech Technol. 6, 365–377 (2001)

    Google Scholar 

  19. Silva, D., de Lima, A., Maia, R., Braga, D., de Moraes, J.F., de Moraes, J.A., Resende Jr., F.G.: A rule-based grapheme-phone converter and stress determination for Brazilian Portuguese natural language processing. In: VI International Telecommunications Symposium (2006)

    Google Scholar 

  20. Silva, D.C., Braga, D., Resende Jr., F.G.V.: Separação das Silabas e Determinação da Tonicidade no Português Brasileiro. In: XXVI Simpósio Brasileiro de Telecomunicações, SBrT 2008 (2008)

    Google Scholar 

  21. Siravenha, A., Neto, N., Macedo, V., Klautau, A.: Uso de Regras Fonológicas com de terminação de Vogal Tônica para Conversão Grafema-Fone em Português Brasileiro. In: 7th International Information and Telecommunication Technologies Symposium (2008)

    Google Scholar 

  22. Souza, D., Saturnino, L., Maciel, A.: A portability evaluation of Brazilian Portuguese voice produced with MARY TTS. In: 2014 International Conference on Systems, Signals and Image Processing (IWSSIP) (2014)

    Google Scholar 

  23. Taylor, P.: Text-To-Speech Synthesis. Cambridge University Press, Cambridge (2009)

    Google Scholar 

  24. Turunen, M.: Speech application design and development. Technical report (2004)

    Google Scholar 

  25. Yoshimura, T., Tokuda, K., Masuko, T., Kobayashi, T., Kitamura, T.: Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis. In: Proceedings of EUROSPEECH, vol. 5, no. 98, pp. 2347–2350 (1999)

    Google Scholar 

  26. Zen, H., Tokuda, K., Black, A.W.: Statistical parametric speech synthesis. Speech Commun. 51(11), 1039–1064 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ericson Costa or Nelson Neto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Costa, E., Neto, N. (2018). Free Tools and Resources for HMM-Based Brazilian Portuguese Speech Synthesis. In: Simari, G., Fermé, E., Gutiérrez Segura, F., Rodríguez Melquiades, J. (eds) Advances in Artificial Intelligence - IBERAMIA 2018. IBERAMIA 2018. Lecture Notes in Computer Science(), vol 11238. Springer, Cham. https://doi.org/10.1007/978-3-030-03928-8_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03928-8_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03927-1

  • Online ISBN: 978-3-030-03928-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics