Advertisement

Towards an unrestricted domain TTS system for African tone languages

  • Moses E. EkpenyongEmail author
  • Eno-Abasi Urua
  • Dafydd Gibbon
Article

Abstract

In this paper we discuss the procedural problems, issues and challenges involved in developing a generic speech synthesizer for African tone languages. We base our development methodology on the “MultiSyn” unit-selection approach, supported by Festival Text-To-Speech (TTS) Toolkit for Ibibio, a Lower Cross subgroup of the (New) Benue-Congo language family widely spoken in the southeastern region of Nigeria. We present in a chronological order, the several levels of infrastructural and linguistic problems as well as challenges identified in the Local Language Speech Technology Initiative (LLSTI) during the development process (from the corpus preparation and refinement stage to the integration and synthesis stage). We provide solutions to most of these challenges and point to possible outlook for further refinement. The evaluation of the initial prototype shows that the synthesis system will be useful to non-literate communities and a wide spectrum of applications.

Keywords

TTS HLT Multi-unit selection Concatenative synthesis Terraced tone modeling 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Black, A., & Taylor, P. (1997). Festival speech synthesis system: system documentation (1.1.1). Human Communication Research Centre, Technical report. HCRC/TR-83. Google Scholar
  2. Black, A., Taylor, P., & Caley, R. (1999). The festival speech synthesis system. System Documentation (1.4.0), www.cstr.ed.ac.uk/projects/festival/manual/.
  3. Clark, R., Richmond, K., & King, S. (2004). Festival 2: build your own general purpose unit selection speech synthesizer. In 5th ISCA speech synthesis work shop, Pittsburgh, PA (pp. 173–178). Google Scholar
  4. Dutoit, T. (1999). An introduction to text-to-speech synthesis. Berlin: Springer. Google Scholar
  5. Essien, O. (1990). A grammar of the Ibibio language. Ibadan: University Press Limited. Google Scholar
  6. Gibbon, D. (1981). A new look at intonation syntax and semantics. In A. James & P. Westney (Eds.), New linguistics impulses in foreign language teaching. Tübingen: Gunter Narr Google Scholar
  7. Gibbon, D. (1987). Finite state processing of tone systems. In Proceedings of the European chapter of ACL, Copenhagen (pp. 291–297). Google Scholar
  8. Gibbon, D. (2001). Finite state prosodic analysis of African corpus resources. In 7th EUROSPEECH conference, Aalborg, Denmark (pp. 83–86). Google Scholar
  9. Gibbon, D., & Urua, E. (2006). Computational morphotonology in Niger-Congo languages. In Proceedings of speech prosody 2006, Dresden, Germany. Google Scholar
  10. Gibbon, D., Urua, E., & Ekpenyong, M. (2004). Data creation for Ibibio speech synthesis. LLSTI Progress Report, Third Partners Workshop, Lisbon. Google Scholar
  11. Gibbon, D., Urua, E.-A., & Ekpenyong, M. (2006). Problems and solutions in African tone language text-to-speech. In MULTILING 2006 ISCA Tutorial and Research Workshop (ITRW), Stallenbosch, South Africa. Google Scholar
  12. Gut, U., & Gibbon, D. (Eds.) (2002). Typology of African prosodic systems. Bielefeld occasional papers on typology 1. Universitaet Bielefeld, Germany. Google Scholar
  13. Hamza, W., Bakis, R., Shuang, Z., & Zen, H. (2005). On building a concatenative speech synthesis system for blizzard challenge speech databases. In INTERSPEECH 2005, Lisbon. Google Scholar
  14. Hiroya, F. (1988). A note on the physiological and physical basis for the phrase and accent components in the voice fundamental frequency contour. In O. Fugimura (Ed.), Vocal physiology: voice production, mechanisms and functions (pp. 347–355). New York: Raven Press. Google Scholar
  15. Hunt, A., & Black, A. (1996). Unit selection in a concatenative speech synthesis system using a large speech database. In Proceedings of ICASSP, 1, Atlanta, Georgia (pp. 373–376). Google Scholar
  16. Kaufman, E. (1985). Ibibio dictionary. Cross River State University and Ibibio Language Board, Nigeria, in cooperation with African Studies Centre, Leiden, The Netherlands. Google Scholar
  17. Klabbers, E., Stoeber, K., Veldhuis, R., & Breuer, S. (2001). Speech synthesis development made easy: the Bonn open synthesis system. In Proceedings of Eurospeech, Aalborg (pp. 521–524). Google Scholar
  18. Martin, J. (1998). A two-level take on Tianjin tone. In G.-J. Kruijff & I. Kruijff-Korbayová (Eds.), Proceedings of the third ESSLLI student session, 10th European summer school on logic, language and information, Saarbruecken, Germany (pp. 162–174). Google Scholar
  19. Mizuno, H., Asano, H., Isoyai, M., Hasebe, M., & Abe, M. (2004). Text-to-speech synthesis technology using corpus-based approach. NTT Technical Review (Vol. 2, No. 3, pp. 70–75). Google Scholar
  20. Olive, J. (1977). Rule synthesis of speech from diadic units. In Proceedings of ICASSP-77 (pp. 568–570). Google Scholar
  21. Pierrehumbert, J. (1980). The phonology and phonetics of English intonation. Diss. Massachusetts Institute of Technology. Google Scholar
  22. Reich, P. (1969). The finiteness of natural language. Language, 45, 831–843. CrossRefGoogle Scholar
  23. Schroeter, J. (2006). Text-to-speech (TTS) synthesis. In R. Dorf (Ed.), Circuits, signals and speech and language processing. http://www.research.att.com/~ttsweb/tts/papers/2005_EEHandbook/tts.pdf.
  24. Shalonova, K., & Tucker, R. (2004). Issues in porting TTS to minority languages. In SALTMIL workshop on minority languages, LREC 2004, Lisbon. Google Scholar
  25. Talikdar, P. (2004). Optimal text selection module version 0.2. LLSTI Progress Report, Third Partners Workshop, Lisbon. Google Scholar
  26. Taylor, P., Black, A., & Caley, R. (1998). The architecture of the festival speech synthesis system. In 3rd ESCA workshop on speech synthesis (pp. 147–151), Jenolan Caves, Australia. Google Scholar
  27. ‘t Hart, J., & Cohen, A. (1973). Intonation by rule, a perceptual quest. Journal of Phonetics, 1, 309–327. Google Scholar
  28. Tucker, R., & Shalonova, K. (2005). Supporting the creation of TTS for local language voice information systems. In INTERSPEECH-2005 (pp. 453–456). Google Scholar
  29. Urua, E. (2000). Ibibio phonetics and phonology. Cape Town: Centre for Advanced Studies of African Society. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Moses E. Ekpenyong
    • 1
    Email author
  • Eno-Abasi Urua
    • 1
  • Dafydd Gibbon
    • 2
  1. 1.University of UyoUyoNigeria
  2. 2.Universität BielefeldBielefeldGermany

Personalised recommendations