Augmented Auditory Representation of e-Texts for Text-to-Speech Systems

  • Gerasimos Xydas
  • Georgios Kouroupetroglou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2166)


Emerging electronic text formats include hierarchical structure and visualization related information that current Text-to-Speech (TtS) systems ignore. In this paper we present a novel approach for composing detailed auditory representation of e-texts using speech and audio. Furthermore, we provide a scripting language (CAD scripts) for defining specific customizations on the operation of a TtS. CAD scripts can be assigned as well to specific text meta-data to enable their discrete auditory representation. This approach can form a mean for a detailed exchange of functionality across different TtS implementations. Moreover, it can be hosted to current TtS systems with minor (or major) modifications. Finally, we briefly present the implementation of DEMOSTHeNES Composer for augmented auditory generation of meta-text using the above methodology.


Audio Signal Speech Synthesis Script Language Auditory Representation Speak Dialogue System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Voice eXtensible Markup Language (VoiceXML™) version 1.0, W3C Note 05 May 2000 (2000),
  2. 2.
    Sproat, R., Taylor, P., Tanenblatt, M. and Isard, A.: A markup language for text-to-speech synthesis, In Proceedings of Eurospeech97, Rhodes, Greece (1997) 1747–1750Google Scholar
  3. 3.
    Mitsopoulos, E.: A Principled Approach to the Design of Auditory Interaction in the Non-Visual User Interface, Submitted for the degree of Doctor of Philosophy, University of York, UK (2000)Google Scholar
  4. 4.
    Hakulinen, J., Turunen, M. and Raiha, K.: The Use of Prosodic Features to Help Users Extract Information from Structured Elements in Spoken Dialogue Systems, In Proceedings of ESCA Tutorial and Research Workshop on Dialogue and Prosody, Eindhoven, The Netherlands, (1999) 65–70Google Scholar
  5. 5.
    Shriver, S., Black, A. and Rosenfeld, R.: Audio Signals in Speech Interfaces, In Proceedings of International Conference on Spoken Language Processing (ICLSP-2000), Beijing, China (2000)Google Scholar
  6. 6.
    Taylor, P., Black, A. and Caley, R.: The architecture of the Festival Speech Synthesis System, 3rd ESCA Workshop on Speech Synthesis, Jenolan Caves, Australia (1998) 147–151Google Scholar
  7. 7.
    Dutoit, T., Bagein, M., Malfrere, F., Pagel, V., Ruelle, A., Tounsi, N. and Wynsberghe, D.: EULER: an Open, Generic, Multi-lingual and Multi-Platform Text-To-Speech System, In Proceedings of LREC’00, Athens, Greece (2000) 563–566.Google Scholar
  8. 8.
    Huckvale, M.: Presentation and Processing of Linguistic Structures for an All-Prosodic Systhesis System Using XML, In Proceedings of Eurospeech99, Budapest, Hungary (1999) 1847–1850Google Scholar
  9. 9.
    Horlock, J.: How Information is Extracted at Edinburgh, TeSTIA-2000, 8th ELSNET Eupopean Summer School on Languge & Speech Communication, Chios, Greece (2000)Google Scholar
  10. 10.
    Xydas, G. and Kouroupetroglou, G.: Text-to-Speech Scripting Interface for Appropriate Vocalisation of e-Texts, In Proceedings of Eurospeech2001, Aalborg, Denmark (2001)Google Scholar
  11. 11.
    XSL Transformations (XSLT), Version 1.0, W3C Recommendation 16 November 1999, (1999)
  12. 12.
    Xydas, G. and Kouroupetroglou, G.: DEMOSTHeNES Composer, Technical Report, University of Athens, Athens (2001)Google Scholar
  13. 13.
    Dutoit, T., Pagel, V., Pierret, N., Bataille, F., Van Der Vreken, O.: The MBROLAProject: Towards a Set of High-Quality Speech Synthesizers Free of Use for Non-Commercial Purposes, In Proceedings of ICSLP’96, Philadelphia, vol. 3, (1996) 1393–1396Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Gerasimos Xydas
    • 1
  • Georgios Kouroupetroglou
    • 1
  1. 1.Department of Informatics and Telecommunications, Division of Communication and Signal ProcessingUniversity of AthensAthensGreece

Personalised recommendations