A Framework for Language-Independent Analysis and Prosodic Feature Annotation of Text Corpora

  • Dimitris Spiliotopoulos
  • Georgios Petasis
  • Georgios Kouroupetroglou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5246)

Abstract

Concept-to-Speech systems include Natural Language Generators that produce linguistically enriched text descriptions which can lead to significantly improved quality of speech synthesis. There are cases, however, where either the generator modules produce pieces of non-analyzed, non-annotated plain text, or such modules are not available at all. Moreover, the language analysis is restricted by the usually limited domain coverage of the generator due to its embedded grammar. This work reports on a language-independent framework basis, linguistic resources and language analysis procedures (word/sentence identification, part-of-speech, prosodic feature annotation) for text annotation/processing for plain or enriched text corpora. It aims to produce an automated XML- annotated enriched prosodic markup for English and Greek texts, for improved synthetic speech. The markup includes information for both training the synthesizer and for actual input for synthesising. Depending on the domain and target, different methods may be used for automatic classification of entities (words, phrases, sentences) to one or more preset categories such as “emphatic event”, “new/old information”, “second argument to verb”, “proper noun phrase”, etc. The prosodic features are classified according to the analysis of the speech-specific characteristics for their role in prosody modelling and passed through to the synthesizer via an extended SOLE-ML description. Evaluation results show that using selectable hybrid methods for part-of-speech tagging high accuracy is achieved. Annotation of a large generated text corpus containing 50% enriched text and 50% canned plain text produces a fully annotated uniform SOLE-ML output containing all prosodic features found in the initial enriched source. Furthermore, additional automatically-derived prosodic feature annotation and speech synthesis related values are assigned, such as word-placement in sentences and phrases, previous and next word entity relations, emphatic phrases containing proper nouns, and more.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Taylor, P., Black, A., Caley, R.: The architecture of the festival speech synthesis system. In: Proc. 3rd ESCA Workshop on Speech Synthesis, Australia, pp. 147–151 (1998)Google Scholar
  2. 2.
    O’Donnel, M., Mellish, C., Oberlander, J., Knott, A.: ILEX: An architecture for a dynamic hypertext generation system. Natural Language Engineering 7(3), 225–250 (2001)CrossRefGoogle Scholar
  3. 3.
    Hitzeman, J., Black, A., Taylor, P., Mellish, C., Oberlander, J.: On the Use of Automatically Generated Discourse-Level Information in a Concept-to-Speech Synthesis System. In: Proc. 5th Int. Conf. on Spoken Language Generation (ICSLP), pp. 2763–2768 (1998)Google Scholar
  4. 4.
    Isard, A., Oberlander, J., Androutsopoulos, I., Matheson, C.: Speaking the Users’ Languages. IEEE Intelligent Systems 18(1), 40–45 (2003)CrossRefGoogle Scholar
  5. 5.
    Pan, S., McKeown, K., Hirschberg, J.: Exploring features from natural language generation for prosody modeling. Computer Speech and Language 16, 457–490 (2002)CrossRefGoogle Scholar
  6. 6.
    Xydas, G., Spiliotopoulos, D., Kouroupetroglou, G.: Modeling Improved Prosody Generation from High-Level Linguistically Annotated Corpora. IEICE Trans. of Inf. and Syst., Special Section on Corpus-Based Speech Technologies E88-D(3), 510–518 (2005)Google Scholar
  7. 7.
    Black, A., Taylor, P.: Assigning intonation elements and prosodic phrasing for English speech synthesis from high level linguistic input. In: Proc. 3rd Int. Conf. on Spoken Language Processing, Yokohama, Japan, pp. 715–718 (1994)Google Scholar
  8. 8.
    Petasis, G., Karkaletsis, V., Paliouras, G., Androutsopoulos, I., Spyropoulos, C.D.: Ellogon: A New Text Engineering Platform. In: Proc. 3rd Int. Conf. on Language Resources and Evaluation (LREC 2002), Las Palmas, Canary Islands, Spain, May 2002, pp. 72–78 (2002)Google Scholar
  9. 9.
    Ellogon Language Engineering Platform, Speech tools add-ons, http://www.ellogon.org/speech/
  10. 10.
    Brill, E.: Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part of Speech Tagging. Computational Linguistics 21, 543–565 (1995)Google Scholar
  11. 11.
    Petasis, G., Karkaletsis, V., Farmakiotou, D., Androutsopoulos, I., Spyropoulos, C.D.: A Greek Morphological Lexicon and its Exploitation by Natural Language Processing Applications. In: Manolopoulos, Y., Evripidou, S., Kakas, A.C. (eds.) PCI 2001. LNCS, vol. 2563, Springer, Heidelberg (2003)CrossRefGoogle Scholar
  12. 12.
    Xydas, G., Kouroupetroglou, G.: The DEMOSTHeNES Speech Composer. In: Proc. 4th ISCA Workshop on Speech Synthesis, Perthshire, Scotland, pp. 167–172 (2001)Google Scholar
  13. 13.
    Petasis, G., Paliouras, G., Karkaletsis, V., Spyropoulos, C.D., Androutsopoulos, I.: Resolving Part-of-Speech Ambiguity in the Greek Language Using Learning Techniques. In: Fakotakis, et al. (eds.) Machine Learning in Human Language Technology, pp. 29–34 (1999)Google Scholar
  14. 14.
    Bolinger, D.: Intonation and its Uses: Melody in grammar and discourse. Edward Arnold, London (1989)Google Scholar
  15. 15.
    Hitzeman, J., Black, A., Mellish, C., Oberlander, J., Poesio, M., Taylor, P.: An annotation scheme for Concept-to-Speech synthesis. In: Proc. 7th European Workshop on Natural Language Generation, Toulouse France, pp. 59–66 (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Dimitris Spiliotopoulos
    • 1
  • Georgios Petasis
    • 1
  • Georgios Kouroupetroglou
    • 1
  1. 1.Department of Informatics and TelecommunicationsNational and Kapodistrian University of AthensAthensGreece

Personalised recommendations