Training Accent and Phrasing Assignment on Large Corpora
This chapter describes techniques for acquiring intonational phrasing rules and prominence assignment rules for text-to-speech synthesis automatically from labeled corpora or from annotated text together with some evaluation of these procedures for Standard American English. The procedures employ decision trees generated automatically using classification and regression tree (CART) machine learning techniques, from audio corpora labeled for pitch accent and phrase boundary location or from text corpora labeled by native speakers with likely locations of intonational features. Both types of corpus are used as training material, together with information available about the text via simple text analysis techniques, to produce decision trees, which in turn are used to predict accent and phrasing decisions for text-to-speech. Rules generated by these methods achieve more than 95% accuracy for phrasing decisions and 85% for prominence assignment.
KeywordsWord Class Speech Corpus Pitch Accent Phrase Boundary Global Focus
Unable to display preview. Download preview PDF.