Prosodic Word Prediction Using a Maximum Entropy Approach
As the basic prosodic unit, the prosodic word influences the naturalness and the intelligibility greatly. Although the research shows that the lexicon word are greatly different from the prosodic word, the lexicon word still provides the important cues for the prosodic word forming. The rhythm constraint is another important factor for the prosodic word prediction. Some lexicon word length patterns trend to be combined together. Based on the mapping relationship and the difference between the lexicon words and the prosodic words, the process of the prosodic word prediction is divided into two parts, grouping the lexicon word to the prosodic word and splitting the lexicon word into prosodic words. This paper proposes a maximum entropy method to model these two parts, respectively. The experiment results show that this maximum entropy model is competent for the prosodic word prediction task. In the word grouping model, a feature selection algorithm is used to induce more efficient features for the model, which not only decrease the feature number greatly, but also improve the model performance at the same time. And, the splitting model can correctly detect the prosodic word boundary in the lexicon word. The f-score of the prosodic word boundary prediction reaches 95.55%.
KeywordsMaximum Entropy Statistic Machine Translation Lexical Information Word Grouping Maximum Entropy Model
Unable to display preview. Download preview PDF.
- 1.Qian, Y., Chu, M., Peng, H.: Segmenting unrestricted Chinese text into prosodic words instead of lexicon words. In: Proceeding of the 2001 International Conference on Acoustic, Speech and Signal Processing, Salt Lake City (2001)Google Scholar
- 2.Shi, Q., Ma, X.: Statistic Prosody Structure Prediction. In: Int. Proc. of the IEEE 2002 Workshop on Speech Synthesis, Santa Monica, Ca. (2002)Google Scholar
- 3.Sheng, Z., Jianhua, T., et al.: Chinese prosodic phrasing with extended features. In: ICASSP 2003 (2003)Google Scholar
- 4.Cao, J.: The Rhythm of Mandarin Chinese. Journal of Chinese Linguistics, Monograph Series 17 (2002)Google Scholar
- 5.Hongjun, W.: Prosodic words and prosodic phrases in Chinese. Chinese languages and writings 274-279 (2000)Google Scholar
- 6.Berger, A.L., Della Pietra, V.J., Della Pietra, S.A.: A Maximum Entropy Approach to Natural Language Processing. Computational Linguistics 22(1), 39–71 (1996)Google Scholar
- 7.Low, J.K., Ng, H.T., Guo, W.: A Maximum Entropy Approach to Chinese Word Segmentation. In: Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, Jeju Island, Korea, pp. 161–164 (2005)Google Scholar
- 8.Och, F.J., Ney, H.: Discriminative Training and Maximum Entropy Models for Statistical Machine Translation. In: Proc. of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, July 2002, pp. 295–302 (2002)Google Scholar
- 10.Ratnaparkhi, A.: Maximum Entropy Models For Natural Language Ambiguity Resolution. In: Computer and Information Science. University of Pennsylvania, Philadelphia (1998)Google Scholar