Abstract
This chapter presents a statistical decision procedure for lexical ambiguity resolution in text-to-speech synthesis. Based on decision lists, the algorithm incorporates both local syntactic patterns and more distant collocational evidence, combining the strengths of decision trees, N-gram taggers and Bayesian classifiers. The algorithm is applied to seven major types of ambiguity in which context can be used to choose the pronunciation of a word.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
P. Brown, S. Delia Pietra, V. Delia Pietra, and R. Mercer. Word sense disambiguation using statistical methods. In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, 264–270, 1991.
L. Brieman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth & Brooks, Monterey, CA, 1984.
E. Brill. A Corpus-Based Approach to Language Learning. Ph.D. Thesis, University of Pennsylvania, Philadelphia, 1993.
R. Bruce and J. Wiebe. Word-sense disambiguation using decomposable models. In Proceedings of the 32nd Annual Meeting of the Association for Compu-tational Linguistics, Las Cruces, NM, 139–146, 1994.
K. W. Church. A stochastic parts program and noun phrase parser for unrestricted text. In Proceedings of the Second Conference on Applied Natural Language Processing, 136–143, 1988.
I. Dagan and A. Itai. Word sense disambiguation using a second language monolingual corpus. Computational Linguistics, 20:563–596, 1994.
W. Gale, K. Church, and D. Yarowsky. A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26:415–439, 1992.
W. Gale, K. Church, and D. Yarowsky. Discrimination decisions for 100,000-dimensional spaces. In Current Issues in Computational Linguistics: In Honour of Don Walker, A. Zampoli, N. Calzolari, and M. Palmer, eds. Kluwer Academic Publishers, Dordrecht, Holland, 429–450, 1994.
M. Hearst. Noun homograph disambiguation using local context in large text corpora. In Using Corpora, University of Waterloo, Waterloo, Ontario, 1991.
F. Jelinek. Markov source modeling of text generation. In Impact of Processing Techniques on Communication, J. Skwirzinski, M. Nijhoff, Dordrecht, 1985.
C. Leacock, G. Towell, and E. Voorhees. Corpus-based statistical sense resolution. In Proceedings, ARPA Human Language Technology Workshop, Princeton, NJ, 260–265, 1993.
B. Merialdo. Tagging text with a probabilistic model In Proceedings of the IBM Natural Language ITL, Paris, France, 161–172, 1990.
F. Mosteller and D. Wallace Inference and Disputed Authorship: The Federalist. Addison-Wesley, Reading, MA, 1964.
R. L. Rivest. Learning decision lists. Machine Learning 2:229–246, 1987.
R. Sproat, J. Hirschberg, and D. Yarowsky. A corpus-based synthesizer. In Proceedings, International Conference on Spoken Language Processing, Banff, 1992.
D. Yarowsky. Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. In Proceedings, COLING-92, Nantes, France, 454–460, 1992.
D. Yarowsky. One sense per collocation. In Proceedings, ARPA Human Language Technology Workshop, Princeton, NJ, 266–271, 1993.
D. Yarowsky. Decision lists for lexical ambiguity resolution: Application to accent restoration in Spanish and French. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM, 88–95, 1994.
D. Yarowsky. A comparison of corpus-based techniques for restoring accents in Spanish and French text. In Proceedings, 2nd Annual Workshop on Very Large Corpora, Kyoto, Japan, 19–32, 1994.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1997 Springer Science+Business Media New York
About this chapter
Cite this chapter
Yarowsky, D. (1997). Homograph Disambiguation in Text-to-Speech Synthesis. In: van Santen, J.P.H., Olive, J.P., Sproat, R.W., Hirschberg, J. (eds) Progress in Speech Synthesis. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-1894-4_12
Download citation
DOI: https://doi.org/10.1007/978-1-4612-1894-4_12
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-7328-8
Online ISBN: 978-1-4612-1894-4
eBook Packages: Springer Book Archive