Abstract
Over the past few years, we have seen a significant increase in the number and sophistication of computational studies of large bodies of text and speech. Such studies have a wide variety of topics and motives, from lexicography and studies of language change, to methods for automated indexing and information retrieval, tagging and parsing algorithms, techniques for generating idiomatic text, cognitive models of language acquisition, and statistical models for application in speech recognizers, text or speech compression schemes, optical character readers, machine translation systems, and spelling correctors.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
ACL: 1989, ‘ACL Data Collection Initiative Announcement’, The Finite String 15.
Bahl, L.B., Brown, P.F., de Souza, P.V., and Mercer, R.L.: 1990, ‘A Tree-Based Statistical Language Model for Natural Language Speech Recognition’. In Waibel, A., and Lee, K.-F., Readings in Speech Recognition, San Mateo, CA: Morgan Kaufman.
Brill, E., Magerman, D., Marcus, M., and Santorini, B.: 1990, ‘Deducing Linguistic Structure from the Statistics of Large Corpora’. In Proceedings of the DARPA Speech and Natural Language Workshop, New York: Morgan Kaufman.
Brown, P.F., Delia Pietra, S.A., Delia Pietra, V.J., Lai, J.C., Mercer, R.L.: 1990, ‘An Estimate of an Upper Bound for the Entropy of English’. Ms.
Brown, P.F., Cocke J., Delia Pietra, S.A., Delia Pietra, V.J., Jelinek, F., Lafferty, J.D., Mercer, R.L., and Roosin, P.S.: 1990, ‘A Statistical Approach to Machine Translation’. Computational Linguistics 16, 79–85.
Chitrao, M., and Grishman, R.: 1990, ‘Statistical Parsing of Messages’. In Proceedings of DARPA Speech and Natural Language Processing Workshop. New York: Morgan Kaufman.
Chomsky, N.: 1957, Syntactic Structures. The Hague: Mouton.
Choueka, Y.: 1988, ‘Looking for Needles in a Haystack: Or, Locating Interesting Collocational Expressions in Large Textual Databases. In Proceedings of the RIA088 Conference on User-Oriented Content-Based Text and Image Handling. Cambridge, MA.
Church, K.W.: 1988, ‘A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text’. In Proceedings of the Second ACL Conference on Applied Natural Language Processing. Austin, Texas.
Church, K.W. and Hanks, P.: 1990, ‘Word Association Norms, Mutual Information and Lexicography’. Computational Linguistics 16, 22–29.
Church, K.W., Hanks, P., and Hindle, D.: forthcoming, ‘Using Statistics in Lexical Analysis’. In Zernik, V., ed. Lexical Acquisition: Using On-line Resources to Build a Lexicon.
Dagan, I., and Itai, A.: 1991 ‘A Statistical Filter for Resolving Pronoun References’. In Proceedings of the 29th Meeting of the ACL, Berkeley.
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., and Harshman, R.: 1990, ‘Indexing by Latent Semantic Analysis’. Journal of the American Society for Information Science.
De Marcken, C.G.: 1990, ‘Parsing the LOB Corpus’. In Proceedings of the 28th Annual Meeting of the ACL, Pittsburgh, PA, 243-251.
DeRose, S.J.: 1988, ‘Grammatical Category Disambiguation by Statistical Optimization’. Computational Linguistics 14, 31–39.
Fillmore, C.J., and Atkins, B.T.: forthcoming, ‘Toward a Frame-Based Lexicon: the Semantics of RISK and Its Neighbors’. In Lehrer, A., and Kittay, E. (eds.) Papers in Lexical Semantics.
Gale, W.A. and Church, K.W.: 1990, ‘Poor Estimates of Context Are Worse than None’. In Proceedings of the DARPA Speech and Natural Language Workshop, June 1990.
Hanson, S. J. and Kegl, J.: 1987, ‘PARSNIP: A Connectionist Network That Learns Natural Language Grammar from Exposure to Natural Language Sentences’. In Proceedings of the Cognitive Science Society, Seattle, WA, 106-119.
Hindle, D.: 1990, ‘Noun Classification from Predicate-Argument Structures’. In Proceedings of the 28th Annual Meeting of the ACL, Pittsburgh, PA, 268-275.
Hindle, D. and Rooth., M.: 1990,’ structural Ambiguity and Lexical Relations’. In Proceedings of the DARPA Speech and Natural Language Workshop. June 1990.
Jelinek, F.: 1990, ‘Self-Organized Language Modeling for Speech Recognition’. In Waibel, A., and Lee, K.-F. (eds.), Readings in Speech Recognition, San Mateo, CA: Morgan Kaufman.
Jelinek, F., Lafferty, J.D., and Mercer, R.L.: 1990, Basic Methods of Probabilistic Context Free Grammars. Yorktown Heights: IBM RC 16374 (#72684).
Jelinek, F. and Mercer, R.: 1980, ‘Interpolated Estimation of Markov Source Parameters from Sparse Data’. In Proceedings of the Workshop on Pattern Recognition in Practice. Amsterdam: North-Holland.
Johansson, S., Atwell, E., Garside, R., and Leech, G.: 1986, The Tagged LOB Corpus: User’s Manual. Bergen: Norwegian Computing Centre for the Humanities.
Kernighan, M.D., Church, K.W., and Gale, W.A.: 1990, ‘A Spelling Corrector Based on Error Frequencies’. In Proceedings of the Thirteenth International Conference on Computational Linguistics.
Kroch, A.: 1989 ‘Function and Grammar in the History of English: Periphrastic Do’. In Fasold, R., and Schiffrin, D. (eds.), Language Change and Variation. Amsterdam and Philadelphia: John Benjamins.
Kucera, H. and Francis, W.N.: 1967, Computational Analysis of Present-Day American English. Providence: Brown University Press.
Liberman, M.: 1989, ‘Text on Tap: the ACL/DCI’. In Proceedings of the DARPA Speech and Natural Language Workshop, October 1989. San Mateo, CA.: Morgan Kaufmann.
Miller, G.A., and Chomsky, N.: 1963, ‘Finitary Models of Language Users’. In Luce, R.D., Bush, R.R., and Galanter, E. (eds.), Handbook of Mathematical Psychology. Vol. 2, 419–492. Wiley.
Partee, B., Ter Meulen, A., and Wall, W.: 1990, Mathematical Methods in Linguistics. Dordrecht: Reidel.
Shannon, C.: 1951, ‘Prediction and Entropy of Printed English’, Bell Systems Technical Journal 30, 50–64.
Sinclair, J.M. (ed.): 1987, Looking Up: An Account of the COBUILD Project in Lexical Computing. London and Glasgow: Collins.
Smadja, F.: 1989, ‘Macrocoding the Lexicon with Co-occurrence Knowledge’. In Proceedings of the First International Lexical Acquisition Workshop, IJCAI, Detroit, August 1989.
Smadja, F. and McKeown, K.: 1990, ‘Automatically Extracting and Representing Collocations for Language Generation’. In Proceedings of the 28th Annual Meeting of the ACL, Pittsburgh, PA, 252-259.
Srihari, S.N.: 1984, Computer Text Recognition and Error Correction. IEEE Computer Society Press.
Walker, D.: 1989, ‘Developing Lexical Resources’. In Proceedings of the 5th Annual Conference of the UW Centre for the New Oxford English Dictionary, Waterloo, Ontario.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1991 ECSC - EEC - EAEC, Brussels - Luxembourg
About this paper
Cite this paper
Liberman, M.Y. (1991). The Trend towards Statistical Models in Natural Language Processing. In: Klein, E., Veltman, F. (eds) Natural Language and Speech. ESPRIT Basic Research Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-77189-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-77189-7_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-77191-0
Online ISBN: 978-3-642-77189-7
eBook Packages: Springer Book Archive