A Review of Statistical Language Processing Techniques

McMahon, John; Smith, F. Jack

doi:10.1023/A:1006517723917

A Review of Statistical Language Processing Techniques

Published: October 1998

Volume 12, pages 347–391, (1998)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

John McMahon¹ &
F. Jack Smith¹

302 Accesses
8 Citations
Explore all metrics

Abstract

We present a review of some recently developed techniques in the field of natural language processing. This area has witnessed a confluence of approaches which are inspired by theories from linguistics and those which are inspired by theories from information theory: statistical language models are becoming more linguistically sophisticated and the models of language used by linguists are incorporating stochastic techniques to help resolve ambiguities. We include a discussion about the underlying similarities between some of these systems and mention two approaches to the evaluation of statistical language processing systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Antonisse, Hendrik James (1991). A Grammar-Based Genetic Algorithm. Foundations of Genetic Algorithms: 193–204.
Atick, Joseph J. & Redlich, A. Norman (1990). Towards a Theory of Early Visual Processing. Neural Computation 2: 308–320.
Google Scholar
Attneave, Fred (1954). Some Informational Aspects of Visual Perception. Psychological Review 61(3): 183–193.
Google Scholar
Badalamenti, Anthony F., Langs, Robert J. & Robinson James (1994). Lawful Systems Dynamics in How Poets Choose Their Words. Behavioral Science 39.
Bahl, Lalit R., Brown, Peter F., DeSouza, Peter V. & Mercer, Robert L. (July, 1989). A Tree-Based Statistical Language Model for Natural Language Speech Recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 37(7): 1001–1008.
Google Scholar
Barry, Alwyn (September 1993). The Emergence of High Level Structure in Classifier Systems – a Proposal. In Cowie, R. & Owens M. (eds.) Proceedings of the Sixth Irish Conference on Artificial Intelligence and Cognitive Science, 185–196.
Basili, Roberto, Pazienza, Teresa & Velardi, Paolo (1992). Combining NLP and Statistical Techniques for Lexical Acquisition. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press, Technical report FS–92–05.
Basili, Roberto, Pazienza Teresa & Velardi, Paolo (1993). What Can Be Learned from Raw Texts? Machine Translation 8: 147–173.
Google Scholar
Baum, Leonard E., Petrie, Ted, Soules, George & Weiss, Norman (1970). A Maximization Technique Occurring in the Statistical Analyses of Probabilistic Functions of Markov Chains. The Annals of Mathematical Statistics 41(1): 164–171.
Google Scholar
Beckwith, Richard, Fellbaum, Christiane, Gross, Derek & Miller, George A. (1991). WordNet: A Lexical Database Organized on Psycholinguistic Principles. In Zernik, Uri (ed.) Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, ch. 9, 211–232. Lawrence Erlbaum Associates.
Berwick, Robert C. (1986). Learning from Positive-Only Examples – the Subset Principle and Three Case Studies. In Carbonell. J. C., Michalski, R. S. & Mitchell, T. M. (eds.) Machine Learning: An Artificial Intelligence Approach (Volume 2). Morgan Kaufmann Publishers.
Bod, Rens (1995). Enriching Language With Statistics: Performance Models of Natural Language. ILLC-publications.
Bouchaffra, Djamel & Rouault, Jacques (1992). A Nonstationary Hidden Markov Model with a Hard Capture of Observations: Application to the Problem of Morphological Ambiguities. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.
Brent, Michael (1993). Minimal Generative Explanations: A Middle Ground Between Neurons and Triggers. In Proceedings of the Fifteenth Meeting of the Cognitive Science Society.
Brill, Eric & Marcus, Mitch (1992). Tagging an Unfamiliar Text with Minimal Human Supervision. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.
Brill, Eric, Magerman, David, Marcus, Mitchell & Santorini, Beatrice (1990). Deducing Linguistic Structure from the Statistics of Large Corpora. In Proceedings of the DARPA Speech and Natural Language Workshop.
Brill, Eric (1994). Some Advances in Transformation-Based Part of Speech Tagging. In Proceedings of the 12th National Conference on Artificial Intelligence, AAAI94.
Brooks, Rodney A. (1991). Intelligence without Reason. In IJCAI91.
Brown, P. F., Cocke, J., Della Pietra, V., Della Pietra, S., Jelinek, F., Mercer, R. L. & Roosin, P. S. (1990). A Statistical Approach to Machine Translation. Computational Linguistics 16: 79–85.
Google Scholar
Brown, Peter F., Della Pietra, Vincent, De Souza, Peter, Lai, Jennifer C. & Mercer, Robert (1992). Class-Based n –Gram Models of Natural Language. Computational Linguistics 18(4): 467–479.
Google Scholar
Burger, John D. & Connolly, Dennis (1992). Probabilistic Resolution of Anaphoric Reference. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.
Carroll, Glenn & Charniak, Eugene (1992). Learning Probabilistic Dependence Grammars from Labelled Text. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.
Chater, Nick (1994). Neural Networks: The New Statistical Models of Mind. In Proceedings of the 1993 Neural Computation and Psychology Workshop. UCL Press: London.
Google Scholar
Cheng, Bing & Titterington, D. M. (1994). Neural Networks: A Review from a Statistical Perspective. Statistical Science 9(1): 2–54.
Google Scholar
Chomsky, Noam (1957). Syntactic Structures. The Hague: Mouton.
Google Scholar
Church, Kenneth W. & Gale, William A. (1991). A Comparison of the Enhanced Good-Turing and Deleted Estimation Methods for Estimating Probabilities of English Bigrams. Computer Speech and Language 5: 19–54.
Google Scholar
Church, Kenneth W. & Gale, Willliam A. (1995). Poisson Mixtures. Natural Language Engineering 1(2): 163–190.
Google Scholar
Church, Kenneth Ward & Hanks, Patrick (1989). Word Association Norms, Mutual Information and Lexicography. In Proceedings of the 27th Annual Conference of the Association of Computational Linguistics, 76–82.
Church, Kenneth W. & Mercer, Robert L. (1993). Introduction to the Special Issue on Computational Linguistics Using Large Corpora. Computational Linguistics 19(1): 1–23.
Google Scholar
Church, Kenneth W., Gale, William A., Hanks, Patrick & Hindle, Donald (1991). Using Statistics in Lexical Analysis. In Zernik, Uri (ed.) Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, ch. 6, 115–164. Lawrence Erlbaum Associates.
Church, Kenneth Ward (1988). A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In Second Conference on Applied Natural Language Processing.
Cover, Thomas M. & and Thomas, Joy A. (1991). Elements of Information Theory. John Wiley and Sons.
de Saussure, Ferdinand (1983). Course in General Linguistics. London: Duckworth.
Google Scholar
Dennett, Daniel (1991). Consciousness Explained. London: Allen Lane.
Google Scholar
Derouault, Anne-Marie & Merialdo, Bernard (November 1986). Natural Language Modelling for Phoneme-to-Text Transcription. I. E. E. E. Transactions on Pattern Analysis and Machine Intelligence, PAM-I8(6).
Dunning, Ted (1993). Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics 19(1): 61–74.
Google Scholar
Egedi, D., Palmer, M., Park, H. S. & Joshi, A. K. (October 1994). Korean to English Translation Using Synchronous TAGs. In Proceedings of the First Conference of the Association for Machine Translation in the Americas, 48–55. Columbia, Maryland.
Elman, Jeffrey L. (1990). Finding Structure in Time. Cognitive Science 14: 179–211.
Google Scholar
Elman, Jeffrey L. (1991). Incremental Learning, or the Importance of Starting Small. Technical Report 9101, Center for Research in Language, U. C. S. D.
Elworthy, David (October 1994). Does baumwelch Re-Estimation Help Taggers? In Proceedings of the Fourteenth ACL Conference on Applied Natural Language Processing, ANLP-94, 53–58.
Faulk, R. D. & Gustavson, F. Goertzel (1990). Segmenting Discrete Data Representing Continuous Speech Input. I. B. M. Systems Journal 29(2).
Finch, Steven & Chater, Nick (1992). Bootstrapping Syntactic Categories Using Statistical Methods. In Daelemans, Walter & Powers, David (eds.) Background and Experiments in Machine Learning of Natural Language, 229–235. Institute for Language Technology and AI.
Finch, Steven & Chater, Nick (1994). Learning Syntactic Categories: A Statistical Approach. In Oaksford M. & Brown, G. D. A. (eds.) Neurodynamics and Psychology, ch. 12. Academic Press.
Firth, J. R. (1968). A Synopsis of Linguistic Theory 1930 – 1955. In Palmer, F. (ed.) Selected Papers of J. R. Firth. Longman.
Fisher, David & Riloff, Ellen (1992). Applying Statistical Methods to Small Corpora: Benefitting from a Limited Domain. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.
Gale, William A. & Church, Kenneth W. (1990). Poor Estimates of Context Are Worse Than None. In Proceedings of the DARPA Speech and Natural Language Workshop, 283–287.
Gale, William A., Church, Kenneth W. & Yarowsky, David (1992). Work on Statistical Methods forWord Sense Disambiguation. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.
Gaussier, Eric & Lange, Jean-Marc (1992). Towards Bilingual Terminology. In Proceedings of the 19th International Conference of the Association for Literary and Linguistic Computing and the 12th International Conference on Computers and the Humanities, 121–124.
Gold, M. (1967). Language Identification in the Limit. Information and Control 10: 447–474.
Google Scholar
Goldberg, D. E. (1989). Genetic Algorithms in Search Optimization and Machine Learning. Addison Wesley.
Gonzalez, Rafael C. & Thomason, Michael G. (1978). Syntactic Pattern Recognition. Addison Wesley.
Good, I. J. (December 1953). The Population Frequencies of Species and the Estimation of Population Parameters. Biometrika 40: 237–264.
Google Scholar
Gorin, A. L., Levinson, S. E., Gertner, A. N. & Goldman, E. (1991). Adaptive Acquisition of Language. Computer Speech and Language 5: 101–132.
Google Scholar
Grefenstette, Gregory (1992). Finding Semantic Similarity in Raw Text: The Deese Antonyms. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.
Harris, Zellig S. (1951). Structural Linguistics. Phoenix Books.
Holland, J. H. (1975). Adaptation in Natural and Artificial Systems. University of Michigan Press.
Holland, J. H. (1986). Escaping Brittleness: The Possibilities of General Purpose Learning Algorithms Applied to Parallel Rule-Bases Systems. In Carbonell, J. G., Michalski, R. S. & Mitchell, T. M. (eds.) Machine Learning II. Morgan Kaufmann.
Hughes, John & Atwell, Eric (1994). The Automated Evaluation of Inferred Word Classifications. In Eleventh European Conference on Artificial Intelligence.
Hull, Jonathan J. (1992). Combining Syntactic Knowledge and Visual Text Recognition: A Hidden Markov Model for Part of Speech Tagging in a Word Recognition Algorithm. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.
Jelinek, Frederick, Merialdo, B., Roukos, S. & Strauss, M. (February 1991). A Dynamic Language Model for Speech Recognition. In Proceedings of the Speech and Language DARPA Workshop, 293–295.
Jelinek, Frederick (April 1976). Continuous Speech Recognition by Statistical Methods. Proceedings of the I. E. E. E. 64(4).
Jelinek, Frederick (1985). The Development of an Experimental Discrete Dictation Recogniser. Proceedings of the I. E. E. E. 73(11).
Jelinek, Frederick (1990). Self-Organized Language Modelling for Speech Recognition. In Waibel & Lee (eds.) Readings in Speech Recognition. San Mateo, California: Morgan Kaufmann.
Google Scholar
Jones, Daniel (1992). Virtual Machine Translation. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.
Jordan, M. (1986). Attractor Dynamics and Parallelismin a Connectionist Sequential Machine. In Proceedings of the Eighth Annual Conference of the Cognitive Science Society, 531–546. Lawrence Erlbaum Associates.
Juola, Patrick, Hall, Chris & Boggs, Adam (1994). Corpus-Based Morphological Segmentation by Entropy Changes. In Monaghan, A. I. C. (ed.) Third Conference on the Cognitive Science of Natural Language Processing. Dublin City University.
Juola, Patrick (1994). A Psycholinguistic Approach to Corpus-Based Machine Translation. In Monaghan, A. I. C. (ed.) Third Conference on the Cognitive Science of Natural Language Processing. Dublin City University.
Juola, Patrick (1995). Corpus-Based Acquisition of Grammars and Transfer Functions for Machine Translation. Technical report CUCS75695, Department of Computer Science, University of Colorado at Boulder.
Katz, Slava M. (March 1987). Estimation of Probabilities for Sparse Data for the Language Model Component of a Speech Recogniser. I. E. E. E. Transactions on Acoustics, Speech and Signal Processing, ASSP-35(3): 400–401.
Google Scholar
Kiss, George R. (1973). Grammatical Word Classes: A Learning Process and Its Simulation. Psychology of Learning and Motivation 7: 1–41.
Google Scholar
Kneser, Reinhard & Ney, Hermann (1993). Forming Word Classes by Statistical Clustering for Statistical Language Modelling. In Köhler, R. & Rieger, B. B. (eds.) Contributions to Quantitative Linguistics, 221–226. Kluwer Academic Publishers.
Kolmogorov, A. N. (1964). Three Approaches to the Quantitative Definition of Information. Problems in Information Transmission 1: 4–7.
Google Scholar
Koza, J. R. (1989). Hierarchical Genetic Algorithms that Operate on Populations of Computer Programs. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, 768–780.
Kozima, Hideki (1993). Text Segmentation Based on Similarity Between Words. In Proceedings of the Association for Computational Linguistics.
Krotov, Alexander, Gaizauskas, Robert & Wilks, Yorick (1994). Acquiring a Stochastic Context-Free Grammar from the Penn Treebank. In Monaghan, A. I. C. (ed.) Third Conference on the Cognitive Science of Natural Language Processing. Dublin City University.
Kuhn, Ronald & De Mori, Renato (June 1990). A Cache-Based Natural Language Model for Speech Recognition. I. E. E. E. Transactions on Pattern Analysis and Machine Intelligence 12(6): 570–583.
Google Scholar
Kupiec, Julian (1992). Robust Part-of-Speech Tagging Using a Hidden Markov Model. Computer Speech and Language 6: 225–242.
Google Scholar
Lau, Raymond, Rosenfeld, Ronald & Roukos, Salim (March 1993). Adaptive Language Modelling Using the Maximum Entropy Principle. In ARPA Workshop on Human Language Technology, 108–113. Princeton.
Liberman, Mark K. (1991). The Trend Towards Statistical Models in Natural Language Processing. In Klein, E. & Veltman, F. (eds.) Natural Language and Speech. Berlin: Springer Verlag.
Google Scholar
Liddy, Elizabeth D. & Paik, Woojin (1992). Statistically-Guided Word Sense Disambiguation. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.
Linsker, Ralph (1988). Self-Organization in a Perceptual Network. I. E. E. E. Computer 21(3): 105–117.
Google Scholar
Magerman, David M. (February 1994). Natural Language Parsing as Statistical Pattern Recognition. PhD thesis, Stanford University Computer Science Department.
Makhoul, John, Jelinek, Fred, Rabiner, Larry, Weinstein, Clifford & Zue, Victor (1990). Spoken Language Systems. Annual Review of Computer Science 4: 481–501.
Google Scholar
Manderick, Bernard (1992). The Genetic Algorithm. In Background and Experiments in Machine Learning of Natural Language.
McGee-Wood, Mary (1993). Categorial Grammars. London: Routledge.
Google Scholar
McMahon, John & Smith, F. J. (1996). Improving Statistical Language Model Performance with Automatically Generated Word Hierarchies. Computational Linguistics 22(2): 217–247.
Google Scholar
Miller, George A. & Charles, Walter G. (1991). Contextual Correlates of Semantic Similarity. Language and Cognitive Processes 6(1): 1–28.
Google Scholar
Miller, George A. (1951). Language and Communication. New York: McGrawHill.
Google Scholar
Nagao, Makoto (1984). A Framework of a Mechanical Translation Between Japanese and English by Analogy Principle. In Elithorn A. & Barnerji, R. (eds.) Artificial and Human Intelligence, 173–180. North-Holland.
Naumann, Sven & Schrepp, Jürgen (1992). Inductive Learning of Reversible Grammars. In Daelemans, Walter & Powers, David (eds.) Background and Experiments in Machine Learning of Natural Language, 237–243. Institute for Language Technology and AI.
Ney, Hermann, Essen, Ute & Kneser, Reinhard (1994). On Structuring Probabilistic Dependencies in Stochastic Language Modelling. Computer Speech and Language 8: 1–38.
Google Scholar
Nicholis, John S. & Katsikas, Anastassis A. (1993). Chaotic Dynamics of Linguistic-Like Processes at the Syntactical and Semantic Levels: In Pursuit of a Multifractal Attractor. In Patterns, Information and Chaos in Neuronal Systems. World Scientific.
Niyogi, Partha & Berwick, Robert C. (March 1995). A Note on Zipf's Law, Natural Languages, and Noncoding DNA Regions. Technical Report Memo 118, M. I. T. CBCL. In the Computation and Language archive.
O'Boyle, Peter, Owens, Marie & Smith, F. J. (1994). A Weighted Average N-Gram Model of Natural Language. Computer Speech and Language 8: 337–349.
Google Scholar
O'Boyle, Peter, McMahon, John & Smith, F. J. (December 1995). Combining a Multi-Level Class Hierarchy with Weighted Average Function-Based Smoothing. In IEEE Workshop on Automatic Speech recognition, 9/1–9/7. Snowbird, Utah.
Google Scholar
Oncina, J., Castellanos, A., Vidal, E. & Jimémez, V. (1994). Corpus-Based Machine Translation Through Subsequential Transducers. In Monaghan, A. I. C. (ed.) Third Conference on the Cognitive Science of Natural Language Processing. Dublin City University.
Pereira, Fernando & Tishby, Naftali (1992). Distributed Similarity, Phase Transitions and Hierarchical Clustering. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.
Plumbley, M. D. (1993). Information Theory and Neural Network Learning Algorithms. In Orchard, Gerry (ed.) Neural Computing – Research and Applications, 145–155. Institute of Physics Publishing. Proceedings of the Second Irish Neural Networks Conference.
Powers, David & Daelemans, Walter (1992). SHOE: The Extraction of Hierarchical Structure for Machine Learning of Natural Language (Project Summary). In Background and Experiments in Machine Learning of Natural Language, 125–161.
Rabiner, L. R. & Juang. B. J. (January 1986). An Introduction to Hidden Markov Models. I. E. E. E. A. S. S. P. Magazine, 4–16.
Ramsay, Allan (July 1994). Linguistics: The Cognitive Science of Natural Language. In Third Conference on the Cognitive Science of Natural Language Processing. Dublin City University.
Rao, P. Srinivasa, Monkowski, Michael D. & Roukos, Salim (May 1995). Language Model Adaptation Via Minimum Discrimination Information. In ICASSP 95, 161–164. IEEE.
Google Scholar
Redington, Martin, Chater, Nick & Finch, Steven (1993). Distributional Information and the Acquisition of Linguistic Categories: A Statistical Approach. In Proceedings of the Fifteenth Annual Conference of the Cognitive Science Society.
Reilly, Ronan (1992). A Connectionist Technique for On-Line Parsing. In The Cognitive Science of Natural Language Processing.
Reilly, Ronan (1992). An Exploration of Clause Boundary Effects in Simple Recurrent Network Representations. In The Second Irish Neural Netowrks Conference.
Resnik, Philip S. (1992). Probabilistic Tree-Adjoining Grammar as a Framework for Statistical Natural Language Processing. In Proceedings of COLING-92. Nantes.
Resnik, Philip S. (December 1993). Selection and Information: A Class-Based Approach to Lexical Relationships. PhD thesis, Computer and Information Science, University of Pennsylvania. Institute for Research in Cognitive Science Report I. R. C. S.–93–42.
Rohlicek, Jan Robin, Chow, Yen-Lu & Roukos, Salim (1988). Statistical Language Modelling Using a Small Corpus from an Application Domain. In IEEE International Conference on Acoustics, Speech and Signal Processing, 267–270.
Sampson, Geoffrey (1987). Evidence Against the Grammatical-Ungrammatical Distinction. In Meijs, Wilem (ed.) Corpus Linguistics and Beyond: Proceedings of the 7th International Conference on English Language Research on Computerized Corpora, 219–226. Rodopi, Amsterdam.
Scholtes, J. C. (1992). Resolving Linguistic Ambiguities with a Neural Data Oriented Parsing (DOP) System. In Background and Experiments in Machine Learning of Natural Language, 279–282.
Schütze, Hinrich (1993). Part-of-Speech Induction from Scratch. In Proceedings of the Association for Computational Linguistics 31, 251–258.
Google Scholar
Shannon, C. E. (1951). Prediction and Entropy of Printed English. Bell System Technical Journal.
Solomon, D. & McGee-Wood, M. (December 1993). Unified Lexicon and Grammar. In Collingham, Russell J. (ed.) Workshop on the Unified Lexicon.
Somers, H., McLean, I. & Jones, D. (1994). Experiments in Multilingual Example-Based Generation. In Monaghan, A. I. C. (ed.) Third Conference on the Cognitive Science of Natural Language Processing. Dublin City University.
Sutcliffe, Richard F. E., McElligott, Annette & O'Néill, G. (1993). Irish-English Lexical Translation Using Distributed Semantic Representations. In Cowie, R. & Owens, M. (eds.) Artificial Intelligence and Cognitive Science.
Tanenhaus, Michael K. (1988). Psycholinguistics: An Overview. In Newmeyer, Frederick J. (ed.) Linguistics: The Cambridge Survey, volume III, ch. 1, 1–37. Cambridge University Press.
Wolff, J. Gerard (1991). Towards a Theory of Cognition and Computing. Ellis Horwood.
Wyard, P. J. & Nightingale, C. (1992). Grammar Recognition by a Single Layer Higher Order Neural Net. B. T. Technological Journal 10(3).
Zernik, Uri (1991). Introduction. In Zernik, Uri (ed.) Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, ch. 1, 1–26. Lawrence Erlbaum Associates.
Zipf, George K. (1949). Human Behaviour and the Principle of Least Effort. Addison-Wesley.

Download references

Author information

Authors and Affiliations

Department of Computer Science, The Queen's University of Belfast, Belfast, BT7 1NN, N. Ireland
John McMahon & F. Jack Smith

Authors

John McMahon
View author publications
You can also search for this author in PubMed Google Scholar
F. Jack Smith
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

McMahon, J., Smith, F.J. A Review of Statistical Language Processing Techniques. Artificial Intelligence Review 12, 347–391 (1998). https://doi.org/10.1023/A:1006517723917

Download citation

Issue Date: October 1998
DOI: https://doi.org/10.1023/A:1006517723917

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Review of Statistical Language Processing Techniques

Abstract

Access this article

Similar content being viewed by others

Maximum Entropy Models for Natural Language Processing

Natural Language Processing, Moving from Rules to Data

Natural Language Processing: Past, Present and Future

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

A Review of Statistical Language Processing Techniques

Abstract

Access this article

Similar content being viewed by others

Maximum Entropy Models for Natural Language Processing

Natural Language Processing, Moving from Rules to Data

Natural Language Processing: Past, Present and Future

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation