Skip to main content
Log in

A Review of Statistical Language Processing Techniques

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

We present a review of some recently developed techniques in the field of natural language processing. This area has witnessed a confluence of approaches which are inspired by theories from linguistics and those which are inspired by theories from information theory: statistical language models are becoming more linguistically sophisticated and the models of language used by linguists are incorporating stochastic techniques to help resolve ambiguities. We include a discussion about the underlying similarities between some of these systems and mention two approaches to the evaluation of statistical language processing systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Antonisse, Hendrik James (1991). A Grammar-Based Genetic Algorithm. Foundations of Genetic Algorithms: 193–204.

  • Atick, Joseph J. & Redlich, A. Norman (1990). Towards a Theory of Early Visual Processing. Neural Computation 2: 308–320.

    Google Scholar 

  • Attneave, Fred (1954). Some Informational Aspects of Visual Perception. Psychological Review 61(3): 183–193.

    Google Scholar 

  • Badalamenti, Anthony F., Langs, Robert J. & Robinson James (1994). Lawful Systems Dynamics in How Poets Choose Their Words. Behavioral Science 39.

  • Bahl, Lalit R., Brown, Peter F., DeSouza, Peter V. & Mercer, Robert L. (July, 1989). A Tree-Based Statistical Language Model for Natural Language Speech Recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 37(7): 1001–1008.

    Google Scholar 

  • Barry, Alwyn (September 1993). The Emergence of High Level Structure in Classifier Systems – a Proposal. In Cowie, R. & Owens M. (eds.) Proceedings of the Sixth Irish Conference on Artificial Intelligence and Cognitive Science, 185–196.

  • Basili, Roberto, Pazienza, Teresa & Velardi, Paolo (1992). Combining NLP and Statistical Techniques for Lexical Acquisition. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press, Technical report FS–92–05.

  • Basili, Roberto, Pazienza Teresa & Velardi, Paolo (1993). What Can Be Learned from Raw Texts? Machine Translation 8: 147–173.

    Google Scholar 

  • Baum, Leonard E., Petrie, Ted, Soules, George & Weiss, Norman (1970). A Maximization Technique Occurring in the Statistical Analyses of Probabilistic Functions of Markov Chains. The Annals of Mathematical Statistics 41(1): 164–171.

    Google Scholar 

  • Beckwith, Richard, Fellbaum, Christiane, Gross, Derek & Miller, George A. (1991). WordNet: A Lexical Database Organized on Psycholinguistic Principles. In Zernik, Uri (ed.) Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, ch. 9, 211–232. Lawrence Erlbaum Associates.

  • Berwick, Robert C. (1986). Learning from Positive-Only Examples – the Subset Principle and Three Case Studies. In Carbonell. J. C., Michalski, R. S. & Mitchell, T. M. (eds.) Machine Learning: An Artificial Intelligence Approach (Volume 2). Morgan Kaufmann Publishers.

  • Bod, Rens (1995). Enriching Language With Statistics: Performance Models of Natural Language. ILLC-publications.

  • Bouchaffra, Djamel & Rouault, Jacques (1992). A Nonstationary Hidden Markov Model with a Hard Capture of Observations: Application to the Problem of Morphological Ambiguities. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.

  • Brent, Michael (1993). Minimal Generative Explanations: A Middle Ground Between Neurons and Triggers. In Proceedings of the Fifteenth Meeting of the Cognitive Science Society.

  • Brill, Eric & Marcus, Mitch (1992). Tagging an Unfamiliar Text with Minimal Human Supervision. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.

  • Brill, Eric, Magerman, David, Marcus, Mitchell & Santorini, Beatrice (1990). Deducing Linguistic Structure from the Statistics of Large Corpora. In Proceedings of the DARPA Speech and Natural Language Workshop.

  • Brill, Eric (1994). Some Advances in Transformation-Based Part of Speech Tagging. In Proceedings of the 12th National Conference on Artificial Intelligence, AAAI94.

  • Brooks, Rodney A. (1991). Intelligence without Reason. In IJCAI91.

  • Brown, P. F., Cocke, J., Della Pietra, V., Della Pietra, S., Jelinek, F., Mercer, R. L. & Roosin, P. S. (1990). A Statistical Approach to Machine Translation. Computational Linguistics 16: 79–85.

    Google Scholar 

  • Brown, Peter F., Della Pietra, Vincent, De Souza, Peter, Lai, Jennifer C. & Mercer, Robert (1992). Class-Based n –Gram Models of Natural Language. Computational Linguistics 18(4): 467–479.

    Google Scholar 

  • Burger, John D. & Connolly, Dennis (1992). Probabilistic Resolution of Anaphoric Reference. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.

  • Carroll, Glenn & Charniak, Eugene (1992). Learning Probabilistic Dependence Grammars from Labelled Text. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.

  • Chater, Nick (1994). Neural Networks: The New Statistical Models of Mind. In Proceedings of the 1993 Neural Computation and Psychology Workshop. UCL Press: London.

    Google Scholar 

  • Cheng, Bing & Titterington, D. M. (1994). Neural Networks: A Review from a Statistical Perspective. Statistical Science 9(1): 2–54.

    Google Scholar 

  • Chomsky, Noam (1957). Syntactic Structures. The Hague: Mouton.

    Google Scholar 

  • Church, Kenneth W. & Gale, William A. (1991). A Comparison of the Enhanced Good-Turing and Deleted Estimation Methods for Estimating Probabilities of English Bigrams. Computer Speech and Language 5: 19–54.

    Google Scholar 

  • Church, Kenneth W. & Gale, Willliam A. (1995). Poisson Mixtures. Natural Language Engineering 1(2): 163–190.

    Google Scholar 

  • Church, Kenneth Ward & Hanks, Patrick (1989). Word Association Norms, Mutual Information and Lexicography. In Proceedings of the 27th Annual Conference of the Association of Computational Linguistics, 76–82.

  • Church, Kenneth W. & Mercer, Robert L. (1993). Introduction to the Special Issue on Computational Linguistics Using Large Corpora. Computational Linguistics 19(1): 1–23.

    Google Scholar 

  • Church, Kenneth W., Gale, William A., Hanks, Patrick & Hindle, Donald (1991). Using Statistics in Lexical Analysis. In Zernik, Uri (ed.) Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, ch. 6, 115–164. Lawrence Erlbaum Associates.

  • Church, Kenneth Ward (1988). A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In Second Conference on Applied Natural Language Processing.

  • Cover, Thomas M. & and Thomas, Joy A. (1991). Elements of Information Theory. John Wiley and Sons.

  • de Saussure, Ferdinand (1983). Course in General Linguistics. London: Duckworth.

    Google Scholar 

  • Dennett, Daniel (1991). Consciousness Explained. London: Allen Lane.

    Google Scholar 

  • Derouault, Anne-Marie & Merialdo, Bernard (November 1986). Natural Language Modelling for Phoneme-to-Text Transcription. I. E. E. E. Transactions on Pattern Analysis and Machine Intelligence, PAM-I8(6).

  • Dunning, Ted (1993). Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics 19(1): 61–74.

    Google Scholar 

  • Egedi, D., Palmer, M., Park, H. S. & Joshi, A. K. (October 1994). Korean to English Translation Using Synchronous TAGs. In Proceedings of the First Conference of the Association for Machine Translation in the Americas, 48–55. Columbia, Maryland.

  • Elman, Jeffrey L. (1990). Finding Structure in Time. Cognitive Science 14: 179–211.

    Google Scholar 

  • Elman, Jeffrey L. (1991). Incremental Learning, or the Importance of Starting Small. Technical Report 9101, Center for Research in Language, U. C. S. D.

  • Elworthy, David (October 1994). Does baumwelch Re-Estimation Help Taggers? In Proceedings of the Fourteenth ACL Conference on Applied Natural Language Processing, ANLP-94, 53–58.

  • Faulk, R. D. & Gustavson, F. Goertzel (1990). Segmenting Discrete Data Representing Continuous Speech Input. I. B. M. Systems Journal 29(2).

  • Finch, Steven & Chater, Nick (1992). Bootstrapping Syntactic Categories Using Statistical Methods. In Daelemans, Walter & Powers, David (eds.) Background and Experiments in Machine Learning of Natural Language, 229–235. Institute for Language Technology and AI.

  • Finch, Steven & Chater, Nick (1994). Learning Syntactic Categories: A Statistical Approach. In Oaksford M. & Brown, G. D. A. (eds.) Neurodynamics and Psychology, ch. 12. Academic Press.

  • Firth, J. R. (1968). A Synopsis of Linguistic Theory 1930 – 1955. In Palmer, F. (ed.) Selected Papers of J. R. Firth. Longman.

  • Fisher, David & Riloff, Ellen (1992). Applying Statistical Methods to Small Corpora: Benefitting from a Limited Domain. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.

  • Gale, William A. & Church, Kenneth W. (1990). Poor Estimates of Context Are Worse Than None. In Proceedings of the DARPA Speech and Natural Language Workshop, 283–287.

  • Gale, William A., Church, Kenneth W. & Yarowsky, David (1992). Work on Statistical Methods forWord Sense Disambiguation. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.

  • Gaussier, Eric & Lange, Jean-Marc (1992). Towards Bilingual Terminology. In Proceedings of the 19th International Conference of the Association for Literary and Linguistic Computing and the 12th International Conference on Computers and the Humanities, 121–124.

  • Gold, M. (1967). Language Identification in the Limit. Information and Control 10: 447–474.

    Google Scholar 

  • Goldberg, D. E. (1989). Genetic Algorithms in Search Optimization and Machine Learning. Addison Wesley.

  • Gonzalez, Rafael C. & Thomason, Michael G. (1978). Syntactic Pattern Recognition. Addison Wesley.

  • Good, I. J. (December 1953). The Population Frequencies of Species and the Estimation of Population Parameters. Biometrika 40: 237–264.

    Google Scholar 

  • Gorin, A. L., Levinson, S. E., Gertner, A. N. & Goldman, E. (1991). Adaptive Acquisition of Language. Computer Speech and Language 5: 101–132.

    Google Scholar 

  • Grefenstette, Gregory (1992). Finding Semantic Similarity in Raw Text: The Deese Antonyms. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.

  • Harris, Zellig S. (1951). Structural Linguistics. Phoenix Books.

  • Holland, J. H. (1975). Adaptation in Natural and Artificial Systems. University of Michigan Press.

  • Holland, J. H. (1986). Escaping Brittleness: The Possibilities of General Purpose Learning Algorithms Applied to Parallel Rule-Bases Systems. In Carbonell, J. G., Michalski, R. S. & Mitchell, T. M. (eds.) Machine Learning II. Morgan Kaufmann.

  • Hughes, John & Atwell, Eric (1994). The Automated Evaluation of Inferred Word Classifications. In Eleventh European Conference on Artificial Intelligence.

  • Hull, Jonathan J. (1992). Combining Syntactic Knowledge and Visual Text Recognition: A Hidden Markov Model for Part of Speech Tagging in a Word Recognition Algorithm. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.

  • Jelinek, Frederick, Merialdo, B., Roukos, S. & Strauss, M. (February 1991). A Dynamic Language Model for Speech Recognition. In Proceedings of the Speech and Language DARPA Workshop, 293–295.

  • Jelinek, Frederick (April 1976). Continuous Speech Recognition by Statistical Methods. Proceedings of the I. E. E. E. 64(4).

  • Jelinek, Frederick (1985). The Development of an Experimental Discrete Dictation Recogniser. Proceedings of the I. E. E. E. 73(11).

  • Jelinek, Frederick (1990). Self-Organized Language Modelling for Speech Recognition. In Waibel & Lee (eds.) Readings in Speech Recognition. San Mateo, California: Morgan Kaufmann.

    Google Scholar 

  • Jones, Daniel (1992). Virtual Machine Translation. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.

  • Jordan, M. (1986). Attractor Dynamics and Parallelismin a Connectionist Sequential Machine. In Proceedings of the Eighth Annual Conference of the Cognitive Science Society, 531–546. Lawrence Erlbaum Associates.

  • Juola, Patrick, Hall, Chris & Boggs, Adam (1994). Corpus-Based Morphological Segmentation by Entropy Changes. In Monaghan, A. I. C. (ed.) Third Conference on the Cognitive Science of Natural Language Processing. Dublin City University.

  • Juola, Patrick (1994). A Psycholinguistic Approach to Corpus-Based Machine Translation. In Monaghan, A. I. C. (ed.) Third Conference on the Cognitive Science of Natural Language Processing. Dublin City University.

  • Juola, Patrick (1995). Corpus-Based Acquisition of Grammars and Transfer Functions for Machine Translation. Technical report CUCS75695, Department of Computer Science, University of Colorado at Boulder.

  • Katz, Slava M. (March 1987). Estimation of Probabilities for Sparse Data for the Language Model Component of a Speech Recogniser. I. E. E. E. Transactions on Acoustics, Speech and Signal Processing, ASSP-35(3): 400–401.

    Google Scholar 

  • Kiss, George R. (1973). Grammatical Word Classes: A Learning Process and Its Simulation. Psychology of Learning and Motivation 7: 1–41.

    Google Scholar 

  • Kneser, Reinhard & Ney, Hermann (1993). Forming Word Classes by Statistical Clustering for Statistical Language Modelling. In Köhler, R. & Rieger, B. B. (eds.) Contributions to Quantitative Linguistics, 221–226. Kluwer Academic Publishers.

  • Kolmogorov, A. N. (1964). Three Approaches to the Quantitative Definition of Information. Problems in Information Transmission 1: 4–7.

    Google Scholar 

  • Koza, J. R. (1989). Hierarchical Genetic Algorithms that Operate on Populations of Computer Programs. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, 768–780.

  • Kozima, Hideki (1993). Text Segmentation Based on Similarity Between Words. In Proceedings of the Association for Computational Linguistics.

  • Krotov, Alexander, Gaizauskas, Robert & Wilks, Yorick (1994). Acquiring a Stochastic Context-Free Grammar from the Penn Treebank. In Monaghan, A. I. C. (ed.) Third Conference on the Cognitive Science of Natural Language Processing. Dublin City University.

  • Kuhn, Ronald & De Mori, Renato (June 1990). A Cache-Based Natural Language Model for Speech Recognition. I. E. E. E. Transactions on Pattern Analysis and Machine Intelligence 12(6): 570–583.

    Google Scholar 

  • Kupiec, Julian (1992). Robust Part-of-Speech Tagging Using a Hidden Markov Model. Computer Speech and Language 6: 225–242.

    Google Scholar 

  • Lau, Raymond, Rosenfeld, Ronald & Roukos, Salim (March 1993). Adaptive Language Modelling Using the Maximum Entropy Principle. In ARPA Workshop on Human Language Technology, 108–113. Princeton.

  • Liberman, Mark K. (1991). The Trend Towards Statistical Models in Natural Language Processing. In Klein, E. & Veltman, F. (eds.) Natural Language and Speech. Berlin: Springer Verlag.

    Google Scholar 

  • Liddy, Elizabeth D. & Paik, Woojin (1992). Statistically-Guided Word Sense Disambiguation. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.

  • Linsker, Ralph (1988). Self-Organization in a Perceptual Network. I. E. E. E. Computer 21(3): 105–117.

    Google Scholar 

  • Magerman, David M. (February 1994). Natural Language Parsing as Statistical Pattern Recognition. PhD thesis, Stanford University Computer Science Department.

  • Makhoul, John, Jelinek, Fred, Rabiner, Larry, Weinstein, Clifford & Zue, Victor (1990). Spoken Language Systems. Annual Review of Computer Science 4: 481–501.

    Google Scholar 

  • Manderick, Bernard (1992). The Genetic Algorithm. In Background and Experiments in Machine Learning of Natural Language.

  • McGee-Wood, Mary (1993). Categorial Grammars. London: Routledge.

    Google Scholar 

  • McMahon, John & Smith, F. J. (1996). Improving Statistical Language Model Performance with Automatically Generated Word Hierarchies. Computational Linguistics 22(2): 217–247.

    Google Scholar 

  • Miller, George A. & Charles, Walter G. (1991). Contextual Correlates of Semantic Similarity. Language and Cognitive Processes 6(1): 1–28.

    Google Scholar 

  • Miller, George A. (1951). Language and Communication. New York: McGrawHill.

    Google Scholar 

  • Nagao, Makoto (1984). A Framework of a Mechanical Translation Between Japanese and English by Analogy Principle. In Elithorn A. & Barnerji, R. (eds.) Artificial and Human Intelligence, 173–180. North-Holland.

  • Naumann, Sven & Schrepp, Jürgen (1992). Inductive Learning of Reversible Grammars. In Daelemans, Walter & Powers, David (eds.) Background and Experiments in Machine Learning of Natural Language, 237–243. Institute for Language Technology and AI.

  • Ney, Hermann, Essen, Ute & Kneser, Reinhard (1994). On Structuring Probabilistic Dependencies in Stochastic Language Modelling. Computer Speech and Language 8: 1–38.

    Google Scholar 

  • Nicholis, John S. & Katsikas, Anastassis A. (1993). Chaotic Dynamics of Linguistic-Like Processes at the Syntactical and Semantic Levels: In Pursuit of a Multifractal Attractor. In Patterns, Information and Chaos in Neuronal Systems. World Scientific.

  • Niyogi, Partha & Berwick, Robert C. (March 1995). A Note on Zipf's Law, Natural Languages, and Noncoding DNA Regions. Technical Report Memo 118, M. I. T. CBCL. In the Computation and Language archive.

  • O'Boyle, Peter, Owens, Marie & Smith, F. J. (1994). A Weighted Average N-Gram Model of Natural Language. Computer Speech and Language 8: 337–349.

    Google Scholar 

  • O'Boyle, Peter, McMahon, John & Smith, F. J. (December 1995). Combining a Multi-Level Class Hierarchy with Weighted Average Function-Based Smoothing. In IEEE Workshop on Automatic Speech recognition, 9/1–9/7. Snowbird, Utah.

    Google Scholar 

  • Oncina, J., Castellanos, A., Vidal, E. & Jimémez, V. (1994). Corpus-Based Machine Translation Through Subsequential Transducers. In Monaghan, A. I. C. (ed.) Third Conference on the Cognitive Science of Natural Language Processing. Dublin City University.

  • Pereira, Fernando & Tishby, Naftali (1992). Distributed Similarity, Phase Transitions and Hierarchical Clustering. In Probabilistic Approaches to Natural Language. American Association for Artificial Intelligence, AAAI Press. Technical report FS9205.

  • Plumbley, M. D. (1993). Information Theory and Neural Network Learning Algorithms. In Orchard, Gerry (ed.) Neural Computing – Research and Applications, 145–155. Institute of Physics Publishing. Proceedings of the Second Irish Neural Networks Conference.

  • Powers, David & Daelemans, Walter (1992). SHOE: The Extraction of Hierarchical Structure for Machine Learning of Natural Language (Project Summary). In Background and Experiments in Machine Learning of Natural Language, 125–161.

  • Rabiner, L. R. & Juang. B. J. (January 1986). An Introduction to Hidden Markov Models. I. E. E. E. A. S. S. P. Magazine, 4–16.

  • Ramsay, Allan (July 1994). Linguistics: The Cognitive Science of Natural Language. In Third Conference on the Cognitive Science of Natural Language Processing. Dublin City University.

  • Rao, P. Srinivasa, Monkowski, Michael D. & Roukos, Salim (May 1995). Language Model Adaptation Via Minimum Discrimination Information. In ICASSP 95, 161–164. IEEE.

    Google Scholar 

  • Redington, Martin, Chater, Nick & Finch, Steven (1993). Distributional Information and the Acquisition of Linguistic Categories: A Statistical Approach. In Proceedings of the Fifteenth Annual Conference of the Cognitive Science Society.

  • Reilly, Ronan (1992). A Connectionist Technique for On-Line Parsing. In The Cognitive Science of Natural Language Processing.

  • Reilly, Ronan (1992). An Exploration of Clause Boundary Effects in Simple Recurrent Network Representations. In The Second Irish Neural Netowrks Conference.

  • Resnik, Philip S. (1992). Probabilistic Tree-Adjoining Grammar as a Framework for Statistical Natural Language Processing. In Proceedings of COLING-92. Nantes.

  • Resnik, Philip S. (December 1993). Selection and Information: A Class-Based Approach to Lexical Relationships. PhD thesis, Computer and Information Science, University of Pennsylvania. Institute for Research in Cognitive Science Report I. R. C. S.–93–42.

  • Rohlicek, Jan Robin, Chow, Yen-Lu & Roukos, Salim (1988). Statistical Language Modelling Using a Small Corpus from an Application Domain. In IEEE International Conference on Acoustics, Speech and Signal Processing, 267–270.

  • Sampson, Geoffrey (1987). Evidence Against the Grammatical-Ungrammatical Distinction. In Meijs, Wilem (ed.) Corpus Linguistics and Beyond: Proceedings of the 7th International Conference on English Language Research on Computerized Corpora, 219–226. Rodopi, Amsterdam.

  • Scholtes, J. C. (1992). Resolving Linguistic Ambiguities with a Neural Data Oriented Parsing (DOP) System. In Background and Experiments in Machine Learning of Natural Language, 279–282.

  • Schütze, Hinrich (1993). Part-of-Speech Induction from Scratch. In Proceedings of the Association for Computational Linguistics 31, 251–258.

    Google Scholar 

  • Shannon, C. E. (1951). Prediction and Entropy of Printed English. Bell System Technical Journal.

  • Solomon, D. & McGee-Wood, M. (December 1993). Unified Lexicon and Grammar. In Collingham, Russell J. (ed.) Workshop on the Unified Lexicon.

  • Somers, H., McLean, I. & Jones, D. (1994). Experiments in Multilingual Example-Based Generation. In Monaghan, A. I. C. (ed.) Third Conference on the Cognitive Science of Natural Language Processing. Dublin City University.

  • Sutcliffe, Richard F. E., McElligott, Annette & O'Néill, G. (1993). Irish-English Lexical Translation Using Distributed Semantic Representations. In Cowie, R. & Owens, M. (eds.) Artificial Intelligence and Cognitive Science.

  • Tanenhaus, Michael K. (1988). Psycholinguistics: An Overview. In Newmeyer, Frederick J. (ed.) Linguistics: The Cambridge Survey, volume III, ch. 1, 1–37. Cambridge University Press.

  • Wolff, J. Gerard (1991). Towards a Theory of Cognition and Computing. Ellis Horwood.

  • Wyard, P. J. & Nightingale, C. (1992). Grammar Recognition by a Single Layer Higher Order Neural Net. B. T. Technological Journal 10(3).

  • Zernik, Uri (1991). Introduction. In Zernik, Uri (ed.) Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, ch. 1, 1–26. Lawrence Erlbaum Associates.

  • Zipf, George K. (1949). Human Behaviour and the Principle of Least Effort. Addison-Wesley.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

McMahon, J., Smith, F.J. A Review of Statistical Language Processing Techniques. Artificial Intelligence Review 12, 347–391 (1998). https://doi.org/10.1023/A:1006517723917

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1006517723917

Navigation