Natural Language Processing


Natural language processing (NLP), also known as computational linguistics, is a broad subject that encompasses technologies for automated processing of natural (human) language. Such processing includes parsing natural-language text, generating natural language as a form of program output, and extracting semantics. I have been working in this field since the 1980s using a wide variety of programming techniques, and I believe that quantitative methods provide better results with less effort for most applications. Therefore, in this chapter, I will cover statistical NLP. Statistical NLP uses statistical or probabilistic methods for segmenting text, determining each word’s likely part of speech, classifying text, and automatically summarizing text. I will show you what I consider to be some of the simplest yet most useful techniques for developing Web 3.0 applications that require some “understanding” of text. (Natural-language generation is also a useful topic, but I won’t cover it here.)


Natural Language Processing Resource Description Framework Bayesian Classifier Input Text Latent Semantic Indexing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Mark Watson 2009

Personalised recommendations