Humanities Data in R

Part of the series Quantitative Methods in the Humanities and Social Sciences pp 131-155

Natural Language Processing

  • Taylor ArnoldAffiliated withYale University
  • , Lauren TiltonAffiliated withYale University

* Final gross prices may vary according to local VAT.

Get Access


An introduction applying low-level natural language processing is given in this chapter. Techniques such as tokenization, lemmatization, part of speech tagging, and coreference detection are described in relationship to text analysis. The methods are applied to a corpus of short stories by Sir Arthur Conan Doyle featuring his famous detective, Sherlock Holmes.