Text Mining

  • Max Bramer
Part of the Undergraduate Topics in Computer Science book series (UTICS)


This chapter looks at a particular type of classification task, where the objects are text documents. A method of processing the documents for use by the classification algorithms given earlier in this book using a bag-of-words representation is described.

An important special case of text classification arises when the documents are web pages. The automatic classification of web pages is known as hypertext categorisation. The differences between standard text classification and hypertext categorisation are illustrated and issues relating to the latter are discussed.


Text Classification Vector Space Model Stop Word Inverse Document Frequency Human Classifier 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Copyright information

© Springer-Verlag London Ltd. 2016

Authors and Affiliations

  • Max Bramer
    • 1
  1. 1.School of ComputingUniversity of PortsmouthPortsmouthUK

Personalised recommendations