Survey of Text Mining II

Clustering, Classification, and Retrieval

  • Michael W. Berry
  • Malu Castellanos

Table of contents

  1. Front Matter
    Pages i-xv
  2. Clustering

    1. Pierre Senellart, Vincent D. Blondel
      Pages 25-44
    2. Dimitrios Zeimpekis, Efstratios Gallopoulos
      Pages 45-64
    3. Jacob Kogan, Charles Nicholas, Mike Wiacek
      Pages 65-85
    4. Loulwah AlSumait, Carlotta Domeniconi
      Pages 87-105
  3. Document Retrieval and Representation

    1. Mei Kobayashi, Masaki Aono
      Pages 109-127
    2. Zhonghang Xia, Guangming Xing, Houduo Qi, Qi Li
      Pages 129-144
  4. Email Surveillance and Filtering

    1. Brett W. Bader, Michael W. Berry, Murray Browne
      Pages 147-163
    2. Wilfried N. Gansterer, Andreas G. K. Janecek, Robert Neumayer
      Pages 165-183
  5. Anomaly Detection

    1. Edward G. Allan, Michael R. Horvath, Christopher V. Kopek, Brian T. Lamb, Thomas S. Whaples, Michael W. Berry
      Pages 203-217
    2. Mostafa Keikha, Narjes Sharif Razavian, Farhad Oroumchian, Hassan Seyed Razi
      Pages 219-232
  6. Back Matter
    Pages 233-240

About this book


The proliferation of digital computing devices and their use in communication has resulted in an increased demand for systems and algorithms capable of mining textual data. Thus, the development of techniques for mining unstructured, semi-structured, and fully-structured textual data has become increasingly important in both academia and industry.

This second volume continues to survey the evolving field of text mining - the application of techniques of machine learning, in conjunction with natural language processing, information extraction and algebraic/mathematical approaches, to computational information retrieval. Numerous diverse issues are addressed, ranging from the development of new learning approaches to novel document clustering algorithms, collectively spanning several major topic areas in text mining.


• Acts as an important benchmark in the development of current and future approaches to mining textual information

• Serves as an excellent companion text for courses in text and data mining, information retrieval and computational statistics

• Experts from academia and industry share their experiences in solving large-scale retrieval and classification problems

• Presents an overview of current methods and software for text mining

• Highlights open research questions in document categorization and clustering, and trend detection

• Describes new application problems in areas such as email surveillance and anomaly detection

Survey of Text Mining II offers a broad selection in state-of-the art algorithms and software for text mining from both academic and industrial perspectives, to generate interest and insight into the state of the field. This book will be an indispensable resource for researchers, practitioners, and professionals involved in information retrieval, computational statistics, and data mining.

Michael W. Berry is a professor in the Department of Electrical Engineering and Computer Science at the University of Tennessee, Knoxville.

Malu Castellanos is a senior researcher at Hewlett-Packard Laboratories in Palo Alto, California.


Anomaly Detection Augmented Reality Classification Document Clustering Extensible Markup Language (XML) Information Extraction Information Retrieval Text Mining Topic Evolution algorithms

Editors and affiliations

  • Michael W. Berry
    • 1
  • Malu Castellanos
    • 2
  1. 1.Department of Computer ScienceUniversity of TennesseeUSA
  2. 2.Hewlett-Packard LaboratoriesPalo AltoUSA

Bibliographic information