Skip to main content
  • Book
  • © 2014

Text Mining

From Ontology Learning to Automated Text Processing Applications

  • A unique contribution to the analysis of big textual data

  • The first book that presents the overarching notion of text mining, ranging from lexical acquisition to NLP applications

  • The book strikes a balance between the overall vision and the general picture and current topics in text mining research

  • Includes supplementary material: sn.pub/extras

Buying options

eBook USD 84.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-12655-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book USD 109.00
Price excludes VAT (USA)
Hardcover Book USD 119.99
Price excludes VAT (USA)

This is a preview of subscription content, access via your institution.

Table of contents (11 chapters)

  1. Front Matter

    Pages i-x
  2. Text Mining Techniques and Methodologies

    1. Front Matter

      Pages 1-1
    2. Building Large Resources for Text Mining: The Leipzig Corpora Collection

      • Uwe Quasthoff, Dirk Goldhahn, Thomas Eckart
      Pages 3-24
    3. Simple, Fast and Accurate Taxonomy Learning

      • Zornitsa Kozareva
      Pages 41-62
    4. A Topology-Based Approach to Visualize the Thematic Composition of Document Collections

      • Patrick Oesterling, Christian Heine, Gunther H. Weber, Gerik Scheuermann
      Pages 63-85
    5. Towards a Network Model of the Coreness of Texts: An Experiment in Classifying Latin Texts Using the TTLab Latin Tagger

      • Alexander Mehler, Tim vor der Brück, Rüdiger Gleim, T. Geelhaar
      Pages 87-112
  3. Text Mining Applications

    1. Front Matter

      Pages 113-113
    2. A Structuralist Approach for Personal Knowledge Exploration Systems on Mobile Devices

      • Stefan Bordag, Christian Hänig, Christian Beutenmüller
      Pages 115-136
    3. Deception Detection Within and Across Cultures

      • Veronica Perez-Rosas, Cristian Bologa, Mihai Burzo, Rada Mihalcea
      Pages 157-175
    4. Sentiment Analysis: What’s Your Opinion?

      • Jonathan Sonntag, Manfred Stede
      Pages 177-199
    5. Towards a Historical Text Re-use Detection

      • Marco Büchler, Philip R. Burns, Martin Müller, Emily Franzini, Greta Franzini
      Pages 221-238

About this book

​This book comprises a set of articles that specify the methodology of text mining, describe the creation of lexical resources in the framework of text mining, and use text mining for various tasks in natural language processing (NLP). The analysis of large amounts of textual data is a prerequisite to build lexical resources such as dictionaries and ontologies, and also has direct applications in automated text processing in fields such as history, healthcare and mobile applications, just to name a few. This volume gives an update in terms of the recent gains in text mining methods and reflects the most recent achievements with respect to the automatic build-up of large lexical resources. It addresses researchers that already perform text mining, and those who want to enrich their battery of methods. Selected articles can be used to support graduate-level teaching.

The book is suitable for all readers that completed undergraduate studies of computational linguistics, quantitative linguistics, computer science and computational humanities. It assumes basic knowledge of computer science and corpus processing as well as of statistics.

Keywords

  • Big Data
  • Corpus processing
  • Dictionary acquisition
  • Natural Language Processing
  • Text mining

Editors and Affiliations

  • Computer Science Department, Technische Universität Darmstadt FG Language Technology, Darmstadt, Germany

    Chris Biemann

  • Computer Science Department, Goethe University WG Text Technology, Frankfurt am Main, Germany

    Alexander Mehler

About the editors

After completing his doctoral dissertation with Gerhard Heyer at the University of Leipzig (Germany), Chris Biemann joined the semantic search startup Powerset (San Francisco) in 2008, which was acquired to become part of Microsoft's Bing in the same year. In 2011, he joined TU Darmstadt (Germany) as an assistant professor (W1) for Language Technology. His interests are situated in statistical semantics, unsupervised and knowledge-free natural language processing and in leveraging the wisdom of the crowds for language data acquisition. Alexander Mehler is professor (W3) for Computational Humanities / Text Technology at the Goethe University Frankfurt am Main, where he heads the Text Technology Lab as part of the Institute of Informatics. His research interests focus on the empirical analysis and simulative synthesis of discourse units in spoken and written communication. He aims at a quantitative theory of networking in linguistic systems to enable multi-agent simulations of their life cycle. Alexander Mehler integrates models of semantic spaces with simulation models of language evolution and topological models of network theory to capture the complexity of linguistic information systems. Currently, he is heading several research projects on the analysis of linguistic networks in historical semantics. Most recently he started a research project on kinetic text-technologies that integrates the paradigm of games with a purpose with the wiki way of collaborative writing and kinetic HCI.

Bibliographic Information

Buying options

eBook USD 84.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-12655-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book USD 109.00
Price excludes VAT (USA)
Hardcover Book USD 119.99
Price excludes VAT (USA)