Graphics of Large Datasets

Visualizing a Million

  • Antony Unwin
  • Martin Theus
  • Heike Hofmann

Part of the Statistics and Computing book series (SCO)

Table of contents

  1. Front Matter
    Pages i-xiii
  2. Introduction

    1. Antony Unwin
      Pages 1-27
  3. Basics

    1. Front Matter
      Pages 29-29
    2. Martin Theus
      Pages 31-54
    3. Martin Theus
      Pages 55-72
    4. Antony Unwin
      Pages 73-101
  4. Applications

    1. Front Matter
      Pages 103-103
    2. Dianne Cook, Leslie Miller
      Pages 125-141
    3. Rida Moustafa, Ed Wegman
      Pages 143-155
    4. Graham Wills
      Pages 157-175
    5. Simon Urbanek
      Pages 177-202
    6. Bárbara González-Arévalo, Félix Hernández-Campos, Steve Marron, Cheolwoo Park
      Pages 203-226
    7. Antony Unwin, Martin Theus
      Pages 227-249
  5. Back Matter
    Pages 251-271

About this book


Graphics are great for exploring data, but how can they be used for looking at the large datasets that are commonplace to-day? This book shows how to look at ways of visualizing large datasets, whether large in numbers of cases or large in numbers of variables or large in both. Data visualization is useful for data cleaning, exploring data, identifying trends and clusters, spotting local patterns, evaluating modeling output, and presenting results. It is essential for exploratory data analysis and data mining. Data analysts, statisticians, computer scientists-indeed anyone who has to explore a large dataset of their own-should benefit from reading this book.

New approaches to graphics are needed to visualize the information in large datasets and most of the innovations described in this book are developments of standard graphics. There are considerable advantages in extending displays which are well-known and well-tried, both in understanding how best to make use of them in your work and in presenting results to others. It should also make the book readily accessible for readers who already have a little experience of drawing statistical graphics. All ideas are illustrated with displays from analyses of real datasets and the authors emphasize the importance of interpreting displays effectively. Graphics should be drawn to convey information and the book includes many insightful examples.

Antony Unwin holds the Chair of Computer Oriented Statistics and Data Analysis at the University of Augsburg. He has been involved in developing visualization software for twenty years. Martin Theus is a Senior Researcher at the University of Augsburg, has worked in industry and research in both Germany and the USA, and is the author of the visualization software Mondrian. Heike Hofmann is Assistant Professor of Statistics at Iowa State University. She wrote the software MANET and has also cooperated in the development of the GGobi software.


Excel STATISTICA computer data analysis modeling statistical software visualization

Authors and affiliations

  • Antony Unwin
    • 1
  • Martin Theus
    • 1
  • Heike Hofmann
    • 2
  1. 1.Department of Computer Oriented Statistics and Data AnalysisUniversity of AugsburgAugsburgGermany
  2. 2.Department of StatisticsIowa State UniversityAmes

Bibliographic information