Advertisement

Core Data Analysis: Summarization, Correlation, and Visualization

  • Boris Mirkin
Textbook

Part of the Undergraduate Topics in Computer Science book series (UTICS)

Table of contents

  1. Front Matter
    Pages i-xv
  2. Boris Mirkin
    Pages 1-75
  3. Boris Mirkin
    Pages 77-161
  4. Boris Mirkin
    Pages 163-292
  5. Boris Mirkin
    Pages 405-475
  6. Back Matter
    Pages 477-524

About this book

Introduction

This text examines the goals of data analysis with respect to enhancing knowledge, and identifies data summarization and correlation analysis as the core issues. Data summarization, both quantitative and categorical, is treated within the encoder-decoder paradigm bringing forward a number of mathematically supported insights into the methods and relations between them. Two Chapters describe methods for categorical summarization: partitioning, divisive clustering and separate cluster finding and another explain the methods for quantitative summarization, Principal Component Analysis and PageRank.

Features:

·        An in-depth presentation of K-means partitioning including a corresponding Pythagorean decomposition of the data scatter.

·        Advice regarding such issues as clustering of categorical and mixed scale data, similarity and network data, interpretation aids, anomalous clusters, the number of clusters, etc.

·        Thorough attention to data-driven modelling including a number of mathematically stated relations between statistical and geometrical concepts including those between goodness-of-fit criteria for decision trees and data standardization, similarity and consensus clustering, modularity clustering and uniform partitioning.

New edition highlights:

·        Inclusion of ranking issues such as Google PageRank, linear stratification and tied rankings median, consensus clustering, semi-average clustering, one-cluster clustering

·        Restructured to make the logics more straightforward and sections self-contained

Core Data Analysis: Summarization, Correlation and Visualization is aimed at those who are eager to participate in developing the field as well as appealing to novices and practitioners. 

 

Keywords

Clustering Data Analysis K-means Principal component analysis Visualization

Authors and affiliations

  • Boris Mirkin
    • 1
  1. 1.Department of Data Analysis and Artificial Intelligence, Faculty of Computer ScienceNational Research University Higher School of EconomicsMoscowRussia

Bibliographic information

  • DOI https://doi.org/10.1007/978-3-030-00271-8
  • Copyright Information Springer Nature Switzerland AG 2019
  • Publisher Name Springer, Cham
  • eBook Packages Computer Science
  • Print ISBN 978-3-030-00270-1
  • Online ISBN 978-3-030-00271-8
  • Series Print ISSN 1863-7310
  • Series Online ISSN 2197-1781
  • Buy this book on publisher's site