Machine Learning Models and Algorithms for Big Data Classification

Thinking with Examples for Effective Learning

  • Shan Suthaharan

Part of the Integrated Series in Information Systems book series (ISIS, volume 36)

Table of contents

  1. Front Matter
    Pages i-xix
  2. Shan Suthaharan
    Pages 1-13
  3. Understanding Big Data

    1. Front Matter
      Pages 15-15
    2. Shan Suthaharan
      Pages 17-29
    3. Shan Suthaharan
      Pages 31-75
  4. Understanding Big Data Systems

    1. Front Matter
      Pages 77-77
    2. Shan Suthaharan
      Pages 79-97
    3. Shan Suthaharan
      Pages 99-119
  5. Understanding Machine Learning

    1. Front Matter
      Pages 121-121
    2. Shan Suthaharan
      Pages 123-143
    3. Shan Suthaharan
      Pages 145-181
    4. Shan Suthaharan
      Pages 183-206
    5. Shan Suthaharan
      Pages 207-235
    6. Shan Suthaharan
      Pages 237-269
  6. Understanding Scaling-Up Machine Learning

    1. Front Matter
      Pages 271-271
    2. Shan Suthaharan
      Pages 273-288
    3. Shan Suthaharan
      Pages 289-307
    4. Shan Suthaharan
      Pages 309-328
    5. Shan Suthaharan
      Pages 329-355
  7. Back Matter
    Pages 357-359

About this book


This book presents machine learning models and algorithms to address big data classification problems. Existing machine learning techniques like the decision tree (a hierarchical approach), random forest (an ensemble hierarchical approach), and deep learning (a layered approach) are highly suitable for the system that can handle such problems. This book helps readers, especially students and newcomers to the field of big data and machine learning, to gain a quick understanding of the techniques and technologies; therefore, the theory, examples, and programs (Matlab and R) presented in this book have been simplified, hardcoded, repeated, or spaced for improvements. They provide vehicles to test and understand the complicated concepts of various topics in the field. It is expected that the readers adopt these programs to experiment with the examples, and then modify or write their own programs toward advancing their knowledge for solving more complex and challenging problems.

The presentation format of this book focuses on simplicity, readability, and dependability so that both undergraduate and graduate students as well as new researchers, developers, and practitioners in this field can easily trust and grasp the concepts, and learn them effectively. It has been written to reduce the mathematical complexity and help the vast majority of readers to understand the topics and get interested in the field. This book consists of four parts, with the total of 14 chapters. The first part mainly focuses on the topics that are needed to help analyze and understand data and big data. The second part covers the topics that can explain the systems required for processing big data. The third part presents the topics required to understand and select machine learning techniques to classify big data. Finally, the fourth part concentrates on the topics that explain the scaling-up machine learning, an important solution for modern big data problems.


Big Data Classification Data Visualization Machine Learning Supervised Learning Unit Circle Machine

Authors and affiliations

  • Shan Suthaharan
    • 1
  1. 1.Department of Computer ScienceUNC GreensboroGreensboroUSA

Bibliographic information

  • DOI
  • Copyright Information Springer Science+Business Media New York 2016
  • Publisher Name Springer, Boston, MA
  • eBook Packages Business and Management
  • Print ISBN 978-1-4899-7640-6
  • Online ISBN 978-1-4899-7641-3
  • Series Print ISSN 1571-0270
  • Series Online ISSN 2197-7968
  • Buy this book on publisher's site