© 2020

Next-Generation Machine Learning with Spark

Covers XGBoost, LightGBM, Spark NLP, Distributed Deep Learning with Keras, and More


Table of contents

  1. Front Matter
    Pages i-xix
  2. Butch Quinto
    Pages 1-27
  3. Butch Quinto
    Pages 29-96
  4. Butch Quinto
    Pages 97-187
  5. Butch Quinto
    Pages 189-244
  6. Butch Quinto
    Pages 245-268
  7. Butch Quinto
    Pages 269-287
  8. Butch Quinto
    Pages 289-348
  9. Back Matter
    Pages 349-355

About this book


Access real-world documentation and examples for the Spark platform for building large-scale, enterprise-grade machine learning applications.

The past decade has seen an astonishing series of advances in machine learning. These breakthroughs are disrupting our everyday life and making an impact across every industry.

Next-Generation Machine Learning with Spark provides a gentle introduction to Spark and Spark MLlib and advances to more powerful, third-party machine learning algorithms and libraries beyond what is available in the standard Spark MLlib library. By the end of this book, you will be able to apply your knowledge to real-world use cases through dozens of practical examples and insightful explanations.

You will:

  • Be introduced to machine learning, Spark, and Spark MLlib 2.4.x
  • Achieve lightning-fast gradient boosting on Spark with the XGBoost4J-Spark and LightGBM libraries
  • Detect anomalies with the Isolation Forest algorithm for Spark
  • Use the Spark NLP and Stanford CoreNLP libraries that support multiple languages
  • Optimize your ML workload with the Alluxio in-memory data accelerator for Spark
  • Use GraphX and GraphFrames for Graph Analysis
  • Perform image recognition using convolutional neural networks
  • Utilize the Keras framework and distributed deep learning libraries with Spark 


Spark Big data Machine Learning Spark ML Spark MLlib Spark Machine Learning XGBoost LightGBM NLP Natural Language Processing Stanford CoreNLP Spark NLP Random Forest Logistic Regression Linear Regression Distributed Computing

Authors and affiliations

  1. 1.CarsonUSA

About the authors

Butch Quinto is founder and Chief AI Officer at Intelvi AI, an artificial intelligence company that develops cutting-edge solutions for the defense, industrial, and transportation industries. As Chief AI Officer, Butch heads strategy, innovation, research, and development. Previously, he was the Director of Artificial Intelligence at a leading technology firm and Chief Data Officer at an AI startup. As Director of Analytics at Deloitte, Butch led the development of several enterprise-grade AI and IoT solutions as well as strategy, business development, and venture capital due diligence. He has more than 20 years of experience in various technology and leadership roles in several industries including banking and finance, telecommunications, government, utilities, transportation, e-commerce, retail, manufacturing, and bioinformatics. Butch is the author of Next-Generation Big Data (Apress) and a member of the Association for the Advancement of Artificial Intelligence and the American Association for the Advancement of Science. 

Bibliographic information

  • Book Title Next-Generation Machine Learning with Spark
  • Book Subtitle Covers XGBoost, LightGBM, Spark NLP, Distributed Deep Learning with Keras, and More
  • Authors Butch Quinto
  • DOI
  • Copyright Information Butch Quinto 2020
  • Publisher Name Apress, Berkeley, CA
  • eBook Packages Professional and Applied Computing Professional and Applied Computing (R0)
  • Softcover ISBN 978-1-4842-5668-8
  • eBook ISBN 978-1-4842-5669-5
  • Edition Number 1
  • Number of Pages XIX, 355
  • Number of Illustrations 67 b/w illustrations, 0 illustrations in colour
  • Topics Big Data
  • Buy this book on publisher's site