Learn PySpark

Build Python-based Machine Learning and Deep Learning Models

  • Pramod Singh

Table of contents

  1. Front Matter
    Pages i-xviii
  2. Pramod Singh
    Pages 1-16
  3. Pramod Singh
    Pages 17-48
  4. Pramod Singh
    Pages 49-65
  5. Pramod Singh
    Pages 67-84
  6. Pramod Singh
    Pages 85-115
  7. Pramod Singh
    Pages 117-159
  8. Pramod Singh
    Pages 161-181
  9. Pramod Singh
    Pages 183-203
  10. Back Matter
    Pages 205-210

About this book


Leverage machine and deep learning models to build applications on real-time data using PySpark. This book is perfect for those who want to learn to use this language to perform exploratory data analysis and solve an array of business challenges.

You'll start by reviewing PySpark fundamentals, such as Spark’s core architecture, and see how to use PySpark for big data processing like data ingestion, cleaning, and transformations techniques. This is followed by building workflows for analyzing streaming data using PySpark and a comparison of various streaming platforms. 

You'll then see how to schedule different spark jobs using Airflow with PySpark and book examine tuning machine and deep learning models for real-time predictions. This book concludes with a discussion on graph frames and performing network analysis using graph algorithms in PySpark. All the code presented in the book will be available in Python scripts on Github.


PySpark Python Machine Learning Deep Learning Big Data Spark Data Processing AirFlow Supervised Machine Learning Unsupervised Machine Learning Graph Frames

Authors and affiliations

  • Pramod Singh
    • 1
  1. 1.BangaloreIndia

Bibliographic information