Spark Architecture and the Resilient Distributed Dataset

  • Raju Kumar Mishra


You learned Python in the preceding chapter. Now it is time to learn PySpark and utilize the power of a distributed system to solve problems related to big data. We generally distribute large amounts of data on a cluster and perform processing on that distributed data.

Copyright information

©  Raju Kumar Mishra 2018

Authors and Affiliations

  • Raju Kumar Mishra
    • 1
  1. 1.BangaloreIndia

Personalised recommendations