Introduction to Spark

  • Zubair Nabi


The first version of Spark was open sourced in 2010, and it went into Apache incubation in 2013. By early 2014, it was promoted to a top-level Apache project. It has already replaced Hadoop as the Big Data processing engine of choice in most organizations. This is a testament to its maturity and the richness of its design. Batch processing, iterative and interactive computation, stream processing, graph analytics, ETL, machine learning, and data warehousing; you name it and Spark can already handle it. This chapter is a hands-on primer to Spark to set the stage for the rest of the book.


Work Node Manager Execution Execution Location Driver Program Persistence Level 

Copyright information

© Zubair Nabi 2016

Authors and Affiliations

  • Zubair Nabi
    • 1
  1. 1.LahorePakistan

Personalised recommendations