Advertisement

Programming Internals of Scalding and Spark

  • K.G. SrinivasaEmail author
  • Anil Kumar Muppalla
Chapter
Part of the Computer Communications and Networks book series (CCN)

Abstract

Scalding is a Scala-based library built on top of Cascading, a Java library that forms an abstraction over low-level Hadoop API. It is comparable to Pig, but brings the advantages of Scala in building MapReduce jobs [1].

Keywords

Cloud Computing Program Internal Storage Level Hadoop Cluster Parallelize Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Scala, ”The Scala Programming Language,” 2002. [Online]. Available: http://www.scalalang.org/.
  2. 2.
    Twitter, Scalding, 2011. [Online]. Available: https://github.com/twitter/scalding.
  3. 3.
    Wensel, C. K. ”Cascading: Defining and executing complex and fault tolerant data processin workflows on a hadoop cluster” (2008).Google Scholar
  4. 4.
    Cascading, ”Cascading: Application Platform for Enterprise Big Data” [Online] Available: http://www.cascading.org/
  5. 5.
    Zaharia, Matei, et al. ”Spark: cluster computing with working sets.” Proceedings of the 2nd USENIX conference on Hot topics in cloud computing. 2010.Google Scholar
  6. 6.
    B. Hindman, A. Konwinski, M. Zaharia, and I. Stoica. A common substrate for cluster computing. In Workshop on Hot Topics in Cloud Computing (HotCloud) 2009, 2009.Google Scholar
  7. 7.
    Spark, Apache. [Online] Available: http://spark.incubator.apache.org/docs/latest/

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.M.S. Ramaiah Institute of TechnologyBangaloreIndia

Personalised recommendations