Skip to main content

Complete Guide to Open Source Big Data Stack

  • Book
  • © 2018


  • Describes the step-by-step construction of a real-world big data stack from open source software
  • Explains popular Apache-based systems such as Hadoop, HBase, Cassandra, Riak, Brooklyn, Spark, Kafka, and more
  • Author builds a data stack for this book and then recounts the process, including successes, failures, and lessons learned

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 16.99 USD 39.99
Discount applied Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (10 chapters)


About this book

See a Mesos-based big data stack created and the components used. You will use currently available Apache full and incubating systems. The components are introduced by example and you learn how they work together.

In the Complete Guide to Open Source Big Data Stack, the author begins by creating a private cloud and then installs and examines Apache Brooklyn. After that, he uses each chapter to introduce one piece of the big data stack—sharing how to source the software and how to install it. You learn by simple example, step by step and chapter by chapter, as a real big data stack is created. The book concentrates on Apache-based systems and shares detailed examples of cloud storage, release management, resource management, processing, queuing, frameworks, data visualization, and more.

What You’ll Learn

  • Install a private cloud onto the local cluster using Apache cloud stack
  • Source, install, and configure Apache: Brooklyn, Mesos, Kafka, and Zeppelin
  • See how Brooklyn can be used to install Mule ESB on a cluster and Cassandra in the cloud
  • Install and use DCOS for big data processing
  • Use Apache Spark for big data stack data processing

Who This Book Is For

Developers, architects, IT project managers, database administrators, and others charged with developing or supporting a big data system. It is also for anyone interested in Hadoop or big data, and those experiencing problems with data size.

Authors and Affiliations

  • Paraparaumu, New Zealand

    Michael Frampton

About the author

Mike Frampton has been in the IT industry since 1990, working in many roles (tester, developer, support, QA), and in many sectors (telecoms, banking, energy, insurance). He has also worked for major corporations and banks as a contractor and a permanent member of staff, including Agilent, BT, IBM, HP, Reuters, and JP Morgan Chase. The owner of Semtech Solutions, an IT/Big Data consultancy, Mike currently lives by the beach in Paraparaumu, New Zealand, with his wife and son. Mike has a keen interest in new IT-based technologies and the way that technologies integrate. Being married to a Thai national, Mike divides his time between Paraparaumu or Wellington in New Zealand and their house in Roi Et, Thailand.

Bibliographic Information

Publish with us