Integrate full-stack open-source fast data pipeline architecture and choose the correct technology—Spark, Mesos, Akka, Cassandra, and Kafka (SMACK)—in every layer. Fast data is becoming a requirement for many enterprises. So far, however, the focus has largely been on collecting, aggregating, and crunching large data sets in a timely manner. In many cases organizations need more than one paradigm to perform efficient analyses.
Big Data SMACK explains each technology and, more importantly, how to integrate them. It provides detailed coverage of the practical benefits of these technologies and incorporates real-world examples. The book focuses on the problems and scenarios solved by the architecture, as well as the solutions provided by each technology. This book covers the five main concepts of data pipeline architecture and how to integrate, replace, and reinforce every layer:
- The engine: Apache Spark
- The container: Apache MesosThe model: Akka<
- The storage: Apache Cassandra
- The broker: Apache Kafka