Apache Apex

Gundabattula, Ananth; Weise, Thomas

doi:10.1007/978-3-319-63962-8_316-1

Ananth Gundabattula³ &
Thomas Weise⁴

276 Accesses

Introduction

Apache Apex (2018; Weise et al. 2017) is a large-scale stream-first big data processing framework that can be used for low-latency, high-throughput, and fault-tolerant processing of unbounded (or bounded) datasets on clusters. Apex development started in 2012, and it became a project at the Apache Software Foundation in 2015. Apex can be used for real-time and batch processing, based on a unified stateful streaming architecture, with support for event-time windowing and exactly-once processing semantics (Fig. 1).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Akidau T et al (2013) MillWheel: fault-tolerant stream processing at internet scale. PVLDB 6:1033–1044
Google Scholar
Akidau T et al (2015) The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. PVLDB 8:1792–1803
Google Scholar
Apache Apex (2018) https://apex.apache.org/
Apache Calcite (2018) https://calcite.apache.org/
Bertolucci M et al (2015) Static and dynamic big data partitioning on Apache Spark. PARCO
Google Scholar
Carbone P et al (2015a) Apache Flink™: stream and batch processing in a single engine. IEEE Data Eng Bull 38:28–38
Google Scholar
Carbone P et al (2015b) Lightweight asynchronous snapshots for distributed dataflows. CoRR abs/1506.08603: n. pag
Google Scholar
Carbone P et al (2017) State management in Apache Flink®: consistent stateful distributed stream processing. PVLDB 10:1718–1729
Google Scholar
Confluent blog (2018) https://www.confluent.io/blog/ksql-open-source-streaming-sql-for-apache-kafka/
Del Monte B (2017) Efficient migration of very large distributed state for scalable stream processing. PhD@VLDB
Google Scholar
Fernandez RC et al (2013) Integrating scale out and fault tolerance in stream processing using operator state management. SIGMOD conference
Google Scholar
Floratou A et al (2017) Dhalion: self-regulating stream processing in Heron. PVLDB 10:1825–1836
Google Scholar
Hummer W et al (2013) Elastic stream processing in the cloud. Wiley Interdisc Rew: Data Min Knowl Discov 3:333–345
Google Scholar
Jacques-Silva G et al (2016) Consistent regions: guaranteed tuple processing in IBM streams. PVLDB 9:1341–1352
Google Scholar
Kulkarni S et al (2015) Twitter Heron: stream processing at scale. SIGMOD conference
Google Scholar
Lin W et al (2016) StreamScope: continuous reliable distributed processing of dig data streams. NSDI
Google Scholar
Nasir MAU (2016) Fault tolerance for stream processing engines. CoRR abs/1605.00928: n. pag
Google Scholar
Noghabi SA et al (2017) Stateful scalable stream processing at LinkedIn. PVLDB 10:1634–1645
Google Scholar
Sattler K-U, Beier F (2013) Towards elastic stream processing: patterns and infrastructure. BD3@VLDB
Google Scholar
Sebepou Z, Magoutis K (2011) CEC: continuous eventual checkpointing for data stream processing operators. In: 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN). pp 145–156
Google Scholar
To Q-C et al (2017) A survey of state management in big data processing systems. CoRR abs/1702.01596: n. pag
Google Scholar
Weise T et al (2017) Learning Apache Apex. Packt Publishing
Google Scholar
Zaharia M et al (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. NSDI
Google Scholar
Zaharia M et al (2013) Discretized streams: fault-tolerant streaming computation at scale. SOSP
Google Scholar

Download references

Author information

Authors and Affiliations

Commonwealth Bank of Australia, Sydney, NSW, Australia
Ananth Gundabattula
Atrato Inc., San Francisco, CA, USA
Thomas Weise

Authors

Ananth Gundabattula
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Weise
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ananth Gundabattula or Thomas Weise .

Editor information

Editors and Affiliations

Institute of Computer Science, University of Tartu, Tartu, Estonia
Sherif Sakr
Sch of Info Techno, Building J12, University of Sydney Sch of Info Techno, Building J12, Sydney, Australia
Albert Zomaya

Section Editor information

Politecnico di Milano http://home.deib.polimi.it/margara/
Alessandro Margara
Database Systems and Information Management Group, Technische Universität Berlin, Berlin, Germany
Tilmann Rabl

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Gundabattula, A., Weise, T. (2018). Apache Apex. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_316-1

Download citation

DOI: https://doi.org/10.1007/978-3-319-63962-8_316-1
Received: 01 March 2018
Accepted: 08 March 2018
Published: 10 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63962-8
Online ISBN: 978-3-319-63962-8
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics