Skip to main content
Log in

A survey of state management in big data processing systems

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

The concept of state and its applications vary widely across big data processing systems. This is evident in both the research literature and existing systems, such as Apache Flink, Apache Heron, Apache Samza, Apache Spark, and Apache Storm. Given the pivotal role that state management plays, particularly, for iterative batch and stream processing, in this survey, we present examples of state as an enabler, discuss the alternative approaches used to handle and implement state, capture the many facets of state management, and highlight new research directions. Our aim is to provide insight into disparate state management techniques, motivate others to pursue research in this area, and draw attention to open problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Doulkeridis, C., Nørvåg, K.: A survey of large-scale analytical query processing in MapReduce. VLDB J. 23(3), 355–380 (2014)

    Article  Google Scholar 

  2. Sakr, S., Liu, A., Fayoumi, A.: The family of MapReduce and large scale data processing systems. J. ACM Comput. Surv. (ACM CSUR) 46(1), 11 (2013)

    Google Scholar 

  3. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  4. Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache flink™: stream and batch processing in a single engine. IEEE Data Eng. Bull. 38(4), 28–38 (2015)

    Google Scholar 

  5. Apache Flink. http://flink.apache.org/ (2018)

  6. Alexandrov, A., et al.: The stratosphere platform for big data analytics. VLDB J. 23(6), 939–964 (2014)

    Article  Google Scholar 

  7. Kulkarni, S., et al.: Twitter Heron: stream processing at scale. In: SIGMOD, pp. 239–250 (2015)

  8. Apache Heron. http://incubator.apache.org/projects/heron.html (2018)

  9. Apache Samza. http://samza.apache.org/ (2018)

  10. Apache Spark. http://spark.apache.org/ (2018)

  11. Hirzel, M., Soulé, R., Schneider, S., Gedik, B., Grimm, R.: A catalog of stream processing optimizations. ACM Comput. Surv. (CSUR) 46(4), 46 (2014)

    Article  Google Scholar 

  12. Van Roy, P., Haridi, S.: Concepts, Techniques, and Models of Computer Programming. MIT Press, Cambridge (2004)

    Google Scholar 

  13. Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M.: MapReduce online. In: NSDI (2010)

  14. Ekanayake, J., Fox, G.: High performance parallel computing with clouds and cloud technologies. In: CloudComp (2009)

  15. Logothetis, D., Olston, C., Reed, B., Webb, K.C., Yocum, K.: Stateful bulk processing for incremental analytics. In: ACM Symposium on Cloud Computing (SoCC), pp. 51–62 (2010)

  16. Matteis, T.D., Mencagli, G.: Parallel patterns for window-based stateful operators on data streams: an algorithmic skeleton approach. J. Parallel Program. 45, 382–401 (2016)

    Article  Google Scholar 

  17. Fernandez, R.C., Migliavacca, M., Kalyvianaki, E., Pietzuch, P.: Integrating scale out and fault tolerance in stream processing using operator state management. In: SIGMOD (2013)

  18. Wu, Y., Tan, K.: ChronoStream: elastic stateful stream computation in the cloud. In: ICDE, pp. 723–734 (2015)

  19. Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.: Distributed GraphLab: a framework for machine learning in the cloud. PVLDB 5(8), 716–727 (2012)

    Google Scholar 

  20. Meehan, J., et al.: S-Store: streaming meets transaction processing. PVLDB 8(13), 2134–2145 (2015)

    Google Scholar 

  21. Losa, G., et al.: CAPSULE: language and system support for efficient state sharing in distributed stream processing systems. In: DEBS, pp. 268–277 (2012)

  22. Ding, J., et al.: Efficient operator state migration for cloud-based data stream management systems. In: The Computing Research Repository (CoRR). arXiv:1501.03619 (2016)

  23. Feng, Y.-H., et al.: Efficient and adaptive stateful replication for stream processing engines in high-availability cluster. TPDS 22(11), 1788–1796 (2011)

    Google Scholar 

  24. Fegaras, L.: Incremental query processing on big data streams. In: TKDE (2016)

  25. Brito, A., Fetzer, C., Sturzrehm, H., Felber, P.: Speculative out-of-order event processing with software transaction memory. In: DEBS, pp. 265–275 (2008)

  26. Nicolae, B., Cappello, F.: AI-Ckpt: leveraging memory access patterns for adaptive asynchronous incremental checkpointing. In: High-Performance Parallel and Distributed Computing (HPDC), pp. 155–166 (2013)

  27. Ren, K., Diamond, T., Abadi, D.J., Thomson, A.: Low-overhead asynchronous checkpointing in main-memory database systems. In: SIGMOD, pp. 1539–1551 (2016)

  28. Liu, B., Zhu, Y., Rundensteiner, E.A.: Run-time operator state spilling for memory intensive long-running queries. In: SIGMOD, pp. 347–358 (2006)

  29. Ananthanarayanan, R., et al.: Photon: fault-tolerant and scalable joining of continuous data streams. In: SIGMOD, pp. 577–588 (2013)

  30. Zhang, H., Chen, G., Ooi, B.C., Tan, K.L., Zhang, M.: In-memory big data management and processing: a survey. TKDE 27(7), 1920–1948 (2015)

    Google Scholar 

  31. Kwon, Y., Balazinska, M., Greenberg, A.: Fault-tolerant stream processing using a distributed, replicated file system. PVLDB 1(1), 574–585 (2008)

    Google Scholar 

  32. Tu, Y.-C., Liu, S., Prabhakar, S., Yao, B.: Load shedding in stream databases: a control-based approach. In: VLDB, pp. 787–798 (2006)

  33. Mokbel, M., Lu, M., Aref, W.: Hash-merge join: a non-blocking join algorithm for producing fast and early join results. In: ICDE, pp. 251–262 (2004)

  34. Urhan, T., Franklin, M.J.: Xjoin: a reactively-scheduled pipelined join operator. IEEE Data Eng. Bull. 23(2), 27–33 (2000)

    Google Scholar 

  35. Viglas, S., Naughton, J.F., Burger, J.: Maximizing the output rate of multi-way join queries over streaming information sources. In: VLDB, pp. 285–296 (2003)

    Chapter  Google Scholar 

  36. Hwang, J.H., Balazinska, M., Rasin, A., Cetintemel, U., Stonebraker, M., Zdonik, S.: High-availability algorithms for distributed stream processing. In ICDE, pp. 779–790 (2005)

  37. Fernandez, R.C., Migliavacca, M., Kalyvianaki, E., Pietzuch, P.: Making state explicit for imperative big data processing. In: USENIX ATC (2014)

  38. Murray, D.G., McSherry, F., Isaacs, R., Isard, M., Barham, P., Abadi, M.: Naiad: a timely dataflow system. In: ACM Symposium on Operating Systems Principles (SOSP), pp. 439–455 (2013)

  39. Toshniwal, A., et al.: Storm@twitter. In: SIGMOD, pp. 147–156 (2014)

  40. Zaharia, M., et al.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: NSDI (2012)

  41. Ding, L., Mehta, N., Rundensteiner, E.A., Heineman, G.T.: Joining punctuated streams. In: EDBT, pp. 587–604 (2004)

  42. Tucker, P.A., Maier, D., Sheard, T., Fegaras, L.: Exploiting punctuation semantics in continuous data streams. TKDE 15(3), 555–568 (2003)

    Google Scholar 

  43. Li, H.G., Chen, S., Tatemura, J., Agrawal, D., Candan, K.S., Hsiung, W.P.: Safety guarantee of continuous join queries over punctuated data streams. In: VLDB, pp. 19–30 (2006)

  44. Li, J., Tufte, K., Shkapenyuk, V., Papadimos, V., Johnson, T., Maier, D.: Out-of-order processing: a new architecture for high-performance stream systems. PVLDB 1(1), 274–288 (2008)

    Google Scholar 

  45. Zhu, Y., Rundensteiner, E., Heineman, G.T.: Dynamic plan migration for continuous queries over data streams. In: SIGMOD (2004)

  46. Gulisano, V., Peris, R.J., Martínez, M.P., Soriente, C., Valduriez, P.: StreamCloud: an elastic and scalable data stream system. TPDS 23(12), 2351–2365 (2012)

    Google Scholar 

  47. Pietzuch, P., Ledlie, J., Shneidman, J., Roussopoulos, M., Welsh, M., Seltzer, M.: Network-aware operator placement for stream-processing systems. In: ICDE (2006)

  48. Ottenwalder, B., Koldehofe, B., Rothermel, K., Ramachandran, U.: MigCEP: operator migration for mobility driven distributed complex event processing. In: DEBS, pp. 183–194 (2013)

  49. Fernandez, R.C., Garefalakis, P., Pietzuch, P.: Java2SDG: stateful big data processing for the masses. In: ICDE, pp. 1390–1393 (2016)

  50. Ahmad, Y., Kennedy, O., Koch, C., Nikolic, M.: DBToaster: higher-order delta processing for dynamic, frequently fresh views. PVLDB 5(10), 968–979 (2012)

    Google Scholar 

  51. Arasu, A., Babu, S., Widom, J.: The CQL continuous query language: semantic foundations and query execution. VLDB J. 15(2), 121–142 (2006)

    Article  Google Scholar 

  52. Gordon, M.I., Thies, W., Amarasinghe, S.: Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In: Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 151–162 (2006)

  53. Sermulins, J., Thies, W., Rabbah, R., Amarasinghe, S.: Cache aware optimization of stream programs. In: Languages, Compiler, and Tool Support for Embedded Systems (LCTES), pp. 115–126 (2005)

  54. Kuntschke, R., Stegmaier, B., Kemper, A.: Data stream sharing. Technical Report, TU Munich (2005)

  55. Tatbul, N., et al.: Handling shared, mutable state in stream processing with correctness guarantees. IEEE Data Eng. Bull. 38(4), 94–104 (2015)

    Google Scholar 

  56. Naksinehaboon, N., et al.: Reliability-aware approach: an incremental checkpoint/restart model in HPC environments. In: CCGRID, pp. 783–788 (2008)

  57. Sebepou, Z., Magoutis, K.: CEC: continuous eventual checkpointing for data stream processing operators. In: DSN, pp. 145–156 (2011)

  58. Koch, C.: Incremental query evaluation in a ring of databases. In: PODS, pp. 87–98 (2010)

  59. Koch, C., Ahmad, Y., Kennedy, O., Nikolic, M., Nötzli, A., Lupei, D., Shaikhha, A.: DBToaster: higher-order delta processing for dynamic, frequently fresh views. VLDB J. 23(2), 253–278 (2014)

    Article  Google Scholar 

  60. Koch, C., Lupei, D., Tannen, V.: Incremental view maintenance for collection programming. In: PODS, pp. 75–90 (2016)

  61. McSherry, F., Murray, D.G., Isaacs, R., Isard, M.: Differential dataflow. In: CIDR (2013)

  62. Nikolic, M., Elseidy, M., Koch, C.: LINVIEW: incremental view maintenance for complex analytical queries. In: SIGMOD, pp. 253–264 (2014)

  63. Nikolic, M., Dashti, M., Koch, C.: How to win a hot dog eating contest: distributed incremental view maintenance with batch updates. In: SIGMOD, pp. 511–526 (2016)

  64. Padmanabhan, S., Malkemus, T., Jhingran, A., Agarwal, R.: Block oriented processing of relational database operations in modern computer architectures. In: ICDE, pp. 567–574 (2001)

  65. Wang, L., Fu, T.Z.J., Ma, R.T.B., Winslett, M., Zhang, Z.: Elasticutor: rapid elasticity for realtime stateful stream processing. In: The Computing Research Repository (CoRR). arXiv:1711.01046 (2017)

  66. Shah, M.A., Hellerstein, J.M., Chandrasekaran, S., Franklin, M.J.: Flux: an adaptive partitioning operator for continuous query systems. In: ICDE (2003)

  67. Gedik, B.: Partitioning functions for stateful data parallelism in stream processing. VLDB J. 23(4), 517–539 (2014)

    Article  Google Scholar 

  68. Nasir, M.A.U., Morales, G.D.F., García-Soriano, D., Kourtellis, N., Serafini, M.: The power of both choices: practical load balancing for distributed stream processing engines. In: ICDE, pp. 137–148 (2015)

  69. Nasir, M.A.U., Morales, G.D.F., Kourtellis, N., Serafini, M.: When two choices are not enough: balancing at scale in distributed stream processing. In: ICDE, pp. 589–600 (2016)

  70. Katsipoulakis, N.R., Labrinidis, A., Chrysanthis, P.K.: A holistic view of stream partitioning costs. PVLDB 10(11), 1286–1297 (2017)

    Google Scholar 

  71. Sayed, N.E., Schroeder, B.: Checkpoint/restart in practice: when simple is better. In: IEEE International Conference on Cluster Computing (CLUSTER), pp. 84–92 (2014)

  72. Bouguerra, M.S., Trystram, D., Wagner, F.: Complexity analysis of checkpoint scheduling with variable costs. IEEE Trans. Comput. 62(6), 1269–1275 (2013)

    Article  MathSciNet  Google Scholar 

  73. Young, J.W.: A first order approximation to the optimum checkpoint interval. Commun. ACM 17(9), 530–531 (1974)

    Article  Google Scholar 

  74. Robert, Y., Vivien, F., Zaidouni, D.: On the complexity of scheduling checkpoints for computational workflows. In: DSN, pp. 1–6 (2012)

  75. Logothetis, D., Yocum, K.: Data indexing for stateful, large-scale data processing. In: NETDB (2009)

  76. Schelter, S., Ewen, S., Tzoumas, K., Markl, V.: “All roads lead to Rome:” optimistic recovery for distributed iterative data processing. In: CIKM, pp. 1919–1928 (2013)

  77. Ewen, S., Tzoumas, K., Kaufmann, M., Markl, V.: Spinning fast iterative data flows. PVLDB 5(11), 1268–1279 (2012)

    Google Scholar 

  78. Ewen, S., Schelter, S., Tzoumas, K., Warneke, D., Markl, V.: Iterative parallel data processing with stratosphere: an inside look. In: SIGMOD, pp. 1053–1056 (2013)

  79. Markl, V.: Breaking the chains: on declarative data analysis and data independence in the big data era. PVLDB 7(13), 1730–1733 (2014)

    Google Scholar 

  80. Weimer, M., Condie, T., Ramakrishnan, R.: Machine learning in ScalOps, a higher order cloud computing language. NIPS BigLearn 9, 389–396 (2011)

    Google Scholar 

  81. Zinkevich, M., Weimer, M., Smola, A.J., Li, L.: Parallelized stochastic gradient descent. In: Neural Information Processing Systems (NIPS), pp. 2595–2603 (2010)

  82. Benjelloun, O., Sarma, A.D., Halevy, A., Widom, J.: ULDBs: databases with uncertainty and lineage. In: VLDB, pp. 953–964 (2006)

  83. Dudoladov, S., Xu, C., Schelter, S., Katsifodimos, A., Ewen, S., Tzoumas, K., Markl, V.: Optimistic recovery for iterative dataflows in action. In: SIGMOD, pp. 1439–1443 (2015)

  84. Xu, C., Holzemer, M., Kaul, M., Markl, V.: Efficient fault-tolerance for iterative graph processing on distributed dataflow systems. In: ICDE, pp. 613–624 (2016)

  85. Hwang, J.H., Xing, Y., Cetintemel, U., Zdonik, S.: A cooperative, self-configuring high-availability solution for stream processing. In: ICDE (2007)

  86. Chen, Z., Dongarra, J.: Highly scalable self-healing algorithms for high performance scientific computing. IEEE Trans. Comput. 58(11), 1512–1524 (2009)

    Article  MathSciNet  Google Scholar 

  87. Hakkarinen, D., Chen, Z.: Multilevel diskless checkpointing. IEEE Trans. Comput. 62(4), 772–783 (2013)

    Article  MathSciNet  Google Scholar 

  88. Koldehofe, B., Mayer, R., Ramachandran, U., Rothermel, K., Völz, M.: Rollback-recovery without checkpoints in distributed event processing systems. In: DEBS, pp. 27–38 (2013)

  89. Su, L., Zhou, Y.: Tolerating correlated failures in massively parallel stream processing engines. In: ICDE, pp. 517–528 (2016)

  90. Upadhyaya, P., et al.: A latency and fault-tolerance optimizer for online parallel query plans. In: SIGMOD, pp. 241–252 (2011)

  91. Wang, H., Peh, L.-S., Koukoumidis, E., Tao, S., Chan, M.C.: Meteor shower: a reliable stream processing system for commodity data centers. In: IEEE IPDPS, pp. 1180–1191 (2012)

  92. Balazinska, M., Balakrishnan, H., Madden, S., Stonebraker, M.: Fault-tolerance in the Borealis distributed stream processing system. In: SIGMOD, pp. 13–24 (2005)

  93. Balazinska, M., Balakrishnan, H., Madden, S., Stonebraker, M.: Fault-tolerance in the Borealis distributed stream processing system. TODS 33(1), 1–44 (2008)

    Article  Google Scholar 

  94. Abadi, D.J., et al.: The design of the Borealis stream processing engine. In: CIDR, pp. 277–289 (2005)

  95. Carbone, P., Fóra, G., Ewen, S., Haridi, S., Tzoumas, K.: Lightweight asynchronous snapshots for distributed dataflows. In: The Computing Research Repository (CoRR). arXiv:1506.08603 (2015)

  96. Jangjaimon, I., Tzeng, N.-F.: Adaptive incremental checkpointing via delta compression for networked multicore systems. In: IEEE IPDPS, pp. 7–18 (2013)

  97. Paun, M., et al.: Incremental checkpoint schemes for Weibull failure distribution. J. Found. Comput. Sci. 21(3), 329–344 (2010)

    Article  MathSciNet  Google Scholar 

  98. Madsen, K.G.S., Zhou, Y.: Dynamic resource management in a massively parallel stream processing engine. In: CIKM, pp. 13–22 (2015)

  99. Madsen, K.G.S., Zhou, Y., Cao, J.: Integrative dynamic reconfiguration in a parallel stream processing engine. In: The Computing Research Repository (CoRR). arXiv:1602.03770 (2016)

  100. McSherry, F., Isaacs, R., Isard, M., Murray, D.G.: Composable incremental and iterative data-parallel computation with Naiad. Technical report number MSR-TR-2012-105. Microsoft Research Silicon Valley (2012)

  101. Carbone, P., Ewen, S., Fóra, G., Haridi, S., Richter, S., Tzoumas, K.: State management in apache flink: consistent stateful distributed stream processing. PVLDB 10(12), 1718–1729 (2017)

    Google Scholar 

  102. Cai, Y., Giarrusso, P.G., Rendel, T., Ostermann, K.: A theory of changes for higher-order languages: incrementalizing λ-calculi by static differentiation. In: Programming Language Design and Implementation (PLDI), pp. 145–155 (2014)

    Article  Google Scholar 

  103. Fegaras, L.: An algebra for distributed big data analytics. Technical report (2016)

  104. Hammer, M.A., Dunfield, J., Headley, K., Labich, N., Foster, J.S., Hicks, M., Horn, D.V.: Incremental computation with names. SIGPLAN 50(10), 748–766 (2015)

    Article  Google Scholar 

  105. Alexandrov, A., et al.: Implicit parallelism through deep language embedding. In: SIGMOD, pp. 47–61 (2015)

  106. Silva, G.J., Gedik, B., Andrade, H., Wu, K.-L.: Language level checkpointing support for stream processing applications. In: DSN (2009)

  107. Agrawal, D., et al.: Road to freedom in big data analytics. In: EDBT, pp. 479–484 (2016)

  108. Agrawal, D., et al. Rheem: enabling multi-platform task execution. In: SIGMOD, pp. 2069–2072 (2016)

  109. Wu, X., et al.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2007)

    Article  Google Scholar 

  110. Aggarwal, C., Yu, P.: A survey of synopsis construction in data streams. In: Data Streams, Advances in Database Systems, vol. 31. Springer, New York (2007)

  111. Johnson, T., Muthukrishnan, S., Rozenbaum, I.: Sampling algorithms in a stream operator. In: SIGMOD, pp. 1–12 (2005)

  112. Liu, W., Li, G., Cheng, J.: Fast PageRank approximation by adaptive sampling. Knowl. Inf. Syst. 42(1), 127–146 (2015)

    Article  Google Scholar 

  113. Mitliagkas, I., Borokhovich, M., Dimakis, A.G., Caramanis, C.: FrogWild!: fast PageRank approximations on graph engines. PVLDB 8(8), 874–885 (2015)

    Google Scholar 

  114. Yossef, Z.B., Mashiach, L.: Local approximation of PageRank and reverse PageRank. In: Research and Development in Information Retrieval (SIGIR), pp. 865–866 (2008)

  115. Zhu, F., Fang, Y., Chang, K.C.-C., Ying, J.: Scheduled approximation for personalized PageRank with utility-based hub selection. VLDB J. 24(5), 655–679 (2015)

    Article  Google Scholar 

  116. Fujiwara, Y., Nakatsuji, M., Onizuka, M., Kitsuregawa, M.: Fast and exact top-k search for random walk with restart. PVLDB 5(5), 442–453 (2012)

    Google Scholar 

  117. Yu, W., Lin, X., Zhang, W.: Fast incremental SimRank on link-evolving graphs. In: ICDE, pp. 304–315 (2014)

  118. Hochreiner, C., Vögler, M., Schulte, S., Dustdar, S.: Elastic stream processing for the internet of things. In: CLOUD, pp. 100–107 (2016)

  119. Boykin, O., Ritchie, S., O’Connell, I., Lin, J.: Summingbird: a framework for integrating batch and online mapreduce computations. PVLDB 7(13), 1441–1451 (2014)

    Google Scholar 

  120. Meehan, J., Zdonik, S., Tian, S., Tian, Y., Tatbul, N., Dziedzic, A., Elmore, A.: Integrating real-time and batch processing in a polystore. In: High-Performance Extreme Computing Conference (HPEC) (2016)

  121. Marz, N., Warren, J.: Big data: principles and best practices of scalable realtime data systems. ISBN 9781617290343 (2015)

  122. Kappa Architecture. http://kappa-architecture.com (2018)

  123. Elmore, A., et al.: A demonstration of the BigDAWG polystore system. PVLDB 8(12), 1908–1911 (2015)

    Google Scholar 

Download references

Acknowledgements

This work was funded by the H2020 STREAMLINE Project under Grant Agreement No. 688191 and by the German Federal Ministry for Education and Research (BMBF) funded Berlin Big Data Center (BBDC), Under Funding Mark 01IS14013A.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Quoc-Cuong To.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

To, QC., Soto, J. & Markl, V. A survey of state management in big data processing systems. The VLDB Journal 27, 847–872 (2018). https://doi.org/10.1007/s00778-018-0514-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-018-0514-9

Keywords

Navigation