Encyclopedia of Big Data Technologies

Living Edition
| Editors: Sherif Sakr, Albert Zomaya

Definition of Data Streams

Living reference work entry
DOI: https://doi.org/10.1007/978-3-319-63962-8_188-1

Synonyms

Definitions

A data stream is a countably infinite sequence of elements. Different models of data streams exist that take different approaches with respect to the mutability of the stream and to the structure of stream elements. Stream processing refers to analyzing data streams on-the-fly to produce new results as new input data becomes available. Time is a central concept in stream processing: in almost all models of streams, each stream element is associated with one or more timestamps from a given time domain that might indicate, for instance, when the element was generated, the validity of its content, or when it became available for processing.

Overview

A data stream is a countably infinite sequence of elements and is used to represent data elements that are made available over time. Examples are readings from sensors in an environmental monitoring application, stock quotes in financial applications, or network data in computer monitoring...

This is a preview of subscription content, log in to check access.

References

  1. Affetti L, Margara A, Cugola G (2017) Flowdb: integrating stream processing and consistent state management. In: Proceedings of the international conference on distributed and event-based systems, DEBS’17. ACM, pp 134–145. https://doi.org/10.1145/3093742.3093929
  2. Akidau T (2015) The world beyond batch: streaming 101Google Scholar
  3. Akidau T, Bradshaw R, Chambers C, Chernyak S, Fernández-Moctezuma RJ, Lax R, McVeety S, Mills D, Perry F, Schmidt E, Whittle S (2015) The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. VLDB 8(12):1792–1803. https://doi.org/10.14778/2824032.2824076Google Scholar
  4. Arasu A, Babu S, Widom J (2006) The CQL continuous query language: semantic foundations and query execution. VLDB 15(2):121–142. https://doi.org/10.1007/s00778-004-0147-zCrossRefGoogle Scholar
  5. Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream systems. In: Proceedings of the symposium on principles of database systems, PODS’02. ACM, pp 1–16. https://doi.org/10.1145/543613.543615
  6. Botan I, Derakhshan R, Dindar N, Haas L, Miller RJ, Tatbul N (2010) Secret: a model for analysis of the execution semantics of stream processing systems. VLDB 3(1–2):232–243. https://doi.org/10.14778/1920841.1920874Google Scholar
  7. Carbone P, Katsifodimos A, Ewen S, Markl V, Haridi S, Tzoumas K (2015) Apache flink: stream and batch processing in a single engine. Bull IEEE Comput Soc Tech Comm Data Eng 36(4):28–38.Google Scholar
  8. Carbone P, Ewen S, Fóra G, Haridi S, Richter S, Tzoumas K (2017) State management in Apache flinkⓇ: consistent stateful distributed stream processing. Proc VLDB 10(12):1718–1729. https://doi.org/10.14778/3137765.3137777CrossRefGoogle Scholar
  9. Cugola G, Margara A (2012) Processing flows of information: from data stream to complex event processing. ACM Comput Surv 44(3):15:1–15:62. https://doi.org/10.1145/2187671.2187677CrossRefGoogle Scholar
  10. Doblander C, Rabl T, Jacobsen HA (2014) Processing big events with showers and streams. In: Rabl T, Poess M, Baru C, Jacobsen HA (eds) Specifying big data benchmarks. Springer, Berlin/Heidelberg, pp 60–71CrossRefGoogle Scholar
  11. Etzion O, Niblett P (2010) Event processing in action. Manning Publications, GreenwichGoogle Scholar
  12. Luckham DC (2001) The power of events: an introduction to complex event processing in distributed enterprise systems. Addison-Wesley, BostonGoogle Scholar
  13. Marz N, Warren J (2015) Big data: principles and best practices of scalable realtime data systems. Manning Publications, GreenwichGoogle Scholar
  14. Stonebraker M, Çetintemel U, Zdonik S (2005) The 8 requirements of real-time stream processing. SIGMOD Rec 34(4):42–47. https://doi.org/10.1145/1107499.1107504CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Politecnico di MilanoMilanoItaly
  2. 2.Database Systems and Information Management GroupTechnische Universität BerlinBerlinGermany

Section editors and affiliations

  • Alessandro Margara
    • 1
  • Tilmann Rabl
    • 2
  1. 1.Politecnico di Milano
  2. 2.Database Systems and Information Management GroupTechnische Universität BerlinBerlinGermany