Semantics of Data Streams and Operators

  • David Maier
  • Jin Li
  • Peter Tucker
  • Kristin Tufte
  • Vassilis Papadimos
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3363)

Abstract

What does a data stream mean? Much of the extensive work on query operators and query processing for data streams has proceeded without the benefit of an answer to this question. While such imprecision may be tolerable when dealing with simple cases, such as flat data, guaranteed physical order and element-wise operations, it can lead to ambiguities when dealing with nested data, disordered streams and windowed operators. We propose reconstitution functions to make the denotation and representation of data streams more precise, and use these functions to investigate the connection between monotonicity and non-blocking behavior of stream operators. We also touch on a reconstitution function for XML data. Other aspects of data stream semantics we consider are the use of punctuation to delineate finite subsets of a stream, adequacy of descriptions of stream disorder, and the formal specification of windowed operators.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [AC03]
    Abadi, D., Carney, D., Cetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.: Aurora: a new model and architecture for data stream management. VLDB Journal 2(12), 120–139 (2003)Google Scholar
  2. [Abi]
  3. [AA95]
    Acharya, S., Alonso, R., Franklin, M., Zdonik, S.: Broadcast Disks: Data Management for Asymmetric Communication Environments. In: Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD 1995), San Jose, CA (June 1995)Google Scholar
  4. [ABW03]
    Arasu, A., Babu, S., Widom, J.: The CQL Continuous Query Language: Semantic Foundations and Query Execution, Stanford University Technical Report (October 2003)Google Scholar
  5. [AW04]
    Arasu, A., Widom, J.: Resource Sharing in Continuous Sliding-Window Aggregates. In: Proceedings of the 30th International Conference on Very Large Databases (VLDB 2004), Toronto, Canada (September 2004)Google Scholar
  6. [AW04R]
    Arasu, A., Widom, J.: A Denotational Semantics for Continuous Queries over Streams and Relations. SIGMOD Record 33(3) (September 2004)Google Scholar
  7. [BB02]
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in Data Stream Systems. In: Proceedings of the 21st ACM Symposim on Principles of Database Systems (PODS 2002), Madison, Wisconsin (June 2002)Google Scholar
  8. [BU04]
    Babu, S., Srivastava, U., Widom, J.: Exploiting k-Constraints to Reduce Memory Overhead in Continuous Queries over Data Streams. ACM Transactions on Database Systems 29(3), 545–580 (2004)CrossRefGoogle Scholar
  9. [BDT99]
    Buneman, P., Deutsch, A., Tan, W.C.: A Deterministic Model for Semistructured Data. In: Proceedings of the Workshop on Query Processing for Semistructured Data and Non-Standard Data Formats, Jerusalem, Israel (January 1999)Google Scholar
  10. [C02]
    Carney, D., et al.: Monitoring Streams - A New Class of Data Management Applications. In: Proceedings of the 28th International Conference on Very Large Databases (VLDB 2002), Hong Kong, China (August 2002)Google Scholar
  11. [C04]
    Cormode, G., et al.: Holistic UDAFs at streaming speeds. In: Proceedings of the 2004 ACM SIGMOD International Conference on the Management of Data (SIGMOD 2004), Paris, France (June 2004)Google Scholar
  12. [CJ02]
    Cranor, C., Johnson, T., Spatscheck, O.: How to Query Network Traffic Data Using Data Streams (2002) (unpublished manuscript)Google Scholar
  13. [GO03]
    Golab, L., Tamer Özsu, M.: Processing Sliding Window Multi-Joins in Continuous Queries over Data Streams. In: Proceedings of the 29th International Conference on Very Large Databases (VLDB 2003), Berlin, Germany (September 2003)Google Scholar
  14. [SQL99]
    Gulutzan, P., Pelzer, T.: SQL 1999 Complete, Really. CMP Books (1999) ISBN: 0-87930-568-1Google Scholar
  15. [HG04]
    Hammad, M., Ghanem, T., Aref, W., Elmagarmid, A., Mokbel, M.: Efficient Pipelined Execution of Sliding-Window Queries Over Data Streams. Purdue University Department of Computer Sciences Technical Report CSD TR#03-035 (June 2004)Google Scholar
  16. [HF03]
    Hammad, M., Franklin, M., Aref, W., Elmagarmid, A.: Scheduling for shared window joins over data streams. In: Proceedings of the 29th International Conference on Very Large Databases (VLDB 2003), Berlin, Germany (September 2003)Google Scholar
  17. [HAK03]
    Hammad, M., Aref, W., Elmagarmid, A.: Stream Window Join: Tracking Moving Objects in Sensor-Network Databases. In: Proceedings of the 15th International Conference on Scientific and Statistical Database Management (SSDBM 2003) Cambridge, MA (July 2003)Google Scholar
  18. [KNF03]
    Kang, J., Naughton, J., Viglas, J.: Evaluating Window Joins over Unbounded Streams. In: Proceedings of the 19th International Conference on Data Engineering (ICDE 2003), Bangalore, India (March 2003)Google Scholar
  19. [LWZ04]
    Law, Y., Wang, H., Zaniolo, C.: Query Languages and Data Models for Database Sequences and Data Streams. In: Proceedings of the 30th International Conference on Very Large Databases (VLDB 2004), Toronto, Canada (September 2004)Google Scholar
  20. [LMP04]
    Li, J., Maier, D., Papadimos, V., Tucker, P.A., Tufte, K.: Evaluating Window Aggregate Queries over Streams, OGI Technical Report (May 2004), Available from, http://www.cse.ogi.edu/~jinli/papers/WinAggrQ.pdf
  21. [NDM]
    Naughton, J., DeWitt, D., Maier, D., et al.: The Niagara Internet Query System, http://www.cs.wisc.edu/niagara
  22. [PMA]
    Passive Measurement and Analysis project. San Diego Supercomputer Center, http://pma.nlanr.net/PMA
  23. [ST97]
    Shanmugasundaram, J., Tufte, K., DeWitt, D.J., Maier, D., Naughton, J.F.: Architecting a network query engine for producing partial results. In: Suciu, D., Vossen, G. (eds.) WebDB 2000. LNCS, vol. 1997, p. 58. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  24. [SW04]
    Srivastava, U., Widom, J.: Flexible Time Management in Data Stream Systems. In: Proceedings of the 2004 ACM Symposium on Principles of Database Systems (PODS 2004), Paris, France (June 2004)Google Scholar
  25. [SH98]
    Sullivan, M., Heybey, A.: Tribeca: A system for managing large databases of network traffic. In: Proceedings of the USENIX Annul Technical Conference, New Orleans, Louisiana (June 1998)Google Scholar
  26. [TM03]
    Tucker, P.A., Maier, D., Sheard, T., Fegaras, L.: Exploiting Punctuation Semantics in Continuous Data Streams. Transactions on Knowledge and Data Engineering 15(3), 555–568 (2003)CrossRefGoogle Scholar
  27. [TMS03]
    Tucker, P.A., Maier, D., Sheard, T.: Applying Punctuation Schemes to Queries over Continuous Data Streams. IEEE Data Engineering Bulletin 26(1), 33–40 (2003)Google Scholar
  28. [TM01]
    Tufte, K., Maier, D.: Aggregation and Accumulation of XML Data. IEEE Data Engineering Bulletin 24(2), 34–39 (2001)Google Scholar
  29. [XM03]

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • David Maier
    • 1
  • Jin Li
    • 1
  • Peter Tucker
    • 2
  • Kristin Tufte
    • 1
  • Vassilis Papadimos
    • 1
  1. 1.Computer Science DepartmentPortland State UniversityPortland
  2. 2.Whitworth CollegeSpokane

Personalised recommendations