The VLDB Journal

, Volume 15, Issue 2, pp 121–142 | Cite as

The CQL continuous query language: semantic foundations and query execution

Regular Paper

Abstract

CQL, a continuous query language, is supported by the STREAM prototype data stream management system (DSMS) at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and stored relations. We begin by presenting an abstract semantics that relies only on “black-box” mappings among streams and relations. From these mappings we define a precise and general interpretation for continuous queries. CQL is an instantiation of our abstract semantics using SQL to map from relations to relations, window specifications derived from SQL-99 to map from streams to relations, and three new operators to map from relations to streams. Most of the CQL language is operational in the STREAM system. We present the structure of CQL's query execution plans as well as details of the most important components: operators, interoperator queues, synopses, and sharing of components among multiple operators and queries. Examples throughout the paper are drawn from the Linear Road benchmark recently proposed for DSMSs. We also curate a public repository of data stream applications that includes a wide variety of queries expressed in CQL. The relative ease of capturing these applications in CQL is one indicator that the language contains an appropriate set of constructs for data stream processing.

Data streams Continuous queries Query language Query processing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arasu, A., Babcock, B., Babu, S., Datar, M., Ito, K., Nishizawa, I., Rosenstein, J., Widom, J.: STREAM: The Stanford Stream Data Manager. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, p. 665 (2003) [demonstration description]Google Scholar
  2. 2.
    Arasu, A., Babcock, B., Babu, S., McAlister, J., Widom, J.: Characterizing memory requirements for queries over continuous data streams. ACM Trans. Database Syst. 29(1), 162–194 (2004)CrossRefGoogle Scholar
  3. 3.
    Arasu, A., Babu, S., Widom, J.: CQL: A language for continuous queries over streams and relations. In: 9th Interantional Workshop on Database Programming Languages, pp. 1–11 (2003)Google Scholar
  4. 4.
    Arasu, A., Cherniak, M. et al.: Linear road: a stream data management benchmark. In: Proceedings of the 30th International Conference on Very Large Data Bases, pp. 480–491 (2004)Google Scholar
  5. 5.
    Arasu, A., Widom, J.: A denotational semantics for continuous queries over streams and relations. SIGMOD Rec. 33(3), 6–12 (2004)CrossRefGoogle Scholar
  6. 6.
    Arasu, A., Widom, J.: Resource sharing in continuous sliding-window aggregates. In: Proceedings of the 30th International Conference on Very Large Data Bases, pp. 336–347 (2004)Google Scholar
  7. 7.
    Babcock, B., Babu, S., Datar, M., Motwani, R.: Chain: Operator scheduling for memory minimization in data stream systems. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 253–264 (2003)Google Scholar
  8. 8.
    Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 1–16 (2002)Google Scholar
  9. 9.
    Babcock, B., Datar, M., Motwani, R.: Load shedding for aggregation queries over data streams. In: Proceedings of the 20th International Conference on Data Engineering, pp. 350–361 (2004)Google Scholar
  10. 10.
    Babu, S., Motwani, R., Munagala, K., Nishizawa, I., Widom, J.: Adaptive ordering of pipelined stream filters. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 407–418 (2004)Google Scholar
  11. 11.
    Babu, S., Munagala, K., Widom, J., Motwani, R.: Adaptive caching for continuous queries. In: Proceedings of the 21st International Conference on Data Engineering (2005)Google Scholar
  12. 12.
    Babu, S., Srivastava, U., Widom, J.: Exploiting k-constraints to reduce memory overhead in continuous queries over data streams. ACM Trans. Database Syst. 31(3) (2004)Google Scholar
  13. 13.
    Babu, S., Widom, J.: StreaMon: an adaptive engine for stream query processing. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (2004) [demonstration description]Google Scholar
  14. 14.
    Daniel, B.: The characterization of continuous queries. Int. J. Coop. Inf. Syst. 8(4), 295–323 (1999)CrossRefGoogle Scholar
  15. 15.
    Bloom, B.: Space/time trade-offs in hash coding with allowable errors. Comm. ACM 13(7), 422–426 (1970)MATHCrossRefGoogle Scholar
  16. 16.
    Carney, D., Centintemel, U. et al.: Monitoring streams—A new class of data management applications. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 215–226 (2002)Google Scholar
  17. 17.
    Carney, D., Centintemel, U., Rasin, A., Zdonik, S., Cherniack, M., Stonebraker, M.: Operator scheduling in a data stream manager. In: Proceedings of the 29th International Conference on Very Large Data Bases, pp. 838–849 (2003)Google Scholar
  18. 18.
    Chandrasekharan, S., Cooper, O. Deshpande, A., Franklin, M.J., Hellerstein, J.M., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F., Shah, M.: TelegraphCQ: continuous dataflow processing for an uncertain world. In: Proceedings of the 1st Conference on Innovative Data Systems Research, pp. 269–280 (2003)Google Scholar
  19. 19.
    Chandrasekharan, S., Franklin, M.J.: Streaming queries over streaming data. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 203–214 (2002)Google Scholar
  20. 20.
    Chen, J., DeWitt, D.J., Tian, F., Wang, Y.: NiagaraCQ: a scalable continuous query system for internet databases. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 379–390 (2000)Google Scholar
  21. 21.
    Cranor, C., Johnson, T., Spataschek, O., Shkapenyuk, V.: Gigascope: a stream database for network applications. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 647–651 (2003)Google Scholar
  22. 22.
    Das, A., Gehrke J., Reidewald, M.: Approximate join processing over data streams. In: Proceedigs of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 40–51 (2003)Google Scholar
  23. 23.
    Dobra, A., Garofalakis, M.N., Gehrke, J., Rastogi, R.: Processing complex aggregate queries over data streams. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 61–72 (2002)Google Scholar
  24. 24.
    Gehrke, J.: Special issue on data stream processing. IEEE Comput. Soc. Bull. Tech. Commun. Data Eng. 26(1) (2003)Google Scholar
  25. 25.
    Golab, L., Ozsu, M.T.: Issues in data stream management. SIGMOD Rec. 32(2), 5–14 (2003)CrossRefGoogle Scholar
  26. 26.
    Gupta, A., Mumick, I.S.: Maintenance of materialized views: Problems, techniques, and applications. IEEE Comput. Soc. Bull. Tech. Commun. Data Eng. 18(2), 3–18 (1995)Google Scholar
  27. 27.
    Gupta, A., Mumick, I.S., Subramanian, V.S.: Maintaining views incrementally. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 157–166 (1993)Google Scholar
  28. 28.
    Hammad, M.A., Franklin, M.J., Aref, W.G., Elmagarmid, A.K.: Scheduling for shared window joins over data streams. In: Proceedings of the 29th International Conference on Very Large Data Bases, pp. 297–308 (2003)Google Scholar
  29. 29.
    Jagadish, H.V., Mumick, I.S., Silberschatz, A.: View maintenance issues for the chronicle data model. In: Proceedings of the 14th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 113–124 (1995)Google Scholar
  30. 30.
    Law, Y.-N., Wang, H., Zaniolo, C.: Query languages and data models for database sequences and data streams. In: Proceedings of the 30th International Conference on Very Large Data Bases, pp. 492–503 (2004)Google Scholar
  31. 31.
    Liu, L., Pu, C., Tang, W.: Continual queries for internet scale event-driven information delivery. IEEE Trans. Knowl. Data Eng. 11(4), 610–628 (1999)CrossRefGoogle Scholar
  32. 32.
    Madden, S., Shah, M.A., Hellerstein, J.M., Raman, V.: Continuously adaptive continuous queries over streams. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 49–60 (2002)Google Scholar
  33. 33.
    Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Datar, M., Manku, G., Olston, C., Rosenstein, J., Varma, R.: Query processing, approximation, and resource management in a data stream management system. In: Proceedings of the 1st Conference on Innovative Data Systems Research, pp. 245–256 (2003)Google Scholar
  34. 34.
    Nguyen, B., Abiteboul, S., Cobena, G., Preda, M.: Monitoring XML data on the web. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, pp. 437–448 (2001)Google Scholar
  35. 35.
    Özsoyoglu, G., Snodgrass, R.T.: Temporal and real-time databases: a survey. IEEE Trans. Knowl. Data Eng. 7(4), 513–532 (1995)CrossRefGoogle Scholar
  36. 36.
    Seshadri, P., Livny, M., Ramakrishnan, R.: SEQ: A model for sequence databases. In: Proceedings of the 11th International Conference on Data Engineering, pp. 232–239 (1995)Google Scholar
  37. 37.
    Srivastava, U., Widom, J.: Flexible time management in data stream systems. In: Proceedings of the 23rd ACM SIGACT-SIGMODSIGART Symposium on Principles of Database Systems, pp. 263–274 (2004)Google Scholar
  38. 38.
    Srivastava, U., Widom, J.: Memory-limited execution of windowed stream joins. In: Proceedings of the 30th International Conference on Very Large Data Bases, pp. 324–335 (2004)Google Scholar
  39. 39.
    Sullivan, M.: Tribeca: A stream database manager for network traffic analysis. In: Proceedings of the 22nd International Conference on Very Large Data Bases, p. 594 (1996)Google Scholar
  40. 40.
    Tatbul, N., Cetintemel, U., Zdonik, S., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: Proceedings of the 2003 International Conference on Very Large Data Bases, pp. 309–320, 2003Google Scholar
  41. 41.
    Terry, D.B., Goldberg, D., Nichols, D., Oki, B.M.: Continuous queries over append-only databases. In: Proceedings of the 1992 ACM SIGMOD International Conference on Management of Data, pp. 321–330 (1992)Google Scholar
  42. 42.
    Tucker, P.A., Tufte, K., Papadimos, V., Maier, D.: NEXMark – A benchmark for querying data streams, 2002. Available at: http://www.cse.ogi.edu/dot/niagara/NEXMark/
  43. 43.
    Viglas, S., Naughton, J.F., Burger, J.: Maximizing the output rate of multi-way join queries over streaming information sources. In: Proceedings of the 29th International Conference on Very Large Data Bases, pp. 285–296 (2003)Google Scholar
  44. 44.
    Vitter, J.: Random sampling with a reservoir. ACM Trans. Math. Softw. 11(1), 37–57 (1985)MATHMathSciNetCrossRefGoogle Scholar
  45. 45.
    Wang, H., Zaniolo, C.: ATLaS: a native extension of sql for data mining. In: Proceedings of the 3rd SIAM International Conference on Data Mining (2003)Google Scholar
  46. 46.
    Wang, H., Zaniolo, C., Luo, C.: ATLaS: A small but complete sql extension for data mining and data streams. In: Proceedings of the 29th International Conference on Very Large Data Bases, pp. 1113–1116 (2003) [demonstration description]Google Scholar
  47. 47.
    Widom, J., Ceri, S. (eds.): Active database systems: triggers and rules for advanced database processing. San Francisco: Morgan Kaufmann (1996)Google Scholar

Copyright information

© Springer-Verlag 2005

Authors and Affiliations

  1. 1.Computer Science DepartmentStanford UniversityStanfordUSA

Personalised recommendations