Skip to main content

The CQL continuous query language: semantic foundations and query execution

Abstract

CQL, a continuous query language, is supported by the STREAM prototype data stream management system (DSMS) at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and stored relations. We begin by presenting an abstract semantics that relies only on “black-box” mappings among streams and relations. From these mappings we define a precise and general interpretation for continuous queries. CQL is an instantiation of our abstract semantics using SQL to map from relations to relations, window specifications derived from SQL-99 to map from streams to relations, and three new operators to map from relations to streams. Most of the CQL language is operational in the STREAM system. We present the structure of CQL's query execution plans as well as details of the most important components: operators, interoperator queues, synopses, and sharing of components among multiple operators and queries. Examples throughout the paper are drawn from the Linear Road benchmark recently proposed for DSMSs. We also curate a public repository of data stream applications that includes a wide variety of queries expressed in CQL. The relative ease of capturing these applications in CQL is one indicator that the language contains an appropriate set of constructs for data stream processing.

This is a preview of subscription content, access via your institution.

References

  1. Arasu, A., Babcock, B., Babu, S., Datar, M., Ito, K., Nishizawa, I., Rosenstein, J., Widom, J.: STREAM: The Stanford Stream Data Manager. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, p. 665 (2003) [demonstration description]

  2. Arasu, A., Babcock, B., Babu, S., McAlister, J., Widom, J.: Characterizing memory requirements for queries over continuous data streams. ACM Trans. Database Syst. 29(1), 162–194 (2004)

    Article  Google Scholar 

  3. Arasu, A., Babu, S., Widom, J.: CQL: A language for continuous queries over streams and relations. In: 9th Interantional Workshop on Database Programming Languages, pp. 1–11 (2003)

  4. Arasu, A., Cherniak, M. et al.: Linear road: a stream data management benchmark. In: Proceedings of the 30th International Conference on Very Large Data Bases, pp. 480–491 (2004)

  5. Arasu, A., Widom, J.: A denotational semantics for continuous queries over streams and relations. SIGMOD Rec. 33(3), 6–12 (2004)

    Article  Google Scholar 

  6. Arasu, A., Widom, J.: Resource sharing in continuous sliding-window aggregates. In: Proceedings of the 30th International Conference on Very Large Data Bases, pp. 336–347 (2004)

  7. Babcock, B., Babu, S., Datar, M., Motwani, R.: Chain: Operator scheduling for memory minimization in data stream systems. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 253–264 (2003)

  8. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 1–16 (2002)

  9. Babcock, B., Datar, M., Motwani, R.: Load shedding for aggregation queries over data streams. In: Proceedings of the 20th International Conference on Data Engineering, pp. 350–361 (2004)

  10. Babu, S., Motwani, R., Munagala, K., Nishizawa, I., Widom, J.: Adaptive ordering of pipelined stream filters. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 407–418 (2004)

  11. Babu, S., Munagala, K., Widom, J., Motwani, R.: Adaptive caching for continuous queries. In: Proceedings of the 21st International Conference on Data Engineering (2005)

  12. Babu, S., Srivastava, U., Widom, J.: Exploiting k-constraints to reduce memory overhead in continuous queries over data streams. ACM Trans. Database Syst. 31(3) (2004)

  13. Babu, S., Widom, J.: StreaMon: an adaptive engine for stream query processing. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (2004) [demonstration description]

  14. Daniel, B.: The characterization of continuous queries. Int. J. Coop. Inf. Syst. 8(4), 295–323 (1999)

    Article  Google Scholar 

  15. Bloom, B.: Space/time trade-offs in hash coding with allowable errors. Comm. ACM 13(7), 422–426 (1970)

    MATH  Article  Google Scholar 

  16. Carney, D., Centintemel, U. et al.: Monitoring streams—A new class of data management applications. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 215–226 (2002)

  17. Carney, D., Centintemel, U., Rasin, A., Zdonik, S., Cherniack, M., Stonebraker, M.: Operator scheduling in a data stream manager. In: Proceedings of the 29th International Conference on Very Large Data Bases, pp. 838–849 (2003)

  18. Chandrasekharan, S., Cooper, O. Deshpande, A., Franklin, M.J., Hellerstein, J.M., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F., Shah, M.: TelegraphCQ: continuous dataflow processing for an uncertain world. In: Proceedings of the 1st Conference on Innovative Data Systems Research, pp. 269–280 (2003)

  19. Chandrasekharan, S., Franklin, M.J.: Streaming queries over streaming data. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 203–214 (2002)

  20. Chen, J., DeWitt, D.J., Tian, F., Wang, Y.: NiagaraCQ: a scalable continuous query system for internet databases. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 379–390 (2000)

  21. Cranor, C., Johnson, T., Spataschek, O., Shkapenyuk, V.: Gigascope: a stream database for network applications. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 647–651 (2003)

  22. Das, A., Gehrke J., Reidewald, M.: Approximate join processing over data streams. In: Proceedigs of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 40–51 (2003)

  23. Dobra, A., Garofalakis, M.N., Gehrke, J., Rastogi, R.: Processing complex aggregate queries over data streams. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 61–72 (2002)

  24. Gehrke, J.: Special issue on data stream processing. IEEE Comput. Soc. Bull. Tech. Commun. Data Eng. 26(1) (2003)

  25. Golab, L., Ozsu, M.T.: Issues in data stream management. SIGMOD Rec. 32(2), 5–14 (2003)

    Article  Google Scholar 

  26. Gupta, A., Mumick, I.S.: Maintenance of materialized views: Problems, techniques, and applications. IEEE Comput. Soc. Bull. Tech. Commun. Data Eng. 18(2), 3–18 (1995)

    Google Scholar 

  27. Gupta, A., Mumick, I.S., Subramanian, V.S.: Maintaining views incrementally. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 157–166 (1993)

  28. Hammad, M.A., Franklin, M.J., Aref, W.G., Elmagarmid, A.K.: Scheduling for shared window joins over data streams. In: Proceedings of the 29th International Conference on Very Large Data Bases, pp. 297–308 (2003)

  29. Jagadish, H.V., Mumick, I.S., Silberschatz, A.: View maintenance issues for the chronicle data model. In: Proceedings of the 14th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 113–124 (1995)

  30. Law, Y.-N., Wang, H., Zaniolo, C.: Query languages and data models for database sequences and data streams. In: Proceedings of the 30th International Conference on Very Large Data Bases, pp. 492–503 (2004)

  31. Liu, L., Pu, C., Tang, W.: Continual queries for internet scale event-driven information delivery. IEEE Trans. Knowl. Data Eng. 11(4), 610–628 (1999)

    Article  Google Scholar 

  32. Madden, S., Shah, M.A., Hellerstein, J.M., Raman, V.: Continuously adaptive continuous queries over streams. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 49–60 (2002)

  33. Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Datar, M., Manku, G., Olston, C., Rosenstein, J., Varma, R.: Query processing, approximation, and resource management in a data stream management system. In: Proceedings of the 1st Conference on Innovative Data Systems Research, pp. 245–256 (2003)

  34. Nguyen, B., Abiteboul, S., Cobena, G., Preda, M.: Monitoring XML data on the web. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, pp. 437–448 (2001)

  35. Özsoyoglu, G., Snodgrass, R.T.: Temporal and real-time databases: a survey. IEEE Trans. Knowl. Data Eng. 7(4), 513–532 (1995)

    Article  Google Scholar 

  36. Seshadri, P., Livny, M., Ramakrishnan, R.: SEQ: A model for sequence databases. In: Proceedings of the 11th International Conference on Data Engineering, pp. 232–239 (1995)

  37. Srivastava, U., Widom, J.: Flexible time management in data stream systems. In: Proceedings of the 23rd ACM SIGACT-SIGMODSIGART Symposium on Principles of Database Systems, pp. 263–274 (2004)

  38. Srivastava, U., Widom, J.: Memory-limited execution of windowed stream joins. In: Proceedings of the 30th International Conference on Very Large Data Bases, pp. 324–335 (2004)

  39. Sullivan, M.: Tribeca: A stream database manager for network traffic analysis. In: Proceedings of the 22nd International Conference on Very Large Data Bases, p. 594 (1996)

  40. Tatbul, N., Cetintemel, U., Zdonik, S., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: Proceedings of the 2003 International Conference on Very Large Data Bases, pp. 309–320, 2003

  41. Terry, D.B., Goldberg, D., Nichols, D., Oki, B.M.: Continuous queries over append-only databases. In: Proceedings of the 1992 ACM SIGMOD International Conference on Management of Data, pp. 321–330 (1992)

  42. Tucker, P.A., Tufte, K., Papadimos, V., Maier, D.: NEXMark – A benchmark for querying data streams, 2002. Available at: http://www.cse.ogi.edu/dot/niagara/NEXMark/

  43. Viglas, S., Naughton, J.F., Burger, J.: Maximizing the output rate of multi-way join queries over streaming information sources. In: Proceedings of the 29th International Conference on Very Large Data Bases, pp. 285–296 (2003)

  44. Vitter, J.: Random sampling with a reservoir. ACM Trans. Math. Softw. 11(1), 37–57 (1985)

    MATH  MathSciNet  Article  Google Scholar 

  45. Wang, H., Zaniolo, C.: ATLaS: a native extension of sql for data mining. In: Proceedings of the 3rd SIAM International Conference on Data Mining (2003)

  46. Wang, H., Zaniolo, C., Luo, C.: ATLaS: A small but complete sql extension for data mining and data streams. In: Proceedings of the 29th International Conference on Very Large Data Bases, pp. 1113–1116 (2003) [demonstration description]

  47. Widom, J., Ceri, S. (eds.): Active database systems: triggers and rules for advanced database processing. San Francisco: Morgan Kaufmann (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arvind Arasu.

Additional information

Edited by M. Franklin

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Arasu, A., Babu, S. & Widom, J. The CQL continuous query language: semantic foundations and query execution. The VLDB Journal 15, 121–142 (2006). https://doi.org/10.1007/s00778-004-0147-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-004-0147-z

  • Data streams
  • Continuous queries
  • Query language
  • Query processing