Efficient Support for Time Series Queries in Data Stream Management Systems

  • Yijian Bai
  • Chang R. Luo
  • Hetal Thakkar
  • Carlo Zaniolo
Part of the Advances in Database Systems book series (ADBS, volume 30)

Abstract

There is much current interest in supporting continuous queries on data streams using generalizations of database query languages, such as SQL. The research challenges faced by this approach include (i) overcoming the expressive power limitations of database languages on data stream applications, and (ii) providing query processing and optimization techniques for the data stream execution environment that is so different from that of traditional databases. In particular, SQL must be extended to support sequence queries on time series, and to overcome the loss of expressive power due to the exclusion of blocking query operators. Furthermore, the query processing techniques of relational databases must be replaced with techniques that optimize execution of time-series queries and the utilization of main memory. The Expressive Stream Language for Time Series (ESL-TS) and its query optimization techniques solve these problems efficiently and are part of the data stream management system prototype developed at UCLA.

References

  1. A. Arasu, S. Babu, and J. Widom. An abstract semantics and concrete language for continuous queries over streams and relations. Technical report, Stanford University, 2002.Google Scholar
  2. B. Babcock, S. Babu, M. Datar, R. Motawani, and J. Widom. Models and issues in data stream systems. in PODS, 2002.Google Scholar
  3. Shivnath Babu. Stream query repository. Technical report, CS Department, Stanford University, http://www-db.stanford.edu/stream/sqr/, 2002.Google Scholar
  4. D. Barbara. The characterization of continuous queries. Intl. Journal of Cooperative Information Systems, 8(4):295–323, 1999.CrossRefGoogle Scholar
  5. S. Boag, D. Chamberlin, M. F. Fernandez, D. Florescu, J. Robie, J. Simeon, and M. Stefanescu (eds.). Xquery 1.0: An xml query language-working draft 22 august 2003. Working Draft 22 August 2003, W3C, http://www.w3.org/tr/xquery/, 2003.Google Scholar
  6. D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, and S. Zdonik. Monitoring streams-a new class of data management applications. In VLDB, Hong Kong, China, 2002.Google Scholar
  7. S. Chandrasekaran and M. Franklin. Streaming queries over streaming data. In VLDB, 2002.Google Scholar
  8. J. Chen, D. J. DeWitt, F. Tian, and Y. Wang. NiagaraCQ: A scalable continuous query system for internet databases. In SIGMOD, pages 379–390, May 2000.Google Scholar
  9. Yanlei Diao and Michael J. Franklin. Query processing for high-volume xml message brokering. In VLDB 2003, pages 261–272, 2003.Google Scholar
  10. Lukasz Golab and M. Tamer Özsu. Issues in data stream management. ACM SIGMOD Record, 32(2):5–14, 2003.CrossRefGoogle Scholar
  11. J. Han, Y. Fu, W. Wang, K. Koperski, and O. R. Zaiane. DMQL: A data mining query language for relational databases. In Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD), pages 27–33, Montreal, Canada, June 1996.Google Scholar
  12. J. M. Hellerstein, P. J. Hass, and H. J. Wang. Online aggregation. In SIGMOD, 1997.Google Scholar
  13. T. Imielinski and A. Virmani. MSQL: a query language for database mining. Data Mining and Knowledge Discovery, 3:373–408, 1999.CrossRefGoogle Scholar
  14. Informix. Informix: Datablade developers kid infoshelf. http://www.informix.co.za/answers/english/docs/dbdk/infoshelf, 1998.Google Scholar
  15. H. Jagadish, I. Mumick, and A. Silberschatz. View maintenance issues for the chronicle data model. In PODS, pages 113–124, 1995.Google Scholar
  16. D. E. Knuth, J. H. Morris, and V. R. Pratt. Fast pattern matching in strings. SUM Journal of Computing, 6(2):323–350, June 1977.MathSciNetCrossRefGoogle Scholar
  17. Y-N Law, H. Wang, and C. Zaniolo. Query Languages and Data Models for Database Sequences and Data Streams In VLDB, 2004.Google Scholar
  18. L. Liu, C. Pu, and W. Tang. Continual queries for internet scale event-driven information delivery. IEEE TKDE, 11(4):583–590, August 1999.Google Scholar
  19. G. Linoff M. J. A. Berry. Data Mining Techniques: For Marketing, Sales, and Customer Support. John Wiley, 1997.Google Scholar
  20. Sam Madden, Mehul A. Shah, Joseph M. Hellerstein, and Vijayshankar Raman. Continuously adaptive continuous queries over streams. In SIGMOD, pages 49–61, 2002.Google Scholar
  21. R. Meo, G. Psaila, and S. Ceri. A new SQL-like operator for mining association rules. In VLDB, pages 122–133, Bombay, India, 1996.Google Scholar
  22. C. Perng and D. Parker. SQL/LPP: A Time Series Extension of SQL Based on Limited Patience Patterns In DEXA, 1999.Google Scholar
  23. R. Ramakrishnan, D. Donjerkovic, A. Ranganathan, K. Beyer, and M. Krishnaprasad. Srql: Sorted relational query language, 1998.Google Scholar
  24. Reza Sadri. Optimization of Sequence Queries in Database Systems. PhD thesis, University of California, Los Angeles, 2001.Google Scholar
  25. Reza Sadri, Carlo Zaniolo, and Amir M. Zarkesh and Jafar Adibi. A sequential pattern query language for supporting instant data minining for e-services. In VLDB, pages 653–656, 2001.Google Scholar
  26. Reza Sadri, Carlo Zaniolo, Amir Zarkesh, and Jafar Adibi. Optimization of sequence queries in database systems. In PODS, Santa Barbara, CA, May 2001.Google Scholar
  27. S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with relational database systems: Alternatives and implications. In SIGMOD, 1998.Google Scholar
  28. P. Seshadri. Predator: A resource for database research. SIGMOD Record, 27(1): 16–20, 1998.CrossRefGoogle Scholar
  29. Praveen Seshadri, Miron Livny, and Raghu Ramakrishnan. Sequence query processing. In Richard T. Snodgrass and Marianne Winslett, editors, Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, pages 430–441. ACM Press, 1994.Google Scholar
  30. Praveen Seshadri and Arun N. Swami. Generalized partial indexes. In Proceedings of Eleventh International Conference on Data Engineering 1995, pages 420–427. IEEE Computer Society, 1995.Google Scholar
  31. M. Sullivan. Tribeca: A stream database manager for network traffic analysis. In VLDB, 1996.Google Scholar
  32. D. Terry, D. Goldberg, D. Nichols, and B. Oki. Continuous queries over append-only databases. In SIGMOD, pages 321–330, 6 1992.CrossRefGoogle Scholar
  33. Haixun Wang and Carlo Zaniolo. Using SQL to build new aggregates and extenders for object-relational systems. In VLDB, 2000.Google Scholar
  34. Haixun Wang and Carlo Zaniolo. Extending sql for decision support applications. In Proceedings of the 4th Intl. Workshop on Design and Management of Data Warehouses (DMDW), pages 1–2, 2002.Google Scholar
  35. Haixun Wang and Carlo Zaniolo. ATLaS: A native extension of sql for data mining. In SDM, San Francisco, CA, 5 2003.Google Scholar
  36. C. A. Wright, L. Cumberland, and Y. Feng. A performance comparison between five string pattern matching algorithms. Technical Report, Dec. 1998. http://ocean.st.usm.edu/~cawright/pattern.matching.html.Google Scholar
  37. Fred Zemke, Krishna Kulkarni, Andy Witkowski, and Bob Lyle. Proposal for OLAP functions. In ISO/IEC JTC1/SC32 WG3:YGJ-nnn, ANSI NCITS H2-99-155, 1999.Google Scholar

Copyright information

© Springer Science+Business Media, Inc. 2005

Authors and Affiliations

  • Yijian Bai
    • 1
  • Chang R. Luo
    • 1
  • Hetal Thakkar
    • 1
  • Carlo Zaniolo
  1. 1.Computer Science DepartmentUCLAUSA

Personalised recommendations