Advertisement

Journal of Central South University

, Volume 18, Issue 3, pp 782–790 | Cite as

Continuous query scheduler based on operators clustering

  • M. Sami Soliman
  • Guan-zheng Tan (谭冠政)Email author
Article
  • 37 Downloads

Abstract

Data stream management system (DSMS) provides convenient solutions to the problem of processing continuous queries on data streams. Previous approaches for scheduling these queries and their operators assume that each operator runs in separate thread or all operators combine in one query plan and run in a single thread. Both approaches suffer from severe drawbacks concerning the thread overhead and the stalls due to expensive operators. To overcome these drawbacks, a novel approach called clustered operators scheduling (COS) is proposed that adaptively clusters operators of the query plan into a number of groups based on their selectivity and computing cost using S-mean clustering. Experimental evaluation is provided to demonstrate the potential benefits of COS scheduling over the other scheduling strategies. COS can provide adaptive, flexible, reliable, scalable and robust design for continuous query processor.

Key words

data stream management systems operators scheduling continuous query clustering 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    JIANG Qing-chun, CHAKRAVARTHY S. Anatomy of a data stream management system [C]// Proceedings of the Advances in Databases and Information Systems. Thessaloniki, Greece, 2006: 233–258.Google Scholar
  2. [2]
    MOTWANI R, WIDOM J, ARASU A, BABCOCK B, BABU S, DATAR M, MANKU G, OLSTON C, ROSENSTEIN J, VARMA R. Query processing, approximation, and resource management in a data stream management system [C]// Proceedings of First Biennial Conference on Innovative Data Systems Research. Asilomar, CA, USA, 2003: 238–249.Google Scholar
  3. [3]
    AVNUR R, HELLERSTEIN J M. Eddies: continuously adaptive query processing [C]// Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. Dallas, Texas, USA, 2000: 261–272.Google Scholar
  4. [4]
    BABCOCK B, BABU S, DATAR M, MOTWANI R, THOMAS D. Operator scheduling in data stream systems [J]. Very Large DataBases Journal: Special Issue on Data Stream Processing, 2004, 13(4): 29–36.Google Scholar
  5. [5]
    PINEDO M L. Scheduling: theory, algorithms, and systems [M]. 3rd ed. New York: Springer, 2008.zbMATHGoogle Scholar
  6. [6]
    CAMMERT M, HEINZ C, KRÄMER J, SEEGER B, VAUPEL S, WOLSKE U. Flexible multi-threaded scheduling for continuous queries over data streams [C]// Proceedings of First International Workshop on Scalable Stream Processing Systems. Istanbul, Turkey, 2007: 624–633.Google Scholar
  7. [7]
    KRÄMER J, SEEGER B. A temporal foundation for continuous queries over data streams [C]// Proceedings of 11th International Conference of Management of Data. Goa, India, 2005: 70–82.Google Scholar
  8. [8]
    GRAEFE G. Query evaluation techniques for large databases [J]. ACM Computing Surveys, 1993, 25(2): 73–170.CrossRefGoogle Scholar
  9. [9]
    MADDEN S, FRANKLIN M J. Fjording the stream: An architecture for queries over streaming sensor data [C]// Proceedings of International Conference on Data Engineering. San Jose, California, USA, 2002: 555–567.Google Scholar
  10. [10]
    The STREAM Group. STREAM: The Stanford stream data manager [J]. IEEE Data Engineering Bulletin, 2003, 26(1): 19–26.Google Scholar
  11. [11]
    JIANG Qing-chun, CHAKRAVARTHY S. Queueing analysis of relational operators for continuous data streams [C]// Proceedings of the ACM CIKM International Conference on Information and Knowledge Management. New Orleans, Louisiana, USA, 2003: 271–278.Google Scholar
  12. [12]
    LIAN Hong, WAN Zhen-kai. The computer simulation for queuing system [J]. World Academy of Science, Engineering and Technology, 2007, 34(1): 176–179.Google Scholar
  13. [13]
    KLEINROCK L. Queueing systems: Theory [M]. New York: Wiley Interscience, 1975: 119–125.zbMATHGoogle Scholar
  14. [14]
    MEDHI J. Stochastic models in queueing theory [M]. 2nd ed. New York: Academic Press, 2002: 101–109.Google Scholar
  15. [15]
    MacKAY D. Information theory, inference and learning algorithms [M]. Cambridge: Cambridge University Press, 2003: 285–290.Google Scholar
  16. [16]
    LEI H, TANG L, IGLESIAS J, MUKHERJEE S, MOHANTY S. S-means: Similarity driven clustering and its application in gravitational-wave astronomy data mining [C]// Proceedings of the International Workshop on Knowledge Discovery from Ubiquitous Data Streams. Warsaw, Poland, 2007: 1124–1135.Google Scholar
  17. [17]
    CARNEY D, CETINTEMEL U, RASIN A, ZDONIK S B, CHERNIACK M, STONEBRAKER M. Operator scheduling in a data stream manager [R]. Technical Report CS-03-04, Brown University: Department of Computer Science, 2003.CrossRefGoogle Scholar
  18. [18]
    CAMMERT M, KRÄMER J, SEEGER B, VAUPEL S. A cost-based approach to adaptive resource management in data stream systems [J]. IEEE Transactions on Knowledge and Data Engineering, 2008, 20(2): 230–245.CrossRefGoogle Scholar
  19. [19]
    VARGA A. OMNeT discrete event simulation system version 3.2 user manual [EB/OL]. [2009-01-04] https://doi.org/www.omnetpp.org/doc/manual/usman.htm
  20. [20]
    PERROS H. Computer simulation techniques: The definitive introduction [EB/OL]. [2009-10-20]. https://doi.org/www.csc.ncsu.edu/faculty/perros/simulation.pdf.

Copyright information

© Central South University Press and Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  1. 1.School of Information Science and EngineeringCentral South UniversityChangshaChina

Personalised recommendations