Skip to main content

Load Distribution for Distributed Stream Processing

  • Conference paper
Current Trends in Database Technology - EDBT 2004 Workshops (EDBT 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3268))

Included in the following conference series:

Abstract

Distributed steam processing is necessary for a large class of stream-based applications. To exploit the full power of distributed computation, effective load distribution techniques must be developed to optimize the system performance and cope with time-varying loads. When traditional load balancing or load sharing strategies are applied to such systems, we find that they either fall short in achieving good load distribution or fail to maintain good task partition in the long run.

In this paper, we study two important issues of dynamic load distribution in the context of data-intensive stream processing. The first one is how to allocate processing resources for push-based tasks such that the average end-to-end data processing latency can be minimized. The second issue is how to maintain a good load distribution dynamically for long running continuous queries. We propose a new hybrid load distribution strategy that addresses the above concerns by load clustering. To achieve scalability, our algorithm is completely decentralized and asynchronous.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bouganim, L., Florescu, D., Valduriez, P.: Dynamic load balancing in hierarchical parallel database systems. In: Int’l. Conf. on Very Large Data Bases (VLDB), Bombay, India, pp. 436–447 (September 1996)

    Google Scholar 

  2. Carney, D., Cetintemel, U., Cherniack, M., Convey, C., Lee, S., Seidman, G., Stonebraker, M., Tatbul, N., Zdonik, S.: Monitoring Streams – A New Class of Data Management Applications. In: Int’l. Conf. on Very Large Data Bases (VLDB), Hong Kong, China, pp. 215–226 (August 2002)

    Google Scholar 

  3. Carney, D., Cetintemel, U., Rasin, A., Zdonik, S., Cherniack, M., Stonebraker, M.: Operator scheduling in a data stream manager. In: Int’l. Conf. on Very Large Data Bases (VLDB), Berlin, Germany (September 2003)

    Google Scholar 

  4. Chandrasekaran, S., Deshpande, A., Franklin, M., Hellerstein, J., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F., Shah, M.: TelegraphCQ: Continuous dataflow processing for an uncertain world. In: CIDR Conference, Asilomar, CA, pp. 269–280 (January 2003)

    Google Scholar 

  5. Cherniack, M., Balakrishnan, H., Balazinska, M., Carney, D., Cetintemel, U., Xing, Y., Zdonik, S.: Scalable Distributed Stream Processing. In: CIDR Conference, Asilomar, CA, pp. 257–268 (January 2003)

    Google Scholar 

  6. DeWitt, D., Gray, J.: Parallel database systems: the future of high performance database systems. Communications of the ACM 35(6), 85–98 (1992)

    Article  Google Scholar 

  7. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Co., New York (1979)

    MATH  Google Scholar 

  8. Hendrickson, B., Devine, K.: Dynamic load balancing in computational mechanics. Computer Methods in Applied Mechanics and Engineering 184, 485–500 (2000)

    Article  MATH  Google Scholar 

  9. Kremien, O., Kramer, J., Magee, J.: Scalable, adaptive load sharing for distributed systems. IEEE Parallel and Distributed Technology: Systems and Applications 1(3), 62–70 (1993)

    Article  Google Scholar 

  10. Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Datar, M., Manku, G., Olston, C., Rosenstein, J., Varma, R.: Query processing, approximation, and resource management in a data stream management system. In: CIDR Conference, Asilomar, CA, pp. 245–256 (January 2003)

    Google Scholar 

  11. Rahm, E., Marek, R.: Dynamic multi-resource load balancing in parallel database systems. In: Int’l. Conf. on Very Large Data Bases (VLDB), pp. 395–406 (1995)

    Google Scholar 

  12. Schloegel, K., Karypis, G., Kumar, V.: Graph Partitioning for High Performance Scientific Simulations. CRPC Parallel Computing Handbook. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  13. Shah, M.A., Hellerstein, J.M., Chandrasekaran, S., Franklin, M.J.: Flux: An Adaptive Partitioning Operator for Continuous Query Systems. In: ICDE Conference, pp. 25–36 (2003)

    Google Scholar 

  14. Willebeek, M.H., Reeves, A.P.: Strategies for dynamic load balancing on highly parallel computers. IEEE Trans. on Parallel and Distributed Systems 4(9), 979–993 (1993)

    Article  Google Scholar 

  15. Xu, C.Z., Monien, B., Luling, R., Lau, F.C.M.: Nearest neighbor algorithms for load balancing in parallel computers. Concurrency: Practice and Experience 9(12), 1351–1376 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xing, Y. (2004). Load Distribution for Distributed Stream Processing. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds) Current Trends in Database Technology - EDBT 2004 Workshops. EDBT 2004. Lecture Notes in Computer Science, vol 3268. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30192-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30192-9_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23305-3

  • Online ISBN: 978-3-540-30192-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics