Distributed Resource Allocation in Stream Processing Systems

  • Cathy H. Xia
  • James A. Broberg
  • Zhen Liu
  • Li Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4167)


Distributed stream processing architecture has emerged as appealing solution to coping with the analysis of large amount of data from dispersed sources. A fundamental problem in such stream processing systems is how to best utilize the available resources so that the overall system performance is optimized. We consider a distributed stream processing system that consists of a network of cooperating servers, collectively providing processing services for multiple data streams. Each stream is required to complete a series of operations on various servers. We assume all servers have finite computing resources and all communication links have finite available bandwidth. The problem is to find distributed schemes to allocate the limited computing resources as well as the communication bandwidth in the system so as to achieve a maximum concurrent throughput for all output streams. We present a generalized multicommodity flow model for the above problem. We develop a distributed resource allocation algorithm that guarantees the optimality. We also provide detailed analysis on the complexity of the algorithm and demonstrate the performance using numerical experiments.


Data Stream Sink Node Stream Processing Resource Allocation Problem Output Stream 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abadi, et al.: Aurora: A new model and architecture for data stream management. VLDB Journal 12(2) (September 2003)Google Scholar
  2. 2.
    Awerbuch, B., Leighton, F.: A simple local-control approximation algorithm for multicommodity flow. In: Proc. of the 34th IEEE Symp. on Foundations of Computer Science (FOCS), pp. 459–468 (1993)Google Scholar
  3. 3.
    Awerbuch, B., Leighton, F.: Improved approximation algorithms for the multi-commodity flow problem and local competitive routing in dynamic networks. In: Proc. of the 26th ACM Symp. on Theory of Computing (STOC), pp. 487–496 (1994)Google Scholar
  4. 4.
    Ahmad, Y., et al.: Distributed Operation in the Borealis Stream Processing Engine. In: SIGMOD 2005 (2005)Google Scholar
  5. 5.
    Babcock, B., Babu, S., Datar, M., Motwani, R.: Chain: Operator scheduling for memory minimization in data stream systems. In: SIGMOD (June 2003)Google Scholar
  6. 6.
    Bazaraa, M.S., Jarvis, J.J., Sherali, H.D.: Linear Programming and Network Flows. John Wiley & Sons, Chichester (1977)MATHGoogle Scholar
  7. 7.
    Broberg, J.A., Liu, Z., Xia, C.H., Zhang, L.: A Multicommodity Flow Model for Distributed Streaming Processing. Poster in SIGMETRICS (2006)Google Scholar
  8. 8.
    Carney, D., Cÿetintemel, U., Rasin, A., Zdonik, S., Cherniack, M., Stonebraker, M.: Operator scheduling in a data stream manager. In: 29th VLDB (September 2003)Google Scholar
  9. 9.
    Chandrasekaran, S., Franklin, M.J.: Remembrance of streams past: Overload-sensitive management of archived streams. In: 30th VLDB (September 2004)Google Scholar
  10. 10.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. MIT Press/ McGraw-Hill Book Company, Cambridge/ Boston (1990)MATHGoogle Scholar
  11. 11.
    Cranor, C., Johnson, T., Shkapenyuk, V., Spatscheck, O.: Gigascope: A stream database for network applications. In: SIGMOD (June 2003)Google Scholar
  12. 12.
    Approximating Fractional Multicommodity Flows Independent of the Number of Commodities. SIAM J. Discrete Math. 13(4), 505–520 (2000)Google Scholar
  13. 13.
    Hu, T.C.: Multi-Commodity Network Flows. Operations Research 11, 344–360 (1963)MATHCrossRefGoogle Scholar
  14. 14.
    Motwani, et al.: Query processing, approximation, and resource management in a data stream management system. In: CIDR (January 2003)Google Scholar
  15. 15.
    Pietzuch, P., Shneidman, J., Ledlie, J., Welsh, M., Seltzer, M., Roussopoulos, M.: Hourglass: A Stream-Based Overlay Network for Sensor Applications. In: HIP 2004 (2004)Google Scholar
  16. 16.
    Shahrokhi, F., Matula, D.W.: The maximum concurrent flow problem. J. Assoc. Comput. Mach. 37, 318–334 (1990)MATHMathSciNetGoogle Scholar
  17. 17.
    Stein, C.: Approximation algorithms for multicommodity flow and shop scheduling problems, Ph.D thesis. MIT (1992)Google Scholar
  18. 18.
    Srivastava, U., Munagala, K., Widom, J.: Operator placement for in-network stream query processing. In: Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 250–258 (2005)Google Scholar
  19. 19.
    Tatbul, N., Etintemel, U.C., Zdonik, S., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: 29th VLDB (September 2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Cathy H. Xia
    • 1
  • James A. Broberg
    • 2
  • Zhen Liu
    • 1
  • Li Zhang
    • 1
  1. 1.IBM T.J. Watson Research CenterYorktown HeightsUSA
  2. 2.School of Computer Science & Information TechnologyRMIT-UniversityMelbourneAustralia

Personalised recommendations