Skip to main content
Log in

Load control in scalable distributed file structures

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

The paper presents a family of distributed file structures, coined DiFS, for record structured, disk resident files with key based exact or interval match access. The file is organized into buckets that are spread among multiple servers, where a server may hold several buckets. Client requests are serviced by mapping keys onto buckets and looking up the corresponding server in an address table. Dynamic growth, in terms of file size and access load, is supported by bucket splits and bucket migrations onto the existing or newly created servers.

The major problem that we are addressing is achieving scalability in the sense that both the file size and the client throughput can be scaled up by linearly increasing the number of servers and dynamically redistributing the data. Unlike previous work with similar objectives, our data redistribution considers explicitly the cost/performance ratio of the system by aiming to minimize the number of servers that are used to provide the required performance. A new server is added only if the overall server load in the system does not drop below a pre-specified threshold. Simulation results demonstrate the scalability with controlled cost/performance and the importance of global load control. The impact of various tuning parameters on the effectiveness of the load control is studied in detail. Finally, we compare our approach with other approaches known to date and demonstrate that each of the previous approaches can be recast as a special case of our model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. A. Barak and A. Shiloh, “A distributed load balancing policy for a multicomputer,” Software Practice & Experience, vol. 15, no. 9, pp. 901–913, Sept. 1985.

    Google Scholar 

  2. R. Devine, “Design and implementation of DDH: A distributed dynamic hashing algorithm,” 4th International Conference on Foundations of Data Organization and Algorithms (FODO), Chicago, 1993.

  3. D.J. DeWitt and J.N. Gray, “Parallel database systems: The future of high performance database systems,” Communications of the ACM, vol. 35, no. 6, pp. 85–98, June 1992.

    Google Scholar 

  4. D.L. Eager, E.D. Lazowska, and J. Zahorjan, “Adaptive load sharing in homogeneous distributed systems,” IEEE Transactions on Software Engineering, vol. 12, no. 5, pp. 662–675, May 1986.

    Google Scholar 

  5. R. Fagin, J. Nievergelt, N. Pippenger, and H.R. Strong, “Extendible hashing—A fast access method for dynamic files,” ACM Transactions on Database Systems, vol. 4, no. 3, pp. 315–344, 1979.

    Google Scholar 

  6. J. Gray (Ed.), The Benchmark Handbook for Database and Transaction Processing Systems, Morgan Kaufmann, 1991.

  7. Y. Huang and O. Wolfson, “Object allocation in distributed databases and mobile computers,” Data Engineering Conference, Houston, 1994.

  8. T. Johnson and P. Krishna, “Lazy updates for distributed search structure,” ACM SIGMOD Conference, Washington, 1993.

  9. B. Kröll and P. Widmayer, “Distributing a search tree among a growing number of processors,” ACM SIGMOD Conference, Minneapolis, 1994.

  10. W. Litwin, “Linear hashing: A new tool for file and table addressing,” VLDB Conference, Montreal, 1980.

  11. W. Litwin, M.-A. Neimat, and D.A. Schneider, “LH*—Linear hashing for distributed files,” ACM SIGMOD Conference, Washington, 1993; extended version published as: Technical Report HPL-93-21, Hewlett-Packard Labs, 1993.

  12. W. Litwin, M.-A., Neimat, and D.A. Schneider, “RP*: A family of order-preserving scalable distributed data structures,” VLDB Conference, Santiago de Chile, 1994.

  13. M.J. Litzkow, M. Livny, and M.W. Mutka, “Condor—A hunter of idle workstations,” 8th International Conference on Distributed Computing Systems (DCS), San Jose, 1988.

  14. P. Scheuermann, G. Weikum, and P. Zabback, “Disk cooling in parallel disk systems,” IEEE Data Engineering Bulletin, vol. 17, no. 3, pp. 29–40, Sept. 1994.

    Google Scholar 

  15. H. Schwetman, CSIM Reference Manual (Revision 16), Microelectronics and Computer Technology Corporation, Austin, 1992.

    Google Scholar 

  16. R. Vingralek, Y. Breitbart, and G. Weikum, “Distributed file organization with scalable cost/performance,” ACM SIGMOD Conference, Minneapolis, 1994.

  17. G. Weikum, P. Scheuermann, and P. Zabback, “Dynamic file allocation in disk array,” ACM SIGMOD Conference, Denver, 1991.

  18. O. Wolfson and S. Jajodia, “Distributed algorithms for dynamic replication of data,” ACM PODS Conference, San Diego, 1992.

Download references

Author information

Authors and Affiliations

Authors

Additional information

Recommended by: Mei Hsu

This material is based in part upon work supported by a grant from Hewlett-Packard Corporation and by NSF under grant IRI-9221947.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Breitbart, Y., Vingralek, R. & Weikum, G. Load control in scalable distributed file structures. Distrib Parallel Databases 4, 319–354 (1996). https://doi.org/10.1007/BF00119338

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00119338

Keywords

Navigation