Abstract
In this paper we present the design and implementation of DPH, a storage layer for cluster environments. DPH is a Distributed Data Structure (DDS) based on the distribution of a paged hash table. It combines main memory with file system resources across the cluster in order to implement a distributed dictionary that can be used for the storage of very large data sets with key based addressing techniques. The DPH storage layer is supported by a collection of cluster-aware utilities and services. Access to the DPH interface is provided by a user-level API. A preliminary performance evaluation shows promising results.
Supported by PRODEP III, through the grant 5.3/N/199.006/00, and SAPIENS, through the grant 41739/CHS/2001.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Al Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam. PVM: Parallel Virtual Machine. A User’s Guide and Tutorial for Networked Parallel Computing. Scientific and Engineering Computation. MIT Press, 1994. 679
M. Snir, S. Otto, S. Huss-Lederman, David Walker, and J. Dongarra. MPI-The Complete Reference. Scientific and Engineering Computation. MIT Press, 1998. 679
W. Litwin, M.-A. Neimat, and D.A. Schneider. LH*: Linear Hashing for Distributed Files. In Proceedings of the ACM SIGMOD-International Conference on Management of Data, pages 327–336, 1993. 679, 680
R. Devine. Design and implementation of DDH: a distributed dynamic hashing algorithm. In Proceedings of the 4th Int. Conf. on Foundations of Data Organization and Algorithms, pages 101–114, 1993. 679, 680
V. Hilford, F.B. Bastani, and B. Cukic. EH*-Extendible Hashing in a Distributed Environment. In Proceedings of the COMPSAC’ 97-21st International Computer Software and Applications Conference, 1997. 679, 681
R. Vingralek, Y. Breitbart, and G. Weikum. Distributed File Organization with Scalable Cost/Performance. In Proceedings of the ACM SIGMOD-International Conference on Management of Data, 1994. 679
B. Kroll and P. Widmayer. Distributing a Search Tree Among a Growing Number of Processors. In Proceedings of the ACM SIGMOD-International Conference on Management of Data, pages 265–276, 1994. 679, 681
T. Johnson and A. Colbrook. A Distributed, Replicated, Data-Balanced Search Structure. Technical Report TR03-028, Dept. of CISE, University of Florida, 1995. 679, 681
S.D. Gribble, E.A. Brewer, J.M. Hellerstein, and D. Culler. Scalable, Distributed Data Structures for Internet Service Construction. In Proceedings of the Fourth Symposium on Operating Systems Design and Implementation, 2000. 679, 681
W.K. Preslan et all. A 64-bit, Shared Disk File System for Linux. In Proceedings of the 7h NASA Goddard Conference on Mass Storage Systems and Tech. in cooperation with the Sixteenth IEEE Symposium on Mass Storage Systems, 1999. 679
P.H. Carns, W. B. Ligon, R.B. Ross, and R. Thakur. PVFS: A Parallel File System for Linux Clusters. In Proceedings of the 4th Annual Linux Showcase and Conference, pages 317–327. USENIX Association, 2000. 679
J. S. Vitter. Online Data Structures in External Memory. In Proceedings of the 26th Annual Intern. Colloquium on Automata, Languages, and Programming, 1999. 679, 691
J. Rufino, A. Pina, A. Alves, and J. Exposto. Distributed Hash Tables. International Workshop on Performance-oriented Application Development for Distributed Architectures (PADDA 2001), 2001. 679
D.E. Knuth. The Art of Computer Programming-Volume 3: Sorting and Searching. Addison-Wesley, 2nd edition, 1998. 680, 682
R. J. Enbody and H.C. Du. Dynamic Hashing Schemes. ACM Computing Surveys, (20):85–113, 1988. 680, 691
W. Litwin. Linear hashing: A new tool for file and table addressing. In Proceedings of the 6th Conference on Very Large Databases, pages 212–223, 1980. 680
R. Fagin, J. Nievergelt, N. Pippenger, and H.R. Strong. Extendible hashing: a fast access method for dynamic files. ACM Transactions on Database Systems, (315–344), 1979. 680, 681
T. Stornetta and F. Brewer. Implementation of an Efficient Parallel BDD Package. In Proceedings of the 33rd ACM/IEEE Design Automation Conference, 1996. 681
P. Bagwell. Ideal Hash Trees. Technical report, Computer Science Department, Ecole Polytechnique Federale de Lausanne, 2000. 681
A. Pina, V. Oliveira, C. Moreira, and A. Alves. pCoR-a Prototype for Resource Oriented Computing. (to appear in HPC 2002), 2002. 682
A. Pina. MC2-Modelo de ComputaÇão Celular. Origem e EvoluÇão. PhDthesis, Dep. de Informática, Univ. do Minho, Braga, Portugal, 1997. 682
Myricom. The GM Message Passing System, 2000. 682, 685
B. Jenkins. A Hash Function for Hash Table Lookup. Dr. Doob’s, 1997. 682, 686, 691
A.V. Aho, R. Sethi, and J.D. Ullman. Compilers: Principles, Techniques and Tools. Addison-Wesley, 1985. 682
R. C. Uzgalis. General Hash Functions. Technical Report TR 91-01, University of Hong Kong, 1991. 682
W. Pugh. SkipList s: A Probabilistic Alternative to Balanced Trees. Communications of the ACM, 33(6):668–676, 1990. 683
D. Kargeer, A. Sherman, A. Berkheimer, B. Bogstad, R. Dhanidina, K. Iwamoto, B. Kim, L. Matkins, and Y. Yerushalmi. Web Caching with Consistent Hashing. In Proceedings of the 8th International WWW Conference, 1999. 691
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rufino, J., Pina, A., Alves, A., Exposto, J. (2003). Distributed Paged Hash Tables. In: Palma, J.M.L.M., Sousa, A.A., Dongarra, J., Hernández, V. (eds) High Performance Computing for Computational Science — VECPAR 2002. VECPAR 2002. Lecture Notes in Computer Science, vol 2565. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36569-9_46
Download citation
DOI: https://doi.org/10.1007/3-540-36569-9_46
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00852-1
Online ISBN: 978-3-540-36569-3
eBook Packages: Springer Book Archive