Abstract
Data explosion is an increasingly prevalent problem in every field of science. Traditional out-of-core models that assume a single disk have been found inadequate to handle voluminous data. As a result, models that employ multiple disks have been proposed in the literature. For example, the Parallel Disk Systems (PDS) model assumes D disks and a single computer. It is also assumed that a block of data from each of the D disks can be fetched into the main memory in one parallel I/O operation.
In this article, we survey sorting and selection algorithms that have been devised for out-of-core models assuming multiple disks. We also consider practical implementations of parallel disk models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Bibliography
A. Aggarwal and C. G. Plaxton. Optimal parallel sorting in multilevel storage. In Proc. Fifth Annual ACM Symposium on Discrete Algorithms, pages 659–668, 1994.
A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31 (9): 1116–1127, 1988.
L. Arge. The buffer tree: A new technique for optimal I/O-algorithms. In Proc. 4th International Workshop on Algorithms and Data Structures (WADS), pages 334–345, 1995.
R. Barye, E. F. Grove, and J. S. Vitter. Simple randomized mergesort on parallel disks. Technical report, Department of Computer Science, 1996.
M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest, and R. E. Tarjan. Time bounds for selection. Journal of Computer and System Sciences, 7: 448–461, 1973.
T. Cormen. Determining an out-of-core fft decomposition strategy for parallel disks by dynamic programming. In Robert S. Schreiber Michael T. Heath, Abhiram Ranade, editor, Algorithms for Parallel Processing, pages 307–320. New York: Springer-Verlag, 1999.
T. Cormen and M. D. Pearson, 1999. Personal Communication.
R. W. Floyd and R. L. Rivest. Expected time bounds for selection. Communications of the ACM, 18 (3): 165–172, 1975.
E. Horowitz, S. Sahni, and S. Rajasekaran, editors. Computer Algorithms. W. H. Freeman Press, 1998.
T. Leighton. Tight bounds on the complexity of parallel sorting. IEEE Transactions on Computers, C34 (4): 344–354, 1985.
G.S. Manku, S. Rajagopalan, and G. Lindsay. Approximate medians and other quantiles in one pass and with limited memory. In Proc. of the 1998 ACM SIGMOD International Conference on Management of Data, pages 426–435, 1998.
J. I. Munro and M. S. Paterson. Selection and sorting with limited storage. Theoretical Computer Science, 12: 315–323, 1980.
M. H. Nodine and J. S. Vitter. Large scale sorting in parallel memories. In Proc. Third Annual ACM Symposium on Parallel Algorithms and Architectures, pages 29–39, 1990.
M. H. Nodine and J. S. Vitter. Greed sort: Optimal deterministic sorting on parallel disks. Journal of the ACM, 42 (4): 919–933, 1995.
V. S. Pai, A. A. Schaffer, and P. J. Varman. Markov analysis of multiple-disk prefetching strategies for external merging. Theoretical Computer Science, 128 (2): 1994, 211–239.
M. D. Pearson. Fast out-of-core sorting on parallel disk systems. Technical report, Dartmouth College, Computer Science, 1999. ftp://ftp.cs.dartmouth.edu/TR/TR99–351.ps.Z.
S. Rajasekaran. A framework for simple sorting algorithms on parallel disk systems. In Proc. 10th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 88–97, 1998a.
S. Rajasekaran. Selection algorithms for the parallel disk systems. In Proc. International Conference on High Performance Computing, 1998b.
S. Rajasekaran and X. Jin. A practical model for parallel disks. manuscript, 1999.
S. Rajasekaran and J.H. Reif. Derivation of randomized sorting and selection algorithms. In R. Paige, J.H. Reif, and R. Wachter, editors, Parallel Algorithm Derivation and Program Transformation, pages 187–205. Kluwer Academic Publishers, 1993.
C. D. Thompson and H. T. Kung. Sorting on a mesh connected parallel computer. Communications of the ACM, 20 (4): 263–271, 1977.
J. S. Vitter and E. A. M. Shriver. Algorithms for parallel memory I: Two-level memories. Algorithmica, 12 (2–3): 110–147, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Rajasekaran, S. (2002). Sorting and Selection on Parallel Disk Models. In: Abello, J., Pardalos, P.M., Resende, M.G.C. (eds) Handbook of Massive Data Sets. Massive Computing, vol 4. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0005-6_24
Download citation
DOI: https://doi.org/10.1007/978-1-4615-0005-6_24
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-4882-5
Online ISBN: 978-1-4615-0005-6
eBook Packages: Springer Book Archive