Skip to main content

Sorting and Selection on Parallel Disk Models

  • Chapter
Handbook of Massive Data Sets

Part of the book series: Massive Computing ((MACO,volume 4))

  • 511 Accesses

Abstract

Data explosion is an increasingly prevalent problem in every field of science. Traditional out-of-core models that assume a single disk have been found inadequate to handle voluminous data. As a result, models that employ multiple disks have been proposed in the literature. For example, the Parallel Disk Systems (PDS) model assumes D disks and a single computer. It is also assumed that a block of data from each of the D disks can be fetched into the main memory in one parallel I/O operation.

In this article, we survey sorting and selection algorithms that have been devised for out-of-core models assuming multiple disks. We also consider practical implementations of parallel disk models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 629.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 799.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 799.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Bibliography

  • A. Aggarwal and C. G. Plaxton. Optimal parallel sorting in multilevel storage. In Proc. Fifth Annual ACM Symposium on Discrete Algorithms, pages 659–668, 1994.

    Google Scholar 

  • A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31 (9): 1116–1127, 1988.

    Article  MathSciNet  Google Scholar 

  • L. Arge. The buffer tree: A new technique for optimal I/O-algorithms. In Proc. 4th International Workshop on Algorithms and Data Structures (WADS), pages 334–345, 1995.

    Chapter  Google Scholar 

  • R. Barye, E. F. Grove, and J. S. Vitter. Simple randomized mergesort on parallel disks. Technical report, Department of Computer Science, 1996.

    Google Scholar 

  • M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest, and R. E. Tarjan. Time bounds for selection. Journal of Computer and System Sciences, 7: 448–461, 1973.

    Article  MathSciNet  MATH  Google Scholar 

  • T. Cormen. Determining an out-of-core fft decomposition strategy for parallel disks by dynamic programming. In Robert S. Schreiber Michael T. Heath, Abhiram Ranade, editor, Algorithms for Parallel Processing, pages 307–320. New York: Springer-Verlag, 1999.

    Chapter  Google Scholar 

  • T. Cormen and M. D. Pearson, 1999. Personal Communication.

    Google Scholar 

  • R. W. Floyd and R. L. Rivest. Expected time bounds for selection. Communications of the ACM, 18 (3): 165–172, 1975.

    Article  MATH  Google Scholar 

  • E. Horowitz, S. Sahni, and S. Rajasekaran, editors. Computer Algorithms. W. H. Freeman Press, 1998.

    Google Scholar 

  • T. Leighton. Tight bounds on the complexity of parallel sorting. IEEE Transactions on Computers, C34 (4): 344–354, 1985.

    Article  MathSciNet  MATH  Google Scholar 

  • G.S. Manku, S. Rajagopalan, and G. Lindsay. Approximate medians and other quantiles in one pass and with limited memory. In Proc. of the 1998 ACM SIGMOD International Conference on Management of Data, pages 426–435, 1998.

    Chapter  Google Scholar 

  • J. I. Munro and M. S. Paterson. Selection and sorting with limited storage. Theoretical Computer Science, 12: 315–323, 1980.

    Article  MathSciNet  MATH  Google Scholar 

  • M. H. Nodine and J. S. Vitter. Large scale sorting in parallel memories. In Proc. Third Annual ACM Symposium on Parallel Algorithms and Architectures, pages 29–39, 1990.

    Google Scholar 

  • M. H. Nodine and J. S. Vitter. Greed sort: Optimal deterministic sorting on parallel disks. Journal of the ACM, 42 (4): 919–933, 1995.

    Article  MathSciNet  Google Scholar 

  • V. S. Pai, A. A. Schaffer, and P. J. Varman. Markov analysis of multiple-disk prefetching strategies for external merging. Theoretical Computer Science, 128 (2): 1994, 211–239.

    Article  MathSciNet  MATH  Google Scholar 

  • M. D. Pearson. Fast out-of-core sorting on parallel disk systems. Technical report, Dartmouth College, Computer Science, 1999. ftp://ftp.cs.dartmouth.edu/TR/TR99–351.ps.Z.

  • S. Rajasekaran. A framework for simple sorting algorithms on parallel disk systems. In Proc. 10th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 88–97, 1998a.

    Google Scholar 

  • S. Rajasekaran. Selection algorithms for the parallel disk systems. In Proc. International Conference on High Performance Computing, 1998b.

    Google Scholar 

  • S. Rajasekaran and X. Jin. A practical model for parallel disks. manuscript, 1999.

    Google Scholar 

  • S. Rajasekaran and J.H. Reif. Derivation of randomized sorting and selection algorithms. In R. Paige, J.H. Reif, and R. Wachter, editors, Parallel Algorithm Derivation and Program Transformation, pages 187–205. Kluwer Academic Publishers, 1993.

    Google Scholar 

  • C. D. Thompson and H. T. Kung. Sorting on a mesh connected parallel computer. Communications of the ACM, 20 (4): 263–271, 1977.

    Article  MathSciNet  MATH  Google Scholar 

  • J. S. Vitter and E. A. M. Shriver. Algorithms for parallel memory I: Two-level memories. Algorithmica, 12 (2–3): 110–147, 1994.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Rajasekaran, S. (2002). Sorting and Selection on Parallel Disk Models. In: Abello, J., Pardalos, P.M., Resende, M.G.C. (eds) Handbook of Massive Data Sets. Massive Computing, vol 4. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0005-6_24

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-0005-6_24

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-4882-5

  • Online ISBN: 978-1-4615-0005-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics