Skip to main content

External Sorting and Permuting

  • Living reference work entry
  • First Online:
Encyclopedia of Algorithms
  • 214 Accesses

Synonyms

Out-of-core sorting

Years and Authors of Summarized Original Work

1988; Aggarwal, Vitter

Problem Definition

Notations The main properties of magnetic disks and multiple disk systems can be captured by the commonly used parallel disk model (PDM), which is summarized below in its current form as developed by Vitter and Shriver [22]:

  • N = problem size (in units of data items);

  • M = internal memory size (in units of data items);

  • B = block transfer size (in units of data items);

  • D = number of independent disk drives;

  • P = number of CPUs,

where M < N, and 1 ≤ DB ≤ M∕2. The data items are assumed to be of fixed length. In a single I/O, each of the D disks can simultaneously transfer a block of B contiguous data items. (In the original 1988 article [2], the D blocks per I/O were allowed to come from the same disk, which is not realistic.) If P ≤ D, each of the P processors can drive about D∕P disks; if D < P, each disk is shared by about P∕D processors. The internal memory size is M∕P...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  1. Aggarwal A, Plaxton CG (1994) Optimal parallel sorting in multi-level storage. In: Proceedings of the 5th ACM-SIAM symposium on discrete algorithms, Arlington, vol 5, pp 659–668

    Google Scholar 

  2. Aggarwal A, Vitter JS (1988) The input/output complexity of sorting and related problems. Commun ACM 31:1116–1127

    Article  MathSciNet  Google Scholar 

  3. Arge L, Goodrich MT, Nelson M, Sitchinava N (2008) Fundamental parallel algorithms for private-cache chip multiprocessors. In: Proceedings of the 20th symposium on parallelism in algorithms and architectures, Munich, pp 197–206

    Google Scholar 

  4. Arge L, Knudsen M, Larsen K (1993) A general lower bound on the I/O-complexity of comparison-based algorithms. In: Proceedings of the workshop on algorithms and data structures, Montréal. Lecture notes in computer science, vol 709, pp 83–94

    Google Scholar 

  5. Arge L, Thorup M (2013) RAM-efficient external memory sorting. In: Proceedings of the 24th international symposium on algorithms and computation, Hong Kong. Lecture notes in computer science, vol 8283, pp 491–501

    Google Scholar 

  6. Barve RD, Kallahalla M, Varman PJ, Vitter JS (2000) Competitive analysis of buffer management algorithms. J Algorithms 36:152–181

    Article  MathSciNet  MATH  Google Scholar 

  7. Barve RD, Vitter JS (2002) A simple and efficient parallel disk mergesort. ACM Trans Comput Syst 35:189–215

    MathSciNet  MATH  Google Scholar 

  8. Cormen TH, Sundquist T, Wisniewski LF (1999) Asymptotically tight bounds for performing BMMC permutations on parallel disk systems. SIAM J Comput 28:105–136

    Article  MathSciNet  Google Scholar 

  9. Dementiev R, Sanders P (2003) Asynchronous parallel disk sorting. In: Proceedings of the 15th ACM symposium on parallelism in algorithms and architectures, San Diego, pp 138–148

    Google Scholar 

  10. Hutchinson DA, Sanders P, Vitter JS (2005) Duality between prefetching and queued writing with parallel disks. SIAM J Comput 34:1443–1463

    Article  MathSciNet  MATH  Google Scholar 

  11. Kallahalla M, Varman PJ (2005) Optimal read-once parallel disk scheduling. Algorithmica 43:309–343

    Article  MathSciNet  MATH  Google Scholar 

  12. Knuth DE (1998) Sorting and searching. The art of computer programming, vol 3, 2nd edn. Addison-Wesley, Reading

    Google Scholar 

  13. Matias Y, Segal E, Vitter JS (2006) Efficient bundle sorting. SIAM J Comput 36(2):394–410

    Article  MathSciNet  MATH  Google Scholar 

  14. Nodine MH, Vitter JS (1993) Deterministic distribution sort in shared and distributed memory multiprocessors. In: Proceedings of the 5th ACM symposium on parallel algorithms and architectures, Velen, vol 5. ACM, pp 120–129

    Google Scholar 

  15. Nodine MH, Vitter JS (1995) Greed sort: an optimal sorting algorithm for multiple disks. J ACM 42:919–933

    Article  MathSciNet  Google Scholar 

  16. Rahn M, Sanders P, Singler J (2010) Scalable distributed-memory external sorting. In: Proceedings of the 26th IEEE international conference on data engineering, Long Beach, pp 685–688

    Google Scholar 

  17. Shah R, Varman PJ, Vitter JS (2004) Online algorithms for prefetching and caching on parallel disks. In: Proceedings of the 16th ACM symposium on parallel algorithms and architectures, Barcelona, pp 255–264

    Google Scholar 

  18. Thonangi R, Yang J (2013) Permuting data on random-access block storage. Proc VLDB Endow 6(9):721–732

    Article  Google Scholar 

  19. Vitter JS (2001) External memory algorithms and data structures: dealing with massive data. ACM Comput Surv 33(2):209–271

    Article  Google Scholar 

  20. Vitter JS (2008) Algorithms and data structures for external memory. Series on foundations and trends in theoretical computer science. Now Publishers, Hanover. (Also referenced as Volume 2, Issue 4 of Foundations and trends in theoretical computer science, Now Publishers)

    Google Scholar 

  21. Vitter JS, Hutchinson DA (2006) Distribution sort with randomized cycling. J ACM 53:656–680

    Article  MathSciNet  Google Scholar 

  22. Vitter JS, Shriver EAM (1994) Algorithms for parallel memory I: two-level memories. Algorithmica 12:110–147

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeffrey Scott Vitter .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media New York

About this entry

Cite this entry

Vitter, J. (2015). External Sorting and Permuting. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27848-8_137-2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27848-8_137-2

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Online ISBN: 978-3-642-27848-8

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics