Skip to main content

Parallel Sorting of Large Data Volumes on Distributed Memory Multiprocessors

  • Chapter

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 732))

Abstract

The use of multiprocessor architectures requires the parallelization of sorting algorithms. A parallel sorting algorithm based on horizontal parallelization is presented. This algorithm is suited for large data volumes (external sorting) and does not suffer from processing skew in presence of data skew. The core of the parallel sorting algorithm is a new adaptive partitioning method. The effect of data skew is remedied by taking samples representing the distribution of the input data. The parallel algorithm has been implemented on top of a shared disk multiprocessor architecture. The performance evaluation of the algorithm shows that it has linear speedup. Furthermore, the optimal degree of CPU parallelism is derived if I/O limitations are taken into account.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Aho, J. Hoperoft, J. Ullman; Data Structures and Algorithms; Addison Wesley Publ. Comp. Inc., 1983

    Google Scholar 

  2. P. Apers, M. Kersten, H. Oerlemans; PRISMA Database Machine: A Distributed Main Memory Approach; In: Proceedings of the 1st International Conference on Extending Database Technology, Venice, Mar 88

    Google Scholar 

  3. K. Batcher; Sorting Networks and their Applications; In: Proceedings of the 1968 Spring Joint Computer Conference, Vol. 32, 1968, pp. 307–314

    Google Scholar 

  4. B. Baugst0, J. Greipsland; Parallel Sorting Methods for Large Data Volumes on a Hypercube Database Computer, In: Proceedings of the 6th International Workshop on Database Machines, Deauxville, Jun 89, LNCS No. 368, pp. 128–141

    Google Scholar 

  5. R. Bayer, T. Harder; Preplaning of Disk Merges; Computing, Vol. 21, No. 1, pp. 1–16, 1978

    MathSciNet  Google Scholar 

  6. R. Bayer, T. Harder; A Performance Model for Preplaned Disk Sorting; Computing, Vol. 21, No. 1, pp. 17–36, 1978

    Google Scholar 

  7. M. Beck, D. Bitton, K. Wilkinson; Sorting Large Files on a Backend Multiprocessor,IEEE Transactions on Computers, Vol. 37, No. 7, Jul 88

    Google Scholar 

  8. T. Bemmerl, T. Ludwig; MMK - A Distributed Operation System Kernel with Integrated Loadbalancing; In: Proceedings of the CONPAR 90 - VAPP IV, Zürich, Sept 90

    Google Scholar 

  9. T. Bemmerl, A. Bode, P. Braun, O. Hansen, T. Tremi, R. Wismüller; The Design and Implementation of TOPSYS; Technical report, Technische Universität München, No. 342/16/91 A, Jul91

    Google Scholar 

  10. G.Bilardi, A. Nicolau; Adaptive Bitonic Sorting: An Optimal Parallel Algorithm for Shared-Memory Machines; SIAM J. Comput., Vol. 18, No. 2, Apr 89, pp. 216–228

    Google Scholar 

  11. H. Borst, W. Alexander, L. Clay, G. Copeland, S. Danforth, M. Franklin, B. Hart, M. Smith, P. Valduriez; Prototyping Bubba, A Highly Parallel Database System; IEEE Transactions an Knowledge and Data Engineering, Vol. 2, No. 1, Mar 90

    Google Scholar 

  12. U. Borghoff; Catalogue of Distributed File/Operating Systems; Springer Verlag, Berlin Heidelberg, 1992

    Book  MATH  Google Scholar 

  13. T. Chen, V. Lum, C. Tung; The Rebound Sorter: An Efficient Sort Engine for Large Files; In: Proceedings of the 4th International Conference on Very Large Data Bases, West Berlin, pp. 312–318

    Google Scholar 

  14. R. Cole; Parallel Merge Sort, SIAM J. Comput., Vol. 17, No. 4, Aug 88, pp. 770–785

    Google Scholar 

  15. D. DeWitt, S. Ghandeharizadeh, D. Scheider, A. Bricker, H. Hsiao, R. R. Rasmussen; The Gamma Database Machine Project; IEEE Transactions on Knowledge and Data Engineering, Vol. 2, No. 1, Mar 90

    Google Scholar 

  16. D. DeWitt, J. Naughton, D. Schneider; Parallel Sorting on a Shared Nothing Architecture using Probabilistic Splitting; In: Proceedings of the 1st International Conference on Parallel and Distributed Information Systems, Miami Beach, Florida, Dec 91

    Google Scholar 

  17. Y. Dohi, A. Suzuki, N. Matsui; Hardware Sorter and its Application to Database Machine; In: Proceedings of the 9th Conference on Computer Architecture, Austin, Apr 82, pp. 218–225

    Google Scholar 

  18. B. Edem, R. Helliwell, T. Johnston, E. Lary, R. Lary; Sort Accelerator,Technical Report, Database Research Group, DEC, May 90, Do. No.: DBS-TR-3 DEC-TR-691

    Google Scholar 

  19. G. Gibson; Redundant Disk Arrays: Reliable, Parallel Secondary Storage; Technical Report, No. UCB/CSB 91/613, Computer Science Decision, University of California, Berkeley, Dec 90

    Google Scholar 

  20. Intel Scientific Computers; Concurrent Supercomputing - The Second Generation, A technical Summary of the iPSCl2 Concurrent Supercomputer; Reprinted from the Proc. of the ACM Third Hypercube Conference

    Google Scholar 

  21. B. Iyer, G. Ricard, P. Varman; Percentile Finding Algorithm for Multipe Sorted Runs; In: Proceedings of the 15th International Conference on Very Large Databases, Amsterdam, 1989, pp. 135–144

    Google Scholar 

  22. B. Kandler, M. Pawlowski; SAM: A Sorting Toolbox - User’s Guide; Technical Report, Technische Universität München, No. 342/2/91 B, Jun 91 (in german)

    Google Scholar 

  23. D. Knuth; Sorting and Searching. The Art of Computer Programming; Addison Wesley Publ. Comp. Inc., 1973, Vol. 3

    Google Scholar 

  24. K. Lehnert; Regelbasierte Beschreibung von Optimierungsverfahren für relationale Datenbankabfragesprachen; Ph.D. Thesis, Technische Universität München, Dec 88 (in german)

    Google Scholar 

  25. E. Loibl, H. Obermaier, M. Pawlowski; Towards Parallelism in a Relational Database System; Technical Report, Technische Universität München, No. 342/10/91 A, Jun 91

    Google Scholar 

  26. R. Lorie, H. Young; A Low Communication Sort Algorithm For a Parallel Database Machine; In: Proceedings of the 15th International Conference on Very Large Data Bases, Amsterdam, 1989, pp. 125–134

    Google Scholar 

  27. D. Menzel; Paralleles Externes Sortieren auf Multiprozessoranlagen; Master Thesis, Technische Universität München, Nov 1991 (in german)

    Google Scholar 

  28. D. Patterson, G. Gibson, R. Katz; A Case for Redundant Arrays of Inexpensive Disks (RAID); In. Proceedings of the SIGMOD International Conference on Managemant of Data, Editors: H. Boral, P. Larson, ACM Press, Chicago, Jun 88, pp. 109–116

    Google Scholar 

  29. M. Quinn; Parallel Sorting Algorithms for Tightly Coupled Multiprocessors; Parallel Computing, Vol. 6, 1988, pp. 349–357

    MATH  Google Scholar 

  30. A. Reuter; Database Sharing; Informatik Spektrum, Vol. 8, No. 4, Apr 85, pp. 225–226

    Google Scholar 

  31. M. Stonebraker; The Case for Shared Nothing; Database Engineering, Vol. 9, No. 1, 1986

    Google Scholar 

  32. Teradata Corp.; DBC/1012 Database Computer System Manual; Doc. No. C10–0001–02, Nov 1985

    Google Scholar 

  33. TransAction Software GmbH; TransBase Relational Database System; System Guide, München, 1988

    Google Scholar 

  34. P. Varman, B. Iyer, S. Scheufler; A Multiprocessor Algorithm for Merging Multiple Sorted Lists; In: Proceedings of the International Conference on Parallel Processing, 1990, pp. III-22 - III-26

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1993 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Pawlowski, M., Bayer, R. (1993). Parallel Sorting of Large Data Volumes on Distributed Memory Multiprocessors. In: Bode, A., Dal Cin, M. (eds) Parallel Computer Architectures. Lecture Notes in Computer Science, vol 732. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-21577-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-21577-7_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-57307-4

  • Online ISBN: 978-3-662-21577-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics