Abstract
The use of multiprocessor architectures requires the parallelization of sorting algorithms. A parallel sorting algorithm based on horizontal parallelization is presented. This algorithm is suited for large data volumes (external sorting) and does not suffer from processing skew in presence of data skew. The core of the parallel sorting algorithm is a new adaptive partitioning method. The effect of data skew is remedied by taking samples representing the distribution of the input data. The parallel algorithm has been implemented on top of a shared disk multiprocessor architecture. The performance evaluation of the algorithm shows that it has linear speedup. Furthermore, the optimal degree of CPU parallelism is derived if I/O limitations are taken into account.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
A. Aho, J. Hoperoft, J. Ullman; Data Structures and Algorithms; Addison Wesley Publ. Comp. Inc., 1983
P. Apers, M. Kersten, H. Oerlemans; PRISMA Database Machine: A Distributed Main Memory Approach; In: Proceedings of the 1st International Conference on Extending Database Technology, Venice, Mar 88
K. Batcher; Sorting Networks and their Applications; In: Proceedings of the 1968 Spring Joint Computer Conference, Vol. 32, 1968, pp. 307–314
B. Baugst0, J. Greipsland; Parallel Sorting Methods for Large Data Volumes on a Hypercube Database Computer, In: Proceedings of the 6th International Workshop on Database Machines, Deauxville, Jun 89, LNCS No. 368, pp. 128–141
R. Bayer, T. Harder; Preplaning of Disk Merges; Computing, Vol. 21, No. 1, pp. 1–16, 1978
R. Bayer, T. Harder; A Performance Model for Preplaned Disk Sorting; Computing, Vol. 21, No. 1, pp. 17–36, 1978
M. Beck, D. Bitton, K. Wilkinson; Sorting Large Files on a Backend Multiprocessor,IEEE Transactions on Computers, Vol. 37, No. 7, Jul 88
T. Bemmerl, T. Ludwig; MMK - A Distributed Operation System Kernel with Integrated Loadbalancing; In: Proceedings of the CONPAR 90 - VAPP IV, Zürich, Sept 90
T. Bemmerl, A. Bode, P. Braun, O. Hansen, T. Tremi, R. Wismüller; The Design and Implementation of TOPSYS; Technical report, Technische Universität München, No. 342/16/91 A, Jul91
G.Bilardi, A. Nicolau; Adaptive Bitonic Sorting: An Optimal Parallel Algorithm for Shared-Memory Machines; SIAM J. Comput., Vol. 18, No. 2, Apr 89, pp. 216–228
H. Borst, W. Alexander, L. Clay, G. Copeland, S. Danforth, M. Franklin, B. Hart, M. Smith, P. Valduriez; Prototyping Bubba, A Highly Parallel Database System; IEEE Transactions an Knowledge and Data Engineering, Vol. 2, No. 1, Mar 90
U. Borghoff; Catalogue of Distributed File/Operating Systems; Springer Verlag, Berlin Heidelberg, 1992
T. Chen, V. Lum, C. Tung; The Rebound Sorter: An Efficient Sort Engine for Large Files; In: Proceedings of the 4th International Conference on Very Large Data Bases, West Berlin, pp. 312–318
R. Cole; Parallel Merge Sort, SIAM J. Comput., Vol. 17, No. 4, Aug 88, pp. 770–785
D. DeWitt, S. Ghandeharizadeh, D. Scheider, A. Bricker, H. Hsiao, R. R. Rasmussen; The Gamma Database Machine Project; IEEE Transactions on Knowledge and Data Engineering, Vol. 2, No. 1, Mar 90
D. DeWitt, J. Naughton, D. Schneider; Parallel Sorting on a Shared Nothing Architecture using Probabilistic Splitting; In: Proceedings of the 1st International Conference on Parallel and Distributed Information Systems, Miami Beach, Florida, Dec 91
Y. Dohi, A. Suzuki, N. Matsui; Hardware Sorter and its Application to Database Machine; In: Proceedings of the 9th Conference on Computer Architecture, Austin, Apr 82, pp. 218–225
B. Edem, R. Helliwell, T. Johnston, E. Lary, R. Lary; Sort Accelerator,Technical Report, Database Research Group, DEC, May 90, Do. No.: DBS-TR-3 DEC-TR-691
G. Gibson; Redundant Disk Arrays: Reliable, Parallel Secondary Storage; Technical Report, No. UCB/CSB 91/613, Computer Science Decision, University of California, Berkeley, Dec 90
Intel Scientific Computers; Concurrent Supercomputing - The Second Generation, A technical Summary of the iPSCl2 Concurrent Supercomputer; Reprinted from the Proc. of the ACM Third Hypercube Conference
B. Iyer, G. Ricard, P. Varman; Percentile Finding Algorithm for Multipe Sorted Runs; In: Proceedings of the 15th International Conference on Very Large Databases, Amsterdam, 1989, pp. 135–144
B. Kandler, M. Pawlowski; SAM: A Sorting Toolbox - User’s Guide; Technical Report, Technische Universität München, No. 342/2/91 B, Jun 91 (in german)
D. Knuth; Sorting and Searching. The Art of Computer Programming; Addison Wesley Publ. Comp. Inc., 1973, Vol. 3
K. Lehnert; Regelbasierte Beschreibung von Optimierungsverfahren für relationale Datenbankabfragesprachen; Ph.D. Thesis, Technische Universität München, Dec 88 (in german)
E. Loibl, H. Obermaier, M. Pawlowski; Towards Parallelism in a Relational Database System; Technical Report, Technische Universität München, No. 342/10/91 A, Jun 91
R. Lorie, H. Young; A Low Communication Sort Algorithm For a Parallel Database Machine; In: Proceedings of the 15th International Conference on Very Large Data Bases, Amsterdam, 1989, pp. 125–134
D. Menzel; Paralleles Externes Sortieren auf Multiprozessoranlagen; Master Thesis, Technische Universität München, Nov 1991 (in german)
D. Patterson, G. Gibson, R. Katz; A Case for Redundant Arrays of Inexpensive Disks (RAID); In. Proceedings of the SIGMOD International Conference on Managemant of Data, Editors: H. Boral, P. Larson, ACM Press, Chicago, Jun 88, pp. 109–116
M. Quinn; Parallel Sorting Algorithms for Tightly Coupled Multiprocessors; Parallel Computing, Vol. 6, 1988, pp. 349–357
A. Reuter; Database Sharing; Informatik Spektrum, Vol. 8, No. 4, Apr 85, pp. 225–226
M. Stonebraker; The Case for Shared Nothing; Database Engineering, Vol. 9, No. 1, 1986
Teradata Corp.; DBC/1012 Database Computer System Manual; Doc. No. C10–0001–02, Nov 1985
TransAction Software GmbH; TransBase Relational Database System; System Guide, München, 1988
P. Varman, B. Iyer, S. Scheufler; A Multiprocessor Algorithm for Merging Multiple Sorted Lists; In: Proceedings of the International Conference on Parallel Processing, 1990, pp. III-22 - III-26
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Pawlowski, M., Bayer, R. (1993). Parallel Sorting of Large Data Volumes on Distributed Memory Multiprocessors. In: Bode, A., Dal Cin, M. (eds) Parallel Computer Architectures. Lecture Notes in Computer Science, vol 732. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-21577-7_18
Download citation
DOI: https://doi.org/10.1007/978-3-662-21577-7_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-57307-4
Online ISBN: 978-3-662-21577-7
eBook Packages: Springer Book Archive