Optimized Pipelined Parallel Merge Sort on the Cell BE

Keller, Jörg; Kessler, Christoph W.

doi:10.1007/978-3-642-00955-6_18

Jörg Keller²⁴ &
Christoph W. Kessler²⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5415))

Included in the following conference series:

European Conference on Parallel Processing

773 Accesses
2 Citations

Abstract

Chip multiprocessors designed for streaming applications such as Cell BE offer impressive peak performance but suffer from limited bandwidth to off-chip main memory. As the number of cores is expected to rise further, this bottleneck will become more critical in the coming years. Hence, memory-efficient algorithms are required. As a case study, we investigate parallel sorting on Cell BE as a problem of great importance and as a challenge where the ratio between computation and memory transfer is very low. Our previous work led to a parallel mergesort that reduces memory bandwidth requirements by pipelining between SPEs, but the allocation of SPEs was rather ad-hoc. In our present work, we investigate mappings of merger nodes to SPEs. The mappings are designed to provide optimal trade-offs between load balancing, buffer memory consumption, and communication load on the on-chip bus. We solve this multi-objective optimization problem by deriving an integer linear programming formulation and compute Pareto-optimal solutions for the mapping of merge trees with up to 127 merger nodes. For mapping larger trees, we give a fast divide-and-conquer based approximation algorithm. We evaluate the sorting algorithm resulting from our mappings by a discrete event simulation.

Download to read the full chapter text

Chapter PDF

Parallel Sorting for GPUs

Hourglass: A Bandwidth-Driven Performance Model for Sorting Algorithms

Adaptive Partitioning and Order-Preserved Merging of Data Streams

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Chen, T., Raghavan, R., Dale, J.N., Iwata, E.: Cell broadband engine architecture and its first implementation—a performance view. IBM J. Res. Devel. 51(5), 559–572 (2007)
Article Google Scholar
Huh, J., Keckler, S.W., Burger, D.: Exploring the design space of future CMPs. In: Proc. Int.l Conf. Parallel Architectures and Compilation Techniques (PACT 2001), pp. 199–210 (2001)
Google Scholar
Akl, S.G.: Parallel Sorting Algorithms. Academic Press, London (1985)
MATH Google Scholar
JáJá, J.: An Introduction to Parallel Algorithms. Addison-Wesley, Reading (1992)
MATH Google Scholar
Gedik, B., Bordawekar, R., Yu, P.S.: Cellsort: High performance sorting on the cell processor. In: Proc. 33rd Intl. Conf. on Very Large Data Bases, pp. 1286–1207 (2007)
Google Scholar
Inoue, H., Moriyama, T., Komatsu, H., Nakatani, T.: AA-sort: A new parallel sorting algorithm for multi-core SIMD processors. In: Proc. Int.l Conf. Parallel Architectures and Compilation Techniques (PACT 2007), pp. 189–198 (2007)
Google Scholar
Shi, H., Schaeffer, J.: Parallel sorting by regular sampling. Journal of Parallel and Distributed Computing 14, 361–372 (1992)
Article MATH Google Scholar
ILOG Inc.: Cplex version 10.2 (2007), http://www.ilog.com

Download references

Author information

Authors and Affiliations

Dept. of Math. and Computer Science, FernUniversität in Hagen, 58084, Hagen, Germany
Jörg Keller
Dept. of Computer and Inf. Science, Linköpings Universitet, 58183, Linköping, Sweden
Christoph W. Kessler

Authors

Jörg Keller
View author publications
You can also search for this author in PubMed Google Scholar
Christoph W. Kessler
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departament Arquitectura de Computadors i Sistemes Operatius, Universitat Autònoma de Barcelona, 08193, Bellaterra, Spain
Eduardo César
Wirtschaftsuniversität Wien, 1090, Wien, Austria
Michael Alexander
Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich GmbH, 52425, Jülich, Germany
Achim Streit
NEC Laboratories Europe, NEC Europe Ltd., Rathausallee 10, 53757, Sankt Augustin, Germany
Jesper Larsson Träff
Université de Paris Nord, LIPN, CNRS UMR 7030, 99 avenue J.B. Clément, 93430, Villetaneuse, France
Christophe Cérin
Technische Universität Dresden, 01069, Dresden, Germany
Andreas Knüpfer
LMU München, Institut für Informatik,, 80538, München, Germany
Dieter Kranzlmüller
Center for Computation and Technology (CCT), Louisiana State University, LA 70803, Baton Rouge, USA
Shantenu Jha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Keller, J., Kessler, C.W. (2009). Optimized Pipelined Parallel Merge Sort on the Cell BE. In: César, E., et al. Euro-Par 2008 Workshops - Parallel Processing. Euro-Par 2008. Lecture Notes in Computer Science, vol 5415. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00955-6_18

Download citation

DOI: https://doi.org/10.1007/978-3-642-00955-6_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00954-9
Online ISBN: 978-3-642-00955-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Optimized Pipelined Parallel Merge Sort on the Cell BE

Abstract

Chapter PDF

Similar content being viewed by others

Parallel Sorting for GPUs

Hourglass: A Bandwidth-Driven Performance Model for Sorting Algorithms

Adaptive Partitioning and Order-Preserved Merging of Data Streams

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Optimized Pipelined Parallel Merge Sort on the Cell BE

Abstract

Chapter PDF

Similar content being viewed by others

Parallel Sorting for GPUs

Hourglass: A Bandwidth-Driven Performance Model for Sorting Algorithms

Adaptive Partitioning and Order-Preserved Merging of Data Streams

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation