Abstract
The existing solutions to program parallel architectures range from parallelizing compilers to distributed concurrent programming. Intermediate approaches propose a more structured parallelism: Algorithmic skeletons are higher-order functions that capture the patterns of parallel algorithms. The user of the library has just to compose some of the skeletons to write her parallel application. When one is designing a parallel program, the parallel performance is important. It is thus very interesting for the programmer to rely on a simple yet realistic parallel performance model such as the Bulk Synchronous Parallel (BSP) model. We present OSL, the Orléans Skeleton Library: it is a library of BSP algorithmic skeletons in C++. It offers data-parallel skeletons on arrays as well as communication oriented skeletons. The performance of OSL is demonstrated with two applications: heat equation and FFT.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
OpenMP Application Program Interface version 3.0 (May 2008)
Aldinucci, M., Danelutto, M., Teti, P.: An Advanced Environment Supporting Structured Parallel Programming in Java. Future Generation Computer Systems 19, 611–626 (2002)
Apt, K.R., Olderog, E.-R.: Verification of sequential and concurrent programs, 2nd edn. Springer, Heidelberg (1997)
Bamha, M., Exbrayat, M.: Pipelining a Skew-Insensitive Parallel Join Algorithm. Parallel Processing Letters 13(3), 317–328 (2003)
Benoit, A., Murray, C., Gilmore, S., Hillston, J.: Flexible Skeletal Programming with eSkel. In: Cunha, J.C., Medeiros, P.D. (eds.) Euro-Par 2005. LNCS, vol. 3648, pp. 761–770. Springer, Heidelberg (2005)
Bischof, H., Gorlatch, S., Leschinskiy, R.: DatTeL: A Data-Parallel C++ Template Library. Parallel Processing Letters 13(3), 461–472 (2003)
Bisseling, R.: Parallel Scientific Computation. A structured approach using BSP and MPI. Oxford University Press, Oxford (2004)
Bonorden, O., Juurlink, B., von Otte, I., Rieping, I.: The Paderborn University BSP (PUB) Library. Parallel Computing 29(2), 187–207 (2003)
Braud, A., Vrain, C.: A parallel genetic algorithm based on the BSP model. In: Evolutionary Computation and Parallel Processing GECCO & AAAI Workshop, Orlando (Florida), USA (1999)
Caromel, D., Leyton, M.: Fine tuning algorithmic skeletons. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 72–81. Springer, Heidelberg (2007)
Chapman, B., Jost, G., van Der Pas, R.: Using OpenMP. MIT Press, Cambridge (2008); about OpenMP 2.5
Ciechanowicz, P., Poldner, M., Kuchen, H.: The Münster Skeleton Library Muesli – A Comprenhensive Overview. Technical Report Working Paper No. 7, European Research Center for Information Systems, University of Münster, Germany (2009)
Cole, M.: Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press, Cambridge (1989)
Cole, M.: Bringing Skeletons out of the Closet: A Pragmatic Manifesto for Skeletal Parallel Programming. Parallel Computing 30(3), 389–406 (2004)
Skillicorn, D.B., Hill, J.M.D., McColl, W.F.: Questions and Answers about BSP. Scientific Programming 6(3), 249–274 (1997)
Dabrowski, F., Loulergue, F.: Functional Bulk Synchronous Programming in C++. In: 21st IASTED International Multi-conference, Applied Informatics (AI 2003), Symposium on Parallel and Distributed Computing and Networks, February 2003, pp. 462–467. ACTA Press (2003)
Danelutto, M., Dazzi, P.: Joint Structured/Unstructured Parallelism Exploitation in Muskel. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006. LNCS, vol. 3992, pp. 937–944. Springer, Heidelberg (2006)
Darlington, J., Field, A.J., Harrison, P.G., Kelly, P., Sharp, D., Wu, Q., While, R.: Parallel Programming Using Skeleton Functions. In: Reeve, M., Bode, A., Wolf, G. (eds.) PARLE 1993. LNCS, vol. 694, pp. 146–160. Springer, Heidelberg (1993)
Dehne, F., Fabri, A., Rau-Chaplin, A.: Scalable parallel ceometric algorithms for coarse grained multicomputer. In: 9th Symposium on Computational Geometry, pp. 298–307 (1993)
Dracopoulos, D.C., Kent, S.: Speeding up genetic programming: A parallel BSP implementation. In: First Annual Conference on Genetic Programming. MIT Press, Cambridge (1996)
Emoto, K., Matsuzaki, K., Hu, Z., Takeichi, M.: Domain-Specific Optimization Strategy for Skeleton Programs. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 705–714. Springer, Heidelberg (2007)
Falcou, J., Sérot, J.: Formal Semantics Applied to the Implementation of a Skeleton-Based Parallel Programming Library. In: Bischof, C.H., Bücker, H.M., Gibbon, P., Joubert, G.R., Lippert, T., Mohr, B., Peters, F.J. (eds.) Parallel Computing: Architectures, Algorithms and Applications, ParCo 2007. Advances in Parallel Computing, vol. 15, pp. 243–252. IOS Press, Amsterdam (2007)
Falcou, J., Sérot, J., Chateau, T., Lapresté, J.-T.: Quaff: Efficient C++ Design for Parallel Skeletons. Parallel Computing 32, 604–615 (2006)
Gava, F.: Formal Proofs of Functional BSP Programs. Parallel Processing Letters 13(3), 365–376 (2003)
Ghuloum, A., Smith, T., Gansha, W., Zhou, X., Fang, J., Guo, P., So, B., Rajagopalan, M., Chen, Y., Chen, B.: Future-Proof Data Parallel Algorithms and Software on Intel Multi-Core Architecture. Intel Technology Journal 11(4) (2007)
Granvilliers, L., Hains, G., Miller, Q., Romero, N.: A system for the high-level parallelization and cooperation of constraint solvers. In: Pan, Y., Akl, S.G., Li, K. (eds.) Proceedings of International Conference on Parallel and Distributed Computing and Systems (PDCS), Las Vegas, USA, pp. 596–601. IASTED/ACTA Press (1998)
Gu, Y., Lee, B.-S., Cai, W.: JBSP: A BSP Programming Library in Java. Journal of Parallel and Distributed Computing 61(17), 1126–1142 (2001)
Hill, J.M.D., McColl, B., Stefanescu, D., Goudreau, M., et al.: BSPlib: The BSP Programming Library. Parallel Computing 24, 1947–1980 (1998)
Hill, J.M.D., Skillicorn, D.B.: Practical Barrier Synchronisation. In: 6th EuroMicro Workshop on Parallel and Distributed Processing (PDP 1998). IEEE Computer Society Press, Los Alamitos (1998)
Hinsen, K., Langtangen, H.P., Skavhaug, O., Odegård, Å.: Using BSP and Python to simplify parallel programming. Future Generation Computur Systems 22(1), 123–157 (2006)
Jifeng, H., Miller, Q., Chen, L.: Algebraic laws for BSP programming. In: Fraigniaud, P., Mignotte, A., Robert, Y., Bougé, L. (eds.) Euro-Par 1996. LNCS, vol. 1124, pp. 1123–1124. Springer, Heidelberg (1996)
Kessler, C.W.: Managing Distributed Shared Arrays in a Bulk-Synchronous Parallel Environment. Concurrency and Computation: Practice and Experience 16, 133–153 (2004)
Kuchen, H.: A Skeleton Library. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, pp. 620–629. Springer, Heidelberg (2002)
Kuchen, H., Cole, M.: The Integration of Task and Data Parallel Skeletons. Parallel Processing Letters 12(2), 141–155 (2002)
Kuchen, H., Poldner, M.: On Implementing the Farm Skeleton. Parallel Processing Letters 18(1), 204–219 (2008)
Loulergue, F., Gava, F., Billiet, D.: Bulk Synchronous Parallel ML: Modular Implementation and Performance Prediction. In: Sunderam, V.S., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2005. LNCS, vol. 3515, pp. 1046–1054. Springer, Heidelberg (2005)
Matsuzaki, K., Iwasaki, H., Emoto, K., Hu, Z.: A Library of Constructive Skeletons for Sequential Style of Parallel Programming. In: InfoScale 2006: Proceedings of the 1st international conference on Scalable information systems. ACM Press, New York (2006)
McColl, W.F.: Scalability, portability and predictability: The BSP approach to parallel programming. Future Generation Computer Systems 12, 265–272 (1996)
Merlin, A., Hains, G.: A bulk synchronous process algebra. Computer Languages, Systems and Structures 33(3-4), 111–133 (2007)
Nichols, B., Buttlar, D., Proulx Farrell, J.: Pthreads Programming: A POSIX Standard for Better Multiprocessing. O’Reilly, Sebastopol (1996)
Pelagatti, S.: Structured Development of Parallel Programs. Taylor & Francis, Abington (1998)
Pervez, S., Gopalakrishnan, G., Kirby, R.M., Palmer, R., Thakur, R., Gropp, W.: Practical Model-Checking Method for Verifying Correctness of MPI Programs. In: Cappello, F., Herault, T., Dongarra, J. (eds.) PVM/MPI 2007. LNCS, vol. 4757, pp. 344–353. Springer, Heidelberg (2007)
Reinders, J.: Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism. O’Reilly, Sebastopol (2007)
Rogers, R.O., Skillicorn, D.B.: Using the BSP cost model to optimise parallel neural network training. Future Generation Computer Systems 14(5-6), 409–424 (1998)
Siegel, S.F.: Model Checking Nonblocking MPI Programs. In: Cook, B., Podelski, A. (eds.) VMCAI 2007. LNCS, vol. 4349, pp. 44–58. Springer, Heidelberg (2007)
Snir, M., Gropp, W.: MPI the Complete Reference. MIT Press, Cambridge (1998)
Suijlen, W.J.: BSPonMPI, http://bsponmpi.sourceforge.net
Valiant, L.G.: A bridging model for parallel computation. Comm. of the ACM 33(8), 103 (1990)
Veldhuizen, T.: Techniques for Scientific C++. Computer science technical report 542, Indiana University (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Javed, N., Loulergue, F. (2009). OSL: Optimized Bulk Synchronous Parallel Skeletons on Distributed Arrays. In: Dou, Y., Gruber, R., Joller, J.M. (eds) Advanced Parallel Processing Technologies. APPT 2009. Lecture Notes in Computer Science, vol 5737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03644-6_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-03644-6_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03643-9
Online ISBN: 978-3-642-03644-6
eBook Packages: Computer ScienceComputer Science (R0)