Summary
Shared-memory multicore computing platforms are becoming commonplace, and loop parallelization with OpenMP offers an easy way for the user to harness their power. As a result, tools for automatic differentiation (AD) should be able to deal with such codes in a fashion that preserves their parallel nature also for the derivative evaluation. In this paper, we explore this issue using a plasma simulation code. Its structure, which in essence is a time stepping loop with several parallelizable inner loops, is representative of many other computations. Using this code as an example, we develop a strategy for the efficient implementation of the reverse mode of AD with trace-based AD-tools and implement it with the ADOL-C tool. The strategy combines checkpointing at the outer level with parallel trace generation and evaluation at the inner level. We discuss the extensions necessary for ADOL-C to work in a multithreaded environment and the setup necessary for the user code and present performance results on a shared-memory multiprocessor.
Keywords
- Parallelism
- OpenMP
- reverse mode
- checkpointing
- ADOL-C
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Asanovic, K., Bodik, R., Catanzaro, B.C., Gebis, J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, S.W., Yelick, K.A.: The landscape of parallel computing research: A view from Berkeley. Tech. Rep. UCB/EECS-2006-183, EECS Department, University of California, Berkeley (2006)
Benary, J.: Parallelism in the reverse mode. In: M. Berz, C. Bischof, G. Corliss, A. Griewank (eds.) Computational Differentiation: Techniques, Applications, and Tools, pp. 137–147. SIAM, Philadelphia, PA (1996)
Berz, M., Bischof, C., Corliss, G., Griewank, A. (eds.): Computational Differentiation: Techniques, Applications and Tools. SIAM, Philadelphia, PA (1996)
Bischof, C., Green, L., Haigler, K., Knauff, T.: Parallel calculation of sensitivity derivatives for aircraft design using automatic differentiation. In: Proc. 5th AIAA/NASA/USAF/ISSMO Symposium on Multidisciplinary Analysis and Optimization, AIAA 94-4261, pp. 73–84 (1994)
Bischof, C., Griewank, A., Juedes, D.: Exploiting parallelism in automatic differentiation. In: E. Houstis, Y. Muraoka (eds.) Proc. 1991 Int. Conf. on Supercomputing, pp. 146–153. ACM Press, Baltimore, Md. (1991)
Bischof, C.H.: Issues in Parallel Automatic Differentiation. In: A. Griewank, G.F. Corliss (eds.) Automatic Differentiation of Algorithms: Theory, Implementation, and Application, pp. 100–113. SIAM, Philadelphia, PA (1991)
Bischof, C.H., Bücker, H.M., Wu, P.: Time-parallel computation of pseudo-adjoints for a leapfrog scheme. International Journal of High Speed Computing 12(1), 1–27 (2004)
Bischof, C.H., Hovland, P.D.: Automatic differentiation: Parallel computation. In: C.A. Floudas, P.M. Pardalos (eds.) Encyclopedia of Optimization, vol. I, pp. 102–108. Kluwer Academic Publishers, Dordrecht, The Netherlands (2001)
Bischof, C.H., an Mey, D., Terboven, C., Sarholz, S.: Parallel computers everywhere. In: Proc. 16th Conf. on the Computation of Electromagnetic Fields (COMPUMAG2007)), pp. 693–700. Informationstechnische Gesellschaft im VDE (2007)
Bischof, C.H., an Mey, D., Terboven, C., Sarholz, S.: Petaflops basics – performance from SMP building blocks. In: D. Bader (ed.) Petascale Computing – Algorithms and Applications, pp. 311–331. Chapmann & Hall/CRC (2007)
Brunst, H., Nagel, W.E.: Scalable Performance Analysis of Parallel Systems: Concepts and Experiences. In: G.R. Joubert, W.E. Nagel, F.J. Peters, W.V. Walter (eds.) PARALLEL COMPUTING: Software Technology, Algorithms, Architectures and Applications, Advances in Parallel Computing, vol. 13, pp. 737–744. Elsevier (2003)
Bücker, H.M., Corliss, G.F., Hovland, P.D., Naumann, U., Norris, B. (eds.): Automatic Differentiation: Applications, Theory, and Implementations, Lecture Notes in Computational Science and Engineering, vol. 50. Springer, New York, NY (2005)
Bücker, H.M., Lang, B., Rasch, A., Bischof, C.H.: Automatic parallelism in differentiation of Fourier transforms. In: Proc. 18th ACM Symp. on Applied Computing, Melbourne, Florida, USA, March 9–12, 2003, pp. 148–152. ACM Press, New York (2003)
Bücker, H.M., Lang, B., Rasch, A., Bischof, C.H., an Mey, D.: Explicit loop scheduling in OpenMP for parallel automatic differentiation. In: J.N. Almhana, V.C. Bhavsar (eds.) Proc. 16th Annual Int. Symp. High Performance Computing Systems and Applications, Moncton, NB, Canada, June 16–19, 2002, pp. 121–126. IEEE Comput. Soc. Press (2002)
Bücker, H.M., Rasch, A., Vehreschild, A.: Automatic generation of parallel code for Hessian computations. In: Proceedings of the International Workshop on OpenMP (IWOMP 2006), Reims, France, June 12–15, 2006, Lecture Notes in Computer Science, vol. 4315. Springer. To appear
Bücker, H.M., Rasch, A., Wolf, A.: A class of OpenMP applications involving nested parallelism. In: Proc. 19th ACM Symp. on Applied Computing, Nicosia, Cyprus, March 14–17, 2004, vol. 1, pp. 220–224. ACM Press (2004)
Carle, A., Fagan, M.: Automatically differentiating MPI-1 datatypes: The complete story. In: G. Corliss, C. Faure, A. Griewank, L. Hascoët, U. Naumann (eds.) Automatic Differentiation of Algorithms: From Simulation to Optimization, Computer and Information Science, chap. 25, pp. 215–222. Springer, New York, NY (2002)
Corliss, G., Faure, C., Griewank, A., Hascoët, L., Naumann, U. (eds.): Automatic Differentiation of Algorithms: From Simulation to Optimization, Computer and Information Science. Springer, New York, NY (2002)
Dagum, L., Menon, R.: OpenMP: An Industry-Standard API for Shared-Memory Programming. IEEE Computational Science and Engineering 05(1), 46–55 (1998)
Faure, C., Dutto, P.: Extension of Odyssee to the MPI Library – Direct Mode. Rapport de Recherche 3715, INRIA, Sophia-Antipolis (1999)
Faure, C., Dutto, P.: Extension of Odyssee to the MPI Library – Reverse Mode. Rapport de Recherche 3774, INRIA, Sophia-Antipolis (1999)
Gerndt, A., Sarholz, S., Wolter, M., an Mey, D., Bischof, C., Kuhlen, T.: Nested OpenMP for efficient computation of 3D critical points in multi-block CFD datasets. In: Proc. ACM/IEEE SC 2006 Conference (2006)
Giering, R., Kaminski, T., Todling, R., Errico, R., Gelaro, R., Winslow, N.: Tangent linear and adjoint versions of NASA/GMAO’s Fortran 90 global weather forecast model. In: H.M. Bücker, G.F. Corliss, P.D. Hovland, U. Naumann, B. Norris (eds.) Automatic Differentiation: Applications, Theory, and Implementations, pp. 275–284. Springer (2005)
Griewank, A.: Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. No. 19 in Frontiers in Appl. Math. SIAM, Philadelphia, PA (2000)
Griewank, A.: A mathematical view of automatic differentiation. Acta Numerica 12, 1–78 (2003)
Griewank, A., Corliss, G.F. (eds.): Automatic Differentiation of Algorithms: Theory, Implementation, and Application. SIAM, Philadelphia, PA (1991)
Griewank, A., Juedes, D., Utke, J.: Algorithm 755: ADOL-C: A Package for the Automatic Differentiation of Algorithms Written in C/C++. ACM Transactions on Mathematical Software 22(2), 131–167 (1996)
Guertler, N.: Parallel Automatic Differentiation of a Quantum Plasma Code (2007). Computer science diploma thesis, RWTH-Aachen University, available at http://www.sc. rwth-aachen.de/Diplom/Thesis-Guertler.pdf
Heimbach, P., Hill, C., Giering, R.: Automatic generation of efficient adjoint code for a parallel Navier-Stokes solver. In: P.M.A. Sloot, C.J.K. Tan, J.J. Dongarra, A.G. Hoekstra (eds.) Computational Science – ICCS 2002, Proc. Int. Conf. on Computational Science, Amsterdam, The Netherlands, April 21–24, 2002. Part II, Lecture Notes in Computer Science, vol. 2330, pp. 1019–1028. Springer, Berlin (2002)
Hovland, P.D., Bischof, C.H.: Automatic differentiation of message-passing parallel programs. In: Proc. 1st Merged Int. Parallel Processing Symposium and Symposium on Parallel and Distributed Processing, pp. 98–104. IEEE Computer Society Press (1998)
Kowarz, A., Walther, A.: Optimal Checkpointing for Time-Stepping Procedures in ADOL-C. In: V.N. Alexandrov, G.D. van Albada, P.M.A. Sloot, J. Dongarra (eds.) Computational Science – ICCS 2006, Lecture Notes in Computer Science, vol. 3994, pp. 541–549. Springer, Heidelberg (2006)
Kowarz, A., Walther, A.: Efficient Calculation of Sensitivities for Optimization Problems. Discussiones Mathematicae. Differential Inclusions, Control and Optimization 27, 119–134 (2007). To appear
Kowarz, A., Walther, A.: Parallel Derivative Computation Using ADOL-C. Proceedings PASA 2008, Lecture Notes in Informatics, Vol. 124, pp. 83–92 (2008)
Luca, L.D., Musmanno, R.: A parallel automatic differentiation algorithm for simulation models. Simulation Practice and Theory 5(3), 235–252 (1997)
Mancini, M.: A parallel hierarchical approach for automatic differentiation. In: G. Corliss, C. Faure, A. Griewank, L. Hascoët, U. Naumann (eds.) Automatic Differentiation of Algorithms: From Simulation to Optimization, Computer and Information Science, chap. 27, pp. 231–236. Springer, New York, NY (2002)
P. Heimbach and C. Hill and R. Giering: An efficient exact adjoint of the parallel MIT general circulation model, generated via automatic differentiation. Future Generation Computer Systems 21(8), 1356–1371 (2005)
Rall, L.B.: Automatic Differentiation: Techniques and Applications, Lecture Notes in Computer Science, vol. 120. Springer, Berlin (1981)
Rasch, A., Bücker, H.M., Bischof, C.H.: Automatic Computation of Sensitivities for a Parallel Aerodynamic Simulation. In: C. Bischof, M. Bücker, P. Gibbon, G. Joubert, T. Lippert, B. Mohr, F. Peters (eds.) Parallel Computing: Architectures, Algorithms and Applications, Proceedings of the International Conference ParCo 2007, Advances in Parallel Computing, vol. 15, pp. 303–310. IOS Press, Amsterdam, The Netherlands (2008)
Spiegel, A., an Mey, D., Bischof, C.: Hybrid parallelization of CFD applications with dynamic thread balancing. In: Proc. PARA04 Workshop, Lyngby, Denmark, June 2004, Lecture Notes in Computer Science, vol. 3732, pp. 433–441. Springer Verlag (2006)
Terboven, C., Spiegel, A., an Mey, D., Gross, S., Reichelt, V.: Parallelization of the C++ navier-stokes solver DROPS with OpenMP. In: Parallel Computing (ParCo 2005): Current & Future Issues of High-End Computing, Malaga, Spain, September 2005, NIC Series, vol. 33, pp. 431–438 (2006)
Walther, A., Griewank, A.: Advantages of binomial checkpointing for memory-reduces adjoint calculations. In: M. Feistauer, V. Dolejsi, P. Knobloch, K. Najzar (eds.) Numerical mathematics and advanced applications, Proceedings ENUMATH 2003, pp. 834–843. Springer (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bischof, C., Guertler, N., Kowarz, A., Walther, A. (2008). Parallel Reverse Mode Automatic Differentiation for OpenMP Programs with ADOL-C. In: Bischof, C.H., Bücker, H.M., Hovland, P., Naumann, U., Utke, J. (eds) Advances in Automatic Differentiation. Lecture Notes in Computational Science and Engineering, vol 64. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68942-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-68942-3_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68935-5
Online ISBN: 978-3-540-68942-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)
