Advances in the Parallelisation of Software for Quantum Chemistry Applications

  • Martin Roderus
  • Alexei Matveev
  • Hans-Joachim Bungartz
  • Notker Rösch
Conference paper
Part of the Lecture Notes in Computational Science and Engineering book series (LNCSE, volume 93)


Density functional theory (DFT) provides some of the most important methods used in computational theory today. They allow one to determine the electronic structure of finite chemical systems, be they molecules or clusters, using a quantum-mechanical model, and exposes, thus, the great majority of the systems’ properties relevant to chemical applications. However, the numerical treatment of large chemical systems proves to be expensive, requiring elaborate parallelisation strategies.This paper presents two recent developments which aim at improving the parallel scalability of the quantum chemistry code ParaGauss. First, we introduce a new Fortran interface to parallel matrix algebra and its library implementation. This interface specifies a set of distributed data objects, combined with a set of linear algebra operators. Thus, complicated algebraic expressions can be expressed efficiently in pseudo-mathematical notation, while the numerical computations are carried out by back-end parallel routines. This technique is evaluated on relativistic transformations, as implemented in ParaGauss.The second development addresses the solution of the generalized matrix eigenvalue problem—an inherent step in electronic structure calculations. In the case the symmetry of a molecule is exploited, pertinent matrices expose a block-diagonal structure which makes the efficient use of existing parallel eigenvalue solvers difficult. We discuss a technique that uses a malleable parallel task scheduling (MPTS) algorithm for scheduling instances of parallel ScaLAPACK-routines on the available processor resources. This technique significantly improves the parallel performance of this numerical step, reducing the corresponding execution time to below 1 s in most applications considered.


High performance computing Parallel numerical algebra Density functional theory Relativistic quantum chemistry Scheduling algorithms 


  1. 1.
    Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Ostrouchov, S., Sorensen, D.: LAPACK’s User’s Guide. SIAM, Philadelphia (1992)Google Scholar
  2. 2.
    Auckenthaler, T., Blum, V., Bungartz, H.J., Huckle, T., Johanni, R., Krämer, L., Lang, B., Lederer, H., Willems, P.R.: Parallel solution of partial symmetric eigenvalue problems from electronic structure calculations. Parallel Comput. 37, 783–794 (2011)CrossRefGoogle Scholar
  3. 3.
    Belling, T., Grauschopf, T., Krüger, S., Mayer, M., Nörtemann, F., Staufer, M., Zenger, C., Rösch, N.: In: Bungartz, H.J., Durst, F., Zenger, C. (eds.) High Performance Scientific and Engineering Computing. Lecture Notes in Computational Science and Engineering, vol. 8, p. 439. Springer, Heidelberg (1999)Google Scholar
  4. 4.
    Belling, T., Grauschopf, T., Krüger, S., Nörtemann, F., Staufer, M., Mayer, M., Nasluzov, V.A., Birkenheuer, U., Hu, A., Matveev, A.V., Shor, A.V., Fuchs-Rohr, M.S.K., Neyman, K.M., Ganyushin, D.I., Kerdcharoen, T., Woiterski, A., Gordienko, A.B., Majumder, S., Rösch, N.: PARAGAUSS, version 3.1. Technische Universität München (2006)Google Scholar
  5. 5.
    Blackford, L.S., Choi, J., Cleary, A., D’Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D., Whaley, R.C.: ScaLAPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia (1997)CrossRefzbMATHGoogle Scholar
  6. 6.
    Blazewicz, J., Kovalyov, M.Y., Machowiak, M., Trystram, D., Weglarz, J.: Scheduling malleable tasks on parallel processors to minimize the makespan. Ann. Oper. Res. 129, 65–80 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Blazewicz, J., Ecker, K., Pesch, E., Schmidt, G., Weglarz, J.: Handbook on Scheduling: From Theory to Applications. Springer, Heidelberg (2007)Google Scholar
  8. 8.
    Buenker, R.J., Chandra, P., Hess, B.A.: Matrix representation of the relativistic kinetic energy operator: two-component variational procedure for the treatment of many-electron atoms and molecules. Chem. Phys. 84, 1–9 (1984)CrossRefGoogle Scholar
  9. 9.
    Decker, T., Lücking, T., Monien, B.: A 5/4-approximation algorithm for scheduling identical malleable tasks. Theor. Comput. Sci. 361(2), 226–240 (2006)CrossRefzbMATHGoogle Scholar
  10. 10.
    Douglas, M., Kroll, N.M.: Quantum electrodynamical corrections to the fine structure of helium. Ann. Phys. (NY) 82, 89 (1974)Google Scholar
  11. 11.
    Enhanced data type facilities. ISO/IEC TR 15581, 2nd edn. (1999).
  12. 12.
    Garey, M., Johnson, D.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, New York (1979)zbMATHGoogle Scholar
  13. 13.
    Gottschling, P., Wise, D.S., Adams, M.D.: Representation-transparent matrix algorithms with scalable performance. In: Proceedings of the 21st Annual International Conference on Supercomputing, ICS ’07, Seattle, pp. 116–125. ACM, New York (2007)Google Scholar
  14. 14.
    Graham, R.L.: Bounds for certain multiprocessing anomalies. Bell Syst. Tech. J. 45, 1563–1581 (1966)CrossRefGoogle Scholar
  15. 15.
    Graham, R.L.: Bounds on multiprocessing timing anomalities. SIAM J. Appl. Math. 17, 263–269 (1969)Google Scholar
  16. 16.
    Häberlen, O.D., Chung, S.C., Stener, M., Rösch, N.: From clusters to the bulk. A relativistic electronic structure investigation on a series of gold clusters Aun, \(n = 6,\ldots 147\). J. Chem. Phys. 106, 5189–5201 (1997)Google Scholar
  17. 17.
    Hein, J.: Improved parallel performance of SIESTA for the HPCx Phase2 system. Technical report, The University of Edinburgh (2004)Google Scholar
  18. 18.
    Jansen, K.: Scheduling malleable parallel tasks: an asymptotic fully polynomial time approximation scheme. Algorithmica 39, 59–81 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Ling, B.S.: The Boost C++ Libraries. XML Press (2011).
  20. 20.
    Ludwig, W., Tiwari, P.: Scheduling malleable and nonmalleable parallel tasks. In: SODA ’94, Arlington, pp. 167–176 (1994)Google Scholar
  21. 21.
    Mounié, G., Rapine, C., Trystram, D.: Efficient approximation algorithms for scheduling malleable tasks. In: SPAA ’99, Saint Malo, pp. 23–32 (1999)Google Scholar
  22. 22.
    Mounié, G., Rapine, C., Trystram, D.: A 3/2-approximation algorithm for scheduling independent monotonic malleable tasks. SIAM J. Comput. 37(2), 401–412 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    National supercomputer HLRB-II. Retrieved on 10 Aug 2012
  24. 24.
    Roderus, M., Berariu, A., Bungartz, H.J., Krüger, S., Matveev, A.V., Rösch, N.: Scheduling parallel eigenvalue computations in a quantum chemistry code. In: Euro-Par (2)’10, Ischia, pp. 113–124 (2010)Google Scholar
  25. 25.
    Roderus, M., Matveev, A.V., Bungartz, H.J.: A high-level Fortran interface to parallel matrix algebra. In: CCSEIT-2012, International Conference Proceeding Series (ICPS), Coimbatore. ACM (2012, Accepted)Google Scholar
  26. 26.
    Rösch, N., Matveev, A., Nasluzov, V.A., Neyman, K.M., Moskaleva, L., Krüger, S.: Quantum chemistry with the Douglas–Kroll–Hess approach to relativistic density functional theory: efficient methods for molecules and materials. In: Schwerdtfeger, P. (ed.) Relativistic Electronic Structure Theory – Applications. Theoretical and Computational Chemistry Series, vol. 14, pp. 656–722. Elsevier, Amsterdam (2004)CrossRefGoogle Scholar
  27. 27.
    Sanderson, C.: Armadillo: an open source C++ linear algebra library for fast prototyping and computationally intensive experiments. Technical report, NICTA (2010)Google Scholar
  28. 28.
    Steinberg, A.: A strip-packing algorithm with absolute performance bound 2. SIAM J. Comput. 26(2), 401–409 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Stewart, G.W.: Matran: a Fortran 95 matrix wrapper. Technical report, UMIACS (2003)Google Scholar
  30. 30.
    Turek, J., Wolf, J., Yu, P.: Approximate algorithms for scheduling parallelizable tasks. In: SPAA’92, San Diego, pp. 323–332 (1992)Google Scholar
  31. 31.
    van de Geijn, R.A.: Using PLAPACK: Parallel Linear Algebra Package. MIT, Cambridge (1997)Google Scholar
  32. 32.
    Ward, R.C., Bai, Y., Pratt, J.: Performance of parallel eigensolvers on electronic structure calculations II. Technical report, The University of Tennessee (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Martin Roderus
    • 1
  • Alexei Matveev
    • 2
  • Hans-Joachim Bungartz
    • 1
  • Notker Rösch
    • 3
    • 4
  1. 1.Department of InformaticsTechnische Universität MünchenGarchingGermany
  2. 2.Department ChemieTechnische Universität MünchenGarchingGermany
  3. 3.Department Chemie & Catalysis Research CenterTechnische Universität MünchenGarchingGermany
  4. 4.Institute of High Performance ComputingAgency of Science, Technology, and ResearchSingaporeSingapore

Personalised recommendations