Skip to main content

Array-Based Reduction Operations for a Parallel Adaptive FEM

  • Chapter

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7686))

Abstract

For many applications of scientific computing, reduction operations may cause a performance bottleneck. In this article, the performance of different coarse- and fine-grained methods for implementing the reduction is investigated. Fine-grained reductions using atomic operations or fine-grained explicit locks are compared to the coarse-grained reduction operations supplied by OpenMP and MPI.

The reduction operations investigated are used for an adaptive FEM. The performance results show that applications can gain a speedup by using fine-grained reduction since this implementation enables to hide the reduction between calculation while minimising the time waiting for synchronisation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beuchler, S., Meyer, A., Pester, M.: SPC-Pm3AdH v1.0 Programmers manual. Preprint SFB393 01-08, TU Chemnitz (2001) (revised 2003)

    Google Scholar 

  2. Basic linear algebra subprograms technical (BLAST) forum standard (2001)

    Google Scholar 

  3. Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14(3), 189–204 (2000)

    Article  Google Scholar 

  4. Case, R., Padegs, A.: Architecture of the IBM System/370. Commun. ACM 21(1), 73–96 (1987)

    Article  Google Scholar 

  5. Gao, D., Schwartzentruber, T.: Optimizations and OpenMP implementation for the direct simulation monte carlo method. Comput. Fluids 42(1), 73–81 (2011)

    Article  Google Scholar 

  6. Greenwald, M.: Non-blocking synchronization and system design. Ph.D. thesis, Stanford University, Stanford, CA, USA (1999)

    Google Scholar 

  7. Liu, Z., Chapman, B.M., Wen, Y., Huang, L., Hernandez, O.: Analyses for the Translation of OpenMP Codes into SPMD Style with Array Privatization. In: Voss, M.J. (ed.) WOMPAT 2003. LNCS, vol. 2716, pp. 26–41. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  8. Meloni, S., Federico, A., Rosati, M.: Reduction on arrays: comparison of performances between different algorithms. In: Proc. EWOMP 2003 (2003)

    Google Scholar 

  9. Meyer, A.: A parallel preconditioned conjugate gradient method using domain decomposition and inexact solvers on each subdomain. Comput. 45, 217–234 (1990)

    Article  MATH  Google Scholar 

  10. Ries, D., Stonebraker, M.: Effects of locking granularity in a database management system. ACM Trans. Database Syst. 2(3), 233–246 (1977)

    Article  Google Scholar 

  11. Shirako, J., Peixotto, D., Sarkar, V., Scherer, W.: Phaser accumulators: A new reduction construct for dynamic parallelism. In: Proc. IPDPS (2009)

    Google Scholar 

  12. Speziale, E., di Biagio, A., Agosta, G.: An optimized reduction design to minimize atomic operations in shared memory multiprocessors. In: Proc. IPDPS, Workshops and PhD Forum (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Balg, M., Lang, J., Meyer, A., Rünger, G. (2013). Array-Based Reduction Operations for a Parallel Adaptive FEM. In: Keller, R., Kramer, D., Weiss, JP. (eds) Facing the Multicore-Challenge III. Lecture Notes in Computer Science, vol 7686. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35893-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35893-7_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35892-0

  • Online ISBN: 978-3-642-35893-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics