Parallelization of an Edge- and Coherence-Enhancing Anisotropic Diffusion Filter with a Distributed Memory Approach Based on GPI

  • Martin Kühn
Conference paper


Numerical algorithms in the seismic industry are among the most challenging areas of High Performance Computing and require an ever growing number of compute power and main memory. The Global Address Space Programming Interface (GPI) provides a model to program distributed memory clusters based on RDMA transfers in a Partitioned Global Address Space (PGAS). Based on GPI a generic straight forward parallelization of an Anisotropic Diffusion Filter (ADF) is implemented as an example of an Explicit Finite Difference scheme. Key features of the implementation are a complete overlay of the computation with network data transfers, a dynamic load distribution scheme and the usage of one-sided communication patterns throughout the algorithm to orchestrate read and write accesses to the image data. Synchronization points between the compute nodes or barriers are completely avoided. Benchmarks on a cluster with 260 nodes and 1040 cores reveal a constant communication overhead of less than 6% of the total computation time. This figure is still met if the compute nodes in the cluster differ significantly in performance capacity.


Message Passing Interface Current Time Step Dependency Range Synchronization Point Remote Direct Memory Access 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



We like to thank Joachim Weickert, University of Saarbrücken for providing a single threaded reference implementation of the PSPro Edge- and Coherence-Enhancing Anisotropic Diffusion Filter.


  1. 1.
    Bruhn, A., Jakob, T., Fischer, M., Kohlberger, T., Weickert, J., Brüning, U., Schnörr, C.: High performance cluster computing with 3-D nonlinear diffusion filters. Real Time Imag. 10(1), 41–51 (2004). doi: 10.1016/j.rti.2003.12.002CrossRefGoogle Scholar
  2. 2.
    Machado, R., Lojewski, C.: The Fraunhofer virtual machine: a communication library and runtime system based on the RDMA model. Comput. Sci. Res. Dev. 23(3–4), 125–132 (2009). doi: 10.1007/s00450-009-0088-2CrossRefGoogle Scholar
  3. 3.
    Weickert, J.: Theoretical foundations of anisotropic diffusion in image processing. Computing, Suppl 11, 221–236 (1996)CrossRefGoogle Scholar
  4. 4.
    Weickert, J.: Anisotropic diffusion in image processing. Teubner (1998)Google Scholar
  5. 5.
    Weickert, J.: Coherence-enhancing difusion filtering. Internationals Journal of Computer Vision 31(2/3), 111–127 (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Martin Kühn
    • 1
  1. 1.Fraunhofer ITWMKaiserslauternGermany

Personalised recommendations