Advertisement

Improving Memory Affinity of Geophysics Applications on NUMA Platforms Using Minas

  • Christiane Pousa Ribeiro
  • Márcio Castro
  • Jean-François Méhaut
  • Alexandre Carissimi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6449)

Abstract

On numerical scientific High Performance Computing (HPC), Non-Uniform Memory Access (NUMA) platforms are now commonplace. On such platforms, the memory affinity management remains an important concern in order to overcome the memory wall problem. Prior solutions have presented some drawbacks such as machine dependency and a limited set of memory policies. This paper introduces Minas, a framework which provides either explicit or automatic memory affinity management with architecture abstraction for ccNUMAs. We evaluate our solution on two ccNUMA platforms using two geophysics parallel applications. The results show some performance improvements in comparison with other solutions available for Linux.

Keywords

Memory Access High Performance Computing Memory Bank Seismic Wave Propagation Memory Page 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Verghese, B., Devine, S., Gupta, A., Rosenblum, M.: Operating system support for improving data locality on CC-NUMA compute servers. In: ASPLOS-VII: Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 279–289 (1996)Google Scholar
  2. 2.
    Antony, J., Janes, P.P., Rendell, A.P.: Exploring thread and memory placement on NUMA architectures: Solaris and linux, ultraSPARC/FirePlane and opteron/HyperTransport. In: Robert, Y., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.) HiPC 2006. LNCS, vol. 4297, pp. 338–352. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    Kleen, A.: A NUMA API for Linux, Tech. Rep. Novell-4621437 (2005), http://whitepapers.zdnet.co.uk/01000000651,260150330p,00.htm
  4. 4.
    Löf, H., Holmgren, S.: Affinity-on-next-touch: Increasing the Performance of an Industrial PDE Solver on a cc-NUMA System. In: ICS 2005: Proceedings of the 19th Annual International Conference on Supercomputing, pp. 387–392. ACM, New York (2005), http://portal.acm.org/citation.cfm?id=1088149.1088201 Google Scholar
  5. 5.
    Terboven, C., Mey, D.A., Schmidl, D., Jin, H., Reichstein, T.: Data and Thread Affinity in OpenMP Programs. In: MAW 2008: Proceedings of the 2008 workshop on Memory access on future processors, pp. 377–384. ACM, New York (2008), http://dx.doi.org/10.1145/1366219.1366222 Google Scholar
  6. 6.
    Goglin, B., Furmento, N.: Enabling High-Performance Memory Migration for Multithreaded Applications on Linux. In: IEEE (ed.) MTAAP 2009: Workshop on Multithreaded Architectures and Applications, held in conjunction with IPDPS 2009, Rome, Italie (2009), http://hal.inria.fr/inria-00358172/en/
  7. 7.
    Ribeiro, C.P., Méhaut, J.-F.: Minas Project - Memory affInity maNAgement System (2009), http://pousa.christiane.googlepages.com/Minas
  8. 8.
    Ribeiro, C.P., Castro, M., Fernandes, L.G., Carissimi, A., Méhaut, J.-F.: Memory Affinity for Hierarchical Shared Memory Multiprocessors. In: 21st International Symposium on Computer Architecture and High Performance Computing - SBAC-PAD, IEEE, São Paulo (2009)Google Scholar
  9. 9.
    Dupros, F., Pousa, C., Carissimi, A., Méhaut, J.-F.: Parallel Simulations of Seismic Wave Propagation on NUMA Architectures. In: ParCo 2009: International Conference on Parallel Computing, Lyon, France (2009)Google Scholar
  10. 10.
    Castro, M., Fernandes, L.G., Ribeiro, C.P., Méhaut, J.-F., de Aguiar, M.S.: NUMA-ICTM: A Parallel Version of ICTM Exploiting Memory Placement Strategies for NUMA Machines. In: PDSEC 2009: Parallel and Distributed Processing Symposium, International, pp. 1–8 (2009)Google Scholar
  11. 11.
    Google, Google-perftools: Fast, multi-threaded malloc() and nifty performance analysis tools (2009), http://code.google.com/p/google-perftools/
  12. 12.
    Gürsoy, A., Kale, L.V.: Performance and modularity benefits of message-driven execution. J. Parallel Distrib. Comput. 64(4), 461–480 (2004)CrossRefzbMATHGoogle Scholar
  13. 13.
    Intel, Intel Threading Building Blocks (2010), http://www.threadingbuildingblocks.org/

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Christiane Pousa Ribeiro
    • 1
  • Márcio Castro
    • 1
  • Jean-François Méhaut
    • 1
  • Alexandre Carissimi
    • 2
  1. 1.LIG Laboratory - INRIAUniversity of GrenobleGrenobleFrance
  2. 2.Universidade Federal do Rio Grande do SulPorto AlegreBrazil

Personalised recommendations