The Journal of Supercomputing

, Volume 65, Issue 2, pp 930–948 | Cite as

A flexible and dynamic page migration infrastructure based on hardware counters

  • Juan A. Lorenzo-Castillo
  • Juan C. Pichel
  • Francisco F. Rivera
  • Tomás F. Pena
  • José C. Cabaleiro


Performance counters, also known as hardware counters, are a powerful monitoring mechanism included in the Performance Monitoring Unit (PMU) of most of the modern microprocessors. Their use is gaining popularity as an analysis and validation tool for profiling, since their impact is virtually imperceptible and their precision has noticeably increased thanks to the new Precise Event-Based Sampling (PEBS) features.

In this paper, we present and evaluate a novel user-level tool, based on hardware counters, for monitoring and migrating pages dynamically. This tool supports different migration strategies, being able to attach and monitor a target application without need to modify it whatsoever. The page migration process is performed timely and its overhead is overcome by the benefit of the data locality achieved.

As a case study, an access-based migration algorithm was implemented and integrated into our tool. Performance results on a NUMA system show a noticeable reduction of remote accesses and execution time, achieving speedups of up to ∼21 % in a multiprogrammed environment.


Hardware counters Page migration NUMA 



This work has been partially supported by Hewlett-Packard under contract 2008/CE377, by the Ministry of Education and Science of Spain, FEDER funds under contract TIN 2010-17541 and by the Xunta de Galicia (Spain) under contract 2010/28 and project 09TIC002CT. This work is in the frame of the Spanish network CAPAP-H. The authors also wish to thank the supercomputer facilities provided by CESGA.


  1. 1.
    Bolosky WJ, Scott ML, Fitzgerald RP, Fowler RJ, Cox AL (1991) NUMA policies and their relation to memory architecture. In: Int conf on architectural support for programming languages and operating systems, pp 212–221 Google Scholar
  2. 2.
    Bull JM, Johnson C (2002) Data distribution, migration and replication on a ccNUMA architecture. In: Proceedings of the fourth European workshop on OpenMP Google Scholar
  3. 3.
    Eranian S (2005) The Perfmon2 interface specification. Technical report HPL-2004-200R1, HP Labs Google Scholar
  4. 4.
    Galicia supercomputing centre (CESGA):
  5. 5.
    Goglin B, Furmento N (2009) Enabling high-performance memory migration for multithreaded applications on Linux. In: Proc of the IEEE int symposium on parallel & distributed processing, pp 1–9 Google Scholar
  6. 6.
    Hewlett Packard (2006) Dual-core update to the Intel Itanium 2 processor reference manual. Technical paper Google Scholar
  7. 7.
    Jin H, Jin H, Frumkin M, Frumkin M, Yan J, Yan J (1999) The OpenMP implementation of NAS parallel benchmarks and its performance. Technical report Google Scholar
  8. 8.
    Larowe RP Jr, Schlatter Ellis C (1991) Experimental comparison of memory management policies for NUMA multiprocessors. ACM Trans Comput Syst 9(4):319–363 CrossRefGoogle Scholar
  9. 9.
    Majo Z, Gross TR (2012) Matching memory access patterns and data placement for NUMA systems. In: Proc of the tenth international symposium on code generation and optimization, CGO’12, New York, NY, USA, pp 230–241 CrossRefGoogle Scholar
  10. 10.
    Marathe J, Mueller F (2006) Hardware profile-guided automatic page placement for ccNUMA systems. In: Proc of the ACM SIGPLAN symposium on principles and practice of parallel programming, pp 90–99 Google Scholar
  11. 11.
  12. 12.
    Nikolopoulos DS, Papatheodorou TS, Polychronopoulos CD, Labarta J, Ayguadé E (2000) A case for user-level dynamic page migration. In: Proceedings of the int conf on supercomputing, pp 119–130 Google Scholar
  13. 13.
    Nikolopoulos DS, Papatheodorou TS, Polychronopoulos CD, Labarta J, Ayguadé E (2000) User-level dynamic page migration for multiprogrammed shared-memory multiprocessors. In: Proc of the int conf on parallel processing, p 95 Google Scholar
  14. 14.
    Nikolopoulos DS, Polychronopoulos CD, Papatheodorou TS, Labarta J, Ayguadé E (2002) Scheduler-activated dynamic page migration for multiprogrammed DSM multiprocessors. J Parallel Distrib Comput 62(6):1069–1103 zbMATHCrossRefGoogle Scholar
  15. 15.
    OpenMP: Simple, portable, scalable SMP programming.
  16. 16.
    Perfmon2 monitoring interface and Pfmon monitoring tool:
  17. 17.
    Tao J, Schulz M, Karl W (2002) Improving data locality using dynamic page migration based on memory access histograms. In: Proc of the international conference on computational science—Part II, pp 933–942 Google Scholar
  18. 18.
    Thakkar V (2008) Dynamic page migration on ccNUMA platforms guided by hardware tracing. Master’s thesis, Graduate Faculty of North Carolina State University Google Scholar
  19. 19.
    Tikir MM, Hollingsworth JK (2004) Using hardware counters to automatically improve memory performance. In: Proc of the ACM/IEEE conference on supercomputing, SC’04, p 46 CrossRefGoogle Scholar
  20. 20.
    Tikir MM, Hollingsworth JK (2008) Hardware monitors for dynamic page migration. J Parallel Distrib Comput 68:1186–1200 CrossRefGoogle Scholar
  21. 21.
    Wang X, Wen X, Li Y, Luo Y, Li X, Wang Z (2012) A dynamic cache partitioning mechanism under virtualization environment. In: Proc of the 11th international conf on trust, security and privacy in computing and communications (TrustCom), pp 1907–1911 Google Scholar
  22. 22.
    Wilson KM, Aglietti BB (2001) Dynamic page placement to improve locality in CC-NUMA multiprocessors for TPC-C. In: Proceedings of the ACM/IEEE conference on supercomputing, pp 98–107 Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Juan A. Lorenzo-Castillo
    • 1
  • Juan C. Pichel
    • 1
  • Francisco F. Rivera
    • 1
  • Tomás F. Pena
    • 1
  • José C. Cabaleiro
    • 1
  1. 1.Centro de Investigación en Tecnoloxías da Información (CITIUS)University of Santiago de CompostelaSantiago de CompostelaSpain

Personalised recommendations