Skip to main content

Evaluating System-Level Monitors and Knobs on Real Hardware

  • Chapter
  • First Online:
  • 453 Accesses

Abstract

This chapter evaluates and defines a methodology for the oracle selection of the monitors and knobs to use to configure an HPC system running a scientific application while satisfying the application’s requirements and not violating any system constraints. This methodology relies on a heuristic correlation analysis between requirements, monitors, and knobs to determine the minimum subset of monitors to observe and knobs to explore to determine the optimal system configuration for the HPC application. The setup under examination is the Floreon+ application, along with IT4I’s cluster. At the end of this analysis, the eleven-dimensional was reduced to a three-dimensional space for monitors and a six-dimensional space to a three-dimensional space for knobs. This reduction shows the potential and highlights the need for a realistic methodology to help identify such a minimum set of monitors and knobs. Additionally, a characterization of the application is provided by showing that Floreon+ performance requirements are satisfied with a CPU frequency lower than the servers nominal CPU frequency.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The specific thresholds are picked on empirical analysis not shown here.

References

  1. Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459.

    Article  Google Scholar 

  2. Ahn, J. H., Jouppi, N. P., Kozyrakis, C., Leverich, J., & Schreiber, R. S. (2009). Future scaling of processor-memory interfaces. In Conference on High Performance Computing Networking, Storage and Analysis (pp. 42:1–42:12).

    Google Scholar 

  3. Ankireddi, S., & Chen, T. (2014, February). Configuring and using DDR3 memory with HP ProLiant Gen8 Servers, Best Practice Guidelines for ProLiant servers with Intel Xeon processors.

    Google Scholar 

  4. Annavaram, M., Rakvic, R., Polito, M., Bouguet, J.-Y., Hankins, R. A., & Davies, B. (2004). The fuzzy correlation between code and performance predictability. In Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture (pp. 93–104). New York: IEEE Computer Society.

    Google Scholar 

  5. BIOS and Kernel Developers Guide (BKDG) for AMD Family 10h. (2010, April).

    Google Scholar 

  6. BIOS and Kernel Developers Guide (BKDG) for AMD Family 15h. February (2014).

    Google Scholar 

  7. Brewer, E. (2001). Lessons from giant-scale services. Internet Computing, 5(4), 46–55.

    Article  Google Scholar 

  8. Hardy, D., Kleanthous, M., Sideris, I., Saidi, A., Ozer, E., & Sazeides, Y. (2013). An analytical framework for estimating TCO and exploring data center design space. In International Symposium on Performance Analysis of Systems and Software (pp. 54–63).

    Google Scholar 

  9. Hoste, K., & Eeckhout, L. (2006). Comparing benchmarks using key microarchitecture-independent characteristics. In 2006 IEEE International Symposium on Workload Characterization (pp. 83–92). New York: IEEE.

    Chapter  Google Scholar 

  10. Hoste, K., & Eeckhout, L. (2007). Microarchitecture-independent workload characterization. IEEE Micro, 27(3), 63–72.

    Article  Google Scholar 

  11. Hoste, K., Phansalkar, A., Eeckhout, L., Georges, A., John, L. K., & De Bosschere, K. (2006). Performance prediction based on inherent program similarity. In Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques (pp. 114–122). New York: ACM.

    Chapter  Google Scholar 

  12. Hp rom-based setup utility user guide. February HP TR (2014).

    Google Scholar 

  13. Huang, L., & Xu, Q. (2010). Characterizing the lifetime reliability of many core processors with core-level redundancy. In 2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (pp. 680–685). New York: IEEE.

    Chapter  Google Scholar 

  14. Intel. (2008). Intel Turbo Boost Technology in Intel Core microarchitecture (Nehalem) Based Processors. White paper. Technical report, November 2008.

    Google Scholar 

  15. Iordache, A., Buyukkaya, E., & Pierre, G. (2015). Heterogeneous resource selection for arbitrary HPC applications in the cloud. In IFIP International Conference on Distributed Applications and Interoperable Systems (pp. 108–123). Berlin: Springer.

    Google Scholar 

  16. Jian, X., Duwe, H., Sartori, J., Sridharan, V., & Kumar, R. (2013). Low-power, low-storage-overhead Chipkill correct via multi-line error correction. In International Conference on High Performance Computing, Networking, Storage and Analysis (pp. 24:1–24:12).

    Google Scholar 

  17. Jolliffe, I. (2002). Principal component analysis. Wiley Online Library.

    Google Scholar 

  18. Kuchar, S., Golasowski, M., Vavrik, R., Podhoranyi, M., Sir, B., & Martinovic, J. (2015). Using high performance computing for online flood monitoring and prediction. International Journal of Environmental, Ecological, Geological and Geophysical Engineering, 9(5), 267–272.

    Google Scholar 

  19. likwid-powermeter. Tool for accessing RAPL counters on Intel processor. https://github.com/RRZE-HPC/likwid/wiki/Likwid-Powermeter

  20. lm-sensors. http://www.lm-sensors.org/wiki/man/sensors-detect

  21. Mars, J., Tang, L., Hundt, R., Skadron, K., & Soffa, M. L. (2011). Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In 44th Annual International Symposium on Microarchitecture (pp. 248–259).

    Google Scholar 

  22. Nikolaou, P., Sazeides, Y., Ndreu, L., & Kleanthous, M. (2015). Modeling the implications of dram failures and protection techniques on datacenter TCO. In Proceedings of the 48th International Symposium on Microarchitecture, MICRO-48 (pp. 572–584). New York: ACM.

    Chapter  Google Scholar 

  23. Portero, A., Kuchař, Š., Vavřík, R., Golasowski, M., & Vondrá, V. (2014). System and application scenarios for disaster management processes, the rainfall-runoff model case study. In IFIP International Conference on Computer Information Systems and Industrial Management (pp. 315–326). Berlin: Springer.

    Google Scholar 

  24. Portero, A., Vavrik, R., Kuchar, S., Golasowski, M., Vondrak, V., Libutti, S., et al. (2015). Flood prediction model simulation with heterogeneous trade-offs in high performance computing framework. In 29th EUROPEAN Conference on Modelling and Simulation ECMS.

    Google Scholar 

  25. Sridharan, V., Stearley, J., DeBardeleben, N., Blanchard, S., & Gurumurthi, S. (2013). Feng Shui of supercomputer memory: Positional effects in dram and SRAM faults. In International Conference on High Performance Computing, Networking, Storage and Analysis (pp. 22:1–22:11).

    Google Scholar 

  26. Team, R. C. et al. (2013). R: A language and environment for statistical computing, Citeseer.

    Google Scholar 

  27. Tendler, J. M., Dodson, J. S., Fields, J., Le, H., & Sinharoy, B. (2002). Power4 system microarchitecture. IBM Journal of Research and Development, 46(1), 5–25.

    Article  Google Scholar 

  28. Tullsen, D. M., Eggers, S. J., & Levy, H. M. (1995). Simultaneous multithreading: Maximizing on-chip parallelism. SIGARCH Computer Architecture News, 23(2), 392–403.

    Article  Google Scholar 

  29. Udipi, A. N., Muralimanohar, N., Balsubramonian, R., Davis, A., & Jouppi, N. P. (2012). LOT-ECC: Localized and tiered reliability mechanisms for commodity memory systems. In 39th Annual International Symposium on Computer Architecture (pp. 285–296).

    Google Scholar 

  30. Weiser, M., Welch, B., Demers, A., & Shenker, S. (1994). Scheduling for reduced cpu energy. In Proceedings of the 1st USENIX Conference on Operating Systems Design and Implementation, OSDI ’94. Berkeley: USENIX Association.

    Google Scholar 

  31. Yang, H., Breslow, A., Mars, J., & Tang, L. (2013). Bubble-flux: Precise online QoS management for increased utilization in warehouse scale computers. In 40th Annual International Symposium on Computer Architecture (pp. 607–618).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Panagiota Nikolaou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Nikolaou, P. et al. (2019). Evaluating System-Level Monitors and Knobs on Real Hardware. In: Fornaciari, W., Soudris, D. (eds) Harnessing Performance Variability in Embedded and High-performance Many/Multi-core Platforms. Springer, Cham. https://doi.org/10.1007/978-3-319-91962-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-91962-1_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-91961-4

  • Online ISBN: 978-3-319-91962-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics