Abstract
This chapter evaluates and defines a methodology for the oracle selection of the monitors and knobs to use to configure an HPC system running a scientific application while satisfying the application’s requirements and not violating any system constraints. This methodology relies on a heuristic correlation analysis between requirements, monitors, and knobs to determine the minimum subset of monitors to observe and knobs to explore to determine the optimal system configuration for the HPC application. The setup under examination is the Floreon+ application, along with IT4I’s cluster. At the end of this analysis, the eleven-dimensional was reduced to a three-dimensional space for monitors and a six-dimensional space to a three-dimensional space for knobs. This reduction shows the potential and highlights the need for a realistic methodology to help identify such a minimum set of monitors and knobs. Additionally, a characterization of the application is provided by showing that Floreon+ performance requirements are satisfied with a CPU frequency lower than the servers nominal CPU frequency.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
The specific thresholds are picked on empirical analysis not shown here.
References
Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459.
Ahn, J. H., Jouppi, N. P., Kozyrakis, C., Leverich, J., & Schreiber, R. S. (2009). Future scaling of processor-memory interfaces. In Conference on High Performance Computing Networking, Storage and Analysis (pp. 42:1–42:12).
Ankireddi, S., & Chen, T. (2014, February). Configuring and using DDR3 memory with HP ProLiant Gen8 Servers, Best Practice Guidelines for ProLiant servers with Intel Xeon processors.
Annavaram, M., Rakvic, R., Polito, M., Bouguet, J.-Y., Hankins, R. A., & Davies, B. (2004). The fuzzy correlation between code and performance predictability. In Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture (pp. 93–104). New York: IEEE Computer Society.
BIOS and Kernel Developers Guide (BKDG) for AMD Family 10h. (2010, April).
BIOS and Kernel Developers Guide (BKDG) for AMD Family 15h. February (2014).
Brewer, E. (2001). Lessons from giant-scale services. Internet Computing, 5(4), 46–55.
Hardy, D., Kleanthous, M., Sideris, I., Saidi, A., Ozer, E., & Sazeides, Y. (2013). An analytical framework for estimating TCO and exploring data center design space. In International Symposium on Performance Analysis of Systems and Software (pp. 54–63).
Hoste, K., & Eeckhout, L. (2006). Comparing benchmarks using key microarchitecture-independent characteristics. In 2006 IEEE International Symposium on Workload Characterization (pp. 83–92). New York: IEEE.
Hoste, K., & Eeckhout, L. (2007). Microarchitecture-independent workload characterization. IEEE Micro, 27(3), 63–72.
Hoste, K., Phansalkar, A., Eeckhout, L., Georges, A., John, L. K., & De Bosschere, K. (2006). Performance prediction based on inherent program similarity. In Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques (pp. 114–122). New York: ACM.
Hp rom-based setup utility user guide. February HP TR (2014).
Huang, L., & Xu, Q. (2010). Characterizing the lifetime reliability of many core processors with core-level redundancy. In 2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (pp. 680–685). New York: IEEE.
Intel. (2008). Intel Turbo Boost Technology in Intel Core microarchitecture (Nehalem) Based Processors. White paper. Technical report, November 2008.
Iordache, A., Buyukkaya, E., & Pierre, G. (2015). Heterogeneous resource selection for arbitrary HPC applications in the cloud. In IFIP International Conference on Distributed Applications and Interoperable Systems (pp. 108–123). Berlin: Springer.
Jian, X., Duwe, H., Sartori, J., Sridharan, V., & Kumar, R. (2013). Low-power, low-storage-overhead Chipkill correct via multi-line error correction. In International Conference on High Performance Computing, Networking, Storage and Analysis (pp. 24:1–24:12).
Jolliffe, I. (2002). Principal component analysis. Wiley Online Library.
Kuchar, S., Golasowski, M., Vavrik, R., Podhoranyi, M., Sir, B., & Martinovic, J. (2015). Using high performance computing for online flood monitoring and prediction. International Journal of Environmental, Ecological, Geological and Geophysical Engineering, 9(5), 267–272.
likwid-powermeter. Tool for accessing RAPL counters on Intel processor. https://github.com/RRZE-HPC/likwid/wiki/Likwid-Powermeter
lm-sensors. http://www.lm-sensors.org/wiki/man/sensors-detect
Mars, J., Tang, L., Hundt, R., Skadron, K., & Soffa, M. L. (2011). Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In 44th Annual International Symposium on Microarchitecture (pp. 248–259).
Nikolaou, P., Sazeides, Y., Ndreu, L., & Kleanthous, M. (2015). Modeling the implications of dram failures and protection techniques on datacenter TCO. In Proceedings of the 48th International Symposium on Microarchitecture, MICRO-48 (pp. 572–584). New York: ACM.
Portero, A., Kuchař, Š., Vavřík, R., Golasowski, M., & Vondrá, V. (2014). System and application scenarios for disaster management processes, the rainfall-runoff model case study. In IFIP International Conference on Computer Information Systems and Industrial Management (pp. 315–326). Berlin: Springer.
Portero, A., Vavrik, R., Kuchar, S., Golasowski, M., Vondrak, V., Libutti, S., et al. (2015). Flood prediction model simulation with heterogeneous trade-offs in high performance computing framework. In 29th EUROPEAN Conference on Modelling and Simulation ECMS.
Sridharan, V., Stearley, J., DeBardeleben, N., Blanchard, S., & Gurumurthi, S. (2013). Feng Shui of supercomputer memory: Positional effects in dram and SRAM faults. In International Conference on High Performance Computing, Networking, Storage and Analysis (pp. 22:1–22:11).
Team, R. C. et al. (2013). R: A language and environment for statistical computing, Citeseer.
Tendler, J. M., Dodson, J. S., Fields, J., Le, H., & Sinharoy, B. (2002). Power4 system microarchitecture. IBM Journal of Research and Development, 46(1), 5–25.
Tullsen, D. M., Eggers, S. J., & Levy, H. M. (1995). Simultaneous multithreading: Maximizing on-chip parallelism. SIGARCH Computer Architecture News, 23(2), 392–403.
Udipi, A. N., Muralimanohar, N., Balsubramonian, R., Davis, A., & Jouppi, N. P. (2012). LOT-ECC: Localized and tiered reliability mechanisms for commodity memory systems. In 39th Annual International Symposium on Computer Architecture (pp. 285–296).
Weiser, M., Welch, B., Demers, A., & Shenker, S. (1994). Scheduling for reduced cpu energy. In Proceedings of the 1st USENIX Conference on Operating Systems Design and Implementation, OSDI ’94. Berkeley: USENIX Association.
Yang, H., Breslow, A., Mars, J., & Tang, L. (2013). Bubble-flux: Precise online QoS management for increased utilization in warehouse scale computers. In 40th Annual International Symposium on Computer Architecture (pp. 607–618).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Nikolaou, P. et al. (2019). Evaluating System-Level Monitors and Knobs on Real Hardware. In: Fornaciari, W., Soudris, D. (eds) Harnessing Performance Variability in Embedded and High-performance Many/Multi-core Platforms. Springer, Cham. https://doi.org/10.1007/978-3-319-91962-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-91962-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91961-4
Online ISBN: 978-3-319-91962-1
eBook Packages: EngineeringEngineering (R0)