Interference-aware co-scheduling method based on classification of application characteristics from hardware performance counter using data mining

Choi, Jieun; Park, Geunchul; Nam, Dukyun

doi:10.1007/s10586-019-02949-7

Interference-aware co-scheduling method based on classification of application characteristics from hardware performance counter using data mining

Published: 12 June 2019

Volume 23, pages 57–69, (2020)
Cite this article

Cluster Computing Aims and scope Submit manuscript

495 Accesses
6 Citations
Explore all metrics

Abstract

Computational scientists and engineers who are eager to obtain the best performance of scientific applications need efficient application characterization methods to successfully exploit high-performance hardware resources. However, modern processors are accompanied by high-bandwidth on-chip memory or a large number of cores. Therefore, application characterization research that takes into account the newly introduced hardware features in next-generation high performance computing environments is insufficient and complex. In this paper, we propose a simple and fast method to classify the application characteristics in systems state-of-the-art processors using hardware performance counters. The proposed method utilizes hardware performance counters to monitor hardware events related to system performance. A clustering approach is adopted that requires limited understanding of the correlation between hardware events and application characteristics. The application characterization technique is applied to NAS parallel benchmarks in two systems, including Intel Knights Landing and SkyLake Xeon processors. We demonstrate that the proposed techniques can capture system and application characteristics and provide users with useful insights into application execution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SchedMon: A Performance and Energy Monitoring Tool for Modern Multi-cores

Performance Patterns and Hardware Metrics on Modern Multicore Processors: Best Practices for Performance Engineering

Hardware Counters’ Space Reduction for Code Region Characterization

References

Jones, M.D., et al.: Workload analysis of blue waters. arXiv:1703.00924 (2017)
Cho, J.-Y., Jin, H.-W., Nam, D.: Enhanced memory management for scalable MPI intra-node communication on many-core processor. In: Proceedings of the 24th European MPI Users’ Group Meeting (EuroMPI), Article No. 10. ACM (2017)
Molka, D., Schöne, R., Hackenberg, D., Nagel, W.E.: Detecting memory-boundedness with hardware performance counters. In: Proceedings of the 8th ACM/SPEC International Conference on Performance Engineering (ICPE), pp. 27–38. ACM (2017)
Liang, F., Feng, C., Lu, X., Xu, Z.: Performance characterization of hadoop and data MPI based on Amdahl’s second law. In: Proceedings of the 9th IEEE International Conference on Networking, Architecture, and Storage, pp. 207–215. IEEE (2014)
Wang, H., Isci, C., Subramanian, L., Choi, J., Qian, D., Mutlu, O.: A-DRM: Architecture-aware distributed resource management of virtualized clusters. In: Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, pp. 93–106. ACM (2015)
Sreepathi, S., et al.: Application characterization using Oxbow toolkit and PADS infrastructure. In: Proceedings of the 1st International Workshop on Hardware-Software Co-design for High Performance Computing, pp. 55–63. IEEE (2014)
Eyerman, S., Eeckhout, L., Karkhanis, T., Smith, J.E.: A performance counter architecture for computing accurate CPI components. In: Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 175–184. ACM (2006)
NAS Parallel Benchmark. https://www.nas.nasa.gov/publications/npb.html
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
EM Clustering. http://weka.sourceforge.net/doc.dev/weka/clusterers/EM.html
Hardware Performance Counter. https://en.wikipedia.org/wiki/Hardware_performance_counter
Intel\(^{\textregistered }\) 64 and IA-32 Architectures Software Developer’s Manual, vol. 3B. Intel (2017)
AMD64 Architecture Programmer’s Manual, vol. 2. AMD (2013)
Perf. https://perf.wiki.kernel.org/
Performance API (PAPI). http://icl.cs.utk.edu/papi/
Intel Vtune. https://software.intel.com/en-us/intel-vtune-amplifier-xe
Mathur, W., Cook, J.: Improved estimation for software multiplexing of performance counters. In: Proceedings of the 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 23–32. IEEE (2005)
Jundt, A., et al.: Compute bottlenecks on the new 64-bit ARM. In: Proceedings of the 3rd International Workshop on Energy Efficient Supercomputing, Article No. 6. IEEE (2015)
Schöne, R., Hackenberg, D.: On-line analysis of hardware performance events for workload characterization and processor frequency scaling decisions. In: Proceedings of the 2nd ACM/SPEC International Conference on Performance Engineering, pp. 481–486. ACM (2011)
Keller, V., Gruber, R.: One joule per gflop for blas2 now! In: Proceedings of the International Conference of Numerical Analysis and Applied Mathematics, pp. 1321–1324. American Institute of Physics (2010)
Jarus, M., Oleksiak, A.: Top-down characterization approximation based on performance counters architecture for AMD processors. Simul. Model. Pract. Theory. 68, 146–162 (2016)
Article Google Scholar
Da Costa, G., Pierson, J.-M.: Characterizing Applications from Power Consumption: A Case Study for HPC Benchmarks. In: Kranzlmüller, D., Toja, A.M. (eds.) Information and Communication on Technology for the Fight Against Global Warming (ICT-GLOW 2011). Lecture Notes in Computer Science, vol. 6868. Springer (2011)
Zhang, J., Figueiredo, R.J.: Application classification through monitoring and learning of resource consumption patterns. In: Proceedings of the Parallel and Distributed Processing Symposium (IPDPS). IEEE (2006)
Breitbart, J., Weidendorfer, J., Trinitis, C.: Case study on Co-scheduling for HPC applications. In: Proceedings of the 44th International Conference on Parallel Processing Workshops, pp. 277–285. IEEE (2015)
Zhuravlev, S., Blagodurov, S., Fedorova, A.: Addressing shared resource contention in multicore processors via scheduling. In: Proceedings of the Fifteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 129–141. ACM (2010)
Van Craeynest, K., Jaleel, A., Eeckhout, L., Narvaez, P., Emer, J.: Scheduling heterogeneous multi-cores through performance impact estimation (PIE). In: Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA), pp. 213–224. ACM (2012)
Jeffers, J., Reinders, J., Sodani, A.: Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition. Morgan Kaufmann, Burlington (2016)
Google Scholar
Harini, R.: Intel\(^{\textregistered }\) Xeon\(^{\textregistered }\) Phi\(^{{\rm TM}}\) Processor—Performance Monitoring Reference Manual, vol. 1. Intel (2017)
Likwid tool: Knights Landing. https://github.com/RRZE-HPC/likwid/wiki/KNL
Harini, R.: Intel\(^{\textregistered }\) Xeon\(^{\textregistered }\) Phi\(^{{\rm TM}}\) Processor—Performance Monitoring Reference Manual, vol. 2. Intel (2017)
Likwid tool: Skylake. https://github.com/RRZE-HPC/likwid/wiki/Skylake
Intel\(^{\textregistered }\) Xeon\(^{\textregistered }\) Processor Scalable Memory Family Uncore Performance Monitoring Reference Manual. Intel (2017)
Park, G., Rho, S., Kim, J.-S., Nam, D.: Towards optimal scheduling policy for heterogeneous memory architecture in many-core system. Clust. Comput. 22(1), 121–133 (2019)
Article Google Scholar
Choi, J., Park, G., Nam, D.: Efficient classification of application characteristics by using hardware performance counters with data mining. In: Proceedings of 2018 IEEE 3rd International Workshops on Foundations and Applications of Self* Systems (FAS*W), pp. 24–29. IEEE (2018)
Wong, P., Van der Wijngaart, R.F.: NAS parallel benchmarks I/O version 2.4. NAS Technical Report NAS-03-002. NASA Ames Research Center (2003)
Pearson correlation coefficient. https://en.wikipedia.org/wiki/Pearson_correlation_coefficient

Download references

Acknowledgements

This work was partly supported by Institute for Information & communications Technology Promotion (IITP) Grant funded by the Korea government (MSIT) (No. R0190-18-2012) and Korea Institute of Science and Technology Information (KISTI) Grant (No. K-19-L02-C06-S01)

Author information

Authors and Affiliations

National Institute of Supercomputing and Networking, KISTI, Daejeon, Republic of Korea
Jieun Choi, Geunchul Park & Dukyun Nam

Authors

Jieun Choi
View author publications
You can also search for this author in PubMed Google Scholar
Geunchul Park
View author publications
You can also search for this author in PubMed Google Scholar
Dukyun Nam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dukyun Nam.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Choi, J., Park, G. & Nam, D. Interference-aware co-scheduling method based on classification of application characteristics from hardware performance counter using data mining. Cluster Comput 23, 57–69 (2020). https://doi.org/10.1007/s10586-019-02949-7

Download citation

Received: 10 January 2019
Revised: 15 May 2019
Accepted: 06 June 2019
Published: 12 June 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s10586-019-02949-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Interference-aware co-scheduling method based on classification of application characteristics from hardware performance counter using data mining

Abstract

Access this article

Similar content being viewed by others

SchedMon: A Performance and Energy Monitoring Tool for Modern Multi-cores

Performance Patterns and Hardware Metrics on Modern Multicore Processors: Best Practices for Performance Engineering

Hardware Counters’ Space Reduction for Code Region Characterization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Interference-aware co-scheduling method based on classification of application characteristics from hardware performance counter using data mining

Abstract

Access this article

Similar content being viewed by others

SchedMon: A Performance and Energy Monitoring Tool for Modern Multi-cores

Performance Patterns and Hardware Metrics on Modern Multicore Processors: Best Practices for Performance Engineering

Hardware Counters’ Space Reduction for Code Region Characterization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation