Skip to main content

Benchmarking Modern Databases for Storing and Profiling Very Large Scale HPC Communication Data

  • Conference paper
  • First Online:
Benchmarking, Measuring, and Optimizing (Bench 2023)

Abstract

Capturing cross-stack profiling of communication on HPC systems at fine granularity is critical for gaining insights into the detailed performance trade-offs and interplay among various components of HPC ecosystem. To enable this, one needs to be able to collect, store, and retrieve system-wide data at high fidelity. As modern HPC systems expand, ensuring high-fidelity, real-time communication profiling becomes more challenging, especially with the growing number of users employing profiling tools to monitor their workloads. We take on this challenge in this paper and identify the key metrics of performance that makes a database amenable to these needs. We then design benchmarks to measure and understand the performance of multiple, popular, open-source databases. Through rigorous experimental analysis, we demonstrate the performance and scalability trends of the selected databases to perform different types of fundamental storage and retrieval operations under various conditions. Through this work, we are able to achieve sub-second complex data querying serving up to 64 users and demonstrate a “9\(\times \)” improvement in insertion latency through parallel data insertion, achieving a latency of 55 ms and 50% less disk space for inserting 200,000 rows of profiling data collected from a potential system that is “4\(\times \)” the size of the state-of-the-art 19th-ranked Frontera supercomputing system at TACC with 8,368 nodes.

This research is supported in part by NSF grants #1818253, #1854828, #1931537, #2018627, #2311830, #2312927, and XRAC grant #NCR-130002.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Prometheus exporter. https://github.com/prometheus/node_exporter

  2. Kousha, P., et al.: Accelerated real-time network monitoring and profiling at scale using OSU INAM. In: Practice and Experience in Advanced Research Computing (PEARC 2020) (2020)

    Google Scholar 

  3. ClickHouse: Clickhouse official website. https://clickhouse.tech/

  4. DBeaver Corp: Dbeaver - universal database tool. https://dbeaver.io/. Accessed 27 Dec 2023

  5. InfluxData: Influxdb. https://www.influxdata.com/products/influxdb/

  6. Malony, A.D., Shende, S.: Performance technology for complex parallel and distributed systems. In: Kotsis, G., Kacsuk, P. (eds.) Proceedings of the DAPSYS 2000, pp. 37–46 (2000)

    Google Scholar 

  7. Network based computing team: OSU INAM (2019). http://mvapich.cse.ohio-state.edu/tools/osu-inam/

  8. NVIDIA Nsight Systems. https://developer.nvidia.com/nsight-systems

  9. Oak Ridge National Laboratory: Frontier (2023). https://www.olcf.ornl.gov/frontier/. Accessed 27 Dec 2023

  10. OSC: Ohio Supercomputer Center. https://www.osc.edu/

  11. Palmer, J.T., et al.: Open XDMoD: a tool for the comprehensive management of high-performance computing resources. Comput. Sci. Eng. 17(4), 52–62 (2015). https://doi.org/10.1109/MCSE.2015.68

    Article  Google Scholar 

  12. Pouya Kousha: Best Practices with OSU INAM. http://mvapich.cse.ohio-state.edu/userguide/osu-inam/#_best_practices_with_osu_inam

  13. Stanzione, D., West, J., Evans, R.T., Minyard, T., Ghattas, O., Panda, D.: Frontera: the evolution of leadership computing at the national science foundation. In: Practice and Experience in Advanced Research Computing, pp. 106–111. PEARC’20, ACM, New York, NY, USA (2020). https://doi.org/10.1145/3311790.3396656

  14. The Apache Software Foundation: Apache cassandra. https://cassandra.apache.org/. Accessed 27 Dec 2023

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pouya Kousha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kousha, P., Zhou, Q., Subramoni, H., Panda, D.K. (2024). Benchmarking Modern Databases for Storing and Profiling Very Large Scale HPC Communication Data. In: Hunold, S., Xie, B., Shu, K. (eds) Benchmarking, Measuring, and Optimizing. Bench 2023. Lecture Notes in Computer Science, vol 14521. Springer, Singapore. https://doi.org/10.1007/978-981-97-0316-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-0316-6_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-0315-9

  • Online ISBN: 978-981-97-0316-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics