Skip to main content

Towards an Architecture for Management of Very Large Computing Systems

  • Conference paper
High Performance Computing on Vector Systems 2010
  • 562 Accesses

Abstract

Managing very large computing systems with up to 100.000 nodes has become a very complex issue. Existing tools reach their limits especially for High Performance Computing (HPC) resources because they are slightly different from other compute resources. First we will introduce the specific HPC obstacles and what we suppose to be challenges for future resources to support the system management. After that we propose the framework designed in scope of the TIMaCS Project (http://www.timacs.de). Assuming that we once have a corresponding solution implemented we will show how this solution can change administration far beyond the current situation. This is separated into a more technical part describing how the administration can be simplified or where we can add new capabilities in resources provisioning and a business part where we outline the need for business policy based management and scheduling, and show a possible approach investigating these relationships. In the end we will show what might be possible far beyond the scope of the project.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. TIMaCS project web-site http://www.timacs.de

  2. Top 500 supercomputing Sites http://www.top500.org

  3. Rami Matarneh (2009). Multi Microkernel Operating Systems for Multi-Core Processors, Journal of Computing Science 5 (7) (pp. 493–500). ISSN 1549-3936

    Article  Google Scholar 

  4. Linux Magazine, Technical Review, Monitoring (2007)

    Google Scholar 

  5. Nagios project web-site http://www.nagios.org. Cited 28 May 2010

  6. Big Brother product web-site http://www.bb4.com

  7. Zenoss project web-site http://www.zenoss.com

  8. Buchholz, J., Volk, E.: The Need for New Monitoring and Management Technologies in Large Scale Computing Systems. In: Proceedings of eChallenges 2010, to appear

    Google Scholar 

  9. IBM: An architectural blueprint for autonomic computing http://www-01.ibm.com/software/tivoli/autonomic/pdfs/AC_Blueprint_White_Paper_4th.pdf, IBM Whitepaper, June 2006. Cited 28 May 2010

  10. AMQP web-site http://www.amqp.org. Cited 28 May 2010

  11. Ganglia sourceforge web-site http://ganglia.sourceforge.net. Cited 28 May 2010

  12. Wikipedia: Cron description http://en.wikipedia.org/wiki/Cron. Cited 28 May 2010

  13. Clusterresources: Moab Workload Manager user-guide http://www.clusterresources.com/products/mwm/docs/moabusers.shtml. Cited 28 May 2010

  14. Clusterresources: Maui Scheduler Administrator’s Guide, version 3.2 http://www.clusterresources.com/products/maui/docs/. Cited 28 May 2010

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jochen Buchholz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Buchholz, J., Volk, E. (2010). Towards an Architecture for Management of Very Large Computing Systems. In: Resch, M., et al. High Performance Computing on Vector Systems 2010. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11851-7_2

Download citation

Publish with us

Policies and ethics