Enterprise HPC on the Clouds

Part of the Computer Communications and Networks book series (CCN)


In the past few decades, the use of high-performance computing (HPC) has become more and more relevant in the enterprise. From aeronautics to the car industry, and from large computer manufacturers to Internet start-ups, everybody has the need to process enormous amounts of data in order to reduce costs and cope with the speed that technology is evolving today. Companies know that the need for an HPC solution is paramount to their success and the viability of their business in the future. While large enterprises have the required funds for an in-house HPC system, many smaller companies do not have the budget to deploy such solutions, although their needs for data processing may be equally high. Through commoditization of hardware, the need for supercomputers in HPC has evaporated; clusters of servers can nowadays provide the same functionality and performance, at a much lower cost. The latter has led to the advent of “cloud computing” which constitutes a major paradigm shift in how we, as users, can have access to large-scale computing infrastructure. “Clouds” offer virtually limitless resources, on-demand, at a relatively low cost. In the future, this can lead to a complete outsourcing of enterprise HPC and demolish the need for in-house solutions. In this chapter, we are going to discuss the major issues that must be addressed in order to make clouds viable for enterprise HPC, and review research, based on existing or simulated cloud systems, that hints as to how the problems can be solved.


Torque Fist Backfilling CapEx 


  1. 1.
    Akioka, S., Muraoka, Y.: HPC benchmarks on Amazon EC2. In: Proceedings of the 24th IEEE International Conference on Advanced Information Networking and Application Workshops (WAINA), IEEE Computer Society, Washington, DC, pp. 1029–1034 (2010). doi: 10.1109/WAINA.2010.166
  2. 2.
    Andrzejak, A., Kondo, D., Yi, S.: Decision model for cloud computing under SLA constraints. Research Report INRIA. http://hal.inria.fr/inria-00474849/en/ (2010). Accessed 23 Feb 2011
  3. 3.
    Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R.H., Konwinski, A., Lee, G., Patterson, D.A., Rabkin, A., Stoica, I., Zaharia, M.: Above the clouds: a Berkeley view of cloud computing. Technical Report, UC Berkeley. http://radlab.cs.berkeley.edu/publication/285 (2009). Accessed 23 Feb 2011
  4. 4.
    Assunção, M.D., Costanzo, A., Buyya, R.: A cost-benefit analysis of using cloud computing to extend the capacity of clusters. Cluster Comput. 13, 335–347 (2010). doi: 10.1007/s10586-010-0131-x CrossRefGoogle Scholar
  5. 5.
    Bientinesi, P., Iakymchuk, R., Napper, J.: HPC on competitive cloud resources. In: Furht, B., Escalante, A. (eds.) Handbook of Cloud Computing. Springer, Boston (2010). doi:10.1007/978-1-4419-6524-0_21 Google Scholar
  6. 6.
    Buyya, R., Ranjan, R., Calheiros, R.N.: Modeling and simulation of scalable cloud computing environments and the cloudSim toolkit: challenges and opportunities. In: Proceedings of the 7th High Performance Computing and Simulation Conference (HPCS), Leipzig, Germany (2009). doi: 10.1109/HPCSIM.2009.5192685
  7. 7.
    Buyya, R., Yeo, C.S., Venugopal, S., Broberg, J., Brandic, I.: Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener. Comput. Syst. 25(6), 599–616 (2009). Elsevier B.V. doi: 10.1016/j.future.2008.12.001 Google Scholar
  8. 8.
    Clarke, E.M., Grumberg, O., Peled, D.: Model Checking. MIT Press, Cambridge (2000)Google Scholar
  9. 9.
    Dillon, T., Wu, C., Chang, E.: Cloud computing: issues and challenges. In: Proceedings of the 24th IEEE International Conference on Advanced Information Networking and Application (AINA), Perth, Australia, pp. 27–33 (2010). doi: 10.1109/AINA.2010.187
  10. 10.
    Duda, K.J., Cheriton, D.R.: Borrowed-virtual-time (BVT) scheduling: supporting latency sensitive threads in a general-purpose scheduler, ACM SIGOPS. Oper. Syst. Rev. 33(5), 261–276 (1999). doi: 10.1145/319344.319169 CrossRefGoogle Scholar
  11. 11.
    Feitelson, D.: Metrics for parallel job scheduling and their convergence. In: Feitelson, D., Rudolph, L. (eds.) Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 2221, pp. 188–205. Springer, Berlin/Heidelberg (2001)CrossRefGoogle Scholar
  12. 12.
    Gens, F.: New IDC IT cloud services survey: top benefits and challenges. IDC exchange. http://blogs.idc.com/ie/?p=730 (2009). Accessed 23 Feb 2011
  13. 13.
    Goyal, P.: Enterprise usability of cloud computing environments: issues and challenges. In: 19th IEEE International Workshops on Enabling Technology: Infrastructure for Collaboration Enterprise, pp. 54–59 (2010). doi: 10.1109/WETICE.2010.15
  14. 14.
    Hazelhurst, S.: Scientific computing using virtual high-performance computing: a case study using the Amazon elastic computing cloud. In: Proceedings of the 2008 Annual Research Conference of the South African Institute of Computer Science and Information Technology on IT Research in Developing Countries: Riding the Wave of Technology, SAICSIT’08, ACM, New York, NY, pp. 94–103 (2008). doi: 10.1145/1456659.1456671 (2008)
  15. 15.
    He, Q., Zhou, S., Kobler, B., Duffy, D., McGlynn, T.: Case study for running HPC applications in public clouds. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, ACM, New York, NY, pp. 395–401 (2010). doi: 10.1145/1851476.1851535 (2010)
  16. 16.
    Islam, M., Balaji, P., Sadayappan, P., Panda, D.: QoPS: a QoS based scheme for parallel job scheduling. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds.) Job Scheduling Strategies for Parallel Processing. Lecture Notes in Computer Science, vol. 2862, pp. 252–268. Springer, Berlin/Heidelberg (2003). doi:10.1007/10968987_13 CrossRefGoogle Scholar
  17. 17.
    Juve, G., Deelman, E., Vahi, K., Mehta, G., Berriman, B., Berman, B.P., Maechling, P.: Scientific workflow applications on Amazon EC2. In: 5th IEEE International Conference on E-Science Workshops, Oxford, UK, pp. 59–66 (2009). doi: 10.1109/ESCIW.2009.5408002 (2009)
  18. 18.
    Kim, H., el Khamra, Y., Jha, S., Parashar, M.: An autonomic approach to integrated HPC grid and cloud usage. In: Proceedings of the 5th IEEE International Conference on e-Sci’09, Oxford, UK, pp. 366–373 (2009). doi: 10.1109/e-Science.2009.58
  19. 19.
    Lublin U, Feitelson, D.G.: (2003) The workload on parallel supercomputers: modeling the characteristics of rigid jobs. J. Parallel Distrib. Comput. 63, 1105–1122. Elsevier. doi: 10.1016/S0743-7315(03)00108-4 Google Scholar
  20. 20.
    Mergen, M.F., Uhlig, V., Krieger, O., Xenidis, J.: Virtualization for high-performance computing SIGOPS. Oper. Syst. Rev. 40(2), 8–11 (2006). doi: 10.1145/1131322.1131328 CrossRefGoogle Scholar
  21. 21.
    Moschakis, I., Karatza, H.: Evaluation of gang scheduling performance and cost in a cloud computing system. J. Supercomput. (2010). Online First. doi: 10.1007/s11227-010-0481-4
  22. 22.
    Ostermann, S., Iosup, A., Yigitbasi, N., Prodan, R., Fahringer, T., Epema, D.: A performance analysis of EC2 cloud computing services for scientific computing. In: Diaz, M., Avresky, D., Bode, A., Bruno, C., Dekel, E. (eds.) Cloud Computing: First International Conference, Cloud-Comp 2009, Munich, Germany, 19–21 Oct 2009, Revised Selected Papers LNICST, vol. 34, pp. 115–131 (2010). doi: 10.1007/978-3-642-12636-9
  23. 23.
    Salehi, M., Buyya, R.: Adapting market-oriented scheduling policies for cloud computing. In: Hsu, C.H., Yang, L.T., Park, J.H., Yeo, S.S. (eds.) Algorithms and Architectures for Parallel Processing. Lecture Notes in Computer Science, vol. 6081, pp. 351–362. Springer, Berlin/Heidelberg (2010). doi:10.1007/978-3-642-13119-6_31CrossRefGoogle Scholar
  24. 24.
    Vecchiola, C., Pandey, S., Buyya, R.: High-performance cloud computing: a view of scientific applications. In: Proceedings of the 10th International Symposium, on Pervasive Systems, Algorithms, and Networks, Kaohsiung, ISPAN’09, pp. 4–16. IEEE Computer Society, Washington, DC (2009). doi:10.1109/I-SPAN.2009.150 Google Scholar
  25. 25.
    Wang, G., Ng, T.S.E.: The impact of virtualization on network performance of Amazon EC2 Data Center. In: Proceedings of the 29th Conference on Information Communications, INFOCOM’10, pp. 1163–1171. IEEE Press, Piscataway, NJ (2010). doi:10.1109/INFCOM.2010.5461931 Google Scholar
  26. 26.
    Youseff, L., Wolski, R., Gorda, B., Krintz, C.: Paravirtualization for HPC systems. In: Min, G., Di Martino, B., Yang, L., Guo, M., Ruenger, G. (eds.) Frontiers of High Performance Computing and Networking – ISPA 2006 Workshops. Lecture Notes in Computer Science, vol. 4331, pp. 474–486. Springer, Berlin/Heidelberg (2006). doi:10.1007/11942634_49 CrossRefGoogle Scholar
  27. 27.
    Youseff, L., Wolski, R., Gorda, B., Krintz, C.: Evaluating the performance impact of Xen on MPI and process execution for HPC systems. In: Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing, VTDC 06’, Guilin, IEEE Computer Society, Washington, DC (2006). doi:10.1109/VTDC.2006.4 Google Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  1. 1.Department of InformaticsAristotle University of ThessalonikiThessalonikiGreece

Personalised recommendations