Advertisement

A case study of a shared/buy-in computing ecosystem

  • Christopher Liao
  • Yonatan Klausner
  • David Starobinski
  • Eran Simhon
  • Azer Bestavros
Article
  • 24 Downloads

Abstract

Many research institutions are deploying computing clusters based on a shared/buy-in paradigm. Such clusters combine shared computers, which are free to be used by all users, and buy-in computers, which are computers purchased by users for semi-exclusive use. The purpose of this paper is to characterize the typical behavior and performance of a shared/buy-in computing cluster, using data traces from the Shared Computing Cluster (SCC) at Boston University that runs under this paradigm as a case study. Among our main findings, we show that the semi-exclusive policy, which allows any SCC user to use idle buy-in resources for a limited time, increases the utilization of buy-in resources by 17.4%, thus significantly improving the performance of the system as a whole. We find that jobs allowed to run on idle buy-in resources arrive more frequently and run for a shorter time than other jobs. Finally, we identify the run time limit (i.e., the maximum time during which a job is allowed to use resources) and the type of parallel environment as two factors that have a significant impact on the different performance experienced by shared and buy-in jobs.

Keywords

Grid computing Cluster computing Shared/buy-in architecture Workload characterization 

Notes

Acknowledgements

This research was supported in part by the NSF under Grants 1717858, 1012798, 1117160, 1414119, and 1430145, and by the Hariri Institute for Computing at BU. The authors would also like to acknowledge the Research Computing Services group at Boston University, including Glenn Bresnahan, Mike Dugan, and Katia Oleinik, for their guidance and technical support.

References

  1. 1.
    Anoep, S., Dumitrescu, C., Epema, D., Iosup, A., Jan, M., Li, H., Wolters, L.: Grid workloads archive. http://gwa.ewi.tudelft.nl/datasets/ (2016)
  2. 2.
    Boston University Information Services & Technology: Research computing support. http://www.bu.edu/tech/support/research/
  3. 3.
    Calzarossa, M., Massari, L., Tessera, D.: Workload characterization issues and methodologies. In: Haring, G., Lindemann, C., Reiser, M. (eds.) Performance Evaluation: Origins and Directions, pp. 459–482. Springer, Berlin (2000)CrossRefGoogle Scholar
  4. 4.
    Cao, J., Chan, A.T.S., Sun, Y., Das, S.K., Guo, M.: A taxonomy of application scheduling tools for high performance cluster computing. Clust. Comput. 9(3), 355–371 (2006).  https://doi.org/10.1007/s10586-006-9747-2 CrossRefGoogle Scholar
  5. 5.
    Delamare, S., Fedak, G., Kondo, D., Lodygensky, O.: Spequlos: a qos service for hybrid and elastic computing infrastructures. Clust. Comput. 17(1), 79–100 (2014).  https://doi.org/10.1007/s10586-013-0283-6 CrossRefGoogle Scholar
  6. 6.
    Delignette-Muller, M.L., Dutang, C., Pouillot, R., Denis, J.B., Delignette-Muller, M.M.L.: Package ‘fitdistrplus’ (2015)Google Scholar
  7. 7.
    Di, S., Kondo, D., Cirne, W.: Characterization and comparison of cloud versus grid workloads. In: Proceedings of the 2012 IEEE International Conference on Cluster Computing (CLUSTER), pp. 230–238. IEEE (2012)Google Scholar
  8. 8.
    Dümmler, J., Kunis, R., Rünger, G.: SEParAT: scheduling support environment for parallel application task graphs. Clust. Comput. 15(3), 223–238 (2012).  https://doi.org/10.1007/s10586-012-0211-1 CrossRefGoogle Scholar
  9. 9.
    Feitelson, D.: Parallel workloads archive. http://www.cs.huji.ac.il/labs/parallel/workload/ (2005)
  10. 10.
    Feitelson, D.G., Tsafrir, D., Krakov, D.: Experience with using the parallel workloads archive. J. Parallel Distrib. Comput. 74(10), 2967–2982 (2014)CrossRefGoogle Scholar
  11. 11.
    Iosup, A., Li, H., Jan, M., Anoep, S., Dumitrescu, C., Wolters, L., Epema, D.H.: The grid workloads archive. Future Gener. Comput. Syst. 24(7), 672–686 (2008)CrossRefGoogle Scholar
  12. 12.
    Klausner, Y., Liao, C., Starobinski, D., Simhon, E., Bestavros, A.: Workload characterization of the shared/buy-in computing cluster at boston university. In: Proceedings of the 2016 IEEE MIT Undergraduate Research Technology Conference. IEEE (2016)Google Scholar
  13. 13.
    Krakov, D., Feitelson, D.G.: Comparing performance heatmaps. In: Workshop on Job Scheduling Strategies for Parallel Processing, pp. 42–61. Springer (2013)Google Scholar
  14. 14.
    Kübert, R., Wesner, S.: High performance computing as a service with service level agreements. In: Proceedings of the 2012 IEEE Ninth International Conference on Services Computing (SCC), pp. 578–585. IEEE (2012)Google Scholar
  15. 15.
    Kumar, R., Vadhiyar, S.: Identifying quick starters: towards an integrated framework for efficient predictions of queue waiting times of batch parallel jobs. In: Workshop on Job Scheduling Strategies for Parallel Processing, pp. 196–215. Springer (2012)Google Scholar
  16. 16.
    Li, H., Groep, D., Wolters, L.: Workload characteristics of a multi-cluster supercomputer. In: Workshop on Job Scheduling Strategies for Parallel Processing, pp. 176–193. Springer (2004)Google Scholar
  17. 17.
    Livny, M.: HPC cluster buy in options—center for high throughput computing. http://chtc.cs.wisc.edu/hpc-buy-in.shtml (2016)
  18. 18.
    Oracle-Corporation: Beginner’s guide to oracle grid engine 6.2. http://www.oracle.com/technetwork/oem/host-server-mgmt/twp-gridengine-beginner-167116.pdf (2010)
  19. 19.
    Qureshi, K., Shah, S.M.H., Manuel, P.: Empirical performance evaluation of schedulers for cluster of workstations. Clust. Comput. 14(2), 101–113 (2011).  https://doi.org/10.1007/s10586-010-0128-5 CrossRefGoogle Scholar
  20. 20.
    R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/ (2016)
  21. 21.
    Russell, J.: Buy-in—University of Arizona research computing. http://rc.arizona.edu/buy-in (2016)
  22. 22.
    Tran, N.M., Wolters, L.: Towards a profound analysis of bags-of-tasks in parallel systems and their performance impact. In: Proceedings of the 20th International Symposium on High Performance Distributed Computing, HPDC ’11, pp. 111–122. ACM, New York, NY, USA (2011).  https://doi.org/10.1145/1996130.1996148
  23. 23.
    Zhang, H., Jiang, G., Yoshihira, K., Chen, H., Saxena, A.: Intelligent workload factoring for a hybrid cloud computing model. In: Proceedings of the 2009 World Conference on Services-I, pp. 701–708. IEEE (2009)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Christopher Liao
    • 1
  • Yonatan Klausner
    • 1
  • David Starobinski
    • 1
  • Eran Simhon
    • 1
  • Azer Bestavros
    • 1
  1. 1.Boston UniversityBostonUSA

Personalised recommendations