Journal of Grid Computing

, Volume 15, Issue 4, pp 415–434 | Cite as

Adaptive Resource Allocation with Job Runtime Uncertainty

  • Raul Ramírez-Velarde
  • Andrei Tchernykh
  • Carlos Barba-Jimenez
  • Adán Hirales-Carbajal
  • Juan Nolazco-Flores
Article
  • 51 Downloads

Abstract

In this paper, we address the problem of dynamic resource allocation in presence of job runtime uncertainty. We develop an execution delay model for runtime prediction, and design an adaptive stochastic allocation strategy, named Pareto Fractal Flow Predictor (PFFP). We conduct a comprehensive performance evaluation study of the PFFP strategy on real production traces, and compare it with other well-known non-clairvoyant strategies over two metrics. In order to choose the best strategy, we perform bi-objective analysis according to a degradation methodology. To analyze possible biasing results and negative effects of allowing a small portion of the problem instances with large deviation to dominate the conclusions, we present performance profiles of the strategies. We show that PFFP performs well in different scenarios with a variety of workloads and distributed resources.

Keywords

Runtime uncertainty Distributed system Resource allocation Self-similarity Heavy-tails 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ramirez-Alcaraz, J.M., Tchernykh, A., Yahyapour, R., Schwiegelshohn, U., Quezada-Pina, A., Gonzalez-Garcia, J.L., Hirales-Carbajal, A.: Job allocation strategies with user run time estimates for online scheduling in hierarchical grids. J. Grid Comput. 9, 95–116 (2011)CrossRefGoogle Scholar
  2. 2.
    Hirales-Carbajal, A., Tchernykh, A., Yahyapour, R., Gonzalez-Garcia, J.L., Roblitz, T., Ramirez-Alcaraz, J.M.: Multiple workflow scheduling strategies with user run time estimates on a grid. J. Grid Comput. 10(2), 325–346 (2012)Google Scholar
  3. 3.
    Tsafrir, D., Etsion, Y., Feitelson, D.G.: Backfilling using system-generated predictions rather than user runtime estimates. IEEE Trans. Parallel Distrib. Syst. 18, 789–803 (2007)CrossRefGoogle Scholar
  4. 4.
    Oprescu, A.-M., Kielmann, T., Leahu, H.: Stochastic tail-phase optimization for bag-of-tasks execution in clouds. In: 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing, pp. 204–208 (2012)Google Scholar
  5. 5.
    Sotskov, Y.N., Werner, F.: Sequencing and Scheduling with Inaccurate Data. Nova, Commack (2014)Google Scholar
  6. 6.
    Bacso, G., Visegradi, A., Kertesz, A., Némethet, Z.: On efficiency of multi-job grid allocation based on statistical trace data. J. Grid Comput. 12, 169 (2014).  https://doi.org/10.1007/s10723-013-9274-3 CrossRefGoogle Scholar
  7. 7.
    Leland, W.E., Taqqu, M.S., Willinger, W., Wilson, D.V.: On the self-similar nature of ethernet traffic (Extended Version). IEEE/ACM Trans. Netw. 2, 1–15 (1994)CrossRefGoogle Scholar
  8. 8.
    Parulekar, M., Makowski, A.M.: Tail probabilities for a multiplexer with self-similar traffic. In: Proceedings of the Fifteenth Annual Joint Conference of the IEEE Computer and Communications Societies Conference on The Conference on Computer Communications - Volume 3, pp. 1452–1459. IEEE Computer Society, San Francisco (1996)Google Scholar
  9. 9.
    Beran, J.: Statistics for Long-Memory Processes. Taylor & Francis, New York (1994)MATHGoogle Scholar
  10. 10.
    Crovella, M.E., Taqqu, M.S., Bestavros, A., Adler, R.J., Feldman, R.E. (eds.): A Practical Guide to Heavy Tails. Heavy-tailed Probability Distributions in the World Wide Web. Birkhauser Boston Inc, Cambridge (1998)Google Scholar
  11. 11.
    Beran, J., Sherman, R., Taqqu, M.S., Willinger, W.: Long-range dependence in variable-bit-rate video traffic. IEEE Trans. Commun. 43, 1566–1579 (1995)CrossRefGoogle Scholar
  12. 12.
    Schwiegelshohn, U., Tchernykh, A., Yahyapour, R.: Online scheduling in grids. In: International Symposium on Parallel and Distributed Processing, 2008, pp. 1–10. IEEE (2008)Google Scholar
  13. 13.
    Gehring, J., Streit, A.: Robust resource management for metacomputers. In: Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing. p. 105. IEEE Computer Society, Washington, DC (2000)Google Scholar
  14. 14.
    James, H.A., Hawick, K.A.: Scheduling independent tasks on metacomputing systems. In: Proceedings of Parallel and Distributed Computing Systems (1999)Google Scholar
  15. 15.
    Vadhiyar, S.S., Dongarra, J.J.: A metascheduler for the grid. In: Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing, p. 343. IEEE Computer Society, Washington, DC (2002)Google Scholar
  16. 16.
    Diaza, A.R., Tchernykh, A., Eckerc, K.H.: Algorithms for dynamic scheduling of unit execution time tasks. Eur. J. Oper. Res. 146, 403–416 (2003)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Hamscher, V., Schwiegelshohn, U., Streit, A., Yahyapour, R.: Evaluation of job-scheduling strategies for grid computing. In: Proceedings of the First IEEE/ACM International Workshop on Grid Computing. pp. 191–202. Springer, London (2000)Google Scholar
  18. 18.
    Sabin, G., Kettimuthu, R., Rajan, A., Sadayappan, P.: Scheduling of Parallel Jobs in a Heterogeneous Multi-site Environment. In: Feitelson, D., Rudolph, L., and Schwiegelshohn, U. (eds.) Job Scheduling Strategies for Parallel Processing. pp. 87–104. Springer Berlin Heidelberg (2003).Google Scholar
  19. 19.
    Tchernykh, A., Ramirez, J.M., Avetisyan, A., Kuzjurin, N., Grushin, D., Zhuk, S.: Two level job-scheduling strategies for a computational grid. In: Proceedings of the 6th International Conference on Parallel Processing and Applied Mathematics, pp. 774–781. Springer, Poznan (2006)Google Scholar
  20. 20.
    Zhuk, S., Chernykh, A., Avetisyan, A., Gaissaryan, S., Grushin, D., Kuzjurin, N., Pospelov, A., Shokurov, A.: Comparison of scheduling heuristics for grid resource broker. In: Proceedings of the Fifth Mexican International Conference in Computer Science, pp. 388–392. IEEE Computer Society, Washington, DC (2004)Google Scholar
  21. 21.
    Kianpisheh, S., Jalili, S., Charkari, M.: Predicting job wait time in grid environment by applying machine learning methods on historical information. Int. J. Grid Distrib. Comput. 5, 11–22 (2012)Google Scholar
  22. 22.
    Kumar, R., Vadhiyar, S.: Prediction of queue waiting times for metascheduling on parallel batch systems. In: Cirne, W., Desai, N. (Eds.) Job Scheduling Strategies for Parallel Processing, Lecture Notes in Computer Science, vol. 8828, pp. 108–128 (2015)Google Scholar
  23. 23.
    Megow, N., Uetz, M., Vredeveld, T.: Models and algorithms for stochastic online scheduling. Math. Oper. Res. 31(3), 513–525 (2005)MathSciNetCrossRefMATHGoogle Scholar
  24. 24.
    Megow, N., Vredeveld, T.: Approximation in preemptive stochastic online scheduling. LNCS 4168, 516–527 (2006)MathSciNetMATHGoogle Scholar
  25. 25.
    Vredeveld, T.: Stochastic online scheduling. Comput. Sci. Res. Dev. 27(3), 181–187 (2012)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Albers, S.: Better bounds for online scheduling. SIAM J. Comput. 29, 459–473 (1999)MathSciNetCrossRefMATHGoogle Scholar
  27. 27.
    Grosu, D., Chronopoulos, A.T.: Algorithmic mechanism design for load balancing in distributed systems. In: Proceedings of the IEEE International Conference on Cluster Computing, p. 445. IEEE Computer Society, Washington, DC (2002)Google Scholar
  28. 28.
    Addie, R.G., Zukerman, M., Neame, T.D.: Broadband traffic modeling: simple solutions to hard problems. Commun. Mag. 36, 88–95 (1998)CrossRefGoogle Scholar
  29. 29.
    Norros, I.: A storage model with self-similar input. Queueing Syst. 16, 387–396 (1994)MathSciNetCrossRefMATHGoogle Scholar
  30. 30.
    Ramirez-Velarde, R.V., Rodriguez-Dagnino, R.M.: A gamma fractal noise source model for variable bit rate video servers. Comput. Commun. 27, 1786–1798 (2004)CrossRefGoogle Scholar
  31. 31.
    Bashforth, B., Williamson, C.L.: Statistical Multiplexing of Self-Similar Video Streams: Simulation Study and Performance Results. MASCOTS, pp. 119–126. IEEE Computer Society (2002)Google Scholar
  32. 32.
    Bodamer, S., Charzinski, J.: Evaluation of effective bandwidth schemes for self-similar traffic. In: ITC Specialist Seminar on IP Traffic Measurement, Modeling, and Management, Monterrey (2000)Google Scholar
  33. 33.
    Patel, A.A., Williamson, C.L.: Effective bandwidth of self-similar traffic sources: theoretical and simulation results. In: Proceedings of the IASTED Conference on Applied Modeling and Simulation, pp. 298–302. Banff (1997)Google Scholar
  34. 34.
    Loboz, C.: Cloud resource usage—heavy tailed distributions invalidating traditional capacity planning models. J. Grid Comput. 10(1), 85–108 (2012)CrossRefGoogle Scholar
  35. 35.
    Christodoulopoulos, K., Gkamas, V., Varvarigos, E.A.: Statistical analysis and modeling of jobs in a grid environment. J. Grid Comput. 6(1), 77–101 (2008)CrossRefGoogle Scholar
  36. 36.
    Bazinet, A.L., Cummings, M.P.: Subdividing long-running, variable-length analyses into short, fixed-length BOINC work units. J. Grid Comput. 14(3), 429–41 (2016)CrossRefGoogle Scholar
  37. 37.
    Ramirez-Velarde, R., Vargas, C., Castanon, G., Martinez-Elizalde, L.: Self-similarity and multi-dimensionality: tools for performance modelling of distributed infrastructure. In: Meersman, R., Tari, Z. (eds.) On the Move to Meaningful Internet Systems: OTM 2008, pp 812–821. Springer, Berlin (2008)Google Scholar
  38. 38.
    Asmussen, S.: Applied Probability and Queues. Springer, Berlin (2003)MATHGoogle Scholar
  39. 39.
    Resnick, S.I.: Heavy tail modeling and teletraffic data. Ann. Stat. 25, 1805–2272 (1997)MathSciNetCrossRefMATHGoogle Scholar
  40. 40.
    Leon-Garcia, A.: Probability, Statistics, and Random Processes for Electrical Engineering. Pearson/Prentice Hall, Upper Saddle River (2008)Google Scholar
  41. 41.
    Park, K., Willinger, W.: Self-similar network traffic: an overview. In: Self-Similar Network Traffic and Performance Evaluation, pp. 1–38 (2000)Google Scholar
  42. 42.
    Kurowski, K., Ludwiczak, B., Nabrzyski, J., Oleksiak, A., Pukacki, J.: Dynamic grid scheduling with job migration and rescheduling in the GridLab resource management system. Sci. Program 12, 263–273 (2004)Google Scholar
  43. 43.
    Ramirez-Velarde, R.V., Rodriguez-Dagnino, R.M.: From commodity computers to high-performance environments: scalability analysis using self-similarity, large deviations and heavy-tails. Concurr. Comput. Pract. Exp. 22, 1494–1515 (2010)Google Scholar
  44. 44.
    Grimme, C., Lepping, J., Papaspyrou, A., Fölling, A.: Teikoku Grid scheduling Framework (2009)Google Scholar
  45. 45.
    Hirales-Carbajal, A., Tchernykh, A., Roblitz, T., Yahyapour, R.: A Grid simulation framework to study advance scheduling strategies for complex workflow applications. In: 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW), pp. 1–8 (2010)Google Scholar
  46. 46.
    Di, S., Kondo, D., Cirne, W.: In: 2012 IEEE International Conference on Characterization and Comparison of Cloud versus Grid Workloads Cluster Computing (CLUSTER), pp. 230–238 (2012)Google Scholar
  47. 47.
    PWA: Parallel Workloads Archive (2014)Google Scholar
  48. 48.
    Feitelson, D.G., Tsafrir, D., Krakov, D.: Experience with the parallel workloads archive. The Hebrew University and the Israel Institute of Technology (2012)Google Scholar
  49. 49.
    Quezada-Pina, A., Tchernykh, A., Gonzalez-Garcia, J.L., Hirales-Carbajal, A., Ramirez-Alcaraz, J.M., Schwiegelshohn, U., Yahyapour, R., Miranda-Lopez, V.: Adaptive parallel job scheduling with resource admissible allocation on two-level hierarchical grids. In: Future Generation Computer Systems. Elsevier Science (2012)Google Scholar
  50. 50.
    Dolan, E.D., Moré, J.J., Munson, T.S.: Optimality measures for performance profiles. SIAM J. Optim. 16, 891–909 (2006)MathSciNetCrossRefMATHGoogle Scholar
  51. 51.
    Orgerie, A.-C., Lefèvre, L., Gelas, J.P.: How an experimental grid is used: the grid5000 case and its impact on energy usage. In: Proceedings of 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid2008), pp. 19–22 (2008)Google Scholar
  52. 52.
    Pawlish, M., Varde, A.S., Robila, S.A., Ranganathan, A.: A call for energy efficiency in data centers. SIGMOD Rec. 43(1), 45–51 (2014)CrossRefGoogle Scholar
  53. 53.
    DeCarlo, L.T.: On the meaning and use of kurtosis. Psychol. Methods 2, 292–307 (1997)CrossRefGoogle Scholar
  54. 54.
    Petersen, J.L.: Estimating the parameters of a Pareto distribution. University of Montana (2000)Google Scholar
  55. 55.
    Rytgaard, M.: Estimation in the Pareto Distribution, pp. 201–216. Astin Bulletin 20.02 (1990)Google Scholar
  56. 56.
    Luceño, A.: Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators. Comput. Stat. Data Anal. 51, 904–917 (2006)MathSciNetCrossRefMATHGoogle Scholar
  57. 57.
    Weber, M.D., Leemis, L.M., Kincaid, R.K.: Minimum Kolmogorov-Smirnov test statistic parameter estimates. J. Stat. Comput. Simul. 76, 196–206 (2006)MathSciNetCrossRefMATHGoogle Scholar
  58. 58.
    Clegg, R.G.: A practical guide to measuring the Hurst parameter. In: 21st UK Performance Engineering Workshop, School of Computing Science Technical Report Series, CSTR-916, pp. 43–55. University of Newcastle (2006)Google Scholar
  59. 59.
    Kirichenko, L., Radivilova, T., Deineko, Z.: Comparative analysis for estimating of hurst exponent for stationary and nonstationary time series. Int. J. Inf. Technol. Knowl. 5(1), 371–388 (2011)Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2017

Authors and Affiliations

  1. 1.Computer Science DepartmentTecnológico de MonterreyMonterreyMéxico
  2. 2.Computer Science DepartmentCICESE Research CenterEnsenadaMéxico
  3. 3.CETYS UniversityTijuanaMexico

Personalised recommendations