Modeling Multiclass Task-Based Applications on Heterogeneous Distributed Environments

  • Riccardo PinciroliEmail author
  • Marco Gribaudo
  • Giuseppe Serazzi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10378)


The volume of data, one of the five “V” characteristics of Big Data, grows at a rate that is much higher than the increase of ability of the existing systems to manage it within an acceptable time. Several technologies have been developed to approach this scalability issue. For instance, MapReduce has been introduced to cope with the problem of processing a huge amount of data, by splitting the computation into a set of tasks that are concurrently executed. The savings of even a marginal time in the processing of all the tasks of a set can bring valuable benefits to the execution of the whole application and to the management costs of the entire data center. To this end, we propose a technique to minimize the global processing time of a set of tasks, having different service requirements, concurrently executed on two or more heterogeneous systems. The validity of the proposed technique is demonstrated using a multiformalism model that consists of a combination of Queueing Networks and Petri Nets. Application of this technique to an Apache Hive case-study shows that the described allocation policy can lead to performance gains on both total execution time and energy consumption.


Pool depletion systems MapReduce Schedulers Energy efficiency Performance evaluation Queueing networks Petri nets Multiformalism models 



This research was supported in part by the European Commission under the grant ANTAREX H2020 FET-HPC-671623.


  1. 1.
    Andrew, L.L., Lin, M., Wierman, A.: Optimality, fairness, and robustness in speed scaling designs. In: ACM SIGMETRICS Performance Evaluation Review, vol. 38, pp. 37–48. ACM (2010)CrossRefGoogle Scholar
  2. 2.
    Bansal, N., Chan, H.L., Pruhs, K.: Speed scaling with an arbitrary power function. In: Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 693–701. Society for Industrial and Applied Mathematics (2009)Google Scholar
  3. 3.
    Barbierato, E., Gribaudo, M., Manini, D.: Fluid approximation of pool depletion systems. In: Wittevrongel, S., Phung-Duc, T. (eds.) ASMTA 2016. LNCS, vol. 9845, pp. 60–75. Springer, Cham (2016). doi: 10.1007/978-3-319-43904-4_5CrossRefGoogle Scholar
  4. 4.
    Bertoli, M., Casale, G., Serazzi, G.: JMT: performance engineering tools for system modeling. SIGMETRICS Perform. Eval. Rev. 36(4), 10–15 (2009)CrossRefGoogle Scholar
  5. 5.
    Cerotti, D., Gribaudo, M., Piazzolla, P., Pinciroli, R., Serazzi, G.: Multi-class queuing networks models for energy optimization. In: Proceedings of the 8th International Conference on Performance Evaluation Methodologies and Tools, pp. 98–105. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering) (2014)Google Scholar
  6. 6.
    Cerotti, D., Gribaudo, M., Piazzolla, P., Serazzi, G.: Flexible CPU provisioning in clouds: a new source of performance unpredictability. In: Ninth International Conference on Quantitative Evaluation of Systems, QEST 2012, London, United Kingdom, 17–20 September 2012, pp. 230–237 (2012)Google Scholar
  7. 7.
    Cerotti, D., Gribaudo, M., Pinciroli, R., Serazzi, G.: Stochastic analysis of energy consumption in pool depletion systems. In: Remke, A., Haverkort, B.R. (eds.) MMB&DFT 2016. LNCS, vol. 9629, pp. 25–39. Springer, Cham (2016). doi: 10.1007/978-3-319-31559-1_4CrossRefGoogle Scholar
  8. 8.
    Cerotti, D., Gribaudo, M., Pinciroli, R., Serazzi, G.: Optimal population mix in pool depletion systems with two-class workload. In: 10th EAI International Conference on Performance Evaluation Methodologies and Tools. ACM (2017)Google Scholar
  9. 9.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  10. 10.
    Fan, X., Weber, W.D., Barroso, L.A.: Power provisioning for a warehouse-sized computer. In: ACM SIGARCH Computer Architecture News, vol. 35, pp. 13–23. ACM (2007)Google Scholar
  11. 11.
    Gandhi, A., Gupta, V., Harchol-Balter, M., Kozuch, M.A.: Optimality analysis of energy-performance trade-off for server farm management. Perform. Eval. 67(11), 1155–1171 (2010)CrossRefGoogle Scholar
  12. 12.
    Gribaudo, M., Iacono, M.: Theory and Application of Multi-formalism Modeling. IGI Global, Hershey (2013)Google Scholar
  13. 13.
    Ho, T.T.N., Gribaudo, M., Pernici, B.: Characterizing energy per job in cloud applications. Electronics 5(4), 90 (2016)CrossRefGoogle Scholar
  14. 14.
    Huang, L., Wang, X.W., Zhai, Y.D., Yang, B.: Extraction of user profile based on the hadoop framework. In: 5th International Conference on Wireless Communications, Networking and Mobile Computing, WiCom 2009, pp. 1–6. IEEE (2009)Google Scholar
  15. 15.
    Hyytiä, E., Righter, R., Aalto, S.: Task assignment in a heterogeneous server farm with switching delays and general energy-aware cost structure. Perform. Eval. 75, 17–35 (2014)CrossRefGoogle Scholar
  16. 16.
    Kang, C.W., Abbaspour, S., Pedram, M.: Buffer sizing for minimum energy-delay product by using an approximating polynomial. In: Proceedings of the 13th ACM Great Lakes Symposium on VLSI, pp. 112–115. ACM (2003)Google Scholar
  17. 17.
    Kaxiras, S., Martonosi, M.: Computer architecture techniques for power-efficiency. Synth. Lect. Comput. Archit. 3(1), 1–207 (2008)CrossRefGoogle Scholar
  18. 18.
    Kulkarni, A.P., Khandewal, M.: Survey on hadoop and introduction to YARN. Int. J. Emerg. Technol. Adv. Eng. 4(5), 82–87 (2014)Google Scholar
  19. 19.
    Rosti, E., Schiavoni, F., Serazzi, G.: Queueing network models with two classes of customers. In: Proceedings Fifth International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 1997, pp. 229–234. IEEE (1997)Google Scholar
  20. 20.
    Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive: a warehousing solution over a map-reduce framework. Proc. VLDB Endow. 2(2), 1626–1629 (2009)CrossRefGoogle Scholar
  21. 21.
    Zaharia, M., Borthakur, D., Sarma, J.S., Elmeleegy, K., Shenker, S., Stoica, I.: Job scheduling for multi-user mapreduce clusters. Technical Report UCB/EECS-2009-55, EECS Department, University of California, Berkeley (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Riccardo Pinciroli
    • 1
    Email author
  • Marco Gribaudo
    • 1
  • Giuseppe Serazzi
    • 1
  1. 1.Dipartimento di Elettronica, Informazione e BioingengeriaPolitecnico di MilanoMilanoItaly

Personalised recommendations