Abstract
The volume of data, one of the five āVā characteristics of Big Data, grows at a rate that is much higher than the increase of ability of the existing systems to manage it within an acceptable time. Several technologies have been developed to approach this scalability issue. For instance, MapReduce has been introduced to cope with the problem of processing a huge amount of data, by splitting the computation into a set of tasks that are concurrently executed. The savings of even a marginal time in the processing of all the tasks of a set can bring valuable benefits to the execution of the whole application and to the management costs of the entire data center. To this end, we propose a technique to minimize the global processing time of a set of tasks, having different service requirements, concurrently executed on two or more heterogeneous systems. The validity of the proposed technique is demonstrated using a multiformalism model that consists of a combination of Queueing Networks and Petri Nets. Application of this technique to an Apache Hive case-study shows that the described allocation policy can lead to performance gains on both total execution time and energy consumption.
Notes
- 1.
Available at http://ftp.pdl.cmu.edu/pub/datasets/hla/. Please, include http at the beginning of the URL to make it work.
References
Andrew, L.L., Lin, M., Wierman, A.: Optimality, fairness, and robustness in speed scaling designs. In: ACM SIGMETRICS Performance Evaluation Review, vol. 38, pp. 37ā48. ACM (2010)
Bansal, N., Chan, H.L., Pruhs, K.: Speed scaling with an arbitrary power function. In: Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 693ā701. Society for Industrial and Applied Mathematics (2009)
Barbierato, E., Gribaudo, M., Manini, D.: Fluid approximation of pool depletion systems. In: Wittevrongel, S., Phung-Duc, T. (eds.) ASMTA 2016. LNCS, vol. 9845, pp. 60ā75. Springer, Cham (2016). doi:10.1007/978-3-319-43904-4_5
Bertoli, M., Casale, G., Serazzi, G.: JMT: performance engineering tools for system modeling. SIGMETRICS Perform. Eval. Rev. 36(4), 10ā15 (2009)
Cerotti, D., Gribaudo, M., Piazzolla, P., Pinciroli, R., Serazzi, G.: Multi-class queuing networks models for energy optimization. In: Proceedings of the 8th International Conference on Performance Evaluation Methodologies and Tools, pp. 98ā105. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering) (2014)
Cerotti, D., Gribaudo, M., Piazzolla, P., Serazzi, G.: Flexible CPU provisioning in clouds: a new source of performance unpredictability. In: Ninth International Conference on Quantitative Evaluation of Systems, QEST 2012, London, United Kingdom, 17ā20 September 2012, pp. 230ā237 (2012)
Cerotti, D., Gribaudo, M., Pinciroli, R., Serazzi, G.: Stochastic analysis of energy consumption in pool depletion systems. In: Remke, A., Haverkort, B.R. (eds.) MMB&DFT 2016. LNCS, vol. 9629, pp. 25ā39. Springer, Cham (2016). doi:10.1007/978-3-319-31559-1_4
Cerotti, D., Gribaudo, M., Pinciroli, R., Serazzi, G.: Optimal population mix in pool depletion systems with two-class workload. In: 10th EAI International Conference on Performance Evaluation Methodologies and Tools. ACM (2017)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107ā113 (2008)
Fan, X., Weber, W.D., Barroso, L.A.: Power provisioning for a warehouse-sized computer. In: ACM SIGARCH Computer Architecture News, vol. 35, pp. 13ā23. ACM (2007)
Gandhi, A., Gupta, V., Harchol-Balter, M., Kozuch, M.A.: Optimality analysis of energy-performance trade-off for server farm management. Perform. Eval. 67(11), 1155ā1171 (2010)
Gribaudo, M., Iacono, M.: Theory and Application of Multi-formalism Modeling. IGI Global, Hershey (2013)
Ho, T.T.N., Gribaudo, M., Pernici, B.: Characterizing energy per job in cloud applications. Electronics 5(4), 90 (2016)
Huang, L., Wang, X.W., Zhai, Y.D., Yang, B.: Extraction of user profile based on the hadoop framework. In: 5th International Conference on Wireless Communications, Networking and Mobile Computing, WiCom 2009, pp. 1ā6. IEEE (2009)
HyytiƤ, E., Righter, R., Aalto, S.: Task assignment in a heterogeneous server farm with switching delays and general energy-aware cost structure. Perform. Eval. 75, 17ā35 (2014)
Kang, C.W., Abbaspour, S., Pedram, M.: Buffer sizing for minimum energy-delay product by using an approximating polynomial. In: Proceedings of the 13th ACM Great Lakes Symposium on VLSI, pp. 112ā115. ACM (2003)
Kaxiras, S., Martonosi, M.: Computer architecture techniques for power-efficiency. Synth. Lect. Comput. Archit. 3(1), 1ā207 (2008)
Kulkarni, A.P., Khandewal, M.: Survey on hadoop and introduction to YARN. Int. J. Emerg. Technol. Adv. Eng. 4(5), 82ā87 (2014)
Rosti, E., Schiavoni, F., Serazzi, G.: Queueing network models with two classes of customers. In: Proceedings Fifth International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 1997, pp. 229ā234. IEEE (1997)
Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive: a warehousing solution over a map-reduce framework. Proc. VLDB Endow. 2(2), 1626ā1629 (2009)
Zaharia, M., Borthakur, D., Sarma, J.S., Elmeleegy, K., Shenker, S., Stoica, I.: Job scheduling for multi-user mapreduce clusters. Technical Report UCB/EECS-2009-55, EECS Department, University of California, Berkeley (2009)
Acknowledgement
This research was supported in part by the European Commission under the grant ANTAREX H2020 FET-HPC-671623.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2017 Springer International Publishing AG
About this paper
Cite this paper
Pinciroli, R., Gribaudo, M., Serazzi, G. (2017). Modeling Multiclass Task-Based Applications on Heterogeneous Distributed Environments. In: Thomas, N., Forshaw, M. (eds) Analytical and Stochastic Modelling Techniques and Applications. ASMTA 2017. Lecture Notes in Computer Science(), vol 10378. Springer, Cham. https://doi.org/10.1007/978-3-319-61428-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-61428-1_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61427-4
Online ISBN: 978-3-319-61428-1
eBook Packages: Computer ScienceComputer Science (R0)