Abstract
Apache Spark is probably the most widely adopted framework for developing big-data batch applications and for executing them on a cluster of (virtual) machines. In general, the more resources (machines) one uses, the faster applications execute, but there is currently no adequate means to determine the proper size of a Spark cluster given time constraints, or to foresee execution times given the number of employed machines. One can only run these applications and use her/his experience to size the cluster and predict expected execution times. Wrong estimation of execution times can lead to costly overruns and overly long executions, thus calling for analytic sizing/prediction techniques that provide precise time guarantees. This paper addresses this problem by proposing a solution based on model-checking. The approach exploits a directed acyclic graph (DAG) to abstract the structure of the execution flows of Spark programs, annotates each node (Spark stage) with execution-related data, and formulates the identification of the global execution time as a reachability problem. To avoid the well-known state space explosion problem, the paper also proposes a technique to reduce the size of generated abstract models. This results in a significant decrease in used memory and/or verification time making our approach feasible for predicting the execution time of Spark applications given the resources available. The benefits of the proposed reduction technique are evaluated by using both timed automata and constraint LTL over clocks logic to formally encode and analyze generated models. The approach is also successfully validated on some realistic case studies. Since the optimization is not Spark-specific, we claim that it can be applied to a wide range of applications whose underlying model can be abstracted as a DAG.
Similar content being viewed by others
References
Alur, R., Courcoubetis, C., Dill, D.: Model-checking in dense real-time. Inf Comput 104(1), 2–34 (1993)
Alur, R., Dill, D.: A theory of timed automata. Theor Comput Sci 126(2), 183–235 (1994)
Brito A, Ardagna D, Blanquer I, Evangelinou A, Barbierato E, Gribaudo M, Almeida J, Couto AP, Braga T (2017) D3.4 EUBra-BIGSEA QoS infrastructure services intermediate version. Technical report, EUBra-BIGSEA consortium
Biere, A., Cimatti, A., Clarke, E.C., Strichman, O., Zhu, Y.: Bounded model checking. Adv Comput 58, 118–149 (2003)
Behrmann G, David A, Larsen KG, Hakansson J, Petterson P, Yi W, Hendriks M (2006) Uppaal 4.0. In: Proceedings of the 3rd international conference on the quantitative evaluation of systems, QEST '06, Washington, DC, USA. IEEE Computer Society, pp 125–126
Bozga, M., Daws, C., Maler, O., Olivero, A., Tripakis, S., Yovine, S.: Kronos: a model-checking tool for real-time systems. In: Ravn, A.P., Rischel, H. (eds.) Formal techniques in real-time and fault-tolerant systems, pp. 298–302. Springer, Berlin (1998)
Baresi, L., Ghezzi, C., Mottola, L.: Loupe: verifying publish-subscribe architectures with a magnifying lens. IEEE Trans Softw Eng 37(2), 228–246 (2011)
Bradley S, Henderson W, Kendall D (1999) Using timed automata for response time analysis of distributed real-time systems. In: 24th IFAC/IFIP workshop on real-time programming, pp 143–148
Behrmann, G., Larsen, K.G., Rasmussen, J.I.: Optimal scheduling using priced timed automata. SIGMETRICS Perform Eval Rev 32(4), 34–40 (2005)
Bouyer P (2009) Model-checking timed temporal logics. Electron Notes Theor Comput Sci 231:323–341. Proceedings of the 5th workshop on methods for modalities (M4M5 2007)
Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the international world-wide web conference (WWW), pp 107–117
Baresi L, Pourhashem Kallehbasti MM, Rossi M (2016) How bit-vector logic can help improve the verification of LTL specifications over infinite domains. In: Proceedings of the 31st annual ACM symposium on applied computing, pp 1666–1673
Baresi L, Quattrocchi G (2018) Towards vertically scalable spark applications. In: Euro-Par 2018: parallel processing workshops. Springer
Bersani MM, Rossi M, San Pietro P (2017) A logical characterization of timed regular languages. Theor Comput Sci 658:46--59
Bersani, M.M., Rossi, M., San Pietro, P.: A tool for deciding the satisfiability of continuous-time metric temporal logic. Acta Informatica 53(2), 171–206 (2016)
Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2), 121–167 (1998)
Cimatti A, Clarke EM, Giunchiglia E, Giunchiglia F, Pistore M, Roveri M, Sebastiani R, Tacchella A (2002) Nusmv 2: an opensource tool for symbolic model checking. In: Computer aided verification, 14th international conference, CAV 2002, Copenhagen, Denmark, 27–31 July 2002, Proceedings, pp 359–364
Clarke EC, Emerson EA, Jha S, Prasad Sistla A (1998) Symmetry reductions in model checking. In: Hu AJ, Vardi MY (eds) Computer aided verification, 10th international conference, CAV '98, Vancouver, BC, Canada, 28 June–2 July 1998, Proceedings. Lecture notes in computer science, vol 1427. Springer, pp 147–158
Clarke, E.M., Grumberg, O., Peled, D.A.: Model checking. MIT Press, Cambridge (1999)
Corbett JC (Jul 1996) Timing analysis of ada tasking programs. IEEE Trans Softw Eng 22(7):461–483
DAG-ver Project repository. github.com/deib-polimi/DAG-ver, 2019
Demri, S., D'Souza, D.: An automata-theoretic approach to constraint LTL. Inf Comput 205(3), 380–415 (2007)
Dilworth, R.P.: A decomposition theorem for partially ordered sets. Ann Math 51(1), 161–166 (1950)
Donaldson AF, Miller A, Parker D (2009) Language-level symmetry reduction for probabilistic model checking. In: QEST 2009, sixth international conference on the quantitative evaluation of systems, Budapest, Hungary, 13–16 Sept 2009. IEEE Computer Society, pp 289–298
Finkel, A., Schnoebelen, P.: Well-structured transition systems everywhere!. Theor Comput Sci 256(1), 63–92 (2001)
Godefroid P (1996) Partial-order methods for the verification of concurrent systems: an approach to the state-explosion problem. Lecture notes in computer science, vol 1032. Springer, Berlin
Gianniti, E., Rizzi, A.M., Barbierato, E., Gribaudo, M., Ardagna, D.: Fluid petri nets for the performance evaluation of mapreduce and Spark applications. SIGMETRICS Perform Eval Rev 44, 23–36 (2017)
Hazewinkel M (1987) Encyclopaedia of mathematics (1). Encyclopaedia of mathematics: an updated and annotated translation of the soviet ``Mathematical Encyclopaedia''. Springer, Berlin
Henzinger, T.A.: The theory of hybrid automata, pp. 265–292. Springer, Berlin (2000)
Holzmann, G.J.: The model checker SPIN. IEEE Trans Softw Eng 23(5), 279–295 (1997)
Ikiz S, Garg VK (2004) Online algorithms for Dilworth's chain partition. Technical report, Parallel and Distributed Systems Laboratory, Department of Electrical and Computer Engineering, University of Texas at Austin
Jang K, Sherry J, Ballani H, Moncaster T (2015) Silo: predictable message latency in the cloud. In: Proceedings of the 2015 ACM conference on special interest group on data communication, SIGCOMM 2015, London, UK, 17–21 Aug 2015, pp 435–448
Kc K, Anyanwu K (2010) Scheduling hadoop jobs to meet deadlines. In: Proceedings of the IEEE 2nd international conference on cloud computing technology and science. IEEE
Krakora, J., Waszniowski, L., Pisa, P., Hanzalek, Z.: Timed automata approach to real time distributed system verification. Proceedings of the IEEE international workshop on factory communication systems 2004, 407–410 (2004)
Li S, Hu S, Wang S, Su L, Abdelzaher T, Gupta I, Pace R (2014) WOHA: deadline-aware map-reduce workflow scheduling framework over hadoop clusters. In: Proceedings of the IEEE 34th international conference on distributed computing systems. IEEE
Bersani, M.M., Frigeri, A., Morzenti, A., Pradella, M., Rossi, M., San Pietro, P.: Constraint LTL satisfiability checking without automata. J Appl Log 12(4), 522–557 (2014)
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Le Cam LM, Neyman J (eds) Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1. University of California Press, pp 281–297
Miller, A., Donaldson, A.F., Calder, M.: Symmetry in temporal logic model checking. ACM Comput Surv 38(3), 8 (2006)
Marconi F, Quattrocchi G, Baresi L, Bersani MM, Rossi M (2018) On the timed analysis of big-data applications. In: Dutle A, Muñoz CA, Narkawicz A (eds) NASA formal methods—10th international symposium, NFM 2018, Newport News, VA, USA, 17–19 Apr 2018, Proceedings. Lecture notes in computer science, vol 10811. Springer, pp 315–332
Ousterhout K, Rasti R, Ratnasamy S, Shenker S, Chun B (2015) Making sense of performance in data analytics frameworks. In: Proceedings of the 12th USENIX conference on networked systems design and implementation. USENIX
Perez, D., Bernardi, S., Merseguer, J., JoJ, Requeno, Casale, G., Zhu, L.: DICE simulation tools: final version. Deliverable, DICE consortium (2017)
Palencia JC, Gonzalez Harbour M (1998) Schedulability analysis for tasks with static and dynamic offsets. In: Proceedings of the IEEE real-time systems symposium, pp 26–37
Politecnico di Milano (2019) The Zot Bounded Model/Satisfiability Checker. github.com/fm-polimi/zot
Wang, F.: Efficient verification of timed automata with bdd-like data structures. Int J Softw Tools Technol Transf 6(1), 77–97 (2004)
Waez, M.T.B., Dingel, J., Rudie, K.: A survey of timed automata for the development of real-time systems. Comput Sci Rev 9, 1–26 (2013)
Yasmina Abdeddaïm, Y., Asarin, E., Maler, O.: Scheduling with timed automata. Theor Comput Sci 354(2), 272–300 (2006)
Yu J, Chen H, Hu F (2015) SASM: improving Spark performance with adaptive skew mitigation. In: 2015 IEEE international conference on progress in informatics and computing (PIC)
Acknowledgements
This work has been partially supported by the DICE project (Horizon 2020 project no. 644869) and by the GAUSS national research project (MIUR, PRIN 2015, Contract 2015KWREMX).
Author information
Authors and Affiliations
Corresponding author
Additional information
Jim Woodcock
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Baresi, L., Bersani, M.M., Marconi, F. et al. Using formal verification to evaluate the execution time of Spark applications. Form Asp Comp 32, 33–70 (2020). https://doi.org/10.1007/s00165-020-00505-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00165-020-00505-4