Abstract
Cloud computing is a new paradigm for using ICT services—only when needed and for as long as needed, and paying only for service actually consumed. Benchmarking the increasingly many cloud services is crucial for market growth and perceived fairness, and for service design and tuning. In this work, we propose a generic architecture for benchmarking cloud services. Motivated by recent demand for data-intensive ICT services, and in particular by processing of large graphs, we adapt the generic architecture to Graphalytics, a benchmark for distributed and GPU-based graph analytics platforms. Graphalytics focuses on the dependence of performance on the input dataset, on the analytics algorithm, and on the provisioned infrastructure. The benchmark provides components for platform configuration, deployment, and monitoring, and has been tested for a variety of platforms. We also propose a new challenge for the process of benchmarking data-intensive services, namely the inclusion of the data-processing algorithm in the system under test; this increases significantly the relevance of benchmarking results, albeit, at the cost of increased benchmarking duration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In inverse chronological order: Lecture at the Fifth Workshop on Big Data Benchmarking (WBDB), Potsdam, Germany, August 2014. Lecture at the Linked Data Benchmark Council’s Fourth TUC Meeting 2014, Amsterdam, May 2014. Lecture at Intel, Haifa, Israel, June 2013. Lecture at IBM Research Labs, Haifa, Israel, May 2013. Lecture at IBM T.J. Watson, Yorktown Heights, NY, USA, May 2013. Lecture at Technion, Haifa, Israel, May 2013. Online lecture for the SPEC Research Group, 2012.
- 2.
- 3.
References
Lumsdaine, B.H.A., Gregor, D., Berry, J.W.: Challenges in parallel graph processing. Parallel Process. Lett. 17, 5–20 (2007)
Agarwal, V., Petrini, F., Pasetto, D., Bader, D.A.: Scalable graph exploration on multicore processors. In: SC, pp. 1–11 (2010)
Albayraktaroglu, K., Jaleel, A., Wu, X., Franklin, M., Jacob, B., Tseng, C.-W., Yeung, D.: Biobench: a benchmark suite of bioinformatics applications. In: ISPASS, pp. 2–9. IEEE Computer Society (2005)
Amaral, J.N.: How did this get published? pitfalls in experimental evaluation of computing systems. LTES talk (2012). http://webdocs.cs.ualberta.ca/amaral/Amaral-LCTES2012.pptx. Accessed October 2012
Amazon Web Services. Case studies. Amazon web site, October 2012. http://aws.amazon.com/solutions/case-studies/. Accessed October 2012
Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–12 (1999)
Brebner, P., Cecchet, E., Marguerite, J., Tuma, P., Ciuhandu, O., Dufour, B., Eeckhout, L., Frénot, S., Krishna, A.S., Murphy, J., Verbrugge, C.: Middleware benchmarking: approaches, results, experiences. Concurrency Comput. Pract. Experience 17(15), 1799–1805 (2005)
Buble, A., Bulej, L., Tuma, P.: Corba benchmarking: a course with hidden obstacles. In: IPDPS, p. 279 (2003)
Buluç, A., Duriakova, E., Fox, A., Gilbert, J.R., Kamil, S., Lugowski, A., Oliker, L., Williams, S.: High-productivity and high-performance analysis of filtered semantic graphs. In: IPDPS (2013)
Burtscher, M., Nasre, R., Pingali, K.: A quantitative study of irregular programs on GPUS. In: 2012 IEEE International Symposium on Workload Characterization (IISWC), pp. 141–151. IEEE (2012)
Cai, J., Poon, C.K.: Path-hop: efficiently indexing large graphs for reachability queries. In: CIKM (2010)
Chapin, S.J., Cirne, W., Feitelson, D.G., Jones, J.P., Leutenegger, S.T., Schwiegelshohn, U., Smith, W., Talby, D.: Benchmarks and standards for the evaluation of parallel job schedulers. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1999, IPPS-WS 1999, and SPDP-WS 1999. LNCS, vol. 1659, pp. 67–90. Springer, Heidelberg (1999)
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.H., Skadron, K.: Rodinia: a benchmark suite for heterogeneous computing. In: The 2009 IEEE International Symposium on Workload Characterization, IISWC 2009, pp. 44–54 (2009)
Checconi, F., Petrini, F.: Massive data analytics: the graph 500 on IBM blue Gene/Q. IBM J. Res. Dev. 57(1/2), 10 (2013)
Cong, G., Makarychev, K.: Optimizing large-scale graph analysis on multithreaded, multicore platforms. In: IPDPS (2012)
Deelman, E., Singh, G., Livny, M., Berriman, J.B., Good, J.: The cost of doing science on the cloud: the montage example. In: SC, p. 50. IEEE/ACM (2008)
Downey, A.B., Feitelson, D.G.: The elusive goal of workload characterization. SIGMETRICS Perform. Eval. Rev. 26(4), 14–29 (1999)
Eeckhout, L., Nussbaum, S., Smith, J.E., Bosschere, K.D.: Statistical simulation: adding efficiency to the computer designer’s toolbox. IEEE Micro 23(5), 26–38 (2003)
Folkerts, E., Alexandrov, A., Sachs, K., Iosup, A., Markl, V., Tosun, C.: Benchmarking in the cloud: what it should, can, and cannot be. In: Nambiar, R., Poess, M. (eds.) TPCTC 2012. LNCS, vol. 7755, pp. 173–188. Springer, Heidelberg (2013)
Frachtenberg, E., Feitelson, D.G.: Pitfalls in parallel job scheduling evaluation. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 257–282. Springer, Heidelberg (2005)
Genbrugge, D., Eeckhout, L.: Chip multiprocessor design space exploration through statistical simulation. IEEE Trans. Comput. 58(12), 1668–1681 (2009)
Georges, A., Buytaert, D., Eeckhout, L.: Statistically rigorous java performance evaluation. In: OOPSLA, pp. 57–76 (2007)
Graph500 consortium. Graph 500 benchmark specification. Graph500 documentation, September 2011. http://www.graph500.org/specifications
Gray, J. (ed.): The Benchmark Handbook for Database and Transasction Systems. Mergan Kaufmann, San Mateo (1993)
Guo, Y., Biczak, M., Varbanescu, A.L., Iosup, A., Martella, C., Willke, T.L.: How well do graph-processing platforms perform? an empirical performance evaluation and analysis. In: IPDPS (2014)
Guo, Y., Iosup, A.: The game trace archive. In: NETGAMES, pp. 1–6 (2012)
Guo, Y., Varbanescu, A.L., Iosup, A., Martella, C., Willke, T.L.: Benchmarking graph-processing platforms: a vision. In: ICPE, pp. 289–292 (2014)
Han, M., Daudjee, K., Ammar, K., Özsu, M.T., Wang, X., Jin, T.: An experimental comparison of pregel-like graph processing systems. PVLDB 7(12), 1047–1058 (2014)
Iosup, A.: Iaas cloud benchmarking: approaches, challenges, and experience. In: HotTopiCS, pp. 1–2 (2013)
Iosup, A., Epema, D.H.J.: GrenchMark: a framework for analyzing, testing, and comparing grids. In: CCGrid, pp. 313–320 (2006)
Iosup, A., Epema, D.H.J., Franke, C., Papaspyrou, A., Schley, L., Song, B., Yahyapour, R.: On grid performance evaluation using synthetic workloads. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2006. LNCS, vol. 4376, pp. 232–255. Springer, Heidelberg (2007)
Iosup, A., Ostermann, S., Yigitbasi, N., Prodan, R., Fahringer, T., Epema, D.H.J.: Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Trans. Par. Dist. Syst. 22(6), 931–945 (2011)
Iosup, A., Prodan, R., Epema, D.: Iaas cloud benchmarking: approaches, challenges, and experience. In: Li, X., Qiu, J. (eds.) Cloud Computing for Data-Intensive Applications. Springer, New York (2015)
Iosup, A., Prodan, R., Epema, D.H.J.: Iaas cloud benchmarking: approaches, challenges, and experience. In: SC Companion/MTAGS (2012)
Jackson, K.R., Muriki, K., Ramakrishnan, L., Runge, K.J., Thomas, R.C.: Performance and cost analysis of the supernova factory on the amazon aws cloud. Sci. Program. 19(2–3), 107–119 (2011)
Jain, R. (ed.): The Art of Computer Systems Performance Analysis. Wiley, New York (1991)
Jiang, W., Agrawal, G.: Ex-MATE: data intensive computing with large reduction objects and its application to graph mining. In: CCGRID (2011)
Katz, G.J., Kider Jr., J.T.: All-pairs shortest-paths for large graphs on the GPU. In: 23rd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware, pp. 47–55 (2008)
LDBC consortium. Social network benchmark: Data generator. LDBC Deliverable 2.2.2, September 2013. http://ldbc.eu/sites/default/files/D2.2.2_final.pdf
Leskovec, J.: Stanford Network Analysis Platform (SNAP). Stanford University, California (2006)
Leskovec, J., Kleinberg, J.M., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, Illinois, USA, pp. 177–187, 21–24 August 2005
Lu, Y., Cheng, J., Yan, D., Wu, H.: Large-scale distributed graph computing systems: an experimental evaluation. PVLDB 8(3), 281–292 (2014)
Mell, P., Grance, T.: The NIST definition of cloud computing. National Institute of Standards and Technology (NIST) Special Publication 800–145, September 2011. http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf. Accessed October 2012
de Laat, C., Verstraaten, M., Varbanescu, A.L.: State-of-the-art in graph traversals on modern arhictectures. Technical report, University of Amsterdam, August 2014
Merrill, D., Garland, M., Grimshaw, A.: Scalable GPU graph traversal. SIGPLAN Not. 47(8), 117–128 (2012)
Nasre, R., Burtscher, M., Pingali, K.: Data-driven versus topology-driven irregular computations on GPUs. In: 2013 IEEE 27th International Symposium on Parallel & Distributed Processing (IPDPS), pp. 463–474. IEEE (2013)
Oskin, M., Chong, F.T., Farrens, M.K.: Hls: combining statistical and symbolic simulation to guide microprocessor designs. In: ISCA, pp. 71–82 (2000)
Penders, A.: Accelerating graph analysis with heterogeneous systems. Master’s thesis, PDS, EWI, TUDelft, December 2012
Pingali, K., Nguyen, D., Kulkarni, M., Burtscher, M., Hassaan, M.A., Kaleem, R., Lee, T.-H., Lenharth, A., Manevich, R., Méndez-Lojo, M., et al.: The tao of parallelism in algorithms. ACM SIGPLAN Not. 46(6), 12–25 (2011)
Que, X., Checconi, F., Petrini, F.: Performance analysis of graph algorithms on P7IH. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 109–123. Springer, Heidelberg (2014)
Raicu, I., Zhang, Z., Wilde, M., Foster, I.T., Beckman, P.H., Iskra, K., Clifford, B.: Toward loosely coupled programming on petascale systems. In: SC, p. 22. ACM (2008)
Hong, T.O.S., Kim, S.K., Olukotun, K.: Accelerating CUDA graph algorithms at maximum warp. In: Principles and Practice of Parallel Programming, PPoPP 2011 (2011)
Saavedra, R.H., Smith, A.J.: Analysis of benchmark characteristics and benchmark performance prediction. ACM Trans. Comput. Syst. 14(4), 344–384 (1996)
Schroeder, B., Wierman, A., Harchol-Balter, M.: Open versus closed: a cautionary tale. In: NSDI (2006)
Sharkawi, S., DeSota, D., Panda, R., Indukuru, R., Stevens, S., Taylor, V.E., Wu, X.: Performance projection of HPC applications using SPEC CFP2006 benchmarks. In: IPDPS, pp. 1–12 (2009)
Shun, J., Blelloch, G.E.: Ligra: a lightweight graph processing framework for shared memory. In: PPOPP (2013)
Spacco, J., Pugh, W.: Rubis revisited: why J2EE benchmarking is hard. Stud. Inform. Univ. 4(1), 25–30 (2005)
Varbanescu, A.L., Verstraaten, M., de Laat, C., Penders, A., Iosup, A., Sips, H.: Can portability improve performance? an empirical study of parallel graph analytics. In: ICPE (2015)
Villegas, D., Antoniou, A., Sadjadi, S.M., Iosup, A.: An analysis of provisioning and allocation policies for infrastructure-as-a-service clouds. In: 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2012, pp. 612–619, Ottawa, Canada, 13–16 May 2012
Wang, N., Zhang, J., Tan, K.-L., Tung, A.K.H.: On triangulation-based dense neighborhood graphs discovery. VLDB 4(2), 58–68 (2010)
Yigitbasi, N., Iosup, A., Epema, D.H.J., Ostermann, S.: C-meter: a framework for performance analysis of computing clouds. In: 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGrid 2009, Shanghai, China, pp. 472–477, 18–21 May 2009
Acknowledgments
This work is supported by the Dutch STW/NOW Veni personal grants @large (#11881) and Graphitti (#12480), by the EU FP7 project PEDCA, by the Dutch national program COMMIT and its funded project COMMissioner, and by the Dutch KIEM project KIESA. The authors wish to thank Hassan Chafi and the Oracle Research Labs, Peter Boncz and the LDBC project, and Josep Larriba-Pey and Arnau Prat Perez, whose support has made the Graphalytics benchmark possible; and to Tilmann Rabl, for facilitating this material.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Iosup, A. et al. (2015). Towards Benchmarking IaaS and PaaS Clouds for Graph Analytics. In: Rabl, T., Sachs, K., Poess, M., Baru, C., Jacobson, HA. (eds) Big Data Benchmarking. WBDB 2014. Lecture Notes in Computer Science(), vol 8991. Springer, Cham. https://doi.org/10.1007/978-3-319-20233-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-20233-4_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20232-7
Online ISBN: 978-3-319-20233-4
eBook Packages: Computer ScienceComputer Science (R0)