Towards Benchmarking IaaS and PaaS Clouds for Graph Analytics

Iosup, Alexandru; Capotă, Mihai; Hegeman, Tim; Guo, Yong; Ngai, Wing Lung; Varbanescu, Ana Lucia; Verstraaten, Merijn

doi:10.1007/978-3-319-20233-4_11

Alexandru Iosup¹⁸,
Mihai Capotă¹⁸,
Tim Hegeman¹⁸,
Yong Guo¹⁸,
Wing Lung Ngai¹⁸,
Ana Lucia Varbanescu¹⁹ &
…
Merijn Verstraaten¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8991))

Included in the following conference series:

Workshop on Big Data Benchmarks

1091 Accesses
1 Citations

Abstract

Cloud computing is a new paradigm for using ICT services—only when needed and for as long as needed, and paying only for service actually consumed. Benchmarking the increasingly many cloud services is crucial for market growth and perceived fairness, and for service design and tuning. In this work, we propose a generic architecture for benchmarking cloud services. Motivated by recent demand for data-intensive ICT services, and in particular by processing of large graphs, we adapt the generic architecture to Graphalytics, a benchmark for distributed and GPU-based graph analytics platforms. Graphalytics focuses on the dependence of performance on the input dataset, on the analytics algorithm, and on the provisioned infrastructure. The benchmark provides components for platform configuration, deployment, and monitoring, and has been tested for a variety of platforms. We also propose a new challenge for the process of benchmarking data-intensive services, namely the inclusion of the data-processing algorithm in the system under test; this increases significantly the relevance of benchmarking results, albeit, at the cost of increased benchmarking duration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In inverse chronological order: Lecture at the Fifth Workshop on Big Data Benchmarking (WBDB), Potsdam, Germany, August 2014. Lecture at the Linked Data Benchmark Council’s Fourth TUC Meeting 2014, Amsterdam, May 2014. Lecture at Intel, Haifa, Israel, June 2013. Lecture at IBM Research Labs, Haifa, Israel, May 2013. Lecture at IBM T.J. Watson, Yorktown Heights, NY, USA, May 2013. Lecture at Technion, Haifa, Israel, May 2013. Online lecture for the SPEC Research Group, 2012.
2.
http://www.graph500.org.
3.
http://iss.ices.utexas.edu/?p=projects/galois/lonestargpu.

References

Lumsdaine, B.H.A., Gregor, D., Berry, J.W.: Challenges in parallel graph processing. Parallel Process. Lett. 17, 5–20 (2007)
Article MathSciNet Google Scholar
Agarwal, V., Petrini, F., Pasetto, D., Bader, D.A.: Scalable graph exploration on multicore processors. In: SC, pp. 1–11 (2010)
Google Scholar
Albayraktaroglu, K., Jaleel, A., Wu, X., Franklin, M., Jacob, B., Tseng, C.-W., Yeung, D.: Biobench: a benchmark suite of bioinformatics applications. In: ISPASS, pp. 2–9. IEEE Computer Society (2005)
Google Scholar
Amaral, J.N.: How did this get published? pitfalls in experimental evaluation of computing systems. LTES talk (2012). http://webdocs.cs.ualberta.ca/amaral/Amaral-LCTES2012.pptx. Accessed October 2012
Amazon Web Services. Case studies. Amazon web site, October 2012. http://aws.amazon.com/solutions/case-studies/. Accessed October 2012
Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–12 (1999)
Article MathSciNet Google Scholar
Brebner, P., Cecchet, E., Marguerite, J., Tuma, P., Ciuhandu, O., Dufour, B., Eeckhout, L., Frénot, S., Krishna, A.S., Murphy, J., Verbrugge, C.: Middleware benchmarking: approaches, results, experiences. Concurrency Comput. Pract. Experience 17(15), 1799–1805 (2005)
Article Google Scholar
Buble, A., Bulej, L., Tuma, P.: Corba benchmarking: a course with hidden obstacles. In: IPDPS, p. 279 (2003)
Google Scholar
Buluç, A., Duriakova, E., Fox, A., Gilbert, J.R., Kamil, S., Lugowski, A., Oliker, L., Williams, S.: High-productivity and high-performance analysis of filtered semantic graphs. In: IPDPS (2013)
Google Scholar
Burtscher, M., Nasre, R., Pingali, K.: A quantitative study of irregular programs on GPUS. In: 2012 IEEE International Symposium on Workload Characterization (IISWC), pp. 141–151. IEEE (2012)
Google Scholar
Cai, J., Poon, C.K.: Path-hop: efficiently indexing large graphs for reachability queries. In: CIKM (2010)
Google Scholar
Chapin, S.J., Cirne, W., Feitelson, D.G., Jones, J.P., Leutenegger, S.T., Schwiegelshohn, U., Smith, W., Talby, D.: Benchmarks and standards for the evaluation of parallel job schedulers. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1999, IPPS-WS 1999, and SPDP-WS 1999. LNCS, vol. 1659, pp. 67–90. Springer, Heidelberg (1999)
Chapter Google Scholar
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.H., Skadron, K.: Rodinia: a benchmark suite for heterogeneous computing. In: The 2009 IEEE International Symposium on Workload Characterization, IISWC 2009, pp. 44–54 (2009)
Google Scholar
Checconi, F., Petrini, F.: Massive data analytics: the graph 500 on IBM blue Gene/Q. IBM J. Res. Dev. 57(1/2), 10 (2013)
Article Google Scholar
Cong, G., Makarychev, K.: Optimizing large-scale graph analysis on multithreaded, multicore platforms. In: IPDPS (2012)
Google Scholar
Deelman, E., Singh, G., Livny, M., Berriman, J.B., Good, J.: The cost of doing science on the cloud: the montage example. In: SC, p. 50. IEEE/ACM (2008)
Google Scholar
Downey, A.B., Feitelson, D.G.: The elusive goal of workload characterization. SIGMETRICS Perform. Eval. Rev. 26(4), 14–29 (1999)
Article Google Scholar
Eeckhout, L., Nussbaum, S., Smith, J.E., Bosschere, K.D.: Statistical simulation: adding efficiency to the computer designer’s toolbox. IEEE Micro 23(5), 26–38 (2003)
Article Google Scholar
Folkerts, E., Alexandrov, A., Sachs, K., Iosup, A., Markl, V., Tosun, C.: Benchmarking in the cloud: what it should, can, and cannot be. In: Nambiar, R., Poess, M. (eds.) TPCTC 2012. LNCS, vol. 7755, pp. 173–188. Springer, Heidelberg (2013)
Chapter Google Scholar
Frachtenberg, E., Feitelson, D.G.: Pitfalls in parallel job scheduling evaluation. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 257–282. Springer, Heidelberg (2005)
Chapter Google Scholar
Genbrugge, D., Eeckhout, L.: Chip multiprocessor design space exploration through statistical simulation. IEEE Trans. Comput. 58(12), 1668–1681 (2009)
Article MathSciNet Google Scholar
Georges, A., Buytaert, D., Eeckhout, L.: Statistically rigorous java performance evaluation. In: OOPSLA, pp. 57–76 (2007)
Google Scholar
Graph500 consortium. Graph 500 benchmark specification. Graph500 documentation, September 2011. http://www.graph500.org/specifications
Gray, J. (ed.): The Benchmark Handbook for Database and Transasction Systems. Mergan Kaufmann, San Mateo (1993)
Google Scholar
Guo, Y., Biczak, M., Varbanescu, A.L., Iosup, A., Martella, C., Willke, T.L.: How well do graph-processing platforms perform? an empirical performance evaluation and analysis. In: IPDPS (2014)
Google Scholar
Guo, Y., Iosup, A.: The game trace archive. In: NETGAMES, pp. 1–6 (2012)
Google Scholar
Guo, Y., Varbanescu, A.L., Iosup, A., Martella, C., Willke, T.L.: Benchmarking graph-processing platforms: a vision. In: ICPE, pp. 289–292 (2014)
Google Scholar
Han, M., Daudjee, K., Ammar, K., Özsu, M.T., Wang, X., Jin, T.: An experimental comparison of pregel-like graph processing systems. PVLDB 7(12), 1047–1058 (2014)
Google Scholar
Iosup, A.: Iaas cloud benchmarking: approaches, challenges, and experience. In: HotTopiCS, pp. 1–2 (2013)
Google Scholar
Iosup, A., Epema, D.H.J.: GrenchMark: a framework for analyzing, testing, and comparing grids. In: CCGrid, pp. 313–320 (2006)
Google Scholar
Iosup, A., Epema, D.H.J., Franke, C., Papaspyrou, A., Schley, L., Song, B., Yahyapour, R.: On grid performance evaluation using synthetic workloads. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2006. LNCS, vol. 4376, pp. 232–255. Springer, Heidelberg (2007)
Chapter Google Scholar
Iosup, A., Ostermann, S., Yigitbasi, N., Prodan, R., Fahringer, T., Epema, D.H.J.: Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Trans. Par. Dist. Syst. 22(6), 931–945 (2011)
Article Google Scholar
Iosup, A., Prodan, R., Epema, D.: Iaas cloud benchmarking: approaches, challenges, and experience. In: Li, X., Qiu, J. (eds.) Cloud Computing for Data-Intensive Applications. Springer, New York (2015)
Google Scholar
Iosup, A., Prodan, R., Epema, D.H.J.: Iaas cloud benchmarking: approaches, challenges, and experience. In: SC Companion/MTAGS (2012)
Google Scholar
Jackson, K.R., Muriki, K., Ramakrishnan, L., Runge, K.J., Thomas, R.C.: Performance and cost analysis of the supernova factory on the amazon aws cloud. Sci. Program. 19(2–3), 107–119 (2011)
Google Scholar
Jain, R. (ed.): The Art of Computer Systems Performance Analysis. Wiley, New York (1991)
MATH Google Scholar
Jiang, W., Agrawal, G.: Ex-MATE: data intensive computing with large reduction objects and its application to graph mining. In: CCGRID (2011)
Google Scholar
Katz, G.J., Kider Jr., J.T.: All-pairs shortest-paths for large graphs on the GPU. In: 23rd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware, pp. 47–55 (2008)
Google Scholar
LDBC consortium. Social network benchmark: Data generator. LDBC Deliverable 2.2.2, September 2013. http://ldbc.eu/sites/default/files/D2.2.2_final.pdf
Leskovec, J.: Stanford Network Analysis Platform (SNAP). Stanford University, California (2006)
Google Scholar
Leskovec, J., Kleinberg, J.M., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, Illinois, USA, pp. 177–187, 21–24 August 2005
Google Scholar
Lu, Y., Cheng, J., Yan, D., Wu, H.: Large-scale distributed graph computing systems: an experimental evaluation. PVLDB 8(3), 281–292 (2014)
Google Scholar
Mell, P., Grance, T.: The NIST definition of cloud computing. National Institute of Standards and Technology (NIST) Special Publication 800–145, September 2011. http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf. Accessed October 2012
de Laat, C., Verstraaten, M., Varbanescu, A.L.: State-of-the-art in graph traversals on modern arhictectures. Technical report, University of Amsterdam, August 2014
Google Scholar
Merrill, D., Garland, M., Grimshaw, A.: Scalable GPU graph traversal. SIGPLAN Not. 47(8), 117–128 (2012)
Article Google Scholar
Nasre, R., Burtscher, M., Pingali, K.: Data-driven versus topology-driven irregular computations on GPUs. In: 2013 IEEE 27th International Symposium on Parallel & Distributed Processing (IPDPS), pp. 463–474. IEEE (2013)
Google Scholar
Oskin, M., Chong, F.T., Farrens, M.K.: Hls: combining statistical and symbolic simulation to guide microprocessor designs. In: ISCA, pp. 71–82 (2000)
Google Scholar
Penders, A.: Accelerating graph analysis with heterogeneous systems. Master’s thesis, PDS, EWI, TUDelft, December 2012
Google Scholar
Pingali, K., Nguyen, D., Kulkarni, M., Burtscher, M., Hassaan, M.A., Kaleem, R., Lee, T.-H., Lenharth, A., Manevich, R., Méndez-Lojo, M., et al.: The tao of parallelism in algorithms. ACM SIGPLAN Not. 46(6), 12–25 (2011)
Article Google Scholar
Que, X., Checconi, F., Petrini, F.: Performance analysis of graph algorithms on P7IH. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 109–123. Springer, Heidelberg (2014)
Google Scholar
Raicu, I., Zhang, Z., Wilde, M., Foster, I.T., Beckman, P.H., Iskra, K., Clifford, B.: Toward loosely coupled programming on petascale systems. In: SC, p. 22. ACM (2008)
Google Scholar
Hong, T.O.S., Kim, S.K., Olukotun, K.: Accelerating CUDA graph algorithms at maximum warp. In: Principles and Practice of Parallel Programming, PPoPP 2011 (2011)
Google Scholar
Saavedra, R.H., Smith, A.J.: Analysis of benchmark characteristics and benchmark performance prediction. ACM Trans. Comput. Syst. 14(4), 344–384 (1996)
Article Google Scholar
Schroeder, B., Wierman, A., Harchol-Balter, M.: Open versus closed: a cautionary tale. In: NSDI (2006)
Google Scholar
Sharkawi, S., DeSota, D., Panda, R., Indukuru, R., Stevens, S., Taylor, V.E., Wu, X.: Performance projection of HPC applications using SPEC CFP2006 benchmarks. In: IPDPS, pp. 1–12 (2009)
Google Scholar
Shun, J., Blelloch, G.E.: Ligra: a lightweight graph processing framework for shared memory. In: PPOPP (2013)
Google Scholar
Spacco, J., Pugh, W.: Rubis revisited: why J2EE benchmarking is hard. Stud. Inform. Univ. 4(1), 25–30 (2005)
Google Scholar
Varbanescu, A.L., Verstraaten, M., de Laat, C., Penders, A., Iosup, A., Sips, H.: Can portability improve performance? an empirical study of parallel graph analytics. In: ICPE (2015)
Google Scholar
Villegas, D., Antoniou, A., Sadjadi, S.M., Iosup, A.: An analysis of provisioning and allocation policies for infrastructure-as-a-service clouds. In: 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2012, pp. 612–619, Ottawa, Canada, 13–16 May 2012
Google Scholar
Wang, N., Zhang, J., Tan, K.-L., Tung, A.K.H.: On triangulation-based dense neighborhood graphs discovery. VLDB 4(2), 58–68 (2010)
Google Scholar
Yigitbasi, N., Iosup, A., Epema, D.H.J., Ostermann, S.: C-meter: a framework for performance analysis of computing clouds. In: 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGrid 2009, Shanghai, China, pp. 472–477, 18–21 May 2009
Google Scholar

Download references

Acknowledgments

This work is supported by the Dutch STW/NOW Veni personal grants @large (#11881) and Graphitti (#12480), by the EU FP7 project PEDCA, by the Dutch national program COMMIT and its funded project COMMissioner, and by the Dutch KIEM project KIESA. The authors wish to thank Hassan Chafi and the Oracle Research Labs, Peter Boncz and the LDBC project, and Josep Larriba-Pey and Arnau Prat Perez, whose support has made the Graphalytics benchmark possible; and to Tilmann Rabl, for facilitating this material.

Author information

Authors and Affiliations

Delft University of Technology, Delft, The Netherlands
Alexandru Iosup, Mihai Capotă, Tim Hegeman, Yong Guo & Wing Lung Ngai
University of Amsterdam, Amsterdam, The Netherlands
Ana Lucia Varbanescu & Merijn Verstraaten

Authors

Alexandru Iosup
View author publications
You can also search for this author in PubMed Google Scholar
Mihai Capotă
View author publications
You can also search for this author in PubMed Google Scholar
Tim Hegeman
View author publications
You can also search for this author in PubMed Google Scholar
Yong Guo
View author publications
You can also search for this author in PubMed Google Scholar
Wing Lung Ngai
View author publications
You can also search for this author in PubMed Google Scholar
Ana Lucia Varbanescu
View author publications
You can also search for this author in PubMed Google Scholar
Merijn Verstraaten
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexandru Iosup .

Editor information

Editors and Affiliations

University of Toronto, Toronto, Ontario, Canada
Tilmann Rabl
SAP SE, Köln, Germany
Kai Sachs
Server Technologies, Oracle Corporation, Redwood Shores, California, USA
Meikel Poess
University of California at San Diego, La Jolla, CA, USA
Chaitanya Baru
Middleware Systems Research Group, Toronto, Ontario, Canada
Hans-Arno Jacobson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Iosup, A. et al. (2015). Towards Benchmarking IaaS and PaaS Clouds for Graph Analytics. In: Rabl, T., Sachs, K., Poess, M., Baru, C., Jacobson, HA. (eds) Big Data Benchmarking. WBDB 2014. Lecture Notes in Computer Science(), vol 8991. Springer, Cham. https://doi.org/10.1007/978-3-319-20233-4_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-20233-4_11
Published: 14 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20232-7
Online ISBN: 978-3-319-20233-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics