Abstract
In this position paper, we draw the specifications for a novel benchmark for comparing parallel processing frameworks in the context of big data applications hosted in the cloud. We aim at filling several gaps in already existing cloud data processing benchmarks, which lack a real-life context for their processes, thus losing relevance when trying to assess performance for real applications. Hence, we propose a fictitious news site hosted in the cloud that is to be managed by the framework under analysis, together with several objective use case scenarios and measures for evaluating system performance. The main strengths of our benchmark definition are parallelization capabilities supporting cloud features and big data properties.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
IBM, What is big data? (2012), http://www-01.ibm.com/software/data/bigdata/
Sato, K.: An Inside Look at Google BigQuery, White paper (2012), https://cloud.google.com/files/BigQueryTechnicalWP.pdf
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop Distributed File System. In: 26th IEEE Symposium on Mass Storage Systems and Technologies (MSST 2010), Incline Village, USA, pp. 1–10 (2010)
Folkerts, E., Alexandrov, A., Sachs, K., Iosup, A., Markl, V., Tosun, C.: Benchmarking in the Cloud: What it Should, Can, and Cannot Be. In: Nambiar, R., Poess, M. (eds.) TPCTC 2012. LNCS, vol. 7755, pp. 173–188. Springer, Heidelberg (2013)
Transaction Processing Performance Council (TPC), TPC Benchmark DS Standard Specification Version 1.1.0 (2012), http://www.tpc.org
Open Cloud Consortium, Generate synthetic site-entity log data for testing and benchmarking applications requiring large data sets (2009), http://code.google.com/p/malgen/
Cloud Harmony (2013), http://www.cloudharmony.com/benchmarks
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, S.: Benchmarking cloud serving systems with YCSB. In: 1st ACM Symposium on Cloud Computing (SoCC 2010), Indianapolis, USA, pp. 143–154 (2010)
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. In: 21st ACM SIGOPS Symposium on Operating Systems Principles (SOSP 2007), pp. 205–220 (2007)
Chen, Y., Alspaugh, S., Ganapathi, A., Griffith, R., KatzThe, R.: Statistical Workload Injector for MapReduce (SWIM) (2013), https://github.com/SWIMProjectUCB/SWIM/wiki
Juang, B.H., Rabiner, L.R.: Hidden Markov models for speech recognition. Technometrics 33(3), 251–272 (1991)
Xu, M., Liang, H., Xin, L.: A Refined TF-IDF Algorithm Based on Channel Distribution Information for Web News Feature Extraction. In: Second International Workshop on Education Technology and Computer Science (ETCS 2010), Wuhan, China, vol. 2, pp. 15–19 (2010)
Wing, W., Ghorbani, A.A.: Weighted pagerank algorithm. In: Second Annual Conference on Communication Networks and Services Research (CNSR 2004), Fredericton, Canada, pp. 305–314 (2004)
Newman, D., Asuncion, A., Smyth, P., Welling, M.: Distributed algorithms for topic models. The Journal of Machine Learning Research 10, 1801–1828 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ferrarons, J., Adhana, M., Colmenares, C., Pietrowska, S., Bentayeb, F., Darmont, J. (2014). PRIMEBALL: A Parallel Processing Framework Benchmark for Big Data Applications in the Cloud. In: Nambiar, R., Poess, M. (eds) Performance Characterization and Benchmarking. TPCTC 2013. Lecture Notes in Computer Science, vol 8391. Springer, Cham. https://doi.org/10.1007/978-3-319-04936-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-04936-6_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04935-9
Online ISBN: 978-3-319-04936-6
eBook Packages: Computer ScienceComputer Science (R0)