Abstract
A big data server is a computer system designed to store and process many types of unstructured data arriving at a rapid pace. Such data captured from the Internet and Social Networks are crucial for both developed and developing countries to be able to make informed decisions in time. However, sustainability of big data infrastructures and electronic waste are big issues due to the rapid changes in technology. In this paper we evaluate the performance of big data servers on reusable computers in order to evaluate the scalability and feasibility of constructing big data servers using discarded computers that can be procured as low as $40. In particular, we compare virtualized clusters and bare metal clusters of the low-cost recycled computing nodes for their scalability and feasibility. Virtualized environment is often considered for big data infrastructures due to more efficient management of the clusters despite of the performance overheads. Our study shows that virtualized environment is not scalable for low-cost recycled computing nodes. Our performance evaluation shows that the virtualized cluster is 66% slower than the non-virtualized cluster for read operations. For write operations, the virtualized system is 88% slower than the non-virtualized system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cuzzocrea, A., Song, I.-Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution! In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, pp. 101–104. ACM (2011)
Zaslavsky, A., Perera, C., Georgakopoulos, D.: Sensing as a service and big data. arXiv preprint arXiv:13010159 (2013)
Song, I.: Diagnosis of pneumonia from sounds collected using low cost cell phones. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2015)
Song, I., Diederich, J.: Intention extraction from text messages. In: Neural Information Processing. Theory and Algorithms, pp. 330–337. Springer (2010)
DeWitt, D., Gray, J.: Parallel database systems: the future of high performance database systems. Communications of the ACM 35(6), 85–98 (1992)
Chen, M., Mao, S., Liu, Y.: Big data: A survey. Mobile Networks and Applications 19(2), 171–209 (2014)
Buell, J.: Virtualized Hadoop Performance with VMware vSphere 5.1 (2013). http://www.vmware.com/files/pdf/vmware-virtualizing-apache-hadoop.pdf (accessed September 26, 2015)
Vong, J., Song, I.: Emerging Technologies for Emerging Markets, vol. 11. Springer (2015)
Vong, J., Song, I.: Securing online medical data. In: Emerging Technologies for Emerging Markets, pp. 133–143. Springer, Singapore (2015)
Vong, J., Song, I.: Automated health care services. In: Emerging Technologies for Emerging Markets, pp. 89–102. Springer, Singapore (2015)
Lech, M., Song, I., Yellowlees, P., Diederich, J.: Mental Health Informatics. Springer, Berlin, Heidelberg (2014)
Diederich, J., Song, I.: Mental health informatics: current approaches. In: Mental Health Informatics, pp. 1–16. Springer, Berlin, Heidelberg (2014)
Song, I., Vong, J.: Affective core-banking services for microfinance. In: Computer and Information Science, pp. 91–102. Springer International Publishing (2013)
Song, I., Vong, J.: Mobile collaborative experiential learning (MCEL): personalized formative assessment. In: 2013 International Conference on IT Convergence and Security (ICITCS), pp. 1–4. IEEE (2013)
Hu, H., Wen, Y., Chua, T.-S., Li, X.: Toward scalable systems for big data analytics: A technology tutorial. Access, IEEE 2, 652–687 (2014)
Dede, E., Govindaraju, M., Gunter, D., Canon, R.S., Ramakrishnan, L.: Performance evaluation of a mongodb and hadoop platform for scientific data analysis. In: Proceedings of the 4th ACM Workshop on Scientific Cloud Computing, pp. 13–20. ACM (2013)
Venner, J.: Pro Hadoop. Apress (2009)
Lakhe, B.: Monitoring in hadoop. In: Practical Hadoop Security, pp. 119–141. Springer (2014)
Sailer, R., Jaeger, T., Valdez, E., Caceres, R., Perez, R., Berger, S., Griffin, J.L., Van Doorn, L.: Building a MAC-based security architecture for the Xen open-source hypervisor. In: 21st Annual, Computer Security Applications Conference, pp. 10–285. IEEE (2005)
Zhang, X., Keahey, K., Foster, I., Freeman, T.: Virtual cluster workspaces for grid applications. sl: TR-ANL/MCS-P1246-0405 (2005)
Foster, I., Freeman, T., Keahy, K., Scheftner, D., Sotomayer, B., Zhang, X.: Virtual clusters for grid communities. In: Sixth IEEE International Symposium on Cluster Computing and the Grid, CCGRID 2006, pp. 513–520. IEEE (2006)
McDougall, R.: Project Serengeti: There’s a Virtual Elephant in my Datacenter (2012). https://blogs.vmware.com/cto/project-serengeti-theres-a-virtual-elephant-in-my-datacenter/ (accessed October 20, 2015)
Ivanov, T., Zicari, R.V., Izberovic, S., Tolle, K.: Performance Evaluation of Virtualized Hadoop Clusters. arXiv preprint arXiv:14113811 (2014)
Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10. IEEE (2010)
Ghazal, A., Rabl, T., Hu, M., Raab, F., Poess, M., Crolotte, A., Jacobsen, H.-A.: BigBench: towards an industry standard benchmark for big data analytics. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1197–1208. ACM (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media Singapore
About this paper
Cite this paper
Chandrasekaran, S., Song, I. (2016). Sustainability of Big Data Servers Under Rapid Changes of Technology. In: Kim, K., Joukov, N. (eds) Information Science and Applications (ICISA) 2016. Lecture Notes in Electrical Engineering, vol 376. Springer, Singapore. https://doi.org/10.1007/978-981-10-0557-2_15
Download citation
DOI: https://doi.org/10.1007/978-981-10-0557-2_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0556-5
Online ISBN: 978-981-10-0557-2
eBook Packages: EngineeringEngineering (R0)