With computing systems undergone a fundamental transformation from single-processor devices at the turn of the century to the ubiquitous and networked devices and the warehouse-scale computing via the cloud, the parallelism has become ubiquitous at many levels. At micro level, parallelisms are being explored from the underlying circuits, to pipelining and instruction level parallelism on multi-cores or many cores on a chip as well as in a machine. From macro level, parallelisms are being promoted from multiple machines on a rack, many racks in a data center, to the globally shared infrastructure of the Internet. With the push of big data, we are entering a new era of parallel computing driven by novel and ground breaking research innovation on elastic parallelism and scalability. In this paper, we will give an overview of computing infrastructure for big data processing, focusing on architectural, storage and networking challenges of supporting big data paper. We will briefly discuss emerging computing infrastructure and technologies that are promising for improving data parallelism, task parallelism and encouraging vertical and horizontal computation parallelism.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Byers A H. Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute, 2011, 1–137
Graphics Processing Unit (GPU). http://en.wikipedia.org/wiki/Graphics_processing_unit
Kim N S, Draper S C, Zhou S T, Katariya S, Ghasemi H R, Park T. Analyzing the impact of joint optimization of cell size, redundancy, and ECC on low-voltage SRAM array total area. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2012, 20(12): 2333–2337
Gilani S Z, Kim N S, Schulte M J. Power-efficient computing for compute-intensive GPGPU applications. In: Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques. 2012, 445–446
Mattson T. The future of many core computing: a tale of two processors. Intel Labs Report. 2010
Borkar S. Thousand core chips: a technology perspective. In: Proceedings of the 44th Annual Design Automation Conference. 2007, 746–749
Phase-change memory (pcm). http://en.wikipedia.org/wiki/Phasechange_memory
21st century computer architecture. http://cra.org/ccc/docs/init/21stcenturyarchitecturewhitepaper.pdf
Malewicz G, Austern M H, Bik A J, Dehnert J C, Horn I, Leiser N, Czajkowski G. Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 International Conference on Management of Data. 2010, 135–146
Kyrölä A, Blelloch G, Guestrin C. GraphChi: large-scale graph computation on just a PC. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, 31–46
Altavista web page hyperlink connectivity graph. 2012. http://webgraph.sandbox.yahoo.com
Guo Y, Pan Z, Heflin J. LUBM: a benchmark for OWL knowledge base systems. Web Semantics: Science, Services and Agents on the World Wide Web, 2005, 3(2): 158–182
Prud’Hommeaux E, Seaborne A. SPARQL query language for RDF. W3C Recommendation, 2008
Ling LIU is a full professor in the School of Computer Science at Georgia Institute of Technology. She directs the research programs in Distributed Data Intensive Systems Lab (DiSL). Prof. Liu’s researoh interests are in the areas of cloud computing, big data and big data analytics, distributed computing, and Internet services, with the focus on performance, availability, fault tolerance, security and privacy. She has published over 300 international journal and conference papers. She is a recipient of 2012 IEEE Computer Society Technical Achievement Award and an Outstanding Doctoral Thesis Advisor Award in 2012 from Georgia Institute of Technology. She is a co-Editor-in-Chief of the 5 volumes of Encyclopedia of Database Systems (Springer 2010), the Editorin-Chief of IEEE Transactions on Service Computing, and is on the editorial board of over a dozen international journals. Her research is primarily supported by grants from the U.S. National Science Foundation (NSF) and industrial companies such as IBM and Intel.
About this article
Cite this article
Liu, L. Computing infrastructure for big data processing. Front. Comput. Sci. 7, 165–170 (2013). https://doi.org/10.1007/s11704-013-3900-x
- big data
- cloud computing
- data analytics
- elastic scalability
- heterogeneous computing
- big data processing