Data Intensive Computing is characterized by problems where data is the primary challenger, whether it is the complexity, size, or rate of the data acquisition. The hardware platform required for a data intensive computing environment consists of tens, sometimes even hundreds, of thousands of compute nodes with their corresponding networking and storage subsystems, power distribution and conditioning equipment, and extensive cooling systems. An essential requirement for processing exploding volumes of data is to move processing and analysis to data, where possible, rather than data to processing and analysis [1]. It is also critical to maximize the parallelism over the data and the efficiency of data movement between discrete devices in a network.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
C. Tanasescu and T. Reed, “Data Intensive Computing: How SGI Altrix ICE and Intel Xeon Processor 5500 Series Help Sustain HPC Efficiency Amid Explosive Data Growth,” http://www.sgi.com/pdfs/4154.pdf, 2009
J. McKendrick, “Size of the data universe:1.2 zettabytes and growing fast,” ZDNet, May 2010
Y. Chen, D. Pavlov and J. F. Canny, “Large-Scale Behavioral Targetting,” ACM KDD’09, Paris, France, July 2009
D. Newman, A. Asuncion, P. Smyth and Max Welling, “Distributed Algorithms for Topic Models,” Journal of Machine Learning Research, Aug., 2009
R. Shankar and G. Narendra, “MapReduce Programming with Apache Hadoop,” JavaWorld.com, Sept. 2008
LexisNexis Risk Solutions, “LexisNexis HPCC: ECL Programmers Guide, ”http://www.lexisnexis.com/risk/about/guides/programmers-guide.pdf, 2010
LexisNexis Risk Solutions, “High-Performance Cluster Computing,” http://www.lexisnexis.com/government/solutions/literature/hpcc-das.pdf, 2010
G. Harrison, “10 Things You Should Know About NoSQL Databases,” http://www.techrepublic.com/downloads/10-things-you-should-know-about-nosql-databases, Aug 2010
C. Strauch, “NoSQL Databases,” Stuttgart Media University, Feb 2011
HP, “HP Superdome 2: the Ultimate Mission-critical Platform,” http://www.compaq.com/cpq-storage/pdfs/4AA1--7762ENW.pdf, June 2010
J. Baker, C. Bond, J. C. Corbett and etc., “Megastore: Providing Scalable, Highly Available Storage for Interactive Services,” 5th Biennial Conference on Innovative Data Systems Research (CIDR’11), January 2011
M. Diehl, “Database Replicatio with Mysql,” Linux Journal, May 2010
L. Barroso and U. Hölzle “The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines,” 2009
J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” OSDI’04: Sixth Symposium on Operating System Design and Implementation,” San Francisco, CA, Dec. 2004
Cisco Systems, “Data Center Design – IP Network Infrastructure,” http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/DC_3_0/DC-3_0_IPInfra.pdf, Oct. 2009
M. Arregoces and M. Portolani, “Data Center Fundamentals,” Cisco Press, 2004
B. Hedlund, “Top of Rack vs End of Row Data Center Designs,” http://bradhedlund.com/2009/04/05/top-of-rack-vs-end-of-row-data-center-designs/, April 2009
A. Greenberg, J. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel and S. Sengupta “VL2: A Scalable and Flexible Data Center Network,” ACM SIGCOMM, Barcelona, Spain, Aug., 2009
A. Greenberg, P. Lahiri, D. A. Maltz, P. Patel and S. Sengupta. “Towards a next generation data center architecture: Scalability and Commoditization,” PRESTO Workshop at SIGCOMM, 2008
S. M. Rumble, D. Ongaro, and R. Stutsman, M. Rosenblum and J. K. Ousterhout, “It’s Time for Low Latency,” Proceedings of the 13th Workshop on Hot Topics in Operating Systems (HotOS 2011).
J. Dean, “Designs, lessons and advice from building large distributed systems,” Keynote talk: The 3rd ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware (October 2009)
S. Amershi, J. Fogarty, A. Kapoor and D. Tan, “Effectie End-User Interaction with Machine Learning,” AAAI-11, Nector, 2011
R. Perlman, Interconnections, Second Edition Addison-Wesley, 2000.
M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, Commodity Data Center network Architecture,” in Proceedings of SIGCOMM, 2008
S. Mahapatra and X. Yuan, “Load Balancing Mechanisms in Data Center Networks,” the 7th Int. Conf.& Expo on Emerging Technologies for a Smarter World (CEWIT), Sept. 2010. (invited)
W. J. Dally and B. Towles, Principles and Practices of Interconnection Networks, Morgan Kaufmann Publishers, 2004
G. Lin and N. Pippenger, “Parallel Algorithms for Routing in Non-Blocking Networks,” Math. System Theory, Vol. 27 pp. 29–40, 1994
L. G. Valiant, “A scheme for fast parallel communication,” SIAM Journal on Computing, Vol. 11, No. 2, pp. 350–361, 1982
J. Touch and R. Perlman, “Transparent Interconnection of Lots of Links (TRILL): Problem and Applicability Statement,” RFC 5556, IETF, May 2009
R. Perlman, D. Eastlake, S. Switches, D. G. Dutt, S. Gai and A. Ghanwani, “RBridges: Base Protocol Specification,” Internet-Draft, IETF, Mar. 2010 (http://tools.ietf.org/html/draft-ietf-trill-rbridge-protocol-16).
P. Ashwood-Smith, “Shortest Path Bridging IEEE 802.1aq Overview & Applications,” in UK Network Operators Forum, Sept. 2010 (http://www.uknof.org.uk/uknof17/Ashwood_Smith-SPB.pdf).
D. Eastlate, P. Ashwood-Smith, S. Keesara, and P. Unbehagen, “The Great Debate: TRILL Versus 802.1aq (SPB),” in North American Network Operators’ Group (NANOG) Meeting 50, Oct. 2010 (http://www.nanog.org/meetings/nanog50/presentations/Monday/NANOG50.Talk63.NANOG50_TRILL-SPB-Debate-Roisman.pdf).
Cisco Systems, “Data Center Interconnect: Layer 2 Extension between Remote Data,” http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/white_paper_c11_493718. pdf, July 2009
Cisco Systems, “Cisco Overlay Transport Virtualization Technology Introduction and Deployment Considerations,” Jan. 2011 http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/DCI/whitepaper/DCI3_OTV_Intro_WP.pdf)
R. Moskowitz and P. Nikander, “Host Identity Protocol (HIP) Architecture,” RFC 4423, IETF, May 2006 (http://www.ietf.org/rfc/rfc4423.txt).
D. Farinacci, V. Fuller, D. Meyer and D. Lewis, “Locator/ID Separation Protocol (LISP),” draft-farinacci-lisp-12, IETF, Mar. 2009 (http://tools.ietf.org/html/draft-farinacci-lisp-12).
D. Ferrucci, E. Brown, J. Chu-Carroll and etc., “Building Watson: An Overview of the DeepQA Project,” Association for the Advancement of Artificial Intelligence, Fall, 2010
J. Qiu, J. Ekanayake, T. Guanarathene, and etc., “Data Intensive Computing for Bioinformatics,” Bloomington, IN, Indiana University, December, 2009
J. Shafer, S. Rixner and A.L. Cox, “Datacenter Storage Architecture for MapReduce Applications,” Workshop on Architectural Concerns in Large Datacenters (ACLD 2009), Austin, TX, June 2009.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Lin, G., Liu, E. (2011). High Performance Network Architectures for Data Intensive Computing. In: Furht, B., Escalante, A. (eds) Handbook of Data Intensive Computing. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1415-5_1
Download citation
DOI: https://doi.org/10.1007/978-1-4614-1415-5_1
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-1414-8
Online ISBN: 978-1-4614-1415-5
eBook Packages: Computer ScienceComputer Science (R0)