High Performance Network Architectures for Data Intensive Computing

Lin, Geng; Liu, Eileen

doi:10.1007/978-1-4614-1415-5_1

Geng Lin³ &
Eileen Liu⁴

1605 Accesses
1 Citations

Data Intensive Computing is characterized by problems where data is the primary challenger, whether it is the complexity, size, or rate of the data acquisition. The hardware platform required for a data intensive computing environment consists of tens, sometimes even hundreds, of thousands of compute nodes with their corresponding networking and storage subsystems, power distribution and conditioning equipment, and extensive cooling systems. An essential requirement for processing exploding volumes of data is to move processing and analysis to data, where possible, rather than data to processing and analysis [1]. It is also critical to maximize the parallelism over the data and the efficiency of data movement between discrete devices in a network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

C. Tanasescu and T. Reed, “Data Intensive Computing: How SGI Altrix ICE and Intel Xeon Processor 5500 Series Help Sustain HPC Efficiency Amid Explosive Data Growth,” http://www.sgi.com/pdfs/4154.pdf, 2009
J. McKendrick, “Size of the data universe:1.2 zettabytes and growing fast,” ZDNet, May 2010
Google Scholar
Y. Chen, D. Pavlov and J. F. Canny, “Large-Scale Behavioral Targetting,” ACM KDD’09, Paris, France, July 2009
Google Scholar
D. Newman, A. Asuncion, P. Smyth and Max Welling, “Distributed Algorithms for Topic Models,” Journal of Machine Learning Research, Aug., 2009
Google Scholar
R. Shankar and G. Narendra, “MapReduce Programming with Apache Hadoop,” JavaWorld.com, Sept. 2008
Google Scholar
LexisNexis Risk Solutions, “LexisNexis HPCC: ECL Programmers Guide, ”http://www.lexisnexis.com/risk/about/guides/programmers-guide.pdf, 2010
Google Scholar
LexisNexis Risk Solutions, “High-Performance Cluster Computing,” http://www.lexisnexis.com/government/solutions/literature/hpcc-das.pdf, 2010
G. Harrison, “10 Things You Should Know About NoSQL Databases,” http://www.techrepublic.com/downloads/10-things-you-should-know-about-nosql-databases, Aug 2010
C. Strauch, “NoSQL Databases,” Stuttgart Media University, Feb 2011
Google Scholar
HP, “HP Superdome 2: the Ultimate Mission-critical Platform,” http://www.compaq.com/cpq-storage/pdfs/4AA1--7762ENW.pdf, June 2010
J. Baker, C. Bond, J. C. Corbett and etc., “Megastore: Providing Scalable, Highly Available Storage for Interactive Services,” 5^th Biennial Conference on Innovative Data Systems Research (CIDR’11), January 2011
Google Scholar
M. Diehl, “Database Replicatio with Mysql,” Linux Journal, May 2010
Google Scholar
L. Barroso and U. Hölzle “The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines,” 2009
Google Scholar
J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” OSDI’04: Sixth Symposium on Operating System Design and Implementation,” San Francisco, CA, Dec. 2004
Google Scholar
Cisco Systems, “Data Center Design – IP Network Infrastructure,” http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/DC_3_0/DC-3_0_IPInfra.pdf, Oct. 2009
M. Arregoces and M. Portolani, “Data Center Fundamentals,” Cisco Press, 2004
Google Scholar
B. Hedlund, “Top of Rack vs End of Row Data Center Designs,” http://bradhedlund.com/2009/04/05/top-of-rack-vs-end-of-row-data-center-designs/, April 2009
A. Greenberg, J. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel and S. Sengupta “VL2: A Scalable and Flexible Data Center Network,” ACM SIGCOMM, Barcelona, Spain, Aug., 2009
Google Scholar
A. Greenberg, P. Lahiri, D. A. Maltz, P. Patel and S. Sengupta. “Towards a next generation data center architecture: Scalability and Commoditization,” PRESTO Workshop at SIGCOMM, 2008
Google Scholar
S. M. Rumble, D. Ongaro, and R. Stutsman, M. Rosenblum and J. K. Ousterhout, “It’s Time for Low Latency,” Proceedings of the 13th Workshop on Hot Topics in Operating Systems (HotOS 2011).
Google Scholar
J. Dean, “Designs, lessons and advice from building large distributed systems,” Keynote talk: The 3rd ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware (October 2009)
Google Scholar
S. Amershi, J. Fogarty, A. Kapoor and D. Tan, “Effectie End-User Interaction with Machine Learning,” AAAI-11, Nector, 2011
Google Scholar
R. Perlman, Interconnections, Second Edition Addison-Wesley, 2000.
Google Scholar
M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, Commodity Data Center network Architecture,” in Proceedings of SIGCOMM, 2008
Google Scholar
S. Mahapatra and X. Yuan, “Load Balancing Mechanisms in Data Center Networks,” the 7th Int. Conf.& Expo on Emerging Technologies for a Smarter World (CEWIT), Sept. 2010. (invited)
Google Scholar
W. J. Dally and B. Towles, Principles and Practices of Interconnection Networks, Morgan Kaufmann Publishers, 2004
Google Scholar
G. Lin and N. Pippenger, “Parallel Algorithms for Routing in Non-Blocking Networks,” Math. System Theory, Vol. 27 pp. 29–40, 1994
Article MATH MathSciNet Google Scholar
L. G. Valiant, “A scheme for fast parallel communication,” SIAM Journal on Computing, Vol. 11, No. 2, pp. 350–361, 1982
Article MATH MathSciNet Google Scholar
J. Touch and R. Perlman, “Transparent Interconnection of Lots of Links (TRILL): Problem and Applicability Statement,” RFC 5556, IETF, May 2009
Google Scholar
R. Perlman, D. Eastlake, S. Switches, D. G. Dutt, S. Gai and A. Ghanwani, “RBridges: Base Protocol Specification,” Internet-Draft, IETF, Mar. 2010 (http://tools.ietf.org/html/draft-ietf-trill-rbridge-protocol-16).
P. Ashwood-Smith, “Shortest Path Bridging IEEE 802.1aq Overview & Applications,” in UK Network Operators Forum, Sept. 2010 (http://www.uknof.org.uk/uknof17/Ashwood_Smith-SPB.pdf).
D. Eastlate, P. Ashwood-Smith, S. Keesara, and P. Unbehagen, “The Great Debate: TRILL Versus 802.1aq (SPB),” in North American Network Operators’ Group (NANOG) Meeting 50, Oct. 2010 (http://www.nanog.org/meetings/nanog50/presentations/Monday/NANOG50.Talk63.NANOG50_TRILL-SPB-Debate-Roisman.pdf).
Cisco Systems, “Data Center Interconnect: Layer 2 Extension between Remote Data,” http://www.cisco.com/en/US/prod/collateral/switches/ps5718/ps708/white_paper_c11_493718. pdf, July 2009
Cisco Systems, “Cisco Overlay Transport Virtualization Technology Introduction and Deployment Considerations,” Jan. 2011 http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/DCI/whitepaper/DCI3_OTV_Intro_WP.pdf)
R. Moskowitz and P. Nikander, “Host Identity Protocol (HIP) Architecture,” RFC 4423, IETF, May 2006 (http://www.ietf.org/rfc/rfc4423.txt).
D. Farinacci, V. Fuller, D. Meyer and D. Lewis, “Locator/ID Separation Protocol (LISP),” draft-farinacci-lisp-12, IETF, Mar. 2009 (http://tools.ietf.org/html/draft-farinacci-lisp-12).
D. Ferrucci, E. Brown, J. Chu-Carroll and etc., “Building Watson: An Overview of the DeepQA Project,” Association for the Advancement of Artificial Intelligence, Fall, 2010
Google Scholar
J. Qiu, J. Ekanayake, T. Guanarathene, and etc., “Data Intensive Computing for Bioinformatics,” Bloomington, IN, Indiana University, December, 2009
Google Scholar
J. Shafer, S. Rixner and A.L. Cox, “Datacenter Storage Architecture for MapReduce Applications,” Workshop on Architectural Concerns in Large Datacenters (ACLD 2009), Austin, TX, June 2009.
Google Scholar

Download references

Author information

Authors and Affiliations

Dell, IBM Alliance Cisco Systems, San Francisco, CA, USA
Geng Lin
Nominum, Inc., Redwood City, CA, USA
Eileen Liu

Authors

Geng Lin
View author publications
You can also search for this author in PubMed Google Scholar
Eileen Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Geng Lin .

Editor information

Editors and Affiliations

Dept. of Computer Science & Engineering, Florida Atlantic University, Boca Raton, 33431, Florida, USA
Borko Furht
LexisNexis, Boca Raton, 33487, Florida, USA
Armando Escalante

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lin, G., Liu, E. (2011). High Performance Network Architectures for Data Intensive Computing. In: Furht, B., Escalante, A. (eds) Handbook of Data Intensive Computing. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1415-5_1

Download citation

DOI: https://doi.org/10.1007/978-1-4614-1415-5_1
Published: 11 November 2011
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-1414-8
Online ISBN: 978-1-4614-1415-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics