Abstract
Nowadays more and more data have been generated every day in some enterprises such as facebook and google. These data need to be collected and analyzed in time. So the speed of transmitting data must be very high and the latency must be very low. Hadoop is applied in these enterprises and they use several data centers to store and process these data. But if the amount of data is growing fast or we will use only one data center then the bandwidth of the Ethernet Hadoop Distributed File System (HDFS) using cannot meet the need. The bandwidth of the Ethernet is going to become the performance bottleneck of HDFS. In order to solve this problem we will introduce a relatively new switched fabric communication link–—Infiniband in this paper. Based on Infiniband we have designed a new communication mechanism of HDFS and implemented it by modifying the code of HDFS. We use remote direct memory access (RDMA) to send and receive data rather than socket. The new HDFS will not use original stream mode to transmit data. Instead it will dynamically expand buffer and use changeable threshold. In this way the new HDFS will make CPU idle and improve performance. Unlike IPoIB which only uses Infiniband hardware device, our optimized HDFS is not only based on Infiniband hardware but also changes the code of HDFS to use RDMA. Our HDFS uses socket to transmit control message and RDMA to transmit data to make full use of the bandwidth of Infiniband. So applying the Infiniband with RDMA network bandwidth has not been the performance bottleneck of HDFS any more. According to the experiment results we have found that the network bandwidth of HDFS over Infiniband is 60 percent higher than the Ethernet and our optimized HDFS has much better performance than the HDFS over the Ethernet. On the other hand, the performance of our HDFS is also higher than the one which only use IPoIB.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Deshmukh, V.D.: InfiniBand: A New Era in Networking. In: National Conference on Innovative Paradigms in Engineering & Technology, NCIPET (2012)
Infiniband Trade Association, http://www.Infinibandta.org
Infiniband Trade Association. Socket Direct ProtocolSpecification V1.0 (2002)
Apache Hadoop Project, http://hadoop.apache.org/
Deng, M.: Performance Analysis of HBase Considering Scenarios. Programmer Magazine, 100 pages (August 2012)
Wang, Y., Que, X., et al.: Hadoop Acceleration Through Network Levitated Merge. In: Proceedings of the 2010 International Conference on Supercomputing, Seattle, WA, USA, pp. 57–66 (November 2011)
Seo, S., Jang, I., et al.: HPMR: Prefetching and pre-shuffling in shared MapReduce computation environment. In: IEEE Cluster Conference, pp. 1–8 (August 2009)
Zaharia, M., Konwinski, A., et al.: Improving mapreduce performance in heterogeneous environments. Technical Report UCB/EECS-2008-99, EECS Department, University of California, Berkeley (August 2008)
Mao, Y., Morris, R., et al.: Optimizing mapreduce for multicore architectures. Technical Report MIT-CSAIL-TR-2010-020, MIT (May 2010)
Liu, J., Jiang, W., et al.: Design and Implementation of MPICH2 over InfiniBand with RDMA Support. In: Proceedings of the 18th International Parallel and Distributed Processing Symposium, IPDPS 2004 (2004)
Liu, J., Wu, J., et al.: High Performance RDMA-Based MPI Implementation over InfiniBand. International Journal of Parallel Programming 32(3) (June 2004)
Wu, J., Wyckoff, P., et al.: PVFS over InfiniBand: Design and Performance Evaluation. In: Proc. International Conference on Parallel Processing (2003)
Mellanox Technologies, Mellanox InfiniBand InfiniHost MT23108 Adapters (July 2002), http://www.mellanox.com
Open Fabrics Alliance, http://www.openfabrics.org/
RDMA Read and Write with IB Verbs, http://thegeekinthecorner.wordpress.com/
Dunning, D., Regnier, G., et al.: The Virtual Interface Architecture. IEEE Micro 18(2), 66–76 (1998)
Liu, A., Qian, D., et al.: IPoIB Architecture and its Application. Computer Science (September 2003)
Transmission of IP over InfiniBand (IPoIB), http://www.hjp.at/doc/rfc/rfc4391.html
Hilland, J., Culley, P., et al.: draft-hilland-iwarp-verbs-v1.0, RDMA Protocol Verbs Specification, Version 1.0 (April 2003)
Sur, S., Wang, H., et al.: Can High-Performance Interconnects Benefit Hadoop Distributed File System? In: Workshop on Micro Architectural Support for Virtualization, Data Center Computing, and Clouds (MASVDC). Held in Conjunction with MICRO (December 2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Buyun, D., Pei, F., Xiao, F., Bin, L., Zhihong, Z. (2013). Design and Implementation of HDFS over Infiniband with RDMA. In: Tsaoussidis, V., Kassler, A.J., Koucheryavy, Y., Mellouk, A. (eds) Wired/Wireless Internet Communication. WWIC 2013. Lecture Notes in Computer Science, vol 7889. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38401-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-38401-1_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38400-4
Online ISBN: 978-3-642-38401-1
eBook Packages: Computer ScienceComputer Science (R0)