A Fine-Grained Performance Bottleneck Analysis Method for HDFS
The performance issue of HDFS has always been a great concern due to its widely adoption in both production and research environments. However, a fine-grained performance analysis tool is missing to effectively identify the bottlenecks as well as to provide useful guidance for performance optimization. In this paper, we propose a fine-grained performance bottleneck analysis tool, which extends HTrace with fine-grained instrumentation points that are missing in Hadoop official distribution. In addition, we propose an effective trace merging method that improves the understandability of our analysis. We analyze the performance of HDFS under different kinds of workloads and get undiscovered insights.
KeywordsHDFS Instrumentation Bottleneck analysis Performance optimization
The authors would like to thank all anonymous reviewers for their insightful comments and suggestions. This work is supported by National Key Research and Development Program of China (Grant No. 2016YFB1000304) and National Natural Science Foundation of China (Grant No. 61502019).
- 1.Fonseca, R., Porter, G., Katz, R.H., Shenker, S., Stoica, I.: X-trace: a pervasive network tracing framework. In: Proceedings of the 4th USENIX Conference on Networked Systems Design and Implementation, p. 20. USENIX Association (2007)Google Scholar
- 2.Apache HTrace: htrace (2015). https://htrace.incubator.apache.org/
- 3.Ren, Z., Shi, W., Wan, J., Cao, F., Lin, J.: Realistic and scalable benchmarking cloud file systems: practices and lessons from alicloud. IEEE Trans. Parallel Distrib. Syst. PP(99), 1 (2017)Google Scholar
- 4.Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems And Technologies (MSST), pp. 1–10. IEEE (2010)Google Scholar
- 5.Sigelman, B.H., et al.: Dapper, a large-scale distributed systems tracing infrastructure. Technical report, Google, Inc (2010)Google Scholar
- 6.Thereska, E., et al.: Stardust: tracking activity in a distributed storage system. In: ACM SIGMETRICS Performance Evaluation Review, vol. 34, pp. 3–14. ACM (2006)Google Scholar