Abstract
Ceph is an object-based scale-out storage system that is widely used in the cloud computing environment due to its scalable and reliable characteristics. Although there are many factors to affect the performance of scale-out storage systems, the design of a communication subsystem plays an important role in determining the overall performance of these systems. In this paper, we first conduct an extensive analysis of communication subsystem in Ceph, which uses asynchronous messenger framework, called async messenger, for inter-component communication in the storage cluster. Then, we propose three optimization techniques to improve the performance of Ceph messenger. These include (i) deploying load balancing algorithm among worker threads based on the amount of workloads, (ii) assigning multiple worker threads (we call dual worker) per single connection to maximize the overlapping activity among threads, and (iii) using multiple connections between storage servers to maximize bandwidth usage. The experimental results show that the optimized Ceph messenger outperforms the original messenger implementation by up to 40% in random writes with 4 K messages. Moreover, Ceph with optimized communication subsystem shows up to 13% performance improvement as compared to original Ceph.
Similar content being viewed by others
References
Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.: Ceph: a scalable, high-performance distributed file system. In: 7th Symposium on Operating Systems Design and Implementation, USENIX Association, pp. 307–320 (2006)
Gluster. https://www.gluster.org/
Lustre. http://lustre.org/
Izumobase Blog, Inline deduplication for distributed storage. http://www.izumobase.com/blog/inline-deduplication-for-speed
Openstack User Survey (2015). http://www.openstack.org/assets/survey/Public-User-Survey-Report.pdf
Ling, Y., Mullen, T., Lin, X.: Analysis of optimal thread pool size. ACM SIGOPS Oper. Syst. Rev. 34(2), 42–55 (2000)
Han, Y., Lee, K., Park, S.: A dynamic message-aware communication scheduler for Ceph storage system. In: IEEE International Workshops on Foundations and Applications of Self* Systems, pp. 60–65 (2016)
Lee, D.Y., Jeong, K., Han, S.H., Kim, J.S., Hwang, J.Y., Cho, S.: Understanding write behavior of storage backends in Ceph object store. In: Proceedings of the 2017 IEEE International Conference on Massive Storage Systems and Technology (2017)
Oh, M., Eom, J., Yoon, J., Yun, J., Kim, S., Yeom, H.: Performance optimization for all flash scale-out storage. In: IEEE International Conference on Cluster Computing, pp. 1561–1563 (2016)
Gudu, D., Hardt, M., Streit, A.: Evaluating the performance and scalability of the Ceph distributed storage system. In: IEEE International Conference on Big Data (Big Data), pp. 177–182 (2014)
Wang, F., Nelson, M., Oral, S., Atchley, S., Atchley, S., Weil, S., Settlemyer, B., Caldwell, B., Hill, J.: Performance and scalability evaluation of the Ceph parallel file system. In: Proceeding of the 8th Parallel Data Storage Workshop, pp. 14–19 (2013)
Chowdhury, M., Srikanth, K., Stoica, I.: Leveraging endpoint flexibility in data-intensive clusters. In: ACM SIGCOMM Computer Communication Review, vol. 43, no. 4, pp. 231–242 (2013)
Kettimuthu, R., Vardoyan, G., Agrawal, G., Sadayappan, P., Foster, I.: An elegant sufficiency: load-aware differentiated scheduling of data transfers. In: SC-International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–12 (2015)
Kim, Y., Atchley, S., Vallee, G.R., Lee, S., Shipman, G.M.: Optimizing end-to-end big data transfers over terabits network infrastructure. In: IEEE Transactions on Parallel and Distributed Systems, pp. 188–201 (2017)
Fio benchmark. http://freecode.com/projects/fio
Ito, T., Ohsaki, H., Imase, M.: GridFTP-APT: automatic parallelism tuning mechanism for data transfer protocol GridFTP. In: 6th IEEE Interntional Symposium on Cluster Computing and the Grid (CCGrid) (2006)
Subramoni, H., Lai, P., Kettimuthu, R., Panda, D.K.: High performance data transfer in grid environment using GridFTP over InfiniBand, Technical Report, OSU-CISRC-11/09-TR53
Acknowledgements
This research was supported by Next-Generation Information Computing Development Program through National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT (2017M3C4A7080245).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Song, U., Jeong, B., Park, S. et al. Optimizing communication performance in scale-out storage system. Cluster Comput 22, 335–346 (2019). https://doi.org/10.1007/s10586-018-2831-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-018-2831-6