Skip to main content
Log in

Optimizing communication performance in scale-out storage system

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Ceph is an object-based scale-out storage system that is widely used in the cloud computing environment due to its scalable and reliable characteristics. Although there are many factors to affect the performance of scale-out storage systems, the design of a communication subsystem plays an important role in determining the overall performance of these systems. In this paper, we first conduct an extensive analysis of communication subsystem in Ceph, which uses asynchronous messenger framework, called async messenger, for inter-component communication in the storage cluster. Then, we propose three optimization techniques to improve the performance of Ceph messenger. These include (i) deploying load balancing algorithm among worker threads based on the amount of workloads, (ii) assigning multiple worker threads (we call dual worker) per single connection to maximize the overlapping activity among threads, and (iii) using multiple connections between storage servers to maximize bandwidth usage. The experimental results show that the optimized Ceph messenger outperforms the original messenger implementation by up to 40% in random writes with 4 K messages. Moreover, Ceph with optimized communication subsystem shows up to 13% performance improvement as compared to original Ceph.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Gartner. http://www.gartner.com/newsroom/id/3598917

  2. Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.: Ceph: a scalable, high-performance distributed file system. In: 7th Symposium on Operating Systems Design and Implementation, USENIX Association, pp. 307–320 (2006)

  3. Gluster. https://www.gluster.org/

  4. Lustre. http://lustre.org/

  5. Izumobase Blog, Inline deduplication for distributed storage. http://www.izumobase.com/blog/inline-deduplication-for-speed

  6. Openstack User Survey (2015). http://www.openstack.org/assets/survey/Public-User-Survey-Report.pdf

  7. Ling, Y., Mullen, T., Lin, X.: Analysis of optimal thread pool size. ACM SIGOPS Oper. Syst. Rev. 34(2), 42–55 (2000)

    Article  Google Scholar 

  8. Han, Y., Lee, K., Park, S.: A dynamic message-aware communication scheduler for Ceph storage system. In: IEEE International Workshops on Foundations and Applications of Self* Systems, pp. 60–65 (2016)

  9. Lee, D.Y., Jeong, K., Han, S.H., Kim, J.S., Hwang, J.Y., Cho, S.: Understanding write behavior of storage backends in Ceph object store. In: Proceedings of the 2017 IEEE International Conference on Massive Storage Systems and Technology (2017)

  10. Oh, M., Eom, J., Yoon, J., Yun, J., Kim, S., Yeom, H.: Performance optimization for all flash scale-out storage. In: IEEE International Conference on Cluster Computing, pp. 1561–1563 (2016)

  11. Gudu, D., Hardt, M., Streit, A.: Evaluating the performance and scalability of the Ceph distributed storage system. In: IEEE International Conference on Big Data (Big Data), pp. 177–182 (2014)

  12. Wang, F., Nelson, M., Oral, S., Atchley, S., Atchley, S., Weil, S., Settlemyer, B., Caldwell, B., Hill, J.: Performance and scalability evaluation of the Ceph parallel file system. In: Proceeding of the 8th Parallel Data Storage Workshop, pp. 14–19 (2013)

  13. Chowdhury, M., Srikanth, K., Stoica, I.: Leveraging endpoint flexibility in data-intensive clusters. In: ACM SIGCOMM Computer Communication Review, vol. 43, no. 4, pp. 231–242 (2013)

  14. Kettimuthu, R., Vardoyan, G., Agrawal, G., Sadayappan, P., Foster, I.: An elegant sufficiency: load-aware differentiated scheduling of data transfers. In: SC-International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–12 (2015)

  15. Kim, Y., Atchley, S., Vallee, G.R., Lee, S., Shipman, G.M.: Optimizing end-to-end big data transfers over terabits network infrastructure. In: IEEE Transactions on Parallel and Distributed Systems, pp. 188–201 (2017)

  16. Fio benchmark. http://freecode.com/projects/fio

  17. Ito, T., Ohsaki, H., Imase, M.: GridFTP-APT: automatic parallelism tuning mechanism for data transfer protocol GridFTP. In: 6th IEEE Interntional Symposium on Cluster Computing and the Grid (CCGrid) (2006)

  18. Subramoni, H., Lai, P., Kettimuthu, R., Panda, D.K.: High performance data transfer in grid environment using GridFTP over InfiniBand, Technical Report, OSU-CISRC-11/09-TR53

Download references

Acknowledgements

This research was supported by Next-Generation Information Computing Development Program through National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT (2017M3C4A7080245).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sungyong Park.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, U., Jeong, B., Park, S. et al. Optimizing communication performance in scale-out storage system. Cluster Comput 22, 335–346 (2019). https://doi.org/10.1007/s10586-018-2831-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-018-2831-6

Keywords

Navigation