Skip to main content
Log in

RDMA-Based Apache Storm for High-Performance Stream Data Processing

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Apache Storm is a scalable fault-tolerant distributed real time stream-processing framework widely used in big data applications. For distributed data-sensitive applications, low-latency, high-throughput communication modules have a critical impact on overall system performance. Apache Storm currently uses Netty as its communication component, an asynchronous server/client framework based on TCP/IP protocol stack. The TCP/IP protocol stack has inherent performance flaws due to frequent memory copying and context switching. The Netty component not only limits the performance of the Storm but also increases the CPU load in the IPoIB (IP over InfiniBand) communication mode. In this paper, we introduce two new implementations for Apache Storm communication components with the help of RDMA technology. The performance evaluation on Mellanox QDR Cards (40 Gbps) shows that our implementations can achieve speedup up to 5\(\times\) compared with IPoIB and 10\(\times\) with Gigabit Ethernet. Our implementations also significantly reduce the CPU load and increase the throughput of the system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Abadi, D.J., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.B.: Aurora: a new model and architecture for data stream management. VLDB J. 12(2), 120–139 (2003). https://doi.org/10.1007/s00778-003-0095-z

    Article  Google Scholar 

  2. Agostini, E., Rossetti, D., Potluri, S.: Gpudirect async: exploring GPU synchronous communication techniques for infiniband clusters. J. Parallel Distrib. Comput. 114, 28–45 (2018). https://doi.org/10.1016/j.jpdc.2017.12.007

    Article  Google Scholar 

  3. Akidau, T., Bradshaw, R., Chambers, C., Chernyak, S., Fernández-Moctezuma, R., Lax, R., McVeety, S., Mills, D., Perry, F., Schmidt, E., Whittle, S.: The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. Proc. VLDB Endow. 8(12), 1792–1803 (2015). https://doi.org/10.14778/2824032.2824076

    Article  Google Scholar 

  4. Amarasinghe, G., de Assunção, M.D., Harwood, A., Karunasekera, S.: Ecsnet++: a simulator for distributed stream processing on edge and cloud environments. Future Gener. Comput. Syst. 111, 401–418 (2020). https://doi.org/10.1016/j.future.2019.11.014

    Article  Google Scholar 

  5. Corral-Plaza, D., Medina-Bulo, I., Ortiz, G., Boubeta-Puig, J.: A stream processing architecture for heterogeneous data sources in the internet of things. Comput. Stand. Interfaces (2020). https://doi.org/10.1016/j.csi.2020.103426

    Article  Google Scholar 

  6. Evans, R.: Apache storm, a hands on tutorial. In: 2015 IEEE International Conference on Cloud Engineering, IC2E 2015, Tempe, AZ, USA, March 9–13, 2015, p. 2. IEEE Computer Society (2015). https://doi.org/10.1109/IC2E.2015.67

  7. Friedman, E., Tzoumas, K.: Introduction to Apache Flink: Stream Processing for Real Time and Beyond, 1st edn. O’Reilly Media, Inc., Newton (2016)

    Google Scholar 

  8. He, Z., Wang, D., Fu, B., Tan, K., Hua, B., Zhang, Z.L., Zheng, K.: MASQ: RDMA for virtual private cloud. In: Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, SIGCOMM ’20, p. 1–14. Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3387514.3405849

  9. Jia, C., Liu, J., Jin, X., Lin, H., An, H., Han, W., Wu, Z., Chi, M.: Improving the performance of distributed tensorflow with RDMA. Int. J. Parallel Program. 46(4), 674–685 (2018). https://doi.org/10.1007/s10766-017-0520-3

    Article  Google Scholar 

  10. Liu, X., Buyya, R.: Resource management and scheduling in distributed stream processing systems: a taxonomy, review, and future directions. ACM Comput. Surv. 53(3), 50:1-50:41 (2020). https://doi.org/10.1145/3355399

    Article  Google Scholar 

  11. Lu, F., Fang, T., Zhang, Z., Li, S., Chen, J., An, H., Han, W.: Improving the performance of mongodb with RDMA. In: Z. Xiao, L.T. Yang, P. Balaji, T. Li, K. Li, A.Y. Zomaya (eds.) 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019, Zhangjiajie, China, August 10-12, 2019, pp. 1004–1010. IEEE (2019). https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00144

  12. MacArthur, P., Liu, Q., Russell, R.D., Mizero, F., Veeraraghavan, M., Dennis, J.M.: An integrated tutorial on infiniband, verbs, and MPI. IEEE Commun. Surv. Tutor. 19(4), 2894–2926 (2017). https://doi.org/10.1109/COMST.2017.2746083

    Article  Google Scholar 

  13. Ousterhout, K., Rasti, R., Ratnasamy, S., Shenker, S., Chun, B.: Making sense of performance in data analytics frameworks. In: 12th USENIX Symposium on Networked Systems Design and Implementation, NSDI 15, Oakland, CA, USA, May 4–6, 2015, pp. 293–307. USENIX Association (2015). https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/ousterhout

  14. Stuedi, P., Metzler, B., Trivedi, A.: jVerbs: Ultra-low latency for data center applications. In: Proceedings of the 4th Annual Symposium on Cloud Computing, SoCC 2013 (2013). https://doi.org/10.1145/2523616.2523631

  15. Sun, D., Gao, S., Liu, X., Li, F., Buyya, R.: Performance-aware deployment of streaming applications in distributed stream computing systems. Int. J. Bio Inspired Comput. 15(1), 52–62 (2020). https://doi.org/10.1504/IJBIC.2020.105892

    Article  Google Scholar 

  16. Trivedi, A., Stuedi, P., Pfefferle, J., Stoica, R., Metzler, B., Koltsidas, I., Ioannou, N.: On the [ir]relevance of network performance for data processing. In: A. Clements, T. Condie (eds.) 8th USENIX Workshop on Hot Topics in Cloud Computing, HotCloud 2016, Denver, CO, USA, June 20–21, 2016. USENIX Association (2016). https://www.usenix.org/conference/hotcloud16/workshop-program/presentation/trivedi

  17. Wu, Z., Li, M., Chi, M., Xu, L., An, H.: Runtime adaptive matrix multiplication for the SW26010 many-core processor. IEEE Access 8, 156915–156928 (2020). https://doi.org/10.1109/ACCESS.2020.3019302

    Article  Google Scholar 

  18. Yang, S., Son, S., Choi, M., Moon, Y.: Performance improvement of apache storm using infiniband RDMA. J. Supercomput. 75(10), 6804–6830 (2019). https://doi.org/10.1007/s11227-019-02905-7

    Article  Google Scholar 

  19. Zaharia, M., Xin, R.S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M.J., Ghodsi, A., Gonzalez, J., Shenker, S., Stoica, I.: Apache spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016). https://doi.org/10.1145/2934664

    Article  Google Scholar 

  20. Zeuch, S., Breß, S., Rabl, T., Monte, B.D., Karimov, J., Lutz, C., Renz, M., Traub, J., Markl, V.: Analyzing efficient stream processing on modern hardware. Proc. VLDB Endow. 12(5), 516–530 (2019). https://doi.org/10.14778/3303753.3303758

    Article  Google Scholar 

  21. Zhang, S., He, B., Dahlmeier, D., Zhou, A.C., Heinze, T.: Revisiting the design of data stream processing systems on multi-core processors. In: 33rd IEEE International Conference on Data Engineering, ICDE 2017, San Diego, CA, USA, April 19–22, 2017, pp. 659–670. IEEE Computer Society (2017). https://doi.org/10.1109/ICDE.2017.119

  22. Zhang, S., He, J., Zhou, A.C., He, B.: Briskstream: Scaling data stream processing on shared-memory multicore architectures. In: P.A. Boncz, S. Manegold, A. Ailamaki, A. Deshpande, T. Kraska (eds.) Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30–July 5, 2019, pp. 705–722. ACM (2019). https://doi.org/10.1145/3299869.3300067

Download references

Acknowledgements

We are thankful to the reviewers for evaluating this study and providing valuable feedback.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Junshi Chen or Hong An.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work is supported by the National Key Research and Development Program of China (Grant No. 2017YFB0202002).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Z., Liu, Z., Jiang, Q. et al. RDMA-Based Apache Storm for High-Performance Stream Data Processing. Int J Parallel Prog 49, 671–684 (2021). https://doi.org/10.1007/s10766-021-00696-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-021-00696-0

Keywords

Navigation