Abstract
Apache Kafka is a mainstream message middleware that can provide topic-based data distribution with high throughput. Some existing works have explored building a large-scale content-based publish/subscribe system on Kafka. However, when the number of subscribers is large, the time overhead for matching the message with a large number of subscriptions and forwarding the message to the matched subscribers is large, which greatly affects the latency of message distribution. In this paper, we propose a new type of topic called the fat topic in Kafka to improve the latency of content-based data distribution. In addition, we modify Kafka’s code to provide Consumer and Provider APIs to access fat topics. We conducted extensive experiments to evaluate the performance of fat topics. The experiment results show that the fat topic can improve the latency of content-based event distribution by about 3.7 times compared with the original Kafka topic.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In Kafka, events are termed as messages. In our paper, event and message are interchangeable.
- 2.
References
Chockler, G., Melamed, R., Tock, Y., Vitenberg, R.: SpiderCast: a scalable interest-aware overlay for topic-based pub/sub communication. In: Proceedings of the 2007 Inaugural International Conference on Distributed Event-based Systems, pp. 14–25 (2007)
Detti, A., Funari, L., Blefari-Melazzi, N.: Sub-linear scalability of MQTT clusters in topic-based publish-subscribe applications. IEEE Trans. Netw. Serv. Manage. 17(3), 1954–1968 (2020)
Ding, T., Qian, S., Cao, J., Xue, G., Li, M.: SCSL: optimizing matching algorithms to improve real-time for content-based pub/sub systems. In: 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 148–157. IEEE (2020)
Ding, T., et al.: MO-Tree: an efficient forwarding engine for spatiotemporal-aware pub/sub systems. IEEE Trans. Parallel Distrib. Syst. 32(4), 855–866 (2020)
Dobbelaere, P., Esmaili, K.S.: Kafka versus RabbitMQ: a comparative study of two industry reference publish/subscribe implementations: industry paper. In: Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems, pp. 227–238 (2017)
Eugster, P.T., Felber, P.A., Guerraoui, R., Kermarrec, A.M.: The many faces of publish/subscribe. ACM Comput. Surv. (CSUR) 35(2), 114–131 (2003)
Ji, S., Jacobsen, H.A.: Ps-tree-based efficient Boolean expression matching for high-dimensional and dense workloads. Proc. VLDB Endowment 12(3), 251–264 (2018). https://doi.org/10.14778/3291264.3291270
Jokela, P., Zahemszky, A., Esteve Rothenberg, C., Arianfar, S., Nikander, P.: LIPSIN: line speed publish/subscribe inter-networking. ACM SIGCOMM Comput. Commun. Rev. 39(4), 195–206 (2009)
Kreps, J., Narkhede, N., Rao, J., et al.: Kafka: a distributed messaging system for log processing. Proc. NetDB. 11, 1–7 (2011)
Liao, Z., et al.: PhSIH: a lightweight parallelization of event matching in content-based pub/sub systems. In: Proceedings of the 48th International Conference on Parallel Processing, pp. 1–10 (2019)
Martins, J.L., Duarte, S.: Routing algorithms for content-based publish/subscribe systems. IEEE Commun. Surv. Tutorials 12(1) (2010)
Muhl, G., Fiege, L., Gartner, F.C., Buchmann, A.: Evaluating advanced routing algorithms for content-based publish/subscribe systems. In: 10th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems, MASCOTS 2002. Proceedings, pp. 167–176. IEEE (2002)
Qian, S., et al.: A fast and anti-matchability matching algorithm for content-based publish/subscribe systems. Comput. Netw. 149, 213–225 (2019)
Qian, S., Cao, J., Zhu, Y., Li, M.: REIN: a fast event matching approach for content-based publish/subscribe systems. In: IEEE INFOCOM 2014-IEEE Conference on Computer Communications, pp. 2058–2066. IEEE (2014)
Qian, S., Cao, J., Zhu, Y., Li, M., Wang, J.: H-Tree: an efficient index structure for event matching in content-based publish/subscribe systems. IEEE Trans. Parallel Distrib. Syst. 26(6), 1622–1632 (2015). https://doi.org/10.1109/TPDS.2014.2323262
Qian, S., Mao, W., Cao, J., Le Mouël, F., Li, M.: Adjusting matching algorithm to adapt to workload fluctuations in content-based publish/subscribe systems. In: IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pp. 1936–1944. IEEE (2019)
Saito, T., Nakamura, S., Enokido, T., Takizawa, M.: A topic-based publish/subscribe system in a fog computing model for the IoT. In: Barolli, L., Poniszewska-Maranda, A., Enokido, T. (eds.) CISIS 2020. AISC, vol. 1194, pp. 12–21. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-50454-0_2
Xu, J., et al.: Roda: a flexible framework for real-time on-demand data aggregation. In: Qiu, M. (ed.) ICA3PP 2020. LNCS, vol. 12453, pp. 587–602. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60239-0_40
Zhang, D., Chan, C.Y., Tan, K.L.: An efficient publish/subscribe index for e-commerce databases. Proc. VLDB Endowment 7(8), 613–624 (2014)
Zhao, Y., Wu, J.: Towards approximate event processing in a large-scale content-based network. In: 2011 31st International Conference on Distributed Computing Systems, pp. 790–799. IEEE (2011)
Acknowledgments
This work was supported by the National Key Research and Development Program of China (2019YFB1704400), the National Natural Science Foundation of China (61772334, 61702151), and the Special Fund for Scientific Instruments of the National Natural Science Foundation of China (61827810).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Qian, S., Xu, J., Cao, J., Xue, G., Li, J., Zhang, W. (2021). Fat Topic: Improving Latency in Content-Based Publish/Subscribe Systems on Apache Kafka. In: Liu, Z., Wu, F., Das, S.K. (eds) Wireless Algorithms, Systems, and Applications. WASA 2021. Lecture Notes in Computer Science(), vol 12937. Springer, Cham. https://doi.org/10.1007/978-3-030-85928-2_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-85928-2_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85927-5
Online ISBN: 978-3-030-85928-2
eBook Packages: Computer ScienceComputer Science (R0)