Abstract
High-cardinality flow detection over the big network data stream plays an important role in many practical applications. To process large and fast data streams in real-time, most existing work uses compact data structures like sketches to fit themself in high-speed but small on-chip memory. However, this design suffers from expensive computation and thus only supports periodical high-cardinality flow detection. Although NDS can provide online flow cardinality estimation, it is designed to estimate all flows accurately. In contrast, high-cardinality flow detection only concerns whether a flow’s cardinality exceeds a certain threshold. This paper complements the prior work by proposing an online high-cardinality flow detection method with high resource efficiency. Based on the on-chip/off-chip design, the proposed method reduces large flows’ resource consumption by constructing a virtual bitmap sharing module over the physical bitmap. We evaluate the performance of the proposed method using the real-world Internet traces downloaded from CAIDA. The experimental results show that our method can save up to 65.8% on-chip memory when bounding the same constraints for false-positive rates and false-negative rates.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Yang, T., Zhou, Y., Jin, H., Chen, S., Li, X.: Pyramid sketch: a sketch framework for frequency estimation of data streams. Proc. VLDB Endow. 10(11), 1442–1453 (2017)
Wu, G., et al.: Accelerating real-time tracking applications over big data stream with constrained space. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds.) DASFAA 2019. LNCS, vol. 11446, pp. 3–18. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18576-3_1
Huang, H., et al.: An efficient k-persistent spread estimator for traffic measurement in high-speed networks. IEEE ACM Trans. Netw. 28(4), 1463–1476 (2020)
Yang, Z., Zheng, B., Li, G., Zhao, X., Zhou, X., Jensen, C.S.: Adaptive top-k overlap set similarity joins. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 1081–1092. IEEE (2020)
Zheng, B., et al.: Answering why-not group spatial keyword queries. TKDE 32(1), 26–39 (2020)
Estan, C., Varghese, G.: New directions in traffic measurement and accounting: focusing on the elephants, ignoring the mice. ACM Trans. Comput. Syst. (TOCS) 21(3), 270–313 (2003)
Lieven, P., Scheuermann, B.: High-speed per-flow traffic measurement with probabilistic multiplicity counting. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2010), pp. 1–9 (2010)
Yoon, M., Li, T., Chen, S., Kwon Peir, J.: Fit a spread estimator in small memory. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2009), pp. 504–512 (2009)
Zhou, Y., Zhou, Y., Chen, M., Xiao, Q., Chen, S.: Highly compact virtual counters for per-flow traffic measurement through register sharing. In: Proceedings of the IEEE GLOBECOM 2016, pp. 1–6 (2016)
Ting, D.: Approximate distinct counts for billions of datasets. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 69–86. Association for Computing Machinery, New York (2019)
Zheng, J., Xu, H., Chen, G., Dai, H.: Minimizing transient congestion during network update in data centers. In: Proceedings of IEEE International Conference on Network Protocols (ICNP 2015), pp. 1–10 (2015)
Xu, H., Yu, Z., Qian, C., Li, X., Liu, Z., Huang, L.: Minimizing flow statistics collection cost using wildcard-based requests in SDNs. IEEE ACM Trans. Netw. 25(6), 3587–3601 (2017)
Li, T., Chen, S., Luo, W., Zhang, M.: Scan detection in high-speed networks based on optimal dynamic bit sharing. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2011), pp. 3200–3208 (2011)
Hu, C., Liu, B., Wang, S., Tian, J., Cheng, Y., Chen, Y.: ANLS: adaptive non-linear sampling method for accurate flow size measurement. IEEE Trans. Commun. 60(3), 789–798 (2012)
Hao, F., Kodialam, M., Lakshman, T.: ACCEL-RATE: a faster mechanism for memory efficient per-flow traffic estimation. ACM SIGMETRICS Perform. Eval. Revi. 32, 155–166 (2004)
Sun, Y., Huang, H., Ma, C., Chen, S., Du, Y., Xiao, Q.: Online spread estimation with non-duplicate sampling. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2020), pp. 2440–2448 (2020)
Heule, S., Nunkesser, M., Hall, A.: HyperLogLog in practice: algorithmic engineering of a state of the art cardinality estimation algorithm. In: Proceedings of the 16th International Conference on Extending Database Technology (EDBT 2013), pp. 683–692 (2013)
Yang, T., et al.: A generic technique for sketches to adapt to different counting ranges. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2019), pp. 2017–2025 (2019)
Yoon, M., Li, T., Chen, S., Peir, J.K.: Fit a compact spread estimator in small high-speed memory. IEEE ACM Trans. Network. (TON) 19(5), 1253–1264 (2011)
Huang, H., et al.: You can drop but you can’t hide: \(k\)-persistent spread estimation in high-speed networks. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2018), pp. 1889–1897 (2018)
Zhou, Y., Zhou, Y., Chen, S., Zhang, Y.: Highly compact virtual active counters for per-flow traffic measurement. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2018), pp. 1–9 (2018)
Zhang, Y.: An adaptive flow counting method for anomaly detection in SDN. In: Proceedings of the Ninth ACM Conference on Emerging Networking Experiments and Technologies, pp. 25–30. Association for Computing Machinery, New York (2013)
Cheng, G., Yu, J.: Adaptive sampling for OpenFlow network measurement methods. In: Proceedings of the 12th International Conference on Future Internet Technologies, pp. 1–7. Association for Computing Machinery, New York (2017)
CAIDA: The CAIDA UCSD anonymized internet traces (2016). http://www.caida.org/data/passive/passive_2016_dataset.xml. Accessed 28 July 2019
Wang, P., Jia, P., Zhang, X., Tao, J., Guan, X., Towsley, D.: Utilizing dynamic properties of sharing bits and registers to estimate user cardinalities over time. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp. 1094–1105 (2019)
Acknowledgements
This research was supported by the National Natural Science Foundation of China (Grant No. 62072322, 61873177, and U20A20182) and Natural Science Research Project of Jiangsu Higher Education Institution (Grant No. 18KJA520010).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Du, Y., Huang, H., Sun, YE., Liu, A., Gao, G., Zhang, B. (2021). Online High-Cardinality Flow Detection over Big Network Data Stream. In: Jensen, C.S., et al. Database Systems for Advanced Applications. DASFAA 2021. Lecture Notes in Computer Science(), vol 12681. Springer, Cham. https://doi.org/10.1007/978-3-030-73194-6_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-73194-6_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73193-9
Online ISBN: 978-3-030-73194-6
eBook Packages: Computer ScienceComputer Science (R0)