Skip to main content

Online High-Cardinality Flow Detection over Big Network Data Stream

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12681))

Included in the following conference series:

Abstract

High-cardinality flow detection over the big network data stream plays an important role in many practical applications. To process large and fast data streams in real-time, most existing work uses compact data structures like sketches to fit themself in high-speed but small on-chip memory. However, this design suffers from expensive computation and thus only supports periodical high-cardinality flow detection. Although NDS can provide online flow cardinality estimation, it is designed to estimate all flows accurately. In contrast, high-cardinality flow detection only concerns whether a flow’s cardinality exceeds a certain threshold. This paper complements the prior work by proposing an online high-cardinality flow detection method with high resource efficiency. Based on the on-chip/off-chip design, the proposed method reduces large flows’ resource consumption by constructing a virtual bitmap sharing module over the physical bitmap. We evaluate the performance of the proposed method using the real-world Internet traces downloaded from CAIDA. The experimental results show that our method can save up to 65.8% on-chip memory when bounding the same constraints for false-positive rates and false-negative rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Yang, T., Zhou, Y., Jin, H., Chen, S., Li, X.: Pyramid sketch: a sketch framework for frequency estimation of data streams. Proc. VLDB Endow. 10(11), 1442–1453 (2017)

    Article  Google Scholar 

  2. Wu, G., et al.: Accelerating real-time tracking applications over big data stream with constrained space. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds.) DASFAA 2019. LNCS, vol. 11446, pp. 3–18. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18576-3_1

    Chapter  Google Scholar 

  3. Huang, H., et al.: An efficient k-persistent spread estimator for traffic measurement in high-speed networks. IEEE ACM Trans. Netw. 28(4), 1463–1476 (2020)

    Article  Google Scholar 

  4. Yang, Z., Zheng, B., Li, G., Zhao, X., Zhou, X., Jensen, C.S.: Adaptive top-k overlap set similarity joins. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 1081–1092. IEEE (2020)

    Google Scholar 

  5. Zheng, B., et al.: Answering why-not group spatial keyword queries. TKDE 32(1), 26–39 (2020)

    Google Scholar 

  6. Estan, C., Varghese, G.: New directions in traffic measurement and accounting: focusing on the elephants, ignoring the mice. ACM Trans. Comput. Syst. (TOCS) 21(3), 270–313 (2003)

    Article  Google Scholar 

  7. Lieven, P., Scheuermann, B.: High-speed per-flow traffic measurement with probabilistic multiplicity counting. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2010), pp. 1–9 (2010)

    Google Scholar 

  8. Yoon, M., Li, T., Chen, S., Kwon Peir, J.: Fit a spread estimator in small memory. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2009), pp. 504–512 (2009)

    Google Scholar 

  9. Zhou, Y., Zhou, Y., Chen, M., Xiao, Q., Chen, S.: Highly compact virtual counters for per-flow traffic measurement through register sharing. In: Proceedings of the IEEE GLOBECOM 2016, pp. 1–6 (2016)

    Google Scholar 

  10. Ting, D.: Approximate distinct counts for billions of datasets. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 69–86. Association for Computing Machinery, New York (2019)

    Google Scholar 

  11. Zheng, J., Xu, H., Chen, G., Dai, H.: Minimizing transient congestion during network update in data centers. In: Proceedings of IEEE International Conference on Network Protocols (ICNP 2015), pp. 1–10 (2015)

    Google Scholar 

  12. Xu, H., Yu, Z., Qian, C., Li, X., Liu, Z., Huang, L.: Minimizing flow statistics collection cost using wildcard-based requests in SDNs. IEEE ACM Trans. Netw. 25(6), 3587–3601 (2017)

    Article  Google Scholar 

  13. Li, T., Chen, S., Luo, W., Zhang, M.: Scan detection in high-speed networks based on optimal dynamic bit sharing. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2011), pp. 3200–3208 (2011)

    Google Scholar 

  14. Hu, C., Liu, B., Wang, S., Tian, J., Cheng, Y., Chen, Y.: ANLS: adaptive non-linear sampling method for accurate flow size measurement. IEEE Trans. Commun. 60(3), 789–798 (2012)

    Article  Google Scholar 

  15. Hao, F., Kodialam, M., Lakshman, T.: ACCEL-RATE: a faster mechanism for memory efficient per-flow traffic estimation. ACM SIGMETRICS Perform. Eval. Revi. 32, 155–166 (2004)

    Article  Google Scholar 

  16. Sun, Y., Huang, H., Ma, C., Chen, S., Du, Y., Xiao, Q.: Online spread estimation with non-duplicate sampling. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2020), pp. 2440–2448 (2020)

    Google Scholar 

  17. Heule, S., Nunkesser, M., Hall, A.: HyperLogLog in practice: algorithmic engineering of a state of the art cardinality estimation algorithm. In: Proceedings of the 16th International Conference on Extending Database Technology (EDBT 2013), pp. 683–692 (2013)

    Google Scholar 

  18. Yang, T., et al.: A generic technique for sketches to adapt to different counting ranges. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2019), pp. 2017–2025 (2019)

    Google Scholar 

  19. Yoon, M., Li, T., Chen, S., Peir, J.K.: Fit a compact spread estimator in small high-speed memory. IEEE ACM Trans. Network. (TON) 19(5), 1253–1264 (2011)

    Article  Google Scholar 

  20. Huang, H., et al.: You can drop but you can’t hide: \(k\)-persistent spread estimation in high-speed networks. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2018), pp. 1889–1897 (2018)

    Google Scholar 

  21. Zhou, Y., Zhou, Y., Chen, S., Zhang, Y.: Highly compact virtual active counters for per-flow traffic measurement. In: Proceedings of the IEEE Conference on Computer Communications (INFOCOM 2018), pp. 1–9 (2018)

    Google Scholar 

  22. Zhang, Y.: An adaptive flow counting method for anomaly detection in SDN. In: Proceedings of the Ninth ACM Conference on Emerging Networking Experiments and Technologies, pp. 25–30. Association for Computing Machinery, New York (2013)

    Google Scholar 

  23. Cheng, G., Yu, J.: Adaptive sampling for OpenFlow network measurement methods. In: Proceedings of the 12th International Conference on Future Internet Technologies, pp. 1–7. Association for Computing Machinery, New York (2017)

    Google Scholar 

  24. CAIDA: The CAIDA UCSD anonymized internet traces (2016). http://www.caida.org/data/passive/passive_2016_dataset.xml. Accessed 28 July 2019

  25. Wang, P., Jia, P., Zhang, X., Tao, J., Guan, X., Towsley, D.: Utilizing dynamic properties of sharing bits and registers to estimate user cardinalities over time. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE), pp. 1094–1105 (2019)

    Google Scholar 

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (Grant No. 62072322, 61873177, and U20A20182) and Natural Science Research Project of Jiangsu Higher Education Institution (Grant No. 18KJA520010).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to He Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Du, Y., Huang, H., Sun, YE., Liu, A., Gao, G., Zhang, B. (2021). Online High-Cardinality Flow Detection over Big Network Data Stream. In: Jensen, C.S., et al. Database Systems for Advanced Applications. DASFAA 2021. Lecture Notes in Computer Science(), vol 12681. Springer, Cham. https://doi.org/10.1007/978-3-030-73194-6_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-73194-6_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-73193-9

  • Online ISBN: 978-3-030-73194-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics