Skip to main content

Per-Flow Cardinality Measurement

  • Chapter
  • First Online:

Part of the book series: Wireless Networks ((WN))

Abstract

Per-flow cardinality measurement over big network data consisting of numerous flows is a fundamental problem with many practical applications. Traditionally the research on this problem focused on using a small amount of memory to estimate each flow’s cardinality from a large range (up to 109). However, although the memory needed for each flow has been greatly compressed, when there is an extremely large number of flows, the overall memory demand can still be very high, exceeding the availability under some important scenarios, such as implementing online measurement modules in network processors using only on-chip cache memory. In this chapter, instead of allocating a separated data structure (called estimator) for each flow, we take a different path by viewing all the flows together as a whole: Each flow is allocated with a virtual estimator, and these virtual estimators share a common memory space. We show that sharing at the register (multi-bit) level is superior than sharing at the bit level. We present a framework of virtual estimators that allows us to apply the idea of sharing to an array of cardinality estimation solutions, achieving far better memory efficiency than the best existing work. Experimental results show that the new solution can work in a tight memory space of less than 1 bit per flow or even one tenth of a bit per flow—a quest that has never been realized before.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bar-yossef, Z., Jayram, T.S., Kumar, R., Sivakumar, D., Trevisan, L., Luca: Counting distinct elements in a data stream. In: Proceedings of the RANDOM: Workshop on Randomization and Approximation (2002)

    Google Scholar 

  2. Beyer, K., Haas, P.J., Reinwald, B., Sismanis, Y., Gemulla, R.: On synopses for distinct-value estimation under multiset operations. In: Proceedings of the ACM SIGMOD (2007)

    Book  Google Scholar 

  3. Cormode, G., Muthukrishnan, S.: An improved data stream summary: the Count-Min sketch and its applications. In: Proceedings of the LATIN (2004)

    MATH  Google Scholar 

  4. Costa, M., Crowcroft, J., Castro, M., Rowstron, A., Zhou, L., Zhang, L., Barham, P.: Vigilante: end-to-end containment of internet worms. SIGOPS Operat. Syst. Rev. 39 (5), 133–147 (2005)

    Article  Google Scholar 

  5. Durand, M., Flajolet, P.: Loglog counting of large cardinalities. In: ESA: European Symposia on Algorithms, pp. 605–617 (2003)

    Google Scholar 

  6. Estan, C., Varghese, G.: New directions in traffic measurement and accounting. In: Proceedings of the ACM SIGCOMM (2002)

    Book  Google Scholar 

  7. Estan, C., Varghese, G., Fish, M.: Bitmap algorithms for counting active flows on high-speed links. IEEE/ACM Trans. Netw. 14 (5), 925–937 (2006)

    Article  Google Scholar 

  8. Flajolet, P., Martin, G.N.: Probabilistic counting algorithms for database applications. J. Comput. Syst. Sci. 31 (2), 182–209 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  9. Flajolet, P., Fusy, E., Gandouet, O., Meunier., F.: HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. In: Proceedings of the AOFA: International Conference on Analysis of Algorithms (2007)

    Google Scholar 

  10. Heule, S., Nunkesser, M., Hall, A.: HyperLogLog in practice: algorithmic engineering of a state-of-the-art cardinality estimation algorithm. In: Proceedings of the EDBT (2013)

    Book  Google Scholar 

  11. Li, T., Chen, S., Ling, Y.: Fast and compact per-flow traffic measurement through randomized counter sharing. In: Proceedings of the IEEE INFOCOM, pp. 1799–1807 (2011)

    Google Scholar 

  12. Lieven, P., Scheuermann, B.: High-speed per-flow traffic measurement with probabilistic multiplicity counting. In: Proceedings of IEEE INFOCOM, pp. 1–9 (2010). doi:10.1109/INFCOM.2010.5461921

  13. Lu, Y., Montanari, A., Prabhakar, B., Dharmapurikar, S., Kabbani, A.: Counter braids: a novel counter architecture for per-flow measurement. In: Proceedings of ACM SIGMETRICS (2008)

    Book  Google Scholar 

  14. Neustar.biz: How to choose a good hash function: part 3. http://research.neustar.biz/2012/02/02/choosing-a-good-hash-function-part-3 (2012)

  15. Ntarmos, N., Triantafillou, P., Weikum, G.: Counting at large: efficient cardinality estimation in internet-scale data networks. In: Proceedings of the ICDE, pp. 40–40 (2006). doi:10.1109/ICDE.2006.44

  16. The CAIDA UCSD Anonymized 2013 Internet Traces - January 17 (2013). http://www.caida.org/data/passive/passive_2013_dataset.xml

    Google Scholar 

  17. Whang, K.Y., Vander-Zanden, B.T., Taylor, H.M.: A linear-time probabilistic counting algorithm for database applications. ACM Trans. Database Syst. 15 (2), 208–229 (1990)

    Article  Google Scholar 

  18. Xiao, Q., Xiao, B., Chen, S.: Differential estimation in dynamic RFID systems. In: Proceedings of the INFOCOM (Mini-Conference), pp. 295–299 (2013)

    Google Scholar 

  19. Xiao, Q., Qiao, Y., Zhen, M., Chen, S.: Estimating the persistent spreads in high-speed networks. In: Proceedings of the IEEE ICNP, pp. 131–142 (2014)

    Google Scholar 

  20. Yoon, M., Li, T., Chen, S., Peir, J.K.: Fit a spread estimator in small memory. In: Proceedings of the IEEE INFOCOM (2009)

    Book  Google Scholar 

  21. Zhao, Q., Xu, J., Kumar, A.: Detection of super sources and destinations in high-speed networks: algorithms, analysis and evaluation. IEEE JASC 24 (10), 1840–1852 (2006)

    Google Scholar 

  22. Zou, C.C., Gao, L., Gong, W., Towsley, D.: Monitoring and early warning for internet worms. In: Proceedings of the 10th ACM Conference on Computer and Communications Security (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Chen, S., Chen, M., Xiao, Q. (2017). Per-Flow Cardinality Measurement. In: Traffic Measurement for Big Network Data. Wireless Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-47340-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47340-6_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47339-0

  • Online ISBN: 978-3-319-47340-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics