An Algorithm for Estimation of Flow Length Distributions Using Heavy-Tailed Feature

  • Weijiang Liu
  • Jian Gong
  • Wei Ding
  • Guang Cheng
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3994)


Routers have the ability to output statistics about packets and flows of packets that traverse them. Since however the generation of detailed traffic statistics does not scale well with link speed, increasingly passive traffic measurement employs sampling at the packet level. Packet sampling has become an attractive and scalable means to measure flow data on high-speed links. However, knowing the number and length of the original flows is necessary for some applications. This paper provides an algorithm that uses flow statistics formed from sampled packet stream to infer the absolute frequencies of lengths of flows in the unsampled stream. We achieve this through statistical inference and by exploiting heavy-tailed feather. We also investigate the impact on our results of different packet sampling rate. The experiment results show the inferred distributions are accurate in most cases.


Length Distribution Pareto Distribution Expectation Maximization Algorithm Bloom Filter Proxy Cache 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Duffield, N.G., Lund, C., Thorup, M.: Charging from sampled network usage. In: ACM SIGCOMM Internet Measurement Workshop 2001, November 2001, pp. 245–256 (2001)Google Scholar
  2. 2.
    Duffield, N.G., Lund, C., Thorup, M.: Properties and Prediction of Flow Statistics from Sampled Packet Streams. In: ACM SIGCOMM Internet Measurement Workshop 2002, November 2002, pp. 159–171 (2002)Google Scholar
  3. 3.
    Feldmann, A., Caceres, R., Douglis, F., Glass, G., Rabinovich, M.: Performance of Web Proxy Caching in Heterogeneous Bandwidth Environments. In: IEEE INFOCOM 1999, March 1999, pp. 107–116 (1999)Google Scholar
  4. 4.
    Feldmann, A., Rexford, J., Caceres, R.: Efficient Policies for Carrying Web Traffic over Flow-Switched Networks. IEEE/ACM Transactions on Networking 6, 673–685 (1998)CrossRefGoogle Scholar
  5. 5.
    Kumar, A., Xu, J., Li, L., Wang, J.: Space Code Bloom Filter for Efficient Traffic Flow Measurement. In: IEEE INFOCOM 2004, pp. 1762–1773 (2004)Google Scholar
  6. 6.
    Kumar, A., Sung, M., Xu, J(Jim.), Wang, J.: Data streaming algorithms for efficient and accurate estimation of flow size distribution. In: ACM SIGMETRICS 2004, pp. 177–188 (2004)Google Scholar
  7. 7.
    Hohn, N., Veitch, D.: Inverting Sampled Traffic. In: Internet Measurement Conference 2003, Miami Beach, Florida, USA, October 27-29, pp. 222–233 (2003)Google Scholar
  8. 8.
    Duffield, N.G., Lund, C., Thorup, M.: Estimating Flow Distributions from Sampled Flow Statistics. IEEE/ACM Transation on Networking 13, 933–945 (2005)CrossRefGoogle Scholar
  9. 9.
    Mao, S., Wang, J., Pu, X.: Advanced Mathematical Statistics. China Higher Education Press, Beijing (1998)Google Scholar
  10. 10.
    NLANR: Abilene-III data set,
  11. 11.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Weijiang Liu
    • 1
  • Jian Gong
    • 1
  • Wei Ding
    • 1
  • Guang Cheng
    • 1
  1. 1.Department of Computer Science and EngineeringSoutheast UniversityNanjingChina

Personalised recommendations