Estimating the Number of Hosts Corresponding to an Address while Preserving Anonymity

  • Alif Wahid
  • Christopher Leckie
  • Chenfeng Zhou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7645)


Estimating the number of hosts that have been assigned to an Internet address is a challenging problem due to confounding factors such as the dynamic allocation of addresses and the prohibition of access to privacy sensitive data that can reveal user identities and remove anonymity. We propose a probabilistic method that strikes a desired balance between protection of anonymity and accuracy of estimation. By utilising the phenomenon of preferential attachment, we show that the number of hosts corresponding to an address is accurately predicted by the number of times that an address appears in a series of alternating ON and OFF intervals. We validate our method using a four month trace of dynamic address allocations at a campus wireless network. In so doing, we demonstrate the practical significance and utility of such an anonymity preserving method for estimating the number of hosts corresponding to a dynamic address.


Preferential Attachment Network Address Translator Full Binary Tree Internet Measurement Classic Hide Markov Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Anderson, D.G., Balakrishnan, H., Feamster, N., Koponen, T., Moon, D., Shenker, S.: Accountable Internet Protocol (AIP). In: Proceedings of the ACM SIGCOMM 2008 Conference on Data Communication, pp. 339–350. ACM (August 2008)Google Scholar
  2. 2.
    Barabasi, A.L., Albert, R.: Emergence of Scaling in Random Networks. Science 286(5439), 509–512 (1999)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bellovin, S.M.: A Technique for Counting NATted Hosts. In: Proceedings of the 2nd ACM SIGCOMM Workshop on Internet Measurment, pp. 267–272. ACM, New York (2002)CrossRefGoogle Scholar
  4. 4.
    Cai, X., Heidemann, J.: Understanding Block-level Address Usage in the Visible Internet. In: Proceedings of the ACM SIGCOMM Conference, pp. 99–110. ACM (August 2010)Google Scholar
  5. 5.
    Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law Distributions in Empirical Data. SIAM Review 51(4), 661–703 (2009)MathSciNetzbMATHCrossRefGoogle Scholar
  6. 6.
    Droms, R.: RFC 2131: Dynamic Host Configuration Protocol. (March 1997), (accessed on January 25, 2010)
  7. 7.
    Fischer, W., Meier-Hellstern, K.: The Markov-modulated Poisson process (MMPP) cookbook. Performance Evaluation 18, 149–171 (1992)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Heidemann, J., Pradkin, Y., Govindan, R., Papadopoulos, C., Bartlett, G., Bannister, J.: Census and Survey of the Visible Internet. In: Proceedings of the 8th ACM SIGCOMM Conference on Internet Measurement, pp. 169–182. ACM (October 2008)Google Scholar
  9. 9.
    Holz, T., Gorecki, C., Reick, K., Freiling, F.C.: Measuring and Detecting Fast-Flux Service Networks. In: Proceedings of the 16th Annual Network & Distributed System Security Symposium, ISOC (February 2008)Google Scholar
  10. 10.
    Information Sciences Institute, University of Southern California: RFC 791: Internet Protocol (September 1981), (accessed on January 25, 2010)
  11. 11.
    Khadilkar, M., Feamster, N., Sanders, M., Clark, R.: Usage-Based DHCP Lease Time Optimization. In: Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement. ACM (2007)Google Scholar
  12. 12.
    Kreibich, C., Weaver, N., Nechaev, B., Paxson, V.: Netalyzr: Illuminating The Edge Network. In: Proceedings of ACM Internet Measurement Conference. ACM (November 2010)Google Scholar
  13. 13.
    Le-Ngoc, T., Subramanian, S.: A Pareto-modulated Poisson process (PMPP) model for long-range dependent traffic. Computer Communications 23, 123–132 (2000)CrossRefGoogle Scholar
  14. 14.
    Leland, W.E., Taqqu, M.S., Willinger, W., Wilson, D.: On the Self-Similar Nature of Ethernet Traffic (Extended Version). IEEE/ACM Transactions on Networking 2(1), 1–15 (1994)CrossRefGoogle Scholar
  15. 15.
    Maier, G., Feldmann, A., Paxson, V., Allman, M.: On Dominant Characteristics of Residential Broadband Internet Traffic. In: Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement, pp. 90–102. ACM (November 2009)Google Scholar
  16. 16.
    Mitzenmacher, M.: A Brief History of Generative Models for Power Law and Lognormal Distributions. Internet Mathematics 1(2), 226–251 (2004)MathSciNetzbMATHCrossRefGoogle Scholar
  17. 17.
    Mockapetris, P.: RFC 1035: Domain Names: Implementation and Specification (November 1987), (accessed on January 25, 2010)
  18. 18.
    Osterweil, E., Amante, S., McPherson, D., Massey, D.: The Great IPv4 Land Grab: Resource Certification for the IPv4 Grey Market. In: Proceedings of the Tenth ACM Workshop on Hot Topics in Networks. ACM (November 2011)Google Scholar
  19. 19.
    Paxson, V., Floyd, S.: Wide-Area Traffic: The Failure of Poisson Modeling. IEEE/ACM Transactions on Networking 3(3), 226–244 (1995)CrossRefGoogle Scholar
  20. 20.
    Rabiner, L.R., Juang, B.H.: An Introduction to Hidden Markov Models. IEEE ASSP Magazine 3(1), 4–16 (1986)CrossRefGoogle Scholar
  21. 21.
    Rajab, M.A., Zarfoss, J., Monrose, F., Terzis, A.: My Botnet is Bigger than Yours (Maybe, Better than Yours): why size estimates remain challenging. In: Proceedings of the 1st Workshop on Hot Topics in Understanding Botnets. USENIX Association (April 2007)Google Scholar
  22. 22.
    Rigney, C., Willens, S., Rubens, A., Simpson, W.: RFC 2865: Remote Authentication Dial In User Service (June 2000), (accessed on January 25, 2010)
  23. 23.
    Taqqu, M.S., Willinger, W., Sherman, R.: Proof of a Fundamental Result in Self-Similar Traffic Modeling. ACM SIGCOMM Computer Communications Review 27(2), 5–23 (1997)CrossRefGoogle Scholar
  24. 24.
    Tsuchiya, P.F., Eng, T.: Extending the IP Internet Through Address Reuse. ACM SIGCOMM Computer Communication Review 23(1), 16–33 (1993)CrossRefGoogle Scholar
  25. 25.
    Victorian Consolidated Legislation: Information Privacy Act 2000 - SCHEDULE 1 (2000), (accessed on June 12, 2012)
  26. 26.
    Hsu, W., Helmy, A.: CRAWDAD trace set usc/mobilib/dhcp (v. 2007-01-08) (January 2007), (accessed on March 5, 2011)
  27. 27.
    Wilcox, C., Papadopoulos, C., Heidemann, J.: Correlating Spam Activity with IP Address Characteristics. In: Proceedings of INFOCOM IEEE Conference on Computer Communications. IEEE (2010)Google Scholar
  28. 28.
    Willinger, W., Taqqu, M.S., Sherman, R., Wilson, D.V.: Self-Similarity Through High Variability: Statistical Analysis of Ethernet LAN Traffic at the Source Level. IEEE/ACM Transactions on Networking 5(1), 71–86 (1997)CrossRefGoogle Scholar
  29. 29.
    Xie, Y., Yu, F., Abadi, M.: De-anonymizing the Internet Using Unreliable IDs. In: Proceedings of the 2009 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 75–86. ACM (August 2009)Google Scholar
  30. 30.
    Xie, Y., Yu, F., Achan, K., Gillum, E., Goldszmidt, M., Wobber, T.: How Dynamic are IP Addresses? In: Proceedings of the 2007 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 301–312. ACM (August 2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Alif Wahid
    • 1
  • Christopher Leckie
    • 1
  • Chenfeng Zhou
    • 1
  1. 1.Department of Computing and Information SystemsThe University of MelbourneAustralia

Personalised recommendations