Pitfalls in HTTP Traffic Measurements and Analysis

  • Fabian Schneider
  • Bernhard Ager
  • Gregor Maier
  • Anja Feldmann
  • Steve Uhlig
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7192)

Abstract

Being responsible for more than half of the total traffic volume in the Internet, HTTP is a popular subject for traffic analysis. From our experiences with HTTP traffic analysis we identified a number of pitfalls which can render a carefully executed study flawed. Often these pitfalls can be avoided easily. Based on passive traffic measurements of 20.000 European residential broadband customers, we quantify the potential error of three issues: Non-consideration of persistent or pipelined HTTP requests, mismatches between the Content-Type header field and the actual content, and mismatches between the Content-Length header and the actual transmitted volume. We find that 60% (30%) of all HTTP requests (bytes) are persistent (i.e., not the first in a TCP connection) and 4% are pipelined. Moreover, we observe a Content-Type mismatch for 35% of the total HTTP volume. In terms of Content-Length accuracy our data shows a factor of at least 3.2 more bytes reported in the HTTP header than actually transferred.

Keywords

Mime Type Browser Popularity Mismatch Category Persistent Request Browser Category 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ager, B., Schneider, F., Kim, J., Feldmann, A.: Revisiting cacheability in times of user generated content. In: Proc. of IEEE Global Internet Symposium (2010)Google Scholar
  2. 2.
    Callahan, T., Allman, M., Paxson, V.: A Longitudinal View of HTTP Traffic. In: Krishnamurthy, A., Plattner, B. (eds.) PAM 2010. LNCS, vol. 6032, pp. 222–231. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Doverspike, R., Gerber, A.: Traffic Types and Growth in Backbone Networks. Tech. rep. In: Proc. of OFC/NFOEC (invited paper) (March 2011)Google Scholar
  4. 4.
    Erman, J., Gerber, A., Hajiaghayi, M.T., Pei, D., Spatscheck, O.: Network-aware forward caching. In: Proc. International World Wide Web Conference, WWW (2009)Google Scholar
  5. 5.
    Labovitz, C., Iekel-Johnson, S., McPherson, D., Oberheide, J., Jahanian, F.: Internet inter-domain traffic. In: Proc. ACM SIGCOMM Conference (2010)Google Scholar
  6. 6.
    Maier, G., Feldmann, A., Paxson, V., Allman, M.: On dominant characteristics of residential broadband internet traffic. In: Proc. Internet Measurement Conf., IMC (2009)Google Scholar
  7. 7.
    Maier, G., Sommer, R., Dreger, H., Feldmann, A., Paxson, V., Schneider, F.: Enriching network security analysis with time travel. In: Proc. ACM SIGCOMM Conference (2008)Google Scholar
  8. 8.
    Paxson, V.: Bro: A system for detecting network intruders in real-time. Computer Networks Journal 31, 23–24, 2435–2463 (1999), Bro homepage: http://www.bro-ids.org
  9. 9.
    Schneider, F., Agarwal, S., Alpcan, T., Feldmann, A.: The New Web: Characterizing AJAX Traffic. In: Claypool, M., Uhlig, S. (eds.) PAM 2008. LNCS, vol. 4979, pp. 31–40. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Schneider, F., Feldmann, A., Krishnamurthy, B., Willinger, W.: Understanding online social network usage from a network perspective. In: Proc. Internet Measurement Conf., IMC (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Fabian Schneider
    • 1
    • 2
  • Bernhard Ager
    • 2
  • Gregor Maier
    • 2
    • 3
  • Anja Feldmann
    • 2
  • Steve Uhlig
    • 4
  1. 1.NEC Laboratories EuropeHeidelbergGermany
  2. 2.Telekom Innovation LaboratoriesTU BerlinBerlinGermany
  3. 3.International Computer Science InstituteBerkeleyUSA
  4. 4.Queen Mary, University of LondonLondonUK

Personalised recommendations