World Wide Web

, Volume 2, Issue 1–2, pp 15–28 | Cite as

Changes in Web client access patterns: Characteristics and caching implications

  • Paul Barford
  • Azer Bestavros
  • Adam Bradley
  • Mark Crovella
Article

Abstract

Understanding the nature of the workloads and system demands created by users of the World Wide Web is crucial to properly designing and provisioning Web services. Previous measurements of Web client workloads have been shown to exhibit a number of characteristic features; however, it is not clear how those features may be changing with time. In this study we compare two measurements of Web client workloads separated in time by three years, both captured from the same computing facility at Boston University. The older dataset, obtained in 1995, is well known in the research literature and has been the basis for a wide variety of studies. The newer dataset was captured in 1998 and is comparable in size to the older dataset. The new dataset has the drawback that the collection of users measured may no longer be representative of general Web users; however, using it has the advantage that many comparisons can be drawn more clearly than would be possible using a new, different source of measurement. Our results fall into two categories. First we compare the statistical and distributional properties of Web requests across the two datasets. This serves to reinforce and deepen our understanding of the characteristic statistical properties of Web client requests. We find that the kinds of distributions that best describe document sizes have not changed between 1995 and 1998, although specific values of the distributional parameters are different. Second, we explore the question of how the observed differences in the properties of Web client requests, particularly the popularity and temporal locality properties, affect the potential for Web file caching in the network. We find that for the computing facility represented by our traces between 1995 and 1998, (1) the benefits of using size‐based caching policies have diminished; and (2) the potential for caching requested files in the network has declined.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abdulla, G., E.A. Fox, and M. Abrams (1997), "Shared User Behavior on the World Wide Web," In Proceedings of WebNet 97.Google Scholar
  2. Abrams, M., C.R. Standridge, G. Abdulla, S. Williams, and E.A. Fox (1995), "Caching Proxies: Limitations and Potentials," The World Wide Web Journal 1, 1.Google Scholar
  3. Almeida, V., A. Bestavros, M. Crovella, and A. de Oliveira (1996), "Characterizing Reference Locality in the WWW," In Proceedings of 1996 International Conference on Parallel and Distributed Information Systems (PDIS '96), pp. 92–103.Google Scholar
  4. Arlitt, M.F. and C.L. Williamson (1997), "Web Server Workload Characterization: The Search for Invariants," IEEE/ACM Transactions on Networking 5, 5, 631–645.CrossRefGoogle Scholar
  5. Barford, P. and M.E. Crovella (1998), "Generating Representative Web Workloads for Network and Server Performance Evaluation," In Proceedings of Performance '98/SIGMETRICS '98, pp. 151–160.Google Scholar
  6. Bestavros, A., R.L. Carter, M.E. Crovella, C.R. Cunha, A. Heddaya, and S.A. Mirdad (1995), "Application-Level Document Caching in the Internet," In Proceedings of the 2nd International Workshop on Services in Distributed and Networked Environments (SDNE '95).Google Scholar
  7. Bestavros, A. and C. Cunha (1996), "Server-initiated Document Dissemination for the WWW," IEEE Data Engineering Bulletin 19, 3–11.Google Scholar
  8. Bolot, J.-C. and P. Hoschka (1996), "Performance Engineering of the World Wide Web: Application to Dimensioning and Cache Design," In Proceedings of the 5th Interntional Conference on the WWW, Paris, France.Google Scholar
  9. Bolot, J.-C., S. Lamblot, and A. Simonian (1997), "Design of Efficient Caching Schemes for the World Wide Web," In Teletraffic Contributions for the Information Age, Proceedings of the 15th International Teletraffic Congress (ITC-15), V. Ramaswami and P. Wirth, <nt>Eds.</nt>, pp. 403–412.Google Scholar
  10. Catledge, L.D. and J.E. Pitkow (1995), "Characterizing Browsing Strategies in the World-Wide Web," Computer Networks and ISDN Systems 26, 6, 1065–1073.CrossRefGoogle Scholar
  11. Crovella, M.E. and A. Bestavros (1997), "Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes," IEEE/ACM Transactions on Networking 5, 6, 835–846.CrossRefGoogle Scholar
  12. Crovella, M.E. and L. Lipsky (1997), "Long-Lasting Transient Conditions in Simulations with Heavy-Tailed Workloads," In Proceedings of the 1997 Winter Simulation Conference, pp. 1005–1012.Google Scholar
  13. Crovella, M.E., M.S. Taqqu, and A. Bestavros (1998), "Heavy-Tailed Probability Distributions in the World Wide Web," In A Practical Guide to Heavy Tails, Chapter 1, Chapman & Hall, New York, pp. 3–26.Google Scholar
  14. Cunha, C.A., A. Bestavros, and M.E. Crovella (1995), "Characteristics of WWW Client-Based Traces," Technical Report TR-95-010, Boston University, Department of Computer Science.Google Scholar
  15. D'Agostino, R.B. and M.A. Stephens, <nt>Eds.</nt> (1986), Goodness-of-Fit Techniques, Marcel Dekker, Inc.Google Scholar
  16. Deng, S. (1996), "Empirical Model of WWW Document Arivals at Access Links," In Proceedings of the 1996 IEEE International Conference on Communication.Google Scholar
  17. Feldmann, A. and W. Whitt (1997), "Fitting Mixtures of Exponentials to Long-Tail Distributions to Analyze Network Performance Models," In Proceedings of IEEE INFOCOM '97, pp. 1098–1116.Google Scholar
  18. Glassman, S. (1994), "A Caching Relay for the World Wide Web," In Proceedings of the 1st International World Wide Web Conference, pp. 69–76.Google Scholar
  19. Huberman, B.A., P.L.T. Pirolli, J.E. Pitkow, and R.M. Lukose (1998), "Strong Regularities in World Wide Web Surfing," Science 280, 5360, 95–97.CrossRefGoogle Scholar
  20. Iyengar, A.K., E.A. MacNair, M.S. Squillante, and L. Zhang (1998), "A General Methodology for Characterizing Access Patterns and Analyzing Web Server Performance," In Proceedings of MASCOTS '98.Google Scholar
  21. Maltzahn, C., K.J. Richardson, and D. Grunwald (1997), "Performance Issues of Enterprise Level Web Proxies," In Proceedings of the 1997 ACM Sigmetrics International Conference on Measurement and Modeling of Computer Systems, pp. 13–23.Google Scholar
  22. Mandelbrot, B.B. (1983), The Fractal Geometry of Nature, W.H. Freedman and Co., New York.MATHGoogle Scholar
  23. Manley, S. and M. Seltzer (1997), "Web facts and fantasy," In Proceedings of the 1997 USENIX Symposium on Internet Technologies and Systems.Google Scholar
  24. Markatos, E. (1996), "Main Memory Caching of Web Documents," In Proceedings of the 5th Interntional Conference on the WWW.Google Scholar
  25. Muntz, D. and P. Honeyman (1992), "Multi-level Caching in Distributed File Systems or Your Cache Ain't Nuthing but Trash," In Proceedings of the Winter 1992 USENIX, pp. 305–313.Google Scholar
  26. Murta, C.D., V. Almeida, and W. Meira Jr. (1998), "Analyzing Performance of Partitioned Caches for the World Wide Web," In Proceedings of the 3rd International WWW Caching Workshop.Google Scholar
  27. Nishikawa, N., T. Hosokawa, Y. Mori, K. Yoshida, and H. Tsuji (1998), "Memory-Based Architecture for Distributed WWW Caching Proxy," Computer Networks and ISDN Systems 30, 205–214.Google Scholar
  28. Pederson, S. and M. Johnson (1990), "Estimating Model Discrepancy," Technometrics.Google Scholar
  29. Pitkow, J.E. (1997), "Summary of WWW Characterizations," In Proceedings of the 7th World Wide Web Conference (WWW7).Google Scholar
  30. Williams, S., M. Abrams, C.R. Standridge, G. Abdulla, and E.A. Fox (1996), "Removal Policies in Network Caches for World-Wide Web Documents," In Proceedings of ACM SIGCOMM '96.Google Scholar
  31. Zipf, G.K. (1949), Human Behavior and the Principle of Least-Effort, Addison-Wesley, Cambridge, MA.Google Scholar

Copyright information

© Kluwer Academic Publishers 1999

Authors and Affiliations

  • Paul Barford
    • 1
  • Azer Bestavros
    • 1
  • Adam Bradley
    • 1
  • Mark Crovella
    • 1
  1. 1.Computer Science DepartmentBoston UniversityBostonUSA

Personalised recommendations