Skip to main content

Mining Web Logs to Improve Web Caching and Prefetching

  • Conference paper
  • First Online:
Web Intelligence: Research and Development (WI 2001)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2198))

Included in the following conference series:

Abstract

Caching and prefetching are well known strategies for improving the performance of Internet systems. The heart of a caching system is its page replacement policy, which selects the pages to be replaced in a proxy cache when a request arrives. By the same token, the essence of a prefetching algorithm lies in its ability to accurately predict future request. In this paper, we present a method for caching variable-sized web objects using an n-gram based prediction of future web requests. Our method aims at mining a prediction model from the web logs for document access patterns and using the model to extend the well-known GDSF caching policy. In addition, we present a new method to integrate this caching algorithm with a prediction-based prefetching algorithm. We empirically show that the system performance is greatly improved using the integrated approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Arlitt, R. Friedrich L. Cherkasova, J. Dilley, and T. Jin. Evaluating content management techniques for web proxy caches. In HP Technical report, Palo Alto, Apr. 1999.

    Google Scholar 

  2. C. Aggarwal, J. L. Wolf, and P. S. Yu. Caching on the World Wide Web. In IEEE Transactions on Knowledge and Data Engineering, volume 11, pages 94–107, 1999.

    Article  Google Scholar 

  3. Pei Cao, Edward W. Felten, Anna R. Karlin and Kai Li. A Study of integrated Prefetching and Caching Strategies. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, May 1995.

    Google Scholar 

  4. H. T. Chou and D. J. DeWitt. An evaluation of buffer management strategies for relational database systems. In Proceedings of the Eleventh International Conference on Very Large Databases, pages 127–141, August 1985.

    Google Scholar 

  5. P. Cao and S. Irani. Cost-aware www proxy caching algorithms. In USENIX Symposium on Internet Technologies and Systems, Monterey, CA, Dec. 1997.

    Google Scholar 

  6. E. Markatos and C. Chironaki. A Top Ten Approach for Prefetching the Web. In Proceedings of the INET’98 Internet Global Summit. July 1998

    Google Scholar 

  7. Dan Duchamp. Prefetching Hyperlinks. In Proceedings of the Second USENIX Symposium on Internet Technologies and Systems (USITS’ 99), Bouder, CO Oct 1999.

    Google Scholar 

  8. Pitkow J. and Pirolli P. Mining longest repeating subsequences to predict www surfing. In Proceedings of the 1999 USENIX Annual Technical Conference, 1999.

    Google Scholar 

  9. T. M. Kroeger and D. D. E. Long. Predicting future file-system actions from prior events. In USENIX 96, San Diego, Calif., Jan. 1996.

    Google Scholar 

  10. E. Markatos. Main memory caching of web documents. In Computer networks and ISDN Systems, volume 28, pages 893–905, 1996.

    Article  Google Scholar 

  11. K. Chinen and S. Yamaguchi. An Interactive Prefetching Proxy Server for Improvement of WWW Latency. In Proceedings of the Seventh Annual Conference of the Internet Society (INEt’97), Kuala Lumpur, June 1997.

    Google Scholar 

  12. E. J. O’Neil, P. E. O’Neil, and G. Weikum. The LRU-K page replacement algorithm for database disk buffering. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 297–306, May 1993.

    Google Scholar 

  13. S. Schechter, M. Krishnan, and M.D. Smith. Using path profiles to predict http requests. In Proceedings of the Seventh International World Wide Web Conference Brisbane, Australia., 1998.

    Google Scholar 

  14. Zhong Su, Qiang Yang, Ye Lu, and HongJiang Zhang. Whatnext: A prediction system for web requests using n-gram sequence models. In Proceedings of the First International Conference on Web Information Systems and Engineering Conference, pages 200–207, Hong Kong, June 2000.

    Google Scholar 

  15. Zhong Su, Qiang Yang, and HongJiang Zhang. A prediction system for multimedia pre-fetching on the internet. In ACM Muldimedia Conference 2000. ACM, October 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, Q., Zhang, H.H., Li, I.T.Y., Lu, Y. (2001). Mining Web Logs to Improve Web Caching and Prefetching. In: Zhong, N., Yao, Y., Liu, J., Ohsuga, S. (eds) Web Intelligence: Research and Development. WI 2001. Lecture Notes in Computer Science(), vol 2198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45490-X_62

Download citation

  • DOI: https://doi.org/10.1007/3-540-45490-X_62

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42730-8

  • Online ISBN: 978-3-540-45490-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics