Using Similarity Measure to Enhance the Robustness of Web Access Prediction Model

  • Ben Niu
  • Simon C. K. Shiu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3683)


Prefetching web content by predicting users’ web requests can reduce the response time of the web server and optimize the network traffic. The Markov model that is based on the conditional probability has been studied by many researchers for web access path prediction. The prediction accuracy rate can reach up to 60 to 70 percent high. However a drawback of this type of model is that as the length of the access path grows the chance of successful path matching will decrease and the model will become inapplicable. In order to preserving the applicability as well as improving the accuracy rate, we extend the model by introducing a similarity measure among access paths. Therefore, the matching process becomes less rigid and the model will be more applicable and robust to the change of the path length.


Similarity Measure Access Point User Group Access Path Pattern Path 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Griffioen, J., Appleton, R.: Reducing file system latency using a predictive approach. In: Proc. USENIX Conference, Boston, Massachusetts, USA, pp. 8–12 (1994)Google Scholar
  2. 2.
    Padmanabhan, V.N., Mogul, J.C.: Using predictive prefetching to improve World Wide Web latency. ACM Computer Communication Review 27(3), 22–36 (1996)CrossRefGoogle Scholar
  3. 3.
    Pitkow, J., Pirolli, P.: Mining longest repeated subsequences to predict World Wide Web surfing. In: Second USENIX Symposium on Internet Technologies and Systems, Boulder, Colorado, USA, pp. 11–14 (1999)Google Scholar
  4. 4.
    Su, Z., Yang, Q., Lu, Y., Zhang, H.: WhatNext: A prediction system for web request using N-gram sequence models. In: First International Conference on Web Information Systems and Engineering Conference, Hong Kong, China, pp. 214–221 (2000)Google Scholar
  5. 5.
    Yang, Q., Zhang, H., Li, I.: Mining web logs for prediction models in WWW caching and prefetching. In: The Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 2001, San Francisco, California, USA, pp. 473–478 (2001)Google Scholar
  6. 6.
    Yang, Q., Zhang, H.: Web log mining for predictive web caching. IEEE Transactions on Knowledge and Data Engineering 15(4), 1050–1053 (2003)CrossRefGoogle Scholar
  7. 7.
    Pal, S.K., Shiu, S.C.K.: Foundations of Soft Case-Based Reasoning. John Wiley, Hoboken (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Ben Niu
    • 1
  • Simon C. K. Shiu
    • 1
  1. 1.Department of ComputingHong Kong Polytechnic UniversityHong KongChina

Personalised recommendations