A Similarity-Aware Multiagent-Based Web Content Management Scheme

  • Jitian Xiao
  • Jun Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3930)


This paper presents a similarity-aware multiagent-based web content management scheme. Based on a set of similarity measures that assess similarities between web documents, we propose a similarity-aware multi-cache architecture, in which the cached web documents are organized into a number of sub-caches according to their content similarity. A predictor is then developed to predict the cached documents a user might access next. Once a pre-fetching plan was formed, a set of agents are employed to work together for pre-fetching document between proxy caches and browsing clients. Preliminary experiments have shown that our predictor offers superior performance when compared with some existing prediction algorithms.


Access Pattern Latent Semantic Analysis Vector Space Model Proxy Cache Document Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Fan, L., Cao, P., Lin, W., Jacobson, Q.: Web Prefetching between Low-Bandwidth Client and Proxies: Potential and Performance. In: SIGMETRICS 1999 (1999)Google Scholar
  2. 2.
    Xiao, J., Zhang, Y., Jia, X.: Measuring Similarity of Interests for Clustering Web-Users. In: Proceedings of the 12th Australian Database Conference 2001 (ADC 2001), Gold Coast, Australia, pp. 107–114 (2001)Google Scholar
  3. 3.
    Salton, G.: Automatic Information Organization and Retrieval. McGraw-Hill, New York (1968)Google Scholar
  4. 4.
    Rasmussen, E.: Clustering algorithms. Information Retrieval: Data Structure and Algorithms, pp. 419–442. Prentice Hall, Englewood Cliffs (1992)Google Scholar
  5. 5.
    Deerwester, S., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman., R.A.: Indexing by Latent Semantics Analysis. Journal of the Society for Information Science 41(6), 391–407Google Scholar
  6. 6.
    Dumais, S.T., Furnas, G.W., Landauer, T.K., Deerwester, S.: Using Latent Semantic Analysis to Improve Information Retrieval. In: Proceedings of the CHI 1988: Conference on Human Factors in Computing Systems, pp. 281–285. ACM, New York (1988)CrossRefGoogle Scholar
  7. 7.
    Dean, J., Henzinger, M.R.: Finding Related Pages in the World-Wide Web. In: Proceedings of the 8th International Conference on World Wide Web (1999)Google Scholar
  8. 8.
    Kleinberg, J.M.: Authoritative Sources in a Hyperlinked Environment. J. of the ACM (JACM) 46(5), 604–632Google Scholar
  9. 9.
    Larson, R.R.: Bibliometrics of the World-Wide Web: An Exploratory Analysis of the Intellectual Structure of Cyberspace. In: Proceedings of the Annual Meeting of the American Society for Information Science, Baltimore, Maryland (1996)Google Scholar
  10. 10.
    Pitkow, J., Pirolli, P.: Life, Death, and Lawfulness on the Electronic Frontier. In: Proceedings of the Conference on Human Factors in Computing Systems, Atlanda, Georgia (1997)Google Scholar
  11. 11.
    Flesca, S., Masciari, E.: Efficient and Effective Web Change Detection, Data and Knowledge Engineering. Elsevier, Amsterdam (2003)Google Scholar
  12. 12.
    Fox, E.: Extending the Boolean and Vector Space Models on Information Retrieval with P-Norm Queries and Multiple Concepts Types. Cornell University DissertationGoogle Scholar
  13. 13.
    Shaw, J.A., Fox, E.A.: Combination of Multiple Searches. In: Proceedings of the 3rd Text Retrieval Conference (TREC-3), p. 105 (1994)Google Scholar
  14. 14.
    Chakrabarti, S., Dom, B.E., Kumar, S.R., Raghavan, P., Rajagopalan, S., Tomkins, A., Gibson, D., Kleinberg, J.M.: Mining the Web’s Link Structure. IEEE Computer 32(8), 60–67Google Scholar
  15. 15.
    Rocchio, J.J., McGill, M.J.: Relevance Feedback in Information Retrieval. Prentice-Hall Inc., Englewood Cliff (1997)Google Scholar
  16. 16.
    Ide, E.: New Experiments in Relevance Feedback. Prentice-Hall, Englewood Cliffs (1971)Google Scholar
  17. 17.
    Brauen, T.: Document Vector Modification. Prentice-Hall Inc., Englewood Cliffs (1971)Google Scholar
  18. 18.
    Popescul, A., Flake, G., Lawrence, S., Ungar, L.H., Gile, C.L.: Clustering and Identifying Temporal Trends in Document Database. In: Proceedings of the IEEE advances in Digital Libraries, Washington (2000)Google Scholar
  19. 19.
    Beefermand, D., Berger, A.: Agglomerative clustering of a search engine query log. In: Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, pp. 407–415 (2000)Google Scholar
  20. 20.
    Wen, J.R., Nie, J.Y., Zhang, H.J.: Query Clustering Using User Logs. ACM Transactions on Information Systems (TOIS) 20(1), 59–81 (2002)CrossRefGoogle Scholar
  21. 21.
    Su, Z., Yang, Q., Zhang, H.J., Xu, X., Hu, Y.: Correlation-Based Document Clustering Using Web Logs. In: Proceedings of the 34th Hawaii International Conference on System Science, Hawaii, USA (2001)Google Scholar
  22. 22.
    Bradshaw, J.M.: Software Agents. AAAI Press/MIT Press, San Francisco (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jitian Xiao
    • 1
  • Jun Wang
    • 2
    • 1
  1. 1.School of Computer and Information ScienceEdith Cowan UniversityMount LawleyAustralia
  2. 2.School of Computer Science and EngineeringWenzhou UniversityWenzhouChina

Personalised recommendations