Predicting Web Information Content

  • Tingshao Zhu
  • Russ Greiner
  • Gerald Häubl
  • Bob Price
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3169)


This paper introduces a novel method for predicting the current information need of a web user from the content of the pages the user has visited and the actions the user has applied to these pages. This inference is based on a parameterized model of how the sequence of actions chosen by the user indicates the degree to which page content satisfies the user’s information need. We show that the model parameters can be estimated using standard methods from a labelled corpus. Data from lab experiments demonstrate that the prediction model can effectively identify the information needs of new users, browsing previously unseen pages. The paper concludes with an overview of our “complete-web” recommendation system, WebIC, which uses the prediction model to recommend useful pages to the user, from anywhere on the Web.


Association Rule Page Content Information Retrieval Technique Current Page General User Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th International Conference on Very Large Databases (VLDB 1994), Santiago, Chile (September 1994)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proc. of the International Conference on Data Engineering (ICDE), Taipei, Taiwan (March 1995)Google Scholar
  3. 3.
    Billsus, D., Pazzani, M.: A hybrid user model for news story classification. In: Proceedings of the Seventh International Conference on User Modeling (UM 1999), Banff, Canada (1999)Google Scholar
  4. 4.
    Blackmon, M., Polson, P., Kitajima, M., Lewis, C.: Cognitive walkthrough for the web. In: 2002 ACM conference on human factors in computing systems (CHI 2002), pp. 463–470 (2002)Google Scholar
  5. 5.
    Budzik, J., Hammond, K.: Watson: Anticipating and contextualizing information needs. In: Proceedings of 62nd Annual Meeting of the American Society for Information Science, Medford, NJ (1999)Google Scholar
  6. 6.
    Choo, C.W., Detlor, B., Turnbull, D.: A behavioral model of information seeking on the web – preliminary results of a study of how managers and it specialists use the web. In: Preston, C. (ed.) Proceedings of the 61st Annual Meeting of the American Society for Information Science, Pittsburgh, PA, October 1998, pp. 290–302 (1998)Google Scholar
  7. 7.
    Duda, R., Hart, P.: Pattern Classification and Scene Analysis. Wiley, New York (1973)zbMATHGoogle Scholar
  8. 8.
    Japkowicz, N.: The class imbalance problem: Significance and strategies. In: Proceedings of the 2000 International Conference on Artificial Intelligence (ICAI 2000) (2000)Google Scholar
  9. 9.
    Lewis, D., Knowles, K.: Threading electronic mail: A preliminary study. Information Processing and Management 33(2), 209–217 (1997)CrossRefGoogle Scholar
  10. 10.
    Lieberman, H.: Letizia: An agent that assists web browsing. In: International Joint Conference on Artificial Intelligence, Montreal, Canada (August 1995)Google Scholar
  11. 11.
    Ling, C., Li, C.: Data mining for direct marketing problems and solutions. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD 1998), New York, AAAI Press, Menlo Park (1998)Google Scholar
  12. 12.
    Pirolli, P., Fu, W.: Snif-act: A model of information foraging on the world wide web. In: Ninth International Conference on User Modeling, Johnstown, PA (2003)Google Scholar
  13. 13.
    Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1992)Google Scholar
  14. 14.
    Rijsbergen, C.: Information Retrieval, 2nd edn. Butterworths, London (1979)zbMATHGoogle Scholar
  15. 15.
    Zhu, T., Greiner, R., Häubl, G.: An effective complete-web recommender system. In: The Twelfth International World Wide Web Conference (WWW 2003), Budapest, HUNGARY (May 2003)Google Scholar
  16. 16.
    Zhu, T., Greiner, R., Häubl, G.: Learning a model of a web user’s interests. In: Brusilovsky, P., Corbett, A.T., de Rosis, F. (eds.) UM 2003. LNCS, vol. 2702, Springer, Heidelberg (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Tingshao Zhu
    • 1
  • Russ Greiner
    • 1
  • Gerald Häubl
    • 2
  • Bob Price
    • 1
  1. 1.Department of Computing ScienceUniversity of AlbertaCanada
  2. 2.School of BusinessUniversity of AlbertaCanada

Personalised recommendations