Machine Learning

, Volume 27, Issue 3, pp 313–331 | Cite as

Learning and Revising User Profiles: The Identification of Interesting Web Sites

  • Michael Pazzani
  • Daniel Billsus


We discuss algorithms for learning and revising user profiles that can determine which World Wide Web sites on a given topic would be interesting to a user. We describe the use of a naive Bayesian classifier for this task, and demonstrate that it can incrementally learn profiles from user feedback on the interestingness of Web sites. Furthermore, the Bayesian classifier may easily be extended to revise user provided profiles. In an experimental evaluation we compare the Bayesian classifier to computationally more intensive alternatives, and show that it performs at least as well as these approaches throughout a range of different domains. In addition, we empirically analyze the effects of providing the classifier with background knowledge in form of user defined profiles and examine the use of lexical knowledge for feature selection. We find that both approaches can substantially increase the prediction accuracy.

Information filtering intelligent agents multistrategy learning World Wide Web user profiles 


  1. Armstrong, R., Freitag, D., Joachims, T., & Mitchell, T. (1995).WebWatcher: A learning apprentice for the World Wide Web. Working Notes of the AAAI Spring Symposium Series on Information Gathering from Distributed, Heterogeneous Environments (pp. 6–12). Palo Alto, CA.Google Scholar
  2. Balabanovic, Shoham, & Yun. (1995). An adaptive agent for automated web browsing (Technical Report CS-TN–97–52). Stanford University, Palo Alto, CA.Google Scholar
  3. Cost, S., & Salzberg, S. (1993). A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10:57–78.Google Scholar
  4. Croft, W.B., & Harper, D. (1979). Using probabilistic models of document retrieval without relevance. Journal of Documentation, 35:285–295.Google Scholar
  5. Domingos, P., & Pazzani, M. (1996). Beyond independence: Conditions for the optimality of the Simple Bayesian Classifier. Proceedings of the Thirteenth International Conference on Machine Learning (pp. 105–112). Morgan Kaufmann, San Fransico, CA.Google Scholar
  6. Duda, R., & Hart, P. (1973). Pattern Classification and Scene Analysis. John Wiley & Sons, New York.Google Scholar
  7. Harman, D.K. (1994). Overview of the second Text Retrieval Conference (TREC-2). Proceedings of the Second Text Retrieval Conference.TREC-2/, NIST Special Publication.Google Scholar
  8. Heckerman, D. (1995). A Tutorial on Learning with Bayesian Networks (Technical Report MSR-TR–95–06). Microsoft Corporation.Google Scholar
  9. Ittner, D., Lewis, D., & Ahn, D. (1995). Text categorization of low quality images. Symposium on Document Analysis and Information Retrieval (pp. 301–315). UNLV, Las Vegas, NV, ISRI.Google Scholar
  10. John, G., Kohavi, R., & Pfleger, K. (1994). Irrelevant features and the subset selection problem. Proceedings of the Eleventh International Conference on Machine Learning (pp. 121–138). New Brunswick, NJ.Google Scholar
  11. Kittler, J. (1986). Feature selection and extraction. In Young, & Fu, (Eds.), Handbook of Pattern Recognition and Image Processing. Academic Press, New York.Google Scholar
  12. Kononenko, I. (1990). Comparison of inductive and naive Bayesian learning approaches to automatic knowledge acquisition. In B. Wielinga (Ed.), Current Trends in Knowledge Acquisition. IOS Press, Amsterdam.Google Scholar
  13. Lang, K. (1995). NewsWeeder: Learning to filter news. Proceedings of the Twelfth International Conference on Machine Learning (pp. 331–339). Lake Tahoe, CA.Google Scholar
  14. Lashkari, Y. (1995). The WebHound Personalized Document Filtering System. projects/webhound/Google Scholar
  15. Lewis, D. (1992). Representation and learning in information retrieval. Doctoral dissertation, Department of Computer and Information Science, University of Massachusetts.Google Scholar
  16. Lieberman, H. (1995). Letizia: An agent that assists web browsing. Proceedings of the International Joint Conference on Artificial Intelligence (pp. 924–929), Montreal, August 1995.Google Scholar
  17. Maron, M. (1961). Automatic indexing: An experimental inquiry. Journal of the Association for Computing Machinery, 8:404–417.Google Scholar
  18. Mauldin, M., & Leavitt, J. (1994).Web agent related research at the center for machine translation. Proceedings of the ACM Special Interest Group on Networked Information Discovery and Retrieval. The MITRE Corporation, McLean, Virgiana.Google Scholar
  19. Miller, G. (1991). WordNet: An on-line lexical database. International Journal of Lexicography, 3(4), 235–312.Google Scholar
  20. Minsky, M., & Papert, S. (1969). Perceptrons. MIT Press, Cambridge, MA.Google Scholar
  21. Pazzani, M., Muramatsu J., and Billsus, D. (1996). Syskill & Webert: Identifying interesting web sites. Proceedings of the National Conference on Artificial Intelligence (pp. 54–61). Portland, OR.Google Scholar
  22. Quinlan, J.R. (1986). Induction of decision trees. Machine Learning, 1:81–106.CrossRefGoogle Scholar
  23. Rachlin, Kasif, Salzberg, & Aha, (1994). Towards a better understanding of memory-based reasoning systems. Proceedings of the Eleventh International Conference on Machine Learning (pp. 242–250). New Brunswick, NJ.Google Scholar
  24. Rocchio, J. (1971). Relevance feedback information retrieval. In Gerald Salton (Ed.), The SMART Retrieval System-Experiments in Automated Document Processing (pp. 313–323). Prentice-Hall, Englewood Cliffs, NJ.Google Scholar
  25. Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning internal representations by error propagation. In D. Rumelhart & J. McClelland (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations, (pp. 318–362). MIT Press, Cambridge, MA.Google Scholar
  26. Salton, G. (1989). Automatic Text Processing. Addison-Wesley.Google Scholar
  27. Salton, G., & Buckley, C. (1990). Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41:288–297.Google Scholar
  28. Skalak, D. (1994). Prototype and feature selection by sampling and random mutation hill climbing algorithms. Proceedings of the Eleventh International Conference on Machine Learning (pp. 293–301). New Brunswick, NJ.Google Scholar
  29. Stanfill, C., & Waltz, D. (1986). Towards memory-based reasoning. Communications of the ACM, 29:1213–1228.Google Scholar
  30. Widrow, G., & Hoff, M. (1960). Adaptive switching circuits. Institute of Radio Engineers, Western Electronic Show and Convention, Convention Record, Part 4.Google Scholar

Copyright information

© Kluwer Academic Publishers 1997

Authors and Affiliations

  • Michael Pazzani
    • 1
  • Daniel Billsus
    • 1
  1. 1.Department of Information and Computer ScienceUniversity of CaliforniaIrvineIrvine

Personalised recommendations