User Modeling and User-Adapted Interaction

, Volume 17, Issue 3, pp 217–255 | Cite as

A content-collaborative recommender that exploits WordNet-based user profiles for neighborhood formation

Original Paper

Abstract

Collaborative and content-based filtering are the recommendation techniques most widely adopted to date. Traditional collaborative approaches compute a similarity value between the current user and each other user by taking into account their rating style, that is the set of ratings given on the same items. Based on the ratings of the most similar users, commonly referred to as neighbors, collaborative algorithms compute recommendations for the current user. The problem with this approach is that the similarity value is only computable if users have common rated items. The main contribution of this work is a possible solution to overcome this limitation. We propose a new content-collaborative hybrid recommender which computes similarities between users relying on their content-based profiles, in which user preferences are stored, instead of comparing their rating styles. In more detail, user profiles are clustered to discover current user neighbors. Content-based user profiles play a key role in the proposed hybrid recommender. Traditional keyword-based approaches to user profiling are unable to capture the semantics of user interests. A distinctive feature of our work is the integration of linguistic knowledge in the process of learning semantic user profiles representing user interests in a more effective way, compared to classical keyword-based profiles, due to a sense-based indexing. Semantic profiles are obtained by integrating machine learning algorithms for text categorization, namely a naïve Bayes approach and a relevance feedback method, with a word sense disambiguation strategy based exclusively on the lexical knowledge stored in the WordNet lexical database. Experiments carried out on a content-based extension of the EachMovie dataset show an improvement of the accuracy of sense-based profiles with respect to keyword-based ones, when coping with the task of classifying movies as interesting (or not) for the current user. An experimental session has been also performed in order to evaluate the proposed hybrid recommender system. The results highlight the improvement in the predictive accuracy of collaborative recommendations obtained by selecting like-minded users according to user profiles.

Keywords

User modeling Collaborative filtering Content-based filtering Hybrid recommenders Machine learning Neighborhood formation in recommender systems WordNet 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adomavicius G., Sankaranarayanan R., Sen S., Tuzhilin A. (2005) Incorporating contextual information in recommender systems using a multidimensional approach. ACM Trans. Inf. Sys. 23(1): 103–145CrossRefGoogle Scholar
  2. Adomavicius G., Tuzhilin A. (2005) Towards the next generation of recommender systems, a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowledge Data Eng. 17(6):734–749CrossRefGoogle Scholar
  3. Asnicar F., Tasso C. (1997) ifWeb: a prototype of user model-based intelligent agent for documentation filtering and navigation in the word wide web. In: Tasso C., Jameson A., Paris C.L. (eds) Proceedings of the First International Workshop on Adaptive Systems and User Modeling on the World Wide Web, Sixth International Conference on User Modeling. Chia Laguna, Sardinia Italy, pp. 3–12Google Scholar
  4. Balabanovic M., Shoham Y. (1997) Fab: content-based, collaborative recommendation. Commun. ACM 40(3): 66–72CrossRefGoogle Scholar
  5. Basu C., Hirsh H., Cohen W.: Recommendation as classification: using social and content-based information in recommendation. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-98) and of the Tenth Conference on Innovative Applications of Artificial Intelligence (IAAI-98), pp. 714–720. Menlo Park, AAAI Press (1998)Google Scholar
  6. Billsus D., Pazzani M.J. Learning collaborative information filters. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 46–54. Morgan Kaufmann, San Francisco, CA (1998)Google Scholar
  7. Bloedhorn S., Hotho A.: Boosting for Text Classification with Semantic Features. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Mining for and from the Semantic Web Workshop, pp. 70–87. Seattle, WA, USA, (2004)Google Scholar
  8. Bradley P.S., Fayyad U.M. (1998) Refining initial points for K-means clustering. In: Shavlik J. (eds) Proceedings of the Fifteenth International Conference on Machine Learning (ICML ’98). California, Morgan Kaufmann, pp. 91–99Google Scholar
  9. Breese J.S., Heckerman D., Kadie C. (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Cooper, G.F., Moral S. (eds) Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence. Morgan, Kaufmann, pp. 43–52Google Scholar
  10. Budanitsky A., Hirst G.: Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures. In: Proceedings of the Workshop on WordNet and other Lexical Resources, Second Meeting of the North American Chapter of the Association for Computational Linguistics, pp. 29–34. Pittsburgh, PA (2001)Google Scholar
  11. Burke R. (2002) Hybrid recommender systems: survey and experiments. User Model. User-Adapted Interaction 12(4): 331–370MATHCrossRefGoogle Scholar
  12. Claypool M., Gokhale A., Miranda T., Murnikov P., Netes D., Sartin M.: Combining content-based and collaborative filters in an online newspaper. In: Proceedings of ACM SIGIR Workshop on Recommender Systems: Algorithms and Evaluation. Berkeley, California, USA, ACM Press, New York, NY, USA (1999)Google Scholar
  13. Cutting D., Karger D., Pedersen J., Tukey J.: Scatter/gather: a cluster based approach to browsing large document collection. In: Proceedings of the Fifteenth ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 318–329, Copenhagen, Denmark, ACM Press, New York, NY, USA (1992)Google Scholar
  14. Degemmis M.: Learning User Profiles from Text for Personalized Information Access. Ph.D. thesis, Department of Informatics, University of Bari (2005)Google Scholar
  15. Degemmis M., Lops P., Semeraro G.: WordNet-based Word Sense Disambiguation for Learning User Profiles. In: Proceedings of the Second European Web Mining Forum, ECML/PKDD 2005, pp. 16–27. Porto, Portugal, (2005)Google Scholar
  16. Degemmis M., Lops P., Semeraro G., Costabile M., Guida S., Licchelli O.: Improving collaborative recommender systems by means of user profiles. In: Karat C.-M., Blom J., Karat J. (eds.) Designing personalized user experiences in eCommerce, pp. 253–274. Kluwer Academic (2004)Google Scholar
  17. Delgado J., Ishii N.: Memory-based weighted-majority prediction for recommender systems. In: Proceedings of the ACM SIGIR Workshop on Recommender Systems: Algorithms and Evaluation. Berkeley, California, USA, ACM Press, New York, NY, USA (1999)Google Scholar
  18. Fellbaum C. WordNet: An Electronic Lexical Database. MIT Press (1998)Google Scholar
  19. Hartigan J. (1975) Clustering Algorithms. John Wiley & Sons, New York, NYMATHGoogle Scholar
  20. Hartigan J., Wong M. (1979) Algorithm AS136: a k-means clustering algorithm. Appl. Stat. 28, 100–108MATHCrossRefGoogle Scholar
  21. Herlocker J.L., Konstan J.A., Borchers A., Riedl J.: An algorithmic framework for performing collaborative filtering. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 230–237. Berkeley, California, USA, ACM Press New York, NY, USA. (1999)Google Scholar
  22. Herlocker J.L., Konstan J.A., Riedl J.: Explaining collaborative filtering recommendations. In: Proceedings of the ACM 2000 Conference on Computer Supported Cooperative Work, pp. 241–250. Philadelphia, Pennsylvania, United States, ACM Press New York, NY, USA. (2000)Google Scholar
  23. Herlocker J.L., Konstan J.A., Terveen L.G., Riedl J.T. (2004) Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1): 5–53CrossRefGoogle Scholar
  24. Hotho A., Staab S., Stumme G.: Wordnet improves text document clustering. In: Proceedings of the Semantic Web Workshop at SIGIR 2003, 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Toronto, Canada, ACM Press New York, NY, USA (2003)Google Scholar
  25. Kohavi R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 1137–1145. San Mateo, CA: Morgan Kaufmann (1995)Google Scholar
  26. Larsen B., Aone C.: Fast and Effective text mining using linear-time document clustering. In: Chaudhuri S. Madigan D. (eds.) Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 16–22. N.Y., ACM Press, (1999)Google Scholar
  27. Leacock C., Chodorow M.: Combining local context and WordNet similarity for word sense identification. In: Fellbaum C. (ed.) WordNet: An Electronic Lexical Database, pp. 266–283. MIT Press. (1998)Google Scholar
  28. Lee W.S. Collaborative learning for recommender systems. In: Proceedings of the Eighteenth International Conference on Machine Learning. pp. 314–321. Morgan Kaufmann, San Francisco, CA, (2001)Google Scholar
  29. Linden G., Smith B., York J. (2003) Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comp. 7(1): 76–80CrossRefGoogle Scholar
  30. Lops P.: Hybrid recommendation techniques based on user profiles. Ph.D. thesis, Department of Informatics, University of Bari (2005)Google Scholar
  31. Magnini B., Strapparava C.: Improving user modelling with content-based techniques. In: Proceedings of the Eighth International Conference on User Modeling, pp. 74–83. Sonthofen, Germany, Springer (2001)Google Scholar
  32. Manning C., Schütze H.: Foundations of statistical natural language processing, Chapt. 7: Word Sense Disambiguation, pp. 229–264. The MIT Press, Cambridge, US (1999)Google Scholar
  33. Massa P.: Trust-aware decentralized recommender systems. Ph.D. thesis, International Doctorate School in Information and Communication Technologies, University of Trento (2006)Google Scholar
  34. Mavroeidis D., Tsatsaronis G., Vazirgiannis M., Theobald M., Weikum G.: Word sense disambiguation for exploiting hierarchical thesauri in text classification. In: Proceedings of the Ninth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), vol. 3721 of Lecture Notes in Computer Science, pp. 181–192. Porto, Portugal, Springer (2005)Google Scholar
  35. McCallum A., Nigam K.: A comparison of event models for naïve Bayes text classification. In: Proceedings of the AAAI/ICML-98 Workshop on Learning for Text Categorization, pp. 41–48. AAAI Press (1998)Google Scholar
  36. Melville P., Mooney R.J., Nagarajan R.: Content-Boosted collaborative filtering for improved recommendations. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence (AAAI/IAAI-02), pp. 187–192. Menlo Parc, CA, USA, AAAI Press (2002)Google Scholar
  37. Miller, G. WordNet: an on-line lexical database. Int. J. Lexicogr. 3(4), 235–312. (Special Issue) (1990)Google Scholar
  38. Mitchell T. (1997) Machine Learning. McGraw-Hill, New YorkMATHGoogle Scholar
  39. Mladenic D. (1999) Text-learning and related intelligent agents: a survey. IEEE Intelligent Syst. 14(4): 44–54CrossRefGoogle Scholar
  40. Mooney R.J., Roy L.: Content-based book recommending using learning for text categorization. In: Proceedings of the Fifth ACM Conference on Digital Libraries, pp. 195–204. San Antonio, US, ACM Press, New York, US (2000)Google Scholar
  41. Nakamura A., Abe N.: Collaborative filtering using weighted majority prediction algorithms. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 395–403. Morgan Kaufmann (1998)Google Scholar
  42. Orkin M., Drogin R. (1990) Vital Statistics. McGraw-Hill, New YorkGoogle Scholar
  43. Patwardhan S., Banerjee S., Pedersen T.: Using measures of semantic relatedness for word sense disambiguation. In: Gelbukh, A.F. (ed.) Computational Linguistics and Intelligent Text Processing, Fourth International Conference, CICLing 2003, Proceedings, vol. 2588 of Lecture Notes in Computer Science, pp. 241–257. Springer (2003)Google Scholar
  44. Pazzani M., Billsus D. (1997) Learning and revising user profiles: the identification of interesting web sites. Machine Learning 27(3): 313–331CrossRefGoogle Scholar
  45. Pazzani M.J. (1999) A Framework for collaborative, content-based and demographic filtering. Artificial Intelligence Rev. 13(5–6): 393–408CrossRefGoogle Scholar
  46. Resnick P., Iacovou N., Suchak M., Bergstrom P., Riedl J.: GroupLens: an open architecture for collaborative filtering of netnews. In: Proceedings of the ACM 1994 Conference on Computer Supported Cooperative Work, pp. 175–186. Chapel Hill, North Carolina, ACM Press New York, NY, USA (1994)Google Scholar
  47. Resnick P., Varian H. (1997) Recommender systems. Commun. ACM 40(3): 56–58CrossRefGoogle Scholar
  48. Resnik P.: WordNet and class-based probabilities. In: Fellbaum C. (ed.) WordNet: An Electronic lexical database, pp. 239–263, MIT Press (1998)Google Scholar
  49. Rocchio J. (1971) Relevance feedback information retrieval. In: Salton G. (eds) The SMART Retrieval System – Experiments in Automated Document Processing. Prentice-Hall, Englewood Cliffs, NJ, pp. 313–323Google Scholar
  50. Rodriguez M.d.B., Gomez-Hidalgo J.M., Diaz-Agudo B.: Using WordNet to complement training information in text categorization. In: Second International Conference on Recent Advances in NLP, pp. 150–157 (1997)Google Scholar
  51. Rosso P., Ferretti E., Jimenez D., Vidal V.: Text categorization and information retrieval using WordNet synsets. In: Sojka P., Pala K., Smrž, P., Fellbaum C., Vossen P. (eds.) Proceedings of the Second International WordNet Conference, pp. 299–304. Masaryk University Brno, Czech Republic (2004)Google Scholar
  52. Sarwar B.M., Karypis G., Konstan J., Reidl J.: Recommender systems for large-scale E-Commerce: scalable neighborhood formation using clustering. In: Proceedings of the Fifth International Conference on Computer and Information Technology (ICCIT). Dhaka, Bangladesh (2002)Google Scholar
  53. Sarwar B.M., Karypis G., Konstan J.A., Riedl J.: Analysis of recommendation algorithms for E-commerce. In: ACM Conference on Electronic Commerce, pp. 158–167. Minneapolis, Minnesota, USA, (2000a)Google Scholar
  54. Sarwar B.M., Karypis G., Konstan J.A., Riedl J.: Application of dimensionality reduction in recommender systems: a case study. In: Proceedings of the WebKDD 2000 Workshop at the ACM-SIGKDD Conference on Knowledge Discovery in Databases (KDD’00). Boston, MA (2000b)Google Scholar
  55. Schwab I., Kobsa A., Koychev I. (2001) Learning User Interests through Positive Examples using Content Analysis and Collaborative Filtering. Draft from Fraunhofer Institute for Applied Information Technology, GermanyGoogle Scholar
  56. Scott S., Matwin S.: Text classification using WordNet hypernyms. In: Harabagiu S. (ed.) COLING-ACL Workshop on Usage of WordNet in NLP Systems, pp. 45–51. Somerset, New Jersey, Association for Computational Linguistics (1998)Google Scholar
  57. Sebastiani F. (2002) Machine learning in automated text categorization. ACM Comp. Surveys 34(1): 1–47CrossRefGoogle Scholar
  58. Semeraro G., Degemmis M., Lops P., Basile P.: Combining learning and word sense disambiguation for intelligent user profiling. In: Twentieth International Joint Conference on Artificial Intelligence, 2007. Hyderabad, India. (Forthcoming) (2007)Google Scholar
  59. Shardanand U., Maes P.: Social information filtering: algorithms for automating/word of mouth. In: Proceedings of ACM CHI’95 Conference on Human Factors in Computing Systems, vol. 1, pp. 210–217. Denver, Colorado, United States (1995)Google Scholar
  60. Soboroff I., Nicholas C.: Combining content and collaboration in text filtering. In: IJCAI’99 Workshop: Machine Learning for Information Filtering, pp. 86–91. Stockholm, Sweden (1999)Google Scholar
  61. Stevenson M. (2003) Word Sense Disambiguation: The Case for Combinations of Knowledge Sources. CSLI Publications Stanford, CA, USAGoogle Scholar
  62. Terveen L., Hill W.: Human-computer collaboration in recommender systems, pp. 223–242. In: Carroll J. (ed.) HCI on the new Millennium, Addison Wesley (2001)Google Scholar
  63. Theobald M., Schenkel R., Weikum G.: Exploting structure, annotation, and ontological knowledge for automatic classification of XML data. In: Proceedings of the Seventh International Workshop on Web and Databases, pp. 1–6. Maison de la Chimie, Paris, France (2004)Google Scholar
  64. Ungar L., Foster D.: Clustering methods for collaborative filtering. In: Proceedings of the Workshop on Recommendation Systems. AAAI Press, Menlo Park California (1998)Google Scholar
  65. Vozalis E., Margaritis K.G.: Analysis of recommender systems algorithms. In: Proceedings of the Sixth Hellenic European Conference on Computer Mathematics and its Applications (HERCMA). Athens, Greece (2003)Google Scholar
  66. Witten I., Bell T. (1991) The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression. IEEE Trans. Inf. Theory 37(4): 1085–1094CrossRefGoogle Scholar
  67. Yang Y. Pedersen J.O.: A comparative study on feature selection in text categorization. In: Fisher D.H. (ed.) Proceedings of ICML-97, Fourteenth International Conference on Machine Learning, pp. 412–420. Nashville, US, Morgan Kaufmann Publishers, San Francisco, US (1997)Google Scholar
  68. Yao Y.Y. (1995) Measuring retrieval effectiveness based on user preference of documents. J. Am. Soc. Inf. Sci. 46(2): 133–145CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2007

Authors and Affiliations

  • Marco Degemmis
    • 1
  • Pasquale Lops
    • 1
  • Giovanni Semeraro
    • 1
  1. 1.Department of InformaticsUniversity of BariBariItaly

Personalised recommendations