Skip to main content

Neighborhood-Based Collaborative Filtering

Abstract

Neighborhood-based collaborative filtering algorithms, also referred to as memory-based algorithms, were among the earliest algorithms developed for collaborative filtering. These algorithms are based on the fact that similar users display similar patterns of rating behavior and similar items receive similar ratings. There are two primary types of neighborhood-based algorithms:

Keywords

  • Target Item
  • Rating Matrix
  • Target User
  • Offline Phase
  • Latent Factor Model

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-29659-3_2
  • Chapter length: 42 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   54.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-29659-3
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   69.99
Price excludes VAT (USA)
Hardcover Book
USD   69.99
Price excludes VAT (USA)
Figure 2.1
Figure 2.2
Figure 2.3
Figure 2.4
Figure 2.5

Notes

  1. 1.

    In many cases, k valid peers of target user u with observed ratings for item j might not exist. This scenario is particularly common in sparse ratings matrices, such as the case where user u has less than k observed ratings. In such cases, the set P u (j) will have cardinality less than k.

  2. 2.

    The precise method used by Netflix is proprietary and therefore not known. However, item-based methods do provide a viable methodology to achieve similar goals.

  3. 3.

    There can be some minor differences depending on how the mean is computed for each row within the Pearson coefficient. If the mean for each row is computed using all the observed entries of that row (rather than only the mutually specified entries), then the Pearson correlation coefficient is identical to the cosine coefficient for row-wise mean-centered matrices.

  4. 4.

    Diagonal matrices are usually square. Although this matrix is not square, only entries with equal indices are nonzero. This is a generalized definition of a diagonal matrix.

  5. 5.

    A discussion of linear regression is provided in section 4.4.5 of Chapter 4, but in the context of content-based systems.

  6. 6.

    The approach can be adapted to arbitrary rating matrices. However, the main advantages of the approach are realized for non-negative ratings matrices.

  7. 7.

    It is noteworthy that imposing an additional constraint, such as non-negativity, always reduces the quality of the optimal solution on the observed entries. On the other hand, imposing constraints increases the model bias and reduces model variance, which might reduce overfitting on the unobserved entries. In fact, when two closely related models have contradicting relative performances on the observed and unobserved entries, respectively, it is almost always a result of differential levels of overfitting in the two cases. You will learn more about the bias-variance trade-off in Chapter 6. In general, it is more reliable to predict item ratings with positive item-item relationships rather than negative relationships. The non-negativity constraint is based on this observation. The incorporation of model biases in the form of such natural constraints is particularly useful for smaller data sets.

Bibliography

  1. C. Aggarwal. Social network data analytics. Springer, New York, 2011.

    Google Scholar 

  2. C. Aggarwal. Data mining: the textbook. Springer, New York, 2015.

    Google Scholar 

  3. C. Aggarwal and S. Parthasarathy. Mining massively incomplete data sets by conceptual reconstruction. ACM KDD Conference, pp. 227–232, 2001.

    Google Scholar 

  4. C. Aggarwal, J. Wolf, K.-L. Wu, and P. Yu. Horting hatches an egg: a new graph-theoretic approach to collaborative filtering. ACM KDD Conference, pp. 201–212, 1999.

    Google Scholar 

  5. C. Anderson. The long tail: why the future of business is selling less of more. Hyperion, 2006.

    Google Scholar 

  6. F. Aiolli. Efficient top-n recommendation for very large scale binary rated datasets. ACM conference on Recommender Systems, pp. 273–280, 2013.

    Google Scholar 

  7. R. Bell, Y. Koren, and C. Volinsky. Modeling relationships at multiple scales to improve accuracy of large recommender systems. ACM KDD Conference, pp. 95–104, 2007.

    Google Scholar 

  8. R. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. IEEE International Conference on Data Mining, pp. 43–52, 2007.

    Google Scholar 

  9. J. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. Conference on Uncertainty in Artificial Inetlligence, 1998.

    Google Scholar 

  10. S. Chee, J. Han, and K. Wang. Rectree: An efficient collaborative filtering method. Data Warehousing and Knowledge Discovery, pp. 141–151, 2001.

    Google Scholar 

  11. E. Christakopoulou and G. Karypis. HOSLIM: Higher-order sparse linear method for top-n recommender systems. Advances in Knowledge Discovery and Data Mining, pp. 38–49, 2014.

    Google Scholar 

  12. W. Cohen, R. Schapire and Y. Singer. Learning to order things. Advances in Neural Information Processing Systems, pp. 451–457, 2007.

    Google Scholar 

  13. M. O’Connor and J. Herlocker. Clustering items for collaborative filtering. Proceedings of the ACM SIGIR workshop on recommender systems, Vol 128. 1999.

    Google Scholar 

  14. P. Cremonesi, Y. Koren, and R. Turrin. Performance of recommender algorithms on top-n recommendation tasks. RecSys, pp. 39–46, 2010.

    Google Scholar 

  15. M. Deshpande and G. Karypis. Item-based top-n recommendation algorithms. ACM Transactions on Information Systems (TOIS), 22(1), pp. 143–177, 2004.

    CrossRef  Google Scholar 

  16. C. Desrosiers and G. Karypis. A comprehensive survey of neighborhood-based recommendation methods. Recommender Systems Handbook, pp. 107–144, 2011.

    Google Scholar 

  17. F. Fouss, A. Pirotte, J. Renders, and M. Saerens. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Transactions on Knowledge and Data Engineering, 19(3), pp. 355–369, 2007.

    CrossRef  Google Scholar 

  18. F. Fouss, L. Yen, A. Pirotte, and M. Saerens. An experimental investigation of graph kernels on a collaborative recommendation task. IEEE International Conference on Data Mining (ICDM), pp. 863–868, 2006.

    Google Scholar 

  19. K. Goldberg, T. Roeder, D. Gupta, and C. Perkins. Eigentaste: A constant time collaborative filtering algorithm. Information Retrieval, 4(2), pp. 133–151, 2001.

    CrossRef  MATH  Google Scholar 

  20. M. Gori and A. Pucci. Itemrank: a random-walk based scoring algorithm for recommender engines. IJCAI Conference, pp. 2766–2771, 2007.

    Google Scholar 

  21. T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning. Springer, 2009.

    Google Scholar 

  22. J. Herlocker, J. Konstan, A. Borchers, and J. Riedl. An algorithmic framework for performing collaborative filtering. ACM SIGIR Conference, pp. 230–237, 1999.

    Google Scholar 

  23. J. Herlocker, J. Konstan,, and J. Riedl. An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Information Retrieval, 5(4), pp. 287–310, 2002.

    CrossRef  Google Scholar 

  24. T. Hofmann. Latent semantic models for collaborative filtering. ACM Transactions on Information Systems (TOIS), 22(1), pp. 89–114, 2004.

    CrossRef  Google Scholar 

  25. A. Howe, and R. Forbes. Re-considering neighborhood-based collaborative filtering parameters in the context of new data. Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 1481–1482, 2008.

    Google Scholar 

  26. Z. Huang, X. Li, and H. Chen. Link prediction approach to collaborative filtering. ACM/IEEE-CS joint conference on Digital libraries, pp. 141–142, 2005.

    Google Scholar 

  27. Z. Huang, H. Chen, and D. Zheng. Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Transactions on Information Systems, 22(1), pp. 116–142, 2004.

    CrossRef  Google Scholar 

  28. R. Jin, J. Chai, and L. Si. An automatic weighting scheme for collaborative filtering. ACM SIGIR Conference, pp. 337–344, 2004.

    Google Scholar 

  29. R. Jin, L. Si, and C. Zhai. Preference-based graphic models for collaborative filtering. Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence, pp. 329–336, 2003.

    Google Scholar 

  30. R. Jin, L. Si, C. Zhai, and J. Callan. Collaborative filtering with decoupled models for preferences and ratings. ACM CIKM Conference, pp. 309–316, 2003.

    Google Scholar 

  31. M. Kendall and J. Gibbons. Rank correlation methods. Charles Griffin, 5th edition, 1990.

    Google Scholar 

  32. Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. ACM KDD Conference, pp. 426–434, 2008. Extended version of this paper appears as: “Y. Koren. Factor in the neighbors: Scalable and accurate collaborative filtering. ACM Transactions on Knowledge Discovery from Data (TKDD), 4(1), 1, 2010.”

    Google Scholar 

  33. Y. Koren and R. Bell. Advances in collaborative filtering. Recommender Systems Handbook, Springer, pp. 145–186, 2011. (Extended version in 2015 edition of handbook).

    Google Scholar 

  34. Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8), pp. 30–37, 2009.

    CrossRef  Google Scholar 

  35. D. Lemire and A. Maclachlan. Slope one predictors for online rating-based collaborative filtering. SIAM Conference on Data Mining, 2005.

    Google Scholar 

  36. M. Levy and K. Jack. Efficient Top-N Recommendation by Linear Regression. Large Scale Recommender Systems Workshop (LSRS) at RecSys, 2013.

    Google Scholar 

  37. D. Liben-Nowell and J. Kleinberg. The link-prediction problem for social networks. Journal of the American society for information science and technology, 58(7), pp. 1019–1031, 2007.

    CrossRef  Google Scholar 

  38. G. Linden, B. Smith, and J. York. Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Computing, 7(1), pp. 76–80, 2003.

    Google Scholar 

  39. H. Ma, I. King, and M. Lyu. Effective missing data prediction for collaborative filtering. ACM SIGIR Conference, pp. 39–46, 2007.

    Google Scholar 

  40. C. Manning, P. Raghavan, and H. Schutze. Introduction to information retrieval. Cambridge University Press, Cambridge, 2008.

    Google Scholar 

  41. N. Meinshausen. Sign-constrained least squares estimation for high-dimensional regression. Electronic Journal of Statistics, 7, pp. 607–1631, 2013.

    MathSciNet  CrossRef  MATH  Google Scholar 

  42. X. Ning and G. Karypis. SLIM: Sparse linear methods for top-N recommender systems. IEEE International Conference on Data Mining, pp. 497–506, 2011.

    Google Scholar 

  43. X. Ning and G. Karypis. Sparse linear methods with side information for top-n recommendations. ACM Conference on Recommender Systems, pp. 155–162, 2012.

    Google Scholar 

  44. Y. Park and A. Tuzhilin. The long tail of recommender systems and how to leverage it. Proceedings of the ACM Conference on Recommender Systems, pp. 11–18, 2008.

    Google Scholar 

  45. W. Pan and L. Chen. CoFiSet: Collaborative filtering via learning pairwise preferences over item-sets. SIAM Conference on Data Mining, 2013.

    Google Scholar 

  46. S. Parthasarathy and C. Aggarwal. On the use of conceptual reconstruction for mining massively incomplete data sets. IEEE Transactions on Knowledge and Data Engineering, 15(6), pp. 1512–1521, 2003.

    CrossRef  Google Scholar 

  47. J. Rennie and N. Srebro. Fast maximum margin matrix factorization for collaborative prediction. ICML Conference, pp. 713–718, 2005.

    Google Scholar 

  48. P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. GroupLens: an open architecture for collaborative filtering of netnews. Proceedings of the ACM Conference on Computer Supported Cooperative Work, pp. 175–186, 1994.

    Google Scholar 

  49. R. Salakhutdinov, and A. Mnih. Probabilistic matrix factorization. Advances in Neural and Information Processing Systems, pp. 1257–1264, 2007.

    Google Scholar 

  50. B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. World Wide Web Conference, pp. 285–295, 2001.

    Google Scholar 

  51. B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Application of dimensionality reduction in recommender system – a case study. WebKDD Workshop at ACM SIGKDD Conference, 2000. Also appears at Technical Report TR-00-043, University of Minnesota, Minneapolis, 2000. https://wwws.cs.umn.edu/tech_reports_upload/tr2000/00-043.pdf

  52. B. Sarwar, J. Konstan, A. Borchers, J. Herlocker, B. Miller, and J. Riedl. Using filtering agents to improve prediction quality in the grouplens research collaborative filtering system. ACM Conference on Computer Supported Cooperative Work, pp. 345–354, 1998.

    Google Scholar 

  53. B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Recommender systems for large-scale e-commerce: Scalable neighborhood formation using clustering. International Conference on Computer and Information Technology, 2002.

    Google Scholar 

  54. U. Shardanand and P. Maes. Social information filtering: algorithms for automating word of mouth. ACM Conference on Human Factors in Computing Systems, 1995.

    Google Scholar 

  55. G. Strang. An introduction to linear algebra. Wellesley Cambridge Press, 2009.

    Google Scholar 

  56. K. Verstrepen and B. Goethals. Unifying nearest neighbors collaborative filtering. ACM Conference on Recommender Systems, pp. 177–184, 2014.

    Google Scholar 

  57. S. Vucetic and Z. Obradovic. Collaborative filtering using a regression-based approach. Knowledge and Information Systems, 7(1), pp. 1–22, 2005.

    CrossRef  Google Scholar 

  58. J. Wang, A. de Vries, and M. Reinders. Unifying user-based and item-based similarity approaches by similarity fusion. ACM SIGIR Conference, pp. 501–508, 2006.

    Google Scholar 

  59. B. Xu, J. Bu, C. Chen, and D. Cai. An exploration of improving collaborative recommender systems via user-item subgroups. World Wide Web Conference, pp. 21–30, 2012.

    Google Scholar 

  60. G. Xue, C. Lin, Q. Yang, W. Xi, H. Zeng, Y. Yu, and Z. Chen. Scalable collaborative filtering using cluster-based smoothing. ACM SIGIR Conference, pp. 114–121, 2005.

    Google Scholar 

  61. H. Yildirim, and M. Krishnamoorthy. A random walk method for alleviating the sparsity problem in collaborative filtering. ACM Conference on Recommender Systems, pp. 131–138, 2008.

    Google Scholar 

  62. H. Yin, B. Cui, J. Li, J. Yao, and C. Chen. Challenging the long tail recommendation. Proceedings of the VLDB Endowment, 5(9), pp. 896–907, 2012.

    CrossRef  Google Scholar 

  63. T. Zhang and V. Iyengar. Recommender systems using linear classifiers. Journal of Machine Learning Research, 2, pp. 313–334, 2002.

    MATH  Google Scholar 

  64. http://eigentaste.berkeley.edu/user/index.php

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Aggarwal, C.C. (2016). Neighborhood-Based Collaborative Filtering. In: Recommender Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-29659-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-29659-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-29657-9

  • Online ISBN: 978-3-319-29659-3

  • eBook Packages: Computer ScienceComputer Science (R0)