Skip to main content
Log in

Measure prediction capability of data for collaborative filtering

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Collaborative filtering (CF) approaches have been widely been employed in e-commerce to help users find items they like. Whereas most of existing work focuses on improving algorithmic performance, it is important to know whether the recommendation for users and items can be trustworthy. In this paper, we propose a metric, “relatedness,” to measure the potential that a user’s preference on an item can be accurately predicted. The relatedness of a user–item pair is determined by a community which consists of users and items most related to the pair. The relatedness is computed by solving a constrained \(\ell _{1}^{}\)-regularized least square problem with a generalized homotopy algorithm, and we design the homotopy-based community search algorithm to identify the community by alternately selecting the most related users and items. As an application of the relatedness metric, we develop the data-oriented combination (DOC) method for recommender systems by integrating a group of benchmark CF methods based on the relatedness of user–item pairs. In experimental studies, we examine the effectiveness of the relatedness metric and validate the performance of the DOC method by comparing it with benchmark methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Adomavicius G, Kamireddy S, Kwon, Y (2007) Towards more confident recommendations: improving recommender systems using filtering approach based on rating variance. In: Proceedings of the 17th workshop on information technology and systems

  2. Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. Knowl Data Eng IEEE Trans 17(6):734–749

    Article  Google Scholar 

  3. Bell R, Koren Y, Volinsky C (2007) Modeling relationships at multiple scales to improve accuracy of large recommender systems. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 95–104

  4. Bell RM, Koren Y, Volinsky C (2008) The bellkor 2008 solution to the netflix prize. Statistics Research Department at AT&T Research

  5. Breese JS, Heckerman D, Kadie C, et al (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the 14th conference on Uncertainty in Artificial Intelligence, pp 43–52

  6. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MathSciNet  MATH  Google Scholar 

  7. Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Boston

    MATH  Google Scholar 

  8. DEC Systems Research Center (1997) EachMovie 1997. http://www.research.digital.com/SRC/eachmovie/

  9. Chen SS, Donoho DL, Saunders MA (1999) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20(1):33–61

    Article  MathSciNet  MATH  Google Scholar 

  10. Donoho DL, Tsaig Y (2008) Fast solution of \(\ell _{1}\)-norm minimization problems when the solution may be sparse. Inf Theory IEEE Trans 54(11):4789–4812

    Article  MathSciNet  MATH  Google Scholar 

  11. Donoho DL, Tsaig Y, Drori I, Starck JL (2012) Sparse solution of underdetermined systems of linear equations by stagewise orthogonal matching pursuit. Inf Theory IEEE Trans 58(2):1094–1121

    Article  MathSciNet  Google Scholar 

  12. Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. The Ann Stat 32(2):407–499

    Article  MathSciNet  MATH  Google Scholar 

  13. Ekstrand MD, Riedl JT, Konstan JA (2010) Collaborative filtering recommender systems. Found Trends Human–Comput Interact 4(2):81–173

    Article  Google Scholar 

  14. Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Thirteenth international conference on machine learning, pp 148–156

  15. Gunawardana A, Shani G (2009) A survey of accuracy evaluation metrics of recommendation tasks. J Mach Learn Res 10:2935–2962

    MathSciNet  MATH  Google Scholar 

  16. Herlocker JL, Konstan JA, Borchers A, Riedl J (1999) An algorithmic framework for performing collaborative filtering. In: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. ACM, Berkeley, California, pp 230–237

  17. Herlocker JL, Konstan JA, Terveen LG, Riedl JT (2004) Evaluating collaborative filtering recommender systems. ACM Trans Inf Syst (TOIS) 22(1):5–53

    Article  Google Scholar 

  18. Hofmann T (2004) Latent semantic models for collaborative filtering. ACM Trans Inf Syst (TOIS) 22(1):89–115

    Article  Google Scholar 

  19. Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: Eighth IEEE international conference on data mining 2008. IEEE, pp 263–272

  20. Jin R, Si L, Zhai C (2006) A study of mixture models for collaborative filtering. Inf Retri 9:357–382

    Article  Google Scholar 

  21. Konstan JA, Miller BN, Maltz D, Herlocker JL, Gordon LR, Riedl J (1997) GroupLens: applying collaborative filtering to Usenet news. Commun ACM 40(3):77–87

    Article  Google Scholar 

  22. Koren Y (2008) Factorization meets the neighborhood: a multifaceted collaborative filtering model. In: Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 426–434

  23. Koren Y (2009) The bellkor solution to the netflix grand prize, Netflix prize documentation. http://www.netflixprize.com/

  24. Lauw HW, Lim E-P, Wang K (2006) Bias and controversy: beyond the statistical deviation. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 625–630

  25. McCrae J, Piatek A, Langley A (2004) Collaborative filtering. http://www.imperialviolet.org/

  26. McLaughlin MR, Herlocker JL (2004) A collaborative filtering algorithm and evaluation metric that accurately model the user experience. In: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp 329–336

  27. McNee SM, Lam SK, Guetzlaff C, Konstan JA, Riedl J (2003) Confidence displays and training in recommender systems. In: Proceedings of INTERACT’03 IFIP TC13 international conference on human–computer interaction, pp 176–183

  28. Ning Xia, Karypis George (2011) Slim: Sparse linear methods for top-n recommender systems. In: 2011 IEEE 11th international conference on data mining (ICDM). IEEE, pp 497–506

  29. Ning Xia, Karypis George (2012) Sparse linear methods with side information for top-n recommendations. In: Proceedings of the sixth ACM conference on recommender systems. ACM, pp 155–162

  30. Osborne MR, Presnell B, Turlach BA (2000) A new approach to variable selection in least squares problems. IMA J Numer Anal 20(3):389–403

    Article  MathSciNet  MATH  Google Scholar 

  31. Papagelis M, Plexousakis D, Kutsuras T (2005). Alleviating the sparsity problem of collaborative filtering using trust inferences. In: iTrust. Springer, pp 224–239

  32. Piccart B, Struyf J, Blockeel H (2010) Alleviating the sparsity problem in collaborative filtering by using an adapted distance and a graph-based method. In: Proceedings of the 2010 SIAM international conference on data mining. SIAM, pp 189–198

  33. Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2009) BPR: Bayesian personalized ranking from implicit feedback. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, 452–461

  34. Resnick P, Iacovou N, Suchak M, Bergstrom P, Riedl J (1994) GroupLens: an open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM conference on computer supported cooperative work, pp 175–186

  35. Riedl J, Konstan J, Terveen L (2006) MovieLens. http://www.grouplens.org/node/73

  36. Rockafellar RT (1970) Convex analysis. Princeton University Press, New Jersey

    Book  MATH  Google Scholar 

  37. Sarwar B, Karypis G, Konstan J, Reidl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th international conference on world wide web. ACM, Hong Kong, pp 285–295

  38. Sarwar B, Karypis G, Konstan J, Riedl J (2000) Analysis of recommendation algorithms for e-commerce. In: Proceedings of the 2nd ACM conference on Electronic commerce, pp 158–167

  39. Sarwar BM, Konstan JA, Borchers A, Herlocker J, Miller B, Riedl J (1998) Using filtering agents to improve prediction quality in the GroupLens research collaborative filtering system. In: Proceedings of the 1998 ACM conference on computer supported cooperative work, pp 345–354

  40. Symeonidis P, Nanopoulos A, Papadopoulos AN, Manolopoulos Y (2008) Nearest-biclusters collaborative filtering based on constant and coherent values. Inf Retr 11(1):51–75

    Article  Google Scholar 

  41. Takács G, Pilászy I, Németh B, Tikk D (2009) Scalable collaborative filtering approaches for large recommender systems. J Mach Learn Res 10:623–656

    Google Scholar 

  42. Töscher A, Jahrer M, Bell RM (2009) The BigChaos solution to the Netflix grand prize. Netflix prize documentation. http://www.netflixprize.com/

  43. Wang J, De Vries AP, Reinders MJT (2006) Unifying user-based and item-based collaborative filtering approaches by similarity fusion. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 501–508

  44. Wu FQ, He L, Xia WW, Ren L (2008) A collaborative filtering algorithm based on users’ partial similarity. In: 10th international conference on control, automation, robotics and vision, 2008. ICARCV 2008, pp 1072–1077

  45. YahooMusic (2008) Yahoo! Music user ratings of musical artists, version 1.0. http://research.yahoo.com

  46. Yang J, Wang H, Wang W, Yu P (2003) Enhanced biclustering on expression data. In: Proceedings third IEEE symposium on bioinformatics and bioengineering (BIBE ”03), pp 321–327

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xijun Liang.

Additional information

This work is partially supported by the Natural Science Foundation of China under Grant 11171049 and 61503412, Natural Science Foundation of Shandong Province under Grant No. ZR2014AP004, and Fundamental Research Funds for the Central Universities under Grant No. 15CX02051A.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or permissions@acm.org.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, X., Xia, Z., Pang, L. et al. Measure prediction capability of data for collaborative filtering. Knowl Inf Syst 49, 975–1004 (2016). https://doi.org/10.1007/s10115-016-0920-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-016-0920-5

Keywords

Navigation