Advertisement

The Journal of Supercomputing

, Volume 75, Issue 1, pp 123–141 | Cite as

An efficient parallel similarity matrix construction on MapReduce for collaborative filtering

  • Seunghee Kim
  • Hongyeon Kim
  • Jun-Ki MinEmail author
Article
  • 101 Downloads

Abstract

Nowadays, the collaborative filtering becomes popular for recommendation systems. However, as the volume of data increases expansively, the construction of a similarity matrix becomes a performance bottleneck in recommendation systems. The MapReduce framework proposed by Google has been widely used for data-intensive application recently. Thus, in this work, we propose an efficient parallel algorithm ConSimMR for constructing a similarity matrix using MapReduce. We first partition a set of items into disjoint groups in each of which items rated by similar users tend to be located. We next compute the similarity of every pair of items belonging to the same group. Finally, we calculate the similarity of every item pair included in different groups. At this step, by using the rating list of each user rather than that of each item, we can compute the similarities in parallel resulting in the performance improvement. We conducted experiments to compare our parallel algorithm ConSimMR with the previous algorithms on real-life data sets and confirmed the efficiency as well as scalability of ConSimMR.

Keywords

Collaborative filtering Recommendation MapReduce Big data 

References

  1. 1.
    Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749CrossRefGoogle Scholar
  2. 2.
    Apache: Apache hadoop. http://hadoop.apache.org (2010). Accessed 1 June 2017
  3. 3.
    Breese JS, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp 43–52Google Scholar
  4. 4.
    Broder AZ (1997) On the resemblance and containment of documents. In: Proceedings of Compression and Complexity of Sequences 1997, IEEE, pp 21–29Google Scholar
  5. 5.
    Cohen E (1997) Size-estimation framework with applications to transitive closure and reachability. J Comput Syst Sci 55(3):441–453MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Das AS, Datar M, Garg A, Rajaram S (2007) Google news personalization: scalable online collaborative filtering. In: Proceedings of the 16th International Conference on World Wide Web, ACM, pp 271–280Google Scholar
  7. 7.
    Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRefGoogle Scholar
  8. 8.
    Delgado J, Ishii N (1999) Memory-based weighted majority prediction. In: ACM SIGIR Workshop Recommender Systems CiteseerGoogle Scholar
  9. 9.
    Deshpande M, Karypis G (2004) Item-based top-n recommendation algorithms. ACM Trans Inf Syst (TOIS) 22(1):143–177CrossRefGoogle Scholar
  10. 10.
    Goldberg D, Nichols D, Oki BM, Terry D (1992) Using collaborative filtering to weave an information tapestry. Commun ACM 35(12):61–70CrossRefGoogle Scholar
  11. 11.
    Indyk P (2001) A small approximately min-wise independent family of hash functions. J Algorithms 38(1):84–90MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, ACM, pp 604–613Google Scholar
  13. 13.
    Jiang J, Lu J, Zhang G, Long G (2011) Scaling-up item-based collaborative filtering recommendation algorithm based on hadoop. In: 2011 IEEE World Congress on Services, pp 490–497Google Scholar
  14. 14.
    Li C, He K (2017) CBMR: an optimized MapReduce for item-based collaborative filtering recommendation algorithm with empirical analysis. Concurr Comput Pract Exp 29:e4092.  https://doi.org/10.1002/cpe.4092
  15. 15.
    Meng S, Dou W, Zhang X, Chen J (2014) KASR: a keyword-aware service recommendation method on mapreduce for big data applications. IEEE Trans Parallel Distrib Syst 25(12):3221–3231CrossRefGoogle Scholar
  16. 16.
    Miller BN, Albert I, Lam SK, Konstan JA, Riedl J (2003) Movielens unplugged: experiences with an occasionally connected recommender system. In: Proceedings of the 8th International Conference on Intelligent User Interfaces, pp 263–266Google Scholar
  17. 17.
    Resnick P, Iacovou N, Suchak M, Bergstrom P, Riedl J (1994) Grouplens: an open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, pp 175–186Google Scholar
  18. 18.
    Sarwar B, Karypis G, Konstan J, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th International Conference on World Wide Web, pp 285–295Google Scholar
  19. 19.
    Schelter S, Boden C, Markl V (2012) Scalable similarity-based neighborhood methods with MapReduce. In: Proceedings of the Sixth ACM Conference on Recommender Systems, pp 163–170Google Scholar
  20. 20.
    Shardanand U, Maes P (1995) Social information filtering: algorithms for automating word of mouth. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp 210–217Google Scholar
  21. 21.
    Su X, Khoshgoftaar TM (2009) A survey of collaborative filtering techniques. Adv Artif Intell 2009:4CrossRefGoogle Scholar
  22. 22.
    Wang P, Ye H (2009) A personalized recommendation algorithm combining slope one scheme and user based collaborative filtering. In: Proceedings of the International Conference on Industrial and Information Systems, pp 152–154Google Scholar
  23. 23.
    Zhao ZD, Shang MS (2010) User-based collaborative-filtering recommendation algorithms on Hadoop. In: Proceedings of Third International Conference on Knowledge Discovery and Data Mining (WKDD), pp 478–481Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of IT Convergence Software EngineeringKorea University of Technology and EducationCheonanKorea
  2. 2.School of Computer Science and EngineeringKorea University of Technology and EducationCheonanKorea

Personalised recommendations