User Modeling and User-Adapted Interaction

, Volume 26, Issue 1, pp 69–101 | Cite as

Towards reproducibility in recommender-systems research

  • Joeran Beel
  • Corinna Breitinger
  • Stefan Langer
  • Andreas Lommatzsch
  • Bela Gipp
Article

Abstract

Numerous recommendation approaches are in use today. However, comparing their effectiveness is a challenging task because evaluation results are rarely reproducible. In this article, we examine the challenge of reproducibility in recommender-system research. We conduct experiments using Plista’s news recommender system, and Docear’s research-paper recommender system. The experiments show that there are large discrepancies in the effectiveness of identical recommendation approaches in only slightly different scenarios, as well as large discrepancies for slightly different approaches in identical scenarios. For example, in one news-recommendation scenario, the performance of a content-based filtering approach was twice as high as the second-best approach, while in another scenario the same content-based filtering approach was the worst performing approach. We found several determinants that may contribute to the large discrepancies observed in recommendation effectiveness. Determinants we examined include user characteristics (gender and age), datasets, weighting schemes, the time at which recommendations were shown, and user-model size. Some of the determinants have interdependencies. For instance, the optimal size of an algorithms’ user model depended on users’ age. Since minor variations in approaches and scenarios can lead to significant changes in a recommendation approach’s performance, ensuring reproducibility of experimental results is difficult. We discuss these findings and conclude that to ensure reproducibility, the recommender-system community needs to (1) survey other research fields and learn from them, (2) find a common understanding of reproducibility, (3) identify and understand the determinants that affect reproducibility, (4) conduct more comprehensive experiments, (5) modernize publication practices, (6) foster the development and use of recommendation frameworks, and (7) establish best-practice guidelines for recommender-systems research.

Keywords

Recommender systems Evaluation Experimentation Reproducibility 

References

  1. Al-Maskari, A., Sanderson, M., Clough, P.: The relationship between IR effectiveness measures and user satisfaction. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 773–774. ACM, New York (2007)Google Scholar
  2. Amatriain, X., Pujol, J., Oliver, N.: I like it. i like it not: Evaluating user ratings noise in recommender systems. In: Carberry, S., Weibelzahl, S., Micarelli, A., Semeraro, G. (eds.) User Modeling, Adaptation, and Personalization, pp. 247–258. Springer, Berlin (2009)Google Scholar
  3. Beel, J.: Towards effective research-paper recommender systems and user modeling based on mind maps. PhD Thesis. Otto-von-Guericke Universität Magdeburg (2015)Google Scholar
  4. Beel, J., Langer, S.: A comparison of offline evaluations, online evaluations, and user studies in the context of research-paper recommender systems. In: Kapidakis, S., Mazurek, C., Werla, M. (eds.) Proceedings of the 19th International Conference on Theory and Practice of Digital Libraries (TPDL). Lecture Notes in Computer Science. 153–168 (2015). doi:10.1007/978-3-319-24592-8_12
  5. Beel, J., Gipp, B., Shaker, A., Friedrich, N.: SciPlore Xtract: extracting titles from scientific PDF documents by analyzing style information (font size). In Lalmas, M., Jose, J., Rauber, A., Sebastiani, F., Frommholz, I. (eds.) Research and Advanced Technology for Digital Libraries. Proceedings of the 14th European Conference on Digital Libraries (ECDL’10). Lecture Notes of Computer Science (LNCS), pp. 413–416. Springer, Glasgow (2010)Google Scholar
  6. Beel, J., Gipp, B., Langer, S., Genzmehr, M.: Docear: an academic literature suite for searching, organizing and creating academic literature. In: Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries (JCDL). JCDL’11, pp. 465–466. ACM, New York (2011). doi:10.1145/1998076.1998188
  7. Beel, J., Langer, S., Genzmehr, M.: Sponsored versus organic (Research Paper) recommendations and the impact of labeling. In: Aalberg, T., Dobreva, M., Papatheodorou, C., Tsakonas, G., Farrugia, C. (eds.) Proceedings of the 17th International Conference on Theory and Practice of Digital Libraries (TPDL 2013), pp. 395–399. Malta, Valletta (2013)Google Scholar
  8. Beel, J., Langer, S., Genzmehr, M., Gipp, B., Breitinger, C., Nürnberger, A.: Research paper recommender system evaluation: a quantitative literature survey. In: Proceedings of the Workshop on Reproducibility and Replication in Recommender Systems Evaluation (RepSys) at the ACM Recommender System Conference (RecSys). ACM International Conference Proceedings Series (ICPS), pp. 15–22. ACM, New York (2013b). doi:10.1145/2532508.2532512
  9. Beel, J., Langer, S., Genzmehr, M., Gipp, B., Nürnberger, A.: A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation. In: Proceedings of the Workshop on Reproducibility and Replication in Recommender Systems Evaluation (RepSys) at the ACM Recommender System Conference (RecSys). ACM International Conference Proceedings Series (ICPS), pp. 7–14 (2013c). doi:10.1145/2532508.2532511
  10. Beel, J., Langer, S., Genzmehr, M., Müller, C.: Docears PDF inspector: title extraction from PDF files. In: Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’13), pp. 443–444. ACM, New York (2013d). doi:10.1145/2467696.2467789
  11. Beel, J., Langer, S., Genzmehr, M., Nürnberger, A.: Introducing Docear’s research paper recommender system. In: Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’13), pp. 459–460. ACM, New Year (2013e). doi:10.1145/2467696.2467786
  12. Beel, J., Langer, S., Genzmehr, M., Nürnberger, A: Persistence in recommender systems: giving the same recommendations to the same users multiple times. In: Aalberg, T., Dobreva, M., Papatheodorou, C., Tsakonas, G., Farrugia, C. (eds.) Proceedings of the 17th International Conference on Theory and Practice of Digital Libraries (TPDL 2013). Lecture Notes of Computer Science (LNCS), pp. 390–394. Springer, Valletta (2013f)Google Scholar
  13. Beel, J., Langer, S., Nürnberger, A., Genzmehr, M.: The impact of demographics (age and gender) and other user characteristics on evaluating recommender systems. In: Aalberg, T., Dobreva, M., Papatheodorou, C., Tsakonas, G., Farrugia, C. (eds.) Proceedings of the 17th International Conference on Theory and Practice of Digital Libraries (TPDL 2013), pp. 400–404. Springer, Valletta (2013)Google Scholar
  14. Beel, J., Langer, S., Genzmehr, M., Gipp, B.: Utilizing mind-maps for information retrieval and user modelling. In: Dimitrova, V., Kuflik, T., Chin, D., Ricci, F., Dolog, P., Houben, G.-J. (eds.) Proceedings of the 22nd Conference on User Modelling, Adaption, and Personalization (UMAP). Lecture Notes in Computer Science, pp. 301–313. Springer, Berlin (2014a). doi:10.1007/978-3-319-08786-3_26
  15. Beel, J., Langer, S., Gipp, B., Nürnberger, A.: The architecture and datasets of Docear’s Research paper recommender system. D-Lib Magazine 20, 11/12 (2014b). doi:10.1045/november14-beel
  16. Beel, J., Gipp, B., Langer, S., Breitinger, C.: Research paper recommender systems: a literature survey. Int. J. Digital Libr. 1–34 (2015a). doi:10.1007/s00799-015-0156-0
  17. Beel, J., Langer, S., Kapitsaki, G.M., Breitinger, C., Gipp, B.: Exploring the potential of user modeling based on mind maps. In: Ricci, F., Bontcheva, K., Conlan, O., Lawless, S. (eds.) Proceedings of the 23rd Conference on User Modelling, Adaptation and Personalization (UMAP). Lecture Notes of Computer Science, pp. 3–17. Springer, Berlin (2015b). doi:10.1007/978-3-319-20267-9_1
  18. Bellogin, A., Castells, P., Said, A., Tikk, D.: Report on the workshop on reproducibility and replication in recommender systems evaluation (RepSys). In: ACM SIGIR forum, pp. 29–35. ACM, New York (2014)Google Scholar
  19. Bethard, S., Jurafsky, D.: Who should I cite: learning literature search models from citation behavior. In: Proceedings of the 19th ACM international conference on Information and knowledge management, pp. 609–618. ACM, New York (2010)Google Scholar
  20. Bobadilla, J., Ortega, F., Hernando, A., Gutiérrez, A.: Recommender systems survey. Knowl.-Based Syst. 46(2013), 109–132 (2013)CrossRefGoogle Scholar
  21. Bogers, T., van den Bosch, A.: Comparing and evaluating information retrieval algorithms for news recommendation. In: RecSys’07, pp. 141–144. ACM, Minneapolis (2007)Google Scholar
  22. Bogers, T., van den Bosch, A.: Recommending scientific articles using citeulike. In: Proceedings of the 2008 ACM conference on Recommender systems, pp. 287–290. ACM, New York (2008)Google Scholar
  23. Bollen, J., Rocha, L.M.: An adaptive systems approach to the implementation and evaluation of digital library recommendation systems. In: Proceedings of the 4th European Conference on Digital Libraries, pp. 356–359. Springer, Berlin (2000)Google Scholar
  24. Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the 14th conference on Uncertainty in Artificial Intelligence, pp. 43–52. Microsoft Research (1998)Google Scholar
  25. Buckheit, J.B., Donoho, D.L.: Wavelab and reproducible research. Wavelets and Statistics. Lecture Notes in Statistics, pp. 55–81. Springer, Berlin (1995)Google Scholar
  26. Burns, A.C., Bush, R.F.: Marketing Research, 7th edn. Prentice Hall, Upper Saddle River (2013)Google Scholar
  27. Casadevall, A., Fang, F.C.: Reproducible science. Infect. Immun. 78(12), 4972–4975 (2010)CrossRefGoogle Scholar
  28. CiteULike. My Top Recommendations. Webpage (http://www.citeulike.org/profile/joeran/recommendations) (2011)
  29. Cremonesi, P., Garzotto, F., Negro, S., Papadopoulos, A.V., Turrin, R.: Looking for “good” recommendations: a comparative evaluation of recommender systems. In: Human-Computer Interaction-INTERACT 2011, pp. 152–168. Springer, Berlin (2011)Google Scholar
  30. Cremonesi, P., Garzotto, F., Turrin, R.: Investigating the persuasion potential of recommender systems from a quality perspective: An empirical study. ACM Trans. Interact. Intell. Syst. 2(2), 1–11 (2012)CrossRefGoogle Scholar
  31. Davies, M.: Concept mapping, mind mapping and argument mapping: what are the differences and do they matter? Hig. Educ. 62(3), 279–301 (2011)CrossRefGoogle Scholar
  32. Deyo, R.A., Diehr, P., Patrick, D.L.: Reproducibility and responsiveness of health status measures statistics and strategies for evaluation. Control. Clin. Trials 12(4), S142–S158 (1991)CrossRefGoogle Scholar
  33. Dong, R., Tokarchuk, L., Ma, A.: Digging friendship: paper recommendation in social network. In: Proceedings of Networking & Electronic Commerce Research Conference (NAEC 2009), pp. 21–28 (2009)Google Scholar
  34. Domingues Garcia, R., Bender, M., Anjorin, M., Rensing, C., Steinmetz, R.: FReSET: an evaluation framework for folksonomy-based recommender systems. In: Proceedings of the 4th ACM RecSys Workshop on Recommender Systems and the Social Web, pp. 25–28. ACM, New York (2012)Google Scholar
  35. Downing, S.M.: Reliability: on the reproducibility of assessment data. Med. Educ. 38(9), 1006–1012 (2004)CrossRefGoogle Scholar
  36. Drummond, C.: Replicability is not reproducibility: nor is it good science. In: Proceedings of the Evaluation Methods for MachineLearning Workshop at the 26th ICML (2009)Google Scholar
  37. Ekstrand, M.D., Kannan, P., Stemper, J.A., Butler, J.T., Konstan, J.A., Riedl, J.T.: Automatically building research reading lists. In Proceedings of the Fourth ACM Conference on Recommender Systems, pp. 159–166. ACM, New York (2010)Google Scholar
  38. Eckart de Castilho, R., Gurevych, I.: A lightweight framework for reproducible parameter sweeping in information retrieval. In: Proceedings of the 2011 Workshop on Data InfrastructurEs for Supporting Information Retrieval Evaluation, pp. 7–10. ACM, New York (2011)Google Scholar
  39. Ekstrand, M.D., Ludwig, M., Kolb, J., Riedl, J.T.: LensKit: a modular recommender framework. In: Proceedings of the fifth ACM Conference on Recommender Systems, pp. 349–350. ACM, New York (2011a)Google Scholar
  40. Ekstrand, M.D., Ludwig, M., Konstan, J.A., Riedl, J.T.: Rethinking the recommender research ecosystem: reproducibility, openness, and LensKit. In: Proceedings of the fifth ACM Conference on Recommender Systems, pp. 133–140. ACM, New York (2011b)Google Scholar
  41. Felfernig, A., Jeran, M., Ninaus, G., Reinfrank, F., Reiterer, S.: Toward the next generation of recommender systems: applications and research challenges. In: Multimedia Services in Intelligent Environments, pp. 81–98. Springer, Berlin (2013)Google Scholar
  42. Flyvbjerg, B.: Making Social Science Matter: Why Social Inquiry Fails and How It Can Succeed Again. Cambridge University Press, Cambridge (2001)CrossRefGoogle Scholar
  43. Gantner, Z., Rendle, S., Freudenthaler, C., Schmidt-Thieme, L.: MyMediaLite: a free recommender system library. In Proceedings of the Fifth ACM Conference on Recommender Systems, pp. 305–308. ACM, New York (2011)Google Scholar
  44. Ge, M., Delgado-Battenfeld, C., Jannach, D.: Beyond accuracy: evaluating recommender systems by coverage and serendipity. In: Proceedings of the Fourth ACM Conference on Recommender Systems, pp. 257–260. ACM, New York (2010)Google Scholar
  45. Gunawardana, A., Shani, G.: A survey of accuracy evaluation metrics of recommendation tasks. J. Mach. Learn. Res. 10, 2935–2962 (2009)MathSciNetMATHGoogle Scholar
  46. Guo, G., Zhang, J., Sun, Z., Yorke-Smith, N.: Librec: a java library for recommender systems. In: Posters, Demos, Late-breaking Results and Workshop Proceedings of the 23rd International Conference on User Modeling, Adaptation and Personalization (2015)Google Scholar
  47. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATHGoogle Scholar
  48. Hahsler, M.: Recommenderlab: a framework for developing and testing recommendation algorithms. (2011). https://cran.r-project.org/web/packages/recommenderlab/vignettes/recommenderlab.pdf
  49. Hawking, D., Craswell, N., Thistlewaite, P., Harman, D.: Results and challenges in web search evaluation. Comput. Netw. 31(11), 1321–1330 (1999)CrossRefGoogle Scholar
  50. Hayes, C., Massa, P., Avensani, P., Cunningham, P.: An on-line evaluation framework for recommender systems. In: Proceedings of the AH’2002 Workshop on Recommendation and Personalization in eCommerce. Department of Computer Science, Trinity College Dublin (2002)Google Scholar
  51. He, Q., Pei, J., Kifer, D., Mitra, P., Giles, L.: Context-aware citation recommendation. In: Proceedings of the 19th International Conference on World Wide Web, pp. 421–430. ACM, New York (2010)Google Scholar
  52. He, J., Nie, J.-Y., Lu, Y., Zhao, W.X.: Position-aligned translation model for citation recommendation. In: Proceedings of the 19th International Conference on String Processing and Information Retrieval, pp. 251–263. Springer, Berlin (2012)Google Scholar
  53. Herlocker, J.L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1), 5–53 (2004)CrossRefGoogle Scholar
  54. Hersh, W. et al.: Do batch and user evaluations give the same results? In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 17–24. ACM, New York (2000a)Google Scholar
  55. Hersh, W.R. et al.: Further analysis of whether batch and user evaluations give the same results with a question-answering task. In: Proceedings of the Ninth Text REtrieval Conference (TREC 9), pp. 16–25 (2000b)Google Scholar
  56. Hoeymans, N., Wouters, E.R.C.M., Feskens, E.J.M., van den Bos, G.A.M., Kromhout, D.: Reproducibility of performance-based and self-reported measures of functional status. J. Gerontol. Ser. A 52(6), M363–M368 (1997)CrossRefGoogle Scholar
  57. Hofmann, K., Schuth, A., Bellogin, A., de Rijke, M.: Effects of position bias on click-based recommender evaluation. In Advances in Information Retrieval, pp. 624–630. Springer, Berlin (2014)Google Scholar
  58. Holland, B., Holland, L., Davies, J.: An investigation into the concept of mind mapping and the use of mind mapping software to support and improve student academic performance, pp. 89–94. Centre for Learning and Teaching - Learning and Teaching Project Report, University of Wolverhampton, Hollande (2004)Google Scholar
  59. Huang, W., Kataria, S., Caragea, C., Mitra, P., Giles, C.L., Rokach, L.: Recommending citations: translating papers into references. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Managementpp. 1910–1914. ACM, New York (2012)Google Scholar
  60. Jannach, D.: Recommender systems: an introduction. Lecture Slides (PhD School 2014) (2014)Google Scholar
  61. Jannach, D., Zanker, M., Ge, M., Gröning, M.: Recommender systems in computer science and information systems-a landscape of research. Proceedings of the 13th International Conference. EC-Web, pp. 76–87. Springer, Berlin (2012)Google Scholar
  62. Jannach, D., Lerche, L., Gedikli, F., Bonnin, G.: What recommenders recommend-an analysis of accuracy, popularity, and sales diversity effects. In: Carberry, S., Weibelzahl, S., Micarelli, A., Semeraro, G. (eds.) User Modeling, Adaptation, and Personalization, pp. 25–37. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  63. Knijnenburg, B.P., Willemsen, M.C., Kobsa, A.: A pragmatic procedure to support the user-centric evaluation of recommender systems. In: Proceedings of the fifth ACM Conference on Recommender Systems, pp. 321–324. ACM, New York (2011)Google Scholar
  64. Knijnenburg, B.P., Willemsen, M.C., Gantner, Z., Soncu, H., Newell, C.: Explaining the user experience of recommender systems. User Model. User-Adap. Inter. 22(4–5), 441–504 (2012)CrossRefGoogle Scholar
  65. Koers, H., Gabriel, A., Capone, R.: Executable papers in computer science go live on ScienceDirect (2013). https://www.elsevier.com/connect/executable-papers-in-computer-science-go-live-on-sciencedirect
  66. Konstan, J.A., Adomavicius, G.: Toward identification and adoption of best practices in algorithmic recommender systems research. In: Proceedings of the International Workshop on Reproducibility and Replication in Recommender Systems Evaluation, pp. 23–28. ACM, New York (2013)Google Scholar
  67. Konstan, J., Ekstrand, M.D.: Introduction to Recommender Systems. Coursera Lecture Slides. Springer, Berlin (2015)Google Scholar
  68. Konstan, J.A., Riedl, J.: Recommender systems: from algorithms to user experience. User Model. User-Adapt. Inreraction 22(1–2), 1–23 (2012)Google Scholar
  69. Kowald, D., Lacic, E., Trattner, C.: Tagrec: towards a standardized tag recommender benchmarking framework. In: Proceedings of the 25th ACM Conference on Hypertext and Social Media, pp. 305–307. ACM, New York (2014)Google Scholar
  70. Langer, S., Beel, J.: The Comparability of recommender system evaluations and characteristics of Docear’s users. In: Proceedings of the Workshop on Recommender Systems Evaluation: Dimensions and Design (REDD) at the 2014 ACM Conference Series on Recommender Systems (RecSys). CEUR-WS, pp. 1–6 (2014)Google Scholar
  71. Lommatzsch, A.: Real-time news recommendation using context-aware ensembles. In: Proceedings of the 36th European Conference on Information Retrieval (ECIR), pp. 51–62. Springer, New York (2014a)Google Scholar
  72. Lommatzsch, A.: Real-time news recommendation using context-aware ensembles. PowerPoint Presentation, http://euklid.aot.tu-berlin.de/andreas/20140414__ECIR/20140414__Lommatzsch-ECIR2014.pdf (2014b)
  73. Lu, Y., He, J., Shan, D., Yan, H.: Recommending citations with translation model. In: Proceedings of the 20th ACM international conference on Information and knowledge management, pp. 2017–2020. ACM, New York (2011)Google Scholar
  74. Manouselis, N., Verbert, K.: Layered evaluation of multi-criteria collaborative filtering for scientific paper recommendation. In: Procedia Computer Science, pp. 1189–1197. Elsevier, New York (2013)Google Scholar
  75. McNee, S.M. et al.: On the recommending of citations for research papers. In: Proceedings of the ACM Conference on Computer Supported Cooperative Work, pp. 116–125. ACM, New Orleans (2002). doi:10.1145/587078.587096
  76. McNee, S.M., Kapoor, N., Konstan, J.A.: Don’t look stupid: avoiding pitfalls when recommending research papers. In: Proceedings of the 2006 20th Anniversary Conference on Computer Supported Cooperative Work, pp. 171–180. ACM, New York (2006)Google Scholar
  77. McNutt, M.: Reproducibility. Science 343(6168), 229–229 (2014)Google Scholar
  78. Melville, P., Mooney, R.J., Nagarajan, R. Content-boosted collaborative filtering for improved recommendations. In: Proceedings of the National Conference on Artificial Intelligence, pp. 187–192. AAAI Press, Menlo Park Cambridge, MA (1999); MIT Press, London (2002)Google Scholar
  79. Open Science Collaboration and others: Estimating the reproducibility of psychological science. Science 349, 6251 (2015). doi:10.1126/science.aac4716 Google Scholar
  80. Pennock, D.M., Horvitz, E., Lawrence, S., Giles, CL.: Collaborative filtering by personality diagnosis: a hybrid memory-and model-based approach. In Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence, pp. 473–480. Morgan Kaufmann Publishers Inc., Burlington (2000)Google Scholar
  81. Popper, K.: The Logic of Scientific Discovery. Hutchinson, London (1959)MATHGoogle Scholar
  82. Pu, P., Chen, L., Hu, R.: A user-centric evaluation framework for recommender systems. In: Proceedings of the fifth ACM Conference on Recommender Systems, pp. 157–164. ACM, New York (2011)Google Scholar
  83. Pu, P., Chen, L., Hu, R.: Evaluating recommender systems from the user’s perspective: survey of the state of the art. User Model. User-Adapt. Interaction 22, 1–39 (2012)CrossRefGoogle Scholar
  84. Rehman, J.: Cancer research in crisis: Are the drugs we count on based on bad science? (2013). http://www.salon.com/2013/09/01/is_cancer_research_facing_a_crisis/
  85. Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.: GroupLens: an open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, pp. 175–186. ACM, New York (1994)Google Scholar
  86. Ricci, F., Rokach, L., Shapira, B., Paul, K.B.: Recommender Systems handbook. Springer, New York (2011)CrossRefMATHGoogle Scholar
  87. Ricci, F., Rokach, L., Shapira, B., Paul, K.B.: Recommender Systems Handbook, 2nd edn. Springer, New York (2015)CrossRefMATHGoogle Scholar
  88. Rich, E.: User modeling via stereotypes. Cognit. Sci. 3(4), 329–354 (1979)CrossRefGoogle Scholar
  89. Rothwell, P.M., Martyn, C.N.: Reproducibility of peer review in clinical neuroscience. Brain 123(9), 1964–1969 (2000)CrossRefGoogle Scholar
  90. Said, A.: Evaluating the Accuracy and Utility of Recommender Systems. PhD Thesis. Technische Universität Berlin (2013)Google Scholar
  91. Said, A., Bellogin, A.: Rival: a toolkit to foster reproducibility in recommender system evaluation. In: Proceedings of the 8th ACM Conference on Recommender Systems, pp. 371–372. ACM, New York (2014)Google Scholar
  92. Schein, A.I., Popescul, A., Ungar, L.H., Pennock, D.M.: Methods and metrics for cold-start recommendations. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 253–260. ACM, New York (2002)Google Scholar
  93. Schmidt, S.: Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Rev. Gen. Psychol. 13(2), 90 (2009)CrossRefGoogle Scholar
  94. Shani, G., Gunawardana, A.: Evaluating recommendation systems. In: Carberry, S., Weibelzahl, S., Micarelli, A., Semeraro, G. (eds.) Recommender Systems Handbook, pp. 257–297. Springer, New York (2011)CrossRefGoogle Scholar
  95. Sharma, L.: Gera, Anju: A survey of recommendation system: research challenges. Int. J. Eng. Trends Technol. 4(5), 1989–1992 (2013)Google Scholar
  96. Shi, Y., Larson, M., Hanjalic, A.: Collaborative filtering beyond the user-item matrix: a survey of the state of the art and future challenges. ACM Comput. Surv. 47(1), 3:1–3:45 (2014). doi:10.1145/2556270
  97. Sonnenburg, S., et al.: The need for open source software in machine learning. J. Mach. Learn. Res. 8, 2443–2466 (2007)Google Scholar
  98. Thomas, D., Greenberg, A., Calarco, P.: Scholarly usage based recommendations: evaluating bX for a consortium presentation. http://igelu.org/wp-content/uploads/2011/09/bx_igelu_presentation_updated_september-13.pdf. (2011)
  99. Torres, R., McNee, S.M., Abel, M., Konstan, J.A., Riedl, J.: Enhancing digital libraries with TechLens+. In: Proceedings of the 4th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 228–236. ACM, New York (2004)Google Scholar
  100. Turpin, A.H., Hersh, W.: Why batch and user evaluations do not give the same results. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 225–231. ACM, New York (2001)Google Scholar
  101. Voorhees, E.M.: TREC: improving information access through evaluation. Bull. Am. Soc. Inf. Sci. Technol. 32(1), 16–21 (2005)CrossRefGoogle Scholar
  102. Zarrinkalam, F., Kahani, M.: SemCiR-a citation recommendation system based on a novel semantic distance measure. Program 47(1), 92–112 (2013)CrossRefGoogle Scholar
  103. Zheng, H., Wang, D., Zhang, Q., Li, H., Yang, T.: Do clicks measure recommendation relevancy? An empirical user study. In: Proceedings of the fourth ACM Conference on Recommender Systems, pp. 249–252. ACM, New York (2010)Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • Joeran Beel
    • 1
    • 5
  • Corinna Breitinger
    • 1
    • 2
  • Stefan Langer
    • 1
    • 3
  • Andreas Lommatzsch
    • 4
  • Bela Gipp
    • 1
    • 5
  1. 1.DocearKonstanzGermany
  2. 2.School of Computer Science, Physics and MathematicsLinnaeus UniversityVäxjöSweden
  3. 3.Department of Computer ScienceOtto-von-Guericke UniversityMagdeburgGermany
  4. 4.DAI-LabTechnische Universität BerlinBerlinGermany
  5. 5.Department of Information ScienceKonstanz UniversityKonstanzGermany

Personalised recommendations