Research Paper Recommender Systems: A Subspace Clustering Approach
Researchers from the same lab often spend a considerable amount of time searching for published articles relevant to their current project. Despite having similar interests, they conduct independent, time consuming searches. While they may share the results afterwards, they are unable to leverage previous search results during the search process. We propose a research paper recommender system that avoids such time consuming searches by augmenting existing search engines with recommendations based on previous searches performed by others in the lab. Most existing recommender systems were developed for commercial domains with millions of users. The research paper domain has relatively few users compared to the large number of online research papers. The two major challenges with this type of data are the large number of dimensions and the sparseness of the data. The novel contribution of the paper is a scalable subspace clustering algorithm (SCuBA) that tackles these problems. Both synthetic and benchmark datasets are used to evaluate the clustering algorithm and to demonstrate that it performs better than the traditional collaborative filtering approaches when recommending research papers.
KeywordsRecommender System Hash Table Subspace Cluster Similar User Embed Cluster
Unable to display preview. Download preview PDF.
- 2.Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (1998)Google Scholar
- 6.Dhillon, I.S., Guan, Y.: Information theoretic clustering of sparse co-occurrence data. In: Proceedings of the third International Conference on Data mining. IEEE Press, Los Alamitos (2003)Google Scholar
- 7.Goil, S., Nagesh, H., Choudhary, A.: Mafia: Efficient and scalable subspace clustering for very large data sets. Technical Report CPDC-TR-9906-010, Northwestern University, 2145 Sheridan Road, Evanston IL 60208 (June 1999)Google Scholar
- 14.McNee, S.M., Albert, I., Cosley, D., Gopalkrishnan, P., Lam, S.K., Rashid, A.M., Konstan, J.A., Riedl, J.: On the recommending of citations for research papers. In: Proceedings of the 2002 ACM Conference on Computer supported cooperative work, pp. 116–125. ACM Press, New York (2002)CrossRefGoogle Scholar
- 18.Patrikainen, A., Manilla, H.: Subspace clustering of high-dimensional binary data - a probabilistic approach. In: Workshop on Clustering High Dimensional Data and its Applications, SIAM International Conference on Data Mining (2004)Google Scholar
- 19.Peters, M., Zaki, M.J.: Clicks: Clustering categorical data using k-partite maximal cliques. In: IEEE International Conference on Data Engineering. IEEE, Los Alamitos (2005)Google Scholar