Abstract
Researchers from the same lab often spend a considerable amount of time searching for published articles relevant to their current project. Despite having similar interests, they conduct independent, time consuming searches. While they may share the results afterwards, they are unable to leverage previous search results during the search process. We propose a research paper recommender system that avoids such time consuming searches by augmenting existing search engines with recommendations based on previous searches performed by others in the lab. Most existing recommender systems were developed for commercial domains with millions of users. The research paper domain has relatively few users compared to the large number of online research papers. The two major challenges with this type of data are the large number of dimensions and the sparseness of the data. The novel contribution of the paper is a scalable subspace clustering algorithm (SCuBA) that tackles these problems. Both synthetic and benchmark datasets are used to evaluate the clustering algorithm and to demonstrate that it performs better than the traditional collaborative filtering approaches when recommending research papers.
Supported by grants from Prop 301 (No. ECR A601) and CEINT 2004.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp. 94–105. ACM Press, New York (1998)
Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (1998)
Cheng, C.-H., Fu, A.W., Zhang, Y.: Entropy-based subspace clustering for mining numerical data. In: Proceedings of the fifth ACM SIGKDD International Conference on Knowledge discovery and data mining, pp. 84–93. ACM Press, New York (1999)
Demiriz, A.: Enhancing product recommender systems on sparse binary data. Data Min. Knowl. Discov. 9(2), 147–170 (2004)
Deshpande, M., Karypis, G.: Item-based top-n recommendation algorithms. ACM Trans. Inf. Syst. 22(1), 143–177 (2004)
Dhillon, I.S., Guan, Y.: Information theoretic clustering of sparse co-occurrence data. In: Proceedings of the third International Conference on Data mining. IEEE Press, Los Alamitos (2003)
Goil, S., Nagesh, H., Choudhary, A.: Mafia: Efficient and scalable subspace clustering for very large data sets. Technical Report CPDC-TR-9906-010, Northwestern University, 2145 Sheridan Road, Evanston IL 60208 (June 1999)
Goodrum, A., McCain, K.W., Lawrence, S., Giles, C.L.: Scholarly publishing in the internet age: a citation analysis of computer science literature. Information Processing and Management 37(5), 661–675 (2001)
Heckerman, D., Chickering, D.M., Meek, C., Rounthwaite, R., Kadie, C.: Dependency networks for inference, collaborative filtering, and data visualization. J. Mach. Learn. Res. 1, 49–75 (2001)
Herlocker, J.L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1), 5–53 (2004)
Hofmann, T.: Latent semantic models for collaborative filtering. ACM Trans. Inf. Syst. 22(1), 89–115 (2004)
Huang, Z., Chen, H., Zeng, D.: Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Trans. Inf. Syst. 22(1), 116–142 (2004)
Liu, B., Xia, Y., Yu, P.S.: Clustering through decision tree construction. In: Proceedings of the ninth International Conference on Information and Knowledge Management, pp. 20–29. ACM Press, New York (2000)
McNee, S.M., Albert, I., Cosley, D., Gopalkrishnan, P., Lam, S.K., Rashid, A.M., Konstan, J.A., Riedl, J.: On the recommending of citations for research papers. In: Proceedings of the 2002 ACM Conference on Computer supported cooperative work, pp. 116–125. ACM Press, New York (2002)
Middleton, S.E., Shadbolt, N.R., De Roure, D.C.: Ontological user profiling in recommender systems. ACM Trans. Inf. Syst. 22(1), 54–88 (2004)
Mobasher, B., Cooley, R., Srivastava, J.: Automatic personalization based on web usage mining. Commun. ACM 43(8), 142–151 (2000)
Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data: a review. SIGKDD Explorations 6(1), 90–105 (2004)
Patrikainen, A., Manilla, H.: Subspace clustering of high-dimensional binary data - a probabilistic approach. In: Workshop on Clustering High Dimensional Data and its Applications, SIAM International Conference on Data Mining (2004)
Peters, M., Zaki, M.J.: Clicks: Clustering categorical data using k-partite maximal cliques. In: IEEE International Conference on Data Engineering. IEEE, Los Alamitos (2005)
Resnick, P., Varian, H.R.: Recommender systems. Commun. ACM 40(3), 56–58 (1997)
Sarwar, B., Karypis, G., Konstan, J., Reidl, J.: Item-based collaborative filtering recommendation algorithms. In: Proceedings of the tenth International Conference on World Wide Web, pp. 285–295. ACM Press, New York (2001)
Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Analysis of recommendation algorithms for e-commerce. In: Proceedings of the 2nd ACM Conference on Electronic commerce, pp. 158–167. ACM Press, New York (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Agarwal, N., Haque, E., Liu, H., Parsons, L. (2005). Research Paper Recommender Systems: A Subspace Clustering Approach. In: Fan, W., Wu, Z., Yang, J. (eds) Advances in Web-Age Information Management. WAIM 2005. Lecture Notes in Computer Science, vol 3739. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11563952_42
Download citation
DOI: https://doi.org/10.1007/11563952_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29227-2
Online ISBN: 978-3-540-32087-6
eBook Packages: Computer ScienceComputer Science (R0)