Topic Based Author Ranking with Full-Text Citation Analysis
Author metadata provide significant scientific publication characterization, which often represents important domain knowledge. Publications from existing or potential reputable authors motivate further research as “stand on the shoulder of giants”. This paper addresses author ranking problem for information retrieval and recommendation, and the contributions of this research are four-fold. First of all, we employed full-text citation analysis (citation context) to enhance the classical author citation network. Second, supervised topic modeling method is used to determine the contribution of a specific author (as a vertex) or a citation (as an edge). Third, PageRank with prior and transitioning topical probability distributions measured the importance of authors (in the graph) based on each scientific topic. Last but not least, we proposed a novel evaluation method to compare the result of PageRank with prior with classical ranking methods, i.e., BM25, TFIDF and Language Model, and PageRank. The result shows that our ranking method with full-text citation analysis significantly (p<0.001) outperforms than the other ranking methods.
KeywordsAuthor ranking Labeled-LDA Full-text citation extraction PageRank with prior
Unable to display preview. Download preview PDF.
- 3.Garfield, E., Sher, I.H.: Genetics Citation Index. Institute for Scientific Information, Philadelphia (1963)Google Scholar
- 5.Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)Google Scholar
- 8.Ramage, D., Hall, D., Nallapati, R., Manning, C.D.: Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In: EMNLP 2009 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 248–256. Association for Computational Linguistics (2009)Google Scholar
- 9.White, S., Smyth, P.: Algorithms for estimating relative importance in networks. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 266–275. ACM (2003)Google Scholar
- 10.Cheng, A., Friedman, E.: Manipulability of PageRank under Sybil strategies. In: First Workshop on the Economics of Networked Systems, NetEcon 2006 (2006)Google Scholar
- 11.Rodriguez, M.A., Bollen, J.: Simulating network influence algorithms using particle-swarms: Pagerank and pagerank-priors (2006)Google Scholar