Learning to Diversify Expert Finding with Subtopics

  • Hang Su
  • Jie Tang
  • Wanling Hong
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7301)


Expert finding is concerned about finding persons who are knowledgeable on a given topic. It has many applications in enterprise search, social networks, and collaborative management. In this paper, we study the problem of diversification for expert finding. Specifically, employing an academic social network as the basis for our experiments, we aim to answer the following question: Given a query and an academic social network, how to diversify the ranking list, so that it captures the whole spectrum of relevant authors’ expertise? We precisely define the problem and propose a new objective function by incorporating topic-based diversity into the relevance ranking measurement. A learning-based model is presented to solve the objective function. Our empirical study in a real system validates the effectiveness of the proposed method, which can achieve significant improvements (+15.3%-+94.6% by MAP) over alternative methods.


Information Retrieval Topic Model Latent Dirichlet Allocation Retrieval Model Mean Average Precision 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: WSDM 2009, pp. 5–14. ACM (2009)Google Scholar
  2. 2.
    Baeza-Yates, R., Ribeiro-Neto, B., et al.: Modern information retrieval, vol. 463. ACM Press, New York (1999)Google Scholar
  3. 3.
    Balog, K., Azzopardi, L., de Rijke, M.: Formal models for expert finding in enterprise corpora. In: SIGIR 2006, pp. 43–50. ACM (2006)Google Scholar
  4. 4.
    Bertsekas, D.: Nonlinear programming. Athena Scientific, Belmont (1999)zbMATHGoogle Scholar
  5. 5.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. In: NIPS 2001, pp. 601–608 (2001)Google Scholar
  6. 6.
    Carbonell, J.G., Goldstein, J.: The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: SIGIR 1998, pp. 335–336 (1998)Google Scholar
  7. 7.
    Fang, Y., Si, L., Mathur, A.: Ranking experts with discriminative probabilistic models. In: SIGIR 2009 Workshop on LRIR. Citeseer (2009)Google Scholar
  8. 8.
    Hirsch, J.E.: An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences 102(46), 16569–16572 (2005)CrossRefGoogle Scholar
  9. 9.
    Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR 1999, pp. 50–57. ACM (1999)Google Scholar
  10. 10.
    Koza, J.: On the programming of computers by means of natural selection, vol. 1. MIT Press (1996)Google Scholar
  11. 11.
    Liu, T.: Learning to rank for information retrieval. Foundations and Trends in Information Retrieval 3(3), 225–331 (2009)CrossRefGoogle Scholar
  12. 12.
    Radlinski, F., Bennett, P.N., Carterette, B., Joachims, T.: Redundancy, diversity and interdependent document relevance. SIGIR Forum 43, 46–52 (2009)CrossRefGoogle Scholar
  13. 13.
    Robertson, S., Walker, S., Hancock-Beaulieu, M., Gatford, M., Payne, A.: Okapi at trec-4. In: Proceedings of TREC, vol. 4 (1995)Google Scholar
  14. 14.
    Roth, M., Ben-David, A., Deutscher, D., Flysher, G., Horn, I., Leichtberg, A., Leiser, N., Matias, Y., Merom, R.: Suggesting friends using the implicit social graph. In: KDD 2010 (2010)Google Scholar
  15. 15.
    Russell, S., Norvig, P., Canny, J., Malik, J., Edwards, D.: Artificial intelligence: a modern approach, vol. 74. Prentice Hall, Englewood Cliffs (1995)zbMATHGoogle Scholar
  16. 16.
    Soboroff, I., de Vries, A., Craswell, N.: Overview of the trec 2006 enterprise track. In: Proceedings of TREC. Citeseer (2006)Google Scholar
  17. 17.
    Tang, J., Jin, R., Zhang, J.: A topic modeling approach and its integration into the random walk framework for academic search. In: ICDM 2008, pp. 1055–1060 (2008)Google Scholar
  18. 18.
    Tang, J., Wu, S., Gao, B., Wan, Y.: Topic-level social network search. In: KDD 2011, pp. 769–772. ACM (2011)Google Scholar
  19. 19.
    Tang, J., Yao, L., Zhang, D., Zhang, J.: A combination approach to web user profiling. ACM TKDD 5(1), 1–44 (2010)CrossRefGoogle Scholar
  20. 20.
    Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: Arnetminer: extraction and mining of academic social networks. In: KDD 2008, pp. 990–998 (2008)Google Scholar
  21. 21.
    Tong, H., He, J., Wen, Z., Konuru, R., Lin, C.-Y.: Diversified ranking on large graphs: an optimization viewpoint. In: KDD 2011, pp. 1028–1036 (2011)Google Scholar
  22. 22.
    Wei, X., Croft, W.: Lda-based document models for ad-hoc retrieval. In: SIGIR 2006, pp. 178–185. ACM (2006)Google Scholar
  23. 23.
    Yeh, J., Lin, J., Ke, H., Yang, W.: Learning to rank for information retrieval using genetic programming. In: SIGIR 2007 Workshop on LR4IR. Citeseer (2007)Google Scholar
  24. 24.
    Yue, Y., Joachims, T.: Predicting diverse subsets using structural svms. In: ICML 2008, pp. 1224–1231. ACM (2008)Google Scholar
  25. 25.
    Zhai, C., Cohen, W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: SIGIR 2003, pp. 10–17. ACM (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Hang Su
    • 1
  • Jie Tang
    • 2
  • Wanling Hong
    • 1
  1. 1.School of SoftwareBeihang UniversityChina
  2. 2.Department of Computer Science and TechnologyTsinghua UniversityChina

Personalised recommendations