Journal of Intelligent Information Systems

, Volume 49, Issue 2, pp 255–279 | Cite as

A survey on expert finding techniques

Article

Abstract

Finding experts in specified areas is an important task and has attracted much attention in the information retrieval community. Research on this topic has made significant progress in the past few decades and various techniques have been proposed. In this survey, we review the state-of-the-art methods in expert finding and summarize these methods into different categories based on their underlying algorithms and models. We also introduce the most widely used data collection for evaluating expert finding systems, and discuss future research directions.

Keywords

Expert finding Data resources Algorithms 

References

  1. Amati, G., & Van Rijsbergen, C.J. (2002). Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Transactions on Information Systems (TOIS), 20(4), 357–389.Google Scholar
  2. Anvik, J., & Murphy, G.C. (2007). Determining implementation expertise from bug reports. In Fourth International Workshop on Mining Software Repositories (MSR’07: ICSE Workshops 2007) (pp. 2–2). IEEE.Google Scholar
  3. Bailey, P., De Vries, A.P., Craswell, N., & Soboroff, I. (2007). Overview of the TREC 2007 enterprise track. In TREC.Google Scholar
  4. Balog, K. (2007). People search in the enterprise. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 916–916). ACM.Google Scholar
  5. Balog, K. (2008). The SIGIR 2008 workshop on future challenges in expertise retrieval (fCHER). In ACM SIGIR Forum (Vol. 42, No. 2, pp. 46–52). ACM.Google Scholar
  6. Balog, K., Azzopardi, L., & De Rijke, M. (2006). Formal models for expert finding in enterprise corpora. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 43–50). ACM.Google Scholar
  7. Balog, K., Azzopardi, L., & De Rijke, M. (2009). A language modeling framework for expert finding. Information Processing & Management, 45(1), 1–19 0306–4573.CrossRefGoogle Scholar
  8. Balog, K., Bogers, T., Azzopardi, L., De Rijke, M., & Van Den Bosch, A. (2007). Broad expertise retrieval in sparse data environments. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 551–558). ACM.Google Scholar
  9. Balog, K., & De Rijke, M. (2007). Determining expert profiles (with an application to expert finding). In IJCAI (Vol. 7, pp. 2657–2662).Google Scholar
  10. Balog, K., De Rijke, M., & Weerkamp, W. (2008). Bloggers as experts: feed distillation using expert retrieval models. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (pp. 753–754). ACM.Google Scholar
  11. Balog, K., Hofmann, K., Weerkamp, W., & De Rijke M. (2007). Query and document models for enterprise search.Google Scholar
  12. Balog, K., Yi, F., De Rijke, M., Serdyukov, P., & Si, L. (2012). Expertise retrieval. Foundations and Trends in Information Retrieval, 6(2-3), 127–256.CrossRefGoogle Scholar
  13. Basu, S., Bilenko, M., & Mooney, R.J. (2004). A probabilistic framework for semi-supervised clustering. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 59–68). ACM.Google Scholar
  14. Becerra-Fernandez, I (2001). Locating expertise at nasa: developing a tool to leverage human capital. Knowledge Management Review, 4(4), 34–37.Google Scholar
  15. Bekkerman, R., & McCallum, A. (2005). Disambiguating web appearances of people in a social network. In Proceedings of the 14th international conference on World Wide Web (pp. 463-470). ACM.Google Scholar
  16. Berendsen, R., Rijke, M., Balog, K., Bogers, T., & Bosch, A. (2013). On the assessment of expertise profiles. Journal of the American Society for Information Science and Technology, 64(10), 2024–2044.CrossRefGoogle Scholar
  17. Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The semantic web. Scientific american, 284(5), 28– 37.CrossRefGoogle Scholar
  18. Blei, D.M., Ng, A.Y., & Jordan, M.I. (2003). Latent dirichlet allocation. the Journal of machine Learning 744 research, 3, 993–1022 1532–4435.MATHGoogle Scholar
  19. Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer networks 776 and ISDN systems, 30(1), 107–117 0169–7552.CrossRefGoogle Scholar
  20. Campbell, C.S., Maglio, P.P., Cozzi, A., & Dom, B. (2003). Expertise identification using email communications. In Proceedings of the twelfth international conference on Information and knowledge management (pp. 528–531). ACM.Google Scholar
  21. Cao, Y., Liu, J., Bao, S., & Li, H. (2005). Research on Expert Search at Enterprise Track of TREC 2005.Google Scholar
  22. Chen, H., Shen, H., Xiong, J., Tan, S., & Cheng, X. (2006). Social network structure behind the mailing lists: ICT-IIIS at TREC 2006 expert finding track. In TREC.Google Scholar
  23. Conrad, J.G., & Utt, M.H. (1994). A system for discovering relationships by feature extraction from text databases: Springer.Google Scholar
  24. Craswell, N, de Vries, A.P., & Soboroff, I. (2005). Overview of the TREC enterprise track. In Trec (Vol. 5, pp. 199–205).Google Scholar
  25. Craswell, N., Hawking, D., Vercoustre, A.M., & Wilkins, P. (2001). P@noptic expert: Searching for experts not just for documents. In Ausweb Poster Proceedings, Queensland, Australia (Vol. 15, p. 17).Google Scholar
  26. Culotta, A., Bekkerman, R., & McCallum, A. (2004). Extracting social networks and contact information from email and the web.Google Scholar
  27. D’Amore, R. (2004). Expertise community detection. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 498–499). ACM.Google Scholar
  28. Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., & Harshman, R.A. (1990). Indexing by latent semantic analysis. JASIS, 41(6), 391–407.CrossRefGoogle Scholar
  29. Deng, H., King, I., & Lyu, M.R. (2008). Formal models for expert finding on DBLP bibliography data. In 2008 Eighth IEEE International Conference on Data Mining (pp. 163–172). IEEE.Google Scholar
  30. Dom, B., Eiron, I., Cozzi, A., & Zhang, Y. (2003). Graph-based ranking algorithms for e-mail expertise analysis. In Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery (pp. 42–48). ACM.Google Scholar
  31. Duan, H., Zhou, Q., Lu, Z., Jin, O., Bao, S., Cao, Y., & et al. (2007). Research on enterprise track of TREC 2007 at SJTU APEX lab. (Vol.22).Google Scholar
  32. Dumais, S.T., & Nielsen, J. (1992). Automating the assignment of submitted manuscripts to reviewers. In Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 233–244). ACM.Google Scholar
  33. Ehrlich, K., Lin, C.Y., & Griffiths-Fisher, V. (2007). Searching for experts in the enterprise: combining text and social network analysis. In Proceedings of the 2007 international ACM conference on Supporting group work (pp. 117–126). ACM.Google Scholar
  34. Fang, H., & Zhai, C.X. (2007). Probabilistic models for expert finding, Springer.Google Scholar
  35. Fang, Y., Si, L., & Mathur, A.P. (2011). Discriminative probabilistic models for expert search in heterogeneous information sources. Information Retrieval, 14(2), 158–177.CrossRefGoogle Scholar
  36. Foner, L.N. (1997). Yenta: a multi-agent, referral-based matchmaking system. In Proceedings of the first international conference on Autonomous agents (pp. 301–307). ACM.Google Scholar
  37. Fox, E.A., & Shaw, J.A. (1994). Combination of multiple searches. NIST SPECIAL PUBLICATION SP, 243–243.Google Scholar
  38. Fu, Y., Xiang, R., Liu, Y., Zhang, M., & Ma, S. (2007). Finding experts using social network analysis. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (pp. 77–80). IEEE Computer Society.Google Scholar
  39. Greer, J.E., Mccalla, G., Collins, J.A., Kumar, V.S., Meagher, P., & Vassileva, J. (1998). Supporting peer help and collaboration in distributed workplace environments. International Journal of Artificial Intelligence in Education (IJAIED), 9, 159–177.Google Scholar
  40. Griffiths, T.L., & Steyvers, M. (2004). Finding scientific topics[J]. Proceedings of the National academy of Sciences, 101(Suppl 1), 5228–5235.CrossRefGoogle Scholar
  41. Han, H., Giles, L., Zha, H., Li, C., & Tsioutsiouliklis, K. (2004). Two supervised learning approaches for name disambiguation in author citations. In Digital Libraries, 2004. Proceedings of the 2004 joint ACM/IEEE conference on (pp. 296–305). IEEE.Google Scholar
  42. Han, H., Zha, H., & Giles, C.L. (2005). Name disambiguation in author citations using a k-way spectral clustering method. In Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL’05) (pp. 334–343). IEEE.Google Scholar
  43. Hofmann, T. (1999). Probabilistic latent semantic indexing. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval (pp. 50–57). ACM.Google Scholar
  44. Hu, G., Liu, J., Li, H., Cao, Y., Nie, J.Y., & Gao, J. (2006). A supervised learning approach to entity search. In Asia Information Retrieval Symposium (pp. 54–66). Springer Berlin Heidelberg.Google Scholar
  45. Kautz, H., Selman, B., & Shah, M. (1997). The hidden web. AI magazine, 18(2), 27.Google Scholar
  46. Kleinberg, J.M. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46(5), 604–632.MathSciNetCrossRefMATHGoogle Scholar
  47. Lavrenko, V., & Croft, W.B. (2001). Relevance based language models. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 120–127). ACM.Google Scholar
  48. Lee, J.H. (1997). Analyses of multiple evidence combination. In ACM SIGIR Forum (Vol. 31, No. SI, pp. 267–276). ACM.Google Scholar
  49. Macdonald, C. (2009). The voting model for people search (Doctoral dissertation, University of Glasgow).Google Scholar
  50. Macdonald, C., & Ounis, I. (2006). Voting for candidates: adapting data fusion techniques for an expert search task. In Proceedings of the 15th ACM international conference on Information and knowledge management (pp. 387–396). ACM.Google Scholar
  51. Macdonald, C., & Ounis, I. (2007). Using relevance feedback in expert search. In European Conference on Information Retrieval (pp. 431–443). Springer Berlin Heidelberg.Google Scholar
  52. McCallum, A., Wang, X., & Corrada-Emmanuel, A. (2007). Topic and role discovery in social networks with experiments on enron and academic email. Journal of Artificial Intelligence Research (JAIR), 30, 249–272.Google Scholar
  53. McDonald, D.W., & Ackerman, M.S. (2000). Expertise recommender: a flexible recommendation system and architecture. In Proceedings of the 2000 ACM conference on Computer supported cooperative work(pp. 231–240). ACM.Google Scholar
  54. McLean, A., Wu, M., & Vercoustre, A.M. (2005). Combining structured corporate data and document content to improve expertise finding. arXiv:cs/0509005.
  55. Mattox, D., Maybury, M.T., & Morey, D. (1999). Enterprise expert and knowledge discovery. In HCI (2)(pp. 303–307).Google Scholar
  56. Maybury, M., D’Amore, R., & House, D. (2001). Expert finding for collaborative virtual environments. Communications of the ACM, 44(12), 55–56.CrossRefGoogle Scholar
  57. Mimno, D., & McCallum, A. (2007). Expertise modeling for matching papers with reviewers. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 500–509). ACM.Google Scholar
  58. Mockus, A., & Herbsleb, J.D. (2002). Expertise browser: a quantitative approach to identifying expertise. In Proceedings of the 24th international conference on software engineering (pp. 503–512). ACM.Google Scholar
  59. Montague, M., & Aslam, J.A. (2001). Relevance score normalization for metasearch. In Proceedings of the tenth international conference on Information and knowledge management (pp. 427–433). ACM.Google Scholar
  60. Petkova, D., & Croft, W.B. (2007). Proximity-based document representation for named entity retrieval. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management (pp. 731–740). ACM.Google Scholar
  61. Petkova, D., & Croft, W.B. (2008). Hierarchical language models for expert finding in enterprise corpora. International Journal on Artificial Intelligence Tools, 17(01), 5–18.CrossRefGoogle Scholar
  62. Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. In Proceedings of the 20th conference on Uncertainty in artificial intelligence (pp. 487–494). AUAI Press.Google Scholar
  63. Sanderson, M. (2010). Test collection based evaluation of information retrieval systems: Now Publishers Inc.Google Scholar
  64. Schuler, D., & Zimmermann, T. (2008). Mining usage expertise from version archives. In Proceedings of the 2008 international working conference on Mining software repositories (pp. 121–124). ACM.Google Scholar
  65. Serdyukov, P., Chernov, S., & Nejdl, W. (2007). Enhancing expert search through query modeling. In European Conference on Information Retrieval (pp. 737–740). Springer Berlin Heidelberg.Google Scholar
  66. Serdyukov, P., Rode, H., & Hiemstra, D. (2008). Modeling multi-step relevance propagation for expert finding. In Proceedings of the 17th ACM conference on Information and knowledge management (pp. 1133–1142). ACM.Google Scholar
  67. Shakery, A., & Zhai, C.X. (2006). A probabilistic relevance propagation model for hypertext retrieval. In Proceedings of the 15th ACM international conference on Information and knowledge management (pp. 550–558). ACM.Google Scholar
  68. Stankovic, M., Wagner, C., Jovanovic, J., & Laublet, P. (2010). Looking for experts? what can linked data do for you?. In LDOW.Google Scholar
  69. Tang, J., Fong, A.C.M., Wang, B., & Zhang, J. (2012). A unified probabilistic framework for name disambiguation in digital library. IEEE Transactions on Knowledge and Data Engineering, 24(6), 975–987.CrossRefGoogle Scholar
  70. Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., & Su, Z. (2008). Arnetminer: extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 990–998). ACM.Google Scholar
  71. Yao, C., Bo, P., He, J., & Yang, Z. (2005). CNDS. Expert Finding System for TREC 2005. Fourteenth Text Retrieval Conference, Trec 2005, Gaithersburg, Maryland, November. DBLP.Google Scholar
  72. Yimam-Seid, D., & Kobsa, A. (2003). Expert-finding systems for organizations: Problem and domain analysis and the demoir approach. Journal of Organizational Computing and Electronic Commerce, 13(1), 1–24.CrossRefGoogle Scholar
  73. Zhai, C., & Lafferty, J. (2001). A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 334–342). ACM.Google Scholar
  74. Zhang, J., Tang, J., & Li, J. (2007). Expert finding in a social network. In International Conference on Database Systems for Advanced Applications (pp. 1066–1069). Springer Berlin Heidelberg.Google Scholar
  75. Zhang, J., Ackerman, M.S., & Adamic, L. (2007). Expertise networks in online communities: structure and algorithms. In Proceedings of the 16th international conference on World Wide Web (pp. 221–230). ACM.Google Scholar
  76. Zhang, M., Song, R., Lin, C., Ma, S., Jiang, Z., Jin, Y., Liu, Y., Zhao, L., & Ma, S. (2003). Expansion-based technologies in finding relevant and new information: thu trec 2002: novelty track experiments. NIST SPECIAL PUBLICATION SP, 251, 586–590.Google Scholar
  77. Zhu, J., Song, D., & Rüger, S. (2009). Integrating multiple windows and document features for expert finding. Journal of the American Society for Information Science and Technology, 60(4), 694–715.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Shuyi Lin
    • 1
    • 2
  • Wenxing Hong
    • 3
  • Dingding Wang
    • 4
  • Tao Li
    • 5
    • 6
  1. 1.Department of AutomationShanghai Jiao Tong UniversityShanghaiChina
  2. 2.Key Laboratory of System Control and Information Processing, Ministry of EducationShanghai Jiao Tong UniversityShanghaiChina
  3. 3.Department of AutomationXiamen UniversityXiamenChina
  4. 4.Department of CEECSFlorida Atlantic UniversityBoca RatonUSA
  5. 5.School of Computer ScienceFlorida International UniversityMiamiUSA
  6. 6.School of Computer ScienceNanjing University of Posts and TelecommunicationsNanjingChina

Personalised recommendations