Advertisement

Modeling Documents as Mixtures of Persons for Expert Finding

  • Pavel Serdyukov
  • Djoerd Hiemstra
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4956)

Abstract

In this paper we address the problem of searching for knowledgeable persons within the enterprise, known as the expert finding (or expert search) task. We present a probabilistic algorithm using the assumption that terms in documents are produced by people who are mentioned in them. We represent documents retrieved to a query as mixtures of candidate experts language models. Two methods of personal language models extraction are proposed, as well as the way of combining them with other evidences of expertise. Experiments conducted with the TREC Enterprise collection demonstrate the superiority of our approach in comparison with the best one among existing solutions.

Keywords

Query Term Query Expansion Mean Average Precision Mean Reciprocal Rank Expert Find 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Balog, K., Azzopardi, L., de Rijke, M.: Formal models for expert finding in enterprise corpora. In: SIGIR 2006: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 43–50 (2006)Google Scholar
  2. 2.
    Balog, K., Bogers, T., Azzopardi, L., de Rijke, M., van den Bosch, A.: Broad expertise retrieval in sparse data environments. In: SIGIR 2007: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 551–558. ACM Press, New York (2007)Google Scholar
  3. 3.
    Balog, K., de Rijke, M.: Finding experts and their details in e-mail corpora. In: 15th International World Wide Web Conference (WWW 2006) (2006)Google Scholar
  4. 4.
    Campbell, C.S., Maglio, P.P., Cozzi, A., Dom, B.: Expertise identification using email communications. In: CIKM 2003: Proceedings of the twelfth international conference on Information and knowledge management, pp. 528–531. ACM Press, New York (2003)CrossRefGoogle Scholar
  5. 5.
    Craswell, N., de Vries, A., Soboroff, I.: Overview of the trec-2005 enterprise track. In: Proceedings of TREC-2005, Gaithersburg, MD (2005)Google Scholar
  6. 6.
    Craswell, N., Hawking, D., Vercoustre, A.-M., Wilkins, P.: Panoptic expert: Searching for experts not just for documents. In: Ausweb Poster Proceedings, Queensland, Australia (2001)Google Scholar
  7. 7.
    Crestani, F., Lalmas, M., Rijsbergen, C.J.V., Campbell, I.: ”Is this document relevant?: Probably”: a survey of probabilistic models in information retrieval. ACM Comput. Surv. 30(4), 528–552 (1998)CrossRefGoogle Scholar
  8. 8.
    Dempster, A., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39(1), 1–38 (1977)MATHMathSciNetGoogle Scholar
  9. 9.
    Fang, H., Zhai, C.: Probabilistic models for expert finding. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 418–430. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  10. 10.
    Xiong, J., Tan, S., Chen, H., Shen, H., Cheng, X.: Social Network Structure behind the Mailing Lists: ICT-IIIS at TREC 2006 Expert Finding Track. In: Proceeddings of the 15th Text REtrieval Conference (TREC 2006) (2006)Google Scholar
  11. 11.
    Hiemstra, D., de Jong, F.M.G.: Statistical language models and information retrieval: Natural language processing really meets retrieval. Glot international 5(8), 288–293 (2001)Google Scholar
  12. 12.
    Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR 1999: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 50–57. ACM Press, New York (1999)CrossRefGoogle Scholar
  13. 13.
    Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Lavrenko, V., Croft, W.B.: Relevance based language models. In: SIGIR 2001: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 120–127. ACM Press, New York (2002)Google Scholar
  15. 15.
    Liu, X., Croft, W.B., Koll, M.: Finding experts in community-based question-answering services. In: CIKM 2005: Proceedings of the 14th ACM international conference on Information and knowledge management, pp. 315–316. ACM Press, New York (2005)CrossRefGoogle Scholar
  16. 16.
    Lu, W., Robertson, S., Macfarlane, A., Zhao, H.: Window-based Enterprise Expert Search. In: Proceeddings of the 15th Text REtrieval Conference (TREC 2006) (2006)Google Scholar
  17. 17.
    Macdonald, C., Ounis, I.: Voting for candidates: adapting data fusion techniques for an expert search task. In: CIKM 2006: Proceedings of the 15th ACM international conference on Information and knowledge management, pp. 387–396. ACM Press, New York (2006)CrossRefGoogle Scholar
  18. 18.
    Macdonald, C., Ounis, I.: Using relevance feedback in expert search. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 431–443. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  19. 19.
    Maybury, M.T.: Expert finding systems. Technical Report MTR06B000040, MITRE Corporation (2006)Google Scholar
  20. 20.
    McDonald, D.W., Ackerman, M.S.: Just talk to me: a field study of expertise location. In: CSCW 1998: Proceedings of the 1998 ACM conference on Computer supported cooperative work, pp. 315–324. ACM Press, New York (1998)CrossRefGoogle Scholar
  21. 21.
    Petkova, D., Croft, W.B.: Hierarchical language models for expert finding in enterprise corpora. In: ICTAI 2006: Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence, pp. 599–608. IEEE Computer Society, Los Alamitos (2006)CrossRefGoogle Scholar
  22. 22.
    Serdyukov, P., Chernov, S., Nejdl, W.: Enhancing expert search through query modeling. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 737–740. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  23. 23.
    Serdyukov, P., Rode, H., Hiemstra, D.: University of Twente at the TREC 2007 Enterprise Track: Modeling relevance propagation for the expert search task. In: Proceeddings of the 16th Text REtrieval Conference (TREC 2007) (2007)Google Scholar
  24. 24.
    Tsikrika, T., Serdyukov, P., Rode, H., Westerveld, T., Aly, R., Hiemstra, D., de Vries, A.: Structured Document Retrieval, Multimedia Retrieval, and Entity Ranking Using PF/Tijah. In: Fuhr, N., Lalmas, M., Trotman, A. (eds.) INEX 2006. LNCS, vol. 4518, Springer, Heidelberg (2007)Google Scholar
  25. 25.
    Zaragoza, H., Rode, H., Mika, P., Atserias, J., Ciaramita, M., Attardi, G.: Ranking very many typed entities on wikipedia. In: CIKM 2007: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pp. 1015–1018. ACM Press, New York (2007)CrossRefGoogle Scholar
  26. 26.
    Zhai, C., Lafferty, J.: Model-based feedback in the language modeling approach to information retrieval. In: CIKM 2001: Proceedings of the tenth international con- ference on Information and knowledge management, pp. 403–410 (2001)Google Scholar
  27. 27.
    Zhang, J., Ackerman, M.S., Adamic, L.: Expertise networks in online communi- ties: structure and algorithms. In: WWW 2007: Proceedings of the 16th international conference on World Wide Web, pp. 221–230. ACM Press, New York (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Pavel Serdyukov
    • 1
  • Djoerd Hiemstra
    • 1
  1. 1.Database GroupUniversity of TwenteEnschedeThe Netherlands

Personalised recommendations