Modeling Documents as Mixtures of Persons for Expert Finding
Conference paper
Abstract
In this paper we address the problem of searching for knowledgeable persons within the enterprise, known as the expert finding (or expert search) task. We present a probabilistic algorithm using the assumption that terms in documents are produced by people who are mentioned in them. We represent documents retrieved to a query as mixtures of candidate experts language models. Two methods of personal language models extraction are proposed, as well as the way of combining them with other evidences of expertise. Experiments conducted with the TREC Enterprise collection demonstrate the superiority of our approach in comparison with the best one among existing solutions.
Keywords
Query Term Query Expansion Mean Average Precision Mean Reciprocal Rank Expert Find
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Preview
Unable to display preview. Download preview PDF.
References
- 1.Balog, K., Azzopardi, L., de Rijke, M.: Formal models for expert finding in enterprise corpora. In: SIGIR 2006: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 43–50 (2006)Google Scholar
- 2.Balog, K., Bogers, T., Azzopardi, L., de Rijke, M., van den Bosch, A.: Broad expertise retrieval in sparse data environments. In: SIGIR 2007: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 551–558. ACM Press, New York (2007)Google Scholar
- 3.Balog, K., de Rijke, M.: Finding experts and their details in e-mail corpora. In: 15th International World Wide Web Conference (WWW 2006) (2006)Google Scholar
- 4.Campbell, C.S., Maglio, P.P., Cozzi, A., Dom, B.: Expertise identification using email communications. In: CIKM 2003: Proceedings of the twelfth international conference on Information and knowledge management, pp. 528–531. ACM Press, New York (2003)CrossRefGoogle Scholar
- 5.Craswell, N., de Vries, A., Soboroff, I.: Overview of the trec-2005 enterprise track. In: Proceedings of TREC-2005, Gaithersburg, MD (2005)Google Scholar
- 6.Craswell, N., Hawking, D., Vercoustre, A.-M., Wilkins, P.: Panoptic expert: Searching for experts not just for documents. In: Ausweb Poster Proceedings, Queensland, Australia (2001)Google Scholar
- 7.Crestani, F., Lalmas, M., Rijsbergen, C.J.V., Campbell, I.: ”Is this document relevant?: Probably”: a survey of probabilistic models in information retrieval. ACM Comput. Surv. 30(4), 528–552 (1998)CrossRefGoogle Scholar
- 8.Dempster, A., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39(1), 1–38 (1977)zbMATHMathSciNetGoogle Scholar
- 9.Fang, H., Zhai, C.: Probabilistic models for expert finding. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 418–430. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 10.Xiong, J., Tan, S., Chen, H., Shen, H., Cheng, X.: Social Network Structure behind the Mailing Lists: ICT-IIIS at TREC 2006 Expert Finding Track. In: Proceeddings of the 15th Text REtrieval Conference (TREC 2006) (2006)Google Scholar
- 11.Hiemstra, D., de Jong, F.M.G.: Statistical language models and information retrieval: Natural language processing really meets retrieval. Glot international 5(8), 288–293 (2001)Google Scholar
- 12.Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR 1999: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 50–57. ACM Press, New York (1999)CrossRefGoogle Scholar
- 13.Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
- 14.Lavrenko, V., Croft, W.B.: Relevance based language models. In: SIGIR 2001: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 120–127. ACM Press, New York (2002)Google Scholar
- 15.Liu, X., Croft, W.B., Koll, M.: Finding experts in community-based question-answering services. In: CIKM 2005: Proceedings of the 14th ACM international conference on Information and knowledge management, pp. 315–316. ACM Press, New York (2005)CrossRefGoogle Scholar
- 16.Lu, W., Robertson, S., Macfarlane, A., Zhao, H.: Window-based Enterprise Expert Search. In: Proceeddings of the 15th Text REtrieval Conference (TREC 2006) (2006)Google Scholar
- 17.Macdonald, C., Ounis, I.: Voting for candidates: adapting data fusion techniques for an expert search task. In: CIKM 2006: Proceedings of the 15th ACM international conference on Information and knowledge management, pp. 387–396. ACM Press, New York (2006)CrossRefGoogle Scholar
- 18.Macdonald, C., Ounis, I.: Using relevance feedback in expert search. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 431–443. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 19.Maybury, M.T.: Expert finding systems. Technical Report MTR06B000040, MITRE Corporation (2006)Google Scholar
- 20.McDonald, D.W., Ackerman, M.S.: Just talk to me: a field study of expertise location. In: CSCW 1998: Proceedings of the 1998 ACM conference on Computer supported cooperative work, pp. 315–324. ACM Press, New York (1998)CrossRefGoogle Scholar
- 21.Petkova, D., Croft, W.B.: Hierarchical language models for expert finding in enterprise corpora. In: ICTAI 2006: Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence, pp. 599–608. IEEE Computer Society, Los Alamitos (2006)CrossRefGoogle Scholar
- 22.Serdyukov, P., Chernov, S., Nejdl, W.: Enhancing expert search through query modeling. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 737–740. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 23.Serdyukov, P., Rode, H., Hiemstra, D.: University of Twente at the TREC 2007 Enterprise Track: Modeling relevance propagation for the expert search task. In: Proceeddings of the 16th Text REtrieval Conference (TREC 2007) (2007)Google Scholar
- 24.Tsikrika, T., Serdyukov, P., Rode, H., Westerveld, T., Aly, R., Hiemstra, D., de Vries, A.: Structured Document Retrieval, Multimedia Retrieval, and Entity Ranking Using PF/Tijah. In: Fuhr, N., Lalmas, M., Trotman, A. (eds.) INEX 2006. LNCS, vol. 4518, Springer, Heidelberg (2007)Google Scholar
- 25.Zaragoza, H., Rode, H., Mika, P., Atserias, J., Ciaramita, M., Attardi, G.: Ranking very many typed entities on wikipedia. In: CIKM 2007: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pp. 1015–1018. ACM Press, New York (2007)CrossRefGoogle Scholar
- 26.Zhai, C., Lafferty, J.: Model-based feedback in the language modeling approach to information retrieval. In: CIKM 2001: Proceedings of the tenth international con- ference on Information and knowledge management, pp. 403–410 (2001)Google Scholar
- 27.Zhang, J., Ackerman, M.S., Adamic, L.: Expertise networks in online communi- ties: structure and algorithms. In: WWW 2007: Proceedings of the 16th international conference on World Wide Web, pp. 221–230. ACM Press, New York (2007)CrossRefGoogle Scholar
Copyright information
© Springer-Verlag Berlin Heidelberg 2008