Knowledge and Information Systems

, Volume 16, Issue 3, pp 259–280 | Cite as

Voting techniques for expert search

Regular Paper

Abstract

In an expert search task, the users’ need is to identify people who have relevant expertise to a topic of interest. An expert search system predicts and ranks the expertise of a set of candidate persons with respect to the users’ query. In this paper, we propose a novel approach for predicting and ranking candidate expertise with respect to a query, called the Voting Model for Expert Search. In the Voting Model, we see the problem of ranking experts as a voting problem. We model the voting problem using 12 various voting techniques, which are inspired from the data fusion field. We investigate the effectiveness of the Voting Model and the associated voting techniques across a range of document weighting models, in the context of the TREC 2005 and TREC 2006 Enterprise tracks. The evaluation results show that the voting paradigm is very effective, without using any query or collection-specific heuristics. Moreover, we show that improving the quality of the underlying document representation can significantly improve the retrieval performance of the voting techniques on an expert search task. In particular, we demonstrate that applying field-based weighting models improves the ranking of candidates. Finally, we demonstrate that the relative performance of the voting techniques for the proposed approach is stable on a given task regardless of the used weighting models, suggesting that some of the proposed voting techniques will always perform better than other voting techniques.

Keywords

Voting Expert finding Expertise modelling Expert search Information retrieval Ranking Data fusion 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Amati G (2003) Probabilistic models for information retrieval based on divergence from randomness. PhD thesis, University of Glasgow, Glasgow, UKGoogle Scholar
  2. 2.
    Amati G (2006) Frequentist and Bayesian approach to information retrieval. In: Lalmas M, MacFarlane A, Rüger S et al (eds) Proceedings of ECIR 2006. Lecture Notes in Computer Science, vol 3936. Springer, London, pp 13–24. doi: 10.1007/11735106_3Google Scholar
  3. 3.
    Balog K, de Rijke M (2006) Finding experts and their details in e-mail corpora. In: Carr L, De Roure D, Iyengar A et al (eds). Proceedings of WWW 2006. ACM Press, Edinburgh, pp. 1035–1036 doi: 10.1145/1135777.1136002Google Scholar
  4. 4.
    Balog K, Azzopardi L, de Rijke M (2006) Formal models for expert finding in enterprise corpora. In: Efthimiadis E, Dumais S, Hawking D et al (eds) Proceedings of ACM SIGIR 2006. ACM Press, Seattle, pp 43–50. doi: 10.1145/1148170.1148181Google Scholar
  5. 5.
    Aslam JA, Montague M (2001) Models for metasearch. In: oft WB, Harper D, Kraft D et al. (eds). Proceedings of ACM SIGIR 2001. ACM Press, New Orleans, pp 276–284 doi: 10.1145/383952.384007Google Scholar
  6. 6.
    Campbell CS, Maglio PP, Cozzi A, et al (2003) Expertise identification using email communications. In Proceedings of ACM CIKM 2003. ACM Press, New Orleans, pp 528–531. doi: 10.1145/956863.956965Google Scholar
  7. 7.
    Cao Y, Li H, Liu J et al (2005) Research on expert search at enterprise track of TREC 2005. In: Proceedings of TREC-2005. NIST, GaithersburgGoogle Scholar
  8. 8.
    Craswell N, de Vries AP, Soboroff I (2005) Overview of the TREC-2005 enterprise track. In: Proceedings of TREC-2005. NIST, GaithersburgGoogle Scholar
  9. 9.
    aswell N, Hawking D, Vercoustre A-M et al (2001) Panoptic expert: searching for experts not just for documents. In: Ausweb Poster Proceedings, Queensland, AustraliaGoogle Scholar
  10. 10.
    Dom B, Eiron I and Cozzi A (2003). Graph-based ranking algorithms for e-mail expertise analysis. In: Zaki, MJ and Aggarwal, C (eds) Proceedings of ACM SIGMOD DMKD Workshop 2003., pp 42–48. ACM Press, San Diego Google Scholar
  11. 11.
    Dumais ST, Nielsen J (1992) Automating the assignment of submitted manusipts to reviewers. In: Belkin NJ, Ingwersen P, Pejtersen AM (eds) Proceedings of ACM SIGIR 1992, Copenhagen, Denmark, pp 233–244. doi: 10.1145/133160.133205Google Scholar
  12. 12.
    Fang H, Zhai C (2007) Probabilistic models for expert finding. In: Amati G, Carpineto C, Romano G (eds) Proceedings of ECIR 2007. Lecture Notes in Computer Science vol 4425. Springer, Rome, pp 418-430. doi: 10.1007/978-3-540-71496-5_38Google Scholar
  13. 13.
    Fox EA, Shaw JA (1994) Combination of multiple searches. In: Proceedings of TREC-2. NIST, GaithersburgGoogle Scholar
  14. 14.
    Hertzum M and Pejtersen AM (2000). The information-seeking practises of engineers: searching for documents as well as for people. Inf Process Manage 36(5): 761–778 doi: 10.1016/S0306-4573(00)00011-X CrossRefGoogle Scholar
  15. 15.
    Hiemstra D (2001) Using language models for information retrieval. PhD thesis, University of Twente, The NetherlandsGoogle Scholar
  16. 16.
    Kendall MG (1955). Rank correlation methods, 2nd edn. Charles Griffin, London Google Scholar
  17. 17.
    Kleinberg JM (1999). Authoritative sources in a hyperlinked environment. J ACM 46(5): 604–632 doi: 10.1145/324133.324140 MATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Lee JH (1997) Analyses of multiple evidence combination. In: Belkin NJ, Willett P, Narasimhalu AD (eds) Proceedings of ACM SIGIR 1997, ACM Press, Philadelphia, pp 267–276. doi: 10.1145/258525.258587Google Scholar
  19. 19.
    Lioma C, Macdonald C, Plachouras V, et al (2007) University of Glasgow at TREC 2006: experiments in terabyte and enterprise tracks with terrier. In: Proceedings of TREC 2006. NIST, GaithersburgGoogle Scholar
  20. 20.
    Liu X, oft WB, Koll M (2005) Finding experts in community-based question-answering services. In: Schek H-J, Fuhr N, Chowdhury A (eds) Proceedings of ACM CIKM 2005, ACM Press, Bremen, pp 315–316. doi: 10.1145/1099554.1099644Google Scholar
  21. 21.
    Macdonald C, He B, Plachouras V, et al (2006) University of Glasgow at TREC 2005: experiments in terabyte and enterprise tracks with terrier. In: Proceedings of TREC-2005. NIST, GaithersburgGoogle Scholar
  22. 22.
    Macdonald C, Ounis I (2006) Searching for expertise using the terrier platform. In: Efthimiadis E, Dumais S, Hawking D et al (eds) Proceedings of ACM SIGIR 2006. ACM Press, Seattle WA, pp 732. doi: 10.1145/1148170.1148345Google Scholar
  23. 23.
    Macdonald C, Ounis I (2007) Using relevance feedback in expert search. In: Amati G, Carpineto C, Romano G (eds) Proceedings of ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Rome, pp 418-430. doi: 10.1007/978-3-540-71496-5_39Google Scholar
  24. 24.
    Macdonald C, Plachouras V, He B, Lioma C, Ounis I (2006) University of Glasgow at WebCLEF 2005: experiments in per-field normalisation and language specific stemming. In: Peters C, Gey FC, Gonzalo et al (eds) Proceedings of CLEF workshop 2005. Lecture Notes in Computer Science, vol 4022. Springer, Vienna, Austria, pp 898-907. doi: 10.1007/11878773_100Google Scholar
  25. 25.
    Manmatha R, Rath T, Feng F (2001) Modelling score distributions for combining the outputs of search engines. In: oft WB, Harper D, Kraft D et al (eds) Proceedings of ACM SIGIR 2001. ACM Press, New Orleans LA, pp 267–275. doi: 10.1145/383952.384005Google Scholar
  26. 26.
    Maybury M, D’Amore R and House D (2001). Expert finding for collaborative virtual environments. Commun ACM 44(12): 55–56 doi: 10.1145/501338.501343 Google Scholar
  27. 27.
    McLean A, Vercoustre A-M, Wu M (2003) Enterprise PeopleFinder: combining evidence from Web pages and corporate data. In: Hawking D, Bruza P, Thom J (eds) Proceedings of the 8th Australasian Document Computing Symposium (ADCS’03)Google Scholar
  28. 28.
    Montague M, Aslam JA (2001) Metasearch consistency. In: oft WB, Harper D, Kraft D et al (eds) Proceedings of ACM SIGIR 2001. ACM Press, New Orleans, pp 386–387. doi: 10.1145/383952.384030Google Scholar
  29. 29.
    Montague M, Aslam JA (2001) Relevance score normalization for metasearch. In: Proceedings of ACM CIKM 2001. ACM Press, Atlanta, pp 427–433. doi: 10.1145/502585.502657Google Scholar
  30. 30.
    Montague M, Aslam JA (2002) Condorcet fusion for improved retrieval. In Proceedings of ACM CIKM 2002. ACM Press, McLean, pp 538–548. doi: 10.1145/584792.584881Google Scholar
  31. 31.
    Ogilvie P, Callan J (2003) Combining document representations for known-item search. In: Clarke C, Cormack G, Callan J et al (eds) Proceedings of ACM SIGIR 2003. Toronto, Canada, pp 143–150. doi: 10.1145/860435.860463Google Scholar
  32. 32.
    Ounis I, Amati G, Plachouras V et al (2005) Terrier Information Retrieval Platform. In: Losada D, Fernández-Luna JM (eds) Proceedings of ECIR 2005. Lecture Notes in Computer Science, vol 3408. Springer, Santiago de Compostela, pp 517–519. doi: 10.1007/b107096Google Scholar
  33. 33.
    Ounis I, Amati G and Plachouras V (2006). Terrier: a high performance and scalable information retrieval platform. In: Beigbeder, M, Buntime, W, and Gen Yee, W (eds) Proceedings of the OSIR Workshop 2006, pp 18–25. ACM Press, Seattle Google Scholar
  34. 34.
    Petkova D, oft WB (2006) Hierarchical language models for expert finding in enterprise corpora. In: Lu CT, Bourbakis NG (eds) Proceedings of ICTAI 2006. IEEE, Washington, DC, pp 599–608. doi: 10.1109/ICTAI.2006.63Google Scholar
  35. 35.
    Plachouras V, He B, Ounis I (2004) University of Glasgow at TREC2004: experiments in Web, robust and terabyte tracks with terrier. In: Proceedings of TREC-2004. NIST, GaithersburgGoogle Scholar
  36. 36.
    Plachouras V, Ounis I (2007) Multinomial randomness models for retrieval with document fields. In: Amati G, Carpineto C, Romano G (eds) Proceedings of ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Rome, pp 28-39. doi: 10.1007/978-3-540-71496-5_6Google Scholar
  37. 37.
    Robertson SE, Zaragoza H, Taylor M (2004) Simple BM25 extension to multiple weighted Fields. In: Gravano L, Zhai CX, Herzog O (eds) Proceedings of ACM CIKM 2004. ACM Press, Washington, DC, pp 42–49. doi: 10.1145/1031171.1031181Google Scholar
  38. 38.
    Robertson SE, Walker S, Hancock-Beaulieu M, et al (1995) Okapi at TREC-4. In: Proceedings of TREC-4. NIST, GaithersburgGoogle Scholar
  39. 39.
    Robertson SE, Walker S, Hancock-Beaulieu M, et al (1992) Okapi at TREC. In: Proceedings of TREC-1. NIST, GaithersburgGoogle Scholar
  40. 40.
    Savoy J, Calvé AL, Vrajitoru D (1997) Report on the TREC-5 experiment: data fusion and collection fusion. In: Proceedings of TREC-5. NIST, Gaithersburg, MDGoogle Scholar
  41. 41.
    Shaw JA, Fox EA (1994) Combination of multiple searches. In: Proceedings of TREC-3. NIST GaithersburgGoogle Scholar
  42. 42.
    Sihn W, Heeren F (2001) Xpertfinder—expert finding within specified subject areas through analysis of E-mail communication. In: Proceedings of Euromedia 2001, Valencia, Spain, pp 279–283Google Scholar
  43. 43.
    Soboroff I, de Vries AP, aswell N (2006) Overview of the TREC-2006 enterprise track. In: Proceedings of TREC-2006. NIST, GaithersburgGoogle Scholar
  44. 44.
    Wang J, Chen Z, Tao L, Ma WY, Wenyin L (2002) Ranking user’s relevance to a topic through link analysis on web logs. In: Proceedings of WIDM 2002 workshop, McLean, VA, pp 49–54Google Scholar
  45. 45.
    Yimam-Seid D and Kobsa A (2003). Expert finding systems for organizations: problem and domain analysis and the DEMOIR approach. J Organizat Comput and Elec Commerce 13(1): 1–24 CrossRefGoogle Scholar
  46. 46.
    Zaragoza H, aswell N, Taylor M, et al (2004) Miosoft Cambridge at TREC-13: Web and HARD tracks. In: Proceedings of TREC-2004. NIST, GaithersburgGoogle Scholar
  47. 47.
    Zhang M, Song R, Lin C, et al (2002) Expansion-based technologies in finding relevant and new information: THU TREC2002: Novelty Track experiments. In: Proceedings of TREC-2002. NIST, GaithersburgGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2007

Authors and Affiliations

  1. 1.Department of Computing ScienceUniversity of GlasgowGlasgowScotland, UK

Personalised recommendations