Advertisement

Ranking Users for Intelligent Message Addressing

  • Vitor R. Carvalho
  • William W. Cohen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4956)

Abstract

Finding persons who are knowledgeable on a given topic (i.e. Expert Search) has become an active area of recent research [1,2,3] . In this paper we investigate the related task of Intelligent Message Addressing, i.e., finding persons who are potential recipients of a message under composition given its current contents, its previously-specified recipients or a few initial letters of the intended recipient contact (intelligent auto-completion). We begin by providing quantitative evidence, from a very large corpus, of how frequently email users are subject to message addressing problems. We then propose several techniques for this task, including adaptations of well-known formal models of Expert Search. Surprisingly, a simple model based on the K-Nearest-Neighbors algorithm consistently outperformed all other methods. We also investigated combinations of the proposed methods using fusion techniques, which leaded to significant performance improvements over the baselines models. In auto-completion experiments, the proposed models also outperformed all standard baselines. Overall, the proposed techniques showed ranking performance of more than 0.5 in MRR over 5202 queries from 36 different email users, suggesting intelligent message addressing can be a welcome addition to email.

Keywords

Mean Average Precision Prediction Task Initial Letter Mean Reciprocal Rank Email Client 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Balog, K., Azzopardi, L., de Rijke, M.: Formal models for expert finding in enterprise corpora. In: SIGIR 2006 (2006)Google Scholar
  2. 2.
    Fang, H., Zhai, C.: Probabilistic models for expert finding. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 418–430. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Macdonald, M., Ounis, I.: Voting for candidates: Adapting data fusion techniques for an expert search task. In: CIKM, Arlington, USA, November 6-11 (2006)Google Scholar
  4. 4.
    Carvalho, V.R., Cohen, W.W.: Predicting recipients in the enron email corpus. Technical Report CMU-LTI-07-005 (2007)Google Scholar
  5. 5.
    Shetty, J., Adibi, J.: Enron email dataset. Technical report, USC Information Sciences Institute (2004), http://www.isi.edu/~adibi/Enron/Enron.htm
  6. 6.
    Cohen, W.W.: Enron Email Dataset Webpage, http://www.cs.cmu.edu/~enron/
  7. 7.
    Joachims, T.: A probabilistic analysis of the rocchio algorithm with TFIDF for text categorization. In: Proceedings of the ICML 1997 (1997)Google Scholar
  8. 8.
    Salton, G., Buckley, C.: Term weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988)CrossRefGoogle Scholar
  9. 9.
    Yang, Y., Liu, X.: A re-examination of text categorization methods. In: 22nd Annual International SIGIR, August 1999, pp. 42–49 (1999)Google Scholar
  10. 10.
    Klimt, B., Yang, Y.: The enron corpus: A new dataset for email classification research. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 217–226. Springer, Heidelberg (2004)Google Scholar
  11. 11.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)Google Scholar
  12. 12.
    Aslam, J.A., Montague, M.: Models for metasearch. In: Proceedings of ACM SIGIR, pp. 276–284 (2001)Google Scholar
  13. 13.
    Ogilvie, P., Callan, J.P.: Combining document representation for known item search. In: ACM SIGIR (2003)Google Scholar
  14. 14.
    Dom, B., Eiron, I., Cozzi, A., Zhang, Y.: Graph-based ranking algorithms for e-mail expertise analysis. In: Data Mining and Knowledge Discovery Workshop (DMKD2003) in ACM SIGMOD (2003)Google Scholar
  15. 15.
    Campbell, C.S., Maglio, P.P., Cozzi, A., Dom, B.: Expertise identification using email communications. In: CIKM (2003)Google Scholar
  16. 16.
    Sihn, W., Heeren, F.: Expert finding within specified subject areas through analysis of e-mail communication. In: Proceedings of the Euromedia 2001 (2001)Google Scholar
  17. 17.
    Pal, C., McCallum, A.: Cc prediction with graphical models. In: CEAS (2006)Google Scholar
  18. 18.
    Carvalho, V.R., Cohen, W.W.: Preventing information leaks in email. In: Proceedings of SIAM International Conference on Data Mining (SDM 2007), Minneapolis, MN (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Vitor R. Carvalho
    • 1
  • William W. Cohen
    • 1
    • 2
  1. 1.Language Technologies InstituteCarnegie Mellon UniversityPittsburghUSA
  2. 2.Machine Learning DepartmentCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations