Applying Maximum Entropy to Known-Item Email Retrieval

  • Sirvan Yahyaei
  • Christof Monz
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4956)


It is becoming increasingly common in information retrieval to combine evidence from multiple resources to compute the retrieval status value of documents. Although this has led to considerable improvements in several retrieval tasks, one of the outstanding issues is estimation of the respective weights that should be associated with the different sources of evidence. In this paper we propose to use maximum entropy in combination with the limited memory LBFG algorithm to estimate feature weights. Examining the effectiveness of our approach with respect to the known-item finding task of enterprise track of TREC shows that it significantly outperforms a standard retrieval baseline and leads to competitive performance.


Information Retrieval Maximum Entropy Maximum Entropy Method Retrieval Task Maximum Entropy Principle 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Berger, A.L., Della Pietra, V.J., Della Pietra, S.A.: A maximum entropy approach to natural language processing. Comput. Linguist. 22(1), 39–71 (1996)Google Scholar
  2. 2.
    Cooper, W.S.: Exploiting the maximum entropy principle to increase retrieval effectiveness. Journal of the American Society for Information Science 34(1), 31–39 (1983)CrossRefGoogle Scholar
  3. 3.
    Craswell, N., de Vries, A., Soboroff, I.: Overview of the trec-2005 enterprise track. In: Proceedings of the 14th Text REtrieval Conference (2006)Google Scholar
  4. 4.
    Greiff, W.R., Ponte, J.M.: The maximum entropy approach and probabilistic ir models. ACM Trans. Inf. Syst. 18(3), 246–287 (2000)CrossRefGoogle Scholar
  5. 5.
    Kantor, P.B., Lee, J.J.: The maximum entropy principle in information retrieval. In: SIGIR 1986: Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 269–274. ACM Press, New York (1986)CrossRefGoogle Scholar
  6. 6.
    Kantor, P.B., Lee, J.J.: Testing the maximum entropy principle for information retrieval. J. Am. Soc. Inf. Sci. 49(6), 557–566 (1998)CrossRefGoogle Scholar
  7. 7.
    Lalmas, M.: Uniform representation of content and structure for structured document retrieval. In: 20th SGES International Conference on Knowledge Based Systems and Applied Artificial Intelligence (2000)Google Scholar
  8. 8.
    Monz, C.: From Document Retrieval to Question Answering. PhD thesis, University of Amsterdam (2003)Google Scholar
  9. 9.
    Nallapati, R.: Discriminative models for information retrieval. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 64–71 (2004)Google Scholar
  10. 10.
    Nocedal, J.: Updating quasi-newton matrices with limited storage. Mathematics of Computation 35, 773–782 (1980)zbMATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Ogilvie, P., Callan, J.: Combining document representations for known-item search. In: SIGIR 2003: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 143–150. ACM Press, New York (2003)CrossRefGoogle Scholar
  12. 12.
    Ogilvie, P., Callan, J.: Experiments with language models for known-item finding of e-mail messages. In: Proceedings of the Fourteenth Text Retrieval Conference (TREC-14) (2005)Google Scholar
  13. 13.
    Rasolofo, Y., Savoy, J.: Term proximity scoring for keyword-based retrieval systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, p. 79. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  14. 14.
    Robertson, S., Zaragoza, H., Taylor, M.: Simple bm25 extension to multiple weighted fields. In: CIKM 2004: Proceedings of the thirteenth ACM international conference on Information and knowledge management, pp. 42–49. ACM Press, New York (2004)CrossRefGoogle Scholar
  15. 15.
    Tsikrika, T., Lalmas, M.: Combining evidence from web retrieval using the inference network model - an experimental study. Information Processing & Management, Special Issue in Bayesian Networks and Information Retrieval 40(5), 751–772 (2004)Google Scholar
  16. 16.
    Zobel, J., Moffat, A.: Exploring the similarity space. SIGIR Forum 32(1), 18–34 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Sirvan Yahyaei
    • 1
  • Christof Monz
    • 1
  1. 1.Department of Computer ScienceQueen Mary University of LondonLondonUK

Personalised recommendations