An evaluation of term dependence models in information retrieval

  • G. Salton
  • C. Buckley
  • C. T. Yu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 146)

Abstract

In practical retrieval environments the assumption is normally made that the terms assigned to the documents of a collection occur independently of each other. The term independence assumption is unrealistic in many cases, but its use leads to a simple retrieval algorithm. More realistic retrieval systems take into account dependencies between certain term pairs and possibly between term triples. In this study, methods are outlined for generating dependency factors for term pairs and term triples and for using them in retrieval. Evaluation output is included to demonstrate the effectiveness of the suggested methodologies.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    M.E. Maron and J.L. Kuhns, On Relevance, Probabilistic Indexing and Information Retrieval, Journal of the ACM, Vol. 7, No. 3, July 1960, p. 216–244.Google Scholar
  2. [2]
    D. Kraft and A. Bookstein, Evaluation of Information Retrieval Systems: A Decision Theory Approach, Journal of the ASIS, Vol. 29, No. 1, January 1978, p. 31–40.Google Scholar
  3. [3]
    D. Chow and C.T. Yu, "On the Construction of Feedback Queries", Journal of the ACM, Vol. 29, No. 1, January 1982, p. 127–151.Google Scholar
  4. [4]
    G. Salton, Mathematics and Information Retrieval, Journal of Documentation, Vol. 35, No. 1, March 1979, p. 1–29.Google Scholar
  5. [5]
    C.T. Yu, W.S. Luk and M.K. Siu, On Models of Information Retrieval Processes, Information Systems, Vol. 4, No. 3, p. 205–218, 1979.Google Scholar
  6. [6]
    R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis, J. Wiley and Sons, New York, 1973.Google Scholar
  7. [7]
    C.J. van Rijsbergen, A Theoretical Basis for the Use of Cooccurrence Data in Information Retrieval, Journal of Documentation, Vol. 33, No. 2, June 1977, p. 106–119.Google Scholar
  8. [8]
    D.J. Harper and C.J. van Rijsbergen, An Evaluation of Feedback in Document Retrieval using Co-occurrence Data, Journal of Documentation, Vol. 34, No. 3, September 1978, p. 189–216.Google Scholar
  9. [9]
    S.E. Robertson, C.J. van Rijsbergen, and M.F. Porter, Probabilistic Models of Indexing and Searching, in Information Retrieval Research, R.N. Oddy, S.E. Robertson, C.J. van Rijsbergen and P.W. Williams, editors, Butterworths, London, 1981, p. 35–56.Google Scholar
  10. [10]
    C.T. Yu, K. Lam, and G. Salton, Extensions to the Tree Dependence Model in Information Retrieval, Technical Report, Computer Science Department, Cornell University, Ithaca, New York, 1982.Google Scholar
  11. [11]
    K.V.M. Whitney, Minimal Spanning Tree, Communications of the ACM, Vol. 15, No. 4, April 1972, p. 273–274.Google Scholar
  12. [12]
    C.T. Yu, K. Lam, and G. Salton, Term Weighting in Information Retrieval Using the Term Precision Model, Journal of the ACM, Vol. 29, No. 1, January 1982, p. 152–170.Google Scholar
  13. [13]
    H. Wu and G. Salton, The Estimation of Term Relevance Weights Using Relevance Feedback, Journal of Documentation, Vol. 37, No. 4, December 1981, p. 194–214.Google Scholar
  14. [14]
    G. Salton, Dynamic Information and Library Processing, Prentice Hall Inc., Englewood Cliffs, New Jersey, 1975, Chapter 6.Google Scholar

Copyright information

© Springer-Verlag 1983

Authors and Affiliations

  • G. Salton
    • 1
  • C. Buckley
    • 1
  • C. T. Yu
    • 2
  1. 1.Department of Computer ScienceCornell UniversityIthacaUSA
  2. 2.Department of Information EngineeringUniversity of Illinois/Chicago CircleChicagoUSA

Personalised recommendations