Skip to main content
Log in

Term weighting for information retrieval based on term’s discrimination power

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

One of the most important research topics in Information Retrieval is term weighting for document ranking and retrieval, such as TFIDF, BM25, etc. We propose a term weighting method that utilizes past retrieval results consisting of the queries that contain a particular term, retrieval documents, and their relevance judgments. A term’s Discrimination Power(DP) is based on the difference degree of the term’s average weights obtained from between relevant and non-relevant retrieved document sets. The difference based DP performs better compared to ratio based DP introduced in the previous research. Our experimental result shows that a term weighting scheme based on the discrimination power method outperforms a TF*IDF based scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Baeza-Yates R (1999) Modern information retrieval. ACM Press

  2. Broglio J, Callan JP, Croft WB, Nachbar DW (1994) Nachbar, document retrieval and routing using the INQUERY system. In: Proceedings of the third text REtrieval conference

  3. Cao G, Nie J-Y, Si L, Bai J (2007) Learning to rank documents for Ad-Hoc retrieval with regularized models. In: Proceedings of the ACM Conference on Research and Development in Information Retrieval (SIGIR 2007)

  4. Chun H-W, Jeong C-H, Song S-K, Choi Y-S, Choi S-P, Sung W-K (2011) Composite kernel–based relation extraction using predicate-argument structure, UNESST 2011

  5. Chun H-W, Jeong C-H, Song S-K, Choi Y-S, Jeong D-H, Choi S-P, Sung W-K (2011) Smart searching system for virtual science brain. LNCS 6890:324–332

    Google Scholar 

  6. Craswell N, Zaragoza H, Robertson S (2005) Microsoft Cambridge at TREC-14: enterprise track. In: Proceedings of the Fourteenth Text REtrieval Conference, Gaithersburg

  7. Cummins R, O'Riordan C (2006) Evolving local and global weighting schemes in information retrieval. J Inf Retr 9(3):311–330

    Article  Google Scholar 

  8. Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM 46:667–668

    Article  MathSciNet  Google Scholar 

  9. Pahikkala T, Tsivtsivadze E, Airola A, Boberg J, Salakoski T (2007) Learning to rank with pairwise regularized least-squares. In: Proceedings of the ACM Conference on Research and Development in Information Retrieval (SIGIR 2007)

  10. Robertson S (2004) Understanding inverse document frequency: on theoretical arguments for IDF. J Doc 5:503–520

    Article  Google Scholar 

  11. Robertson SE, Walker S, Jones S, Hancock-Beaulieu MM, Gatford M (1996) Okapi at TREC-4. In: The Proceedings of the Fourth Text REtrieval Conference

  12. Salton G, Buckley C (1990) Improving retrieval performance by relevance feedback. J Am Soc Inf Sci 41(4):288–297

    Article  Google Scholar 

  13. Song S-K, Myaeng SH (2012) A novel term weighting scheme based on discrimination power obtained from past retrieval results. Inf Process Manag, http://dx.doi.org/10.1016/j.ipm.2012.03.004

  14. Turtle H, Croft WB (1991) Evaluation of an inference network-based retrieval model. ACM Trans Inf Syst (TOIS) 9(3):187–222

    Article  Google Scholar 

  15. Yeh J-Y, Lin J-Y, Ke H-R, Yang W-P (2007) Learning to rank for information retrieval using genetic programming. In: Proceedings of the ACM Conference on Research and Development in Information Retrieval (SIGIR 2007)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sa-kwang Song.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Q., Lee, S., Jung, H. et al. Term weighting for information retrieval based on term’s discrimination power. Multimed Tools Appl 71, 769–781 (2014). https://doi.org/10.1007/s11042-013-1420-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1420-1

Keywords

Navigation