Skip to main content

Inspecting Retrieval Engines Based on Term’s Weight

  • Conference paper
Computer Science and its Applications

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 330))

  • 3382 Accesses

Abstract

We introduce a revised way of inspecting retrieval engines based on the term’s weight which we call it as discrimination power of a term. The aim of this methodology is to make engine developers find the indexing or query processing components that are problematic in ranking retrieved documents. The internal processes of indexing and query processing are modeled in a Bayesian inference network. The nodes in this model can be terms, phrases, sentences, documents, etc. The layers represent internal step-by-step processing components like stemming, tokenizing, POS Tagging, parsing, etc. By computing and showing the extent to which each node plays a role in causing a relevant (or irrelevant) document to be ranked high (or low), the proposed method can signal the nodes to be examined and hence the components to be modified. We insist that the proposed method can be used for automatic failure analysis of a system with test runs and that its effectiveness can be improved even automatically.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Harman, D.: The NRRC Reliable Information Access (RIA) Workshop. In: 27th Annual International ACM SIGIR Conference, Sheffield, UK, July 25-39 (2004)

    Google Scholar 

  2. Harman, D.: What we have learned, and not learned, from TREC. In: Proceedings of the 22nd Annual Colloquium on Information Retrieval Research, Cambridge, England, pp. 2–21 (April 2000)

    Google Scholar 

  3. Metzler, D., Croft, W.B.: Combining the Language Model and Inference Network Approaches to Retrieval. Information Processing and Management Special Issue on Bayesian Networks and Information Retrieval 40(5), 735–750 (2004)

    Google Scholar 

  4. Metzler, D., Strohman, T., Turtle, H., Croft, W.B.: Indri at TREC 2004: Terabyte Track. In: TREC 2004 (2004)

    Google Scholar 

  5. Crestani, F., De Campos, L.M., Fernández-Luna, J.M., Huete, J.F.: A Multi-layered Bayesian Network Model for Structured Document Retrieval. In: Nielsen, T.D., Zhang, N.L. (eds.) ECSQARU 2003. LNCS (LNAI), vol. 2711, pp. 74–86. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  6. Turtle, H., Croft, W.: Evaluation of an inference network-based retrieval model. ACM Transactions on Information Systems 9(3), 187–222 (1991)

    Article  Google Scholar 

  7. Silva, I., Ribeiro-Neto, B., Calado, P., Moura, E., Ziviani, N.: Link-based and content-based evidential information in a belief network model. In: Proceedings of the 23th ACM–SIGIR Conference, pp. 96–103 (2000)

    Google Scholar 

  8. Robertson, S., Zaragoza, H.: The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval 3(4), 333–389 (2009)

    Article  Google Scholar 

  9. Song, S.-K., Myaeng, S.H.: A Novel Term Weighting Scheme Based on Discrimination Power Obtained from Past Retrieval Results. Information Processing & Management 48(5), 919–930 (2012)

    Article  Google Scholar 

  10. Li, Q., Lee, S., Jung, H., Lee, Y.S., Cho, J.H., Song, S.-K.: “Term weighting for information retrieval based on term’s discrimination power. Multimedia Tools and Applications 71(2), 769–781 (2014)

    Article  Google Scholar 

  11. Song, S.-K., Lee, S., Jung, H.: Methodology for Analyzing Search Engine Modules using Bayesian Inference Network. Journal of KIISE 40(5), 277–282 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sa-kwang Song .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Song, Sk., Shin, S., Kim, YM., Seon, CN., Hong, S., Jung, H. (2015). Inspecting Retrieval Engines Based on Term’s Weight. In: Park, J., Stojmenovic, I., Jeong, H., Yi, G. (eds) Computer Science and its Applications. Lecture Notes in Electrical Engineering, vol 330. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45402-2_162

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-45402-2_162

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-45401-5

  • Online ISBN: 978-3-662-45402-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics