Skip to main content

Inspecting Retrieval Engines Based on Term’s Weight

  • Conference paper
  • 3056 Accesses

Part of the Lecture Notes in Electrical Engineering book series (LNEE,volume 330)

Abstract

We introduce a revised way of inspecting retrieval engines based on the term’s weight which we call it as discrimination power of a term. The aim of this methodology is to make engine developers find the indexing or query processing components that are problematic in ranking retrieved documents. The internal processes of indexing and query processing are modeled in a Bayesian inference network. The nodes in this model can be terms, phrases, sentences, documents, etc. The layers represent internal step-by-step processing components like stemming, tokenizing, POS Tagging, parsing, etc. By computing and showing the extent to which each node plays a role in causing a relevant (or irrelevant) document to be ranked high (or low), the proposed method can signal the nodes to be examined and hence the components to be modified. We insist that the proposed method can be used for automatic failure analysis of a system with test runs and that its effectiveness can be improved even automatically.

Keywords

  • Failure Inspection
  • Discrimination Power
  • Retrieval Engine
  • Bayesian Inference Network
  • Term Weighting

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-662-45402-2_162
  • Chapter length: 6 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   389.00
Price excludes VAT (USA)
  • ISBN: 978-3-662-45402-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   499.99
Price excludes VAT (USA)
Hardcover Book
USD   499.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Harman, D.: The NRRC Reliable Information Access (RIA) Workshop. In: 27th Annual International ACM SIGIR Conference, Sheffield, UK, July 25-39 (2004)

    Google Scholar 

  2. Harman, D.: What we have learned, and not learned, from TREC. In: Proceedings of the 22nd Annual Colloquium on Information Retrieval Research, Cambridge, England, pp. 2–21 (April 2000)

    Google Scholar 

  3. Metzler, D., Croft, W.B.: Combining the Language Model and Inference Network Approaches to Retrieval. Information Processing and Management Special Issue on Bayesian Networks and Information Retrieval 40(5), 735–750 (2004)

    Google Scholar 

  4. Metzler, D., Strohman, T., Turtle, H., Croft, W.B.: Indri at TREC 2004: Terabyte Track. In: TREC 2004 (2004)

    Google Scholar 

  5. Crestani, F., De Campos, L.M., Fernández-Luna, J.M., Huete, J.F.: A Multi-layered Bayesian Network Model for Structured Document Retrieval. In: Nielsen, T.D., Zhang, N.L. (eds.) ECSQARU 2003. LNCS (LNAI), vol. 2711, pp. 74–86. Springer, Heidelberg (2003)

    CrossRef  Google Scholar 

  6. Turtle, H., Croft, W.: Evaluation of an inference network-based retrieval model. ACM Transactions on Information Systems 9(3), 187–222 (1991)

    CrossRef  Google Scholar 

  7. Silva, I., Ribeiro-Neto, B., Calado, P., Moura, E., Ziviani, N.: Link-based and content-based evidential information in a belief network model. In: Proceedings of the 23th ACM–SIGIR Conference, pp. 96–103 (2000)

    Google Scholar 

  8. Robertson, S., Zaragoza, H.: The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval 3(4), 333–389 (2009)

    CrossRef  Google Scholar 

  9. Song, S.-K., Myaeng, S.H.: A Novel Term Weighting Scheme Based on Discrimination Power Obtained from Past Retrieval Results. Information Processing & Management 48(5), 919–930 (2012)

    CrossRef  Google Scholar 

  10. Li, Q., Lee, S., Jung, H., Lee, Y.S., Cho, J.H., Song, S.-K.: “Term weighting for information retrieval based on term’s discrimination power. Multimedia Tools and Applications 71(2), 769–781 (2014)

    CrossRef  Google Scholar 

  11. Song, S.-K., Lee, S., Jung, H.: Methodology for Analyzing Search Engine Modules using Bayesian Inference Network. Journal of KIISE 40(5), 277–282 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sa-kwang Song .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Song, Sk., Shin, S., Kim, YM., Seon, CN., Hong, S., Jung, H. (2015). Inspecting Retrieval Engines Based on Term’s Weight. In: Park, J., Stojmenovic, I., Jeong, H., Yi, G. (eds) Computer Science and its Applications. Lecture Notes in Electrical Engineering, vol 330. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45402-2_162

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-45402-2_162

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-45401-5

  • Online ISBN: 978-3-662-45402-2

  • eBook Packages: EngineeringEngineering (R0)