Abstract
We introduce a revised way of inspecting retrieval engines based on the term’s weight which we call it as discrimination power of a term. The aim of this methodology is to make engine developers find the indexing or query processing components that are problematic in ranking retrieved documents. The internal processes of indexing and query processing are modeled in a Bayesian inference network. The nodes in this model can be terms, phrases, sentences, documents, etc. The layers represent internal step-by-step processing components like stemming, tokenizing, POS Tagging, parsing, etc. By computing and showing the extent to which each node plays a role in causing a relevant (or irrelevant) document to be ranked high (or low), the proposed method can signal the nodes to be examined and hence the components to be modified. We insist that the proposed method can be used for automatic failure analysis of a system with test runs and that its effectiveness can be improved even automatically.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Harman, D.: The NRRC Reliable Information Access (RIA) Workshop. In: 27th Annual International ACM SIGIR Conference, Sheffield, UK, July 25-39 (2004)
Harman, D.: What we have learned, and not learned, from TREC. In: Proceedings of the 22nd Annual Colloquium on Information Retrieval Research, Cambridge, England, pp. 2–21 (April 2000)
Metzler, D., Croft, W.B.: Combining the Language Model and Inference Network Approaches to Retrieval. Information Processing and Management Special Issue on Bayesian Networks and Information Retrieval 40(5), 735–750 (2004)
Metzler, D., Strohman, T., Turtle, H., Croft, W.B.: Indri at TREC 2004: Terabyte Track. In: TREC 2004 (2004)
Crestani, F., De Campos, L.M., Fernández-Luna, J.M., Huete, J.F.: A Multi-layered Bayesian Network Model for Structured Document Retrieval. In: Nielsen, T.D., Zhang, N.L. (eds.) ECSQARU 2003. LNCS (LNAI), vol. 2711, pp. 74–86. Springer, Heidelberg (2003)
Turtle, H., Croft, W.: Evaluation of an inference network-based retrieval model. ACM Transactions on Information Systems 9(3), 187–222 (1991)
Silva, I., Ribeiro-Neto, B., Calado, P., Moura, E., Ziviani, N.: Link-based and content-based evidential information in a belief network model. In: Proceedings of the 23th ACM–SIGIR Conference, pp. 96–103 (2000)
Robertson, S., Zaragoza, H.: The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval 3(4), 333–389 (2009)
Song, S.-K., Myaeng, S.H.: A Novel Term Weighting Scheme Based on Discrimination Power Obtained from Past Retrieval Results. Information Processing & Management 48(5), 919–930 (2012)
Li, Q., Lee, S., Jung, H., Lee, Y.S., Cho, J.H., Song, S.-K.: “Term weighting for information retrieval based on term’s discrimination power. Multimedia Tools and Applications 71(2), 769–781 (2014)
Song, S.-K., Lee, S., Jung, H.: Methodology for Analyzing Search Engine Modules using Bayesian Inference Network. Journal of KIISE 40(5), 277–282 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Song, Sk., Shin, S., Kim, YM., Seon, CN., Hong, S., Jung, H. (2015). Inspecting Retrieval Engines Based on Term’s Weight. In: Park, J., Stojmenovic, I., Jeong, H., Yi, G. (eds) Computer Science and its Applications. Lecture Notes in Electrical Engineering, vol 330. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45402-2_162
Download citation
DOI: https://doi.org/10.1007/978-3-662-45402-2_162
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45401-5
Online ISBN: 978-3-662-45402-2
eBook Packages: EngineeringEngineering (R0)