Abstract
Information Retrieval (IR) systems combine a variety of techniques stemming from logical, vector-space and probabilistic models. This variety of combinations has produced a significant increase in retrieval effectiveness since early 1990s. Nevertheless, the quest for new frameworks has not been less intense than the research in the optimization and experimentation of the most common retrieval models. This paper presents a new framework based on Discrete Fourier Transform (DFT) for IR. Basically, this model represents a query term as a sine curve and a query is the sum of sine curves, thus it acquires an elegant and sound mathematical form. The sinusoidal representation of the query is transformed from the time domain to the frequency domain through DFT. The result of the DFT is a spectrum. Each document of the collection corresponds to a set of filters and the retrieval operation corresponds to filtering the spectrum – for each document the spectrum is filtered and the result is a power. Hence, the documents are ranked by the power of the spectrum such that the more the document decreases the power of the spectrum, the higher the rank of the document. This paper is mainly theoretical and the retrieval algorithm is reported to suggest the feasibility of the proposed model. Some small-scale experiments carried out for testing the effectiveness of the algorithm indicate a performance comparable to the state-of-the-art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Voorhees, E., Harman, D. (eds.): TREC: Experiment and Evaluation in Information Retrieval. The MIT Press, Cambridge (2005)
Robertson, S.: Salton award lecture: On theoretical argument in information retrieval. SIGIR Forum 34(1), 1–10 (2000)
Croft, W., Lafferty, J. (eds.): Language Modeling for Information Retrieval. Springer, Heidelberg (2003)
van Rijsbergen, C.: The Geometry of Information Retrieval. Cambridge University Press, UK (2004)
Fuhr, N.: A probability ranking principle for interactive information retrieval. Journal of Information Retrieval 11(3), 251–265 (2008)
Cooper, W.: Getting beyond Boole. Information Processing & Management 24, 243–248 (1988)
van Rijsbergen, C.: A non-classical logic for Information Retrieval. The Computer Journal 29(6), 481–485 (1986)
Salton, G.: Automatic information organization and retrieval. Mc Graw Hill, New York (1968)
Salton, G.: Mathematics and information retrieval. Journal of Documentation 35(1), 1–29 (1979)
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)
Fine, T.: Theories of probability. Academic Press, London (1973)
Maron, M., Kuhns, J.: On relevance, probabilistic indexing and retrieval. Journal of the ACM 7, 216–244 (1960)
Robertson, S., Sparck Jones, K.: Relevance weighting of search terms. Journal of the American Society for Information Science 27, 129–146 (1976)
Robertson, S.: The probability ranking principle in information retrieval. Journal of Documentation 33(4), 294–304 (1977)
Robertson, S., Walker, S.: Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In: Proceedings of the ACM International Conference on Research and Development in Information Retrieval (SIGIR), Dublin, Ireland, pp. 232–241 (1994)
Turtle, H., Croft, W.: Inference networks for document Retrieval. In: Proceedings of the ACM International Conference on Research and Development in Information Retrieval (SIGIR), Brussels, Belgium (September 1990)
Ponte, J., Croft, W.: A language modeling approach to information retrieval. In: Proceedings of the ACM International Conference on Research and Development in Information Retrieval (SIGIR), Melbourne, Australia, pp. 275–281. ACM Press, New York (1998)
Park, L.A.F., Ramamohanarao, K., Palaniswami, M.: Fourier domain scoring: A novel document ranking method. IEEE Trans. on Knowl. and Data Eng. 16(5), 529–539 (2004)
Park, L.A.F., Ramamohanarao, K., Palaniswami, M.: A novel document retrieval method using the discrete wavelet transform. ACM Trans. Inf. Syst. 23(3), 267–298 (2005)
Oppenheim, A.V., Willsky, A.S., Nawab, S.H.: Signals & systems, 2nd edn. Prentice-Hall, Inc., Upper Saddle River (1996)
Mitra, S.K.: Digital Signal Processing: A Computer-Based Approach, 3rd edn. McGraw-Hill, New York (2006)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press/McGraw-Hill Book Company (2000)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical recipes in C: The art of scientific computing, 2nd edn. Cambridge University Press, Cambridge (1992)
Croft, B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice, 1st edn. Addison Wesley, Reading (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Costa, A., Melucci, M. (2010). An Information Retrieval Model Based on Discrete Fourier Transform. In: Cunningham, H., Hanbury, A., Rüger, S. (eds) Advances in Multidisciplinary Retrieval. IRFC 2010. Lecture Notes in Computer Science, vol 6107. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13084-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-13084-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13083-0
Online ISBN: 978-3-642-13084-7
eBook Packages: Computer ScienceComputer Science (R0)