Inverted index; Full text inverted index; Postings file
An Inverted file is an index data structure that maps content to its location within a database file, in a document or in a set of documents. It is normally composed of: (i) a vocabulary that contains all the distinct words found in a text and (ii), for each word t of the vocabulary, a list that contains statistics about the occurrences of t in the text. Such list is known as the inverted list of t. The inverted file is the most popular data structure used in document retrieval systems to support full text search.
Efforts for indexing electronic texts are found in literature since the beginning of the computational systems. For example, descriptions of Electronic Information Search Systems that are able to index and search text can be found in the early 1950s .
In a seminal work, Gerard Salton wrote a book in 1968, containing the basis for the modern inf ...
- Baeza-Yates R. and Ribeiro-Neto B. Modern Information Retrieval. Addison Wesley, Reading, MA, 1999.
- Kaszkiel M. and Zobel J. Term-ordered query evaluation versus document-ordered query evaluation for large document databases. In Proc. 21st Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, 1998, pp. 343–344.
- Long X. and Suel T. Three-level caching for efficient query processing in large Web search engines. In Proc. 14th Int. World Wide Web Conference, 2005, pp. 257–266.
- Luhn H.P. A statistical approach to mechanized encoding and searching of literary information. IBM J. Res. and Dev., 309–317, October 1957.
- de Moura E.S., dos Santos C.F., Fernandes D.R., Silva A.S., Calado P., and Nascimento M.A. Improving web search efficiency via a locality based static pruning method. In Proc. 14th Int. World Wide Web Conference, 2005, pp. 235–244.
- Salton, G. (1968) Automatic Information Organization and Retrieval. McGraw-Hill, New York, NY
- Witten, I., Moffat, A., Bell, T. (1999) Managing Gigabytes. Morgan Kaufmann, Los Altos, CA
- Zobel, J., Moffat, A. (2006) Inverted Files for Text Search Engines. ACM Comput. Surv. 38: pp. 1-56 CrossRef
- Inverted Files
- Reference Work Title
- Encyclopedia of Database Systems
- pp 1571-1574
- Print ISBN
- Online ISBN
- Springer US
- Copyright Holder
- Springer US
- Additional Links
- Industry Sectors
- eBook Packages
- Editor Affiliations
- 1. College of Computing, Georgia Institute of Technology
- 2. Database Research Group David R. Cheriton School of Computer Science, University of Waterloo
- Author Affiliations
- 1. Federal University of Amazonas, Manaus, Brazil
- 2. FUCAPI, Manaus, Brazil
To view the rest of this content please follow the download PDF link above.