Abstract
In this paper we present new approach to compression of inverted lists in indexes of information retrieval systems. The technique exploits contextual information obtained from a non-supervised clustering process run on the document collection. A substantial improvement of compression factor is achieved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anh, V.N., Moffat, A., Inverted index compression using word-aligned binary codes, Information Retrieval, 8(2004)151-166
Becks, A., Visual Knowledge Management with Adaptable Document Maps, Sankt Augustin, GMD 2001
Berry, M.W., Drmac, Z., Jessup, E.R. Matrices, vector spaces and information retrieval, SIAM Review, 41(1999)335-362
Bezdek, J.C., Pal, S.K., Fuzzy Models for Pattern Recognition: Methods that Search for Structures in Data, IEEE, New York, 1992
Blanco, R., Barreiro, A., Characterization of a simple case of the reassignment of document identifiers as a pattern sequencing problem, Proc. of the 28 th Annual Internat. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2005
Blandford, D., Blelloch, G., Index compression through document reordering, in: Proceesings of Data Compression Conference (DCC), 2002, pp. 342-351
Cher-Sheng Cheng, Jean Jyh-Jiun Shann, Chung-Ping Chung, Unique-order interpolative coding for fast querying and space-efficient indexing in information retrieval systems, Information Processing and Management, 42(2006)407-428
Ciesielski, K., Klopotek, M.A., Contextual maps for browsing huge document collections, in: Proceedings of the 16 th International Symposium Methodologies for Intelligent Systems (ISMIS-2006), LNAI 4203, Springer, 2006
Ciesielski, K. et al., Adaptive document maps, in: Proceedings. of the Intelligent Information Processing and Web Mining, Springer, 2006, pp.109-120
Fritzke, B., A growing neural gas network learns topologies, In: G. Tesauro, D.S. Touretzky, and T.K. Leen (eds.) Advances in Neural Information Processing Systems 7, MIT Press Cambridge, MA, 1995, pp. 625-632.
Persin, M., Zobel, J., Sacks-Davis, R., Filtered document retrieval with frequency-sorted indexes, Journal of the American Society for Information Science 47(1996)749-764
Robertson, S., Walker, S., Okapi/Keenbow at TREC- 8, In: E. Voorhees and D. Harman, (eds.), The 8 th Text Retrieval Conference (TREC-8), NIST Special Publication 500-246, Gaithersburg, MD, 2000, pp. 151-161
Robertson, S., Walker, S., Hancock-Beaulieu, M., Gull, A., Lau, M., Okapi at TREC, in D. Harman, ed., The 1 st Text Retrieval Conference (TREC-1), NIST Special Publication 500-207, Gaithersburg, MD, 1992, pp. 21-30
Silvestri, F., Orlando, S., Perego, R., Assigning identifiers to documents to enhance the clustering property of full text indexes, Proceedings of the 27 th ACM SIGIR Conference, 2004
Williams H., Zobel J. Compressing integers for fast file access. Computer Journal, 2(1999)193-201
Witten I., Moffat A. and Bell T. Managing Gigabytes. Morgan Kaufman Publishers, New York, second edition, 1999
Zobel, J. and Moffat, A., Exploring the similarity space, ACM SIGIR Forum 32(1), 1998, 18-34
Moffat, A. und Zobel, J. Self-indexing inverted files for fast text retrieval, ACM Transactions on Information Systems, 14(1996)349-379.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer Science+Business Media, LLC
About this paper
Cite this paper
Czerski, D., Ciesielski, K., Dramiński, M., Kłopotek, M.A., Wierzchoń, S.T. (2007). Inverted Lists Compression Using Contextual Information. In: Pejaś, J., Saeed, K. (eds) Advances in Information Processing and Protection. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-73137-7_6
Download citation
DOI: https://doi.org/10.1007/978-0-387-73137-7_6
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-73136-0
Online ISBN: 978-0-387-73137-7
eBook Packages: Computer ScienceComputer Science (R0)