Rank-Sensitive Data Structures
Output-sensitive data structures result from preprocessing n items and are capable of reporting the items satisfying an on-line query in O(t(n) + ℓ) time, where t(n) is the cost of traversing the structure and ℓ ≤ n is the number of reported items satisfying the query. In this paper we focus on rank-sensitive data structures, which are additionally given a ranking of the n items, so that just the top k best-ranking items should be reported at query time, sorted in rank order, at a cost of O(t(n) + k) time. Note that k is part of the query as a parameter under the control of the user (as opposed to ℓ which is query-dependent). We explore the problem of adding rank-sensitivity to data structures such as suffix trees or range trees, where the ℓ items satisfying the query form O(polylog(n)) intervals of consecutive entries from which we choose the top k best-ranking ones. Letting s(n) be the number of items (including their copies) stored in the original data structures, we increase the space by an additional term of O(s(n) lg ε n) memory words of space, each of O(lg n) bits, for any positive constant ε < 1. We allow for changing the ranking on the fly during the lifetime of the data structures, with ranking values in 0 ... O(n). In this case, query time becomes O(t(n) + k) plus O(lg n/ lg lg n) per interval; each change in the ranking and each insertion/deletion of an item takes O(lg n); the additional term in space occupancy increases to O(s(n) lg n/ lg lg n).
Unable to display preview. Download preview PDF.
- 3.Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Amsterdam (1999)Google Scholar
- 9.Fich, F.E.: Class notes CSC 2429F: Dynamic data structures. Department of Computer Science. University of Toronto, Canada (2003)Google Scholar
- 12.Gabow, H.N., Bentley, J.L., Tarjan, R.E.: Scaling and related techniques for geometry problems. In: STOC 1984, Washington, D.C, pp. 135–143 (1984)Google Scholar
- 14.Hearn, D., Baker, M.: Computer Graphics with OpenGL. Prentice-Hall, Englewood Cliffs (2003)Google Scholar
- 20.Mortensen, C.W.: Fully-dynamic two dimensional orthogonal range and line segment intersection reporting in logarithmic time. In: SODA 2003, pp. 618–627 (2003)Google Scholar
- 21.Muthukrishnan, S.: Efficient algorithms for document retrieval problems. In: SODA 2002: Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, pp. 657–666 (2002)Google Scholar
- 24.Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Tech. rep, Stanford University, Stanford, CA (1998)Google Scholar
- 25.Raman, R.: Eliminating amortization: on data structures with guaranteed response time. PhD thesis, Rochester, NY, USA (1993)Google Scholar
- 26.Thorup, M.: On AC0 implementations of fusion trees and atomic heaps. In: Proceedings of the fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2003), January 12–14, pp. 699–707. ACM Press, New York (2003)Google Scholar
- 30.Weiner, P.: Linear pattern matching algorithms. In: Conference Record, IEEE 14th Annual Symposium on Switching and Automata Theory, pp. 1–11 (1973)Google Scholar
- 31.Ian Witten, H., Moffat, A., Bell, T.C.: Managing gigabytes: Compressing and indexing documents and images. Morgan Kaufmann Pubs. Inc., San Francisco (1999)Google Scholar