Rank-Sensitive Data Structures
Output-sensitive data structures result from preprocessing n items and are capable of reporting the items satisfying an on-line query in O(t(n) + ℓ) time, where t(n) is the cost of traversing the structure and ℓ ≤ n is the number of reported items satisfying the query. In this paper we focus on rank-sensitive data structures, which are additionally given a ranking of the n items, so that just the top k best-ranking items should be reported at query time, sorted in rank order, at a cost of O(t(n) + k) time. Note that k is part of the query as a parameter under the control of the user (as opposed to ℓ which is query-dependent). We explore the problem of adding rank-sensitivity to data structures such as suffix trees or range trees, where the ℓ items satisfying the query form O(polylog(n)) intervals of consecutive entries from which we choose the top k best-ranking ones. Letting s(n) be the number of items (including their copies) stored in the original data structures, we increase the space by an additional term of O(s(n) lgεn) memory words of space, each of O(lg n) bits, for any positive constant ε < 1. We allow for changing the ranking on the fly during the lifetime of the data structures, with ranking values in 0 ... O(n). In this case, query time becomes O(t(n) + k) plus O(lg n/ lg lg n) per interval; each change in the ranking and each insertion/deletion of an item takes O(lg n); the additional term in space occupancy increases to O(s(n) lg n/ lg lg n).
Unable to display preview. Download preview PDF.
- 3.Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Amsterdam (1999)Google Scholar
- 9.Fich, F.E.: Class notes CSC 2429F: Dynamic data structures. Department of Computer Science. University of Toronto, Canada (2003)Google Scholar
- 12.Gabow, H.N., Bentley, J.L., Tarjan, R.E.: Scaling and related techniques for geometry problems. In: STOC 1984, Washington, D.C, pp. 135–143 (1984)Google Scholar
- 14.Hearn, D., Baker, M.: Computer Graphics with OpenGL. Prentice-Hall, Englewood Cliffs (2003)Google Scholar
- 20.Mortensen, C.W.: Fully-dynamic two dimensional orthogonal range and line segment intersection reporting in logarithmic time. In: SODA 2003, pp. 618–627 (2003)Google Scholar
- 21.Muthukrishnan, S.: Efficient algorithms for document retrieval problems. In: SODA 2002: Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, pp. 657–666 (2002)Google Scholar
- 24.Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Tech. rep, Stanford University, Stanford, CA (1998)Google Scholar
- 25.Raman, R.: Eliminating amortization: on data structures with guaranteed response time. PhD thesis, Rochester, NY, USA (1993)Google Scholar
- 26.Thorup, M.: On AC0 implementations of fusion trees and atomic heaps. In: Proceedings of the fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2003), January 12–14, pp. 699–707. ACM Press, New York (2003)Google Scholar
- 30.Weiner, P.: Linear pattern matching algorithms. In: Conference Record, IEEE 14th Annual Symposium on Switching and Automata Theory, pp. 1–11 (1973)Google Scholar
- 31.Ian Witten, H., Moffat, A., Bell, T.C.: Managing gigabytes: Compressing and indexing documents and images. Morgan Kaufmann Pubs. Inc., San Francisco (1999)Google Scholar