Efficient Representation of Multidimensional Data over Hierarchical Domains
We consider the problem of representing multidimensional data where the domain of each dimension is organized hierarchically, and the queries require summary information at a different node in the hierarchy of each dimension. This is the typical case of OLAP databases. A basic approach is to represent each hierarchy as a one-dimensional line and recast the queries as multidimensional range queries. This approach can be implemented compactly by generalizing to more dimensions the \(k^2\)-treap, a compact representation of two-dimensional points that allows for efficient summarization queries along generic ranges. Instead, we propose a more flexible generalization, which instead of a generic quadtree-like partition of the space, follows the domain hierarchies across each dimension to organize the partitioning. The resulting structure is much more efficient than a generic multidimensional structure, since queries are resolved by aggregating much fewer nodes of the tree.
- 4.Chan, T., Durocher, S., Larsen, K., Morrison, J., Wilkinson, B.: Linear-space data structures for range mode query in arrays. In: Proceedings of 29th International Symposium on Theoretical Aspects of Computer Science (STACS), pp. 290–301 (2012)Google Scholar
- 6.Clark, D.: Compact PAT Trees. Ph.D. thesis, University of Waterloo, Canada (1996)Google Scholar
- 7.Codd, E.F., Codd, S.B., Salley, C.T.: Providing OLAP. On-Line Analytical Processing to User-Analysts: An IT Mandate. E. F. Codd and Associates (1993)Google Scholar
- 9.Jacobson, G.: Space-efficient static trees and graphs. In: Proceedings of the 30th Annual Symposium on Foundations of Computer Science, SFCS 1989, pp. 549–554. IEEE Computer Society, Washington, DC (1989)Google Scholar
- 10.Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling, 2nd edn. Wiley, New York (2002)Google Scholar
- 11.Larsen, K., van Walderveen, F.: Near-optimal range reporting structures for categorical data. In: Proceedings of 24th Symposium on Discrete Algorithms (SODA), pp. 265–276 (2013)Google Scholar