Parallel Data Cube Storage Structure for Range Sum Queries and Dynamic Updates
- 34 Downloads
I/O parallelism is considered to be a promising approach to achieving high performance in parallel data warehousing systems where huge amounts of data and complex analytical queries have to be processed. This paper proposes a parallel secondary data cube storage structure (PHC for short) to efficiently support the processing of range sum queries and dynamic updates on data cube using parallel computing systems. Based on PHC, two parallel algorithms for processing range sum queries and updates are proposed also. Both the algorithms have the same time complexity, O(log d n/P). The analytical and experimental results show that PHC and the parallel algorithms have high performance and achieve optimum speedup.
Keywordsdata warehouse parallel processing cube range query processing
Unable to display preview. Download preview PDF.
- Codd E F. Providing OLAP (on-line analytical processing) to user-analysts: An IT mandate. Technical Report, E.F. Codd and Associates, 1993.Google Scholar
- Ho C T, Agrawal R, Megiddo R, Srikant R. Range queries in OLAP data cubes. In Proc. the Int. ACM SIGMOD Conference, Tucson, Arizona, May 1997, pp.73–88.Google Scholar
- Geffner S, Agrawal D, Abbadi A, Smith T. Relative prefix sums: An efficient approach for querying dynamic OLAP data cubes. In Proc. the 15th Int. Conference on Data Engineering, Sydney, Australia, March 1999, pp.328–335.Google Scholar
- Chan C Y, Ioannidis Y E. Hierarchical cubes for range-sum queries. In Proc. the 25th VLDB Conference, Edinburgh, Scotland, UK, September 1999, pp.675–686,Google Scholar
- Chun S J, Chung C W, Lee J H, Lee S L. Dynamic update cube for range-sum queries. In Proc. 27th VLDB Conf., Rome, Italy, Sept. 2001, pp.521–530.Google Scholar
- Li J Z, Gao H. Hierarchical data cube for range queries and dynamic updates. In Proc. the East-European Conference on Advances in Databases and Information Systems (ADBIS), Dresden, Germany, 2003, 9: 61–75.Google Scholar
- Jens Albrecht, Wolfgang Sporer. Aggregate-based query processing in a parallel data warehouse server. In the 10th International Workshop on Database and Expert Systems Applications, Florence, Italy, September, 1999, pp.40–44.Google Scholar
- Hector G M, Wilburt J L, Janet L W et al. Distributed and parallel computing issues in data warehousing. http://www-db.stanford.edu/warehousing/warehouse.html.
- Sun J, Grosky W I. Dynamic Maintenance of Multidimensional Range Data Partitioning for Parallel Data Processing. In Proc. DOLAP98, Washington DC, USA, Nov.1998, pp.72–79.Google Scholar
- Muto S, Kitsuregawa M. A dynamic load balancing strategy for parallel datacube computation. In Proc. DOLAP99, Kansas, Missouri, USA, Nov. 1999, pp.67–72.Google Scholar
- Rohm U, Bohm K et al. OLAP query routing and physical design in a DB cluster. EDBT, 2000, pp.254–268.Google Scholar
- Dehne F, Eavis T, Rau-Chaplin A. A cluster architecture for parallel data warehousing. In Proc. International Conference on Cluster Computing and the Grid (CCGrid 2001), Brisbane, Australia, May 2001, p.161.Google Scholar
- Goil S, Choudhary A. An infrastructure for scalable parallel multidimensional analysis. In Proc. 11th International Conference on Scientific and Statistical Database Management, Cleveland, Ohio, USA, July, 1999, pp.102–111.Google Scholar
- Gao H, Li J Z. Hierarchical data cube for range sum queries. Journal of Software, 2003, 7: 1258–1268.Google Scholar