I/O parallelism is considered to be a promising approach to achieving high performance in parallel data warehousing systems where huge amounts of data and complex analytical queries have to be processed. This paper proposes a parallel secondary data cube storage structure (PHC for short) to efficiently support the processing of range sum queries and dynamic updates on data cube using parallel computing systems. Based on PHC, two parallel algorithms for processing range sum queries and updates are proposed also. Both the algorithms have the same time complexity, O(log d n/P). The analytical and experimental results show that PHC and the parallel algorithms have high performance and achieve optimum speedup.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Codd E F. Providing OLAP (on-line analytical processing) to user-analysts: An IT mandate. Technical Report, E.F. Codd and Associates, 1993.
Gray J et al. Data cube: A relational aggregation operator generalizing group-by, cross-tab and sub-totals. Data Mining and Knowledge Discovery, 1997, 1(1): 29–54.
Ho C T, Agrawal R, Megiddo R, Srikant R. Range queries in OLAP data cubes. In Proc. the Int. ACM SIGMOD Conference, Tucson, Arizona, May 1997, pp.73–88.
Geffner S, Agrawal D, Abbadi A, Smith T. Relative prefix sums: An efficient approach for querying dynamic OLAP data cubes. In Proc. the 15th Int. Conference on Data Engineering, Sydney, Australia, March 1999, pp.328–335.
Chan C Y, Ioannidis Y E. Hierarchical cubes for range-sum queries. In Proc. the 25th VLDB Conference, Edinburgh, Scotland, UK, September 1999, pp.675–686,
Chun S J, Chung C W, Lee J H, Lee S L. Dynamic update cube for range-sum queries. In Proc. 27th VLDB Conf., Rome, Italy, Sept. 2001, pp.521–530.
Liang W, Wang H, Orlowska M E. Range queries in dynamic OLAP data cubes. Data and Knowledge Engineering, 2000, 34: 21–38.
Li J Z, Gao H. Hierarchical data cube for range queries and dynamic updates. In Proc. the East-European Conference on Advances in Databases and Information Systems (ADBIS), Dresden, Germany, 2003, 9: 61–75.
Jens Albrecht, Wolfgang Sporer. Aggregate-based query processing in a parallel data warehouse server. In the 10th International Workshop on Database and Expert Systems Applications, Florence, Italy, September, 1999, pp.40–44.
Hector G M, Wilburt J L, Janet L W et al. Distributed and parallel computing issues in data warehousing. http://www-db.stanford.edu/warehousing/warehouse.html.
Sun J, Grosky W I. Dynamic Maintenance of Multidimensional Range Data Partitioning for Parallel Data Processing. In Proc. DOLAP98, Washington DC, USA, Nov.1998, pp.72–79.
Muto S, Kitsuregawa M. A dynamic load balancing strategy for parallel datacube computation. In Proc. DOLAP99, Kansas, Missouri, USA, Nov. 1999, pp.67–72.
Rohm U, Bohm K et al. OLAP query routing and physical design in a DB cluster. EDBT, 2000, pp.254–268.
Dehne F, Eavis T, Rau-Chaplin A. A cluster architecture for parallel data warehousing. In Proc. International Conference on Cluster Computing and the Grid (CCGrid 2001), Brisbane, Australia, May 2001, p.161.
Goil S, Choudhary A. An infrastructure for scalable parallel multidimensional analysis. In Proc. 11th International Conference on Scientific and Statistical Database Management, Cleveland, Ohio, USA, July, 1999, pp.102–111.
Gao H, Li J Z. Hierarchical data cube for range sum queries. Journal of Software, 2003, 7: 1258–1268.
Supported by the National Natural Science Foundation of China under Grant No.60273082 and the Natural Science Foundation of Heilongjiang Province under Grant No.F0208.
Hong Gao received the B.Sc. degree in computer science from Heilongjiang University, China, the M.S. degree in computer science from Harbin Engineering University, China, and the Ph.D. degree in computer science from Harbin Institute of Technology, China. She is a member of the database research group. Her research interests include data warehousing, data mining and techniques of compressed database management system. She has published more than 20 technical papers in refereed journals and conference proceedings in the areas of databases.
Jian-Zhong Li is a full professor and the chairman of the Department of Computer Science and Engineering at the Harbin Institute of Technology, China. He worked in the University of California at Berkeley as a visiting scholar in 1985. From 1986 to 1987, he was a staff scientist in the Information Research Group at Lawrence Berkeley National Laboratory, Berkeley, USA. He has also been a visiting professor at the University of Minnesota, Minneapolis, Minnesota, USA, from 1991 to 1992 and from 1998 to 1999. His current research interests include data warehousing, data mining, XML databases, bioinformatics, and sensor network. He has authored many books and published more than 200 technical papers in refereed journals and conference proceedings in database areas. He has delivered a number of invited presentations and participated in panel discussions on many topics. His professional activities include service on various program committees. He is a member of the IEEE Computer Society and a member of the ACM.
About this article
Cite this article
Gao, H., Li, J. Parallel Data Cube Storage Structure for Range Sum Queries and Dynamic Updates. J Comput Sci Technol 20, 345–356 (2005). https://doi.org/10.1007/s11390-005-0345-1
- data warehouse
- parallel processing
- range query processing