Parallelizing the Data Cube

  • Frank Dehne
  • Todd Eavis
  • Susanne Hambrusch
  • Andrew Rau-Chaplin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1973)


This paper presents a general methodology for the efficient parallelization of existing data cube construction algorithms. We describe two different partitioning strategies, one for top-down and one for bottom-up cube algorithms. Both partitioning strategies assign subcubes to individual processors in such a way that the loads assigned to the processors are balanced. Our methods reduce inter-processor communication overhead by partitioning the load in advance instead of computing each individual group-by in parallel as is done in previous parallel approaches. In fact, after the initial load distribution phase, each processor can compute its assigned subcube without any communication with the other processors. Our methods enable code reuse by permitting the use of existing sequential (external memory) data cube algorithms for the subcube computations on each processor. This supports the transfer of optimized sequential data cube code to a parallel setting. The bottom-up partitioning strategy balances the number of single attribute external memory sorts made by each processor. The top-down strategy partitions a weighted tree in which weights reflect algorithm specific cost measures like estimated group-by sizes. Both partitioning approaches can be implemented on any shared disk type parallel machine composed of p processors connected via an interconnection fabric and with access to a shared parallel disk array. Experimental results presented show that our partitioning strategies generate a close to optimal load balance between processors.


Span Tree Message Passing Interface External Memory Data Cube Weighted Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    S. Agarwal, R. Agarwal, P.M. Deshpande, A. Gupta, J.F. Naughton, R. Ramakrishnan, and S. Srawagi. On the computation of multi-dimensional aggregates. In Proc. 22nd VLDB Conf., pages 506–521, 1996.Google Scholar
  2. 2.
    Argonne National Laboratory, Message Passing Interface (MPI) standard.
  3. 3.
    R.I. Becker, Y. Perl, and S.R. Schach. A shifting algorithm for min-max tree partitioning. J. ACM, (29):58–67, 1982.Google Scholar
  4. 4.
    K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. In Proc. of 1999 ACM SIGMOD Conference on Management of data, pages 359–370, 1999.Google Scholar
  5. 5.
    T. Cheatham, A. Fahmy, D. C. Stefanescu, and L. G. Valiant. Bulk synchronous parallel computing-A paradigm for transportable software. In Proc. of the 28th Hawaii International Conference on System Sciences. Vol. 2: Software Technology, pages 268–275, 1995.Google Scholar
  6. 6.
    F. Dehne, W. Dittrich, and D. Hutchinson. Efficient external memory algorithms by simulating coarse-grained parallel algorithms. In Proc. 9th ACM Symposium on Parallel Algorithms and Architectures (SPAA’97), pages 106–115, 1997.Google Scholar
  7. 7.
    F. Dehne, W. Dittrich, D. Hutchinson, and A. Maheshwari. Parallel virtual memory. In Proc. 10th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 889–890, 1999.Google Scholar
  8. 8.
    F. Dehne, A. Fabri, and A. Rau-Chaplin. Scalable parallel computational geometry for coarse grained multicomputers. In ACM Symp. Computational Geometry, pages 298–307, 1993.Google Scholar
  9. 9.
    F. Dehne, D. Hutchinson, and A. Maheshwari. Reducing i/o complexity by simulating coarse grained parallel algorithms. In Proc. 13th International Parallel Processing Symposium (IPPS’99), pages 14–20, 1999.Google Scholar
  10. 10.
    P.M. Deshpande, S. Agarwal, J.F. Naughton, and R Ramakrishnan. Computation of multidimensional aggregates. Technical Report1314, University of Wisconsin, Madison, 1996.Google Scholar
  11. 11.
    P. Flajolet and G.N. Martin. Probablistic counting algorithms for database applications. Journal of Computer and System Sciences, 31(2):182–209, 1985.zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    G.N. Frederickson. Optimal algorithms for tree partitioning. In Proc. ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 168–177, 1991.Google Scholar
  13. 13.
    S. Goil and A. Choudhary. High performance OLAP and data mining on parallel computers. Journal of Data Mining and Knowledge Discovery, 1(4), 1997.Google Scholar
  14. 14.
    S. Goil and A. Choudhary. A parallel scalable infrastructure for OLAP and data mining. In Proc. International Data Engineering and Applications Symposium (IDEAS’99), Montreal, August 1999.Google Scholar
  15. 15.
    M. Goudreau, K. Lang, S. Rao, T. Suel, and T. Tsantilas. Towards efficiency and portability: Programming with the BSP model. In Proc. 8th ACM Symposium on Parallel Algorithms and Architectures (SPAA’ 96), pages 1–12, 1996.Google Scholar
  16. 16.
    J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. J. Data Mining and Knowledge Discovery,1(1):29–53, April 1997.CrossRefGoogle Scholar
  17. 17.
    V. Harinarayan, A. Rajaraman, and J.D. Ullman. Implementing data cubes efficiently. SIGMOD Record (ACM Special Interest Group on Management of Data), 25(2):205–216, 1996.Google Scholar
  18. 18.
    J. Hill, B. McColl, D. Stefanescu, M. Goudreau, K. Lang, S. Rao, T. Suel, T. Tsantilas, and R. Bisseling. BSPlib: The BSP programming library. Parallel Computing, 24(14):1947–1980, December 1998.CrossRefGoogle Scholar
  19. 19.
    Y. Perl and U. Vishkin. Efficient implementation of a shifting algorithm. Disc. Appl. Math., (12):71–80, 1985.Google Scholar
  20. 20.
    K.A. Ross and D. Srivastava. Fast computation of sparse datacubes. In Proc. 23rd VLDB Conference, pages 116–125, 1997.Google Scholar
  21. 21.
    S. Sarawagi, R. Agrawal, and A. Gupta. On computing the data cube. Technical Report RJ10026, IBM Almaden Research Center, San Jose, CA, 1996.Google Scholar
  22. 22.
    A. Shukla, P. Deshpende, J.F. Naughton, and K. Ramasamy. Storage estimation for mutlidimensional aggregates in the presence of hierarchies. In Proc. 22nd VLDB Conference, pages 522–531, 1996.Google Scholar
  23. 23.
    J.F. Sibeyn and M. Kaufmann. BSP-like external-memory computation. In Proc. of 3rd Italian Conf. on Algorithms and Complexity (CIAC-97), volume LNCS1203,pages 229–240. Springer, 1997.Google Scholar
  24. 24.
    D.E. Vengroff and J.S. Vitter. I/o-efficient scientific computation using tpie. In Proc. Goddard Conference on Mass Storage Systems and Technologies, pages 553–570, 1996.Google Scholar
  25. 25.
    J.S. Vitter. External memory algorithms. In Proc. 17th ACM Symp. on Principles of Database Systems (PODS’ 98), pages 119–128, 1998.Google Scholar
  26. 26.
    J.S. Vitter and E.A.M. Shriver. Algorithms for parallel memory. i: Two-level memories. Algorithmica, 12(2–3):110–147, 1994.zbMATHCrossRefMathSciNetGoogle Scholar
  27. 27.
    Y. Zhao, P.M. Deshpande, and J.F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In Proc. ACM SIGMOD Conf., pages 159–170, 1997.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Frank Dehne
    • 1
  • Todd Eavis
    • 2
  • Susanne Hambrusch
    • 3
  • Andrew Rau-Chaplin
    • 2
  1. 1.Carleton UniversityOttawaCanada
  2. 2.Dalhousie UniversityHalifaxCanada
  3. 3.Purdue UniversityWest LafayetteUSA

Personalised recommendations