On Rectangular Partitionings in Two Dimensions: Algorithms, Complexity and Applications

  • S. Muthukrishnan
  • Viswanath Poosala
  • Torsten Suel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1540)

Abstract

Partitioning a multi-dimensional data set into rectangular partitions subject to certain constraints is an important problem that arises in many database applications, including histogram-based selectivity estimation, load-balancing, and construction of index structures. While provably optimal and efficient algorithms exist for partitioning one-dimensional data, the multi-dimensional problem has received less attention, except for a few special cases. As a result, the heuristic partitioning techniques that are used in practice are not well understood, and come with no guarantees on the quality of the solution. In this paper, we present algorithmic and complexity-theoretic results for the fundamental problem of partitioning a two-dimensional array into rectangular tiles of arbitrary size in a way that minimizes the number of tiles required to satisfy a given constraint. Our main results are approximation algorithms for several partitioning problems that provably approximate the optimal solutions within small constant factors, and that run in linear or close to linear time. We also establish the NP-hardness of several partitioning problems, therefore it is unlikely that there are efficient, i.e., polynomial time, algorithms for solving these problems exactly.

We also discuss a few applications in which partitioning problems arise. One of the applications is the problem of constructing multi-dimensional histograms. Our results, for example, give an efficient algorithm to construct the V-Optimal histograms which are known to be the most accurate histograms in several selectivity estimation problems. Our algorithms are the first to provide guaranteed bounds on the quality of the solution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    S. Anily and A. Federgruen. Structured partitioning problems. Operations Research, 13, 130–149, 1991.MathSciNetCrossRefGoogle Scholar
  2. 2.
    S. Arora. Polynomial time approximation schemes for euclidean tsp and other geometric problems. Proc 37th IEEE Symp. of Foundations of Computer Science (FOCS), pages 2–12, 1996.Google Scholar
  3. 3.
    S. Bokhari. Partitioning problems in parallel, pipelined, and distributed computing. IEEE Transactions on Computers, 37, 38–57, 1988.CrossRefMathSciNetGoogle Scholar
  4. 4.
    Brönnimann and Goodrich. Almost optimal set covers in finite VC-dimension. In Proceedings of the 10th Annual Symposium on Computational Geometry, 1994.Google Scholar
  5. 5.
    B. Carpentieri and J. Storer. A split-merge parallel block matching algorithmGoogle Scholar
  6. 6.
    M. Charikar, C. Chekuri, T. Feder, and R. Motwani. Personal communication, 1996.Google Scholar
  7. 7.
    K. L. Clarkson. A Las Vegas algorithm for linear programming when the dimension is small. In Proc. 29th Annual IEEE Symposium on Foundations of Computer Science, pages 452–456, October 1988.Google Scholar
  8. 8.
    F. d’Amore and P. Franciosa. On the optimal binary plane partition for sets of isothetic rectangles. Information Proc. Letters, 44, 255–259, 1992.MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    R. Fowler, M. Paterson, and S. Tanimoto. Optimal packing and covering in the plane are np-complete. Information Proc. Letters, 12, 133–137, 1981.MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    G. Fox, M. Johnson, G. Lyzenga, S. Otto, J. Salmon, and D. Walker. Solving Problems on Concurrent Processors, volume 1. Prentice-Hall, Englewood Cliffs, New Jersey, 1988.Google Scholar
  11. 11.
    M. Grigni and F. Manne. On the complexity of the generalized block distribution. Proc. of 3rd international workshop on parallel algorithms for irregularly structured problems (IRREGULAR’ 96), Lecture notes in computer science 1117, Springer, 319–326, 1996.CrossRefGoogle Scholar
  12. 12.
    D. Haussler and E. Welzl. Epsilon-nets and simplex range queries. Discrete and Computational Geometry, 2:127–151, 1987.MATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Y. Ioannidis. Universality of serial histograms. Proc. of the 19th Int. Conf. on Very Large Databases, pages 256–267, December 1993.Google Scholar
  14. 14.
    Y. Ioannidis and V. Poosala. Balancing histogram optimality and practicality for query result size estimation. Proc. of ACM SIGMOD Conf, pages 233–244, May 1995.Google Scholar
  15. 15.
    H. V. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K. Sevcik, and T. Suel. Optimal histograms with quality guarantees. Proc. of the 24rd Int. Conf. on Very Large Databases, pages 275–286, August 1998.Google Scholar
  16. 16.
    J. Jain and A. Jain. Displacement measurement and its application in interframe coding. IEEE Transactions on communications, 29, 1799–1808, 1981.CrossRefGoogle Scholar
  17. 17.
    M. Kaddoura, S. Ranka and A. Wang. Array decomposition for nonuniform computational environments. Technical Report, Syracuse University, 1995.Google Scholar
  18. 18.
    S. Khanna, S. Muthukrishnan, and M. Paterson. Approximating rectangle tiling and packing. Proc Symp. on Discrete Algorithms (SODA), pages 384–393, 1998.Google Scholar
  19. 19.
    S. Khanna, S. Muthukrishnan, and S. Skiena. Efficient array partitioning. Proc. Intl. Colloq. on Automata, Languages, and Programming (ICALP), pages 616–626, 1997.Google Scholar
  20. 20.
    R. P. Kooi. The optimization of queries in relational databases. PhD thesis, Case Western Reserve University, Sept 1980.Google Scholar
  21. 21.
    D. Lichtenstein. Planar formulae and their uses. SIAM J. Computing, 11, 329–343, 1982.MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    N. Littlestone. Learning quickly when irrelevant attributes abound: A new linearthreshold algorithm. In Proceedings of the 28th Annual Symposium on Foundations of Computer Science, pages 68–77, October 1987.Google Scholar
  23. 23.
    F. Manne. Load Balancing in Parallel Sparse Matrix Computations. Ph.d. thesis, Department of Informatics, University of Bergen, Norway, 1993.Google Scholar
  24. 24.
    F. Manne and T. Sorevik. Partitioning an array onto a mesh of processors. Proc. of Workshop on Applied Parallel Computing in Industrial Problems. 1996.Google Scholar
  25. 25.
    C. Manning. Introduction to Digital Video Coding and Block Matching Algorithms. http://atlantis.ucc.ie/dvideo/dv.html.
  26. 26.
    J. Mitchell. Guillotine subdivisions approximate polygonal subdivisions: A simple method for geometric k-mst problem. Proc. ACM-SIAM Symp. on Discrete Algorithms (SODA), pages 402–408, 1996.Google Scholar
  27. 27.
    M. Muralikrishna and David J Dewitt. Equi-depth histograms for estimating selectivity factors for multi-dimensional queries. Proc. of ACM SIGMOD Conf, pages 28–36, 1988.Google Scholar
  28. 28.
    J. Nievergelt, H. Hinterberger, and K. C. Sevcik. The grid file: An adaptable, symmetric multikey file structure. ACM Transactions on Database Systems, 9(1):38–71, March 1984.CrossRefGoogle Scholar
  29. 29.
    S. Muthukrishnan, V. Poosala and T. Suel. On rectangular partitionings in two dimensions: algorithms, complexity and applications. Manuscript, 1998.Google Scholar
  30. 30.
    V. Poosala, Y. Ioannidis, P. Haas, and E. Shekita. Improved histograms for selectivity estimation of range predicates. Proc. of ACMSIGMOD Conf, pages 294–305, June 1996.Google Scholar
  31. 31.
    V. Poosala. Histogram-based estimation techniques in databases. PhD thesis, Univ. of Wisconsin-Madison, 1997.Google Scholar
  32. 32.
    V. Poosala and Y. Ioannidis. Selectivity estimation without the attribute value independence assumption. Proc. of the 23rd Int. Conf. on Very Large Databases, August 1997.Google Scholar
  33. 33.
    G. P. Shapiro and C. Connell. Accurate estimation of the number of tuples satisfying a condition. Proc. of ACM SIGMOD Conf, pages 256–276, 1984.Google Scholar
  34. 34.
    E. Welzl. Partition trees for triangle counting and other range searching problems. In Proceedings of the 4th Annual Symposium on Computational Geometry, pages 23–33, June 1988.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • S. Muthukrishnan
    • 1
  • Viswanath Poosala
    • 1
  • Torsten Suel
    • 2
  1. 1.Bell LaboratoriesMurray Hill
  2. 2.Six MetroTech CenterPolytechnic UniversityBrooklyn

Personalised recommendations