Abstract
Horizontal data partitioning is a non redundant optimization technique used in designing data warehouses. Most of today’s commercial database systems offer native data definition language support for defining horizontal partitions of a table. Two types of horizontal partitioning are available: primary and derived horizontal fragmentations. In the first type, a table is decomposed into a set of fragments based on its own attributes, whereas in the second type, a table is fragmented based on partitioning schemes of other tables. In this paper, we first show hardness to select an optimal partitioning schema of a relational data warehouse. Due to its high complexity, we develop a hill climbing algorithm to select a near optimal solution. Finally, we conduct extensive experimental studies to compare the proposed algorithm with the existing ones using a mathematical cost model. The generated fragmentation schemes by these algorithms are validated on Oracle 10g using data set of APB1 benchmark.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Özsu, M.T., Valduriez, P.: Principles of Distributed Database Systems, 2nd edn. Prentice Hall, Englewood Cliffs (1999)
Sanjay, A., Narasayya, V.R., Yang, B.: Integrating vertical and horizontal partitioning into automated physical database design. In: Sigmod 2004, pp. 359–370 (June 2004)
Papadomanolakis, S., Ailamaki, A.: AutoPart: Automating Schema Design for Large Scientific Databases Using Data Partitioning. In: Proceedings of the 16th International Conference on Scientific and Statistical Database Management (SSDBM 2004), June 2004, pp. 383–392 (2004)
Eadon, G., Chong, E.I., Shankar, S., Raghavan, A., Srinivasan, J., Das, S.: Supporting table partitioning by reference in oracle. In: SIGMOD 2008 (to appear, 2008)
Bellatreche, L., Boukhalfa, K., Abdalla, H.I.: Saga: A combination of genetic and simulated annealing algorithms for physical data warehouse design. In: 23rd British National Conference on Databases, July 2006, pp. 212–219 (2006)
Bellatreche, L., Karlapalem, K., Simonet, A.: Algorithms and support for horizontal class partitioning in object-oriented databases. The Distributed and Parallel Databases Journal 8(2), 155–179 (2000)
Navathe, S., Karlapalem, K., Ra, M.: A mixed partitioning methodology for distributed database design. Journal of Computer and Software Engineering 3(4), 395–426 (1995)
Ceri, S., Negri, M., Pelagatti, G.: Horizontal data partitioning in database design. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. SIGPLAN Notices, pp. 128–136 (1982)
Karlapalem, K., Navathe, S.B., Ammar, M.: Optimal redesign policies to support dynamic processing of applications on a distributed database system. Information Systems 21(4), 353–367 (1996)
Tucker, A., Crampton, J., Swift, S.: Rgfga: An efficient representation and crossover for grouping genetic algorithms. Evol. Comput. 13(4), 477–499 (2005)
Garey, M.R., Johnson, D.S.: Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1990)
OLAP Council, Apb-1 olap benchmark, release ii (1998), http://www.olapcouncil.org/research/resrchly.htm
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bellatreche, L., Boukhalfa, K., Richard, P. (2008). Data Partitioning in Data Warehouses: Hardness Study, Heuristics and ORACLE Validation. In: Song, IY., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2008. Lecture Notes in Computer Science, vol 5182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85836-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-85836-2_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85835-5
Online ISBN: 978-3-540-85836-2
eBook Packages: Computer ScienceComputer Science (R0)