Skip to main content

Data Partitioning in Data Warehouses: Hardness Study, Heuristics and ORACLE Validation

  • Conference paper
Data Warehousing and Knowledge Discovery (DaWaK 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5182))

Included in the following conference series:

Abstract

Horizontal data partitioning is a non redundant optimization technique used in designing data warehouses. Most of today’s commercial database systems offer native data definition language support for defining horizontal partitions of a table. Two types of horizontal partitioning are available: primary and derived horizontal fragmentations. In the first type, a table is decomposed into a set of fragments based on its own attributes, whereas in the second type, a table is fragmented based on partitioning schemes of other tables. In this paper, we first show hardness to select an optimal partitioning schema of a relational data warehouse. Due to its high complexity, we develop a hill climbing algorithm to select a near optimal solution. Finally, we conduct extensive experimental studies to compare the proposed algorithm with the existing ones using a mathematical cost model. The generated fragmentation schemes by these algorithms are validated on Oracle 10g using data set of APB1 benchmark.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Özsu, M.T., Valduriez, P.: Principles of Distributed Database Systems, 2nd edn. Prentice Hall, Englewood Cliffs (1999)

    Google Scholar 

  2. Sanjay, A., Narasayya, V.R., Yang, B.: Integrating vertical and horizontal partitioning into automated physical database design. In: Sigmod 2004, pp. 359–370 (June 2004)

    Google Scholar 

  3. Papadomanolakis, S., Ailamaki, A.: AutoPart: Automating Schema Design for Large Scientific Databases Using Data Partitioning. In: Proceedings of the 16th International Conference on Scientific and Statistical Database Management (SSDBM 2004), June 2004, pp. 383–392 (2004)

    Google Scholar 

  4. Eadon, G., Chong, E.I., Shankar, S., Raghavan, A., Srinivasan, J., Das, S.: Supporting table partitioning by reference in oracle. In: SIGMOD 2008 (to appear, 2008)

    Google Scholar 

  5. Bellatreche, L., Boukhalfa, K., Abdalla, H.I.: Saga: A combination of genetic and simulated annealing algorithms for physical data warehouse design. In: 23rd British National Conference on Databases, July 2006, pp. 212–219 (2006)

    Google Scholar 

  6. Bellatreche, L., Karlapalem, K., Simonet, A.: Algorithms and support for horizontal class partitioning in object-oriented databases. The Distributed and Parallel Databases Journal 8(2), 155–179 (2000)

    Article  Google Scholar 

  7. Navathe, S., Karlapalem, K., Ra, M.: A mixed partitioning methodology for distributed database design. Journal of Computer and Software Engineering 3(4), 395–426 (1995)

    Google Scholar 

  8. Ceri, S., Negri, M., Pelagatti, G.: Horizontal data partitioning in database design. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. SIGPLAN Notices, pp. 128–136 (1982)

    Google Scholar 

  9. Karlapalem, K., Navathe, S.B., Ammar, M.: Optimal redesign policies to support dynamic processing of applications on a distributed database system. Information Systems 21(4), 353–367 (1996)

    Article  Google Scholar 

  10. Tucker, A., Crampton, J., Swift, S.: Rgfga: An efficient representation and crossover for grouping genetic algorithms. Evol. Comput. 13(4), 477–499 (2005)

    Article  Google Scholar 

  11. Garey, M.R., Johnson, D.S.: Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1990)

    Google Scholar 

  12. OLAP Council, Apb-1 olap benchmark, release ii (1998), http://www.olapcouncil.org/research/resrchly.htm

Download references

Author information

Authors and Affiliations

Authors

Editor information

Il-Yeol Song Johann Eder Tho Manh Nguyen

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bellatreche, L., Boukhalfa, K., Richard, P. (2008). Data Partitioning in Data Warehouses: Hardness Study, Heuristics and ORACLE Validation. In: Song, IY., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2008. Lecture Notes in Computer Science, vol 5182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85836-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85836-2_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85835-5

  • Online ISBN: 978-3-540-85836-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics