Advertisement

Clustering-Based Materialized View Selection in Data Warehouses

  • Kamel Aouiche
  • Pierre-Emmanuel Jouve
  • Jérôme Darmont
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4152)

Abstract

Materialized view selection is a non-trivial task. Hence, its complexity must be reduced. A judicious choice of views must be cost-driven and influenced by the workload experienced by the system. In this paper, we propose a framework for materialized view selection that exploits a data mining technique (clustering), in order to determine clusters of similar queries. We also propose a view merging algorithm that builds a set of candidate views, as well as a greedy process for selecting a set of views to materialize. This selection is based on cost models that evaluate the cost of accessing data using views and the cost of storing these views. To validate our strategy, we executed a workload of decision-support queries on a test data warehouse, with and without using our strategy. Our experimental results demonstrate its efficiency, even when storage space is limited.

Keywords

Cost Model Storage Space Data Cube Merging Process Similar Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, S., Chaudhuri, S., Narasayya, V.R.: Automated selection of materialized views and indexes in SQL databases. In: 26th International Conference on Very Large Data Bases (VLDB 2000), Cairo, Egypt, pp. 496–505 (2000)Google Scholar
  2. 2.
    Baralis, E., Paraboschi, S., Teniente, E.: Materialized views selection in a multidimensional database. In: 23rd International Conference on Very Large Data Bases (VLDB 1997), Athens, Greece, pp. 156–165 (1997)Google Scholar
  3. 3.
    Baril, X., Bellahsene, Z.: Selection of materialized views: a cost-based approach. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681, pp. 665–680. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  4. 4.
    Cardenas, A.F.: Analysis and performance of inverted data base structures. Communication in ACM 18(5), 253–263 (1975)MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Chan, G.K.Y., Li, Q., Feng, L.: Design and selection of materialized views in a data warehousing environment: A case study. In: 2nd ACM international workshop on Data warehousing and OLAP (DOLAP 1999), Kansas City, USA, pp. 42–47 (1999)Google Scholar
  6. 6.
    Goldstein, J., Larson, P.: Optimizing queries using materialized views: A practical, scalable solution. In: ACM SIGMOD international conference on Management of data (SIGMOD 2001), Santa Barbara, USA, pp. 331–342 (2001)Google Scholar
  7. 7.
    Golfarelli, M., Rizzi, S.: A methodological framework for data warehouse design. In: 1st ACM international workshop on Data warehousing and OLAP (DOLAP 1998), New York, USA, pp. 3–9 (1998)Google Scholar
  8. 8.
    Gupta, H.: Selection of views to materialize in a data warehouse. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 98–112. Springer, Heidelberg (1996)Google Scholar
  9. 9.
    Gupta, H., Mumick, I.S.: Selection of views to materialize in a data warehouse. IEEE Transactions on Knowledge and Data Engineering 17(1), 24–43 (2005)CrossRefGoogle Scholar
  10. 10.
    Harinarayan, V., Rajaraman, A., Ullman, J.D.: Implementing data cubes efficiently. In: ACM SIGMOD International Conference on Management of data (SIGMOD 1996), Montreal, Canada, pp. 205–216 (1996)Google Scholar
  11. 11.
    Jouve, P., Nicoloyannis, N.: KEROUAC: An algorithm for clustering categorical data sets with practical advantages. In: International Workshop on Data Mining for Actionable Knowledge (DMAK’2003, in conjunction with PAKDD 2003) (2003)Google Scholar
  12. 12.
    Jouve, P., Nicoloyannis, N.: A new method for combining partitions, applications for distributed clustering. In: International Workshop on Paralell and Distributed Machine Learning and Data Mining (ECML/PKDD 2003), pp. 35–46 (2003)Google Scholar
  13. 13.
    Kotidis, Y., Roussopoulos, N.: DynaMat: A dynamic view management system for data warehouses. In: ACM SIGMOD International Conference on Management of Data (SIGMOD 1999), Philadelphia, USA, pp. 371–382 (1999)Google Scholar
  14. 14.
    Nadeau, T.P., Teorey, T.J.: Achieving scalability in OLAP materialized view selection. In: 5th ACM International Workshop on Data Warehousing and OLAP (DOLAP 2002), McLean, USA (2002)Google Scholar
  15. 15.
    Rizzi, S., Saltarelli, E.: View materialization vs. indexing: Balancing space constraints in data warehouse design. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681, pp. 502–519. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  16. 16.
    Shukla, A., Deshpande, P.M., Naughton, J.F.: Materialized view selection for multi-cube data models. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, pp. 269–284. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  17. 17.
    Sismanis, Y., Deligiannakis, A., Roussopoulos, N., Kotidis, Y.: Dwarf: shrinking the petacube. In: ACM SIGMOD International Conference on Management of Data (SIGMOD 2002), Madison, USA, pp. 464–475 (2002)Google Scholar
  18. 18.
    Smith, J.R., Li, C.-S., Jhingran, A.: A wavelet framework for adapting data cube views for OLAP. IEEE Transactions on Knowledge and Data Engineering 16(5), 552–565 (2004)CrossRefGoogle Scholar
  19. 19.
    Theodoratos, D., Xu, W.: Constructing search spaces for materialized view selection. In: 7th ACM international workshop on Data warehousing and OLAP (DOLAP 2004), Washington, USA (2004)Google Scholar
  20. 20.
    Transaction Processing Council. TPC Benchmark R Standard Specification (1999)Google Scholar
  21. 21.
    Uchiyama, H., Runapongsa, K., Teorey, T.J.: A progressive view materialization algorithm. In: 2nd ACM International Workshop on Data warehousing and OLAP (DOLAP 1999), Kansas City, USA, pp. 36–41 (1999)Google Scholar
  22. 22.
    Valluri, S.R., Vadapalli, S., Karlapalem, K.: View relevance driven materialized view selection in data warehousing environment. In: 30th Australasian conference on Database technologies, Melbourne, Australia, pp. 187–196 (2002)Google Scholar
  23. 23.
    Yao, S.B.: Approximating block accesses in database organizations. Communication in ACM 20(4), 260–261 (1977)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Kamel Aouiche
    • 1
  • Pierre-Emmanuel Jouve
    • 1
  • Jérôme Darmont
    • 1
  1. 1.ERIC LaboratoryUniversity of Lyon 2BRONFrance

Personalised recommendations