Skip to main content

Inferring Aggregation Hierarchies for Integration of Data Marts

  • Conference paper
Database and Expert Systems Applications (DEXA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6262))

Included in the following conference series:

Abstract

The problem of integrating heterogeneous data marts is an important problem in building enterprise data warehouses. Specially identifying compatible dimensions is crucial to successful integration. Existing notions of dimension compatibility rely on given and exact dimension hierarchy information being available. In this paper, we propose to infer aggregation hierarchies for dimensions from a database instance and use these inferred aggregation hierarchies for integration of data marts. We formulate the problem of inferring aggregation hierarchies as computing a minimal directed graph from data, and develop algorithms to this end. We extend previous notions of dimension compatibility in terms of inferred aggregation hierarchies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aho, A., Garey, M., Ullman, J.: The transitive reduction of a directed graph. SIAM Journal on Computing 1(2), 131–137 (1972)

    Article  MATH  MathSciNet  Google Scholar 

  2. Akoka, J., Comyn-Wattiau, I., Prat, N.: Dimension hierarchies design from uml generalizations and aggregations. In: Kunii, H.S., Jajodia, S., Sølvberg, A. (eds.) ER 2001. LNCS, vol. 2224, pp. 442–455. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  3. Banek, M., Boris, V., Tjoa, A., Skocir, Z.: Automating the schema matching process for heterogeneous data warehouse. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2007. LNCS, vol. 4654, pp. 45–54. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Batini, C., Lenzerini, M., Navathe, S.B.: A comparative analysis of methodologies for database schema integration. ACM Comput. Surv. 18(4), 323–364 (1986)

    Article  Google Scholar 

  5. Bever, M., Ruland, D.: Aggregation and generalization hierarchies in office automation. In: Proceedings of the ACM SIGOIS and IEEECS TC-OA 1988 conference on office information systems, pp. 250–264. ACM, New York (1988)

    Chapter  Google Scholar 

  6. Teradata BusinessObjects. Data mart consolidation and business intelligence standardization (2007), http://www.businessobjects.com/pdf/investors/data_mart_consolidation.pdf

  7. Cabibbo, L., Torlone, R.: Dimension compatability for data mart integration. In: Proceedings of the 12th Italian Symposium on Advanced Database Systems, Universit’a degli studi Roma Tre, pp. 6–17. Dipartimento di Informatica e Automazione (2004)

    Google Scholar 

  8. Cabibbo, L., Torlone, R.: Integrating heterogeneous multidimensional databases. In: SSDBM 2005: Proceedings of the 17th international conference on Scientific and statistical database management, pp. 205–214. Lawrence Berkeley Laboratory, Berkeley (2005)

    Google Scholar 

  9. Carpineto, C., Romano, G., d’Adamo, P.: Inferring dependencies from relations: a conceptual clustering approach. International Journal of Intelligent Systems 15, 415–441 (2009)

    MathSciNet  Google Scholar 

  10. Critchlow, T., Ganesh, M., Musick, R.: Automatic generation of warehouse mediators using an ontology engine. In: Proceedings of the 5th International Workshop on Knowledge Represenation Meets Databases (KRDB 1998). CEUR Workshop Proceedings, vol. 10, pp. 8-1–8-8 (1998)

    Google Scholar 

  11. Grossmann, W., Moschner, M.: Knowledge integration from multidimensional data sources. In: Moreno Díaz, R., Pichler, F., Quesada Arencibia, A. (eds.) EUROCAST 2007. LNCS, vol. 4739, pp. 345–351. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Kantola, M., Mannila, H., Räihä, K., Siirtola, H.: Discovering functional and inclusion dependencies in relational databases. International Journal of Intelligent Systems 7, 591–607 (2007)

    Article  Google Scholar 

  13. Kimball, R., Ross, M.: The Data Warehouse Toolkit. Wiley Computer Publishing, Chichester (2000)

    Google Scholar 

  14. Lenzerini, M.: Data integration: a theoretical perspective. In: PODS 2002: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 233–246. ACM, New York (2002)

    Chapter  Google Scholar 

  15. Mannila, H., Räihä, K.: Algorithms for inferring functional dependencies from relations. Data Knowl. Eng. 12(1), 83–99 (1994)

    Article  MATH  Google Scholar 

  16. Mazón, J., Lechtenbörger, J., Trujillo, J.: A survey on summarizability issues in multidimensional modeling. Data Knowl. Eng. (2009)

    Google Scholar 

  17. Mazón, J., Trujillo, J.: Enriching data warehouse dimension hierarchies by using semantic relations. In: Bell, D.A., Hong, J. (eds.) BNCOD 2006. LNCS, vol. 4042, pp. 278–281. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  18. ORACLE, http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_5006.htm

  19. Rafanelli, M., Shoshani, A.: Storm: a statistical object representation model. In: SSDBM V: Proceedings of the fifth international conference on Statistical and scientific database management, pp. 14–29. Springer, New York (1990)

    Google Scholar 

  20. Romero, O., Calvanese, D., Abelló, A., Rodríguez-Muro, M.: Discovering functional dependencies for multidimensional design. In: DOLAP ’09: Proceeding of the ACM twelfth international workshop on Data warehousing and OLAP, pp. 1–8. ACM, New York (2009)

    Chapter  Google Scholar 

  21. Lenzand, H., Shoshani, A.: Summarizability in olap and statistical data bases. In: SSDBM 1997: Proceedings of the Ninth International Conference on Scientific and Statistical Database Management, pp. 132–143. IEEE Computer Society, Washington (1997)

    Google Scholar 

  22. Templeton, M., Henley, H., Maros, E., Van Buer, D.: Interviso: dealing with the complexity of federated database access. The VLDB Journal 4(2), 287–318 (1995)

    Article  Google Scholar 

  23. Torlone, R.: Two approaches to the integration of heterogeneous data warehouses. Distrib. Parallel Databases 23(1), 69–97 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Riazati, D., Thom, J.A., Zhang, X. (2010). Inferring Aggregation Hierarchies for Integration of Data Marts. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds) Database and Expert Systems Applications. DEXA 2010. Lecture Notes in Computer Science, vol 6262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15251-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15251-1_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15250-4

  • Online ISBN: 978-3-642-15251-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics