Abstract
The problem of integrating heterogeneous data marts is an important problem in building enterprise data warehouses. Specially identifying compatible dimensions is crucial to successful integration. Existing notions of dimension compatibility rely on given and exact dimension hierarchy information being available. In this paper, we propose to infer aggregation hierarchies for dimensions from a database instance and use these inferred aggregation hierarchies for integration of data marts. We formulate the problem of inferring aggregation hierarchies as computing a minimal directed graph from data, and develop algorithms to this end. We extend previous notions of dimension compatibility in terms of inferred aggregation hierarchies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aho, A., Garey, M., Ullman, J.: The transitive reduction of a directed graph. SIAM Journal on Computing 1(2), 131–137 (1972)
Akoka, J., Comyn-Wattiau, I., Prat, N.: Dimension hierarchies design from uml generalizations and aggregations. In: Kunii, H.S., Jajodia, S., Sølvberg, A. (eds.) ER 2001. LNCS, vol. 2224, pp. 442–455. Springer, Heidelberg (2001)
Banek, M., Boris, V., Tjoa, A., Skocir, Z.: Automating the schema matching process for heterogeneous data warehouse. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2007. LNCS, vol. 4654, pp. 45–54. Springer, Heidelberg (2007)
Batini, C., Lenzerini, M., Navathe, S.B.: A comparative analysis of methodologies for database schema integration. ACM Comput. Surv. 18(4), 323–364 (1986)
Bever, M., Ruland, D.: Aggregation and generalization hierarchies in office automation. In: Proceedings of the ACM SIGOIS and IEEECS TC-OA 1988 conference on office information systems, pp. 250–264. ACM, New York (1988)
Teradata BusinessObjects. Data mart consolidation and business intelligence standardization (2007), http://www.businessobjects.com/pdf/investors/data_mart_consolidation.pdf
Cabibbo, L., Torlone, R.: Dimension compatability for data mart integration. In: Proceedings of the 12th Italian Symposium on Advanced Database Systems, Universit’a degli studi Roma Tre, pp. 6–17. Dipartimento di Informatica e Automazione (2004)
Cabibbo, L., Torlone, R.: Integrating heterogeneous multidimensional databases. In: SSDBM 2005: Proceedings of the 17th international conference on Scientific and statistical database management, pp. 205–214. Lawrence Berkeley Laboratory, Berkeley (2005)
Carpineto, C., Romano, G., d’Adamo, P.: Inferring dependencies from relations: a conceptual clustering approach. International Journal of Intelligent Systems 15, 415–441 (2009)
Critchlow, T., Ganesh, M., Musick, R.: Automatic generation of warehouse mediators using an ontology engine. In: Proceedings of the 5th International Workshop on Knowledge Represenation Meets Databases (KRDB 1998). CEUR Workshop Proceedings, vol. 10, pp. 8-1–8-8 (1998)
Grossmann, W., Moschner, M.: Knowledge integration from multidimensional data sources. In: Moreno DÃaz, R., Pichler, F., Quesada Arencibia, A. (eds.) EUROCAST 2007. LNCS, vol. 4739, pp. 345–351. Springer, Heidelberg (2007)
Kantola, M., Mannila, H., Räihä, K., Siirtola, H.: Discovering functional and inclusion dependencies in relational databases. International Journal of Intelligent Systems 7, 591–607 (2007)
Kimball, R., Ross, M.: The Data Warehouse Toolkit. Wiley Computer Publishing, Chichester (2000)
Lenzerini, M.: Data integration: a theoretical perspective. In: PODS 2002: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 233–246. ACM, New York (2002)
Mannila, H., Räihä, K.: Algorithms for inferring functional dependencies from relations. Data Knowl. Eng. 12(1), 83–99 (1994)
Mazón, J., Lechtenbörger, J., Trujillo, J.: A survey on summarizability issues in multidimensional modeling. Data Knowl. Eng. (2009)
Mazón, J., Trujillo, J.: Enriching data warehouse dimension hierarchies by using semantic relations. In: Bell, D.A., Hong, J. (eds.) BNCOD 2006. LNCS, vol. 4042, pp. 278–281. Springer, Heidelberg (2006)
ORACLE, http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_5006.htm
Rafanelli, M., Shoshani, A.: Storm: a statistical object representation model. In: SSDBM V: Proceedings of the fifth international conference on Statistical and scientific database management, pp. 14–29. Springer, New York (1990)
Romero, O., Calvanese, D., Abelló, A., RodrÃguez-Muro, M.: Discovering functional dependencies for multidimensional design. In: DOLAP ’09: Proceeding of the ACM twelfth international workshop on Data warehousing and OLAP, pp. 1–8. ACM, New York (2009)
Lenzand, H., Shoshani, A.: Summarizability in olap and statistical data bases. In: SSDBM 1997: Proceedings of the Ninth International Conference on Scientific and Statistical Database Management, pp. 132–143. IEEE Computer Society, Washington (1997)
Templeton, M., Henley, H., Maros, E., Van Buer, D.: Interviso: dealing with the complexity of federated database access. The VLDB Journal 4(2), 287–318 (1995)
Torlone, R.: Two approaches to the integration of heterogeneous data warehouses. Distrib. Parallel Databases 23(1), 69–97 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Riazati, D., Thom, J.A., Zhang, X. (2010). Inferring Aggregation Hierarchies for Integration of Data Marts. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds) Database and Expert Systems Applications. DEXA 2010. Lecture Notes in Computer Science, vol 6262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15251-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-15251-1_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15250-4
Online ISBN: 978-3-642-15251-1
eBook Packages: Computer ScienceComputer Science (R0)