Advertisement

Data Integration Patterns for Data Warehouse Automation

  • Kalle Tomingas
  • Margus Kliimask
  • Tanel Tammet
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 312)

Abstract

The paper presents a mapping-based and metadata-driven modular data transformation framework designed to solve extract-transform-load (ETL) automation, impact analysis, data quality and integration problems in data warehouse environments. We introduce a declarative mapping formalization technique, an abstract expression pattern concept and a related template engine technology for flexible ETL code generation and execution. The feasibility and efficiency of the approach is demonstrated on the pattern detection and data lineage analysis case studies using large real life SQL corpuses.

Keywords

data warehouse etl data mappings template based sql generation abstract syntax patterns metadata management 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Behrend, A., Jörg, T.: Optimized Incremental ETL Jobs for Maintaining Data Warehouses (2010)Google Scholar
  2. 2.
    Boehm, M., Habich, D., Lehner, W., Wloka, U.: GCIP: Exploiting the Generation and Optimization of Integration Processes (2009)Google Scholar
  3. 3.
    Böhm, M., Habich, D., Lehner, W., Wloka, U.: Model-driven generation and optimization of complex integration processes. In: ICEIS (2008)Google Scholar
  4. 4.
    Dessloch, S., Hernández, M.A., Wisnesky, R., Radwan, A., Zhou, J.: Orchid: Integrating Schema Mapping and ETL. In: IEEE 24th International Conference on Data Engineering (2008)Google Scholar
  5. 5.
    Giorgini, P., Rizzi, S., Garzetti, M.: GRAnD: A Goal-Oriented Approach to Requirement Analysis in Data Warehouses. DSS 45(1), 4–21 (2008)Google Scholar
  6. 6.
    Haas, L.M., Hernández, M.A., Ho, H., Popa, L., Roth, M.: Clio Grows Up: From Research Prototype to Industrial Tool. In: SIGMOD, pp. 805–810 (2005)Google Scholar
  7. 7.
    Jun, T., Kai, C., Yu, F., Gang, T.: The Research & Application of ETL Tool in Business Intelligence Project, International Forum on Information Technology and Applications. In: FITA 2009, pp. 620–623 (2009)Google Scholar
  8. 8.
    Papastefanatos, G., Vassiliadis, P., Simitsis, A., Sellis, T., Vassiliou, Y.: Rule-based Management of Schema Changes at ETL sources. In: Grundspenkis, J., Kirikova, M., Manolopoulos, Y., Novickis, L. (eds.) ADBIS 2009. LNCS, vol. 5968, pp. 55–62. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Patil, P.S., Rao, S., Patil, S.B.: Data Integration Problem of structural and semantic heterogeneity: Data Warehousing Framework models for the optimization of the ETL processes (2011)Google Scholar
  10. 10.
    Reiss, S.P.: Finding Unusual Code. In: 2007 IEEE International Conference on Software Maintenance, pp. 34–43 (2007)Google Scholar
  11. 11.
    Rodiç, J., Baranoviç, M.: Generating Data Quality Rules and Integration into ETL Process (2009)Google Scholar
  12. 12.
    Roth, M., Hernández, M.A., Coulthard, P., Yan, L., Popa, L., Ho, H.C.T., Salter, C.C.: XML mapping technology: Making connections in an XML-centric world. IBM Systems Journal (2006)Google Scholar
  13. 13.
    Simitsis, A., Vassiliadis, P., Sellis, T.K.: Optimizing ETL Processes in Data Warehouses. In: ICDE, pp. 564–575 (2005)Google Scholar
  14. 14.
    Simitsis, A., Wilkinson, K., Dayal, U., Castellanos, M.: Optimizing ETL workflows for fault-tolerance. In: International Conference on Data Engineering (ICDE), pp. 385–396 (2010)Google Scholar
  15. 15.
    Song, X., Yan, X., Yang, L.: Design ETL Metamodel Based on UML Profile, Knowledge Acquisition and Modeling. In: KAM 2009, pp. 69–72 (2009)Google Scholar
  16. 16.
    Stöhr, T., Müller, R., Rahm, E.: An Integrative and Uniform Model for Metadata Management in Data Warehousing Environment. In: Workshop on Design and Management of Data Warehouses (DMDW) (1999)Google Scholar
  17. 17.
    Vassiliadis, P., Simitsis, A., Georgantas, P., Terrovitis, M.: A Framework for the Design of ETL Scenarios. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681, Springer, Heidelberg (2003)CrossRefGoogle Scholar
  18. 18.
  19. 19.
  20. 20.
    NIST Role Based Access Control (RBAC) Standard, http://csrc.nist.gov/groups/SNS/rbac

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Kalle Tomingas
    • 1
  • Margus Kliimask
    • 2
  • Tanel Tammet
    • 1
  1. 1.Tallinn University of TechnologyTallinnEstonia
  2. 2.Eliko Competence CenterTallinnEstonia

Personalised recommendations