E-ETL: Framework For Managing Evolving ETL Processes

Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 185)

Abstract

Data warehouses integrate external data sources (EDSs), which very often change their data structures (schemas). In many cases, such changes cause an erroneous execution of an already deployed ETL workflow. Structural changes of EDSs are frequent, therefore an automatic reparation of an ETL workflow, after such changes, is of a high importance. This paper presents a framework for handling the evolution of an ETL layer – E − ETL. Detection of changes in EDSs causes a reparation of the fragment of ETL workflow which interacts with the changed EDS. The proposed framework was developed as a module external to an ETL engine, accessing the engine by means of API. The innovation of this framework are algorithms for semi-automatic reparation of an ETL workflow.

Keywords

Data Warehouse Evolution Rule Common Data Model External Data Source Reparation Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Eder, J., Koncilia, C., Morzy, T.: The COMET Metamodel for Temporal Data Warehouses. In: Pidduck, A.B., Mylopoulos, J., Woo, C.C., Ozsu, M.T. (eds.) CAiSE 2002. LNCS, vol. 2348, pp. 83–99. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Papastefanatos, G., Vassiliadis, P., Simitsis, A., Sellis, T., Vassiliou, Y.: Rule-Based Management of Schema Changes at ETL Sources. In: Grundspenkis, J., Kirikova, M., Manolopoulos, Y., Novickis, L. (eds.) ADBIS 2009. LNCS, vol. 5968, pp. 55–62. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Papastefanatos, G., Vassiliadis, P., Simitsis, A., Vassiliou, Y.: Policy-Regulated Management of ETL Evolution. J. Data Semantics, 147–177 (2009)Google Scholar
  4. 4.
    Rundensteiner, E.A., Koeller, A., Zhang, X.: Maintaining data warehouses over changing information sources. Communications of the ACM 43(6), 57–62 (2000)CrossRefGoogle Scholar
  5. 5.
    Rundensteiner, E.A., Koeller, A., Zhang, X., Lee, A.J., Nica, A., Van Wyk, A., Lee, Y.: Evolvable View Environment (EVE): Non-Equivalent View Maintenance under Schema Changes. In: Proc. of ACM Int. Conf. on Management of Data, SIGMOD, pp. 553–555. ACM Press (1999)Google Scholar
  6. 6.
    Wojciechowski, A.: E-ETL: Framework For Managing Evolving ETL Processes. In: Proc. of Ph.D. Students in Information and Knowledge Management Workshop (PIKM), pp. 59–66. ACM Press (2011)Google Scholar
  7. 7.
    Wojciechowski, A., Wrembel, R.: Research Problems of the ETL Technology. Foundations of Computing and Decision Sciences 35(5), 283–306 (2010)Google Scholar
  8. 8.
    Wrembel, R.: On handling the evolution of external data sources in a data warehouse architecture. In: Taniar, D., Chen, L. (eds.) Data Mining and Database Technologies: Innovative Approaches. IGI Group (2011)Google Scholar
  9. 9.
    Wrembel, R., Bębel, B.: The Framework for Detecting and Propagating Changes from Data Sources Structure into a Data Warehouse. Foundations of Computing & Decision Sciences 30(4), 361–372 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Institute of Computing SciencePoznań University of TechnologyPoznańPoland

Personalised recommendations