Skip to main content

A Domain-Specific Language for ETL Patterns Specification in Data Warehousing Systems

  • Conference paper
  • First Online:
Progress in Artificial Intelligence (EPIA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9273))

Included in the following conference series:

Abstract

During the last few years many research efforts have been done to improve the design of ETL (Extract-Transform-Load) systems. ETL systems are considered very time-consuming, error-prone and complex involving several participants from different knowledge domains. ETL processes are one of the most important components of a data warehousing system that are strongly influenced by the complexity of business requirements, their changing and evolution. These aspects influence not only the structure of a data warehouse but also the structures of the data sources involved with. To minimize the negative impact of such variables, we propose the use of ETL patterns to build specific ETL packages. In this paper, we formalize this approach using BPMN (Business Process Modelling Language) for modelling more conceptual ETL workflows, mapping them to real execution primitives through the use of a domain-specific language that allows for the generation of specific instances that can be executed in an ETL commercial tool.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. OMG, Documents Associated With Business Process Model And Notation (BPMN) Version 2.0 (2011)

    Google Scholar 

  2. Thomsen, C., Pedersen, T.B.: Pygrametl: a powerful programming framework for extract-transform-load programmers. In: Proceeding of the ACM Twelfth International Workshop on Data Warehousing and OLAP, DOLAP 2009, pp. 49–56 (2009)

    Google Scholar 

  3. Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Conceptual modeling for ETL processes. In: Proceedings of the 5th ACM International Workshop on Data Warehousing and OLAP, DOLAP 2002, pp. 14–21 (2002)

    Google Scholar 

  4. Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: On the logical modeling of ETL processes. In: Pidduck, A., Mylopoulos, J., Woo, C.C., Ozsu, M. (eds.) CAiSE 2002. LNCS, vol. 2348, pp. 782–786. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  5. Simitsis, A., Vassiliadis, P.: A method for the mapping of conceptual designs to logical blueprints for ETL processes. Decis. Support Syst. 45, 22–40 (2008)

    Article  Google Scholar 

  6. Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N., Sellis, T.: Arktos: A Tool for Data Cleaning and Transformation in Data Warehouse Environments. IEEE Data Eng. Bull. 23(4), 42–47 (2000)

    Google Scholar 

  7. Luján-Mora, S., Trujillo, J., Song, I.-Y.: A UML profile for multidimensional modeling in data warehouses. Data Knowl. Eng. 59, 725–769 (2006)

    Article  Google Scholar 

  8. Trujillo, J., Luján-Mora, S.: A UML based approach for modeling ETL processes in data warehouses. Concept. Model. 2813, 307–320 (2003)

    Google Scholar 

  9. Wilkinson, K., Simitsis, A., Castellanos, M., Dayal, U.: Leveraging business process models for ETL design. In: Parsons, J., Saeki, M., Shoval, P., Woo, C., Wand, Y. (eds.) ER 2010. LNCS, vol. 6412, pp. 15–30. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  10. El Akkaoui, Z., Zimanyi, E.: Defining ETL worfklows using BPMN and BPEL. In: Proceedings of the ACM Twelfth International Workshop on Data Warehousing and OLAP, DOLAP 2009, pp. 41–48 (2009)

    Google Scholar 

  11. El Akkaoui, Z., Zimànyi, E., Mazón, J.-N., Trujillo, J.: A model-driven framework for ETL process development. In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, DOLAP 2011, pp. 45–52 (2011)

    Google Scholar 

  12. El Akkaoui, Z., Mazón, J.-N., Vaisman, A., Zimányi, E.: BPMN-based conceptual modeling of ETL processes. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 1–14. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  13. El Akkaoui, Z., Zimanyi, E., Mazon, J.-N., Trujillo, J.: A BPMN-based design and maintenance framework for ETL processes. Int. J. Data Warehous. Min. 9, 46 (2013)

    Article  Google Scholar 

  14. Rahm, E., Do, H.: Data cleaning: Problems and current approaches. IEEE Data Eng. Bull. 23, 3–13 (2000)

    Google Scholar 

  15. Köppen, V., Brüggemann, B., Berendt, B.: Designing Data Integration: The ETL Pattern Approach. Eur. J. Informatics Prof. XII (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bruno Oliveira .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Oliveira, B., Belo, O. (2015). A Domain-Specific Language for ETL Patterns Specification in Data Warehousing Systems. In: Pereira, F., Machado, P., Costa, E., Cardoso, A. (eds) Progress in Artificial Intelligence. EPIA 2015. Lecture Notes in Computer Science(), vol 9273. Springer, Cham. https://doi.org/10.1007/978-3-319-23485-4_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23485-4_60

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23484-7

  • Online ISBN: 978-3-319-23485-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics