Skip to main content

Blueprints and Measures for ETL Workflows

  • Conference paper
Conceptual Modeling – ER 2005 (ER 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3716))

Included in the following conference series:

Abstract

Extract-Transform-Load (ETL) workflows are data centric workflows responsible for transferring, cleaning, and loading data from their respective sources to the warehouse. Previous research has identified graph-based techniques that construct the blueprints for the structure of such workflows. In this paper, we extend existing results by explicitly incorporating the internal semantics of each activity in the workflow graph. Apart from the value that blueprints have per se, we exploit our modeling to introduce rigorous techniques for the measurement of ETL workflows. To this end, we build upon an existing formal framework for software quality metrics and formally prove how our quality measures fit within this framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Booch, G., Rumbaugh, J., Jacobson, I.: The Unified Modeling Language User Guide. Addison-Wesley, Reading (1999)

    Google Scholar 

  2. Briand, L.C., Morasca, S., Basili, V.R.: Property-Based Software Engineering Measurement. IEEE Trans. on Software Engineering 22(1) (January 1996)

    Google Scholar 

  3. Ceri, S., Gottlob, G., Tanca, L.: Logic Programming and Databases. Springer, Heidelberg (1990)

    Google Scholar 

  4. Dumke, R.R.: Software Metrics: a subdivided bibliography, Available at http://irb.cs.uni-magdeburg.de/sw-eng/us/bibliography/bib_main.shtml

  5. Fenton, N.E., Neil, M.: Software metrics: roadmap. In: ICSE - Future of SE Track 2000, pp. 357–370 (2000)

    Google Scholar 

  6. Fenton, N.: Software Measurement: A Necessary Scientific Basis. IEEE Trans. on Software Engineering 20(3) (March 1994)

    Google Scholar 

  7. Galhardas, H., Florescu, D., Shasha, D., Simon, E.: Ajax: An Extensible Data Cleaning Tool. In: Proc. ACM SIGMOD Intl. Conf. on the Management of Data, Dallas, Texas, p. 590 (2000)

    Google Scholar 

  8. Raman, V., Hellerstein, J.: Potter’s Wheel: An Interactive Data Cleaning System. In: Proceedings of 27th International Conference on Very Large Data Bases (VLDB 2001), Roma, Italy, pp. 381–390 (2001)

    Google Scholar 

  9. Trujillo, J., Luján-Mora, S.: A UML Based Approach for Modeling ETL Processes in Data Warehouses. In: Song, I.-Y., Liddle, S.W., Ling, T.-W., Scheuermann, P. (eds.) ER 2003. LNCS, vol. 2813, pp. 307–320. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  10. Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Modeling ETL Activities as Graphs. In: Proc. 4th Intl. Workshop on Design and Management of Data Warehouses (DMDW 2002), Toronto, Canada, pp. 52–61 (2002)

    Google Scholar 

  11. Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Conceptual Modeling for ETL Processes. In: Proc. 5th ACM Intl. Workshop on Data Warehousing and OLAP (DOLAP), McLean, Virginia, USA, pp. 14–21 (2002)

    Google Scholar 

  12. Vassiliadis, P., Simitsis, A., Georgantas, P., Terrovitis, M.: A Framework for the Design of ETL Scenarios. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681, pp. 520–535. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  13. Vassiliadis, P., Simitsis, A., Terrovitis, M., Skiadopoulos, S.: Blueprints for ETL workflows (long version). Available through http://www.cs.uoi.gr/~pvassil/publications/2005_ER_AG/ETL_blueprints_long.pdf

  14. Zaniolo, C.: LDL++ Tutorial. UCLA (December 1998), http://pike.cs.ucla.edu/ldl/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vassiliadis, P., Simitsis, A., Terrovitis, M., Skiadopoulos, S. (2005). Blueprints and Measures for ETL Workflows. In: Delcambre, L., Kop, C., Mayr, H.C., Mylopoulos, J., Pastor, O. (eds) Conceptual Modeling – ER 2005. ER 2005. Lecture Notes in Computer Science, vol 3716. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11568322_25

Download citation

  • DOI: https://doi.org/10.1007/11568322_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29389-7

  • Online ISBN: 978-3-540-32068-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics