Abstract
Extract-Transform-Load (ETL) workflows are data centric workflows responsible for transferring, cleaning, and loading data from their respective sources to the warehouse. Previous research has identified graph-based techniques that construct the blueprints for the structure of such workflows. In this paper, we extend existing results by explicitly incorporating the internal semantics of each activity in the workflow graph. Apart from the value that blueprints have per se, we exploit our modeling to introduce rigorous techniques for the measurement of ETL workflows. To this end, we build upon an existing formal framework for software quality metrics and formally prove how our quality measures fit within this framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Booch, G., Rumbaugh, J., Jacobson, I.: The Unified Modeling Language User Guide. Addison-Wesley, Reading (1999)
Briand, L.C., Morasca, S., Basili, V.R.: Property-Based Software Engineering Measurement. IEEE Trans. on Software Engineering 22(1) (January 1996)
Ceri, S., Gottlob, G., Tanca, L.: Logic Programming and Databases. Springer, Heidelberg (1990)
Dumke, R.R.: Software Metrics: a subdivided bibliography, Available at http://irb.cs.uni-magdeburg.de/sw-eng/us/bibliography/bib_main.shtml
Fenton, N.E., Neil, M.: Software metrics: roadmap. In: ICSE - Future of SE Track 2000, pp. 357–370 (2000)
Fenton, N.: Software Measurement: A Necessary Scientific Basis. IEEE Trans. on Software Engineering 20(3) (March 1994)
Galhardas, H., Florescu, D., Shasha, D., Simon, E.: Ajax: An Extensible Data Cleaning Tool. In: Proc. ACM SIGMOD Intl. Conf. on the Management of Data, Dallas, Texas, p. 590 (2000)
Raman, V., Hellerstein, J.: Potter’s Wheel: An Interactive Data Cleaning System. In: Proceedings of 27th International Conference on Very Large Data Bases (VLDB 2001), Roma, Italy, pp. 381–390 (2001)
Trujillo, J., Luján-Mora, S.: A UML Based Approach for Modeling ETL Processes in Data Warehouses. In: Song, I.-Y., Liddle, S.W., Ling, T.-W., Scheuermann, P. (eds.) ER 2003. LNCS, vol. 2813, pp. 307–320. Springer, Heidelberg (2003)
Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Modeling ETL Activities as Graphs. In: Proc. 4th Intl. Workshop on Design and Management of Data Warehouses (DMDW 2002), Toronto, Canada, pp. 52–61 (2002)
Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Conceptual Modeling for ETL Processes. In: Proc. 5th ACM Intl. Workshop on Data Warehousing and OLAP (DOLAP), McLean, Virginia, USA, pp. 14–21 (2002)
Vassiliadis, P., Simitsis, A., Georgantas, P., Terrovitis, M.: A Framework for the Design of ETL Scenarios. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681, pp. 520–535. Springer, Heidelberg (2003)
Vassiliadis, P., Simitsis, A., Terrovitis, M., Skiadopoulos, S.: Blueprints for ETL workflows (long version). Available through http://www.cs.uoi.gr/~pvassil/publications/2005_ER_AG/ETL_blueprints_long.pdf
Zaniolo, C.: LDL++ Tutorial. UCLA (December 1998), http://pike.cs.ucla.edu/ldl/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vassiliadis, P., Simitsis, A., Terrovitis, M., Skiadopoulos, S. (2005). Blueprints and Measures for ETL Workflows. In: Delcambre, L., Kop, C., Mayr, H.C., Mylopoulos, J., Pastor, O. (eds) Conceptual Modeling – ER 2005. ER 2005. Lecture Notes in Computer Science, vol 3716. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11568322_25
Download citation
DOI: https://doi.org/10.1007/11568322_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29389-7
Online ISBN: 978-3-540-32068-5
eBook Packages: Computer ScienceComputer Science (R0)