Skip to main content

Approaching ETL Processes Specification Using a Pattern-Based Ontology

  • Conference paper
  • First Online:
Data Management Technologies and Applications (DATA 2016)

Abstract

The development of software projects is often based on the composition of components for creating new products and components through the promotion of reusable techniques. These pre-configured components are sometimes based on well-known and validated design-patterns describing abstract solutions for solving recurring problems. The data warehouse ETL development life cycle shares the main steps of most typical phases of any software process development. Considering that patterns have been broadly used in many software areas as a way to increase reliability, reduce development risks and enhance standards compliance, a pattern-oriented approach for the development of ETL systems can be achieve, providing a more flexible approach for ETL implementation. Appealing to an ontology specification, in this paper we present and discuss contextual data for describing ETL patterns based on their structural properties. The use of an ontology allows for the interpretation of ETL patterns by a computer and used posteriorly to rule its instantiation to physical models that can be executed using existing commercial tools.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gruber, T.R.: A translation approach to portable ontology specifications. Knowl. Acquis. 5, 199–220 (1993)

    Article  Google Scholar 

  2. Gamma, E., Helm, R., Johnson, R.E., Vlissides, J.: Design patterns: elements of reusable object-oriented software. Design. 206, 395 (1995)

    Google Scholar 

  3. Alexander, C., Ishikawa, S., Silverstein, M.: A Pattern Language: Towns, Buildings, Construction. Oxford University Press, Oxford (1977)

    Google Scholar 

  4. Weske, M., van der Aalst, W., Verbeek, H.: Advances in business process management. Data Knowl. Eng. 50, 1–8 (2004)

    Article  Google Scholar 

  5. Oliveira, B., Belo, O.: BPMN Patterns for ETL conceptual modelling and validation. In: 20th International Symposium on Methodologies for Intelligent Systems (ISMIS 2012), Macau, 4–7 December 2012

    Google Scholar 

  6. Oliveira, B., Santos, V., Belo, O.: Pattern-based ETL conceptual modelling. In: Cuzzocrea, A., Maabout, S. (eds.) MEDI 2013. LNCS, vol. 8216, pp. 237–248. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41366-7_20

    Chapter  Google Scholar 

  7. Oliveira, B., Belo, O.: An ontology for describing ETL patterns behavior. In: Proceedings of 5th International Conference on Data Management Technologies and Applications (DATA 2016), Lisboa, Portugal, 24–26 July 2016

    Google Scholar 

  8. McGuinness, D.L., van Harmelen, F.: OWL Web Ontology Language Overview (2004)

    Google Scholar 

  9. Oliveira, B., Belo, O.: A domain-specific language for ETL patterns specification in data warehousing systems. In: Pereira, F., Machado, P., Costa, E., Cardoso, A. (eds.) EPIA 2015. LNCS (LNAI), vol. 9273, pp. 597–602. Springer, Cham (2015). doi:10.1007/978-3-319-23485-4_60

    Google Scholar 

  10. McGuinness, D.L., Wright, J.R.: Conceptual modelling for configuration: a description logic-based approach. Artif. Intell. Eng. Des. Anal. Manuf. 12, 333–344 (1998)

    Article  Google Scholar 

  11. Dietrich, J., Elgar, C.: Towards a web of patterns. Web Semant. Sci. Serv. Agents World Wide Web 5, 108–116 (2007)

    Article  Google Scholar 

  12. Noy, N., McGuinness, D.: Ontology development 101, A guide to creating your first ontology. Development. 32, 1–25 (2001)

    Google Scholar 

  13. Antoniou, G., Van Harmelen, F.: OWL web ontology language. Handb. Ontol. Inf. Syst. 2007, 157–160 (2004)

    Google Scholar 

  14. Vassiliadis, P., Simitsis, A., Georgantas, P., Terrovitis, M.: A framework for the design of ETL scenarios. In: Eder, J., Missikoff, M. (eds.) CAiSE 2003. LNCS, vol. 2681, pp. 520–535. Springer, Heidelberg (2003). doi:10.1007/3-540-45017-3_35

    Chapter  Google Scholar 

  15. Vassiliadis, P., Simitsis, A., Skiadopoulos, S., Conceptual modeling for ETL processes. In: Proceedings of the 5th ACM International Workshop on Data Warehousing and OLAP, DOLAP 2002, pp. 1–25 (2002)

    Google Scholar 

  16. Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: On the logical modeling of ETL processes. Science 80, 782–786 (2002)

    MATH  Google Scholar 

  17. Simitsis, A., Vassiliadis, P.: A method for the mapping of conceptual designs to logical blueprints for ETL processes. Decis. Support Syst. 45, 22–40 (2008)

    Article  Google Scholar 

  18. Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N., Sellis, T.: ARKTOS: a tool for data cleaning and transformation in data warehouse environments. Bull. IEEE Comput. Soc. Tech. Comm. Data Eng. 1–7 (2000)

    Google Scholar 

  19. Skoutas, D., Simitsis, A.: Ontology-based conceptual design of ETL processes for both structured and semi-structured data. Int. J. Semant. Web Inf. Syst. 3, 1–24 (2000)

    Article  Google Scholar 

  20. El Akkaoui, Z., Zimanyi, E.: Defining ETL worfklows using BPMN and BPEL. In: Proceedings of the ACM Twelfth International Workshop on Data Warehousing and OLAP, DOLAP 2009, pp. 41–48 (2009)

    Google Scholar 

  21. White, S.A., Corp, I.B.M.: Using BPMN to model a BPEL process. Business 3, 1–18 (2005)

    Google Scholar 

  22. El Akkaoui, Z., Zimànyi, E., Mazón, J.-N., Trujillo, J.: A model-driven framework for ETL process development. In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, DOLAP, pp. 45–52 (2011)

    Google Scholar 

  23. Köppen, V., Brüggemann, B., Berendt, B.: Designing data integration: the ETL pattern approach. Eur. J. Inform. Prof. XII, 49–55 (2011)

    Google Scholar 

  24. Luján-Mora, S., Trujillo, J., Song, I.-Y.: A UML profile for multidimensional modeling in data warehouses. Data Knowl. Eng. 59, 725–769 (2006)

    Article  Google Scholar 

  25. Muñoz, L., Mazón, J.-N., Pardillo, J., Trujillo, J.: Modelling ETL processes of data warehouses with UML activity diagrams. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM 2008. LNCS, vol. 5333, pp. 44–53. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88875-8_21

    Chapter  Google Scholar 

  26. Muñoz, L., Mazón, J.-N., Trujillo, J.: Automatic generation of ETL processes from conceptual models. In: Proceedings of the ACM Twelfth International Workshop on Data Warehousing and OLAP, pp. 33–40. ACM, New York (2009)

    Google Scholar 

  27. W3.org, Semantic Web - W3C. http://www.w3.org/standards/semanticweb/

  28. Motik, B., Patel-Schneider, P.F., Parsia, B., Bock, C., Fokoue, A., Haase, P., Hoekstra, R., Horrocks, I., Ruttenberg, A., Sattler, U., Smith, M.: OWL 2 Web Ontology Language - Structural Specification and Functional-Style Syntax, 2nd edn. Online, pp. 1–133 (2012)

    Google Scholar 

  29. Rahm, E., Do, H.: Data cleaning: Problems and current approaches. IEEE Data Eng. Bull. 23, 3–13 (2000)

    Google Scholar 

  30. Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. Wiley, Hoboken (2002)

    Google Scholar 

  31. Protégé, The Protégé Ontology Editor (2011)

    Google Scholar 

  32. Horridge, M.: protégé-owl api. http://protege.stanford.edu/plugins/owl/api/

  33. Akkaoui, Z., Mazón, J.-N., Vaisman, A., Zimányi, E.: BPMN-based conceptual modeling of ETL processes. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 1–14. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32584-7_1

    Chapter  Google Scholar 

  34. Oliveira, B., Santos, V., Gomes, C., Marques, R., Belo, O.: Conceptual-physical bridging - from BPMN models to physical implementations on Kettle. In: CEUR Workshop Proceedings, pp. 55–59 (2015)

    Google Scholar 

  35. Oliveira, B., Belo, O., Cuzzocrea, A.: A pattern-oriented approach for supporting ETL conceptual modelling and its YAWL-based implementation. In: 3rd International Conference on Data Management Technologies and Applications, DATA 2014, pp. 408–415 (2014)

    Google Scholar 

  36. Bouman, R., Van Dongen, J.: Pentaho® Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQL® (2009)

    Google Scholar 

  37. Gradecki, J.D., Cole, J.: Mastering Apache Velocity - Java Open Source library (2003)

    Google Scholar 

  38. Jackson, D.: Software Abstractions: Logic, Language, and Analysis. MIT Press, Cambridge (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Orlando Belo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Oliveira, B., Belo, O. (2017). Approaching ETL Processes Specification Using a Pattern-Based Ontology. In: Francalanci, C., Helfert, M. (eds) Data Management Technologies and Applications. DATA 2016. Communications in Computer and Information Science, vol 737. Springer, Cham. https://doi.org/10.1007/978-3-319-62911-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-62911-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-62910-0

  • Online ISBN: 978-3-319-62911-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics