Advertisement

Data Mapping Diagrams for Data Warehouse Design with UML

  • Sergio Luján-Mora
  • Panos Vassiliadis
  • Juan Trujillo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3288)

Abstract

In Data Warehouse (DW) scenarios, ETL (Extraction, Transformation, Loading) processes are responsible for the extraction of data from heterogeneous operational data sources, their transformation (conversion, cleaning, normalization, etc.) and their loading into the DW. In this paper, we present a framework for the design of the DW back-stage (and the respective ETL processes) based on the key observation that this task fundamentally involves dealing with the specificities of information at very low levels of granularity including transformation rules at the attribute level. Specifically, we present a disciplined framework for the modeling of the relationships between sources and targets in different levels of granularity (including coarse mappings at the database and table levels to detailed inter-attribute mappings at the attribute level). In order to accomplish this goal, we extend UML (Unified Modeling Language) to model attributes as first-class citizens. In our attempt to provide complementary views of the design artifacts in different levels of detail, our framework is based on a principled approach in the usage of UML packages, to allow zooming in and out the design of a scenario.

Keywords

data mapping ETL data warehouse UML 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    SQL Power Group: How do I ensure the success of my DW? Internet (2002), http://www.sqlpower.ca/page/dw_best_practices
  2. 2.
    Strange, K.: ETL Was the Key to this DataWarehouse’s Success. Technical Report CS-15-3143, Gartner (2002) Google Scholar
  3. 3.
    Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Conceptual Modeling for ETL Processes. In: Proc. of 5th Intl. Workshop on Data Warehousing and OLAP (DOLAP 2002), McLean, USA, pp. 14–21 (2002)Google Scholar
  4. 4.
    Trujillo, J., Luján-Mora, S.: A UML Based Approach for Modeling ETL Processes in Data Warehouses. In: Song, I.-Y., Liddle, S.W., Ling, T.-W., Scheuermann, P. (eds.) ER 2003. LNCS, vol. 2813, pp. 307–320. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  5. 5.
    Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Modeling ETL Activities as Graphs. In: Proc. of 4th Intl.Workshop on the Design and Management of DataWarehouses (DMDW 2002), Toronto, Canada, pp. 52–61 (2002)Google Scholar
  6. 6.
    Luján-Mora, S., Trujillo, J., Song, I.: Extending UML for Multidimensional Modeling. In: Jézéquel, J.-M., Hussmann, H., Cook, S. (eds.) UML 2002. LNCS, vol. 2460, pp. 290–304. Springer, Heidelberg (2002)Google Scholar
  7. 7.
    Luján-Mora, S., Trujillo, J., Song, I.: Multidimensional Modeling with UML Package Diagrams. In: Spaccapietra, S., March, S.T., Kambayashi, Y. (eds.) ER 2002. LNCS, vol. 2503, pp. 199–213. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  8. 8.
    Luján-Mora, S., Trujillo, J.: A Comprehensive Method for DataWarehouse Design. In: Proc. of the 5th Intl.Workshop on Design and Management of DataWarehouses (DMDW 2003), Berlin, Germany, vol. 1, pp. 1.1–1.14 (2003)Google Scholar
  9. 9.
    Jarke, M., Lenzerini, M., Vassiliou, Y., Vassiliadis, P.: Fundamentals of Data Warehouses, 2nd edn. Springer, Heidelberg (2003)Google Scholar
  10. 10.
    Object Management Group (OMG): Unified Modeling Language Specification 1.4. Internet (2001), http://www.omg.org/cgi-bin/doc?formal/01-09-67
  11. 11.
    Lenzerini, M.: Data Integration: A Theoretical Perspective. In: Proceedings of the Twenty-first ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Madison, Wisconsin, USA, pp. 233–246 (2002)Google Scholar
  12. 12.
    Bernstein, P., Levy, A., Pottinger, R.: A Vision for Management of Complex Models. Technical Report MSR-TR-2000-53, Microsoft Research (2000)Google Scholar
  13. 13.
    Bernstein, P., Rahm, E.: Data Warehouse Scenarios for Model Management. In: Laender, A.H.F., Liddle, S.W., Storey, V.C. (eds.) ER 2000. LNCS, vol. 1920, pp. 1–15. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  14. 14.
    Dobre, A., Hakimpour, F., Dittrich, K.R.: Operators and Classification for Data Mapping in Semantic Integration. In: Song, I.-Y., Liddle, S.W., Ling, T.-W., Scheuermann, P. (eds.) ER 2003. LNCS, vol. 2813, pp. 534–547. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  15. 15.
    Falkenberg, E.: Concepts for modelling information. In: Proc. of the IFIP Conference on Modelling in Data Base Management Systems, Amsterdam, Holland, pp. 95–109 (1976)Google Scholar
  16. 16.
    Embley, D., Kurtz, B., Woodfield, S.: Object-oriented Systems Analysis: A Model-Driven Approach. Prentice-Hall, Englewood Cliffs (1992)Google Scholar
  17. 17.
    Halpin, T., Bloesch, A.: Data modeling in UML and ORM: a comparison. Journal of Database Management 10, 4–13 (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Sergio Luján-Mora
    • 1
  • Panos Vassiliadis
    • 2
  • Juan Trujillo
    • 1
  1. 1.Dept. of Software and Computing SystemsUniversity of AlicanteSpain
  2. 2.Dept. of Computer ScienceUniversity of IoanninaHellas

Personalised recommendations