Computing a Canonical Hierarchical Schema

  • Jens Lemcke
  • Gunther Stuhec
  • Michael Dietrich
Conference paper
Part of the Proceedings of the I-ESA Conferences book series (IESACONF, volume 5)


We present a novel approach to constructing a canonical data model from a set of hierarchical schemas. Canonical data model is a well-known pattern for enterprise integration and the integral enabler for many business applications such as business warehousing, business intelligence, data cleansing, and forsustainable business-to-business integration. After knowing the correspondences between schemas by applying existing schema or ontology matching, building the overarching canonical schema remains. A canonical schema must be able to integrate extremely different and even conflicting structures. Furthermore, the schema should exhibit the most commonly used structures of the sources and be stable with respect to the order of importing. Due to these properties, the manual construction is cumbersome and error-prone and becomes a major cost driver of integration projects. Our approach models that task as finding an optimal solution of a constraint satisfaction problem. Our comparison with manual integration shows that our prototype quickly reduces human effort by multiple person days with growing size of the integration task. With our techniques as a baseline, data models of enterprise applications can be converged and kept in synch to reduce integration costs in the long run.


Enterprise application integration Enterprise information integration e-business standard Canonical data model 


  1. 1.
    Kastner und Saia, “The Composite Applications Benchmark Report”. Dez-2006.Google Scholar
  2. 2.
    Gartner, “Technology Research | Gartner Inc.” [Online]. Available: [Accessed: 30-Sep-2011].
  3. 3.
    D. Beneventano, S. Bergamaschi, F. Guerra, und M. Vincini, The MOMIS approach to Information Integration. 2001.Google Scholar
  4. 4.
    K. Saleem, Z. Bellahsene, und E. Hunt, “PORSCHE: Performance ORiented SCHEma mediation”, Inf. Syst., Bd. 33, Nr. 7-8, S. 637-657, 2008.Google Scholar
  5. 5.
    C. Delobel, C. Reynaud, M.-C. Rousset, J.-P. Sirot, und D. Vodislav, “Semantic integration in Xyleme: a uniform tree-based approach”, Data Knowl. Eng., Bd. 44, Nr. 3, S. 267-298, 2003.Google Scholar
  6. 6.
    R. D. S. Mello und C. A. Heuser, “BInXS: A Process for Integration of XML Schemata”, 2005, Bd. 3520, S. 151-166.Google Scholar
  7. 7.
    “Data Integration - Data Transformation - Data Management - Data Security - Data in the Cloud - Liaison Technologies”.Google Scholar
  8. 8.
    Crossgate, “Crossgate: EDI Managed Services, E-Invoicing, SAP PI, Supply Chain Analytics”. [Online]. Available: [Accessed: 29-Sep-2011].
  9. 9.
    J. Madhavan, P. Bernstein, und E. Rahm, “Generic Schema Matching with Cupid”, in In The VLDB Journal, 2001, S. 49–58.Google Scholar

Copyright information

© Springer-Verlag London Limited 2012

Authors and Affiliations

  • Jens Lemcke
    • 1
  • Gunther Stuhec
    • 2
  • Michael Dietrich
    • 1
  1. 1.SAP Research KarlsruheKarlsruheGermany
  2. 2.SAP AGWalldorfGermany

Personalised recommendations