A User-Guided Approach for Large-Scale Multi-schema Integration

  • Muhammad Wasimullah Khan
  • Jelena Zdravkovic
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 134)

Abstract

Schema matching plays an important role in various fields of enterprise system modeling and integration, such as in databases, business intelligence, knowledge management, interoperability, and others. The matching problem relates to finding the semantic correspondences between two or more schemas. The focus of the most of the research done in schema and ontology matching is pairwise matching, where 2 schemas are compared at the time. While few semi-automatic approaches have been recently proposed in pairwise matching to involve user, current multi-schema approaches mainly rely on the use of statistical information in order to avoid user interaction, which is largely limited to parameter tuning. In this study, we propose a user-guided iterative approach for large-scale multi-schema integration. Given n schemas, the goal is to match schema elements iteratively and demonstrate that the learning approach results in improved accuracy during iterations. The research is conducted in SAP Research Karlsruhe, followed by an evaluation using large e-business schemas. The evaluation results demonstrated an improvement in accuracy of matching proposals based on user’s involvement, as well as an easier accomplishment of a unified data model.

Keywords

Schema Integration Business Intelligence System Interoperability 

References

  1. 1.
    SAP Netweaver. Adaptive Technology for the Networked Enterprise (August 15, 2011), http://www.sap.com/platform/netweaver/index.epx
  2. 2.
    Microsoft BizTalk Server, Microsoft BizTalk Server website (August 15, 2011), http://www.microsoft.com/biztalk/en/us/default.aspx
  3. 3.
    IBM InfoSphere, InfoSphere Platform (August 15, 2011), http://www-01.ibm.com/software/data/infosphere/
  4. 4.
    Madhavan, J., Bernstein, P.A., Rahm, E.: Generic Schema Matching with Cupid. In: Proceedings of the 27th International Conference on Very Large Data Bases (VLDB 2001), pp. 49–58. Morgan Kaufmann Publishers Inc., San Francisco (2001)Google Scholar
  5. 5.
    Rahm, E., Bernstein, P.A.: A Survey of Approaches to Automatic Schema Matching. VLDB Journal 10(4), 334–350 (2001)CrossRefGoogle Scholar
  6. 6.
    Berlin, J., Motro, A.: Autoplex: Automated Discovery of Content for Virtual Databases. In: Batini, C., Giunchiglia, F., Giorgini, P., Mecella, M. (eds.) CoopIS 2001. LNCS, vol. 2172, pp. 108–122. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  7. 7.
    Doan, A.H., Domingos, P., Halevy, A.Y.: Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data (SIGMOD 2001), vol. 30(2), pp. 509–520. ACM, New York (2001)CrossRefGoogle Scholar
  8. 8.
    Rahm, E., Do, H.H., Maßmann, S.: Matching large XML schemas. ACM SIGMOD Record 33(4) (2004)Google Scholar
  9. 9.
    Bernstein, P.A., et al.: Industrial-strength Schema Matching. ACM SIGMOD Record 33(4) (2004)Google Scholar
  10. 10.
    Euzenat, J., Shvaiko, P.: A survey of schema-based matching approaches. Technical report, Informatica e Telecomunicazioni, University of Trento (2007)Google Scholar
  11. 11.
    Rahm, E.: Towards Large-Scale Schema and Ontology Matching. In: Schema Matching and Mapping. Data-Centric Systems and Applications, pp. 3–28. Springer (2011)Google Scholar
  12. 12.
    Uno, T., et al.: LCM ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets. In: IEEE International Conference on Data Mining, Workshop on Frequent Itemset Mining Implementations (FIMI), Brighton, UK (2004)Google Scholar
  13. 13.
    Bellahsene, Z., Bonifati, A., Rahm, E.: Schema Matching and Mapping. Data-Centric Systems and Applications. Springer (2011)Google Scholar
  14. 14.
    Do, H.-H., Melnik, S., Rahm, E.: Comparison of Schema Matching Evaluations. In: Chaudhri, A.B., Jeckle, M., Rahm, E., Unland, R. (eds.) Web Database System and Web-Services 2002. LNCS, vol. 2593, pp. 221–237. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  15. 15.
    Shvaiko, P., Euzenat, J.: A Survey of Schema-Based Matching Approaches. In: Spaccapietra, S. (ed.) Journal on Data Semantics IV. LNCS, vol. 3730, pp. 146–171. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  16. 16.
    Noy, N.F.: Semantic Integration: A Survey of Ontology-Based Approaches. SIGMOD Record 33(4), 65–70 (2004)CrossRefGoogle Scholar
  17. 17.
    Bernstein, P.A., Melnik, S., Churchill, J.E.: Incremental Schema Matching. In: Proceedings of the 32nd International Conference on Very Large Data Bases (VLDB 2006), pp. 1167–1170 (2006)Google Scholar
  18. 18.
    Chen, D., et al.: A User Guided Iterative Alignment Approach for Ontology Mapping. In: Semantic Web Enabled Web Service (SWWS), pp. 51–56 (2008)Google Scholar
  19. 19.
    Falconer, S.M., Noy, N.F.: Interactive Techniques to Support Ontology Matching. In: Bellahsene, Z., Bonifati, A., Rahm, E. (eds.) Schema Matching and Mapping, pp. 29–52. Springer (2011)Google Scholar
  20. 20.
    Rech, J., et al.: Intelligent assistance for collaborative schema governance in the German agricultural eBusiness sector. In: Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services. ACM, New York (2010)Google Scholar
  21. 21.
    Zhdanova, A.V., Shvaiko, P.: Community-Driven Ontology Matching. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 34–49. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2012

Authors and Affiliations

  • Muhammad Wasimullah Khan
    • 1
  • Jelena Zdravkovic
    • 2
  1. 1.School of Information and Communication TechnologyRoyal Institute of TechnologySweden
  2. 2.Department of Computer and Systems SciencesStockholm UniversityKistaSweden

Personalised recommendations