A Survey of Schema-Based Matching Approaches

  • Pavel Shvaiko
  • Jérôme Euzenat
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3730)

Abstract

Schema and ontology matching is a critical problem in many application domains, such as semantic web, schema/ontology integration, data warehouses, e-commerce, etc. Many different matching solutions have been proposed so far. In this paper we present a new classification of schema-based matching techniques that builds on the top of state of the art in both schema and ontology matching. Some innovations are in introducing new criteria which are based on (i) general properties of matching techniques, (ii) interpretation of input information, and (iii) the kind of input information. In particular, we distinguish between approximate and exact techniques at schema-level; and syntactic, semantic, and external techniques at element- and structure-level. Based on the classification proposed we overview some of the recent schema/ontology matching systems pointing which part of the solution space they cover. The proposed classification provides a common conceptual basis, and, hence, can be used for comparing different existing schema/ontology matching techniques and systems as well as for designing new ones, taking advantages of state of the art solutions.

Keywords

Exter Mili 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aleksovski, Z., ten Kate, W., van Harmelen, F.: Semantic coordination: a new approximation method and its application in the music domain. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298. Springer, Heidelberg (2004)Google Scholar
  2. 2.
    Aumüller, D., Do, H.H., Massmann, S., Rahm, E.: Schema and ontology matching with COMA++. In: Proceedings of the International Conference on Management of Data (SIGMOD). Software Demonstration (2005)Google Scholar
  3. 3.
    Batini, C., Lenzerini, M., Navathe, S.B.: A comparative analysis of methodologies for database schema integration. ACM Computing Surveys 18(4), 323–364 (1986)CrossRefGoogle Scholar
  4. 4.
    Beck, H.W., Gala, S.K., Navathe, S.B.: Classification as a query processing technique in the CANDIDE semantic data model. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 572–581 (1989)Google Scholar
  5. 5.
    Bergamaschi, S., Castano, S., Vincini, M.: Semantic integration of semistructured and structured data sources. SIGMOD Record 28(1), 54–59 (1999)CrossRefGoogle Scholar
  6. 6.
    Berlin, J., Motro, A.: Autoplex: Automated discovery of content for virtual databases. In: Batini, C., Giunchiglia, F., Giorgini, P., Mecella, M. (eds.) CoopIS 2001. LNCS, vol. 2172, pp. 108–122. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  7. 7.
    Berlin, J., Motro, A.: Database schema matching using machine learning with feature selection. In: Pidduck, A.B., Mylopoulos, J., Woo, C.C., Ozsu, M.T. (eds.) CAiSE 2002. LNCS, vol. 2348, pp. 452–466. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  8. 8.
    Bernstein, P., Melnik, S., Petropoulos, M., Quix, C.: Industrial-strength schema matching. SIGMOD Record 33(4), 38–43 (2004)CrossRefGoogle Scholar
  9. 9.
    Le Berre, D.: A satisfiability library for Java, http://www.sat4j.org/
  10. 10.
    Borgida, A., Brachman, R., McGuinness, D., Resnick, L.: CLASSIC: A structural data model for objects. SIGMOD Record 18(2), 58–67 (1989)CrossRefGoogle Scholar
  11. 11.
    Bouquet, P., Euzenat, J., Franconi, E., Serafini, L., Stamou, G., Tessaris, S.: D2.2.1: Specification of a common framework for characterizing alignment. Technical report, NoE Knowledge Web project delivable (2004), http://knowledgeweb.semanticweb.org/
  12. 12.
    Bouquet, P., Serafini, L., Zanobini, S.: Semantic coordination: A new approach and an application. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870, pp. 130–145. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  13. 13.
    Castano, S., De Antonellis, V., De Capitani di Vimercati, S.: Global viewing of heterogeneous data sources. IEEE Transactions on Knowledge and Data Engineering 13(2), 277–297 (2001)CrossRefGoogle Scholar
  14. 14.
    Castano, S., Ferrara, A., Montanelli, S., Racca, G.: Semantic information interoperability in open networked systems. In: Bouzeghoub, M., Goble, C.A., Kashyap, V., Spaccapietra, S. (eds.) ICSNW 2004. LNCS, vol. 3226, pp. 215–230. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  15. 15.
    Meta Data Coalition. Open information model, version 1.0. (August 1999), http://mdc.info/oim/oim10.html
  16. 16.
    Cohen, W., Ravikumar, P., Fienberg, S.: A comparison of string metrics for matching names and records. In: Proceedings of the workshop on Data Cleaning and Object Consolidation at the International Conference on Knowledge Discovery and Data Mining, KDD (2003)Google Scholar
  17. 17.
    Dhamankar, R., Lee, Y., Doan, A., Halevy, A., Domingos, P.: iMAP: Discovering complex semantic matches between database schemas. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 383–394 (2004)Google Scholar
  18. 18.
    Di Noia, T., Di Sciascio, E., Donini, F.M., Mongiello, M.: A system for principled matchmaking in an electronic marketplace. In: Proceedings of the World Wide Web Conference (WWW), pp. 321–330 (2003)Google Scholar
  19. 19.
    Dieng, R., Hug, S.: Comparison of ”personal ontologies” represented through conceptual graphs. In: Proceedings of the European Conference on Artificial Intelligence (ECAI), pp. 341–345 (1998)Google Scholar
  20. 20.
    Do, H.H., Melnik, S., Rahm, E.: Comparison of schema matching evaluations. In: Proceedings of the workshop on Web and Databases (2002)Google Scholar
  21. 21.
    Do, H.H., Rahm, E.: COMA - a system for flexible combination of schema matching approaches. In: Proceedings of the Very Large Data Bases Conference (VLDB), pp. 610–621 (2001)Google Scholar
  22. 22.
    Doan, A., Domingos, P., Halevy, A.: Reconciling schemas of disparate data sources: A machine-learning approach. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 509–520 (2001)Google Scholar
  23. 23.
    Doan, A., Madhavan, J., Domingos, P., Halevy, A.: Learning to map ontologies on the semantic web. In: Proceedings of the International World Wide Web Conference (WWW), pp. 662–673 (2003)Google Scholar
  24. 24.
    Dou, D., McDermott, D., Qi, P.: Ontology translation on the Semantic Web. In: Journal on Data Semantics (JoDS), II, pp. 35–57 (2005)Google Scholar
  25. 25.
    Ehrig, M., Staab, S.: QOM: Quick ontology mapping. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 683–697. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  26. 26.
    Ehrig, M., Sure, Y.: Ontology mapping - an integrated approach. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 76–91. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  27. 27.
    Euzenat, J.: An API for ontology alignment. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 698–712. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  28. 28.
    Euzenat, J., Barrasa, J., Bouquet, P., Dieng, R., Ehrig, M., Hauswirth, M., Jarrar, M., Lara, R., Maynard, D., Napoli, A., Stamou, G., Stuckenschmidt, H., Shvaiko, P., Tessaris, S., van Acker, S., Zaihrayeu, I., Bach, T.L.: D2.2.3: State of the art on ontology alignment. Technical report, NoE Knowledge Web project delivable (2004), http://knowledgeweb.semanticweb.org/
  29. 29.
    Euzenat, J., Valtchev, P.: An integrative proximity measure for ontology alignment. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds.) ISWC 2003. LNCS, vol. 2870. Springer, Heidelberg (2003)Google Scholar
  30. 30.
    Euzenat, J., Valtchev, P.: Similarity-based ontology alignment in OWL-lite. In: Proceedings of the European Conference on Artificial Intelligence (ECAI), pp. 333–337 (2004)Google Scholar
  31. 31.
    Gangemi, A., Guarino, N., Masolo, C., Oltramari, A.: Sweetening WordNet with DOLCE. AI Magazine 24(3), 13–24 (2003)Google Scholar
  32. 32.
    Giunchiglia, F., Shvaiko, P.: Semantic matching. The Knowledge Engineering Review Journal (KER) 18(3), 265–280 (2003)CrossRefGoogle Scholar
  33. 33.
    Giunchiglia, F., Shvaiko, P., Yatskevich, M.: S-Match: an algorithm and an implementation of semantic matching. In: Bussler, C.J., Davies, J., Fensel, D., Studer, R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 61–75. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  34. 34.
    Giunchiglia, F., Shvaiko, P., Yatskevich, M.: Semantic schema matching. Technical Report DIT-05-014, University of Trento (2005)Google Scholar
  35. 35.
    Giunchiglia, F., Yatskevich, M.: Element level semantic matching. In: Proceedings of Meaning Coordination and Negotiation workshop at the International Semantic Web Conference, ISWC (2004)Google Scholar
  36. 36.
    Giunchiglia, F., Zaihrayeu, I.: Making peer databases interact - a vision for an architecture supporting data coordination. In: Proceedings of the International workshop on Cooperative Information Agents (CIA), pp. 18–35 (2002)Google Scholar
  37. 37.
    Guarino, N.: The role of ontologies for the Semantic Web (and beyond). Technical report, Laboratory for Applied Ontology, Institute for Cognitive Sciences and Technology, ISTC-CNR (2004)Google Scholar
  38. 38.
    He, B., Chang, K.C.-C.: A holistic paradigm for large scale schema matching. SIGMOD Record 33(4), 20–25 (2004)CrossRefGoogle Scholar
  39. 39.
    Kalfoglou, Y., Schorlemmer, M.: Ontology mapping: the state of the art. The Knowledge Engineering Review Journal (KER) 18(1), 1–31 (2003)CrossRefGoogle Scholar
  40. 40.
    Kang, J., Naughton, J.F.: On schema matching with opaque column names and data values. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 205–216 (2003)Google Scholar
  41. 41.
    Kashyap, V., Sheth, A.: Semantic and schematic similarities between database objects: a context-based approach. The International Journal on Very Large Data Bases (VLDB) 5(4), 276–304 (1996)CrossRefGoogle Scholar
  42. 42.
    Larson, J.A., Navathe, S.B., Elmasri, R.: A theory of attributed equivalence in databases with application to schema integration. IEEE Transactions on Software Engineering 15(4), 449–463 (1989)MATHCrossRefGoogle Scholar
  43. 43.
    Lenzerini, M.: Data integration: A theoretical perspective. In: Proceedings of the Symposium on Principles of Database Systems (PODS), pp. 233–246 (2002)Google Scholar
  44. 44.
    Li, W.S., Clifton, C.: Semantic integration in heterogeneous databases using neural networks. In: Proceedings of the Very Large Data Bases Conference (VLDB), pp. 1–12 (1994)Google Scholar
  45. 45.
    Madhavan, J., Bernstein, P., Doan, A., Halevy, A.: Corpus-based schema matching. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 57–68 (2005)Google Scholar
  46. 46.
    Madhavan, J., Bernstein, P., Rahm, E.: Generic schema matching with Cupid. In: Proceedings of the Very Large Data Bases Conference (VLDB), pp. 49–58 (2001)Google Scholar
  47. 47.
    Maedche, A., Motik, B., Silva, N., Volz, R.: MAFRA - A MApping FRAmework for Distributed Ontologies. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 235–250. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  48. 48.
    Maedche, A., Staab, S.: Measuring similarity between ontologies. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 251–263. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  49. 49.
    McGuinness, D.L., Fikes, R., Rice, J., Wilder, S.: An environment for merging and testing large ontologies. In: Proceedings of the International Conference on the Principles of Knowledge Representation and Reasoning (KR), pp. 483–493 (2000)Google Scholar
  50. 50.
    Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity flooding: A versatile graph matching algorithm. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 117–128 (2002)Google Scholar
  51. 51.
    Melnik, S., Rahm, E., Bernstein, P.: Rondo: A programming platform for generic model management. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 193–204 (2003)Google Scholar
  52. 52.
    Mena, E., Kashyap, V., Sheth, A., Illarramendi, A.: OBSERVER: An approach for query processing in global information systems based on interoperability between pre-existing ontologies. In: Proceedings of the International Conference on Cooperative Information Systems (CoopIS), pp. 14–25 (1996)Google Scholar
  53. 53.
    Miller, A.G.: WordNet: A lexical database for English. Communications of the ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  54. 54.
    Mitra, P., Noy, N., Jaiswal, A.: OMEN: A probabilistic ontology mapping tool. In: Proceedings of the Meaning Coordination and Negotiation workshop at the International Semantic Web Conference, ISWC (2004)Google Scholar
  55. 55.
    Niles, I., Pease, A.: Towards a standard upper ontology. In: Proceedings of the International Conference on Formal Ontology in Information Systems (FOIS), pp. 2–9 (2001)Google Scholar
  56. 56.
    Noy, N., Klein, M.: Ontology evolution: Not the same as schema evolution. Knowledge and Information Systems (2002)Google Scholar
  57. 57.
    Noy, N., Musen, M.: Anchor-PROMPT: using non-local context for semantic matching. In: Proceedings of the workshop on Ontologies and Information Sharing at the International Joint Conference on Artificial Intelligence (IJCAI), pp. 63–70 (2001)Google Scholar
  58. 58.
    Palopoli, L., Terracina, G., Ursino, D.: The system DIKE: Towards the semi-automatic synthesis of cooperative information systems and data warehouses. In: ADBIS-DASFAA, pp. 108–117. Matfyzpress (2000)Google Scholar
  59. 59.
    Paolucci, M., Kawmura, T., Payne, T., Sycara, K.: Semantic matching of web services capabilities. In: Proceedings of the International Semantic Web Conference (ISWC), pp. 333–347 (2002)Google Scholar
  60. 60.
    Parent, C., Spaccapietra, S.: Database integration: the key to data interoperability. In: Papazoglou, M.P., Spaccapietra, S., Tari, Z. (eds.) Advances in Object-Oriented Data Modeling. The MIT Press, Cambridge (2000)Google Scholar
  61. 61.
    Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics 19(1), 17–30 (1989)CrossRefGoogle Scholar
  62. 62.
    Rahm, E., Bernstein, P.: A survey of approaches to automatic schema matching. The International Journal on Very Large Data Bases (VLDB) 10(4), 334–350 (2001)MATHCrossRefGoogle Scholar
  63. 63.
    Rahm, E., Do, H.H., Maßmann, S.: Matching large XML schemas. SIGMOD Record 33(4), 26–31 (2004)CrossRefGoogle Scholar
  64. 64.
    Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pp. 448–453 (1995)Google Scholar
  65. 65.
    Shasha, D., Wang, J.T.L., Giugno, R.: Algorithmics and applications of tree and graph searching. In: Proceedings of the Symposium on Principles of Database Systems (PODS), pp. 39–52 (2002)Google Scholar
  66. 66.
    Sheth, A., Larson, J.: Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Surveys 22(3), 183–236 (1990)CrossRefGoogle Scholar
  67. 67.
    Shvaiko, P.: A classification of schema-based matching approaches. In: Proceedings of the Meaning Coordination and Negotiation workshop at the International Semantic Web Conference, ISWC (2004)Google Scholar
  68. 68.
    Sotnykova, A., Vangenot, C., Cullot, N., Bennacer, N., Aufaure, M.-A.: Semantic mappings in description logics for spatio-temporal database schema integration. Journal on Data Semantics (JoDS), Special Issue on Semantic-based Geographical Information Systems III (2005)Google Scholar
  69. 69.
    Sure, Y., Corcho, O., Euzenat, J., Hughes, T.: Evaluation of Ontology-based Tools. In: Proceedings of the International Workshop on Evaluation of Ontology-based Tools, EON (2004), http://CEUR-WS.org/Vol-128/
  70. 70.
    Uschold, M., Gruninger, M.: Ontologies and semantics for seamless connectivity. SIGMOD Record 33(4), 58–64 (2004)CrossRefGoogle Scholar
  71. 71.
    Valtchev, P.: Construction automatique de taxonomies pour l’aide à la représentation de connaissances par objets. Thèse d’informatique, Université Grenoble 1 (1999)Google Scholar
  72. 72.
    Valtchev, P., Euzenat, J.: Dissimilarity measure for collections of objects and values. In: Liu, X., Cohen, P.R., R. Berthold, M. (eds.) IDA 1997. LNCS, vol. 1280, pp. 259–272. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  73. 73.
    van Eijk, R., de Boer, F., van de Hoek, W., Meyer, J.J.: On dynamically generated ontology translators in agent communication. International Journal of Intelligent System 16, 587–607 (2001)MATHCrossRefGoogle Scholar
  74. 74.
    Velegrakis, Y., Miller, R.J., Mylopoulos, J.: Representing and querying data transformations. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 81–92 (2005)Google Scholar
  75. 75.
    Wache, H., Voegele, T., Visser, U., Stuckenschmidt, H., Schuster, G., Neumann, H., Huebner, S.: Ontology-based integration of information - a survey of existing approaches. In: Proceedings of the workshop on Ontologies and Information Sharing at the International Joint Conference on Artificial Intelligence (IJCAI), pp. 108–117 (2001)Google Scholar
  76. 76.
    Xu, L., Embley, D.W.: Using domain ontologies to discover direct and indirect matches for schema elements. In: Proceedings of the Semantic Integration workshop at the International Semantic Web Conference, ISWC (2003)Google Scholar
  77. 77.
    Zhang, K., Shasha, D.: Approximate tree pattern matching. In: Apostolico, A., Galil, Z. (eds.) Pattern matching in strings, trees, and arrays, pp. 341–371. Oxford University, Oxford (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Pavel Shvaiko
    • 1
  • Jérôme Euzenat
    • 2
  1. 1.University of TrentoPovo, TrentoItaly
  2. 2.INRIA, Rhône-AlpesFrance

Personalised recommendations