Matching Strategies

  • Jérôme Euzenat
  • Pavel Shvaiko

Abstract

The basic techniques presented in Chap. 5 and the global techniques provided in Chap. 6 are the building blocks on which a matching system is built. Once the similarity or dissimilarity between ontology entities is available, the alignment remains to be computed. This involves more comprehensive treatments. In particular, the following aspects of building a working matching system are considered in this chapter:
  • preparing, if necessary, to handle large scale ontologies (Sect. 7.1.1),

  • organising the combination of various similarities or matching algorithms (Sect. 7.2),

  • exploiting background knowledge sources (Sect. 7.3),

  • aggregating the results of the basic methods in order to compute the compound similarity between entities (Sect. 7.4),

  • learning matchers from data (Sect. 7.5) and tuning them (Sect. 7.6),

  • extracting alignments from the resulting (dis)similarity: indeed, different alignments with different characteristics may be extracted from the same (dis)similarity (Sect. 7.7),

  • improving alignments through disambiguation, debugging and repair (Sect. 7.8).

References

  1. Amgoud, L., Parsons, S., Maudet, N.: Arguments, dialogue and negotiation. In: Proc. 14th European Conference on Artificial Intelligence (ECAI), Berlin, Germany, pp. 338–342 (2000) Google Scholar
  2. Bench-Capon, T.: Persuasion in practical argument using value-based argumentation frameworks. J. Log. Comput. 13(3), 429–448 (2003) MathSciNetCrossRefMATHGoogle Scholar
  3. Berge, C.: Graphes et Hypergraphes. Dunod, Paris (1970) MATHGoogle Scholar
  4. Besana, P.: A framework for combining ontology and schema matchers with Dempster-Shafer. In: Proc. 1st International Workshop on Ontology Matching (OM) at the 5th International Semantic Web Conference (ISWC), Athens, GA, USA, pp. 196–200 (2006) Google Scholar
  5. Bilke, A., Naumann, F.: Schema matching using duplicates. In: Proc. 21st International Conference on Data Engineering (ICDE), Tokyo, Japan, pp. 69–80 (2005) CrossRefGoogle Scholar
  6. Birkes, D., Dodge, Y.: Alternative Methods of Regression. Wiley, New York (2001) Google Scholar
  7. Boser, B., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Proc. 5th Annual Conference on Computational Learning Theory (COLT), Pittsburgh, PA, USA, pp. 144–152(1992) Google Scholar
  8. Breiman, L.: Stacked regressions. Mach. Learn. 24(1), 49–64 (1996) MathSciNetMATHGoogle Scholar
  9. Cohen, W.: Integration of heterogeneous databases without common domains using queries based on textual similarity. In: Proc. 17th International Conference on Management of Data (SIGMOD), Seattle, WA, USA, pp. 201–212 (1998) Google Scholar
  10. Cohen, W., Hirsh, H.: Joins that generalize: text classification using WHIRL. In: Proc. 4th International Conference on Knowledge Discovery and Data Mining (KDD), New York, NY, USA, pp. 169–173 (1998) Google Scholar
  11. Cortes, C., Vapnik, V.: Support-vector networks. In: Proc 12th International Conference on Machine Learning (ICML), Tahoe City, CA, USA, pp. 273–297 (1995) Google Scholar
  12. Cristianini, N., Shawe-Taylor, J.: Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000) CrossRefGoogle Scholar
  13. d’Aquin, M.: Formally measuring agreement and disagreement in ontologies. In: Proc. 5th International Conference on Knowledge Capture (K-CAP), Redondo Beach, CA, USA, pp. 145–152 (2009) CrossRefGoogle Scholar
  14. David, J., Euzenat, J.: Comparison between ontology distances (preliminary results). In: Proc. 7th International Semantic Web Conference (ISWC), Karlsruhe, Germany. Lecture Notes in Computer Science, vol. 5318, pp. 245–260 (2008) Google Scholar
  15. David, J., Guillet, F., Briand, H.: Association rule ontology matching approach. Int. J. Semantic Web Inf. Syst. 3(2), 27–49 (2007) CrossRefGoogle Scholar
  16. David, J., Euzenat, J., Šváb-Zamazal, O.: Ontology similarity in the alignment space. In: Proc. 9th International Semantic Web Conference (ISWC), Shanghai, China. Lecture Notes in Computer Science, vol. 6496, pp. 129–144 (2010) Google Scholar
  17. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proc. 6th Symposium on Operating System Design and Implementation (OSDI), San Francisco, CA, USA, pp. 137–150 (2004) Google Scholar
  18. Dempster, A.: Upper and lower probabilities induced by a multivalued mapping. Ann. Math. Stat. 38(2), 325–339 (1967) MathSciNetCrossRefMATHGoogle Scholar
  19. Do, H.-H., Rahm, E.: COMA—a system for flexible combination of schema matching approaches. In: Proc. 28th International Conference on Very Large Data Bases (VLDB), Hong Kong, China, pp. 610–621 (2002) CrossRefGoogle Scholar
  20. Do, H.-H., Rahm, E.: Matching large schemas: approaches and evaluation. Inf. Syst. 32(6), 857–885 (2007) CrossRefGoogle Scholar
  21. Doan, A.-H., Domingos, P., Halevy, A.: Learning to match the schemas of data sources: a multistrategy approach. Mach. Learn. 50(3), 279–301 (2003) CrossRefMATHGoogle Scholar
  22. Doan, A.-H., Madhavan, J., Domingos, P., Halevy, A.: Ontology matching: a machine learning approach. In: Staab, S., Studer, R. (eds.) Handbook on Ontologies, pp. 385–404. Springer, Berlin (2004). Chap. 18 CrossRefGoogle Scholar
  23. Domingos, P., Pazzani, M.: Beyond independence: conditions for the optimality of the simple Bayesian classifier. In: Proc. 13th International Conference on Machine Learning (ICML), Bari, Italy, pp. 105–112 (1996) Google Scholar
  24. Doran, P., Tamma, V., Payne, T., Palmisano, I.: Dynamic selection of ontological alignments: a space reduction mechanism. In: Proc. 21st International Joint Conference on Artificial Intelligence (IJCAI), Pasadena, CA, USA, pp. 2028–2033 (2009) Google Scholar
  25. Duchateau, F., Bellahsene, Z., Coletta, R.: A flexible approach for planning schema matching algorithms. In: Proc. 16th International Conference on Cooperative Information Systems (CoopIS), Monterrey, Mexico. Lecture Notes in Computer Science, vol. 5331, pp. 249–264 (2008) Google Scholar
  26. Duchateau, F., Coletta, R., Bellahsene, Z., Miller, R.: (not) Yet Another Matcher. In: Proc. 18th ACM Conference on Information and Knowledge Management (CIKM), Hong Kong, China, pp. 1537–1540 (2009) Google Scholar
  27. Dung, P.M.: On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games. Artif. Intell. 77(2), 321–358 (1995) MathSciNetCrossRefMATHGoogle Scholar
  28. Eckert, K., Meilicke, C., Stuckenschmidt, H.: Improving ontology matching using meta-level learning. In: Proc. 6th European Semantic Web Conference (ESWC), Hersounisous, Greece. Lecture Notes in Computer Science, vol. 5554, pp. 158–172 (2009) Google Scholar
  29. Ehrig, M., Sure, Y.: Ontology mapping—an integrated approach. In: Proc. 1st European Semantic Web Symposium (ESWS), Hersounisous, Greece. Lecture Notes in Computer Science, vol. 3053, pp. 76–91 (2004) Google Scholar
  30. Ehrig, M., Staab, S., Sure, Y.: Bootstrapping ontology alignment methods with APFEL. In: Proc. 4th International Semantic Web Conference (ISWC), Galway, Ireland. Lecture Notes in Computer Science, vol. 3729, pp. 186–200 (2005) Google Scholar
  31. Elmeleegy, H., Ouzzani, M., Elmagarmid, A.: Usage-based schema matching. In: Proc. 24th International Conference on Data Engineering (ICDE), Cancún, Mexico, pp. 20–29 (2008) Google Scholar
  32. Esposito, F., Fanizzi, N., d’Amato, C.: Recovering uncertain mappings through structural validation and aggregation with the MoTo system. In: Proc. 25th ACM Symposium on Applied Computing (SAC), Sierre, Switzerland, pp. 1428–1432 (2010) Google Scholar
  33. Gal, A.: Uncertain Schema Matching. Synthesis Lectures on Data Management. Morgan & Claypool, San Rafael (2011) MATHGoogle Scholar
  34. Gal, A., Shvaiko, P.: Advances in ontology matching. In: Dillon, T., Chang, E., Meersman, R., Sycara, K. (eds.) Advances in Web Semantics i, pp. 176–198. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  35. Gal, A., Anaby-Tavor, A., Trombetta, A., Montesi, D.: A framework for modeling and evaluating automatic semantic reconciliation. VLDB J. 14(1), 50–67 (2005a) CrossRefGoogle Scholar
  36. Gale, D., Shapley, L.S.: College admissions and the stability of marriage. Am. Math. Mon. 69(1), 5–15 (1962) MathSciNetCrossRefGoogle Scholar
  37. Gangemi, A.: Restructuring semi-structured terminologies for ontology building: a realistic case study in fishery information systems. Deliverable D16, WonderWeb (2004) Google Scholar
  38. Ghazvinian, A., Noy, N., Musen, M.: From mappings to modules: using mappings to identify domain-specific modules in large ontologies. In: Proc. 6th International Conference on Knowledge Capture (K-CAP), Banff, Canada, pp. 33–40 (2011) Google Scholar
  39. Giunchiglia, F., Shvaiko, P., Yatskevich, M.: Discovering missing background knowledge in ontology matching. In: Proc. 17th European Conference on Artificial Intelligence (ECAI), Riva del Garda, Italy, pp. 382–386 (2006c) Google Scholar
  40. Good, I.J.: The Estimation of Probabilities: an Essay on Modern Bayesian Methods. MIT Press, Cambridge (1965) MATHGoogle Scholar
  41. Gracia, J., Bernad, J., Mena, E.: Ontology matching with CIDER: evaluation report for OAEI 2011. In: Proc. 6th International Workshop on Ontology Matching (OM) at the 10th International Semantic Web Conference (ISWC), Bonn, Germany, pp. 126–133 (2011) Google Scholar
  42. Guha, S., Rastogi, R., Shim, K.: ROCK: a robust clustering algorithm for categorical attributes. In: Proc. 15th International Conference on Data Engineering (ICDE), Sydney, Australia, pp. 512–521 (1999) Google Scholar
  43. Hájek, P.: The Metamathematics of Fuzzy Logic. Kluwer, Dordrecht (1998) CrossRefGoogle Scholar
  44. Hamdi, F., Safar, B., Reynaud, C., Zargayouna, H.: Alignment-based partitioning of large-scale ontologies. In: Guillet, F., Ritschard, G., Zighed, D.A., Briand, H. (eds.) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol. 292, pp. 251–269. Springer, Berlin (2010b) CrossRefGoogle Scholar
  45. Hanif, M.S., Aono, M.: An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size. J. Web Semant. 7(4), 344–356 (2009) CrossRefGoogle Scholar
  46. Holland, J.: Adaptation in Natural and Artificial Systems. MIT Press, Cambridge (1992) Google Scholar
  47. Kohonen, T.: Self-Organizing Maps. Springer, Berlin (2001) CrossRefMATHGoogle Scholar
  48. Laera, L., Tamma, V., Euzenat, J., Bench-Capon, T., Payne, T.: Reaching agreement over ontology alignments. In: Proc. 5th International Semantic Web Conference (ISWC), Athens, GA, USA. Lecture Notes in Computer Science, vol. 4273, pp. 371–384 (2006) Google Scholar
  49. Lambrix, P., Tan, H.: SAMBO—a system for aligning and merging biomedical ontologies. J. Web Semant. 4(1), 196–206 (2006) CrossRefGoogle Scholar
  50. Lee, Y., Sayyadian, M., Doan, A.-H., Rosenthal, A.: eTuner: tuning schema matching software using synthetic scenarios. VLDB J. 16(1), 97–122 (2007) CrossRefGoogle Scholar
  51. Li, W.-S., Clifton, C.: Semantic integration in heterogeneous databases using neural networks. In: Proc. 20th International Conference on Very Large Data Bases (VLDB), Santiago, Chile, pp. 1–12 (1994) Google Scholar
  52. Li, J., Tang, J., Li, Y., Luo, Q.: RiMOM: a dynamic multistrategy ontology alignment framework. IEEE Trans. Knowl. Data Eng. 21(8), 1218–1232 (2009) CrossRefGoogle Scholar
  53. Lin, J., Dyer, C.: Data-Intensive Text Processing with MapReduce. Morgan & Claypool, San Rafael (2010) Google Scholar
  54. Locoro, A., David, J., Euzenat, J.: Context-based matching: design of a flexible framework and experiment. J. Data Semant. 2 (2013, in press) Google Scholar
  55. Lovász, L., Plummer, M.: Matching Theory. North-Holland, Amsterdam (1986) MATHGoogle Scholar
  56. Mao, M., Peng, Y., Spring, M.: An adaptive ontology mapping approach with neural network based constraint satisfaction. J. Web Semant. 8(1), 14–25 (2010) CrossRefGoogle Scholar
  57. Meilicke, C.: Alignment incoherence in ontology matching. PhD thesis, Universität Mannheim, Mannheim, Germany (2011) Google Scholar
  58. Meilicke, C., Stuckenschmidt, H.: An efficient method for computing alignment diagnoses. In: Proc. 3rd International Conference on Web Reasoning and Rule Systems (RR), Chantilly, VA, USA, pp. 182–196 (2009) CrossRefGoogle Scholar
  59. Meilicke, C., Völker, J., Stuckenschmidt, H.: Learning disjointness for debugging mappings between lightweight ontologies. In: Proc. 16th International Conference on Knowledge Engineering and Knowledge Management (EKAW), Acitrezza, Italy. Lecture Notes in Computer Science, vol. 5268, pp. 93–108 (2008) Google Scholar
  60. Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press, Cambridge (1996) Google Scholar
  61. Mochol, M., Jentzsch, A.: Towards a rule-based matcher selection. In: Proc. 16th International Conference on Knowledge Engineering and Knowledge Management (EKAW), Acitrezza, Italy. Lecture Notes in Computer Science, vol. 5268, pp. 109–119 (2008) Google Scholar
  62. Munkres, J.: Algorithms for the assignment and transportation problems. SIAM J. Appl. Math. 5(1), 32–38 (1957) MathSciNetCrossRefMATHGoogle Scholar
  63. Nagy, M., Vargas-Vera, M.: Towards an automatic semantic data integration: multi-agent framework approach. In: Wu, G. (ed.) Semantic Web, pp. 107–134. In-Teh, Vukovar (2010). Chap. 7 Google Scholar
  64. Nandi, A., Bernstein, P.: HAMSTER: using search clicklogs for schema and taxonomy matching. Proc. VLDB Endow. 2(1), 181–192 (2009) Google Scholar
  65. Nottelmann, H., Straccia, U.: A probabilistic, logic-based framework for automated web directory alignment. In: Ma, Z. (ed.) Soft Computing in Ontologies and the Semantic Web. Studies in Fuzziness and Soft Computing, vol. 204, pp. 47–77. Springer, Berlin (2006) CrossRefGoogle Scholar
  66. Peukert, E., Eberius, J., Rahm, E.: A self-configuring schema matching system. In: Proc. 28th International Conference on Data Engineering (ICDE), Washington, DC, USA, pp. 306–317 (2012) Google Scholar
  67. Qazvinian, V., Abolhassani, H., Haeri (Hossein), S., Hariri, B.B.: Evolutionary coincidence-based ontology mapping extraction. Expert Syst. 25(3), 221–236 (2008) CrossRefGoogle Scholar
  68. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Menlo Park (1993) Google Scholar
  69. Ritze, D., Paulheim, H.: Towards an automatic parameterization of ontology matching tools based on example mappings. In: Proc. 6th International Workshop on Ontology Matching (OM) at the 10th International Semantic Web Conference (ISWC), Bonn, Germany, pp. 37–48 (2011) Google Scholar
  70. Shafer, G.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976) MATHGoogle Scholar
  71. Silva, N., Maio, P., Rocha, J.: An approach to ontology mapping negotiation. In: Proc. International Workshop on Integrating Ontologies at the 3rd International Conference on Knowledge Capture (K-CAP), Banff, Canada, pp. 54–60 (2005) Google Scholar
  72. Smets, P.: The combination of evidence in the transferable belief model. IEEE Trans. Pattern Anal. Mach. Intell. 12(5), 447–458 (1990) CrossRefGoogle Scholar
  73. Spiliopoulos, V., Vouros, G., Karkaletsis, V.: On the discovery of subsumption relations for the alignment of ontologies. J. Web Semant. 8(1), 69–88 (2010) CrossRefGoogle Scholar
  74. Straccia, U., Troncy, R.: Towards distributed information retrieval in the semantic web: query reformulation using the oMAP framework. In: Proc. 3rd European Semantic Web Conference (ESWC), Budva, Montenegro. Lecture Notes in Computer Science, vol. 4011, pp. 378–392 (2006) Google Scholar
  75. Stuckenschmidt, H., Parent, C., Spaccapietra, S. (eds.): Modular Ontologies: Concepts, Theories and Techniques for Knowledge Modularization. Lecture Notes in Computer Science, vol. 5445. Springer, Berlin (2009) Google Scholar
  76. Taylor, A.: Social Choice and the Mathematics of Manipulation. Cambridge University Press, Cambridge (2005) CrossRefMATHGoogle Scholar
  77. Ting, K.M., Witten, I.: Issues in stacked generalization. J. Artif. Intell. Res. 10, 271–289 (1999) MATHGoogle Scholar
  78. Tordai, A.: On combining alignment techniques. PhD thesis, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands (2012) Google Scholar
  79. Tournaire, R., Petit, J.-M., Rousset, M.-C., Termier, A.: Discovery of probabilistic mappings between taxonomies: principles and experiments. J. Data Semant. XV, 66–101 (2011) CrossRefGoogle Scholar
  80. Trojahn, C., Moraes, M., Quaresma, P., Vieira, R.: A cooperative approach for composite ontology mapping. J. Data Semant. X, 237–263 (2008) Google Scholar
  81. Trojahn, C., Euzenat, J., Tamma, V., Payne, T.: Argumentation for reconciling agent ontologies. In: Elai, A., Kona, M., Orgun, M. (eds.) Semantic Agent Systems, pp. 89–111. Springer, New York (2011). Chap. 5 CrossRefGoogle Scholar
  82. Valtchev, P.: Construction automatique de taxonomies pour l’aide à la représentation de connaissances par objets. Thèse d’informatique, Université Grenoble 1, Grenoble, France (1999) Google Scholar
  83. Vapnik, V.: The Nature of Statistical Learning Theory, 2nd edn. Springer, New York (2000) CrossRefMATHGoogle Scholar
  84. Vázquez-Naya, J.M., Romero, M.M., Loureiro, J.P., Munteanu, C., Sierra, A.P.: Improving ontology alignment through genetic algorithms. In: Pose, M.G., Cebrián, D.R. (eds.) Soft Computing Methods for Practical Environment Solutions: Techniques and Studies, pp. 1283–1289. IGI Global, Hershey (2010) Google Scholar
  85. Wang, J., Ding, Z., Jiang, C.: GAOM: Genetic Algorithm based Ontology Matching. In: Proc. 1st IEEE Asia-Pacific Services Computing Conference (APSCC), GuangZhou, China, pp. 617–620 (2006) Google Scholar
  86. Wang, Y., Liu, W., Bell, D.: Combining uncertain outputs from multiple ontology matchers. In: Proc. 1st International Conference on Scalable Uncertainty Management (SUM), Washington, DC, USA, pp. 201–214 (2007) CrossRefGoogle Scholar
  87. Wang, P., Zhou, Y., Xu, B.: Matching large ontologies based on reduction anchors. In: Proc. 22nd International Joint Conference on Artificial Intelligence (IJCAI), Barcelona, Spain, pp. 2343–2348 (2011) Google Scholar
  88. Witten, I., Frank, E., Hall, M.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann, Waltham (2011) Google Scholar
  89. Wolpert, D.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992) MathSciNetCrossRefGoogle Scholar
  90. Xu, L., Embley, D.: Discovering direct and indirect matches for schema elements. In: Proc. 8th International Conference on Database Systems for Advanced Applications (DASFAA), Kyoto, Japan, pp. 39–46 (2003) Google Scholar
  91. Yager, R.: On ordered weighted averaging aggregation operators in multi-criteria decision making. IEEE Trans. Syst. Man Cybern. 18(1), 183–190 (1988) MathSciNetCrossRefMATHGoogle Scholar
  92. Yager, R.: Families of OWA operators. Fuzzy Sets Syst. 59, 125–148 (1993) MathSciNetCrossRefMATHGoogle Scholar
  93. Zadeh, L.: Book review: a mathematical theory of evidence. AI Mag. 5(3), 81–83 (1984) Google Scholar
  94. Zurawski, M., Smaill, A., Robertson, D.: Bounded ontological consistency for scalable dynamic knowledge infrastructures. In: Proc. 3rd Asian Semantic Web Conference (ASWC), Bangkok, Thailand. Lecture Notes in Computer Science, vol. 5367, pp. 212–226 (2008) Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Jérôme Euzenat
    • 1
  • Pavel Shvaiko
    • 2
  1. 1.INRIA and LIGGrenobleFrance
  2. 2.Informatica Trentina SpA, while at Department of Engineering and Computer Science (DISI), University of Trento, while at Web of Data, Bruno Kessler Foundation - IRSTTrentoItaly

Personalised recommendations