Managing Uncertainty in Schema Matching with Top-K Schema Mappings

  • Avigdor Gal
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4090)


In this paper, we propose to extend current practice in schema matching with the simultaneous use of top-K schema mappings rather than a single best mapping. This is a natural extension of existing methods (which can be considered to fall into the top-1 category), taking into account the imprecision inherent in the schema matching process. The essence of this method is the simultaneous generation and examination of K best schema mappings to identify useful mappings. The paper discusses efficient methods for generating top-K methods and propose a generic methodology for the simultaneous utilization of top-K mappings. We also propose a concrete heuristic that aims at improving precision at the cost of recall. We have tested the heuristic on real as well as synthetic data and anlyze the emricial results.

The novelty of this paper lies in the robust extension of existing methods for schema matching, one that can gracefully accommodate less-than-perfect scenarios in which the exact mapping cannot be identified in a single iteration. Our proposal represents a step forward in achieving fully automated schema matching, which is currently semi-automated at best.


Bipartite Graph Attribute Mapping Good Mapping Exact Mapping Schema Match 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aitchison, J., Gilchrist, A., Bawden, D.: Thesaurus construction and use: a practical manual, 3rd edn. Aslib, London (1997)Google Scholar
  2. 2.
    Anaby-Tavor, A.: Enhancing the formal similarity based matching model. Master’s thesis, Technion-Israel Institute of Technology (May 2003)Google Scholar
  3. 3.
    Bergamaschi, S., Castano, S., Vincini, M., Beneventano, D.: Semantic integration of heterogeneous information sources. Data & Knowledge Engineering 36(3) (2001)Google Scholar
  4. 4.
    Berners-Lee, T., Hendler, J., Lassila, O.: The semantic Web. Scientific American (May 2001)Google Scholar
  5. 5.
    Brodie, M.: The grand challenge in information technology and the illusion of validity. In: Keynote lecture at the International Federated Conference On the Move to Meaningful Internet Systems and Ubiquitous Computing (2002)Google Scholar
  6. 6.
    Castano, S., De Antonellis, V., Fugini, M.G., Pernici, B.: Conceptual schema analysis: Techniques and applications. ACM Transactions on Database Systems (TODS) 23(3), 286–332 (1998)CrossRefGoogle Scholar
  7. 7.
    Chegireddy, C.R., Hamacher, H.W.: Algorithms for finding k-best perfect matchings. Discrete Applied Mathematics 18, 155–165 (1987)MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Convent, B.: Unsolvable problems related to the view integration approach. In: Goos, G., Hartmanis, J. (eds.) ICDT 1986. LNCS, vol. 243, pp. 141–156. Springer, Heidelberg (1986)Google Scholar
  9. 9.
    Do, H.H., Rahm, E.: COMA - a system for flexible combination of schema matching approaches. In: Proceedings of the International conference on very Large Data Bases (VLDB), pp. 610–621 (2002)Google Scholar
  10. 10.
    Doan, A., Domingos, P., Halevy, A.Y.: Reconciling schemas of disparate data sources: A machine-learning approach. In: Aref, W.G. (ed.) Proceedings of the ACM-SIGMOD conference on Management of Data (SIGMOD), Santa Barbara, California. ACM Press, New York (May 2001)Google Scholar
  11. 11.
    Doan, A., Madhavan, J., Domingos, P., Halevy, A.: Learning to map between ontologies on the semantic web. In: Proceedings of the eleventh international conference on World Wide Web, pp. 662–673. ACM Press, New York (2002)CrossRefGoogle Scholar
  12. 12.
    Ehrig, M., Staab, S.: Qom quick ontology mapping. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 683–697. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  13. 13.
    Noy, N.F., Musen, M.A.: PROMPT: Algorithm and tool for automated ontology merging and alignment. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI 2000), Austin, TX, pp. 450–455 (2000)Google Scholar
  14. 14.
    Gal, A., Anaby-Tavor, A., Trombetta, A., Montesi, D.: A framework for modeling and evaluating automatic semantic reconciliation. VLDB Journal 14(1), 50–67 (2005)CrossRefGoogle Scholar
  15. 15.
    Gal, A., Modica, G., Jamil, H.M., Eyal, A.: Automatic ontology matching using application semantics. AI Magazine 26(1) (2005)Google Scholar
  16. 16.
    Galil, Z.: Efficient algorithms for finding maximum matching in graphs. ACM Computing Surveys 18(1), 23–38 (1986)MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Güntzer, U., Balke, W.-T., Kießling, W.: Optimizing multi-feature queries in image databases. In: Proceedings of the Twenty Sixth Very Large Databases (VLDB) Conference, Las Vegas, pp. 419–428 (2001)Google Scholar
  18. 18.
    Hamacher, H.W., Queyranne, M.: K-best solutions to combinatorial optimization problems. Annals of Operations Research 4, 123–143 (1985/6)CrossRefMathSciNetGoogle Scholar
  19. 19.
    Heß, A., Kushmerick, N.: Learning to attach semantic metadata to web services. In: Proceedings of the Second Semantic Web Conference (2003)Google Scholar
  20. 20.
    Hull, R.: Managing semantic heterogeneity in databases: A theoretical perspective. In: Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pp. 51–61. ACM Press, New York (1997)CrossRefGoogle Scholar
  21. 21.
    Jarrar, M., Meersman, R.: Formal ontology engineering in the DOGMA approach. In: Proceedings International Federated Conference On the Move to Meaningful Internet Systems and Ubiquitous Computing, pp. 238–1254 (October 2002)Google Scholar
  22. 22.
    Mehlhorn, K., Naher, S. (eds.): LEDA, A platform for combinatorial and geometric computing. Cambridge University Press, Cambridge (1999)MATHGoogle Scholar
  23. 23.
    Korte, B., Vygen, J.: Combinatorial Optimization: Theory and Algorithms, 2nd edn. Springer, Heidelberg (2002)MATHGoogle Scholar
  24. 24.
    Madhavan, J., Bernstein, P.A., Domingos, P., Halevy, A.Y.: Representing and reasoning about mappings between domain models. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence (AAAI/IAAI), pp. 80–86 (2002)Google Scholar
  25. 25.
    Melnik, S., Rahm, E., Bernstein, P.A.: Rondo: A programming platform for generic model management. In: Proceedings of the ACM-SIGMOD conference on Management of Data (SIGMOD), San Diego, California, pp. 193–204. ACM Press, New York (2003)Google Scholar
  26. 26.
    Miller, R.J., Haas, L.M., Hernández, M.A.: Schema mapping as query discovery. In: El Abbadi, A., Brodie, M.L., Chakravarthy, S., Dayal, U., Kamel, N., Schlageter, G., Whang, K.-Y. (eds.) Proceedings of the International conference on very Large Data Bases (VLDB), pp. 77–88. Morgan Kaufmann, San Francisco (2000)Google Scholar
  27. 27.
    Miller, R.J., Hernàndez, M.A., Haas, L.M., Yan, L.-L., Ho, C.T.H., Fagin, R., Popa, L.: The Clio project: Managing heterogeneity. SIGMOD Record 30(1), 78–83 (2001)CrossRefGoogle Scholar
  28. 28.
    Modica, G., Gal, A., Jamil, H.: The use of machine-generated ontologies in dynamic information seeking. In: Batini, C., Giunchiglia, F., Giorgini, P., Mecella, M. (eds.) CoopIS 2001. LNCS, vol. 2172, pp. 433–448. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  29. 29.
    Murty, K.G.: An algorithm for ranking all the assignments in order of increasing cost. Operations Research 16, 682–687 (1968)MATHCrossRefGoogle Scholar
  30. 30.
    Pascoal, M., Captivo, M.E., Cl’imaco, J.: A note on a new variant of Murty’s ranking assignments algorithm. Quarterly Journal of the Belgian, French and Italian Operations Research Societies 1(3), 243–255 (2003)MATHMathSciNetGoogle Scholar
  31. 31.
    Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB Journal 10(4), 334–350 (2001)MATHCrossRefGoogle Scholar
  32. 32.
    Sheth, A., Larson, J.: Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Surveys 22(3), 183–236 (1990)CrossRefGoogle Scholar
  33. 33.
    Sheth, A.P., Gala, S.K., Navathe, S.B.: On automatic reasoning for schema integration. Intenational Journal on Intelligent Cooperative Information Systems (IJICIS) 2(1), 23–50 (1993)CrossRefGoogle Scholar
  34. 34.
    Spyns, P., Meersman, R., Jarrar, M.: Data modelling versus ontology engineering. ACM SIGMOD Record 31(4) (2002)Google Scholar
  35. 35.
    Vickery, B.C.: Faceted classification schemes. Graduate School of Library Service, Rutgers, the State University, New Brunswick, N.J. (1966)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Avigdor Gal
    • 1
  1. 1.Technion – Israel Institute of TechnologyHaifaIsrael

Personalised recommendations