Abstract
We address the problem of automating the process of deciding whether two data schema elements match (that is, refer to the same actual object or concept), and propose several methods for combining evidence computed by multiple basic matchers. One class of methods uses Bayesian networks to account for the conditional dependency between the similarity values produced by individual matchers that use the same or similar information, so as to avoid overconfidence in match probability estimates and improve the accuracy of matching. Another class of methods relies on optimization switches that mitigate this dependency in a domain-independent manner. Experimental results under several testing protocols suggest that the matching accuracy of the Bayesian composite matchers can significantly exceed that of the individual component matchers, and the careful selection of optimization switches can improve matching accuracy even further.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB J. 10, 334–350 (2001)
Do, H.H., Rahm, E.: COMA - A System for Flexible Combination of Schema Matching Approaches. In: Proceedings of the 28th International Conference on Very Large Data Bases (VLDB) (2002)
Li, W., Clifton, C.: A tool for identifying attribute correspondences in heterogeneous databases using neural network. J. Data Knowl. Eng. 33(1), 49–84 (2000)
Doan, A., Domingos, P., Halevy, A.: Learning to match the schemas of databases: a multistrategy approach. Mach. Learn. J. 50, 279–301 (2003)
Bergamaschi, S., Castano, S., Vincini, M., Beneventano, D.: Semantic integration of heterogeneous information sources. J. Data Knowl. Eng. 36(3), 215–249 (2001)
Do, H.H., Rahm, R.: Matching large schemas: approaches and evaluation. J. Inf. Syst. 32(6), 857–885 (2007)
Doan, A.H., Domingos, P., Halevy, A.: Reconciling schemas of disparate data sources: A Machine Learning Approach. In: SIGMOD 2001 (2001)
Embley, D.W.: Multifaceted Exploitation of Metadata for Attribute Match Discovery in Information Integration. In: WIIW 2001 (2001)
Heckerman, D.: A tutorial on learning bayesian networks. J. Learn. Graph. Models, 301–354 (2001)
Tang, J., Li, J.Z.: Using bayesian decision for ontology mapping. J. Web Semant. 4(4), 157 (2006)
Thiesson, B.: Accelerated quantification of bayesian networks with incomplete data. In: Proceedings of the Conference on Knowledge Discovery in Data, pp. 306–311 (1995)
Pan, R., Peng, Y., Ding, Z.: Belief update in Bayesian networks using uncertain evidence. In: 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’06), pp. 441–444 (2006)
Marie, A., Gal, A.: Managing Uncertainty in Schema Matcher Ensembles. In: Prade, H., Subrahmanian, V.S. (eds.) SUM 2007. LNCS (LNAI), vol. 4772, pp. 60–73. Springer, Heidelberg (2007)
Doan, A.H., Madhavan, J., Dhamankar, R., Domingos, P., Halevy, A.: Learning to match ontologies on the semantic web. VLDB J. 12(4), 303–319 (2003)
Duchateau, F., Bellahsene, Z., Coletta, R.: A Flexible Approach for Planning Schema Matching Algorithms. In: Meersman, R., Tari, Z. (eds.) OTM 2008, Part I. LNCS, vol. 5331, pp. 249–264. Springer, Heidelberg (2008)
Duchateau, F., Coletta, R., Bellahsene, Z., Miller, R.J.: Not yet another matcher. In: Proceedings of CIKM’09, Hong-Kong, China, pp. 2079–2080, November 2009
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Berlin, J., Motro, A.: Database schema matching using machine learning with feature selection. CAiSE 2002. LNCS, vol. 2348, pp. 452–466. Springer, Heidelberg (2002)
Rajesh, A., Srivatsa, S.K.: XML schema matching – using structural information. Int. J. Comput. Appl. 8(2), 34–41 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nikovski, D., Esenther, A., Ye, X., Shiba, M., Takayama, S. (2013). Matcher Composition Methods for Automatic Schema Matching. In: Cordeiro, J., Maciaszek, L.A., Filipe, J. (eds) Enterprise Information Systems. Lecture Notes in Business Information Processing, vol 141. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40654-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-40654-6_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40653-9
Online ISBN: 978-3-642-40654-6
eBook Packages: Computer ScienceComputer Science (R0)