Abstract
Experience suggests that fully automated schema matching is infeasible, especially for n-to-m matches involving semantic functions. It is therefore advisable for a matching algorithm not only to do as much as possible automatically, but also to accurately identify the critical points where user input is maximally useful. Our matching algorithm combines several existing approaches, with a new emphasis on using the context provided by the way elements are embedded in paths. A prototype tested on biological data (gene sequence, DNA, RNA, etc.) and on bibliographic data, shows significant performance improvements from user feedback and context checking. In non-interactive mode on the purchase order schemas, it compares favorably with COMA, the most mature schema matching system in literature, and also correctly identifies critical points for user input.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Doan, A., Domingos, P., Halevy, A.Y.: Reconciling schemas of disparate data sources: A machine-learning approach. In: SIGMOD 2001 (2001)
Do, H., Rahm, E.: Coma - A System for flexible combination of schema matching approaches. In: Proc. 28th Conf. on Very Large Databases (VLDB) (2002)
Doan, A.: Thesis (2003), http://anhai.cs.uiuc.edu/home/thesis/anhai-thesis.pdf
Madhavan, J., Bernstein, P.A., Rahm, E.: Generic Schema Matching with Cupid. In: Proc. 27th Int. Conf. on Very Large Data Bases (VLDB) (2001)
Li, W., Clifton, C.: SemInt: A Tool for Identifying Attribute Correspondences in Heterogeneous Databases Using Neural network. Data and Knowledge Engineering 33(1), 49–84 (2000)
Madhavan, J., Bernstein, P.A., Domingos, P., Halevy, A.Y.: Representing and Reasoning About Mappings between Domain Models. In: 18th National Conference on Artificial Intelligence (AAAI) (2002)
Melnik, S., Rahm, E., Bernstein, P.A.: Rondo: A Programming Platform for Generic Model Management. In: Proc. ACM Intl. Conference on Management of Data (SIGMOD), San Diego (2003)
Milo, T., Zohar, S.: Using Schema Matching to Simplify Heterogeneous Data Translation. In: VLDB, pp. 122–133 (1998)
Miller, R., Haas, L., Hernandez, M.A.: Schema Mapping as Query Discovery. In: VLDB, pp. 77–88 (2000)
Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity Flooding: A Versatile Graph Matching Algorithm and its Application to Schema Matching. In: ICDE (2002)
Nam, Y.K., Goguen, J., Wang, G.: A Metadata Integration Assistant Generator for Heterogeneous Distributed Databases. In: Meersman, R., Tari, Z., et al. (eds.) CoopIS 2002, DOA 2002, and ODBASE 2002. LNCS, vol. 2519, pp. 1332–1344. Springer, Heidelberg (2002)
Wang, G., Goguen, J., Nam, Y.K., Lin, K.: Critical Points for Interactive Schema Matching. Technical Report. Dept. Computer and Engineering, UCSD (2003)
Rahm, E., Bernstein, P.A.: On Matching Schemas Automatically. Techn. Report, Dept. of Comp. Science, (2001), http://dol.unile-Univ.of.Leipzig.ipzig.de/pub/2001-5/en
Biskup, J., Embley, D.W.: Extracting information from heterogeneous information sources using ontologically specified target view. Information Systems 28, 169–212 (2003)
Xu, L., Embley, D.: Using Domain Ontologies to Discover Direct and Indirect Matches for Schema Elements. In: Proc. Semantic Integration Workshop, Sanibel Island, Florida (2003)
Do, H.H., Melnik, S., Rahm, E.: Comparison of Schema Matching Evaluations. In: Chaudhri, A.B., Jeckle, M., Rahm, E., Unland, R. (eds.) NODe-WS 2002. LNCS, vol. 2593, pp. 221–237. Springer, Heidelberg (2002)
He, B., Chang, K.C.: Statistical Schema Matching acrossWeb Query Interfaces. In: ACM Intl. Conference on Management of Data (SIGMOD), San Diego (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, G., Goguen, J., Nam, YK., Lin, K. (2004). Critical Points for Interactive Schema Matching. In: Yu, J.X., Lin, X., Lu, H., Zhang, Y. (eds) Advanced Web Technologies and Applications. APWeb 2004. Lecture Notes in Computer Science, vol 3007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24655-8_71
Download citation
DOI: https://doi.org/10.1007/978-3-540-24655-8_71
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21371-0
Online ISBN: 978-3-540-24655-8
eBook Packages: Springer Book Archive