Abstract
Schema matching is a key task in several applications such as data integration and ontology engineering. All application fields require the matching of several schemes also known as “holistic matching", but the difficulty of the problem spawned much more attention to pairwise schema matching rather than the latter. In this paper, we propose a new approach for holistic matching. We suggest modelling the problem with some techniques borrowed from the combinatorial optimization field. We propose a linear program, named LP4HM, which extends the maximum-weighted graph matching problem with different linear constraints. The latter encompass matching setup constraints, especially cardinality and threshold constraints; and schema structural constraints, especially superclass/subclass and coherence constraints. The matching quality of LP4HM is evaluated on a recent benchmark dedicated to assessing schema matching tools. Experimentations show competitive results compared to other tools, in particular for recall and HSR quality measures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
References
Agreste, S., Meo, P.D., Ferrara, E., Ursino, D.: XML matchers: approaches and challenges. Knowl.-Based Syst. 66, 190–209 (2014)
Aumueller, D., Do, H.H., Massmann, S., Rahm, E.: Schema and ontology matching with COMA++. In: SIGMOD 2005. pp. 906–908 (2005)
Berro, A., Megdiche, I., Teste, O.: A content-driven ETL processes for open data. In: Bassiliades, N., Ivanovic, M., Kon-Popovska, M., Manolopoulos, Y., Palpanas, T., Trajcevski, G., Vakali, A. (eds.) New Trends in Database and Information Systems II. AISC, vol. 312, pp. 29–40. Springer, Heidelberg (2015)
Berro, A., Megdiche, I., Teste, O.: Holistic statistical open data integration based on integer linear programming. In: RCIS 2015, pp. 524–535 (2015)
Do, H.H., Rahm, E.: Matching large schemas: approaches and evaluation. Inf. Syst. 32(6), 857–885 (2007)
Duchateau, F., Bellahsene, Z.: Designing a benchmark for the assessment of schema matching tools. Open J. Databases (OJDB) 1(1), 3–25 (2014)
Duchateau, F., Coletta, R., Miller, R.J.: Yam: a schema matcher factory. In: CIKM, pp. 2079–2080 (2009)
Edmonds, J.: Maximum matching and a polyhedron with 0, 1-vertices. J. Res. Natl. Bur. Stand. B 69, 125–130 (1965)
Euzenat, J., Shvaiko, P.: Ontology Matching. Springer, Heidelberg (2007)
Euzenat, J., Shvaiko, P.: Ontology Matching, 2nd edn. Springer, Heidelberg (2013)
Euzenat, J., Valtchev, P.: Similarity-based ontology alignment in owl-lite. In: Proceedings of the 16th European Conference on Artificial Intelligence (ECAI), pp. 333–337. IOS press (2004)
Giunchiglia, F., Yatskevich, M., Shvaiko, P.: Semantic matching: algorithms and implementation. In: Spaccapietra, S., Atzeni, P., Fages, F., Hacid, M.-S., Kifer, M., Mylopoulos, J., Pernici, B., Shvaiko, P., Trujillo, J., Zaihrayeu, I. (eds.) Journal on Data Semantics IX. LNCS, vol. 4601, pp. 1–38. Springer, Heidelberg (2007)
Huber, J., Sztyler, T., Nner, J., Meilicke, C.: CODI: Combinatorial optimization for data integration: results for OAEI 2011. In: CEUR Workshop Proceedings on OM, vol. 814 (2011). http://CEUR-WS.org
Jimenez, S., Becerra, C., Gelbukh, A., Gonzalez, F.: Generalized mongue-elkan method for approximate text string comparison. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 559–570. Springer, Heidelberg (2009)
Lin, D.: An information-theoretic definition of similarity. In. In Proceedings of the 15th International Conference on Machine Learning, pp. 296–304. Morgan Kaufmann (1998)
Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: Proceedings of the 18th International Conference on Data Engineering, ICDE 2002, pp. 117–128. IEEE Computer Society (2002)
Niepert, M., Meilicke, C., Stuckenschmidt, H.: A probabilistic-logical framework for ontology matching. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, pp. 1413–1418. AAAI Press (2010)
Rahm, E.: Towards large-scale schema and ontology matching. In: Bellahsene, Z., Bonifati, A., Rahm, E. (eds.) Schema Matching and Mapping. Data-Centric Systems and Applications, pp. 3–27. Springer, Heidelberg (2011)
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB J. 10, 334–350 (2001)
Schrijver, A.: Combinatorial Optimization: Polyhedra and Efficiency. Springer, Heidelberg (2003)
Shvaiko, P., Euzenat, J.: Ontology matching: State of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25(1), 158–176 (2013)
Shvaiko, P., Euzenat, J.: A Survey of Schema-Based Matching Approaches. In: Spaccapietra, S. (ed.) Journal on Data Semantics IV. LNCS, vol. 3730, pp. 146–171. Springer, Heidelberg (2005)
Sun, Y., Ma, L., Shuang, W.: A comparative evaluation of string similarity metrics for ontology alignement. J. Inf. Comput. Sci. 12(3), 957–964 (2015)
Wu, Z., Palmer., M.: Verb semantics and lexical selection. In: 32nd Annual Meeting of the Association for Computational Linguistics, New Mexico State University, Las Cruces, New Mexico, pp. 133–138 (1994)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Berro, A., Megdiche, I., Teste, O. (2015). A Linear Program for Holistic Matching: Assessment on Schema Matching Benchmark. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds) Database and Expert Systems Applications. Globe DEXA 2015 2015. Lecture Notes in Computer Science(), vol 9262. Springer, Cham. https://doi.org/10.1007/978-3-319-22852-5_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-22852-5_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22851-8
Online ISBN: 978-3-319-22852-5
eBook Packages: Computer ScienceComputer Science (R0)