Skip to main content

A Linear Program for Holistic Matching: Assessment on Schema Matching Benchmark

  • Conference paper
  • First Online:
Database and Expert Systems Applications (Globe 2015, DEXA 2015)

Abstract

Schema matching is a key task in several applications such as data integration and ontology engineering. All application fields require the matching of several schemes also known as “holistic matching", but the difficulty of the problem spawned much more attention to pairwise schema matching rather than the latter. In this paper, we propose a new approach for holistic matching. We suggest modelling the problem with some techniques borrowed from the combinatorial optimization field. We propose a linear program, named LP4HM, which extends the maximum-weighted graph matching problem with different linear constraints. The latter encompass matching setup constraints, especially cardinality and threshold constraints; and schema structural constraints, especially superclass/subclass and coherence constraints. The matching quality of LP4HM is evaluated on a recent benchmark dedicated to assessing schema matching tools. Experimentations show competitive results compared to other tools, in particular for recall and HSR quality measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.scotland.gov.uk/Topics/Statistics/Browse/Crime-Justice/TrendData. http://data.gov.uk/dataset/seizures-drugs-england-wales.

  2. 2.

    http://ontosim.gforge.inria.fr/.

  3. 3.

    http://sourceforge.net/projects/simmetrics/.

  4. 4.

    http://secondstring.sourceforge.net/.

  5. 5.

    https://code.google.com/p/ws4j/.

  6. 6.

    https://wordnet.princeton.edu/.

References

  1. Agreste, S., Meo, P.D., Ferrara, E., Ursino, D.: XML matchers: approaches and challenges. Knowl.-Based Syst. 66, 190–209 (2014)

    Article  Google Scholar 

  2. Aumueller, D., Do, H.H., Massmann, S., Rahm, E.: Schema and ontology matching with COMA++. In: SIGMOD 2005. pp. 906–908 (2005)

    Google Scholar 

  3. Berro, A., Megdiche, I., Teste, O.: A content-driven ETL processes for open data. In: Bassiliades, N., Ivanovic, M., Kon-Popovska, M., Manolopoulos, Y., Palpanas, T., Trajcevski, G., Vakali, A. (eds.) New Trends in Database and Information Systems II. AISC, vol. 312, pp. 29–40. Springer, Heidelberg (2015)

    Google Scholar 

  4. Berro, A., Megdiche, I., Teste, O.: Holistic statistical open data integration based on integer linear programming. In: RCIS 2015, pp. 524–535 (2015)

    Google Scholar 

  5. Do, H.H., Rahm, E.: Matching large schemas: approaches and evaluation. Inf. Syst. 32(6), 857–885 (2007)

    Article  Google Scholar 

  6. Duchateau, F., Bellahsene, Z.: Designing a benchmark for the assessment of schema matching tools. Open J. Databases (OJDB) 1(1), 3–25 (2014)

    Google Scholar 

  7. Duchateau, F., Coletta, R., Miller, R.J.: Yam: a schema matcher factory. In: CIKM, pp. 2079–2080 (2009)

    Google Scholar 

  8. Edmonds, J.: Maximum matching and a polyhedron with 0, 1-vertices. J. Res. Natl. Bur. Stand. B 69, 125–130 (1965)

    Article  MathSciNet  MATH  Google Scholar 

  9. Euzenat, J., Shvaiko, P.: Ontology Matching. Springer, Heidelberg (2007)

    MATH  Google Scholar 

  10. Euzenat, J., Shvaiko, P.: Ontology Matching, 2nd edn. Springer, Heidelberg (2013)

    Book  Google Scholar 

  11. Euzenat, J., Valtchev, P.: Similarity-based ontology alignment in owl-lite. In: Proceedings of the 16th European Conference on Artificial Intelligence (ECAI), pp. 333–337. IOS press (2004)

    Google Scholar 

  12. Giunchiglia, F., Yatskevich, M., Shvaiko, P.: Semantic matching: algorithms and implementation. In: Spaccapietra, S., Atzeni, P., Fages, F., Hacid, M.-S., Kifer, M., Mylopoulos, J., Pernici, B., Shvaiko, P., Trujillo, J., Zaihrayeu, I. (eds.) Journal on Data Semantics IX. LNCS, vol. 4601, pp. 1–38. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  13. Huber, J., Sztyler, T., Nner, J., Meilicke, C.: CODI: Combinatorial optimization for data integration: results for OAEI 2011. In: CEUR Workshop Proceedings on OM, vol. 814 (2011). http://CEUR-WS.org

  14. Jimenez, S., Becerra, C., Gelbukh, A., Gonzalez, F.: Generalized mongue-elkan method for approximate text string comparison. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 559–570. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  15. Lin, D.: An information-theoretic definition of similarity. In. In Proceedings of the 15th International Conference on Machine Learning, pp. 296–304. Morgan Kaufmann (1998)

    Google Scholar 

  16. Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: Proceedings of the 18th International Conference on Data Engineering, ICDE 2002, pp. 117–128. IEEE Computer Society (2002)

    Google Scholar 

  17. Niepert, M., Meilicke, C., Stuckenschmidt, H.: A probabilistic-logical framework for ontology matching. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence, pp. 1413–1418. AAAI Press (2010)

    Google Scholar 

  18. Rahm, E.: Towards large-scale schema and ontology matching. In: Bellahsene, Z., Bonifati, A., Rahm, E. (eds.) Schema Matching and Mapping. Data-Centric Systems and Applications, pp. 3–27. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  19. Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB J. 10, 334–350 (2001)

    Article  MATH  Google Scholar 

  20. Schrijver, A.: Combinatorial Optimization: Polyhedra and Efficiency. Springer, Heidelberg (2003)

    Google Scholar 

  21. Shvaiko, P., Euzenat, J.: Ontology matching: State of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25(1), 158–176 (2013)

    Article  Google Scholar 

  22. Shvaiko, P., Euzenat, J.: A Survey of Schema-Based Matching Approaches. In: Spaccapietra, S. (ed.) Journal on Data Semantics IV. LNCS, vol. 3730, pp. 146–171. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  23. Sun, Y., Ma, L., Shuang, W.: A comparative evaluation of string similarity metrics for ontology alignement. J. Inf. Comput. Sci. 12(3), 957–964 (2015)

    Article  Google Scholar 

  24. Wu, Z., Palmer., M.: Verb semantics and lexical selection. In: 32nd Annual Meeting of the Association for Computational Linguistics, New Mexico State University, Las Cruces, New Mexico, pp. 133–138 (1994)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Imen Megdiche .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Berro, A., Megdiche, I., Teste, O. (2015). A Linear Program for Holistic Matching: Assessment on Schema Matching Benchmark. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds) Database and Expert Systems Applications. Globe DEXA 2015 2015. Lecture Notes in Computer Science(), vol 9262. Springer, Cham. https://doi.org/10.1007/978-3-319-22852-5_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22852-5_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22851-8

  • Online ISBN: 978-3-319-22852-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics