Skip to main content

Towards a Holistic Schema Matching Approach Designed for Large-Scale Schemas

  • Conference paper
  • First Online:
Computational Collective Intelligence (ICCCI 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12496))

Included in the following conference series:

Abstract

Holistic schema matching is a fundamental challenge in the big data integration domain. Ideally, clusters of semantically corresponding elements are created and are updated as more schemas are matched. Developing a high-quality holistic schema matching approach is critical for two main reasons. First, identifying as many accurate and holistic semantic correspondences as possible right from the beginning. Second, reducing considerably the search space. Nevertheless, this problem is challenging since overlapping schema elements are not available. Identifying schema overlaps is further complicated for two main reasons: (1) there is a large number of schemas; and (2) overlaps vary for different schemas. In this paper we present HMO, a Holistic schema Matching approach based on schema Overlaps and designed for large-scale schemas. HMO can balance the search space and the quality of the holistic semantic correspondences. To narrow down the search space, HMO matches schemas based on their overlaps. To obtain high-accuracy, HMO uses an existing high-quality semantic similarity measure. Experimental results on four real-world domains show effectiveness and scalability of our matching approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://metaquerier.cs.uiuc.edu/repository.

References

  1. Aumueller, D., Do, H.-H., Massmann, S., Rahm, E.: Schema and ontology matching with coma++. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 906–908. ACM (2005)

    Google Scholar 

  2. Bernstein, P.A., Madhavan, J., Rahm, E.: Generic schema matching, ten years later. In: Proceedings of the VLDB Endowment, vol. 4, no. 11, pp. 695–701 (2011)

    Google Scholar 

  3. Do, H.-H., Rahm, E.: Coma: a system for flexible combination of schema matching approaches. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 610–621. VLDB Endowment (2002)

    Google Scholar 

  4. Ehrig, M., Staab, S.: QOM – quick ontology mapping. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 683–697. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30475-3_47

    Chapter  Google Scholar 

  5. El Yazidi, M.H., Zellou, A., Idri, A.: Fmams: fuzzy mapping approach for mediation systems. Int. J. Appl. Evol. Comput. (IJAEC) 4(3), 34–46 (2013)

    Article  Google Scholar 

  6. Giunchiglia, F., Autayeu, A., Pane, J.: S-match: an open source framework for matching lightweight ontologies. Semant. Web 3(3), 307–317 (2012)

    Article  Google Scholar 

  7. Gruetze, T., Böhm, C., Naumann, F.: Holistic and scalable ontology alignment for linked open data. LDOW 937, 1–10 (2012)

    Google Scholar 

  8. Kastner, I., Adriaans, F.: Linguistic constraints on statistical word segmentation: the role of consonants in Arabic and English. Cogn. Sci. 42, 494–518 (2018)

    Article  Google Scholar 

  9. Madhavan, J., Bernstein, P.A., Rahm, E.: Generic schema matching with cupid. vldb 1, 49–58 (2001)

    Google Scholar 

  10. Rahm, E., Peukert, E.: Holistic schema matching (2019)

    Google Scholar 

  11. Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. arXiv preprint cmp-lg/9511007 (1995)

    Google Scholar 

  12. Saleem, K., Bellahsene, Z., Hunt, E.: Porsche: performance oriented schema mediation. Inf. Syst. 33(7–8), 637–657 (2008)

    Article  Google Scholar 

  13. Su, W., Wang, J., Lochovsky, F.: Holistic Schema Matching for Web Query Interfaces. In: Ioannidis, Y., et al. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 77–94. Springer, Heidelberg (2006). https://doi.org/10.1007/11687238_8

    Chapter  Google Scholar 

  14. Yousfi, A., El Yazidi, M.H., Zellou, A.: hmatcher: matching schemas holistically. Int. J. Intell. Eng. Syst. 13(5), 490–501 (2020)

    Google Scholar 

  15. Yousfi, A., Elyazidi, M.H., Zellou, A.: Assessing the performance of a new semantic similarity measure designed for schema matching for mediation systems. In: Nguyen, N.T., Pimenidis, E., Khan, Z., Trawiński, B. (eds.) ICCCI 2018. LNCS (LNAI), vol. 11055, pp. 64–74. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98443-8_7

    Chapter  Google Scholar 

  16. Yousfi, A., Yazidi, M.H.E., Zellou, A.: xmatcher: Matching extensible markup language schemas using semantic-based techniques. Int. J. Adv. Comput. Sci. Appl. 11(8) (2020)

    Google Scholar 

  17. Zhang, C., Chen, L., Jagadish, H., Zhang, M., Tong, Y.: Reducing uncertainty of schema matching via crowdsourcing with accuracy rates. IEEE Trans. Knowl. Data Eng. 32, 135–151 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aola Yousfi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yousfi, A., Yazidi, M.H.E., Zellou, A. (2020). Towards a Holistic Schema Matching Approach Designed for Large-Scale Schemas. In: Nguyen, N.T., Hoang, B.H., Huynh, C.P., Hwang, D., Trawiński, B., Vossen, G. (eds) Computational Collective Intelligence. ICCCI 2020. Lecture Notes in Computer Science(), vol 12496. Springer, Cham. https://doi.org/10.1007/978-3-030-63007-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63007-2_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63006-5

  • Online ISBN: 978-3-030-63007-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics