Advertisement

Evaluation of Terminological Schema Matching and Its Implications for Schema Mapping

  • Sarawat Anam
  • Yang Sok Kim
  • Byeong Ho Kang
  • Qing Liu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8862)

Abstract

Recently large amounts of schema data, which describe data structure of various domains such as purchase order, health, publication, geography, agriculture, environment and music, are available over the Web. Schema mapping aims to solve schema heterogeneity problem in schema data. This research thoroughly examines how string similarity metrics and text processing techniques impact on the performance of terminological schema mapping and highlights their limitations. Our experimental study demonstrates that the performance of terminological schema matching is significantly improved by using text processing techniques. However, the performance improvement is slightly different between datasets because of the characteristics of the datasets, and in spite of applying all text processing techniques, some datasets still exhibit low performance. Our research supports the claim that a system which can manage the context dependent characteristics of terminological schema matching is essential for better schema mapping algorithms.

Keywords

Schema mapping terminological schema matching string metrics 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cate, B.T., Dalmau, V., Kolaitis, P.G.: Learning schema mappings. In: Proceedings of the 15th International Conference on Database Theory, pp. 182–195. ACM, Berlin (2012)Google Scholar
  2. 2.
    Glavic, B., Alonso, G., Miller, J.R., Hass, L.M.: TRAMP: Understanding the behavior of schema mappings through provenance. Proceedings of the VLDB Endowment 3(1-2), 1314–1325 (2010)CrossRefGoogle Scholar
  3. 3.
    Ngo, D., Bellahsene, Z., Todorov, K.: Opening the Black Box of Ontology Matching. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 16–30. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  4. 4.
    Ngo, D., Bellahsene, Z., Coletta, R.: A generic approach for combining linguistic and context profile metrics in ontology matching. In: Meersman, R., et al. (eds.) OTM 2011, Part II. LNCS, vol. 7045, pp. 800–807. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  5. 5.
    Al-Ghanim, M., Noah, S.A., Sembok, T.M.: Automating XML schema matching: A composite approach. In: International Conference on Electrical Engineering and Informatics (ICEEI) (2011)Google Scholar
  6. 6.
    Cohen, W.W., Ravikumar, P., Stephen, E.: A Comparison of String Distance Metrics for Name-Matching Tasks. In: IJCAI 2003 Workshop on Information Integration (2003)Google Scholar
  7. 7.
    Cheatham, M., Hitzler, P.: String Similarity Metrics for Ontology Alignment. In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 294–309. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  8. 8.
    Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A Comparison of String Distance Metrics for Name-Matching Tasks. In: IJCAI 2003 Workshop on Information Integration (2003)Google Scholar
  9. 9.
    Jimenez, S., Becerra, C., Gelbukh, A., Gonzalez, F.: Generalized Mongue-Elkan Method for Approximate Text String Comparison. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 559–570. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  10. 10.
    Do, H.-H., Rahm, E.: COMA: A system for flexible combination of schema matching approaches. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 610–621. VLDB Endowment, Hong Kong (2002)CrossRefGoogle Scholar
  11. 11.
    Madhavan, J., Bernstein, P.A., Rahm, E.: Generic Schema Matching with Cupid. In: Proceedings of the 27th International Conference on Very Large Data Bases, pp. 49–58. Morgan Kaufmann Publishers Inc. (2001)Google Scholar
  12. 12.
    Cheng, W., Lin, H., Sun, Y.: An efficient schema matching algorithm. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds.) KES 2005, Part II. LNCS (LNAI), vol. 3682, pp. 972–978. Springer, Heidelberg (2005)Google Scholar
  13. 13.
    Koudas, N., Sarawagi, S., Srivastava, D.: Record linkage: similarity measures and algorithms. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 802–803. ACM, Chicago (2006)CrossRefGoogle Scholar
  14. 14.
    Cheatham, M., Hitzler, P.: String similarity metrics for ontology alignment. In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 294–309. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  15. 15.
    Stoilos, G., Stamou, G., Kollias, S.: A string metric for ontology alignment. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 624–637. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  16. 16.
    Marie, A., Gal, A.: Boosting schema matchers. In: Meersman, R., Tari, Z. (eds.) OTM 2008, Part I. LNCS, vol. 5331, pp. 283–300. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  17. 17.
    Unal, O., Afsarmanesh, H.: Schema Matching and Integration for Data Sharing Among Collaborating Organizations. Journal of Software 4(3) (2009) (1796217X)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Sarawat Anam
    • 1
    • 2
  • Yang Sok Kim
    • 1
  • Byeong Ho Kang
    • 1
  • Qing Liu
    • 2
  1. 1.School of Computing and Information SystemsUniversity of TasmaniaSandy BayAustralia
  2. 2.Intelligent Sensing and Systems Laboratory, CSIRO Computational InformaticsHobartAustralia

Personalised recommendations