KnoE: A Web Mining Tool to Validate Previously Discovered Semantic Correspondences


The problem of matching schemas or ontologies consists of providing corresponding entities in two or more knowledge models that belong to a same domain but have been developed separately. Nowadays there are a lot of techniques and tools for addressing this problem, however, the complex nature of the matching problem make existing solutions for real situations not fully satisfactory. The Google Similarity Distance has appeared recently. Its purpose is to mine knowledge from the Web using the Google search engine in order to semantically compare text expressions. Our work consists of developing a software application for validating results discovered by schema and ontology matching tools using the philosophy behind this distance. Moreover, we are interested in using not only Google, but other popular search engines with this similarity distance. The results reveal three main facts. Firstly, some web search engines can help us to validate semantic correspondences satisfactorily. Secondly there are significant differences among the web search engines. And thirdly the best results are obtained when using combinations of the web search engines that we have studied.

This work was supported by Spanish Ministry of Innovation and Science through REALIDAD: Gestion, Analisis y Explotacion Efficiente de Datos Vinculados under Grant No. TIN2011-25840.

Martinez-Gil, J., Aldana-Montes, J.F. KnoE: A Web Mining Tool to Validate Previously Discovered Semantic Correspondences. J. Comput. Sci. Technol. 27, 1222–1232 (2012).

  • database integration
  • data and knowledge engineering
  • similarity distance