Journal of Computer Science and Technology

, Volume 27, Issue 6, pp 1222–1232 | Cite as

KnoE: A Web Mining Tool to Validate Previously Discovered Semantic Correspondences

  • Jorge Martinez-Gil
  • José F. Aldana-Montes
Regular Paper


The problem of matching schemas or ontologies consists of providing corresponding entities in two or more knowledge models that belong to a same domain but have been developed separately. Nowadays there are a lot of techniques and tools for addressing this problem, however, the complex nature of the matching problem make existing solutions for real situations not fully satisfactory. The Google Similarity Distance has appeared recently. Its purpose is to mine knowledge from the Web using the Google search engine in order to semantically compare text expressions. Our work consists of developing a software application for validating results discovered by schema and ontology matching tools using the philosophy behind this distance. Moreover, we are interested in using not only Google, but other popular search engines with this similarity distance. The results reveal three main facts. Firstly, some web search engines can help us to validate semantic correspondences satisfactorily. Secondly there are significant differences among the web search engines. And thirdly the best results are obtained when using combinations of the web search engines that we have studied.


database integration data and knowledge engineering similarity distance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Berners-Lee T, Hendler J, Lassila O (2001) The Semantic Web. Scientific American 284(5):34–43CrossRefGoogle Scholar
  2. [2]
    Euzenat J, Shvaiko P. Ontology Matching, Springer, 2007.Google Scholar
  3. [3]
    Kiefer C, Bernstein A, Stocker M. The fundamentals of iSPARQL: A virtual triple approach for similarity-based semantic web tasks. In Proc. ISWC/ASWC, Nov. 2007, pp.295–309.Google Scholar
  4. [4]
    Ziegler P, Kiefer C, Sturm C, Dittrich K R, Bernstein A. Detecting similarities in ontologies with the SOQA-SimPack toolkit. In Proc. the 10th EDBT, March 2006, pp.59–76.Google Scholar
  5. [5]
    Lambrix P, Tan H (2007) A tool for evaluating ontology alignment strategies. J Data Semantics 8:182–202Google Scholar
  6. [6]
    Domshlak C, Gal A, Roitman H (2007) Rank aggregation for automatic schema matching. IEEE Trans Knowl Data Eng 19(4):538–553CrossRefGoogle Scholar
  7. [7]
    Gal A, Anaby-Tavor A, Trombetta A, Montesi D (2005) A framework for modeling and evaluating automatic semantic reconciliation. VLDB Journal 14(1):50–67CrossRefGoogle Scholar
  8. [8]
    Ehrig M, Staab S, Sure Y. Bootstrapping ontology alignment methods with APFEL. In Proc. the 4th International Semantic Web Conference, Nov. 2005, pp.186–200.Google Scholar
  9. [9]
    Lee Y, Sayyadian M, Doan A, Rosenthal AS (2007) eTuner: Tuning schema matching software using synthetic scenarios. VLDB Journal 16(1):97–122CrossRefGoogle Scholar
  10. [10]
    Mao M, Peng Y, Spring M (2010) An adaptive ontology mapping approach with neural network based constraint satisfaction. J Web Semantics 8(1):14–25CrossRefGoogle Scholar
  11. [11]
    Wang J, Ding Z, Jiang C. GAOM: Genetic algorithm based ontology matching. In Proc. APSCC, Dec. 2006, pp.617–620.Google Scholar
  12. [12]
    Ernandes M, Angelini G, Gori M. WebCrow: A web-based system for crossword solving. In Proc. the 20th AAAI, July 2005, pp.1412–1417.Google Scholar
  13. [13]
    Gracia J, Mena E. Web-based measure of semantic relatedness. In Proc. the 9th WISE, Sept. 2008, pp.136–150.Google Scholar
  14. [14]
    Cilibrasi RL, Vitányi PMB (2007) The google similarity distance. IEEE Trans Knowledge and Data Engineering 19(3):370–383CrossRefGoogle Scholar
  15. [15]
    Budanitsky A, Hirst G (2006) Evaluating word Net-based measures of lexical semantic relatedness. Computational Linguistics 32(1):13–47zbMATHCrossRefGoogle Scholar
  16. [16]
    Motta E, Sabou M. Next generation semantic web applications. In Proc. the 1st ASWC, Sept. 2006, pp.24–29.Google Scholar
  17. [17]
    Do H H, Rahm E. COMA — A system for flexible combination of schema matching approaches. In Proc. the 28th VLDB, August 2002, pp.610–621.Google Scholar
  18. [18]
    Aumueller D, Do H H, Massmann S, Rahm E. Schema and ontology matching with COMA++. In Proc. the 24th SIGMOD Conference, June 2005, pp.906–908.Google Scholar
  19. [19]
    Drumm C, Schmitt M, Do H H, Rahm E. Quickmig: Automatic schema matching for data migration projects. In Proc. the 16th CIKM, Nov. 2007, pp.107–116.Google Scholar
  20. [20]
    Ehrig M, Sure Y. FOAM — Framework for ontology alignment and mapping - results of the ontology alignment evaluation initiative. In Proc. Integrating Ontologies, Oct. 2005, pp.72–76.Google Scholar
  21. [21]
    Wang Z, Zhang X, Hou L, Zhao Y, Li J, Qi Y, Tang J. Ri-MOM results for OAEI 2010. In Proc. the 15th OM, Nov. 2010.Google Scholar
  22. [22]
    Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88CrossRefGoogle Scholar
  23. [23]
    Miller GA (1995) WordNet: A lexical database for English. Commun ACM 38(11):39–41CrossRefGoogle Scholar
  24. [24]
    Martinez-Gil J, Aldana-Montes JF (2011) Evaluation of two heuristic approaches to solve the ontology meta-matching problem. Knowl Inf Syst 26(2):225–247CrossRefGoogle Scholar
  25. [25]
    Avesani P, Giunchiglia F, Yatskevich M. A large scale taxonomy mapping evaluation. In Proc. the 4th International Semantic Web Conference, Nov. 2005, pp.67–81.Google Scholar
  26. [26]
    Euzenat J, Meilicke C, Stuckenschmidt H, Shvaiko P, Trojahn C (2011) Ontology alignment evaluation initiative: Six years of experience. J Data Semantics 15:158–192CrossRefGoogle Scholar
  27. [27]
    Shvaiko P, Euzenat J, Giunchiglia F, He B (eds.) Proceedings of the 2nd InternationalWorkshop on Ontology Matching Busan, Korea, November 11, 2007.Google Scholar
  28. [28]
    van Harmelen F. Two obvious intuitions: Ontology-mapping needs background knowledge and approximation. In Proc. IAT, Dec. 2006, p.11.Google Scholar
  29. [29]
    Giunchiglia F, Shvaiko P, Yatskevich M. Discovering missing background knowledge in ontology matching. In Proc. the 17th ECAI, Aug. 29-Sept. 1, 2006, pp.382–386.Google Scholar
  30. [30]
    Vazquez R, Swoboda N. Combining the semantic web with the web as background knowledge for ontology mapping. In Proc. OTM, Nov. 2007, 1: 814–831.Google Scholar
  31. [31]
    Gligorov R, ten Kate W, Aleksovski Z, van Harmelen F. Using google distance to weight approximate ontology matches. In Proc. the 16th WWW, May 2007, pp.767–776.Google Scholar
  32. [32]
    Miller GA, Charles WG (1991) Contextual correlates of semantic similarity. Language Cognitive Processes 6(1):1–28CrossRefGoogle Scholar
  33. [33]
    Rubenstein H, Goodenough JB (1965) Contextual correlates of synonymy. Commun ACM 8(10):627–633CrossRefGoogle Scholar
  34. [34]
    Keller F, Lapata M (2003) Using the Web to obtain frequencies for unseen bigrams. Computational Linguistics 29(3):459–484CrossRefGoogle Scholar
  35. [35]
    Resnik P, Smith NA (2003) The Web as a parallel corpus. Computational Linguistics 29(3):349–380CrossRefGoogle Scholar
  36. [36]
    Turney P D. Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. CoRR, 2002, cs.LG/0212033.Google Scholar
  37. [37]
    Matsuo Y, Sakaki T, Uchiyama K, Ishizuka M. Graph-based word clustering using a web search engine. In Proc. EMNLP, July 2006, pp.542–550.Google Scholar
  38. [38]
    Sahami M, Heilman T D. A web-based kernel function for measuring the similarity of short text snippets. In Proc. the 15th WWW, May 2006, pp.377–386.Google Scholar
  39. [39]
    Chen H H, Lin M S, Wei Y C. Novel association measures using web search with double checking. In Proc. ACL, July 2006.Google Scholar

Copyright information

© Springer Science+Business Media New York & Science Press, China 2012

Authors and Affiliations

  1. 1.Department of Computer Languages and Computing SciencesUniversity of MálagaMálagaSpain

Personalised recommendations