Earth Science Informatics

, Volume 8, Issue 1, pp 95–102 | Cite as

SEM+: tool for discovering concept mapping in Earth science related domain

  • Jin Guang ZhengEmail author
  • Linyun Fu
  • Xiaogang Ma
  • Peter Fox
Research Article


The amount of Earth Science related domain concepts and vocabularies encoded in popular Semantic Web languages such as OWL and SKOS grows rapidly as more and more domain scientists realize the power of Semantic Web Technologies. The interlinking between these concepts will enable the possibility of performing data integration and identity recognition, which is crucial in developing applications that use data from multiple sources. In this paper, we discuss a new tool for performing concept mapping called SEM+. In SEM+, we designed the Information Entropy based Weighted Similarity Model to compute semantic similarity between entity data and suggest possible linking. We also adopted a blocking approach to group possible matching entities into one block and therefore reduce the computation space. We performed evaluations on SEM+ using the Integrated Ocean Observatory System ontology and the Marine Metadata Interoperability ontology and discussed the results and new findings.


Ontology matching Instance matching Owl:sameAs Entity resolution Linked data 


  1. Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. ACM Press, Addison-WesleyGoogle Scholar
  2. Bechhofer S (2009) OWL: web ontology language. Encyclopedia of Database Sytems. Springer US. 2008–2009. Miles, Alistair, and José R. Pérez-Agüera.Google Scholar
  3. Benjelloun O, Garcia-Mollina H, Menestrina D, Su Q, Whang S, Widom J (2009) Swoosh: a generic approach to entity resolution. VLDB J 18(1):255–276CrossRefGoogle Scholar
  4. Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 284(5):28–37CrossRefGoogle Scholar
  5. Chaudhuri S, Ganti V, Motwani R (2005) Robust identification of fuzzy duplicates. In Prof. of ICDE, pp. 865–876Google Scholar
  6. Cruz IF, Antonelli FP, Stroe C (2009) Agreementmaker: efficient matching for large real-world schemas and ontologies. PVLDB 2(2):1586–1589Google Scholar
  7. Dong X, Halevy Y, Madhavan J (2005) Reference reconciliation in complex information spaces. In Proc. of SIGMOD, pp. 865–876Google Scholar
  8. Duan S, Fokoue A, Srinivas K, Byrne B (2012) A clustering-based approach to ontology alignment, In Proc. of ISWCGoogle Scholar
  9. Elfeky M, Elmagarmid A, Verykios V (2005) Tailor: a record linkage tool box. In Proc. of SIGMOD, pp. 85–96Google Scholar
  10. Euzenat J (1994) Brief overview of T-Tree: the TROPES taxonomy building tool, in: 4th ASIS SIG/CR Workshop on Classification Research, Columbus (OH, US), pp. 69–87Google Scholar
  11. Haq BU (ed) (2007) The geological time table, 6th edn. Elsevier, AmsterdamGoogle Scholar
  12. Jean-Mary Y, Shironoshita E, Kabuka M (2009) Ontology matching with semantic verification. In Proc. of Web Semantics: Science, Services and Agents on the World Wide WebGoogle Scholar
  13. Jimenez-Ruiz E, Grau B (2012) LogMap: logic-based and scalable ontology matching, In Proc. of ISWCGoogle Scholar
  14. Klyne G, Carroll JJ (2006) Resource description framework (RDF): concepts and abstract syntaxGoogle Scholar
  15. Lin D (1998) An information-theoretic definition of similarity. In Proc. of 15th International Conference of machine Learning (ICML) pp. 296–304Google Scholar
  16. NERC Vocabulary Server,
  17. Newcombe H, Kenedy J (1962) Record linkage: making maximum use of the discriminating power of identifying information. Commun ACM 5(11):563–566Google Scholar
  18. Nguyen K, Ichise R, Le B (2012) SLINT: a schema-independent linked data interlinking system. In Ontology Matching (OM 2012)Google Scholar
  19. Raskin RG, Pan MJ (2005) Knowledge representation in the semantic web for earth and environmental terminology (SWEET). Comput Geosci 31(9):1119–1125CrossRefGoogle Scholar
  20. Rong S, Niu X, Xiang E, Wang H, Yang Q, Yu Y (2012) A machine learning approach for instance matching based on similarity metrics. In Proc. Of ISWCGoogle Scholar
  21. Sarawagi S, Bhamidipaty A (2002) Interactive deduplication using active learning. In Proc. of KDD, pp. 269–278Google Scholar
  22. Shannon C (1948) A mathematical theory of communication. Bell Sys Techn J 27(3):379–423CrossRefGoogle Scholar
  23. Shvaiko P, Euzenant J (2013) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data EngGoogle Scholar
  24. SKOS (2007) simple knowledge organisation for the web. Cat Classif Q 43.3–4: 69–83.Google Scholar
  25. Stumme G, Madche A (2011) FCA-Merge: bottom-up merging of ontologies, In the 7th International conference on artificial Intellligence (IJCAI), pp. 225–230Google Scholar
  26. Tang J, Li J, Liang B, Huang X, Li Y, Wang K (2006) Using Bayesian decision for ontology mapping. J Web Semantics Sci, Serv Agents World Wide Web, pp. 243–262Google Scholar
  27. Thor A, Rahm E (2007) Moma – a mapping-based object matching system. In Proc. of CIDR, pp. 247–258Google Scholar
  28. Volz J, Bizer C, Gaedke M, Kobilarov G (2009) Discovering and maintaining links on the web of data. In Proc. of ISWC, pp. 650–665Google Scholar
  29. Walker JD, Geissman JW (2009) 2009 GSA geologic time scale. GSA Today 19(4–5):60–61CrossRefGoogle Scholar
  30. Yancey W (2002) Bigmatch: a program for extracting probable matches from a large file for record linkage. Statistical research report series rrc2002/01, U.S. Bureau of CensusGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Jin Guang Zheng
    • 1
    Email author
  • Linyun Fu
    • 1
  • Xiaogang Ma
    • 1
  • Peter Fox
    • 1
  1. 1.Tetherless World Constellation, Computer Science DepartmentRensselaer Polytechnic InstituteTroyUSA

Personalised recommendations