Advertisement

Ontology Alignment Based on Word Embedding and Random Forest Classification

  • Ikechukwu Nkisi-OrjiEmail author
  • Nirmalie Wiratunga
  • Stewart Massie
  • Kit-Ying Hui
  • Rachel Heaven
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11051)

Abstract

Ontology alignment is crucial for integrating heterogeneous data sources and forms an important component of the semantic web. Accordingly, several ontology alignment techniques have been proposed and used for discovering correspondences between the concepts (or entities) of different ontologies. Most alignment techniques depend on string-based similarities which are unable to handle the vocabulary mismatch problem. Also, determining which similarity measures to use and how to effectively combine them in alignment systems are challenges that have persisted in this area. In this work, we introduce a random forest classifier approach for ontology alignment which relies on word embedding for determining a variety of semantic similarity features between concepts. Specifically, we combine string-based and semantic similarity measures to form feature vectors that are used by the classifier model to determine when concepts align. By harnessing background knowledge and relying on minimal information from the ontologies, our approach can handle knowledge-light ontological resources. It also eliminates the need for learning the aggregation weights of a composition of similarity measures. Experiments using Ontology Alignment Evaluation Initiative (OAEI) dataset and real-world ontologies highlight the utility of our approach and show that it can outperform state-of-the-art alignment systems. Code related to this paper is available at: https://bitbucket.org/paravariar/rafcom.

Keywords

Ontology alignment Word embedding Machine classification Semantic web 

Notes

Acknowledgement

This work was supported in part by the British Geological Survey (BGS) through the BGS University Funding Initiative (BUFI S291). We are grateful for the valuable comments of our reviewers.

References

  1. 1.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar
  2. 2.
    Cheatham, M., Hitzler, P.: String similarity metrics for ontology alignment. In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 294–309. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-41338-4_19CrossRefGoogle Scholar
  3. 3.
    Cruz, I.F., Antonelli, F.P., Stroe, C.: AgreementMaker: efficient matching for large real-world schemas and ontologies. Proc. VLDB Endowment 2(2), 1586–1589 (2009)CrossRefGoogle Scholar
  4. 4.
    David, J., Euzenat, J., Scharffe, F., Trojahn dos Santos, C.: The alignment API 4.0. Seman. Web 2(1), 3–10 (2011)Google Scholar
  5. 5.
    Doan, A., Madhavan, J., Dhamankar, R., Domingos, P., Halevy, A.: Learning to match ontologies on the semantic web. VLDB J. 12(4), 303–319 (2003)CrossRefGoogle Scholar
  6. 6.
    Euzenat, J., Shvaiko, P., et al.: Ontology Matching, vol. 333. Springer, Heidelberg (2007)zbMATHGoogle Scholar
  7. 7.
    Gulić, M., Vrdoljak, B., Banek, M.: CroMatcher: an ontology matching system based on automated weighted aggregation and iterative final alignment. Web Seman. Sci. Serv. Agents World Wide Web 41, 50–71 (2016)CrossRefGoogle Scholar
  8. 8.
    Husein, I.G., Akbar, S., Sitohang, B., Azizah, F.N.: Review of ontology matching with background knowledge. In: Data and Software Engineering, pp. 1–6. IEEE (2016)Google Scholar
  9. 9.
    Jain, P., Hitzler, P., Sheth, A.P., Verma, K., Yeh, P.Z.: Ontology alignment for linked open data. In: Patel-Schneider, P.F., et al. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 402–417. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-17746-0_26CrossRefGoogle Scholar
  10. 10.
    Li, J., Tang, J., Li, Y., Luo, Q.: RIMOM: a dynamic multistrategy ontology alignment framework. IEEE Trans. Knowl. Data Eng. 21(8), 1218–1232 (2009)CrossRefGoogle Scholar
  11. 11.
    Li, Y., McLean, D., Bandar, Z.A., O’shea, J.D., Crockett, K.: Sentence similarity based on semantic nets and corpus statistics. IEEE Trans. Knowl. Data Eng. 18(8), 1138–1150 (2006)CrossRefGoogle Scholar
  12. 12.
    Lin, F., Sandkuhl, K.: A survey of exploiting wordNet in ontology matching. In: Bramer, M. (ed.) IFIP AI 2008. ITIFIP, vol. 276, pp. 341–350. Springer, Boston, MA (2008).  https://doi.org/10.1007/978-0-387-09695-7_33CrossRefGoogle Scholar
  13. 13.
    Martínez-Romero, M., Vázquez-Naya, J.M., Nóvoa, F.J., Vázquez, G., Pereira, J.: A genetic algorithms-based approach for optimizing similarity aggregation in ontology matching. In: Rojas, I., Joya, G., Gabestany, J. (eds.) IWANN 2013, Part I. LNCS, vol. 7902, pp. 435–444. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-38679-4_43CrossRefGoogle Scholar
  14. 14.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
  15. 15.
    Monge, A.E., Elkan, C., et al.: The field matching problem: algorithms and applications. In: KDD, pp. 267–270 (1996)Google Scholar
  16. 16.
    Ngo, D.H., Bellahsene, Z.: YAM++: a multi-strategy based approach for ontology matching task. In: ten Teije, A., et al. (eds.) EKAW 2012. LNCS (LNAI), vol. 7603, pp. 421–425. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33876-2_38CrossRefGoogle Scholar
  17. 17.
    Ngo, D.H., Bellahsene, Z., Todorov, K.: Opening the black box of ontology matching. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 16–30. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-38288-8_2CrossRefGoogle Scholar
  18. 18.
    Otero-Cerdeira, L., Rodríguez-Martínez, F.J., Gómez-Rodríguez, A.: Ontology matching: a literature review. Expert Syst. Appl. 42(2), 949–971 (2015)CrossRefGoogle Scholar
  19. 19.
    Shvaiko, P., Euzenat, J.: Ontology matching: state of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25(1), 158–176 (2013)CrossRefGoogle Scholar
  20. 20.
    Stoilos, G., Stamou, G., Kollias, S.: A string metric for ontology alignment. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 624–637. Springer, Heidelberg (2005).  https://doi.org/10.1007/11574620_45CrossRefGoogle Scholar
  21. 21.
    Sun, Z., Hu, W., Li, C.: Cross-lingual entity alignment via joint attribute-preserving embedding. In: d’Amato, C., et al. (eds.) ISWC 2017, Part I. LNCS, vol. 10587, pp. 628–644. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-68288-4_37CrossRefGoogle Scholar
  22. 22.
    Zhang, Y., et al.: Ontology matching with word embeddings. In: Sun, M., Liu, Y., Zhao, J. (eds.) CCL/NLP-NABD-2014. LNCS (LNAI), vol. 8801, pp. 34–45. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-12277-9_4CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Robert Gordon UniversityAberdeenUK
  2. 2.British Geological SurveyNottinghamUK

Personalised recommendations