Online Relation Alignment for Linked Datasets

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10249)

Abstract

The large number of linked datasets in the Web, and their diversity in terms of schema representation has led to a fragmented dataset landscape. Querying and addressing information needs that span across disparate datasets requires the alignment of such schemas. Majority of schema and ontology alignment approaches focus exclusively on class alignment. Yet, relation alignment has not been fully addressed, and existing approaches fall short on addressing the dynamics of datasets and their size.

In this work, we address the problem of relation alignment across disparate linked datasets. Our approach focuses on two main aspects. First, online relation alignment, where we do not require full access, and sample instead for a minimal subset of the data. Thus, we address the main limitation of existing work on dealing with the large scale of linked datasets, and in cases where the datasets provide only query access. Second, we learn supervised machine learning models for which we employ various features or matchers that account for the diversity of linked datasets at the instance level. We perform an experimental evaluation on real-world linked datasets, DBpedia, YAGO, and Freebase. The results show superior performance against state-of-the-art approaches in schema matching, with an average relation alignment accuracy of 84%. In addition, we show that relation alignment can be performed efficiently at scale.

References

  1. 1.
  2. 2.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). doi:10.1007/978-3-540-76298-0_52 CrossRefGoogle Scholar
  3. 3.
    Aumueller, D., Do, H.-H., Massmann, S., Rahm, E.: Schema and ontology matching with coma++. In: SIGMOD (2005)Google Scholar
  4. 4.
    Bishop, C.M.: Pattern Recognition and Machine Learning, vol. 1. Springer, Heidelberg (2006)MATHGoogle Scholar
  5. 5.
    Bizer, C., Heath, T., Idehen, K., Berners-Lee, T.: Linked data on the Web. In: WWW (2008)Google Scholar
  6. 6.
    Böhm, C., de Melo, G., Naumann, F., Weikum, G.: Linda: distributed web-of-data-scale entity matching. In: CIKM (2012)Google Scholar
  7. 7.
    Cheatham, M., Hitzler, P.: String similarity metrics for ontology alignment. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8219, pp. 294–309. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41338-4_19 CrossRefGoogle Scholar
  8. 8.
    Cruz, I.F., Antonelli, F.P., Stroe, C.: Agreementmaker: efficient matching for large real-world schemas and ontologies. PVLDB 2, 1586–1589 (2009)Google Scholar
  9. 9.
    d’Aquin, M., Adamou, A., Dietze, S.: Assessing the educational linked data landscape. In: WebSci (2013)Google Scholar
  10. 10.
    Dehaspe, L., Toivonen, H.: Discovery of frequent datalog patterns. Data Min. Knowl. Discov. 3, 7–36 (1999)CrossRefGoogle Scholar
  11. 11.
    Dhamankar, R., Lee, Y., Doan, A., Halevy, A.Y., Domingos, P.: imap: discovering complex mappings between database schemas. In: SIGMOD (2004)Google Scholar
  12. 12.
    Doan, A.-H., Madhavan, J., Domingos, P., Halevy, A.: Ontology matching: a machine learning approach. In: Staab, S., Studer, R. (eds.) Handbook of ontologies. International Handbooks on Information Systems, pp. 385–403. Springer, Heidelberg (2004). doi:10.1007/978-3-540-24750-0_19 CrossRefGoogle Scholar
  13. 13.
    Galárraga, L., Preda, N., Suchanek, F.M.: Mining rules to align knowledge bases. In: AKBC (2013)Google Scholar
  14. 14.
    Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.M.: Amie: association rule mining under incomplete evidence in ontological knowledge bases. In: WWW (2013)Google Scholar
  15. 15.
    Jain, P., Hitzler, P., Sheth, A.P., Verma, K., Yeh, P.Z.: Ontology alignment for linked open data. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010. LNCS, vol. 6496, pp. 402–417. Springer, Heidelberg (2010). doi:10.1007/978-3-642-17746-0_26 CrossRefGoogle Scholar
  16. 16.
    Kirsten, T., Thor, A., Rahm, E.: Instance-based matching of large life science ontologies. In: Cohen-Boulakia, S., Tannen, V. (eds.) DILS 2007. LNCS, vol. 4544, pp. 172–187. Springer, Heidelberg (2007). doi:10.1007/978-3-540-73255-6_15 CrossRefGoogle Scholar
  17. 17.
    Koutraki, M., Preda, N., Vodislav, D.: Sofya: Semantic on-the-fly relation alignment. In: EDBT (2016)Google Scholar
  18. 18.
    Koutraki, M., Vodislav, D., Preda, N.: Deriving intensional descriptions for web services. In: CIKM (2015)Google Scholar
  19. 19.
    Koutraki, M., Vodislav, D., Preda, N.: Doris: discovering ontological relations in services. In: ISWC (2015)Google Scholar
  20. 20.
    Lacoste-Julien, S., Palla, K., Davies, A., Kasneci, G., Graepel, T., Ghahramani, Z.: Sigma: simple greedy matching for aligning large knowledge bases. In: KDD (2013)Google Scholar
  21. 21.
    Madhavan, J., Bernstein, P.A., Doan, A., Halevy, A.: Corpus-based schema matching. In: ICDE (2005)Google Scholar
  22. 22.
    Miller, R.J., Haas, L.M., Hernández, M.A.: Schema mapping as query discovery. In: VLDB (2000)Google Scholar
  23. 23.
    Movshovitz-Attias, D., Whang, S.E., Noy, N., Halevy, A.: Discovering subsumption relationships for web-based ontologies. In: Proceedings of the 18th International Workshop on Web and Databases (2015)Google Scholar
  24. 24.
    Parundekar, R., Knoblock, C.A., Ambite, J.L.: Linking and building ontologies of linked data. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010. LNCS, vol. 6496, pp. 598–614. Springer, Heidelberg (2010). doi:10.1007/978-3-642-17746-0_38 CrossRefGoogle Scholar
  25. 25.
    Schmachtenberg, M., Bizer, C., Paulheim, H.: Adoption of the linked data best practices in different topical domains. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 245–260. Springer, Cham (2014). doi:10.1007/978-3-319-11964-9_16 Google Scholar
  26. 26.
    Seligman, L., Mork, P., Halevy, A.Y., Smith, K.P., Carey, M.J., Chen, K., Wolf, C., Madhavan, J., Kannan, A., Burdick, D.: Openii: an open source information integration toolkit. In: SIGMOD (2010)Google Scholar
  27. 27.
    Shvaiko, P., Euzenat, J.: Ontology matching: state of the art and future challenges. IEEE Trans. Knowl. Data Eng. 25, 158–176 (2013)CrossRefGoogle Scholar
  28. 28.
    Suchanek, F.M., Abiteboul, S., Senellart, P.: Paris: probabilistic alignment of relations, instances, and schema. PVLDB 5(3), 157–168 (2011)Google Scholar
  29. 29.
    Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge - unifying WordNet and Wikipedia. In: WWW (2007)Google Scholar
  30. 30.
    Udrea, O., Getoor, L., Miller, R.J.: Leveraging data and structure in ontology integration. In: SIGMOD (2007)Google Scholar
  31. 31.
    Wang, S., Englebienne, G., Schlobach, S.: Learning concept mappings from instance similarity. In: Sheth, A., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 339–355. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88564-1_22 CrossRefGoogle Scholar
  32. 32.
    Wijaya, D.T., Talukdar, P.P., Mitchell, T.M.: Pidgin: ontology alignment using web text as interlingua. In: CIKM (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.FIZ Karlsruhe – Leibniz Institute for Information InfrastructureKarlsruheGermany
  2. 2.Institute AIFBKarlsruhe Institute of Technology (KIT)KarlsruheGermany
  3. 3.Universtiy of Paris-SaclayVersaillesFrance
  4. 4.ETIS CNRS, University of Cergy-PontoiseCergy-PontoiseFrance

Personalised recommendations