Leveraging Terminological Structure for Object Reconciliation

  • Jan Noessner
  • Mathias Niepert
  • Christian Meilicke
  • Heiner Stuckenschmidt
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6089)

Abstract

It has been argued that linked open data is the major benefit of semantic technologies for the web as it provides a huge amount of structured data that can be accessed in a more effective way than web pages. While linked open data avoids many problems connected with the use of expressive ontologies such as the knowledge acquisition bottleneck, data heterogeneity remains a challenging problem. In particular, identical objects may be referred to by different URIs in different data sets. Identifying such representations of the same object is called object reconciliation. In this paper, we propose a novel approach to object reconciliation that is based on an existing semantic similarity measure for linked data. We adapt the measure to the object reconciliation problem, present exact and approximate algorithms that efficiently implement the methods, and provide a systematic experimental evaluation based on a benchmark dataset. As our main result, we show that the use of light-weight ontologies and schema information significantly improves object reconciliation in the context of linked open data.

References

  1. 1.
    Bhattacharya, I., Getoor, L.: Entity resolution in graphs. In: Bhattacharya, I., Getoor, L. (eds.) Mining Graph Data. Wiley & Sons, Chichester (2006)Google Scholar
  2. 2.
    Borgida, A., Walsh, T.J., Hirsh, H.: Towards measuring similarity in description logics. In: Proceedings of DL (2005)Google Scholar
  3. 3.
    Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: Proceedings of the IJCAI 2003 Workshop on Information Integration on the Web (2003)Google Scholar
  4. 4.
    Cour, T., Srinivasan, P., Shi, J.: Balanced graph matching. In: Advances in Neural Information Processing Systems 19 (2007)Google Scholar
  5. 5.
    D’Amato, C., Staab, S., Fanizzi, N.: On the influence of description logics ontologies on conceptual similarity. In: Gangemi, A., Euzenat, J. (eds.) EKAW 2008. LNCS (LNAI), vol. 5268, pp. 48–63. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering 19(1), 1 (2007)CrossRefGoogle Scholar
  7. 7.
    Euzenat, J., Shvaiko, P.: Ontology matching. Springer, Heidelberg (2007)MATHGoogle Scholar
  8. 8.
    Fellegi, I., Sunter, A.: A theory for record linkage. Journal of the American Statistical Association 64(328), 1183–1210 (1969)CrossRefGoogle Scholar
  9. 9.
    Ferrara, A., Lorusso, D., Montanelli, S., Varese, G.: Towards a Benchmark for Instance Matching. In: The 7th International Semantic Web Conference (2008)Google Scholar
  10. 10.
    Hassanzadeh, O., Lim, L., Kementsietsidis, A., Wang, M.: A declarative framework for semantic link discovery over relational data. In: Proceedings of the 18th International Conference on World Wide Web, pp. 1101–1102. ACM, New York (2009)CrossRefGoogle Scholar
  11. 11.
    Horrocks, I.: Ontologies and the semantic web. CACM 51(11), 58–67 (2008)Google Scholar
  12. 12.
    Leordeanu, M., Hebert, M.: A spectral technique for correspondence problems using pairwise constraints. In: International Conference of Computer Vision (ICCV), pp. 1482–1489 (2005)Google Scholar
  13. 13.
    Newcombe, H., Kennedy, J., Axford, S., James, A.: Automatic linkage of vital records. Science 130(3381), 954–959 (1959)CrossRefGoogle Scholar
  14. 14.
    Papadimitriou, C.H.: Computational complexity. Addison-Wesley, Reading (1994)MATHGoogle Scholar
  15. 15.
    Saïs, F., Pernelle, N., Rousset, M.-C.: Combining a logical and a numerical method for data reconciliation. Journal on Data Semantics 12, 66–94 (2009)Google Scholar
  16. 16.
    Schrijver, A.: Theory of Linear and Integer Programming. Wiley, Chichester (1998)MATHGoogle Scholar
  17. 17.
    Sirin, E., Parsia, B., Grau, B.C., Kalyanpur, A., Katz, Y.: Pellet: a practical OWL-DL reasoner. Journal of Web Semantics 5(2), 51–53 (2007)Google Scholar
  18. 18.
    Stoermer, H., Rassadko, N.: Results of OKKAM feature based entity matching algorithm for instance matching contest of OAEI 2009. In: Proceedings of the ISWC 2009 Workshop on Ontology Matching (2009)Google Scholar
  19. 19.
    Stuckenschmidt, H.: A Semantic Similarity Measure for Ontology-Based Information. In: Proceedings of the 8th International Conference on Flexible Query Answering Systems (2009)Google Scholar
  20. 20.
    Taha, H.A.: Operations research: an introduction. Prentice-Hall, New York (2002)Google Scholar
  21. 21.
    Tummarello, G., Delbru, R., Oren, E.: Sindice.com: Weaving the open linked data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 552–565. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  22. 22.
    Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Silk–a link discovery framework for the web of data. In: 2nd Linked Data on the Web Workshop (2009)Google Scholar
  23. 23.
    Zhang, X., Zhong, Q., Shi, F., Li, J., Tang, J.: RiMOM results for OAEI 2009. In: Proceedings of the ISWC 2009 Workshop on Ontology Matching (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Jan Noessner
    • 1
  • Mathias Niepert
    • 1
  • Christian Meilicke
    • 1
  • Heiner Stuckenschmidt
    • 1
  1. 1.KR & KM Research GroupUniversity of MannheimMannheimGermany

Personalised recommendations