Abstract
It has been argued that linked open data is the major benefit of semantic technologies for the web as it provides a huge amount of structured data that can be accessed in a more effective way than web pages. While linked open data avoids many problems connected with the use of expressive ontologies such as the knowledge acquisition bottleneck, data heterogeneity remains a challenging problem. In particular, identical objects may be referred to by different URIs in different data sets. Identifying such representations of the same object is called object reconciliation. In this paper, we propose a novel approach to object reconciliation that is based on an existing semantic similarity measure for linked data. We adapt the measure to the object reconciliation problem, present exact and approximate algorithms that efficiently implement the methods, and provide a systematic experimental evaluation based on a benchmark dataset. As our main result, we show that the use of light-weight ontologies and schema information significantly improves object reconciliation in the context of linked open data.
Chapter PDF
Similar content being viewed by others
Keywords
- Integer Linear Programming
- Mixed Integer Linear Programming
- Graph Match
- Integer Linear Programming Problem
- Ontology Match
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bhattacharya, I., Getoor, L.: Entity resolution in graphs. In: Bhattacharya, I., Getoor, L. (eds.) Mining Graph Data. Wiley & Sons, Chichester (2006)
Borgida, A., Walsh, T.J., Hirsh, H.: Towards measuring similarity in description logics. In: Proceedings of DL (2005)
Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: Proceedings of the IJCAI 2003 Workshop on Information Integration on the Web (2003)
Cour, T., Srinivasan, P., Shi, J.: Balanced graph matching. In: Advances in Neural Information Processing Systems 19 (2007)
D’Amato, C., Staab, S., Fanizzi, N.: On the influence of description logics ontologies on conceptual similarity. In: Gangemi, A., Euzenat, J. (eds.) EKAW 2008. LNCS (LNAI), vol. 5268, pp. 48–63. Springer, Heidelberg (2008)
Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering 19(1), 1 (2007)
Euzenat, J., Shvaiko, P.: Ontology matching. Springer, Heidelberg (2007)
Fellegi, I., Sunter, A.: A theory for record linkage. Journal of the American Statistical Association 64(328), 1183–1210 (1969)
Ferrara, A., Lorusso, D., Montanelli, S., Varese, G.: Towards a Benchmark for Instance Matching. In: The 7th International Semantic Web Conference (2008)
Hassanzadeh, O., Lim, L., Kementsietsidis, A., Wang, M.: A declarative framework for semantic link discovery over relational data. In: Proceedings of the 18th International Conference on World Wide Web, pp. 1101–1102. ACM, New York (2009)
Horrocks, I.: Ontologies and the semantic web. CACM 51(11), 58–67 (2008)
Leordeanu, M., Hebert, M.: A spectral technique for correspondence problems using pairwise constraints. In: International Conference of Computer Vision (ICCV), pp. 1482–1489 (2005)
Newcombe, H., Kennedy, J., Axford, S., James, A.: Automatic linkage of vital records. Science 130(3381), 954–959 (1959)
Papadimitriou, C.H.: Computational complexity. Addison-Wesley, Reading (1994)
Saïs, F., Pernelle, N., Rousset, M.-C.: Combining a logical and a numerical method for data reconciliation. Journal on Data Semantics 12, 66–94 (2009)
Schrijver, A.: Theory of Linear and Integer Programming. Wiley, Chichester (1998)
Sirin, E., Parsia, B., Grau, B.C., Kalyanpur, A., Katz, Y.: Pellet: a practical OWL-DL reasoner. Journal of Web Semantics 5(2), 51–53 (2007)
Stoermer, H., Rassadko, N.: Results of OKKAM feature based entity matching algorithm for instance matching contest of OAEI 2009. In: Proceedings of the ISWC 2009 Workshop on Ontology Matching (2009)
Stuckenschmidt, H.: A Semantic Similarity Measure for Ontology-Based Information. In: Proceedings of the 8th International Conference on Flexible Query Answering Systems (2009)
Taha, H.A.: Operations research: an introduction. Prentice-Hall, New York (2002)
Tummarello, G., Delbru, R., Oren, E.: Sindice.com: Weaving the open linked data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 552–565. Springer, Heidelberg (2007)
Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Silk–a link discovery framework for the web of data. In: 2nd Linked Data on the Web Workshop (2009)
Zhang, X., Zhong, Q., Shi, F., Li, J., Tang, J.: RiMOM results for OAEI 2009. In: Proceedings of the ISWC 2009 Workshop on Ontology Matching (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Noessner, J., Niepert, M., Meilicke, C., Stuckenschmidt, H. (2010). Leveraging Terminological Structure for Object Reconciliation. In: Aroyo, L., et al. The Semantic Web: Research and Applications. ESWC 2010. Lecture Notes in Computer Science, vol 6089. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13489-0_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-13489-0_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13488-3
Online ISBN: 978-3-642-13489-0
eBook Packages: Computer ScienceComputer Science (R0)