Abstract
The massively distributed publication of linked data has brought to the attention of scientific community the limitations of classic methods for achieving data integration and the opportunities of pushing the boundaries of the field by experimenting this collective enterprise that is the linking open data cloud. While reusing existing ontologies is the choice of preference, the exploitation of ontology alignments still is a required step for easing the burden of integrating heterogeneous data sets. Alignments, even between the most used vocabularies, is still poorly supported in systems nowadays whereas links between instances are the most widely used means for bridging the gap between different data sets. We provide in this paper an account of our statistical and qualitative analysis of the network of instance level equivalences in the Linking Open Data Cloud (i.e. the sameAs network) in order to automatically compute alignments at the conceptual level. Moreover, we explore the effect of ontological information when adopting classical Jaccard methods to the ontology alignment task. Automating such task will allow in fact to achieve a clearer conceptual description of the data at the cloud level, while improving the level of integration between datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Al-Mubaid, H., Nguyen, H.A.: Measuring semantic similarity between biomedical concepts within multiple ontologies. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 39(4), 389–398 (2009)
Bechhofer, S., van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D.L., Patel-Schneider, P.F., Stein, L.A.: OWL Web Ontology Language Reference. W3C Recommendation (February 2004)
d’Amato, C., Fanizzi, N., Esposito, F.: A semantic similarity measure for expressive description logics. Computing Research Repository-arxiv.org (2009)
Ding, L., Shinavier, J., Shangguan, Z., McGuinness, D.L.: SameAs Networks and Beyond: Analyzing Deployment Status and Implications of owl:sameAs in Linked Data. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 145–160. Springer, Heidelberg (2010)
Doan, A., Madhavan, J., Domingos, P., Halevy, A.: Ontology matching: A machine learning approach. In: Handbook on Ontologies in Information Systems, pp. 397–416. Springer (2003)
Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering 19(1), 1–16 (2007)
Euzenat, J., Shvaiko, P.: Ontology Matching. Springer, Heidelberg (2007)
Glaser, H., Jaffri, A., Millard, I.: Managing co-reference on the semantic web. In: WWW 2009 Workshop: Linked Data on the Web, LDOW 2009 (April 2009)
Isaac, A., van der Meij, L., Schlobach, S., Wang, S.: An Empirical Study of Instance-Based Ontology Matching. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 253–266. Springer, Heidelberg (2007)
Isele, R., Jentzsch, A., Bizer, C.: Silk Server - Adding missing Links while consuming Linked Data. In: 1st International Workshop on Consuming Linked Data (COLD 2010), Shanghai, China (November 2010)
Jaccard, P.: Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37, 547–579 (1901)
Jain, P., Hitzler, P., Sheth, A.P., Verma, K., Yeh, P.Z.: Ontology Alignment for Linked Open Data. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 402–417. Springer, Heidelberg (2010)
Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings of the Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 233–246 (2002)
Mao, M., Peng, Y., Spring, M.: Ontology mapping: as a binary classification problem. Concurrency and Computation: Practice and Experience 23(9), 1010–1025 (2011)
Miles, A., Pérez-Agüera, J.R.: SKOS: Simple Knowledge Organisation for the Web. Cataloging & Classification Quarterly 43(3), 69–83 (2007)
Niepert, M., Meilicke, C., Stuckenschmidt, H.: A probabilistic-logical framework for ontology matching. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (2010)
Salvadores, M., Correndo, G., Harris, S., Gibbins, N., Shadbolt, N.: The Design and Implementation of Minimal RDFS Backward Reasoning in 4store. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part II. LNCS, vol. 6644, pp. 139–153. Springer, Heidelberg (2011)
Salvadores, M., Correndo, G., Rodriguez-Castro, B., Gibbins, N., Darlington, J., Shadbolt, N.R.: LinksB2N: Automatic Data Integration for the Semantic Web. In: Meersman, R., Dillon, T., Herrero, P. (eds.) OTM 2009, Part II. LNCS, vol. 5871, pp. 1121–1138. Springer, Heidelberg (2009)
Schultz, A., Matteini, A., Isele, R., Bizer, C., Becker, C.: LDIF - Linked Data Integration Framework. In: 2nd International Workshop on Consuming Linked Data (COLD 2011), Bonn, Germany (October 2011)
Winkler, W.E.: Overview of record linkage and current research directions. Technical report, Bureau of the Census (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Correndo, G., Penta, A., Gibbins, N., Shadbolt, N. (2012). Statistical Analysis of the owl:sameAs Network for Aligning Concepts in the Linking Open Data Cloud. In: Liddle, S.W., Schewe, KD., Tjoa, A.M., Zhou, X. (eds) Database and Expert Systems Applications. DEXA 2012. Lecture Notes in Computer Science, vol 7447. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32597-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-32597-7_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32596-0
Online ISBN: 978-3-642-32597-7
eBook Packages: Computer ScienceComputer Science (R0)