Statistical Analysis of the owl:sameAs Network for Aligning Concepts in the Linking Open Data Cloud

  • Gianluca Correndo
  • Antonio Penta
  • Nicholas Gibbins
  • Nigel Shadbolt
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7447)


The massively distributed publication of linked data has brought to the attention of scientific community the limitations of classic methods for achieving data integration and the opportunities of pushing the boundaries of the field by experimenting this collective enterprise that is the linking open data cloud. While reusing existing ontologies is the choice of preference, the exploitation of ontology alignments still is a required step for easing the burden of integrating heterogeneous data sets. Alignments, even between the most used vocabularies, is still poorly supported in systems nowadays whereas links between instances are the most widely used means for bridging the gap between different data sets. We provide in this paper an account of our statistical and qualitative analysis of the network of instance level equivalences in the Linking Open Data Cloud (i.e. the sameAs network) in order to automatically compute alignments at the conceptual level. Moreover, we explore the effect of ontological information when adopting classical Jaccard methods to the ontology alignment task. Automating such task will allow in fact to achieve a clearer conceptual description of the data at the cloud level, while improving the level of integration between datasets.


Linked Data ontology alignment owl:sameAs 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Al-Mubaid, H., Nguyen, H.A.: Measuring semantic similarity between biomedical concepts within multiple ontologies. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 39(4), 389–398 (2009)CrossRefGoogle Scholar
  2. 2.
    Bechhofer, S., van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D.L., Patel-Schneider, P.F., Stein, L.A.: OWL Web Ontology Language Reference. W3C Recommendation (February 2004)Google Scholar
  3. 3.
    d’Amato, C., Fanizzi, N., Esposito, F.: A semantic similarity measure for expressive description logics. Computing Research (2009)Google Scholar
  4. 4.
    Ding, L., Shinavier, J., Shangguan, Z., McGuinness, D.L.: SameAs Networks and Beyond: Analyzing Deployment Status and Implications of owl:sameAs in Linked Data. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 145–160. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  5. 5.
    Doan, A., Madhavan, J., Domingos, P., Halevy, A.: Ontology matching: A machine learning approach. In: Handbook on Ontologies in Information Systems, pp. 397–416. Springer (2003)Google Scholar
  6. 6.
    Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering 19(1), 1–16 (2007)CrossRefGoogle Scholar
  7. 7.
    Euzenat, J., Shvaiko, P.: Ontology Matching. Springer, Heidelberg (2007)zbMATHGoogle Scholar
  8. 8.
    Glaser, H., Jaffri, A., Millard, I.: Managing co-reference on the semantic web. In: WWW 2009 Workshop: Linked Data on the Web, LDOW 2009 (April 2009)Google Scholar
  9. 9.
    Isaac, A., van der Meij, L., Schlobach, S., Wang, S.: An Empirical Study of Instance-Based Ontology Matching. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 253–266. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  10. 10.
    Isele, R., Jentzsch, A., Bizer, C.: Silk Server - Adding missing Links while consuming Linked Data. In: 1st International Workshop on Consuming Linked Data (COLD 2010), Shanghai, China (November 2010)Google Scholar
  11. 11.
    Jaccard, P.: Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37, 547–579 (1901)Google Scholar
  12. 12.
    Jain, P., Hitzler, P., Sheth, A.P., Verma, K., Yeh, P.Z.: Ontology Alignment for Linked Open Data. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 402–417. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  13. 13.
    Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings of the Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 233–246 (2002)Google Scholar
  14. 14.
    Mao, M., Peng, Y., Spring, M.: Ontology mapping: as a binary classification problem. Concurrency and Computation: Practice and Experience 23(9), 1010–1025 (2011)CrossRefGoogle Scholar
  15. 15.
    Miles, A., Pérez-Agüera, J.R.: SKOS: Simple Knowledge Organisation for the Web. Cataloging & Classification Quarterly 43(3), 69–83 (2007)CrossRefGoogle Scholar
  16. 16.
    Niepert, M., Meilicke, C., Stuckenschmidt, H.: A probabilistic-logical framework for ontology matching. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (2010)Google Scholar
  17. 17.
    Salvadores, M., Correndo, G., Harris, S., Gibbins, N., Shadbolt, N.: The Design and Implementation of Minimal RDFS Backward Reasoning in 4store. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part II. LNCS, vol. 6644, pp. 139–153. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  18. 18.
    Salvadores, M., Correndo, G., Rodriguez-Castro, B., Gibbins, N., Darlington, J., Shadbolt, N.R.: LinksB2N: Automatic Data Integration for the Semantic Web. In: Meersman, R., Dillon, T., Herrero, P. (eds.) OTM 2009, Part II. LNCS, vol. 5871, pp. 1121–1138. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  19. 19.
    Schultz, A., Matteini, A., Isele, R., Bizer, C., Becker, C.: LDIF - Linked Data Integration Framework. In: 2nd International Workshop on Consuming Linked Data (COLD 2011), Bonn, Germany (October 2011)Google Scholar
  20. 20.
    Winkler, W.E.: Overview of record linkage and current research directions. Technical report, Bureau of the Census (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Gianluca Correndo
    • 1
  • Antonio Penta
    • 1
  • Nicholas Gibbins
    • 1
  • Nigel Shadbolt
    • 1
  1. 1.Electronics and Computer ScienceUniversity of SouthamptonUK

Personalised recommendations