Co-evolution of RDF Datasets

  • Sidra Faisal
  • Kemele M. Endris
  • Saeedeh Shekarpour
  • Sören Auer
  • Maria-Esther Vidal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9671)

Abstract

Linking Data initiatives have fostered the publication of large number of RDF datasets in the Linked Open Data (LOD) cloud, as well as the development of query processing infrastructures to access these data in a federated fashion. However, different experimental studies have shown that availability of LOD datasets cannot be always ensured, being RDF data replication required for envisioning reliable federated query frameworks. Albeit enhancing data availability, RDF data replication requires synchronization and conflict resolution when replicas and source datasets are allowed to change data over time, i.e., co-evolution management needs to be provided to ensure consistency. In this paper, we tackle the problem of RDF data co-evolution and devise an approach for conflict resolution during co-evolution of RDF datasets. Our proposed approach is property-oriented and allows for exploiting semantics about RDF properties during co-evolution management. The quality of our approach is empirically evaluated in different scenarios on the DBpedia-live dataset. Experimental results suggest that proposed proposed techniques have a positive impact on the quality of data in source datasets and replicas.

Keywords

Dataset synchronization Dataset co-evolution Conflict identification Conflict resolution RDF dataset 

References

  1. 1.
    Aslan, K., Molli, P., Skaf-Molli, H., Weiss, S.: C-set: a commutative replicated data type for semantic stores. In: 4th International Workshop on REsource Discovery (RED) (2011)Google Scholar
  2. 2.
    Auer, S., Herre, H.: A versioning and evolution framework for RDF knowledge bases. In: Virbitskaite, I., Voronkov, A. (eds.) PSI 2006. LNCS, vol. 4378, pp. 55–69. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Bilke, A., Bleiholder, J., Naumann, F., Böhm, C., Draba, K., Weis, M.: Automatic data fusion with hummer. In: 31st International Conference on Very Large Data Bases (VLDB) (2005)Google Scholar
  4. 4.
    Bleiholder, J., Naumann, F.: Data fusion and conflict resolution in integrated information systems. In: International Workshop on Information Integration on the Web (2006)Google Scholar
  5. 5.
    Bryl, V., Bizer, C.: Learning conflict resolution strategies for cross-language wikipedia data fusion. In: 23rd International Conference on World Wide Web (WWW) (2014)Google Scholar
  6. 6.
    Buil-Aranda, C., Hogan, A., Umbrich, J., Vandenbussche, P.-Y.: SPARQL web-querying infrastructure: ready for action? In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 277–293. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  7. 7.
    Endris, K.M., Faisal, S., Orlandi, F., Auer, S., Scerri, S.: Interest-based RDF update propagation. In: The Semantic Web - ISWC 2015–Proceedings of 14th International Semantic Web Conference, Part I, Bethlehem, PA, USA, pp. 513–529, 11–15 October 2015Google Scholar
  8. 8.
    Endris, K.M., Faisal, S., Orlandi, F., Auer, S., Scerri, S.: IRAP - an interest-based RDF update propagation framework. In: ISWC 2015 Posters and Demonstrations Track Co-located with the 14th Interational Semantic Web Conference (ISWC) (2015)Google Scholar
  9. 9.
    Feigenbaum, L., Williams, G., Clark, K., Torres, E.: SPARQL 1.1 protocol (2013). http://www.w3.org/TR/sparql11-protocol/
  10. 10.
    Ibáñez, L.-D., Skaf-Molli, H., Molli, P., Corby, O.: Col-graph: towards writable and scalable linked open data. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 325–340. Springer, Heidelberg (2014)Google Scholar
  11. 11.
    Knap, T., Michelfeit, J., Daniel, J., Jerman, P., Rychnovský, D., Soukup, T., Nečaský, M.: ODCleanStore: a framework for managing and providing integrated linked data on the web. In: Wang, X.S., Cruz, I., Delis, A., Huang, G. (eds.) WISE 2012. LNCS, vol. 7651, pp. 815–816. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  12. 12.
    Konstantinidis, G., Flouris, G., Antoniou, G., Christophides, V.: Ontology evolution: a framework and its application to RDF. In: Joint ODBIS-SWDB Workshop on Semantic Web, Ontologies, Databases (2007)Google Scholar
  13. 13.
    Mendes, P.N., Müleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: Joint EDBT-ICDT Workshops, pp. 116–123 (2012)Google Scholar
  14. 14.
    Michelfeit, J., Knap, T., Neaský, M.: Linked data integration with conflicts. Web Semant. (2014)Google Scholar
  15. 15.
    Montoya, G., Skaf-Molli, H., Molli, P., Vidal, M.E.: Federated SPARQL queries processing with replicated fragments. In: The Semantic Web - ISWC 2015–Proceedings of 14th International Semantic Web Conference, Part I, Bethlehem, PA, USA, 11–15 October 2015Google Scholar
  16. 16.
    Motro, A., Anokhin, P.: Fusionplex: resolution of data inconsistencies in the integration of heterogeneous information sources. Inf. Fusion 7(2), 176–196 (2006)CrossRefGoogle Scholar
  17. 17.
    Paton, N.W., Christodoulou, K., Fernandes, A.A.A., Parsia, B., Hedele, C.: Pay-as-you-go data integration for linked data: opportunities, challenges and architectures. In: 4th International Workshop on Semantic Web Information Management (2012)Google Scholar
  18. 18.
    Saleem, M., Ngonga Ngomo, A.-C., Xavier Parreira, J., Deus, H.F., Hauswirth, M.: DAW: Duplicate-AWare federated query processing over the web of data. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 574–590. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  19. 19.
    Schandl, B.: Replication and versioning of partial RDF graphs. In: 7th Interantional Conference on The Semantic Web, pp. 31–45 (2010)Google Scholar
  20. 20.
    Schmachtenberg, M., Bizer, C., Paulheim, H.: Adoption of the linked data best practices in different topical domains. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 245–260. Springer, Heidelberg (2014)Google Scholar
  21. 21.
    Schultz, A., Matteini, A., Isele, R., Bizer, C., Becker, C.: LDIF linked data integration framework. In: 2nd International Workshop on Consuming Linked Data (2011)Google Scholar
  22. 22.
    Tummarello, G., Morbidoni, C., Bachmann-Gmür, R., Erling, O.: RDFSync: efficient remote synchronization of RDF models. In: Aberer, K., et al. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 537–551. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  23. 23.
    Verborgh, R., et al.: Querying datasets on the web with high availability. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 180–196. Springer, Heidelberg (2014)Google Scholar
  24. 24.
    Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked open data: a survey. Semant. Web J. (2015)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Sidra Faisal
    • 1
  • Kemele M. Endris
    • 1
  • Saeedeh Shekarpour
    • 1
  • Sören Auer
    • 1
  • Maria-Esther Vidal
    • 1
  1. 1.University of Bonn and Fraunhofer IAISBonnGermany

Personalised recommendations