Skip to main content

Ensuring License Compliance in Linked Data with Query Relaxation

  • Chapter
  • First Online:
  • 152 Accesses

Part of the book series: Lecture Notes in Computer Science ((TLDKS,volume 12920))

Abstract

When two or more licensed datasets participate in evaluating a federated query, to be reusable, the query result must be protected by a license compliant with each license of the involved datasets. Due to incompatibilities or contradictions among licenses, such a license does not always exist, leading to a query result that cannot be licensed nor reused on a legal basis. We propose to deal with this issue during the federated query processing by dynamically discarding datasets of conflicting licenses. However, this solution may generate an empty query result. To face this problem, we use query relaxation techniques. Our problem statement is, given a SPARQL query and a federation of licensed datasets, how to guarantee a relevant and non-empty query result whose license is compliant with each license of involved datasets? To detect and prevent license conflicts, we propose FLiQue, a license-aware query processing strategy for federated query engines. Our challenge is to limit communication costs when the query relaxation process is necessary. Experiments show that FLiQue guarantees license compliance, and if necessary, can find relevant relaxed federated queries with a limited overhead in terms of execution time.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    This compatibility graph conforms to the license compatibility chart shown in https://wiki.creativecommons.org/wiki/Wiki/cc_license_compatibility.

  2. 2.

    To simplify, we show the same licenses as in Fig. 1. However, a compatibility graph of licenses can contain many more licenses and not limited to Creative Commons ones.

  3. 3.

    In French, FLiQue is a homophone of flic, which means cop.

  4. 4.

    https://creativecommons.org/ns.

  5. 5.

    https://ns.inria.fr/l4lod/.

  6. 6.

    Creative Commons also proposes its licences in RDF https://github.com/creativecommons/cc.licenserdf/tree/master/cc/licenserdf/licenses.

  7. 7.

    https://www.dalicc.net/license-library.

  8. 8.

    A partial order is any binary relation that is reflexive, antisymmetric, and transitive.

  9. 9.

    A demonstration tool to define, step by step, a compatibility graph of licenses with the CaLi approach can be found here https://saas.ls2n.fr/cali/.

  10. 10.

    That is \(|S|^{|A|}\) minus the licenses discarded by constraints, where S is a set of status and A a set of actions. The three status considered by Creative Commons licenses are: permissions, duties, and prohibitions.

  11. 11.

    Next compatibility graphs of licenses illustrate the CaLi approach:

    http://cali.priloo.univ-nantes.fr/ld/graph,

    http://cali.priloo.univ-nantes.fr/rep/graph.

  12. 12.

    https://www.w3.org/TR/rdf-sparql-query/#OptionalMatching.

  13. 13.

    https://www.w3.org/TR/void/.

  14. 14.

    DARQ is an extension of ARQ http://jena.sourceforge.net/ARQ/.

  15. 15.

    The VoID vocabulary was proposed after DARQ.

  16. 16.

    Datasets without licenses can be associated with the most permissive license (e.g., CC Zero, and ODbL) or can be discarded from the federation.

  17. 17.

    Other choices could be defined, for example, based on the cardinality estimations of result sets or based on the number of involved data sources.

  18. 18.

    https://github.com/dice-group/CostFed.

  19. 19.

    https://github.com/benjimor/FLiQuE.

  20. 20.

    https://github.com/dice-group/LargeRDFBench.

  21. 21.

    This is a compilation of all ontologies we found for LargeRDFFech datasets: https://raw.githubusercontent.com/benj-moreau/FLiQue/master/flique/ontologies/ontology.n3.

References

  1. Bizer, C., Heath, T., Berners-Lee, T.: Linked data: the story so far. In: Semantic Services, Interoperability and Web Applications: Emerging Concepts. IGI Global (2011)

    Google Scholar 

  2. Bonatti, P.A., Decker, S., Polleres, A., Presutti, V.: Knowledge graphs: new directions for knowledge representation on the semantic web (gagstuhl seminar 18371). Dagstuhl reports (2019)

    Google Scholar 

  3. Cabrio, E., Palmero Aprosio, A., Villata, S.: These are your rights. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 255–269. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07443-6_18

    Chapter  Google Scholar 

  4. Čebirić, Š, et al.: Summarizing semantic graphs: a survey. VLDB J. 28(3), 295–327 (2018). https://doi.org/10.1007/s00778-018-0528-3

    Article  Google Scholar 

  5. Costabello, L., Villata, S., Gandon, F.: Context-aware access control for RDF graph stores. In: European Conference on Artificial Intelligence (ECAI) (2012)

    Google Scholar 

  6. Cyganiak, R., Hausenblas, M.: Describing linked datasets - on the design and usage of voiD, the “vocabulary of interlinked datasets”. In: Linked Data on the Web Workshop (LDOW) (2009)

    Google Scholar 

  7. Ferré, S.: Answers partitioning and lazy joins for efficient query relaxation and application to similarity search. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 209–224. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_14

    Chapter  Google Scholar 

  8. Fokou, G., Jean, S., Hadjali, A., Baron, M.: RDF query relaxation strategies based on failure causes. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 439–454. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34129-3_27

    Chapter  Google Scholar 

  9. Gabillon, A., Letouzey, L.: A view based access control model for SPARQL. In: International Conference on Network and System Security (NSS) (2010)

    Google Scholar 

  10. Görlitz, O., Staab, S.: SPLENDID: SPARQL endpoint federation exploiting VOID descriptions. In: Workshop Consuming Linked Data (COLD) Collocated with ISWC (2011)

    Google Scholar 

  11. Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. J. Web Semant. 3(2–3), 158–182 (2005)

    Article  Google Scholar 

  12. Hartig, O., Vidal, M.E., Freytag, J.C.: Federated semantic data management (dagstuhl seminar 17262). Dagstuhl reports (2017)

    Google Scholar 

  13. Havur, G., et al.: DALICC: a framework for publishing and consuming data assets legally. In: International Conference on Semantic Systems (SEMANTICS), Poster&Demo (2018)

    Google Scholar 

  14. Hogan, A., et al.: Knowledge graphs. CoRR abs/2003.02320 (2020)

    Google Scholar 

  15. Huang, H., Liu, C., Zhou, X.: Approximating query answering on RDF databases. J. World Wide Web 15, 89–114 (2012). https://doi.org/10.1007/s11280-011-0131-7

    Article  Google Scholar 

  16. Hurtado, C.A., Poulovassilis, A., Wood, P.T.: Query relaxation in RDF. In: Spaccapietra, S. (ed.) Journal on Data Semantics X. LNCS, vol. 4900, pp. 31–61. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-77688-8_2

    Chapter  Google Scholar 

  17. Iannella, R., Villata, S.: ODRL information model 2.2. W3C Recommendation (2018)

    Google Scholar 

  18. Kapitsaki, G.M., Kramer, F., Tselikas, N.D.: Automating the license compatibility process in open source software with SPDX. J. Syst. Softw. 131, 386–401 (2017)

    Article  Google Scholar 

  19. Khan, Y., et al.: SAFE: policy aware SPARQL query federation over RDF data cubes. In: Semantic Web Applications and Tools for Life Sciences (SWAT4LS) (2014)

    Google Scholar 

  20. Kirrane, S., Abdelrahman, A., Mileo, A., Decker, S.: Secure manipulation of linked data. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 248–263. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41335-3_16

    Chapter  Google Scholar 

  21. Moreau, B., Serrano-Alvarado, P., Perrin, M., Desmontils, E.: A license-based search engine. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11762, pp. 130–135. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32327-1_26

    Chapter  Google Scholar 

  22. Moreau, B., Serrano-Alvarado, P., Perrin, M., Desmontils, E.: Modelling the compatibility of licenses. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11503, pp. 255–269. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21348-0_17

    Chapter  Google Scholar 

  23. Oguz, D., Ergenc, B., Yin, S., Dikenelli, O., Hameurlain, A.: Federated query processing on linked data: a qualitative survey and open challenges. Knowl. Eng. Rev. 30(5), 545–563 (2015)

    Article  Google Scholar 

  24. Oulmakhzoune, S., Cuppens-Boulahia, N., Cuppens, F., Morucci, S., Barhamgi, M., Benslimane, D.: Privacy query rewriting algorithm instrumented by a privacy-aware access control model. Ann. Telecommun. 69, 3–19 (2014). https://doi.org/10.1007/s12243-013-0365-8

    Article  Google Scholar 

  25. Pellegrini, T., et al.: DALICC: a license management framework for digital assets. In: Proceedings of the Internationales Rechtsinformatik Symposion (IRIS), p. 10 (2019)

    Google Scholar 

  26. Qudus, U., Saleem, M., Ngonga Ngomo, A.C., Lee, Y.K.: An empirical evaluation of cost-based federated SPARQL query processing engines. CoRR, abs/2104.00984 (2021). https://arxiv.org/abs/2104.00984

  27. Quilitz, B., Leser, U.: Querying distributed RDF data sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68234-9_39

    Chapter  Google Scholar 

  28. Reddivari, P., Finin, T., Joshi, A., et al.: Policy-based access control for an RDF store. In: Workshop Semantic Web for Collaborative Knowledge Acquisition (SWeCKa) Collocated with IJCAI (2007)

    Google Scholar 

  29. Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: International Joint Conference on Artificial Intelligence (IJCAI) (1995)

    Google Scholar 

  30. Rodríguez Doncel, V., Gómez-Pérez, A., Villata, S.: A dataset of RDF licenses. In: Legal Knowledge and Information Systems Conference (ICLKIS) (2014)

    Google Scholar 

  31. Sadeh, N., Acquisti, A., Breaux, T.D., Cranor, L.F., et al.: Towards usable privacy policies: semi-automatically extracting data practices from websites’ privacy policies. In: Symposium on Usable Privacy and Security (SOUPS) (2014). Poster

    Google Scholar 

  32. Saleem, M., Hasnain, A., Ngomo, A.C.N.: LargeRDFBench: a billion triples benchmark for SPARQL endpoint federation. J. Semant. Web 48, 85–125 (2018)

    Article  Google Scholar 

  33. Saleem, M., Ngonga Ngomo, A.-C.: HiBISCuS: hypergraph-based source selection for SPARQL endpoint federation. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 176–191. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07443-6_13

    Chapter  Google Scholar 

  34. Saleem, M., Potocki, A., Soru, T., Hartig, O., Ngomo, A.N.: CostFed: cost-based query optimization for SPARQL endpoint federation. In: International Conference on Semantic Systems (SEMANTICS) (2018)

    Google Scholar 

  35. Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: optimization techniques for federated query processing on linked data. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 601–616. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_38

    Chapter  Google Scholar 

  36. Seneviratne, O., Kagal, L., Berners-Lee, T.: Policy-aware content reuse on the web. In: Bernstein, A., et al. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 553–568. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04930-9_35

    Chapter  Google Scholar 

  37. Villata, S., Gandon, F.: Licenses compatibility and composition in the web of data. In: Workshop Consuming Linked Data (COLD) Collocated with ISWC (2012)

    Google Scholar 

  38. Wheeler, D.A.: The Free-Libre/Open Source Software (FLOSS) License Slide (2017). https://www.dwheeler.com/essays/floss-license-slide.pdf

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patricia Serrano-Alvarado .

Editor information

Editors and Affiliations

A Supplemental material

A Supplemental material

figure c
figure d
figure e
figure f
figure g
figure h
figure i
figure j
figure k
figure l

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer-Verlag GmbH Germany, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Moreau, B., Serrano-Alvarado, P. (2021). Ensuring License Compliance in Linked Data with Query Relaxation. In: Hameurlain, A., Tjoa, A.M., Amann, B., Goasdoué, F. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XLIX. Lecture Notes in Computer Science(), vol 12920. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-64148-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-64148-4_4

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-64147-7

  • Online ISBN: 978-3-662-64148-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics