Abstract
When two or more licensed datasets participate in evaluating a federated query, to be reusable, the query result must be protected by a license compliant with each license of the involved datasets. Due to incompatibilities or contradictions among licenses, such a license does not always exist, leading to a query result that cannot be licensed nor reused on a legal basis. We propose to deal with this issue during the federated query processing by dynamically discarding datasets of conflicting licenses. However, this solution may generate an empty query result. To face this problem, we use query relaxation techniques. Our problem statement is, given a SPARQL query and a federation of licensed datasets, how to guarantee a relevant and non-empty query result whose license is compliant with each license of involved datasets? To detect and prevent license conflicts, we propose FLiQue, a license-aware query processing strategy for federated query engines. Our challenge is to limit communication costs when the query relaxation process is necessary. Experiments show that FLiQue guarantees license compliance, and if necessary, can find relevant relaxed federated queries with a limited overhead in terms of execution time.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
This compatibility graph conforms to the license compatibility chart shown in https://wiki.creativecommons.org/wiki/Wiki/cc_license_compatibility.
- 2.
To simplify, we show the same licenses as in Fig. 1. However, a compatibility graph of licenses can contain many more licenses and not limited to Creative Commons ones.
- 3.
In French, FLiQue is a homophone of flic, which means cop.
- 4.
- 5.
- 6.
Creative Commons also proposes its licences in RDF https://github.com/creativecommons/cc.licenserdf/tree/master/cc/licenserdf/licenses.
- 7.
- 8.
A partial order is any binary relation that is reflexive, antisymmetric, and transitive.
- 9.
A demonstration tool to define, step by step, a compatibility graph of licenses with the CaLi approach can be found here https://saas.ls2n.fr/cali/.
- 10.
That is \(|S|^{|A|}\) minus the licenses discarded by constraints, where S is a set of status and A a set of actions. The three status considered by Creative Commons licenses are: permissions, duties, and prohibitions.
- 11.
Next compatibility graphs of licenses illustrate the CaLi approach:
- 12.
- 13.
- 14.
DARQ is an extension of ARQ http://jena.sourceforge.net/ARQ/.
- 15.
The VoID vocabulary was proposed after DARQ.
- 16.
Datasets without licenses can be associated with the most permissive license (e.g., CC Zero, and ODbL) or can be discarded from the federation.
- 17.
Other choices could be defined, for example, based on the cardinality estimations of result sets or based on the number of involved data sources.
- 18.
- 19.
- 20.
- 21.
This is a compilation of all ontologies we found for LargeRDFFech datasets: https://raw.githubusercontent.com/benj-moreau/FLiQue/master/flique/ontologies/ontology.n3.
References
Bizer, C., Heath, T., Berners-Lee, T.: Linked data: the story so far. In: Semantic Services, Interoperability and Web Applications: Emerging Concepts. IGI Global (2011)
Bonatti, P.A., Decker, S., Polleres, A., Presutti, V.: Knowledge graphs: new directions for knowledge representation on the semantic web (gagstuhl seminar 18371). Dagstuhl reports (2019)
Cabrio, E., Palmero Aprosio, A., Villata, S.: These are your rights. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 255–269. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07443-6_18
Čebirić, Š, et al.: Summarizing semantic graphs: a survey. VLDB J. 28(3), 295–327 (2018). https://doi.org/10.1007/s00778-018-0528-3
Costabello, L., Villata, S., Gandon, F.: Context-aware access control for RDF graph stores. In: European Conference on Artificial Intelligence (ECAI) (2012)
Cyganiak, R., Hausenblas, M.: Describing linked datasets - on the design and usage of voiD, the “vocabulary of interlinked datasets”. In: Linked Data on the Web Workshop (LDOW) (2009)
Ferré, S.: Answers partitioning and lazy joins for efficient query relaxation and application to similarity search. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 209–224. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_14
Fokou, G., Jean, S., Hadjali, A., Baron, M.: RDF query relaxation strategies based on failure causes. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 439–454. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-34129-3_27
Gabillon, A., Letouzey, L.: A view based access control model for SPARQL. In: International Conference on Network and System Security (NSS) (2010)
Görlitz, O., Staab, S.: SPLENDID: SPARQL endpoint federation exploiting VOID descriptions. In: Workshop Consuming Linked Data (COLD) Collocated with ISWC (2011)
Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. J. Web Semant. 3(2–3), 158–182 (2005)
Hartig, O., Vidal, M.E., Freytag, J.C.: Federated semantic data management (dagstuhl seminar 17262). Dagstuhl reports (2017)
Havur, G., et al.: DALICC: a framework for publishing and consuming data assets legally. In: International Conference on Semantic Systems (SEMANTICS), Poster&Demo (2018)
Hogan, A., et al.: Knowledge graphs. CoRR abs/2003.02320 (2020)
Huang, H., Liu, C., Zhou, X.: Approximating query answering on RDF databases. J. World Wide Web 15, 89–114 (2012). https://doi.org/10.1007/s11280-011-0131-7
Hurtado, C.A., Poulovassilis, A., Wood, P.T.: Query relaxation in RDF. In: Spaccapietra, S. (ed.) Journal on Data Semantics X. LNCS, vol. 4900, pp. 31–61. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-77688-8_2
Iannella, R., Villata, S.: ODRL information model 2.2. W3C Recommendation (2018)
Kapitsaki, G.M., Kramer, F., Tselikas, N.D.: Automating the license compatibility process in open source software with SPDX. J. Syst. Softw. 131, 386–401 (2017)
Khan, Y., et al.: SAFE: policy aware SPARQL query federation over RDF data cubes. In: Semantic Web Applications and Tools for Life Sciences (SWAT4LS) (2014)
Kirrane, S., Abdelrahman, A., Mileo, A., Decker, S.: Secure manipulation of linked data. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 248–263. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41335-3_16
Moreau, B., Serrano-Alvarado, P., Perrin, M., Desmontils, E.: A license-based search engine. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11762, pp. 130–135. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32327-1_26
Moreau, B., Serrano-Alvarado, P., Perrin, M., Desmontils, E.: Modelling the compatibility of licenses. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11503, pp. 255–269. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21348-0_17
Oguz, D., Ergenc, B., Yin, S., Dikenelli, O., Hameurlain, A.: Federated query processing on linked data: a qualitative survey and open challenges. Knowl. Eng. Rev. 30(5), 545–563 (2015)
Oulmakhzoune, S., Cuppens-Boulahia, N., Cuppens, F., Morucci, S., Barhamgi, M., Benslimane, D.: Privacy query rewriting algorithm instrumented by a privacy-aware access control model. Ann. Telecommun. 69, 3–19 (2014). https://doi.org/10.1007/s12243-013-0365-8
Pellegrini, T., et al.: DALICC: a license management framework for digital assets. In: Proceedings of the Internationales Rechtsinformatik Symposion (IRIS), p. 10 (2019)
Qudus, U., Saleem, M., Ngonga Ngomo, A.C., Lee, Y.K.: An empirical evaluation of cost-based federated SPARQL query processing engines. CoRR, abs/2104.00984 (2021). https://arxiv.org/abs/2104.00984
Quilitz, B., Leser, U.: Querying distributed RDF data sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68234-9_39
Reddivari, P., Finin, T., Joshi, A., et al.: Policy-based access control for an RDF store. In: Workshop Semantic Web for Collaborative Knowledge Acquisition (SWeCKa) Collocated with IJCAI (2007)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: International Joint Conference on Artificial Intelligence (IJCAI) (1995)
Rodríguez Doncel, V., Gómez-Pérez, A., Villata, S.: A dataset of RDF licenses. In: Legal Knowledge and Information Systems Conference (ICLKIS) (2014)
Sadeh, N., Acquisti, A., Breaux, T.D., Cranor, L.F., et al.: Towards usable privacy policies: semi-automatically extracting data practices from websites’ privacy policies. In: Symposium on Usable Privacy and Security (SOUPS) (2014). Poster
Saleem, M., Hasnain, A., Ngomo, A.C.N.: LargeRDFBench: a billion triples benchmark for SPARQL endpoint federation. J. Semant. Web 48, 85–125 (2018)
Saleem, M., Ngonga Ngomo, A.-C.: HiBISCuS: hypergraph-based source selection for SPARQL endpoint federation. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 176–191. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07443-6_13
Saleem, M., Potocki, A., Soru, T., Hartig, O., Ngomo, A.N.: CostFed: cost-based query optimization for SPARQL endpoint federation. In: International Conference on Semantic Systems (SEMANTICS) (2018)
Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: optimization techniques for federated query processing on linked data. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 601–616. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_38
Seneviratne, O., Kagal, L., Berners-Lee, T.: Policy-aware content reuse on the web. In: Bernstein, A., et al. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 553–568. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04930-9_35
Villata, S., Gandon, F.: Licenses compatibility and composition in the web of data. In: Workshop Consuming Linked Data (COLD) Collocated with ISWC (2012)
Wheeler, D.A.: The Free-Libre/Open Source Software (FLOSS) License Slide (2017). https://www.dwheeler.com/essays/floss-license-slide.pdf
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Supplemental material
A Supplemental material
Rights and permissions
Copyright information
© 2021 Springer-Verlag GmbH Germany, part of Springer Nature
About this chapter
Cite this chapter
Moreau, B., Serrano-Alvarado, P. (2021). Ensuring License Compliance in Linked Data with Query Relaxation. In: Hameurlain, A., Tjoa, A.M., Amann, B., Goasdoué, F. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XLIX. Lecture Notes in Computer Science(), vol 12920. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-64148-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-662-64148-4_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-64147-7
Online ISBN: 978-3-662-64148-4
eBook Packages: Computer ScienceComputer Science (R0)