Abstract
This paper focuses on the current legal barriers to text and data mining (TDM) in the context of smart disclosure systems (SDSs) whose aim is to provide consumers with improved access to the data needed to make informed decisions. The use of intellectual property rights and contracts, combined with technological protection measures, can hinder TDM and the deployment of SDSs. Further, those legal constraints can negatively impact on artificial intelligence innovation, because that requires improved access to data. There are thus various arguments for enhanced “machine legibility”. However, the TDM exceptions included in the recently approved Directive on Copyright in the Digital Single Market do not appear to clear the way for enhanced “machine legibility”. In relation to SDSs, we also argue that the principle of transparency, which is embedded in consumer and data protection laws, can serve as a last line of defence against prohibition of TDM.
Similar content being viewed by others
Notes
Sunstein (2012).
One of the first projects in the area of automated analysis of legal documents is “Usable Privacy Policy” (www.usableprivacy.org), a consortium led by Carnegie Mellon University. Their tool aims to help users to navigate through the text of privacy policy and identify the privacy options and choices available (Sadeh et al. 2013). More recently, an international team formed by researchers from the Switzerland's Federal Institute of Technology, the University of Wisconsin and the University of Michigan, has launched two tools: Polisis (https://pribot.org/polisis), a tool to visualise in a very effective way the content of a privacy policy; and Pribot (https://pribot.org/bot), a chatbot available to answer questions about a specific privacy policy (Harkous et al. 2018). In the field of the automated analysis of T&C, we must mention CLAUDETTE, a research project carried out by an interdisciplinary team at the European University Institute (https://claudette.eui.eu). The tool, based on machine learning techniques, assesses the fairness of consumer standard terms (https://claudette.eui.eu/use-our-tools/). This functionality will be extended to the analysis of privacy policy. For more information, see Contissa et al. (2018); Lippi et al. (2018). Another interdisciplinary project, SaToS (Software Aided Analysis of ToS), is conducted by the chair of Software Engineering for Business Information Systems (Sebis) at TU Munich. The German research group is developing a solution to automatically identify Terms of Services of e-commerce websites and summarise the key points of the contract in a simplified language (Braun et al. 2018).
This is precisely one of the objects of “The Internet of Platforms: an empirical research on private ordering and consumer protection in the sharing economy”, carried out at UCLouvain. The project aims to address the issue of the lack of transparency in sharing economy transactions and improve the information users receive from and about the platform (http://www.rosels.eu/research/research-project-iop/). This paper presents some of the preliminary results of this project.
In the recently approved Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market and amending Directives 96/9/EC and 2001/29/EC (OJ L 130, 17.5.2019, pp. 92–125) – hereinafter the “Directive on Copyright in the DSM” or “CDSM Directive”) – TDM is defined as “any automated analytical technique aimed at analysing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations” (Art. 2.2, CDSM Directive). The definition is sufficiently broad to embrace the current TDM application panorama. For a technical definition of text and data mining, see Hearst (2003). Specifically on text mining, Feldman and Sanger (2007). For an extensive analysis of the definition of TDM, see Triaille et al. (2014).
On the crucial need to train algorithms on different datasets, see Hall and Pesenti (2017).
By “private ordering”, we refer to both contractual, technological and informal measures as tools to enforce platforms’ rights and interests towards their users. In the absence of a clear legislative framework or effective (and efficient) remedies, contracts and technology can be used to expand the prerogatives and powers of platforms, restricting the legitimate uses and faculties of the weak party. As noted with reference to the intellectual property domain by Dussolier (2007), pp. 1393–1394.
Arts. 3 and 4, CDSM Directive.
Art. 4(1), General Data Protection Regulation.
Art. 1(2), Database Directive.
For the classification of a website as a database, see Strowel and Derclaye (2001), pp. 311–312.
ECJ, Case C-604/10, Football Dataco Ltd, para. 38.
Ibid., para. 39.
In line with the US leading case Feist Publications Inc. v. Rural Telephone Service Co., 499 U.S. 340 (1991). See Waelde et al. (2013), p. 65.
Art. 7, Database Directive.
Derclaye (2008), p. 107.
ECJ, C-46/02, Fixtures Marketing Ltd v. Oy Veikkaus Ab [2004] ECLI: ECLI:EU:C:2004:694; C-338/02, Fixtures Marketing Ltd v. Svenska Spel AB [2004] ECLI:EU:C:2004:696; C-444/02, Fixtures Marketing Ltd v. Organismoa Prognostikon Agnon Podosfairou AE (OPAP) [2004] ECLI:EU:C:2004:697; C-203/02, The British Horseracing Board Ltd and Others/William Hill Organization Ltd [2004] ECLI:EU:C:2004:695.
The maker of the database could state the existence of the sui generis right in the T&C, but such a circumstance is quite rare. Usually, T&C contain general formula such as “All rights reserved”, leaving ample room for interpretation to the end-user, which in the majority of the cases is nor a lawyer, much less an IP expert.
As known, the originality requirement is not harmonised in the InfoSoc Directive (it is mentioned in the Software, Term and Database directives). However, the concept has been interpreted as “the author’s own intellectual creation” and applied to works in several decisions of the European Court of Justice. See, ECJ, C-5/08, Infopaq International [2009] ECLI:EU:C:2009:465, para. 37; C-403/08, Football Association Premier League and Others [2011] ECLI:EU:C:2011:631, para. 97; C-393/09, BSA [2010], ECLI:EU:C:2010:816, paras. 44–45; C-355/12, Nintendo and Others [2014], ECLI:EU:C:2014:25, paras. 21–22. For an overview of the “Europeanization” of originality, Strowel (2012).
Rosati (2013a). See also ECJ, Case C-161/17, Land Nordrhein-Westfalen v. Dirk Renckhoff [2017] ECLI:EU:C:2018:634 (on a photograph of the city of Cordoba).
ECJ, Case C-5/08 Infopaq International, para. 45.
Ibid., para. 48.
District Court of Milan, specialised section, decision No. 6057/2014, available online here: https://www.giurisprudenzadelleimprese.it/wordpress/wp-content/uploads/2014/08/20140512_RG79952-20111.pdf.
District Court of Venice, specialised section, decision of 17 December 2014, RG 1522/2011 (not published).
Madrid Provincial Court (s. 12) of 3 March 2004, cited in Vallés (2009), pp. 114–115.
See Bernault et al. (2017), p. 116, footnote 94.
Paris Criminal Court, 17 January 1968, Gaz. Pal. 1968, I, p. 197.
Paris Commercial Court, 67ème ch., 4 Sept. 1989, Expertises, 1991, p. 273, obs. Gross (contract proposed by a credit provider to traders); Paris Court of Appeal, 4th ch., 27 Nov. 2002, Expertises, 2003, p. 190 (software licence); Douai Court of Appeal, 27 March 2013, Propr. Intell., 2013, p. 285, obs. Brugière (terms of reference for public procurement).
Brussels Court of Appeal, 28 January 1997, cited in de Visscher and Michaux (2000), p. 31, footnote 125.
Supreme Court of Judicator – Court of Appeal, Elanco v. Mandops [1980] RPC 213.
Bently and Sherman (2009), p. 64.
Federal Supreme Court, 10 October 1991 – I ZR 147/89 (“Bedienungsanweisung”).
Federal Supreme Court, 11 April 2002 – I ZR 231/99 (“Technische Lieferbedingungen”).
Federal Supreme Court, 22 September 1999 – I ZR 48/97 (“Planungsmappe”).
The comment of Senator Kennedy at the April 2018 Congressional hearing of Mark Zuckerberg regarding the Cambridge Analytica affair was widely disseminated online: “Here's what everybody's been trying to tell you today, and – and I say this gently. Your user agreement sucks […] The purpose of that user agreement is to cover Facebook's rear end. It's not to inform your users about their rights. Now, you know that and I know that. I'm going to suggest to you that you go back home and rewrite it. And tell your $1,200 an hour lawyers, no disrespect. They're good. But – but tell them you want it written in English and non-Swahili, so the average American can understand it. That would be a start”.
See point 5 of the Creative Commons Terms, https://creativecommons.org/terms/.
See for instance the privacy policy elaborated by the designer Stefania Passera: https://juro.com/policy.html.
TDM may potentially perform also the reproduction and/or adaptation and/or communication and/or distribution of the database itself. However, we do not analyse this case here.
Triaille et al. (2014), p. 32. According to the authors, this is not the case if the tool simply spots one or two words through the text without making a copy of the work (e.g. spotting and counting the occurrences of the word “malaria”). In particular, p. 31. Similarly, Stamatoudi (2016), p. 1261; Montagnani and Aime (2017), p. 379 ff.
We do not take into consideration Art. 9(a) Database Directive, since the exception for personal use applies only to non-electronic database, which do not permit TDM in any case.
Art. 8(2), Database Directive.
Art. 8(3), Database Directive.
As required by Art. 5(3)(a) InfoSoc Directive and Art. 6(1)(a) Database Directive, but not by Art. 9(b) Database Directive. See Derclaye (2008), p. 131.
See Arts. 5(3)(a) InfoSoc Directive and Arts. 6(2)(b) and 9(1)(b) Database Directive. Another inconsistency between the two Directives must be noted with regard to the research exception. Under the Database Directive, the reference to the source appears to be a mandatory requirement, while the InfoSoc admits the possibility of not indicating the author’s names if “this turns out to be impossible” (Art. 5.3.a. InfoSoc Directive). According to some authors, this difference is more a declamation than a substantial matter, considering the general principle of “ad impossibilia nemo tenetur”, which will be applicable to the exception for database in any case [see Montagnani and Aime (2017), p. 387; citing Walter and Von Lewinski (2010)]. Other authors interpret literally the provisions [cf. Triaille et al. (2014), p. 70, as reported in Montagnani and Aime (2017), p. 387].
Caspers and Guibault (2016a), p. 34.
Valenti (2007b), p. 195.
Ibid. p. 202, para. III and the bibliography thereby cited.
Despite the potential “open-ended” nature of the three-step test, as reported by Margoni (2012). See also Hilty et al. (2008). While, for phonograms and videograms, there is a specific provision: the reproduction is permitted if done by a physical person solely for personal use and for non-commercial purposes, in compliance with the applicable TPMs (Art. 71sexies (1), Law 633/1941). The exception will not apply if the reproduction is done by a third party (Art. 71sexies (2), Law 633/1941) and if the works are available on-demand and protected by TPMs or contracts (Art. 71sexies (3), Law 633/1941). On the limits of the private copying exception for the digital context and the interplay with TPMs, cf. Caso (2004); Montagnani (2007); Mazziotti (2008).
See Derclaye and Favale (2010); Helberger et al. (2013); Triaille et al. (2013); Caspers and Guibault (2016a). Regarding TPMs, Art. 6(4) does not require that the Member States take appropriate measures to ensure that the beneficiaries of the private copying exception do in practice benefit of the exception.
Derclaye and Favale (2010), p. 90.
Kretschmer et al. (2010), p. 13.
In Italy, see Art. 68bis, Law 633/41. Cf. Valenti (2007a).
Recital 33, InfoSoc Directive.
ECJ, C-5/08, Infopaq International A/S v. Danske Dagblades Forening [2009], ECLI:EU:C:2009:465; C-403/08, Football Association Premier League Ltd and others v. QC Leisure [2011], ECLI:EU:C:2011:631; case C-302/10, Infopaq International A/S v. Danske Dagblades Forening [2012], ECLI:EU:C:2012:16; case C-360/13, Public Relations Consultants Association Ltd v. Newspaper Licensing Agency Ltd and Others [2014], ECLI:EU:C:2014:1195.
Triaille et al. 2014.
ECJ, C-302/10, Infopaq International A/S v. Danske Dagblades Forening, para. 54.
Ibidem.
Triaille et al. (2014), pp. 31–32.
The “normal use” of the content of the database has been established, for example, in the national proceedings of the Ryanair case (ECJ, Case C–30/14, Ryanair Ltd v. PR Aviation BV [2015], ECLI:EU:C:2015:10). In that case, the Netherlands court stated that the activity of an online intermediary comparing the prices of flights, by performing also the extraction of information from the Ryanair website, was a “normal use” of that database, thus any contrary contractual provision considered unenforceable. However, the ECJ noted that the existence of the sui generis right was not proven in the case: as a consequence, the prohibition of contractual overriding did not apply.
https://www.uber.com/legal/terms/be/ (accessed August 2018).
Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions, “A Digital Single Market Strategy for Europe”, COM/2015/0192 final.
Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions, “Towards a modern, more European copyright framework”, COM/2015/0626 final.
Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions, “Promoting a fair, efficient and competitive European copyright-based economy in the Digital Single Market”, COM/2016/592.
See the version of the text dated 25 May 2018 (Council of the EU, Interinstitutional File: 2016/0280(COD), doc. 9134/18. Hereinafter "Council text”, available at: https://www.consilium.europa.eu/media/35373/st09134-en18.pdf).
Here the Council tries to fix the wording of Recital 8 (Commission text). The latter has been criticised in the literature for being a potential source of confusion, since it "wrongly suggests that carrying out TDM is per se of relevance to copyright. The explanations given in Recital 8 (Council text), according to which an authorisation to undertake such acts must be obtained from rightholders if no exception or limitation applies, are too sweeping”. Hilty and Richter (2017), p. 3. However, the clarification offered by the Council could not be sufficient: the temporary reproduction exception is only one of the possible legitimate activities that can be lawfully performed by users without authorisation nor a specific TDM exception.
Defined at Art. 2(3), Draft Directive.
The wording of the Recital was far from being clear. On the one hand, by recognising the importance of peer review and verification, the proposal seems to allow the retention of the copies made under the exception “in certain cases” (not specified). Such copies must “be stored in a secure environment and not be retained for longer than is necessary for the scientific research activities”. The text leaves to the Member State the task to determine the concrete modalities for retaining the copies. However, the hard issue to ascertain would have been: when will the copies no longer be necessary? Is this something easy to determine in the context of research? The CDSM Directive has not maintained such a provision for the "scientific" TDM exception, but it appears at Art. 4(2) only.
See, for instance, amendments proposed to the text of Art. 3 of the Commission proposal: amendment 538 by Julia Reda, Nessa Childers, Max Andersson, Michel Reimon, Brando Benifei (deleting any reference to research organisation, research purposes and lawful access), amendment 539 by Jytte Guteland (extending TDM to cultural heritage institutions), amendments 546 and 547 (respectively encouraging and obliging Members States to allow research organisations, without lawful access to works and other subject-matter, to perform TDM), amendment 548 (protecting the mandatory TDM exception against TPMs) and amendments 551–555 (limiting the scope of the measures that the rightholder can adopt to ensure the security and integrity of the networks and databases where the works or other subject-matter are hosted), amendment 564 (mandating the adoption of open formats for publicly-funded research and data in order to enable TDM). The text of the amendments presented by the Member of the EP is available here: https://euractiv.eu/wp-content/uploads/sites/2/2017/05/JURI-copyright-amendments.pdf.
Committee on Legal Affairs, Report on the proposal for a directive of the European Parliament and of the Council on copyright in the Digital Single Market (COM(2016)0593 – C8-0383/2016 – 2016/0280(COD)), 29 June 2018, available here: http://www.europarl.europa.eu/sides/getDoc.do?type=REPORT&mode=XML&reference=A8-2018-0245&language=EN.
In this sense, see the second sentence added at para. 1 of Art. 3, EP text.
See above Communication, “A Digital Single Market Strategy for Europe”, COM/2015/0192 final, p. 7: “Innovation in research for both non-commercial and commercial purposes, based on the use of text and data mining (e.g. copying of text and datasets in search of significant correlations or occurrences) may be hampered because of an unclear legal framework and divergent approaches at national level. The need for greater legal certainty to enable researchers and educational institutions to make wider use of copyright-protected material, including across borders, so that they can benefit from the potential of these technologies and from cross-border collaboration will be assessed, as with all parts of the copyright proposals in the light of its impact on all interested parties”.
Margoni and Kretschmer (2018) and Caspers and Guibault (2016a). As acutely pointed out by Margoni and Kretschmer: “The EU legislature is fully aware of this contradiction but failed to address it properly. In fact, Art. 6 of the Proposal (“common provisions”) clarifies that the provisions of the first, third and fifth subparagraph of Art. 6(4) InfoSoc directive apply. In plain English, this means that if a user qualifies for an exception to copyright (e.g. TDM) but a Technological Protection Measure prevents them from doing it, Member States have an obligation to take appropriate measures to ensure that right holders make available to the beneficiary an exception or limitation. In the almost 20 years since when the InfoSoc directive was enacted, the UKIPO, which has correctly put in place a specific procedure for this type of situations, has received less than a handful of requests”.
These thoughts were firstly elaborated in Strowel (2018).
See also Poort (2018).
Authors Guild v. Google, Inc. No. 13-4829-cv (2d Cir. Oct. 16, 2015). On April 18, 2016, the Supreme Court denied the petition for a writ of certiorari, leaving the Second Circuit ruling in Google's favour intact. To make available parts of the corpus of books, Google has scanned the digital copies and established a publicly available search function, the ngrams tool.
The following US decisions quoted in Authors Guild v. Google, Inc. have, for instance, exempted several uses that, without the application of the work use requirement, cannot escape copyright’s exclusivity in the EU: A.V. ex rel. Vanderhye v. iParadigms, LLC, 562 F.3d 630, 638–640 (4th Cir. 2009) (justifying as transformative fair use purpose the complete digital copying of a manuscript to determine whether the original included matter plagiarized from other works); Perfect 10, Inc. v. Amazon.com, Inc., 508 F.3d 1146, 1165 (9th Cir. 2007) (justifying as transformative fair use purpose the use of a digital, thumbnail copy of the original to provide an Internet pathway to the original); Kelly v. Arriba Soft Corp., 336 F.3d 811, 818–819 (9th Cir. 2003) (same); Bond v. Blum, 317 F.3d 385 (4th Cir. 2003) (justifying as fair use purpose the copying of author’s original unpublished autobiographical manuscript for the purpose of showing that he murdered his father and was an unfit custodian of his children).
On the theoretical foundation of platform cooperativism, see Scholz (2016).
Blablacar does not prohibit TDM on the whole content of the website, but only on a substantial part of it.
See Caspers and Guibault (2016a), p. 8.
According to T&C of Bar d’Office, users cannot obtain (or attempt) to obtain any material or information through any means not intentionally made available by the platform. We shall conclude that third-party applications aiming at analysing Bar d’office legal documents do not fall within the permitted uses. Wibee does not allow to “exploit in any way the content” but it has to be combined with the contractual provision that allows the use of the platform for non-commercial purposes only. Finally, Menu Next Door contained the broad, but vague, formulation “All rights reserved”.
On the origin and functioning of this file, seehttp://www.robotstxt.org/robotstxt.html.
Rotenberg and Compañó (2009).
Art. 6(3), InfoSoc Directive.
Ibidem.
In Europe, the robots.txt file has been questioned with reference to the issue of implied license only. See, for instance, the Copiepresse v. Google saga, commented in Strowel (2007, 2011). Doubts about the classification of robot.txt as a TPM in the US context are expressed by Jasiewicz (2012). In the US case Healthcare Advocates, Inc. v. Harding [497 F. Supp. 2d 627, 643 (E.D. Pa. 2007)], the Court for the Eastern District of Pennsylvania incidentally discussed the nature of the robots.txt file. The judge recognised the protocol as a TPM under the DMCA in that specific case. However, the Court expressly affirmed that robots.txt is not “analogous to digital password protection or encryption” and its nature must be assessed case-by-case (“This finding should not be interpreted as a finding that a robots.txt file universally qualifies as a technological measure that controls access to copyrighted works under the DMCA”).
Buijze (2013). The analysis presented in this paragraph has been further developed in Ducato (forthcoming).
The transparency principle, as a duty to provide information before the conclusion of the contract, is envisaged in the Annex of the Unfair Terms Directive (UTD), which includes, among the list of potential unfair terms, the contractual provision which: “irrevocably binds the consumer to terms with which he had no real opportunity of becoming acquainted before the conclusion of the contract” (Annex, 1.i, UTD). It can also be derived by Recital 20, UTD. It is further recalled at Art. 6(1) of the Consumer Rights Directive (CRD).
The principle of transparency, sub specie of understandability of the information provided to the consumer, is specifically mentioned in several legislative instruments. The duty to provide the consumer with information in a clear and comprehensible manner is recalled at Art. 5, UTD: “In the case of contracts where all or certain terms offered to the consumer are in writing, these terms must always be drafted in plain, intelligible language”. Moreover, it is expressed at Arts 5(1), 6(1) CRD and further expanded at Art. 8 CRD. In addition, when the contract is concluded “through a means of distance communication which allows limited space or time to display the information” (Art. 8.4, CRD), like the screen of a mobile phone, the trader will have to provide at least a set of pre-contractual information, such as the main characteristics of the goods or services, the identity of the trader, the total price, the right of withdrawal, the duration of the contract and, if the contract is of indeterminate duration, the conditions for terminating the contract. Among the appropriate means to display information to the consumer, the Commission suggested the adoption of a set of icons, making also available a model. However, such a measure does not seem to have taken hold (EC Commission, DG Justice Guidance document concerning Directive 2011/83/EU of the European Parliament and of the Council of 25 October 2011 on consumer rights, amending Council Directive 93/13/EEC and Directive 1999/44/EC of the European Parliament and of the Council and repealing Council Directive 85/577/EEC and Directive 97/7/EC of the European Parliament and of the Council, June 2014, available here: https://ec.europa.eu/info/sites/info/files/crd_guidance_en_0.pdf).
Micklitz et al. (2009), p. 136.
Kästle-Lamparter (2018), p. 474.
When it has interpreted the plainness and intelligibility requirements, the ECJ has always excluded formalistic readings. In Kásler, for instance, the Court held that the requirement of transparency of terms, under the UTD, cannot “be reduced merely to being formally and grammatically intelligible” (ECJ, C-26/13, Árpád Kásler and Hajnalka Káslerné Rábai v. OTP Jelzálogbank Zrt. [2014] ECLI:EU:C:2014:282, para. 71). Principle confirmed in the subsequent jurisprudence. See, ECJ, Bogdan Matei and Ioana Ofelia Matei v. SC Volksbank România SA [2015] ECLI:EU:C:2015:127. See also, ECJ, C-191/15, Verein für Konsumenteninformation v. Amazon EU Sà [2016] ECLI:EU:C: 2016:612). Terms must be transparent “so that the consumer can foresee, on the basis of clear, intelligible criteria, the economic consequences for him which derive from it” (ECJ, C-26/13, Árpád Kásler and Hajnalka Káslerné Rábai v. OTP Jelzálogbank Zrt, para. 73). As later specified in Gutiérrez Naranjo, under the transparency principle, the consumer has to be able to understand not only the economic consequences but also the legal ones (ECJ, C-154/15, Francisco Gutiérrez Naranjo v. Cajasur Banco SAU, Ana María Palacios Martínez v. Banco Bilbao Vizcaya Argentaria SA (BBVA), Banco Popular Español SA v. Emilio Irles López and Teresa Torres Andreu [2016] ECLI:EU:C:2016:980).
About the interplay between consumer protection and data protection, Helberger et al. (2017).
EDPB, Guidelines on Transparency under Regulation 2016/679, 13 April 2018.
Art. 12.7 GDPR. See also Recital 60. The link between transparency, information and visualisation is further stressed at Recital 58, GDPR: “The principle of transparency requires that any information addressed to the public or to the data subject be concise, easily accessible and easy to understand, and that clear and plain language and, additionally, where appropriate, visualisation be used. Such information could be provided in electronic form, for example, when addressed to the public, through a website. This is of particular relevance in situations where the proliferation of actors and the technological complexity of practice make it difficult for the data subject to know and understand whether, by whom and for what purpose personal data relating to him or her are being collected, such as in the case of online advertising”.
OECD (2016), p. 17. A version for the GDPR set of icons was released in the Annex of the first reading of the European Parliament (European Parliament legislative resolution of 12 March 2014 on the proposal for a regulation of the European Parliament and of the Council on the protection of individuals with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation) [COM(2012)0011 – C7-0025/2012 – 2012/0011(COD)] (Ordinary legislative procedure: first reading). But no other official initiatives have been taken in this direction. An interesting methodology to answer the challenges of GDPR’s icons has been developed within the research project run by the Cirsfid group at the University of Bologna: http://gdprbydesign.cirsfid.unibo.it/.
Art. 12.1, GDPR. See also Recitals 39 and 42, GDPR.
Art. 7.2, GDPR.
EDPB, Guidelines on Transparency under Regulation 2016/679, p. 8.
Ibid., p. 11.
Ibid., p. 12.
Ibidem.
References
Ayres I, Schwartz A (2014) The no-reading problem in consumer contract law. Stanf Law Rev 66(3):545–610
Bakos Y et al (2014) Does anyone read the fine print? Consumer attention to standard form contracts. J Legal Stud 43(1):1–35
Bar-Gill O (2015) Defending (smart) disclosure: a comment on more than you wanted to know. Jerus Rev Legal Stud 11(1):75–82
Ben-Shahar O (2009) The myth of the ‘opportunity to read’ in contract law. Eur Rev Contract Law 5(1):1–28
Ben-Shahar O (2013) Regulation through boilerplate: an apologia. Mich L Rev 112:883–903
Bently L, Sherman B (2009) Intellectual property law. Oxford University Press, Oxford
Bernault C et al (2017) Traité de la propriété littéraire et artistique. Litec, Paris
Bernhardt B et al (2015) Revolutionizing scholarship: a panel discussion on text and data mining. Ser Rev 41(3):184–186
Beunen AC (2007) Protection for databases: the European Database Directive and its effects in the Netherlands. Wolf Legal Publishers, Nijmegen
Borghi M, Karapapa S (2015) Contractual restrictions on lawful use of information: sole-source databases protected by the back door? Eur Intellect Prop Rev 37(8):505–514
Braun D et al (2018) Customer-centered LegalTech: automated analysis of standard form. Internationales Rechtsinformatik Symposium (IRIS), pp 627–634
Buijze A (2013) The six faces of transparency. Utrecht L. Rev. 9(3):3–25
Busch C (2016) The future of pre-contractual information duties: from behavioural insights to big data. In: Twigg-Flesner C (ed) Research handbook on EU consumer and contract law. Edward Elgar, Cheltenham, pp 221–240
Busch C (2019) Implementing personalized law: personalized disclosures in consumer law and privacy law. Univ Chicago Law Rev 86(2):309–332
Caso R (2004) Digital rights management: il commercio delle informazioni digitali tra contratto e diritto d’autore. CEDAM, Padua
Caspers M, Guibault L (2016a) Baseline report of policies and barriers of TDM in Europe. https://www.futuretdm.eu/wp-content/uploads/FutureTDM_D3.3-Baseline-Report-of-Policies-and-Barriers-of-TDM-in-Europe-1.pdf. Accessed 18 August 2018
Caspers M, Guibault L (2016b) A right to ‘read’ for machines: assessing a black-box analysis exception for data mining. Computer Sciences 53(1):1–5
Contissa G et al (2018) CLAUDETTE meets GDPR. Automating the evaluation of privacy policies using artificial intelligence. https://www.beuc.eu/publications/beuc-x-2018-066_claudette_meets_gdpr_report.pdf. Accessed 18 October 2018
de Visscher F, Michaux B (2000) Précis du droit d’auteur et des droits voisins. Bruylant, Brussels
Derclaye E (2008) the legal protection of databases: a comparative analysis. Edward Elgar, Cheltenham
Derclaye E (2012) Football Dataco: skill and labour is dead. http://copyrightblog.kluweriplaw.com/2012/03/01/football-dataco-skill-and-labour-is-dead/. Accessed 18 August 2018
Derclaye E (2014) The Database Directive. In: Stamatoudi I, Torremans P (eds) EU copyright law: a commentary. Edward Elgar, Cheltenham, pp 298–354
Derclaye E, Favale M (2010) Paper 3: user contracts. In: Kretschmer M et al (eds) The relationship between copyright and contract law. http://eprints.bournemouth.ac.uk/16091/1/_contractlaw-report.pdf. Accessed 18 August 2018
Ducato R (forthcoming) Transparency by (legal) design. Paper presented at the Private Law Consortium, Harvard Law School, 15 May 2018 and at the Younger Scholar Informal Symposium, organised within the General Congress of the International Academy of Comparative Law, Fukuoka, 25 July 2018
Dussolier S (2007) Sharing access to intellectual property through private ordering. Chicago Kent Law Rev 82(3):1391–1438
Feldman R, Sanger J (2007) The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge University Press, Cambridge
Geiger C et al (2018) The exception for text and data mining (TDM) in the proposed Directive on Copyright in the Digital Single Market—legal aspects. Centre for International Intellectual Property Studies (CEIPI) Research Paper No. 2018-02:1–34
Groom J (2004) Are agent exclusion clauses a legitimate application of the EU Database Directive. SCRIPTed 1:83–118
Guibault L (2002) Copyright limitations and contracts. Kluwer Law Int, The Hague
Guibault L et al (2012) Study on the implementation and effect in Member States’ laws of Directive 2001/29/EC on the harmonisation of certain aspects of copyright and related rights in the information society. Report to the European Commission, DG Internal Market, February 2007. Amsterdam Law School Research Paper No. 2012-28. Institute for Information Law Research Paper No. 2012-23. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2006358. Accessed 18 August 2018
Hall W, Pesenti J (2017) Growing the artificial intelligence industry in the UK. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/652097/Growing_the_artificial_intelligence_industry_in_the_UK.pdf. Accessed 18 August 2018
Harkous H et al (2018) Polisis: automated analysis and presentation of privacy policies using deep learning. arXiv preprint arXiv:1802.02561. Accessed 18 August 2018
Hearst MA (2003) Text data mining. In: Mitkov R (ed) The Oxford handbook of computational linguistics. Oxford University Press, Oxford, pp 616–662
Helberger N (2013) Forms matter: informing consumers effectively (Study commissioned by BEUC). https://www.ivir.nl/publicaties/download/Form_matters.pdf. Accessed 18 August 2018
Helberger N, Hugenholtz PB (2007) No place like home for making a copy: private copying in European copyright law and consumer law. Berkeley Tech LJ 22(3):1061–1093
Helberger N et al (2013) Digital content contracts for consumers. J Consum Policy 36(1):37–57
Helberger N et al (2017) The perfect match? A closer look at the relationship between EU consumer law and data protection law. Common Mark Law Rev 54(5):1427–1465
Helleringer G, Sibony A-L (2017) European consumer protection through the behavioral lens. Columbia J Eur Law 23(3):607–649
Hillman RA (2006) Online boilerplate: would mandatory website disclosure of e-standard terms backfire? Mich Law Rev 104(5):837–856
Hilty RM, Richter H (2017) Position statement of the Max Planck Institute for Innovation and Competition on the proposed modernisation of European copyright rules part B exceptions and limitations (Art. 3–Text and Data Mining). https://www.ip.mpg.de/fileadmin/ipmpg/content/stellungnahmen/MPI_Position_Statement_Part_B_Chapter_1_Update23022017.pdf. Accessed 18 Aug 2018
Hilty RM et al (2008) Declaration on a balanced interpretation of the “three-step test” in copyright law. Int Rev Intellect Prop Compet Law 39(6):707–712
Hugenholtz PB (2000) Copyright, contract and code: what will remain of the public domain. Brook J Int Law 26:77–90
Jasiewicz MI (2012) Copyright protection in an opt-out world: implied license doctrine and news aggregators. Yale Law J 122(3):837–850
Kästle-Lamparter D (2018) Pre-contractual information duties. In: Jansen N, Zimmermann R (eds) Commentaries on European contract laws. Oxford University Press, Oxford, pp 384–504
Kretschmer M et al (2010) The relationship between copyright and contract law. http://eprints.bournemouth.ac.uk/16091/1/_contractlaw-report.pdf. Accessed 18 August 2018
Lippi M et al (2018) CLAUDETTE: an automated detector of potentially unfair clauses in online terms of service. arXiv preprint arXiv:1805.01217. Accessed 18 October 2018
Loos M (2015) Transparency of standard terms under the Unfair Contract Terms Directive and the proposal for a common European sales law. Eur Rev Priv Law 23(2):179–193
Margoni T (2012) Eccezioni e limitazioni al diritto d’autore in Internet = Exceptions and Limitations to Copyright Law in the Internet. https://www.ivir.nl/publicaties/download/Giurisprudenza_Italiana_2011_8_9.pdf. Accessed 18 August 2018
Margoni T, Dore G (2016) Why we need a text and data mining exception (but it is not enough). https://interop2016.github.io/pdf/INTEROP-13.pdf. Accessed 18 August 2018
Margoni T, Kretschmer M (2018) The text and data mining exception in the proposal for a Directive on Copyright in the Digital Single Market: why it is not what EU copyright law needs. https://www.create.ac.uk/blog/2018/04/25/why-tdm-exception-copyright-directive-digital-single-market-not-what-eu-copyright-needs/. Accessed 18 Aug 2018
Mazziotti G (2008) EU digital copyright law and the end-user. Springer, Berlin
Micklitz H-W et al (2009) Understanding EU consumer law. Intersentia, Antwerp
Montagnani ML (2007) Dal peer-to-peer ai sistemi di Digital Rights Management: primi appunti sul melting pot della distribuzione online. Il diritto d’autore 1:1–57
Montagnani ML, Aime G (2017) Il text and data mining e il diritto d’autore. AIDA XXVI:376–394
Myska M, Harasta J (2016) Less is more: protecting databases in the EU after Ryanair. Masaryk UJL Tech 10:170–198
OECD (2016) Protecting consumers in peer platform markets: exploring the issues. https://unctad.org/meetings/en/Contribution/dtl-eWeek2017c05-oecd_en.pdf. Accessed 18 August 2018
Poort J (2018) Borderlines of copyright protection: an economic analysis. In: Hugenholtz PB (ed) Copyright reconstructed: rethinking copyright’s economic rights in a time of highly dynamic technological and economic change. Wolters Kluwer, Alphen aan den Rijn, pp 283–338
Porat A, Strahilevitz LJ (2013) Personalizing default rules and disclosure with big data. Mich Law Rev 112(8):1417–1478
Radin MJ (2013) Boilerplate: the fine print, vanishing rights, and the rule of law. Princeton University Press, Princeton
Rosati E (2013a) Originality in EU copyright. Edward Elgar, Cheltenham
Rosati E (2013b) Towards an EU-wide copyright? (judicial) pride and (legislative) prejudice. Intellect Prop Q 1:47–68
Rosati E (2018) National and EU text and data mining exceptions: room for coexistence? http://ipkitten.blogspot.com/2018/03/national-and-eu-text-and-data-mining.html. Accessed 18 August 2018
Rotenberg B, Compañó R (2009) Search Engines for audio-visual content: copyright law and its policy relevance telecommunication markets. In: Curwen P et al (eds) Telecommunication markets. Springer, Heidelberg, pp 113–139
Sadeh N et al (2013) The usable privacy policy project. http://reports-archive.adm.cs.cmu.edu/anon/isr2013/CMU-ISR-13-119.pdf. Accessed 18 August 2018
Scholz T (2016) Platform cooperativism. Challenging the corporate sharing economy. Rosa Luxemburg Stiftung, New York
Sire G (2015) Inclusion exclue: le code est un contrat léonin. Enquête sur la valeur technique et juridique du protocole robots.txt. Réseaux 189(1):187–214
Stamatoudi IA (2016) Text and data mining. In: Stamatoudi IA (ed) New developments in EU and international copyright law. Wolters Kluwer, Alphen aan den Rijn, pp 251–282
Strowel A (1993) Droit d’auteur et copyright, Divergences et convergences. Bruylant, Paris
Strowel A (2007) Google et les nouveaux services en ligne: quels effets sur l’économie des contenus, quels défis pour la propriété intellectuelle. Journal des tribunaux 22:589–598
Strowel A (2011) Quand Google défie le droit. Larcier, Bruxelles
Strowel A (2012) European copyright: beyond the additions made by the European Court of Justice, some pieces are still missing. In: Janssens M-C, Overwalle GV (eds) Harmonisation of European IP law, in Honor of Fr. Gotzen. De Boeck, Bruxelles, pp 73–98
Strowel A (2014) Droit d’auteur et copyright. Convergences des droits, régulation différente des contrats Mélanges en l’honneur du Professeur André Lucas. LexisNexis, Paris, pp 699–717
Strowel A (2015) Fair compensation for private copying copyright and the digital agenda for Europe: current regulations and challenges for the future. Sakkoulas Publications, Athens, pp 189–195
Strowel A (2018) Reconstructing the reproduction and communication to the public rights: how to align copyright with its fundamentals. In: Hugenholtz PB (ed) Copyright reconstructed rethinking copyright’s economic rights in a time of highly dynamic technological and economic change. Wolters Kluwer, Alphen aan den Rijn, pp 203–240
Strowel A, Derclaye E (2001) Droit d’auteur et numérique: logiciels, bases de données, multimédia. Bruylant, Bruxelles
Sunstein C (2012) Informing consumers through smart disclosure. https://obamawhitehouse.archives.gov/blog/2012/03/30/informing-consumers-through-smart-disclosure. Accessed 28 August 2018
Triaille JP et al (2013) Study on the application of Directive 2001/29/EC on copyright and related rights in the information society. https://publications.europa.eu/en/publication-detail/-/publication/9ebb5084-ea89-4b3e-bda2-33816f11425b. Accessed 18 Oct 2018
Triaille JP et al (2014) Study on the legal framework of text and data mining (TDM). https://publications.europa.eu/en/publication-detail/-/publication/074ddf78-01e9-4a1d-9895-65290705e2a5/language-en. Accessed 18 October 2018
Valenti R (2007a) Art. 68-bis. In: Ubertazzi LC (ed) Diritto d’autore. Estratto da L. C. Ubertazzi “Commetario breve alle leggi su proprietà intellettuale e concorrenza”. CEDAM, Padua, pp 203–204
Valenti R (2007b) Introduzione agli artt. 65-71-quinquies. In: Ubertazzi LC (ed) Diritto d’autore. Estratto da L.C. Ubertazzi “Commentario breve alle leggi su proprietà intellettuale e concorrenza”. CEDAM, Padua, pp 190–196
Vallés RC (2009) Research handbook on the future of EU copyright the requirement of originality. Edward Elgar, Cheltenham
Vila T et al (2003) Why we can’t be bothered to read privacy policies models of privacy economics as a lemons market. In: ICEC ’03 proceedings of the 5th International Conference on Electronic Commerce, pp 403–407
Waelde C et al (2013) Contemporary intellectual property: law and policy. Oxford University Press, Oxford
Walter M, Von Lewinski S (2010) European copyright law. A commentary. Oxford University Press, Oxford
Wilhelmsson T (2004) The abuse of the “confident consumer” as a justification for EC consumer law. J Consum Policy 27(3):317–337
Acknowledgements
The research is supported by the Innoviris Grant 2016-BB2B-9. A sincere thanks to Dr. Guido Noto La Diega for the constructive discussion on an early version of this paper. The authors have jointly conceived the paper and share the views expressed therein. Nonetheless, while Section 5 is attributable to Alain Strowel, Sections 2, 6 and 7 are specifically attributable to Rossana Ducato. Both authors equally contributed to the drafting of Sections 1, 3, 4 and 8.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ducato, R., Strowel, A. Limitations to Text and Data Mining and Consumer Empowerment: Making the Case for a Right to “Machine Legibility”. IIC 50, 649–684 (2019). https://doi.org/10.1007/s40319-019-00833-w
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40319-019-00833-w