Skip to main content
Log in

Limitations to Text and Data Mining and Consumer Empowerment: Making the Case for a Right to “Machine Legibility”

  • Article
  • Published:
IIC - International Review of Intellectual Property and Competition Law Aims and scope Submit manuscript

Abstract

This paper focuses on the current legal barriers to text and data mining (TDM) in the context of smart disclosure systems (SDSs) whose aim is to provide consumers with improved access to the data needed to make informed decisions. The use of intellectual property rights and contracts, combined with technological protection measures, can hinder TDM and the deployment of SDSs. Further, those legal constraints can negatively impact on artificial intelligence innovation, because that requires improved access to data. There are thus various arguments for enhanced “machine legibility”. However, the TDM exceptions included in the recently approved Directive on Copyright in the Digital Single Market do not appear to clear the way for enhanced “machine legibility”. In relation to SDSs, we also argue that the principle of transparency, which is embedded in consumer and data protection laws, can serve as a last line of defence against prohibition of TDM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. Sunstein (2012).

  2. On the advantages of smart disclosures and targeted information, see Ben-Shahar (2009); Helberger (2013); Porat and Strahilevitz (2013); Bar-Gill (2015); Busch (2016); Helleringer and Sibony (2017); Busch (2019).

  3. One of the first projects in the area of automated analysis of legal documents is “Usable Privacy Policy” (www.usableprivacy.org), a consortium led by Carnegie Mellon University. Their tool aims to help users to navigate through the text of privacy policy and identify the privacy options and choices available (Sadeh et al. 2013). More recently, an international team formed by researchers from the Switzerland's Federal Institute of Technology, the University of Wisconsin and the University of Michigan, has launched two tools: Polisis (https://pribot.org/polisis), a tool to visualise in a very effective way the content of a privacy policy; and Pribot (https://pribot.org/bot), a chatbot available to answer questions about a specific privacy policy (Harkous et al. 2018). In the field of the automated analysis of T&C, we must mention CLAUDETTE, a research project carried out by an interdisciplinary team at the European University Institute (https://claudette.eui.eu). The tool, based on machine learning techniques, assesses the fairness of consumer standard terms (https://claudette.eui.eu/use-our-tools/). This functionality will be extended to the analysis of privacy policy. For more information, see Contissa et al. (2018); Lippi et al. (2018). Another interdisciplinary project, SaToS (Software Aided Analysis of ToS), is conducted by the chair of Software Engineering for Business Information Systems (Sebis) at TU Munich. The German research group is developing a solution to automatically identify Terms of Services of e-commerce websites and summarise the key points of the contract in a simplified language (Braun et al. 2018).

  4. This is precisely one of the objects of “The Internet of Platforms: an empirical research on private ordering and consumer protection in the sharing economy”, carried out at UCLouvain. The project aims to address the issue of the lack of transparency in sharing economy transactions and improve the information users receive from and about the platform (http://www.rosels.eu/research/research-project-iop/). This paper presents some of the preliminary results of this project.

  5. On the users’ tendency not to read contracts, the literature is extensive. Ex multis, Vila et al. (2003); Wilhelmsson (2004); Hillman (2006); Ben-Shahar (2013); Radin (2013); Ayres and Schwartz (2014); Bakos et al. (2014).

  6. In the recently approved Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market and amending Directives 96/9/EC and 2001/29/EC (OJ L 130, 17.5.2019, pp. 92–125) – hereinafter the “Directive on Copyright in the DSM” or “CDSM Directive”) – TDM is defined as “any automated analytical technique aimed at analysing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations” (Art. 2.2, CDSM Directive). The definition is sufficiently broad to embrace the current TDM application panorama. For a technical definition of text and data mining, see Hearst (2003). Specifically on text mining, Feldman and Sanger (2007). For an extensive analysis of the definition of TDM, see Triaille et al. (2014).

  7. Triaille et al. (2014); Bernhardt et al. (2015); Caspers and Guibault (2016a); Margoni and Dore (2016); Stamatoudi (2016); Hilty and Richter (2017); Geiger et al. (2018); Margoni and Kretschmer (2018); Rosati (2018).

  8. On the crucial need to train algorithms on different datasets, see Hall and Pesenti (2017).

  9. By “private ordering”, we refer to both contractual, technological and informal measures as tools to enforce platforms’ rights and interests towards their users. In the absence of a clear legislative framework or effective (and efficient) remedies, contracts and technology can be used to expand the prerogatives and powers of platforms, restricting the legitimate uses and faculties of the weak party. As noted with reference to the intellectual property domain by Dussolier (2007), pp. 1393–1394.

  10. Arts. 3 and 4, CDSM Directive.

  11. Triaille et al. (2014); Margoni and Kretschmer (2018); Stamatoudi (2016); Montagnani and Aime (2017); Strowel (2018).

  12. Art. 4(1), General Data Protection Regulation.

  13. Art. 3, Database Directive. For a comprehensive overview, see Beunen (2007); Derclaye (2008, 2014).

  14. Art. 1(2), Database Directive.

  15. For the classification of a website as a database, see Strowel and Derclaye (2001), pp. 311–312.

  16. ECJ, Case C-604/10, Football Dataco Ltd and Others v. Yahoo! UK Ltd and Others [2012], ECLI:EU:C:2012:115, para. 38. Derclaye (2012); Rosati (2013b).

  17. ECJ, Case C-604/10, Football Dataco Ltd, para. 38.

  18. Ibid., para. 39.

  19. In line with the US leading case Feist Publications Inc. v. Rural Telephone Service Co., 499 U.S. 340 (1991). See Waelde et al. (2013), p. 65.

  20. Art. 7, Database Directive.

  21. Derclaye (2008), p. 107.

  22. ECJ, C-46/02, Fixtures Marketing Ltd v. Oy Veikkaus Ab [2004] ECLI: ECLI:EU:C:2004:694; C-338/02, Fixtures Marketing Ltd v. Svenska Spel AB [2004] ECLI:EU:C:2004:696; C-444/02, Fixtures Marketing Ltd v. Organismoa Prognostikon Agnon Podosfairou AE (OPAP) [2004] ECLI:EU:C:2004:697; C-203/02, The British Horseracing Board Ltd and Others/William Hill Organization Ltd [2004] ECLI:EU:C:2004:695.

  23. The maker of the database could state the existence of the sui generis right in the T&C, but such a circumstance is quite rare. Usually, T&C contain general formula such as “All rights reserved”, leaving ample room for interpretation to the end-user, which in the majority of the cases is nor a lawyer, much less an IP expert.

  24. As known, the originality requirement is not harmonised in the InfoSoc Directive (it is mentioned in the Software, Term and Database directives). However, the concept has been interpreted as “the author’s own intellectual creation” and applied to works in several decisions of the European Court of Justice. See, ECJ, C-5/08, Infopaq International [2009] ECLI:EU:C:2009:465, para. 37; C-403/08, Football Association Premier League and Others [2011] ECLI:EU:C:2011:631, para. 97; C-393/09, BSA [2010], ECLI:EU:C:2010:816, paras. 44–45; C-355/12, Nintendo and Others [2014], ECLI:EU:C:2014:25, paras. 21–22. For an overview of the “Europeanization” of originality, Strowel (2012).

  25. Rosati (2013a). See also ECJ, Case C-161/17, Land Nordrhein-Westfalen v. Dirk Renckhoff [2017] ECLI:EU:C:2018:634 (on a photograph of the city of Cordoba).

  26. ECJ, Case C-5/08 Infopaq International, para. 45.

  27. Ibid., para. 48.

  28. District Court of Milan, specialised section, decision No. 6057/2014, available online here: https://www.giurisprudenzadelleimprese.it/wordpress/wp-content/uploads/2014/08/20140512_RG79952-20111.pdf.

  29. District Court of Venice, specialised section, decision of 17 December 2014, RG 1522/2011 (not published).

  30. Madrid Provincial Court (s. 12) of 3 March 2004, cited in Vallés (2009), pp. 114–115.

  31. See Bernault et al. (2017), p. 116, footnote 94.

  32. Paris Criminal Court, 17 January 1968, Gaz. Pal. 1968, I, p. 197.

  33. Paris Commercial Court, 67ème ch., 4 Sept. 1989, Expertises, 1991, p. 273, obs. Gross (contract proposed by a credit provider to traders); Paris Court of Appeal, 4th ch., 27 Nov. 2002, Expertises, 2003, p. 190 (software licence); Douai Court of Appeal, 27 March 2013, Propr. Intell., 2013, p. 285, obs. Brugière (terms of reference for public procurement).

  34. Brussels Court of Appeal, 28 January 1997, cited in de Visscher and Michaux (2000), p. 31, footnote 125.

  35. Supreme Court of Judicator – Court of Appeal, Elanco v. Mandops [1980] RPC 213.

  36. Bently and Sherman (2009), p. 64.

  37. Federal Supreme Court, 10 October 1991 – I ZR 147/89 (“Bedienungsanweisung”).

  38. Federal Supreme Court, 11 April 2002 – I ZR 231/99 (“Technische Lieferbedingungen”).

  39. Federal Supreme Court, 22 September 1999 – I ZR 48/97 (“Planungsmappe”).

  40. The comment of Senator Kennedy at the April 2018 Congressional hearing of Mark Zuckerberg regarding the Cambridge Analytica affair was widely disseminated online: “Here's what everybody's been trying to tell you today, and – and I say this gently. Your user agreement sucks […] The purpose of that user agreement is to cover Facebook's rear end. It's not to inform your users about their rights. Now, you know that and I know that. I'm going to suggest to you that you go back home and rewrite it. And tell your $1,200 an hour lawyers, no disrespect. They're good. But – but tell them you want it written in English and non-Swahili, so the average American can understand it. That would be a start”.

  41. See point 5 of the Creative Commons Terms, https://creativecommons.org/terms/.

  42. See for instance the privacy policy elaborated by the designer Stefania Passera: https://juro.com/policy.html.

  43. TDM may potentially perform also the reproduction and/or adaptation and/or communication and/or distribution of the database itself. However, we do not analyse this case here.

  44. In the same sense, Caspers and Guibault (2016a); Triaille et al. (2014).

  45. Triaille et al. (2014), p. 32. According to the authors, this is not the case if the tool simply spots one or two words through the text without making a copy of the work (e.g. spotting and counting the occurrences of the word “malaria”). In particular, p. 31. Similarly, Stamatoudi (2016), p. 1261; Montagnani and Aime (2017), p. 379 ff.

  46. We do not take into consideration Art. 9(a) Database Directive, since the exception for personal use applies only to non-electronic database, which do not permit TDM in any case.

  47. Art. 8(2), Database Directive.

  48. Art. 8(3), Database Directive.

  49. As required by Art. 5(3)(a) InfoSoc Directive and Art. 6(1)(a) Database Directive, but not by Art. 9(b) Database Directive. See Derclaye (2008), p. 131.

  50. For a complete analysis of the activities that could fall within the exception, Triaille et al. (2013), p. 359 ff.; Guibault et al. (2012), p. 49 ff.; Montagnani and Aime (2017), pp. 385 ff.

  51. See Arts. 5(3)(a) InfoSoc Directive and Arts. 6(2)(b) and 9(1)(b) Database Directive. Another inconsistency between the two Directives must be noted with regard to the research exception. Under the Database Directive, the reference to the source appears to be a mandatory requirement, while the InfoSoc admits the possibility of not indicating the author’s names if “this turns out to be impossible” (Art. 5.3.a. InfoSoc Directive). According to some authors, this difference is more a declamation than a substantial matter, considering the general principle of “ad impossibilia nemo tenetur”, which will be applicable to the exception for database in any case [see Montagnani and Aime (2017), p. 387; citing Walter and Von Lewinski (2010)]. Other authors interpret literally the provisions [cf. Triaille et al. (2014), p. 70, as reported in Montagnani and Aime (2017), p. 387].

  52. Caspers and Guibault (2016a), p. 34. See also Helberger and Hugenholtz (2007). The private copying exception has traditionally received little attention in the literature and the case law, apart from the issues related to the copyright levies and fair compensation. See Strowel (2015).

  53. Caspers and Guibault (2016a), p. 34.

  54. Valenti (2007b), p. 195.

  55. Ibid. p. 202, para. III and the bibliography thereby cited.

  56. Despite the potential “open-ended” nature of the three-step test, as reported by Margoni (2012). See also Hilty et al. (2008). While, for phonograms and videograms, there is a specific provision: the reproduction is permitted if done by a physical person solely for personal use and for non-commercial purposes, in compliance with the applicable TPMs (Art. 71sexies (1), Law 633/1941). The exception will not apply if the reproduction is done by a third party (Art. 71sexies (2), Law 633/1941) and if the works are available on-demand and protected by TPMs or contracts (Art. 71sexies (3), Law 633/1941). On the limits of the private copying exception for the digital context and the interplay with TPMs, cf. Caso (2004); Montagnani (2007); Mazziotti (2008).

  57. See Derclaye and Favale (2010); Helberger et al. (2013); Triaille et al. (2013); Caspers and Guibault (2016a). Regarding TPMs, Art. 6(4) does not require that the Member States take appropriate measures to ensure that the beneficiaries of the private copying exception do in practice benefit of the exception.

  58. Derclaye and Favale (2010), p. 90.

  59. The conflict between freedom of contracts and copyright limitations is extensively investigated in Guibault (2002). Even if with reference to the proposal for the InfoSoc Directive, the issue was already pointed out by Hugenholtz (2000).

  60. Kretschmer et al. (2010), p. 13.

  61. In Italy, see Art. 68bis, Law 633/41. Cf. Valenti (2007a).

  62. Recital 33, InfoSoc Directive.

  63. ECJ, C-5/08, Infopaq International A/S v. Danske Dagblades Forening [2009], ECLI:EU:C:2009:465; C-403/08, Football Association Premier League Ltd and others v. QC Leisure [2011], ECLI:EU:C:2011:631; case C-302/10, Infopaq International A/S v. Danske Dagblades Forening [2012], ECLI:EU:C:2012:16; case C-360/13, Public Relations Consultants Association Ltd v. Newspaper Licensing Agency Ltd and Others [2014], ECLI:EU:C:2014:1195.

  64. Triaille et al. 2014.

  65. ECJ, C-302/10, Infopaq International A/S v. Danske Dagblades Forening, para. 54.

  66. Ibidem.

  67. Triaille et al. (2014), pp. 31–32.

  68. The “normal use” of the content of the database has been established, for example, in the national proceedings of the Ryanair case (ECJ, Case C–30/14, Ryanair Ltd v. PR Aviation BV [2015], ECLI:EU:C:2015:10). In that case, the Netherlands court stated that the activity of an online intermediary comparing the prices of flights, by performing also the extraction of information from the Ryanair website, was a “normal use” of that database, thus any contrary contractual provision considered unenforceable. However, the ECJ noted that the existence of the sui generis right was not proven in the case: as a consequence, the prohibition of contractual overriding did not apply.

  69. https://www.uber.com/legal/terms/be/ (accessed August 2018).

  70. ECJ, Case C-30/14, Ryanair Ltd v. PR Aviation BV, paras. 44–45. Cf. Borghi and Karapapa (2015); Myska and Harasta (2016).

  71. Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions, “A Digital Single Market Strategy for Europe”, COM/2015/0192 final.

  72. Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions, “Towards a modern, more European copyright framework”, COM/2015/0626 final.

  73. Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions, “Promoting a fair, efficient and competitive European copyright-based economy in the Digital Single Market”, COM/2016/592.

  74. See the version of the text dated 25 May 2018 (Council of the EU, Interinstitutional File: 2016/0280(COD), doc. 9134/18. Hereinafter "Council text”, available at: https://www.consilium.europa.eu/media/35373/st09134-en18.pdf).

  75. Here the Council tries to fix the wording of Recital 8 (Commission text). The latter has been criticised in the literature for being a potential source of confusion, since it "wrongly suggests that carrying out TDM is per se of relevance to copyright. The explanations given in Recital 8 (Council text), according to which an authorisation to undertake such acts must be obtained from rightholders if no exception or limitation applies, are too sweeping”. Hilty and Richter (2017), p. 3. However, the clarification offered by the Council could not be sufficient: the temporary reproduction exception is only one of the possible legitimate activities that can be lawfully performed by users without authorisation nor a specific TDM exception.

  76. Defined at Art. 2(3), Draft Directive.

  77. The wording of the Recital was far from being clear. On the one hand, by recognising the importance of peer review and verification, the proposal seems to allow the retention of the copies made under the exception “in certain cases” (not specified). Such copies must “be stored in a secure environment and not be retained for longer than is necessary for the scientific research activities”. The text leaves to the Member State the task to determine the concrete modalities for retaining the copies. However, the hard issue to ascertain would have been: when will the copies no longer be necessary? Is this something easy to determine in the context of research? The CDSM Directive has not maintained such a provision for the "scientific" TDM exception, but it appears at Art. 4(2) only.

  78. See, for instance, amendments proposed to the text of Art. 3 of the Commission proposal: amendment 538 by Julia Reda, Nessa Childers, Max Andersson, Michel Reimon, Brando Benifei (deleting any reference to research organisation, research purposes and lawful access), amendment 539 by Jytte Guteland (extending TDM to cultural heritage institutions), amendments 546 and 547 (respectively encouraging and obliging Members States to allow research organisations, without lawful access to works and other subject-matter, to perform TDM), amendment 548 (protecting the mandatory TDM exception against TPMs) and amendments 551–555 (limiting the scope of the measures that the rightholder can adopt to ensure the security and integrity of the networks and databases where the works or other subject-matter are hosted), amendment 564 (mandating the adoption of open formats for publicly-funded research and data in order to enable TDM). The text of the amendments presented by the Member of the EP is available here: https://euractiv.eu/wp-content/uploads/sites/2/2017/05/JURI-copyright-amendments.pdf.

  79. Committee on Legal Affairs, Report on the proposal for a directive of the European Parliament and of the Council on copyright in the Digital Single Market (COM(2016)0593 – C8-0383/2016 – 2016/0280(COD)), 29 June 2018, available here: http://www.europarl.europa.eu/sides/getDoc.do?type=REPORT&mode=XML&reference=A8-2018-0245&language=EN.

  80. In this sense, see the second sentence added at para. 1 of Art. 3, EP text.

  81. Many scholars have argued that the exception should be broadened. For instance, Margoni and Kretschmer (2018); Caspers and Guibault (2016b); Margoni and Dore (2016); Hilty and Richter (2017); Geiger et al. (2018).

  82. See above Communication, “A Digital Single Market Strategy for Europe”, COM/2015/0192 final, p. 7: “Innovation in research for both non-commercial and commercial purposes, based on the use of text and data mining (e.g. copying of text and datasets in search of significant correlations or occurrences) may be hampered because of an unclear legal framework and divergent approaches at national level. The need for greater legal certainty to enable researchers and educational institutions to make wider use of copyright-protected material, including across borders, so that they can benefit from the potential of these technologies and from cross-border collaboration will be assessed, as with all parts of the copyright proposals in the light of its impact on all interested parties”.

  83. Margoni and Kretschmer (2018) and Caspers and Guibault (2016a). As acutely pointed out by Margoni and Kretschmer: “The EU legislature is fully aware of this contradiction but failed to address it properly. In fact, Art. 6 of the Proposal (“common provisions”) clarifies that the provisions of the first, third and fifth subparagraph of Art. 6(4) InfoSoc directive apply. In plain English, this means that if a user qualifies for an exception to copyright (e.g. TDM) but a Technological Protection Measure prevents them from doing it, Member States have an obligation to take appropriate measures to ensure that right holders make available to the beneficiary an exception or limitation. In the almost 20 years since when the InfoSoc directive was enacted, the UKIPO, which has correctly put in place a specific procedure for this type of situations, has received less than a handful of requests”.

  84. These thoughts were firstly elaborated in Strowel (2018).

  85. See also Poort (2018).

  86. See the review of the history and principles of copyright in both legal traditions: Strowel (1993). More recently, Strowel (2014), pp. 701–703.

  87. Authors Guild v. Google, Inc. No. 13-4829-cv (2d Cir. Oct. 16, 2015). On April 18, 2016, the Supreme Court denied the petition for a writ of certiorari, leaving the Second Circuit ruling in Google's favour intact. To make available parts of the corpus of books, Google has scanned the digital copies and established a publicly available search function, the ngrams tool.

  88. The following US decisions quoted in Authors Guild v. Google, Inc. have, for instance, exempted several uses that, without the application of the work use requirement, cannot escape copyright’s exclusivity in the EU: A.V. ex rel. Vanderhye v. iParadigms, LLC, 562 F.3d 630, 638–640 (4th Cir. 2009) (justifying as transformative fair use purpose the complete digital copying of a manuscript to determine whether the original included matter plagiarized from other works); Perfect 10, Inc. v. Amazon.com, Inc., 508 F.3d 1146, 1165 (9th Cir. 2007) (justifying as transformative fair use purpose the use of a digital, thumbnail copy of the original to provide an Internet pathway to the original); Kelly v. Arriba Soft Corp., 336 F.3d 811, 818–819 (9th Cir. 2003) (same); Bond v. Blum, 317 F.3d 385 (4th Cir. 2003) (justifying as fair use purpose the copying of author’s original unpublished autobiographical manuscript for the purpose of showing that he murdered his father and was an unfit custodian of his children).

  89. Caspers and Guibault (2016a); Stamatoudi (2016); Triaille et al. (2014).

  90. On the theoretical foundation of platform cooperativism, see Scholz (2016).

  91. Blablacar does not prohibit TDM on the whole content of the website, but only on a substantial part of it.

  92. See Caspers and Guibault (2016a), p. 8.

  93. According to T&C of Bar d’Office, users cannot obtain (or attempt) to obtain any material or information through any means not intentionally made available by the platform. We shall conclude that third-party applications aiming at analysing Bar d’office legal documents do not fall within the permitted uses. Wibee does not allow to “exploit in any way the content” but it has to be combined with the contractual provision that allows the use of the platform for non-commercial purposes only. Finally, Menu Next Door contained the broad, but vague, formulation “All rights reserved”.

  94. On the origin and functioning of this file, seehttp://www.robotstxt.org/robotstxt.html.

  95. Rotenberg and Compañó (2009).

  96. Art. 6(3), InfoSoc Directive.

  97. Ibidem.

  98. Sire (2015). Contra, Groom (2004), according to which a visiting agent programmed to systematically ignore the robots.txt can be seen as a strategy to circumvent a technological protection measure.

  99. In Europe, the robots.txt file has been questioned with reference to the issue of implied license only. See, for instance, the Copiepresse v. Google saga, commented in Strowel (2007, 2011). Doubts about the classification of robot.txt as a TPM in the US context are expressed by Jasiewicz (2012). In the US case Healthcare Advocates, Inc. v. Harding [497 F. Supp. 2d 627, 643 (E.D. Pa. 2007)], the Court for the Eastern District of Pennsylvania incidentally discussed the nature of the robots.txt file. The judge recognised the protocol as a TPM under the DMCA in that specific case. However, the Court expressly affirmed that robots.txt is not “analogous to digital password protection or encryption” and its nature must be assessed case-by-case (“This finding should not be interpreted as a finding that a robots.txt file universally qualifies as a technological measure that controls access to copyrighted works under the DMCA”).

  100. Buijze (2013). The analysis presented in this paragraph has been further developed in Ducato (forthcoming).

  101. Loos (2015); Kästle-Lamparter (2018), pp. 429–430, 474 and 481. See also Micklitz et al. (2009), pp. 135 ff.

  102. The transparency principle, as a duty to provide information before the conclusion of the contract, is envisaged in the Annex of the Unfair Terms Directive (UTD), which includes, among the list of potential unfair terms, the contractual provision which: “irrevocably binds the consumer to terms with which he had no real opportunity of becoming acquainted before the conclusion of the contract” (Annex, 1.i, UTD). It can also be derived by Recital 20, UTD. It is further recalled at Art. 6(1) of the Consumer Rights Directive (CRD).

  103. The principle of transparency, sub specie of understandability of the information provided to the consumer, is specifically mentioned in several legislative instruments. The duty to provide the consumer with information in a clear and comprehensible manner is recalled at Art. 5, UTD: “In the case of contracts where all or certain terms offered to the consumer are in writing, these terms must always be drafted in plain, intelligible language”. Moreover, it is expressed at Arts 5(1), 6(1) CRD and further expanded at Art. 8 CRD. In addition, when the contract is concluded “through a means of distance communication which allows limited space or time to display the information” (Art. 8.4, CRD), like the screen of a mobile phone, the trader will have to provide at least a set of pre-contractual information, such as the main characteristics of the goods or services, the identity of the trader, the total price, the right of withdrawal, the duration of the contract and, if the contract is of indeterminate duration, the conditions for terminating the contract. Among the appropriate means to display information to the consumer, the Commission suggested the adoption of a set of icons, making also available a model. However, such a measure does not seem to have taken hold (EC Commission, DG Justice Guidance document concerning Directive 2011/83/EU of the European Parliament and of the Council of 25 October 2011 on consumer rights, amending Council Directive 93/13/EEC and Directive 1999/44/EC of the European Parliament and of the Council and repealing Council Directive 85/577/EEC and Directive 97/7/EC of the European Parliament and of the Council, June 2014, available here: https://ec.europa.eu/info/sites/info/files/crd_guidance_en_0.pdf).

  104. Micklitz et al. (2009), p. 136.

  105. Kästle-Lamparter (2018), p. 474.

  106. When it has interpreted the plainness and intelligibility requirements, the ECJ has always excluded formalistic readings. In Kásler, for instance, the Court held that the requirement of transparency of terms, under the UTD, cannot “be reduced merely to being formally and grammatically intelligible” (ECJ, C-26/13, Árpád Kásler and Hajnalka Káslerné Rábai v. OTP Jelzálogbank Zrt. [2014] ECLI:EU:C:2014:282, para. 71). Principle confirmed in the subsequent jurisprudence. See, ECJ, Bogdan Matei and Ioana Ofelia Matei v. SC Volksbank România SA [2015] ECLI:EU:C:2015:127. See also, ECJ, C-191/15, Verein für Konsumenteninformation v. Amazon EU Sà [2016] ECLI:EU:C: 2016:612). Terms must be transparent “so that the consumer can foresee, on the basis of clear, intelligible criteria, the economic consequences for him which derive from it” (ECJ, C-26/13, Árpád Kásler and Hajnalka Káslerné Rábai v. OTP Jelzálogbank Zrt, para. 73). As later specified in Gutiérrez Naranjo, under the transparency principle, the consumer has to be able to understand not only the economic consequences but also the legal ones (ECJ, C-154/15, Francisco Gutiérrez Naranjo v. Cajasur Banco SAU, Ana María Palacios Martínez v. Banco Bilbao Vizcaya Argentaria SA (BBVA), Banco Popular Español SA v. Emilio Irles López and Teresa Torres Andreu [2016] ECLI:EU:C:2016:980).

  107. About the interplay between consumer protection and data protection, Helberger et al. (2017).

  108. EDPB, Guidelines on Transparency under Regulation 2016/679, 13 April 2018.

  109. Art. 12.7 GDPR. See also Recital 60. The link between transparency, information and visualisation is further stressed at Recital 58, GDPR: “The principle of transparency requires that any information addressed to the public or to the data subject be concise, easily accessible and easy to understand, and that clear and plain language and, additionally, where appropriate, visualisation be used. Such information could be provided in electronic form, for example, when addressed to the public, through a website. This is of particular relevance in situations where the proliferation of actors and the technological complexity of practice make it difficult for the data subject to know and understand whether, by whom and for what purpose personal data relating to him or her are being collected, such as in the case of online advertising”.

  110. OECD (2016), p. 17. A version for the GDPR set of icons was released in the Annex of the first reading of the European Parliament (European Parliament legislative resolution of 12 March 2014 on the proposal for a regulation of the European Parliament and of the Council on the protection of individuals with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation) [COM(2012)0011 – C7-0025/2012 – 2012/0011(COD)] (Ordinary legislative procedure: first reading). But no other official initiatives have been taken in this direction. An interesting methodology to answer the challenges of GDPR’s icons has been developed within the research project run by the Cirsfid group at the University of Bologna: http://gdprbydesign.cirsfid.unibo.it/.

  111. Art. 12.1, GDPR. See also Recitals 39 and 42, GDPR.

  112. Art. 7.2, GDPR.

  113. EDPB, Guidelines on Transparency under Regulation 2016/679, p. 8.

  114. Ibid., p. 11.

  115. Ibid., p. 12.

  116. Ibidem.

References

  • Ayres I, Schwartz A (2014) The no-reading problem in consumer contract law. Stanf Law Rev 66(3):545–610

    Google Scholar 

  • Bakos Y et al (2014) Does anyone read the fine print? Consumer attention to standard form contracts. J Legal Stud 43(1):1–35

    Article  Google Scholar 

  • Bar-Gill O (2015) Defending (smart) disclosure: a comment on more than you wanted to know. Jerus Rev Legal Stud 11(1):75–82

    Article  Google Scholar 

  • Ben-Shahar O (2009) The myth of the ‘opportunity to read’ in contract law. Eur Rev Contract Law 5(1):1–28

    Article  Google Scholar 

  • Ben-Shahar O (2013) Regulation through boilerplate: an apologia. Mich L Rev 112:883–903

    Google Scholar 

  • Bently L, Sherman B (2009) Intellectual property law. Oxford University Press, Oxford

    Google Scholar 

  • Bernault C et al (2017) Traité de la propriété littéraire et artistique. Litec, Paris

    Google Scholar 

  • Bernhardt B et al (2015) Revolutionizing scholarship: a panel discussion on text and data mining. Ser Rev 41(3):184–186

    Article  Google Scholar 

  • Beunen AC (2007) Protection for databases: the European Database Directive and its effects in the Netherlands. Wolf Legal Publishers, Nijmegen

    Google Scholar 

  • Borghi M, Karapapa S (2015) Contractual restrictions on lawful use of information: sole-source databases protected by the back door? Eur Intellect Prop Rev 37(8):505–514

    Google Scholar 

  • Braun D et al (2018) Customer-centered LegalTech: automated analysis of standard form. Internationales Rechtsinformatik Symposium (IRIS), pp 627–634

  • Buijze A (2013) The six faces of transparency. Utrecht L. Rev. 9(3):3–25

    Article  Google Scholar 

  • Busch C (2016) The future of pre-contractual information duties: from behavioural insights to big data. In: Twigg-Flesner C (ed) Research handbook on EU consumer and contract law. Edward Elgar, Cheltenham, pp 221–240

    Chapter  Google Scholar 

  • Busch C (2019) Implementing personalized law: personalized disclosures in consumer law and privacy law. Univ Chicago Law Rev 86(2):309–332

    Google Scholar 

  • Caso R (2004) Digital rights management: il commercio delle informazioni digitali tra contratto e diritto d’autore. CEDAM, Padua

    Google Scholar 

  • Caspers M, Guibault L (2016a) Baseline report of policies and barriers of TDM in Europe. https://www.futuretdm.eu/wp-content/uploads/FutureTDM_D3.3-Baseline-Report-of-Policies-and-Barriers-of-TDM-in-Europe-1.pdf. Accessed 18 August 2018

  • Caspers M, Guibault L (2016b) A right to ‘read’ for machines: assessing a black-box analysis exception for data mining. Computer Sciences 53(1):1–5

    Google Scholar 

  • Contissa G et al (2018) CLAUDETTE meets GDPR. Automating the evaluation of privacy policies using artificial intelligence. https://www.beuc.eu/publications/beuc-x-2018-066_claudette_meets_gdpr_report.pdf. Accessed 18 October 2018

  • de Visscher F, Michaux B (2000) Précis du droit d’auteur et des droits voisins. Bruylant, Brussels

    Google Scholar 

  • Derclaye E (2008) the legal protection of databases: a comparative analysis. Edward Elgar, Cheltenham

    Book  Google Scholar 

  • Derclaye E (2012) Football Dataco: skill and labour is dead. http://copyrightblog.kluweriplaw.com/2012/03/01/football-dataco-skill-and-labour-is-dead/. Accessed 18 August 2018

  • Derclaye E (2014) The Database Directive. In: Stamatoudi I, Torremans P (eds) EU copyright law: a commentary. Edward Elgar, Cheltenham, pp 298–354

    Chapter  Google Scholar 

  • Derclaye E, Favale M (2010) Paper 3: user contracts. In: Kretschmer M et al (eds) The relationship between copyright and contract law. http://eprints.bournemouth.ac.uk/16091/1/_contractlaw-report.pdf. Accessed 18 August 2018

  • Ducato R (forthcoming) Transparency by (legal) design. Paper presented at the Private Law Consortium, Harvard Law School, 15 May 2018 and at the Younger Scholar Informal Symposium, organised within the General Congress of the International Academy of Comparative Law, Fukuoka, 25 July 2018

  • Dussolier S (2007) Sharing access to intellectual property through private ordering. Chicago Kent Law Rev 82(3):1391–1438

    Google Scholar 

  • Feldman R, Sanger J (2007) The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge University Press, Cambridge

    Google Scholar 

  • Geiger C et al (2018) The exception for text and data mining (TDM) in the proposed Directive on Copyright in the Digital Single Market—legal aspects. Centre for International Intellectual Property Studies (CEIPI) Research Paper No. 2018-02:1–34

  • Groom J (2004) Are agent exclusion clauses a legitimate application of the EU Database Directive. SCRIPTed 1:83–118

    Article  Google Scholar 

  • Guibault L (2002) Copyright limitations and contracts. Kluwer Law Int, The Hague

    Google Scholar 

  • Guibault L et al (2012) Study on the implementation and effect in Member States’ laws of Directive 2001/29/EC on the harmonisation of certain aspects of copyright and related rights in the information society. Report to the European Commission, DG Internal Market, February 2007. Amsterdam Law School Research Paper No. 2012-28. Institute for Information Law Research Paper No. 2012-23. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2006358. Accessed 18 August 2018

  • Hall W, Pesenti J (2017) Growing the artificial intelligence industry in the UK. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/652097/Growing_the_artificial_intelligence_industry_in_the_UK.pdf. Accessed 18 August 2018

  • Harkous H et al (2018) Polisis: automated analysis and presentation of privacy policies using deep learning. arXiv preprint arXiv:1802.02561. Accessed 18 August 2018

  • Hearst MA (2003) Text data mining. In: Mitkov R (ed) The Oxford handbook of computational linguistics. Oxford University Press, Oxford, pp 616–662

    Google Scholar 

  • Helberger N (2013) Forms matter: informing consumers effectively (Study commissioned by BEUC). https://www.ivir.nl/publicaties/download/Form_matters.pdf. Accessed 18 August 2018

  • Helberger N, Hugenholtz PB (2007) No place like home for making a copy: private copying in European copyright law and consumer law. Berkeley Tech LJ 22(3):1061–1093

    Google Scholar 

  • Helberger N et al (2013) Digital content contracts for consumers. J Consum Policy 36(1):37–57

    Article  Google Scholar 

  • Helberger N et al (2017) The perfect match? A closer look at the relationship between EU consumer law and data protection law. Common Mark Law Rev 54(5):1427–1465

    Google Scholar 

  • Helleringer G, Sibony A-L (2017) European consumer protection through the behavioral lens. Columbia J Eur Law 23(3):607–649

    Google Scholar 

  • Hillman RA (2006) Online boilerplate: would mandatory website disclosure of e-standard terms backfire? Mich Law Rev 104(5):837–856

    Google Scholar 

  • Hilty RM, Richter H (2017) Position statement of the Max Planck Institute for Innovation and Competition on the proposed modernisation of European copyright rules part B exceptions and limitations (Art. 3–Text and Data Mining). https://www.ip.mpg.de/fileadmin/ipmpg/content/stellungnahmen/MPI_Position_Statement_Part_B_Chapter_1_Update23022017.pdf. Accessed 18 Aug 2018

  • Hilty RM et al (2008) Declaration on a balanced interpretation of the “three-step test” in copyright law. Int Rev Intellect Prop Compet Law 39(6):707–712  

    Google Scholar 

  • Hugenholtz PB (2000) Copyright, contract and code: what will remain of the public domain. Brook J Int Law 26:77–90

    Google Scholar 

  • Jasiewicz MI (2012) Copyright protection in an opt-out world: implied license doctrine and news aggregators. Yale Law J 122(3):837–850

    Google Scholar 

  • Kästle-Lamparter D (2018) Pre-contractual information duties. In: Jansen N, Zimmermann R (eds) Commentaries on European contract laws. Oxford University Press, Oxford, pp 384–504

    Google Scholar 

  • Kretschmer M et al (2010) The relationship between copyright and contract law. http://eprints.bournemouth.ac.uk/16091/1/_contractlaw-report.pdf. Accessed 18 August 2018

  • Lippi M et al (2018) CLAUDETTE: an automated detector of potentially unfair clauses in online terms of service. arXiv preprint arXiv:1805.01217. Accessed 18 October 2018

  • Loos M (2015) Transparency of standard terms under the Unfair Contract Terms Directive and the proposal for a common European sales law. Eur Rev Priv Law 23(2):179–193

    Google Scholar 

  • Margoni T (2012) Eccezioni e limitazioni al diritto d’autore in Internet = Exceptions and Limitations to Copyright Law in the Internet. https://www.ivir.nl/publicaties/download/Giurisprudenza_Italiana_2011_8_9.pdf. Accessed 18 August 2018

  • Margoni T, Dore G (2016) Why we need a text and data mining exception (but it is not enough). https://interop2016.github.io/pdf/INTEROP-13.pdf. Accessed 18 August 2018

  • Margoni T, Kretschmer M (2018) The text and data mining exception in the proposal for a Directive on Copyright in the Digital Single Market: why it is not what EU copyright law needs. https://www.create.ac.uk/blog/2018/04/25/why-tdm-exception-copyright-directive-digital-single-market-not-what-eu-copyright-needs/. Accessed 18 Aug 2018

  • Mazziotti G (2008) EU digital copyright law and the end-user. Springer, Berlin

    Google Scholar 

  • Micklitz H-W et al (2009) Understanding EU consumer law. Intersentia, Antwerp

    Google Scholar 

  • Montagnani ML (2007) Dal peer-to-peer ai sistemi di Digital Rights Management: primi appunti sul melting pot della distribuzione online. Il diritto d’autore 1:1–57

    Google Scholar 

  • Montagnani ML, Aime G (2017) Il text and data mining e il diritto d’autore. AIDA XXVI:376–394

    Google Scholar 

  • Myska M, Harasta J (2016) Less is more: protecting databases in the EU after Ryanair. Masaryk UJL Tech 10:170–198

    Google Scholar 

  • OECD (2016) Protecting consumers in peer platform markets: exploring the issues. https://unctad.org/meetings/en/Contribution/dtl-eWeek2017c05-oecd_en.pdf. Accessed 18 August 2018

  • Poort J (2018) Borderlines of copyright protection: an economic analysis. In: Hugenholtz PB (ed) Copyright reconstructed: rethinking copyright’s economic rights in a time of highly dynamic technological and economic change. Wolters Kluwer, Alphen aan den Rijn, pp 283–338

    Google Scholar 

  • Porat A, Strahilevitz LJ (2013) Personalizing default rules and disclosure with big data. Mich Law Rev 112(8):1417–1478

    Google Scholar 

  • Radin MJ (2013) Boilerplate: the fine print, vanishing rights, and the rule of law. Princeton University Press, Princeton

    Book  Google Scholar 

  • Rosati E (2013a) Originality in EU copyright. Edward Elgar, Cheltenham

    Book  Google Scholar 

  • Rosati E (2013b) Towards an EU-wide copyright? (judicial) pride and (legislative) prejudice. Intellect Prop Q 1:47–68

    Google Scholar 

  • Rosati E (2018) National and EU text and data mining exceptions: room for coexistence? http://ipkitten.blogspot.com/2018/03/national-and-eu-text-and-data-mining.html. Accessed 18 August 2018

  • Rotenberg B, Compañó R (2009) Search Engines for audio-visual content: copyright law and its policy relevance telecommunication markets. In: Curwen P et al (eds) Telecommunication markets. Springer, Heidelberg, pp 113–139

    Chapter  Google Scholar 

  • Sadeh N et al (2013) The usable privacy policy project. http://reports-archive.adm.cs.cmu.edu/anon/isr2013/CMU-ISR-13-119.pdf. Accessed 18 August 2018

  • Scholz T (2016) Platform cooperativism. Challenging the corporate sharing economy. Rosa Luxemburg Stiftung, New York

    Google Scholar 

  • Sire G (2015) Inclusion exclue: le code est un contrat léonin. Enquête sur la valeur technique et juridique du protocole robots.txt. Réseaux 189(1):187–214

    Article  Google Scholar 

  • Stamatoudi IA (2016) Text and data mining. In: Stamatoudi IA (ed) New developments in EU and international copyright law. Wolters Kluwer, Alphen aan den Rijn, pp 251–282

    Google Scholar 

  • Strowel A (1993) Droit d’auteur et copyright, Divergences et convergences. Bruylant, Paris

    Google Scholar 

  • Strowel A (2007) Google et les nouveaux services en ligne: quels effets sur l’économie des contenus, quels défis pour la propriété intellectuelle. Journal des tribunaux 22:589–598

    Google Scholar 

  • Strowel A (2011) Quand Google défie le droit. Larcier, Bruxelles

    Google Scholar 

  • Strowel A (2012) European copyright: beyond the additions made by the European Court of Justice, some pieces are still missing. In: Janssens M-C, Overwalle GV (eds) Harmonisation of European IP law, in Honor of Fr. Gotzen. De Boeck, Bruxelles, pp 73–98

    Google Scholar 

  • Strowel A (2014) Droit d’auteur et copyright. Convergences des droits, régulation différente des contrats Mélanges en l’honneur du Professeur André Lucas. LexisNexis, Paris, pp 699–717

  • Strowel A (2015) Fair compensation for private copying copyright and the digital agenda for Europe: current regulations and challenges for the future. Sakkoulas Publications, Athens, pp 189–195

    Google Scholar 

  • Strowel A (2018) Reconstructing the reproduction and communication to the public rights: how to align copyright with its fundamentals. In: Hugenholtz PB (ed) Copyright reconstructed rethinking copyright’s economic rights in a time of highly dynamic technological and economic change. Wolters Kluwer, Alphen aan den Rijn, pp 203–240

    Google Scholar 

  • Strowel A, Derclaye E (2001) Droit d’auteur et numérique: logiciels, bases de données, multimédia. Bruylant, Bruxelles

    Google Scholar 

  • Sunstein C (2012) Informing consumers through smart disclosure. https://obamawhitehouse.archives.gov/blog/2012/03/30/informing-consumers-through-smart-disclosure. Accessed 28 August 2018

  • Triaille JP et al (2013) Study on the application of Directive 2001/29/EC on copyright and related rights in the information society. https://publications.europa.eu/en/publication-detail/-/publication/9ebb5084-ea89-4b3e-bda2-33816f11425b. Accessed 18 Oct 2018

  • Triaille JP et al (2014) Study on the legal framework of text and data mining (TDM). https://publications.europa.eu/en/publication-detail/-/publication/074ddf78-01e9-4a1d-9895-65290705e2a5/language-en. Accessed 18 October 2018

  • Valenti R (2007a) Art. 68-bis. In: Ubertazzi LC (ed) Diritto d’autore. Estratto da L. C. Ubertazzi “Commetario breve alle leggi su proprietà intellettuale e concorrenza”. CEDAM, Padua, pp 203–204

  • Valenti R (2007b) Introduzione agli artt. 65-71-quinquies. In: Ubertazzi LC (ed) Diritto d’autore. Estratto da L.C. Ubertazzi “Commentario breve alle leggi su proprietà intellettuale e concorrenza”. CEDAM, Padua, pp 190–196

  • Vallés RC (2009) Research handbook on the future of EU copyright the requirement of originality. Edward Elgar, Cheltenham

    Google Scholar 

  • Vila T et al (2003) Why we can’t be bothered to read privacy policies models of privacy economics as a lemons market. In: ICEC ’03 proceedings of the 5th International Conference on Electronic Commerce, pp 403–407

  • Waelde C et al (2013) Contemporary intellectual property: law and policy. Oxford University Press, Oxford

    Book  Google Scholar 

  • Walter M, Von Lewinski S (2010) European copyright law. A commentary. Oxford University Press, Oxford

    Google Scholar 

  • Wilhelmsson T (2004) The abuse of the “confident consumer” as a justification for EC consumer law. J Consum Policy 27(3):317–337

    Article  Google Scholar 

Download references

Acknowledgements

The research is supported by the Innoviris Grant 2016-BB2B-9. A sincere thanks to Dr. Guido Noto La Diega for the constructive discussion on an early version of this paper. The authors have jointly conceived the paper and share the views expressed therein. Nonetheless, while Section 5 is attributable to Alain Strowel, Sections 2, 6 and 7 are specifically attributable to Rossana Ducato. Both authors equally contributed to the drafting of Sections 1, 3, 4 and 8.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rossana Ducato.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ducato, R., Strowel, A. Limitations to Text and Data Mining and Consumer Empowerment: Making the Case for a Right to “Machine Legibility”. IIC 50, 649–684 (2019). https://doi.org/10.1007/s40319-019-00833-w

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40319-019-00833-w

Keywords

Navigation