1 Introduction

Public databases have become a backbone of modern data-driven biological research. Especially in the European Union (EU), it is expected that large-scale research data infrastructure projects such as the European Open Science Cloud (EOSC) will expand existing databases over the coming decade and bring novel databases into existence. However, several long-running curated repositories for biological reference data (legacy databases) have failed to respond to the needs of their scientific communities and do not provide up-to-date services, for example, programmatic access via application programming interfaces (APIs), fine-grained and transparent data versioning and the use of persistent identifiers (PIDs). Replication, derivation and recreation of such legacy databases may address these gaps, but such activities may be hampered by the sui generis database right, a form of intellectual property rights (IPRs), enshrined in the EU’s Database Directive.

Public biological databases are sometimes subject to central creation and central management. For example, a research institution or consortium develops or brings together related datasets for the purposes of future analysis or future use.Footnote 1 In other instances, a central institution or a research consortium operates a platform on which external contributors can integrate their own datasets for the purpose of making such data available for downstream use.Footnote 2 Operators of the central platforms are often responsible only for hosting the provided datasets. Other times, operators also provide value-added services to data contributors. Value-added services include curating the data to improve its technical interoperability with other available datasets or creating enriched datasets from the original input data via data analysis, data aggregation, or data visualisation tools.Footnote 3 Alternatively, some public biological databases are decentralised, with a formal or informal arrangement of research institutions or citizen scientists each contributing data, cloud computing, or cloud storage resources, and analysis tools for common purposes without appointing a central data custodian.Footnote 4

Here, we discuss the implications of the EU Database Directive on the establishment by research communities of biological data resources, including both databases and value-added services. Our legal analysis raises a number of policy considerations relevant to the functioning of public biological databases. For our purposes, public biological databases refer to collections of biological data, or secondary outputs derived from such data, that are made available to the public using an open access, registered access, or controlled access mechanism.Footnote 5 In Sect. 2, we discuss the relationship between the public policy justifications for the creation of intellectual property rights. In Sect. 3, we discuss the law, jurisprudence, and doctrine establishing the ambit and scope of application of the EU sui generis database right. In Sect. 4, we consider the potential for the sui generis database right to find application to public biological databases. We also consider the potential for downstream services that enable data discovery, and analysis to infringe the sui generis database right. Section 5 details public policy reforms that may better align the database right with contemporary approaches to open science and reduce the transaction costs inherent in negotiating downstream rights in IPR-encumbered data. In Sect. 6, we address the potential repeal of the sui generis database right. Section 7 outlines generalisable conclusions directed to the broader effort to establish EU information policy.

2 Intellectual Property Rights versus Open Science

2.1 Justifications For and Against Intellectual Property Rights

The design of IPRs attempts to find balance between two conflicting imperatives, first, to provide creators of intangible goods with protection from reuse of their goods absent suitable compensation and/or authorisation, and second, to enable the secondary use of existing intangible goods, especially for those that are necessary economic inputs into other productive activities.Footnote 6 In balancing these interests, IPRs are limited in scope and duration. Upon the expiration of IPRs, the protected intangibles and associated rights of use return to the public domain. Thus, IPRs are commonly conceived of as a bargain between the creator and society, because these rights are granted to incentivise the disclosure of the intangible goods. In protecting the rights of creators in intangibles, IPRs enable producers of such goods to disclose them to third parties or to the public while retaining the ability to exclude others from using them, or to impose conditions on their use. In their absence, trade secrets protect information of commercial value that is held confidentially.Footnote 7

IPRs are designed to address the unique character of intangible goods, which can be considered non-rivalrous, in whole or in part. The use or consumption of non-rivalrous goods by one individual does not inhibit another’s use or consumption of the same goods. For example, the use of a dataset as part of an analysis does not preclude others from using the dataset afterwards.Footnote 8 This is distinct from tangible goods whose consumption is rivalrous, in that the consumption of the good exhausts it in whole or in part and precludes its future use. Further, absent reliance on specialised technological tools, such as digital rights management (DRM) technologies, intangible goods, such as datasets, are usually non-excludable, meaning that it is usually not possible to stop a user from making a second copy and transmitting it to another recipient.

IPRs are an instrument of industrial or economic policy in that they provide a measure of excludability for a set period of time, thereby creating markets in intangible goods (IPRs can also act to condition access to a commons upon respect for its rules). Examples of IPRs include copyright protection for literary works, software, databases, and patent protection for new and useful or industrially applicable inventions.Footnote 9 IPRs enable their holders to control access to their protected intangible goods, via licenses or assignment to third parties. The latter are forms of contracts which specify enforceable rights over intangible goods between contracting parties and largely fall within the realm of private law. Public law may provide further limitations on use of information, for example, regulatory oversight over personal data.Footnote 10

Some economists argue that absent the creation of IPRs, the potential for third parties to free-ride on the efforts and investments of others will create market failures that disincentivise investment in the production of intangible goods.Footnote 11 In other words, these goods create positive externalities that benefit third parties who are unwilling to bear the costs of purchasing such goods. If the producers of intangibles cannot capture value through market transactions, they may opt not to produce them or to make them available. The resultant lack of a competitive market may result in suboptimal investment in the production of intangible goods.Footnote 12 Further, parties that bear the costs of first production of an intangible good are exposed to the risk that other distributors will engage in competition without bearing the fixed cost of original production of the intangible good. That is, absent IPRs, the original distributor that bears the cost of producing an intangible good is left at a competitive disadvantage relative to other distributors. Imposing costs upon third parties as a precondition to accessing intangibles, however, deprives these parties of benefits that could be realised at little or no cost, because sharing an intangible good creates additional value at no additional cost.Footnote 13

Therefore, insofar as free markets fail to generate sufficient investment in the creation of intangibles, which may be considered “public goods”, it may be beneficial to recognise time-limited IPRs to stimulate investment in their production. Exclusive, albeit time-limited IPRs guarantee the creators who bear the cost of production a minimum market share that entitles them to sufficient profits to justify the risk of investment.Footnote 14 However, “strong” IPRs that permit the unilateral and categorical exclusion of third parties from the use of intangible goods can inhibit innovation.Footnote 15 This inhibition arises from a rights-holder precluding access to intangible goods that are necessary inputs into the creation of downstream innovations, thereby maintaining market dominance. It also arises from the transaction costs entailed in negotiating access to all of the intangible goods required for an anticipated purpose. For example, in the context of data-intensive biological research, IPRs over aspects of data can limit the value derived from generating and combining datasets of disparate origin, in part, due to legal uncertainty. Attempts to train machine learning (ML) algorithms might require access to a large number of datasets that are each potentially subject to copyright, moral rights, or sui generis database rights. Though the creators of such datasets may have no interest in precluding third parties from using them, the default operation of law may invest such creators with legal rights to exclude third parties from using their protected datasets. Third parties, therefore, lack certainty about whether their intended use of a dataset will infringe IPRs. This encourages conservatism in the use of available intangibles, due to real or imagined risks of legal non-compliance.Footnote 16

Gauging the amount of protection to confer via IPRs is difficult. Over-protection might stifle the pro-social use of existing intangibles, whilst under-protection might inhibit investment in their initial generation. Studies demonstrate the complex relationship between IPRs and domestic economic growth or contribution to innovation. IPRs best contribute to economic development when they are time-limited,Footnote 17 and their economic benefit is contingent on other characteristics, such as access to a stable enforcement infrastructure. IPRs best benefit developed economies that have a strong innovation sector but can prove a detriment to developing economies without significant domestic IPR holdings. The latter benefit more from liberal access to intangible goods developed elsewhere.Footnote 18

For these reasons, legislators and courts have adopted numerous measures to nuance the relationship between intellectual property protection, innovation, and the prosocial use of existing intangible goods. Legislatures may create new IPRs or fine tune existing ones through statutes or regulations. Courts contribute by interpreting the scope of IPRs, for example, allowable subject matter for IPRs and tests for IPR criteria, or in devising exceptions to the application of IPRs. In so doing, courts avoid restricting the downstream use of infrastructural intangible goods.Footnote 19 These act as necessary inputs into downstream production processes, and limitations in access or use could preclude economic activity and innovation. Courts nuance the application of IPRs for reasons of public policy, but also increase the overall complexity of the applicable legal rules, especially in supranational legal systems, such as that of the EU For example, the national courts of individual EU Member States can create legal fragmentation through heterogenous interpretation of supposedly harmonised inter-State intellectual property law regimes. In partial response to the difficulties inherent in administering IPRs and determining their boundaries, and in recognition of the increasing role of non-market institutions (e.g. academia, crowdsourcing) in producing intangibles, open science initiatives are coming to be seen as a privileged vehicle for incentivising the production of valuable intangibles that are minimally encumbered by private rights.

2.2 Justifications For and Against Open Data in Support of Open Science

Open science is an umbrella term that captures an ongoing shift in the policy considerations that determine how the outputs of scientific research are to be disseminated, and which categories of actors can participate in the scientific enterprise. Though there is no consensus definition of open science, the concept has been recognised in numerous international policy documents, including the recent UNESCO Recommendation on Open ScienceFootnote 20 and multiple OECD instruments.Footnote 21

The policy justifications for open science are numerous and contested. Central themes in the open science literature include the following: (1) democratising science to enable participation and comprehension of research results by members of the public; (2) decreasing research costs by enabling access to inputs such as software, data, and publications, as well as infrastructure such as computational and analytic resources; (3) increasing the accountability of scientific researchers in making data, methodologies, and results more accessible to enable replication and validation of findings.Footnote 22

Open science initiatives acknowledge public participation in defining research agendas and performing research (e.g. citizen science), and produce mechanisms that foster research reproducibility and scientific integrity.Footnote 23 However, they also reflect pragmatic changes in science policy that flow from technological advances. As the costs of storing and transferring data decline, and as data are increasingly generated as a residual output of other activities (e.g. clinical care or public-sector service provision), the potential to repurpose existing data to drive novel research efforts at low cost becomes easier. Therefore, the trend toward privileging open science activities also reflects a response to fundamental changes in the cost-benefit proposition of repurposing existing data to enable further downstream research and of the heightened potential to coordinate the activities of scientific researchers and the public toward cooperative efforts using decentralised platforms.Footnote 24

Open science raises policy tensions. The determination of the best combination of incentives and institutions to generate intangibles and translate them into downstream outputs is often context-specific and indeterminate. For example, if generating data is cost-intensive and difficult to coordinate through non-market institutions, it might be preferable to use property rights in intangibles to stimulate production. In contrast, in those instances where researchers, public sector institutions, or members of the public have strong incentives to generate primary data or downstream intangibles absent market-based incentives, it might prove more effective in generating downstream research outputs to require the open dissemination of data and to limit or exclude the application of IPRs. However, placing limitations on the potential to commercialise downstream outputs resulting from the use of data and other intangibles must be done with great caution; limiting the prospect of commercialising or using IPRs to protect downstream innovations such as drugs, therapies, or software tools could have a chilling effect on their production.

In practice, open science, and open access to data more generally, has been championed through a number of different approaches. Government and public-sector approaches include requirements to release biomedical research data and public sector data to the public. National funding agencies impose on researchers open science terms that apply to data generation and other intangible outputs. Some research communities and consortia voluntarily enable the open use of their data and other outputs. Mechanisms include open publication, licenses, or dedications to the public domain to enable downstream access.

3 EU Laws Relevant to Database Rights and Open Science

3.1 The Sui Generis Database Right

The sui generis database right is a novel IPR first implemented in EU law, and subsequently adopted in three other jurisdictions.Footnote 25 This right protects the contents of a database from the unauthorised “extraction and/or or re-utilisation of the whole or of a substantial part.”Footnote 26 This right is applicable to databases if a “substantial investment”Footnote 27 is made in their compilation. The Database Directive elaborates on what comprises the extraction and/or reutilisation of a “substantial portion” the database. The “substantial part” criterion is assessed from both a quantitative and a qualitative perspective.Footnote 28 Consequently, the database right could be qualitatively infringed, even if the portion extracted or reutilised does not reflect a large portion of the overall contents of the database.Footnote 29

The right is accorded to the “maker” of the database, but “maker” is not defined and is understood to include both natural persons and legal persons (e.g. corporations) who are “nationals of a Member State or who have their habitual residence in the territory of the community.”Footnote 30 In contrast to copyright, which protects creative expression, the right protects the database if a “substantial investment”Footnote 31 has been made in its compilation. Natural and legal persons can benefit from sui generis database protection, whilst subcontractors cannot benefit from such right.Footnote 32

This right is one of the strongest forms of IPR created by legislatures, because its term is indefinite and there are only limited exceptions to its application. The right expires after a 15-year period.Footnote 33 However, because the right is re-actualised each time a database is modifiedFootnote 34 (i.e. the 15-year period restarts), the duration of the sui generis database right is functionally unlimited.Footnote 35 Unlike its closest IPR relative, copyright, it lacks fair-use exceptions and other similar limitations.

EU Member States can implement limited exceptions in national legislation. First, an exception can be applied to extraction that is performed for the purpose of non-commercial teaching or non-commercial scientific research activities.Footnote 36 Second, an exception can be made for the purposes of private use of non-electronic databases.Footnote 37 Third, Member States can introduce an exception for purposes of “public security or an administrative or judicial procedure.”Footnote 38

Text and data mining (TDM) activities do not infringe the extraction prong of the sui generis database right,Footnote 39 unless the holder of the sui generis database right makes an express preclusion of such activities.Footnote 40 TDM performed for scientific research purposes does not infringe the extraction prong of the sui generis database right, regardless of whether or not the holder of the sui generis database right expressly precludes such activities.Footnote 41

Finally, the Database Directive limits the extent to which the holders of sui generis database rights can restrict the actions of lawful users of protected databases.Footnote 42 Rights-holders cannot contractually assert rights in databases that go beyond the protection provided by the sui generis database right. Specifically, the rights-holder cannot, contractually or otherwise, preclude lawful users from extracting or reutilising insubstantial portionsFootnote 43 of protected databases.Footnote 44 One of the critical challenges inherent in the interpretation of this provision is that “lawful user” is not a defined term in the Database Directive, and Member States have diverged in their definitions in their local implementations of the Database Directive.Footnote 45

3.2 Sui Generis Database Right: Historical and Policy Context

While the sui generis database right provides considerable protection, the scope of its application has been limited by EU courts.Footnote 46 Understanding this limitation from a policy perspective requires consideration of the historical origins of the database right. In the late 1980s, EU policymakers convened to reform EU intellectual property law,Footnote 47 for the purpose of enhancing EU competitiveness in the emergent digital technologies sector. An objective of the reforms was to incentivise the production of data to match that of the United States.Footnote 48 In crafting the EU Database Directive, policymakers first harmonised the copyright protections afforded to databases, while respecting the distinct copyright traditions of EU Member States.Footnote 49 This approach attempted to harmonise the intellectual property protection afforded to databases throughout the EU, while respecting the distinct intellectual property traditions of continental Europe, the United Kingdom, and those of select other (mostly Nordic) EU Member States.Footnote 50

Historically, continental Member States required a compilation, such as a database, to demonstrate “originality” to be considered eligible for copyright protection. Conversely, the United Kingdom, adopted a more liberal “sweat of the brow” doctrine, which provides copyright protection if effort, labor, or other resources are expended in the creation of a database. The Database Directive adopted the continental approach to determine whether subject matter was eligible for copyright protection throughout the EUFootnote 51 This approach lowered the legal standard required to obtain copyright protection in a select number of EU jurisdictions, such as Germany, but raised it in the United Kingdom.Footnote 52

To offset concerns that the ensuing protection afforded by copyright to databases was too narrow, EU policymakers created the sui generis database right.Footnote 53 This right strengthened IPR protections available to database producersFootnote 54 by protecting databases where an “investment”, considered with respect to capital, human labor, or technological resources was made in their production.Footnote 55 It applies legal protection for compilation of pre-existing data that was historically recognised in the Nordic EU Member statesFootnote 56 and can be understood as applying the United Kingdom’s “sweat of the brow” approach, despite the rejection of this standard for copyright.Footnote 57

The EU initially advocated for the database right to be implemented in other jurisdictions, and for its transliteration into an international standard of intellectual property protection.Footnote 58 However, numerous commentators have expressed significant reservations about the right’s effectiveness in stimulating investment in the production of databases since its implementation.Footnote 59 Indeed, the European Commission has itself joined in the critique of the Database Directive and its unintended deleterious effects.

In December 2005, the Commission released its first evaluation of the Database Directive since its coming into force. The evaluation considered whether the Database Directive has achieved its stated policy objectives, and considered whether the sui generis database right has stifled economic activity by precluding competition.Footnote 60 The European Commission concluded that there was no empirical evidence to suggest that the sui generis database right had generated increased investment in database production and that there had been no demonstrable increase in database production in the EU, either before or after the implementation of the sui generis right.Footnote 61 Instead, research stakeholders interviewed for the report suggested that the database right had led to the strengthening of monopolies on database production and dissemination amongst the largest market participants.Footnote 62 The result is not surprising based on IPR economics literature, much of which concludes that the creation of strong IPRs of long duration entrenches the economic advantages of established producers or aggregators of intangible goods, rather than bolstering the competitiveness of new market entrants.

The European Commission concluded that harmonisation efforts had not succeeded because of ambiguous language used to define the scope and subject matter of the sui generis database right.Footnote 63 These ambiguities are more pronounced for the sui generis database right than for copyright, which has been streamlined through a rich history of national legislation implementation and jurisprudential interpretation. The database right left national courts to adopt novel and often conflicting interpretations.Footnote 64 Divergence in interpretation further arose from the implementation of the Directive in the national law of each Member State.Footnote 65 In addition, the European Commission found that the breadth of the sui generis database right almost amounted to a “pure” property right in raw data.Footnote 66

In contrast, consulted members of the EU database industry overwhelmingly felt that the database right protected them well, and that EU harmonisation of rights in databases was of crucial importance.Footnote 67 This view did not extend to representatives of scientific research bodies, libraries, and academic institutions, who decried the narrow exception for scientific research and its restriction to non-commercial research use.Footnote 68

The 2005 evaluationFootnote 69 was followed by a rigorous, empirical, 2018 study in support of a second evaluation,Footnote 70 which assessed the economic and legal effects of the Database Directive. The 2018 study focused on the perspectives of industry participants, such as “database users”, “database user-producers”, and “database producers”, on the economic and legal effects of the Database Directive.Footnote 71 Representatives from research institutions, academic institutions, and legal experts also contributed to the study.Footnote 72

The authors of the 2018 study considered the effectiveness of the sui generis database right and made reform recommendations. They concluded, first, that the economic value of data is increasingly realised through the combination, analysis, and modification of existing datasets. Thus, granting database creators exclusive rights in databases can deter productive activities.Footnote 73 Second, the exceptions to the sui generis database right are more limited than for copyright,Footnote 74 which creates barriers to the secondary use of existing databases. Third, the content and ambit of the sui generis database right remain unclear, both due to drafting ambiguities and divergent statutory implementation and jurisprudential interpretation of the Directive in each Member State.Footnote 75 Fourth, the Court of Justice of the European Union (CJEU)’s holding in Ryanair Ltd v. PR Aviation VB (discussed below) enabled database producers to hold their IPR-ineligible databases to a higher standard of protection than if their databases were eligible for sui generis database right or copyright protections, through contractual restrictions on their use.Footnote 76 Finally, the database right coexists uneasily with later EU initiatives to foster the creation of open data repositories and requirements for freely accessible public sector information (PSI).

Proposed policy reforms strike a balance between maintaining the benefits of the sui generis database right and limiting its more detrimental features. They include: (1) implementing a registration requirement or other legal formalities as a precondition to the application of the sui generis database right, rather than having the rights apply through default operation of the law;Footnote 77 (2) expanding exceptions to the sui generis database right, such as those applied to copyright in the Information Society Directive;Footnote 78 (3) introducing a compulsory licensing scheme for sole-source databases;Footnote 79 and (4) implementing legislative reforms to overturn the CJEU’s Ryanair decision.Footnote 80

3.3 Sui Generis Database Right: Jurisprudential Interpretation and Other Limiting Factors

Having addressed the substantive contents and the historical evolution of the sui generis database right, we now discuss the judicial interpretation of that right by the CJEU and the national courts of EU Member States. This jurisprudence is consistent with the conclusion that the sui generis database right serves to protect investment in producing a database more than the database itself.

In The British Horseracing Board and Others, the CJEU interpreted both the breadth of the database right and the circumstances that give rise to it.Footnote 81 The CJEU interpreted the substantive contents of the right expansively, concluding that the “quantitative” prong of the statutory test protects the extraction or the reutilisation of a large proportion of the database’s contents.Footnote 82 However, the right also protects a small proportion of the contents of a database if the concerned proportion represents a significant amount of the effort or other resources utilised in the creation of the database.Footnote 83 Further, the court established that the sui generis database right is not exhausted if the maker of the database renders the database public, either directly or by allowing a third party to make the database public. In other words, the database right that limits extraction or reutilisation continues to apply to databases that have been made public.Footnote 84 Merely consulting a database, however, does not constitute extraction or reutilisation, and therefore does not violate the sui generis database right.Footnote 85 Repeated extraction or reutilisation of an insubstantial portion of a database can cumulatively lead to the breach of the sui generis database right.Footnote 86

Despite its liberal interpretation of the breadth of the sui generis database right, the CJEU in British Horseracing Board went on to restrict its application. The CJEU held that the “substantial investment” protected by the database right is the investment in compiling the database, not the investment in creating or producing the data contained in the database.Footnote 87 In further jurisprudence, the CJEU confirmed that investment in producing the contents of a database are of no relevance to satisfying the prerequisite of “substantial investment” for a sui generis database right to arise.Footnote 88 The CJEU stipulates that the creator of data can in some instances claim a sui generis database right in a database assembled from data that it has generated, but that the efforts made to generate such data do not contribute to establishing a legal entitlement to the sui generis database right.Footnote 89 The CJEU justifies its restrictive interpretation based on its reading of Recital 39 of the Database Directive,Footnote 90 which states that the purpose of the right is to incentivise the organisation of data into datasets and the maintenance of those datasets, rather than to foster the generation of data.Footnote 91

Commentators agree that the Database Directive applies to “collected” databases but not to “created” databases, and that this interpretation is consistent with the Database Directive’s objective to protect investment in the curation of new databases. Our own interpretation is that the “creation” of data is used as a “proxy” for the underlying policy issue that the courts are trying to address: denying protection to sole-source databases, to foster competition in the market for information and related services. That is, database-production efforts that entail the primary creation of data have a high likelihood of producing “sole-source” databases.Footnote 92 It is easier to verify whether a database is “created” than it is to perform a deeper analysis of whether it is a sole-source database. Therefore, the creation/collection test is used to distinguish non-protected databases from protected databases.

Further, databases composed of data generated without an independent investment through the regular business activities or functions of an institution or an organisation do not require additional legal protection (these are referred to as spin-off databases). No public policy justification exists to protect such data, because these data are generated regardless of the incentive value that the sui generis database right might provide.Footnote 93 Courts have therefore denied legal protection to spin-off databases through the requirement of an independent investment to be made in the compilation of a database.

Thus, the scope of protection afforded by the database right is broad, once such a right is confirmed to exist. Nonetheless, the application of the database right is curtailed by the narrow interpretation of eligible “investments” in producing databases, which excludes investment in the generation or production of data, and the incidental compilation of data arising from other activities. The principal policy justification for this narrowFootnote 94 reading of the right is to limit the potential for data producers to assert database rights on databases comprised of sole-source data and on “spin-off” databases. Precluding the assertion of rights in “created” databases prevents the creators of valuable data from engaging in anticompetitive practices related to data, of which such created databases are often the exclusive source.Footnote 95 Precluding the assertion of rights in spin-off datasets ensures that such datasets – in which no investment was made – are not subject to the sui generis database right, as this would preclude downstream use of such data without the restriction serving to reward an investment in database production.

In its 2021 decision, CV-Online Latvia SIA v. Melons (Latvia), the CJEU further narrowed the scope of application of the sui generis database right, through a restrictive interpretation of the conditions causing the breach of the database right. The defendant in Latvia reutilised a significant portion of a protected database. According to prior jurisprudence, this would have constituted an unqualified infringement of the sui generis database right. The court instead determined that the reutilisation of a significant portion of a protected database infringes the sui generis database right only if such reutilisation risks preventing the rights-holder from benefiting from the fruits of their original investment, enabling cost-recovery and financial profits being the raison d’être of the sui generis database right.Footnote 96 It also held that the existence of a risk to the investment was not sufficient to constitute a breach – where such risk was identified, it was required to consider whether the impugned act furthers an objective of the Database Directive that justifies the perpetuation of such risk, such as making information available for downstream use, or creating valuable downstream products.Footnote 97 The presence or absence of a risk posed to a protected investment nonetheless is, since Latvia, the principal criterion used to evaluate whether or not the sui generis database right has been breached.Footnote 98

In Ryanair Ltd v. PR Aviation VB (Ryanair), the court considered the legal effects of the Database Directive on databases that are not protected by either the database right or copyright.Footnote 99 The Database Directive protects lawful users of databases against the assertion by rights-holders of restrictions that exceed those contemplated in the Database Directive.Footnote 100 The court in Ryanair therefore considered whether this “right” favoring database users was applicable to databases if neither copyright nor the database right were applicable.Footnote 101 The court answered in the negative.Footnote 102 Counterintuitively, the effect of this decision is to enable producers of databases that the Database Directive does not protect to impose even stricter protections via contract.Footnote 103 Here it is important to recollect that rights-holders cannot contractually assert rights in databases that go beyond the protection provided by the sui generis database right.

3.4 The Open Data Directive

In contrast to the Database Directive, the Open Data Directive encourages public sector bodies to openly share data. The Open Data Directive entered into force in 2019, amending and replacing the Public Sector Information Directive (PSI Directive) that was adopted in 2003 and first updated in 2013.Footnote 104 These successive Directives established baseline rules that must be integrated to the access-to-information laws of EU Member States, whilst leaving to Member States the discretion to determine which documents and information will be subject to such regimes.Footnote 105

The PSI Directive implemented a minimal set of rules enabling the reuse of documents that public sector bodies made available to the public and authorised the reuse of. These rules did not compel public sector bodies to make their documents available to the public, nor to authorise their reuse. However, the PSI Directive applied when public sector bodies voluntarily chose to make their documents public for the purpose of reuse (or national law required them to do so). It created obligations that determined the conditions according to which such data were be shared.Footnote 106 The PSI Directive mandated that public sector bodies provide access to their documents for commercial and non-commercial purposes. Such bodies were required to respond to requests for the use of documents expediently, limit charges levied to a “reasonable return on investment”,Footnote 107 subject access to minimally restrictive licenses,Footnote 108 and provide all market participants equitable access to the documents unless exclusive arrangements were exceptionally justified.Footnote 109

The 2013 iteration modified the PSI Directive.Footnote 110 The 2013 PSI Directive introduced a general right for third parties to reuse documents that public sector entities make public.Footnote 111 It also modified the provisions on cost recovery to limit charges to the “marginal costs incurred for [the] reproduction, provision and dissemination [of such documents]” in most instances.Footnote 112 Further, it extended the application of the Directive to categories of organisations previously excluded from its ambit, such as “libraries, museums, and archives”,Footnote 113 but it preserved the right of those organisations to a “reasonable return on investment”Footnote 114 in establishing access costs. Public bodies required to recover costs to fund their operation were also exempted from the requirement to limit their charges to the marginal costs of reproduction and related activities.Footnote 115

In 2019, the Open Data Directive replaced the PSI Directive, reprising most of its provisions whilst integrating select changes and novelties that build upon and extend the guarantees of openness integrated to the 2003 and 2013 versions of the PSI Directive. One of the most significant changes, for our purposes, is to subject public-funded research data to the full range of reuse requirements that the Open Data Directive, so long as “researchers, research performing organizations, or research funding organizations”Footnote 116 have made those datasets public through research repositories or other channels.Footnote 117 The other significant change is to adopt both articlesFootnote 118 and RecitalsFootnote 119 confirming that public sector bodies cannot exercise the sui generis database right in a manner that would frustrate the rights to access provided for in the Open Data Directive.Footnote 120

The straightforward conclusion that might be drawn here is that the Open Data Directive has mooted the Database Directive relative to research data and related research outputs. However, the situation is more complex than it might first appear. First, the Open Data Directive is not applicable to numerous categories of datasets that are carved-out from its application.Footnote 121 These include those that are subject to IPRs belonging to third parties. If both third-party IPRs and public sector sui generis database rights find application to the same dataset, this could re-actualise the right of public sector actors or public funded research to use the sui generis database right to preclude downstream use of affected datasets or documents.Footnote 122 This exception could be leveraged in a strategic manner to defeat the intent of the Open Data Directive. One other crucial carve-out applies to identifiable personal data, as defined in EU data protection legislation – insofar as the relevant Union and Member State access-to-information regimes create limitations to the access to such information. Therefore, where research data derived from human participants research is considered to be regulated personal data to which relevant national or Union access-to-information regimes preclude or caveat reuse, the Open Data Directive will not operate to limit the application of the sui generis database right.Footnote 123 Last, it bears mentioning that the Open Data Directive is not applicable to documents that are made public and available for reuse at the discretion of public bodies in a manner that falls “outside the scope of the public task”,Footnote 124 or that fall outside of the relevant national access-to-information regimes.Footnote 125

Second, there are significant definitional challenges in determining the interplay between the obligation to enable reuse on the one hand, and the residual right to preclude extraction or reutilisation on the other. That is: the degree to which public sector entities subject to the obligations in the Open Data Directive can exercise the sui generis database right, whilst still fulfilling their obligation to enable the reuse of data (i.e. to refrain from its exercise), remains ambiguous.

Nonetheless, the general trend of the iterative changes made to the PSI Directive is to increase its application to research-related documents and public-funded research data so as to enable their downstream use in a manner that reads down or overrides reliance on private rights therein. There is good reason to imagine that a policy-minded court would interpret the right to reuse in the Open Data Directive as curtailing reliance on the sui generis database right. Indeed, the Open Data Directive is but one example of an emergent trend on the part of EU legislators to enact laws privileging both open science and a broad conception of the scientific and public information commons.Footnote 126 Further examination of the contents of the Open Data Directive illustrates this well.

Documents that fall within the scope of the Open Data Directive must be made available in a form that is “open, machine-readable, accessible, findable and re-usable.”Footnote 127 Public bodies must respond to requests for access to documents.Footnote 128 Costs charged for access to most documents must be limited to the marginal costs of providing access, and of select other related acts (e.g. anonymising them).Footnote 129 No costs can be charged to access research data.Footnote 130 Public bodies must provide information about the charges applicable to accessing documentsFootnote 131 and also inform third parties of the redress mechanisms that exist concerning applications for access to documents.Footnote 132

The Open Data Directive generally prohibits public sector bodies from subjecting third parties’ access to documents to restrictive conditions,Footnote 133 or granting inequitable rights of use to different users.Footnote 134 These limits include express preclusions on granting exclusive rights to documents to select partiesFootnote 135 and on imposing discriminatory conditions of access to documents, relative to similar use-cases.Footnote 136

3.5 The Data Act

The European Commission’s proposed Data Act, as now drafted, would further read down the scope of application of the sui generis database right. Article 35 establishes that the database right would not find application to “databases containing data obtained from or generated by the use of a product or a related service.”Footnote 137 If implemented, this should be construed as the near elimination of the database right. The modification is described in the Recitals and explanatory memorandum of the Data Act as directed to ensuring that IPRs do not find application to data that derive from “internet-of-things” (IoT) devices.Footnote 138 Taken alone, this might militate for a narrow interpretation of its breadth of application. In the Impact Assessment Report that accompanies the Data Act, the article is construed first as a clarification detailing that the sui generis database right does not find application to data arising from the use of connected devices – as might already be the case without revision because such data arise from pre-existing activities without independent investment – in other words, IoT databases are spin-off databases.Footnote 139 This amendment is also construed as the limited paring down of the database right to preclude such right from generating heightened transaction costs and occasioning reduced potential for competition, and analogised to wider EU policy efforts to redraft the sui generis database right.Footnote 140 Both due to its broad language and its association to the broader effort to limit the negative effects of the sui generis database right, the proposed amendment would create wide latitude for courts to further narrow the databases captured by the database right, through a broad construction of the exclusion at Art. 35 of the proposed Data Act.Footnote 141

4 Policy Considerations for Public Research Databases

Having described general legal and policy considerations relevant to the sui generis database right, the Open Data Directive, and select other EU laws, we now consider their application to public biological databases.

Multiple distinct categories of stakeholder engage in the production of public data repositories, including public biological databases. These include research databases funded through public monies, or through a combination of private and public funds, as well as co-created databases that are produced through open science and citizen science initiatives. Some of these repositories are large databases produced to house data from one study or from a network of related studies (e.g. the International Cancer Genome Consortium, UK Biobank).Footnote 142 Others are larger-scale repositories that are created to store and enable access to data from multiple unrelated studies, acting as infrastructure at the disposal of researchers that independently choose to deposit data (e.g. dbGaP, the European Genome Archive).Footnote 143 Last, some tools act as “discovery tools” that enable searches to be performed across multiple independent databases or repositories without centralising the queried datasets or dataset metadata (e.g. a Beacon network, CanDIG).Footnote 144

The creation of public biological databases often relies on academic research or infrastructure funds from national or international funding agencies, such as the National Science Foundation (NSF), the National Institutes of Health (NIH), the European Union’s Horizon 2020 initiative, and Canada’s Tri-Councils. Other government agencies may also support such activities, including the Defense Advanced Research Projects Agency (DARPA) in the United States. Private charities, the private sector, or public-private partnerships may also support research data generation, curation, and related initiatives. The funding often flows to one or multiple research institutions, government agencies, or consortia, which support, through the provision of data, intra- or extra-mural biomedical research. These categories of repositories differ in their organisational structure, their purpose, and in the nature of the information stored.

The benefit of public data lies in its potential for reuse. Therefore, the default operation of law – through the sui generis database right – granting exclusive IPRs on public datasets conflicts with the rationale that justifies their creation. This could lead to the automatic creation of private rights in public datasets, that run contrary to the intentions of public research institutions, public funding agencies, and self-organised open science communities.

In the following section, we apply the tests discussed above to determine when the sui generis database right might apply to public biological databases. The subsequent section determines the conditions according to which the creation of platforms used to discover data or to submit queries to existing public biological databases will breach the sui generis database right.

4.1 Does the Sui Generis Database Right Apply?

4.1.1 Barriers to the Recognition of the Sui Generis Database Right that Relate to Public Biological Databases Meeting Its Preconditions

Three reasons support the conclusion that the database right is not applicable to public biological databases, which relate to the investment test and the conditions according to which such databases are compiled. First, the database right will not apply to databases that generate their own data as per British Horseracing Board and related decisions.Footnote 145Because the production of such databases is not subject to an investment independent from that made to create the concerned data, it is not subject to the sui generis database right. That is, funds (often public) are invested to perform the first-instance generation of research data, rather than to curate pre-existing data into databases.Footnote 146 While the CJEU has not directly addressed the application of the sui generis database right to public research databases, the national courts of Member States have. National courts have allowed the private sector participants in public-private partnerships to benefit from the sui generis database right, whilst denying the right to the public sector partners. Thus, national courts have sometimes found that institutions/agencies that obtain public funding to create a database, or to generate raw data, do not benefit from the sui generis database right.Footnote 147 Second, research databases are often created through the principal research activities of the consortium or its constituent research organisations, and therefore no independent investment is made in the production of such databases (i.e. these are spin-off databases).Footnote 148 Third, the independent investment made in the creation of a public biological database likely carries no investment risk, which is a precondition to the database right being recognised. That is, the data were produced through funding directed to assembling databases and other research outputs that act as pure public goods – the output data were not intended to be translated into profits, and therefore no associated investment risk exists.

One notable case that might counterbalance the foregoing analysis is the CJEU holding in Directmedia. In Directmedia, a university invested EUR 34,900 in compiling a list of the most common German poems featured in existing anthologies, which required archival research and statistical analysis.Footnote 149 This investment was considered to be subject to the protection of the sui generis database right. However, it is worth considering that in this instance, significant resources were exceptionally dedicated to the compilation of data from existing sources (rather than to data generation) – and the question of whether or not the concerned database was subject to protection was not itself referred to the court.Footnote 150 The court was asked to produce a holding on the issue of infringement, alone. This holding could temper our conclusion that public-funded and research-funded efforts to compile databases absent a profit motive are less subject to the sui generis database right than private-funded efforts that envision profit, or public-funded efforts that envision cost-recovery (e.g. through the levy of access fees).

Certain ambiguities therefore remain. It is unclear whether public-sector research funding arrangements that enable funded parties to benefit from cost-recovery mechanisms qualify for sui generis database protection. Second, the threshold of additional investment required to translate a created, spin-off, public-funded, or otherwise presumptively unprotected database into a protected database is difficult to ascertain. It is also unclear how a recipient of research funding could invest additional funds and effort at its own risk to translate a non-protected deliverable into a protected object that is the result of additional, risk-bearing investment.

4.1.2 Barriers to the Recognition of the Sui Generis Database Right Arising from the Definition of a Database “Maker”

The identification of an appropriate database “maker” that benefits from the sui generis database right can be unclear in research collaborations. For example, the right vests in “the person who takes the initiative and the risk of investing”,Footnote 151 leaving ambiguous whether the right could vest in actors that initiate the database or those that bear the risk inherent in creating the database, alone.Footnote 152 Alternatively, it is possible that both conditions must be met,Footnote 153 but it is unclear whether each collaborator must meet both requirements, or whether it is sufficient for separate collaborators to meet each of the conditions (so as to hold either separate or joint rights).Footnote 154 Recent case law does appear to suggest that one actor must perform the initiation of database creation, and also bear the associated investment risk, to benefit from the database right.Footnote 155

Equally ambiguous are the categories of risk sufficient to satisfy the “investment risk” criterion.Footnote 156 Some legal scholars conclude that the Database Directive refers to the “organisational” risk engaged in creating the database, rather than the financial or reputational risk inherent in its creation.Footnote 157 This interpretation implies that “pure contractors” (i.e. persons that hire third parties to produce a database) could not benefit from the sui generis database right.Footnote 158 We disagree, because such limitation of the categories of “risk” to “organisational risk” does not flow from the text of the Database Directive. Instead both the historical origin of the rightFootnote 159 and the exclusion of subcontractorsFootnote 160 at Recital 41Footnote 161 favor an interpretation that accepts pure or principally financial risk as sufficient to give rise to the database right.Footnote 162 Recital 40 appears to contemplate the expenditure of “financial resources” and the “expending of time, effort and energy” as discrete and valid forms of risk-taking that could give rise to the right independent from one another.Footnote 163 Courts appear to consider that one organisation must take both the “initiative” and the “risk” of investing, in cumulation, to benefit from the sui generis database right. Conversely, it is not required for one singular organisation to bear a specific category of risk to benefit from the sui generis database right (i.e. to expend resources that are explicitly of a financial, organisational, or other category in the production of the database).Footnote 164

Based on this discussion, the potential for individual or institutional members of research consortia to benefit from the database right may be restricted in practice, depending on the interpretation of “database maker”. Research consortia will oftentimes not meet the definition if all potential rights-holders must meet both the precondition of initiation and risk, or if the risk must be organisational in character. Conversely, if a more liberal interpretation is applied, collaborators within research consortia, either individually or as a group, may benefit from the database right.

Interpretive debates notwithstanding, the policy objective of the Database Directive is to create incentives to invest in database compilation. This suggests that courts will favor a broad definition of database maker, including network-based, market-based, or firm-based production, rather than recognising elements contingent on specific organisational forms or financial arrangements. In practice, the “investment” test will be more determinative of whether a sui generis database right is recognised, and the apportionment of roles and responsibilities between research collaborators will be less so. This interpretation accords with the investment test and the public policy considerations that underpin the sui generis database right; the definition of database maker is tangential.

4.1.3 Barriers to the Recognition of the Sui Generis Database Right Arising from the Structure of Research Collaborations and Research Consortia

It is often unclear how national law apportions joint ownership of property rights, including IPRs. Thus, singular institutions within research consortia, or groups of unincorporated researchers, could potentially hold joint sui generis database rights in the public research databases created via joint efforts. Whether these rights can be exercised by a singular consortium member, or must be exercised in a collective manner, might depend on the contents of applicable national law. The potential for conflicts in interpretation between the national property laws of each EU Member State give rise to further uncertainty.

Application of the database right also leaves indeterminate the rights of non-EU participants in collaborations, as the Directive restricts the sui generis database right to “nationals of a Member State or who have their habitual residence in the territory of the community”Footnote 165 and “companies and firms formed in accordance with the law of a Member State and having their registered office, central administration or principal place of business within the Community.”Footnote 166 It is unclear how EU courts would consider the entitlement of non-EU collaborators to the joint sui generis database right arising from an international research collaboration.Footnote 167 In transcontinental collaborations, it is conceivable that EU beneficiaries could license the sui generis database right to their non-EU collaborators, frustrating the intent of the Database Directive to restrict benefits to EU nationals or residents.

Moreover, it remains unclear whether a research collaborator who benefits from the sui generis database right could license such rights to collaborators who do not, as part of a consortium agreement or other contractual agreement. Contrary to the Database Directive, such licensing would enable collaborators that produce databases as part of their existing activities to benefit from the sui generis database right.

A further complication is the application of the Database Directive to collaborations in which the majority of the investment and attendant risk is borne by parties that do not qualify for the database right. For example, if an unincorporated public-private partnership or an international effort invests in the creation of a database, and the majority of the investment risk is borne by the non-EU collaborators, it is unclear whether such a collaboration would entitle the qualifying participants to benefit from the sui generis database right. Indeed, some legal scholars contend that the text of the Database Directive, independent of national laws, is ambiguous about the potential for joint sui generis database rights to arise in a database, the creation of which results from a collaborative effort. Other authors have attempted to resolve the question about the potential for joint ownership of sui generis database rights in considering the legal nature of the sui generis database right itself (i.e. the text of the Database Directive), rather than the applicable Member State property law rules governing how property rights arise and are apportioned.Footnote 168 Lipton contends that the language of the Database Directive creates ambiguities for the potential for joint sui generis database rights to arise in a database, the creation of which results from a collaborative effort.Footnote 169 Some legal scholarsFootnote 170 consider this question relative to the non-binding definition of “database maker” that is provided at Recital 41 of the Database Directive, and the inconsistent Member State implementations.Footnote 171 This line of inquiry attempts to determine the minimum preconditions that one collaborator in a collective endeavor must satisfy to meet the definition of “database maker”, as discussed above.

4.2 Can the Database Right Limit Use or Reuse of Public Research Data?

In the previous section, we demonstrated that public biological databases will seldom be subject to the sui generis database due to the application of the investment test and the “created/collected” distinction, but difficulties might also arise from the definition of the database maker and the joint allocation of rights between collaborators.

In circumstances where a sui generis database right is recognised in a database, we now consider the categories of downstream data use that the database right might preclude. We focus on the creation of data discovery platforms, search engines, and other, similar tools that enable pre-existing external databases to be queried and accessed with greater facility.

Do the creation and operation of platforms that facilitate data discovery or data access have the potential to infringe the Database Directive? Platforms host information about available datasets, known as metadata, and triage the metadata to help researchers find information that is relevant to their intended research activities, hosted on other pre-existing repositories. For example, a platform could enable multiple repositories of biological data to be queried for features that are of research interest. Queries could be based on genetic variant, phenotype, pathology, or other elements of clinical metadata, and formatted using standardised ontologies.Footnote 172 These platforms could act as pure “discovery” tools, enabling researchers to search for relevant dataset-level information from across multiple restricted-access (controlled) repositories. Users then proceed to negotiate access to the datasets of interest with their individual data custodians. Alternatively, platforms enabling data “access” could centralise the direct download links to multiple external datasets that are of common research interest to downstream data users. Still other technologies could bring together some features of “discovery” and “access” tools, for example, those that produce aggregate data from dataset-level metadata or from record-level data within protected datasets. These could add value to the datasets used to produce them, or could render them obsolete.

4.2.1 Creating Downstream Databases from Existing Databases and the Sui Generis Database Right

The direct reutilisation of a pre-existing database as part of a second, downstream database could infringe the sui generis database if elements of one public biological database are integrated to a downstream database that fulfils a similar purpose. In Directmedia, the CJEU held that broad categories of actions could be held to constitute “extraction” or “reutilisation”, infringing the sui generis database right. These acts are not limited to direct “copying” of databases through technological duplication but also to their labor-intensive manual reconstitution in full or in part.Footnote 173 Therefore, in producing downstream databases that derive considerable content from other existing databases, the operative analysis is not directed to the categories of “actions” performed to create copies of the affected databases. Rather, it is directed to whether their outcome constitutes the “extraction” or “reutilisation” of a substantial part of a protected database. This requires the application of the test elaborated in Latvia, which seeks to determine, first, whether the extraction or reutilisation of a substantial part of a protected database has occurred, and second, whether this extraction or reutilisation creates a “significant detriment, evaluated qualitatively or quantitatively, to the investment.”Footnote 174 In performing this evaluation, the CJEU in Latvia also considers the balance between precluding the “potential risk to the substantial investment of the maker”Footnote 175 and privileging the entitlement of third parties to access information and/or to translate it into downstream innovations.

The creation of a derived database that replicates a pre-existing database – or brings content from multiple such databases together – therefore does not necessarily infringe the sui generis database right. First, we determine whether the derived database extracts and reutilises a substantial portion of the original database. Second, we determine whether or not this detracts a great deal from the potential for the rights-holder to benefit from their investment. Third, where such a risk arises, we evaluate its severity relative to the harm to functioning markets in information that would be occasioned in upholding the right. This means evaluating the value of repurposing such information, or of a value-added service that relies on it.

A derived database that acts as a strict replica of a pre-existing, protected database will likely violate the sui generis database right. Conversely, a derived database that brings together information from one or multiple databases without detracting from the value of the original database (for example, because its function differs, or it is directed to a different audience), would not breach the database right. Databases that extract and/or reutilise a “substantial part” of one or multiple databases in a manner that deprecates the value of the original databases will not necessarily breach the sui generis database right, so long as these produce value that is independent from that of the original database, sufficient to justify the risk to the investment.Footnote 176 Nonetheless, the presence or absence of a risk to a protected investment remains the main criterion that is used to determine whether or not the sui generis database right is infringed.Footnote 177

4.2.2 Creating Platforms and Tools Enabling Data Discovery, Analysis, and Access that Aggregate Data from Multiple Databases or Repositories

A similar analysis determines whether or not the integration of data, metadata, download links, and the like from other public biological databases into downstream discovery tools or search engines breaches the sui generis database right.

The CJEU has explicitly addressed the relationship between value-added search platforms and the database right. In Innoweb B.V. v. Wegener I.C.T. Media B.V. (Innoweb), the CJEU considered whether a “meta search engine”Footnote 178 breached the database right. The court held in the affirmative because the meta search engine reutilised a substantial portion of the underlying databases searched.Footnote 179 The meta search engine, GasPedaal, accepts queries for car advertisements according to select filters and search terms. Once a search is made using the platform, the meta search engine runs the search directly via underlying search platforms. GasPedaal then presents the search results to the end user, integrating information in a similar order to the underlying search platforms queried. End users are therefore not required to consult the underlying platforms.Footnote 180 The CJEU stated: “to ensure that the person who has taken the initiative and assumed the risk of making a substantial investment in terms of human, technical, and/or financial resources in the setting up and operation of a database receives a return on his investment.”Footnote 181 GasPedaal infringed the database right, because it rendered irrelevant the search engines from which the presented information was drawn; it deprived such platforms of income.Footnote 182 Further, the meta search engine, despite providing additional value by aggregating the results of multiple search engines, operated in a similar fashion to the source search engines. It presented search queries in a comparable manner, and performed searches in real time.Footnote 183 In sum, the court concluded that these features rendered the GasPedaal meta search engine “close to the manufacture of a parasitical competing product” and therefore to infringe the database right.Footnote 184

The CJEU considered meta search engines again, on different facts, in Latvia. The search platform concerned indexed the “meta tags” of numerous websites hosting job advertisements, that such websites provided to make their content discoverable to general search engines. In contrast to Innoweb, the search engine in Latvia did not reuse the native search engines of the websites from which it drew its content.Footnote 185 Instead, it first indexed the content of the websites that hosted the job advertisements and then provided deep links to those websites’ primary copies of the job advertisements.Footnote 186 The impugned search engine added functionality in bringing together in one place the results of queries that would otherwise need to be performed on numerous independent search engines. Unlike GasPedaal, it did not constitute a parasitic imitation of the search engines used to produce it. GasPedaal did little but run searches through the existing search engines, simultaneously depriving the underlying sites of usership and revenue.Footnote 187

4.2.3 Conclusion as to the Effects of the Database Right

Our general position on the application of the sui generis database right to public biological databases is the following. There is strong, but not unqualified, support, for the position that the sui generis database right would not apply to most public-funded consortium-produced public biological databases. The strongest argument in support of this claim is that such efforts often “create” rather than “collect” data, which does not create an entitlement to the sui generis database right. Further, recent case law provides heightened support for the position that the sui generis database right does not find application to public-funded databases due to a lack of associated investment risk, though this position remains contentious. If the database is produced as the incidental result of pre-existing research or clinical activities, it is almost certain that no sui generis database right will be recognised. Last, different actors taking the initiative and bearing the risk of investing in the creation of a database could lead to their disqualification from database protection due to the cumulative nature of such criteria. Our general conclusion is that researchers operating as a consortium to produce a database will often be deprived from the protection of the sui generis database right, though the preclusion is not strict and absolute.

In those residual circumstances in which the sui generis database right does find application to a public biological database, the prospect for downstream use of that database to infringe the database right is restricted. First, some authors contend that because the framing of infringement in Latvia requires that the downstream use create a risk to the fruition of the protected investment, public-funded research deliverables would struggle to overcome the infringement test even where subject to the database right. Second, there is a high likelihood that either derived databases or discovery-and-access tools that add value to existing databases would not infringe the sui generis database right even if these created a risk to the realisation of the investment in a protected database. This is the case because of the application of the balancing test introduced in Latvia.

The final residual consideration is the application of the Open Data Directive and the Data Governance Act in these circumstances. These also contain provisions that enable reuse of research data and public sector information and limit the potential to invoke the sui generis database right. Though relevant, it is not clear how broad the right of reuse anticipated in these acts is, and there is significant potential for research data not to be subject to these laws due to the carve-outs anticipated for certain categories of data. Therefore, these instruments are anticipated to be less impactful than the considerations raised in the preceding paragraphs.

The caveat to our position is the holding in Ryanair. This holding enables organisations that release databases that are not subject to the sui generis database right to use strong contractual protections or digital rights management (DRM) technologies to condition the use and reuse of those databases. In using such protections, organisations need not respect the minimal rights of reuse that the Database Directive anticipates for those databases that are subject to the sui generis database right. Therefore, it is possible that the producers of public biological databases could assert strong contractual protections – or use digital rights management technologies – to prohibit the reuse of their databases, where the sui generis database right does not find application. However, the use of such conditions in the public sector, at least, is often precluded by open access policies of funding agencies and research institutions.

The lack of clarity as to whether the database right applies to research databases and associated information, including metadata, annotations, and ontologies, combined with the potential for unintentional infringement, could deter third parties from producing data discovery platforms, data access platforms, enriched datasets, and other value-added derivatives. Legal uncertainty may negatively impact open science initiatives that rely on data and tools contributed by third parties as well as ongoing analysis and reutilisation of data. Reform is therefore recommended.

4.2.4 Conclusion as to the Benefits and Consequences of the Database Right

The legal uncertainty of the database right, compared to copyright, is exacerbated because of its broad scope and lack of time limitation. A more limited right might still compensate database producers for the high cost of first production that places them at a competitive disadvantage relative to other participants in the same market.Footnote 188 It might stimulate investment in the production of intangibles in the form of databases. Without some protection, the market might under-apportion the creation of such goods because they are non-excludable, precluding prospective creators from recognising profits on their investment in creation.Footnote 189 However, a more certain and time-limited right would decrease the risk of exclusion inhibiting the creation of downstream value. Under the current regimen, rights-holders may engage in rent-seeking behaviors, commanding high prices to access IPR protected intangibles that could otherwise be shared or reproduced at zero or minimal cost. In the case of the database right, this risk is exacerbated because recent policy analyses conclude that there is limited need to introduce legal rights to incentivise the production of databases – database producers have increasing incentives to do so as other economic activities become increasingly data-reliant.

It is apparent from our discussion that the CJEU and national courts have attempted to optimise the scope of protection, with respect to both scope and subject matter. They have limited the scope of the database right in step with the emergence of a mature information society, in which numerous incentives to generate valuable data and datasets exist. They recognise the economic value of enabling access to data and datasets, limiting exclusive rights that purportedly incentivise their production and dissemination.Footnote 190

Our discussion highlights the interpretive ambiguities and potential inflexibilities of IPRs, such as the Database Directive. Such IPRs are problematic because the proliferation of differing statutory interpretations risks real or perceived legal non-compliance. Ambiguities in the breadth of application of IPRs discourages investment in the production or disclosure of intangibles for fear that these will not be protected. It also encourages the reticent use of available intangibles to avoid potential infringement.

Unfortunately, opportunities for legal reform are resource-intensive and slow. They require action by the judiciary or legislators. It is therefore difficult to ensure that the scope of IPRs remains context-sensitive and attuned to economic, societal, and technological transformation.

5 Proposed Amendments to the Sui Generis Database Right

5.1 Proposals to Reform the Sui Generis Database Right

Academic commentators and the formal evaluations of the Database Directive have proposed a number of reforms, up to and including the elimination of the database right. The arguments against repeal of the right are first, that its elimination could expose firms whose database rights have financial value to financial loss. Second, such action could potentially constitute a breach of the right to intellectual property, which is established in EU human rights law.Footnote 191 Furthermore, such action could cause firms to limit their production of EU-sourced intangibles (or purchase of rights therein) for fear of the subsequent elimination or aggressive modification of the IPRs. Finally, the Database Directive not only created the database right but also articulated the compromise position between the continental and the UK approach to copyright protection. Despite the departure of the UK from the EU, the continued harmonisation of the database right and copyright in both jurisdictions creates economic advantages for firms resident in both jurisdictions.Footnote 192

A second reform proposal is the implementation of a registration requirement and/or a notification requirement as a precondition to asserting the database right.Footnote 193 This would, for instance, require firms that assert their database right to provide such notice on their database(s) and require firms to register the database’s digital object identifier (DOI) in a public register. These mechanisms alleviate some legal ambiguity, because they give notice to third parties that extract or reutilise data from a database. Conversely, the absence of rights assertion provides greater certainty regarding freedom to operate. The principal disadvantage of this approach, however, is that it could entrench the economic advantage of large relative to smaller firms, because legally sophisticated firms are more likely to take action on notification and registration.

Other potential reforms include the implementation of novel exemptions to the database right, for example, an exemption for commercial and non-commercial research that extracts and reutilises data and an exemption that enables TDM. The TDM exemption as currently construed enables database right-holders to opt out of its application,Footnote 194 creating uncertainty for users, and limits its application to the extraction prong of the database right. Numerous methods of performing TDM are algorithmic, reducing the ability of users to determine if an extracted or reutilised database is subject to an opt-out notice.Footnote 195 It is also limits is application to extraction and not reutilisation, diminishing its usefulness even where it does find application.Footnote 196 Broad and clear exemptions would provide greater legal certainty for users, including users of public research databases.Footnote 197

A further issue with the database right that needs to be addressed is the information asymmetry that inhibits rights-holders and downstream users of protected databases from appreciating whether an act constitutes the infringement of the sui generis database right with equal precision. The investment test, which is the principal mechanism used to determine if the right arises and, since Latvia, to assess its infringement, relies on information about the existence of a protected investment that is not available to the potentially infringing party.Footnote 198 A solution may be to mimic copyright exemptions, which include fair use and its subcategory of “transformative fair use” in US copyright law. The test for transformative fair use considers whether creative effort has transformed the object of copyright protection into a novel creative work. Applied to the database right the exemption might consider whether a “substantial investment in terms of human, technical and/or financial resources” creates a new database or other derivate output that is distinct from the protected database, so as to constitute transformative fair use.Footnote 199 This is similar to the test introduced in Latvia. However, where the Latvia test uses information available to the right-holder to determine whether the right is breached, the “transformative fair use” test would use information available to the downstream user to assess whether or not infringement had occurred.

A fair use exemption would continue to provide database right-holders with a similar breadth of protection but enable the potentially infringing party to objectively assess whether a derivative output that entails a substantial investment is sufficiently distinct from the initial database to meet the standard of “transformative fair use.” The principal benefit is that the test used to determine if a database is protected and if a prospective derivative infringes the original right is assessed on the basis of information that is available to all parties.

Finally, the Database Directive could be replaced by a regulation that harmonises the database right throughout the EUFootnote 200 This could perform the explicit clarification of the relationship between the Database Directive that intends to protect investments in database-production, and later EU public policy efforts that are directed to guaranteeing the downstream openness of both public-funded and private-funded data, such as the Open Data Directive, the Data Governance Act, and the Data Act. This effort would be welcome because the policy priorities of EU legislatures have shifted from incentivising the production of databases to incentivising the open release and reuse of data. This shift reflects changes in the market for information and downstream products/services, rather than a shift in the objectives of the European Commission in information policy.

Though these possibilities are appealing, the transaction costs inherent in interpreting the boundaries of the sui generis database right and negotiating access rights, and the continued potential for litigation costs or civil penalties to be imposed on good-faith downstream users of data would remain even where the right was amended or its ascription was mediated through a registration requirement or other formality. It is dubious that the database right creates value that equals or surpasses the transaction costs and chilling effect that its ambiguous boundaries impose. For these reasons, and those stated below, we urge the repeal of the sui generis database right.

6 Repealing the Sui Generis Database Right

Despite potential challenges, we contend that the sui generis database right should be repealed. While minor modifications and interpretations have the advantages discussed above, the evaluation and administration of the sui generis database right creates considerable costs for database creators, database users, and for courts. Such costs are incurred in developing contracts to manage and to assign the respective database rights of collaborators, and that inhere in both upstream and downstream users of shared data and databases. Costs are also incurred in hiring legal advisors and other relevant experts to determine the probable application of the sui generis database right to particular factual circumstances, on the part of both prospective claimants and prospective users of potentially protected databases. Parties must continue to bear the costs of legal ambiguities that lead to disputes or to unintended outcomes (e.g. unanticipated infringement of the sui generis database right, or reliance on a right that courts deem not to exist).

The costs of the sui generis database rights are not limited to those that database producers and data users must bear. The administration of the database right, and the negative externalities that it generates, can create costs in which all members of society must share. The resources that courts expend in interpreting and refining the ambit of the sui generis database right for reasons relating to both legal clarity and public policy are examples. Similarly, the hesitation of prospective users to utilise databases that could be IPR-protected may have a deterrent effect that leads to a systematic decline in the translation of available data into valuable outputs, depriving society of the benefits.

Approaches to reform that would create a registration requirement or specialised administrative body responsible for confirming the ambit and application of the database right alleviate some, but not all of the above difficulties. For example, the operation and decision-making activities of such an administrative organ, and the performance and verification of the associated formalities, still create high costs for database producers, database users, and for society at large. Furthermore, rendering legal rights contingent on performing legal formalities (e.g. registration of the right) can favor well-capitalised commercial entities and academic research institutions, relative to smaller organisations and individuals. The better-capitalised institutions will often retain legal counsel and expend additional resources to ensure that their rights are secured in databases that these institutions generate.

Bearing some of the costs might be worth the while if the sui generis database right created benefits that justified these additional costs. However, our analysis demonstrates that the incentives that the database right creates are neither effective in stimulating database production, nor required to such end. Database producers often have incentives to compile databases that are not conditional on the assertion of proprietary rights therein. In those instances in which asserting proprietary rights in databases is valuable, digital rights management or contractual mechanisms are both more effective, and more flexible, than the database right.

There is extensive literature on the potential for legal complexity to arise from the interaction of simple rules. This literature further notes that there are little empirical criteria available to determine whether a particular combination of legal rules is “too complex”, such that it is difficult to measure the complexity of law using a meaningful set of metrics. It is also difficult to compare the success of a law in meeting its objectives, relative to its complexity, or to the complexity it introduces to the legal system.Footnote 201 Despite these limitations, it is possible for us to conclude that the complexity produced by the sui generis database right outweighs the benefits that it creates. However, at minimum, in the absence of repeal initiatives, publicly funded databases and those that are made available for downstream use as a matter of law, should be excluded from the operation of the sui generis database right.

One final issue that ought to be addressed in reforming the sui generis database right is the result of the Ryanair decision. In the context of public biological databases, the repeal of the sui generis right would require strictures on the use of contracts and DRM technologies to prevent the downstream use of such databases. Indeed, the same critiques that we have levied concerning the transaction costs and prospect for infringement inherent in navigating the sui generis database right could be levied at the need to navigate contracts of adhesion, licenses, and DRM technologies that are appended to databases. To this end, we recommend legislating an unqualified right to the reuse of public databases similar to those anticipated in the Database Directive, the Open Data Directive, or the Data Governance Act. This proposal is coherent with the broader EU pivot away from recognising exclusive rights in databases, and toward encouraging the translation of publicly-funded data into valuable downstream outputs.

7 Conclusion: The Sui Generis Database Right in Context

To reiterate: the sui generis database was devised to foster investment in database production, and to stimulate markets in information and related services. In concise legislation that has remained open to jurisprudential refinement and redefinition, the Database Directive creates a strong and novel IPR, offered to database makers as an incentive to translate pre-existing data into novel databases.

However, the database right has a deleterious effect on the reuse of available data. Firms, public bodies, and research organisations bear strong incentives to generate and compile data for their own purposes, absent the database right acting to incentivise such behavior. Much of the economic value that arises from data use can be ascribed to databases, search engines, and other downstream products that are produced from pre-existing databases. Therefore, the database right does not appear to be needed to ensure that databases are produced. Once recognised in a database, it creates friction with the competing policy imperative to enable the liberal downstream use of data. To promote this latter objective, courts and legislatures have created considerable exceptions to the database right, limiting the circumstances in which it is recognised, the conditions that give rise to its breach, and the circumstances according to which it can be exercised.

These modifications have not led to a streamlined right that elegantly balances incentivising database production against freeing up information reuse. Instead, the database right as now articulated applies in a conditional manner, and determining the circumstances of its application requires jurists to balance numerous competing factors. These include determining whether a database is protected, whether legislation precludes the rights-holder’s reliance on the database right, and whether contemplated downstream uses pose a risk to a protected investment.

In performing this evaluation, established firms with considerable resources to allocate to legal counsel are favored relative to smaller organisations. These firms seldom need the protection of the database right, as these hold the resources to use contracts and DRM technologies to protect databases that are valuable to them. Smaller firms, public bodies, and research organisations often conduct their activities through decentralised networks. These organisations will less often have the resources to determine whether their own databases benefit from the database right, and in fact, the preconditions to the recognition thereof might operate to disfavor such networked production from giving rise to protected databases. In reusing available public databases, it is anticipated that smaller organisations of this nature will refrain from the reuse of available and non-protected databases, for fear of non-compliance with applicable rights. In contrast to larger and more established firms, these do not possess the financial resources and access to in-house legal expertise required to distinguish infringing conduct from non-infringing conduct. The net effect of the database right is therefore to further entrench information monopolies, an outcome that is in direct conflict with the stated objectives of EU information policy. In all cases, establishing whether a database is subject to the database right, and whether a proposed downstream use breaches the database right, imposes high costs on organisations in the EU, leaving them at a competitive disadvantage relative to non-EU organisations that need not direct precious legal compliance resources to performing such analyses.

Our conclusion creates reason to be cautious about the broader EU effort to balance context-specific protected interests in information (e.g. commercial rights, data protection rights) against downstream uses thereof that might conflict with those protected interests. Recent legislative efforts including the General Data Protection Regulation (GDPR), the Data Governance Act, the proposed Data Act, and the proposed European Health Data Space Regulation (EHDS Regulation) attempt to develop concerted public policy proposals directed to balancing individual and/or private interests in information against the framing thereof as part of a commons that is open to unqualified use.

Legal mechanisms used to balance contextually recognised private interests in information against the general public interest in enabling its reuse include individual, enforceable rights, licensing requirements, reporting requirements, and the requirement to describe and to balance risks in impact assessments. Other mechanisms include the creation of novel information regulators, such as supervisory authorities and competent bodies, and of bodies responsible for issuing interpretive guidance, such as the European Data Protection Board (EDPB) and the European Data Innovation Board (EDIB).

The EU effort to invest considerable legislative and public-sector resources in the creation of clear rules balancing contingent rights to restrict information use against the general right to reuse such information is laudable. However, the lesson of the sui generis database right is that these efforts will create complicated legal structures that are cost-intensive to understand and to administer. Systematically, reliance on this approach will likely contribute to the indeterminacy of the applicable laws because of the intricacy of the applicable legal tests and the long timespans required to obtain feedback from courts and regulators regarding best efforts compliance. It will also entrench the advantages of centralised organisations that can better coordinate their activities according to a singular and shared response to applicable legislation, and entrench the advantages of large organisations with deep pockets to dedicate to mitigating or withstanding the risk of non-compliance. Last, it will render EU organisations less competitive relative to non-EU organisations acting in the same sphere or activities, as the former are required to bear heightened transaction costs and heightened costs of compliance.

Though suggesting remedies to these broader public policy challenges is outside of the scope of this paper, a few brief comments can be made. First, the iterative effort to review and to refine the real-world effects of EU information policy are welcome. It is difficult to anticipate the effects that legislation will have on the regulated object. Reviewing and recasting legislation can help mitigate its unanticipated deleterious effects, or to achieve similar benefits in a more cost-effective manner. It can also ensure that the law remains responsive to shifting incentives or changing practices in the regulated sector, which motivate a change in regulatory strategy. Second, in favoring the interests of pro-social efforts such as research efforts, bright-line exceptions to the infringement of public law, or to the infringement of private rights, should be favored, instead of contextual and conditional exceptions thereto. This is the case because the organisations engaged in these categories of pro-social enterprise do not have the resources required to engage in intensive legal interpretation, nor to bear the afferent risks. Therefore, a contextually applicable exception to onerous regulatory requirements or to the infringement of private rights, such as the sui generis database right, can fail altogether to alleviate the applicable regulatory burden, where the costs of assessing whether such an exemption finds application remain high. These latter approaches – reviewing legislation often, and integrating blanket exceptions favoring privileged categories of actors such as SMEs and research organisations – can help to reconcile the need of low-capacity organisations for low-cost solutions, with the wide and ambitious project of creating comprehensive information policy throughout the EU, through the continuous implementation of sweeping omnibus statutes.