1 Introduction

The Data Governance Act (DGA) is a part of the larger regulatory framework pursued by the EU for digitalization, data economy, artificial intelligence, and other important policy goals often approached under the label of digital sovereignty.Footnote 1 Data is a key ingredient in this framework. Artificial intelligence needs data. Science needs data. Digital applications and services need data.

Data is the new oil, so the saying goes. But, in many ways, the new economy and the giant companies built around data share the same notoriety as the ruthless oil barons of the past (Lahtiranta & Hyrynsalmi, 2018). Against this backdrop, it is understandable that the EU’s focus has long been on data protection. This focus culminated to the enactment of the General Data Protection Regulation (GDPR) in 2016. In recent years the focus has switched toward facilitating data economy and data sharing in Europe. To some extent, however, the data economy focus was already present during the policy-making of the GDPR. For many politicians and stakeholders, the regulation had the twin goals of protecting personal data and facilitating the free flow of personal data across the internal market (König, 2022). Data reuse was also a hot topic during the negotiations (Starkbaum & Felt, 2019). Another point is that to a certain degree the data economy focus was also explicitly embedded to the GDPR, which, according to Article 20, gave data subjects a new right for data portability between different data controllers. While the idea was to facilitate data sharing and interoperability, the new portability right turned out to be problematic in many ways, particularly with respect to data reuse (van Ooijen & Vrabec, 2019). To this end, the DGA seeks to facilitate further sharing of personal data by introducing a concept of data altruism. Another core tenet in the new regulation is the reuse of data held by public sector bodies.

The goals of the DGA are ambitious. The primary goal is to facilitate data economy in Europe and improve the EU’s digital single market. Particular emphasis is placed upon small- and medium-sized enterprises (SMEs) and start-ups for which the planned data reuse and data sharing provide new material to innovate in artificial intelligence and digital applications. Scientific research is also an important part of the goals. In general, data is seen as necessary for tackling the climate change and facilitating the green transition, improving the energy infrastructure, healthcare, and financial services, and so on and so forth. These goals are framed with a distinct “European way” to data and data economy. Therefore, fairness, data protection, and lawfulness receive a considerable attention in the regulation. In what follows, the main content of the new regulation is briefly reviewed. After the review, a few reflections are provided about potential challenges ahead.

2 The DGA in Brief

The Data Governance Act was proposed by the Commission in 2020 and it was approved by the Parliament in 2022. This Regulation (EU) 2022/868 will apply from September 2023 (see European Commission, 2022c). The first Article in the regulation specifies the scope. Accordingly, the regulation lays down (1) the conditions for reuse of data held by European public sector bodies; (2) a notification and supervisory framework for the provision of data intermediation services; (3) a framework for voluntary registration mechanism for entities that collect and process data made available on altruistic purposes; and (4) a framework for establishing a new European board for innovation in data economy. Data altruism is defined in the second article; it refers to “the voluntary sharing of data on the basis of the consent of data subjects to process personal data pertaining to them, or permissions of data holders to allow the use of their non-personal data without seeking or receiving a reward that goes beyond compensation related to the costs that they incur where they make their data available for objectives of general interest as provided for in national law, where applicable, such as healthcare, combating climate change, improving mobility, facilitating the development, production and dissemination of official statistics, improving the provision of public services, public policy-making or scientific research purposes in the general interest”. In other words, data altruism is based either on the permission given by an organization for not-for-profit processing activities of non-personal data or the notion of consent in case personal data is involved.

The categories of data for reuse are defined in Article 3. Accordingly, the regulation applies to data held by public sector bodies which is protected on the grounds of commercial confidentiality, statistical confidentiality, protection of intellectual property, and protection of personal data. Thus, personal data held by public sector bodies is covered and hence also the GDPR applies. There are also exclusions. The regulation does not cover data held by public undertakings, data held by public service broadcasters and their subsidiaries, data held by cultural establishments and educational institutions, data protected on the grounds of national security and defense, and data falling outside the scope of the public tasks of the public sector bodies concerned.

The conditions for data reuse are defined in Article 5. The general principles are non-discrimination, transparency, proportionality, and proper justification without attempts to restrict competition. To ensure that data is properly protected, public sector bodies must ensure that personal data is anonymized and commercially confidential data is properly modified, aggregated or otherwise handled with proper disclosure controls. Thus, the GDPR’s concept of pseudonymization is not sufficient: proper anonymization is generally required for reuse of personal data.Footnote 2 That said, the fifth article provides also two alternative options: a secure processing environment controlled by a public sector body in case remote access is provided or reuse and processing at the physical premises of a public sector body. In all cases security must be guaranteed. The fifth article also prohibits users of reused data from any attempts to re-identify data subjects. To help public sector bodies with their new tasks, Article 7 specifies that the member states are obligated to designate specific competent bodies. The support provided by these competent bodies includes technical guidance for data storage and data processing, help with anonymization, suppression, randomization, and other techniques that ensure privacy, confidentiality, integrity, and accessibility of personal data, state-of-the-art privacy-preserving methods, deletion of commercially confidential information, support for consent and permission requests for reuse, and relevant contractual commitments. According to Article 6, public sector bodies may also charge fees for allowing reuse of data they possess.

The regulation also introduces a concept of data intermediation services. According to Article 2, a data intermediation service is “a service which aims to establish commercial relationships for the purposes of data sharing between an undetermined number of data subjects and data holders on the one hand and data users on the other, through technical, legal or other means”. The keyword is commercial relationships; other data sharing services of public sector bodies are excluded together with data services without a commercial relationship between data holders and data users.Footnote 3 Also copyright-protected data is excluded.Footnote 4 In general, data intermediation services are only about sharing data; these should not use the data for purposes other than delivery, although auxiliary functionalities such as anonymization services are allowed (Article 12). In other words, the goal is to promote exchange of data via platforms, databases, and data infrastructures in general through common protocols and data formats that ensure interoperability and security (Article 12). The data subject rights granted by the GDPR are also specified to apply to data intermediation services (Article 10). The establishment of these services requires official registration with public authorities (Article 11). The services are also supervised by competent public authorities that the member states are obliged to designate (Article 13). Analogously to the GDPR, according to the DGA’s Article 14, these new national authorities are empowered to impose fines and even cancel data intermediation services in case of infringements.

The other important concept is data altruism that the member states are instructed to promote and facilitate (Article 16). The regulation speaks about specific data altruism organizations, which are legal persons that operate on a not-for-profit basis without any dependencies to for-profit entities (Article 18). As with data intermediation services, all data altruism organizations wanting to be officially recognized as data altruism organizations must be officially registered to public registries maintained by competent public sector bodies (Articles 17 and 19). These organizations must keep rigorous track of those processing data held by the organizations (Article 20). Regarding data subjects, the regulation emphasizes that the objectives of processing personal data are clearly defined, the geographic location of processing is specified, and tools are provided for consent management (Article 21). Finally, the regulation specifies compliance monitoring requirements that are similar to those specified for data intermediation services. That is, the member states must designate specific competent authorities for the registration of data altruism organizations (Article 23) and the compliance monitoring of these (Article 24).

Coordination at the EU-level is specified to occur through the establishment of a specific European Data Innovation Board. It is designed to operate in cooperation with the EU-level data protection and cyber security institutions, the new competent national public authorities, envoys of SMEs, and other related bodies with relevant expertise, including academic institutions and civil society groups (Article 29). A final important point is about data transfers to countries outside of the EU. Somewhat similarly to the GDPR, such transfers should be prevented when possible but these are still allowed based on international agreements such as mutual legal assistance treaties (Article 31). Given the ongoing controversies and legal cases (see, e.g., Jurcys et al., 2022), it seems safe to assume that also the Data Governance Act will be subjected to legal scrutiny regarding potential international data transfers.

3 Reflections

The new Data Governance Act lays down frameworks for data reuse and data altruism under the supervision of competent public sector authorities. When compared to other recently enacted laws, such as the Digital Services Act and the Digital Markets Act, the legal motivation and background for the DGA are different. In terms of EU law, perhaps the closest reference point is Directive (EU) 2019/1024 on open data and reuse of public sector information. As noted in recital 9 of the DGA, the member states are encouraged to follow the directive’s principle of “open by design” also with the new regulation. Another related law, as noted in the DGA’s recital 7, is the Regulation (EU) 557/2013 for statistical micro-data research. Furthermore, the new regulation resembles national laws already enacted by the member states. For instance, Finland passed in 2019 a law for the secondary use of public sector social and healthcare data.Footnote 5 It shares many similarities with the DGA; the goal is to promote scientific research and innovation, there is a specific national agency that handles the delivery of data and its anonymization, and so forth.

The regulation further resembles the initiatives taken by non-governmental organizations and civil society groups. Notably, the so-called MyDataFootnote 6 initiative, which has been actively promoted by data and privacy activists (Lehtiniemi, 2017; Lehtiniemi & Haapoja, 2020), shares similarities with the regulation and particularly its concept of data altruism. Analogous information systems for personal data management, data governance, and data altruism have recently been presented also in academic research (Zichichi et al., 2022). However, these initiatives, information systems, and the DGA all seem problematic in that these rely on consent for the sharing and processing of personal data.

Although in Europe consent builds upon the foundational concept of informational self-determination, the use of consent as a legal basis for processing personal data has long been criticized from empirical, legal, and technical perspectives (Custers et al., 2013; González & de Hert, 2019; Hjerppe et al., 2023). Good practical examples would be the complex 20,000 word bulletproof legalese documents on one hand and the vague click-through banners used in the current world wide web on the other (Lahtiranta & Hyrynsalmi, 2018; Lundgren, 2020). In other words, consumers and users of digital applications and services do not really understand to what they are consenting to—even though the GDPR’s Article 4 explicitly states that “consent means any freely given, specific, informed and unambiguous indication of the data subject’s wishes by which he or she, by a statement or by a clear affirmative action, signifies agreement to the processing of personal data relating to him or her”. According to skeptical viewpoints, there is little reason to believe that things would be different for the noble goal of data altruism (Ruohonen, 2021). Similar points have been raised also regarding the reuse of public sector data for which the conditions for consent are often different (McKeown et al., 2021). In other words, it is difficult to specify the purpose of processing personal data at the time of initial data collection in a context that involves further processing (Mantelero & Vaciago, 2015). To some extent, the European politicians and lawmakers seem to have been aware of these issues, given that Article 25 in the DGA mentions the development of a specific European data altruism consent form. However, it can be challenging for data altruism consent to fully comply with the GDPR’s consent requirements as reaching the full potential of data economy requires flexibility in processing activities.

The DGA raises also other concerns about data protection and the GDPR. Three such concerns deserve a brief discussion. First, the DGA seems to conflict with some of the fundamental principles of the GDPR. In particular, the GDPR’s Article 5 explicitly states that personal data should be only collected for “specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes”. Although the same article specifies that this purpose limitation does not apply to public interest data archiving, scientific research, and statistical applications, the DGA’s goal of public sector data reuse still raises a concern about whether personal data collected by public sector bodies will be used in a manner which is unexpected or risky to the data subjects. Given that the GDPR does not apply to anonymized data, the DGA’s provision for data reuse under the GDPR’s purpose limitation rests upon proper anonymization. The second concern follows. As is well-known, there are efficient algorithms for de-anonymization and re-identification of data subjects (Henriksen-Bulmer & Jeary, 2016; Narayanan & Shmatikov, 2008; Rocher et al., 2019). The efficiency of such algorithms is likely to only increase with advances in machine learning and artificial intelligence. Hence, it remains debatable how well the state-of-the-art privacy-preserving methods mentioned in the DGA can prevent de-anonymization and re-identification attempts. This concern applies equally to non-personal data held by public sector bodies under commercial confidentiality (Kapoor & Nanda, 2021). The last concern is about national data protection authorities whose duties seem to substantially increase with the DGA. For instance, according to the DGA’s recital 15, prior to granting access for reuse of data, public sector bodies should carry out data protection impact assessments and consult data protection authorities in line with the GDPR’s Articles 35 and 36. Such consultations cover also questions about anonymization. The DGA also mentions, in recital 26, that the new competent bodies for monitoring intermediation services and data altruism organizations do not have a strict supervisory function, which is reserved for data protection authorities. Given the resourcing, coordination, and other problems already faced by European data protection authorities (Ruohonen & Hjerppe, 2022), a concern remains about how well the DGA will be administrated and enforced. The problems with the GDPR’s enforcement provide an alarming precedent.

A further concern relates to the data intermediation services specified in the regulation. Here, it remains unclear whether the existing Big Tech companies are allowed to act as data intermediation services, and how it is possible to ensure that such companies only provide data sharing without attempts to use the data exchanged. Analogous concerns have already been raised in the context of European cloud computing initiatives (Sheikh, 2022). Nor does the regulation answer to a question on how SMEs and start-ups can compete against Big Tech companies for providing data intermediation services. Furthermore, there are potential issues related to other functionalities provided by companies acting as data intermediation services. According to recital 33 in the DGA, a structural separation is needed to avoid conflicts of interests; data intermediation services should be provided through a legal person that is separate from other activities of a given data intermediation service provider.

It remains to be seen how data altruism plays out together with the not-for-profit data altruism organizations. The regulation is rather vague in this regard, mainly emphasizing data protection, trust, and the idea of data repositories for scientific and related purposes. Here, the GDPR generally acts both as a barrier and as an enabler for data sharing and data reuse (Vukovic et al., 2022). In general, the DGA further increases the regulatory complexity with regard to personal data processing, particularly in the context of scientific research. According to some surveys, Europeans have generally positive attitudes toward reuse of their healthcare data, but have still concerns about commercialization, security, and misuse of reused data (Skovgaard et al., 2019). Similar results presumably apply also to voluntary data sharing for not-for-profit purposes. Finally, there are two related regulations in the making: the so-called Data Act and a proposal for the creation of European health data spaces (European Commission, 2022b; European Commission, 2022a). The former augments the DGA with a goal of further data sharing, portability, and interoperability, providing also users means to gain access to data generated by them. It may also solve some of the challenges raised in this commentary. The second continues the same theme; the goal is to empower people by giving them access to their health data in their home country or any other member state. In addition, the idea is again to strengthen the single market for digital health services and products through health data sharing across the member states. Given the sensitivity of health data, concerns over confidentiality and security are graver with this proposal compared to the DGA and the Data Act.