Keywords

1 Introduction

Graduate tracking for tertiary education has been pushed on the European Union’s policy agenda in recent years. A Recommendation of the European Council advances 2022 as the deadline for the Commission to report on the implementation of graduate tracking (The Council of the European Union 2017). The initiative has been built on various policy goals and agendas which highlight the importance of having a solid evidence basis for evaluating the employability of tertiary education graduates, the standards for quality assurance in higher education (ENQA 2009) and vocational education (European Council and European Parliament 2009) or the European Commission’s Communication “New Skills Agenda for Europe” (European Commission 2016).

The Recommendation indicates two approaches for data collection: administrative data, also referred as register/registry/registrar data, and questionnaires. The 2020 report on the state of graduate tracing in the European Union (Beadle et al. 2020), an exercise requested in the text of the Recommendation, adds another source: big data. Overall, the report indicates that questionnaires and administrative data are the main sources of information for graduate tracing, while big data is still an emerging instrument.

We focus in this paper on register data—which builds on the possibility “to link, on an anonymised basis, data from different sources, in order to build a composite picture of graduate outcomes” (The Council of the European Union 2017). Such systems—or in some cases, such analyses, are reported to be institutionalized in most of the EU countries (and other countries included in the Report), with the exception of Croatia, Cyprus, France, Greece, Malta, and Romania, according to 2020 report (Beadle et al. 2020). Even though it is claimed that the secondary use of administrative data is less intrusive than surveys (Crato and Paruolo 2018, p. 4), the personal data protection regime in the European Union was reported to be an obstacle in linking the administrative data sources needed to construct a graduate tracing system using this approach (Beadle et al. 2020, p. 21). The administrative data needed for an analysis of the labor outcomes of tertiary education graduates are generally collected by different entities and stored in different databases. They need to be linked and interrogated for secondary use, compared to the scope of data collection. Of course, this process is governed by personal data protection restrictions. Afterwards, micro-data would need to be made available to an analyst, preferably in a de-identified format, who would compute statistics and provide a report. Researchers claiming access to micro-data have been for a long time regarded as ‘intruders’ by registrars (Jackson 2018). A new thrust was given to the culture of statistical confidentiality and micro-data access, especially after 2007: “to enable conditional gateways through the non-disclosure laws and policies that apply to statistical and other government outputs derived from personal records” (ibid, p.20).

The General Data Protection Regulation (GDPR) was enacted in the European Union to stop the overuse of personal data by private operators and poses the risk of underusing administrative data as key public infrastructure for evidence-based policy and research (Crato and Paruolo 2018; Świȩcicki 2019). Our contribution focuses on the interaction between the establishment of graduate tracer studies as a policy instrument and GDPR as a new source of constraints and opportunities.

We begin by laying the conceptual grounds for this discussion in the next section. We follow it with an empirical section in which we analyze how tracer studies function in two different cases, focusing on the intersection between the policy process and GDPR. The first case we study is Sweden, a country with a functional graduate tracer at the time when GDPR was enacted. The second case is Romania, a country caught by the adjustment of personal data legislation in an ignition phase of graduate tracer studies’ development. We conclude by outlining the two approaches we have identified in order to overcome data protection legislation obstacles for tracer studies. As GDPR is an EU-specific issue, our discussion is EU-specific and includes actionable solutions for the decision-makers with a legitimate interest in developing graduate tracing studies.

2 Employability, Employment, and Personal Data Protection: A Conceptual Discussion

Employability and employment have been thoroughly and diversely defined in the academic literature—see Nilsson (2017) for an overview. An excursus to the concept and history of the terms would diverge from the pragmatic approach we promised in this paper. Thus, we will cherry-pick some perspectives for a working representation of the concepts we consider necessary for the discussion on policy, registry data, and personal data protection.

In quantitative sociology, employment is modeled as a search process determined by “individual‘s own resources, [...] the resources of all others in the job market and upon the available jobs” (Coleman 1991, pp. 4–5). “[W]orkers and jobs possess resource” (Coleman 1991, p. 5) which make them desirable for their counterparts. Leaving aside structural determinants—for an overview, Nilsson (2017) is again an inspired companion—one of the key resources of individuals looking for employment is (considered to be) formal education. Economists have been interested mainly in the monetary returns to education, though more recent accounts also focus on non-monetary returns to education (McMahon 2009). One of the key findings in the Nobel winning human capital theory is that college education pays off in the labor market (Becker 1993). The relationship between schooling and earnings has been translated into an influential model by Mincer (1974). He explains the wages via mainly schooling and experience.Footnote 1

Current (quantitative) research on the match between education and labor outcomes is facing a methodological challenge: how to isolate causal relations in a setting in which experimental designs i.e. holding education away from a “control” group in order to estimate the effect of receiving tertiary education by another group, termed ‘treated’, would be unethical (Crato and Paruolo 2018)? We also approached (with different authorships) this matter in more technical terms elsewhere (Crăciun et al. 2020; Orosz et al. 2020). Here, we will use an example to illustrate its stakes: a boy is raised in a family in which the father earns enough to pay not only for the basic needs, but also for private classes. The mother finds time and finesse to keep him focused on his education. The boy is among the 41.6%Footnote 2 of the children of his age group who graduate high school and pass the baccalaureate exam. Then he goes to higher education, graduates, and finds a job, benefitting not only from a fine education but also from a network of acquaintances and collaborators of his parents. Of course, this is a success story, and the boy will reward his schools in statistics, but the question is, in our opinion, what is the impact of his university education on his career? The story is specific to the Romanian context, its gendering being intentional.

However, the public and the policymakers are less interested in such questions and are keener on finding out the crude evidence of what happens with whole cohorts of graduates, grouped in terms of likeness: graduates of ITC, of arts, of a certain university or a certain study program. Of course, such an approach, in most cases, omits the contribution of the non-random selection into higher education trajectories. The interest for the employability of tertiary education graduates i.e., their gain from formal schooling, measured as their status of employment, is motivated in relation to structural changes in our societies. Such structural changes are used as a justification for urgency, which seems to be a rhetorical artifice used in most official texts we consulted on the topic—and not only on this topic. The Council of the European Union relates its interest in the employability of tertiary education graduates to the un-full recovery from the 2008 financial crisis. Nilsson (2017) relates it to the expansion of tertiary education (in Sweden in the 60 and 70s), though such concerns of “intellectual unemployment” were articulated some tens, almost hundred years before—see, for example, the discussion regarding the “proletariat of the pen” in the 1880s Austria (Janos 1978, p. 108). This narrative was also imported in Romania via Mihai Eminescu, the national poet of Romania, who borrowed it in his editorial writings (Eminescu 1984).

Noting that the employment of university graduates has been an issue of concern [for some] at least since the \({19}\mathrm{th}\) century, we will add an operational definition of labor market outcomes to the conceptual references to employment and employability we provided in the previous paragraphs. According to the data collected by Beadle et al. (2020), labor market effects are quantified using indicators pertaining to: “employment status (employed, full-time, part-time, unemployed, self-employed, in maternity leave, etc.), NACE [economic sector] code of employer, duration of employment/unemployment, length of job search, salary level, geographical/sectoral mobility, job history, [...], location of work [...]”Footnote 3 (p. 29) and “classification of occupations and/or skills on the basis of ILOSTAT, ESCO/ISCO” (p. 62).

As these indicators come from multiple sources, mainly registers for population, social security, education achievements, unemployment, tax, and European Social Fund beneficiaries (Beadle et al. 2020, p. 31) and imply the processing of personal data, they must comply with specific restrictions. The previously quoted, authors identify a set of “common requirements”: “anonymization of personal data”, “aggregation of data for too small groups”, “access to data only for accredited people”, “access for researchers who want to work with data via secure data centers or secure work rooms”, “data handling and storage” (p. 33). Most of these precautions are determined by compliance with the current legislation regarding the protection of personal data, which represents the transposition of the GDPR into domestic law by individual member states (European Commission 2021).

The European Union’s view on data protection is closely linked to privacy issues. The privacy concept, as outlined in Art. 8 of the European Convention on Human Rights refers mainly to the right to private and family life, respect of private home and private correspondence. According to Salecl (2002, p. 8), a definition of the term ‘privacy’ which would be universally accepted is prone to face major difficulties, intrinsically linked to the different cultural renditions of what privacy is. The present “information age” adds another dimension to the cultural and anthropological definitions of what should be kept away from prying eyes, namely the accessing by “the public” of numerous “private aspects” that individuals unwillingly fall prey to, due to the massive outspread of state-of-the-art technology in the last decades (Vertes-Olteanu and Racolta 2019, p. 122).

The European concept of privacy “as a form of protection of a right to respect a personal dignity” differs from the American conception of privacy as a form of “liberty against the State”, or “the right to Freedom from intrusion by the state, especially in your own home (Whitman 2004, p. 1161)”. Information privacy is “an individual’s claim to control the terms under which personal information—information identifiable to the individual—is acquired, disclosed, and used”, as defined in Principles for Providing and Using Personal Information (“IITF Principles”) (The Privacy Working Group 1995). Not surprisingly, the key component of information privacy is the term personal information, in other words, information identifiable to the individual.

In Europe, information privacy has been recognized for a long time, or at least since the European courts began to recognize a right to informational self-determination. The term “informational self-determination” was used for the first time in the context of a decision of the German Constitutional Court regarding the personal data collected upon the occasion of the 1983 census, the German term being “informationelle Selbstbestimmung” (Frosio 2017, p. 313).

However, data protection values are not exclusively privacy-related ones, but partially autonomous, granting—on their own—fundamental rights—the right to data protection as recognized by Article 8 in the Charter of Fundamental Rights of the European Union: “Protection of personal data: Everyone has the right to the protection of personal data concerning him or her. Such data must be processed fairly for specified purposes and on the basis of the consent of the person concerned or some other legitimate basis laid down by law. Everyone has the right of access to data which has been collected concerning him or her, and the right to have it rectified”. We stress that the informed consent is related to some other legitimate basis, such as the public interest, through disjunction, or, in simple terms, there are also other legitimate grounds to process personal data, different from the informed consent of the subject.

As such, data protection can be understood as the right of a person to know which data is gathered in regards to her person, how the data is used, aggregated, protected, and where the data is transmitted. The right to informational self-determination is a huge achievement in recognizing users’ rights, later included in Article 12 (b) of the Data Protection Directive (The Privacy Working Group 1995) by the rule that allows the data subject to request the operator to “rectify, erase or block data the processing of which does not comply with the provisions of this Directive, in particular because of the incomplete or inaccurate nature of the data” (Frosio 2017, p. 314). The Data Protection Directive, the first important step in the recognition of data protection by the EU legal framework, granted and protected the free movement of personal information and the protection of fundamental rights and freedoms of an individual.

The newly coined “right to be forgotten”, enshrined in the already famous Regulation (EU) 2016/679 of the European Parliament and of the Council, only translated the right to informational self-determination into the digital domain. The Regulation clarifies that search engines perform data control and, therefore, they must be considered as “operators” within the meaning of Article 2 letter (d) of Directive 95/46/EC, thus complying with the provisions of the Directive. The right to informational self-determination empowers individuals against data processing entities, such as advertisers, insurers, supermarkets, Big Pharma, and data brokers, by guaranteeing the “authority of the individual in principle to decide for himself whether or not his personal data should be divulged or processed” (Rouvroy and Poullet 2009, p. 45).

Any systematic handling of data corresponds to the notion of ‘processing’ under the material scope of the GDPR. Data means electronically stored information, signs or indications. However, data has to be “personal” data in order to fall within the scope of application of the Regulation. Data is deemed personal if the information relates to an identified or identifiable individual. Data is therefore personal if the identification of a person is possible based on the available data, meaning if a person can be detected, directly or indirectly, by reference to an identifier. This is the case if the assignment to one or more characteristics that are the expression of a physical, physiological, psychological, genetic, economic, cultural or social identity is possible, for example, a person’s name; identification numbers, such as a social insurance number, a personnel number or an ID number; location data; online identifiers (this may involve IP addresses or cookies).

We hereby exemplify with several definitions of personal data according to EU court decisions:

  • the name of a person in conjunction with his/her telephone number and information about his/her working conditions or hobbies constitute personal data (C-101/01.(2003). Sweden v. Bodil Lindqvist 2003).

  • the information published in the press release was personal data, since the data subject was easily identifiable, under the circumstances (C-101/01. (2007). Nikolaou v. Commission 2007). The fact that the applicant was not named did not protect her anonymity.

  • the surname and given name of certain natural persons whose income exceeds certain thresholds, in conjunction with the amount of their earned and unearned income, constitute personal data (C-73/07. (2008). Tietosuojavaltuutettu [Finnish Data Protection Ombudsman] v. Satakunnan Markkinaporssi OY and Satamedia OY 2008).

  • transferred tax data are personal data, since they are “information relating to an identified or identifiable natural person” (C-201/14 Smaranda Bara and Others v. Presedintele Casei Nationale de Asigurări de Sănătate, Casa Natională de Asigurări de Sănătate, Agentia Natională de Administrare Fiscală (ANAF), 2015).

For the proper protection of such data and their subsequent processing, the GDPR sets out stricter requirements for obtaining valid consentFootnote 4 (especially for the processing of special categories of personal data).

However, the GDPR institutes a set of derogations that can constitute grounds for processing administrative data for research and evidence-based policy. One of these venues is the statutory permission under Art. 6 of GDPR. In our case, the processing shall fall under letter (e) of Article 6—processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller. The processing should have a basis in EU or EU Member State law and does not require the law in question to be a legislative act adopted by parliament (Recs. 41, 45 GDPR). Nevertheless, the legal basis should be clear and precise, and its application should be foreseeable to persons subject to it. Such a law might cover multiple processing operations at the same time (Voigt and von dem Bussche 2017, p. 107). Another venue for grounding access to register data was outlined by Trivellato (2018, p. 32):

The main exemption is in Article 5(1b), which states that ‘further processing for scientific research purposes [of data collected for other specified, explicit and legitimate purposes] shall, in accordance with Article 89(1), not be considered to be incompatible with the initial purposes’, [...] where Article 89(1) stipulates that ‘processing for scientific research purposes shall be subject to appropriate safeguards, in accordance with this Regulation, for the rights and freedoms of the data subject.

To conclude, the so-called “right to be forgotten”, as developed by the EU Court of Justice in the 2014 “Google Spain” case (C-131/12, Google Spain SL and Google Inc. v. (AEPD), Agencia Española de Protección de Datos and Mario Costeja González, 2014), puts a spotlight on the right of individuals to exercise control over their personal data, by deciding what information should be accessible to the public through search engines. However, it brings nothing new to the table, apart from the heated discussions around the possibilities of its use. Moreover, we strongly believe that its overall effectiveness was even limited, given the fact that GDPR contains much stricter provisions (doubled by the broadening and strengthening of the exceptions and limitations) than did the Data Protection Directive or the now-famous “Google Spain” decision, which claimed that the right to be forgotten could be exercised when the data proved to be “inadequate, irrelevant or no longer relevant, or excessive in relation to the purposes for which they were processed and in the light of the time that has elapsed”. The Directive provided “exemptions or derogations [...] for the processing of personal data carried out solely for journalistic,” artistic, or literary purposes, but these exceptions may be used only when “necessary to reconcile the right to privacy with the rules governing freedom of expression.” The GDPR more generously instructs Member States to “reconcile the right to the protection of personal data [...] with the right to freedom of expression and information, including processing for journalistic purposes and the purposes of academic, artistic or literary expression.”

The right to be forgotten, or the right to erasure, is (like the majority of rights) not absolute, and it only applies in certain circumstances. Controllers—such is the case with the graduate tracking—may process personal data if “processing is necessary for the performance of a task carried out in the public interest”. Furthermore, necessity is interpreted under proportionality—the data processed must have a close link to the attainment of the processing’s objectives. National law, for example, may specify that certain entities are able to rely on the public interest legal basis, or that processing necessary for scientific research may rely on the public interest legal basis, but with additional safeguards. Relying on this legal basis also allows for potentially curtailing the right to object. Following this line of thought, graduate tracking does benefit from the legal basis of public interest, both from a national perspective, as per the National Strategy for Tertiary Education 2015–2020 (Ministerul Educatiei Nationale si Cercetării Stiintifice, 2015), the National Reform Programme (The Government of Romania 2020), the strategic institutional plan of the ministry responsible for education (The World Bank 2019) and from a European one, as per the Council Recommendation of 20 November 2017 on tracking graduates (2017/C 423/01).

This is also the reason why, in the case of graduate tracking, the controller is exempt from the obligation to inform data subjects of their rights to object to processing. According to Rec. 156 and Art. 21(6) GDPR, where personal data are processed for scientific and historical research purposes or statistical purposes, the data subject has the right to object, unless the processing is necessary for the performance of a task carried out for reasons of public interest (Voigt and von dem Bussche 2017, p. 179). Such an exemption in no way represents an opportunity to avoid or bypass the GDPR provisions, rather a way to make the requirements practical and flexible, when the latter would be impossible to execute or would involve a disproportionate effort.

3 Case Study 1: Sweden

For Sweden, Beadle et al. (2020) list seven instruments for tracing graduates on the labor market. Two instruments monitor the employment of tertiary education graduates using solely administrative data and target the general population of students: “Establishment on the labor market after higher education—Etablering på arbetsmarknaden efter hogskolestudier” and “Establishment on the labor market after higher vocational education—Etablering på arbetsmarknaden efter kvalificerade yrkesutbildningar och yrkeshögskoleutbildningar”. We will briefly describe the instrument tailored for higher education graduates. The National Agency for Higher Education—the predecessor of the agency currently listed in the European Register for Quality Assurance for SwedenFootnote 5 (Swedish Higher Education Authority—UKÄ), has been commissioned by the government to monitor higher education graduates’ “establishment” on the labor market (Nilsson and Viberg 2015). According to the same authors, the first report was issued in 2003 and covered the graduates from academic years 1994/95 to 1999/2000. The reports were compiled as narrative analyses and were issued yearly, at least until 2015. Currently, UKÄ includes a short section on the link between higher education and employment in the annual status report on higher education.Footnote 6

The main source of data for the analyses is LISA—Longitudinell Integrationsdatabas för Sjukförsäkrings- och Arbetsmarknadsstudier [the Swedish Longitudinal Integrated Database for Health Insurance and Labour Market Studies] (Nilsson and Viberg 2015). According to (Ludvigsson et al. 2019), LISA “was launched [in 2003] in response to rising levels of sick leave in the country” (p. 423). LISA includes “the Education Register, Register of Income and Taxation, Occupation Register, [...], Structural Business Statistics from Statistics Sweden, the Swedish Social Insurance Agency, and the Swedish Public Employment Service”, in conjunction with general databases on population (p. 424). Most of the links are made via the personal identity number (PIN). Its coverage is national: “it includes all individuals aged \(\mathrm {\ge }\) 15 years (\(\mathrm {\ge }\) 16 years between 1990 and 2009) and living in Sweden” (p. 433). LISA is compiled by SCB—Statistics Sweden, the national statistics office.Footnote 7

However, anybody in Sweden could provide an alternative graduate tracing mechanism, provided they have the skills to do it and prove a legitimate interest in it. MONA (Microdata Online Access) is a research infrastructure for humanities and social science through which researchers in Sweden can get access to microdata generated from registries.Footnote 8 It was set up in 2004 due to claims “that data provided to researchers were used for other purposes or by other users than those authorized” (Swedish Research Council 2014, p. 11). The process starts with a request for disclosure of data for research purposes, accompanied by “a detailed description of the project, and in some cases a supportive judgment of an Ethical Review Board” (Swedish Research Council 2014, p. 12). The request is put to a “harm test” which consists of making sure that “individuals, or someone close to them, will not suffer injury or harm” (ibidem). The micro-data is “usually” de-identified—if needed i.e. for longitudinal studies, a “code key” is saved to allow updating or supplementing the data at a later stage. This process of licensing is delegated to state universities and governmental authorities for requests coming from their staff.

The relationship between Statistics Sweden and the end-user of the microdata is governed by GDPR: “the recipient is the personal data controller for the personal data that they process, while Statistics Sweden is a personal data processor”.Footnote 9 The two parties sign a “personal data processing agreement”. The administrators of MONA do not follow up on the research performed by the end-users, nor on the actions they perform in order to comply with GDPR. According to our email communication, the end-users are fully responsible for their action—hence also on the way they understand and put to practice issues arising from GDPR, such as subjects’ “right to be forgotten”—as discussed in the conceptual part of the paper. The system does not offer functionalities of communication with the subjects—such as a button that sends an information note to the subjects in a sample whose data was extracted from the database. According to our informal discussions with researchers using the system, the only thing that changed for the end-users of MONA with the enactment of GDPR was an update of the personal data processing agreement.

In Sweden, the state took an active role in interpreting the legislation in order to facilitate the use of data collected for administrative purposes for research and policy purposes, such as tracing tertiary education graduates. The databases are linked by state agencies. Generally, the system is predictable, and individual and corporate actors, such as universities, state agencies or private entities, have clearly assigned roles. Administrative information is treated rather as a public good and a critical research infrastructure. The literature indicates other countries with a similar approach to allowing access to micro-data: Administrative Data for Research—ADRP in the UKFootnote 10 and ELA in PolandFootnote 11 (Bozykowski et al. 2019 apud Świȩcicki 2019).

4 Case 2: Romania

In terms of tracer studies designed for university graduates, Romania is in the stage of developing “a platform aimed at interconnecting student databases and other national databases with relevant information on employees” (Beadle et al. 2020, p. 19). However, the report does not include detailed information on such initiatives coming from universities. We will fill in this gap by telling the story of the transition from a tracer study of the West University of Timisoara (Proteasa et al. 2018) to the development of a platform designed to offer employment indicators for the entire higher education system, as mentioned in the status report cited above (Beadle et al. 2020). The facts will constitute anchors for ordering considerations on the limitations and opportunities brought by the enactment of the GDPR. We were both actively involved in the two initiatives.

The primary source for employment data was the (electronic) register of employees in Romania (ReviSal)—which was instituted as mandatory for all employers (with some exceptions) in 2006. It is administered by a state agency named Work Inspection, which has been specifically authorized since 2017 to grant access to micro-data from the register to public entities provided their by-laws specify such an entitlement (HOTĂRÂRE 905 14/12/2017 Privind Registrul General de Evidentă a Salariatilor, 2017). This specification adds to the derogations from the GDPR—which expand, in our opinion, the list of possible derogations instituted in the previous legislation. Thus, from the legal perspective, the opportunity structure for entrepreneurial action in terms of linking the employee register with student registers can be characterized as rather expanding. It was not matched with action from central authorities not until the development of the platform mentioned by Beadle et al. (2020) was initiated. The empty seat in the policy arena has thus become the subject of entrepreneurial action from other actors that could claim a legitimate basis for accessing employment micro-data: the universities. We cannot establish how many universities managed to get access to employment micro-data to link them to their student registers. We are aware of two. One of them is the initiative of the West University of Timisoara, which developed a platform that matched data and provides access to employment indicators calculated for cohorts—which will be scaled up at a national level in the platform mentioned by Beadle et al. (2020). Another university in Timisoara seems to have accessed employment micro-data, but we could not find a report in the public domain, only media statements (Redactie [The Editors], 2019; Unspecified author, 2021). The legitimate basis for claims of access to micro-data from the employees’ register can be constituted by the obligation of the rector to report annually on “the state of professional insertion of the graduates from the previous promotions”, as part of the “public accountability” of the universities (Legea Educatiei Nationale [Law on National Education], 2011, Art 130). The press releases of the Politechnic University of Timisoara can be considered as such, but at the same time, they can be considered as part of the marketing efforts of the university, thus serving private interests.

The claim of the West University of Timisoara was grounded on the derogations from the data protection legislation: processing personal data for the purpose of statistical or historical research—before GDPR, and processing personal data for public utility purposes—after the enactment of GDPR. The two registers were linked via the personal identification number—the same as for Sweden’s LISA. The platform at the West University of Timisoara delivers indicators on insertion, occupational match, income, employers’ size and economic sector, and internal migration flows determined by the transition from secondary education to tertiary education (Proteasa et al. 2018)—of course, the coverage of the data is limited to the students enrolled at the University. It is user-driven in the sense that it allows the users to select and aggregate the cohorts they are interested in via filters for academic years and study programs attributed to fields of specialization and faculties. We cannot account for how many times was it used for substantial aspects of quality assurance, such as updating curriculum in light of the occupational destinations of the graduates but we can document its use for procedural aspects of quality assurance related to study program accreditation and listing of the study programs in the national register of qualifications. Both procedures are imposed through “Member State law”, and the graduate tracing mechanism we discuss does “provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject” in order to “respect the essence of the right to data protection” (GDPR, Article 9(2j))—a derogation indicated by Trivellato (2018). In our opinion, formal quality assurance procedures substantiate the claims of public utility beyond the public accountability of the rector stipulated in the Law of Education.

The reservations regarding the use of administrative data for graduate tracing invoke formally or informally the “Bara” case (C-201/14 Smaranda Bara and Others v. Presedintele Casei Nationale de Asigurări de Sănătate, Casa Natională de Asigurări de Sănătate, Agentia Natională de Administrare Fiscală (ANAF), 2015). According to the Court of Justice of the European Union, more precisely the judgment (resulted from a reference for a preliminary ruling by the Romanian Court of Appeals), the persons whose personal data are subject to transfer and processing between two public administrative bodies must be informed in advance.

In order to briefly summarize the facts of the case: the applicants made revenues from self-employment. Data relating to their declared income was transferred by ANAF (Agentia Natională de Administrare Fiscală—National Agency for Fiscal Administration) to CNAS (Casa Natională de Asigurări de Sănătate—National Health Insurance House). The latter sought payment of arrears of contributions to the health insurance regime, based on this data. The applicants challenged the lawfulness of the transfer of tax data relating to their income, alleging that the data were used for purposes other than those for which they had initially been provided to ANAF, without their prior explicit consent and without having been previously informed.

The question referred to the court was whether personal data may be processed by authorities for which such data were not intended where such an operation gives rise, retroactively, to financial loss. The Court of Justice held that the requirement of fair processing of personal dataFootnote 12 requires a public administrative body to inform the data subjects of the fact that their data will be transferred to another public administrative body, for their processing by the latter, in its capacity as recipient of those data. The directive expressly requires that any restrictions on the requirement to provide information are imposed by legislative measures. National law required the transfer of data necessary to certify that the person concerned qualifies as an insured person to CNAS. However, these do not include data relating to income since the law recognizes the right of persons without a taxable income to qualify as insured. Therefore, income data cannot qualify as “prior information” under Article 10. Thus, within the meaning of Directive 95/46, tax data transferred are personal data, since they are “information relating to an identified or identifiable natural person”, and both the transfer of the data by ANAF and the subsequent processing by CNAS constitute processing of personal data. Furthermore, the transfer of data was made on the basis of a protocol between the two authorities (ANAF and CNAS), which is not a legislative measure and is not subject to an official publication, thus infringing the conditions stipulated in Article 13.

However, in law, to distinguish a case means to decide that the holding of the legal reasoning of a prior case, the precedent, will not apply in a subsequent trial due to materially different facts between the two cases. Therefore, the Romanian courts and case-law following the “Bara” case chose to apply or not its holding. The precedent, given the direct effect in the national legal order of the findings from the CJEU decision, was later on invokedFootnote 13 in cases related to the incompatibility between the office of vice-mayor and the quality of being a trader, the state of conflict of interests, tax law (the establishment of VAT), a convention concluded between an autonomous administration (of transport) and the local council of city B., the lack of payment of contributions to health insurance or the annulment of a decision on tax liability (Sandru et al. 2017).

In the “Bara” case, the decision was mainly based on Article 13 letter (e) of the Directive 95/46, namely it was held in order to safeguard “an important economic or financial interest of a Member State or of the European Union, including monetary, budgetary and taxation matters”. The ruling of the case does not apply when the transmission of personal data is stipulated by law and, moreover, as stated here above, the processing does not require the law in question to be a legislative act adopted by parliament, meaning that it might also come under the form of by-laws or protocols. Furthermore, the processing of data is deemed lawful if it is necessary for compliance with a legal obligation to which the controller is subject or if it is necessary for the performance of a task carried out in the public interest or utility (Voigt and von dem Bussche 2017, p. 107). Such is the case with the tracking of graduates using administrative data, where the goal of the processing is the public interest. In addition, as stated above, processing for scientific research purposes shall, in accordance with Article 89(1) of Regulation 2016/679, not be considered incompatible with the initial purposes. Subject to appropriate safeguards, the Member States may restrict the data subject’s rights to object when it comes to the processing of their personal data for scientific, historical or statistical purposes. Recital 159 of GDPR states that scientific research should be “interpreted in a broad manner” and includes studies carried out in the public interest. In order for the processing to be considered statistical in nature, Recital 162 of the aforementioned Regulation states that the result of the processing should not be “personal data, but aggregate data” and should not be used to support measures or decisions regarding a particular individual. We find it highly unlikely, if not impossible, that a tracing mechanism, with all safeguard measures taken, would affect in any way the graduates who are subject to data collection.

5 Discussions and Conclusions

Matching employment and education registers represent a promising avenue for graduate tracing at the national and European level, with both strengths and limitations (Crato and Paruolo 2018, p. 3). Once the links are created, and the algorithms for computing indicators are established, electronic platforms could provide access to up-to-date, user-driven, comprehensive, objective, and accurate statistics. Some limitations inherent to administrative data exist—mainly the fact that they are depleted from important details that cannot be captured without interviews, and they still have limitations given by the jurisdiction of the administrative processes through which data are collected. The set-up costs may be substantial, but the efforts to maintain and update the system would be far less than any other instrument considered for the moment.

The development of instruments that use register data to trace tertiary education graduates has to take into consideration the legislation on personal data and specifically the GDPR. Though most of the data in the registries can be de-identified, the links between databases often require the use of official personal identification numbers—which are personal data. However, solutions have been found: Austria is nominated “for tracking graduates using administrative data [...] with total respect for data privacy regulations” (European Commission 2021, p. 16)—which we understand as matching databases without interaction with personal data. Encryption software tools in conjunction with ingenious designs of the data transactions between the different owners of the databases can also overcome this obstacle, from our point of view.

The alternative would be to put to work the legislation regarding personal data protection. We presented two approaches in the empirical part of the chapter. In the case of Sweden, the general legal framework was complemented by formal assignments of roles and responsibilities in linking registers (LISA) and accompanied by access instruments (MONA). In the case of Romania, tracing was grounded on legislation that pre-dated the adjustment of national law to the GDPR and on the basis of the derogations from the GDPR. In the case of Sweden, at least for the events we have covered in this chapter, the model resembles a state-coordinated intervention. In the case of Romania, the events can be rather described in terms of policy entrepreneurialism (Mintrom 2019).

Though the 2016 European Union Regulation on data protection is probably invoked in any discussion regarding the development of graduate tracer studies using register data, we do not consider it an obstacle that cannot be handled with proper legislative and coding skills. In the case of Sweden, where a system was built for granting access to register data in the scope of research, its functioning was largely unaffected in its substantial terms by the enactment of national legislation following the adoption of EU’s GDPR. We argued in the conceptual part of the paper that GDPR also offers opportunities to grant access to register data for research purposes. We also exemplified how such a case can also be built in the absence of a proper top-down definition of roles and responsibilities, with grounds in the existing higher education legislation—see the case study on Romania. However, this approach does not provide safeguards against administrative action coming from exaggerated or even dubious, in our opinion, interpretations of the GDPR. We did not witness such an event in the realm of higher education, but we documented a case of ‘misuse of GDPR’ in order to ‘muzzle media’: the Organized Crime and Corruption Reporting Project accused the Romanian authorities that they wrongfully invoked GDPR to stop abusively a corruption investigation targeting top politicians (OCCRP 2018). We argued earlier in this text that journalistic investigations are among the derogations from GDPR, as well as research in the public interest. We conclude that the development of graduate tracer studies using register data, in the context of GDPR, is a matter of political will, not a technical impossibility deriving from subsequent national legislation.