Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Fostering access to administrative data is without doubt one prerequisite to make policy evaluation more systematic. Administrative data have the advantage of providing detailed and accurate information on large samples of individuals, over repeated periods of time. Since administrative information is often collected for management and monitoring purposes, the possibility of using this data source for promoting evidence-based policy-making might also be more cost efficient than the use of alternative sources of information such as survey data.

This chapter documents how widespread is the use of administrative data for counterfactual impact evaluation (CIE) of active labour market policies (ALMPs). The analysis is based on articles and working papers published since 2000 that evaluate the impact of ALMPs implemented in Europe. After documenting differences across countries and over time in the use of CIE-based evidence of ALMPs, this chapter discusses how data availability might affect the choice of the econometric methods employed for measuring the causal effect of the underlying intervention. It is argued that the data source, whether administrative or survey based, correlates with the comprehensiveness of the CIE. In other words, the feasibility of measuring the impact of the intervention on an array of outcomes, comparing the effectiveness of alternative interventions and making comparisons across different treated groups, seems to be related to the type of data used.

While various meta-analyses have been carried out to assess the effectiveness of ALMPs (Card et al. 2010; Kluve 2010; Bratu et al. 2014; Card et al. 2017), this is, to the best of the authors’ knowledge, the first study to document a link between administrative data availability and CIEs’ use and comprehensiveness.

The analysis presented in this chapter is largely drawn from the report “Knowledge gaps in evaluating labour market and social inclusion policies” (Bratu et al. 2014) and the associated online Counterfactual Evaluation Archive (CEA).Footnote 1 The Centre for Research on Impact Evaluation (CRIE) of the Joint Research Centre of the European Commission was commissioned to write this report by the Directorate-General for Employment, Social Affairs and Inclusion. It reviews evidence on the impact of labour market policies of the type funded by the European Social Fund. The focus on ALMPs was a response to the need for deeper knowledge about the results of the interventions implemented in Europe to address a wide range of labour market problems, such as youth unemployment and social inclusion. In the light of both the increasing focus of the European Commission on policy effectiveness and of the tightened national budgets for the current 2014–2020 programming period, the ultimate objective of the report was therefore to identify possible areas of priority for the CIE of the interventions funded through the European Social Fund.

On the basis of the results of this report, in 2015, the CRIE launched the CEA, to summarise information on what works for whom in the areas of employment and social inclusion. The CEA, an online database which is regularly updated, compiles information on published articles and working papers related to CIE of ALMPs. The discussion in this chapter is based on the most recent update of the CEA. More precisely, the information is drawn from 111 CIEs of ALMP intervention-based papers published over the period January 2000–October 2016. The interventions evaluated took place in the EU-28.

The remainder of the chapter is structured as follows. Section 2 explains the data collection protocol, while Sect. 3 discusses the main findings and lessons learnt from the analysis of the CEA, and Sect. 4 concludes.

2 Data Collection

2.1 Definition of the Active Labour Market Policies Included in the Analysis

Three broad categories of interventions, corresponding to the classification of ALMPs used by both Eurostat and the Organisation for Economic Co-operation and Development, have been considered in the analysis below. These are (1) training, (2) employment incentives and (3) labour market services.

Trainings encompass measures aimed at enhancing the human capital of participants. A distinction is made between (a) classroom/vocational training, (b) on-the-job training and (c) other types of training that are neither classroom/vocational training nor on-the-job training (e.g. training for auxiliary/basic skills).

Employment incentives aim at offering work experience to unemployed individuals to keep them in contact with the labour market. Employment incentives support both the private and the public sectors. Private sector employment incentives include hiring and wage subsidies. Start-up incentives (e.g. tax incentives) or grants, which help unemployed individuals to start their own business, also fall under this category. Public sector employment incentives include public employment schemes to create jobs in public firms or in public activities that produce public goods or services.

Labour market services aim at supporting the unemployed throughout their job search process. They include a wide range of forms of assistance for job seekers, such as career management, job boards, job search assistance, career coaching and job referral and placement. For the CEA, in line with the literature, labour market service interventions have been divided into the following sub-interventions: (a) job search assistance, (b) counselling and monitoring and (c) job placement and relocation assistance.

2.2 Search Strategy: Identification of the Literature Examining the Effect of Active Labour Market Policies

Four online databases were searched for relevant academic articles: (1) Scopus, (2) RePEc IDEAS, (3) SSRN and (4) IZA Discussion Papers Database.Footnote 2 The protocol used to identify the relevant studies involved the following three steps.

First, within each database a search was carried out using the terms (“labour market” OR “labour market” OR “job”) AND (“evaluation” OR “impact” OR “data” OR “intervention” OR “program”), to broadly identify studies related to labour market and evaluation. Each set is connected by AND, while individual search terms within a set are connected by OR.

Intervention-specific keywordsFootnote 3 and the publication period (between January 2000 and October 2016) were also added to this search process. At this stage there was no restriction in the query regarding the target groups or the evaluation methods, to ensure that the search would deliver the most comprehensive set of results along these dimensions.

Second, given the large number of papers that the search identified, the results were further filtered by title. Third, the studies in the resulting list were scrutinised to select those to be included in the CEA.Footnote 4 At this stage, a thorough analysis of potentially relevant studies was carried out to identify the intervention types, outcome variables, evaluation methods and target groups.

In order to be selected, the evaluation had to (1) be based on CIE methods (regression, randomisation, propensity score matching, difference in differences, regression discontinuity design and instrumental variable methods), (2) be focused on interventions aiming at individualsFootnote 5 and (3) examine the impact of ALMPs on specific labour market outcomes, namely, employment status, duration of employment or income/wage level. The studies included in the archive are all reported in the online appendix. For a brief explanation of CIE, see Box 1 below.

Box 1: Counterfactual Impact Evaluation in a Nutshell

The purpose of a CIE is to assess the causal relationship between participation in an intervention and the outcome of interest (e.g. employment probability). It therefore requires comparing the participants’ outcome in the actual situation with that which would have occurred had they not participated. This is called the fundamental evaluation problem, as it is impossible to observe the outcomes of the participants in the latter situation, i.e. in the counterfactual state. Therefore, it is necessary to find an adequate control group of non-participants with which the participant group can be compared.

The two groups should be as similar as possible so as to ensure that the difference in their outcomes can be causally attributed to the intervention. Usually, groups of participants and non-participants are different in dimensions other than participation, either because the intervention is targeted at a particular group of individuals or because individuals self-select into the intervention. In either case, this leads to selection bias, making it misleading to simply compare the outcomes of participants and non-participants.

CIE methods are statistical and econometric techniques to compare the outcomes of participants and non-participants, while taking into consideration the selection bias problem by controlling for pre-existing differences between the two groups. The most compelling way to tackle the selection bias is the experimental setting (randomised control trial (RCT)), whereby individuals are allocated randomly to one of the two groups (participants or non-participants). This eliminates the selection bias problem because, given the randomised assignment, the two groups are similar in all respects but the intervention participation. RCTs however have limited external validity, i.e. results cannot easily be generalised to different contexts. When experiments are not economically or ethically feasible, nonexperimental evaluation methods can be applied to ensure the comparability of groups. Nonexperimental methods include regression Footnote 6 and matching methods, difference in differences, regression discontinuity design and instrumental variables. These approaches consist of statistical methods to control for the selection bias so as to be able to identify a proper comparison group.

2.3 Article Coding: Collected Information

For each selected paper, the following features were coded: the country where the intervention took place, the year of the intervention, the target population (unemployed, young unemployed, disadvantaged young unemployed, elderly unemployed, long-term unemployed, low-skilled unemployed, employed, inactive, disabled and women), the evaluation method (propensity score matching, regression discontinuity design, instrumental variables, difference in differences, regression) and the data used (administrative and/or survey) for evaluating the impact of the underlying intervention.Footnote 7 For the purpose of this chapter, additional information was also coded, relating to the data quality (sample size and the number of data sources used in the empirical analysis) and the comprehensiveness of the CIE analysis (number of outcome indicators considered, a measure of the intervention impact on different groups of participants).

3 Findings

3.1 Active Labour Market Policies Subject to Counterfactual Impact Evaluation Studies, Target Groups and Outcome Indicators

Applying the search protocol resulted in the identification of 111 relevant papers, among which, as shown in Fig. 1, 30.6% evaluate training interventions, 30.6% examine the impact of private and public employment incentives and 18% study the effect of labour market services interventions. The remaining 20.7% measure the effect of more than one ALMP.

Fig. 1
figure 1

Counterfactual impact evaluation studies and categories of active labour market policies. Source: https://crie.jrc.ec.europa.eu/CIE_database/cieDatabase.php; authors’ calculations

Around 12% of CIE studies examining training interventions look at on-the-job training programmes, while the other interventions relate to classroom/vocational training (52.9%) or to a combination of these two categories of intervention (35.3%). The majority of CIEs on employment incentives assess the impact of private sector incentives (64.7%), while only 14.7% specifically evaluate incentives in the public sector and 20.6% assess both sets of interventions. In 50% of the cases, CIEs of labour market services are related to job search assistance programmes. Counselling and monitoring make up 20% of these CIEs, whereas the remainder falling under this category includes an assessment of several labour market services at the same time.

Most of the interventions target unemployed individuals (91%). In addition, among these studies, those targeting subcategories of unemployed individuals often concentrate on the long-term unemployed and the young unemployed.Footnote 8 Only 7.2% of the interventions focus on employed individuals. The most common outcome indicators used in the CIEs measure the labour market status of the participants and non-participants, namely, the employment rate at the end of the intervention or the rate of exit from unemployment at different periods of time following the conclusion of the intervention.

Around 61.3% of the CIEs selected for the CEA examine the impact of an intervention on more than one outcome indicator. Occasionally this depends on the availability of longitudinal data, i.e. on whether or not information on the labour market history of the population targeted by the intervention is accessible. Such longitudinal information allows an estimation of the intervention effect on employment status at repeated periods in time. In other instances, CIE studies examine the effect of the intervention on both employment and income outcome indicators.

Almost 69.4% of the studies test the presence of heterogeneous effects across population groups (by age, such as young and elderly, or by gender). More specifically, the impact of the underlying intervention is frequently measured separately for males and females and/or for different regional locations within a given country.

3.2 Distribution of Counterfactual Impact Evaluation Studies Across EU Countries and Authorship

The 111 studies, published between January 2000 and October 2016, estimate the effect of interventions taking place in 19 different EU-28 countries, with Germany being by far the country where most of the interventions subject to a CIE took place (51 of the 111 CIEs). The preponderance of CIE studies in Germany can be linked with the introduction of the so-called Hartz labour market reforms, which took place between 2003 and 2005, the evaluation of which the German government commissioned from a number of research institutes (Jacobi and Kluve 2006). As shown in Fig. 2, Sweden and Denmark rank second and third, with ten and eight CIEs, respectively. In general, there is a contrast between the number of CIE studies in West and East European countries. Six, five and four CIE studies were found for France, Italy and the United Kingdom, respectively. CIEs have also been carried out in Poland, Romania, Slovakia, Slovenia, Bulgaria and Latvia, but generally only one CIE study could be found per country.Footnote 9 No interventions subject to a CIE were found in Greece, Estonia or Lithuania. The majority of German studies concern training interventions and employment incentives. CIEs of labour market services are more evenly distributed across the EU countries.

Fig. 2
figure 2

Number of counterfactual impact evaluation studies per country. Source: https://crie.jrc.ec.europa.eu/CIE_database/cieDatabase.php; authors’ calculations

The analysis of scientific collaborations in Germany helps to provide a better understanding of the distribution of CIE studies within the country.

For this, as in Newman (2001, 2004a, 2004b) and Ball et al. (2013), network analysis is employed to examine the structure of collaboration networks on the basis of individuals’ co-authorship. In Fig. 3 two authors are considered connected if they have co-authored at least one paper. The strength of collaborative ties is measured on the base of the number of papers co-authored by pairs of authors and is represented by the thickness of the lines connecting them. The size of the nodes is proportional to the number of papers authored by each researcher. Although in a merely visual way, Fig. 3 provides a proxy of how concentrated the authorship of CIE studies is. Names are shown for the authors who have each published more than two CIE studies (maximum 13) based on German data. The concentration of studies among a few core authors may depend on several factors, such as an established tradition in specific departments of performing CIE of ALMPs or the opportunity to access more widely administrative data, in particular for the evaluation of the Hartz reforms. Since most studies build on the IZA/IAB Administrative Evaluation Dataset provided by the Institute for Employment Research (See Eberle and Schmucker 2015), the connectedness within the co-authorship networks and the affiliation of the researchers (at the time of the CIE studies) clearly underline the importance of access to these data for any CIE of ALMPs in Germany. Since the IAB database has been opened to researchers from outside Germany, CIEs of ALMPs in Germany tend to be less concentrated among only a few authors. This is a first indication that the availability of high-quality administrative data is probably related to the application of counterfactual methods for data-driven evidence-based policy.

Fig. 3
figure 3

Network of counterfactual impact evaluation studies in Germany. Source: https://crie.jrc.ec.europa.eu/CIE_database/cieDatabase.php; authors’ calculations. Note: This network graph maps the community of authors’ networks, through a “node–link” diagram, where the circular nodes represent the authors of CIE studies on German ALMPs and the linear links represent the relationships given by scientific collaborations. The size of the nodes is a proxy of the number of papers of each author

3.3 Time Patterns

The number of CIE publications has clearly increased in recent years (Fig. 4), with 91 of the 111 CIE studies published in 2007 or later. While CIE studies could be identified in 12 countries before 2007, this number increases to 19 when the most recent period is also taken into account. In particular, CIE studies of interventions based in Bulgaria, Ireland, Portugal, Slovakia and Slovenia are observed for the first time in the second period.

Fig. 4
figure 4

Counterfactual impact evaluation studies: time patterns. Source: https://crie.jrc.ec.europa.eu/CIE_database/cieDatabase.php; authors’ calculations

The surge of CIE studies in the recent period is certainly partly driven by the rising demand for evidence-based policy. For instance, in the EU provisions for the 2014–2020 programming period, impact evaluations have been made compulsory for ALMPs funded through the European Social Fund. This positive time trend in terms of the number of CIE studies is also probably associated with the increasing accessibility of administrative data in several EU countries, as discussed in more detail below.

3.4 Counterfactual Impact Evaluation Methods

As regards the methodology applied in the CIE studies, Table 1 shows that propensity score matching is the approach most commonly employed (54.9%) for evaluating the impact of ALMPs in Europe. The predominance of propensity score matching is even higher if studies based on the combination of propensity score matching with other CIE methods are also taken into account (11.71%). This finding is true for the three categories of ALMPs. Randomised design-based papers rank second (9.91% of all CIE studies). Designs based on a random assignment to the intervention under scrutiny are particularly frequent when it comes to measuring the effect of labour market services (35%). Finally, difference in differences methodology has been implemented for 8.1% of CIE studies.

Table 1 Distribution of studies by counterfactual impact evaluation method

The pattern observed for the EU-28 is also valid when the summary statistics are limited to Germany. More than half of CIE studies in this country are based on propensity score matching. This method is most frequently applied for the evaluation of training interventions or employment incentives (respectively, 70.6% and 58.8% of these subject-specific studies), while randomisation is more common for measuring the impact of labour market services (63.6%).

3.5 Data Sources

The fact that in Germany the IAB made available to researchers a 2% randomly drawn sample from the integrated employment biographies (IEBs) of the IAB probably largely contributed to promoting CIE-based evidence. The IEBs contain observations on unemployment benefits, job search and participation in ALMPs, combining four data sources.Footnote 10 In Nordic countries, such as Finland and Sweden, administrative data have been available to researchers for several years, and hence, unsurprisingly, these countries also rank high in terms of the number of CIEs of ALMPs.

More generally, Fig. 5 reports the distribution of CIE studies by data source. Around 68% of studies are exclusively based on administrative data. The predominance of CIEs based on administrative data is true for the three categories of ALMP, with CIE of training, employment incentives and labour market services based on administrative data in 67.6%, 55.9% and 60% of cases, respectively. The preponderance of administrative data relative to survey data is observed for all CIE methods, though it is more often associated with nonexperimental than with experimental CIE methods (67% and 45.4%, respectively).

Fig. 5
figure 5

Counterfactual impact evaluation studies and data sources. (Source: https://crie.jrc.ec.europa.eu/CIE_database/cieDatabase.php, authors’ calculations

As shown in Table 2, propensity score matching is strongly associated with the use of administrative data, with around 67.2% of these studies based on this type of data. This figure rises to 88.5% if the CIEs that rely on administrative data merged with survey data are also taken into consideration. CIEs combining propensity score matching with other nonexperimental methods, such as the difference in differences approach, are also largely dependent on administrative data or administrative data merged with survey data. As highlighted in the literature, among others by Sianesi (2002, p. 8) with reference to the evaluation of Swedish ALMPs, the richness of administrative data may justify the use of methods of analysis based on “selection on observables”. Indeed, in contrast to an experimental approach (or other nonexperimental methods such as regression discontinuity design), the reliability of the propensity score matching approach for measuring the impact of an intervention critically depends on the validity of the “ignorability” assumption. This CIE approach supposes that the assignment to an ALMP intervention depends only on characteristics (age, previous labour market experience, educational level, etc.) observable by the evaluator.Footnote 11 If this is indeed the case then, provided that this selection process is controlled for, variations in the outcome indicators between the participants and non-participants should be due to the participation in the intervention. Along the same lines, the reliability of CIEs employing a difference in differences approach hinges on the availability of longitudinal information (before and after the intervention).

Table 2 Distribution of studies by counterfactual impact evaluation method and data source

Biewen et al. (2014, p. 838) summarise the importance of data completeness as follows: “for the analysis of ALMP, detailed information on employment and earnings histories prior to program participation seems important to justify matching estimators for treatment effects that rely on a selection on observables assumption. Accurate longitudinal information on labor market transitions is also useful to account for the dynamics of program assignment and to carefully align treated and comparison units in their elapsed unemployment experience”.

Administrative data, and preferably a combination of several administrative data sources, allow working on databases containing a large set of information on the participants and non-participants in the interventions. This type of data makes the use of nonexperimental methods for CIE more reliable.

3.6 Administrative Data and Completeness of Counterfactual Impact Evaluation Studies

Although CIE is essential to promoting evidence-based policy, it is also true that not all CIE methods are equally rigorous and informative. The purpose here is not to discuss the assumptions underlying each CIE method but to document whether or not some data characteristics are associated with the comprehensiveness of the impact evaluation. In particular, although it is necessary to document the average effect of an intervention on the participants, it is also of the utmost importance to check if the underlying intervention’s impact varies across subgroups of participants. A specific ALMP might not work for the participants as a whole but could be very effective for some subpopulations. If this is not taken into account, the CIE might produce misleading conclusions.

Administrative and survey data differ in terms of population coverage and hence sample size. Indeed, administrative records tend to cover the whole universe of a specific population (for instance, welfare recipients), with this population being tracked for administrative and monitoring purposes, independently of the CIE or any research project. This implies that the sample size used for the CIE of an intervention targeting the population recorded in the administrative database is potentially very large. In contrast, survey data are usually gathered for research purposes, and, as such, sample sizes tend to be much smaller. This is confirmed by the CIE studies in the CEA. In almost 85% of CIE studies based on administrative data, the sample size is ≥5000 observations, while this is the case for only 39% of survey-based CIEs. Furthermore, the type of data used in the studies is associated with the likelihood of searching for heterogeneous effects. As shown in Fig. 6, around 72% of CIEs rely on administrative data test for the existence of heterogeneous effects, while this is the case for only 55% of survey-based CIEs.

Fig. 6
figure 6

Data sources and counterfactual impact evaluation completeness. Source: Authors’ calculations

Along the same lines, CIE studies that examine the effect of ALMPs on short- and long-term outcomes have shown that programme effectiveness can have wide dynamics, from short-term locking-in effects to long-term positive effects on labour market outcomes. In that respect, to ensure a comprehensive CIE, it is important to use a data set allowing for such a dynamic analysis. This is relatively complicated with surveys that are carried out just once or which often suffer from endogenous attrition when repeated over time. In contrast, administrative records generally have a longitudinal component, with the same units being observed repeatedly until being withdrawn from the specific population monitored in the administrative database (e.g. an unemployed person registered in the unemployment office exiting the database when getting hired). As stated in Rehwald et al. (2015, p. 13), “exploring longitudinal register data also allows us to go beyond a single baseline and a single follow-up”.

In addition, linking several types of administrative data, such as tax records with employment records, provides the option to examine the effect of a given intervention on a wide range of outcome variables. Among the 93 CIE studies using administrative data, 77.4% examined the effect of the intervention on more than one outcome variable or on the same outcome over a long period of time. This is the case also for the 83.1% of studies that relied on more than one administrative data source or combined administrative data with survey data.

4 Conclusions

The Counterfactual Evaluation Archive (CEA) is an online database, developed by the Centre for Research on Impact Evaluation of the Joint Research Centre of the European Commission, which collects published articles and working papers using counterfactual impact evaluation (CIE) to assess the impact of active labour market policies (ALMPs). This archive assembles information on the studies published over the past 16 years in EU-28 countries.

Using data from this archive, this study observes an unequal distribution of CIEs of ALMPs across EU member states. In particular, while Germany counts 51 CIE studies, some countries, such as Greece, Estonia and Lithuania, are not even represented in the database. Even though there has been an increase in CIE studies in terms of country coverage over recent years, there is clearly still room for improvement. Why is there a lack of evidence for some countries? There are various possible reasons, including a lack of CIE expertise and culture within the country and/or a limitation in data accessibility. In particular, the underutilisation of administrative data, despite the many benefits of such data sources, ranging from the sample size to the longitudinal dimension and the near-universal coverage of the population under study, could explain the still insufficient number of CIEs of ALMPs in many European countries. In Germany, and in some Nordic countries such as Finland and Sweden, administrative data have been available to researchers for several years, and this is probably one of the reasons behind the relatively high number of CIEs of ALMPs observed in these countries.

Analysing the characteristics of the studies included in the CEA, it is argued that CIEs based on administrative data tend to be more comprehensive than those relying on survey data. Administrative data sets have large sample sizes, which facilitate the analysis of heterogeneous effects. Measuring heterogeneous effects is important because a specific intervention might not work for the participants as a whole, but the findings might be different for some subgroups. CIEs based on administrative data or on a combination of sources are more likely to estimate the effect of an ALMP on several outcome variables or to study the short- and long-term impacts of the interventions. Taken together, these statistics suggest that the availability of administrative data is important for promoting evidence-based policy.