Temporary refugee protection and labor-market outcomes

We study a Danish reform in 2002 that lowered the ex-ante probability of refugees receiving permanent residency by prolonging the period before they were eligible to apply for such residency. Adherence to the new rules was determined by the date of the asylum application, and the reform was implemented retroactively. Using registry-based micro data, we study the effects on labor-market outcomes and investments in education. While proponents of temporary protection regimes argue that stronger incentives to qualify for residency based on labor-market attachment will speed up the labor-market integration, we find no evidence of positive effects on labor-market outcomes.


Introduction
During the past decade, developments around the world have led to a large inflow of asylum seekers to Europe and in response, many European countries have implemented stricter immigration policies. A motivation has been to reduce immigration and improve the integration of immigrants granted residency (see, e.g., Mansouri et al. 2010). One such policy is the shift from permanent to temporary residence permits for refugees. 1 While several countries have, or are about to, implement such reforms, we have limited empirical evidence on their effects on refugees' integration in society in general and in labor markets in particular. A shift to temporary permits could have both positive and negative effects on integration into society and the labor market. The public debate has been centered around the relative strengths of these effects. On the one hand, the expected return to investment in country-specific human capital falls if the probability of receiving permanent residency falls. There can also be a cost in the form of increased stress from a lower probability of being granted permanent residency. In addition, increased uncertainty may affect firms' hiring decisions, making it more difficult for refugees with longer periods of temporary residence to find qualified work. On the other hand, actions that lead to labor-market attachment during the time with temporary residency are incentivized when they increase the probability for permanent residency. This could strengthen the incentives for labor-market investments in the host country and improve integration. The net effect of a shift from permanent to temporary residence permits for refugees is therefore an empirical question in much need of attention.
This paper aims to fill this gap. Specifically, we address what the effects are of changes in the ex-ante probability of being granted permanent residency. We study the effects of a reform that changed the criteria for eligibility for permanent residency in Denmark and that was implemented in 2002 as part of a reform package. The explicit aim of the reform package was to limit the number of asylum seekers in Denmark, while honoring international obligations, and to speed up the integration process (The Danish Immigration Service 2003). The specific reform component that we study in this paper increased the time period a refugee would have had to have been a legal resident (on a temporary residence permit) in Denmark, before being eligible to apply for permanent residency. During the time with temporary status, a residence permit could be withdrawn if the grounds for protection were no longer valid, and if the individual did not have the right to stay on other grounds, such as having a solid labor-market attachment. The change applied to individuals who lodged their first asylum application on or after February 28, 2002. This meant that refugees who applied for asylum from February 28, 2002, onward faced a longer time period with temporary status, during which they risked losing the grounds for protection, before they could apply for permanent residency. Prior to the reform, three years were sufficient, whereas after the reform a refugee would have to wait for seven years. Furthermore, some supplementary conditions, described in detail in Sect. 2, were included. Obtaining permanent residency was thus made more difficult by the reform, i.e., the ex-ante probability of receiving permanent residency in Denmark on the grounds of asylum was higher prior to the reform. This does not mean that the reform necessarily changed whether an individual got to stay in Denmark or not, ex-post. In fact, we show that around 90% of the individuals in our sample (in both the control and the treatment group) are still in Denmark twelve years after their first arrival. Crucially, even if there was a decrease in the ex-ante probability, this can still be consistent with no observed change in the share of asylum seekers staying in Denmark. These individuals could: (1) have had asylum reasons throughout the time period with a temporary permit or (2) established a labor-market attachment, which could be used as grounds for prolonged temporary residency. Although a refugee had no control over the development in the home country, or the Danish authorities' assessment of whether the grounds for asylum were still valid, the refugee could affect the attachment to the labor market and thus affect the probability of staying in Denmark. The reform affected the perceived probability of staying in Denmark, but individuals' responses together with external factors may have stopped this from translating into an observed increase in the risk of deportation. What is key, however, is that there was a change in the perceived probability of receiving permanent residency. Through interviews with NGOs, in a comparative study of temporary permit regimes in Denmark, Germany, and Australia, Mansouri et al. (2010) conclude that introducing temporary residence permits, or prolonging the temporary status, increased uncertainty and they suggest that integration was made more difficult. Although we cannot observe the ex-ante probability of receiving permanent residency, it is clear that the reform was intended to impact this. We can also show that there is a significant decrease in the share of individuals receiving a permanent residence permit following the reform.
We study the impact of the reform using Danish registry-based data. The retroactive implementation of this part of the reform package allows us to distinguish its effect, holding constant other aspects of the reform package. This setup naturally lends itself to a Regression Discontinuity Framework and this is the first step in our empirical analysis. However, while the reform studied in this paper is theoretically ideal for studying the effect of prolonged temporary status, the setting also offers some challenges. Our sample size is limited because we only study outcomes of individuals that are actually given asylum and we need to restrict the time interval to four months before and after the policy change. This poses challenges for the regression discontinuity estimation, and we need to be careful in the interpretation of our results. In a second step, we therefore use descriptive analysis to substantiate our understanding of the impact of the reform. To assess the behavioral responses to this reform component, our focus is on outcomes that are relevant for integration or the assessment of grounds for prolonged residency that the individual could affect herself. We therefore focus on labor-market outcomes and investments in education. 2 Our main result is that there is no evidence of a positive effect on labor-market outcomes, measured in terms of employment and earnings, at any time horizon (up to twelve years after asylum). The coefficient estimates are negative but, in general, not significant using conventional estimates. For investments in education, we estimate a positive effect on enrollment in education using the regression discontinuity framework. While the positive coefficients are in line with our descriptive analysis and we attempt to control for discontinuities in predetermined characteristics, the magnitude and precision of this effect are sensitive in the regression discontinuity estimation. Our takeaway is that while the results from our causal estimation, and in particular the magnitudes, are subject to uncertainty, taken together with the descriptive evidence the findings suggest an increase in enrollment in education following the reform. However, these investments in education appear to be too late to have a positive impact on labor-market outcomes and integration relevant for permanent residency. Thus, while proponents of more stringent residency requirements argue that they will improve labor-market integration, we do not find any evidence that this is true for the reform that we study. If stricter requirements had a strong positive impact on labor-market outcomes, we should be able to detect this also with a small sample size. Thus, we add a piece to the puzzle of understanding the impact of restricting access to permanent residency for the labor-market integration of immigrants, showing that a lower ex-ante probability of permanent residency and more stringent requirements do not necessarily translate into improved labor-market attachment. Since, in the policy debate, stricter requirements are often suggested to incentivize actions that lead to stronger labormarket attachment, we argue that this is an important finding and contribution of our paper.
Our paper contributes to an empirical literature that studies the relation between immigration policies and immigrants' outcomes. Closely related are studies on different durations of migration spells. There are various reasons why the duration in the host country may be important for the migrant's economic behavior. 3 For example, Dustmann (1999) shows that investment in host-country-specific human capital depends on the intended duration in the host country. 4 Intended duration may matter for migrants in general, but in this paper we focus explicitly on refugees, a group that is fundamentally different from other types of migrants. 5 While immigrants' entrance into the labor market is relatively well studied, we know less about the integration process of refugees and their labor-market prospects. In a recent paper, Fasani et al. (2020) show that refugees perform worse in the labor market than other immigrants 3 See Dustmann and Görlach (2016) for an overview on this topic. 4 See also Adda et al. (2021) for a model where intended duration matters for immigrants' career paths and earnings profiles. Chen et al. (2019) study selection into temporary or permanent migration and show that long-term migrants are more strongly positively selected, which they relate to higher returns to matching. 5 Cortes (2004) recognizes the importance of this distinction and focuses on the heterogeneity between refugees and economic immigrants. Assuming that refugees cannot return to their country of origin, and thus face a longer time horizon, they have stronger incentives to invest in country-specific human capital. One benefit of our study is that we can look at the importance of the time horizon and status in the host country within one group of immigrants, refugees. across Europe, and  highlight the indecisiveness about the duration and permanence of the stay in the host country as one of the primary reasons for the poor labor-market integration.
Temporary protection schemes may be designed in many different ways, making them more or less comparable, and institutional settings vary between countries and over time. Thus, it is key to compare findings from different settings and policies. A couple of recent empirical papers study the impact of restricting access to permanent residency. Blomqvist et al. (2020) study the effect of a policy change in Sweden in 2016 that changed the norm from permanent to temporary residence permits. They analyze the impact on education and labor-market outcomes and find an increase in the probability of enrolling in (and validating existing) education and working, indicating a faster integration process in the intermediate horizon. Jutvik and Robinson (2020) study the impact on labor-market inclusion using another Swedish policy change that was implemented in 2013. The policy change applied to asylum seekers from Syria, most of whom got a temporary residence permit prior to the policy change, but who all got permanent residence permits after the policy change. They find that the asylum seekers with temporary residence status have higher incomes and lower unemployment rate in the short run compared to permanent residents, but are less likely to invest in education. While both papers find a positive effect on labor-market outcomes from temporary residency permits, the estimated impact on education differs. Because they study different groups of refugees and measure outcomes at somewhat different time horizons, it is not entirely straightforward to compare the results. In addition, the contexts of the two policy changes were quite different. The policy change in 2013 was due to a reassessment of the situation in Syria and meant a shift to permanent residency permits, whereas the policy change in 2016 limited access to permanent residency following a period with a large inflow of migrants to Sweden. In that sense, (Blomqvist et al. 2020) is closer to the reform that we study in this paper, which meant a shift further away from permanent residency, increasing the time period on temporary residence permits. This is also conceptually closer to the case in Arendt et al. (2021), who study a further tightening (relative to the reform studied in this paper) of the Danish immigration policy in 2007. 6 In addition to extending the existing legislation, it added a minimum requirement in terms of employment and raised the language requirement for individuals to be eligible for permanent residency. Overall, in line with our finding that labor-market outcomes do not improve from stricter residency requirements and longer times with temporary residency, they estimate a negative effect on employment from the more stringent requirements. Turning to education, the authors find no impact on the propensity to pass the higher level language test. However, they highlight heterogeneity in the impact of the reform, suggesting that high-productivity individuals were in fact incentivized, whereas low-productivity individuals (that drive the decrease in employment) faced a disincentive effect. We also estimate an increase in investments in education (in terms of enrollment), but the increase is not immediate implying that, in our case, the increased investment would have been too late to impact labor-market integration during the time horizon when this could improve access to permanent residency.
In addition to policies that regulate the duration of temporary residency permits, or requirements for permanent residency, several other aspects of asylum policies may also matter for the economic behavior of asylum seekers. Examples include waiting times (see, e.g., Hainmueller et al. 2016;Hvidtfeldt et al. 2018), language training (see, e.g., Arendt et al. 2020), employment bans (see, e.g., Hainmueller et al. 2018;Fasani et al. 2021), income support and benefits (see, e.g., LoPalo 2019; Agersnap et al. 2020), and active labor-market programs (see, e.g., Clausen et al. 2009;Sarvimäki and Hämäläinen 2016). Finally, this paper is related to the growing literature that studies the importance of residency status for immigrant outcomes (see, e.g., Pinotti 2017; Mastrobuoni and Pinotti 2015;Baker 2015;Fasani 2018;Felfe et al. 2020Felfe et al. , 2018Kuka et al. 2020;Lozano and Sørensen 2011;Cascio and Lewis 2019;Orrenius and Zavodny 2015;Amuedo-Dorantes et al. 2007;Kossoudji and Cobb-Clark 2002;Bahar et al. 2021;Devillanova et al. 2018;Adda et al. 2020;Bratsberg et al. 2002) The rest of the paper is organized as follows. Section 2 describes the details of the reform. Section 3 describes the data and empirical strategy. In Sect. 4, we present our empirical results. We perform sensitivity analysis in Sect. 5 to assess the robustness of our results. Section 6 concludes the paper.

The 200reform package
The time period of interest to us is the early 2000s. This was a time of substantial change in terms of Danish asylum policies. 7 On November 27, 2001, a new minority centerright-wing coalition government was appointed in Denmark. This shift of government reflected a shift in the public opinion on immigration (see, e.g., Mansouri et al. 2010). The new government introduced a number of legislative changes regarding asylum and immigration policies that were passed by the Danish parliament as amendments to the Aliens Act and the Integration Act. We study the effects of a specific reform component that changed the criteria for eligibility for permanent residency in Denmark (henceforth referred to as the reform). This change was part of a suggestion for a new Bill to amend the Danish Integration Act, presented by the new government in February 2002 (Ersbøll and Gravesen 2010) and passed by the Danish parliament (Folketinget) on June 6, 2002. 8 Both before and after the reform, individuals given asylum were initially granted a temporary residence permit if protection was deemed necessary. While under temporary status, the residence permit could be discontinued if the grounds for residency were no longer valid. Generally, temporary protection would be sustained if the need for protection was intact and there were no legal reasons to withdraw it. 9 Refugees could also be allowed to sustain their temporary residence permits based on labor-market attachments, even if there was no longer any need for protection. After a certain time period as a resident in Denmark, a refugee (above 18 years old) would be eligible to apply for permanent residency. The main change implied by the reform studied here was in how long a refugee would need a temporary residence permit before being considered for permanent residence status. Prior to the reform, three years were sufficient, whereas after the reform a refugee would have to wait for seven years. 10 This change implied that individuals subject to the new rules lived with temporary protection for a longer time period, facing the risk of having their permit discontinued. Once eligible to apply for permanent residency, refugees would be granted permanent residence if the need for protection remained or if they had a labor-market attachment, given the fulfillment of some supplementary conditions, and unless there were legal reasons to withdraw the residence permit. Prior to the reform, these supplementary conditions included completing an integration program and having limited debt. Under the new rules, in addition to completing the integration course, asylum seekers would now have to pass a (basic level) language test and hold no overdue debt. In addition, while a criminal record used to lead to a longer waiting time, a serious criminal record would prevent permanent residency altogether post-reform (Ersbøll and Gravesen 2010). Obtaining permanent residency was thus made more difficult by the reform and the key takeaway for this paper is that the policy change implied a lower ex-ante probability of being granted permanent residency based on asylum reasons. At the same time, permanent residency could be obtained through labor-market attachment and a potential effect of the reform is that this option became more important.
In addition to changes in the requirements for permanent residency, the 2002 reform package also entailed lower benefit levels, made family reunification more difficult, abolished the de facto status, and removed the possibility to apply for asylum at Danish embassies abroad. We are able to isolate the effect of changes to the eligibility for permanent residency from other parts of the reform package, as this was the only component introduced retroactively and it applied to all individuals who lodged their asylum application on or after February 28, 2002 (the date when the new Bill was proposed). The other components of the reform package took effect after the Bill had been passed, on July 1, 2002. For more details on the other components of the reform, see Online Appendix B. In 2003, another potentially important reform was implemented, allowing immigrants that had lodged their applications on or after February 28, 2002, to apply for permanent residency already after five years if they were "well integrated," i.e., if they had a strong labor-market attachment and had not relied on 9 Paragraph 11 in the Aliens Act. 10 Formally, the reform implied that if the refugee had held a legal permit on basis of paragraphs 7-9 of the Aliens Act for at least seven years, counting from the date of approval of the temporary permit, the refugee was eligible to apply for permanent residency. Paragraphs 7-9 included permits for all categories of refugees that we consider, and in particular, paragraph 9 included specific permits based on labor-market attachment. social welfare. 11 Furthermore, in the case of exceptionally successful integration, it was possible to receive a permanent permit already after three years of legal residency (Ersbøll and Gravesen 2010). The implication for our analysis is that the integration motive-and the incentives to acquire a strong labor-market attachment-was made stronger. 12

Empirical strategy and data
In this section, we describe our data sources and the empirical strategy used for the analysis.

Data
Our main dataset is register data collected by Statistics Denmark. For the purpose of this study, we combine two sources of Danish micro data. First, from Statistics Denmark we have register data for all immigrants in Denmark 1997-2015. This dataset includes all immigrants who were registered as living in Denmark on January 1 in any of the years 1997 to 2015, which means that we can follow the individuals in our sample up until 12 years after their initial application for asylum was approved. Second, using unique register data from the Danish Immigration Service we observe, for each individual, the grounds for the residence permit held as well as dates of application and approval. Using individual identifiers, these data can be linked to our main dataset and enable us to define relevant treatment and control groups. Our main variables of interest are chosen to be relevant for the purpose of integration and include labormarket outcomes and educational investments. We define enrollment in education as a dummy variable equal to one if the individual, at some point during the 12 years of data that we observe, enrolls in general education or in education at the university level. 13 In terms of labor-market outcomes, we focus on employment status and labor earnings (including earnings from self-employment). Employment status is defined using a dummy equal to one if the individual is registered as employed (or self-employed) at some point during the 12 years that we observe, whereas earnings are measured after three and seven years of residency in Denmark in our benchmark. 14 The motivation for using these definitions of enrollment and employment is that they increase the 11 This was an addition to paragraph 11 in the Aliens Act, entered into force as Act no. 425 of June 10 2003, and the formal requirement implies that the applicant should have lived legally in Denmark for at least five years and have been self-supporting with a solid labor-market attachment for the last three years. 12 During 2002, there were some other important changes to decision practices for specific refugee groups. These are unrelated to the policy changes studied in this paper, but they are relevant to highlight since they affected the approval rates for specific nationalities. In particular, changes applied to asylum seekers from Afghanistan, Iraq, and Kosovo, for whom, following a reassessment of the security situations, the requirements for asylum were made stricter. 13 All information on education comes from the two registers UDDA and VEUV.
14 The data come from the INK and RAS registers. expected effect size, mitigating the problem of insufficient power due to the small sample size. 15 From the register data, we also collect information on demographic characteristics, specifically age, gender, nationality, marital status, and the number of children in the household, to be used as control variables in the analysis. 16 In addition, from the educational registers, we impute two different measures of skill level at arrival. First, we use the highest level of education completed before arrival in Denmark (primary/secondary or higher). 17 Second, we use the entry level of Danish language courses (1, 2, or 3), because the entry level is determined by the individual's skill level. 18 These measures of initial skill level are used both as control variables and to split the sample in order to study heterogeneous effects.

Empirical strategy
The empirical strategy consists of two parts. First, the retroactive implementation of the reform makes the regression discontinuity framework a natural starting point. 19 However, a small sample size is challenging for this type of estimation and makes the setup for regression discontinuity estimation less than ideal. In a first step, we disregard this worry and consider the graphical presentation of the regression discontinuity estimation, as one of several ways to present the data. For transparency we present these graphs (and regression estimates), but we do not rely on these results alone for our conclusions. In a second step, we move on to discuss the descriptive analysis used. We consider the descriptive analysis, while not offering a clean identification, essential as a complementary tool to strengthen our analysis. Sample restrictions We construct a sample based on application and approval dates for asylum applicants. In total, 66,614 individuals were granted asylum in Denmark during the time period 1997-2015. First, we remove the 23% of these individuals 15 If the reform changed incentives to invest in education or improve one's position in the labor market, this may have affected outcomes even after individuals were allowed to apply for permanent residency. This implies that the expected effect size is larger using this summarizing measure compared to an analysis using outcome variables restricted to the first 3 or 7 years after approval. This measure is therefore our primary definition, but in Sect. 4 we also present estimates for the main outcomes at each year after approval. 16 These variables are from the population register (BEF). To determine marital status on arrival, we assume that if the date of the first change in marital status is missing, the change must have happened before arriving in Denmark (or it would have been recorded). Children at arrival is defined by considering all children born before the application year and associated with the first family identifier available in the registers after the first asylum application. 17 Primary/secondary education includes early childhood education and primary education as well as lower and upper secondary education. Higher education captures university studies (short cycle tertiary, bachelor, master, and doctoral). 18 Level 1 is for students with no or limited educational background, or those who are considered to have limited learning abilities because of trauma, level 2 is for students with some (normal) educational background, and level 3 is for students with higher education (who often speak several languages). 19 The implementation of the reform implies that refugees who applied for asylum prior to February 28, where information on application date is missing. Without information on the application date, we are not able to classify individuals into the relevant control and treatment groups. 20 Out of the remaining 51,213 individuals, we restrict our sample to the 1191 individuals who applied for asylum from November 1, 2001, to June 30, 2002. Our control group is defined as individuals applying for asylum between November 1, 2001, andFebruary 27, 2002, while the treatment group includes individuals applying between February 28 and June 30, 2002. The sample split is chosen to ensure that no other reform components, which would affect those applying prior to and post the cutoff differentially, interfere. We want to compare asylum holders who only differ in terms of which rules regarding permanent residency they are subject to and not in any other dimensions. As described in Sect. 2, there were several components to the 2002 reform package, apart from the prolonged waiting time for permanent residency. To avoid confounding effects from these other components, which mainly relied on the date of approval, we exclude individuals whose applications were approved before July 1, 2002. Due to long processing times, this restriction on the approval date does not reduce our sample to any considerable extent. 21 We also exclude individuals lodging their application from abroad. 22 Throughout the paper, the unit of analysis is the individual. Finally, because we are interested in educational and labor-market outcomes, we focus on individuals who are between the ages of 16 and 60 years at the time of application. We end up with a sample of 635 individuals, where the age restriction is responsible for over 90% of the sample size reduction from the 1191 individuals who applied in the relevant time period. We have explored the option of increasing our sample size by relaxing this restriction, but the age distribution is not suitable for this (the bulk of the remaining individuals are between 0 and 10 years of age, i.e., not a relevant age for labor-market related outcomes or for extending our sample when looking at enrollment in education). Looking at heterogeneous effects by age was also considered, as this type of policy may provide stronger incentives for younger workers. However, the low sample size combined with a skewed age distribution does not easily lend itself to this type of analysis, the distribution is heavily skewed toward younger workers with an average age at the time of application of around 31 years. We therefore believe that our sample mainly represents an age group that should be susceptible to the policy change. Identification in the regression discontinuity framework The decision about the reform was taken on June 6, 2002, and took effect on July 1, 2002-but the part of the reform that we are studying applied retroactively to everyone who applied from February 28, 2002, onward. This means that neither immigrants nor the decision makers at the DIS could have perfectly manipulated the date of application in order to achieve a certain treatment. Looking at aggregate statistics from The Danish Immigration Service (2003) in Fig.1 , we also note that there is no major change in the number of lodged asylum applications in Denmark from February to March 2002. In our data, we observe only individuals whose asylum applications were subsequently approved. We see no notable change in the number of approvals around the cutoff. As we observe the actual date of application, we also present a histogram of the number of granted asylum applications using the week of application in Fig. 3. 23 The absence of a spike in the density of applications made just before the cutoff is in line with our intuition as the reform was implemented retroactively, leaving no room for manipulation. 24 Week 17 excluded, due to too few observations (n < 5), to comply with the rules of Statistics Denmark In the regression discontinuity framework, treatment effects are identified by estimating the magnitude of the discontinuity at the cutoff. While the sharp cutoff implied by the reform intuitively lends itself to the regression discontinuity approach, ideally one would want to compare individuals on each side close to the cutoff. As Denmark approves a relatively small number of asylum seekers, we have to use a fairly broad bandwidth of four months on each side of the cutoff (119 days on each side of the cutoff between November 2001 and June 2002) in order to use as many observations as possible. We cannot extend the window further out from the cutoff because of the other reform components. This is the main drawback in our specific setting and leads the attention to the inherent tradeoff between precision and bias in the regression discontinuity framework, where extending the bandwidth around the cutoff increases the precision, but also the risk of introducing a bias. Furthermore, we are restricted in the extent to which we can extend the bandwidth and, even after extending the bandwidth as far as possible, we still face a small number of observations. As our running variable is the date of application, we have to estimate treatment effects parametrically in order to avoid confounding time-varying effects. The regression equation is specified as where Y i is the outcome of individual i, x i is the normalized date of application such that February 28, 2002, is set to zero and h(·) is a continuous function of the date of application, T i is an indicator for treatment status with T i = 0 if x i < 0 and T i = 1 otherwise, and ε i is the error term. We include an interaction between h(x i ) and the treatment indicator T i , to allow for different trends over time on each side of For the linear specification, we obtain a p-value of 0.121, while the quadratic specification gives a p-value of 0.935. the cutoff. 25 β is the coefficient of interest measuring the effect of being subject to the new rules on permanent residency. In our main specification, h(·) is specified as a linear function. In Sect. 5, we vary the order of this polynomial to test the robustness of our results. All specifications are estimated with and without a vector of predetermined individual characteristics, Z i , to increase efficiency and confirm that covariates do not affect the point estimates. 26 In addition to the main analysis which is performed on the full sample, we split the data by: (i) level of education at arrival (below/above university level, henceforth referred to as low/high skilled) and (ii) gender, to capture potential heterogeneity in response to the reform. 27 The reform may have had a different impact on these different groups of refugees since differences in access to the labor market could determine how much they were able to affect their probability of being granted residency based on labor-market attachment. 28 To assess the continuity assumption underlying the regression discontinuity framework, i.e., whether individuals on either side of the cutoff are comparable, we study predetermined characteristics in our sample (see Figs. 4 and 5). Specifically, we consider demographic variables and educational background. For demographics, we look at the fraction of males, household characteristics and average age, as well as nationality. For educational background, we consider both the self-reported measure of the highest level of education achieved and the level of Danish studies to which the individual is assigned. We find some statistically significant discontinuities that we discuss here. By the nature of immigration, the characteristics of asylum seekers can fluctuate month by month and we do observe a positive jump for the category other nationalities (0.175) and a negative jump for Afghans (−0.128) at the cutoff. 29 In addition, there is a marginally, at the 10% level, significant discontinuity in the share of males (0.136) at the cutoff. In addition to the graphical evidence, Table 1shows the comparison of means of predetermined characteristics for the control and treatment group as well 25 Excluding this interaction does not affect the conclusions in Sect. 4. 26 Although it has been standard practice in regression discontinuity designs to cluster on the running variable, we choose to follow Kolesár and Rothe (2018) and abstain from clustering using only conventional heteroskedasticity robust standard errors. We have repeated all estimations for the full sample with clustering on the running variable and find that not clustering is the more conservative approach for all outcomes. The results from estimations where standard errors are clustered are available upon request. 27 Around 47% of the females are classified as low skilled and around 22% as high skilled. Among males, the division is similar with 43% of the men classified as low skilled and 26% classified as high skilled. 28 Further sample splits have been considered, for example, by nationality, but not implemented due to the small sample size. Such an analysis could have strengthened the external validity, by informing us whether effects are dependent upon source country. Finding a sample of "representative" asylum seekers is however almost impossible, due to the nature of this type of immigration being caused be disruptive events causing individuals to leave their home countries. We therefore focus on gender and skill level in our heterogeneity analysis. 29 Other nationalities are defined as a dummy equal to one if the individual is not from one of the most common countries of origin: Afghanistan, Former Yugoslavia, Iraq, or Somalia.  Fig. 4 Predetermined characteristics, Danish language courses, and education levels. Notes: The graphs are generated using evenly spaced bins, a linear polynomial (order 1), and a uniform kernel as their normalized difference. 30 The normalized difference gives us a scale-invariant measure of the magnitude of the difference between groups. We consider differences above 0.25 to indicate sizable differences. Based on this measure, the groups are generally well balanced over the whole 8-month period that defines our sample of interest. Once more, the biggest differences arise in terms of nationalities: There are on average more Iraqis in the control group and more individuals from the category other 30 See Imbens and Woolridge (2009) for a motivation for using the normalized difference. The measure is defined as:  Fig. 5 Country of origin. Notes: The graphs are generated using evenly spaced bins, a linear polynomial (order 1), and a uniform kernel nationalities in the treatment group. In addition, there are fewer males in the treatment group. The conclusion is very similar using a t-test instead. Apart from these variables, the two groups seem balanced. One concern is that the detected discontinuities, and the overall large variation in observable characteristics, make it difficult to compare individuals on the two sides of the cutoff. We again stress that the way this reform was implemented, by design, rules out the possibility for individuals to have self-selected into the treatment or control group. Discontinuities and variation may, however, still arise due to the nature of immigration (with trends over time that may coincide with our time period) and because of random factors due to the small sample size. While the analysis is conducted separately for females and males, we cannot do the same for nationalities as the number of observations in each cell becomes too small. Instead, in our analysis we run all regressions with and without (predetermined) covariates, including nationality, to address this concern. In addition, we also assess the comparability of the two groups by regressing a proxy for the ability to form a labor-market attachment, labor earnings one year after approval, on predetermined characteristics. Then, we estimate the regression discontinuity using the predicted values for earnings and, reassuringly, do not find any discontinuities in this variable (see Fig. 4). Hence, even though there are differences in some of the characteristics, this suggests that at least in terms of this short-term measure of earnings ability, the two groups are balanced.

Alternative empirical strategies
To mitigate the small-sample problem, there could be several potential strategies. Because of data availability, we rule out the option of considering alternative comparison groups consisting of, for example, other types of immigrants or native Danes. The reason is that a valid comparison group needs to be defined in relation to the application date of the asylum seekers. This is not possible for these other groups because our data are at the annual frequency. Data availability also makes a difference-in-difference strategy unfeasible, as we do not observe individuals in the register data before they arrive in Denmark and apply for a residence permit. This means that we cannot compare the groups pretreatment to, for example, evaluate the crucial parallel trends assumption. A regression discontinuity difference-in-difference could potentially remedy a concern about calendar effects, but this is not a big concern per se and it would not make the control and treatment groups more balanced. Finally, we have considered increasing the bandwidth to a larger time window (12 months on each side of the cutoff), at the expense of clean identification. However, because of fluctuations and a downward trend in the number of asylum applicants, this mainly results in a larger control group, whereas the treatment group is still small in comparison. The differences detected in our main sample are still evident in this extended sample and the estimation is not much more precise, whereas we now have introduced confounding factors from the other reform components. Because of this, four months remain the preferred time window and we restrict our attention to the main sample. Instead, we focus on additional descriptive analysis to deal with the limitations at hand and study average outcomes for the two groups, without relying on the regression discontinuity setup.

Results
This section presents and discusses our empirical findings for education and labormarket outcomes, separately. First, from the regression discontinuity framework we present both graphical evidence and coefficient estimates. 31 Then, we turn to the descriptive analysis and study average outcomes.

Educational outcomes
Investments in human capital can be viewed as part of the integration process. Here, we study enrollment defined using a dummy equal to one if the individual, at some point during the 12 years we observe, enrolls in any type of formal education (primary, secondary, or university). We first turn to the estimation results from the regression discontinuity analysis. Columns (1) and (2) in the first panel of Table 2 report estimates for the effect of the reform on enrollment in formal education, with and without covariates, for the full sample. Related to the regression equation (1), the coefficient estimate of β is around 0.17, i.e., we estimate an increase of around 17 percentage points at the cutoff. Panel (a) in Fig. 6shows the corresponding graphical representation, and we observe an upward jump at the threshold. 32 Columns (3)-(10) in Table 2 report estimates for the subgroups based on gender and skill level. We note that the estimated effect is driven 31 We use the full bandwidth of 119 days. We plot the mean of each outcome for evenly spaced bins of the running variable. The number of bins is selected using a data-driven procedure in order to best approximate the underlying regression function. For each plot, we fit a global linear polynomial, h, to approximate the population CEF, using a uniform kernel and no covariates. These plots are produced using the Stata package rdplot (Calonico et al. 2014). 32 We do not observe an increase in the likelihood of completing an education. The results are available upon request.  Regressions are estimated for the different groups using a polynomial of order 1 and a uniform kernel. High skilled is defined as having a university education, while low skilled is defined as having a primary or secondary education upon arrival in Denmark. Covariates include age at application, gender, partner, number of children, education level, and dummies for the most common nationalities (Afghanistan, Former Yugoslavia, Iraq, and Somalia). Enrollment is a dummy variable equal to one if the individual at some point is enrolled in general education. Enrollment university is the corresponding variable for university education. Employed is a dummy equal to one if the individual was ever employed in Denmark. Earnings is total annual labor earnings in DKK from employment and/or self-employment after three and seven years. *** p

Fig. 6
Education and labor-market outcomes. Notes: The graphs are generated using evenly spaced bins, a linear polynomial (order 1), and a uniform kernel. Enrollment is a dummy variable equal to one if the individual at some point is enrolled in general education. Enrollment in university education is the corresponding variable for university education. Employed is a dummy equal to one if the individual was ever employed in Denmark. Earnings is total labor earnings from employment and/or self-employment after three and seven years, respectively by females and, to a lesser degree, low-skilled individuals. For females, the estimated effect is an increase of around 22 percentage points at the cutoff. In Table 2, we also report estimates of the effect on enrollment at the university level. This variable is a dummy equal to one for individuals that enroll in university education at some point in time during the 12 years that we observe. The estimated coefficient is positive at around 6 percentage points for the full sample, but not significant. Panel (b) in Fig. 6 shows this upward jump graphically. While these results allow for a positive effect on overall enrollment, we want to emphasize caution in the interpretation of these estimates since the magnitude and precision is sensitive to the specification used (see Sect. 5) and whether or not bias-correction is applied. 33 Second, we test for the difference in means between the control and treatment group (see Table 4 ), with the null hypothesis that the means of the two groups are equal. There is a marginally significant difference (at the 10% significance level) between the two groups for enrollment in general education, with higher average enrollment rates for the treatment group (0.25 compared to 0.20). 34 This comparison of averages does not attempt to identify a difference at the cutoff, but the groups compared are the same as in the regression discontinuity analysis. Although we still face the problem of a small sample size, we conclude that the differences in average outcomes are 33 We report conventional estimates as our baseline (since we do not use data-driven methods for bandwidth selection). Using bias-corrected robust estimates, the estimated effect for general enrollment is smaller in magnitude and less precise. For enrollment in university education, the estimated effect is larger in magnitude, but precision is sensitive to the inclusion of covariates. 34 If we instead test the null hypothesis that the average enrollment rate in the treatment group is lower than in the control group, the null hypothesis is rejected at the 5% significance level.  Regressions are estimated for the different groups using a polynomial of order 1 and a uniform kernel. High skilled is defined as having a university education, while low skilled is defined as having a primary or secondary education upon arrival in Denmark. Covariates include age at application, gender, partner, number of children, education level, and dummies for the most common nationalities (Afghanistan, Former Yugoslavia, Iraq, and Somalia  To further exploit our data in a more descriptive manner, and more specifically focus on the dynamic aspect, we show the evolution of enrollment rates over time in panel (a) of Fig. 7. This figure plots the share of individuals enrolled in education in a specific year (through a fitted quadratic polynomial with 95% confidence intervals), relative to the year of approval. We note that the treatment group has overall higher enrollment rates over time for general education. In panel (a) of Fig. 8, we also present regression estimates based on equation (1) for the effect on enrollment for 1,...,12 years after arrival, separately. It is evident from this figure that standard errors are quite large when we consider the different years separately, but the estimated coefficients show a pattern similar to the dynamic evolution plotted in Fig. 7. Similarly, we also estimate the effect of the reform on the likelihood of being classified as a student for 1,...,12 years after arrival and the results indicate that there is an increase in this likelihood around the time when the treatment group was eligible for permanent residency (i.e., after 7 years). 35 To sum up, the positive coefficient estimates on the broad measure of enrollment in education could be interpreted as an increased investment in human capital and integration. However, as we have discussed here and emphasize in Sect. 5, this effect is sensitive to specification and, while we consistently estimate a positive coefficient β, the estimated effect varies in size and precision. Furthermore, these investments in education appear to be too late to have a positive impact on labor-market outcomes and integration relevant for improving access to permanent residency. 35 The results are available upon request.

Labor-market outcomes
Labor-market outcomes are direct measures of attachment to the labor market. Here, we study the effect on whether an individual was ever employed (including selfemployment) in Denmark. This measure is defined, and motivated, in the same way as for enrollment, i.e., as a dummy equal to one for individuals that are ever employed or self-employed during the 12 years following their initial approval of asylum. In addition to having a job (or being self-employed), earnings is another important measure of labor-market attachment. We focus on labor earnings after three and seven years of residency, which gives us a short-and a long-term measure of earnings.
First, based on our regression discontinuity estimations, the second panel in Table 2 reports our coefficient estimates for the effect of the reform on the share ever employed, for the full sample and our four subgroups. Columns (1) and (2) show negative coefficient estimates of β around −0.04 (or −0.10 including covariates), but the estimated coefficients are not significant at conventional levels. This is also the case for the different subgroups based on gender and skill level. Panel (c) in Fig. 6 shows the graphical representation of the estimated effect, and we observe a small downward shift at the threshold. Turning to our measure of earnings, the coefficient estimates are negative, but not significant at conventional levels, for the full sample. We do pick up significant negative estimates for primarily females, in particular at the long horizon. This is consistent with the positive effect on females' enrollment in education. For males, who were also not more likely to enroll in education, there is no significant effect on any of the labor-market outcomes. Panels (c)-(d) in Fig. 6 show the graphical presentation of these results. To sum up, for the full sample, all coefficients are negative but imprecisely estimated. The graphical evidence also reveals a small decrease, but there are no indications of a sizeable and significant negative effect. While the negative effects are imprecisely estimated, we find no evidence of a positive effect on labor-market outcomes, contrary to the argument made by proponents of temporary protection regimes. 36 Second, we consider differences in means between the control and treatment group (see Table 4). There are no significant differences between the two groups in terms of the labor-market outcomes, but the treatment group has lower averages. As for the education outcomes, these differences in average outcomes are in line with our estimation results. We again turn to the dynamic development in Fig. 7, which confirms this picture looking graphically at employment and earnings for each year, and in Fig. 8 where we estimate the coefficients for each of the different years after arrival. Although the coefficient estimates are negative for each year, and, if anything, there are signs of a more negative impact on earnings in later years, the standard errors are very large and we generally cannot rule out a null effect.
To better understand potential mechanisms of the reform, we turn our attention to a discussion of some additional outcomes where we can form hypotheses on the effects. Specifically, we look at the highest skill level ever achieved in the labor market during the years we observe (for more details on skill level see Online Appendix C) and the   Fig. 9 (Heterogeneous) Earnings conditional on employment over time. Notes: The graphs show a quadratic polynomial and 95% confidence intervals number of times an individual changes workplaces. We view the highest skill level as a measure of the quality of the job. One hypothesis is that individuals in our treatment group accept lower-quality jobs in order to achieve and/or maintain a labor-market attachment. As for the number of job changes, there could be both a positive and a negative effect from the reform. For example, one hypothesis is that asylum holders may be locked into lower-quality jobs that they get early on, because they need to keep their labor-market attachment. This would tend to lower the job-changing rate.
We have studied the effects of the reform on these variables, but because of an even smaller sample size (as we only observe these variables for individuals with some labor-market attachment) we settle for a brief discussion on the results. 37 We find no difference in the number of times individuals change workplaces during the 12 years we observe them in Denmark but they generally appear to do so at a decreasing rate over time (which is consistent with a more stable labor-market attachment). For highskilled individuals, we note that the treatment group, on average, reach a lower skill level in the labor market. This could be in line with our hypothesis and imply that the high skilled accept jobs for which they are potentially over-qualified. Figure 9 looks at the fitted quadratic polynomial over time for earnings conditional on employment in the different subgroups (since the subgroups are small already before conditioning 37 The results are available upon request. on employment, we do not estimate regressions for this outcome variable). We note that there is a divergence for the high-skilled individuals, with the control group experiencing stronger earnings over time. Interestingly, although this is based purely on a descriptive analysis, the divergence between the control and the treatment group appears 3-5 years after approval for this subgroup. This is around the time when individuals in the control group are eligible to apply for permanent residency status and is in line with high-skilled individuals in the treatment group potentially accepting jobs with lower earnings compared to the control group. This could be a sign of weaker bargaining power of the individuals in the treatment group or that employers are more reluctant to invest in individuals whose future in the country is more uncertain.

Sensitivity analysis
While our main result is that there is no positive effect on labor-market outcomes, we turn to a sensitivity analysis to assess the robustness of our results. First, we discuss potential attrition due to asylum holders leaving Denmark. Second, we investigate the importance of calendar effects. Third, we consider standard tests for the validity of the regression discontinuity approach.

Duration and permanent residency in Denmark
We are interested in asylum holders' duration and permanent residency in Denmark, for three reasons. First, a reduction in the share being granted permanent residency would indicate that the policy worked as intended (i.e., making it harder to get permanent residency). It also strengthens our argument that individuals would have expected a change in the likelihood of getting permanent residency ex-ante. Second, individuals may stay in Denmark without permanent residence permits by continuously applying for temporary permits. However, the reform may have affected both the willingness and ability to stay in the country. This is in itself an interesting outcome. Individuals in the treatment group faced the risk of losing their residence status for a longer time before they were eligible to apply for permanent residency. This could lead to more individuals leaving Denmark, because their asylum claim was no longer valid and they did not qualify for residency based on labor-market attachment. In addition, asylum holders may have left Denmark by choice, due to the change in regulations. Third, if the fraction staying in Denmark changed, the results on other outcomes may be driven by this selection rather than by behavioral responses among those staying in Denmark. We ask whether the reform had an impact on the share of asylum holders that were granted permanent residency. Estimation results in Table 3 and Fig. 11 show a significant reduction in the share being granted permanent residency within three and seven years from the first approval of temporary residence. Unfortunately, we do not have the data to determine whether this depends on a reduction of applications for permanent residency or an increased rejection rate. In addition, we ask whether the reform had an impact on whether asylum holders stayed in Denmark during the 12  Table 3 show that there is no significant discontinuity in the share that is still registered in Denmark in 2015, confirmed graphically in panel (a) of Fig. 10. This facilitates the interpretation of our other results, as it is unlikely that any effects we find are driven by selection. 39

Calendar effects
Our treatment group, by definition, arrives in Denmark later than the control group. One potential concern is therefore that any observed effects depend on this difference rather than on the reform itself. For example, if the state of the labor market differs between the points in time when the control and the treatment group receive their 38 Unfortunately, for individuals who no longer appear in our data, we cannot distinguish between if they left Denmark or have deceased. 39 In panel (b) of Fig. 10, we present regression discontinuity estimates of the share still in Denmark for each individual year up until 12 years after application. These results confirm that there is no significant difference in the share staying in Denmark. In Table A.1 in Online Appendix E, we also confirm that the groups are still relatively well balanced in 2015. In 2002, asylum seekers were not allowed to work until their applications were approved, but the distribution of approval dates for the treatment and control groups is quite similar, suggesting that there are no substantial differences in when our control and treatment groups are allowed to enter the labor market. 41 Figure 12 shows that there is no clear change in the approval rate during over time period of interest. Because the approval dates of the two groups look rather similar, it is possible that the processing times instead differ. We note that the control group had somewhat longer processing times, implying that these individuals spent more time in the asylum center awaiting their decision. This could be of relevance, if we believe that the time in processing matters. This may be the case for example because of discouragement from a lack of meaningful activities or because a longer time spent in Denmark gave the control group an advantage before entering the labor market. The differences are, however, not so large that we believe they are likely to impact our results. 42 Furthermore, Fig. 4 shows that there is no discontinuity in the processing time at the cutoff. The design of the reform rules out that individuals could self-select into the treatment or the control group. But the appointment of a new government, on November 27, 2001, was, however, clearly associated with stricter immigration policies to come. Discussions of these policies started formally in January-February 2002, and there was media coverage on the intentions to implement measures aimed at reducing immigration. This means that immigrants could have been aware of the intention to reform Danish asylum policies, even if they would not have been able to foresee the exact timing of the reform. 40 Another concern could be if we think that asylum seekers arriving between November and February are inherently different compared to asylum seekers arriving in the spring. We can control for potential differences in observed characteristics but are not able to control for differences in unobserved characteristics. 41 Descriptive graphs are available upon request. 42 Descriptive graphs are available upon request. Regressions are estimated for the full sample using a polynomial of order 1 and a uniform kernel. We split the sample into two halves at the cutoff. Then, we run the regression on each sample using the median as the cutoff. Covariates include age, gender, partner, number of children, education level (all measured at application), and dummies for the most common nationalities (Afghanistan, Former Yugoslavia, Iraq, and Somalia). Enrollment is a dummy variable equal to one if the individual at some point is enrolled in general education. Enrollment university is the corresponding variable for university education. Employed is a dummy equal to one if the individual was ever employed in Denmark. Earnings is total annual labor earnings in DKK from employment and/or self-employment after three and seven years. *** p < 0.01, ** p < 0.05, * p < 0.1

Placebo tests
A standard test in this type of design is to test for placebo effects by estimating the same model, but varying the location of the cutoff. Discontinuities at other cutoff points may suggest that any estimated discontinuities at the real cutoff are not due to the reform. We split the main sample into the control and treatment group separately. Then, following Imbens and Lemieux (2008), we test for discontinuities in our outcome variables at the median date of application in each of the two groups. The advantage of splitting the sample into the control and treatment group is that we avoid fitting a regression function over a point where we expect a discontinuity to occur. We could test for discontinuities at other points within each of these sub-samples, but using the median gives us more power to detect potential discontinuities. As we have emphasized, we face a problem of a small sample size already in the main estimation and, thus, we use this strategy to maximize the number of observations used. Table 5 presents the results from this placebo analysis. For most variables, we do not find any significant discontinuities at the placebo cutoff. However, for enrollment and employment, we estimate a marginally significant (at the 10% level) discontinuity when covariates are included. For earnings after three years, we estimate a marginally significant discontinuity without covariates. However, including controls this is no longer the case. Given the narrow bandwidth, we have to implement for this test and the, even, lower number of observations we end up with in each group, it is not surprising that we detect one discontinuity as we are not able to estimate the time trend and control function as well.

Choice of bandwidth
Given that the placebo test detected one discontinuity at other values of the running variable than the true cutoff point, we want to assess the robustness of our results in greater detail. More specifically, we carefully investigate the sensitivity to changes in the bandwidth. Our main results, presented in Table 2, are estimated using a bandwidth of 119 days around the cutoff point. We cannot extend the bandwidth further without including individuals in the treatment group that were also subject to, for example, the change in benefit levels. 43 For this reason, our sensitivity analysis is restricted to analyzing the effects when decreasing the bandwidth. For both predetermined characteristics and outcome variables, we present coefficients and confidence intervals from estimating the regression discontinuity equation using bandwidths starting at 21 days and then increasing the bandwidth by two days at a time until reaching 119 days (our benchmark bandwidth). Figures 13 and 14 present the results from this analysis for predetermined characteristics. Even at smaller bandwidths, the coefficients are in general not significantly different from zero. Still, the coefficients become much more stable at broader bandwidths (and the confidence interval smaller). This analysis corroborates our choice of using a bandwidth of 119 days. Figure 15 presents the same type of analysis for our outcome variables and confirms our interpretation of the results. At broader bandwidths, the coefficients in general become more stable. Many papers that use the regression discontinuity approach use optimal bandwidth selection, a data-driven approach to select how many observations on each side of the cutoff should be used in the estimation. Because we aim to maximize the number of observations used, we have chosen to instead use the broadest bandwidth possible to isolate the effect of this reform, i.e., use as many observations as possible without including individuals that were also subject to other components of the 2002 reform. This gives us the bandwidth of 119 days. However, we also estimate our main regression specifications using the optimal bandwidth. The results are available in Table A.3 in Online Appendix E. Using the optimal bandwidth selection, about 100 observations are used in the estimations compared to the sample size of 635 when using the 119 days bandwidth. In general, coefficients estimated with the optimal bandwidth are in line with, or larger in magnitude, than our preferred specification but standard errors are large. For enrollment in education, standard errors are about twice as high and the coefficient estimates, while of the same magnitude as in the main specification, are not significant. It should be noted that standard errors are generally substantially larger using this bandwidth. For labor-market outcomes, the estimates are more negative and significant to a higher degree compared to our results in Sect. 4. In light of the low number of observations used in these estimations, the 119-day bandwidth remains our preferred choice.

Assumptions on the regression specification
We also estimate all our main specifications using a quadratic polynomial, rather than the linear function for h(·) of Sect. 4. The main reason to include higher-order polynomials is to capture nonlinearities in the underlying data. However, in our case, using a higher-order polynomial often appears to lead to overfitting and, thus, overestimation of the effect. Using the linear specification is therefore the more conservative choice for most outcomes. However, again, the results for enrollment (general education) are sensitive to the inclusion of a second-order polynomial. The estimated effect is smaller in magnitude and imprecisely estimated. Thus, we reiterate the need to be careful in the interpretation of our results for education outcomes.  The benchmark estimations employ a uniform kernel, but we have estimated all the specifications with the full sample using a triangular kernel as well. The motivation for using a triangular kernel is that it gives more weight to observations close to the cutoff, but given the low number of observations in our sample, the uniform kernel remains our preferred choice. In general, the coefficients using a triangular kernel are in line with, or even larger in magnitude than, our preferred specification. For enrollment, the effect is slightly weaker compared to our benchmark specification. 44

Conclusion
We study the effects of lowering the ex-ante probability of receiving permanent residency status on refugees' outcomes. We exploit a Danish reform in 2002 that prolonged the time period that a refugee was required to have been a legal resident before being eligible to apply for a permanent residence permit. While the results from our causal estimation are subject to uncertainty, together with our descriptive analysis we find no evidence that the reform improved labor-market integration. Our findings are also in line with an increased enrollment in education in the treatment group. However, we want to be cautious about the interpretation of these results as they are sensitive to regression specifications. We further find that the reform had a significant negative impact on the share receiving permanent residency permits, but it did not impact the share of asylum seekers that actually stayed in Denmark. To sum up, while proponents of more stringent residency requirements argue that such policies will improve labor-market integration, we do not find any evidence that this is true for the reform that we study. In particular, if stricter requirements had a strong positive impact, we should be able to detect this also with a small sample size.
We emphasize some limitations of our study that highlight the need for further research. While the reform studied in this paper is theoretically ideal for studying the effect of prolonged temporary status, the setting also offers some notable challenges, in particular because of a small sample size. Furthermore, the external validity of our results, as in any study of a specific reform, depends on the institutional setting. The composition of refugees is clearly time dependent and depends on many things outside the control of the policy maker. Temporary protection schemes may also be designed in many different ways, making them more or less comparable to the reform we are studying in this paper. Therefore, it is important to compare our results to the findings of future studies of temporary permits studied in other settings. Comparing our findings to some recent empirical papers, we highlight the difference between policy changes that apply to specific groups of refugees, due to the situation in their home country, versus policy changes aimed at generally limiting access to permanent residency or incentivizing investments to promote labor-market attachment, but also the difference between different institutional contexts and points in time. Finally, we focus on a specific set of outcomes that we argue are particularly relevant for the integration of refugees, but have abstracted from many others. How to design policies to successfully promote and incentivize labor-market attachment for different groups of refugees remains an important and crucial question. Furthermore, exploring other outcomes and assessing potential mechanisms at work, such as the determinants of investment in different types of education and the timing of these investments, remain interesting tasks for future research. For example, the role of intra-household relationships may be important in order to understand heterogeneous responses of females and males, as well as the timing of investments in education.

Author Contributions
The authors' individual contributions to the paper should be considered equal, and all authors have approved the final article.
Funding Open access funding provided by Royal Danish Library The research leading to these results received funding from The Royal Swedish Academy of Sciences under Grant Agreement No. SO2015-0084. This work was supported by Handelsbankens forskningsstiftelser (Jan Wallander and Tom Hedelius foundation (Grant Number W18-0052) and Tore Browaldh foundation) and gratefully acknowledged by Elisabet Olme and Matilda Kilström.

Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the research described in this paper.
Data and code availability Regarding data availability, as stated in our cover letter, the dataset analyzed in the paper is accessible to us via a server operated by Statistics Denmark. We are not allowed to download or distribute any of the data from the server, and because of this, we request data exemption. However, we are allowed to download and disclose the scripts (do-files in Stata format) that create the final dataset and replicate our analysis. In the case of a positive decision, we are willing and able to supply them. We can also provide more detail on the process to obtain access to this data.
Ethical approval There is in general no ethical approval of data in Denmark. Access to the data is provided through a partnership between Copenhagen Business School and Statistics Denmark. The project has been approved by Copenhagen Business School, and researcher agreements have been signed with Statistics Denmark.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.