Does compulsory schooling affect health? Evidence from ambulatory claims data

Using claims data on more than 23 million statutorily insured, we investigate the causal effect of schooling on health in the largest and most comprehensive analysis for Germany to date. In a regression discontinuity approach, we exploit changes in compulsory schooling in West Germany to estimate the reduced form effect of the reforms on health, measured by doctor diagnoses in ICD-10 format covering physical as well as mental health conditions. To mitigate the problem that empirical results depend on subjective decisions made by the researcher, we perform specification curve analyses to assess the robustness of findings across various model specifications. We find that the reforms have, at best, very small impacts on the examined doctor diagnoses. In most of the specifications we estimate insignificant effects that are close to zero and often of the “wrong” sign. Therefore, our study questions the presence of the large positive effects of education on health that are found in the previous literature. Supplementary Information The online version contains supplementary material available at 10.1007/s10198-021-01404-y.

academic track is the most demanding and academic-orientated track leading to an university-entrance diploma (Abitur) after grade 13. 1 A lower-level qualification -the technical school degree (Fachhochschulreife) -which allows students to attend a polytechnic (Fachhochschule) can be obtained after finishing grade 12 [3]. In addition, there is a fourth type of secondary school, the so-called comprehensive school (Gesamtschule). This type of secondary school combines all three secondary school tracks at one and the same school. All secondary school leaving certificates can be obtained. However, comprehensive schools do not exist in all states and are rather unimportant with only about 10 percent of all children in Germany attending it [4]. Since the first comprehensive schools were only introduced between 1973 and 1982 on a trial basis [5], this type of school is also relatively new and therefore not relevant for our study as we restrict our analysis to individuals born between 1930 and 1959, who finished secondary school in the late 1970s at the latest.
The allocation to one of the secondary school tracks after primary school depends to a great extent on the student's grades in primary school and thus on student's ability. On that basis, the primary school -usually the class teacher -gives a recommendation for the secondary school type he or she thinks the child should attend after grade four. In ten out of 16 federal states this recommendation is binding, in the remaining six states the final decision is taken by the parents [4,6]. Since Germany tracks students very early at the age of ten, when information about the student's learning potential is likely to be incomplete, there is a high risk of misallocating students to tracks. In general, there are possibilities for correcting initial track decisions at a later point, when more information about student's ability is available. In principle, switching tracks within secondary schooling is possible at any grade, but in practice this happens rarely. Dustmann et al. [6] show that only about 2 percent of students switch between tracks throughout secondary school. As a consequence, after allocating students into one of the secondary school tracks they generally stay in the chosen track until they complete it. However, up and downgrading of students between tracks at later stages of the educational career is very common. Dustmann et al. provide evidence that there is a substantial movement from the basic and intermediate to the academic track after graduating from the tracks. Moreover, there is a large amount of downgrading in the form of not enrolling in university after graduation from the academic track. Moreover, the authors find that due to this possibility of up and downgrading the assignment to a particular secondary school track at the end of the primary school has only little effect on the highest degree and long-term labour market outcomes.

Compulsory schooling reforms: Differences in introduction dates
For some of the federal states our reform introduction dates differ from the ones reported in other studies that use the same German compulsory schooling reforms. Most of these studies refer to Pischke and van Wachter [3], who were the first to exploit the German reforms in order to estimate returns to education. For some federal states there is only a minor difference in the reported dates, for others, e.g. for Saarland and Schleswig-Holstein, the difference is quite large (see Table 1). For Saarland, Pischke and van Wachter [3] report the introduction of the reform in 1964, while we state that it was already in 1958. According to them the compulsory ninth grade was introduced in 1956 in Schleswig-Holstein, but we report that it was already implemented in 1947. One reason for these differences may be that in some federal states it was possible for municipalities to have a voluntary ninth grade before the overall implementation was stipulated by law. Our implementation dates refer to the date when the compulsory ninth grade was implemented in the whole state. Pischke and van Wachter [3] do not provide information, which references they use. We took the information from contemporaneous literature on the German school system by Leschinsky and Roeder [7] and Backhaus [8] and from respective federal state laws [9][10][11][12][13][14][15][16]. Moreover, we reviewed data from the Federal Statistical Office of Germany on the ninth grade attendance in the basic track between 1957 and 1973 [17][18][19][20][21][22][23][24][25][26][27][28][29][30]. Taken together, these references suggest that the implementation dates reported by Pischke and van Wachter [3] are sometimes incorrect. Recent work by Cygan-Rehm [31,32] questions the reform dates in Pischke and van Wachter [3] as well. Cygan-Rehm also refers to Leschinsky and Roeder [7] and data from the Federal Statistical Office of Germany and reports reform dates that match ours (see Table 1). Two exceptions are Hesse and North Rhine-Westphalia with a difference of one year each.
Based on data of the Federal Statistical Office of Germany, we plot the number of basic track students in the ninth grade in the school year t as a fraction of the number of basic track students attending the eighth grade in the previous school year t-1 (see Figure 1). Looking at Saarland, for example, we observe that the reform became effective in 1958 as the ninth grade attendance jumps from 0 percent in 1957 to 99 percent in 1958. This matches exactly the timing reported by Leschinsky and Roeder [7] and Backhaus [8] and clearly contradicts the date reported by Pischke and van Wachter [3]. Unfortunately, we do not have early enough data for Schleswig-Holstein to be sure that our reported date is correct. According to Leschinsky and Roeder and Backhaus, the reform introduction was already in 1947, but data is only available from 1957 onwards. However, we are quite confident that the date that was reported by Pischke and van Wachter has to be wrong because the compulsory ninth grade was already stipulated in the state law of Schleswig-Holstein in 1947 [33]. For the remaining federal states our reported reform implementation dates also seem to fit with the data as there are jumps in the ninth grade attendance exactly when the additional school year was fully implemented.

Appendix B: Further analyses using SOEP
Since the KBV claims data do not contain information on years of schooling we are not able to estimate the first stage effect of the compulsory schooling reforms on years of schooling with this data set. However, we complemented our analysis by using the Socio-Economic Panel (SOEP) 2 to estimate a first stage and to test whether measurement error in the treatment indicator due to regional mobility is a potential source of bias for the estimates of the reduced form effect.

The SOEP data
The SOEP is a large representative longitudinal survey of private households in Germany that is conducted annually since 1984 and interviews around 30,000 respondents in nearly 11,000 households every year [34]. For our analysis we pool the waves 1992 to 2018 to obtain a sufficiently large sample. To avoid that some individuals enter the sample more the once, we use for each individual only the observation from 2009 or the observation closest to 2009. Taking observations from 2009 ensures the comparability between the SOEP data and the KBV claims data. Furthermore, we impose the same sample restrictions as on the KBV data. In particular, we consider only German inhabitants of the birth cohorts 1930 to 1959 who are statutorily insured and live in the ten West German federal states excluding Berlin. Furthermore, we exclude individuals that graduated in former East German states or abroad. Moreover, we restrict our sample to individuals with valid information on the federal state of residence, federal state of last school attendance and self-reported health status 3 . For further analyses we consider cohorts born five years before and after each compulsory schooling change.
The SOEP contains only information on the highest secondary school degree. Following Pischke and van Wachter [3], we combine information on an individual's highest secondary school degree and the school years usually required for this degree to construct the schooling variable. Consequently, 13 years are assigned to academic track graduates, if they have university entrance certificate (Abitur) and twelve years, if they have an advanced technical college certificate (Fachhochschulreife). For intermediate track graduates the standard duration of ten years of schooling is taken. For individuals with basic track degree and drop-outs with no school leaving certificate information on the federal state of residence, month and year of birth and the timing of the reform is used to determine whether an individual should have graduated after eight or nine years from the basic track. Table 2 shows descriptive statistics for the final SOEP sample. The sample consists of 45 percent men, which mirrors almost exactly the share of men in the KVB sample. On average, individuals have a mean age of 57 (compared to 58 in the KBV claims data). 4 Individuals have on average 9.7 years of schooling with women having 0.3 years less than men. The mean probability of self-rated poor or bad health is about the same size for men and women: 23 percent of men and 21 percent of women report to be in poor or bad health. The probability of having an academic school degree is almost twice as large for men. Contrarily, much more women than men have an intermediate school degree. The share of basic school graduates is the same size for men and women: 53 percent of women report to have a basic school degree. 5

Potential measurement error in the treatment indicator?
In the KBV claims data only the federal state of residence in 2009 is available, although we would need information on the federal state of last school attendance to ensure a precise assignment of the treatment. By using the state of residence as a proxy, we implicitly assume that individuals attended school in the federal state where they lived in 2009 and thus, the treatment might be 4 As already mentioned above, we pool several waves of the SOEP and keep the observation from 2009 or closest to 2009 to make the sample comparable to the KBV sample. The survey year in our final SOEP sample is 2008, on average, and thus explains the difference in mean age between the SOEP and KBV samples of about one year. 5 The probability of having a basic school degree as well as the average number of school years are almost unchanged when we consider the federal state of school attendance instead of the federal state of residence. imprecisely assigned if cross-state mobility in Germany exists. This measurement error in the treatment indicator would then downward bias our estimates of the treatment effect.
SOEP started collecting information on the federal state of last school attendance in 2001. Using this data, we find that using the state of current residence as a proxy for the school attendance state is unlikely to be problematic. The cross table of the treatment indicators based on the federal state of residence and school attendance in Table 3 shows that cross-state mobility in Germany is quite low, leading only to small differences in the treatment indicator. The table demonstrates that about 95 percent of the people that are coded as treated when the federal state of last school attendance instrument is used are also coded as treated when the instrument is based on the federal state of residence. Moreover, about 96 percent of the non-treated in case of the school attendance instrument are also not treated when the instrument is based on the state of residence. We also test whether measurement error in the treatment indicator attenuates IV estimates of the effect of eduction on self-reported health (see Table 4). Therefore, we first estimate the first stage using the instrument based on the federal state of residence and find that the reforms led to an average increase of about 0.56-0.58 years in school (Panel A). The first stage regression results in Panel B show that the assignment of the instrument based on the federal state of school attendance generates slightly smaller first stage coefficients of about 0.52-0.54. Note that the F statistic for the excluded instrument is rather small in both Panels compared to other studies (e.g. Pischke and van Wachter [3] and Kemptner et al. [35]), which can be explained by small sample sizes. Moreover, we find almost unchanged reduced form coefficients when using the federal state of last school attendance instead of the federal state of residence to construct the instrument. Overall, this leads to a very small upward bias in the IV estimates that are based on information on the state of residence. Since the reduced form and IV point estimates are almost identical, we are quite confident that it is not a major problem to proxy the state of school attendance with the state of residence.  (2), as well as the corresponding reduced form and first stage coefficients. Panel A refers to the instrument based on the federal state of residence, while Panel B relates to the instrument based on the federal state of last school attendance. All regressions are based on the pooled sample of men and women, use a bandwidth of five cohorts around the cut-off point and include fixed effects for year of birth, month of birth and federal state of residence as well as linear state-specific cohort trends, gender and a quadratic age term. Standard errors are clustered on the cohort×state level and presented in parentheses. * p<0.1; * * p<0.05; * * * p<0.01. Source: German Socio-Economic Panel (SOEP).
Appendix C: "Retirement effect" Cohort relative to first affected cohort Morbidity rate (%)

Back pain (M54)
Notes: The figure plots morbidity rates in percent by month-year of birth for five cohorts before and after the first birth cohort affected for the pooled sample of men and women. All data points in the figure present averages by month-year of birth. The vertical line denotes the first birth cohort affected by the law changes. Source: KVB claims data.