1 Introduction

One of the greatest achievements of the world during the twentieth century has been the progress made toward gender equality. Comparing the role of women in economic and social life in 2000 relative to that in 1900 reveals the remarkable changes that occurred in terms of women’s legal status, political rights, access to the labor market, and other areas. Yet, nowhere is this change more visible than when looking at female educational attainment. In many countries of the world, women nowadays outperform men in all levels of education, to the point that the relative under-performance of men is being considered an emerging problem (OECD 2015).

The progress achieved by women in terms of their educational attainment relative to men can be seen in Fig. 1. The figure depicts the evolution of the female-to-male ratio of average years of schooling over the twentieth century for a broad sample of 146 countries based on data from Barro and Lee (2013). In the beginning of the century, this ratio fluctuated around 0.75, implying that women had on average only 3/4 of the years of schooling that men had. Following World War II, though, we see a clear upward trend in this ratio, as female educational attainment began to catch up. By 1990 women had similar levels of schooling than men in many countries of the world and subsequently their educational attainment began to surge ahead.

Fig. 1
figure 1

Global evolution of the education gender gap. Notes This figure depicts the global evolution of the education gender gap measured as the female-to-male ratio of average years of schooling. The data are from Barro and Lee (2013) and reflect the schooling levels of the cohort that was 5–9 years old in the respective year. For the construction of the global series we average the values reported for all 146 countries covered in the data set

Understanding the driving forces behind this remarkable transition has been the focus of a growing literature in economics. Early contributions by Goldin (1995), Galor and Weil (1996) as well as Goldin et al. (2006) have stressed the role of improved labor market opportunities for women. More recent work by Chiappori et al. (2009), Fernandez and Wong (2011) and Reijnders (2018) have highlighted changes in marriages patterns as an important factor. Fernandez et al. (2004), Beaman et al. (2012), Fernandez (2013), and Hazan and Zoabi (2015) have emphasized the importance of changing social norms and the elimination of biases regarding the role of women in society. Lagerlof (2003) as well as Doepke and Tertilt (2009) have underscored the role of improvements in women’s rights. Greenwood et al. (2016) have stressed the role of the decline in the price of household durable goods, which freed women from housework. The literature has also explored the role of some medical advances in raising female schooling, such as the introduction of the birth control pill (Goldin and Katz 2002) and improvements in maternal health (Albanesi and Olivetti 2016).

Within this literature, most of the existing contributions have focused on the experiences of developed countries and particularly on the case of the United States. When it comes to the evolution of the education gender gap, though, developed countries do not necessarily provide the most striking examples. This can been seen in Fig. 2 where we present the female-to-male ratio of average years of schooling separately for low-income, middle-income and high-income countries in 1900, 1950 and 1990. Despite the big differences in terms of economic development between these groups of countries, the figure shows a relative rise in female educational attainment over the second half of the twentieth century in all three groups. At the same time, the rise appears to have been more rapid in low- and middle-income countries, where many of the aforementioned factors have played a less important role.

Fig. 2
figure 2

The closing of the education gender gap in different country groups. Notes This figure depicts the education gender gap measured as the female-to-male ratio of average years of schooling at three points in time (1900, 1950, 1990) for the cohorts that were 5–9 years old in the respective years. It does that separately for low-income, middle-income and high-income countries. The education data are from Barro and Lee (2013). The income groups are defined according to the World Bank classification

In this paper we argue that the similarity in the timing of reductions in the education gender gap across countries can be explained by the global health improvements that took place after World War II. We explore this hypothesis, as health improvements have commonly been argued to promote educational attainment. Healthier individuals, who expect to live longer and more productive lives, are bound to have stronger incentives to invest in their own education. Similarly, parents who have healthier children are more inclined to invest in their children’s education. The nature of the relationship between health improvements and educational attainment has been highlighted in a series of theoretical models starting with Ben-Porath (1967) and more recent contributions by Boucekkine et al. (2002), Kalemli-Ozcan (2002), Cervellati and Sunde (2005, 2015), Soares (2005), Hazan and Zoabi (2006) and de la Croix and Licandro (2012). At the same time, Soares (2006), Bleakley (2007), Jayachandran and Lleras-Muney (2009), Lucas (2010), Oster et al. (2013) and Hansen and Strulik (2017) have provided empirical evidence in support of this relationship.

Following the conclusions from this line of research, we would expect that if female health improves more than male health, female schooling will rise faster than male schooling and the gender gap in educational attainment, observed initially, will eventually be eliminated. While this hypothesis has not been explored in the literature so far, there are good reasons to consider it as a potential explanation for the evolution of the education gender gap in the post-war period. As shown in Table 1, a simple comparison of the evolution of life expectancy at birth, a broad measure of a population’s health status, and average years of schooling of males and females over the twentieth century indeed suggests this pattern. While up until 1950 the ratios between males and females in terms of life expectancy and average years of schooling were fairly constant, in the subsequent years we see life expectancy for women rising more sharply than for men and the education gender gap improving visibly.

Table 1 Life expectancy and educational attainment, 1900–1990

As the similarity in the time trends of life expectancy and average years of schooling is only suggestive, in our analysis we explore more carefully the link between the two by exploiting exogenous variation in life expectancy triggered by the so-called International Epidemiological Transition (IET). This term refers to the period of rapid decline in mortality from previously highly fatal infectious diseases which started after the end of World War II and resulted in unprecedented improvements in life expectancy around the globe (Becker et al. 2005; Cutler et al. 2006). These improvements were brought forward by a series of important medical innovations related to the development of vaccines, antibiotics and other treatment methods. Largely products of medical research in developed countries, these innovations diffused quickly across countries following the coordinated efforts by the United Nations and the World Health Organization. As a consequence of this diffusion process, many infectious diseases, which previously had affected large shares of the world’s population, where largely eradicated or brought under control within a few decades.

Our analysis builds upon prior work in the literature that has utilized the exogenous nature of the IET-related health improvements from the perspective of individual countries to analyze their impact on various economic and social outcomes (Acemoglu and Johnson 2007; Cervellati and Sunde 2011; Hansen 2013). Similar to this line of work, we exploit variation in the disease environment across countries prior to the onset of the IET with the rationale that the introduction of the new methods of disease control should have had a larger impact in places where mortality from infectious diseases was initially higher. This prior variation allows us to estimate the effects of the IET-related health improvements with an instrumental variables strategy similar to Acemoglu and Johnson, where the potential health improvements given the initial mortality environment are used as an instrument for the actual ones. We extend this analysis by noting an important dimension that the literature has until now largely overlooked: the fact that the IET-related health improvements were different for men and women. Employing a similar instrumental variables strategy, we demonstrate that women benefited more than men in terms of life expectancy from the medical advances associated with the IET. This in turn resulted in differential increases in female and male schooling and contributed to sizeable reductions in the pre-existing education gender gap.

In our analysis we particularly explore the role of vaccines in giving rise to these differential gains in life expectancy and schooling. This is motivated by a growing medical literature, which has demonstrated the existence of biological differences between males and females in their immune responses to vaccines and that vaccines are more effective in providing immunity to females (Cook 2008; Klein et al. 2010). This important conclusion about gender differences in vaccine efficacy stands in contrast to antiviral and antibacterial drugs for which the literature has not documented any systematic differences in their effectiveness across genders. In light of this evidence, we exploit variation in the role of vaccines as a method of disease control for different infectious diseases by separating the IET-related mortality reductions into those that can be attributed at least partially to the introduction and diffusion of vaccines and those that were clearly due to other medical innovations. Following this approach, we document that women experienced larger increases in terms of life expectancy and years of schooling than men in cases where mortality from infectious diseases was subsequently brought under control with the help of vaccines. We also show that in cases where mortality reductions were driven by other medical innovations, the resulting increases in life expectancy and schooling were similar for women and men.

Taking into account the gender-specific nature of the IET-related health improvements allows us to explain a sizeable share of the reduction in the education gender gap that occurred across countries after World War II. Based on our main estimates we are able to explain 39% of the actual life expectancy increases of women and 33% of those of men. In terms of educational attainment, the estimated effects correspond to 26% and 21% of the observed increases in female and male education. These differential increases in male and female education in turn imply that the differential impacts of the medical innovations associated with the IET on male and female health can account for approximately 80% of the observed global reductions in the education gender gap.

Beyond establishing the quantitative importance of this link between the IET and the education gender gap, we also subject it to an extensive series of robustness checks. In particular, we repeat our regression analysis with different measures of educational attainment, different notions of life expectancy and different groups of infectious diseases, all of which yield similar results. We also show that the results do not hinge on any of the particularities of our regressions setup. Moreover, we consider the role of alternative factors that may have affected the relative rise in female schooling and show that our results are robust to controlling for these factors. In addition, we demonstrate that our findings are not driven by differences in the health-education elasticity across genders and that female schooling appears equally responsive to life expectancy changes as male schooling. Finally, we show that the gender-specific effects of the IET on life expectancy and the positive relationship between life expectancy and educational attainment for males and females can be observed over the post-war period not only across countries but across U.S. states as well.

Going one step further, we also study the broader macroeconomic implications of the differential improvements in male and female health that resulted from the IET. Focusing on the impact that these health improvements had on GDP per capita, we show that this impact was clearly positive in the case of health improvements that benefited females more than males. We further show that this positive impact was primarily observed in countries that had already undergone the demographic transition at the time of the IET and where fertility was already low. Taken together these results suggest that policies targeted at improving female health can also yield broader benefits in terms of economic development.

To establish the aforementioned results we proceed as follows. Section 2 reviews the evidence from the medical literature on gender differences in terms of infectious diseases and vaccination more specifically. Section 3 outlines the empirical strategy that we follow in the paper. Section 4 describes the data that we use. Section 5 presents our baseline results regarding the effect of the IET-related health improvements on educational attainment of men and women. Section 6 discusses a series of robustness checks on our baseline results, while Sect. 7 considers the role of other factors and different mechanisms in accounting for the relative rise in female schooling. Section 8 explores the relationship between health improvements and educational attainment for men and women based on data from U.S. states. Section 9 provides evidence regarding the effects of improvements in male and female health on GDP per capita. Section 10 offers some concluding remarks.

2 Gender differences related to infectious diseases

The fact that immune responses to pathogens differ between men and women has been long recognized in the medical literature (Grossman 1985). Men generally tend to exhibit weaker immune responses compared to women. This makes them more susceptible to contract infectious diseases and to have subsequently more severe disease outcomes. In this section we summarize the key evidence regarding the different ways in which men and women are affected by infectious diseases and how they respond to vaccination. These differences across genders are later on explored in our empirical analysis.Footnote 1

The greater susceptibility of men to infectious diseases compared to women has been documented in a large number of clinical studies and it is often referred to in the literature as infectious diseases exhibiting a ‘male bias’ (Klein 2004). Recent survey articles reviewing these studies highlight how the male bias applies to a wide range of infectious diseases (Muenchhoff and Goulder 2014; Giefing-Kroell et al. 2015). For some diseases, such as tuberculosis, this bias is so pronounced that the number of incidences among males is almost twice that among females (World Health Organization 2019). The male bias is also evident when looking at mortality from infectious diseases. This was first shown by Owens (2002) using data from the United States and later by Lozano et al. (2012) based on vital statistics from 187 countries reported in the Global Burden of Disease Study. Comparing gender-specific mortality rates for 235 causes of deaths between 1990 and 2010 Lozano et al. report that mortality rates are on average 13% higher for males than for females across the 27 most important infectious diseases.

While traditionally some authors have attributed the higher prevalence of infectious diseases among males to behavioral or environmental factors, recent studies have cast doubt on the relative importance of these factors (Borgdorff et al. 2000; Guerra-Silveira and Abad-Franch 2013) Instead there is increasing evidence that attributes the male bias to physiological differences related to sex hormones and chromosomes. In particular, the fact that females have two X chromosomes, a maternal and a paternal one with a varying pattern of expression, contributes to a biological advantage due to the process of chromosome inactivation (Migeon 2006; Libert et al. 2010). The role of sex hormones, on the other hand, is evident by comparing the male-to-female ratios of incidences and mortality rates at different ages (Guerra-Silveira and Abad-Franch 2013; Giefing-Kroell et al. 2015). These ratios have been shown to peak during puberty and reproductive ages, when the variation in sex hormone levels between males and females is the highest. The conclusion that the male bias in disease susceptibility and mortality from infectious diseases is related to variation in sex hormones has also been strongly supported by laboratory experiments with animals and specific case studies with humans.Footnote 2\( ^{,}\)Footnote 3

These biological differences between men and women are not limited to infectious disease outcomes. They have also been shown to affect acquired immunity levels following vaccination. As it has been documented in a series of studies for a variety of vaccines (Stanberry et al. 2002; Ovsyannikova et al. 2004; Kennedy et al. 2009), vaccine efficacy tends to be higher among women. This means that the relative reduction in disease susceptibility of a vaccinated group of individuals, compared to a non-vaccinated one, is larger for women. While early studies focused on the cases of particular vaccines, more recent work has established that females generally exhibit stronger antibody responses following vaccination, as highlighted in the review articles by Cook (2008) and Klein et al. (2010).

These gender differences in vaccine efficacy have also been shown to be substantial. For example, Engler et al. (2008) report that in the case of influenza the antibody response of females to a half dose of the vaccine is comparable with the antibody response of males to the full dose. What is important to note here is that this pattern is not due to differences in the vaccine doses administered to men and women. In fact, the standard medical practice is to administer the same dose universally (Poland et al. 2011).

Just like in the case of the male bias in disease susceptibility, there is evidence that the stronger female immune response to vaccines can be attributed to sex hormones. Specifically estrogens have been shown to stimulate the activity of immune cells while testosterone has been shown to suppress it (Furman et al. 2013; Sakiani et al. 2013).Footnote 4 The importance attributed to the role of sex hormones is also reflected in the common finding that immune responses to vaccines are similar for pre-pubertal boys and girls (Davidkin et al. 1995; Wu et al. 1999). It is also consistent with a stronger female immune response among young adults (van der Wielen et al. 2006; Hoehler et al. 2007), which weakens above the age of 60 (Wolters et al. 2003; Cook et al. 2006). Beyond the role played by sex hormones, differences in vaccine efficacy between males and females have also been linked to genetic factors (Fish 2008; Poland et al. 2011).

These well-established gender differences in the effectiveness of vaccines against infectious diseases contrast with the evidence on drugs and other methods of disease control. In their analysis of the effectiveness of 113 drugs Simonovsky et al. (2019) find no systematic differences between men and women. Only for drugs acting on the central nervous system, such as anti-psychotic drugs and antidepressants, as well as beta-blockers, reducing the heart rate and systolic blood pressure there is weak evidence for more beneficial effects in women compared to men (Franconi et al. 2007). For antivirals, antibiotics and other drugs acting on the immune system, though, there is no evidence of clear differences in effectiveness between men and women.

Overall these medical studies underscore an important pattern: Immune responses of men and women to infectious diseases are different and this is true for both exposure to the naturally occurring pathogens as well as to the associated vaccines. These differences appear to be rooted largely in biological differences related to sex hormones and chromosomes. This suggests that public health interventions to control infectious diseases are expected to trigger larger mortality reductions among females than among males.Footnote 5 This should be the case particularly for vaccination campaigns against infectious diseases, which should be more effective among females based on the aforementioned evidence. Yet, this should not necessarily apply to other public health campaigns aimed at controlling infectious diseases which do not rely on vaccination.

Looking at historical mortality data from the United States provides some first evidence in line with this prediction. Figure 3a displays the evolution of the population-wide mortality rates over the course of the twentieth century for two groups of infectious and parasitic diseases: vaccine-preventable diseases for which vaccines played a role as a method of control, and non-vaccine-preventable diseases for which vaccines did not play a role as a method of control. Figure 3b displays for the same two groups of diseases the ratio of female mortality rates relative to the corresponding rates for males.

Fig. 3
figure 3

a Mortality rates from vaccine- and non-vaccine-preventable diseases. Notes This figure depicts the evolution of mortality rates from two groups of infectious and parasitic diseases in the U.S., calculated based on data from the U.S. Vital Statistics. The rates reflect mortality of the entire U.S. population. The group of vaccine-preventable diseases includes diphtheria, influenza, measles, pneumonia, tuberculosis, smallpox and whooping cough. The group of non-vaccine-preventable diseases includes cholera, diarrhea, malaria, plague, scarlet fever, typhoid fever and typhus. b Gender differences in mortality from vaccine- and non-vaccine-preventable diseases. Notes This figure depicts the female-to-male ratio of mortality rates from two groups of infectious and parasitic diseases in the U.S., calculated based on data from the U.S. Vital Statistics. The rates reflect mortality of the total female and male U.S. population. The group of vaccine-preventable diseases includes diphtheria, influenza, measles, pneumonia, tuberculosis, smallpox and whooping cough. The group of non-vaccine-preventable diseases includes cholera, diarrhea, malaria, plague, scarlet fever, typhoid fever and typhus

As Fig. 3a highlights, mortality rates from both groups of infectious diseases declined dramatically during the twentieth century. As Fig. 3b reveals, though, the evolution of the female-to-male ratio of mortality rates from these two groups of infectious diseases was different. Starting around the 1930s, female mortality from vaccine-preventable infectious diseases clearly fell more than male mortality, while a relative decrease of this sort is not visible for non-vaccine-preventable infectious diseases.Footnote 6 In the following section we describe how we are going to exploit the role of vaccines versus other methods of disease control for different infectious diseases in the context of our empirical strategy.

3 Empirical strategy

Our empirical analysis builds on the approach of Becker et al. (2005), and Acemoglu and Johnson (2007). These authors investigate the country-wide effects of the large health improvements that took place over the second half of the twentieth century following the IET. These improvements were triggered by the global spread of western medical innovations after World War II, which led to a more effective control of infectious diseases and a sharp decline in mortality rates from these diseases all over the world. This resulted in countries where the mortality burden from infectious diseases was the highest prior to the IET benefiting the most in terms of mortality reductions from the new methods of disease control. This observation permits an identification strategy, originally proposed by Acemoglu and Johnson, for estimating the effects of the IET-related health improvements. This can be done in the context of an instrumental variables regression where potential changes in mortality from infectious diseases, determined by the initial variation in mortality rates, serve as an instrument for the actual health improvements that took place in each country.

Our approach builds on this identification strategy, but differs from previous contributions in the literature as we consider the gender-specific nature of the IET-related health improvements and their effects on educational attainment. In order to do so, we take into consideration whether the initial mortality environment in a given country was dominated by infectious diseases that were subsequently controlled by vaccination or by other methods. This distinction allows us to explore the patterns discussed in the previous section regarding how different public health interventions aimed at controlling infectious diseases can have different effects on males and females depending on the method of disease control. Specifically, in countries where the mortality environment prior to the IET was dominated by infectious diseases that were at least partially brought under control with the introduction of new or improved vaccines, we would expect to see bigger health improvements for females than for males. In contrast, in countries where the mortality environment prior to the IET was dominated by infectious diseases that were brought under control thanks to other medical innovations, we would expect to see similar improvements in female and male health.

As our interest is to estimate the effects of these health improvements on educational attainment, we follow an estimation strategy similar to Hansen (2013). We use life expectancy at birth as a broad measure of health and average years of schooling as a proxy for educational attainment. We focus on long-run changes in these variables and we estimate how they relate with each other for men and women in a long-differences panel with two time periods, before and after the IET. For most of our analysis, we take these two time periods to be 1940 and 1980. Specifically, our main estimation equation is:

$$\begin{aligned} AYS_{gct}=\alpha \cdot LE_{gct}+\mu _{gc}+\gamma _{t}+u_{gct}. \end{aligned}$$
(1)

\(AYS_{gct}\) denotes the average years of schooling of a given cohort of gender g in country c which started school in year t and \( LE_{gct}\) denotes the life expectancy at birth of gender group g in country c in year t. The specification includes gender-country fixed effects, \(\mu _{gc},\) and year fixed effects \(\gamma _{t}.\) Thus, our fixed-effects panel regression with two time periods is equivalent to a specification in first differences. Conditional on these fixed effects, a positive \(\alpha \) coefficient would suggest that life expectancy increases between 1940 and 1980, which reflect improvements in the general health of the population, were associated with increases in educational attainment for men and women.

To account for the possible endogeneity bias in the estimation of \(\alpha ,\) we employ the identification strategy described above and estimate Eq. (1) with two-stage least squares (2SLS). Specifically we instrument \(LE_{gct}\) based on the first-stage specification:

$$\begin{aligned} LE_{gct}=\beta \underset{d}{\cdot \sum }M_{dct}+\beta ^{f}\cdot I_{g}^{f}\cdot \underset{d}{\sum }M_{dct}+\eta _{gc}+\delta _{t}+\varepsilon _{gct}. \end{aligned}$$
(2)

The subscripts g, c and t again denote gender, country, and year, while the subscript d denotes different infectious diseases. \(M_{dct}\) is the potential mortality rate from infectious disease d in country c and year t given the state of the available medical technology. Following Acemoglu and Johnson (2007), we take \(M_{dct}\) in 1940 to be equal to the actual country-specific mortality rates in that year, before the IET, and in 1980 to the mortality rates at the global health frontier. As \(M_{dct}\) takes the same values for all countries in 1980, changes in this variable between 1940 and 1980 do not capture the actual changes in mortality from disease d. Instead they reflect the potential changes in mortality from a given infectious disease that could be achieved as a consequence of the IET-related medical innovations and their global diffusion after 1940. Thus, changes in \(M_{dct}\) are treated as a predictor for the actual changes in mortality from infectious diseases that occurred across countries between 1940 and 1980 and this predictor is used as an instrument for the associated changes in life expectancy.

To assess the role played by the method of disease control, the mortality rates from different infectious diseases are in some cases summed up altogether, as in Eq. (2), and in some cases combined into two groups, as in Eq. (3). The two groups correspond to a group of vaccine-preventable diseases which we denote by VP and refer to simply as the VP group, and a group of non-vaccine-preventable diseases, which we denote by NVP and refer to as the NVP group. These mortality rates are interacted with a dummy variable for all female observations, \(I_{g}^{f},\) in order to estimate the differential effects of the potential mortality reductions across genders. The specification also includes gender-country and year fixed effects in line with Eq. (1).

$$\begin{aligned} LE_{gct}&= \beta ^{VP}\underset{d\in VP}{\cdot \sum }M_{dct}+\beta ^{VPf}\cdot I_{g}^{f}\cdot \underset{d\in VP}{\sum }M_{dct} \nonumber \\&\quad +\,\beta ^{NVP}\underset{d\in NVP}{\cdot \sum }M_{dct}+\beta ^{NVPf}\cdot I_{g}^{f}\cdot \underset{d\in NVP}{\sum }M_{dct}+\eta _{gc}+\delta _{t}+\varepsilon _{gct}. \end{aligned}$$
(3)

As larger changes in \(M_{dct}\) over time indicate greater potential reductions in mortality from infectious diseases, they should be associated with larger increases in life expectancy. Thus, we would expect both \(\beta ^{VP}\) and \(\beta ^{NVP}\) to be negative. Comparing these effects between males and females, we would expect the interaction coefficient \(\beta ^{VPf}\) to be negative. This is because, in light of the evidence from the medical literature discussed in Sect. 2, reductions in potential mortality rates due to the introduction of vaccines are bound to increase life expectancy for women more than for men. At the same time, we expect the interaction coefficient \(\beta ^{NVPf}\) to be zero, as there is no evidence of reductions in potential mortality rates due to other methods of disease control to have differential effects on male and female life expectancy.

The unbiased estimation of the key coefficients of interest in our empirical setup requires the exogeneity of changes in the potential mortality rates, \( M_{dct}\), for the different groups of infectious diseases over our sample period. In that respect, the crucial assumption is that the initial mortality rates, which reflect the mortality environment in each country prior to the IET, are uncorrelated with other time-varying country and gender specific characteristics that influence education through channels other than life expectancy. The validity of this assumption is investigated as part of our robustness analyses. There we control for several time-varying correlates of male and female health and educational attainment, the omission of which could bias our results. Other factors giving rise to persistent differences in the level of educational attainment of males and females in a given country are not explicitly controlled for in our regression, as they will be filtered out by the gender-country fixed effects.

4 Data

For our main regression analysis we use a panel data set covering 75 countries over two time periods.Footnote 7 As already mentioned in the previous section, the two time periods correspond to the years 1940 and 1980 in most regressions.Footnote 8 We focus on changes between these two time periods in order to assess the impact of the medical innovations associated with the IET on life expectancy and schooling before the start of the global HIV/AIDS epidemic, as in Acemoglu and Johnson (2007), Cervellati and Sunde (2011) and Hansen (2013). As we are interested in the extent to which these medical advances had different effects on the outcomes of males and females, our data set includes for each country and year two observations, one for the female population and one for the male population. This leads to a total sample size of 300 observations.

Section A of the appendix lists the 75 countries that are included in our main sample. They span well all regions of the world with the exception of Sub-Saharan Africa for which we have data for only two countries. In this section, we briefly describe the key variables of interest, namely average years of schooling, life expectancy at birth and mortality rates for different infectious diseases. Further information on the data sources for these variables and for all additional data that we employ in our analysis can be found in Section B of the appendix.

To measure educational attainment, we use the average years of schooling data provided by Barro and Lee (2013), which are gender- and cohort-specific. Following the approach of Hansen (2013), we focus on the cohort of individuals that were between 5 and 9 years old in the two respective time periods (1940, 1980) and measure their educational attainment 10 years later (1950, 1990). These are the cohorts of boys and girls that started with their formal schooling between 1937 and 1941, and between 1977 and 1981, respectively.Footnote 9 By comparing them we can assess how educational attainment in different countries was affected by the life expectancy improvements that resulted from the IET.

Data on life expectancy at birth in 1940 are drawn mainly from the various editions of the UN Demographic Yearbook. This is supplemented with additional sources for selected countries, as explained in Section B of the appendix. The respective information for 1980 is obtained from the electronic version of World Population Prospects database of the UN Population Division. We furthermore collect information on life expectancy at higher ages. The sources for these data are the same as for life expectancy at birth, namely UN Demographic Yearbooks for 1940 and World Population Prospects for 1980.

The information on mortality rates from infectious diseases that we use in our main analysis is drawn from Acemoglu and Johnson (2007). The authors report disease-specific mortality rates for 13 infectious diseases in 1940. In line with our empirical strategy described in Sect. 3, we either sum up the mortality rates for all these diseases or consider separately the sum of mortality from diseases that fall in the VP and NVP groups. These groups are defined based on the role that vaccines played as a method of disease control during the post-war era. The VP group of vaccine-preventable diseases includes diseases for which vaccines played at least partially a role as a method of disease control. These are: diphtheria, influenza, measles, pneumonia, smallpox, tuberculosis, whooping cough. The NVP group of non-vaccine-preventable diseases includes the remaining six diseases for which vaccines were not important as a method of control. These are: cholera, malaria, plague, scarlet fever, typhoid fever, typhus. Section C of the appendix provides detailed information on the characteristics of each disease and the key methods of control based on which we assign the 13 diseases to these two groups.Footnote 10

For the year 1980 we do not collect data on disease-specific mortality rates for all countries. This is because our potential mortality instrument should not reflect the actual country-specific mortality environment in that period, but the conditions at the global health frontier. In particular, for our main analysis we follow Acemoglu and Johnson (2007) and assume that the frontier mortality rates in 1980 were zero for all infectious diseases. As an alternative to this approach, we assume that all countries experienced the same proportionate reductions in mortality between 1940 and 1980, rather than reaching the same level in 1980. In this case, the potential mortality rate for each disease in 1980 is taken to be equal to the country-specific rate in 1940 scaled down by the average rate at which mortality for that disease fell at the global level. As a second alternative we assume that the mortality rates for all countries in 1980 were equal to the values observed in the United States in that year, which were close to, but not equal to zero. As a third alternative we assume that the mortality rates for all countries in 1980 were equal to the average values observed in other countries at the health frontier in that year.Footnote 11 Constructed as the difference between the country-specific mortality rates in 1940 and the rates at the global health frontier in 1980, our mortality figures reflect the potential changes in mortality rates from infectious diseases that could have been achieved in each country following the IET. These can function as an instrument for the actual changes in life expectancy. This is because potential changes in mortality rates are correlated with actual changes in mortality rates, which in turn determine the evolution of life expectancy, but are not affected by it.

We should note here that the figures we have for potential mortality are not gender-specific, but refer to the total population of each country. Neither Acemoglu and Johnson (2007) nor the original sources of their data provide any gender-specific mortality rates for our sample of countries. When estimating our first-stage regression specifications, therefore, we assign the same figure to the male and female observations in each respective country and only allow their effect on life expectancy to differ across genders, as indicated in Eqs. (2) and (3). For our robustness analysis in Sect. 6.2, however, we do employ gender-specific mortality rates from infectious diseases in 1940, which we impute from information available for selected countries. This allows us to compare the estimated effects based on gender-specific and non-gender-specific mortality rates. We perform a similar comparison when we conduct our analysis based on data from U.S. states in Sect. 8, for which we have gender-specific mortality rates.

Table 2 shows the descriptive statistics for all key variables. For average years of schooling and life expectancy at birth the statistics are reported separately for males and females. As these figures clearly indicate, both schooling and life expectancy increased substantially between 1940 and 1980 and these increases were larger for females than for males. The exact nature of this relationship is what we investigate in the next section.

Table 2 Descriptive statistics of key variables

5 Baseline regression results

Table 3 presents the results from the estimation of our main specification with 2SLS. Panel A shows the results of the first-stage estimation based on variants of Eqs. (2) and (3), while panel B shows the results of the corresponding second-stage estimation of Eq. (1). Standard errors, reported in brackets, are clustered at the gender-country level in line with the employed fixed-effects.

Table 3 Life expectancy and educational attainment: baseline estimates

In column 1 we present a simple variant of the first-stage specification where we omit the interaction term with the female dummy and we do not split the mortality rates into different disease groups. Instead we look at the overall effect of the potential mortality reductions from the 13 infectious diseases on life expectancy at birth. As the estimated coefficient indicates, the improvements in the mortality environment that took place globally between 1940 and 1980 as result of the IET had a large and statistically significant effect on life expectancy. The coefficient implies that on average the mortality reductions contributed to an increase in life expectancy at birth by 7 years, which is similar to the magnitude reported by Acemoglu and Johnson (2007).

In column 2 we allow the effect of the potential mortality reductions to vary between males and females by including in the specification the interaction term between potential mortality and the female dummy. The estimated coefficient for the interaction term is negative and statistically significant, indicating that female life expectancy, on average, rose faster between 1940 and 1980 than male life expectancy in response to the same improvements in the mortality environment. Specifically, the estimated coefficients of \(-12.9\) and \(-4.9\) imply that male life expectancy increased on average by 5.6 years and female life expectancy increased by 7.8 years as a consequence of the IET-related medical advances. This corresponds to 33% and 39% of the actual life expectancy increases for males and females respectively observed on average over this period in our sample of countries.

When interpreting these estimates, it is important to keep in mind that the relatively larger estimated effect for females implies that female life expectancy increased more in response to the same potential rather than the same actual reduction in mortality. The actual changes in mortality rates for males and females are by construction already reflected in the corresponding life expectancy figures. The potential changes in mortality rates, however, will only be partially reflected in life expectancy, as by 1980 not all countries in our sample had achieved zero mortality rates from these 13 major infectious diseases. Still the fact that female life expectancy gains were systematically higher across countries suggests that some of the IET-related medical advances clearly benefited women more than men.

To understand better the source of these stronger life expectancy gains for females, in column 3 we estimate the first-stage specification distinguishing between potential mortality reductions in terms of vaccine-preventable diseases (VP group) and non-vaccine-preventable diseases (NVP group). We also allow the effects of these reductions to vary across genders. As we can see from the estimated coefficients, the potential reductions in mortality from both groups of diseases were associated with significant increases in life expectancy. Moreover, as the interaction terms with the female dummy reveal, the same potential reductions in mortality were associated with significantly larger increases in female life expectancy than in male life expectancy only in the case of vaccine-preventable diseases. For non-vaccine-preventable diseases the corresponding interaction coefficient is statistically insignificant. This result is not surprising given the extensive evidence from the medical literature, summarized in Sect. 2, regarding the higher efficacy of vaccines among females. From all IET-related medical advances, vaccines are the ones that most likely benefited women more than men. As a consequence, countries where the IET-related mortality reductions were more closely related with the introduction and diffusion of vaccines are the ones expected to experience relatively larger increases in female life expectancy.

Interpreting the magnitudes of the estimated coefficients in column 3, we find that male life expectancy rose on average by 4.2 years and female life expectancy by 6 years, as a consequence of the improved control methods of vaccine-preventable diseases. This means that the effect for females is 43% higher than that for males. Looking at the corresponding magnitudes for non-vaccine-preventable diseases instead, we find that the potential mortality reductions were associated with increases in life expectancy on average by 1.5 years for females and by 1.2 years for males, with the difference between the two being statistically insignificant.Footnote 12 Taken together, these figures imply that out of the actual differential life expectancy gain of 2.76 years between men and women observed on average in our sample, 2.2 years can be explained by the overall reductions in mortality from infectious diseases and 1.8 years can be explained solely by the reductions in mortality from vaccine-preventable diseases.

In columns 4, 5 and 6 we present the estimation results for the same regression specification when using instead the alternative potential mortality instruments. These are constructed assuming that potential mortality rates in 1980 are not zero, but follow the alternative assumptions described in Sect. 4. Specifically for the estimation in column 4 we assume that mortality rates in all countries fell proportionately to the global average, for the estimation in column 5 we assume that mortality rates fell to the levels observed in the United States in 1980 and for the estimation in column 6 we assume that mortality rates fell to the average levels observed in other health-frontier countries in 1980. In all cases the estimation results are very similar to the results reported in column 3. Together they further corroborate the clearly stronger response of female life expectancy to mortality improvements related to vaccination, but not to other methods of disease control.

Turning to panel B of Table 3, we can see the second-stage estimates of the 2SLS estimations that correspond to the first-stage estimates described above. The results in all cases are very similar. Irrespective of the exact setup employed in the first stage, the second-stage estimates suggest that improvements in life expectancy between 1940 and 1980 led to statistically significant increases in average years of schooling. Also when estimating Eq. (1) with ordinary least squares (OLS), as reported in column 7, we obtain a positive coefficient of similar magnitude. The last row of the table shows the first-stage effective F-statistic proposed by Montiel-Olea and Pflueger (2013), which is appropriate for our panel setup with clustered standard errors. Looking at the critical values of the test suggests that our regressions do not suffer from a weak instruments problem as the resulting bias of the 2SLS estimates relative to OLS is always below 20%.Footnote 13

While in our second-stage estimation we obtain a common life-expectancy coefficient for both males and females, our estimates still imply that the mortality reductions associated with the IET gave rise to larger increases in average years of schooling for females than for males.Footnote 14 This is because in the first-stage estimation we have already shown that the same potential mortality reductions from the diffusion of IET-related medical advances led to larger gains in life expectancy for females than for males. In particular, the coefficient estimate of 0.115 in column 2 in combination with the changes in life expectancy predicted from the first stage regression imply increases in schooling, on average, of 0.65 years for males and 0.9 years for females. This corresponds to a reduction of 0.25 years in the education gender gap, which is 80% of the actually observed reduction over our sample period.

An alternative way to assess the differential effect of the IET-related medical advances on male and female schooling is to estimate the reduced-form relationship between our potential mortality instrument and average years of schooling. This is done in columns 8, 9 and 10 of Table 3. In column 8 we present the reduced-form regression using potential mortality from all infectious diseases, in column 9 we interact potential mortality from all diseases with the female dummy and in column 10 we further distinguish potential mortality stemming from the VP and NVP groups of diseases. In all cases the obtained reduced-form estimates confirm the conclusions that emerge from the 2SLS results. Reductions in potential mortality are associated with increases in schooling overall and these increases are higher for women than for men. Moreover, this differential effect appears to be driven by mortality reductions from VP diseases. The effect of mortality reductions from NVP diseases is not only statistically insignificant, but also quantitatively small when evaluated in terms of its implied magnitude.Footnote 15

Comparing the magnitudes of our estimates with previous work in the literature is also reassuring. Our second-stage estimates suggest that one extra year of life led to an increase in schooling by 0.115 years. This is almost identical to the effect size of 0.11 reported by Hansen (2013), who estimates a similar specification over the same time period, but looks at average years of schooling for the whole population, without distinguishing between men and women. Similar effects are also found by Jayachandran and Lleras-Muney (2009), who report effect sizes between 0.11 and 0.15 years of schooling for each additional year of life from improvements in maternal mortality.

Alternatively, we can look at the implied elasticities for the response of schooling to changes in life expectancy. These elasticities are found to be between 0.6 and 1 by Jayachandran and Lleras-Muney (2009) based on data from Sri Lanka, and between 0.8 and 1.3 by Oster et al. (2013) based on data from the United States.Footnote 16 Given the initial levels of life expectancy and schooling in our cross-country sample, we find that on average mortality reductions associated with the IET increased life expectancy by 16% for women and 12% for men. This in turn resulted in a 21% increase in average years of schooling for females and a 14% increase for males, which implies an elasticity of 1.18 for males and 1.34 for females.

6 Robustness checks

Having demonstrated in our baseline regressions the quantitative importance and the statistical significance of the link between the differential health improvements across genders associated with the IET and the evolution of the education gender gap, we proceed in this section to establish the robustness our finding. For this purpose, we first check carefully our first-stage estimation by comparing mortality rates from different groups of infectious diseases, by controlling for mortality rates from other important causes of death and by employing gender-specific mortality rates. We then scrutinize our second-stage estimation by contrasting the effects for different cohorts and alternative measures of educational attainment. Further robustness checks related to the composition of our country sample, the employed regression specification and the time periods of our analysis are provided in the appendix.

6.1 Different groupings of infectious diseases

If improvements in mortality due to the introduction of vaccines led to larger life expectancy gains for females than for males, as established in our first-stage estimation, then this pattern should be observable with mortality rates for individual diseases as well as for sub-groups of diseases. With that in mind, in columns 1, 2 and 3 of Table 4 we repeat our first-stage estimation focusing on potential changes in mortality from the three most important infectious diseases of that time: malaria, pneumonia and tuberculosis. Doing so is instructive as these three diseases together account for 87% of mortality from the 13 infectious diseases in 1940. In each of the three columns we report the effect on life expectancy of the potential changes in mortality from one of the three diseases, as indicated in the top part of the table, while controlling at the same time for the potential changes in mortality from the remaining 12 diseases with the residual mortality variable.

Table 4 Robustness checks with different disease groupings

Comparing the estimation results across the three columns, we see that the interaction term between the female dummy and potential mortality is statistically insignificant for the case of malaria, but negative and statistically significant for pneumonia and tuberculosis. These results are in line with our earlier conclusion that the relatively larger gains in female life expectancy over this period were driven by the stronger immune responses of females to vaccination. As we explain in greater detail in Section C of the appendix, both pneumonia and tuberculosis are diseases for which vaccines played a role as a method of control after 1940. Malaria, on the other hand, was largely controlled by newly developed insecticides, such as DDT, and to this date no effective vaccine against it exists.

Given the importance of these three diseases over our sample period, we need to ensure that our results are not driven by a differential response of females to potential changes in mortality from just these three diseases. Therefore, in column 4 we focus on the remaining ten diseases, splitting them once again into a group of diseases that effectively became vaccine-preventable after 1940 and a group of diseases that did not. In line with the previous estimations, we also control for potential mortality changes from the three major diseases with the residual mortality variable. As the results demonstrate, we see again a clear differential change in female life expectancy resulting from potential reductions in mortality from diseases of the VP group, even with pneumonia and tuberculosis excluded, but not from diseases of the NVP group.Footnote 17\(^{,}\)Footnote 18

A related concern with our results is the fact that mortality from diseases in the VP group was on average higher in 1940 than mortality from diseases in the NVP group. To address this concern, we have collected additional data on mortality from diarrhea in 1940, which is the most important infectious disease not covered in the data set of Acemoglu and Johnson (2007). Its death toll in 1940 was about 17% of the death toll from all other infectious diseases in our sample. Diarrhea also clearly falls in the NVP group of diseases, as no vaccine for any form of diarrhea existed during our sample period. The reason why we do not include diarrhea in our baseline mortality measure is because the available data for diarrheal mortality in 1940 only cover 43 out of the 75 countries in our sample.

Focusing on these 43 countries, however, we can check whether potential reductions in mortality from diarrhea had differential effects on female and male life expectancy. As we can see from column 5 of Table 4, this is clearly not the case. The estimated coefficient of the interaction term with the female dummy is statistically insignificant. Furthermore, we can re-estimate our main specification with diarrhea included in the NVP group of diseases. Looking at the estimates reported in column 6, we see again that the potential mortality reductions from diseases in the NVP group, even with diarrhea included, do not have a clear differential effect on female and male life expectancy.

A final concern regarding our first-stage estimation is that the observed differential gains in life expectancy for females and males may be due to variation in the causative agent behind each disease rather than the method of disease control. While the medical literature has not documented any differences in the immune response of females and males across different types of causative agents, we nevertheless test for this. For this purpose we separate the 13 diseases with respect to their causative agent (bacteria, viruses, parasites) as well as their main method of control. As all viral diseases in our data set (influenza, measles, smallpox) are vaccine-preventable and malaria is the only parasitic disease, for this robustness check we focus on just bacterial diseases. In column 7 we separate the nine bacterial diseases in our data set into vaccine- and non-vaccine-preventable ones and interact the potential mortality rates from these two groups of diseases with the female dummy. As before, we control at the same time for the changes in potential mortality from the remaining four non-bacterial diseases with the residual mortality variable. Once again we find only potential mortality reductions from vaccine-preventable bacterial diseases to be associated with larger gains in life expectancy for females than for males. The patterns that we observe, hence, do not appear to be driven by the causative agent behind the different infectious diseases.

Looking at the second-stage estimates across the different columns of Table 4, we see that the effects of life expectancy on average years of schooling that we obtain are not very different from our baseline estimation in column 3 of Table 3. Only in columns 5 and 6 we obtain a substantially higher coefficient, which is solely driven by the change in the sample composition.Footnote 19 Thus, we can safely conclude that the 2SLS estimates of the relationship between life expectancy and average years of schooling do not hinge on the exact set of infectious diseases that we consider for the first-stage estimation.

6.2 Other sources of mortality

In our discussion of the first-stage estimation results we have focused on the effects that potential reductions in mortality from infectious diseases had on male and female life expectancy. Yet, the observed changes in life expectancy between 1940 and 1980 were not just driven by changing mortality from infectious diseases, but also by changes in other causes of death. Given this fact, in Table 5 we control for changes in mortality from two other important causes of death. These are maternal mortality, whose importance has been highlighted among others by Albanesi and Olivetti (2016), and mortality from cancer and cardiovascular diseases, emphasized by Deaton (2003). These sources of mortality are of particular importance, as they changed dramatically over the sample period, with maternal mortality falling and mortality from cancer and cardiovascular diseases rising. Moreover, these causes of death exhibit clear gender-specific patterns with maternal mortality affecting only women and cancer and cardiovascular diseases being more frequent among men. Controlling for these sources of mortality in our first-stage specification in columns 1 and 2, though, does not alter our main results. Female life expectancy still exhibits a stronger response to potential reductions in mortality from vaccine-preventable diseases, but not to reductions in mortality from non-vaccine-preventable diseases.

Table 5 Robustness checks with other sources of mortality

Another important concern regarding our first-stage estimation may be the fact that the employed mortality data in 1940 are not gender-specific and therefore our potential mortality instrument does not vary across genders in the same country. We deal with this concern in two ways. First, we use gender-specific causes-of-death statistics from the United States, published in the U.S. Vital Statistics (Census Bureau 1940), and calculate for each of the 13 infectious diseases in our sample the U.S. female-to-male mortality ratio in 1940. We then use these ratios to convert the population-wide mortality rates for each of the 75 countries in our sample to gender-specific ones, assuming that the female-to-male mortality ratio was the same as in the United States. As an alternative, we utilize the information provided by Preston et al. (1972) who report gender-specific mortality rates from different causes for 22 of our sample countries at various points in time during the twentieth century.Footnote 20 Based on this information we calculate for each of these countries a female-to-male mortality ratio for different groups of infectious diseases and use these country-specific ratios to convert the population-wide mortality rates for the 22 countries to gender-specific ones.Footnote 21

The estimation results with gender-specific mortality rates can be seen in columns 3 and 4. Column 3 corresponds to the case where we use the mortality rates obtained with the first approach and column 4 to the case where we use the second approach.Footnote 22 In both cases we see that using gender-specific mortality rates does not alter the qualitative nature of our results. Only quantitatively we do see some changes in the estimated magnitudes for the first-stage coefficients. Interestingly, our main coefficient of interest, \(\beta ^{VPf},\) is now larger in absolute terms and even more statistically significant. This suggests that by not employing gender-specific mortality rates in our main regressions we most likely end up underestimating the differential increases in female life expectancy that resulted from the IET. This underestimation is most likely caused by the fact that female mortality rates in 1940 were already lower than the male rates. As discussed in Sect. 2, this probably has to do with the stronger immune response of women to pathogens, making them less susceptible to contract infectious diseases and having less severe disease outcomes. Using the mortality rates for the whole population to construct our mortality instrument will, thus, lead to an understatement of the potential mortality reductions faced by males and an overstatement of the potential mortality reductions faced by females. As a consequence, any resulting bias in our estimation will attenuate our female-specific coefficient \(\beta ^{VPf}\) toward zero.

Looking at the second-stage estimates across the different columns of Table 5, we reach again similar conclusions regarding the relationship between life expectancy and average years of schooling as in our baseline estimates in Table 3. The slightly higher estimated coefficients in columns 1 and 2 are again solely driven by changes in the sample composition rather than by the inclusion of the additional controls. Thus, we can conclude that using population-wide rather than gender-specific mortality rates, given the absence of comprehensive gender-specific mortality data, does not pose a particular problem other than causing potentially an underestimation of the relative changes in male and female education.

6.3 Different cohorts and alternative schooling measures

Having shown that our estimation results do not hinge on the exact specification of our first-stage regression, it is instructive to investigate the relationship between life expectancy and alternative measures of educational attainment. This is what we focus on in Table 6. In column 1 we begin by looking at how the improvements in life expectancy between 1940 and 1980 affected educational attainment for the cohort of males and females that were between the ages of 0 and 9 in each respective year. These are individuals who all started school within the ten years after our observation of life expectancy. Similarly, in column 2 we look at educational attainment for the cohort of males and females that were between the ages of 5 and 14 in each respective year and started school within the 5 years before and after our observation of life expectancy.

Table 6 Robustness checks with different cohorts and alternative schooling measures

As the only thing that changes in these two specifications, compared to our main one in Table 3, is the dependent variable in the second stage, the first stage estimates are the same as in column 3 of Table 3. Looking at the second stage estimates, we see that measuring average years of schooling based on broader age cohorts yields qualitatively similar results. This is reassuring, as it implies that our findings do not hinge on studying the schooling outcomes of a particular age cohort. The estimated coefficient reported in column 2 is a bit lower, but this is probably driven by the fact that changes in life expectancy between 1940 and 1980 were less relevant for the older individuals in the cohort that had already completed part of their education before the full gains in life expectancy were realized.Footnote 23

Having established this result, in columns 3–5 we proceed to assess whether the link between life expectancy and educational attainment operates at all schooling levels or not. For this purpose, we use as our dependent variable in the second-stage estimation, instead of the average years of schooling, the share of individuals from each cohort that has completed primary education in column 3, completed secondary education in column 4, and completed tertiary education in column 5. While the point estimates in these three cases and our baseline setup are not directly comparable, qualitatively the results are similar, showing a significant positive effect of life expectancy improvements on completion rates at all levels of education.

Specifically, the estimates imply that the improvements in life expectancy observed between 1940 and 1980 led on average to a 19 percentage point increase in the primary school completion rates of males and to a 21 percentage point increase in that of females. For secondary education, the corresponding increases in the completion rates for males and females are 1.8 and 2.3 percentage points respectively, while at the tertiary level they are 0.3 and 0.7 percentage points. These figures suggest that the global health improvements over the post-war period had the largest impact on primary education. This is not surprising given that many of the new methods of controlling infectious diseases benefited primarily young children, who may have otherwise been forced to drop out of school at a young age or never started school to begin with.Footnote 24 Given that among cohorts who started school around 1940 less than 15% of the population completed secondary education and less than 5% completed tertiary education, the relatively larger increases for females in terms of primary schooling completion rates over the subsequent 40 years appear as the main driver behind the observed changes in the education gender gap.

In the last three columns of Table 6 we repeat the regressions shown in columns 3–5 for a sub-sample of developing countries.Footnote 25 The dependent variable in column 6 is again the share of a given cohort with completed primary education, in column 7 the share with completed secondary education and in column 8 the share with completed tertiary education. Overall, the obtained estimates are not very different from those for the full sample. For developing countries the effect of life expectancy on primary education is a bit larger in magnitude, while the effects on secondary and tertiary education are slightly weaker and statistically insignificant. The differences, however, are small and they could simply be driven by the differential quality of the data.

7 Comparing the relevance of different factors and mechanisms

Having established the robustness of our main findings regarding the role of the health improvements associated with the IET on the evolution of the education gender gap over the post-war period, in this section we proceed to compare its relevance to other explanations for the post-war expansion of schooling and female schooling in particular. For this purpose we control in our main specification for changes in other variables that have been associated in the literature with health and educational outcomes of males and females. Furthermore, we investigate whether the observed differential changes in educational attainment across genders could be attributed to differences in the health-education elasticity between men and women. Lastly, we expand our analysis considering alternative measures of life expectancy and other indicators of health.

7.1 Exploring the role of additional control variables

In the regression results that we have presented up to this point we have ignored the role played by other factors that may have influenced the evolution of health and educational outcomes of men and women over our sample period and the estimated relationship between the two. With that in mind, in Table 7 we include in our main specification various control variables that reflect such factors and explore whether their inclusion affects our main findings. All details regarding the construction of these variables are provided in Section B of the appendix.

Table 7 Exploring the role of additional control variables

In column 1 we control for each country’s level of GDP per capita in both the first- and second-stage of our estimation. Economic development is bound to have a positive effect on both life expectancy and schooling. As the estimation results reveal, however, controlling for GDP per capita does not alter our main findings. In the first-stage estimation the coefficient on GDP per capita is not statistically different from zero. Its inclusion also does not change the key pattern in the first-stage estimation with potential mortality reductions from vaccine-preventable infectious diseases being associated with relatively larger increases in female life expectancy. In the second-stage estimation, GDP per capita is positively associated with average years of schooling, confirming the role of economic development in fostering educational attainment. In spite of that, its inclusion appears to increase the quantitative importance of the effect of life expectancy on educational attainment rather than reduce it.

In column 2 we control for the quality of institutions in each country, measured by the constraints on the executive score from the Polity IV database. As we can see from the results, on average countries with better institutions experienced smaller gains in life expectancy. Controlling for this effect of institutional quality, however, we still observe in the first stage a differential change in female and male life expectancy. In the second stage the coefficient on institutional quality is insignificant and does not alter the relationship between life expectancy and schooling.

Considering that educational attainment may be influenced also by the concentration of population in cities, in column 3 we control for urbanization. Specifically, we measure each country’s urbanization rate in 1950 and 1990 to see whether changes in that rate relate to the observed changes in schooling, which are also measured in 1950 and 1990, as explained in Sect. 4. Including this variable does not alter our main findings. Moreover, the coefficient estimates for the urbanization rate are statistically insignificant both in the first and in the second stage.

In column 4 we consider the role played by the sectorial structure of the economy, in particular the size of the service sector. This is to capture the idea that structural change may have led to a higher demand for schooling. As we did in the case of the urbanization rate, we collected data on each country’s share of employment in the service sector in 1950 and 1990. We find the service-sector employment variable to be significant in the first-stage estimation, but not in the second-stage. In either case, its inclusion does not change the estimates for our main coefficients of interest.

Another important factor that could be important in this context are country laws about compulsory schooling. As over the post-war period there were important changes in these laws, it is conceivable that these changes may have influenced the observed expansion of schooling and the closing of the gender gap. To assess this, in column 5 we control for the years of compulsory schooling in each country in 1950 and 1990, which once again is in line with the years in which we measure average years of schooling. As we can see from the second-stage regression results, there is a clear positive association between the post-war expansion of compulsory schooling and the observed increases in average years of schooling. From the first-stage results we also see that the expansion of compulsory schooling is positively related with the observed improvements in life expectancy. Nevertheless, the inclusion of this variable does not alter our main results and its effect appears to be largely complementary to that of health improvements.

In column 6 we consider the role of voting rights for women, which relate to broader changes in the legal status of women and which have been emphasized by many authors in the literature.Footnote 26 To control for this channel, we include in our specification a dummy variable reflecting whether women had the right to vote in the respective sample year. As we can see in panel A, this variable is positively associated with life expectancy, but its inclusion does not affect the link between life expectancy and the mortality reductions associated with the IET. The variable is also positively correlated with average years of schooling, as we can see in panel B. However, the estimated coefficient is statistically insignificant and its inclusion does not overturn the positive effect of life expectancy in the second stage.

In column 7 we explore the role that characteristics of the marriage market played in the educational choices made by males and females. This relates to arguments regarding the importance of education in the marriage market presented by Chiappori et al. (2009) in the case of developed countries as well as Ashraf et al. (2016) in the case of developing countries. To capture this idea we control for the ratio of divorces to marriages in 1950 and 1990 respectively. This variable reflects the likelihood of individuals experiencing a divorce relative to that of getting married. While the first-stage estimates in this case suggest that countries in which divorce rates grew faster relative to marriage rates experienced smaller gains in life expectancy, including this variable in the specification does not change the key coefficients of interest. It also does not alter qualitatively the relationship between life expectancy and schooling.Footnote 27

In column 8 we investigate the role played by changes in the labor force participation of males and females over our sample period. This is motivated by the argument made by Goldin (2006) that improved labor market conditions for women increased their returns to schooling in comparison to men. To account for this effect, we collect data for the economic activity rates of males and females in 1950 and 1990 and include them in our specification as a control. Our regression results in this case suggest that while this mechanism may have been important in the case of some developed countries, such as the United States, in the context of our broad country-sample it does not appear to be quantitatively important. Moreover, controlling for it does not alter the observed patterns for the remaining variables, neither in the first stage nor in the second stage.

In columns 9 and 10 we assess the importance of changes in fertility behavior over our sample period. Theories about the demographic transition clearly link fertility with life expectancy and educational attainment.Footnote 28 With that in mind, in column 9 we control in our specification for fertility rates, measured in 1950 and 1990. As the regression results indicate, higher fertility rates are associated with lower levels of life expectancy and lower levels of educational attainment, in line with demographic transition theories. Controlling for this effect, however, does not affect the main regression coefficients, neither in the first stage nor in the second stage. In column 10, we also allow the effect of fertility to differ between high and low income countries. To do so, we interact the fertility rate with a dummy variable indicating countries that in 1950 were above the median income level in our sample of countries. The positive coefficient estimates for the interaction term signify that the adverse effect of fertility on education and the negative correlation between fertility and life expectancy is more pronounced in poorer countries. Conditioning on that effect as well, our main results still hold.

Finally, in column 11 we include in our specification all the control variables that we introduced in columns 1–9. For practical reasons we do not display all their coefficient estimates, but we only report the p value for the F-test regarding their joint statistical significance. As expected from the estimates reported in columns 1–9, this test is rejected, indicating that the control variables are jointly significant. Even when we include all controls variables, we still see that changes in potential mortality from vaccine-preventable diseases have clearly a stronger effect on the life expectancy of females than males and these differential changes in life expectancy are associated with corresponding changes in schooling in the second stage.

In sum, the estimation results in Table 7 present a very consistent picture. Our main findings are robust to various alternative explanations for the evolution of life expectancy and the rise in educational attainment over the post-war period. The statistical significance and the quantitative importance of our key parameter estimates are similar across all specifications. There are some fluctuations in the size of the estimated effects of life expectancy on schooling, but these fluctuations are mostly due to the change in sample size across columns and not to the inclusion of the control variables.

7.2 Exploring potential heterogeneity in the health-education elasticity

The main contribution of our empirical analysis has been to document how the evolution of the education gender gap that took place over the second half of the twentieth century was influenced by the differential health improvements across genders triggered by the IET. As women gained more years of life than men, this promoted further female educational attainment and led to relatively larger increases in female average years of schooling. While the empirical evidence that we have presented so far supports this explanation, it does not preclude an alternative explanation that attributes the closing of the education gender gap to differential returns to schooling. In this case the larger female schooling increases could be the result of women experiencing stronger increases in their returns to schooling compared to men, independently of their health status. In this case, women might choose to invest more in their education even if they did not gain more in terms of health than men. In this section we assess the plausibility of this alternative explanation in two ways. First, we test directly whether the health-education elasticity that we estimate in the second-stage specification differs in general between men and women. Second, we explore whether there is heterogeneity in the health-education elasticity among the countries in our sample with a series of interaction regressions.

The estimation results for these tests are reported in Table 8. For practical reasons we only report the second-stage estimation results. We should stress, however, that when performing these estimations we treat life expectancy as well as all interaction terms that include life expectancy as endogenous and instrument them with our potential mortality instrument multiplied with the relevant interaction variable. This implies that we have multiple endogenous regressors and multiple first-stage regressions. To alleviate concerns about weak instruments we report for each of the underlying first-stage estimations the effective F-test statistic of Montiel-Olea and Pflueger (2013) for the excluded instruments. This allow us to assess the strength of the instruments for each of the endogenous regressors.

Table 8 Exploring potential heterogeneity in the health-education elasticity

In column 1 we introduce in our second stage specification an interaction term of life expectancy with the female dummy. Looking at the resulting estimates, we see that the coefficient on the interaction term is statistically insignificant and effectively zero. This implies that the response of female schooling to a given life expectancy increase was not different than that of male schooling.

The finding of a common health-education elasticity for men and women weights against the hypothesis of stronger increases in the returns to female schooling. Nevertheless, it is still possible that for some countries in our sample the returns to female schooling did increase more over the post-war period due to factors other than changes in life expectancy. With that in mind, in the remaining columns of the table we interact life expectancy and its interaction with the female dummy with several additional control variables. The variables that we use for this exercise include GDP per capita, institutional quality, the urbanization rate, the service-sector employment share, years of compulsory schooling, female voting rights, abortion laws and fertility.Footnote 29 We focus on these variables, as they may affect in different ways the returns to schooling for men and women and, hence, the health-education elasticity.Footnote 30 To simplify the estimation, we do not construct the interaction effects based on the actual values of the control variables, but based on dummy variables that take a value of one for countries that are above the median in terms of the respective control variable and a value of zero otherwise.Footnote 31

Looking at the estimation results in columns 2–9 we see that none of the coefficients on the interaction terms with the additional control variables are statistically significant. This suggests that differences in the aforementioned variables across countries did not affect the nature of the relationship between life expectancy and years of schooling. Female schooling does not appear to be more sensitive or to respond more strongly than male schooling to the post-war increases in life expectancy, on average as well in particular countries. Hence, we can conclude that the relative rise in female schooling and the closing of the education gender gap should be attributed to differential health improvements between men and women, and not to differential changes in the returns to schooling.

Having said that, we should note that our analysis does not necessarily preclude any differences in the returns to schooling between men and women. Persistent differences in the returns to schooling across genders in any of our sample countries, which may account for level differences in the average years of schooling for males and females, will be picked up in our specification by the gender-country fixed effects. This set of fixed effects will capture, for example, the role of gender norms and other cultural or institutional attributes of countries, which are expected to influence health and educational outcomes of men and women and which have been shown to be highly persistent (Alesina et al. 2013; Hansen et al. 2015). At the same time, there are common factors that played a role in the relative rise of female education over our period of interest, such as the introduction of the birth control pill, stressed by Goldin and Katz (2002), or the decline in the price of household appliances, highlighted by Greenwood et al. (2016), which we do not explicitly consider. In the context of our analysis, the effect of these common factors would be largely captured by the year fixed effects.Footnote 32

7.3 Comparing the effects for life expectancy at different ages and other health indicators

Our analysis up to this point has been focused on estimating the effect of life expectancy at birth on average years of schooling. We focus on life expectancy at birth as it is the most commonly used measure of the overall health status of a country’s population. Yet, changes in the mortality environment may have different effects on the health status of individuals depending on their ages. Moreover, the IET may also have affected other dimensions of health that are not fully captured by life expectancy. As these may be important for educational attainment, in this section we broaden our analysis to consider life expectancy at different ages and other indicators of health.

We begin in columns 1 and 2 of Table 9 by conducting our 2SLS estimation using life expectancy at age 5 and age 10 instead of life expectancy at birth. Using these life expectancy measures allows us to exclude the effects that the IET had on the survival rates of infants and very young children and focus on its impact from the age of schooling onward. Despite limited data availability, which forces us to conduct our analysis with a smaller sample, the results that we obtain are both qualitatively and quantitatively similar to our baseline results with life expectancy at birth. Potential mortality reductions from both disease groups led also to improvements in life expectancy at age 5 and 10 and there is a clear differential effect on female life expectancy of potential mortality reductions from vaccine-preventable diseases. The estimated coefficients in the second stage appear slightly higher. This is driven by two things. The change in the sample composition and the smaller coefficient estimates in the first stage. The latter implies that the variation in life expectancy at age 5 or 10 explained by our mortality instrument is smaller than in our baseline estimation using life expectancy at birth. Qualitatively, however, the patterns in the first and second stage estimation are unchanged.

Table 9 Comparing the effects for life expectancy at different ages and other health indicators

In columns 3 and 4 we investigate how the observed changes in average years of schooling of males and females are affected by health improvements during childhood versus adulthood. For this purpose, in column 3, we focus on life expectancy between the ages of 0 and 15 and in column 4 on life expectancy between the ages of 15 and 60. Again, due to data limitations, we are forced to work with a smaller country sample. Looking at the first-stage estimates, we see clearly that potential mortality reductions from both disease groups contributed positively to the rise in both child and adult life expectancy. When the dependent variable is life expectancy during childhood, the interaction effects with the female dummy are insignificant for both disease groups. When the dependent variable is life expectancy during adulthood, though, the interaction effect with the female dummy is significant for the vaccine-preventable disease group, as in our main specification, and insignificant for the non-vaccine-preventable disease group. This is not surprising given the evidence provided by the medical literature, which we discussed in Sect. 2. Differences in the immune response to vaccines between males and females have been shown to be largely related to sex hormones. Hence, we would expect to observe them primarily during reproductive ages when differences in hormones between men and women are largest. Our results are also consistent with the studies reported in Sect. 2 that have documented the absence of gender differences in vaccine efficacy in pre-pubertal children.

Turning our attention to the second-stage estimates, we see that life expectancy increases during both childhood and adulthood are clearly associated with schooling increases. Thus, educational attainment over the post-war period appears responsive to improvements in child as well as adult health. Looking at the point estimates for the second-stage coefficients, it seems as if the effect of health improvements in children is larger than that of health improvements in adults. When expressed in terms of standard deviation changes, though, the effects are similar in magnitude with a standardized effect of 0.65 in the case of child health and 0.6 in the case of adult health. The estimates of column 4 are consistent with the standard Ben-Porath mechanism, which suggests that improvements in adult life expectancy foster education investments by individuals, as they expand the time horizon over which the returns from these investments can be earned. At the same time, the estimates of column 3 lend support to alternative mechanisms, such as those proposed by Kalemli-Ozcan (2002) and Hazan and Zoabi (2006), according to which improvements in child health encourage parents to invest more in the education of their children as their chances of survival to adulthood increase.

In columns 5 and 6 we try to distinguish between the effect of health improvements related to mortality and those related to morbidity. To do so, we collected data on infectious disease incidences. While these data are only available for some of our sample countries and part of the period under consideration in our analysis, they allow us to control for changes in disease incidence in our main specification.Footnote 33 In column 5, we include as a control variable in our main regression specification the disease incidence rate, measured as the average number of incidences over each country’s population, for 8 out of the 13 infectious diseases in our data. As we can see, disease incidence enters with a negative coefficient in both the first and the second stage regression. In the first stage this is not surprising, as it implies that our potential mortality instrument leaves out other important sources of health improvements which seem to be correlated with the disease incidence rate. More interestingly, in the second stage we see that the observed reduction in diseases incidences over the period was associated with an increase in educational attainment after accounting for the effect of life expectancy.

In column 6 we use as an alternative control variable in our specification the average case fatality ratio, which is constructed based on the above-described disease incidence data. It measures the ratio of total deaths from all diseases over the total number of incidences. Looking at the estimation results, we see that changes in the case fatality ratio do not have any major effects on life expectancy at birth in the first stage regression or on average years of schooling in the second stage. The estimated coefficients are statistically insignificant and their inclusion does not seem to alter the estimates for the remaining variables. This suggests that our results are not driven by some infectious diseases becoming more or less fatal.

Finally in column 7 we include in our main specification adult height as a control variable to proxy for health improvements unrelated to mortality. Average height is a commonly-used indicator of population health in the literature (Weil 2014) and has been shown to coevolve with the level of economic development (Dalgaard and Strulik 2015).Footnote 34 Moreover, the advantage of using height data is that they are available already in 1940 and cover all of our sample countries. Including height in our first-stage specification does not affect or influence the estimated relationship between changes in potential mortality and life expectancy at birth, neither on average nor for women in particular. However, in our second-stage regression the estimated coefficient on height is positive and statistically significant. This suggests that increases in height are associated with increase in schooling conditional on the effect of life expectancy.

Overall, the results presented in Table 9 suggest that life expectancy at birth is capturing only part of the effect of health improvements on educational attainment. This is not surprising as life expectancy at birth only measures health improvements reflected in mortality and hence may not fully capture all the improvements in global health that took place during the post-war period as a consequence of the IET. As highlighted by Hazan and Zoabi (2006), educational attainment may also be fostered by health improvements which are unrelated to mortality and not reflected in our life expectancy measure. Hence, our baseline estimates are likely to understate the true importance of health improvements in influencing the educational attainment of men and women.

8 Evidence from U.S. States

The analysis that we have conducted up to this point uses variation in the mortality environment across countries prior to the IET in order to estimate the effect that changes in life expectancy of men and women had on their educational attainment and, as a result, on the education gender gap. Yet, systematic differences in the mortality environment before the IET were not only present across countries, but also within countries. With that in mind, in this section we conduct a similar empirical exercise which exploits variation in mortality from infectious diseases across U.S. states.

While the mortality transition in the United States had already started before 1940 (Cutler and Miller 2005; Goldin and Lleras-Muney 2018), the U.S. witnessed sizeable improvements in mortality from the late 1930s onward. These improvements were driven by medical innovations related to the IET, which spread gradually across the country (Jayachandran et al. 2010; Hansen 2014). Hence, our empirical setup is still applicable. Moreover, conducting our analysis based on data from U.S. states has some further advantages. First, it allows us to compare changes across geographical units with similar health and education systems. Second, it enable us to use the detailed U.S. causes-of-death statistics, which are collected and reported based on a uniform system of death registrations stratified by different social groups including gender and race.Footnote 35 Below we briefly describe the data sources that we use to construct our state-level data set and then present the findings from the estimation of our main specification.

8.1 Data

To conduct our analysis we build a two-period panel data set with information on average years of schooling, life expectancy at birth and mortality rates from different infectious diseases for men and women in the 48 contiguous U.S. states and the District of Columbia.Footnote 36 For comparability with our global analysis, we focus again on the years 1940 and 1980.Footnote 37

As in our global data set, we measure educational attainment in terms of the average years of schooling of individuals who were 5–9 years old in the sample years. We construct this measure for the male and female population based on the 1% U.S. census data provided by IPUMS (Ruggles et al. 2019). For this purpose, we convert the categorical information on educational attainment reported in the census to average years of schooling using the approach explained in Section B of the appendix. The data on life expectancy at birth for 1940 are taken from the State Life Tables 1939–1941, published by the Federal Security Agency, and the data for 1980 are taken from the U.S. Decennial Life Tables for 1979–1981. As the 1940 figures are only available for the white population, we also use for 1980 the data for the white population.

Mortality rates for infectious diseases in 1940 are computed based on the causes-of-death statistics reported in the 1940 Vital Statistics of the United States, which are stratified by gender and race. To ensure comparability with out global analysis, we focus on the same 13 diseases plus diarrhea.Footnote 38 For consistency with our life expectancy data, we focus solely on deaths statistics for whites. For 1980, we follow the same approach we took in our estimation with the country sample and assume that the mortality rates for all diseases were zero for men and women in all states. This implies that, once again, our mortality figures will reflect potential changes rather than actual changes in mortality.

8.2 Regression results

Using the above described data set, we estimate our main 2SLS specification outlined in Sect. 3. The estimation results are reported in Table 10, with panel A presenting the estimates for the first-stage and panel B the estimates for the second-stage. We begin in columns 1 and 2 with the case where we sum up the mortality rates from all 14 diseases. In column 1, we employ mortality data that are not gender-specific, in order to allow for direct comparison with our baseline results in Table 3. In column 2, we report the same regression based on gender-specific mortality rates. The results obtained in the two columns are qualitatively similar. In both cases we find reductions in potential mortality rates from infectious diseases to be associated with increases in life expectancy and this relationship to be stronger for women. Quantitatively, though, the obtained estimates in column 2 are slightly higher.

Table 10 Life expectancy and educational attainment across U.S. States

In column 3 and 4 we split the potential mortality rates between the VP and the NVP disease groups. We do that first with the non-gender-specific mortality rates in column 3 and then with the gender-specific ones in column 4. In both cases we see that the interaction term with the female dummy is significant and negative for mortality from the VP group, but it is statistically insignificant for mortality from the NVP group. This confirms our finding from the country-level analysis. Potential mortality reductions from vaccine-preventable diseases were associated with larger increases in female life expectancy than in male life expectancy. But this was not the case for mortality reductions from non-vaccine preventable diseases. Comparing the obtained coefficients in columns 3 and 4, we see that the estimate for the female-specific effect is larger in absolute magnitude when using the gender-specific potential mortality rates. This mirrors the comparison between the estimates of column 3 in Table 3 and column 3 in Table 4, which we discussed in Sect. 6.2.

Looking more carefully at the magnitudes of the estimated effects in Table 10, we find them to be similar to those of the global sample. While at a first glance the coefficients in Table 10 appear larger in absolute terms than those in Table 3, this is counterbalanced by the lower variation in the variables in the U.S. states sample. Expressed in terms of standard deviations, our estimates imply that a one standard deviation reduction in mortality from all 14 infectious diseases combined is associated with a 0.27 standard deviation increase in life expectancy on average across U.S. states. This is similar to the 0.33 standard deviation increase we found across countries. In terms of implied magnitudes, the coefficient estimates in column 1 imply that in response to the changes in potential mortality male life expectancy increased on average by 1.5 years and female life expectancy by 4.2 years. If we employ the gender-specific mortality rates in column 2, the implied effect sizes are 1.6 years for men and 4.3 years for women.

Turning to the second-stage estimates reported in panel B, we see that, independently of how we specify the first stage regression, the life expectancy changes are associated with significant increases in schooling. Depending on which first-stage specification we employ, we find that one additional year of life is association with 0.166–0.179 additional years of schooling. This is higher than the OLS coefficient, which we report in column 5, and is similar to what we saw in Table 3. Combining the coefficient estimate of 0.17 in column 1 with the increases in life expectancy predicted by the first-stage regression, we find that on average the medical advances associated with the IET contributed to schooling increases of 0.72 years for women and 0.25 years for men.

In columns 6–9, we present the estimates for the corresponding reduced-form regressions as in Table 3. As we did with the 2SLS regressions, we conduct the estimation using population-wide and gender-specific mortality rates. In both cases we see that female schooling responded more strongly than male schooling to the potential improvements in mortality. Moreover, when we separate the effects of mortality reductions between the VP and the NVP disease groups, we see again that the differential effect is driven by potential changes in mortality from diseases of the VP group. In the case of diseases of the NVP group we find the interaction effect to be statistically insignificant.

Overall, the results that we obtain based on the U.S. states data lead to similar conclusions as the results with the cross-country data. Health improvements for women were clearly larger than for men in the second half of the twentieth century. These differential changes in male and female health were largely driven by reductions in mortality from infectious diseases that became preventable with the help of vaccines. Moreover, these differential health improvements played an important role in the relative rise in female educational attainment. At the same time, the comparison between the obtained estimates when using population-wide as opposed to gender-specific mortality rates suggests that the implied magnitudes reported in Sect. 5 may even understate the true size of this differential effect.

9 Implications for economic development

With our regression analysis so far we have established that: (a) the global reductions in mortality from infectious diseases associated with the IET led to substantial gains in life expectancy, (b) these gains led to sizeable increases in years of schooling of men and women, and (c) women experienced larger life expectancy gains and as a consequence their schooling levels rose more than those of men. In this section we explore the broader implications that these mortality reductions had for economic development.

An extensive body of literature, summarized by Deaton (2003) and Weil (2014), has highlighted various channels through which improvements in health can benefit the process of economic development. In the context of the IET-related health improvements, though, Acemoglu and Johnson (2007) have demonstrated that countries experiencing largest gains in life expectancy were not the ones that gained the most in terms of GDP per capita. This is because the resulting increases in total GDP were in many cases offset by increases in the size of the population triggered by the lower mortality rates. Cervellati and Sunde (2011) have further explored this pattern and have shown that changes in life expectancy did not affect the GDP per capita levels of all countries in the same way. In countries that had already experienced the demographic transition by 1940 and where fertility was already low, larger gains in life expectancy led to larger increases GDP per capita, as the effects on aggregate GDP were stronger than on population. In countries that had not yet experienced the demographic transition, on the other hand, larger life expectancy gains did not lead to larger increases in GDP per capita, as the effects on population ended up being stronger.

Given our main finding that the IET resulted in differential gains for males and females in terms of life expectancy, it is instructive to assess whether these differential gains had distinct effects on per capita GDP. In fact, this possibility has already been suggested by some work in the literature. The models of the demographic transition proposed by de la Croix and van der Donckt (2010) as well as Bloom et al. (2015) highlight how improvements in female health tend to be more conducive to economic development than improvements in male health. This is because health improvements for males and females, while both foster educational attainment, have differential effects on fertility and labor force participation. An improvement in male health generates primarily an income effect for households, which increases consumption as well as desired fertility. An improvement in female health, on the other hand, has beyond an income effect also a substitution effect for households, which tends to lower fertility and raise labor force participation. This is due to the fact that women naturally devote a larger share of their time to child-rearing activities.

To our knowledge, the existing literature has not provided any empirical evidence on these distinct effects that male and female health improvements have on economic development. This is largely due to the fact that standard measures of male and female health, such as life expectancy, are very highly correlated and their distinct effects cannot be easily estimated. Our empirical setup, which links the differential improvements in male and female life expectancy to the exogenous mortality reductions from vaccine- and non-vaccine-preventable diseases, however, allows us to indirectly test how economic development is affected by improvements in male and female health. Specifically, this can be done by estimating the following reduced-form specification:

$$\begin{aligned} \ln y_{ct}=b^{VP}\underset{d\in VP}{\sum }M_{dct}+b^{NVP}\underset{d\in NVP}{ \sum }M_{dct}+d_{c}+d_{t}+e_{ct}. \end{aligned}$$
(4)

Here \(y_{ct}\) denotes the level of GDP capita in country c in year t,  while the terms \(\underset{d\in VP}\sum M_{dct}\) and \(\underset{d\in NVP}\sum M_{dct}\) correspond to the potential mortality rates for the groups of vaccine-preventable (VP) and non-vaccine-preventable (NVP) infectious diseases. The specification includes country and year fixed effects, \(d_{c}\) and \(d_{t},\) and is estimated in long-differences between 1940 and 1980 using a similar sample of countries as in our main regression setup.Footnote 39 If the estimates for \(b^{VP}\) and \( b^{NVP}\) are negative, this suggests that across countries larger reductions in potential mortality from these two groups of diseases were associated with larger increases in GDP per capita. This would correspond to the same effects we saw in Table 3 that changes in potential mortality had on life expectancy and schooling.

The results of estimating Eq. (4) can be seen in Table 11. In column 1 we estimate first a simpler variant of the specification in which we do not split the diseases into the two groups. This way we can assess how the reductions in potential mortality from all 13 infectious diseases combined affected per capita GDP. The estimated coefficient that we obtain in this case is instead positive and statistically significant. This implies that countries experiencing smaller reductions in mortality witnessed larger increases in GDP per capita compared to countries experiencing bigger reductions in mortality following the IET, which echoes the findings of Acemoglu and Johnson (2007).

Table 11 Effects of female and male health improvements on per capita GDP

In column 2 we proceed to estimate separately the effect of the potential mortality reductions coming from the VP and the NVP disease groups. As the estimation results indicate, there is a clear positive association between changes in mortality and GDP per capita for the NVP group, but a weak and statistically insignificant association for the VP group. This implies that the inverse relationship between increases in GDP per capita and health improvements, as proxied by potential mortality reductions, occurred largely as a result of reductions in mortality from non-vaccine-preventable diseases. In contrast, reductions in mortality from vaccine-preventable diseases were not inversely related to increases in GDP per capita.

In column 3 we explore this pattern further by allowing for differential effects between countries that had already experienced the demographic transition (DT) by 1940 and those that had not. For this purpose, we follow the country classification of Cervellati and Sunde (2011).Footnote 40 Starting from the specification of column 1 and interacting the changes in potential mortality from all infectious diseases combined with a dummy variable indicating all post-DT countries, we find that the effect of these mortality reductions on per capita GDP was different for pre- and post-DT countries. While in the former group of countries, reductions in potential mortality were inversely related to changes in GDP per capita, in the latter group they were positively related. This confirms the finding of Cervellati and Sunde (2011) that the IET-related medical advances were beneficial for economic development in countries that had already undergone the demographic transition by 1940.

Column 4 combines the regression specifications of columns 2 and 3 by interacting the post-DT dummy separately with potential mortality from the VP and the NVP disease groups. As the estimation results in this case indicate, there is a negative and statistically significant association between changes in GDP per capita and changes in potential mortality in post-DT countries. However, this relationship only applies to mortality reductions from VP diseases. For NVP diseases the association between changes in GDP per capita and changes in potential mortality is positive and statistically significant. These results highlight a clear distinction between the effects of the IET-related improvements in mortality for different groups of countries. Post-DT countries that experienced large mortality reductions from vaccine-preventable diseases clearly gained in terms of income. All other countries, however, which either had not yet undergone the demographic transition or experienced mortality reductions primarily from non-vaccine-preventable diseases witnessed no income gains.

In columns 5 and 6, we estimate two additional specifications similar to columns 3 and 4. The difference is that we now use the initial fertility rate in each country as the interaction variable instead of the binary post-DT dummy. This allows us to look more carefully at how fertility affected the relationship between changes in mortality and GDP per capita by utilizing the full range of fertility rates observed in our sample. The estimates that we obtain in this case have the opposite signs, as lower fertility rates were characteristic of countries that had achieved the demographic transition. The interpretation, though, is consistent with the results in columns 3 and 4, obtained with the post-DT dummy variable.

The estimates in column 5 suggest that in countries with initially low levels of fertility the relationship between changes in potential mortality and changes in GDP per capita was on average negative. For these countries the health improvements associated with the IET clearly contributed to increases in income levels. In contrast, in countries with initially high levels of fertility the estimated relationship between changes in mortality and changes in GDP per capita would turn positive. In particular, based on the coefficient estimates in column 5, the positive relationship between changes in mortality and income would emerge for all countries with fertility rates above 28.8 births per 1000 population, which corresponds to about 60% of our sample. For these countries larger mortality reductions were associated with smaller increases in GDP per capita. Looking further at the results in column 6, we see that the negative association between changes in mortality and income in low-fertility countries was only observed in cases of mortality reductions from vaccine-preventable diseases, consistently with what we found in column 4.

To understand the implications of these results, we should remember that the countries experiencing larger mortality reductions from vaccine-preventable diseases are also the ones where on average females gained more in terms of life expectancy than males. With that mind, the results imply that female health improvements are more conducive to economic development. This is because, in comparison to similar improvements in male health, female health improvements are associated with larger increases in income per capita. At the same time, the interaction effects with the post-DT dummy or the fertility rate suggest that this positive association between changes in health and income will not materialize in all countries. It will only emerge once a country has achieved the demographic transition and freed itself from the pressures of a rising population, for which female health improvements also play a role, as we already noted in Sect. 7.1.

10 Conclusion

In this paper we establish a connection between two important developments that took place over the second half of the twentieth century: the unprecedented improvements in global health and the sharp rise in the educational attainment of women relative to men. We argue that both these developments were to a large extent driven by the diffusion of western medical innovations around the world after World War II, following the coordinated efforts of the United Nations and the World Health Organization. These efforts brought about the so-called International Epidemiological Transition (IET), which led to large reductions in mortality from previously highly fatal infectious diseases all over the world. While the effect that the IET had on life expectancy across countries is well documented, the differential nature of the life expectancy gains across genders and their consequences for the evolution of the education gender gap are a novel contribution of our analysis.

To establish this connection econometrically and determine the direction of causality, we rely on the exogenous nature of the IET-related mortality reductions from the perspective of individual countries. This exogeneity assumption is justified by the fact that the mortality reductions were a product of medical research in a small group of developed countries, which diffused rapidly as a result of systematic efforts by international organizations. In order to establish the differential effect that these medical innovations had on life expectancy across genders, we further exploit biological differences in the responsiveness of males and females to the different methods of disease control. As the medical literature has recently established, vaccine efficacy is higher among females than among males. Hence, health improvements achieved with the introduction of new and improved vaccines are bound to be larger for women. This stands in contrast to health improvements achieved with methods other than vaccines.

Using this empirical strategy, we document that life expectancy gains across countries following the IET were higher among females than among males and that this was particularly the case for gains that can, at least partially, be attributed to vaccines. Moreover, we show that the higher life expectancy gains among females led to greater increases in the educational attainment of females compared to males and this can explain a substantial share of the decline in the education gender gap observed over this period. These findings are scrutinized in a series of robustness checks and assessed relative to alternative explanations for the observed patterns. The finding is further supported by a similar empirical analysis conducted across states within the United States. Finally, we present evidence suggesting that countries where increases in life expectancy where larger among females than among males also witnessed greater increases in their level of GDP per capita over the post-war period.

Taken together, these results indicate that improvements in female health do not only play an important role in empowering women and promoting gender equality, but also benefit the process of economic development. While most of the literature has focused on exploring the direct relationship between gender equality and economic development (Duflo 2012), our findings suggest that there may be policies that can promote both at the same time. These policies may require some time before their impact becomes visible and they may also necessitate strong commitments, such as those made by the United Nations and the World Health Organization in the 1940s to control infectious diseases. In the long run, though, they could prove more effective in achieving gender equality, given the limited evidence that this important policy objective can be achieved solely as a by-product of economic development.