1 Introduction

Human health is an essential part of any society and economic activity. It also assumes a prominent position on the Maslow hierarchical ladder of human needs. As we can see in the current pandemic, the COVID-19 disease has disrupted many economic activities around the world because of its infectious nature, while its contagious effect has harmed global health conditions. Without good health conditions, an economy loses its ability to develop competitive productivity, which might subsequently hinder economic growth. In general, when global health conditions are disrupted by a pandemic, a severe global economic crisis may emerge, as has been witnessed by the COVID-19 pandemic since 2020, a situation that will potentially continue in the years ahead (World Bank 2021).Footnote 1 Clearly, the current situation will create more awareness among economists of the significant positive effect of health on growth. However, in pre-pandemic times, the health effect on growth was often more the source of a lively debate among economists; this debate often emerged as a result of a number of conceptual and methodological research problems, in particular, the health–growth measurements and their heterogeneous effects across countries or regions in terms of size and direction. While addressing the importance of health in maintaining economic activity and also in order to encompass long discussions on the heterogeneous effects of the health–growth relationship, this paper aims to comprehensively and quantitatively synthesise the existing literature on how health plays a role in driving economic growth by means of a meta-analytical modelling approach.

Endogenous growth theories emphasise that economic growth is an endogenous result of an economic system (Romer 1997), and they assert that human capital is a key source of endogenous growth. However, from the empirical side, human capital is solely defined in terms of schooling, and it seems a little attention to the role of health on human capital (see, for example, Bloom et al. 2004). Until recent studies in the 2000s onwards, health as an alternative measure for human capital is gradually acknowledged. Accordingly, healthier workers tend to be more productive and energetic, which subsequently stimulates a higher output.

There are numerous studies attempt to estimate the impacts of human health on economic activity and growth. But these findings are not always identical and sometimes even contradict each other. So, there is a need to provide a quantitative synthesis of the existing empirical literature. To analyse the ‘average’ empirical evidence of the effect of health on growth, as documented by many researchers and studies, we quantitatively review the empirical literature on the relationship between health and economic growth. To do this, we employ meta-analysis methods that enable us to examine the systematic dependencies of empirical results on study characteristics and other moderating factors that might influence how health affects economic growth. Meta-analysis also allows us to investigate the existence of publication bias and the role of study characteristics and heterogeneity between countries in explaining the health–economic growth relationship.

Meta-analysis is the analysis of quantitative empirical studies which attempts to integrate and explain the literature using some specific important key parameters (Jarrell and Stanley 1989). Meta-analysis can be a solution to the assessment of the pros and cons of previous statistical or econometric findings by combining important parameter estimations from various earlier and documented studies and then testing them; most meta-analyses focus on the relationships between core variables (Borenstein et al. 2009). This method has been quite well developed in both health and social science research. Several studies have applied this method to identify the variables which affect economic growth; examples of such studies include (Afonso et al. 2020; Benos and Zotou 2014; Longhi et al 2010; Nijkamp and Poot 2004; Ridhwan et al. 2010; Nunkoo et al. 2020; Valickova et al. 2014). This development not only has provided much empirical evidence in the economic literature but also has stimulated more systematics in quantitative comparative research. Nevertheless, to our knowledge, thus far no study has comprehensively analysed the effect of health on economic growth using meta-analysis methods. Therefore, we are keen to close this gap by conducting a meta-analysis of the relationship between health and economic growth.

In our study, we also test the initial evidence of publication bias towards the positive effects of health on economic growth; this hypothesis is in line with the majority of the literature that argues that health improves economic productivity. Besides, we also find some evidence of publication bias towards the negative effects and insignificant. Our study also suggests that the literature implies a genuine health effect on economic growth. The variation in the health effect on economic growth is also influenced by the available data, the estimation procedure, the model specification, the publication channel of the study, and country-specific characteristics from each study. In our econometric analysis, we apply several estimators and models to examine whether our findings are robust to different estimators and specifications. Overall, we find a greater health effect on studies with newer data from recent years. Studies that use cross-country data also appear to document larger effects, while not accounting for endogeneity seems to create an upward bias. Countries with a higher education policy, longer working experience, and better environmental conditions also tend to increase the effect of health on economic growth.

The remaining part of the paper is structured as follows. In Sect. 2, we conduct a literature review of the effect of health on economic growth, while in Sect. 3, we summarise our estimates of the effect of health on economic growth from an extensive set of collected studies and test for the initial evidence of publication selection. In Sect. 4, we then employ different econometric specifications for our meta-regression analysis so as to explore the heterogeneity of the estimates. Section 5 concludes our findings.

2 Literature review

2.1 Theoretical background

Human health is a compound variable that is difficult to measure in an unambiguous way. In the literature, a distinction is often made between subjective health conditions (e.g. based on self-reported health perceptions) and objective health indicators (based on official statistics). The latter category can be subdivided into supply variables (e.g. the presence of advanced healthcare services) or make use of client variables (e.g. mortality rates, absence due to illness). In the literature, we find a range of different definitions (see, for example, Sartorius 2006).

To understand how health can affect economic growth, we need to elaborate on the concept of health in a general sense. From an economic perspective, being healthy is not only a matter of not having any disease but also having the potential to do productive activities. Workers with better health will be able to perform at a higher level and will be absent less. A good health condition also allows people to acquire more education and skills. Health affects economic growth directly by increasing labour productivity and decreasing the costs of illnesses.Footnote 2 Healthy individuals also indirectly impact economic growth by having a healthy family which may subsequently create healthier future generations. Besides physical health, mental health is an important part of human well-being as an improvement in individual’s mental state can give an increase in social and economic participation, engagement and connectedness, and work productivity (Doran and Kinchin 2020).

Health can be seen as one of the components of the aggregate human capital stock that generates output. Workers with better health, higher education, and more experience will be able to contribute more efficiently to economic growth. As summarised by Weil (2007), there are several channels by which health affects economic growth, namely first, the direct effect through the better productivity of healthier workers; secondly, an improvement in health increases the incentive for people to acquire more schooling that subsequently increases the level of education; and third, because of they are ageing with good health, more people are saving for retirement, thus raising investment and physical capital. The empirical economic model of the health–growth relationship used in the literature (see, among others, Bloom et al. 2019) can generally be described as follows, starting with the conventional Cobb–Douglas production function that takes as its arguments capital and a composite labour input:

$$Y_{it} = A_{it} K_{it}^{\alpha } H_{it}^{1 - \alpha }$$
(1)

where \(Y\) is output; \(K\) is physical capital; \(H\) is the aggregate of human capital stock; \(A\) is a country-specific productivity term; \(i\) indexes the countries; and \(t\) indexes the specific times. The first effect is the direct effect through the better productivity of healthier workers from the aggregate of human capital stock. We assume \(i = 1,2,3, ...\), and \(t = 1,2,3, \ldots\), with the model:

$$Y_{it} = c + a_{1} h_{it} + \gamma {\varvec{X}}_{it} + \delta_{i} + \mu_{t} + \in_{it} ,$$
(2)

where Y is output (measured by real GDP or GDP per capita); c is something that cannot be explained by the independent variable; ℎ is the health variable (life expectancy in number of years, or adult survival rate); X is another control variable that is important in economic growth (for example, initial income, trade, political stability, macroeconomic stability, institutional quality, geography, demography, and other variables); a1 captures the point of interest of the health variable; γ captures the point of interest of the control variable; δ captures common country-specific effects; μ captures common time-specific effects; and ϵ is the error term from the regression results. In this equation, we assume the data is in the form of panel data, but make adjustments if the data found is a time series in which case we will eliminate the cross-sectional model, and vice versa when cross-sectional data is found. The second effect is the improvement in health increases the incentive for people to acquire more schooling which subsequently increases the level of education:

$$Y_{it} = c + a_{1} h_{it} + a_{2} E_{it} + \gamma {\varvec{X}}_{it} + \delta_{i} + \mu_{t} + \in_{it} ,$$
(3)

where \(E\) is the education variable (for example, school enrolment, year of schooling, and other proxy variables) as the aggregate of human capital stock; and \(a_{2}\) captures the point of interest of the education variable as an aggregate of human capital stock, we have used the same terminology as in the explanation of the symbols in Eq. (2).

Following Jarrell and Stanley (1989), we make estimates using estimations taken from the study based on Eqs. (2) and (3), and by using empirical theory, we employ meta-analysis with the model:

$$b_{j} = \beta + \mathop \sum \limits_{k = 1}^{K} \partial_{k} X_{jk} + \varepsilon_{j} ,$$
(4)

where bj is the estimate reported from the jth study which can be called the effect size which, in this case, is the coefficient or t-statistic of health variables from the economic growth equation; β is the 'true' value of the parameter of interest; and Xjk is the meta-independent variable that measures the characteristics of the empirical study and explains its systematic variation from the results in the literature; ∂is the meta-regression coefficient that reflects the bias effect of a particular study characteristic, and ej is the disturbance term of the meta-regression. The meta-independent (Xjk), the types of element that design the meta-analysis, might include the dummy variables, specification variables that account for the heterogeneity of the study, the quality of data, the sample size or observation, and other selected characteristics.

2.2 Empirical evidence

The early empirical approach to analyse the effect of health on economic growth is to regress income growth on the initial level of health by using a cross-sectional sample of countries (see, for example, Barro 1991, 1998; Durlauf et al. 2005). These studies found that initial health appears to be a better predictor of economic growth compared with initial education. Dynamic panel data analysis is used in the more recent work by using the lagged dependent variables as one of the regressors.

A significant amount of empirical work has found that health, which is usually measured by life expectancy, has a positive influence on economic growth (see, for example, Sachs and Warner 1997a; Bloom et al. 2004; Suri et al. 2011). This is naturally in line with the explanation of health as a part of human capital that improves productivity. Despite that, Acemoglu and Johnson (2007) have argued that the first-order effect of increased life expectancy is the increase in population growth, which initially increases capital dilution, and subsequently decreases income growth. While the decrease is later compensated by higher economic activity as more people become productive, this compensation might not be enough if the benefits from increased life expectancy are limited.

Bloom et al. (2019) described the different channels by which health affects economic growth in less developed and developed countries. The main channel for health to affect economic growth in less developed countries is demographic transition and the timing of having sustained long-run economic growth. The increase in life expectancy has led to demographic transition, whereby human capital investments become higher because the working age is longer (Ben-Porath 1967; Cervellati and Sunde 2013). The decline in mortality also drives parents to have fewer children, which leads to better educated population and subsequently creates an economic–demographic transition. The take-off towards sustained growth is then supported by the demographic dividend. As the population becomes more productive (less youth and old-age dependency), investments in education, infrastructure, and health increase, all of which subsequently transform economic development into sustained long-run growth.

In developed countries, the relationship between health and economic growth is more complicated. The discourse about whether health might deter economic growth in developed countries centres around two main topics (Bloom et al. 2018). The first is that health gives an improvement in longevity mainly for the elderly (Breyer et al. 2010; Eggleston and Fuchs 2012). Further longevity gain by the elderly might increase the old-age dependency ratio that leads to a decline in the consumption level. The productivity gain from health improvement might also not be enough to offset the elderly’s high medical costs. The second is that the high health expenditure shares in developed countries might deter economic performance because of the excessive absorption of productive assets by the ‘oversized’ health sectors (Pauly and Saxena 2012). While the decrease of chronic diseases could bring productivity improvements, the longevity improvements disproportionately apply to the elderly who are more economically inactive. Nevertheless, within developed economies, the advantages from even a limited increase in health would likely far surpass the losses from forgone consumption (Kuhn and Prettner 2016). The medical development provided by a generous healthcare system also complements these positive outcomes.

3 Data and methodology

In this section, we explain the data collection process and the estimates used in the study.

3.1 The meta-data set

There are guidelines for conducting a meta-analysis that need to be followed step by step. Because MRA (meta-regression analysis) is widely accepted throughout the scientific literature, members of the meta-analysis of economics research network (MAER-Net) believe that it is appropriate to offer guidelines for reporting meta-regression analyses which can serve as minimal standards for academic journals (Stanley et al. 2013).

3.1.1 Data strategy and selection criteria

We tried to find empirical data using the PoP software (Publish or Perish) and Google Scholar with the keywords ‘health’, ‘economic growth’, ‘GDP’, and ‘estimate’, and the search ended on 12 December 2020. We read the abstract of each paper and determine whether the study discusses the effect of health on economic growth, then we retained all studies that contain empirical estimates. We used studies from various sources, including published journals, working papers, conference papers, and unranked journals that analyse the relationship between health and economic growth. Using this approach, we initially collected 151 studies to be considered. We then selected studies that show complete empirical results (that is the regression coefficient, standard error, and t-statistic) so that we can use the estimate as a data point. We also exclude some outliers from the data. Finally, our data set contained 64 studies that provided us with a total of 719 estimates. There are two approaches to looking at the economic growth model from the health perspective: the first is from the perspective of a micro-based approach in which health conditions such as height, age of menarche, and other variables affect the economic growth of a region or country (Weil 2007). According to Bloom et al. (2019), the second model uses a macro-based approach such as health outcomes (life expectancy, adult survival rate) and health expenditure that affect economic growth in terms of health. We chose to use life expectancy and the adult survival rate as the health outcomes, as these are the most common measure of a country’s health condition.Footnote 3 As given in Table 1, most studies use life expectancy as the measure of health and Gross Domestic Product (GDP) per capita as the measure of a country's economic growth.

Table 1 Summary of health measures and economic growth

Table 2 shows the distribution of studies and estimates by continent. Most of the studies use cross-country data that include countries from different continents. We note some notable studies that have many estimates. Barro and Sala-i-Martin (1995) and Barro and Lee (1994) stated that the effect of health on economic growth is quite large. They both use the same data from 1965 to 1985 that consists of countries from all continents, they conducted experiments using various variables that could affect economic growth. In contrast, Acemoglu and Johnson (2007), who used a sample from the 1900s, stated that health growth will initially harm the economy because of the increase in population size and the interactions between humans in pandemic conditions. He and Li (2020) used time series methods to see the effect of health on the economic growth of each country using data from the 2000s. Their estimates were very diverse between countries. When Barro (2002) repeated his research using the endogenous growth perspective, with the same data as before, the obtained effect of health is also positive. Besides health, Barro (2002) also focused on political and social variables as complementary variables that also increase economic growth.

Table 2 Distribution of studies and estimates by continent

3.1.2 Outliers

Following Iršová and Havránek (2013), from a total of 747 estimates, we excluded 28 estimates whose t-statistic exceeded 10 as that might lead to misinterpretation of the 'true effect' or the publication bias of the effect of health on economic growth.

3.1.3 Effect size measures

Because there are differences in the measurement of the dependent variable (GDP or GDP per capita) and on the health measure (life expectancy or adult survival rates), the estimates from each study are not directly comparable. Hence, we have to use the partial correlation coefficient (PCC) as a standardised effect size that is commonly used in meta-analysis (Doucouliagos and Ulubas 2006; Valickova et al. 2014). The partial correlation coefficient can be obtained from the t-statistic and the degree of freedom of an estimate (Greene 2008):

$$pcc_{ij} = \frac{{t_{ij} }}{{\sqrt {t_{ij}^{2} + dof_{ij} } }} ,$$
(5)

where pccij is the partial correlation coefficient of the estimate of health on economic growth i from the study j that ranges from -1 to 1; t is the t-statistic; and dof is the degree of freedom that is collected from the estimates of the selected studies. The partial correlation coefficient here shows the standardised effect size of health on economic growth. The standard error for each partial correlation coefficient is obtained from the PCC and the t-statistic:

$$SEpcc_{ij} = \frac{{pcc_{ij} }}{{t_{ij} }} ,$$
(6)

where SEpccij is the standard error of the partial correlation coefficient. We can see in Table A1 in Appendix 1 and Figure A1 to A3 in Appendix 2 that the distribution of the estimate in each study is quite diverse: the lower limit of the distribution of the partial correlation coefficient for each study is -0.79, and the upper limit is 0.86. In the next steps, we will see the true effect of health effects on the economy using several methods and strategies.

3.1.4 Condition and unconditional averages

The result of the average effect size using the simple mean with 95% CI is 0.255 [0.234, 0.276]. However, a simple average suffers from several shortcomings. First, it does not consider the precision of the estimate, as each partial correlation coefficient is ascribed the same weight regardless of its sample size. Second, it does not consider possible publication selection, which can bias the average effect (Valickova et al. 2014).

According to Borenstein et al. (2009), a meta-statistical summary should be performed using the fixed-effect average or random-effects average method to obtain more precise results. In the fixed-effect model, we assume that all studies in the meta-analysis share a common (true) effect size, all factors that could influence the effect size are the same in all the studies. In the random-effects model, we assume that the studies have enough in common that makes sense to synthesise the information but, in general, there is no reason to assume that they are identical in the sense that the true effect size is the same in all the studies. The decision to use the random-effects model should be based on our understanding of whether or not all studies share a common effect size, and not on the outcome of a statistical test.

In the meta-analysis, we look at effect size from a different point of view compared with the usual empirical approach. Generally, a regression is carried out to determine the projection of the observed effects that are viewed from the population effect, while in the meta-analysis we determine the projection of the population effect that is viewed from the observed effect (Borenstein et al. 2009). Table 3 shows the results of meta-analysis using several methods. In the fixed-effect model, the inverse-variance method is used by weighting each study using the inverse of its variance, whereas in the random-effects model, because we assume that the estimates are drawn from different populations, we consider the between-study variance or residual heterogeneity, called τ2 (tau-square). To estimate τ2, there are several methods according to Harbord (2008), namely the method of moment (MM) according to DerSimonian and Laird (1986); restricted maximum likelihood (REML) according to Harville (1977); and empirical Bayes (EB) (Morris 1983).

Table 3 Summary of the effect size of health on economic growth

Table 3 shows that the effect size ranges around 0.25, which means that the health effect on economic growth overall has a positive effect on economic growth. Using the fixed-effect method, we find effect size with 95% CI is 0.113 [0.110, 0.117], which is quite low compared with the simple average. This might be a sign of the existence of publication bias because, when we give more weight to studies with a larger sample, the size effect decreases. Nevertheless, the fixed-effect model has a disadvantage because of the assumption that all factors that affect the effect size are considered to be the same in each estimate, which is unlikely to be true. When we assume that other factors could determine the effect size of health on economic growth, we show that the value of the tau-square is quite large, indicating that there is a significant difference in the distribution of the effect size. The results obtained using random effects are close to the simple average, which is approximately 0.248. In this case, the random effect might be better in summarising the true effect size of health on economic growth. We also find that, on average, the effect of increasing life expectancy or the adult survival rate by one year is 0.024, implying that a five-year increase in a population’s life expectancy improves economic growth by 2.4 per cent.

3.2 Identification of publication bias

We need to know which other factors that might influence the effect size, and one of them is publication bias (Sutton et al. 2005). Publication bias might occur because published studies are more inclined to report statistically significant results as they are more likely to be published. However, even a careful review of the existing published literature will not provide an accurate overview of the body of research in an area if the literature itself reflects selection bias. Therefore, the presence of publication bias is usually tested both formally and graphically by what is called a funnel plot. A funnel plot is a scatter diagram of the precision versus the estimated effect (such as, by using regression coefficients, or partial correlation coefficients). Precision is best measured by the inverse of the standard error of the partial correlation coefficient (Doucouliagos and Stanley 2009):

$$prec_{ij} = \frac{1}{{SEpcc_{ij} }} ,$$
(7)

When there is no publication selection, estimates should vary randomly and symmetrically around the ‘true’ population effect. Because small sample studies with typically less precision form the base of the graph, the plot will be more spread out there than at its top. As shown in Table 1, we have 64 studies and 719 estimations from the studies. In detail, 58 estimates are negative and statistically significant; 55 estimates are negative but not statistically significant; 95 estimates are positive but not statistically significant; and 501 estimates are positive and statistically significant. All those data produce funnel plot results that are asymmetrical leaning to the right. Figure 1 shows a publication bias on the right of the funnel plot, and it can be interpreted as follows: (i) the researcher may treat statistically significant results more favourably and (ii) the researchers may prefer a particular direction of the estimate. However, the interpretation of the funnel plot is rather subjective, which requires us to use a more formal method to assess publication bias (Iršová and Havránek 2013).

Fig. 1
figure 1

Funnel plot

Funnel asymmetry testing (FAT) and precision effect testing (PET) are the tests used for detecting genuine effects and that perform quite well even when the incidence of publication selection is severe. Following Stanley (2006), we perform a funnel asymmetry test (FAT) multiple regression analysis to confirm the presence of publication bias and its true effect, and regress the estimated effect size on its standard error (Doucouliagos and Stanley 2009):

$$pcc_{ij} = \beta_{0} + \beta_{1} SEpcc_{ij} + \mu_{ij} ;i = 1, \ldots ,M ;j = 1, \ldots ,N$$
(8)

where \(i\) is the index for the estimates in the \(j\) th studies; and N is the total number of studies. The coefficient \(\beta_{1}\) indicates the publication bias, and β0 indicates the true effect. We introduce SEpccij because, if there is a publication selection, the authors of small sample studies tend to search for larger estimates to compensate for the large standard errors (Benos and Zotou 2014). If the null hypothesis β1 = 0 is rejected, the sign of the estimate of β1 indicates the direction of the publication bias. If the null hypothesis β0 = 0 is rejected, it would imply the existence of a genuine effect of health on economic growth beyond publication bias. Stanley (2006) examined the properties of the test using Monte Carlo simulations and concluded that it is a powerful method for testing for the presence of a genuine effect and that it is effective regardless of the extent of publication selection.

Because the explanatory variables in Eq. (8) are the standard errors of each study that use different sample sizes and different econometric models and techniques, the error in this MRA model, μij in Eq. (8), is likely to be heteroscedastic. We address this by applying weighted least square (WLS) by dividing Eq. (8) with the standard error of the effect size measure (SEpccij):

$$\frac{{pcc_{ij} }}{{SEpcc_{ij} }} = t_{ij} = \frac{{\beta_{0} }}{{SEpcc_{ij} }} + \beta_{1} + \frac{{\mu_{ij} }}{{SEpcc_{ij} }} ,$$
(9)
$$t_{ij} = \beta_{1} + \frac{{\beta_{0} }}{{SEpcc_{ij} }} + \varphi_{ij} , \varphi_{j} |SEpcc_{ij} \sim N\left( {0,\theta } \right) ,$$
(10)

Generally, it is very unlikely that all heterogeneity can be explained so that no ‘residual heterogeneity’ will be left. Accordingly, random-effects meta-regression analysis is more appropriate than fixed effects. In addition, we give weight to the equation by also measuring heteroscedasticity (SEpccij) with between-study variance (τ2). In estimating between-study variance in meta-regressions, Harbord (2008) suggests several methods using both an iterative and a non-iterative estimation. For example, the residual heterogeneity of the random-effects model can be computed by an iterative estimation such as the restricted maximum likelihood process (REML) and the empirical Bayes (EB) method, while the method of moment estimator (MM) represents a non-iterative estimation. In this case, the residual heterogeneity or between-study variance represents the excess variation in the observed growth effects of health expected from the imprecision of results within each study.

After dividing Eq. (8) with the standard error of the partial correlation coefficient, the t-statistic becomes the dependent variable in Eq. (10). The FAT and PET test can be done using Eq. (10). However, because we use a great number of studies and multiple estimates per study, we need to control for the potential dependence of the estimates within a study by employing a mixed-effect multilevel model (Doucouliagos and Stanley 2009; Doucouliagos and Ulubas 2006; Valickova et al. 2014):

$$t_{ij} = \beta_{1} + \frac{{\beta_{0} }}{{SEpcc_{ij} }} + \varepsilon_{j} + \delta_{ij} ,\varepsilon_{j} \left| {SEpcc_{ij} \sim N\left( {0,\theta } \right), \delta_{ij} } \right|SEpcc_{ij} \sim N\left( {0,\varphi } \right) ,$$
(11)

The overall error term (φij) from Eq. (10) now breaks down into two components: study-level random effects (\(\varepsilon_{j}\)), and estimate-level disturbances (δij). The multilevel framework is more suitable because it takes into account the unbalanced nature of the data, allowing for nested multiple random effects, and it is more flexible (Rusnak et al. 2013).

To check the robustness of our FAT and PET model result (Eq. 11), we use additional ME models such as the WAAP (weighted average of the adequately powered) estimator, and a model that include outliers. Re-estimating using the WAAP estimator, which was done by Ioannidis et al. (2017), can correct for publication bias in data conditions that lack statistical power. We compute adequate power by the comparison between the standard error and the absolute result of the fixed-effect average divided by 2.8 (1.96 added by 0.84) (further explanation can be seen in related research, for example, Gallet and Doucouliagos 2017). Then we try to use a model that includes outliers so that we can see the difference that occurs before and after outliers are excluded.

The results of Table 4 show that the null hypothesis of β1 = 0 in the ME and the ME-MM method is rejected, as it indicates a publication bias in positive results. In all methods, the null hypothesis of β0 = 0 is rejected implying the existence of a genuine effect of health on economic growth. We reject the null hypothesis of no between-study heterogeneity at the 1% level, which is confirmed by likelihood ratio tests. We also find that the within-study correlation is large, indicating that it is more appropriate to use the mixed-effect estimator (Rusnak et al. 2013). Because of that, we report the estimate using mixed-effect multilevel model rather than OLS or WLS. However, this specification still assumes that all heterogeneity is solely caused by publication bias and sampling error which is unrealistic (Rusnak et al. 2013; Valickova et al. 2014).

Table 4 Funnel asymmetry test (FAT) and precision effect testing (PET)

According to Doucouliagos (2011), if the ‘true effect’ (β0) is below 0.07, the effect is considered to be ‘very small’; if it is between 0.07 and 0.17, it is ‘small’; if it is between 0.17 and 0.33, it is ‘medium’; and if it is more than 0.33, then the result has a ‘large’ effect. Table 4 shows the results of not being weighted by 'residual heterogeneity'; the ‘true effect’ is 0.038 (column 1) which is categorised as a very small effect. But adding a weight results in a bigger effect than before and produces an insignificant publication bias. The MM, REML, and EB methods produce a higher effect size compared with the mixed-effect method without between-study variance. Using ME, ME-MM, and ME-REML method, we relatively find a ‘small’ effect. While using only studies that use GDP as the dependent variable and using WAAP method, we find a ‘medium’ effect of health on economic growth from the FAT-PET tests. This result suggests that when we use estimates with adequate statistical power, the effect size of health on economic growth becomes larger.

4 Key findings

In this section, we discuss the findings of the meta-regression analysis (MRA). Using the mixed-effect (ME) method, the effect of health on growth and the characteristics of the country can be seen. In addition, the estimation characteristics, health measurement, and model characteristics are also explained in detail. Similar to the previous studies, our findings are various regarding the MRA features.

4.1 4.1 Meta-regression analysis

Because the effect size of each study might depend on the specifications used by each study, we use multivariate meta-regression to determine whether the effect size varies between different contexts and specifications. The difference in the effect size might be caused by the heterogeneity of the study design and the country characteristics in each study. The potential for publication bias frequently emerges in a systematic review analysis. However, by using meta-regression analysis (MRA), we try to correct the publication bias. Also, we identify the drivers of the difference for each study and analyse what makes these differences between the studies. Table 5 shows the mediator variables that we codified. We categorise the variables in several groups: difference in the dependent variables; data characteristics; estimation characteristics; health measure difference; model characteristics; and publication characteristics (Jarrell and Stanley 1989; Stanley et al. 2013). Other than that, we also look at the country characteristics that might affect the effect size. We follow Jarrell and Stanley (1989) and estimate the following equation:

$$t_{ij} = \beta_{1} + \frac{{\beta_{0} }}{{SEpcc_{ij} }} + \mathop \sum \limits_{k = 1}^{K} \frac{{\partial_{k} X_{ijk} }}{{SEpcc_{ij} }} + \varepsilon_{j} + \delta_{ij} \varepsilon_{j} \left| {SEpcc_{ij} \sim N\left( {0,\theta } \right), \delta_{ij} } \right|SEpcc_{ij} \sim N\left( {0,\varphi } \right),$$
(12)

where X stands for the set of moderator variables that are assumed to affect the reported estimates, each weighted by the standard error or standard error plus τ2 to correct for heteroscedasticity; ∂k is the meta-regression coefficient which reflects the biasing effect of a particular study characteristic; and K denotes the total number of moderator variables. This specification assumes that publication bias (β1) and true effect size (β0) varies randomly across studies.

Table 5 Summary statistics

4.2 Explaining the heterogeneity

The first category of our estimates’ characteristics is data characteristics, where the first difference is in the data structure: some are in the form of cross sections; some are in the form of panel data; and there are also time series data. Then we also look at the year period of the sample, as we want to find whether there is a difference between decades, because the newer data might have more complete data and with a better approach as well. We also categorise the country included in the sample based on its income and turn it into a categorical variable including low-income country whose GDP per capita value is at most US$ 1,026 at constant 2010 value; middle-income country whose GDP per capita value varies between US$ 1,027 and 12,475 at constant 2010 value; and high-income country whose GDP per capita value is at least US$ 12,475 at constant 2010 value. We presume that income corresponds directly with countries’ stage development as GDP percapita is one of the basic components of the human development index.Footnote 4

Table 5 summarises all the heterogeneity in our estimates and its study and country characteristics. We use these characteristics as independent variables in our meta-regression analysis using a mixed-effect estimator because we use multiple estimations from one study. The FAT-PET tests indicate a high within-study correlation, which suggests that the mixed-effect estimators are more appropriate. We use regular ME, ME-MM, and ME-REML estimators to check the robustness of our estimation.

In Fig. 2, we can see the heterogeneity of effect size in each country. A shown in Fig. 2, most of the American and European countries have a positive health effect on the economy. Indeed, there is only one country, Belize, in the American continent that has a negative effect size. It can also be seen from Appendix 3, Figure A4 to A7 that there is a visible variability of the health effect in each continent. We also note that a less developed country, like Sudan, has a high health effect on economic growth, while a developed country, like Finland, has a relatively low effect. This phenomenon might be explained by the difference in how health affects economic growth in developed countries and less developed countries. Less developed countries with a high health effect might be in the middle of an economic–demographic transition that spurs their economic growth. On the other hand, developed countries with a low health effect might be experiencing a high old-age dependency ratio because of the high number of elderly people and high absorption of productive assets by the ‘oversized’ healthcare system. Overall, being a high-income country does not necessarily translate into a high effect of health on economic growth.

Fig. 2
figure 2

All country heterogeneity in the effect size of impact health on economic growth

The second category of our variables concerns the difference in estimation characteristics used in each estimate, which we divide into six parts, namely ordinary least square (OLS); generalised method of moment (GMM); instrumental variable (IV); fixed effect (FE); random effect (RE); and time series estimation. Apart from the differences in estimation, the variables used in the model are grouped into several groups of variables.Footnote 5 Based on the relevant mainstream literature, we could classify these models into model 1,Footnote 6 model 2,Footnote 7 model 3,Footnote 8 and model 4,Footnote 9 and if it is not included in these 4 models, it is called ‘the other model’.

The third category is the difference in the dependent variable to measure economic growth. In this connection, there are studies that use the country's GDP or GDP per capita as a measure of the country's economic growth. Also in terms of the health measure difference, two common measures are widely used to represent the health of a country: life expectancy and adult survival rate.

The fourth category is the publication characteristics. If we want to see the difference between studies that are published in different types of journals, we use the quartile index from Scimago Journal and Country Rank. If a journal is not listed on Scimago or has not yet been assigned to a quartile, we categorised it as an unranked journal. Besides published journals, we also used some studies that are published as working papers. So we divide them into Q1, Q2, Q3, Q4, and unranked journals and working papers. We also look at the year of the publication, as there might be different perceptions of the importance of health on economic growth over the year and changes in econometric techniques in the recent studies. Lastly, we also add impact factors to capture any differences in quality that might not be captured in the methodological moderator variables.

The fifth category is country characteristics that might also influence the quality of a country’s health and economy that subsequently affect the effect of health on economic growth and which are collected from open source online databases, World Development Indicators (WDI). Where we are looking for variables that might affect both health and economic growth. Some of these variables are explained in Table 4. We use the number of years of compulsory education, workers’ experience, and CO2 Emissions as some of the characteristics that might influence how health affects economic growth. Our goal is to see how the different characteristics of a country might affect the effect size, in particular non-economic variables that are necessary to support a country’s economic growth and also the environmental aspect. To avoid the reverse causality problem between the health effect and our country characteristics variables, variables that we choose are those that arguably do not have a bidirectional relationship with the health effect on economic growth. In so doing, for example, to capture the education effect on health, we employ years of compulsory education that is considered as an exogenous policy shock (see, for example, Angrist and Krueger 1991; Vella and Klein 2006).

In Table 6, we employ various estimators and models, in order to be able to observe the sensitivity of our estimates. In the first column, we used the regular mixed-effect (ME) method, while in the third and fourth columns we used ME-MM and ME-REML that consider the between-study variance or residual heterogeneity, called τ2 (tau-square) as an additional weight for the estimates. In the second column, we only estimate effect sizes that address the endogeneity issues (using studies with IV estimator only), additionally, we also check the robustness of our result by doing the reverse (only estimating effect sizes that do not address the endogeneity issues, the result is not shown in Table 6 for brevity), overall, the result is quite similar to the full sample estimation. In the fifth column, we include only statistically significant variables. In the sixth column, we try to ascertain the sensitivity of our model by adding outliers that were initially excluded. In the last column, we also examine whether using only estimates from rated journals could compromise the results.

Table 6 Explaining the differences in the estimates of the health–growth nexusa

The effect size of health on economic growth, indicated by the 1/SE coefficient, 0.819 for the specific model and is statistically significant for all models, indicating the existence of the true effect of health on economic growth. The result from using only observations from rated journals gives a similar interpretation as the results from using all observations. Studies that use GDP as the dependent variable increase the effect by around 0.07 compared with studies that use GDP per capita as the dependent variable. Regarding the time of the data used, studies that use more modern data, with the year 2000 as the midpoint, report a higher health effect on growth. This might indicate that as the overall health conditions around the world are becoming better, the contribution of health to economic growth also becomes higher. As the data structure also has some role in determining the effect size heterogeneity, time series studies seem to show higher estimates compared with panel and cross-sectional studies. This evidence might be caused by the varying characteristics of each country when using cross-country samples compared with only using one country as a sample that diminishes the size effect of health.

We found only a small effect of a country’s income on the effect size, but there might be a negative impact on the effect size when a country becomes richer. When we include only estimates that account for endogeneity (column 2), this negative impact becomes more profound and statistically significant. As we noted in the literature review section, more developed countries might be experiencing lower health effect because of high old-age dependency and ‘oversized’ health sector and also the emergence of some diseases such as diabetes, strokes, and mental health disorders that might hinder the contribution of health to the economic growth. Biases from other estimates that do not account for endogeneity might suppress this effect.

Regarding the model estimator characteristics, we found that studies that account for endogeneity (using the IV estimator) recorded a lower effect size. This evidence suggests that not accounting for endogeneity may create an upward bias of the estimation of health on economic growth. We found no difference in the type of health measure that is used by a study, whether it is life expectancy or the adult survival rate. The empirical specification of an estimate has an influence on the effect size, studies with more comprehensive variables that explain economic growth increase the estimate of the health effect. Studies from Q1 journals and working papers also seem to report a higher effect.

Besides study characteristics, we also consider some country characteristics as a control variable that might influence the effect size. We consider the effect of human capital on the effect size by using compulsory education and workers’ experience. An extra year of compulsory education increases the effect size, as higher education level in a country might affect the population's awareness of health and subsequently make the country more productive. We also try to control the years of experience the workers have in a country because a country with a high life expectancy tends to have older labour with longer working experience (Bloom et al. 2004). We find that these countries which have longer workers’ experience have a higher effect size. An increase in health investment might be more beneficial in a country where the population has long working experience because the productive activities of a healthy population will be spread over a higher number of labour hours.

The environmental condition of a country also influences the health effects: countries with higher CO2 emissions have a lower health effect. Higher economic growth might escalate the use of fossil fuel that is responsible for environmental pollution through the emission of carbon, sulphur, etc. Our findings suggest that, while fossil fuel consumption might accelerate a country’s economic growth through high production rates, its detrimental effect on health might eventually be large enough to offset the productivity gain.

5 Conclusion

In the current pandemic, health is becoming more important to be considered as an integral part of economic growth. This study aims to analyse the impact of health on economic growth based on data from 64 studies with 719 estimates that cover four continents: Asia, Europe, America, and Africa. Our results reveal some important findings in the following, which should be relevant for theory discourse and policymaking.

First, based on our study list, we find that, on average, increasing life expectancy or the adult survival rate by one year corresponds to 2.4% increase in economic growth. We then find evidence of publication bias towards a positive effect of health on economic growth. After accounting for the heterogeneity of the estimates, we show that health has a genuine positive effect on economic growth despite the pros and cons of previous study findings.

Second, the effect of health on growth seems to be higher in less developed countries, indicating that the increase in health might induce economic–demographic transition and take-off towards long-run growth that spurs economic growth in developing countries. The lower effect in developed countries may be related to high old-age dependency and ‘oversized’ health sector and also the emergence of some diseases (diabetes, strokes, mental health, etc.).

Third, our finding also suggests that the variation of the health effect on economic growth is also influenced by several data characteristics from the study. Studies that use more modern data report a higher health effect on growth, indicating that as the overall health conditions, technology, and data around the world are becoming better, the contribution of health to economic growth also becomes higher. Studies that account for endogeneity report lower effect size, indicating the existence of upward bias in studies that do not account for endogeneity. Studies with more comprehensive variables increase the estimate of health on growth. These results suggest that the specifications of the study play an important role in determining the magnitude of the effect size. Estimates that are not addressing causal issues and do not include sufficient variables to explain economic growth should be taken with caution.

Finally, the characteristics of the countries in the studies also influence the effect size. A higher year of compulsory education, longer working experience, and better environmental conditions are found to increase the effect size. Overall, we find that countries with better conditions will allow health to channel its benefit to economic growth more effectively. Nevertheless, there might be some other country characteristics that could not be incorporated in the meta-regression, such as the health condition of the countries, e.g. healthcare spending, the prevalence of obesity, etc.; this latter may suffer from reverse causality issue, and we leave this part as a future agenda of research.

As an implication of our review on the effect of health on economic growth, policymakers should always be aware that in order to accelerate economic growth, having a healthy population is a necessary condition so that economic activities can be performed, and also the good health itself will bring an improvement in labour productivity and induce sustainable economic growth. As a result, the maintenance and improvement of health conditions should be one of the most important agendas in policymaking especially in the light of recent global economic disruption as the result of COVID-19 pandemic.