1 Introduction

At least since Smith’s (1776) “The Wealth of Nations,” economists have tried to understand why some countries are richer than others. One of the most widely used approaches to address this question—development accounting—combines measured inputs according to an aggregate production function and compares the estimated outputs to countries’ observed gross domestic products (GDPs). Current consensus is that differences in physical capital only account for a limited fraction of differences in GDP. On the other hand, the relative importance of total factor productivity (TFP) and human capital for explaining income differences remains an unsettled question. For instance,  Manuelli and Seshadri (2014) and Jones (2014) find that differences in human capital account for four-fifth of cross-country income differences while earlier work by Hall and Jones (1999) concluded that they explain only one-fifth. There are two main reasons for such pronounced differences in findings: (1) how human capital is measured and (2) how inputs to human capital are combined; i.e. functional forms for the production function (or composite) of human capital.

The current paper mainly focuses on the first aspect: measurement of human capital. Accurately measuring a country’s stock of human capital is challenging. Literature has pointed out various shortcomings that lie with the traditionally used measure of average years of schooling and has highlighted the importance of taking into account qualitative measures of human capital. One can envision that a broad notion of human capital would furthermore include aspects related to accumulation and investments beyond formal schooling such as job experience, and that for experience to translate into more skills, lifelong learning and training could potentially be important. Last but not least, individuals’ health also has a clear impact on human capital. However, until recently data limitations made it impossible to construct such a broad measure for different countries. This changed when data from the “Programme for the International Assessment of Adult Competencies” (PIAAC) became available. In the current paper we construct a comprehensive measure of aggregate human capital using many different variables related to individual human capital from PIAAC.

In particular, PIAAC data is available for a sample of 30 upper-middle and high-income countries.Footnote 1 GDP per worker among these countries ranges from 48,325 USPPP$ in Estonia to 151,909 USPPP$ in Norway. Given that all ingredients for our measure of human capital—schooling, cognitive skills, job experience, on-the-job-training, and health—thus come from a common source, we are able to circumvent measurement problems that arise when using different data bases. To obtain parameters for the weight of each dimension of human capital in the human capital composite, we use individual level PIAAC data which also include information on wages, and we estimate Mincerian wage regressions for the US. We then combine these so-constructed measures of human capital with data on the stock of physical capital from the Penn World Tables, and we carry out a classical development accounting exercise. Our results show that differences in physical capital together with our multidimensional measure of human capital can account for 42% of the variance in income, compared to 27% when using years of formal schooling only. Differences in cognitive skills play the largest role while experience and health are of lesser importance.

The current paper contributes to the development accounting literature which is extensively reviewed in Caselli (2005) and Hsieh and Klenow (2010). First, while some purely quantitative measures of human capital such as average years of schooling are readily available across countries (see Barro and Lee (2013)), finer measures that take into account aspects of quality are harder to come by. Unlike previous literature that relies on student test scores as a proxy for the quality of human capital (e.g. Caselli (2005) or Hanushek and Woessman (2008)), we are able to approximate the quality of human capital actually used in production with test scores on cognitive skills of the working-age population. Alternative approaches by Schoellman (2012) or Hendricks (2002) who use returns to schooling or average earnings of foreign-educated individuals in the US to proxy for differences in the quality of education across countries are potentially affected by the selection of migrants or wage discrimination.

Second, unlike previous papers, we consider a broad measure of human capital including not only average years of formal schooling and cognitive test scores but also measures of job experience, health, and on-the-job-training. While subsets of all of these measures have been considered in the development accounting literature, to the best of our knowledge ours is the first paper that combines them all. For instance, Klenow and Rodriguez-Clare (1997) find that adding experience to a measure of human capital which includes years of schooling only, increases the explanatory power of the standard model used in development accounting by a mere 4–5% points, in stark contrast to increases by almost 70% in Lagakos et al. (2012). These differences are partly due to the fact that—similar to the current paper—the former assume the same returns to experience across countries, while Lagakos et al. (2012) estimate different returns to years of job experience across countries. Furthermore, there is no question that health is an important determinant of human capital. However, Acemoglu and Johnson (2007) point out that the effect on output per capita is ambiguous, as improvements in health which lead to individual higher productivity also imply larger populations due to increased life expectancy. Nevertheless, Weil (2007) finds that differences in health outcomes—measured by average height, survival rates and age at menarche—across countries contribute to around 10% of the variance of log GDP per worker, while Shastry and Weil (2003) consider the prevalence of anemia and adult survival rates and find that differences in these variables explain 1.3 and 19% respectively. Differences in results hence clearly hinge on how health outcomes are measured. For our sample of upper-middle and high-income countries where we observe very little variation in objective measures such as survival rates or average height, we rely on self-reported health data.Footnote 2 Regarding on-the-job-training, Manuelli and Seshadri (2014) show that human capital investments that individuals undertake after completion of formal schooling constitute an important component of human capital, in particular for richer countries as those in our sample.

In our benchmark exercise we follow the standard assumptions in the development accounting literature and consider competitive factor markets and perfect substitutability of workers. However, Jones (2014), Caselli and Coleman (2006) and most recently Malmberg (2017) show that considering imperfect substitutability of workers with different education levels increases the explanatory power of human capital in development accounting. We hence extend our basic framework and consider a modified model with imperfect substitutability of individuals with and without college education. For reasonable degrees of substitutability we can account for up to 45% of the cross-country variance in output per worker, i.e. 8% more compared to our baseline model with perfect substitutability. We then confirm the robustness of our results along other dimensions, including alternative measures of cognitive skills and experience, as well as considering additional moments from cognitive skill distributions. The remainder of this paper is organized as follows: Sect. 2 outlines the theoretical framework behind development accounting, and we explain how our multidimensional measure of human capital fits into this framework. Sect. 3 describes the data used for our exercise, and in Sect. 4 we explain how we estimate the parameters for the human capital composite. Sect. 5 reports and discusses our results, in Sect. 6 we provide robustness checks, and Sect. 7 concludes.

2 Development accounting: framework

Let \(y_{j}\) be country \(j'\)s GDP per worker. Following Hall and Jones (1999), we propose that \(y_{j}\) is produced according to a Cobb Douglas production function

$$\begin{aligned} y_{j}=A_{j}k_{j}^\alpha h_{j}^{1-\alpha }, \end{aligned}$$

where \(A_{j}\) is total factor productivity (TFP), \(k_{j}\) is capital per worker, \(h_{j}\) is average human capital per worker, and the parameter \(\alpha \) is the capital share in GDP. The so-called “factor-only model” which allows us to calculate counter-factual income gaps assumes that TFP is equal across countries:

$$\begin{aligned} y_{KHj}=k_{j}^\alpha h_{j}^{1-\alpha }. \end{aligned}$$
(2.1)

In a fairly broad set-up one can envision that the average worker’s human capital depends on years of formal schooling \({s_{j}}\)—with qualitative aspects approximated by measures of cognitive skills \(c_{j},\)—work experience \({x_{j}}\), on-the-job-training \({ojt_{j}}\), as well as the individual’s health status \(hl_{j}\)

$$\begin{aligned} h_{j}=g(s_{j}, c_{j}, x_{j}, ojt_{j}, hl_{j}). \end{aligned}$$
(2.2)

While in principle g(.) can take on many different functional forms, we follow literature and assume

$$\begin{aligned} h_{j}=exp(\phi s_{j}+\tau c_{j}+f(x_{j}) +\varsigma ojt_{j} + \theta hl_{j}), \end{aligned}$$
(2.3)

where f(.) denotes the functional form for experience in human capital accumulation. Under perfect competition and free entry, private returns to human capital accumulation—the wage increase due to additional units of human capital—equal social returns—increases in aggregate human capital. In such an environment, the exponential form for \(h_{j}\) has the advantage that its parameters can be directly obtained from Mincerian wage regressions, see Sect. A.2 of the Appendix for details.

Development accounting then compares the observed variation in GDP per worker across countries to the estimated counter-factual variation due to differences in measured factors of production only. In particular, following Caselli (2005) we measure our model’s success by the following ratio:

$$\begin{aligned} success=\frac{var(log(y_{KH})}{var(log(y))}. \end{aligned}$$
(2.4)

This ratio indicates how much of the variance of observed log output per worker is explained for by the variance of the constructed log output per worker from the model. Hence, the larger the measure of success, the smaller is the role assigned to TFP in explaining cross-country differences in income.

3 Data

For constructing our multidimensional measure of the stock of human capital in each country we rely on PIAAC data. PIAAC can be described as the OECD’s adult version of its better-known “Programme for the International Assessment of Students” (PISA). While PISA assesses students’ cognitive skills, PIAAC on the other hand attempts to do so for a country’s working-age population. Up to now PIAAC has been conducted twice in 2011 and 2012 in 24 countries and between 2014 and 2015 in nine additional countries. Apart from cognitive test scores in numeracy and literacy, PIAAC also provides information about individual’s schooling, continuous education, work experience, income etc. In particular, PIAAC data satisfy three important criteria that we exploit for our analysis: i) it is conducted for a nationally representative sample of the working-age population (16–65 years), ii) it is comparable across countries, and iii) it contains information on several dimensions of human capital. In sum, PIAAC data offer a relatively accurate picture of skills potentially relevant to the labor market and could hence help explain differences in income across countries. The sample used for our analysis consists of the following 30 countries: Austria, Belgium, Canada, Chile, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Ireland, Israel, Italy, Japan, Korea, Lithuania, Netherlands, New Zealand, Norway, Poland, Singapore, Slovakia, Slovenia, Spain, Sweden, Turkey, United Kingdom, and United States.Footnote 3 Compared to most development accounting studies, we are thus able to only include a limited number of countries. The advantage of this sample of developed countries however, is that assuming the same aggregate production function and thus omitting any country-specific factors should be less of an issue.

For GDP and capital per worker we use data from the Penn World Tables Feenstra et al. (2015) for 2011 and 2015 for countries from the first and second wave of PIAAC respectively. Table 1 displays summary statistics for our sample. Average GDP per worker among the countries in our sample is 78,192 USPPP$, ranging from around 48,325 USPPP$ in Estonia to 151,909 USPPP$ in Norway. Even though our sample only includes upper-middle and high income countries the magnitude of variation in GDP per worker is significant, with a cross-country standard deviation equal to 28% of the mean. Average capital per worker is equal to around 4 times the GDP per worker.

Table 1 Summary statistics: country-level data

Table 1 also displays summary statistics for inputs used in our human capital composite. All measures are calculated using PIAAC data, and they are weighted means for individuals age 25–65. Average years of schooling in our sample of countries are 12.5 years, ranging from 8.1 years in Turkey to 14.6 years in Ireland. The variance in average years of schooling is relatively low, which in part explains the difficulty to account for cross-country income differences among developed countries when only using this narrow measure of human capital. However, at least since PISA 2000 many studies have pointed out important cross-country differences in the quality of education systems and as a consequence in the quality of a country’s stock of human capital (see e.g. Hanushek and Woessman 2012 or Barro 2001). To account for the qualitative aspect of human capital we consider individuals’ cognitive skills. In particular, we focus on cognitive skills in numeracy and literacy.Footnote 4 According to the OECD numeracy “is defined as the ability to access, use, interpret and communicate mathematical information and ideas [...] to engage in [...] a range of situations in adult life” (OECD 2016a pg. 18). Test questions in the numeracy domain range from being presented a picture of a thermometer and subtracting 30 degrees from the temperature shown to calculating how many wind power stations are needed to substitute for a nuclear plant correctly converting kilo watt hours into Giga watt hours and vice versa. Literacy skills are defined as “the ability to understand, evaluate, use and engage with written texts to participate in society [...] and to develop ones knowledge and potential” (OECD 2016a pg. 18). Test questions range from reading a report about an election and identifying the candidate with the fewest votes to understanding and evaluating a bibliographical search about a given topic amid distracting information. Each skill domain scores on a scale ranging from 0 to 500 points. Cross-country differences in cognitive test scores are quite large. In particular, average numeracy scores range from 203 to 289 and average literacy scores from 216 to 296 in Chile and Japan respectively.Footnote 5

Experience refers to years of actual job experience as reported by the individual. The average individual in our sample has around 20 years of job experience, ranging from 14 years in Turkey to 24 years in Denmark. Note that variability in this measure across countries is limited, approximately equal to 10% of the mean. Compared to typically used measures of potential experience, defined as the number of years that have elapsed since an individual finished schooling, our measure of actual experience has the following advantage. For individuals who enter and exit the labor market frequently (i.e. women, migrants) or for those who study and work at the same time, potential and actual experience may differ significantly. However, to be comparable to findings in literature, in the robustness section we also replicate our results using measures of potential experience. As a measure of on-the-job-training we use the average share of individuals who report to have participated in on-the-job-training the year prior to the survey. This share varies significantly across countries, ranging from 8% in Greece to 45% in Finland.Footnote 6 Finally, individuals’ health status in PIAAC is self-assessed, on the following 5-point scale: 1—excellent, 2—very good, 3—good, 4—fair, 5—poor. We invert this scale such that higher values correspond to better health. In our sample of countries, the mean self-reported health status ranges from 2.5 in Korea to 3.8 in Israel, displaying a rather low variability across countries. Note that self-reported health is not available for Canada and Turkey which is why we use US and Italian means respectively for these countries. According to data on perceived health by the OECD (2017) self-reported health in the US and Italy in 2011 is closest to values for Canada and Turkey in 2011 and 2015 respectively.Footnote 7 Table 14 in the Appendix displays country means for all dimensions of our human capital composite together with the number of observations available for each country.

Table 2 Summary statistics: individual level data US

3.1 Individual level data for Mincer regressions

We rely on individual level data from PIAAC for the US to run Mincerian wage regressions and to estimate the weight of each dimension of human capital in our composite. We use US data because we conjecture that the assumption of perfect competitive markets which allows us to use estimated Mincer coefficients as parameter values for the human capital composite is most likely met in the US. Our sample includes US wage and salary employees between 25–65 years of age, and hence we exclude self-employed and younger workers (16–24 years). We do so for the following reasons: (i) missing information on hours worked by self-employed individuals implies that we cannot construct hourly wages for this group, (ii) part of the income of self-employed comes from sources other than labor such as capital (see Gollin 2002), and (iii) workers below the age of 25 might still be completing their formal schooling. Given that individual wages in PIAAC for the US are only available in deciles, we follow Hanushek et al. (2015), and we assign values for mean wages per deciles proposed by the authors and trim the bottom and top 1% of the wage distribution. Table 2 displays descriptive statistics for US individual level data from PIAAC. Note that to be able to better interpret coefficients we have standardized cognitive test scores such that for the sample of the entire US working-age population they have mean zero and standard deviation one. Regarding all other dimensions of human capital, such as years of schooling, experience, health and participation in on-the-job-training, means for US workers exceed those of average workers in the majority of countries in our sample.

4 Parametrization

We estimate the following Mincer regression to obtain weights for each dimension of human capital

$$\begin{aligned} \log w_{i}=\beta _{0}+\phi s_{i}+\tau c_{i} +\psi _{1}x_{i}+\psi _{2}x_{i}^2+\theta hl_{i}+\varsigma ojt_{i}+\epsilon _{i}, \end{aligned}$$
(4.1)

where \(w_{i}\) is the hourly wage of individual i, \(s_{i}\) refers to years of schooling, \(c_{i}\) are standardized numeracy skills, \(x_{i}\) denotes years of actual experience, \(hl_{i}\) is an index for self-reported health status, \(ojt_{i}\) is a dummy variable that takes on value one if the individual has participated in on-the-job-training, and \(\epsilon _{i}\) is the error term. We do not include both literacy and numeracy test scores at the same time, because both measures are highly correlated at the individual level. Nevertheless, we assess the robustness of our results using literacy rather than numeracy test scores in Sect. 6. Note also that we explicitly do not control for individual characteristics such as gender or migrant status, given that we want to estimate the returns to skills of the “average” worker independently of his or her characteristics.

Table 3 Mincer regression for US wage and salary employees, age 25–65

Table 3 displays the results from this estimation when including each dimension of human capital at a time. Results in the second column show that once we control for years of schooling and numeracy skills, the estimated coefficients remain relatively unchanged. In our preferred specification, displayed in the last column, one additional year of schooling increases hourly wages by 6%, similarly to a one point higher self-reported health status. One additional standard deviation in numeracy is associated with 12% higher hourly wages and having received on-the-job training with 9% respectively. Finally, an individual’s first year of job experience is related to approximately 2.9% higher hourly wages.

Table 4 Benchmark parameters

Our estimated coefficients for years of schooling and cognitive skills are quite similar to those obtained by Hanushek et al. (2015) who also use PIAAC data for the US. However, we can also compare estimated returns to others in literature. For instance, Psacharopoulos and Patrinos (2004) estimate very similar returns to schooling of 6.8% for the OECD average. Other existing estimates for returns to cognitive skills in the US tend to be higher than ours. Using data from the International Adult Literacy Survey (IALS) Blau and Kahn (2005) estimate returns to a one standard deviation increase in cognitive test scores of 0.16 while Hanushek Eric and Zhang (2009) find returns of 0.2. Our estimated returns to one additional year of job experience for US workers lie between estimates by Hanushek et al. (2015) (0.015, and \(-0.028\)) and those in Manuelli and Seshadri (2014) (0.10 and −0.02). Regarding health outcomes, Contoyannis and Rice (2001) and Jäckle and Himmler (2010) also use categorical variables for self-reported health measures and provide estimates for the effect on wages separately for men and women in the UK and Germany respectively. The latter find that wages of very healthy men are between 1.3 and 7.8% higher compared to those in poor health. Our estimated coefficients on the other hand indicate that an improvement by one health category increases wages by 7%. Previous findings on the relationship between on-the-job-training and wages by Dearden et al. (2006) show that a 1% point increase in training is associated with an increase in hourly wages of about 0.3%. Parameter values for our development accounting exercise are based on estimated coefficients from our preferred specification, see Table 4. The capital share in production, \(\alpha \) is set to the standard value of 0.33 (see Caselli 2005).

5 Results

In Table 5 we display the results from our development accounting exercise that tests to which extent variation in factors of production can explain variation in output per worker across countries. Our results indicate that differences in physical capital alone account for 22% of the difference in GDP per worker across the 30 countries in our sample. This number is consistent with findings in previous literature indicating that about 20% of the variation in income across countries is due to differences in physical capital, see Hsieh and Klenow (2010).

Table 5 Results: development accounting

When adding differences in average years of schooling, our measure of success increases only slightly to 26.8%. This is lower compared to previous findings indicating that years of schooling account for around 10% of cross-country differences in output per worker. As mentioned before, in our exercise the limited contribution of years of schooling is partly due to little variation in this measure across the group of countries in our sample. However, by expanding our measure of human capital to also include cognitive skills, the share of explained variance increases to 33%, i.e. the model’s explanatory power increases by 6.2% points.Footnote 8 Hence, the quality of human capital, as proxied by workers’ cognitive skills seems to be somewhat more important than years of schooling. Schoellman (2012) on the other hand finds that education quality differences, estimated using migrants’ returns to schooling, are roughly as important as differences in years of schooling. Including years of job experience adds around 4.5% points to our measure of success, in line with findings by Klenow and Rodriguez-Clare (1997). While including measures of health and on-the-job training further increases the model’s explanatory power both effects are smaller, adding only 3.1 and 0.8% points respectively to our measure of success. When we approximate human capital using all its components, our measure of success is 15% points or 56% higher compared to a model which uses differences in physical capital and average years of schooling only.

5.1 Imperfect substitutability of workers with different levels of education

Most of the development accounting literature assumes that a country’s aggregate human capital can be expressed in average efficiency units, and that workers with different levels of human capital are perfect substitutes. One implication of this assumption is that the distribution of human capital does not affect relative wages of workers with different levels of education, which stands in stark contrast to empirical evidence (see e.g. Katz and Goldin 2009). As mentioned before, Jones (2014), Caselli and Coleman (2006) and Malmberg (2017) show that allowing for imperfect substitutability of workers with different education levels has the potential to increase the importance of human capital for development accounting.

We hence extend our benchmark model to allow for imperfect substitutability of individuals with and without college education.Footnote 9 To this end, we modify Eq. (2.1) in the following way

$$\begin{aligned} y_{KHj}=k_{j}^\alpha \left( \gamma _{c}\left( h_{c,j}L_{c,j}\right) ^{\rho }+(1-\gamma _{c})\left( h_{nc,j}L_{nc,j}\right) ^{\rho }\right) ^{\frac{1-\alpha }{\rho }}, \end{aligned}$$
(5.1)

where \(\rho \) determines the elasticity of substitution between workers with and without college degrees, and \(\gamma _{c}\) determines the labor share of college educated workers. \(L_{c,j}\) and \(L_{nc,j}\) indicate the proportion of college and non-college educated workers in country j. Human capital of college (\(h_{c,j}\)) and non-college (\(h_{nc,j}\)) educated workers is defined in a similar manner as \(h_{j}\) in Eq. 2.3, but with potentially education-specific parameters \(\phi _{e}\), \(\varsigma _{e}\),\(\psi _{e}\), \(\theta _{e}\), \(\tau _{e}\) for \(e=c,nc\). To obtain these parameters we run separate Mincer regressions for college and non-college educated workers in the US. Results from these regressions displayed in Table 15 in the Appendix show that returns to years of schooling, numeracy skills, and on-the-job-training are substantially larger for college educated workers. Note that different from our benchmark model where we disregard the constant term from the Mincer regression, we now need to take these terms into account when constructing (\(h_{c,j}\)) and (\(h_{nc,j}\)).Footnote 10

The elasticity of substitution (ES) between the two types of workers is given by \(ES={\frac{1}{1-\rho }}\).Footnote 11 Ciccone and Peri (2005) using an instrumental variable approach for a panel of US states from 1950–1990, estimate an elasticity of substitution between more and less educated workers of around 1.5, while Card (2009) reports values between 1.5 and 2.5. Krusell et al. (2000) use a value of 1.67 for the elasticity of substitution between workers with at most a high-school diploma and those with higher education. Using data on relative wages and the skill intensity of exports across countries, Malmberg (2017) estimates an elasticity of substitution of 1.3. In our exercise we allow the elasticity of substitution to vary between 2.5 (\(\rho =0.6\)) and 1.3 (\(\rho =0.23\)).

Under perfect competition the wage of college educated workers \(w_c\) (non-college educated workers \(w_{nc}\)) equals their marginal product of \(\frac{\partial Y}{\partial L_{c}}\) (\(\frac{\partial Y}{\partial L_{nc}}\)). Using this result together with data on average hourly wages of college and non-college educated workers in the US from PIAAC, we can obtain \(\gamma _{c}\) from the following equation:

$$\begin{aligned} \frac{w_c}{w_{nc}}=\frac{\gamma _{c}}{1-\gamma _{c}}\left( \frac{ h_{c}}{h_{nc}}\right) ^{\rho }\left( \frac{ L_{c}}{L_{nc}}\right) ^{\rho -1}. \end{aligned}$$

For different values for the elasticity of substitution, Table 6 shows the calibrated values for \(\gamma _{c}\).

Table 6 Calibrated values for \(\gamma _c\)

Results from the corresponding development accounting exercise that considers imperfect substitutability of workers with and without college education are displayed in Table 7. The smaller the elasticity of substitution between college and non-college educated workers, the better the index of success. Depending on the human capital composite considered, results show that for reasonable parameter values the model has 8–12% more explanatory power than our baseline model.

Table 7 Development accounting with imperfect substitutability of college and non-college educated workers

6 Robustness

We test the robustness of our results along various dimensions. First, we consider an alternative definition for experience; i.e. potential instead of actual experience. Second, we measure cognitive skills with test scores in literacy instead of numeracy. Third, we look at the distribution of test scores, and fourth we consider alternative samples for our Mincer regressions that also include self-employed individuals or that excludes potential students, i.e. individuals age 25–29. Finally we present our main results when excluding Norway, a country whose GDP is highly dependent on a specific factor of production, crude oil, which is not captured in our production function.

Actual versus potential experience Most previous literature uses potential experience instead of actual experience as reported by individuals. To be comparable, we repeat our exercise using potential experience which we compute as the difference between the year of the PIAAC survey and the year individuals finished their highest level of education.Footnote 12 Results from the corresponding Mincer regression in the last column of Table 16 show that coefficients for potential experience are smaller than those for actual experience in our baseline regression. Regarding all other coefficients, estimated returns to schooling, numeracy skills and on-the-job training are larger while those for health are smaller. Results from the corresponding development accounting exercise display a somewhat lower explanatory power when using this approximation for job experience; see Table 8.

Table 8 Robustness: using potential instead of actual experience

Test domain cognitive skills Our main analysis uses numeracy test scores to proxy for individuals’ cognitive skills. However, as mentioned before we also have data on literacy test scores. While both measures are highly correlated at the individual level and Chile and Japan are worse and best performers in both domains respectively, the entire ranking of countries is not invariant to the use of either measure. We hence repeat our exercise using test scores in literacy instead. Table 17 in the Appendix shows the results from the corresponding Mincer regressions. While all other coefficients are basically invariant to this change, literacy test scores seem to be somewhat weaker related to individuals’ wages than numeracy test scores. We thus repeat our development accounting exercise using estimated coefficients from the last column of Table 17 as parameter values. Results hardly change; see Table 9.

Table 9 Robustness: using literacy instead of numeracy test scores
Table 10 Robustness: using proficiency levels instead of average numeracy test scores

Distribution of cognitive skills Our baseline human capital composite includes country means of cognitive test scores. However, it may be the case that the productivity of a country’s stock of human capital is better captured by the entire distribution of cognitive skills. In particular, results from PISA studies highlight important differences even across countries with similar average educational achievements but very different distributions of achievers (see e.g. OECD 2016b). To check for the importance of the distribution of cognitive skills for explaining income differences, we consider the share of adults who score at each one of the six so-called proficiency levels defined by PIAAC.Footnote 13 This modification leads us to estimate the following Mincer regression that includes dummy variables dc, for each proficiency level

$$\begin{aligned} \log w_{i}=\beta _{0}+\phi s_{i} +\sum _{j=1,2,3,4,5}\tau _{j} dc_{j,i}+\psi _{1}x_{i}+\psi _{2}x_{i}^2+\theta hl_{i}+\varsigma ojt_{i}+\epsilon _{i}, \end{aligned}$$

where individuals with the lowest proficiency level (level 0) constitute the reference group. All other variables are as defined before. The estimated coefficients from the above equation are displayed in Table 18 in the Appendix. Returns to schooling, experience, on-the-job-training, and health basically remain unchanged. Only coefficients for dummy variables for proficiency levels three and above are significantly different from zero, while wage returns for individuals scoring at levels zero, one, and two are indistinguishable. Table 19 in the Appendix displays the share of individuals at each proficiency level in our US sample as well as in the pooled sample of countries. Repeating our development accounting exercise we find that our baseline model that uses average cognitive skills has a slightly stronger explanatory power than the model that takes into account the distribution of cognitive skills (see Table 10). This might be surprising given that the modified human capital composite includes more detailed information regarding cognitive skills. However, insignificant estimates for coefficients on proficiency levels one and two could be canceling this effect.

Alternative sample including self-employed individuals In our main Mincer regression, we restricted our attention to dependent workers. However, in trying to account for output differences across countries, productivity of self-employed individuals could potentially also play an important role. We hence repeat our Mincer regression using an alternative sample where we include self-employed individuals. In this case we use log monthly earnings including bonuses for dependent workers and self-employed (see Table 20 in the Appendix). Returns on schooling and cognitive skills remain almost unchanged compared to our baseline estimation. However, returns to all other dimensions of human capital change significantly. Returns to experience and on-the-job training double, while returns to health increase by around two thirds, leading to larger measures of success in our development accounting exercise (see Table 11). However, note that these results have to be taken with a grain of salt for two reasons: (i) unfortunately in our Mincer regression that includes self-employed and their earnings, we cannot control for hours worked which makes the estimated returns less precise and (ii) the exercise attributes all earnings of self-employed to labor which is far from realistic, see Gollin (2002).

Table 11 Robustness: using parameters from an alternative sample that includes self-employed individuals

Alternative sample of individuals age 30–65 According to data from the National Center for Education Statistics (2017) 13–15% of 25–30 year olds in the US were enrolled in school in 2011–2015. While this number is much smaller than the fraction of 20–25 year olds enrolled in school (39–40%), we want to make sure that this aspect is not affecting our results. We hence run our Mincer regression for an alternative sample that excludes individuals below the age of 30 (see Table 21 in the Appendix). Compared to our baseline estimation, returns to schooling and health remain basically unchanged. However, returns to experience and on-the-job training are lower, while returns to numeracy skills are higher for this older sample of individuals. This leads to slightly improved measures of success in our development accounting exercise (see Table 12).

Table 12 Robustness: using parameters from an alternative sample with individuals age 30–65

Excluding Norway Norway’s GDP is highly dependent on crude oil, something that is not captured in our production function. Caselli (2005) points out that some authors adjust countries’ income measures by value added in mining industries to account for such dependences. We simple repeat our baseline estimation excluding Norway. Not surprisingly as Table 13 reveals, measures of success increase significantly by around 10% points. Without considering Norway, differences in physical capital together with our broad measure of human capital account for 52% of the variance in output per worker.

Table 13 Robustness: excluding Norway from the sample of countries

Additional dimensions of human capital We also test for the importance of additional dimensions of human capital potentially captured by interaction terms between each component of our human capital composite, as well as additional measures such as non-cognitive skills or parental background. For measures of non-cognitive skills we consider the following three different variables included in PIAAC: i) an index of social trust, ii) how respondents agree with the claim “I like learning new things” and iii) how respondents agree with the claim “I get to the bottom of difficult things”. Regarding parental background we consider two variables: (i) mother’s education and (ii) number of books at home when individuals were 16 years old. However, when adding these variables to our baseline Mincer regression, the corresponding coefficients are not significantly different from zero, and neither do estimated returns to other variables change. This is why we abstain from including these dimensions into our aggregate human capital composite. With respect to interaction terms between different measures of human capital, given our data we do not estimate any significant coefficients, and hence we cannot consider these terms in our development accounting exercise.

7 Conclusions

Taking advantage of nationally representative and comparable data from PIAAC, we provide multidimensional measures for the stock of human capital at the country level based on years of schooling, cognitive skills, experience, on-the-job-training, and health. The same data base allows us to obtain estimates for the weight of each dimension in the human capital composite. We use these so-constructed measures of aggregate human capital to carry out a classical development accounting exercise. Our measure of success indicates that 27% of the variance in output per worker can be explained for by differences in physical capital per worker and average years of schooling. Including our broad measure of human capital increases the model’s explanatory power to 42%, or even 45% when considering imperfect substitutability of workers with different levels of education.

Our findings point out the importance of using broad measures of human capital when trying to account for income differences. While broader than most measures used in literature, our human capital composite is far from complete. For instance, findings by Cubel et al. (2016) highlight the importance of non-cognitive skills for individual productivity which suggests that these also represent an important part of a country’s stock of human capital. In this regard it would be desirable if future waves of PIAAC were to include better measures of individuals’ non-cognitive skills, for instance measures of personality traits such as extraversion, agreeableness, neuroticism, conscientiousness, and openness to experience, characteristics which psychologists call “the big five.” Extending our wish list to the OECD, incorporating additional, in particular, developing countries into future PIAAC waves would open up many great opportunities for future research.