Abstract
A novel hypothesis posits that levels of genetic diversity in a population may partially explain variation in the development and success of countries. Our paper extends evidence on this question by subjecting the hypothesis to an alternative context that eliminates many competing hypotheses. We do this by aggregating representative individual-level data for high schools from a single US state (Wisconsin) in 1957, when the population was composed nearly entirely of individuals of European ancestry. Using this sample of high school aggregations, we too find a strong association between school-level genetic diversity and a range of student socioeconomic outcomes. Our use of survey data also allows for a greater exploration into the potential mechanisms of genetic diversity. In doing so, we find positive associations between genetic diversity and indexes for openness to experience and extraversion, two personality traits tied to creativity and divergent thinking.
Similar content being viewed by others
Notes
The more ultimate rationale for these two mechanisms is tied to the survival advantages of increased genetic diversity weighed against the resulting weakening of kin networks.
Roughly 75% of the population belongs to one of five ethnicities—British, Irish, Norwegian, German, and Polish—with 47% being derived solely from Germany.
Additional findings suggest that personality may be change over the life course (Srivastava et al. 2003); however, the personality traits that show change—conscientiousness, agreeableness, and neuroticism—are those that are not of primary interest for the current work.
This idea is discussed more fully in Arbatli et al. (2018).
To increase sample size, school-level gene frequencies are calculated from genetic data for both graduates and siblings. Online Appendix Section 6 replicates all estimations using a genetic diversity score that is calculated only from WLS graduates.
One potential source of bias from violating this assumption may be tied to favorable interactions amongst the population of possessing a particular trait tied to cognition. However, complicated phenotypes, such as those of differential economic wellbeing, are not likely to be linked to singular genetic variations (Chabris et al. 2012, 2013).
This is also a potential issue in AG, who do not use population representative genetic data.
In recent work, Domingue et al. (2017) show correcting for mortality selection does not alter genetic associations.
As shown in Online Appendix Table A2, genetic diversity is unrelated to IQ for our base specification. Furthermore, Online Appendix Table A5 performs a propensity score matching exercise that attempts to account for potential selection into schools. As shown, the matched effects do not substantially differ from the OLS estimates.
As shown in Online Appendix Section 5, we observe no significant quadratic effect of genetic diversity.
Roughly 75% of the population belongs to one of five ethnicities—British, Irish, Norwegian, German, and Polish, with 47% being derived solely from Germany.
The within-county standard deviation of high school genetic diversity is 0.0138 compared to 0.0162 for the overall sample.
Concerns of generalizability are addressed in part by the replication of Table 12.
Following Ager and Brueckner (2018) and Nunn and Wantchekon (2011), we compare the coefficient from the restricted bivariate estimation of column (1) to our baseline unrestricted model in column (7) to examine the potential for omitted variable bias (Altonji et al. 2005). For years of schooling, the Altonji et al. ratio (hereafter AR) is 1.8, implying that the selection on unobservables would have to be roughly twice that as the selection on observables.
For column (5) of Panels A and B, the coefficient of genetic diversity is not significant at conventional levels. We attribute this to reduction in confounding after adding school controls. The point estimate is similar across both panels for column (7), but the standard error is reduced from the addition of individual/family/historic controls that account for variation in the outcome.
The AR is 0.42 for Duncan job prestige, 0.49 for Seigel job prestige, and 1.68 for the occupational education score. See footnote 15 for further discussion.
The AR is 7.33 for the estimated relationship of genetic diversity with the natural log of family income in 1972. See footnote 15 for further discussion.
The absolute value of the AR is 8.98 when comparing column (1) to column (7) in Panel B of Table 4. See footnote 15 for further discussion.
The closest estimations to those of the current work are found in Table 2 (p. 29), which regress contemporary county incomes on county-level genetic diversity for 1870 while controlling for income in 1870.
A number of studies have focused on the negative effects of genetic diversity on income, paying particular attention to the formation of ethnic groups and resulting ethnic conflict (Alesina and La Ferrara 2005; Arbatli et al. 2018; Ashraf and Galor 2013b). To our knowledge, only one other study focuses on the positive aspects of diversity (Depetris-Chauvin and Özak 2018).
Personality indices are recorded for the 1992 and 2004 wave of the WLS. We take the average of each index for individuals that responded to both waves, while including individuals that contain data for only one of the waves.
The estimates of column (5) in Panel A suggest high school genetic diversity is partially confounded by other school level variables. This confounding is addressed through the addition of our school-level controls. As with Panels A and B of Table 3, the point estimate remains smaller for column (7), but the standard error is reduced from the addition of individual/family/historic controls that account for variation in the outcome.
For Table 5, the AR is 0.46 for Panel A, 3.16 for Panel B, and 0.83 for Panel C. See footnote 15 for further discussion.
The AR for Table 7 is 0.61. See footnote 15 for further discussion.
We thank an anonymous reviewer for suggesting this analysis and discussion.
The coefficient of genetic diversity increases by 8% when controlling for the proposed channels of personality and task diversity.
Each SNP is available for roughly 4500 WLS graduates. To keep sample sizes maximized, we replace missing values with the mean. A dummy variable for those with missing values is also included in the estimation of columns (5) and (7).
It is possible that those parents with high school graduates differ from those without high school graduates; however, when comparing the WLS parents to people of a similar age (within one standard deviation of mean year of birth) in the 1960 5% census sample we see very similar average years of schooling: 9.78 for WLS fathers and 9.49 for men in the 1960 census, and 10.47 for WLS mothers and 10.30 for women in the 1960 census.
The 1940 Census is the most recent census with full coverage. Ideally, we’d use a census that is closer to the 1957 graduation date; however, 1 and 5% samples for the years 1950 and 1960 do not contain spatial data for areas with under 100,000 in population. This would exclude a majority of the rural counties that are central to our study.
Omitting this control does not substantially alter the coefficient of interest.
County-level foreign-born country fractions are included only for those countries exceeding 10,000 migrants in the state of Wisconsin. These are Germany, Poland, Norway, Sweden, Russia (USSR), Canada, Czechoslovakia, Austria, England, Italy, and Hungary.
When using spatially adjusted standard errors (50 km), the association with father’s within-industry occupational diversity is significant at the 1% level.
We have no direct comparison between generations for the WLS’s SES index. We report estimated effects for the bivariate and baseline estimations for WLS graduates to capture a range of outcomes that correspond to the parental estimates that have few controls.
In creating comparisons, we use the coefficients from the simple within-county estimations (column 2 in specified table) for WLS graduates. The standard deviation for the individual sample is used for individual estimates—i.e., education and job prestige, and the standard deviation for the aggregate sample is used for occupational diversity.
The limited ethnic setting of our study, while beneficial in limiting confounding factors associated with unobserved ethno-cultural associations, limits generalizing our findings to the global estimates of AG. Furthermore, the sample being restricted to European ancestry limits potential bias from European vs. non-European origins (Ager and Brueckner 2013; Easterly and Levine 2016).
References
Ager, P., & Brueckner, M. (2013). Cultural diversity and economic growth: Evidence from the US during the age of mass migration. European Economic Review, 64, 76–97.
Ager, P., & Brueckner, M. (2018). Immigrants’ genes: Genetic diversity and economic development in the United States. Economic Inquiry, 56(2), 1149–1164.
Alesina, A., Harnoss, J., & Hillel, R. (2016). Birthplace diversity and economic prosperity. Journal of Economic Growth, 21(2), 101–138.
Alesina, A., & La Ferrara, E. (2005). Ethnic diversity and economic performance. Journal of Economic Literature, 43(3), 762–800.
Altonji, J. G., Elder, T. E., & Taber, C. R. (2005). Selection on observed and unobserved variables: Assessing the effectiveness of catholic schools. Journal of Political Economy, 113(1), 151–184.
Arbatli, E., Ashraf, Q., Galor, O., & Klemp, M. (2018). Diversity and Conflict. Working Paper. https://ideas.repec.org/p/bro/econwp/2018-6.html. Accessed 17 June 2017.
Ashraf, Q., & Galor, O. (2013a). The ‘out of Africa’ hypothesis, human genetic diversity, and comparative economic development. American Economic Review, 103(1), 1–46.
Ashraf, Q., & Galor, O. (2013b). Genetic diversity and the origins of cultural fragmentation. American Economic Review, Papers and Proceedings, 103(3), 528–533.
Ashraf, Q., & Galor, O. (2018). The macrogenoeconomics of comparative development. Journal of Economic Literature, 56(3).
Ashraf, Q., Galor, O., & Klemp, M. (2014). The out of Africa hypothesis of comparative development reflected by light intensity. Working Paper. https://ideas.repec.org/p/bro/econwp/2014-4.html. Accessed 17 June 2017.
Ashraf, Q., Galor, O., & Klemp, M. (2015). Heterogeneity and productivity. Working Paper. https://ideas.repec.org/p/bro/econwp/2015-4.html. Accessed 17 June 2017.
Cavalli-Sforza, L. L. (2005). The human genome diversity project: Past, present and future. Nature Reviews Genetics, 6(4), 333–340.
Chabris, C., Hebert, B. M., Benjamin, D. J., Beauchamp, J. P., Cesarini, D., van der Loos, J. H. M. M., et al. (2012). Most published genetic associations with general intelligence are probably false positives. Psychological Science, 23(11), 1314–1323. https://doi.org/10.1177/0956797611435528.
Chabris, C., Lee, J., Benjamin, D., Beuchamp, J., Glaeser, E., Borst, G., et al. (2013). Why is it hard to find genes that are associated with social science traits? Theoretical and empirical considerations. American Journal of Public Health, 103(S1), S152–S166.
Costa, P. T., Jr., & McCrae, R. R. (1994). Set like plaster: Evidence for the stability of adult personality. In T. F. Heatherton & J. L. Weinberger (Eds.), Can personality change? (pp. 21–40). Washington, DC: American Psychological Association.
Depetris-Chauvin, E., & Özak, Ö. (2018). The Origins of the Division of Labor in Pre-modern Times. https://doi.org/10.2139/ssrn.3130747.
Domingue, B. W., Belsky, D. W., Harrati, A., Conley, D., Weir, D. R., & Boardman, J. D. (2017). Mortality selection in a genetic sample and implications for association studies. International Journal of Epidemiology. https://doi.org/10.1093/ije/dyx041.
Duncan, O. D. (1961). A Socioeconomic index for all occupations. In occupations and social status. New York: Free Press of Glencoe.
Easterly, W., & Levine, R. (2016). The European origins of economic development. Journal of Economic Growth, 21(3), 225–257.
Feist, G. J. (1998). A meta-analysis of the impact of personality on scientific and artistic creativity. Personality and Social Psychological Review, 2, 290–309.
Freeman, R. B., & Huang, W. (2015). Collaborating with people like me: Ethnic coauthorship within the United States. Journal of Labor Economics, 33(S1), S289–S318.
Furnaham, A., & Chamorro-Premuzic, T. (2004). Personality, intelligence, and art. Personality and Individual Differences, 36, 705–715.
Furnham, A., & Bachtiar, V. (2008). Personality and intelligence as predictors of creativity. Personality and Individual Differences, 45, 613–617.
Hirsh, J. B., DeYoung, C. G., & Peterson, J. B. (2009). Metatraits of the big five differentially predict engagement and restraint of behavior. Journal of Personality, 77(4), 1085–1102.
Hong, L., & Page, S. (2001). Problem solving by heterogeneous agents. Journal of Economic Theory, 97, 123–163.
Hong, L., & Page, S. (2004). Groups of diverse problem solvers can outperform groups of high-ability problem solvers. PNAS, 101, 16385–16389.
Hunt, J., & Gauthier-Loiselle, M. (2010). How much does immigration boost innovation. American Economic Journal: Macroeconomics, 2, 31–56.
Kaufman, S. B., Quilty, L. C., Grazioplene, R. G., Hirsh, J. B., Gray, J. R., Peterson, J. B., et al. (2016). Openness to experience and intellect differentially predict creative achievement in the arts and sciences. Journal of Personality, 84(2), 248–258.
Kemeny, T. (2017). Immigrant diversity and economic performance in cities. International Regional Science Review, 40(2), 164–208.
King, L., Walker, L., & Broyles, S. (1996). Creativity and the five factor model. Journal of Research in Personality, 30, 189–203.
Lazear, E. (1999). Globalization and the market for teammates. The Economic Journal, 109, 15–40.
McCrae, R. R., & Costa, P. T., Jr. (1996). Toward a new generation of personality theories: Theoretical contexts for the five-factor model. In J. S. Wiggins (Ed.), The five-factor model of personality: Theoretical perspectives (pp. 51–87). New York: Guilford Press.
McCrae, R. R., Costa, P. T., Ostendorf, F., Angleitner, A., Hrebickova, M., Avia, M. D., et al. (2000). Nature over nurture: Temperament, personality, and life span development. Journal of Personality and Social Psychology, 78, 173–186.
Nunn, N., & Wantchekon, L. (2011). The slave trade and the origins of mistrust in Africa. American Economic Review, 101(7), 3221–3252.
Olson, C., & Ackerman, D. (1998). Wisconsin high school district information for 1954–1957. https://www.ssc.wisc.edu/wlsresearch/documentation/supdoc/hsdistrict.txt. Accessed 17 June 2017.
Ottaviano, G., & Peri, G. (2006). The economic value of cultural diversity: Evidence from US cities. Journal of Economic Geography, 6, 9.
Parrotta, P., Pozzoli, D., & Pytlikova, M. (2014). The nexus between labor diversity and firm’s innovation. Journal of Population Economics, 27, 303–364.
Peri, G. (2012). The effect of immigration on productivity: Evidence from US states. The Review of Economics and Statistics, 94, 348–358.
Peri, G., & Sparber, C. (2009). Task specialization, immigration, and wages. American Economic Journal: Applied Economics, 1, 135–169.
Pickering, A. D., Smillie, L. D., & DeYoung, C. G. (2016). Neurotic individuals are not creative thinkers. Trends in Cognitive Science, 20(1), 1–2.
Ruggles, S., Genadek, K., Goeken, R., Grover, J., & Sobek, M. (2015). Integrated public use microdata series: Version 6.0 [dataset]. Minneapolis: University of Minnesota. https://doi.org/10.18128/D010.V6.0.
Sampson, R. J., & Sharkey, P. (2008). Neighborhood selection and the social reproduction of concentrated racial inequality. Demography, 45(1), 1–29.
Siegel, P. M. (1971). Prestige in the American occupational structure. Doctoral dissertation, University of Chicago.
Srivastava, S., John, O., Gosling, S., & Potter, J. (2003). Development of personality in early and middle adulthood: Set like plaster or persistent change? Journal of Personality and Social Psychology, 84(5), 1041–1053.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research uses data from the Wisconsin Longitudinal Study (WLS) of the University of Wisconsin–Madison. Since 1991, the WLS has been supported principally by the National Institute on Aging (AG-9775, AG-21079, AG-033285, and AG-041868), with additional support from the Vilas Estate Trust, the National Science Foundation, the Spencer Foundation, and the Graduate School of the University of Wisconsin–Madison. Since 1992, data have been collected by the University of Wisconsin Survey Center. A public use file of data from the WLS is available from the Wisconsin Longitudinal Study, University of Wisconsin–Madison, 1180 Observatory Drive, Madison, Wisconsin 53706 and at http://www.ssc.wisc.edu/wlsresearch/data/. The opinions expressed herein are those of the authors.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendices
Variable Appendix
1.1 Regressors of interest
High school genetic diversity
An expected heterozygosity score calculated from high school gene frequencies. Beginning in 2007, the Wisconsin Longitudinal Survey (WLS) began collecting data on 96 single nucleotide polymorphisms (SNPs). These data were collected for roughly half of the original WLS respondents and selected siblings (~ 7000). In constructing our high school level genetic diversity score, we first tabulate the high school frequency of each SNP variant using all available genetic data (graduates and siblings). These high school-specific gene frequencies are then used to calculate expected heterozygosity as specified in Ashraf and Galor (2013a, b). This gives us a high school-specific measure of genetic diversity.
County genetic diversity
This measure is for 1920 and comes from Ager and Brueckner (2018). It is found by matching the ancestry of European immigrants to the estimated genetic diversity score for 1500 CE from Ashraf and Galor (2013a, b).
1.2 Outcomes
Years of schooling
The number of completed years of education for the WLS graduate. From the WLS variable rb003red.
Duncan job prestige
A measure of job prestige based on income, education, and surveyed perceptions of general social standing for certain occupations (Duncan 1961). Measured for the WLS graduate’s first job. From the WLS variable ocsx1u2.
Siegel job prestige
A measure of job prestige based on surveys that evaluated perceptions on the “general” or “social” standing of certain occupations (Siegel 1971). Measured for the WLS graduate’s first job. From the WLS variable ocpx1u2.
Occupational education score
A measure of job prestige for the WLS graduate’s first job that is based on percentage of people in an occupation that completed 1 year of college. From the WLS variable ocex1.
Family income, 1974
Total earnings for WLS graduate’s family during 1974. From the WLS variable yfam74.
Family income, 1992
Total earnings for the WLS graduate’s family during 1992. From the WLS variable rp044hef.
Openness to experience
An additive score from a series of questions intended to measure the WLS graduate’s personality trait of openness. We use the average from two waves of the WLS—1992 and 2004. From the WLS variables mh032rei and ih032rei.
Extraversion
An additive score from a series of questions intended to measure the WLS graduate’s personality trait of extraversion. We use the average from two waves of the WLS—1992 and 2004. From the WLS variables mh001rei and ih001rei.
Conscientiousness
An additive score from a series of questions intended to measure the WLS graduate’s personality trait of conscientiousness. We use the average from two waves of the WLS—1992 and 2004. From the WLS variables mh017rei and ih017rei.
Agreeableness
An additive score from a series of questions intended to measure the WLS graduate’s personality trait of agreeableness. We use the average from two waves of the WLS—1992 and 2004. From the WLS variables mh009rei and ih009rei.
Neuroticism
An additive score from a series of questions intended to measure the WLS graduate’s personality trait of neuroticism. We use the average from two waves of the WLS—1992 and 2004. From the WLS variables mh025rei and ih025rei.
High school occupational diversity
Using data on detailed occupation code, we construct high school frequency of each occupation. This high school occupational frequency is then used to calculate the high school’s occupational diversity in an identical manner as genetic diversity (both being roughly identical to a Hirfendahl Index). From the WLS variable ocx1u.
Controls
2.1 Individual
IQ
WLS graduate’s IQ score mapped from raw Henmon-Nelson test score. For the high school level sample of Tables 8, 9 and 10, the high school mean is used. From WLS variable gwiiq_bm.
Female
An indicator for the WLS graduate’s sex. For the high school level sample of Tables 8, 9 and 10, the high school mean is used. From WLS variable sexrsp.
Birth year
The WLS graduate’s year of birth. For the high school level sample of Tables 8, 9 and 10, the high school mean is used. From WLS variable brdxdy.
2.2 Family
SES, 1957
Index comprised of the WLS graduate’s father’s years of schooling, mother’s years of schooling, father’s Duncan job prestige, and parental income in 1957, the year of the initial wave of the WLS. For the high school level sample of Tables 8, 9 and 10, the high school mean is used. From WLS variable ses57.
Father’s years of schooling
WLS graduate’s father’s years of schooling. For the high school level sample of Tables 8, 9 and 10, the high school mean is used. From WLS variable bmfaedu.
Mother’s years of schooling
WLS graduate’s mother’s years of schooling. For the high school level sample of Tables 8, 9 and 10, the high school mean is used. From WLS variable bmmaedu.
2.3 School
High school size
The size of the WLS graduate’s graduating class. From WLS variable hssize.
Madison/Milwaukee indicator
An indicator for whether the WLS graduate lived in either Madison or Milwaukee during the 1957 collection wave. From WLS variable pop57.
Teacher salary
The average teacher salary by district for years 1954–1957. High school resource data were compiled by Olson and Ackerman (1998). To preserve sample size, missing values are imputed by the county-level mean (an indicator for imputed observations is included); roughly 2% of observations have been imputed.
Teacher experience
The average years of experience for teachers by district for years 1954–1957. Two measures of experience are used: within school district experience and teacher total experience. High school resource data were compiled by Olson and Ackerman (1998). To preserve sample size, missing values are imputed by the county-level mean (an indicator for imputed observations is included); roughly 2% of observations have been imputed.
Teacher years of schooling
The average number of postsecondary years of schooling for teachers by district for years 1954–1957. High school resource data were compiled by Olson and Ackerman (1998). To preserve sample size, missing values are imputed by the county-level mean (an indicator for imputed observations is included); roughly 2% of observations have been imputed.
Number of school days
The average number of school days by district for years 1954–1957. High school resource data were compiled by Olson and Ackerman (1998). To preserve sample size, missing values are imputed by the county-level mean (an indicator for imputed observations is included); roughly 2% of observations have been imputed.
Classroom size
The average classroom size by district for years 1954–1957. The variable is created from the total number of enrolled of high school students (variables en_bt_7 and en_gt_7 in Olson and Ackerman 1998) divided by the total number of teachers across the same time frame (variable te_tea_t).
2.4 Historic
All historic controls are found by matching the WLS graduate’s father’s ancestry to country-level data in Ashraf and Galor (2013a, b). These data are then averaged at the high school level. The set of country-level variables includes absolute latitude, the fraction of arable land, the mean temperature, mean precipitation, mean elevation, an index of roughness, the mean distance to the coast or navigable river, and the fraction of land within 100 km of the coast or a navigable river. All variables and prior sources are found in Ashraf and Galor (2013a, b).
2.5 Robust
Indicator for father’s nationality
The WLS graduate’s reported father’s ancestral nationality. For the high school level sample of Tables 9 and 10, the high school fraction of each reported ancestry is used. From the WLS variable natfth.
Ethnic fractionalization
An ethnic fractionalization score derived from the high school frequency of father’s nationality. Calculated in identical manner as high school genetic diversity and high school occupational diversity. From the WLS variable natfth.
Genetic markers
An additive score (e.g., 0, 1, or 2) of the variant for each SNP in the WLS. These data are for roughly 4500 WLS graduates. To prevent a loss in sample size, missing graduates are assigned the mean for each SNP; an individual-level indicator for those with missing genetic data is also included. For the high school level sample of Tables 9 and 10, the high school frequency is used.
Principal component of genetic markers
The first principal component of all SNPs used to calculate high school genetic diversity.
Industry diversity
A measure of diversity based on the WLS graduate’s industry associated with their first job. Graduates are assigned to one of twelve industry classifications. The fraction of the high school within each industry is then used to calculate a measure of diversity. From WLS variable inmx1u.
Rights and permissions
About this article
Cite this article
Cook, C.J., Fletcher, J.M. High-school genetic diversity and later-life student outcomes: micro-level evidence from the Wisconsin Longitudinal Study. J Econ Growth 23, 307–339 (2018). https://doi.org/10.1007/s10887-018-9157-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10887-018-9157-3