Background

In recent decades, the world has seen major shifts in human reproduction. The Western world, along with a few ‘tiger’ economies of Asia, have witnessed an unprecedented fertility slowdown with women giving birth to fewer children and doing so later in life [1, 2]. These changes have become so extensive that some demographers, sociologists, and economists have dubbed them the “global fertility crisis” [3, 4]. The slowdown has been particularly pronounced in socially and technologically advanced societies [5, 6], including the United States where the fertility rate has dropped below the replacement level, from 2.12 children per woman in 2006 to 1.78 in 2020 [7]. Although potentially beneficial for pressing global issues, such as carbon emissions [8, 9], the slowdown has caused concerns among economists for its projected effect on long-term economic growth [10, 11]. Developed countries spend 1–4% of GDP on family support and birth stimulation initiatives [12, 13] but, notwithstanding some successes, few have managed to stop or reverse the fertility decline [14, 15].

In contrast, lower-income regions of the world, such as countries of sub-Saharan Africa, continue to experience high adolescent fertility rates that presumably impede the growth of human capital in young women [16, 17]. High fertility rates in those nations are expected to increase the world’s population by another 4 billion by the end of this century [18].

Not only do women across the globe give birth to different numbers of children, but they do so at drastically different life stages, spanning from adolescence to middle-age [2, 19]. Such variability raises its own concerns at both extremes: older age of childbearing, a feature of developed countries, is associated with undesirable health outcomes for women and children [20, 21], while fertility in adolescence has negative effects on young women’s human capital [22].

Social and behavioral policies targeting human reproduction, whether they seek to boost or constrain it, are based on the current consensus about its driving forces [23]. Demographers, economists, and sociologists most often explain human reproductive dynamics with reference to economic, technological, and social factors: the availability of contraception, women obtaining higher education, or entering the workforce [24,25,26,27]. Despite the utility of these explanations, they fail to fully account for global variability in human reproduction. For example, they have difficulty explaining why fertility rates are further decreasing in the most industrialized nations, where education and employment have become more (not less) accommodating of childcare responsibilities: even in countries with comprehensive birth stimulation programs, such as France or Sweden, fertility rates have increased only slightly, remaining substantially below their mid-twentieth century levels [28, 29]. Existing explanatory frameworks also have difficulty accounting for the fact that some lower-income nations (e.g., Niger, Congo, or Gabon) that in the past 70 years have seen relatively sharp increases in women attaining higher education, have not seen the same declines in fertility rates that more progressive societies have [19].

In this work, we offer a different perspective on global reproductive patterns, a perspective inspired by life history theory (LHT), a conceptual framework from evolutionary behavioral science that seeks to understand the diversity of reproductive strategies and life cycle traits in species and individual organisms. As suggested by LHT, people, just like other animals, may adjust the timing and number of offspring to mortality risk in their local ecology [30, 31] Based on this perspective, socioeconomic incentives targeting fertility through education, employment, and contraception may have a limited effect because they overlook the biology of human reproduction.

While another major theoretical perspective on human reproduction – demographic transition theory (DTT) – also acknowledges a link between mortality and population dynamics, LHT suggests a direct causal link between the two and generates somewhat different predictions than DTT. Below, we briefly review prior evidence and report novel data supporting a life history approach to human reproductive dynamics.

Demographic transition theory

Classic demographic transition theory (DTT) [32, 33] has established links between mortality and fertility in humans. According to DTT, human societies progress from a mode of high fertility and high mortality to that of low fertility and low mortality. In DTT, changes in mortality and fertility are orthogonal factors that independently cause declines in population growth. The driving forces behind declining mortality – mainly, changes in extrinsic mortality sources, such as contagious diseases – were obvious to the theory’s authors. The causes of declining fertility were less evident: Frank Notestein wrote that it was “impossible to be precise about [its] various causal factors”, attributing fertility decline to youth mobility, “anonymity of city life,” changes in cultural values, and “rational point of view” [33]. The idea that decreasing fertility could be ultimately caused by decreasing mortality through a biological mechanism shared with other species was not part of the classic theory. DTT has later been challenged as only applicable to one historical era and having less predictive power for future population dynamics [34, 35].

Life history theory

The reproductive behavior of many animal species is adaptively calibrated to features of the immediate ecology, as suggested by life history theory, a framework in evolutionary biology that seeks to understand the diversity of reproductive and developmental strategies observed among organisms [36]. Research in this domain has explored factors determining how organisms allocate finite resources toward growth, reproduction, and survival over their lifetimes. Apart from intrinsic factors (e.g., metabolic rates, reproductive physiology), it has been established that extrinsic factors, such as predation pressures, resource availability, and habitat stability, may cause organisms to adjust their life history strategies to maximize reproductive success in a given ecology [37].

Interspecies variability

A notable pattern established in the domain of life history is the link between levels of environmental harshness and unpredictability and a species’ reproductive strategy. Typically, species exposed to higher levels of morbidity and mortality risk (as well as higher stochastic fluctuations of this risk across times and contexts) tend to prioritize immediate reproduction over long-term development, exhibiting the so-called “fast” life history strategy. Mammals in ecologies with higher predation risk mature earlier, have larger litters and shorter gestation periods – house mice being a prototypical example, with their reproductive cycles measured in mere weeks, frequent and large litters of 4–12 pups several times a year, and short lifespans of 6–18 months [38]. The accelerated reproductive timing helps such organisms reap maximum genetic benefits in uncertain environments where long-term development involves higher risk of death before reproduction. Larger numbers of offspring, in turn, reflect a genetic bet-hedging strategy when the threat of offspring mortality is high.

In contrast, species living in safer and more predictable ecologies show signs of a “slower” life history strategy by investing a greater proportion of bioenergetic resources in long-term development prior to reproduction. They have fewer offspring and invest more resources in each, which typically pays off given that these offspring are likely to survive to reproductive age. For example, naked mole rats have a strikingly “slower” life history than mice: mole rats live up to 30 years, reach sexual maturity around 6–9 months of age, and have extended period of maternal care lasting for 4–6 weeks [39]. Similarly, the little brown bat reaches sexual maturity at age 1–2 and has 1–2 pups per litter, a striking contrast to the reproductive frequency and abundance of mice [40]. Such differences have been attributed to safer and more predictable ecologies the “slower” species inhabit: those are often characterized by lower predation rates and stabler habitats. Neither life history strategy – fast or slow – is inherently more adaptive than the other. Rather, both are well-calibrated to optimize reproductive success with respect to the harshness and unpredictability of the immediate ecology.

Intraspecies variability

Apart from interspecies variation in life history strategies, there is some evidence of intraspecific differences in response to environmental variation. In some insects, fish, wild birds, and even plants, organisms within the same species living in different environments have manifested variation in life history traits [41, 42]. Typically, longer lifespans and lower mortality have been associated with delayed fertility and allocation of fewer resources towards reproduction [43]. It must be noted that direct application of life history theory principles to intraspecies variability has been challenged [37], especially in application to humans [44, 45].

Human life histories

By some developmental features, humans are among the “slowest” animals [45], although considerable variability exists across people and world populations [46, 47]. The time we invest in our offspring after birth is 2–4 times longer than that of our closest genetic relatives (chimpanzees), who typically start procuring their own food by age 5 [47, 48]. In some human ecologies, such as post-industrial societies, women reach sexual maturity as late as at age 16 [49], give birth after 25 [50], and care intensively for offspring for up to 21 years: a rare length of parental investment in the animal kingdom.

At the same time, there is considerable intraspecific variability in human life history trajectories [46]. The reproductive window of modern humans spans from 5 to 59 years of age [51], and the documented number of children a woman has ever had in her lifetime ranges from zero to well over twenty and, in one case, even as high as 69 children [52, 53]. While these numbers depict stark extremes, population averages also vary across times and ecologies: e.g., the average number of births per woman in eighteenth century Belgium (6.2) [54] or in modern Niger (6.91) is over six time higher than in modern Taiwan (0.87) [19], while the average age of women at birth of first child varies from 18 in Angola to 31.2 in Italy [19]. From the standpoint of evolutionary biology, such variability could stem from varying levels of harshness and unpredictability in people’s local environments and thus reflect the adaptive calibration of human developmental systems.

Population-level reproductive patterns

Evidence from samples of world societies indicates that variability in fertility and childbearing age is associated with mortality risk [55,56,57]. Prior work examined nation-level indicators of reproductive timing (age of birth of the first child, adolescent fertility, age of marriage), as well as fertility, showing that reproductive age is higher, while rates of adolescent fertility are lower, in nations with lower mortality rates [58, 59]. Similarly, recent work suggests that earlier age of menarche is associated with higher fertility and higher mortality rates [60].

Individual developmental trajectories

Research in the field of human evolutionary psychology has focused on the ontogenetic development of life history traits and documented a conceptually similar pattern: physical development and sexual behaviors of teenagers and young adults showed signs of adjustment to safer vs. riskier early environments [61,62,63]. Specifically, harshness and unpredictability of early environments have been associated with the prevalence of “faster” reproductive trajectories in adolescents [64, 65] characterized by earlier age of sexual debut, greater number of sexual partners, and the timing of sexual maturity. Girls who experienced unpredictable relationships with childhood caregivers reach sexual maturity at a younger age [66]. Such effects are theorized to reflect the adaptive calibration of human reproduction to environmental mortality threat.

Limitations of prior research

Prior work examining the association between mortality risk and human reproductive dynamics, although suggestive, had a few methodological limitations. Population-level analyses used indicators aggregated on national level, which raised critiques about the “ecological fallacy,” i.e., a potentially false assumption that aggregated data from large entities adequately reflects local ecological conditions [67]. In contrast, studies of individual life histories did not fall prey to ecological fallacy, but they had their own limitation: indirect measures of life history variables. In this field of research, mortality risk has been operationalized through indirect socioeconomic indicators, such as low income or absence of father, while reproductive trajectories have been represented by sexuality (e.g., timing of sexual maturation) rather than reproductive outcomes per se. No work has yet applied a single analytic framework to link mortality risk and reproductive outcomes on different levels of analyses.

Another methodological concern of prior research has been the non-independence of data points due to shared variance in ecological conditions between neighboring entities within the same geographical area [67]. Finally, past work discovered signs of nonlinearity in the data [55] that traditional linear methods have limited capacity to explore.

The current work advances this literature by examining reproductive and mortality data on different populational and individual levels, while addressing previous methodological limitations.

Current research

This work applies a life history framework to suggest an explanation for global, local, and individual variability in human reproductive outcomes. In line with prior work, we suggest that the variability of reproductive timing and abundance in modern human populations might reflect a broader biological pattern in which organisms slow their reproduction in response to increased safety and stability – or speed their pace in response to environmental risk. However, apart from analyzing nations and individuals, we leverage data on a third, previously unexplored level of analysis – U.S. counties – that serves as an intermediary population level connecting the other two. Moreover, rather than using socioeconomic indicators as proxies to mortality, we instead leverage data on the actual mortality risk for all three levels and use socioeconomic indicators as covariates, thus examining effects specific to mortality.

Methodological overview

Focusing on human reproductive timing and abundance in connection to local mortality, we leveraged public archival data from 217 world nations, 3,242 U.S. counties, and 2,808 individuals (see Table S1 in the Supplemental for the full list of variables and data sources). At each level, we examined whether reproductive outcomes (age of parents at birth of first child, rates of adolescent fertility, number of offspring) would be predicted by mortality risk (total life expectancy was chosen as the most cumulative mortality measure, see Methods for details). Using both linear and nonlinear models, we tested whether those links would hold after controlling for social/economic indicators (wealth, contraception, education, urbanization, female participation in labor force). Thus, we applied a consistent analytic framework to test the relationships between objective indicators of human mortality and reproductive outcomes on national, subnational, and individual-level data. We controlled not only for most conventional socioeconomic indicators, but also for shared region-level variance. Alongside more traditional hierarchical linear models, we applied machine learning techniques to explore non-linear relationships between variables of interest.

Results

Summary

In hierarchical linear models of human reproductive outcomes on national, county, and individual levels of analysis, life expectancy was a significant predictor of reproductive timing and number of offspring after controlling for effects of socioeconomic variables (see Tables 1, 2 and 3, S2, S8). In some models, mortality had a stronger performance than all socioeconomic predictors considered together: e.g., among world nations, life expectancy alone explained a greater amount of variance in adolescent fertility than did the five socioeconomic predictors. Among individuals, mortality was the only significant (and positive) predictor of the number of children, aside from age. In most random forest models, life expectancy was among top three variables by feature importance.

Table 1 Nation-level predictors of adolescent fertility rates using hierarchical linear modeling
Table 2 U.S. County-level predictors of adolescent fertility rates using hierarchical linear modeling
Table 3 Predictors of individual-level number of children from individual and local features, hierarchical linear model

Below, we report detailed results for adolescent fertility on nation and county level, as well as number of children on the individual level. For other indicators, results are reported briefly in the main text, tables and figures are reported in the Supplement.

Adolescent fertility

Nation-level

In a mixed-effect model, after controlling for the random effect of world region, the fixed effect of life expectancy explained 48.8% of the variance in adolescent fertility. In comparison, fixed effects of five socioeconomic predictors together accounted for 43.5% of the variance in adolescent fertility. In a combined model, life expectancy remained significant while controlling for all socioeconomic covariates (Table 1), moreover, it explained the largest proportion of unique variance (semi-partial r2 = 0.24): more than twice the next most significant predictor – female literacy (semi-partial r2 = 0.10). Adding life expectancy to a mixed-effect model with socioeconomic predictors resulted in significantly improved model fit (χ2diff = 28.13, DF = 1, p < 0.001). See Fig. 1 for visualization.

Fig. 1
figure 1

Nation-level adolescent fertility as predicted by the strongest ecological predictor – life expectancy (top) vs. by the strongest socioeconomic predictor – female literacy rates (bottom). Strength of predictors was defined as the highest partial R2 in a mixed effect model. Graphs depict dispersion of nation-level data. Size of dots reflects country’s GDP per capita (PPP), colors code continents. See Supplemental materials for plots visualizing the relationship of adolescent fertility with other predictors on the level of world nations

County-level

A mixed-effect model of county-level adolescent fertility with life expectancy had a significantly better fit than one with socioeconomic predictors only (χ2diff = 16.29, DF = 1, p < 0.001). In a combined model, life expectancy remained significant while controlling for county-level covariates: educational attainment, female participation in labor force, median income, and degree of urbanization (see Table 2, Fig. 2).

Fig. 2
figure 2

U.S. county-level adolescent fertility as predicted by the ecological predictor – life expectancy (top) vs. by the strongest socioeconomic predictor – youth college rates (bottom). Strength of predictors was defined by the semi-partial R2 in a mixed effect model. Graphs depict dispersion of data and slopes of relationship by U.S. regions. Size of dots reflects county’s median household income, colors code for regions. See Supplemental materials for plots visualizing the relationship of adolescent fertility with other predictors on the U.S. county level

Random forests

In random forest models on both nation-level and county-level data, life expectancy was estimated across 500 (national) and 250 (county) randomly generated trees among three most important variables for predicting adolescent fertility (see Table 4 for feature importance, Fig. 3 and Fig. S1 for decision trees).

Table 4 Feature importance for predicting reproductive outcomes with random forests on three levels of analysis
Fig. 3
figure 3

Decision tree predicting adolescent fertility rates on the level of U.S. counties (see similar trees for national and individual levels in the Supplementary). The model used one mortality risk indicator (life expectancy at birth) alongside five socioeconomic indicators (youth and adult college rates, median household income, female participation in labor force, urbanization for the county level). Decision trees are built using recursive binary splitting which selects significant predictors (nodes) and finds the optimal splitting threshold value of each, with respect to all other predictors in the model; see Methods for details. The top-down order of variables indicates their feature importance, i.e., relevance for predicting the outcome

Age at first birth / Childbearing age

Nation-level

A nation-level mixed effect model predicting women’s age at first birth (AFB) by life expectancy alone accounted for 78.7% of the variance, while an analogous model with five socioeconomic predictors considered together accounted for 74.5%. A combined model with life expectancy and socioeconomic predictors had significantly higher model fit than a model with socioeconomic indicators only (χ2diff = 4.26, DF = 1, p = 0.03). Life expectancy remained a significant predictor of AFB, while controlling for all other predictors (b = 0.81, CI = 0.01 – 1.61, p = 0.047), along with female rates of tertiary education (b = 1.35, CI = 0.58–2.11, p = 0.001). See Table S2 and Fig. S2 for details.

County-level

For county-level analyses, data on AFB was not publicly available; instead, we used age of childbearing (ACB) – an imperfect proxy to the onset of reproduction that confounds signs of “slow” and “fast” reproductive strategy (see Methods for a detailed explanation; Table S8, Fig. S11, S12 for fertility in older mothers as an alternative proxy). In a model of ACB, adding life expectancy to socioeconomic indicators resulted in a significantly better model fit (χ2diff = 6.95, DF = 1, p = 0.008). In the combined model, life expectancy remained a significant predictor (b = 0.19, CI = 0.05–0.33, p = 0.009). See Table S3 and Fig. S3 for details.

Random forests

Life expectancy was estimated among the most important predictors of nation-level AFB. On the county level, it was not among the top three most important predictors of ACB but had a score of 66.4/100 which suggests statistical significance (Table 4, Fig. S4, S5).

Fertility rates

Nation-level

Life expectancy alone accounted for a smaller but comparable amount of variance (60.1%) to the five socioeconomic predictors considered together (67.2%). Life expectancy significantly improved model fit compared to a model with socioeconomic indicators only (χ2diff = 16.93, DF = 1, p < 0.001) and remained significant above them (b = -0.53, CI = -0.79–-0.28, p < 0.001). See Table S4, Fig. S6 for details.

County-level

On the level of U.S. counties, a model containing both life expectancy and socioeconomic indicators had a significantly better fit than a similar model with socioeconomic predictors only (χ2diff = 5.17, DF = 1, p = 0.023). Life expectancy remained a significant predictor (b = -2.20, CI = -3.99–0.40, p = 0.016) above socioeconomic indicators (Table S5, Fig. S7).

Random forests

Life expectancy was estimated as the most important predictor of fertility rates on both the national and the county levels (see Table 4, Fig. S8, S9).

Number of children

Individual level

In a mixed-effect model on individual data, local (county) life expectancy was a significant predictor of respondents’ number of children after controlling for age, income, educational attainment, social class, employment of female in the household, religiosity and urbanization of the area (b = -0.06, CI = -0.12 – -0.00, p = 0.042; see Table 3). This combined model had a significantly better fit than one with socioeconomic predictors only (χ2diff = 4.14, DF = 1, p < 0.041); respondent’s age was the only other significant predictor (b = 0.50, CI = 0.46 – 0.56, p < 0.001). Notably, life expectancy was one of two population-level predictors among six individual ones.

Random forest

Life expectancy was among three features of the highest importance in predicting the number of children (see Table 4).

Population density

Recent work suggests that fertility rates may decrease with the increase in local population density [68], so we tested models of fertility rates with population density among predictors. Nation-level density alone accounted for 2% of the between-country variance in fertility rates and had no significant effects while controlling for other predictors (Table S6, Fig. S10). On the county level, effect of population density was significant, but not after accounting for life expectancy or socioeconomic indicators (Table S7, Fig. S11). On the individual level, level of urbanization (a proxy to population density), was not a significant predictor of the number of children beyond other predictors (Table 3).

Discussion

This work provides novel evidence for the relationship between mortality and human reproductive outcomes on global, local, and individual levels. Consistent with life history theory [45], human reproductive timing and abundance are linked to local mortality indicators. Moreover, those effects hold while controlling for economic and social variables – including education, employment, wealth, industrialization, availability of contraception – conventionally used to explain human reproductive dynamics.

As shown above, the relationship between mortality and reproductive outcomes held in mixed-effect models after controlling for socioeconomic indicators on all levels of analyses. Machine learning models that allowed for nonlinear associations provided a more nuanced picture: decision trees demonstrated how life expectancy may ‘split’ the entire world – and the U.S. as one of its corners – into clusters of faster" vs. "slower" reproduction. Random forests, in turn, suggested that such splitting may at times be of higher predictive power than wealth or education.

While life history perspectives have previously been used to understand human reproduction, the current research is the first to apply the same analytic approach to data on three levels, including the previously unexplored level of U.S. counties, and to tie individual reproductive outcomes to observable mortality threat in local environments. While work that uses only nation-level indicators may lack the granularity to capture meaningful variability within nations, and work that uses only individual-level analyses may lack in-sample variability, the current research takes advantage of global, local, and individual variance in both mortality and reproductive outcomes. Such an approach may be fruitful for further exploring the varying fertility dynamics in urban and rural areas, different genders and ethnicities, socioeconomic backgrounds, or social strata.

This evidence has implications for the current economic and demographic understanding of human fertility. An examination of mortality alongside education, contraception, female employment, and urbanization emphasizes the predictive power of the former, showing that it explains a unique portion of variance, even after controlling for socioeconomic indicators. At the same time, socioeconomic factors, such as education and income, remain robustly significant on all levels of analysis, across linear and nonlinear models. Thus, we do not conclude that socioeconomic factors matter less: instead, we suggest that they may operate at a different ‘level of explanation.’ Using a Nobel-prize winning taxonomy [69], socioeconomic factors may serve as proximate mechanisms underlying immediate behavioral changes – the “how” of the human reproductive slowdown. But behind those proximate mechanisms may reside the “why” – a more ultimate biological process that involves the adaptive calibration of human developmental systems to features of local environments.

This evidence addresses critiques of “ecological fallacy” raised against previous findings in the field [67]. We find that the key pattern holds across multiple levels of analyses, from nations to counties to individuals. Thus, it may be premature to dismiss nation-level analysis of human reproduction only because they use large-scale population aggregation. We propose U.S. counties as a local unit that may strike a proper balance between population-level variability and individual-level granularity. It must be noted, however, that variability in mortality conditions in the U.S. is considerably lower than the world’s (for U.S. life expectancy in years, SD = 2.38, for the world, SD = 8.90), thus, county-level analysis does not represent the true global range of human ecological conditions.

Previous work compared performance of economic versus mortality models – for example, in explaining the demographic transition in Matlab, Bangladesh [70], and dismissed mortality as a weaker explanation for fertility dynamics. In our work, however, mortality serves as an important predictor of reproductive behavior after controlling for socioeconomic variables on three different levels of analysis. A possible explanation is that analysis of a narrow subset of human population (one rural area in one developing country) may have limited the variability in ecological conditions, thus reducing the potential association between mortality and reproduction. In contrast, all nations of the world and all urban and rural areas of one country may have enough variability in mortality risk to capture such an association.

This research has broader implications for social policies addressing reproductive issues. High adolescent fertility, a pressing issue in lower-income areas because of its negative effects on women’s human capital, is commonly viewed as resulting from poor access to contraception and insufficient educational opportunities [71, 72]. Notwithstanding these factors, it can also be conceptualized as a biologically adaptive response to unpredictable ecologies with cues of higher riskiness. In the current work, mortality was the strongest predictor of adolescent fertility across both world nations and U.S. counties. Although adolescent fertility can have dysfunctional consequences for women, their families, and the larger society, it may be reproductively adaptive: having children sooner helps young women ensure their reproductive success when their environments threaten survival to reproductive age. Social policies addressing adolescent pregnancies may therefore benefit from reducing local morbidity and mortality rates (e.g., through investment in health and safety initiatives), in addition to increasing access to contraception and education.

For higher-income societies, the fertility decline and older age of childbearing might represent a biologically adaptive response to stabilizing ecologies, in which humans delay reproduction and have fewer offspring in favor of higher-quality childcare later in life. Such ‘slow’ reproductive trajectory allows to reap maximum genetic benefits in safe and predictable ecologies. Social policies encouraging fertility through economic incentives have often been less successful than expected, possibly because they fail to address the biological mechanisms suggested by life history theory. One such mechanism suggested by prior research could be physiological calibration of human reproductive systems through neuroendocrinal changes in early ontogeny [73,74,75].

Limitations and future directions

The current research should be considered in light of its methodological limitations. First, although suggestive, this work is correlational and cross-sectional, thus hindering our ability to generate causal claims about the effect of ecological variables on reproductive behavior. In theory, reverse causality is possible: e.g., adolescent fertility and younger age at birth may drive mortality through health complications in teenagers.

This study focuses on life expectancy as a single aggregate variable. This limits our ability to identify specific sources of mortality and their varying effects in different environments. Future research would benefit from investigating specific effects of extrinsic mortality risks, such as famine, diseases, or healthcare availability, on human reproduction. It would also benefit from employing longitudinal designs to explore the link between changes in mortality and corresponding changes in reproduction over time.

Effects of mortality risk are often confounded with that of socioeconomic status, as life expectancy correlates highly with SES on all levels of analysis [76, 77]. While the current work controls for effects of the main socioeconomic variables, such as income, education, social class, and wealth on both on individual and population levels, there still may be socioeconomic variance left unaccounted for.

Due to the aggregated nature of most data used in this work, we were not able to robustly distinguish between male and female reproductive outcomes, while prior evidence suggests that they can vary significantly [78, 79]. Moreover, newly emerging evidence suggests that, beyond extrinsic factors, human reproductive behavior has a strong genetic component [80, 81]. Future work would benefit from analyzing additional groups of factors and examining their weights in accounting for the variability in human reproduction.

Conclusions

The above evidence may suggest (although not causally establish) a possible explanation for why adolescent fertility remains an issue in regions with high mortality, as well as why no developed country in the world has seen its fertility rates rise to its mid-twentieth century levels, despite massive efforts taken by some world governments. The biological calibration of human reproductive timing might impose a glass ceiling on the effect of socioeconomic incentives. As it is for other animal species, humans may adjust their reproductive behavior to fit with the survival threat in their local ecology. Social and health policies governing human reproduction, whether they seek to boost or constrain fertility, may thus benefit from incorporating a focus on mortality risk, both addressing it as a potential “accelerator” of human reproduction in riskier regions of the world with high mortality, and accounting for the natural “slowdown” of reproduction humans may adopt in stabler and safer ecologies.

Methods

Summary

We synthesized 17 large public datasets from trusted sources to obtain (1) nation-level data on reproduction, mortality, and socioeconomic progressivity from 217 world economies, (2) county-level data from 3,242 counties and equivalents in the United States, (3) person-level reports on similar indicators from 2,808 respondents. On all three levels, we applied hierarchical linear modeling and random forests with K-fold cross-validation [82] to examine predictors of people’s aggregate reproductive timing and abundance operationalized as: (1) adolescent fertility rates, (2) population-averaged age of childbearing, and (3) population-averaged or individually reported number of children.

At each level, we used two groups of predictors: (1) socioeconomic indicators including wealth, participation of females in workforce, access to contraception and education, urbanization, or industrialization, and (2) local mortality risk level operationalized as life expectancy at birth. Other measures of mortality were considered, such as risk for different age ranges (e.g., mortality in infants under 1, children aged 1 to 4, ages 5 to 9, etc.). However, there is no consensus in the literature on what age is most critical: as suggested by theoretical models [83], comparative analyses [45], and experimental data [43], variation in life history traits is not limited to child mortality and has also been linked to adult mortality. Total life expectancy was chosen in this work because it provides a cumulative measure of mortality across age ranges and does not confine the analysis to any one age range. Notably, age-specific mortality measures are all highly correlated with one another and with total life expectancy (see Tables S10, S11 in the Supplement).

Depending on the level of analysis, we used the most relevant variable to represent the theorized predictor, e.g., GDP per capita (PPP) reflected wealth on the national level, median household income represented wealth on the county level, while reported household income was used on the individual level (see Table S1 for variables).

For each reproductive outcome we examined (1) the explanatory power of a hierarchical linear (mixed-effect) model with socioeconomic predictors alongside a similar model with one mortality risk predictor, and (2) the explanatory power of individual variables in combined models including all the above indicators. Across machine learning analyses, we compared predictors by their (1) top-down order in decision trees built using recursive binary splitting, (2) average feature importance throughout 500/250 trees of the random forest.

Levels of analyses

We examined variability in human reproduction, as well as ecological and socioeconomic features on three levels: 1) world countries (here referred to as ‘nations’ to avoid confusion with ‘counties’); 2) U.S. counties or county equivalents; 3) individuals.

Nation-level data included 217 world economies per World Bank Development Indicators: https://datatopics.worldbank.org/world-development-indicators/. County-level data covered 3,242 U.S. Census areas including 3,006 counties, 14 boroughs and 11 census areas in Alaska, the District of Columbia, 64 parishes in Louisiana, Baltimore city, MD, St. Louis city, MO, part of Yellowstone National Park in Montana, Carson City, NV, 41 independent cities in Virginia, 78 municipalities of Puerto Rico, 3 main islands of the United States Virgin Islands, Guam, 4 municipalities of the Northern Mariana Islands, 3 districts and 2 atolls of American Samoa, 9 islands of the U.S. Minor Outlying Islands.

Individual level data included 2,808 survey responses of Americans residing in 363 different counties collected as part of World Values Survey (Wave 7, U.S. subset). At each level, the units were nested within larger geographical entities to account for shared variance: world nations were nested within 22 world regions (by the International Organization for Standardization (ISO) classification); counties and individuals – within 57 states and territories of the United States.

Data

For each level of analysis, we synthesized several large public datasets from trusted sources that contained secondary de-identified aggregated data. For the nation-level analyses, the datasets included: World Bank Development Indicators, United Nations World Fertility Report, United Nations Economic Commission for Europe Statistical Database, Organization for Economic Cooperation and Development Family Database, United Nations Industrial Development Organization report. For U.S. county-level analyses, the datasets included: U.S. Census (American Community Survey), Institute for Health Metrics and Evaluation: Global Health Data Exchange, Center for Disease Control and Prevention’s data on infant mortality rates, Bureau of Labor Statistic data on labor participation, U.S. Department of Agriculture Atlas of Rural and Small Town America and data on education, income, and population. For individual level, the datasets included World Values Survey (Wave 7, U.S. subset) and county-level data from above sources. Links to original datasets, as well as preprocessed datasets used in the analyses, are publicly available at OSF: https://osf.io/qtf84/.

For nation-level analyses, data were aggregated for years 2000–2020 to eliminate year-to-year fluctuations and missing cases. For U.S. county-level analyses, depending on the variable, data were either aggregated for years 2016–2020 (to match a standard 5-year aggregation cycle of Census data collection). For individual level, data came from years 2017–2021 (the most recent wave of World Values Survey). Datasets were merged using standardized ISO country codes (on the national level) and FIPS codes (on the county and individual levels).

Variables

We focused on 20 variables suggested by the mainstream demographic paradigm of human fertility and life history theory (see Table S1 for measurement details on each variable;; items enumerated with an S refer to Supplemental materials). As dependent variables we used (1) adolescent fertility rates, (2) age of birth of the first child (or average age of childbearing for county-level analyses), (3) fertility rates, (4) individually reported number of children. As socioeconomic predictor variables we used: on the level of world’s nations, (5) contraceptive prevalence, (6) share of women in tertiary education, (7) female literacy rates, (8) female participation in the workforce, (9) generalized economic prosperity measured as GDP per capita (PPP). On the county level, (10) educational attainment measured as youth and adult college rates and/or percent of population with a bachelors’ degree, (11) female participation in labor force, (12) median household income, (13) degree of urbanization. On the individual level, (14) educational attainment, (15) social class, (16) household income, (17) employment of female in the household, (18) religiosity, (19) age. To assess mortality risk, on all three levels of analyses we used (20) local life expectancy at birth (see Table S1 for sources of data and measurement units for all analyzed variables). For predictor variables that correlate at > 0.7 (see Table S2) we ran additional multicollinearity checks, making sure variance inflation scores were not above 2.5.

Note that on the level of the U.S., mortality has limited variability compared to the world: the range of life expectancy averages across U.S. counties is 66.8–86.8 years, SD = 2.38 while the range for worldwide national averages is 54.3–85.2 years, SD = 8.89.

For age of birth of the first child, there is no source that consistently gathers and reports data on all world countries [71]. To increase the sample size on the level of nations, data on these indicators were collected from multiple sources: United Nations Economic Commission for Europe Statistical Database reporting data from 2019 and United Nations World Fertility Report reporting data from 2012 or latest available. We only included data coming from 2000 to 2020. Whenever more than one data point per country was available, they were averaged. Among countries that had data points from two different decades, these data correlated at r = 0.88.

On the county level, data on the age of birth of the first child was not available; instead, we used cohort-weighted age of women at all births for a given period of time. As mentioned in the Main text, such an indicator is an imperfect proxy to life history strategy as it confounds a larger number of offspring – usually, a sign of a “faster” strategy – with older age – a sign of a “slower” strategy. For example, a woman who had her first child at age 18 (Case 1) and thus demonstrated a relatively early reproductive onset, at least for a developed country, may further have children at ages 23, 26, and 29 – thus, her average age of childbearing (24) will be higher than that of a woman who had one child at age 23 (Case 2). While by the conventional life history criteria, Case 1 clearly represents a faster reproductive strategy than Case 2, estimating by age of childbearing would lead us to consider Case 2 a faster one.

Analytic approach

Mixed effect models

We used hierarchical linear (mixed effect) modeling to predict each reproductive outcome using the following types of predictor variables: (1) socioeconomic indicators, (2) life expectancy, (3) both sets of variables together. We compared the variance explained by each model; then, for each pair of models, we conducted a χ2 difference test of model fit to evaluate whether each set of indicators explained significant amount of variance while controlling for the other set. We also report semi-partial r2 for each individual predictor in the model. Units of analyses (nations, counties) were nested within larger geographical areas (world regions, U.S. states) to account for variance shared with neighboring units.

Decision trees and random forests

Because linear models work with sample averages, they provide limited insight into the structure of predictors and their relationships. To provide a more nuanced picture, we applied a non-parametric machine learning algorithm known as decision tree (Rokach & Maimon, 2005) to predict each reproductive outcome from mortality and socioeconomic predictors considered together. For each outcome variable, we then built a random forest [82] to obtain a more robust (as compared to a single tree) estimate feature importance of individual predictors. Random forest is a classification algorithm that construes a large number of uncorrelated (or minimally correlated) decision trees. A decision tree, in turn, relies on a classification algorithm known as recursive binary splitting: it identifies variables (conventionally called “features”) that significantly predict the outcome variable and recursively split the data by a certain threshold of each predictor, with respect to all other predictors in the model. Every split breaks the available data down into two classes in such a manner that the data between the two classes are maximally different from each other, whereas the data within classes are maximally similar. The decision tree algorithm finds the optimal sequence of predictors (features) and their splitting thresholds. A random forest, in turn, constructs a large number of such trees that, because of their lack of intercorrelation, provides both more accurate predictions and more robust feature importance estimates. To keep the trees as independent as possible, random forests use bagging (simulating multiple datasets by removing and repeating random observations from the original dataset) and feature randomness (randomly sampling a subset of features and using only this subset to create a tree). Assessing feature importance is a way to obtain relatively robust estimates for the explanatory power of predictor variables across a large sample of simulated datasets. In the above analyses, feature importance was calculated across 500 trees of the random forest for nation-level analyses and 250 trees for county-level and individual-level analyses (due to limited computational resources). Feature importance of each predictor is measured in relative units scaled to the feature of the highest importance that is assigned a value of 100.