Introduction

Since the neolithic demographic transition, human societies have often lived along a continuum of unsustainability (Ponting 2007; Gowdy and Krall 2013), as depicted in Fig. 1.Footnote 1 At the most extreme end of this continuum is socio-ecological “collapse”, defined here as a rapid decline in the population and/or development of a society due to a combination of ecological and social factors (Cumming and Peterson 2017). Civilizations that may have experienced socio-ecological collapse in whole or part due to their over-extraction of resources include the Mayans, Romans, Greeks, the civilizations of Easter and Pitcairn Islands, the Norse colony in Greenland, the Cahokia in the American Midwest, and the Anasazi (Lopinot and Woods 1993; Ponting 2007; Hughes 2011; Kennett and Beach 2013). In these cases, human systems may have expanded beyond their ability to absorb energy and materials from the environment, and these resource constraints interacted with socio-political conflict to generate socio-ecological collapse. The precise mechanisms and the relative weight of social versus ecological factors is debated in these and other cases (Butzer 2012; Butzer and Endfield 2012; Boersema 2015; Cumming and Peterson 2017), and we do not imply that every case of social collapse is primarily resource driven. Instead, we simply note that human populations have repeatedly experienced collapse and in many of these cases, environmental and resource constraints are implicated.

Fig. 1
figure 1

Sustainability and a continuum of human unsustainability

While these cases of permanent collapse are particularly notable, humans have also spent a great deal of their history in a state of socio-ecological crisis. We define socio-ecological crisis as a situation in which human populations have expanded close to or beyond their ability to gather energy and materials from their environment which imperils the population and its standard of living.Footnote 2 Socio-ecological crisis may lead to the collapse, but societies might also be resilient. Examples of socio-ecological crisis might include the 1828 years (out of 2000) between 108 BCE and 1910 CE in which there was a famine in at least one province in China; the European famine of 1315–17 in which the population resorted to cannibalism; or the Finish famine of 1696–97 in which a quarter of the population perished (Ponting 2007). In each case, crisis did not lead to collapse, but for our purposes, it is only important that so much of recorded human history has been spent in a state of crisis.

In addition to states of socio-ecological crisis and socio-ecological collapse, human populations have spent much of the remainder of our post-paleolithic history in a state of unsustainability. We define unsustainability as a growth phase in which humans extract increasing quantities of energy and materials from the environment and use them for both maintenance and growth. Unsustainability is associated both with economic (Daly 1991) and population growth (Ehrlich and Ehrlich 2009) and may lead to socio-ecological crisis. We might consider the growth phase of the ancient Greek (Hughes 2011) or contemporary Western civilizations to exist in a state of unsustainability. Given our history and our contemporary society, we might wonder why human civilization, since the neolithic revolution seems to be dominated by periods of unsustainability punctuated with periods of socio-ecological crisis and collapse (Ponting 2007; Gowdy and Krall 2013)? The purpose of this paper is to provide a conceptual model for understanding why humans may be predisposed to environmental unsustainability, socio-ecological crisis, and collapse.

Anthroecological theory (AET) hypothesizes that human social and cultural evolution is the ultimate cause of the ecological crises currently damaging earth systems (Ellis 2015; Ellis et al. 2018). According to AET, human damage to earth systems is a consequence of socio-cultural niche construction which has evolved via a multi-level group selection process. In this understanding, humans have acted as ecosystem engineers and niche builders since at least the agricultural revolution, and perhaps since their burning of ancient savannahs for corralling game. This ecosystem engineering led to progressively greater “social scales” and population size, requiring ever-increasing levels of ecosystem engineering.

Here, we attempt to merge AET with Lotka’s maximum power principle (Lotka 1922) and concepts from evolutionary ecology. Lotka argued that natural selection acted so that organisms sought to maximize the rate at which it extracted energy from the environment; H.T. Odum later named this hypothesis the maximum power principle (Sciubba 2011). If Lotka was correct, then the ecosystem engineering and niche construction described by AET can be understood as a consequence of the maximum power principle acting via natural selection. While this energetic view is similar to Ellis’s AET model, we use energy extraction as a target of selection and as a metric of environmental load. This formalization might allow novel tests and models of AET. In the remainder of the paper, we first review the available literature on the evolution of unsustainability including prior approaches from agriculture, evolutionary psychology, and AET. We then describe our conceptual model of energetic AET and its relationship to the evolution of carrying capacity. Finally, we discuss the implications of energetic AET for our understanding of two contemporary issues in socio-ecology and the paper ends with conclusions.

Prior literature on the evolution of unsustainability

Agriculture, energy and socio-ecological crisis

As in the present work, Gowdy and Krall (2013, 2014, 2016) attempted to understand the cause of human unsustainability. They argued that human ultra-sociality (called eusociality in the sociobiological literature) evolved as a consequence of the Neolithic demographic transition and that this ultra-sociality was the ultimate cause of the Anthropocene’s environmental crisis. To Gowdy and Krall, the ultra-social nature of human groups allowed for a shift in the primary level of selection from the individual level to the group level. Thus, “With the transition to agriculture the group as an adaptive unit comes to constitute a wholly different gestalt driven by the imperative to produce surplus (Gowdy and Krall 2014, 139).” This imperative to produce surplus leads to environmental crisis in Gowdy and Krall’s model, but the same phenotype could also be understood as an expression of Lotka’s maximum power principle in our model.

Of course, Gowdy and Krall were not the first to link agriculture and socio-ecological crisis. Thomas Malthus’ classic work on agriculture and population, and the responses to it, are to our knowledge, the first studies of the energetic sustainability of human societies. In is 1798, An Essay on the Principle of Population, Malthus observed that population grew exponentially, while improvements in agricultural yields grew linearly at most (Malthus and Pullen 1989). The result was what has been called a Malthusian catastrophe in which the human demand for food energy exceeds our ability to extract it from the environment with consequent decreases in population size, standard of living, or both. Phrased in the language of the present paper, Malthus argued that the rate of increase of energy extraction from the environment was principally resource (rather than technology) limited and that this resulted in socio-ecological crisis when the rate of population growth exceeded the rate of energy extraction growth.

Malthus’ ideas have been challenged many times since the eighteenth century, perhaps most notably by Esther Boserup (2014). Boserup argued that agricultural productivity was principally limited by the technology and labor humans dedicated to agriculture. In her view, as population density rose, farming practices shifted through “agricultural intensification” which would serve to increase yields at the cost of increased labor. Phrased in the language of the present paper, such a change would amount to technological and cultural evolution towards increasing energy extraction from the environment. This increasing energy extraction could then, contra Malthus, support an exponentially growing population.

Since Boserup’s work, the Green Revolution increased staple crop productivity via new crop varieties and natural gas-made fertilizer. This revolution was able to ameliorate the potential socio-ecological crisis associated with the rapid population growth in the developing world in the latter half of the twentieth century (Ehrlich and Ehrlich 2009). Thus, consistent with the outlines of Boserup’s thesis, humans were able to use technology to increase their energy extraction from the environment, this technologically-mediated energy extraction supported increased population size and occurred commensurate with a cultural shift (evolution) that allowed for increasing technological use in agriculture (Possas et al. 1996). This process is consistent with the energetic AET hypothesis developed here.

While Boserup and Malthus provide important and contrasting lenses through which to understand the role of agriculture in human population, and while Gowdy and Krall identify agriculture as a critical turning point in human evolution, agriculture is no longer the predominant energy source for humans (Smil 2017). Thus, we seek to build a conceptual evolutionary model of the human socio-ecological system that is consistent with these insights from agricultural systems but is more evolutionary and more general and incorporates extra-somatic energy, defined as energy that is used by humans but not used in direct human metabolism (Price 1995). Furthermore, we propose a positive feedback that is missing from prior models of the energetics of socio-ecological systems, and we also integrate a ratchet-effect that causes the evolution of humans towards socio-ecological crisis to be unidirectional.

Evolutionary psychology and socio-ecological crisis

Where Gowdy and Krall (2013, 2014) saw agriculture as the key to understanding human unsustainability, Van Vugt et al. (2014) saw the roots of unsustainability in the paleolithic, before the invention of agriculture. Using an evolutionary psychological approach, Van Vugt et al. argued that ancestral selection in paleolithic humans created psychological conditions that make humans ill prepared to living sustainably. For example, they argue that relative values—valuing status rather than absolute quantities of goods- would have been adaptive in prehistory, but has become maladaptive for sustainability in the contemporary world. In the present paper, we understand the traits described by Van Vugt et al. (2014) as pre-adaptations that are exploited by cultural evolution.

Anthroecological theory and energy

AET proposes that humans are predisposed for unsustainability through their biological and cultural evolution. Specifically, AET proposes that human unsustainability has evolved via a multi-level selection process (see Waring et al. (2015) for a relevant overview of multi-level selection) acting on the genome and occurring in concert with selective and non-selective mechanisms acting on culture and technology. In AET, this process results in a species that is prone to niche construction and ecosystem engineering, and the scale of these processes continues to increase as the population rises. This increasing scale coupled with human propensity for niche construction leads to human unsustainability (Ellis 2015; Ellis et al. 2018).

Here, we hypothesize that humans have evolved to increase their carrying capacity (scale) by increasing their rate of energy extraction from the environment. This hypothesis is identical to that of AET, with the exception that it emphasizes the cause of the process (maximization of energy extraction) rather than the mechanism (niche construction and ecosystem engineering). That is, niche construction and ecosystem engineering are two possible mechanisms by which humans maximize their extraction of energy from the environment. Because the substrate of this evolution is cultural and technological as well as genetic, it outpaces the ability of the ecological system to evolutionarily respond. The human and non-human ecological systems are thus locked in a form of Red Queen coevolution (Van Valen 1973; Stenseth and Smith 1984), but human technology evolves faster than the biotic ecological system’s ability to keep up.

The Red Queen hypothesis is often invoked to describe coevolutionary systems in which one species coevolves to keep up with selective pressure from another species or species group and in which there is a difference in the evolutionary rate of the coevolutionary partners (e.g., pathogens and their hosts). Here, we adapt the Red Queen to include coevolution between a species (humans) and the biotic components of the ecosystem (the biosphere). Thus, we hypothesize that as human population size increases, human populations demand more energy from the environment. This places a selective pressure on the biotic environment, because energetic resources are limited and because of the 2nd law of thermodynamics, energetic resources are zero-sum; that is, if humans consume energy from the environment, there is necessarily less energy available for other organisms. However, the human population evolves via techno-cultural evolution, which is far faster than the biotic environment can evolve via genetic evolution. Thus, the analogy between a fast-evolving parasite (humans) and a slow-evolving host (the biotic environment) is apt. Note that we do not intend to imply that the environment itself coevolves with human techno-cultural evolution, but that the biotic populations that compose the ecosystem evolve, with consequent impacts on the abiotic environment (but see Lovelock and Margulis 1974, Levin 1998, Swenson et al. 2000).

While the conceptual differences between AET and the present hypothesis are modest, there are important differences in their implications. Ellis’s AET proposes that humans are somewhat unique, because we are expert ecosystem engineers and niche builders. The present hypothesis suggests the alternative. Humans, just like all other species, maximize their energy extraction from the environment. Just like all other species, humans are opposed in this process by a coevolutionary response in the biotic components of the ecosystem. Humans are unique only, because the rate of cultural evolution in energy extraction exceeds our biotic environment’s coevolutionary response. In this view, human unsustainability is a result of our uniquely advanced culture coupled with our non-unique drive towards for energy maximization; any species that developed a similar rate of cultural evolution would behave similarly. Niche building and ecosystem engineering (Ellis 2015) or agriculture (Gowdy and Krall 2014) are simply the means by which we maximize energy extraction from the environment.

Conceptual model of unsustainability

Energy and selection

Natural selection works to maximize the number of offspring an individual contributes to a following generation (i.e., fitness). However, energy and fitness are closely linked, because fitness is the result of a physiological process by which energy is captured from the environment and converted into offspring. That is, growth and reproduction at either the group or individual level require energy input from the environment coupled with high entropy waste output to the environment. This physiological understanding of fitness has been well studied in non-humans and is the foundation for much of life history theory [e.g., (Weiner 1992, Garland Jr and Carter 1994, Ricklefs and Wikelski 2002)]. In this context, Pianka (1970) argued that, “…natural selection will usually act to maximize the amounts of matter and energy gathered per unit time.” Brown et al. (1993) likewise offered an energetic definition in which fitness is “reproductive power, or the rate of conversion of energy into offspring.” This reproductive power was taken to be a function of both the rate of assimilation of energy from the environment and the rate of conversion of energy to offspring (but see (Kozlowski 1996)).

Based on Brown et al.’s definition, fitness (F) of individual i is a function of energy extracted (X) from the environment and the efficiency of energy conversion to offspring (E):

$$F_{i} = X_{i} \times E_{i} ,$$
(1)

where F is offspring number per lifetime, X is in calories extracted by the individual over its lifetime, and E is offspring produced per calorie. In this model, fitness is on the individual level, but identical logic applies at the group level, assuming heritable group variation in X or E. This physiological understanding of fitness is tautological when applied to non-humans, because the only source of energy non-human organisms use is somatic. For example, applied to a lion, Eq. 1 simply argues that the lion’s offspring number is the total number of calories it eats multiplied by its offspring number divided by the number of calories it eats. However, both X and E can be selected independently, given heritable variation. That is, lions could be selected for better foraging ability (increased total energy gain from the environment), or for more efficient foraging, digestion, metabolism, etc. (increased efficiency). Unlike non-humans, humans can use extra-somatic energy sources (e.g., fossil fuels), and can convert those energy sources into genetic fitness. For example, humans can burn savannas to harvest game (Scherjon et al. 2015), burn forests to create terra preta (Glaser et al. 2001), and reform fossil fuels to create fertilizer; all of these uses of energy use non-trophic energy to indirectly increase fitness. Thus, for humans X and E are not limited by photosynthesis and respiration as they are in non-human populations.

We hypothesize that humans have been selected for increased X and E and that this may have occurred through a combination of individual selection acting on genetic/behavioral traits that favor high individual X, as well as group selection favoring groups which, due to their genes, culture, or technology, more rapidly absorb energy from the environment and convert that energy to fitness. We hypothesize that this selection has acted to increase both somatic X (e.g., food) and extra-somatic X that can contribute to fitness (e.g., fossil fuels). We hypothesize that humans have been preferentially selected to maximize X rather than E, that this is the cause of our predilection for socio-ecological crisis, and that this preferential selection on X occurs because of thermodynamic limits on the change in E. Furthermore, even selection for increased resource efficiency can lead to increased environmental impacts. As resources are more efficiently converted into fitness, population size increases which in turns leads to increased absolute resource use. This is analogous to Jevons’ Paradox and the ideas of steady state economics which propose that even increased energy efficiency can result in increased environmental impacts through increased economic growth (Daly and Cobb Jr 1994).

At the individual level, we can see the impacts of selection to maximize energy extraction in human social behavior (Penn 2003). As we accumulate more energy from the environment, demonstrated socially in the form of material possessions, we improve our status in the group and presumably improve our mating success (Kruger 2008; Sundie et al. 2011). This might apply both to indirect material indicators of energy gain (e.g., objects) as well as direct physical indicators of energy gain [e.g., body mass as an indicator of male attractiveness (Swami and Tovée 2005)]. Of course, the aggregate accumulation of material possessions and status comes at an environmental cost. We use energy and materials to create and obtain status bearing objects (cars, houses, jewelry, clothes, etc.), and this resource use results in high entropy wastes output to the environment. Thus, one way to understand our socio-ecological crisis is as a result of ancestral selection acting at the individual level and favoring social dominance expressed via the accumulation of embodied energy (Odum 1996). This adapted drive to accumulate material and embodied energy is a preadaptation for environmental extraction which is exploited by later cultural evolution (Van Vugt et al. 2014).

Genetic selection might also operate at a group level (Wilson and Sober 1994). In this case, groups of individuals that can more effectively accumulate resources outcompete groups that cannot or do not. If there is heritable (genetic) variation in the efficiency of resource extraction or resource use, these traits would be selected for (we consider non-genetic mechanisms in the next section). As in Eq. 1, the fitness of the group is a function of the group’s ability to extract energy from the environment and the group efficiency of conversion. For example, groups of hunter-gatherers that hunt more effectively due to genetic mutations in social intelligence might be selected over groups without such traits, even if the trait exhibits a cost at the individual level. See Henrich (2004) for a discussion for both the potential and limits of genetic group selection.

We might view human social organization in general in this lens: social organization exists to maximize the extraction of energy from the environment to the group and individual (X), and the efficiency of the conversion of extracted energy into offspring (E). This is identical to the claim that social organization exists to maximize the fitness of the group (Wilson and Sober 1994) and/or the individuals which compose the group (Nowak et al. 2010), given an energetic definition of fitness. What is unique is the implication. If social organization acts to maximize group fitness (or the fitness of individual group members), and if fitness is dependent on energy extraction from the environment, then social organization results in increasing energy extraction from the environment. Again, either increased X or E leads to environmental unsustainability, and thus social organization is an evolutionary cause of the socioecological crisis.

Cultural-technological coevolution

Multi-level selection theory argues that the phenotypes, particularly social phenotypes, observed in nature can be understood as the result of a selective process that acts on both individual and higher levels of social organization simultaneously. In multi-level selection theory, it is the balance of variance and selective pressures at the group and individual levels that determine the genotypes and phenotypes that result. Multi-level selection is increasingly used in biology as a means of explaining insect social behavior Nowak et al. (2010) and in social science as a means of explaining human social behavior (Bell et al. 2009; Zefferman and Mathew 2015; Richerson et al. 2016), including of pro-environmental behavior (Van den Bergh and Gowdy 2009, Safarzyńska and van den Bergh 2010, Safarzyńska et al. 2012, Waring et al. 2017, Brooks et al. 2018).

One of the traditional weaknesses of multi-level selection theory is that ancestral human groups seemed poorly suited to maintaining the genetic variance between groups required for group-level selection (Henrich 2004). However, selection is not limited to genetic transmission. Cultural group selection posits that culture, rather than genes, is the source of heritable variation on which selection acts at the human group level. Because there are differences in the transmissibility and variance generation between cultural and genetic inheritance, the maintenance of cultural variation across groups is more plausible (Henrich 2004; Bell et al. 2009; Waring et al. 2015), thus the attraction of group-level cultural selection as a means of explaining human traits.

Table 1 describes the levels of selection acting to create human unsustainability and speed with which they act. The strength of selection is a function of the heritable variation in a trait and the trait’s association with fitness (Henrich 2004; Freeman and Herron 2007). The speed of selection is a function of its strength and the generation time (Van Valen 1973). Cultural group selection can be more rapid than genetic forms of selection, because cultural change does not need to wait for biological reproduction to change phenotypic ratios, and new cultural innovations can be added to the population through non-random guided mutation (Perreault 2012; Brooks et al. 2018).

Table 1 Multi-level selection acting in energetic-AET

We hypothesize that genetic, individual level selection for behavioral and psychological traits that increase X serve results in pre-adaptations for a form of gene-culture coevolution (Feldman and Laland 1996; Ambrose 2010) that results in technology that further increases X. Technological-cultural evolution spreads via cultural group selection (or other non-genetic means of transmission) (Cavalli-Sforza and Feldman 1981; Mesoudi et al. 2004; Claidière et al. 2014; Jordan 2014). As a result, technology might change far more quickly than natural systems can respond via natural selection (Wilson 2012). In species without technology or a technological culture, changes in X and E will occur on evolutionary time and within a biological system that will evolve in response to any evolution. That is, in the absence of culture and technology, all of the biotic components of the system are limited to gradual changes in gene frequencies over time such that the rate of change in one species can be adapted to by another species. In humans, cultural evolution may have been too fast for other species to respond (Perreault 2012). Thus, another way to understand our socioecological crisis is as a result of our ability to increase K by rapid, non-genetic processes. This rapid increase in technology creates an unsustainable situation in which the human system changes faster than the biotic components of the non-human system can co-evolutionarily respond.

By analogy to the genetic definition of evolution (change in gene frequencies over time), consider techno-cultural evolution to be the change in the frequency of use of a specific technology over time. This frequency can change extremely rapidly. For example, the frequency of automobile use in the early twentieth century increased from 10,000 vehicles registered in the U.S. in 1900, to 10 million vehicles 20 years later (Nakicenovic 1986). We hypothesize that societies that heavily employed automobiles might have experienced greater economic and population growth in this period than those that did not, extracted more resources from the environment, and increased their carrying capacity. Thus, in the early twentieth century, the U.S. increased X through a change in technology and this change occurred far more rapidly than the non-human system could respond via coevolution.

This technologically mediated increase in X in the early twentieth century U.S. resulted in an increased survival and reproduction in the U.S., resulting in increased population growth. However, as population and affluence grew, new sources of energy were required to support the larger population and its energetic (e.g., economic) expectations. Despite the challenge of maintaining the product of X and E during the twentieth century, the U.S. did so. Several factors contributed to this capability, but as the twentieth century progressed the increasing population increased the number of minds that could be dedicated to increasing X and E. This increased the rate of innovation which increased the rate of energy extraction, which increased the ability of the society to support the population. This positive feedback loop is at the core of the energetic AET hypothesis (Fig. 2).

Fig. 2
figure 2

Hypothetical positive feedback mechanism in energetic AET which leads to unsustainability

The maximum power principle

The result of genetic and cultural evolution for increased X is Lotka’s maximum power principle (Lotka 1921, 1922). The maximum power principle states that open systems tend to maximize the rate at which they can absorb useful energy from the environment. Systems do not necessarily maximize energy gain, nor do they minimize energy loss, but instead systems maximize the useable power (energy divided by time) absorbed from the environment (Hall 2004). This is similar to the argument expressed in the preceding sections, because X is a rate (energy per lifetime). Therefore, to say evolution acts to maximize X is to say selection acts to maximize power from the environment, consistent with Lotka’s principle. Note that Odum and Pinkerton (Odum and Pinkerton 1955; Odum 2007) expanded Lotka’s work and their model might draw different conclusions about efficiency than those presented here (Hall et al. 1986). Here, we use the maximum power principle sensu Lotka rather than Odum.

Lotka (1922) writes:

If sources are presented, capable of supplying available energy in excess of that actually being tapped by the entire system of living organisms, then an opportunity is furnished for suitably constituted organisms to enlarge the total energy flux through the system …In every instance considered, natural selection will so operate as to increase the total mass of the organic system, to increase the rate of circulation of matter through the system, and to increase the total energy flux through the system, so long as there is presented a unutilized residue of matter and available energy (147–148)

Thus, Lotka imagines a system which absorbs exergy X from its environment and sustains N subsystems. If X increases, Lotka argues that N will change to take advantage of the increased X. Humans have been able to repeatedly increase X via technology and as a result, we increase N. According to Lotka, we are expected to continue to do so, “so long as there is presented a unutilized residue of matter and available energy.” Unsustainability arises, because humans have evolved to employ increasingly sophisticated means of finding new unutilized sources of matter and energy. Empirical tests of the maximum power principle are limited, but extant (Hall 2004; Cai et al. 2006; DeLong 2008; Li et al. 2013) as are criticisms (Månsson and McGlade 1993).

Perhaps the most relevant example of the maximum power principle comes from the Neolithic revolution. Bowles (2011) compared the energy efficiency of hunter gatherers to early farmers and found that the efficiency in terms of energy extracted per unit of labor was higher for hunter-gatherers than for farmers (see also Ponting 2007; Smil 2017). This led Bowles to conclude that non-energetic factors explain the transition to agriculture. However, according to Lotka, natural selection favors maximum power (energy per unit time) rather than maximum efficiency (energy per unit input). Early farmers were able to extract more energy from the environment per unit time than hunter gatherers and this explains the success of the Neolithic revolution. That is, in a given year (time), a farming society could extract more energy than a hunter-gatherer society (Ponting 2007; Smil 2017) and could convert this energy to fitness. Thus, the transition to agriculture is consistent with the maximum power principle even while being inconsistent with explanations based on labor or energy efficiency.

Lotka stipulated that just because natural selection will act to maximize power flow through a system, evolution will not necessarily follow due to a lack of variance in the phenotypes leading to increased power flow; he called this variance “generating influences”. We propose that human technology and culture is the primary generative influence that allows humans to obey the maximum power principle.

In nonhumans, metabolic systems are highly conserved across species such that the same molecular pathways are present in diverse taxa. Respiration, for example, is similar in plants, fungi and animals. This conservation is due to the lack of generative influence (variance) in respiration efficiency and results in a thermodynamic limit to the efficiency by which one generation can create the next generation. As a result, the evolution of E is limited. Likewise, in non-humans, the evolution of X is limited by coevolution such that every adaptation that increases X in one species acts as a selective pressure on another species.

Using non-somatic energy, humans have freed X and E from these limitations, however, we hypothesize that there is more variation in the ability to capture energy flow into the system than in the efficiency of energy conversion. Energy efficiency is limited by thermodynamic limits, while energy extraction is, because of rapid evolution through culture and technology, freed from the effects of coevolutionary responses. In other words, it has been easier for the system to evolve towards energy extraction than to evolve to energy efficiency. For example, we might expect that the rate of increase of global wood extraction will exceed the rate of increase in wood-fired energy efficiency. Similar hypotheses could be generated for other fuels.

The evolution of carrying capacity

Carrying capacity model

Human carrying capacity (K) is not a fixed quantity, but an emergent property of the earth system of which humans are a part (Sayre 2008; Chapman and Byron 2018). That is, K is not exogenous to humans but determined by the technology and behavior of humans interacting with climate, biodiversity, and biogeochemical cycles. Over the past several centuries, humans have repeatedly increased K using technology to increase our extraction of energy and materials from the environment and to more efficiently convert those energy and materials into fecundity and survival. Thus, selection has acted on individuals and groups to preferentially favor those groups that more rapidly increase the group-experienced K. In other words, carrying capacity is a group-level phenotype that is an important target of selections and, in humans, emerges from the interaction of genes, culture, technology and environment. We hypothesize that if there was heritable cultural or genetic variation in social ability to increase K, humans will have been selected towards increased K. That is, in the absence of top-down regulation, human populations have been selected to overcome bottom-up regulation via genetic and cultural evolution.

More formally, assume a discrete population growth system in which generation t + 1 is composed only of the offspring of generation t (that is, generation t yields generation t + 1 and dies). In this case

$$N_{t} = \sum F_{t - 1}$$
(2)

where Nt is the population size at time t and F is fitness as defined in Eq. 1. Thus, population size is simply the sum of all individual fitness. Substituting from Eq. 1:

$$N_{t} = \sum X_{t - 1} \times E_{t - 1} .$$
(3)

That is, the population in the next generation is the sum of all of the energy extracted by all members of the population, multiplied by each individual’s conversion efficiency. Assuming the population is at K, Nt equals Nt-1, and K equals Nt. Therefore

$$K = \sum X_{t} \times E_{t} .$$
(4)

At the individual level, we assume that there is heritable (genetic) variation in either X or E. As this heritable variation is selected for, K increases, because they are positively related.

At the group level, we suppose that each group experiences a local K, and a group-level X and E, and that there is heritable, between-group variation in X and E. In this case, groups with higher X or E would increase K, thereby increasing their population size. The increased population size relative to other groups implies might allow for social dominance which could increase the rate of spread of the genes and technologies used to increase X.

The ability of a species to shift its carrying capacity is critical. If K is exogenously determined, selection cannot act to increase the extraction of materials and energy from the environment, because this has already been maximized. However, if K is endogenously determined, that is, if a species can determine its K through culture and technology, then we hypothesize that there may be cultural or genetic selection towards increasing K. The increase in K would, in turn, allow the population to expand to this new K, again placing a selective pressure on the population. However, the population is now larger with greater collective cognitive power and even better adapted at increasing K. Thus, we might expect the rate of increase of K to increase over time (Fig. 3; see Meyer and Ausubel (1999)). This is analogous to an increase in the total frequency of mutation in a population as population size increases. As the size of a population increases, the probability of any given beneficial mutation increases in each generation. Likewise, as the size of the human population increases, the probability of innovation also increases as the number of brains that may be dedicated to innovation increases.

Fig. 3
figure 3

Rate of change of K decreases as population size increases due to increasingly rapid technological innovation

Carrying capacity simulation

To illustrate the argument, we built a simple simulation population model that tracks population growth, energy extraction and the evolution of carrying capacity. The model is based on a standard discrete population growth model in which energy extraction from the environment replaces carrying capacity. Thus

$$N_{t + 1} = N_{t} + R \times N_{t} \left( {\frac{{X_{{\max}} - X_{t} }}{{X_{{\max}} }}} \right),$$
(5)

where R is the intrinsic growth rate and is equal to F from Eq. 1, Xmax is the energetic carrying capacity of the system and Xt is the total population energy extraction. In other words, Xmax and Xt substitute for K and Nt in a standard discrete population growth model (as in Eqs. 3 and 4). Xmax is determined by both by the energy available in the environment and the technology used to capture that energy. Therefore, Xmax can change via technological innovation. We propose that the change in Xmax is a linear function of population size; however, we also propose that Xmax is more likely to change as Xt approaches Xmax. That is, as the total energy extracted by the population nears the maximum energy available for extraction, it creates a selective pressure to increase Xmax. Therefore, we define the probability of innovation, P, at time t as

$$P_{t} = 0.01N_{t} \times \left( {\frac{{X_{t} }}{{X_{{\max}} }}} \right).$$
(6)

The coefficient 0.01 indicates that we assume a 1% chance of innovation per person per generation. The model generates a random integer between 0 and 100 and compares the random number to Pt. If Pt exceeds the random number, innovation occurs and Xmax increases by 10%; otherwise Xmax stays constant. Changes to Xmax are assumed to be positive and permanent. Substituting Eqs. 1 and 3 into Eq. 5 and solving for Xt yields

$$X_{t} = \frac{{N_{t} X_{{\max}} + N_{t} FX_{{\max}} }}{{X_{{\max}} E + N_{t} F}}$$
(7)

where F and E are as defined in Eq. 1, above. For this illustration, we assume that E is fixed at 4 × 10–7 offspring per kcal and Xi is fixed at 5 × 106 kcal per generation. These assumptions yield an R (or F) of 2.

Results of 10 simulations are depicted in Fig. 4. Unsurprisingly, the energetic carrying capacity, Xmax, increases over time, indicating increasing environmental load on the environment. However, the rate of increase of Xmax also increases over time. Figure 5 shows the mean Xmax from the simulations depicted in Fig. 3, with a best-fit exponential equation. The fact that the relationship between generation number and Xmax is exponential, suggests that the rate of increase in energy extraction will itself increase over time. Unless the biotic ecosystem can co-evolve to limit this increasing energy extraction, an unsustainable situation will result.

Fig. 4
figure 4

Results of a simulation model of the evolution of energetic carrying capacity with technological innovation

Fig. 5
figure 5

Change in energetic carrying capacity over time

Note that this model does not include top-down factors such as parasites and pathogens that may act to limit human populations. Parasites and pathogens may represent one way in which the biotic system can co-evolutionarily “keep up” with the techno-cultural evolution of the human system.

Applications of energetic AET

Energy and socio-ecological collapse

The model we presented is not intended to explain socio-ecological collapse per se, but rather the human propensity towards unsustainability and socio-ecological crisis, which may lead to socio-ecological collapse. Nonetheless, some of the critiques of the concept of socio-ecological collapse may be help clarify our argument. Tainter (2006) argued that the role of ecological change in socio-ecological collapse is historically unsupported. Instead, Tainter attributes the collapse of ancient societies principally to social failures. For example, Tainter ascribes the collapse of the Sumerian population around Ur in the late third millennia B.C.E. to poor irrigation practices combined with a failure of leadership to recognize the worsening agricultural situation. Thus, Tainter might disagree that population growth and environmental energy extraction were the primary causes of the collapse of Sumer. However, phrased in the context of carrying capacity evolution, Tainter is suggesting that the carrying capacity declined due to salinization of the soils, which was in turn a consequence of poor soil management. From our perspective, the decline in K was only possible, because K had been inflated by unsustainable irrigation practices. That is, societies use technology to increase Xmax (and thus K) and their populations grow accordingly. In some cases, societies might use this new technology to extract too many resources and the population might overshoot K with negative consequences; deforestation or prey extinction might be examples. In other cases, populations may not overshoot K, but K might decrease below the population size, because the system used to increase X was unsustainable (e.g., dryland agriculture in Sumer). In either case, the “ecological” in socio-ecological collapse is critical.

We propose that human populations face two problems related to carrying capacity evolution. First, continued increases in K may become increasingly difficult due to thermodynamic constraints, eventually leading to a situation in which the human population exceeds carrying capacity and the population collapses. Second, increases in K may not be permanent and may depend on unsustainable extraction of energy and embodied energy from the environment. In this case, the population may not overshoot K but may nonetheless exceed K when K declines and experience collapse.

Butzer (2012) proposes socio-ecological collapse begins with economic and fiscal decline over decadal to centennial timescales, coupled with and precipitating economic crises at decadal timescales. Butzer proposes that many of these economic and fiscal conditions are ecologic, for example, declining agricultural productivity and anthropogenic degradation, while others are socio-economic (economic depression; foreign attacks). However, from the perspective of the energetic version of AET described above, nearly all of the preconditions and triggers described by Butzer are energetic and ecologic. This is due to the thermodynamic linkage between economic activity and energy (Georgescu-Roegen 1975; Daly 1991). That is, all economic activity implies the dissipation of energy so that when Butzer identifies “economic depression” as a trigger for socio-ecological collapse, this implies that human societies are no longer able to extract energy from the environment and convert that energy into economic activity as they had in the recent past. Similarly, when Butzer identifies “foreign attacks” as a stimulus for collapse, the energetic-AET model proposed here sees invasion and conflict as a mechanism for extracting increasing quantities of energy from the environment and thereby increasing K.

Eco-modernism and the environmentalist’s paradox

Energetic-AET has implications for the ecomodernism versus technological pessimism debate and the interconnected environmentalist’s paradox. The environmentalist’s paradox is the observation that human wellbeing seems to be improving in the face of declining natural system wellbeing. Raudsepp-Hearne et al. (2010) both defined the environmentalist’s paradox and evaluated four potential explanations for it. The four non-mutually exclusive potential explanations were that: (1) human well-being has been measured improperly; (2) human well-being is dependent only on food-related ecosystem services which have increased; (3) technology has decoupled human well-being from ecosystem services; and (4) time lags have insulated contemporary populations from future declines. While a complete critique of their analysis is beyond the scope of the present article, Raudsepp-Hearne et al. argued that human well-being has not been evaluated improperly but could not reject the other three hypotheses.

Alternatively, energetic-AET proposes that humans have been able to avoid the negative social impacts of environmental perturbation by increasing their energy demand on the environment. Contra explanation (3), technology has not decoupled human dependence on ecosystem services, but shifted which ecosystem services we depend on and increased our dependence on them. Specifically, humans have used energy subsidies in the form of fossil fuels, renewables and nuclear energy to compensate for declines in energy flows from natural systems. For example, humans use energy in natural gas to reduce nitrogen for fertilizer production. Human wellbeing increases, because energy extraction has increased faster than population growth such that per capita primary energy consumption increased from 1336 kg of oil equivalent per person in 1971–1920 kg of oil equivalent per person in 2014. Consequently, human wellbeing has increased.

Future directions

Energy extraction of competing societies

The suggestion that humans typically live on a continuum of unsustainability as depicted in Fig. 1 may seem inconsistent with the observed sustainability of some indigenous societies (Trosper 2002; Campbell and Butler 2010). Nonetheless, it is a consequence of the present hypothesis. To phrase the energetic AET hypothesis in the language of game theory, we hypothesize that the evolutionary stable strategy of human societies is maximized and increasing rates of energy extraction. Cases of sustainability in which societies fail to increase their energy extraction from the environment—as in some indigenous societies—are hence viewed as temporary, evolutionarily unstable strategies that will be outcompeted and replaced by more extractive strategies. This suggests a testable hypothesis: that societies with higher, less sustainable rates of energy extraction have outcompeted those with lower, more sustainable rates of energy extraction. Note that evolution in general, and cultural evolution specifically, is value-neutral (Sommers and Rosenberg 2003), so while it may be troubling to view sustainable cultures as evolutionarily unstable, this does not falsify the hypothesis.

Integration with pathogen evolution

The model discussed here assumes human carrying capacity is controlled by bottom-up factors, specifically energy extraction from the environment (Hopfenberg 2003). As the global SARS Coronavirus-2 pandemic of 2020 demonstrates, humans are also susceptible to top-down population regulation by pathogens, as human pathogens may be one of the few parts of the ecosystem that can evolve as rapidly as human techno-culture. Thus, human carrying capacity may be limited both by energy gain from the environment and by disease.

Future researchers may attempt to integrate disease evolution into the feedback loop, as depicted in Fig. 2. All else equal, as population density increases, pathogen virulence may be expected to increase (van Baalen and Sabelis 1995; Lively 2006; Mennerat et al. 2010; Borovkov et al. 2013), suggesting that pathogens may place a constraint on the feedback loop in Fig. 2. However, the evolution of pathogen virulence is complex and depends on ecological and genetic factors in both the pathogen and the host and the history of interactions between the two (Knolle 1989; Antia et al. 1994; Levin 1996). Furthermore, the germ theory of disease, vaccination, and the development of antibiotic represent techno-cultural changes in how many human societies responded to pathogens, and this change may be positively associated with population density and energy extraction from the environment. Thus, it may be interesting for future scholars to explore if pathogens act to limit population growth and energy extraction, or if technological innovation and population density have acted to remove the ability of pathogens to function as top-down population regulators.

Conclusions

Understanding how societies transition from unsustainable to sustainable states may be one of the most important questions in environmental social science in the twenty-first century. While such research is ongoing, less effort has focused on understanding why human societies are generally unsustainable in the first place. We propose that human societies are prone to unsustainability, because they have evolved to maximize their rate of energy extraction from the environment through a multi-level selective process acting on both genetic and cultural heritable variation. Specifically, genetic evolution at the individual level creates behavioral pre-adaptations (Van Vugt et al 2014) that works with cultural group selection to favor traits and technologies that increase energy extraction per unit time. This increased energy extraction is used to support increased population sizes and this increased population size further improves the ability of the group to innovate new means of energy extraction, leading to a positive feedback loop that is the ultimate cause of human unsustainability (recall Fig. 2).