Introduction

Humanity needs to reduce greenhouse gas (GHG) emissions by 45% by 2030 to reach net zero emissions by 2050 to prevent the worst effects of climate change (IPCC 2022). Despite a crushing sense of urgency to act, most countries are falling short of their climate targets. A recent analysis examined 180 countries’ national efforts to mitigate climate change, and found that 97% of them are not on track to reach net zero by 2050 (Wolf et al. 2022). In fact, global GHG emissions from fossil fuels have increased by 1% from 2021 to 2022, setting a new record of 37.5 billion tons (4.7 tons per person), making emission reduction even more difficult (“Record-Breaking Carbon Emissions, and More—This Week’s Best Science Graphics,” 2022).

To reduce emissions, we need to address the action gap where collective behavior change is lacking. For example, electric vehicles (EVs) continue to have a low market share in most countries, despite their increasing availability and affordability (Xue et al. 2021). In the U.S., only around 5% of people follow a vegetarian diet and 0.5% are vegan (Kunst 2022); people throw out an average of their own body weight in garbage every month (Hoornweg et al. 2013); and the majority of the public continue to consume fossil-fuel based energy rather than renewable energy (Kartal et al. 2022), despite robust evidence that these behaviors contribute to anthropocentric climate change (Ivanova et al. 2020). For civic action, only 1% of the U.S. population is participating in a campaign to convince elected officials to take action to reduce global warming (Goldberg et al. 2021).

Changes to people’s behavior, infrastructure, and technology can reduce emissions by 40–80% in industry, food, transport, and building end-use sectors (Creutzig et al. 2022). A recent study estimates that living car-free can reduce GHG emissions by 2 tCO2e per capita per year on average, eating a plant-based diet by 0.9 tCO2e/cap., sharing and consuming services rather than material items by 0.3 tCO2e/cap., and using renewable energy by 1.5 tCO2e/cap. (Ivanova et al. 2020). These four actions collectively can reduce 4.7 tCO2e/cap. which will help reach net zero from the current level of 4.7 tCO2e/cap.

It is important to first recognize the enormous inequality in GHG emissions across countries, where higher-income countries emit disproportionately more than lower-income countries (Kenner 2019). Even within a given country, higher-income individuals emit disproportionately more than lower-income individuals (Feng et al. 2021). This inequality emphasizes the need for equity considerations when promoting climate action across income groups and countries. In this paper, we focus on climate action within the higher-income industrialized context, although proposed solutions may also apply to other contexts.

The big question is: Why are people not taking action to combat climate change? There are several barriers at the individual level and the system level that contribute to a lack of climate action. At the individual level, climate action tends to emphasize the need for personal sacrifice to reduce consumption-related emissions (e.g., drive less, fly less, eat less meat), framing the choice as an agonizing tradeoff between immediate individual well-being and future planetary well-being (Fanning and O’Neill 2019; Wynes et al. 2021). Climate action is often moralized, with undertones of shame and guilt to try to make people feel responsible for their carbon-emitting behavior (Jacquet 2017). In addition, the doom-and-gloom narratives that portray the devastations from climate change can leave people feeling paralyzed, anxious, and afraid (Clayton 2020; Ettinger et al. 2021; Wu et al. 2020). Moreover, climate action, such as taking public transit, purchasing renewable energy, and paying for carbon taxes, is often perceived as inconvenient and costly (Drews and van den Bergh 2016). Beyond cost concerns, cognitive biases (e.g., status quo bias, present bias) can prevent people from switching their current high-emission behavior to low-emission behavior (Luo and Zhao 2021; Zhao and Luo 2021). Finally, tangible rewards for climate action, such as rebates for purchasing EVs, and receiving $0.10 for every recycled bottle, are often rare, infrequent, or too small to meaningfully change behavior (Helveston et al. 2015; Iverson 2020).

At the system level, there is a lack of climate-friendly policy (e.g., subsidies for plant-based foods) and infrastructure (e.g., renewable energy, bike lanes, public transit) (Steg et al. 2022). The current climate-unfriendly policies that remain in place (e.g., government subsidies to fossil fuel and cattle industries) increase the difficulty for people to take climate action (Samarajeewa et al. 2012; Stephens 2014). These system-level barriers are incredibly hard to remove, especially since they are partly driven by strong interest groups from the fossil fuel industry (Stokes 2020). The lack of immediate and effective government action on climate change has caused many people, especially youth, to feel hopeless and depressed (Thompson 2021), which does not encourage individual climate action.

Beyond barriers, there are also enablers at the individual and system levels that contribute to the maintenance of existing high-emission behavior. In industrialized countries at the individual level, driving a gasoline-powered vehicle is still the most convenient mode of transport for most people (EPA 2022); beef, lamb, and dairy attract consumers based on their taste and relative affordability (Stampa et al. 2020); it is often easier to throw away and replace clothes and items rather than repairing them; and single-family homes powered by fossil fuel based energy are the most common housing type in developed countries and are viewed as a status symbol (Vestergaard 2006). These enablers are supported by system-level driving forces, such as car-friendly public transit systems, government subsidies to fossil fuel and cattle industries, and waste dumping practices and policies. To reach net zero, we need a complete shift in the incentive structure to enable the collective adoption of low-emission behavior.

Challenges with previous behavioral interventions

Behavioral interventions can help remove the barriers for climate action and remove the enablers for current climate-unfriendly action, with the goal of instigating, spreading, and sustaining behavior change to reduce GHG emissions (Nisa et al. 2019; van der Linden and Goldberg 2020). However, changing human behavior without policy or infrastructure support is often futile. Policy and infrastructure change (e.g., installing bike lanes) should enable individuals to easily adopt low-emission behaviors by providing sufficient incentives and removing the barriers for behavior change. However, previous behavioral interventions have not been able to significantly reduce GHG emissions. In past studies, personal values, beliefs, and attitudes have been shown to influence climate action; therefore, many existing interventions have focused on changing values, beliefs, and attitudes about climate change to elicit climate action (Bouman et al. 2021; Prati et al. 2017). Climate anxiety (i.e., feeling worried about climate change) has been shown to predict some, but not all, types of environmental action (Whitmarsh et al. 2022). Yet, values, beliefs, and attitudes take a long time to establish and are often not predictive of actual behavior (Prati et al. 2017). Motivations for engaging in climate-friendly behavior often differ between socio-economic groups. Higher-income individuals tend to engage in climate-friendly behavior for environmental concerns, whereas lower-income individuals tend to engage in this behavior due to financial concerns (Lempert et al. 2019).

Recent studies have focused on other ways to nudge climate behavior, such as changing the choice architecture (e.g., making renewable energy the default), using social comparisons (e.g., showing how others are doing), and providing personalized information (e.g., using tailored recommendations), which have shown small to moderate effects (Grilli and Curtis 2021; Mertens et al. 2022; Nisa et al. 2019; van der Linden and Goldberg 2020). Many nudges to promote climate action have not been highly effective. For example, a recent study conducted five large-scale field experiments nudging employees of a large organization to carpool. They sent letters and emails, provided non-cash incentives, and created personalized travel plans, but found no effect of these interventions; the reason is likely due to difficulties in changing habitual behavior, the lack of cultural norms, and the lack of climate-friendly infrastructure (Kristal and Whillans 2019). Moreover, the longevity of intervention effects is not well understood, since most studies do not track long-term behavior change beyond a few weeks.

Even if desirable behavior change does occur, it does not mean that the behavior change will result in the intended benefits (e.g., reduced GHG emissions) because of potential negative spillover. Negative spillover is when an intervention leads to a change in the target behavior but decreases the likelihood of engaging in a subsequent behavior, which can cancel out the overall effects (Truelove et al. 2014). For example, a nudge that makes renewable energy the default may conversely reduce public support for a carbon tax policy (Hagmann et al. 2019). Another example is new consumers of cheaper, more efficient solar-powered electricity knowingly or unknowingly increasing their energy consumption (Reimers et al. 2021). By some measures, negative spillover can reduce efficiency gains by 60–100% across the economy (Ruzzenenti et al. 2019). However, spillovers are not always negative. Positive spillover occurs when an intervention leads to a change in the target behavior and also increases the likelihood of engaging in a subsequent behavior. An example is that a small fee ($0.06) reduced the use of single-use plastic bags and increased the use of reusable bags, and also increased people’s support for other environmental policies (Thomas et al. 2019).

To date, there is no consensus on when or why spillovers occur. Only a few existing frameworks have been proposed to explain spillovers (Carrico 2021; Nilsson et al. 2017; Truelove et al. 2014). These frameworks suggest that negative affect-based decisions tend to produce negative spillovers, whereas role-based decisions that enhance environmental identity tend to produce positive spillovers (Truelove et al. 2014). Interventions targeting intrinsic motivations or similar behavior tend to produce positive spillovers (Maki et al. 2019). Interventions supporting personal autonomy, with an explicit rationale explaining why the behavior is important, and addressing normative goals (environmental protection) or personal gain goals (financial savings) tend to produce positive spillovers (Geiger et al. 2021). Positive spillovers are likely due to environmental identity, a desire for consistency, and self-efficacy beliefs; in contrast, negative spillovers are likely due to moral licensing and rebound effect (Carrico 2021). However, the evidentiary basis for these frameworks has been weak, since multiple recent meta-analyses suggest that there are no consistent overall spillovers across pro-environmental behavior or intentions (Carrico 2021; Geiger et al. 2021; Maki et al. 2019; Truelove et al. 2014).

The challenge of changing behavior, mitigating negative spillover, and promoting positive spillover to address climate change has proved to be an arduous task. We believe that past behavioral interventions have shown limited efficacy because they have not considered the fundamental principles of behavior change from an operant conditioning perspective. Previous interventions have primarily focused on the determinants of a target behavior (van Valkengoed et al. 2022), by removing the restraining forces or increasing the driving forces of that behavior (Lewin 1939). However, rarely has any intervention examined what happens after the behavior, despite the consequence of the behavior playing a critical role in shaping that behavior.

To close the action gap and reduce emissions, operant conditioning can be particularly useful as one set of tools to promote climate action. It involves using reinforcement or punishment that follows the behavior to influence that behavior (Burchard & Tyler 1964; Harper 1975; Skinner 1963). Operant conditioning principles have been used over the past century to effectively change behavior in a large number of domains, including healthy eating (Normand et al. 2021), smoking cessation (Roll et al. 1996), reducing energy and water consumption (Agras et al. 1980; Bekker et al. 2010), improving sports performance (Schenk and Miltenberger 2019), and reducing procrastination (Perrin et al. 2011). In fact, many industries such as gambling and social media have used operant conditioning principles successfully to reap profits by promoting user behavior that can be harmful to the users themselves (e.g., to increase the use of slot machines, or user engagement on social media) (Ceylan et al. 2023). Our goal in this paper is to use operant conditioning principles ethically to promote positive behavior change to increase human and planetary well-being.

Here we propose a framework using operant conditioning principles to encourage low-emission behavior and to discourage high-emission behavior. In particular, we propose a specific behavior change technique called differential reinforcement of alternative behavior (DRA) which, in the context of environmental sustainability, involves reinforcing low-emission behavior while extinguishing high-emission behavior, with the goal of replacing the high-emission behavior with the low-emission behavior (Petscher et al. 2009). We describe how DRA can be implemented in the domains of transportation, food, waste, housing, and civic actions. The reason to focus on behavior in these domains is that they tend to have the largest carbon reduction potentials to reach net zero emissions (Ivanova et al. 2020). The principles underlying this framework can be universal across cultures (Sigler et al. 2015); however, their application may vary across different groups and socio-economic contexts. This framework also offers new insights to understand spillovers, and it suggests strategies to mitigate negative spillover and promote positive spillover.

An operant conditioning framework for climate action

Operant conditioning is a form of associative learning about the relationship between a consequence and a behavior. Behavior refers to both covert internal behavior, such as decision making, and overt, externally observable behavior, such as executing a decision. The premise of operant conditioning is that behavior is modified by its consequences (Thorndike 1898). Specifically, reinforcement is when a consequence has increased a behavior, and punishment is when a consequence decreases a behavior (Skinner 1963). This framework aims to reinforce the behaviors themselves rather than the outcomes. This means that reinforcement occurs following a target behavior (e.g., biking), not the outcome of the behavior (e.g., reduced emissions). This is because focusing on outcomes rather than the behavior itself can result in maladaptive behavior (e.g., cheating). This is why most interventions reward healthy behavior (e.g., exercising, healthy food choices) that lead to weight loss instead of rewarding how many pounds are lost (Staiano et al. 2017). In our context, the proposed interventions are designed to reward the behaviors that lead to reduced emissions, instead of rewarding reduced emissions per se.

Reinforcement

There are two types of reinforcement: positive and negative. Positive reinforcement is when a consequence is added following a behavior that increases the likelihood of that behavior occurring in the future. Negative reinforcement is when a consequence is removed following a behavior that increases the likelihood of that behavior occurring in the future (Skinner 1963). “Positive” and “negative” are not value-laden terms; rather, they refer to the addition or removal of a consequence, respectively. It is also important to distinguish between rewards and reinforcers. A reward is a consequence perceived to be of positive value, such as natural rewards (e.g., sugar), financial rewards (e.g., money), social rewards (e.g., praise), and symbolic rewards (e.g., stars). A reinforcer is a consequence that functions to increase the desired behavior in the future, regardless of perceived value. A reward is not a reinforcer if it does not increase the behavior (Skinner 1963). Since we do not know whether a reward will function as a reinforcer until its effect on the behavior is known, we will use reward and reinforcer interchangeably throughout the paper.

The type of reward can influence the efficacy of reinforcement. Which reward works best is highly idiosyncratic, depending on the preferences and experiences of the individual. Financial rewards include cash, gift cards, or any monetary reward. Studies have found that financial incentives can promote the initial behavior change more effectively than social rewards (Demurie et al. 2011; Wang et al. 2020), but are often ineffective in maintaining long-term behavior change (Dickinson 1989). Natural rewards occur automatically, such as sugar and endocannabinoids, and are effective in maintaining a new behavior over time (Bradley-Johnson 1997). Social rewards involve social approval, recognition, or connection, such as praise, smiling faces, and spending time with friends, and are often more effective for changing and sustaining the behavior than financial rewards (Handgraaf et al. 2013). Symbolic rewards do not have any inherent value but represent something else that is rewarding, such as tokens and gold stars, and can be as equally effective as financial rewards (Lossin et al. 2016). Intrinsic rewards are internal, such as a sense of satisfaction and achievement, and have been shown to motivate behavior change even when no external rewards are present (Deci 1976).

Reinforcement can help remove the individual-level and system-level barriers mentioned earlier. For example, some climate actions can feel punishing because they involve substantial upfront cost or effort (e.g., purchasing renewable energy, taking public transit). Positive reinforcement can use a variety of rewards (e.g., financial, social, symbolic, natural) to reduce or remove these barriers to encourage low-emission behavior. Negative reinforcement can also promote climate action by reducing feelings of guilt, shame, and anxiety (predicated, of course, on someone experiencing these feelings) (Whitmarsh et al. 2022). There are different schedules of reinforcement (Fig. 1). A schedule of reinforcement is defined as the response requirement to produce reinforcement (Ferster and Skinner 1957). A response can be reinforced based on the time elapsed since the preceding reinforcement (i.e., interval), or based on the number of responses required to obtain reinforcement (i.e., ratio). A schedule may be fixed (i.e., unchanging) or variable. As a result, there are four basic schedules of reinforcement: fixed-interval, fixed-ratio, variable-interval, and variable-ratio (Ferster and Skinner 1957). Continuous reinforcement (i.e., a fixed-ratio schedule where each instance of a behavior is reinforced) is often best when establishing a new behavior because it allows people to associate the new behavior with the reinforcer quickly (Ferster and Skinner 1957; Schoenfeld et al. 1956). Variable ratio schedule (i.e., an individual receives a reinforcer after a variable number of behavioral responses) tends to be the most effective in sustaining the new behavior over time (Ferster and Skinner 1957) and is the most resistant to extinction (i.e., when the behavior is no longer reinforced and subsequently diminishes) (Bijou 1957; Morgan 2010). In general, variable schedules (i.e., reinforcement is delivered unpredictably) tend to be more effective than fixed schedules (i.e., reinforcement is delivered predictably) for long-term persistence of behavior, and ratio schedules (i.e., reinforcement is contingent on the number of behavioral responses) tend to be more effective than interval schedules (i.e., reinforcement is contingent on the behavioral response after a certain amount of time) for producing higher rates of behavior (Ferster & Skinner 1957; Morgan 2010).

Fig. 1
figure 1

Schedules of reinforcement and punishment

Punishment

While reinforcement can encourage low-emission behavior, it does not directly discourage high-emission behavior. To discourage high-emission behavior, punishment may be needed. There are two types of punishment: positive punishment and negative punishment. Positive punishment occurs when a consequence is added following a behavior that decreases the likelihood of that behavior occurring in the future, such as social punishments like disapproval and reprimands. Negative punishment (we will use penalty instead of negative punishment throughout this paper to avoid confusion between positive and negative punishment) is when something is removed that decreases the likelihood of that behavior occurring in the future, such as monetary fines (Daniels 2016). Extinction is similar to a penalty, where the reinforcer that is maintaining a previous behavior is removed and the behavior decreases as a result (Petscher & Bailey 2008). However, a key difference between extinction and a penalty is that extinction removes the reinforcer of the previous behavior, whereas a penalty removes something positive, such as money, and does not need to be associated with the previous behavior. It is important for punishment to follow a continuous schedule when possible, because if it is not continuous (i.e., an individual is sometimes punished for an undesirable behavior), it can become variable reinforcement for the undesirable behavior which becomes harder to extinguish in the future (Bijou 1957; Schoenfeld et al. 1956).

Although the use of punishment can quickly decrease unwanted behavior, it can also produce unintended negative side effects such as response substitution, response facilitation, generalized suppression, punishment contrast, resentment, escape, avoidance, and concealment (Newsom et al. 1983; Skinner 1971; Solomon 1964). Response substitution occurs when the undesirable behavior is punished, and other undesirable behaviors increase instead. Response facilitation occurs when the punishment that normally reduces behavior results in an increase in the behavior instead, therefore acting as a reinforcer. This often occurs when the punishment is relatively weak, for example, when the monetary fine is too small. Generalized suppression occurs when other desirable behaviors also decrease as a result of the punishment of an undesirable behavior. Punishment contrast occurs when a behavior is punished in one situation and leads to an increase in the behavior in other situations where punishment is not administered.

To limit the negative side effects of punishment, punishment should always be administered in conjunction with positive reinforcement to promote the desired behavior (Newsom et al. 1983; Van Houten and Doleys 1983). Moreover, penalty is preferable over positive punishment, as penalty tends to result in fewer negative side effects (Poling et al. 2002). However, despite the potential negative side effects, punishment alone is one of the most commonly used interventions to promote climate action today (e.g., carbon tax, single-use item fees, shaming narratives) (Aghion et al. 2012; Gneezy and Rustichini 2000; Jacquet 2017). It is perhaps no surprise that many people have grown to resent the climate movement (resentment), vote against parties who support carbon taxes (escape), and even outright deny the existence of climate change (avoidance) (Kondo et al. 2019; Lachapelle & Kiss 2019; Newsom et al. 1983; Norgaard 2011). Businesses have also reacted negatively to punishment by moving to areas with less environmental restrictions (escape) and greenwashing (i.e., misleading consumers about their environmental performance, concealment) (Delmas and Burbano 2011; Leonard 2006; Newsom et al. 1983; Skinner 1971). If we hope to elicit widespread climate action, it is imperative that when punishment is used, we strive to pair it with positive reinforcement.

Differential reinforcement of alternative behavior

Within the context of environmental sustainability, differential reinforcement of alternative behavior (DRA) involves reinforcing low-emission behavior while extinguishing high-emission behavior, with the goal of replacing the high-emission behavior with the low-emission behavior. Although DRA has not yet been tested in the domain of climate change, DRA has been effective in changing a plethora of behaviors in other domains, from minor issues such as increasing appropriate classroom behavior, to severe problems such as reducing aggressive and self-injurious behavior (MacNaul and Neely 2018; Petscher et al. 2009).

DRA has many advantages over reinforcement or punishment alone. A review of empirical support for DRA suggests that DRA leads to long-term behavior change over at least one year (Petscher et al. 2009). DRA can reduce or eliminate the negative side effects associated with punishment and be effective for behavior change even if punishment is not implemented continuously because of the reinforcement of the alternative behavior (Petscher et al. 2009). In two recent reviews, DRA has been found to produce positive collateral changes, such as increased self-care and social behavior (Scotti et al. 1991), less stress, and increased attention to tasks (Petscher et al. 2009). Finally, the general public shows greater support for DRA over punishment because of the opportunity for reward (Petscher and Bailey 2008).

The switch to low-emission behavior does not necessarily require punishing high-emission behavior. In a systematic review, 90% of the reinforcement interventions significantly reduced the undesirable behavior without the use of extinction or punishment (MacNaul and Neely 2018). For example, Athens and Vollmer (2010) increased the duration and quality of reinforcement, and decreased the delay of the reward to favor the desirable behavior, and the effect size was largest when several dimensions of reinforcement were combined (Athens & Vollmer 2010). This suggests that the key for lowering emissions is to provide greater reinforcement for the low-emission behavior than the high-emission behavior (Vollmer et al. 2020).

Factors to consider

Several factors can influence the effectiveness of a DRA intervention, such as immediacy, satiation and deprivation, and the magnitude of the reinforcer (Pierce et al. 1986; Powell et al. 2016). One important factor that is often neglected in climate incentive programs is the immediacy of the reinforcement. For an intervention to be effective, reinforcement is usually delivered immediately following the behavior. Reinforcement delivered with a delay can reduce the impact on the behavior (Chung and Herrnstein 1967). Many current incentives (e.g., EV rebates) are provided weeks or months following the initial behavior, which can weaken the effect of the reinforcement (Yang et al. 2016). Other rebates (e.g., carbon tax rebates) are automatically processed in the tax system months later, which means that people may not remember what the rebate is for (Rivers and Shaffer 2022). It is imperative that reinforcement occurs as close to the target behavior as possible for maximum effectiveness (Chung and Herrnstein 1967).

Satiation is when the impact of the reinforcer is reduced if the individual perceives that they already have enough of the reinforcer being offered, and deprivation is when the impact of the reinforcer is strengthened if the individual perceives that they are deprived of the reinforcer being offered (Pierce et al. 1986). This means that the same financial incentive from a climate rebate may be less attractive to higher-income individuals than to lower-income individuals, because of the relative magnitude of the incentive. The magnitude of the reinforcer also correlates with the effectiveness (Wang et al. 2017). This means that the small financial rewards for recycling bottles or bringing reusable bags may not be sufficiently incentivizing for most people. Continuous reinforcement could lead to satiation which can reduce the effectiveness of the intervention, whereas satiation is less likely to occur when using a variable schedule of reinforcement.

Generalization is another factor to consider when designing interventions. It occurs when a behavior change has lasted over time, occurred in many environments, or spread to related behavior (Arnold-Saritepe et al. 2009). One type of generalization is response generalization, when the individual spontaneously engages in a new behavior that is functionally equivalent (i.e., serves the same function) to the target behavior (Arnold-Saritepe et al. 2009). Generalization can help inform our understanding of spillovers in climate action. While most studies on pro-environmental behavior interventions examine spillovers as an unintended consequence of the intervention (Geiger et al. 2021; Maki et al. 2019), here we view spillovers as response generalization that can be incorporated into the intervention.

Value formation may occur when rewards are framed in terms of environmental benefits of the prior behavior. Previous research suggests that behavior change can influence the formation of new values and attitudes (Sussman and Gifford 2019). When rewards highlight the environmental benefits of prior behavior (e.g., praising someone for biking to work because it reduces emissions from driving), the reward may not only reinforce the prior behavior, but also build an association between the behavior and its environmental benefits, which may foster pro-environmental attitudes, values, or identities through internalization. Any subsequent behavior that is consistent with the pro-environmental value can serve as intrinsic reward for that behavior (e.g., warm glow) (Taufik et al. 2015). While previous studies have argued for value change to increase climate action (Bouman et al. 2021; Prati et al. 2017), here we suggest that reinforcing climate action may help form environmental values and attitudes.

In what follows, we propose specific individual-level and system-level interventions to encourage and sustain low-emission behavior while discouraging high-emission behavior, in the domains of transportation, food, waste, housing, and civic action (Fig. 2). Since different rewards may function as reinforcers for different behavior, we will provide examples of a variety of rewards in each domain. The diversity of examples is intended to highlight the broad range of potential interventions that could be designed using the DRA-based operant conditioning framework. However, an important next step will be to determine which of these interventions are likely to be the most impactful and should be prioritized.

Fig. 2
figure 2

Examples of positive reinforcement to encourage low-emission behavior and penalty to discourage high-emission behavior in transportation, food, waste, and housing domains

Transportation behavior

Increasing low-emission transportation behavior such as biking, taking public transit, driving EVs, and carpooling can greatly reduce GHG emissions. At the system level, reducing the costs or effort associated with low-emission transportation behavior can involve installing bike lanes and making the public transit system safer. Once it becomes feasible and safe for people to bike and take public transit, a continuous reinforcement schedule can be used to establish the new bus-taking or subway-taking behavior by providing positive reinforcement with financial rewards (e.g., a free bus pass for a month for new bus riders) (Gravert and Olsson Collentine 2021), or providing praise every time they board the bus or subway as a social reward (e.g., “Thank you for riding the bus, you are helping save the planet!”).

To sustain the new behavior, a variable ratio schedule can be used by providing a free ticket to the passengers as a financial reward at random intervals. A variable interval schedule can be used to provide praise at random intervals on the ticket machine as a social reward. Additionally, the physical exercise and euphoric feeling provided by biking provide natural rewards (Sparling et al. 2003). Charging stations for EVs, and bike or car rental stations, could be set up like slot machines to provide rewards for driving EVs or renting bikes or cars. For example, the station could generate a reward at random intervals (e.g., a gift card). Lotteries may also serve as positive reinforcement for low-emission transportation. For example, a company-based lottery in which employees were rewarded for decreasing their average miles driven per day successfully reduced the average daily miles driven by those in the lottery condition by 11.6%, while those without the lottery increased their average daily miles by 21.2% (Foxx and Schaeffer 1981). Carpooling can also be positively reinforcing as it provides the social reward of spending time with friends or family in the car, but these social rewards are likely insufficient to overcome barriers such as the difficulty of aligning people’s schedules (Kristal and Whillans 2019). Other forms of positive reinforcement (e.g., a free delicious meal for everyone who carpooled to work) can be used to encourage more people to begin carpooling. For instance, one study provided reserved parking spots and 25-cent coupons to those who carpooled to a university campus and found that this intervention significantly increased the rate of carpooling compared to a control lot (Jacobs et al. 1982). Working from home or videoconferencing may also result in negative reinforcement if it decreases negative emotions associated with the time and monetary cost of travel.

At the same time as reinforcement, punishment can be used to discourage high-emission behavior to encourage people to switch modes of transport. A form of penalty is adding a carbon tax to gasoline or diesel, or adding congestion fees to reduce driving (Aghion et al. 2012). However, there are negative side effects of punishment that can decrease its efficacy. For example, one requirement of punishment is that it has to be sufficiently large at the outset to discourage the undesirable behavior. If it is not large enough, it gives people time to cope with the punishment rather than change their behavior, or it may even increase the undesirable behavior (i.e., response facilitation) (Newsom et al. 1983; Powell et al. 2016). Many carbon tax policies keep the tax relatively small to not provoke public outrage, and then slowly increase it over time (Harrison 2013). This strategy is unlikely to decrease driving unless the cost is substantially increased. Extinction is preferred over punishment to limit potential negative side effects, for example, by removing government subsidies to the fossil fuel industry and reallocating them to renewable energy industries as a form of differential reinforcement. To encourage low-emission transportation behavior, governments can reallocate funding that was budgeted for new parking structures and highways and revenue from carbon tax to improving public transit (e.g., building rapid transit).

Food behavior

Encouraging higher consumption of plant-based food can help reduce GHG emissions (Xu et al. 2021). At the system level, governments can provide subsidies to the plant-based food industry as positive reinforcement to businesses, and businesses can make plant-based food more tasty, nutritious, and affordable, and offer rewards (e.g., free meals) as positive reinforcement to consumers to remove the barriers of cost and less appealing taste of some plant-based food. For example, restaurants can offer discounted but delicious plant-based meals to attract consumers first, and then use a reward program that allows people to receive a random plant-based meal for free as a variable ratio schedule of reinforcement. The reward program itself acts as a symbolic reward, the free plant-based meal is a financial reward, and the tastiness of the meal is a natural reward. Stamp cards can also work as a fixed ratio schedule of reinforcement that provides every nth meal for free. The response required to obtain plant-based food can also be decreased by removing the barriers of inconvenience or lack of availability of plant-based food. For example, restaurants and grocery stores can make plant-based food more readily available and easier to access for consumers. Recently, 11 public hospitals in New York City have made plant-based meals as the primary dinner option by default for inpatients (Mayor Adams & NYC Health + Hospitals Announce Successful Rollout and Expansion of Plant-Based Meals as Primary Option for Patients in NYC Public Hospitals, 2022). Making plant-based meals the default can decrease the effort involved in food decision-making for people who want to eat plant-based food, but may be punishing for people who want to eat meat because the effort of ordering a meat-based meal is increased.

Speaking of punishment, high-emission food behavior (e.g., eating beef, lamb, and dairy products) can be discouraged with extinction, at the same time as using reinforcement to encourage people to switch to the plant-based diet. Extinction can involve removing government subsidies to the cattle industry, while punishment can involve adding a meat tax to beef, lamb, and dairy products to make them more expensive than plant-based alternatives. These subsidies and the revenue from the meat tax can be reallocated to the plant-based food industry as a form of differential reinforcement. This said, punishment should be used with caution to avoid exacerbating the existing food insecurity problems in certain communities (Hasegawa et al. 2018).

Waste behavior

Reducing consumer waste is an important step toward reducing GHG emissions. Low-emission behavior includes reducing food waste, repairing or donating clothes and technology products, and reusing and recycling items. At the system level, policies can be enacted to incentivize companies to upcycle or donate food, clothing, and consumer products instead of throwing them away. Right-to-repair policies can be set up to support businesses and manufacturers in offering repair services that are easy to access and affordable for consumers. To encourage waste reduction behavior at the individual level, positive reinforcement can include providing financial rewards for using meal planning services to reduce food waste, creating tasty and attractive dishes using leftover ingredients or food products that are about to be thrown away, and providing sufficiently large financial rewards for repairing, reusing, or recycling personal items. Several studies have found that positive reinforcement (e.g., rewarding those who pick up and turn in litter) results in the highest rates of cleaning up litter and the most improvement in the appearance of previously littered areas (Gelino et al. 2021; Kohlenberg and Phillips 1973).

There are at least two reasons to provide sufficiently large financial rewards. First, they can prevent crowding out intrinsic motivations from receiving small financial incentives. Crowding out effects occur when people experience a decrease in intrinsic motivation for a behavior after being offered too small a financial reward for engaging in the behavior (Dickinson 1989; Gneezy and Rustichini 2000). Second, small incentives may not be sufficiently motivating for people to engage in the behavior. For example, some recycling policies provide $0.05 or $0.10 for each bottle returned (Iverson 2020), which follows a continuous reinforcement schedule, but the amount may be too small for most people. A more effective intervention is to change this policy to a variable ratio schedule that provides a larger financial reward after a variable number of bottles returned (e.g., instead of receiving $0.10 per bottle, there is a 1% chance of getting $10 per bottle). Making rewards uncertain has been shown to increase the frequency of a repetitive behavior, even when the certain reward is larger in magnitude (Tversky and Kahneman 1992). This reinforcing-uncertainty effect is consistent with the fourfold pattern of risk preference, where people prefer a small chance to win a large reward over getting a guaranteed small reward (Sholanke and Gutberlet 2022). The variable ratio schedule should complement rather than replace the continuous reinforcement schedule, since some individuals rely on the certain rewards from bottle returns for their livelihood (DiGiacomo et al. 2018). Another positive reinforcement intervention may be to encourage people to get together with friends to swap used items, which can serve as a social reward (social interactions) and a financial reward (a free item). Reducing the effort and time involved in accessing these services (e.g., repair, recycle, upcycle) can also lead to waste reduction (DiGiacomo et al. 2018; Lempert et al. 2019).

To discourage high-emission waste behavior such as dumping waste in landfills, we can increase the response requirement for doing so. This can involve removing the convenience of garbage disposal by reducing the garbage bins or putting them in inconvenient places. As for punishment, many cities impose a small fee for single-use items (e.g., plastic bags) (Kish 2018), but these fees can be increased to more effectively discourage the use of single-use items. An additional penalty could involve imposing a fine for excessive waste dumping. The revenue from the fees can be used as incentives for upcycle and repair services. However, punishment alone may backfire (e.g., public outrage, contamination in recycling and compost bins) (Katz and Lattal 2021), so it is important to use punishment selectively and strategically in tandem with reinforcement.

Housing behavior

Living in an attached home (e.g., apartments, townhomes) with renewable energy can greatly decrease GHG emissions compared to living in detached houses with fossil fuel-based energy (Ivanova et al. 2020). At the system level, housing policies can increase property taxes for single detached houses, reduce property taxes for attached houses, and make renewable energy and heat pumps the default in newly constructed buildings. Governments can incentivize the switch to renewable energy by providing substantial subsidies. At the individual level, the decision of choosing where to live and what energy to use is rare compared to decisions around transportation, food, and waste behavior. Thus, there are fewer interventions for reinforcing housing behavior, and they are mostly constrained to continuous schedules of reinforcement or punishment. For example, to encourage people to live in attached homes or to choose renewable energy, positive reinforcement such as large financial incentives could be helpful (e.g., a $10,000 moving bonus, a $500 bonus to offset an electricity bill when a household switches to renewable energy). The likelihood of someone choosing renewable energy can be increased by removing the effort involved in switching to renewable energy or getting solar panels (e.g., simplifying the application process). A penalty to discourage people from choosing to live in detached houses can include substantially increasing property tax and inheritance tax for detached homes. Extinction can involve removing subsidies and financial aid for detached home buyers and owners. The revenue from the taxes and removed subsidies can be reallocated to incentivize living in attached homes and the switch to renewable energy.

Civic behavior

Civic behavior (voting, protesting, and signing petitions) can lead to system-level changes to decrease GHG emissions (Wynes et al. 2021). Social rewards such as social support, connection, and recognition can reinforce joining climate rallies and voting. These social rewards can occur through social interactions at the events in person or on social media. Free public transit can be provided to people attending the rallies or going to vote as a financial reward. The “I voted” sticker is a symbolic and social reward that can reinforce voting behavior. Similarly, a lottery could be used for people who have voted, similar to the vaccination lottery from the Government of Canada to encourage Canadians to get vaccinated against COVID-19 (Dubé et al. 2022).

To help people make a decision on whether to participate in civic actions in the first place, we can reduce the response requirement to participate. Most civic behavior (e.g., voting, contacting elected officials) takes a lot of effort and time which can serve as punishment. Thus, reducing the effort involved in voting, contacting elected officials, and participating in climate rallies can increase the frequency with which people engage in these actions. For example, creating an accessible “cheat sheet” of different candidates that outlines their policy goals and proposals would make it easier for people to decide on who to vote for. Another intervention is to increase the number of polling stations to reduce wait times to make voting less time consuming. Making information readily available (e.g., through a website) about each elected official, their contact information, and what concerns they handle, would make it easier for constituents to decide who to contact in their jurisdiction to discuss climate policy. Finally, for people experiencing climate anxiety and depression, engaging in civic action can help alleviate these negative emotions which may serve as negative reinforcement (Schwartz et al. 2022).

Spillovers

Behavioral interventions at the individual level have been criticized for the resulting negative spillover onto system-level interventions (Chater and Loewenstein 2022). We argue that it is because the behavior from the intervention is rarely positively reinforced. We propose two conditions for positive spillover to occur from an initial behavior to a subsequent behavior: (1) the initial behavior is positively reinforced (e.g., by social or symbolic rewards, or identity reinforcers like the warm glow) (Taufik et al. 2015), and (2) the subsequent behavior is perceived to be followed by a naturally occurring positive reinforcer. For example, a small fee ($0.06) reduced the use of single-use plastic bags and increased the use of reusable bags, and also generated positive spillover to increase public support for other similar policies such as adding charges for plastic bottles and excessive packaging in the UK (Thomas et al. 2019). This may be due to the possibility that the increased use of reusable bags enhanced people’s environmental identity, which serves as natural positive reinforcement (e.g., feeling good about themselves for using reusable bags). The resulting increase in support for similar policies may enhance people’s environmental identity as natural positive reinforcement. This effect also shows up in organizational behavior in terms of discretionary effort, which is the effort that employees engage in that is above and beyond their basic requirements. It has been suggested that the only way to promote discretionary effort is through positive reinforcement, whereas negative reinforcement and punishment only result in the bare minimum required to avoid punishment (Daniels 2016).

We also propose two conditions for negative spillover to occur from an initial behavior to a subsequent behavior: (1) the initial behavior is not positively reinforced, and (2) the subsequent behavior is perceived to not be followed by positive reinforcement. Negative spillover is especially likely if the initial or subsequent behavior involves personal sacrifice (e.g., costs, effort), which functions as a form of punishment that can decrease the behavior. For example, a default nudge made people purchase renewable energy which involved more costs, but subsequently lowered people’s support for a carbon tax policy which would cost them even more (Hagmann et al. 2019). This may be due to the possibility that the additional cost was perceived as a punishment and the behavior was not positively reinforced, which made subsequent behavior that also involved financial cost less likely (e.g., paying for carbon taxes). This could also be interpreted as a case of the side effect of punishment known as generalized suppression (Newsom et al. 1983). This framework bridges a critical gap in the literature by highlighting the importance of reinforcement and punishment for understanding spillovers.

The type of reward also influences the likelihood of spillovers. For example, financial rewards, which are often perceived as compensation for a given behavior rather than an inherent consequence of the behavior, are unlikely to generate positive spillover to other behavior. Free bus passes are likely to reinforce bus-taking behavior, but they may not lead people to take shorter showers. Other types of rewards, such as natural, social, and intrinsic (e.g., biker’s high, feeling good after hanging out with friends) are more likely to generate positive spillovers. For example, if an individual experiences a positive mood after biking to work, their behavior might spill over to walking to work, if walking is also followed by the positive mood.

The current framework can explain previous accounts of spillovers. Attitude change (Henn et al. 2020) from an intervention (i.e., greater care for the planet) can broaden the set of behaviors that are experienced as functionally equivalent, making seemingly different behaviors serve the same function to help the environment. However, changing attitudes may take a long time. Additionally, identity reinforcement (Truelove et al. 2014) and increased self-efficacy (Carrico 2021) can be considered as forms of positive reinforcement in our framework. For negative spillovers, the moral licensing effect (when people allow themselves to do something bad after doing something good) can be explained by a lack of positive reinforcement following the initial behavior, which leads people to seek a reward for the behavior by generating moral credits. This could also be interpreted as the side effect of punishment known as response substitution (Newsom et al. 1983). The crowding out effect (when a reward reduces the intrinsic motivation of the behavior) is a case of a financial reward not functioning as a reinforcer. The rebound effect (increased use of more efficient products) is a form of negative reinforcement because the efficient product removes the higher cost of the inefficient product, which increases the consumption of the more efficient product. Finally, risk compensation occurs when a risk-reducing intervention lowers the perceived risk of the behavior as a form of negative reinforcement and therefore increases the risky behavior itself (Carrico 2021). These examples suggest that negative reinforcement alone is not sufficient to promote positive spillovers. Existing climate action tends to be negatively reinforced (e.g., reduced feelings of guilt, shame, and anxiety), but negative reinforcement is likely insufficient for generating positive spillover (Jacquet 2017; Whitmarsh et al. 2022). This framework highlights the critical importance of using positive reinforcement to promote positive spillover in climate action.

Conclusion

In summary, the DRA-based operant conditioning framework can encourage low-emission behavior using positive reinforcement and discourage high-emission behavior using extinction or punishment across multiple domains, while promoting positive spillover with positive reinforcement. Researchers can test the efficacy of this framework by examining the different schedules of reinforcement, different types of rewards, and spillovers across diverse sets of behavior and audiences. Policymakers, businesses, and stakeholders can implement this framework at the system level and individual level to help achieve the net zero target by 2050.