1 Introduction

Most large-scale environmental problems, such as climate change, are the aggregate result of consumption and production decisions made by many actors in numerous countries. To achieve national and multinational goals, such as the Paris Agreement or Agenda 2030, extensive changes in behavior are needed. In a first-best world, most of these changes would result from economic policies, such as carbon taxes coupled with technological development. However, standard policies are often not stringent enough to reach ambitious and necessary goals. Therefore, attention has been given to alternative approaches to influence consumer behavior. These approaches often rely on the assumption that consumers have pro-social or pro-environmental preferences (Andreoni 1990; Kotchen and Moore 2007), and are willing to make costly sacrifices to reduce the environmental impact of their consumption.

One challenge associated with complex environmental problems is that consumers often lack sufficient information on their contribution to these problems and their extent. Therefore, a first step toward influencing consumers is to provide them with this information coupled with information on how to reduce these problems (Jessoe and Rapson 2014). Combining this information with a comparison of one’s own and others’ performance and information on appropriate behavior, or moral suasion, has been shown to reduce resource use by around 1–5% in various domains (Carlsson et al. 2021). The most prominent example of such an attempt is the use of so-called home energy reports (Allcott 2011), but similar interventions have been evaluated for water use (Brent et al. 2020; Ferraro and Price 2013; Torres and Carlsson 2018), non-residential energy use (Klege et al. 2022), food waste (Kallbekken and Sælen 2013), and household waste (Ek and Söderberg 2021). At the same time, there is evidence that the effect of information on behavior decreases over time (Ferraro et al. 2011), and there are examples of zero effects in some settings (e.g., Gravert and Collentine 2021).

Product labels are another way to inform consumers about the environmental impact of their consumption and facilitate their choices. When it comes to product labels related to greenhouse gas (GHG) emissions, they have primarily been evaluated in the food domain, where the findings suggest that consumers are indeed affected by the provision of information (see, e.g., Kurz 2018; Vanclay et al. 2011; Kažukauskas et al. 2021). Indirectly related to carbon labels is the use of energy efficiency labels, which also have a sizeable impact on consumer behavior (see, e.g., Newell and Siikamäki 2014; Allcott and Taubinsky 2015).

A key characteristic of the information provisions and product labels discussed above is that they focus on one domain or even one product category. First, this means that by definition they have a limited impact, since each domain represents only a small fraction of the total environmental impact of consumption. Second, a comprehensive evaluation of information provisions in one domain requires information about potential spillover effects on other domains to understand the total effect (Carlsson et al. 2021).Footnote 1 Third, it is unclear what scaling up these interventions into several domains would mean and result in. For example, an information intervention that includes all consumption domains should increase the chances of identifying behavioral adjustments that are deemed feasible for the consumer, in which case the mitigation potential should be higher compared to more narrowly targeted interventions.

Apart from the approaches of individual information provision or product labels mentioned above, alternative approaches that target behavior at a larger scale have also been explored. One such approach is a system of personal carbon trading, in which each individual receives carbon allowances to cover emissions from energy use and transport (see, e.g., Fawcett 2010; Fuso Nerini et al. 2021; Fawcett and Parag 2010). However, this approach has not been implemented on a large scale, and consequently there is no empirical evaluation of its effectiveness. Another potential large-scale approach is the formation of carbon communities, where coordination and cooperation among community members facilitate a reduction of emissions. There are several case studies indicating an effect on individual behavior (Heiskanen et al. 2010). However, formally evaluating the impact is challenging in this case since it involves a large-scale implementation, and for natural reasons, there is a selection of what type of communities strive to become a carbon community.

What is the potential for individuals to reduce their total carbon footprint? Previous research has estimated the technical potential for households to reduce total emissions. The analysis by Ivanova et al. (2020) implies a mitigation potential for the US population of roughly 55%.Footnote 2 Bjelle et al. (2018) estimated the direct mitigation potential for Norwegian households to be 24–35%. Carlsson Kanyama et al. (2021) analyzed the mitigation potential for Swedish households to be roughly 40% when considering potential consumption adjustments from food, transport, and housing to less carbon-intensive products and services.

In this paper, we provide the first empirical examination of the potential to reduce people’s total carbon footprints using a carbon calculator App. Specifically, we investigate how motivated and pro-environmental Swedish users of a carbon calculator App change their behavior after App installation. The App provides information about users’ carbon footprints based on financial transactions paired with an environmentally extended input–output analysis, data from official registers, and data entered by the users themselves (Andersson 2020). The App provides detailed information on carbon footprints, both in total and for different categories and individual purchases. In addition, the App provides suggestions on how to reduce the carbon footprint, and social comparison with groups. Compared with the interventions previously described, the detailed information provided across all consumption domains increases the chances that the App pinpoints feasible areas of improvement also for low-emitting individuals. Users who connect their bank to the App receive 2–216 weeks of past transaction data, which provides a baseline estimate of their carbon footprint. To identify the behavioral response after App installation, we use a difference-in-differences design with staggered treatment adoption (Sun and Abraham 2021). Hence, the identification comes from comparing emission levels in a particular week between those who have installed the App and those who have not yet installed it, controlling for individual fixed effects.

The increased use of smartphones has resulted in a wide variety of Apps intended to help people make better decisions, ranging from eating healthier and losing weight (Semper et al. 2016) to exercising more (Kennelly et al. 2018). One important area where the role of Apps has been evaluated is medication adherence, i.e., ensuring that patients take the medications they have been prescribed (Al-Ubaydli et al. 2017). There is, for example, evidence that reminders through Apps or simple text messages improve medication adherence (Vervloet et al. 2012).

Online carbon calculators have also become more common, but there have been few rigorous evaluations of their role and effect on behavior (Salo et al. 2019). Recently, Fosgaard et al. (2021) evaluated the impact of providing information on GHG emissions of grocery purchases in a randomized experiment. They found that the provision of information had a substantial impact in the short run (a 27% reduction in emissions in the first month). This suggests that the provision of information could play a non-negligible role. At the same time, they also found that this effect did not persist over time. Over the whole period of four months, the reduction in emissions was not statistically significant.

Our data contains 173,246 week-by-user observations from 2517 self-selected Swedish users of the App, spanning from 2016 to early 2020 (the beginning of the COVID-19 pandemic). Five hundred thirty users in our unbalanced panel provided emissions data after App installation, and we use them to estimate the behavioral response. These were highly active and engaged App users who held pro-environmental attitudes. We find that the users decreased their carbon footprint by around 10% in the first few weeks after App installation, but over the subsequent weeks, this reduction fades. Additional analysis suggests that the carbon footprint reduction was not driven by any specific consumption category, but rather by a combination of a shift from high- to low-emitting consumption categories, and a temporary decrease in overall transactions. The results were not driven by self-reported changes in lifestyle factors that affected users’ estimated carbon footprints, suggesting that the estimates do not suffer from any of the biases associated with self-reported data, such as social desirability bias (Lopez-Becerra and Alcon 2021). We find no evidence that users started to decrease their footprint before App installation, but that there was a strong and immediate drop directly after.

Was the reduction driven by the App or some other underlying factor that induced both carbon footprint reduction and App installation? Several factors support the interpretation that the effect was driven by the content of the App. First, the App is designed to induce and facilitate carbon footprint reduction and, as will be explained in Sect. 2, has features similar to behavioral interventions that have been shown in experimental studies to induce behavioral change. Second, we find no anticipating behavioral adjustment before App installation, i.e., we find parallel pre-treatment trends, implying that alternative explanations are confined to a triggering event concurrent to App installation. Hence, the results are inconsistent with motivated pro-environmental consumers decreasing their footprint and then installing the App some weeks later to see how they are progressing. Third, the immediate reduction and then gradual reversal is consistent with the behavioral responses documented in several experimental studies using similar treatments (e.g., Allcott and Rogers 2014; Bonan et al. 2021; Fosgaard et al. 2021). Fourth, our users were highly active in the App, showing that they were interested and engaged in its content. The experimental study by Fosgaard et al. (2021) found particularly strong and persistent effects among those engaged in their App with similar features as Svalna, but with a focus on food consumption. Fifth, our users hold particularly pro-environmental attitudes and values. Previous research suggests that individuals with sufficient room for improvement respond more strongly to similar pro-environmental behavioral interventions if they have strong environmental values (Bonan et al. 2021). Finally, we see no indication that users installed the App in response to any obvious pro-environmental events, such as Earth Day or large climate manifestations. Instead, there was a gradual influx of users during our study period. For these reasons, it is plausible that the estimated effect was caused by the content of the App, at least partly. However, since this is not an experiment, we cannot rule out the possibility that the effects were partly driven by some other underlying factor that concurrently triggered both App installation and behavioral adjustment. In either case, our results indicate the potential for motivated consumers to reduce their total carbon footprint—motivated in the sense that they are willing to change, as indicated by their choice to install and engage with the App, as well as by their pro-environmental attitudes.

The rest of the paper is organized as follows. Section 2 describes the carbon calculator, and how and why its content can induce behavioral change. Section 3 presents and discusses the data used. Section 4 describes the empirical strategy, Sect. 5 the results, and Sect. 6 concludes.

2 The Carbon Calculator

The carbon calculator used in this study is provided by Svalna Inc. The mobile application, called Svalna, estimates each user’s carbon footprint associated with purchases made.Footnote 3 By connecting their bank account(s) and/or credit cards to the App, users can get an overview of the carbon footprint of their spending in different consumption categories. Since its launch in 2016, the App has gained users through media coverage in the national press and a community intervention campaign developed in collaboration with Uppsala Municipality.

2.1 Estimating the Carbon Footprint

The App uses a hybrid approach that relies on data from three primary sources to calculate the GHG emissions related to the user’s consumption behavior: (1) financial transaction data from the user’s bank coupled with data on GHG emissions per monetary unit for different categories, (2) data from official databases, and (3) profile data entered by the user.

First, each transaction is automatically categorized and assigned an estimated carbon footprint. Additional data is received when users manually connect their bank again within the App. Users are asked to classify transactions that the system does not automatically identify, and the algorithm improves with each piece of additional information.Footnote 4 Users can also indicate whether they bought a specific item second-hand and receive a lower carbon footprint for that purchase. For some consumption categories, such as food and heating, other data sources are used in combination with transaction data to estimate the carbon footprint of those specific purchases.

Second, the App uses data from official registers where financial information is not provided. Users living in multi-dwelling houses are typically connected to the local district heating network and pay for heating as part of their rent, which means financial data are not sufficient. Instead, the App collects data on the energy performance of the users’ homes (as identified via their home addresses) from the energy performance database maintained by the National Board of Housing, Building and Planning.Footnote 5 For vehicles, data on fuel type, fuel efficiency, and distance traveled are automatically collected from the Swedish Transport Agency database if the user fills in their vehicle’s registration number.Footnote 6 This information is then used to estimate an individual GHG intensity for fuel purchases.

Third, users are asked to answer some questions in a user profile when generating their account. The questionnaire contains questions on, e.g., dietary habits, the number of household members, and information on air travel (departure/arrival destination) in the past two years. Users are free to specify lifestyle changes at any time, and when this is done, the carbon footprints from associated consumption categories are adjusted by the calculator for future purchases. For example, if a user switches from a mixed to a vegetarian diet, the carbon-equivalent footprint from each food-related purchase will be adjusted downwards; see Andersson (2020) for a comprehensive description.

2.2 Features in the App

There are four sections of the App, all illustrated in Fig. 1: (i) an Overview (ii) an Emissions section, (iii) a Goals section, and (iv) a Groups and comparison section. Here, we only briefly describe how they work and are designed, and refer to Andersson (2020) for a more detailed description.

Fig. 1
figure 1

Main sections and features of the mobile application. Note The figure illustrates the App’s sections for the general overview (top left), emissions from individual purchases (top middle), goal-setting and tailored suggestions (top right), and group functions and comparisons (bottom two)

The overview section provides information on the current estimated yearly carbon footprint, a trend line describing the variation over time, and an indicator of the percentage change over the last few months. It also includes a breakdown of the estimated yearly carbon footprint, and a list of the most recent purchases, their categorization, and their carbon footprint.

The emissions section of the application enables users to explore how their carbon footprint has evolved over time and learn how consumption activities are connected to GHG emissions. The carbon footprint calculated by the App is divided into four main categories: (1) transportation, (2) goods and services, (3) food and beverages, and (4) residential energy (with several sub-categories). It is also possible to examine sub-categories and individual transactions and their respective GHG emissions to gain a more detailed understanding of the impact related to different activities and transactions.

In the goals section, the user can set a goal to reduce their annual carbon footprint. Given the difficulties of setting a reasonable goal, the App suggests a goal that the user can then change. In addition, the section illustrates how different behavioral changes would affect a user’s future carbon footprint. The system currently includes 15 suggested adjustments that describe large-scale lifestyle changes. The suggested actions are based on the user’s data, and users are only exposed to suggested adjustments that will reduce their carbon footprint. For instance, while a meat-eater might be suggested to become vegetarian, a vegan will not. This means that users who already maintain a more sustainable lifestyle might not receive any suggested actions from the App.

The groups section comprises both automatically generated geographical groups and user-generated groups. All users are by default assigned to a municipality group. In these groups, users can compare their carbon footprint with the average in the four sub-domains and see their ranking compared with other users in the municipality. Users are also free to form their own groups or join existing groups. The users organize themselves into these groups by sending invites or requests. Group functionality involves the same functionality as the geographical groups and in addition, the users can set a joint goal. It is also possible for workplaces to form a group and organize their staff into competing sub-groups with a leaderboard.

2.3 How Can Users Reduce Their Carbon Footprint?

There are primarily three ways in which users can reduce their carbon footprint, as measured by the calculator. First, users can reduce the carbon intensity of their financial transactions. This is achieved by reducing spending in a category that has a high emission coefficient (e.g., electricity, household appliances, clothing, furniture, and fuels for transport and heating) and instead increasing spending in a category that has a lower emission coefficient (e.g., various services, cultural and sporting events, and entertainment). The same effect occurs if users increase financial transactions to charity or savings accounts (deferred consumption). In addition, by indicating that something was bought second-hand, the carbon footprint of that purchase is reduced. Second, users can decrease overall spending, at least temporarily. Third, users can change lifestyle factors such as car ownership, the number of household members, and dietary habits, which affects the emission coefficient of many consumption categories. For example, the App uses reported dietary habits to determine the carbon footprint of food consumption. Changing one’s dietary habit status from meat-eater to vegetarian reduces the carbon footprint from all food consumption.

2.4 How Can the App Influence Behavior?

The information provided in the emissions section can influence consumption behavior in at least three ways. First, individuals who care about the environmental impact of their consumption might receive new or better information about the carbon footprint of various consumption activities and adjust their behavior accordingly. One feature of the carbon calculator is that the informational feedback is tailored and personalized, enabling the user to identify key areas where the carbon footprint can be reduced. The information feedback is vast and highly detailed since users can get feedback on emission levels for all individual transactions. This enables users to easily infer emission-reducing consumption adjustments across a vast array of consumption categories and domains. In this regard, efforts to reduce the carbon footprint can be directed toward the “low-hanging fruit” for everyone, be it transportation habits, household energy conservation, a reduced rate of replacement of clothing/furniture/appliances, or diet habits. A similar argument has been underlined by Buchanan et al. (2015) and Tiefenbeck et al. (2018) regarding feedback on household energy consumption. Furthermore, in a study on electricity use with a feedback system that breaks electricity consumption down into household appliances, electricity use was reduced by around 8% (Asensio and Delmas 2015), compared to the around 2% reduction induced by ’home energy reports’ that give aggregated feedback on past electricity usage (Allcott 2011; Allcott and Taubinsky 2015; Costa and Kahn 2013).

Second, the behavior might also be affected by the fact that information on the carbon footprint is made more salient. For example, the empirical literature has documented that consumers pay attention to the price before tax and hence ignore the sales tax (Chetty et al. 2009), or that they ignore the fuel and electricity costs of cars and light bulbs compared to the sales prices of such products (Allcott 2011; Allcott and Taubinsky 2015). Moreover, a key role of product labels is to make certain characteristics more salient. For example, energy labels have been introduced to increase the attention to energy use in the purchase of durable goods (Andor et al. 2020). In the case of choices involving a moral dimension, such as climate change, it could be important to influence individuals to make attentive choices (Löfgren and Nordblom 2020), through for example, a carbon calculator.

Third, the information from the App also illustrates how the carbon footprint to a large extent is related to the level of consumption. This can have a direct effect on the overall level of consumption, not only due to concerns about the environmental impact of the consumption, but also because of concerns about one’s financial situation.

In addition to this, the carbon calculator App has two behavioral features that might play a role. In the App, users can set goals. By encouraging people to set non-binding goals or commit to certain actions, the likelihood of behavioral change can be increased. This is because people dislike being inconsistent and not following through with their plans and promises (Festinger 1962). Voluntary goal setting has been found to affect behavior in areas such as energy saving (Harding and Hsiaw 2014) and vehicle use (Kormos et al. 2015). Commitment, which is presumably a stronger level of engagement than merely setting a goal, has been found to affect behavior in areas such as smoking cessation (Giné et al. 2010) and environmentally friendly behavior (Baca-Motes et al. 2013). Since users of the carbon calculator App are not obligated to set goals, and the goals are not formulated as commitments to a certain behaviors, we do not expect any sizeable overall effects of these App features.

The App also provides information about how the user compares with other users. In particular, the user can compare their footprint with that of an average Swede and with the average in their municipality. Mounting empirical evidence suggests that an individual’s behavior is influenced both by information about what other people do, so-called descriptive norms, and by what other people Approve of, so-called injunctive norms (Allcott 2011; Cialdini 2003; Carlsson et al. 2021; Czajkowski et al. 2017; Ferraro and Price 2013; Goeschl et al. 2018; Nyborg et al. 2016).Footnote 7 Social influence is likely to be more effective when the behavior is observable by others and easily scrutinized. As emphasized by Nyborg et al. (2016), individual GHG emissions originate from a wide array of behaviors, many of which are barely observable. In the App, users interact in groups, and can compare themselves with others in much more detail. This is expected to strengthen the effect of social information on the users’ behavior.

3 Data

The data provided by Svalna Inc. includes weekly estimates of total carbon-equivalent emissions based on detailed purchasing data, survey questions, and register data, as described in Sect. 2. The total carbon footprint is divided into four main categories: (1) transportation, (2) goods and services, (3) food and beverages, and (4) residential energy. Each of these in turn consists of several sub-categories, but we will restrict our analysis to total emissions and these four categories, which correspond to the categories users see in the App. We use pre-pandemic data, as purchasing behavior was severely affected thereafter, making it difficult to identify the effect of the App and to generalize the results. The availability of transaction data for each user depends on when and how often the user decides to connect to their bank. With each manual bank connection, the App downloads and analyzes historic transactions. This means that we have data both before and after App installation for those who connected to their bank at least twice, and data before installation for those who connected to their bank exactly once.

To identify the behavioral response after App installation, we use a difference-in-differences design with staggered treatment adoption. We follow the method recently proposed by Sun and Abraham (2021) which is robust to heterogeneous treatment effects, whereby units adopting treatment last are used as control. Hence, in our setting, the control group consists of all users who installed the calculator at the end of our sample period, i.e. in early 2020 before the pandemic.Footnote 8 Thus, the control group provides data only on purchase behavior before App installation. The exact length of these historical data depends on the bank, but the average number of weeks is 68. The treatment group consists of users who installed the App before 2020. These users are considered treated from the point in time they install the App.Footnote 9 Following the method of Sun and Abraham (2021), we exclude in the empirical analysis the period after which control users install the AppFootnote 10 Therefore, and as explained in Sect. 4, identification comes from comparing consumption in a particular week between those who have installed the App and those who have not yet installed it, controlling for individual fixed effects. The main identifying assumption is that, in the absence of treatment, carbon footprints in the control group have parallel trends to those in the treatment group.

Column (1) of Table 1 presents data on all users in our sample who provide the identifying variation and therefore allows us to estimate behavioral change after App installation. Column (2) presents data for the treatment group, i.e. users who adopted the calculator before 2020 and provided post-treatment data. Column (3) presents data for the control group, i.e. users who adopted the calculator in early 2020. Column (4) indicates the difference in means between these two groups. While there is no statistically significant difference in pre-treatment carbon footprints between the groups, there are some differences in mean characteristics, such as gender, commuting distance, and household size. Although balanced characteristics between these two groups are not necessary for our empirical strategy, one might worry that these differences would indicate that the parallel trends assumption is violated.Footnote 11 The most common way to evaluate whether the parallel trends assumption is plausible in a setting with staggered treatment adoption is by estimating pre-treatment effects in a regression framework (Cunningham 2021). In Sect. 4, we will demonstrate that pre-treatment effects are small and generally insignificant, offering support for the identifying assumption. To complement this, Appendix Fig. 5 plots the raw unconditional carbon footprint pre-treatment trends for the treatment and control groups separately. The figure demonstrates that carbon footprints in pre-treatment weeks have very similar trends prior to App installation.

The identifying sample consists of 1240 users in total, of whom 730 and 510 are in the control and treatment groups, respectively.Footnote 12 The mean age is around 32–33, and most people live in apartments with one other person and no children, own zero cars, and eat a mixed diet. Ninety-five percent of the users use the mobile version of the calculator, while the rest use the web page version. Before the adoption of the calculator, users had a mean weekly footprint of 140 kg CO2eq and spent, on average, 12,700 SEK per week.Footnote 13 The large variation in the availability of pre-treatment data is due to differences in data availability between different banks, while the variation in the availability of post-treatment data is due to between-user differences in the number of days between the first and last bank connection.

Table 1 User characteristics

There are some notable differences between the typical user and the Swedish population. Users are younger, with an average age of around 32 compared to around 48 among adults in Sweden overall. They are also more likely to live in a big city (28% compared to 18%) and in an apartment (73% compared to 55%). There are no official statistics on the number of vegetarians and vegans, but the share among the users (around 30%) is higher than what other studies have shown, where the share is often around 10%. There is also survey evidence that the users of the App are more pro-environmental.Footnote 14 In sum, users in both the treatment and control groups differ somewhat from the general Swedish population, and it’s important to keep this in mind when interpreting the external validity of the results of this study.

In Appendix Table 6, and Appendix Figs. 6 and 7, we present and illustrate data on App usage in the treatment group. On average, users engage with the App 10 times after App installation and explore all sections of the App. A third of the users set an emission reduction target. Appendix Fig. 6 shows how users gradually drop out of our sample as they stop updating their bank data (solid black line). While the number of users active in the App in a particular week drops quickly after App installation (grey solid line), the share of users in the sample that engages with the App each week is at least 20% (dashed line). In a review of carbon calculators, Salo et al. (2019) emphasized that engaging people more than once is a particular challenge for most calculators. Compared to this, users in our treatment group are highly active.

Although comprehensive and detailed, the purchasing data and thus emission estimates come with measurement errors. First, emissions are based on which of the 65 consumption sub-categories each transaction belongs to, which most often is determined by the calculator’s algorithm based on a small descriptive text for each transaction. Because incorrect categorizations occur, we manually identify transactions that are obviously incorrect categorizations by the App. For such expenses up to a threshold of 50,000 SEK/week/sub-category, we set the emissions at the user-average emission intensity. Expenses higher than this threshold are classified as financial transactions with a zero carbon-equivalent intensity. As we will show in Sect. 5, the results are not sensitive to where this arbitrary threshold is set.

Second, users can manually re-categorize transactions that are erroneously categorized by the App. Twenty percent of the users in our sample have made such re-categorizations. Half of the re-categorizations concern transactions smaller than 1000 SEK, and around 5% concern transactions larger than 20,000 SEK. Re-categorizations might be done incorrectly and in a manner that systematically reduces the user’s reported carbon footprint. Some users might want to reduce their overall carbon footprint by re-categorizing transactions from categories with high carbon intensity to categories with low carbon intensity, for example, to improve relative to peers also using the calculator. Note that in the App, users can re-categorize transactions that occurred before the App was installed. If users systematically re-categorize transactions occurring before or after App installation differently, this would bias our results. However, there is no obvious reason why users would differentiate between transactions before and after App installation when making re-categorizations. Nevertheless, since we have data on the total expenditures re-categorized each week by each user, we can formally test this.Footnote 15 Table 2 presents the results from regressing the logged total weekly amount re-categorized on a dummy for post-treatment periods, controlling for user and time fixed effects. We cannot reject the null hypothesis of no difference between the amounts re-categorized for the weeks before and after the App installation, suggesting that manual re-categorizations will unlikely bias our results.

Table 2 Manually re-categorized transactions before and after treatment

Third, as described in Sect. 2, the carbon intensity of many consumption categories is partly based on self-reported lifestyle factors, such as dietary habits and household size. To the extent that users systematically report lifestyle changes inaccurately, for example, to manipulate the footprint downwards, this could bias our results. However, we will show that our main results are not driven by those who self-report lifestyle changes after treatment.Footnote 16 It is also important to understand that one important role of the carbon calculator is to induce lifestyle changes. Thus, users reporting such changes should not be seen as a problem in itself.

Fourth, to fine-tune the estimated long-term footprint, the calculator sometimes automatically assigns emission values to weeks without corresponding financial transactions. This is done when official databases and self-reported lifestyle factors imply that there should be regular emissions from a certain category, but the algorithm is unable to detect any corresponding transaction.Footnote 17 While this improves the precision of the emission estimates in the longer run, it decreases the temporal precision of the calculator because immediate changes in consumption behavior will not be detected. Furthermore, it also makes it difficult to measure the carbon intensity (CO2eq/SEK) of purchases. To address this, we recalculate emissions in these categories from observed transactions only, to increase the temporal precision of consumer behavior.Footnote 18

Fifth, although there is a consensus among scholars that multi-regional input–output databases currently represent the preferred approach to calculating carbon footprints (Tukker et al. 2018), putting this data to use to provide carbon estimates for specific consumer goods will inevitably lead to results that deviate from the most accurate estimate for certain products. Hence, using price as a proxy for emissions will, on average, overestimate the emissions associated with the purchases of expensive and high-quality products and underestimate the emissions related to the production of the cheapest products. Since high-income households are more likely to buy expensive products and services, this means that we are likely to overestimate the carbon footprint of high-income households (Girod and De Haan 2010).

Sixth, users of the App may change their behavior in ways that may not be completely noticeable through their spending patterns. For example, the App does not take into consideration the potential emission reductions from altered waste management practices. Also related, if users live in households with more than one member (around half of the users in our sample) and some expenditures are shared, the App would naturally only partly capture altered consumption.Footnote 19 Some consumption categories are, however, very well captured by the transaction data, in particular, those with relatively homogeneous products within them and/or those where product heterogeneity is well captured by official registers (e.g., gas purchases combined with registered information on car ownership).

In addition to the 1240 users presented in Table 1, we also include in our estimation users who adopted the calculator before 2020, but only connected their bank once and thus only provide pre-treatment data. Unlike our control group, these users will not provide any variation enabling us to estimate post-treatment effects, which is of primary interest. However, we still include them to increase statistical power to detect any behavioral change before App installation, i.e. an anticipation effect. These additional 1303 users are presented in Appendix Table 7 and compared to the treatment group previously presented. Figure 2 illustrates data availability before 2020 from the three user groups used in the empirical analysis.Footnote 20

Fig. 2
figure 2

Data availability by user group. Note The figure shows data availability for each week in our sample from the three user groups used in the empirical analysis. Treatment group indicates the number of users with available post-treatment data. Control group are users who adopted the calculator in early 2020 and who provide pre-treatment emissions data. The long-dashed line indicates pre-treatment data available from users who adopted the calculator before 2020 but only provide pre-treatment data, i.e., they only connected to their bank once. Note that pre-treatment data from the treatment group are not included in the figure

Table 3 reports summary statistics for all users active in our sample period, including those in early 2020. The mean weekly carbon footprint is 149 kg CO2eq, which corresponds to a yearly footprint of 7.7 tons CO2eq. The average carbon footprint of the adult population in Sweden is roughly 178 kg CO2eq per week corresponding, to 9.3 tons CO2eq per year, if estimated similarly as in the App.

There is substantial variation in the carbon footprint, and the median weekly footprint is 78 kg CO2eq. The subsequent rows break this down into the four main consumption categories: transportation, goods and services, food and beverages, and residential energy. The categories with the highest average footprint are goods and services, and food and beverages. However, the largest variation in carbon footprint is found in the transportation group, suggesting that some users have a substantial carbon footprint from transportation.

Table 3 Summary statistics

4 Empirical Strategy

To evaluate whether users alter their purchasing behavior after App installation, we estimate various versions of a standard difference-in-differences model with staggered treatment adoption:

$$\begin{aligned} y_{it} = \tau _{it} App_{it} + \eta _{t} + \rho _{i} + \epsilon _{it} \end{aligned}$$
(1)

where \(y_{it}\) is a measure of purchasing behavior of individual i in week t. Our main outcome variable is weekly carbon-equivalent emissions from total consumption, but we also examine emissions from the four sub-categories of consumption. In addition, we consider the emission intensity of weekly spending (CO2eq/SEK), and total weekly spending. We log-transform the dependent variable in our main specifications for primarily two reasons. First, because the main dependent variable is based on individual consumption, it is highly skewed and sensitive to outliers. The log transformation reduces this problem. Second, we find it more plausible that the consumption response is proportional to the individuals’ own consumption level rather than absolute.Footnote 21

\(App_{it}\) is the treatment indicator of individual i in week t and is equal to one after App installation. Thus, this is the point in time after which the user receives information about the consumption-based carbon footprint and experiences the associated features of the carbon calculator App. \(\eta _{t}\) is week fixed effects and controls for any time-variant shocks or trends that affect all users similarly, such as inflation, economic shocks, and seasonality in income or spending. For example, if there are general trends in environmental awareness or the supply of climate-friendly products, they will be accounted for by the time fixed effects. \(\rho _{i}\) is individual fixed effects and controls for any time-invariant observed or unobserved individual characteristics that affect individuals’ purchasing behavior. \(\epsilon _{it}\) is an idiosyncratic error term. The behavioral response before and after treatment is thus captured by \(\tau _{it}\). To account for within-user serial correlation in the error term, we cluster standard errors at the user level.

Estimating Eq. (1) using the commonly used two-way fixed effects estimator and OLS can lead to severely biased results when the treatment effect is heterogeneous, either within-unit over time or between groups of units treated at different times, as shown in a number of recent studies (e.g., Borusyak et al. 2021; De Chaisemartin and d’Haultfoeuille 2020; Callaway and Sant’Anna 2021; Goodman-Bacon 2021; Sun and Abraham 2021). This is because the usual two-way fixed effects estimation procedure provides a weighted average of many 2-by-2 difference-in-differences treatment effects. If estimated with a two-way fixed effects regression, some of these weights can be negative or incorrect due to heterogeneous treatment effects.Footnote 22 To address this, we use the approach suggested by Sun and Abraham (2021), where separate event studies are estimated for each treatment cohort, defined as the week of App installation in our setting. The average leads and lags across treatment cohorts are then calculated, weighted by the number of users in each treatment cohort. This procedure ensures correct and non-negative weights and accounts for possible heterogeneous treatment effects. As proposed by Sun and Abraham (2021), we use as our control group the last treated units in our sample, which in our setting are users who installed the App in early 2020.

What is the interpretation of \(\tau _{it}\)? A key underlying assumption for a standard difference-in-differences model such as Eq. (1) is that, in the absence of treatment, baseline outcomes follow parallel trends in the treatment and control groups, so that control users provide a plausible counterfactual to the treated users (Angrist and Pischke 2008; Sun and Abraham 2021; Roth et al. 2022). Note that this assumption does not require baseline characteristics or even outcomes to be balanced across the treated and control groups, as these are absorbed by the user fixed effects \(\rho _{i}\), but merely that they follow parallel trends. As is standard in the difference-in-differences literature, we test this by estimating individual pre-treatment effects, where no effects would support the assumption of parallel trends. Roth et al. (2022) and others have recently raised the concern that such tests may fail to reject no pre-treatment effects due to low statistical power. This may be a particular concern in our study given the small number of users and identifying observations. To address this, as mentioned in Sect. 3, we include in all our analyses users who installed the App before 2020 and only connected their bank once and therefore only provide pre-treatment data. While these users do not provide any identifying variation to estimate treatment effects after App installation, they increase the power to detect any effect prior to treatment. This test will also detect any anticipation effect. Individuals might start reducing their footprint for reasons unrelated to the calculator, and then start using the calculator later, perhaps after a couple of weeks of consumption adjustment to see how they are progressing. In this case, we expect to detect gradual decreases in footprints prior to treatment.

Given parallel trends and no anticipation effects, and given that the calculator App induces individuals to adjust their consumption behavior, \(\tau _{it}\) can be interpreted as the causal effect of the carbon calculator for the type of subjects who install the App. However, since we are using self-selected users of the carbon calculator and have no information on why users start using it, we cannot entirely rule out the possibility that \(\tau _{it}\) partly captures an underlying factor or event that immediately triggers both App installation and behavioral adjustment. Thus, in the absence of pre-treatment effects, we can interpret \(\tau _{it}\) as the consumption response after App installation for individuals motivated to reduce their footprint, where this motivation is induced by the calculator App or some other underlying cause (such as a climate campaign), or a combination of the two.

Finally, note that the treatment indicator is an intent to treat. Although all users in the treatment group have made at least two manual bank connections, we know that some users are very inactive in the App, and we furthermore have limited information on how they process the information. Additionally, the small number of users and identifying observations make it difficult to identify differences in behavior between different types of users, such as in level of App activity.

5 Results

We begin with the results from a flexible version of Eq. (1), an event study model with week-specific treatment effects for up to 8 weeks before App installation and 30 weeks after. Using 8 weeks before should allow us to detect any anticipation effect and evaluate the plausibility of the parallel trends assumption. We bin all weeks outside of this time frame into two variables and use the week before App installation as the reference period. The results, illustrated in Fig. 3, show that users reduce their consumption-based carbon emissions by around 11% directly after App installation. The figure reports main estimates for the initial 20 weeks after App installation together with 90% confidence intervals; full results are presented in Appendix Table 8. Coefficient estimates for the weeks following the initial periods indicate a gradual slide back to footprint levels prior to App installation, but caution is warranted because the statistical precision is low and decreases as we move further away from the time of App installation. Moreover, we observe no effects on emissions before App installation, indicating no anticipation effect and offering support to the parallel trends assumption. This supports the interpretation that we are catching the immediate behavioral response for individuals who have decided to reduce their footprint, either due to the content of the calculator and/or some underlying shift in preferences occurring at the same time.

Fig. 3
figure 3

App installation and carbon footprints. Note The figure illustrates estimated coefficients and 90% confidence intervals from the model in Eq. (1), using as dependent variable total weekly CO2eq emissions. Week zero is the week of App installation. Full results are available in Appendix Table 8

Figure 4 reports estimates from the same model specification but for each of the main consumption categories: (1) transportation, (2) goods and services, (3) food and beverages, and (4) residential energy. There is less precision in the estimates for the separate categories, which not the least is indicated by the wide confidence intervals, mostly because they are based on fewer transactions. Although it varies from week to week, the pattern is by and large that the carbon footprint is reduced directly after App installation, but that emission levels revert to pre-installation levels after a few weeks.

Fig. 4
figure 4

App installation and carbon footprints in four consumption categories. Note The figures illustrate estimated coefficients and 90% confidence intervals from the model in Eq. (1), using as dependent variable CO2eq emissions from one of four consumption categories. Week zero is the week of App installation. Full results are available in Appendix Table 8

Next, we focus on a specification where we bin the effects for weeks 0–4, 5–12, and after 12 weeks, respectively. The main reasons for doing this are to increase the precision of our estimates and that many household costs are paid monthly and not weekly. Thus, the immediate effect is one month after App installation, and then we allow for a medium-term effect of two additional months, and finally an effect beyond the three first months. We also include a binned effect for four weeks prior to App installation, i.e., a similar length as the immediate effect, making all weeks before that the reference period. Column (1) in Table 4 presents the effect on total carbon-equivalent emissions using this specification, hence analogous to the results in Fig. 3. We see a highly statistically significant average decrease of almost 10% in weekly carbon footprint in the first five weeks after App installation. This negative effect decreases to around 6.5% for weeks 5–12. For the weeks beyond the first 12 weeks, the coefficient estimate suggests an average decrease of around 4.5%, but the statistical precision is low, and the coefficient estimate is not statistically significant at the conventional level.Footnote 23 Subsequent columns in Table 4 split the outcome variable into the four sub-categories of consumption. By doing so, we can investigate whether there are negative spillover effects between categories, whereby decreases in one consumption category are counterbalanced by increases in other categories. The results indicate that carbon footprints decreased in all four consumption domains in the first five weeks, although the statistical precision decreases substantially when splitting the data. As such, this suggests that there are, on average, no negative spillover effects between the four consumption categories. Short-run emission decreases are statistically significant at the five percent level for goods and services, while they are significant at the 10% level for transportation, residential energy, and food and beverages. Beyond the first 12 weeks, none of the estimated coefficients are statistically significant.

Table 4 App installation and carbon footprints: main results

Table 5 investigates the mechanisms underlying the above results. As discussed in Sect. 2, users can reduce their footprint in primarily three ways: (i) shifting consumption from high intensity (CO2eq/SEK) to lower intensity categories, (ii) decreasing overall spending, and (iii) changing lifestyle factors (e.g., address, car ownership, or dietary habits) by manually reporting the change in the App.

Table 5 App installation and carbon footprints: mechanisms and robustness

Column (1) in Table 5 examines the first mechanism by estimating the effect on the logged carbon-equivalent intensity of consumption, i.e., the logged weekly average emission per SEK spent. We find suggestive evidence that consumption is shifted towards consumption with lower carbon-equivalent intensity in the first 12 weeks, although some caution is warranted as the coefficients are only statistically significant at the 10% level. Column (2) examines whether overall spending is reduced or not. The results suggest that this is the case for the first five weeks. This means, consequently, that the users increase their savings in the short-run, which at some later point in time will be used for consumption and/or investments, resulting in a carbon footprint.

In Columns (3) and (4), we also investigate whether the results are driven by changes in lifestyle factors self-reported by the user. We do this by splitting the sample of users into those who do not have any lifestyle factor changes during the sample period (column 3) and those who do (column 4). Out of the 510 users in the treatment group, a majority (n = 352) do not have any reported lifestyle changes.Footnote 24 The results indicate that reported lifestyle changes do not explain the results. Those with lifestyle changes do not have a larger reduction in their carbon footprint compared to those who do not have any lifestyle changes reported.

Taken together, the results in Table 5 suggest that the overall decrease in carbon footprint is driven by a combination of a shift towards expenditures with lower carbon intensity and a temporary decrease in overall expenditures.

Since this is not an experiment, we cannot rule out the possibility that the effects are partly driven by some underlying factor, unrelated to the App, that concurrently triggers both App installation and behavioral adjustment. However, taking a holistic view of the empirical analysis—including the descriptive statistics, user characteristics and behavior, content of the App, and past experimental research on related interventions—the overall results suggest that the content of the App induces carbon footprint reduction, at least partly. In particular (i) the App has features similar to those that are known to induce pro-environmental behavioral change, (ii) there is no anticipating effect, confining alternative explanations to concurrent triggering events, (iii) our users are highly engaged, likely have room for improvement, and hold pro-environmental attitudes, suggesting they are particularly likely to respond to the App’s content (e.g., Fosgaard et al. 2021; Bonan et al. 2021), (iv) the immediate reduction and gradual reversal is consistent with previous experimental findings (e.g., Allcott and Rogers 2014; Fosgaard et al. 2021; Bonan et al. 2021), and (v) the gradual influx of users suggests that the result is not driven by a few single obvious pro-environmental events, such as Earth Day or climate manifestations.

6 Conclusions

Reducing the carbon footprint at the individual level is not as straightforward as it seems at first glance. Reducing the overall level of consumption and using less energy from carbon sources are the most straightforward options, but beyond that, it is more complex for motivated individuals to reduce their footprint. Using carbon calculators to facilitate pro-environmental choices is consequently an interesting alternative, in particular if the provision of information can be coupled with behavioral nudges, such as social information or moral suasion.

As far as we are aware, there have been no evaluations of a comprehensive carbon calculator and its role in reducing people’s carbon footprint. Using data on pro-environmental consumers who adopt and actively use a carbon calculator, we investigate their efforts to reduce their carbon footprint. The reduction in carbon footprint is substantial in the first month, with around a 10% decrease. This reduction is driven by a combination of a shift towards expenditures with lower carbon intensity and a temporary decrease in overall expenditures. This effect size is large compared to those found in other studies on resource use, which typically report a reduction of between 2 and 5% from recieving so-called home-energy reports (Carlsson et al. 2021). However, this is an effect on energy use and not the carbon footprint. The effect on carbon footprint depends on the energy mix and type of marginal energy source. Let us illustrate with two studies. In Allcott (2011), the average effect size on electricity use is 2%. Back-of-the-envelope estimates suggest that this would result in a reduction of around 0.8% in CO2eq emissions. The study in Germany by Andor et al. (2020) finds a considerably smaller effect size of 0.7%, which would result in an estimated reduction of around 0.1% in CO2eq emissions.Footnote 25 Compared to these studies, our estimated effect in the short run is huge. Furthermore, for a country like Sweden, the average CO2eq emissions per individual are very low, which means that the effect on the carbon footprint from reducing electricity use is only relatively minor.Footnote 26 However, after the first month, the effect in our study fades, although still statistically significant in the first subsequent weeks. Similar patterns of effects of information have been found in other studies (Allcott and Rogers 2014; Ferraro et al. 2011; Fosgaard et al. 2021). In particular, there is even evidence of “action and backsliding” of home-energy reports, i.e. that users reduce their electricity use directly after receiving the information but that these efforts decay over time. This suggests that the information treatment should be repeated to have a consistent effect. The long-run effects of home-energy reports, after they have stopped being received, are indeed decaying but are still statistically significant after a few years (Allcott and Rogers 2014).

As we have indicated, the sample size in this study is not very large, and conducting a more large-scale study would be necessary to more precisely estimate carbon footprint reduction efforts and the App’s effect, especially beyond the short-run. In addition, we are not able to explore the individual effects of the various features of the App. Moreover, the users of the carbon calculator App are not a random sample of the population. On the contrary, because users have self-selected into the App and are reported to hold particularly pro-environmental attitudes, it’s likely that they are motivated to change. To know if the use of this type of carbon calculator could spur behavioral changes in a larger population and to understand the mechanisms, a proper experiment with randomization would need to be conducted.

Even if the use of carbon calculators would result in emission reductions, how could such solutions scale to a larger share of the population? One answer is that banks around the world are currently taking steps to integrate carbon footprint calculators into their digital platforms (see, for example, BNP Paribas,Footnote 27, NatWest,Footnote 28, NordeaFootnote 29, and KlarnaFootnote 30). Banking platforms are already trusted and used by a majority of the population in most countries in the world, so offering carbon footprint information to customers might provide a competitive advantage and also relate to other green revenue streams such as green funds and loans. Therefore, carbon footprint information could potentially scale through banking platforms. However, since banks have no strong incentive to help customers reduce their footprint or make such information very salient on their platforms, understanding how this information could affect people likely determines the effectiveness of this solution.

Assuming that the carbon calculator App drives the observed effect, the fact that there is only a short-run impact may not be surprising when considering the features of the App. There is little in it that encourages the user to return and use it repeatedly, and there are essentially no reminders sent to users to check their progress. The possibilities to scale a carbon calculator would require a design where the user is exposed to the App content repeatedly. Our results are also in line with a similar study on information provision relating to food consumption, where only an initial statistically significant effect was found (Fosgaard et al. 2021). Again, a proper evaluation using an experiment would be important in order to learn what could keep users at a lower level of their carbon footprint.