1 Introduction

It is widely agreed among the scientific and institutional communities that there is an urgent need to reduce greenhouse gas (GHG) emissions that contribute to climate change (IPCC 2018). Given that political constraints and the influence of industrial lobbying may limit the extent to which GHG emissions can be addressed using price mechanisms, there is growing interest in behavioral and informational interventions aimed at shifting individual consumption (Arora and Mishra 2021; IPCC 2022). According to Cafaro (2011), consumers have the potential to reduce their carbon footprint by 15 billion tons by 2060 through the adoption of new consumption practices.

Prior work has successfully examined field informational interventions to reduce energy usage and increase demand for more energy-efficient technologies at the household level, such as cars, appliances, and lightbulbs (Guthrie et al. 2015; Hummel and Maedche 2019; Wynes et al. 2018). Indeed, household consumption is estimated to be responsible for up to 72% of global greenhouse gas emissions, which are mostly attributed to transport, housing, and food (Hertwich and Peters 2009; UNEP 2017). Of this, an estimated 20–30% of all GHG emissions originate from food production, making it a critical target for GHG reductions (Commission 2006; IPCC 2018; Vermeulen et al. 2012; Willett et al. 2019). The share of emissions due to food production is similar, if not higher, than household energy usage, which has been a primary focus of prior informational interventions (Goldstein et al. 2020). Informational interventions could potentially also be effective for food consumption, as investigated by recent literature (Dannenberg and Weingärtner 2023; Kanay et al. 2021; Lohmann et al. 2022; Muller et al. 2017; Panzone et al. 2021a; Panzone et al. 2021b; Perino et al. 2014; Suchier et al. 2023).

First, while there is growing awareness that the food production process contributes to climate change, information on the carbon footprint of particular food groups is not readily available and people generally underestimate the impacts (Camilleri et al. 2019; Macdiarmid et al. 2016). For example, despite recent attention to beef as a high emissions food group, many may not be aware of the magnitude: producing a single serving of beef (100 g, 3.53 oz, or 0.22 pounds) generates GHG emissions equivalent to driving 49.86 km (30.98 miles), about the average daily commute in the U.S. (Federal Highway Administration 2017). Second, because emissions vary greatly by food group, shifts in composition can have a large impact (Garnett 2011). For example, emissions related to the production of ground beef are ten times higher than those for chicken (Poore and Nemecek 2018). More generally, interventions that encourage individuals to switch to lower-carbon substitutes or reduce their meat consumption may contribute significantly to the reduction of greenhouse gas emissions: eating fewer animal products can reduce individuals’ total carbon footprint by an estimated 22% (Lacroix 2018). Therefore, there is significant potential to reduce carbon emissions by promoting more sustainable food choices among consumers. Furthermore, research shows that consumers are willing to make changes to their dietary choices to reduce their environmental impact, particularly if supported by informative labeling or educational campaigns (Visschers and Siegrist 2015). Nevertheless, food consumption behaviors are very difficult to shift (Liu et al. 2014) and so may not be responsive to light-touch informational interventions.

The goal of this study is to examine whether it is possible to shift actual consumer behavior in response to the environmental impact of food choices. We implement a randomized field experiment among a national sample of the Danish population to test the impact of providing individuals with information about the GHG emissions of their grocery purchases. We provide the information through a novel smartphone application (app) that collects data on individuals’ grocery purchases both before and during our 19-week intervention. We compare the impact of a “Carbon” app that provides item-level carbon-equivalent emissions of individuals’ grocery purchases to a “Spending” app that provides item-level cost information. The Carbon app applies GHG emissions converted into kilograms of Carbon Dioxide equivalents (kg of CO2-e). Both apps provide real-time individualized weekly, monthly, and yearly feedback on purchases, broken down by categories and single items.Footnote 1 To measure revealed preferences for the two types of information, we also include a treatment group that makes both apps available to participants. Indeed, while we see growing interest in the use of smart technology to provide personalized feedback and shape individual behavior, little is known about the use or impact of such mobile apps.Footnote 2

We observe 175,146 item-level grocery purchases for 258 participants over a 19-week baseline period and a 19-week treatment period. Our primary outcome of interest is the carbon-equivalent emissions of participants’ weekly grocery purchases. We estimate the impact of the Carbon app on carbon emissions using a difference-in-difference analysis with randomization as our assignment. In the first month of treatment, participants who receive the Carbon app significantly reduce carbon emissions from groceries relative to participants who receive the Spending app. We estimate an average decrease of 5.8 kg (kg) in weekly carbon emissions (p = 0.003). The size of the reduction corresponds to a 27% of the baseline emission. The magnitude is equivalent to reducing driving by 49 km (30 miles) per week. During this period, the Carbon app decreases both overall purchases and emissions per purchase, with an estimated 45% decrease in emissions from beef (p = 0.019), which has been the focus of prior work 6). However, over the full 19-week treatment period, the impact of the Carbon app is smaller – an estimated 2.4 kg per week decrease – and not statistically significant.

The pattern of treatment effects over time mirrors the pattern of app usage over time. Engagement in the app is concentrated in the first four weeks of treatment with over half of total app usage taking place in the first month. App usage is similar in the first four weeks for the Carbon and Spending treatments with participants in both groups checking the app on average a little over once a week. Over the full treatment period, app usage is lower in the Carbon treatment than in the Spending treatment, though the differences are not statistically significant. We find similar results for the treatment group that received access to both apps. Providing both apps increases total app usage but crowds out the usage of the individual apps, particularly for longer-term usage of the Carbon app. Over the 19-week treatment period, usage of the Spending app is almost 40% higher than the Carbon app (p = 0.074). These results suggest a weak preference for spending information compared to emissions information over the longer term.

We make several contributions to the existing literature. Our experiment is the first to test the impact of an individualized intervention, in a real-life setting, aimed at decreasing the environmental impact of regular grocery purchases. Previous studies have tested the impact of carbon labels and information on a limited set of products, on a one-time purchase at an experimental store, through self-reported data, on the intention to consume or in the lab (Brownback et al. 2023; Brunner et al. 2018; Camilleri et al. 2019; Elofsson et al. 2016; Kanay et al. 2021; Muller et al. 2017; Osman and Thornton 2019; Panzone et al. 2021a; Panzone et al. 2021b; Perino et al. 2014; Spaargaren et al. 2013; Suchier et al. 2023; Vlaeminck et al. 2014). In related work, for instance, researchers find that informing students about the environmental consequences of meat consumption reduced the demand for meat at educational institutions (Jalil et al. 2020; Lohmann and Gsottbauer 2022).

We are also the first to estimate the effect of providing individuals with real-time information about the climate impacts of their groceries. While interventions aimed at changing environmental behaviors are motivated by the externalities of energy usage, prior work has not provided direct information on how individual food behavior directly translates into environmental impact in a real-time feedback fashion. Instead, prior interventions have largely provided information about individuals’ direct costs (Allcott and Knittel 2019; Allcott and Taubinsky 2015; Davis and Metcalf 2016; Jessoe and Rapson 2014). Other work has provided individualized feedback in terms of usage rather than environmental impact or carbon emissions (Brandon et al. 2019; Hahn and Metcalfe 2016; List and Price 2016). Tiefenbeck et al. (2018) for instance couple individualized feedback on water usage with a picture of a polar bear on an ice cap that shrinks as water usage increases (Tiefenbeck et al. 2018). A comparable study employing an app with a carbon footprint calculator of all different household activities has been recently published (Enlund et al. 2023); similarly, the gamification of individual CO2 emissions related to mobility has been explored by means of a smartphone app by a study at the working paper stage (Goetz et al. 2022). Related work tests general messages about the need for conservation but does not provide individualized information (Ferraro and Price 2013; Ito et al. 2018). Our study examines whether people are responsive to individualized feedback about the externalities of their behavior.

Third, our smart technology allows us to directly measure engagement with the informational interventions. Participants can only receive the information if they open the app, which we track throughout the treatment period. Prior studies that provide individualized feedback over time – for example through smart meters, home energy reports, or robocalls – are not able to measure whether people actually hear or read the information (Allcott and Rogers 2014; Brandon et al. 2019; Ferraro and Price 2013). We measure revealed preferences for the emissions information by comparing engagement with the Carbon app to engagement with the Spending app. Our analysis also examines the relationship between app engagement and treatment impacts on behavior.

Taken together, our findings demonstrate that providing people with personalized emissions information can affect their food purchasing behavior. However, our results also suggest that the impact of the informational intervention requires sustained engagement. In periods with regular app usage, we find meaningful treatment effects on carbon emissions, which decline along with app engagement. We also find suggestive evidence that the impact on emissions is sustained over the longer term for users who remain engaged with the app. The results of this study provide a foundation for the potential of using low-cost, highly scalable informational interventions to shift food purchasing behavior in a green direction.

The rest of the paper is organized as follows. The second section describes the experimental design. The third section discusses the results, and the fourth section concludes.

2 Experimental Design, Materials, and Methods

2.1 Sample and Recruitment

We recruited a national Danish sample to participate in the study.Footnote 3 To do so, we worked with Statistics Denmark, which is the Danish governmental organization that creates statistics on Danish society. On our behalf, Statistics Denmark selected a representative sample of 100,000 Danish adults. In two waves, mid-January 2020 and mid-June 2020, we sent an invitation to participate in our study through the mandatory public electronic mail system in Denmark (only 96,324 were effectively reached). The invitation letters included a description of the research project and the requirements for participation, which consisted of answering a brief survey, downloading an app, and set-up a profile to use it. We varied the framing of the language describing the purpose of the study (the title and one sentence in the description changed) across letters using an environmental frame, an economic frame, or a neutral frame to study any potential differences in motivations to participate along these dimensions (see Figure 4 in the Appendix for letters). No significant difference in study participation or impact on consumption was found across the invitation framings. The variation of the letter framings did not depend on and was not linked to the randomization into treatment. We do acknowledge a selection into the study. We give an impression of the ongoing selection by comparing the climate attitudes retrieved from the baseline survey of those who completed the survey (n = 2711) and those who fully participated in the study by using the app (our actual sample, n = 258): we report this evidence in the Descriptive Statistics subsection below.

In order to participate in the study, participants clicked on a link at the bottom of the letter. The link took them to a survey about perceptions and attitudes toward food in relation to health, the environment, and money. We also asked participants to rank five food items (potatoes, beef, chicken, cheese, and orange) on three dimensions: pollution, cost, and health. The survey questions and tasks are listed in Table 4.

Upon survey completion, we randomly assigned participants to receive the Carbon app, the Spending app, or both apps. Respondents downloaded the assigned app(s) to their smartphone, activated an app-user profile that included optional demographic questions, and connected the app to an e-receipt system of widespread use in Denmark. The e-receipt system collects data from all supermarkets in the country using individual payment card data. Participants who did not have the e-receipt system set up yet could easily sign up; a quick guide to doing so was provided in the online survey platform. The automatic e-receipt system registers all food purchases at the individual level without the need for any manual entries. Once set up, none of our participants disconnected the data collection, suggesting that the participants did not perceive the data collection as problematic after experiencing the app(s). Moreover, our data collection provides historical data on grocery purchases prior to the intervention. The app platforms therefore both served as the data collection device and the information provider. Because the app was required to collect the outcome data on grocery purchases, we were not able to include a group that received no app and no information but, given the longitudinal structure of the data, we can observe behavior before the information was provided. We fix the time corresponding to when the participants enter the experiment and start using the app as their individual t0.

2.2 Treatment and Control Apps

The two apps are identical, the only difference is the consumption metric shown. Respondents in the Carbon treatment received the Carbon app, which provides an overview of the CO2 emission associated with their food purchases. In the Spending treatment, participants received the Spending app, which provides an overview of the expenditures related to their own food consumption. In the Both treatment, respondents received both apps. The two apps were developed by the same company (Spenderlog) and share structural visual design. In both apps, the overview is organized by food groups (e.g., dairy, meat and fish, fruit, and vegetables), individual foods (e.g., cheese, fresh milk, beef, chicken, apples), and item-level purchases. Users can see weekly, monthly, or yearly summaries. It was not possible to include a control group that used a “white” app: the app not only served as a data collector but also provided information, making it challenging to have a separate group solely dedicated to data recording without any informational features. The apps also show comparisons of the user with other households active in the app for all of Denmark (default), as well as by region, household income, household type (e.g., apartment, house), and family type (e.g., single, couple, couple with kids). One additional app feature allows users to set any kind of quantitative goal in relation to their groceries (e.g., reduction in candy consumption). Therefore, the effect of the Carbon treatment could be the sum of the emission information provision, social comparison, and goal setting. We would, however, like to note that only 6 participants were setting climate-related goals, suggesting that the treatment is unlikely influenced by this. Furthermore, while the potential for engaging in social comparisons was present across all treatments, variations in the content of these comparisons may have arisen from the differing information provided and subsequent behavioral responses. We do not have access to details regarding these social comparisons, so we cannot observe the impact. However, it is worth noting that within our sample, participants with initial emission levels above the median exhibited no distinct treatment effects compared to those below the median. This observation indicates that social comparisons might not have had a central influence.

Figure 1 shows examples of the app layout from the Carbon app. The analogous screenshots for the Spending app are in Figure 5 in the Appendix.

Fig. 1
figure 1

Carbon app screenshots. Notes: The figure includes three screenshots taken from the Carbon app which illustrate the layout and content of the app

Within this framework, the Spending app provides cost information about participants’ grocery purchases and the Carbon app shows the carbon footprint linked to each item. In the Carbon app, the emissions information is shown both in terms of kilograms of CO2 and kilometers driven by an average passenger vehicle, which is a common measure to ensure that non-experts can relate to the data.Footnote 4 The calculations are based on the methodology developed by the International Organization for Standardization (ISO 2006a), which estimates the greenhouse gas (GHG) emissions of the full food production process and supply chain. This includes the environmental impact of land use change, farming, inputs (e.g., imported feed and fertilizer), outputs (e.g., livestock manure sold to another holding), processing, and transportation. To provide a summary measure, the GHG emissions are converted into kilograms of Carbon Dioxide equivalents (kg of CO2-e). Major greenhouse gases such as methane and nitrous oxide are expressed in terms of their effect relative to one kg of CO2. For example, since methane is 25 times more efficient at retaining heat in the atmosphere than CO2, one kg of methane corresponds to 25 kg of CO2-equivalents. Similarly, one kg of nitrous oxide equals 298 kg of CO2-equivalents (Boardman 2008; Mogensen et al. 2009; ISO 2006). The authors were not responsible for the emission calculations, which were developed by the partner company.

2.3 Estimations

Of the 100,000 invited participants, 96,324 received the email invitation, 2711 completed the enrollment survey, 332 downloaded their randomly assigned app(s) and 258 created a profile and connected it to their e-receipt system. We randomized participants on a rolling basis using the survey software (Qualtrics). Because randomization occurred before participants downloaded the app, we were not able to block the randomization on demographic characteristics or baseline behavior. We tracked participants for at least 19 weeks after they initially enrolled and installed the app(s). We also include 19 weeks of pre-intervention grocery purchases as the baseline comparison in our analysis.

In a difference-in-difference approach, our primary analysis examines weekly greenhouse gas emissions resulting from food purchases in the 19 weeks prior to and after entering the experiment across the carbon and spending treatments. We estimate changes in emission using the following regression with individual random effects:

$${y}_{it}= {\alpha }_{i}+{\beta }_{1}\sigma +{\beta }_{2}\gamma +{\beta }_{3}\gamma \sigma +{\varepsilon }_{it}$$

where the dependent variable \({y}_{it}\) is the emission for individual i at the weekly level t; \({\alpha }_{i}\) captures the random effect for individual i; the dummy \(\sigma\) is an indicator for the intervention phase (19 weeks after the intervention); the dummy \({\gamma }\) is the indicator for the Carbon treatment (1 for Carbon treatment, 0 otherwise). We cluster standard errors at the individual level.

3 Results

3.1 Descriptive Statistics

Our experimental sample includes the full distribution of the national adult population of Denmark in terms of geography (drawing from all regions of the country), age (ranging from 20–72), household income and composition. As shown in columns (1) & (2) of Table 1, our sample is not nationally representative. We also note that the demographic data are not complete for all respondents.Footnote 5 To account for the non-representative sample and address potential selection into treatment, we reweight our sample to match national averages on a number of dimensions: gender, age, income, employment, children, and region (as shown in the Appendix in Table 6 columns (6) & (12)). The results of the Inverse Probability Weighting (IPW) and treatment effects do not change.

Table 1 Descriptive statistics of experimental sample, experimental groups and national population

Columns (3)-(5) of Table 1 report baseline characteristics for each treatment group. We report the significance from t-tests of binary differences compared to the Spending app treatment for the Carbon app and the Both app treatment groups. There are no statistically significant demographic differences between the Spending treatment and the Carbon treatment. The proportion of participants aged 50–59 and 70–79 significantly differs between the Both treatment and the Spending treatment at the 10% level.

The middle panel reports baseline grocery purchases for the experimental sample. We focus our experimental analysis at the weekly level in an effort to find a unit that includes at least one grocery shopping trip per individual and is not driven by heterogeneity in how people spread their shopping throughout the week. In the 19 weeks prior to study enrollment, participants averaged about 2.5 grocery trips per week with average weekly spending of $56 (USD) and weekly carbon equivalent (CO2-e) emissions of 20.7 kg. The weekly emissions are equivalent to driving 172 km (107 miles), which is about two-thirds of the estimated 252 km that the average Dane drives per week (Christiansen and Baescu 2022). In Figure 6 in the Appendix, a visual representation of the carbon-equivalent emission composition of the average customer's weekly grocery basket is provided. Unsurprisingly, we observe that a large portion of the carbon emission basket is linked to dairy products and meat products, with beef leading the trend on meat, which is the area where we find a significant and persistent reduction in emissions (see Mechanisms section).

There is some baseline imbalance in grocery purchases between the Carbon treatment and the Spending treatment. Participants in the Carbon treatment have higher weekly spending (p = 0.051) and higher carbon emissions (p = 0.073). As discussed in the Methods section, because we had to randomize participants into the assigned app before we knew their demographic information or baseline purchases, we were not able to block the randomization on baseline characteristics to ensure balance. However, as shown in the Appendix in Table 6, when we include demographic controls in the analysis to address the initial imbalance, the baseline difference in emissions between the Carbon treatment and the Spending treatment is small and not statistically significant (columns (1) & (7)).

In the bottom panel of Table 1, we report average responses from the baseline survey participants completed prior to receiving the app. We report average responses on a climate attitude index and a food emissions awareness index with responses on a 1–5 Likert scale.Footnote 6 Participants score highly on climate attitude with average scores of 4.15, indicating a high willingness to address CO2 emissions. Scores are lower, an average of 3.02, for the food emissions awareness index (participants in the Carbon treatment have higher self-reported food emissions awareness scores than in the Spending treatment, p = 0.043). Consistent with their self-reported lack of food emissions awareness, fewer than 20% of participants correctly rank the emissions impact of five food items (potatoes, beef, chicken, cheese, and oranges). Taken together, these results suggest that participants want to address climate change through their personal behavior but are not fully informed on how to do so through their food purchases.

The baseline survey answers also allow us to study selection in our study. Those who completed the survey but did not participate in the study (n = 2453) on average scored lower on the Climate Attitude Index than participants (4.03, compared with 4.15 among the participants, p = 0.0195). Non-participants also made more mistakes in the Environmental Ranking, averaging 2.47 mistakes compares to 2.19 mistakes among participants (p = 0.0014). Non-participants and participants scored similarly on the Food Emissions Awareness Index, 3.01 vs. 3.02 respectively (p = 0.4182). These results suggest that, compared to the broader population, our study participants may be more aware of and motivated to address the climate impact of their food purchases. And more generally that those who are more climate-engaged may be more likely to voluntarily take up informational tools related to climate change. We estimate selection into the experiment using the initial measured attitudes (see Table 7 in Appendix). Only very little selection is observed. To evaluate the influence of the selection on our main conclusions, the selection estimates are used as weights in an Inverse Probability Weighting analysis (we comment more on this at the end of Result Section 3).

3.2 App Engagement

Figure 2 Panel A displays the share of people in each experimental week who checks the app at least once, pooling all treatment groups (in the Appendix Figure 7 shows app usage over time by treatment groups divided across the three panels). App checking is concentrated in the first month of the experiment with more than 90% of participants checking the app at least once in the first week after they set-up their profile in the app, about half of participants checking the app at least once in the second week, almost 40% checking in the third week and a little over a third checking in the fourth week. App usage steadily declines in the second month of the study with an average of 23% checking at least once in a given week and then plateaus at about 10–15% for the remainder of the study through week 19. Given this pattern of app usage, our analysis focus on both the first month (week 1–4) when there is greater engagement (The short-run analysis is insensitive to a choice of bounding 3, 4, or 5 weeks together.), and the longer run outcomes (week 1–19) with lower average engagement

Fig. 2
figure 2

App engagement. A shows the time development of the proportion of participants who check the app at least once in a given week based on each individual participants’ date of enrollment. The figure pools all treatments. For the Both treatment group, we measure whether a participant checks either app at least once. B shows the average number of app checks by treatment group in the first 4 weeks of treatment and all 19 weeks of treatment. Bars indicate 95% Confidence Intervals

When participants do check the app, it is generally within a few days of a shopping trip. The average app check is 1.02 days after the previous shopping trip and 4.4 days before the next one. 41.5% of app checks are on the same day as a shopping trip. Of these 41.5% occur before the shopping trip and 58.5% occur after the shopping trip in the overall period. These patterns are similar for the Spending app and the Carbon app, both in the short and longer term.

To examine revealed preferences for receiving spending information compared to receiving emissions information, Fig. 2 Panel B shows the average number of total app checks in the first 4 weeks and the full 19 weeks of the intervention by treatment group. Across all groups, over half of total app checks for the 19-week treatment period take place in the first month. Comparing the Spending treatment and the Carbon treatment, there is little difference in average app engagement in the short term. App usage in both groups averages a little over once a week during the first month of treatment, 1.15 (p = 0.57 from a Ranksum test of differences across treatments). Over the longer term, however, there is greater engagement with the Spending app than the Carbon app. In the full 19-week treatment period, usage of the Spending app averages about 0.47 times per week compared to 0.36 checks per week for the Carbon app, though the differences are not statistically significant (p = 0.25).

A similar pattern emerges for participants who have access to both apps. Usage of the Carbon and Spending apps are similar in the first month of treatment, averaging about once per week (p = 0.31) but engagement with the Spending app is almost 40% higher than the Carbon app over the 19-week treatment period, averaging 0.39 and 0.28 checks per week respectively (p = 0.074). Providing both apps increases overall app usage compared to providing either of the apps alone (p < 0.01 for all comparisons in the both the short and longer-term). However, it crowds out the usage of the individual apps. In particular, the usage of the Carbon app is about 30% lower in the short term and about 25% lower over the full treatment period, when participants receive both apps compared to receiving the Carbon app alone (p = 0.016 in the short term and p = 0.082 in the longer term).

Taken together, our results suggest a weak preference for spending information compared to emissions information over the longer term. This could reflect that people like the spending information more, or that once people see the emissions information, they learn about climate impacts of different food choices, and do not need repeated engagement with the app. In the Appendix, Table 6, we explore correlates of above-median app usage and do not find strong associations between demographics and app engagement. While finding a positive association of app engagement for the carbon treatment with food awareness, we find suggestive evidence that climate attitude is negatively associated with app engagement in the carbon treatment, potentially because climate-interested people gain less new knowledge from the carbon app compared to people who are not initially climate interested.

3.3 Treatment Effects on Carbon Emissions

Our main analysis estimates the difference-in-difference effect on carbon equivalent (CO2-e) emissions of providing participants with the Carbon app compared to providing them with the Spending App. Figure 3 displays, across the Spending and Carbon treatments, the average weekly greenhouse gas emissions 19 weeks prior to receiving the app and 19 weeks after receiving the app. Average emissions are increasing over the baseline period, perhaps because our experiment took place during COVID-19, which shifted consumption from restaurants and cafes to groceries. During the treatment period, participants assigned to the Spending app continue to increase their emissions whereas emissions flatten among participants assigned to the Carbon app.

Fig. 3
figure 3

Weekly greenhouse gas emissions. Notes: The lines are indexed in week -19 due to initial emission imbalances. A non-indexed graphical representation of the emissions can be found in the Appendix in Figure 8

An intuitive illustration of the weekly emissions estimates across all three treatments, before and after the experiment was launched, is shown in Appendix Figure A.5. It is clear from the illustration that the estimated emissions in the Spending treatment increases while staying constant in the Carbon treatment (with a tendency to decrease). The Both treatment also shows an increase, although of a smaller magnitude than the increase in the Spending treatment.

In Table 2, we report results from difference-in-differences individual-level random effects regressions estimating the impact on weekly CO2-equivalent (CO2-e) emissions of the Carbon app treatment compared to the Spending app treatment. Each participant-week is an observation, and we cluster standard errors at the individual level. The first three columns restrict the sample to the first month of treatment. The last three columns include the full 19-week treatment period. Columns (1) & (4) include all participants. Columns (2–3) and columns (5–6) split the sample based on app engagement, measured as being above or below median app usage during the relevant period. All regressions include 19 weeks of pre-intervention observations.Footnote 7

Table 2 Effect of carbon app on weekly carbon emissions

Table 2 column (1) reports the estimated effects of providing the Carbon app for the first month. Our coefficient of interest is the interaction term of After X Carbon Treatment which estimates the pre- vs. post-treatment difference in weekly emissions in the Carbon app treatment compared to the difference in the Spending app treatment: this is our difference-in-difference weekly emission measure. We estimate that providing information about CO2-e emissions via the app reduces the subsequent CO2-e emissions of weekly food purchases by about 5.8 kg (p = 0.003) compared to the weekly CO2-e emissions in the Spending Treatment. It is noteworthy that in absolute terms, the emissions levels are stable in the Carbon treatment, but increases in the Spending Treatment after the intervention is introduced, leading to a relative reduction of weekly emissions in the Carbon treatment, compared to the Spending treatment. The size of the reduction corresponds to 27% of the pre-treatment baseline emissions of 21.25 kg in the Carbon Treatment. However, the impact of the Carbon app does not persist over time. As shown in column (4), over the 19-week experiment period, we estimate a decline in emissions of about 2.4 kg, a 11.3% decrease that is not statistically significant (p = 0.177).

We also present estimates of providing both apps compared to the Spending app alone, as well as estimates the Both treatment and Pooled treatment (corresponding to the Carbon and Both app groups together) (Table 6, columns (4–5), and (9–10), in the Appendix). The pattern of results is similar, though the effect sizes are smaller and not statistically significant for the Both treatment. The larger impact of the Carbon treatment compared to providing both apps may be due to the higher engagement with the Carbon app when it is provided alone. There also may be effects of providing the Spending app to participants that interact with the impact of the Carbon app.

Relatedly, we note that across all regressions, there is a positive and significant coefficient for the variable After, which suggests that weekly CO2-e emissions are increasing during our study period. As we discussed above, that may be because we implemented the study during the onset of the COVID-19 pandemic when grocery purchases increased (Chenarides et al. 2021), which mechanically increases CO2-e emissions from groceries. An alternative interpretation is that providing the Spending app to participants affects grocery purchases. With our data, we cannot explicitly disentangle the effect of the Spending app from seasonality (including potentially the COVID-19 effect) because we do not have a participant group that receives no information and no app. Seasonality is, however, unlikely to influence our findings as seasonality affects the participants in our treatments equally, and comparison, therefore, cancels out this influence. We compare the Carbon app users against the Spending app users for both technical and policy reasons. First, as mentioned in the design section, we could not include a no-app control group since our app served also as a data collector and we did not have the availability of a “white” app that could just record data without providing any information. Moreover, from a policy perspective, most interventions in this context have focused on providing spending and usage information (Allcott and Knittel 2019; Brandon et al. 2019; Davis and Metcalf 2016; Hahn and Metcalfe 2016; Jessoe and Rapson 2014; List and Price 2016). Our results suggest that, if anything, in our context, this may increase carbon emissions and that providing Carbon information is a more effective intervention. However, the effects fade out over time.

The fade-out in treatment effects corresponds with a fade-out in engagement as discussed in the section above. To further explore the role of engagement, we split the sample by above and below median app usage. Given that the engagement with the app is endogenous, our analysis is correlational. In the first month, above median and below median users check the app on average 1.77 times and 0.59 times per week, respectively. Over the full treatment period, above-median users sustain their usage at 0.84 times per week compared to below median users who check on average 0.18 times per week during a 19-week intervention.

We estimate that among highly engaged (above-median) participants, providing information about CO2-e emissions reduces weekly CO2-e emissions by about 8.5 kg (p = 0.008) in the first month of treatment compared to the spending treatment. A 38% decrease compared to pre-treatment baseline weekly emissions of 22.3 kg (column (3)). That compares to the insignificant 3.2 kg reduction in CO2-e emissions for those who are below the median app usage, corresponding to a 16% decrease (column (2)). When we examine the longer-term effects of the Carbon app, we find suggestive evidence that the most engaged participants who sustain their engagement also sustain meaningful treatment effects. As shown in column (6), we estimate that above-median users reduce their CO2-e emissions by an average of 5.6 kg (p = 0.076) per week over the 19-week intervention, which is similar to the short-term effect estimated for the full sample. For the least-engaged, the long run reduction in CO2-e emissions is smaller than for the whole sample and not statistically significant (0.16 kg; p = 0.935). These results suggest that the most engaged users are driving the treatment impact of the Carbon app, though we caution that the estimated treatment effects are statistically indistinguishable across subgroups.

To address the concern that above median users of the Carbon app may not be comparable to above-median users of the Spending app, we also split the sample based on predicted engagement with the Carbon app. In Table 8 in the Appendix, we regress our models’ covariates on usage among those who received the Carbon app and then use the coefficients to create a predicted Carbon app usage score for all participants. We then split the sample by those predicted to be above- or below-median users and replicate our analysis in the Appendix in Table 9. Among the predicted engaged compliers, the Carbon app reduces CO2-e emissions by an estimated 9.7 kg (p = 0.002), which corresponds to a 45% reduction in the first month of treatment. In the long run, those predicted to be above-median users show a reduction of 4.7 kg in CO2 emissions (p = 0.095). For the less engaged, the reductions are not significant in either the short or long run.

As an additional robustness check, we have also compared the Carbon and the Spending treatments with the Both treatment. It is important to note that the impact of the information provision in the Both treatment is influenced by the fact that participants endogenously chose what information to attend to. In contrast, the information is exogenously imposed in the two other treatments. With this caution in mind, we compare the three treatments in regressions similar to those in Table 2 (see Table 10 in the Appendix). In a regression with the Both treatment as the base category, the weekly emissions after the experiment started were not significantly different between the Spending Treatment and the Both treatment, but significantly smaller (borderline) in the Carbon treatment, compared to the Both treatment. This finding is unsurprising given that participants in the Both treatment attend significantly more towards the spending app than the carbon app, and as such participants in the Both treatment are exposed more to information similar to that of the Spending treatment.

To address potential selection into the experiment, we reweight the main treatment analysis with Inverse Probability Weighting method to match the overall survey sample on the individual attitudes measured in the initial survey (Table 11 in the Appendix). The IPW estimations (columns (2) & (5)) replicates our overall main results (columns (1) & (4)), with only marginal changes in the interaction term, suggesting that our conclusions from the main model are not changed when accounting for the selection effect. When adding the socio-demographics controls (columns (3) & (6)), they seem to pick up some of the treatment and the constant variation, without affecting their significance levels.

Taken together, our results suggest that for time periods and people with high app engagement, providing emissions information can have a meaningful and sustained impact on the carbon footprint of grocery purchases.

3.4 Mechanisms

Finally, we explore the mechanisms leading to our observed reduction in weekly CO2-e emissions when providing the Carbon app. Table 3 has the same structure as Table 2 except that we examine different outcomes, which are reported for each column. As shown in columns (1) & (2), we find that overall purchase quantities and money spent are significantly lower among participants who receive the Carbon app.Footnote 8 Purchases decline by about 28% in the first month of the experiment, while smaller and insignificant reductions are observed over the full 19-week intervention (columns (6) & (7)). We note that lower emissions foods also tend to be less expensive so the reduced spending could reflect both changes in quantity and changes in basket composition. Indeed, when decoupling the carbon emissions from the total quantities, we find suggestive evidence that net of total quantities the Carbon app also reduces carbon emissions per item and per dollar spent (columns (3) & (4)).

Table 3 Mechanisms

In particular, we find a large and significant decrease in emissions from beef consumption in both the short and longer run (columns (5) & (10)). We estimate a 1.2 kg per week reduction in CO2-e emissions from beef in the first month of treatment (p = 0.019), a 45% decrease that is equivalent to over 21% of the treatment impact on overall emissions. The Carbon app does not directly highlight beef as a high emissions food group – as prior informational interventions have done (Camilleri et al. 2019; Jalil et al. 2020) and yet has a large impact on this critical target for reducing the carbon footprint of food production and consumption. Achieving such effects through price changes, or taxes, would require an over 30 percent increase in price, based on estimated price elasticities for beef consumption (Taylor and Tonsor 2013). For future research interventions and app designs, our findings suggest that targeting beef consumption would be particularly powerful.

To facilitate a discussion on the substitution dynamics within key product categories, we refer to Table 12 in the Appendix. The exhibit describes the treatment effect on weekly carbon emissions by food category. Compared to the Spending treatment, the Carbon treatment shows a significant, reducing effect on Beef, Charcuterie, Milk, and Processed food in the short run, while only a significant effect on Beef and Milk in the long run. We have encompassed various alternative dairy products within the Plant-Based category, including oat and almond milk. Despite observing a pronounced and negative impact of the Carbon treatment on the consumption of traditional dairy items, we did not find any statistically significant effect on the adoption of plant-based alternatives despite the positive effect. The effect direction shows a spill-over phenomenon from conventional to plant-based products. Similar to the overall treatment effect, for some of the food categories, such as Charcuterie, Milk and Cheese for instance, providing information about CO2-e emissions via the app reduces the subsequent CO2-e emissions of weekly food purchases compared to the weekly CO2-e emissions in the Spending Treatment. It is however noteworthy that such reduction is to be understood in relative terms: the weekly emissions stemming from some food categories increase in the Spending Treatment after the intervention is introduced while the emissions are constant in the Carbon treatment, leading to a relative reduction of weekly emissions in the Carbon treatment, compared to the Spending treatment.

4 Discussion and Conclusion

Our study demonstrates that providing access to personalized emissions information can have a meaningful impact on grocery purchases. The estimated 5.8 kg decrease in weekly CO2 equivalent emissions, relative to providing spending information, is equivalent to switching from a beef burger to a plant-based burger (e.g. pea-based burger) or reducing driving by 49 km (30.44 miles) per week (Poore and Nemecek 2018). The magnitude of the short-run effect is almost twice as large as the effect of adding a social comparison to a monthly home energy report (HER). Allcott (2011) estimates that the average treatment effects of HERs translate into 0.62 kWh per day, or 10.4 h of lightbulb use per day for a standard 60-W incandescent lightbulb. Our estimated 5.8 kg decrease in CO2-e emission per week for the Carbon treatment corresponds to 1.172 kWh per day, or 19.53 lightbulb hours (Environmental Protection Agency 2019).

We emphasize that our study should be considered a first step towards understanding how real-time, individualized information provision can be used to lower carbon emissions from food consumption. In future research, it would be beneficial to address the external validity of our findings, drawing on four conditions proposed by John List (2020), selection, attrition, naturalness, and scaling. We briefly discuss each condition. Selection: despite having a diverse participant sample, it is plausible that the sample composition favored certain individuals. Using inverse probability weighting robustness checks, we found that our main conclusion holds after modeling the selection. Attrition: while our randomized treatment assignment before the app download ensured internal validity, it certainly contributed to the difference between the initial response rate and the actual app usage phase. Naturalness: one of our study's greatest inherent strengths lies in its real-world experimental setting. Participants were aware of their involvement, yet the focus remained solely on their initial engagement, allowing them to make choices within their day-to-day contexts subsequently. This dynamic makes our evidence remarkably externally valid in this regard. Scaling: to scale our information approach as a strategy for food-related carbon emissions, larger samples are needed to quantify how our approach impacts across different and diverse samples.

Our study also highlights the challenges of sustaining the impact of the emissions intervention. Our results suggest that in periods and among people who remain engaged with the app, the emissions information has meaningful effects. However, the impacts fade quickly along with engagement. This differs somewhat from evidence in the home energy context suggesting that decreases in energy usage may be sustained after people stop receiving home energy reports (Allcott and Rogers 2014; Brandon et al. 2019). This may be in part because they received reports for a longer period and built-up habits during this time. It may also be in part due to the nature of the technology. People can reduce their home energy, for example by one-time installations of energy-efficient light bulbs and appliances that have a persistent impact (Brandon et al. 2019). In contrast, grocery purchase decisions are largely made in real time. Future work could examine integrating personalized feedback on climate footprint on grocery receipts or newly implemented scan-and-go tools, which allow you to scan and purchase items with your smartphone in the grocery store. By applying an intervention based on real-time feedback, one could test the impact of receiving the carbon footprint evidence at the moment of purchase, rather than depending on the app engagement. As our work demonstrates, it is critical to understand how engagement with these interventions affects their impact and the external validity of the treatments. In future research, it would also be interesting to identify the effect of using a feedback app per se, something which our experimental design could not detect.