1 Introduction

Water scarcity is already affecting a quarter of the world’s population, causing economic damage (Franzke 2021), and negative consequences on human health and well-being (Ebi and Bowen 2016). Among the different actions aimed at mitigating this issue, the reduction of excessive water consumption, which contributes to local water stress, has become a primary sustainability objective.Footnote 1 With respect to this, behavioral nudges have been increasingly acknowledged as powerful and cost-effective actions that can supplement or replace traditional economic leversFootnote 2 to correct market failures, engage people in pro-social behavior, and align them with socio-valuable goals (World Bank 2015; Benartzi et al. 2017). Academic literature has offered wide and robust evidence on the behavioral sciences’ capacity to correct cognitive biases while preserving fundamental individual freedom of choice (Sunstein 2018; Thaler & Sunstein 2008).

Behavioral insights are particularly relevant to the water sector. In various countries, water tariffs are regulated, thus limiting the possibility of water companies to leverage water saving by increasing the related price. Moreover, being a sector of general interest, a rise in price might inhibit goals of accessibility and service universalization. Non-monetary measures are thus likely to be politically more effective in promoting water conservation attitudes than traditional economic measures. On top of that, in various countries, people do not have access to updated and reliable information on their daily water consumption. This can partly be traced back to the widespread technological backwardness in the data collection and communication systems. Data on domestic water consumption can be observed directly via water meters, or indirectly via the water bill. Both ways imply non-negligible searching and evaluating costs. Concerning the former, in many countries (Italy included), homes are usually equipped with analogue water meters, which can be installed outside private homes (or, internally, but not in easily visible places). Moreover, water meters report households’ cumulative consumption, limiting users’ knowledge of their daily consumption. Concerning the latter, water bills usually report the total amount of households’ aggregated water consumption over a given period (e.g. a quarter). This information might not be easy to understand, evaluate and compare. Moreover, since self-reported communications or door-to-door readings occur sporadically, water billing is usually calculated according to estimated rather than real data.Footnote 3

Within this framework, the main goal of this research is to assess the extent to which informative feedback on water consumption can nudge water savings.

The Information Campaign

To address our research question, we designed an online information campaign which involved around 1,000 households in the metropolitan area of Milan, over a five-month period, beginning in September 2021 and lasting until January 2022. The campaign was launched in partnership with the CAP Holding water company, one of the main Italian water companies, which manages the integrated water services in the metropolitan area of Milan and in other provinces of the Lombardy region. The CAP Holding water company was the first Italian company to start the replace process of the analogue water meters with electronic smart meters. This technology allowed for an automated, remote collection of the households’ water consumption data, which was elaborated for the information campaign. In particular, treated units received via email on a monthly basis their water consumption diary: a brief report on the households’ water consumption, including a social comparison component. Households were ranked according to their water consumption on a 1–1000 scale and were informed of both their ranking position and the related consumption category: households belonging to the first, second and third consumption tertiles, were ranked ‘low’, ‘medium’ and ‘high’ accordingly. Details on the information campaign are reported in Sect. 3.

The Empirical Strategy

We focused our analysis on a sample of more than 13,800 households distributed over 45 municipalities in the metropolitan area of Milan (Italy) that, at the time of the implementation of the intervention, were already equipped with smart meters remotely collecting data on their water consumption. The sample is not determined by self-selection, since families could not apply for smart meters and the water company was not replacing the conventional meters following any pre-determined geographical criteria.

Within this sample, we were legally bound to send informative material only to around 1,000 households, which, at the time of signing their water contract, gave their consent to receive informative and ancillary communication from the water company.

The CAP Holding water company was committed to involve as many households as possible, therefore it firmly required to send the information campaign to all the 1,000 eligible households. Due to this exogenous constraint, a RCT could not represent a viable option, and we had to adopt an alternative empirical strategy. We initially considered as the treated group the entire subset of eligible households (those equipped with a smart meter and that had provided privacy consent), while the remaining households (those equipped with a smart meter, that had not provided privacy consent) were included into the control group.

Selection into the informative condition was based on a general willingness to receive ancillary communication, with this consent being given several years before the information campaign. For this reason, the adopted selection criterion is likely to be orthogonal to the intervention itself. Nevertheless, we are aware that people voluntarily decided whether or not to give their consent to be advertised. Therefore, following previous relevant literature, aiming to assess the impact of the information campaign on water consumption, we adopted a conditional Two-Way Fixed Effects Diff-in-Diff (TWFEDID) strategy. To avoid a potential self-selection bias, we defined through a Propensity Score Matching (PSM) a control group which, before the treatment, was not statistically different from the treated one along a variety of observable dimensions.

We first excluded from our sample non-resident households (which do not live in the apartment for which they hold the water service contract and who could rent their apartment for long or short periods) and those who reported missing or non-consistent water consumption data or information. This allowed us to build a balanced dataset composed by around 9,000 households. Next, the 2 Nearest Neighbor matching procedure brought to select a sub-sample of 2,355 households: 866 treated units and 1,489 control units which did not show any pre-treatment statistical difference with respect to a large variety of dimensions.

Treated and control units were matched only with respect to their observable and available characteristics. This implies that the PSM validity relies on the assumption that matching on observable characteristics allows to match unobservable characteristics. We addressed this potential issue in different ways. First, building on the intuition that individuals sharing similar socio-economic conditions often cluster in the same urban areas – and in light of the small dimension of the municipalities where the information campaign was developed (the average resident population is 24,000 inhabitants) – we proxied several individual-level unavailable characteristics with an array of municipal-level socio-economic, demographic and territorial observable variables. The inclusion of these variables allows to control for several households-level unobservable factors which might cause differences in water consumption among the treated and control groups.

Moreover, in our diff-in-diff model we included time and households-level fixed effects which allow to address a potential omitted variables’ bias and to control for potential unobserved heterogeneity. The adoption of a fixed effects model rules out that differences among treated and control units in their water consumption could be potentially driven by unobserved heterogeneity that remains fixed within the time period of our trial.

The validity of our empirical strategy aimed at avoiding a self-selection bias is corroborated by a variety of tests. The PSM balance test shows that in the pre-treatment period treated units are not statistically different from their matched control units with respect to all the considered variables. On top of that, a placebo test and a dynamic analysis show that, in the pre-treatment period, treated units are not statistically different from the matched control units in their water consumption’s level and variation. This suggests that the selected treated and control groups do not differ with respect to those unobservable characteristics that correlate and can influence their level and trend of water consumption.

Findings and Contribution

With this work, we aim to address several research questions. The first concerns the information campaign’s effectiveness in nudging a water saving behavior. For this purpose, we developed an Intention to Treat (ITT) analysis where matched treated and control units were compared through a difference-in-differences (DiD) design. In particular, we found that the campaign was effective in promoting an average per capita reduction of 25.8 liters (L) (6,8 gallons) per day compared to the control group. The results of the logarithmic scale analysis suggest that, thanks to the informational campaign, water consumption has decreased by around 6%.

We next developed a dynamic analysis to determine whether the impact of the treatment varied with the number of informative feedback emails sent to the treated units. Indeed, we were interested in establishing whether sending repeated messages on a monthly basis fostered water conservation, thus enhancing water savings over time, or rather, whether the opposite occurred, due to a decrease in consumers’ attention over time. We also compared the matched treated and control groups in the months following the end of the campaign to verify whether the campaign was effective in inducing a permanent change in behavior, or whether its effectiveness was temporary and confined to the campaign period.

The dynamic analysis shows that the effect of the campaign was not constant over time, as estimated additional water saving increased with the number of emails sent. However, we verified that the water conservation effects were not permanent and expired a few months after the end of the trial. This suggests that the information campaign did not represent a sufficient tool to drive structural behavioral change.

Third, we develop an Instrumental Variable strategy to estimate the Local Average Treatment effect (LATE) confined to the complying users which effectively opened the received emails. We found that the 25.8 L per day average reduction in water consumption estimated through the ITT analysis is mainly driven by the complying units. In particular, the LATE estimated with the IV method corresponds to an average per capita water saving equal to 54.9 L per day. In logarithmic terms this corresponds to a 12% reduction in water consumption.

Moreover, we questioned whether the impact of the information campaign varied across the types of consumers and depended on the type of feedback they received. To address this question, we first developed a heterogeneous analysis to assess whether changes in water consumption were uniform across class sizes. We then applied a regression discontinuity design (RDD) around the class sizes’ cut-offs to verify whether sending different feedback to units with comparable consumption levels differentially affected their water saving at the margin. This analysis shows that the impact of the treatment was heterogeneous across the consumption classes and that different feedback differentially affected consumption choices at the margin. This result suggests that the social comparison component of the information campaign represents a key driver of the treated units’ improved water saving performance.

Our research contributes to the existing literature on water conservation nudging in several ways. The first concerns the information notification tool. Previous researches delivered the information mainly through printed letters, postcards or handouts (Ferraro and Price 2013; Landon et al. 2018; Schultz et al. 2019; Torres and Carlsson 2018; Carlsson et al. 2021; Fielding et al. 2013; Miranda et al. 2020), printed leaflets in the form of door hangers (Schultz et al. 2007; Goette et al. 2019), or a combination of printed letters, emails and a website (Dolan and Metcalfe 2015; Brent et al. 2015; Bhanot 2017; Jessoe et al. 2021; Schultz et al. 2016; Daminato et al. 2021). These studies mainly conclude that the intervention effectiveness depends on the type of notification tool used and find that printed copies tend to be more effective than email notifications, which in general are not associated with a significant effect on water conservation. According to various interpretations, the lower success rate associated with online messages could be due to extra effort required to open them. Conversely, in our case, the information was provided exclusively via email. To the best of our knowledge, this represents the first research providing evidence on the effect of an information campaign based solely on email notifications.

A second novel component of our study is that being entirely developed online, it allowed us to monitor the rate of compliance, proxied by the users’ email opening rate, and to differentiate treated units according to their compliance status. We document that the 17% of the treated units never opened any email, while the 44% of the treated units opened a maximum two out of five emails. Due to this non-neglectable non-compliance rate, the ITT approach is likely to underestimate the true treatment effect. Therefore, following previous studies in other fields (Angrist et al. 2010; Leventhal & Brooks-Gunn 2003; Brent and Ward 2019), we complemented the ITT analysis with an Instrumental Variable (IV) model that estimates the Local Average Treatment Effect (LATE) of the information campaign, that is the effect of the treatment restricted to the units who effectively comply with their treatment assignment (Angrist and Pischke 2014). This supportive analysis allows to further verify the robustness of the ITT results and to eliminate the potential source of underestimation of the treatment while, at the same time, addressing the endogeneity of the compliance decision. To the best of our knowledge, this represents a novel contribution of our research to the existing literature on nudging and water conservation behaviour.

Another way our research contributes to the literature concerns the geographical context. Previous water conservation experiments were mainly developed in the US (Ferraro and Price 2013; Brent et al. 2015, 2020; Schultz et al. 2019), with several applications in various other parts of the world, as discussed in Sect. 2. To the best of our knowledge, ours is the first research applied to the Italian case. Our study aims to determine whether the informative feedback tool, previously used in different contexts, remains effective in a new environment, characterized by an emerging water scarcity problem. Indeed, this is increasingly becoming a critical issue in the South-Europe, due to the conjunction of the climate crisis-induced increase in the frequency and intensity of extreme weather events, such as droughts (EEA 2021, IPCC 2022).Footnote 4 In Italy, this criticality is exacerbated by the unsustainable behavior of Italian consumers, who register among the highest level of water consumption in Europe.

Compared to the majority of previous trials, we differentiated our campaign with respect to the type of communicated information. Previous studies mainly communicated the total amount of households’ aggregated water consumption over a given period (Ferraro and Price 2013; Brent et al. 2015; Schultz et al. 2019). Other experiments communicated the daily average of households’ aggregated water consumption (Bhanot 2017; Jessoe et al. 2021), and rarely the litres per capita per day (Goette, et al. 2019). In our intervention, we communicated the daily average water consumption (instead of the total water consumption) at a per capita level (instead of at households’ aggregated levels). The aim of our choice was to provide information in as familiar terms as possible, so that it could be easily quantified and understood by non-skilled users.Footnote 5 We have no tools to assess how clear the communicated information was, and whether the adopted unit of measurement was clearer than other options. Nevertheless, the significant water saving promoted by the information campaign and the relevant estimated LATE effect suggest that the adoption of a per capita daily measure favours a clear understanding of the amount of consumed water, as this measure is associated with a remarkable reduction in water consumption.

The remainder of the paper is organized as follows: Section 2 places our study within the literature on water consumption. Section 3 provides details of the information campaign. Section 4 describes the sample construction and descriptive statistics, in particular it discusses the Propensity Score Matching procedure and the related results. Section 5 presents the empirical strategy. Section 6 discusses the results in detail and Sect. 7 shows the robustness checks. Section 8 offers a discussion of specific aspects, and Sect. 9 concludes.

2 Literature Review

Actions to promote pro-environmental and resource conservation attitudes have been extensively studied in behavioral science literature (see Andor and Fels 2018 for review). A widely agreed finding is that social information campaigns, on top of being relatively cheap to implement (Wang and Chermak 2021), can be more effective than traditional instruments in promoting sustainable daily habits and stimulating consumers to adopt pro-environmental behaviors (Ferraro and Miranda 2013; Ferraro and Price 2013). This result is confirmed by a variety of field studies, whose designs differ with respect to a variety of factors. We review those that are most strictly connected to our research.Footnote 6

Geographical context

Existing water conservation experiments were largely and mainly developed in the US (i.e. Ferraro and Price 2013; Brent et al. 2015, 2020; Schultz et al. 2019), with various applications in other parts of the world, such as Australia (Sarac et al. 2003; Fielding et al. 2013), Central and South America (Miranda et al. 2020; Torres and Carlsson 2018), South Africa (Smith and Visser, 2013), and Asia (Agarwal et al. 2017; Goette, et al. 2019). Very few researchers investigated the impact of a social information program on water consumption in Europe (Ansink et al. 2021 in the UK and Kažukauskas et al. 2021 in Sweden).

Adopted Notification Tool

Households were reached via postcards or mailers (Fielding et al. 2013; Miranda et al. 2020; Brent et al. 2020), handouts (Seyranian et al. 2015), a combination of letters and emails (Brent et al. 2015; Bhanot 2017; Jessoe et al. 2021), a combination of letters and a website (Schultz et al. 2016; Daminato et al. 2021), printed leaflets in the form of door hangers (Goette et al. 2019) or more frequently via printed letters (e.g., Ferraro and Price 2013; Landon et al. 2018; Schultz et al. 2019; Torres and Carlsson 2018; Carlsson et al. 2021). Few studies provide real-time feedback with pre-installed in-home displays (Kažukauskas et al. 2021), or water meters connected shower heads (Agarwal et al. 2017).Footnote 7 Previous studies show that the effectiveness of an information campaign can depend on how it is communicated. Dolan and Metcalfe (2015) report that printed copies of social norms for electricity conservation are more effective than digital copies delivered via email. Brent et al. (2015) use a combination of letters and emails, and find the effect of their campaign to be insignificant for the category receiving the water report via email. Similarly, Schultz et al. (2016) show that web-based delivery is less effective than postal mail. However, none of these studies provide a solid explanation for this result, suggesting that this could depend mainly on the lower success rate associated with online messages, due to the extra effort required in opening them (Schultz et al. 2016). More recently, using a combination of letters and real-time feedback through an online portal, Daminato et al. (2021) show that the use of an online tool drives the main result of their experiment on water consumption.

Type of Communicated Information

Recently, Wang and Chermak (2021) argued that the size of the water saving can depend on the unit of measurement being used to communicate consumption data, which varies among studies. While some campaigns communicated the households’ aggregated total amount of water gallons consumed over a given period (Ferraro and Price 2013; Brent et al. 2015; Schultz et al. 2019) or during the main irrigation season (Landon et al. 2018), others communicated the daily average of the households’ aggregated total amount of gallons consumed in one or two months (Bhanot 2017; Jessoe et al. 2021). Few experiments used an app or home-installed meters where households could observe their real-time consumption (Agarwal et al. 2017; Kažukauskas et al. 2021). Apart from the notable exception of Goette et al. (2019), to the best of our knowledge, no paper has so far provided information on water consumption expressed both per person and per day (daily average per capita water consumption), and none have used liters instead of gallons. The liter is the unit of measurement used in Italy. In other countries with the same unit of measurement, data were expressed in cubic meters (Torres and Carlsson 2018; Carlsson et al. 2021), which is a less familiar unit of measurement than liters.

3 Information Campaign

Treated units received a monthly report (the water consumption diary) on their domestic water use over a five-month period, from September 2021 to January 2022.The report was delivered exclusively via email and included informative feedback and a social comparison component. First, households were informed on their monthly average water consumption, which, differing from previous studies, was communicated on a per capita basis using the liters per day unit of measurement. Second, we communicated the average per capita water consumption level for the entire treated group, and provided some further information aimed at facilitating the social comparison in terms of water consumption.Footnote 8 In particular, we constructed a ranking on a 1–1000 scale and communicated to each household its ranking position and the related consumption class size: whether households were ‘low users’, ‘medium users’ or ‘high users’ (that is, whether they belonged to the first, second and third consumption tertiles).Footnote 9

Concerning the temporal dimension of the field intervention, the amount of water consumed in a given month (e.g. August) is notified by the smart meter at the beginning of the following month (e.g. September). Therefore, with the information campaign, the email sent in a given month t (e.g. September), necessarily refers to the water consumption registered in the previous month t-1 (e.g. August). We then can observe whether the email sent in the month t promotes a reduction of water consumption during the same month t, as notified by the smart meter at the beginning of the following month t + 1 (e.g. October). Figure 1 reports an example of the first email of the information campaign that was sent in September and therefore referred to the water consumption level registered during the previous month, August.

Fig. 1
figure 1

The water consumption diary

4 Design of the Field Intervention and Sample Construction

Few requirements had to be met to be in order to be included in the information campaign. First, households had to be equipped with a smart meter remotely collecting data on their water consumption on a monthly basis.Footnote 10 At the time of running our trial, the CAP Holding water company replaced old analogue water meters with electronic smart meters for around 13,800 households distributed over 45 municipalities in the metropolitan area of Milan (Italy). This sample was not subject to self-selection. Households could not apply for smart meters and the water company was not replacing the meters following any pre-determined geographical criteria.

Within this sample, we were legally bound to send informative material only to around 1,000 households, which, at the time of signing their water contract (and thus well before the design of our information campaign), gave their consent to receive informative and ancillary communications from the water company.

A proper randomization within the sample of 1,000 eligible households would have represented the first-best option to assess the causal impact of the information campaign on water saving.Footnote 11 Unfortunately, in our case a RCT did not represent a viable option. The CAP water company was committed to involving as many households as possible, therefore it firmly required to send the information campaign to all the 1,000 eligible households. This explicit constraint forced us to adopt an alternative approach to assess the impact of the information campaign.Footnote 12

We initially considered the treated group the entire set of eligible households (those equipped with a smart meter and that had provided privacy consent), while non-eligible households (those equipped with a smart meter, that had not provided privacy consent) were included into the control group.Footnote 13

We believe this selection into the treatment is not likely to raise a relevant endogeneity issue. If we had selected the treated units among those who wanted to participate to an information campaign aimed at reducing water consumption, then we would have created a serious self-selection bias. However, in our case, selection into the treatment was based on a general willingness to receive ancillary communications, with this consent being given before the information campaign. For this reason, the adopted selection criterion is likely to be orthogonal to the intervention condition itself. Nevertheless, we are aware that people voluntarily decided whether or not to give their consent to be advertised. Therefore, following previous relevant literature, we addressed potential self-selection issues through a matching procedure.

4.1 Propensity Score Matching

Robust methodologies have been developed to address potential endogeneity issues and to assess causality when randomization cannot be implemented due to some exogenous constraints. Among them, propensity score matching (PSM) represents a widely adopted second-best alternative (Rosenbaum and Rubin 1983; Heckman et al. 1997; Dehejia and Wahba 2002).Footnote 14 DeShazo et al. (2017) adopt a PSM technique to assess to which extent the presence of high-occupancy vehicle lanes promotes plug-in electric vehicle adoption. Du and Takeuchi (2019) combine a PSM with the DiD approach to assess whether the renewable energy-based clean development mechanism contributes to poverty alleviation. Recently, Clay et al. (2023) use a difference-in-differences propensity score matching approach to assess the causal impact of LEED (Leadership in Energy & Environmental Design) certification on energy consumption. Cole et al. (2021) use a PSM with a DiD to analyze how firms’ carbon intensity is affected by the decision to delocalize some productive activities. Marin et al. (2018) combine the PSM with a DiD to analyze how the implementation of the EU ETS affects the regulated firms’ performance compared to the unregulated ones. Castelnovo and Florio (2023) employ a DiD combined with a PSM to assess the impact of public procurement on the patenting activity of the procuring firms in the space sectors, while Clò et al. (2022) combine PSM with a DiD to assess the incremental impact of development banks’ support to innovation on behalf of the financed firms. Lin et al. (2023) adopt a PSM-DiD approach to assess the impact of firms’ export intensity on their environmental performance.

Following this literature, we selected through a PSM a control group which, before the treatment, was not statistically different from the treated one along a variety of observable dimensions. Treated and untreated units were matched on the estimated propensity scores (on the estimated probability of being treated given a set of observable characteristics on treated and control units). We first estimated through a Logit model to what extent the probability of being treated was explained by a plurality of households-level covariates: number of residents, age and gender of the contract holder, the consumer type (whether it is a final consumer or whether it is a self-employed individual with a proper VAT number) and the aggregate households’ pre-treatment water consumption levels (Cday).Footnote 15 Following Marin et al. (2018), we included the households’ pre-treatment variation in water consumption (ΔCday). The inclusion of this variable in the matching procedure is aimed at ensuring that, in the pre-treatment period, treated and control units show a similar variation in their water consumption. This should therefore allow treated and control households to have parallel trends of the outcome variable before the treatment.

Moreover, we obtained from the CAP Holding water company the information on the latest year customers updated their consensus status (consensus year, whether to give consent to receive informative and ancillary communication from the water company) which is the key variable adopted to define whether a family was eligible to receive the treatment. Its inclusion in the matching procedure ensures that the treated and control groups do not show any statistically significant difference with respect to this key variable. In this way, we exclude possible endogeneity issues that could emerge in case the propensity to provide consensus had to vary over time.

It shall be recognized that treated and control units can be matched only with respect to their observable and available characteristics, and thus the PSM validity relies on the assumption that matching on observable characteristics allows to match unobservable characteristics such as income, type of occupation, education or environmental attitudes which are likely to influence their water consumption. To address this potential issue, we proxied some relevant households’ unobservable features with municipal-level observable variables. Building on the intuition that individuals sharing similar socio-economic conditions often cluster in the same urban areas – and considering the small dimension of the municipalities where the information campaign was developed (the average resident population is 24,000 inhabitants) – we proxied several individual-level non available characteristics (income, occupation, education) with a variety of socio-economic and territorial variables referring to the municipality where households live. Among them, we included: i) the municipal average residential housing prices as registered by the real estate market observatory (OMI) of the Italian tax authority (Agenzia delle Entrate); ii) From the same source, we included information on the total number of taxpayers and the total level of declared income per municipality (total income); iii) from the Bank of Italy database, we extracted information on the number of bank branches and on total level of bank deposits and bank loans at a municipal level; iv) the municipal-level soil consumption intensity, as provided by Italian Institute for Environmental Protection and Research (ISPRA); v) the municipal migration rateFootnote 16 (ISTAT); vi)the municipal old-age indexFootnote 17 (ISTAT); vii) the municipal degree of urbanization: whether it is an urban or peripherical area, whether it is a high or low-density building area (source: Italian Department for Development and Economic Cohesion).

Considering that our exercise involved a multitude of small municipalities (on average 24,000 inhabitants), we believe that, overall, municipal-level economic, financial, demographic and urban characteristics allow to proxy for relevant, though non observable, households’ characteristics. Indeed, people with higher income and better occupation are likely to live in wealthier municipalities. Therefore, the inclusion of municipal-level variables mitigates the risk that significant differences in water consumption among treated and control units persist due to unobserved differences among them. On top of that, we added among the matching variables also the municipal pre-treatment level of rainfall (Agri4cast-JRC). Since water consumption can depend on weather conditions, the inclusion of this variable allows to control for potential differences in water consumption among control and treated units due to exogenous climate conditions.

First, we excluded from our matching procedure those who did not reside in the house equipped with a smart meter (and who could rent their apartment for long or short periods, for instance via Airbnb) or that reported missing or non-consistent water consumption data or ancillary information. This allowed us to work with a balanced dataset composed by 8,741 households, with 866 treated units and 7,875 untreated ones.

The probability of being treated was estimated by:

$$P({Treat}_{i}=1)=\Lambda \left(\beta {X}_{i}+\phi {Z}_{m}\right)$$
(1)

where P denotes the propensity of households i to be treated at time t, and \(\Lambda \left(.\right)\) is the logistic distribution function. \(Treat\) is a binary indicator variable that takes a value of 1 if household i receives the treatment and 0 otherwise. X is a vector of pre-treatment households’ characteristics (see Table 3 for a description), while Z is a vector of municipality-level variables.

From the results reported in Table 1, we can observe that the size of the estimated coefficients (and related marginal effects) is quite small.

Table 1 Propensity score estimates

Based on the estimated propensity scores, we matched each treated unit to a maximum of its two nearest neighbor non-treated units (in terms of estimated propensity score). Non-treated units lying out of the common support of the estimated propensity score were excluded from the analysis. This matching procedure brought to restrict our analysis from the initial sample of around 9,000 households to a sub-sample of 2,355 households, with 866 treated units and 1,489 control units.

A first inspection of the density distribution of the propensity scores in both groups, before and after the matching, visually confirms the common support between treatment and comparison groups, and the soundness of the PSM procedure (see Fig. 2).

Fig. 2
figure 2

Probability of receiving the treatment before and after the matching

The PSM balancing test shows that, along several dimensions, the differences between the treated and the untreated units was significant only before the matching procedure. Conversely, the matched treated and untreated units do not show any statistically significant difference with respect to all the considered variables, thus allowing us to reject the null hypothesis (Table 2).

Table 2 Balance test

Table 3 presents the description of the household-level variables and the related summary statistics for the matched sample.

Table 3 Summary statistics and description of the variables

Lastly, we focus on the treatment compliance rate. Having sent the consumption diary via email, we could observe how many of the five monthly emails users received were actually opened by the users (click rate). We cannot verify whether the households which opened the email actually read it. Nevertheless, the email opening rate can be confidently adopted as a confident proxy for the treatment compliance rate. Table 4 reports the email opening rate: the 17% of the treated units never opened any email, while the 44% of the treated units opened a maximum two out of five emails (Table 4).

Table 4 Descriptive statistics on the users’ email opening rate

5 Empirical Strategy

The first research question we want to address is whether the information campaign has been effective in reducing water consumption compared to the households that were not involved in the campaign. Since all the treated units were included irrespectively on their compliance status, this corresponds to an Intention To Treat (ITT) analysis, which we address through the following Two-Way Fixed Effects Diff-in-Diff (TWFEDD) model:

$${y}_{it}=\alpha + \beta DI{D}_{it}+{\gamma }_{i}+{u}_{t}+{\varepsilon }_{it}$$
(2)

where \({y}_{it}\) indicates the average daily per capita water consumption for the user i at the month t. While the post-treatment phase covers the entire period of the information campaign (from September 2021 to January 2022), we decided to restrict the pre-treatment period to the months from May 2021 to August 2021 when there were no COVID-19 restrictions in place. The variable \(DI{D}_{it}\) identifies the unit i as belonging to the treated group in the post-treatment period, it corresponds to the interaction term \({TREA{T}_{i}\times POST}_{t}\), where \({TREAT}_{i}\) is a dummy variable equal to 1 when the unit i belongs to the treated group and 0 otherwise; \({POST}_{t}\) is a dummy variable equal to 1 in the post-treatment period and 0 otherwise. Its parameter \(\beta\) captures average post-treatment variation in water consumption of the treated group compared to the control group. In our diff-in-diff model we include individual and time and fixed effects, respectively \({\gamma }_{i}\) and \({u}_{t}\), which allow to address a potential omitted variables’ bias and to control for potential unobserved heterogeneity. Indeed, the adoption of a fixed effects model rules out that differences among treated and control units in their water consumption could be potentially driven by unobserved heterogeneity that remains fixed within the time period of our trial. Finally, \({\varepsilon }_{it}\) is the error term, which is clustered at an individual level. Equation (2) is estimated with OLS using the standard fixed effects estimator, with robust standard errors (heteroskedasticity) clustered at a household level.

5.1 Dynamic analysis

A second major interest of our research concerns the dynamic effect of the campaign. We are interested in understanding how the impact of the social information campaign varied over time, whether it increased or decreased with the number of emails sent. The latter case would point to the reinforcing contribution of repeated emails in promoting water conservation behavior, while the former case would point to their limited effectiveness, as water savings would decrease at the margin. Conversely, a constant trend would suggest that sending multiple information campaigns does not affect at the margin water conservation behavior. We address this issue by implement the following event study model:

$${y}_{it}= \alpha +\sum_{j=1}^{J}{\beta }_{j}{LAG}_{j}+ \sum_{k=1}^{K}{B}_{k}{LEAD}_{k}+\sum_{j=1}^{J}{\theta }_{j}{(LAG}_{j}*{TREAT}_{i})+ \sum_{k=1}^{K}{\theta }_{k}({LEAD}_{k}*{TREAT}_{i})+{\gamma }_{i}+{\varepsilon }_{it}$$
(3)

Lags and leads are binary variables capturing the months preceding and following the first month of the information campaign. In particular, \(LA{G}_{j}\) with \(j=1,\dots ,4\) refers to the months from May 2021 to August 2021, and \({LEAD}_{k}\) with \(K=1,\dots ,5\) are the months from September 2021 to January 2022. The inclusion of lags and leads allows us to assess the dynamic trend of the treatment, whether it is increasing or decreasing in time, whether it is stable or volatile, whether it is permanent or temporary. Moreover, this approach allows us to compare water consumption for the treated group and the control group in the months preceding the launch of the social information campaign, and to test the parallel trend assumption which must be satisfied for the DiD to provide unbiased estimates.

We extend this approach by including observations for the three months following the end of the information campaign (from February 2022 until April 2022). This approach allows us to highlight the differences in water consumption among the treated group and the control group after the end of the information campaign, and to assess whether its effect has been temporary and confined to the treatment period, or whether it managed to induce a structural change in the treated group’s behavior, promoting a permanent reduction in their water consumption.

5.2 Compliance and LATE Analysis

In the ITT analysis, all the treated subjects are included according to their original treatment assignment, ignoring noncompliance or withdrawal from the treatment (Hollis and Campbell 1999). The ITT approach preserves the balance between the treated and control groups, allowing for an unbiased estimate of the treatment effect. However, in case of substantial non-adherence, a shortcoming of this approach is a potential untrue estimation of the magnitude of the treatment effect, since non-complying units – which are de facto untreated – are analyzed as if they were treated (Angrist 2006; Gupta 2011).

Being our intervention characterized by a non-neglectable rate of non-compliance, we are interested in complementing the ITT with an analysis able to eliminate the potential source of under-estimation.

Extending the baseline model (Eq. 2) by further distinguishing treated units according to their compliance status could represent a potential way to capture how the treated units which actually complied with the treatment varied their water consumption in the post-treatment period compared to the control group. However, while allowing to disentangle the average change in water consumption of the treated group among compliers and non-compliers, this would bring to biased estimates due to the self-selection nature of the compliance decision.

According to Angrist and Pischke (2014), the instrumental variable (IV) method – where the treatment assignment is used as an instrumental variable for the treatment effective delivery – eliminates the non-compliance selection bias, thus allowing to “capture the causal effect of treatment on the treated in spite of the nonrandom compliance decisions made by participants in experiments”. This approach gives an unbiased estimate of the local average treatment effect (LATE), which is the impact of the treatment on compliers-only.

To estimate the LATE we adopt an Instrumental Variable (IV) strategy where the original treatment assignment is used as instrument. This method allows to isolate the variation in actual compliance, which is unrelated to the selection bias, allowing for an unbiased estimate of the treatment effect of compliance on the outcome of interest (Imbens and Angrist 1994).

The IV’s first stage assesses to which extent the instrument induces the treated units to effectively take up the treatment. This is estimated by regressing the treatment take up variable \(COMPL{Y}_{it}\) (it equals 1 when the treated household i effectively opens the email in the month t and 0 otherwise) on the treatment assignment \(TREA{T}_{i}\) (it equals 1 when a household receives the consumption diary via email) received in post treatment period:

$$COMPL{Y}_{it} ={\beta }_{0}+{\beta }_{1}DI{D}_{it}+{\gamma }_{i}+{u}_{t}+{\varepsilon }_{it}$$
(4.1)

The coefficient \({\beta }_{1}\) captures the variation in compliance determined by the instrument. It can be interpreted as a compliance rate, while the residual term captures the variation in compliance related to self-selection. The first stage equation generates the predicted value \(\widehat{COMPL{Y}_{it}}\) that, being uncorrelated with the error term, is unaffected by the potential source of endogeneity. This approach eliminates the source of bias. In order to be a good instrument, we have to reject the null hypothesis that \(TREA{T}_{i}\) is a weak instrument (\({\beta }_{1}\)=0). In the IV second stage we regress the variable of interest on the predicted value \(\widehat{COMPL{Y}_{it}}\) obtained from the first stage regression:

$${y}_{it}=\alpha + \omega \widehat{COMPL{Y}_{it}}+{\gamma }_{i}+{u}_{t}+{\varepsilon }_{it}$$
(4.2)

where the coefficient \(\omega\) captures the LATE. This 2SLS approach isolates that variation in the compliance variable that is unaffected by selection bias and, then, relates it to the outcome of interest. This gives an unbiased estimate of the LATE, that is the unbiased causal effect of the treatment only for the compliers, whose participation in the field intervention is determined exclusively by the treatment assignment.

5.3 Heterogeneity analysis

Next, we develop a heterogeneous analysis to assess whether the treatment effect varies across types of consumers according to some observable characteristics. First, we grouped both treated and control units into tertiles (‘low’, ‘medium’ and ‘high’ classes) according to their pre-treatment average level of per capita water consumption. Then, we extend the baseline model (Eq. 2) by further distinguishing treated units according to their consumption class size \(j\) and by estimating the following triple DiD:

$${y}_{it}=\alpha +\sum_{j=1}^{3}{\varphi }_{j}{DID}_{it}\times {TERTILE}_{ij}+{\gamma }_{i}+{u}_{t}+{\varepsilon }_{it}$$
(5)

The coefficient \({\varphi }_{j}\) of the triple interaction term captures how the treated units belonging to the consumption tertile \(j\) varied their water consumption in the post-treatment period compared to the control group. We expect the information campaign effectiveness to depend on the pre-treatment water consumption level, as previously found in some studies on water and energy consumption (Ferraro et al. 2011, Ferraro and Price 2013; Allcott and Rogers 2014, Andor et al. 2020). In particular, consistent with a convex water saving costs function, we expect that high consumers will experience the most significative water consumption reduction, since they should have most water saving opportunities at lower marginal costs. Conversely, consumers belonging to the first water consumption tertile should have limited opportunities to further reduce their water consumption. We therefore expect the information campaign to have limited or no effect on their behavior.

Moreover, we inspect the heterogeneous effects by further distinguishing consumers according to their age and family size. We expect the informative campaign to be more effective for younger users and for smaller families. This is because younger users are expected to be more sensitive to environmental issues. We also expect communication and coordination costs to be lower for smaller families than for larger families. If this is true, then the information campaign should spread more effectively in smaller families, and we should observe a higher reduction in water consumption.

5.4 Feedback and RDD analysis

We are interested in analyzing whether, within the treated group, consumers’ behavior varies at the margin depending on the type of feedback received. However, due to the endogenous nature of the informative feedback, a direct comparison across high, medium and low users is likely to lead to a biased estimate of the impact of different feedback on water saving behavior. To address this potential endogeneity issue, we develop a regression discontinuity design (RDD) around the consumption classes’ cutoffs. We exploit the fact that consumers were classified into three discrete categories with sharp cutoffs. When all the consumers are considered, the average per capita water consumption differs significantly among classes. Nevertheless, the closer we get to the cutoff, the smaller the difference in consumption between the contiguous categories, with the difference in consumption between households just below and above the cutoff being insignificant. Despite their similarity, consumers around the cutoff are categorized differently and receive different feedback depending on the side of the cutoff they belong to.Footnote 18 Therefore, we exploit this quasi-random category assignment among users around the cutoff to estimate the effect of different feedback on water saving behavior. The main intuition of the RDD is that being households just below and above the cutoff similar in their consumption behavior, then any variation in their respective water consumption can be attributed to the different feedback they received.

To implement the regression discontinuity approach, we build a stacked panel dataset in the following way. Within each treatment month t = 1, …, 5 (from September 2021 to January 2022), we first define \({c}_{t}^{min}\) and \({c}_{t}^{max}\) as the minimum and maximum threshold of the medium consumption category. \({c}^{min}\) defines the cutoff between the low–medium classes, while \({c}^{max}\) defines the cutoff between the medium–high classes. We then calculate the variables \({D}_{it}^{min}\) and \({D}_{it}^{max}\) as the differences between each household’s consumption (in per capita daily liters), and the \({c}_{t}^{min}\) and \({c}_{t}^{max}\) cutoff points. Then, we consider only the treated units whose distance from the cutoff is lower than a given threshold d, which satisfy the conditions \({D}_{it}^{min}\le d\) and \({D}_{it}^{max}\le d\) respectively for the minimum and maximum cutoffs \({c}^{min}\) and \({c}^{max}\). We apply this approach recursively, therefore we construct G = 5 groups (corresponding to as many panel datasets), one for each month t = 1, …,5 of the treatment period. We then stack the G panel datasets and run the following regression:

$${y}_{igt}=\alpha + \gamma DID\_RD{D}_{igt}+{\gamma }_{ig}+{u}_{gt}+{\varepsilon }_{igt}$$
(6)

Notice that the same subject can appear below the cutoff in a certain month and above the cutoff in another month. This implies that the unit of observation is the subject i within the group g. Therefore, yigt represents the water consumption of subject i, belonging to group g, in the month t. \(DID\_RD{D}_{igt}\) is the interaction term \({RDD}_{ig}\times {POST\_RDD}_{t}\), where \({RDD}_{ig}\) is a dummy which equals 1 if the subject i, belonging to group g, falls above the cutoff, and 0 if the subject i falls below the cutoff, while \({POST\_RDD}_{t}\) equals 0 in the month when units receive their informative feedback and 1 in the following month. By interacting these two dummy variables we can estimate our parameter of interest \(\gamma\) which captures whether, at the margin, the selected treated units change their water consumption behavior differentially, depending on which side of the cutoff they belong to. γig are the fixed effects for the subject i within the group g, while \({u}_{gt}\) are time fixed effects referring to the month t within the group g.

We first run this regression separately for the two cutoffs \({c}^{min}\) and \({c}^{max}\) which allows us to compare separately the low category with the medium category, and the medium category to high one. Next we estimate Eq. (6) as a single regression for both cutoffs. This allows to formally test whether the cutoffs matter differentially across the two cutoff types.

6 Results

All the models have been estimated employing the dependent variable in both levels and logarithmic terms, aiming to discern the influence of the informational campaign on water conservation, measured respectively in liters per day and in percentage terms.

Table 5 reports the result of ITT analysis obtained by estimating Eq. (2). We find that, on average, the social information campaign had a positive and statistically significant effect on water savings. Indeed, after the treatment, treated units reduced on average their per capita water consumption by 25.8 L per day (6.8 gallons/day) with respect to the control group (Column 1, Table 5). When expressing the dependent variable in logarithmic terms, this corresponds to a water saving equal to 5.7% (see Appendix A1, Table 11).

Table 5 ITT effect on water use

Our results differ from those of previous studies that did not find a significant effect of information campaigns conducted online and are consistent with the recent finding of Daminato et al. (2021).

6.1 Dynamic analysis

Figure 3 displays the results of the dynamic analysis obtained by estimating Eq. (3). Interestingly, we find evidence that the treatment effect is not constant over time, as it increases with the number of reports sent to the treated units. In particular, after the first email was notified, treated households do not show any statistically significant difference in their water consumption compared to the control group. After the second round of the trial, treated units registered an additional per capita water conservation of 20.4 L (5.4 gallons) per day compared to the control group. Nevertheless, the estimated coefficient was significant only at a 10% level. From the second to the fifth round of the intervention, the treated group registered an additional water saving compared to the control group, which increased from 27.2 L (7.2 gallons) per day in November (significant at a 5% level) to 42 L (11.1 gallons) per day in January (significant at a 1% level).

Fig. 3
figure 3

ITT effect on water use: Dynamic trend. Note: Point Estimate with 95% confidence interval

When the dependent variable is expressed in logarithmic terms, the treated and control groups did not display any statistically significant difference in their water behaviour after the first notified email. Conversely, from the second to the fifth (last) round of the campaign, the treated group registered an additional water savings that increased from -6.9% to -8.7%, both significant at a 5% level (see Appendix A1, Figure 6).

This finding suggests that sending repeated messages enhances higher water savings over time and suggests a rejection of the alternative hypothesis that consumers’ attention decreases with the number of messages. Moreover, the figure does not highlight any significant difference in consumers’ water consumption among the treated group and the control group in the pre-treatment period. This evidence supports the parallel trend assumption that must be satisfied for the DiD to provide unbiased results.

The long-run analysis reveals that the water conservation induced by the information campaign is not permanent. Indeed, the marginal and significant water savings of the treated group decreases the month after the end of the information campaign while it expired the second month following the end of the campaign. Indeed, we find that no statistically significant difference between the treated group and the control group persisted two months after the end of the campaign. This suggests that per se a five-months information campaign did not represent a sufficient tool to drive structural behavioral change.

6.2 Compliance and LATE Analysis

The ITT analysis was complemented with an analysis aimed at capturing the impact of the information campaign on the adherent units which effectively opened the related emails. To address the potential endogeneity issues associated to the self-section nature of the compliance status we run an IV regression to estimate Eqs. 4.1 and 4.2. This allows us to estimate the Local Average Treatment Effect of the information campaign (Table 6). Results of the IV’s first stage show that the coefficient of the instrumental variable is positive and highly significant. By capturing the variation in compliance determined by the instrument, it can be interpreted as a 47% compliance rate.

Table 6 Instrumental Variable and LATE effect on water use: static analysis

In the second stage we estimate the effect of the compliance’s predicted value on water consumption. The estimated coefficient of interest points to a significant local average treatment effect (LATE), which corresponds to an average per capita water saving of 55 L per day for the complying units compared to the control group. When expressing the dependent variable in logarithmic terms, this corresponds to an additional water saving of 12.2% (see Appendix A1, Table 12). The related endogeneity test rejects the null hypothesis that the compliance variable is exogenous, thus pointing to the need for an IV approach. In the weak identification test, the F value is much larger than the critical value, allowing to reject the null hypothesis that our instrument is weak.

To further inspect the mechanisms underlying the local average effect of the information campaign on the complying treated units, we re-estimate Eqs. 4.1 and 4.2 for different time periods. This setting is aimed at capturing the LATE dynamic trend and brings two interesting results. First, the result of the IV first-stage analysis shows that, when the second informative feedback was sent, the rate of compliance declines from 57.6% to 44.5% and then remains quite stable until the end of the information campaign (see Fig. 4, Panel A). Moreover, we find that the size of the water saving registered by the complying units increases with the length of the post-treatment period – respectively from 1 to 5 months after the beginning of the information campaign – and thus with the number of notifications sent via email. In particular, the first email notification of the information campaign does not induce a significant local average treatment effect. However, this becomes evident and significant after sending the second notification. The additional water saving monotonically increases from 29 L per day to 54 L per day when all the five notifications are considered (Fig. 4, Pane B). This corresponds to an additional water saving which increases from -7.6% to -12% over the same period when expressing the dependent variable in log terms.

Fig. 4
figure 4

Instrumental Variable and LATE effect on water use: Dynamic Analysis. Note: Point Estimate with 95% confidence interval

On one side, the increased water saving over time suggests that sending repeated feedback helps households to maintain this information salient, allowing to avoid a loss of attention due to cognitive limitations or time constraints. On the other, this result might depend to a certain extent on the change in the complying group composition. Indeed, complying units or households which decide to opt-out from the program might differ from the full study group with respect to some dimensions.

According to Angrist and Hull (2023), while the LATE approach allows to address potential selection bias issues, self-selection into adherence can limit the relevance of IV LATE estimates when specific demographic groups are substantially under-represented among compliers. Therefore, following Angrist and Hull (2023), we compare adherent units with the full study group on a variety of dimensions. Results show that compliers have demographic characteristics broadly representative of the study group at large (see Appendix A4, Table 18). This evidence brings us to exclude the possibility that the increasing water conservation that we observe over time is driven by specific characteristics of non-adherent units that decide to opt-out from the information campaign by not opening the emails that were sent on a monthly basis.

6.3 Heterogeneity analysis

Table 7 reports the results of the triple DiD (Eq. 5), where treated units are further distinguished depending, respectively, on their pre-treatment consumption class (low, medium or high), on the number of resident family members (less than three or more than two), and the age at the time of running our study of the water contract subscriber (under 51 or over 50). We find that the ITT impact is heterogeneous across the consumption classes (Table 7): while the reduction of consumption is not statistically significant for the low and medium categories, the high consumption category strongly reduces its water consumption (-76.8 L/20.3 gallons per day per capita, or a -19%), compared to untreated users belonging to the same category (See Appendix A1, Table 13 for the results of the analysis when expressing the dependent variable in log terms).

Table 7 ITT effect on water use: Heterogeneity Analysis

It is worth stressing that our study was not specifically designed to test for heterogeneity across types of households. Therefore, the heterogeneity results cannot be fully interpreted in terms of causality. Nevertheless, they suggest that the average reduction in water consumption of the treated group is highly driven by the high consumer category, which has the greatest opportunity to save water at lower marginal costs. Conversely, low level consumers, who already adopt sustainable habits, have higher marginal water conservation costs and do not find significant opportunities to further reduce their water consumption. When exploring heterogeneous effects across age and family sizes (Table 7, Column 2 and 3), we did not find any significant difference among the categories of the treated group. Finally, when considering simultaneously all these categories, we find that the average reduction in water consumption observed for the treated group was mainly driven by the “high-users” consumption class size (Table 7, Column 4).

To provide additional information regarding observable characteristics supporting the heterogeneity analysis, we developed some t-tests to compare household-level characteristics across the different groups (see Appendix A3, Table 17). According to the results, for both the control and treatment groups, users belonging to the high-consumption category are on average older than those belonging to the medium and low-consumption categories. Conversely, medium, and high treated households do not show statistically significant difference in their age. With respect to their gender, low and medium users are highly comparable, and the same holds when comparing medium and high users. Conversely, the share of female is 10% lower in the high-user group compared to the low group, with this difference being statistically significant. Finally, concerning the number of residents, high users on average have 0.3 residents less than medium and low users.

While being in some cases significant, the differences among consumption classes tend to be quite small. Moreover, it should be highlighted that they represent fixed effects that we control for in our empirical estimation models and that are included in the matching procedure to increase the comparability among treated and control units. Therefore, these differences are not likely to majorly affect the robustness of our results.

6.4 Feedback analysis and RDD

In this section we estimate Eq. (6). Results show that consumers react differently to different feedback (Fig. 5). Indeed, units receiving a ‘low user’ feedback tend to slightly increase their consumption compared to similar units which receive a ‘medium user’ feedback. An opposite result is found when we compare medium and high consumers around the cutoff: those receiving a ‘high user’ feedback significantly reduce their consumption compared to similar users who are in the medium category.

Fig. 5
figure 5

ITT differential impact effect on water use of different feedbacks: RDD. Note: Point Estimate with 95% confidence interval

A similar finding emerges when the “low-medium” and the “medium–high” cutoffs are considered jointly (Table 8). Running a single regression for both cutoffs allows us to formally test whether the cutoffs matter differentially across the two cutoff types. Also, in this case we find that, around the low-medium cutoff, consumers below the cutoff receiving a “low-users” notification tend to marginally increase their water consumption with respect to medium users which, in spite of having comparable level of water consumption, are located above the cutoff. Similarly, around medium–high cutoff, only those consumers above the cutoff receiving a “high-users” notification tend to marginally reduce their water consumption, while medium users which are located below the cutoff do not register any significant change in their water consumption. This latter result changes when the dependent variable is expressed in log terms (see Appendix A1, Table 14). In this case, we find that both types of consumers reduce in percentage terms their water consumption independently on the side of the cutoff they are located, and thus independently on the type of notification they received. However, those located above the cutoff which received a “high-user” notification reduced their water consumption more than those located below the cutoff.

Table 8 ITT effect of feedbacks on water use: RDD

7 Robustness Checks

In this section we present and discuss a battery of analyses developed to further test the validity of the assumptions that must be satisfied in order to support the adopted empirical strategy. Moreover, we want to test whether our findings are robust to alternative empirical strategies.

Placebo Test

Running a RCT ensures that the units that are randomly assigned to the treatment are not statistically different from the untreated ones. However, being committed to involve into the informative campaign all the eligible households, we could not develop a pure RCT. Given this exogenous institutional constraint, in order to assess the impact of the information campaign we developed a Two-Way Fixed Effects Diff-in-Diff, where the treated and untreated units were selected through a PSM. The related balance tests confirm that, before the treatment, the selected control group did not present any statistically significant difference with respect to the treated one along a variety of dimensions. Moreover, the DID dynamic analysis confirms that, in the months preceding the treatment, the average water consumption of the treated group was not statistically different from that of the control group.

Furthermore, as an additional validation of our results, we perform a placebo test by hypothetically assuming another date for the delivery of the treatment (see Table 9). More precisely, we split the real pre-treatment period into two sub-periods: a pre-treatment period from May 2021 to June 2021 and a false post-treatment period from July 2021 to August 2021. We developed a second placebo test where we take as the pre-treatment period November 2020 to April 2021, and May 2021 to August 2021 as the treatment months. The results of the placebo tests show that, in the false post-treatment period, there are no statistically significant differences in water consumption between the treated and the control group (see Appendix A1, Table 15, for results in log terms). This suggests that treated and control groups did not differ neither with respect to those unobservable characteristics that correlate and can influence their level and trend of water consumption. Moreover, these results provide further evidence supporting the parallel trend assumption that must hold for the staggered difference-in-differences design to provide unbiased estimated.

Table 9 ITT effect on water use: Placebo test

Other robustness checks

Hereby, we show that our main findings are confirmed when we introduce some changes to our empirical strategy which bring to a change in the sample composition and size. First, descriptive statistics show that per capita monthly water consumption ranges from a minimum of 0.7 L per day to a maximum of 7,781 L per day. Therefore, we test whether our results were driven from outlier values, namely those households who registered in at least one month a very low level of water consumption (plausibly because they were on vacation); and those who registered too high and anomalous level of water consumption. After trimming the distribution of the per capita water consumption at the 2‐nd and 98‐th percentile, the average level of per capita daily consumption declines from 240.3 to 211.3 L per day, and now varies from a minimum of 40 L per day to a maximum of 866.6 L per day. As shown in Table 10 (Panel A), when we exclude these outliers from our sample, the results confirm that the treatment has a significant impact on water consumption, though the size of the estimated water saving is now lower, and amounts to 13 L per day, or -4.8% (Appendix A1, Table 16 reports the results when expressing the dependent variable in logarithmic terms).

Table 10 ITT effect on water use: robustness checks

As a further robustness check, we adopted a stricter matching criterion. In particular, we selected the single closest eligible control unit to be paired with each treated unit (1 Nearest neighbor matching). This causes a reduction of the sub-sample from 2,355 to 1,428 households (half of them being treated). Results reported in Table 10 and Table 16 (Panel B) show that our findings are largely confirmed when we adopt an exact 1:1 matching which ensures a higher pre-treatment comparability among the control and treated groups.

We further show that the main findings of the ITT analysis do not depend on the chosen pre-treatment period. Our main findings remain robust when the Eq. (2) is estimated considering a longer pre-treatment time span, from November 2020 to August 2021. Nevertheless, the size of the treatment effect is now smaller and equals to a per capita water saving of 17 L/day compared to the control group, corresponding to a reduction of around 8.8% (see Table 10 and Table 16, Panel C).

We finally test the robustness of our results by running our baseline regression when the entire balanced pre-matching sample is used as control group and by adding household fixed effects and time fixed effects at a household’s level. Again, we find consistent results (Table 10 and Table 16, Panel D).

8 Discussion

In this section, we address a few aspects that merit specific attention. First, we discuss the information campaign’s main drivers of induced water saving. Second, we propose a series of exploratory considerations related to the potential water-saving measures that households could implement in response to the intervention. Third, adopting a comparative approach, we discuss the generalizability of our findings both at national (Italy) and international level.

Information campaign drivers of water saving

Following previous research in this field, our information campaign, and the message notified via email to the treated users, contained a plurality of information: among others, the monthly average of per capita daily water consumption; the monthly average level of water consumption of the treated group; the household ranking position and the corresponding consumption class size. Our study was not designed with the clear intent of assessing the differential impact of each piece of information. Therefore, one limit of our research is that we cannot identify which was precise channel driving the water saving promoted by the information campaign.

Nevertheless, this RDD analysis – where treated units on one side of the cutoff are compared with other treated units that, despite recording similar levels of water consumption, are placed on the other side of the cutoff and therefore receive a different notification of their class size – provides us some interesting insights on the potential driving channel of the treated units’ additional water savings. Indeed, the results of the RDD analysis suggest that additional water saving cannot be directly imputed to the provision of an information of which the user was not aware of before the launching of the intervention. Indeed, if the lack of awareness on water consumption was the main cause behind unexploited water savings, then the provision of this information should have promoted additional water saving independently on the side of the cutoff where the consumer was placed.

Conversely, the results of the RDD analysis show that consumers with comparable level of water consumption reacted differently to the information campaign depending on the side of the cutoff they were place. This result suggests that the recorded reduction in water consumption was mainly driven by the social and comparative component of the information campaign rather than by the notification of the amount of notified water consumption per se. In fact, those who received a low consumption notification, and therefore were better labelled than their counterparts who are immediately above the cutoff, increased their consumption at the margin. On the contrary, those who receive a high consumer notification, and therefore performed worse than their counterparts located below the cutoff, reduced their consumption more significantly. This evidence highlights the relevance of the social comparison component of the information campaign in driving a behavioural change.

Another open issue concerns the reason why the effect of the campaign vanished a few months after its ending. Again, we cannot unambiguously identify the underlying motivation. Nevertheless, the plurality of our results can provide some useful insights to interpret this evidence. One potential explanation could be that, due to the adopted measurement unit (per capita daily water consumption) people did not really understand the message embedded in the campaign. Indeed, this information differed from the one reported in the water billing (households aggregate water consumption per quarter). We do not think this explanation to be exhaustive. Indeed, if the provided information was not clear enough (due to the chosen measurement unit), we should not have observed any additional water saving during the period of the campaign. Conversely, we document that, when it was in place, the campaign promoted a significant reduction in water consumption.

Another potential explanation is that the campaign, in spite of being effective in bridging a cognitive gap on the users’ water footprint, was too short to drive a structural behavioral change. This potential explanation is consistent with previous evidence on the attention bias and on the memory retrieval bias. Concerning the former, due to cognitive limitations, time constraints, or simply because certain information is more salient or easier to process than others, people can focus only on a limited set of information or stimuli, while ignoring or undervaluing other relevant information, and therefore make sub-optimal choices (Cowan 1988, 1995; Styles 2005). The latter bias refers to the tendency of individuals to recall and retrieve information from memory in a biased or selective manner (de Fockert JM 2005; Furely and Wood 2016). Moreover, the ending of the social comparison with the end of the campaign can partly explain the fact that the savings promoted by the campaign were not persistent over time. Being the social norm component so relevant in driving our result, the not persistent water saving can depend by the short timing of the intervention, which did not favour the emergence of a social norm of water conservation.

Sources of the positive impact of the intervention

Even though our setting does not offer any opportunity to investigate at a micro-level how water is actually used and consumed by household’s members, we can exploratory discuss which actions can drive a reduction in water consumption, but we cannot determine which of them were specifically triggered by the intervention. These actions can be clustered into two broad categories, depending on whether water consumption declines as an effect of a change in behavior or a technological change.Footnote 19 According to the study by Ansink et al (2021) which focused the on impact of audits on water conservation, specifically distinguishing between the information and technological components, we know that the adoption of devices reducing water pressure are particularly effective in the long run while information component of the water audit has a large initial impact, but this gradually fades. Considering the relatively mild and time-limited nature of our intervention, coupled with the absence of explicit emphasis on technological components, we are inclined to suggest that the water savings triggered by our intervention primarily stem from a heightened adoption of micro water-saving behaviors. This includes, but is not limited to, practices such as shorter showers and running full loads in the dishwasher, as well as a proper use of toilet flush. We would tend to exclude that an intervention of such nature induced households to buy new and more efficient appliances or significantly upgrading aspects of the domestic hydraulic system.

External validity of evidence generated through this study

As argued by Andor et al. (2020), generalizing the results of a filed study of this sort to other contexts can be challenging. Along a similar line, Vivalt (2020) concludes that the extent to which experimental results can be generalized remains an open question. Like other water consumption studies in the existing literature, our intervention has different characteristics regarding the context, study design, empirical setup, sample size, intervention studied, and treatment. Nevertheless, our findings align with those from other geographic and institutional contexts. Existing literature on household water conservation indicates a reduction in water consumption ranging between 4 and 9% due to nudge programs. Notable references include studies by Carlsson et al. (2021), Schultz et al. (2019), Torres and Carlsson (2018), Miranda et al. (2020), and Jessoe et al. (2021). In essence, the results of our analyses are comparable to those obtained from trials conducted in various geographical and institutional contexts. The consistency of results across studies conducted in different locations suggests a certain external validity of the nudging approach. This is promising as it indicates the potential effectiveness and generalizability of nudge strategies in promoting water conservation in different contexts. As part of our study, we recognize that although our findings are in line with other studies, they cannot be generalized to the entire European population. However, they can be extended to European cities with similar characteristics. The full generalizability of the results is limited by several considerations. Compared to other Italian areas, the Milan metropolitan area is richer and more dynamic, making it more similar to some European cities than to other Italian cities. On the other hand, the levels of water consumption per capita are high and not comparable to those of other European cities, but closer to those recorded in other areas of Italy. Indeed, the average water consumption in the metropolitan city of Milan aligns with that of other Italian cities, where water stress is also increasing. In detail, in the Italian metropolitan cities, where over 30% of the Italian population resides, the daily water supply provided for authorized uses reach an overall average of 236 L per inhabitant,Footnote 20 one of the highest water usages per person in Europe.Footnote 21 These differences may be influenced by factors such as climate, lifestyle, water management practices, and individual awareness of water conservation. Italy's context, characterized by a relatively higher water consumption, offers an intriguing scenario for investigating the potential impacts of informational campaigns on domestic water usage. This underscores the necessity of tailored initiatives to promote water conservation. On top of that, we believe that the region where we developed our analysis shares with other areas an increasing issue of water scarcity linked to the climate crisis’ induced droughts, as exemplified by Catalonia's recent declaration of a water emergency. This mirrors Italy's past experiences and echoes the challenges witnessed across the continent. Catalonia, notably, is bracing itself for a drought emergency. Our results are likely to be less applicable in countries where water consumption is already low, but they may hold relevance in areas that, due to escalating droughts stemming from the climate crisis, must take actions to reduce consumption. However, we anticipate that external validity may diminish in different settings, such as smaller Italian cities or European cities where per capita water consumption is considerably lower than the Italian average.

9 Conclusions

With this research we analyzed the effects of a water informative campaign run from September 2021 to January 2022 on a sample of around 1,000 households, equipped with smart meters and located in the metropolitan area of Milan. The informative campaign was designed to bring some contributions to the existing literature. First, it is the only campaign developed in Italy, a country characterized by an increasing water scarcity issue, due to both the intensifying of climate-related droughts and to the unsustainable behavior of the Italian citizens, who, on average, consume 236 L (62.3 gallons) of water per day, one of the highest levels in Europe. Second, compared to previous trials, we communicated the daily average water consumption (instead of the total water consumption) at a per capita level (instead of the households’ aggregated levels), thus providing information in a format as familiar as possible, so that it could be easily quantified and assimilated by non-skilled users. The third contribution of our research related to the information notification tool. Unlike previous studies, we decided to provide the information on water consumption exclusively via email, which allowed us to monitor the email click (open) rate and the related opt-out rate. Thanks to this, we could adopt the ITT analysis as our main analysis and complement it with a LATE estimation through an IV approach.

The main result of our research suggests that the information campaign was effective in promoting an average reduction in per capita water consumption equal to 25.8 L/day (6.8 gallons/day), corresponding to a 6% water saving. Considering the intervention involved around 1,000 households with an average of 2.6 inhabitants, this roughly corresponds to around 2 million liters (528 thousand gallons) saved per month and more than 10 million liters (2.6 million gallons) over the period of the intervention. It is important to note that this number refers to the average (ITT) effect of the information campaign regardless of its compliance rate. Moreover, we show that this effect is mainly determined by users who have been effectively treated. Indeed, when focusing on compliers, we estimate a 55 L per day water saving (LATE). We further compared adherent and non-adherent units with the full study group across a variety of dimensions (see Appendix A4). Since various demographic groups are well-represented among LATE compliers, we can conclude that the IV estimates of the information campaign’s effect are likely to predict these effects beyond the group that produced them. This evidence brings us to exclude the possibility that the increasing water conservation that we observe over time is driven by specific characteristics of non-adherent units that decide to opt-out from the information campaign by not opening the emails that were sent on a monthly basis.

Existing literature on domestic water conservation studies indicates a reduction in water consumption ranging between 4 and 9% due to nudge programs (Torres and Carlsson 2018; Schultz et al. 2019; Miranda et al. 2020; Carlsson et al. 2021, and Jessoe et al. 2021). In our study, in response to the intervention, per capita water consumption decreased by around 6%. Compared to the existing literature, we employed an information campaign based exclusively via email, communicated a different type of information, operated in a different geographical context, and additionally, had the opportunity to observe the compliance status, allowing us to estimate the local average treatment effect.

We believe that our results expand on the previous literature which found that website campaigns were little or no effective in promoting water conservation habits and reveals that paperless online campaigns can be an effective and cheap instrument in promoting water saving behavior.

Our information campaign, and the message notified via email to the treated users, contained a plurality of information (among others the per capita level of water consumption, corresponding consumption class size, the average water consumption). Therefore, one limit of our research is that we cannot identify which was precise channel driving the water saving promoted by the information campaign. Nevertheless, the results of the RDD analysis suggest that water saving was not promoted just by the pure information on the level of water consumption. Indeed, consumers with comparable levels of water consumption reacted differently to the information campaign depending on the side of the cutoff they were place and on the related feedback they received. This result suggests that the change in behavior was largely influenced by the type of feedback received.

Some further considerations can be developed by estimating the economic savings associated to the lower water consumption promoted by the information campaign. In Italy, the water sector is regulated by the national Authority ARERA. According to the Italian regulation, the water tariff is divided into two parts. The fixed quota includes the water service tariffs for aqueduct, sewerage and purification, i.e. the management costs of the water network. The related rate is established by the territorial Agency and can vary from province to province. In Lombardy, this varies not only for the type of user but also for the different typologies of private users (final consumers or self-employed individuals with a proper VAT number). The variable quota varies depending on the number of resident people composing a household, the actual per capita water consumption, and the application of any exemption or concession applied to particularly vulnerable class of consumers. The CAP Holding water company estimated that the average cost per cubic meter of water for a 4-members family belonging to the category”residents’ water domestic use” is € 1.37 (or € 0.034 per liter per capita), for a total expense of approximately €450 per year.Footnote 22 With a simplified simulation, we can estimate that, if the 25 L per day of water saving promoted by the our intervention was permanent, this would have corresponded to a per capita annual monetary saving equal to €3.12. For a 4-members family, the annual water tariff would decline by €12.5. This would correspond to a 3% decline in the annual water bill. Considering that the field intervention lasted for five months only and that its effect vanished few months after its end, and considering the lower number of residents belonging to the treated group, the real monetary saving associated to the information campaign was actually modest. This suggests that the water saving per se, and the related (quite modest) economic saving do not seem to represent the main channel of the change in water behaviour promoted by the intervention.

Finally, the dynamic analysis highlighted some interesting insights. Indeed, while we found that the amount of water saving increased with the number of emails over the long term the water conservation impact of the campaign was not permanent, and it expired a few months after the end of the trial. This suggests that information campaign did not represent a sufficient tool to drive a structural behavioral change and puts into doubt whether the observed water saving was effectively induced by a correction of the users’ cognitive biases on their consumption levels, or whether it was induced by the feeling of being observed. Again, we are unable to precisely identify the main reason why water saving expired a few months after the information campaign. We think that the reason cannot be attributed to the notification tool or to the type of provided information, in particular to the possibility that people did really understand the message embedded in the campaign. Indeed, if it was the case, we should not have observed any water saving during the period of the campaign. Conversely, we document that, when it was in place, the campaign promoted a significant reduction in water consumption. On top of this consideration, the results of the RDD analysis suggest that the main reason behind the vanishing effect of the campaign few months after its end cannot be attributed to the lack of environmental awareness or by the lack of updated information on water consumption. Indeed, the RDD analysis shows that information on the level of water consumption was not per se the main driven of the observed water saving. Indeed, people with similar levels of water consumption reacted differently to the information campaign depending on the feedback they received. Therefore, the social comparison component of the campaign, and in particular the relative positioning of each user with respect to the average consumer, played a significant role in influencing water consumption behavior. In a similar vein, the fact that the social comparison ended with the end of the campaign can partly explain the fact that the savings promoted by the campaign were not persistent over time. Moreover, being the social norm component so relevant in driving our result, the not persistent water saving can depend by the short timing of the intervention, which did not favour the emergence of a social norm of water conservation.

We believe these findings to have some relevant policy implications. Given the current and growing water emergency that is affecting advanced and non-advanced countries, our results show that the implementation of information campaigns, through more advanced electronic tools, would allow the policymakers to achieve the objectives of reducing water consumption in line with the United Nations’ SDGs. Furthermore, public policies are needed to encourage water companies to adopt information campaigns. Government guarantees, soft loans and tax incentives are some standard examples. However, our results also suggest the importance of complementing these information campaigns with other measures aimed at inducing a structural change towards more sustainable behavior.