1 Introduction

Climate variability and change pose risks to UK water security through altered drought frequency and intensity, changing water demand, and damage to infrastructure by extreme events. These potentially affect water availability for abstraction, storage and supply with consequences for levels of service and environmental quality (Arnell 2011; Prudhomme et al. 2012). These pressures are compounded by rising water demand linked to growth in consumption and the number of households. A decade ago the Department for Environment Food and Rural Affairs (Defra) predicted that there might be insufficient water resources to meet demand beyond 2025 (Defra 2002). Accordingly, Defra’s (2008) water strategy for England Future Water set a target for reducing household consumption by 13 % by 2030, from 150 to 130 l per person per day. More recently, Ofwat incentivised water companies to reduce household water demand by 5 l per property per day over the current Price Review period (Defra 2011).

This paper reviews methods of household water demand estimation and forecasting with particular reference to the UK. Domestic water use responds to weather, technologic, economic, demographic and cultural factors so demand profiles vary between regions. However, sampling and modelling techniques are transferable, and contribute to a broader understanding of the predictability of water use. Furthermore, detailed analysis of water demand datasets can provide general insights into patterns of household water use behaviour and variability that may be tested in other regions.

First, we set the scene with a brief history of water management in the UK. Next, we review approaches to UK household water demand estimation and forecasting, drawing on the handful of studies that explicitly examine sensitivity to weather. The literature is organised into studies of water demand for short- (daily, weekly to seasonal) and longer- (annual to decadal) timescales. Short-term forecasts are more relevant to operational management of the water system whereas longer-term projections are needed to schedule new supplies and to underpin Water Resource Management Plans (WRMP). We then use the Anglian Water Services (AWS) ‘Golden 100’ data to illustrate practical issues that arise when quantifying household demand within water resource zones. This case study is part of a larger integrated assessment of the water system of South-East England under the Adaptation and Resilience to a Changing Climate Coordination Network (ARCC CN). Finally, we close with a few priorities for further research.

2 A Brief History of Water Management and Policy in the UK

Table 1 lists events, policies and publications that have shaped household water demand management in the UK since the 1970s. Here, we highlight notable weather and political landmarks, including privatisation of the UK water industry in 1989. Under the new structure each water company became accountable to Defra, Ofwat (the water services regulation authority) and the Drinking Water Inspectorate. During the major UK droughts of 1975/76 water use restrictions were imposed and new methods were developed to estimate frequency of low flows. Subsequent episodes in the 1990s raised public awareness of vulnerability to drought, and highlighted the potential for climate change (with rising water demand) to impact water security, especially in southeast England.

Table 1 Milestones in climate change and UK water management 1973–2011

Prior to 1995 a ‘predict and provide’ approach was taken towards water resource planning and efforts generally focussed on supply-side options. Subsequently, a ‘twin-track’ approach with greater emphasis on demand-side options emerged from the Department of the Environment (1996) national review. This evolution in thinking reflected mounting opposition to new supply infrastructure, such as reservoirs, due to their large environmental and economic impact (Water Resources Act 1991). It also showed awareness of the importance of demand management in balancing water supply and demand (Environment Agency 2003; Defra 2008; WWF 2010).

Early guidance to water companies included methods for assessing potential impacts of climate change on water security. For example, a standard approach for estimating headroom was developed for the Third Periodic Review based on four climate change scenarios (UKWIR 1998). Guidance for the Fourth Periodic Review specified that scenarios of future supply and demand be considered in a sensitivity analysis. Furthermore, it stated that companies should apply ‘low-regret’ options that would be beneficial regardless of the climate outlook (Defra 2003).

UK water companies are required to forecast water demand in annual June Returns and to provide cost-effective, defined levels of service, as part of the 5-year Periodic Review process (Defra 2003). Ofwat encourages water companies to include forecasts of domestic customers demand (including micro-component use) within WRMP. Furthermore, household data informs water companies pricing policies and underpins their long-term sustainability plans. Despite these benefits, relatively few studies have evaluated household water demand forecasting in the UK (Herrington 1998; Downing et al. 2003; Goodchild 2003; Atkins 2005). No doubt the limited body of work reflects the cost and difficulty of measuring, interpreting and predicting complex water demand behaviours (Memon and Butler 2006; Medd and Chappells 2008). Although outside the main scope of this paper, international research on household water demand estimation and forecasting also provides valuable insights for the UK (e.g., Arbués et al. 2003).

Emphasis is now on improving sustainability and cost-effectiveness of water supply being mindful of short- and long-term environmental impacts. The Environment Agency (2009) report Water for People and the Environment promotes demand management and greater attention to environmental needs. Defra’s (2011) Water for life sets out a vision for more sustainable water management, including improved interconnection and trading of bulk water supplies between companies. Meanwhile, research has continued into potential impacts of climate change on water security. The Climate Change Act (2008) established the Adaptation Sub-Committee (ASC) which laid the first National Climate Change Risk Assessment (CCRA) before government in January 2012 (Defra 2012). Water sector issues featured prominently in the CCRA and the second annual review of the committee (ASC 2012).

With this historical synopsis in mind, the next section critiques methods of household water demand estimation, drawing attention to the few studies that examine sensitivity to weather and climate.

3 Estimating Domestic Water Demand from Household Data

Most UK studies are based on water-into-supply data as this is most readily available. However, supply data are not sufficiently detailed to resolve household water use behaviour in relation to external factors. Better information on household level and individual consumer behaviour is needed to improve operational management and long-term planning of the water system. In many cases, water companies have to estimate data from characteristics scaled across water resource supply zones. However, household water use is notoriously difficult to infer because it is shaped by local political, social, economic and meteorological factors; by changes in population, uptake of demand reduction measures, and technology; by price elasticity of consumption linked to household size (Arbués et al. 2010); and by interplay between these drivers. Forecasting is further complicated by uncertainties in climate change projections, hydrological modelling and supply calculations (McDonald et al. 2003).

Statistical forecasting methods rely on historic data to define relationships between independent and dependent variables (Memon and Butler 2006). These relationships can then be used to predict future household water demand but forecast accuracy is highly dependent on the quality of the underlying data (Tate 2000). Water demand forecasts are typically derived from explanatory variables (e.g., population, household size and climate variables) using linear regression methods. Alternatively, water companies employ geodemographic profiling to estimate household demand via factor analysis of census data to identify areas with common attributes. For example, the ACORN (A Classification of Residential Neighbourhoods) metric is widely used with household demand coefficients (derived from sample surveys) to estimate household consumption across the regional population.

Water companies also base predictions on ownership-frequency-volume models which consider the number of water using fixtures, how often they are used, and the average quantity of water per use. ‘Microsimulation’ techniques have also been applied to household demand estimation and forecasting (Clarke et al. 1997; Williamson et al. 2002). These are discussed in further detail below as are emerging ideas about the cultural and social dimensions of water use.

3.1 Short-Term Household Water Demand

3.1.1 Metering Trials, Individual Household Monitors, and Domestic Consumption Monitors (DCMs)

In the early 1990s trials began to take place across the UK to improve understanding of household water consumption and the impact of metering (Ofwat 1993b). For example, Russac et al. (1991) analysed the consumption of 969 properties over a 3 month period in 1989 from meter trials in Potters Bar, Hertfordshire. Average daily household water use was determined by property type: detached houses had the highest average consumption, followed by bungalows using 50 % less and flats or mixed accommodation using 60 % less. Age (as a proxy for occupational status) was also found to shape water consumption: single occupant households with a retired resident used 70 % more water than a working age person because the latter spends less time at home.

Russac et al. (1991) found that the most important determinants of water consumption were the number of water fixtures and household size. They showed that appliance ownership can be inferred from property type and occupancy rate (i.e., households with fewer appliances tend to live in smaller properties and have fewer occupants). Shove (2003) and Allon and Sofoulis (2006) assert that domestic water use is an outcome of convenience (and access to appliances), as well as notions of comfort and cleanliness driven by changing infrastructure, technology, societal attitudes and perceived ‘norms’. For example, washing machines increased convenience and raised expectations about the cleanliness of clothing and personal washing which, in turn, has increased household water consumption over the decades (Shove 2003).

Russac et al. (1991) further observed that daily average household consumption conceals much variation in water use within households. There are also large variations in demand between individual properties and within ACORN categories. Therefore, methods of demand estimation based on water-into-supply data offer limited insight to individual water use behaviours and household level drivers.

Thames Water used externally fitted individual household monitors to collect domestic consumption data for their Daily Water Use Study (DWUS). Between ~136 and ~1,625 volunteer households on unmetered tariffs were sampled from 1996 onwards. In 2002 they sampled ~1,900 non-volunteer households on unmetered tariffs to investigate any biases within the DWUS data. The results showed that self-selection (volunteers), the Hawthorne effect (knowledge of being monitored) and unrepresentative sample populations (e.g., no flats) all affected domestic consumption estimates.

3.1.2 Micro-Component Studies

There has been growing interest in micro-component approaches since the mid-1990s (Butler 1993; Edwards and Martin 1995; Atkins 2007). Flow meter technologies developed at that time allowed total water consumption to be disaggregated from a single household meter to multiple end uses such as washing machine, bathroom tap or toilet. These data can then be combined with knowledge of demographic profiles, occupancy rates, or climate variables to understand aggregate household water use. This approach has been most popular in America and Australia (e.g., Mayer and DeOreo 1999; Makki et al. 2011) because it allows the researcher to investigate water use in the home. Nonetheless, micro-component water usage still cannot be attributed to any individual other than in single occupancy households.

Current water resources planning (WRP) guidelines from Ofwat, Environment Agency and Defra stipulate that companies should understand how their customers use water in their homes and that this knowledge should be founded on quantitative analysis of micro-components. Furthermore, companies are expected to forecast the impact of climate change on patterns of domestic consumption and to incorporate these effects in their water resource plans.

Despite these imperatives, there are still relatively few micro-component datasets in the UK, and even fewer that were collected with climate in mind (see Table 2). Edwards and Martin (1995) undertook a pioneering study with what was, at the time, the largest survey of household water consumption in the UK. AWS ‘Survey of Domestic Consumption’ (SODCON) recorded the water use of 1,000 properties on metered tariffs, and 1,000 billed by rateable value. One hundred households within SODCON were chosen for micro-component sampling: the so-called ‘Golden 100’ dataset. In addition to an external meter, SODCON properties also had monitors on individual water fixtures. Metered data were supplemented by information on property, appliances and socio-economic status of participating households.

Table 2 Synthesis of UK domestic water consumption surveys (MC = micro-component, H = household, * = explicitly consider weather variables)

Although the study ran for 13 years preliminary results were presented for April 1992 to October 1993 (Edwards and Martin 1995). These showed that as household size increases per capita consumption (PCC) decreases, but as household income increases so does PCC. The survey also revealed that agricultural villages and affluent suburban houses (ACORN classes A and J respectively) have the highest PCC values whereas council properties (ACORN class F) have the lowest. This supports the finding of Russac et al. (1991) that detached households (typically more affluent suburban houses) use on average more water than other household types. The results further suggest that house type can be used as a proxy for the occupants’ water use profile (which depends on their socio-economic status, family size and composition). This is reflected in water companies’ use of rateable value as a charging mechanism. However, changes in housing stocks and demographics over time mean that rateable value (last updated in 1990) is now considered to be an out-dated and unfair charging mechanism (Walker 2009). Edwards and Martin (1995) also reported higher than expected levels of night time water consumption linked to dishwasher and washing machine use.

The ‘SODCON’ and ‘Golden 100’ studies were novel because they surveyed customers that are not billed by meter alongside those on a metered tariff. This enables investigation of the influence of billing type on household consumption. National Metering Trials found an average 11 % reduction in water use in households billed by meter (Russac et al. 1991), compared with 15 % less in ‘SODCON’ (Edwards and Martin 1995) and 10–15 % less reported by the Walker Review (2009). However, such statistics are susceptible to biases (in sampled age, gender, occupation, socio-economic class), the Hawthorne effect, improved leak detection, or the residential price elasticity of demand for water.

Recent analysis highlights the importance of taking a holistic approach to water efficiency and management as outlined in the Code for Sustainable Homes in England. Fidar et al. (2010) show that although micro-component level water saving measures are designed to reduce overall water demand, consequences for energy consumption and carbon emissions may be overlooked. This is particularly important for hot-water fixtures. For example, Clarke et al. (2009) found that the carbon footprint of white goods is higher than other hot water uses because the carbon intensity of electrical appliances is greater than those fuelled by gas.

Micro-component data allows water use and the impact of demand management activities to be continuously quantified at the finest scale currently possible. These data provide quantitative evidence for assessing cost-effectiveness and suitability of supply and demand side management options. From analysis of the observed relationship between consumption and weather related variables, potential impacts of climate change on demand can be inferred. Given risks to UK water security from climate change, population growth, increased customer expectations about levels of service and growing environmental concerns it is essential that a full understanding is developed of the benefits and limitations of demand management. This is recognised by the EA and Ofwat, and is reflected in the WRP guidelines.

Micro-component sample sizes (and thus spatial and temporal coverage) are ultimately limited by the capital and operating costs of collecting and maintaining data. Furthermore, they provide limited insight to individual water users. Sharp et al. (2011) claim that the concepts, methods and findings of positivist approaches to water demand forecasting are deeply embedded in the socio-technical system for water supply and forecasting in the UK. Data gained from micro-component and DCM techniques, tend to emphasize headline PCC values for future water demand, predicted from a sample. Definitions of the ‘average user’ then become embedded in the regulation and maintenance of the system, implying that a ‘one-size-fits-all’ approach to water management can be adopted (Medd and Shove 2005a; Sofoulis 2011). In practice, there is considerable heterogeneity of water use behaviours at intra- and inter-household scales. Hence, post–positivists suggest that reliance on micro-component and DCM data within the UK is conceptually limiting (Sharp et al. 2011). However, when combined with appropriate qualitative data such micro-component approaches may reveal nuances of demand previously missed by quantitative or qualitative data alone (Brown et al., this issue; Pearce et al., this issue).

3.1.3 Peak Household Water Demands

Despite the above reservations, long-term meter studies provide insights about time-varying factors that influence household water use. For example, the Water Research Council (WRc), Environment Agency and 13 water companies used micro-component data to investigate diurnal variations in household water use (Kowalski and Marshallsay 2005). The study began with ‘identiflow’ meters collecting data from 250 households. Since then the survey has been expanded to 500 unmetered UK properties. These data were combined with socioeconomic and lifestyle indicators from the 2001 Census, ACORN classification, and data held in WRc and water company archives. The resulting dataset is one of the largest in the UK. Even so, metered properties and flats are underrepresented, underlining the challenge of obtaining a representative sample that can then be scaled to a water resource zone.

Kowalski and Marshallsay (2005) separated the data into summer-peak and off-peak consumption. [Stratifying in this way helps regression models explain more variance in the two subsets]. Off-peak water demand was most strongly correlated with ACORN class and occupancy rate which explained 40 % of the variation in consumption. The 60 % unexplained variance is partly due to behaviours and other external drivers (such as weather) not incorporated in the model. Unfortunately, the regression model was not applied to peak summer water consumption as Kowalski and Marshallsay (2005) felt that the sample size was too small and the data too noisy. Instead, comparative analysis of the largest water consumers with the ‘norm’ was undertaken. The former tended to own garden sprinklers, live in detached houses, and fall in the wealthiest ACORN class.

Kowalski and Marshallsay (2005) also analysed diurnal patterns of micro-component water on weekdays and weekends during summer peak and off-peak demand, by house type, socio-economic class and occupancy rate. Differences between water usage on summer peak days and the rest of the year were attributed to external tap use. Toilet flushing was the dominant water use during the night period, followed by washing machines and dishwashers. Large weekly and diurnal variations in household water use were also reported. Some of this behaviour depended on ACORN class and occupancy rate but there was still much unexplained variance. More sophisticated analyses involve treating each day of the week separately, or supplementing metered data with qualitative information from water use diaries, interviews or focus groups.

The findings of Kowalski and Marshallsay (2005) are echoed in international studies. For example, Gato et al. (2007) undertook a study of daily household water use in East Doncaster, Victoria, Australia. They built separate models for base use and seasonal use. The former mainly reflected indoor water use and was independent of variations in rainfall and temperature but strongly dependent on the day of the week.

3.1.4 Diary Based Studies

Butler (1993) undertook a diary and questionnaire study of the influence of household occupancy rates and day of the week on frequency and peak usage of water fixtures. The water use habits of 28 households (76 individual participants) were surveyed. Households ranged from single to five person occupancies with an average occupancy of 2.71 (close to the national mean at that time). Half the participants were from professional and managerial groups (compared with a national average of 15 %). This is a common limitation of household demand studies as datasets tend to be biased towards better educated and more water conscious socio-economic groups.

The original purpose of the study was to improve understanding of wastewater discharges for drain and sewer design (Butler 1991). [At that time, neither occupancy rate nor differences in discharge by day of the week were taken into account when installing drainage]. Sampling was undertaken over seven consecutive days in December 1987. Volunteers completed daily water use diaries documenting the time, day, and type of water use alongside the duration of the waste flow. Measurements were taken from fixtures alongside participants’ water use diaries to confirm average volumes and flow rates. The toilet was on average the most frequently used appliance, the sink was used 50 % as often as the toilet, whereas the bath, shower and washing machine were used ~10 % as often. During peak times, the basin was used most frequently followed by the toilet, with frequency of kitchen sink usage 50 % that of the basin (Butler 1993).

Frequency of appliance usage (except shower) and occupancy rate were found to be weakly positively correlated. Hence, household water use volumes can reflect interplay between occupancy and appliance usage. Both Russac et al. (1991) and Butler (1993) assert that household PCC is more representative of household demand than aggregate household water usage. On the other hand, PCC is limited because it masks the diversity of water use behaviours of even similarly ‘average’ consumers in a household. However, the only ways to record individuals’ water use is through diary based studies, questionnaires, interviews or observational techniques which have their own limitations.

Butler (1993) further identified that water use varies during the day and between days (i.e., there is a weekly cycle) with midweek water use behaviours differing from weekend use. These detailed studies show that models of household water appliance usage based on average weekly amounts hide significant variations in water consumption within days and between midweek and the weekend. Qualitative studies like Butler (1993) provide an opportunity to access the domestic routines and habits of the individual water user, what they consider as ‘normal’ usage, and reveal the ‘non-average’ user. One recent UK survey conceptualises domestic water use in terms of practices such as washing or gardening (Browne et al. under review; Pullinger et al. 2012).

3.2 Long-Term Household Water Demand

3.2.1 Ownership-Frequency-Volume Models

Herrington (1996) is still one of the most widely cited studies of long-term water demand in the UK especially in relation to climate change. Per capita household demands were forecasted to the 2020s using water into supply data. It was assumed that past, present and future household water use could be understood through knowledge of appliance ownership, frequency and habits of appliance usage, and household occupancy rates. Data from Thames Water, Lee Valley, Cambridge and Southern Water’s three divisions were analysed for the period 1976–1991 to build a model and project future demand. Assuming no change in water policy or metering, it was forecasted that between 1991 and 2021 per capita household consumption in these regions could increase by 20 % (~31 l per household).

Herrington (1996) applied expert judgement to identify water uses that could be sensitive to climate and then estimated potential impacts. A slight decrease in toilet water use was predicted as low-flush designs replace conventional cisterns. This could be offset by an expected slight increase in appliance numbers combined with greater frequency of usage. Conversely, it was assumed that water for personal washing would rise due to increased shower ownership, greater frequency of usage and wider ownership of power showers. These devices consume on average 60 l per usage; twice the volume of conventional showers. The forecast produced for 2021 suggested that shower related water demand could increase by 12 %.

Lawn irrigation was also expected to increase as a consequence of rising sprinkler ownership. Although volume of each usage was expected to remain the same, frequency of use was forecast to increase with growing evapotranspiration demands: from once every 5 days to twice every 9 days in summer (May to August). Based on these values and an assumed 1.1 °C temperature rise, lawn sprinkling was forecasted to increase by 11.7 l per household per day. Other garden watering was forecasted to increase by 20 %.

Future water consumption associated with clothes washing and dish washing were expected to show little change. Water saving technologies in washing machines (25 % per cycle) balance increased ownership of these appliances. Likewise, dishwashers use less water than manual washing. Therefore, an expected 50 % ownership of dishwashers by 2021 has little effect on the forecasted demand. Herrington (1996) concluded that without climate change and any changes in demand management, total household consumption would increase by 36 % by 2021 relative to 1991; with climate change, water demand was expected to increase by 41 % over the same period. However, a major limitation of this study was that no quantitative analysis was performed on end use data to determine domestic water consumption relationships. Furthermore, it relies upon the idea that an ‘average’ volumetric use of sampled appliances and water users can be defined and scaled up to accurately forecast the larger population. This masks the heterogeneity of appliance model, water-saving modifications (such as a cistern hippo) and variations in individual water use practices.

3.2.2 Regression Analysis

Atkins (2005, 2007) used AWS ‘SODCON’ data for East of England to build regression models for PCC based on the interaction between water use and occupancy, method of billing, ACORN classification, region and meteorological conditions. They concluded that the model provided a good fit throughout the period (2001–2004) as it explained 80 % of the daily variation in PCC. However, there was a tendency for the model to underestimate the size of summer peaks (also shown by Herrington 1996; Downing et al. 2003) and the effect of individual water use behaviours is aggregated to the household level.

Atkins (2007) also used AWS ‘Golden 100’ data to analyse micro-component water use. They investigated the data using the ownership-frequency-volume approach and regression analysis. They concluded that there was too much volatility within the total usage and significant potential for bias given the small sample (which would not be representative of the wider population). Nonetheless, knowledge of the qualitative patterns in micro-component use and the impact of metering and temperature were still useful for planning purposes.

3.2.3 Long-Term Trends in Peak Household Water Demand

Herrington’s (1998) analysis of southern and eastern England household water micro-components was novel because it investigated peak demands rather than average demands. Micro-components of peak seven-day demands were derived from detailed water use surveys conducted in the previous two decades. Peak water demands were then forecasted to 2021 with, and without, climate change. Under a warming scenario of 1.1 °C by 2021 peak PCC was forecasted to increase by ~50 %; without the temperature rise peak PCC was predicted to increase by ~40 %. Herrington (1998) found that climate was a key driver of inter-annual trends and was responsible for 30–70 % of the variation in peak-week demands in Sussex, Hampshire, Cambridge, and the Lee Valley. Temperature produced the most consistent results: for a range of peak ratios, areas, companies, models and time periods a 1 °C temperature rise resulted in a 1 to 1.7 % increase in monthly, weekly or peak ratios.

Goodchild (2003) investigated the elasticity of summer 7-day average household water demand to assess the potential impact of climate change. The multivariate stepwise linear-regression model was based on data supplied by Essex and Suffolk Water for 41 houses over a 6 year period, and 10 weather variables. These were evapotransipration, water content of top 0.15 m of soil, sunshine hours, solar radiation, maximum temperature greater than 25 °C, rain minus evapotranspiration, maximum temperature, dry bulb temperature, number of days since at least 2 mm of rainfall, and daily rainfall total. Future weather was interpolated to a 10 km grid scale from four climate scenarios for the 2020s produced by the UK Climate Impacts Program (UKCIP). Weather and demand data were smoothed using a daily 7-day moving average to remove the effect of weekday, to discern the effect of antecedent conditions on water demand, and to reduce the impact of the nine-hour lag between measurement of weather and household water demand.

Overall, evapotranspiration was found to be the most powerful predictor. Although the model generally performed well, it under-predicted the highest peak demands (recorded in August 1995). Also, because the model was linear, it predicted reduced water demand below the threshold at which weather variability ceases to have effect. The approach further assumes that people will react to weather variation in the future as they do today, with no adjustment for technological or cultural change. Moreover, the findings are based on aggregate data; no attempt was made to reduce household water demand into micro-components.

With these limitations in mind, the model was applied to both present and future weather to assess how (if all other factors remained constant) household water demand might change under climate change. The model predicted that summer 7-day average household water demand could increase by 2.1 % within a range of 0.8 %–3.7 % by the 2020s. In addition, the frequency of high demand events, with average 7-day household demand of 183 l daily PCC was predicted to exceed 10 % of the time instead of 5 % presently.

3.2.4 Microsimulation and Econometric Modelling

Many water demand forecasting studies advocate a micro-component approach because it enables analysis of demand at the intra-household level (e.g., Herrington 1996; Environment Agency 2001; Atkins 2007). Unfortunately, there is a trade-off between finer-scale analysis and sample size (because of the high cost of data collection and difficulty in recruiting participants). Hence, it follows that there are also uncertainties about sample representativeness when extrapolated to larger areas or future periods.

Some studies integrate ‘microsimulation’ and econometric modelling to create detailed synthetic, small area, household water demand data (e.g., Clarke et al. 1997; Mitchell 1999; Williamson et al. 2002). Population estimates are derived from a wider range of descriptive variables related to the household than traditional indicators (such as, location, tenure, property type, age, gender, occupation and ethnic group). The ‘micro-level’ population is created by chaining conditional probabilities that link these variables. Water appliance ownership rates determined from census data and household surveys can then be combined with knowledge of household micro-component water use related to household size and income.

Clarke et al. (1997) applied the ‘microsimulation’ method to estimate ward level household water consumption using data collected by Yorkshire Water Services Ltd. for 4039 metered properties in Leeds. They found considerable spatial variation in water use depending on dwelling and household characteristics (ranging from 49 to 416 m3 per annum). It was noted that the majority of volunteers were metered, well-educated and middle class so the results do not reflect the water use behaviours of other socio-economic groups and assumes that all households are similarly billed (Kowalski and Marshallsay 2005).

Williamson et al. (2002) subsequently developed a ‘static microsimulation’ approach that employed 60 individual and household constraints to reweight the micro-component data using anonymous records from the 1991 census as a control for the small area census tabulation. Population projections were linearly scaled from the small area age-gender share to the district age-gender figures. Unfortunately, this is a weak assumption because wards are unlikely to maintain the same age-gender profile. The synthetic 1.83 million household consumption monitor ‘microdata’ were used to forecast household water demand for the area producing time-series projections for 2020.

Household water demand was estimated from a regression model based on number of residents, number of bedrooms, washing machine and dishwasher ownership, property type and tenure. This function explained 44 % of the observed variance; the remainder was attributed to factors such as water use behaviour and garden size which were not included in the model. Williamson et al. (2002) assumed that the dataset was large enough to estimate a statistically robust household water-demand function for all household types and within area variability.

Williamson et al. (2002) then considered three scenarios of appliance ownership and future changes in household micro-component usage. These were: no change since 1991, increased ownership as forecast by Herrington (1996) (to reflect the water industry planning assumptions); and a higher rate based on extrapolated time-series of consumer durable ownership. Two scenarios for washer-driers were considered: no change; and efficiency gains in the drying cycle that match those forecast for the washing cycle. Williamson et al. (2002) extended Herrington’s (1996) work by including population characteristics such as age and economic status alongside predicted changes in household size. On a gross level the results were similar: PCC was forecasted to increase by 20.4 % between 1991 and 2021.

Williamson et al. (2002) showed by applying forecasted trends in micro-components (except for washer-drier efficiency) without and with population change, household water consumption in the Yorkshire Water Services area could increase by 23 % and 30.7 % respectively by 2025. Forecast appliance-efficiency gains almost outweigh the impact of projected population increases, consistent with Russac et al. (1991). Within the Yorkshire Water Services region, this change is not expected to be spatially uniform, with percentage increase by ward varying between 10.8 and 46.1 %. [Again, emphasizing the need to consider spatial variability when devising water conservation strategies].

A major advantage of the ‘microsimulation’ method is that population statistics are readily updated from latest birth, death and migration data. The forecast model did not consider changes in water price but the model did enable price elasticity functions to be applied to household water use coefficients. A weakness of this method is that biases in the sample data – such as the influence of billing type – are carried forward into the synthetic dataset. Although the population statistics can be readily updated the method is ultimately limited by the quality of household water use data and ‘hidden population’ biases in census data.

3.2.5 Narrative Based Scenarios

Later work on long-term water demand relies heavily on the CCDEW: Climate Change and Demand for Water report (Downing et al. 2003). This study used the UKCIP02 climate and socio-economic scenarios to estimate future water consumption. The socio-economic scenarios were described by two axes: systems of governance and individual values. These construct four scenarios, each with a narrative describing a different society. The Environment Agency (2001) used the scenarios of anticipated changes in behaviours of water use to investigate the aspects of demand that are most likely to be impacted by climate change.

For example, it is a common assumption that showering frequency may increase as a result of projected warmer summers along with changing attitudes towards personal hygiene. Conversely, water uses such as toilet flushing are less likely to be impacted by climate. The narrative-based scenarios were combined with ownership-frequency-volume models following Herrington (1998) and forecasts of water demand under climate change were then produced. The four storylines and climate scenarios showed a range of future demand possibilities, an advance on the single value predictions of Herrington (1998). The range highlights the need for water management strategies that are robust to a wide set of circumstances (Wilby and Dessai 2010). However, a major limitation of this study, along with Herrington (1996) and Goodchild (2003), was the focus on average PCC. This masks subtle variations in water using practices and their elasticity to variables such as climate.

3.3 Summary

A common finding of the UK studies discussed above (and in Table 2) is that household water use reflects numerous, time- and space-dependent factors. This theme is echoed by international research (e.g., Arbués et al. 2003; Mayer and DeOreo 1999; Gato et al. 2007; Fox et al. 2009; Praskievicz and Chang 2009; Makki et al. 2011). Many studies are hampered by biased and small sample sizes, short records, or location specific data. Not surprisingly, levels of explained variance can be low. However, there is scope for improving models by treating days of the week, months and occupancies separately.

Micro-component data offer objective, quantitative and continuous records of water use within the home, but are at the household scale and can conceal marked variations between individuals in the same residence. Furthermore, these data are costly to collect and maintain. The Hawthorn effect, self-selection, sample population biases in age, gender, socio-economic status, access to appliances, billing method and house type can all limit the sample representativeness of the wider population. Micro-simulation techniques may offer ways to increase sample size and coverage but the synthetic output is ultimately dependent upon the quality of the input water use data.

Diary based studies, interviews and focus groups offer glimpses of within household variations in water use behaviour. However, these studies are inherently biased by the participants’ knowledge of the experiment. Furthermore, they are costly to undertake and sample sizes may suffer accordingly. Ideally, a holistic approach is required in which qualitative studies supplement objective meter records of micro-component usage.

From the discussion in this section it is clear that both metered and survey approaches have methodological advantages. Positivist approaches tend to define an ‘average user’ for bulk water supply and demand estimation; post-positivist approaches explore the heterogeneity of water using behaviours to target particular attitudes and practices for more effective demand management.

Advocates of post-positivist approaches believe that household demand management would be more successful if the barriers and context of water use behaviour were better understood (Sharp et al. 2011; Sofoulis 2011). This could help water management move from an authoritative top-down approach, in which water use is solicited and water-restrictions imposed, to a partnership between providers and users where people become actively engaged in their own demand management and capable of independently supporting new practices (Sofoulis 2011).

Based on our survey of studies it is also clear that there is limited empirical evidence in the UK of strong relationships between weather and water demand at the micro-component scale. A recent study on Climate Change Approaches in Water Resource Planning confirmed this view and notes that more research is needed on peak household micro-component water demands in relation to climate variables (UKWIR 2012a). Peak demands are important to the operational management of water systems and to the long term investment needed to achieve sustainability targets.

4 Micro-Component Analysis in Practice

The above review collates insights gained from analysis of household and micro-component water use data. This section illustrates some practical issues that were confronted when searching for possible climate signals within water company survey data that was originally collected for different purposes. As a case study we refer again to the AWS ‘Golden 100’ data (Fig. 1 and Table 3) which are based on 100 households with telemetry (Plate 1) fitted to eight micro-component household meters (Table 4). The archive also contains data on occupancy, region, rateable value, billing method, ACORN classification, day of week, year and bank holiday. Alongside these are Met Office data for daily rainfall, maximum temperature, minimum temperature and sunshine hours.

Fig. 1
figure 1

Sample sizes by region, metering and household occupancy category in the ‘Golden 100’ data

Table 3 Summary of the AWS ‘Golden 100’ data
Table 4 Factors included in the AWS ‘Golden 100’ data

Five stages of data preparation were implemented through an iterative process (Fig. 2): 1) gross error checking; 2) data stratification; 3) secondary screening for outliers; 4) data transformations; and 5) regression modelling. ‘Matlab’ algorithms were developed for consistency throughout data preparation, to reduce the possibility of human error and to reduce processing times. The development of these procedures is outlined below.

Fig. 2
figure 2

Flow chart showing data preparation method. Numbers refer to the sections in the text. MC is micro-component

4.1 Error Checking/Quality Control

The first step was to reformat all fields to numerical data, ready for import to Matlab (i.e., remove text from the master file). A key justification for using Matlab was that the software can handle very large arrays needed for the multivariate analysis. A quality control algorithm was developed to automatically pass the dataset through a series of error/screening checks (Fig. 2). First, general tests were undertaken to remove gross errors (such a daily minimum temperature exceeding the same day maximum temperature). In these cases, the whole row of data (i.e., daily reading for that house) was removed.

Fig. 3
figure 3

An illustration of the benefit of stratifying data to identify household water use behaviours: a all household PCC data versus daily maximum temperature; b same data by temperature category; c unmetered household subset; d unmetered, two person household subset; e unmetered, two person household, East region subset; f unmetered, two person household, East region, shower micro-component. Error bars are standard error of the mean

Second, completeness checks were undertaken to determine if there were missing data in any of the micro-component daily total values, PCC daily value, or meteorological data. This was important because subsequent multi-variable analysis depended on data being present for all fields. A percentile threshold was then applied to remove the largest 0.05 % values of each variable. This was a necessary and consistent way of excluding extreme outliers (e.g., one rogue entry purported 98,020 l for a single occupancy household). The 0.05 % threshold was determined via sensitivity testing.

Day of the week and month of the year are coded on a linear scale within the ‘Golden 100’ dataset. This is problematic for modelling as it suggests that Sunday (coded 1) is further away from Saturday (coded 7) than Monday (coded 2). Furthermore, previous research shows that each day exhibits distinct water use behaviours (Butler 1993; Atkins 2005, 2007). Day of week and month were, therefore, recoded using dummy variables. For example, Sunday was coded 000000, Monday as 100000, Tuesday as 010000, and so forth. Hence, day of week was always expressed in relation to Sunday (all columns ‘turned off’); likewise all months in relation to January (00000000000). Another dummy variable was used for bank holidays (flag ‘1’) and non-bank holidays (flag ‘0’).

Treatment of days with zero PCC can be awkward, and depends ultimately on the purpose of the study. Removing zero PCC values (which can be assumed to indicate non-occupancy) is legitimate when analysing sensitivity of water use to daily weather. On the other hand, water company annual predictions are based on 365 days of the year. Furthermore, treatment of days in which there is non-zero PCC (indicating occupancy) but a zero micro-component PCC are also problematic. The handling of these events is discussed in section 4.6.

4.2 Stratification

Water use behaviours reflect the interplay of many factors as noted before. The value of the ‘Golden 100’ dataset is that it provides a large archive of information (Table 4). However, it is because of the complexity of the relationships between these variables and household micro-component water use that the noise can overwhelm any underlying patterns of behaviour. Stratification was undertaken to produce more homogenous subsets of the data (Fig. 3). The merit of the ‘Golden 100’ dataset is that it has many factors for deconstructing data (such as bank holiday, day of the week, season, occupancy rate, etc.). Again, a Matlab algorithm was used to automate this process.

As Fig. 3 demonstrates ‘noise’ in the data was reduced by stratifying to more homogenous sub-populations. For example, discerning a temperature signal from all the data is unlikely (Fig. 3a). However, Fig. 3f shows that by investigating one micro-component and the associated water use of a stratified dataset, relationships begin to emerge: in this case greater shower usage with increasing maximum daily temperature up to about 20 °C. Beyond this temperature shower use even begins to decline, perhaps reflecting water saving campaigns in the media.

A major complexity of household water demand estimation and forecasting is how best to stratify data to fully capture water use behaviour variation. We have already noted that studies that do not disaggregate the influence of day of the week can mask important variations in water use (Butler 1993). It is reasonable to assume that there is a difference between weekend and mid-week days so these should be grouped separately. Ideally, because each day has different water use behaviour, days of the week should be treated separately (Atkins 2005, 2007).

Figure 4 shows weekly variations in micro-components. In general, water use declines mid-week relative to the weekend. In this case, the data were stratified by region, occupancy rate, billing type and micro-component. Figure 4 indicates that peak washing machine usage in unmetered, two person households in East region occurs on Saturdays. Usage reduces slightly on Sundays and Mondays falling to a midweek low. Toilet usage in the same households also demonstrates a midweek low and gradual rise to peak weekend usage. These behaviours were independent of day to day weather.

Fig. 4
figure 4

Weekly cycle in washing machine (left) and toilet (right) water use in unmetered two person households in East region

4.3 Secondary Screening

Once data have been stratified, secondary screening may be performed on the micro-component daily totals (excluding external usage). This was a soft coded step within the algorithm that removed all rows of data (daily readings) whenever a micro-component value was above a user defined threshold. The threshold for exclusion can be adjusted to suit the analysis that will be undertaken on the micro-component data. For example, this secondary screening removed suspect values such as 131,218 l bath usage for a three person household.

4.4 Data Transformation

Linear regression analysis assumes normality of the model residuals. Therefore, an automated algorithm was developed to test stratified data for normality and then, if not normally distributed, applied a transformation. The Kolmogorov-Smirnov normality test and Box-Cox transformation were used as required. Box-Cox was chosen as this transforms variables depending on their individual distribution rather than applying a generic transformation to all distributions (Wilks 1995).

4.5 Sub-Set Selection

The last stage before regression-modelling was to assess the contribution made by each micro-component to the aggregate daily household water consumption. A Matlab algorithm returned the sample size, standard error of the micro-component mean, and percentage contribution to the household total daily water consumption (Table 5). Figure 1 shows the sample sizes by household occupancy and metering. It is evident that there are limited data for five person households thus these samples could be aggregated across regions but the regional signature would then lost. Table 5 shows the contribution made by micro-components to gross daily household water consumption within one subset of the data. The ‘ideal’ micro-component for climate regression analysis would be normally distributed (or transformed to normality), have a small standard error, and large sample size. Furthermore, the component would contribute a large fraction of the daily aggregate household water use in relation to others and be responsive to weather variables.

Table 5 Contribution of micro-components to the daily aggregate household water use (two person, unmetered, East region). Zero values refer to days in which 0 l consumption was recorded (non-usage)

The process provided further quality assurance checks. For example, close inspection of the data at micro-component level, stratified by occupancy, region and billing type revealed that one single-person household in Lincoln appeared to be flushing the toilet 22 times per day. After consultation with AWS it was determined that the household size had temporarily increased to four occupants during this period. Thus, despite automated screening and quality checks (section 4.3) there was still no substitute for careful inspection and accounting for unusual values.

4.6 Regression Analysis

No single statistical model is applicable to all micro-components due to the different associated behaviours. For example, an individual tends to use the kitchen sink each day (Fig. 5) whereas behaviours associated with the bath are very different (Fig. 6). In the latter case, the decision about whether to wash (using either a bath or shower) is separate from the amount of water used. This was modelled in two steps. First, the frequency of usage and non-usage of the fixture was analysed in relation to weather, bank holiday or day of the week using a logistic regression model. Second, non-zero (bath) volumes were fit to predictors such as weather, bank holiday, day of the week, etc using multiple linear regression. In this case, zero litre micro-component values (non-usages) were removed.

Fig. 5
figure 5

Daily kitchen sink usage in unmetered two person households in East region

Fig. 6
figure 6

Daily bath usage in unmetered two person households in East region excluding non-usage

Figure 7 shows coefficient weightings from the multiple linear regression model of external water use in unmetered, two person households in East region. This example demonstrates how regression analysis is also a useful diagnostic tool for exploring water use behaviours. As expected, water use is more heavily weighted in summer months when the garden is in bloom and the weather is on average drier and warmer. The relatively large weight attached to outdoor water use in February and March is harder to explain. Again, inspection of the raw data revealed that the sample of non-zero values was very small (N < 10) so there was large uncertainty in the coefficient values during these months. One solution would be to increase the sample size by aggregating data across all three regions.

Fig. 7
figure 7

Multiple linear regression model coefficients for external water use in unmetered two person households in East region

Table 6 compares multiple linear regression models for three micro-components. These were based on daily data so explained variance is low. All three models were heavily weighted on bank holidays suggesting that dishwasher, toilet and shower usage rise on these days. Dishwasher coefficient weightings are low for all meteorological variables suggesting that this micro-component was relatively insensitive to weather but higher for year indicating a long-term reduction (perhaps due to efficiency gains driven by technology). In contrast, shower use increases with maximum temperature and has a midweek low. Toilet use exhibits strong weekday/weekend variations with higher weights for the latter (see also Fig. 4). Overall, toilet use accounts for the largest fraction of the daily aggregate household water use (Table 5) and appears to decrease at higher temperatures.

Table 6 Multiple linear regression model coefficients for three micro-components based on transformed daily data (two person, unmetered, East region). All models are statistically significant
Plate 1
figure 8

Telemetry meter fitted to bathroom sink

5 Conclusions

Given the risks posed by climate change and constraints on UK freshwater supply, combined with increasing demand and rising economic and environmental costs involved in the development of new water resources, forecasting has become an important tool in balancing water supply and demand (Memon and Butler 2006; Defra 2008).

We have reviewed methods of household water demand estimation and forecasting with particular reference to the UK. Since the mid 1990s household water management legislation in the UK has shifted from ‘predict and provide’, supply-driven planning towards a ‘twin-track’ approach with greater emphasis on demand-side solutions. This transition reflected mounting opposition to the economic and environmental costs of new infrastructure plus growing recognition of the potential for demand management (Environment Agency 2003; Defra 2008; WWF 2010). Regulators now require UK water companies to forecast demand in annual June Returns and to provide cost effective, defined levels of service, as part of the 5-yearly Periodic Review process (Defra 2003). The business plans on which the WRMP are based require water companies to describe how demand for water will change and, for domestic customers, how such changes are driven by micro-component use.

However, there is surprisingly little literature on UK household water demand estimation and forecasting under a changing climate. This probably reflects wider difficulties in recording, understanding and predicting the complexities of household water use (Memon and Butler 2006; Medd and Chappells 2008). Nonetheless, this is critical to cost-effective management and long-term planning. It is evident from our review that water use behaviours reflect the interplay between many factors. At the moment the only way to record an individual’s water use is through qualitative methods. Diary based studies, questionnaires and interviews are subjective and have their own limitations. Observational techniques (e.g., video analysis) offer a more objective approach but are costly and intrusive for some sites of water use in the home.

Currently, micro-component data are the finest resolution of quantitative water use within the home possible from household metering. However, these data can be combined with knowledge of other variables such as demographic profiles, occupancy rates, or climate variables to interpret aggregate household water use. Complementing micro-component data of actual consumption with observational techniques could provide a robust and detailed approach to understanding individuals’ water use within the home (Richter and Stamminger 2012). Wider installation of smart meters could be a significant research opportunity and management tool. They could be a visual reminder of household water use, could be used to empower and position householders as managers of their own water use and enable a wider range of pricing mechanisms such as seasonal tariffs (Herrington 2007; Environment Agency 2009; Defra 2011).

Research in the UK and elsewhere consistently finds that the complex relationships influencing household water use make underlying patterns of behaviour difficult to detect. Stratification is widely used to tease out behavioural signals and homogenise data. We have illustrated the complexity of household water use analysis through a worked example of the AWS ‘Golden 100’ dataset. A methodology was presented that error checks and prepares household micro-component water data for weather analysis. Within these processes some thresholds must be soft coded into algorithms highlighting the difficulty in applying a generic methodology to this type of data. The treatment of zero total volume days is a special case that depends ultimately on the purpose of the analysis, such as use or non-use of a fixture, versus aggregated water totals over longer periods.

An important lesson learnt during the handling of metered information is that there is no substitution for careful inspection of raw data during error checking, preparation and analysis. For example, unusually high toilet use was not detected by automated PCC error checking procedures until data were stratified into micro-component subsets. Ultimately, the reliability of results derived from this type of data depends on the rigour of quality assurance which begins at the point of monitoring in the home. We must stress, however, that these data were not originally collected with household-level, weather-related research in mind. Rather, the ‘Golden 100’ data were collected to improve leakage detection. Time and effort spent post-processing the data has added value by increasing the range of potential applications.

The next step in this research is to develop the regression models. The data covers several years so there is scope for evaluating time-dependency in the most important loadings of micro-component water demand. For example, the data could be used to investigate time-varying water use before, during and after a major heatwave such as summer 2003. Coupled with detailed logs of water saving campaigns their actual impact on household water demand could be assessed. Initial analysis suggests that external water use responds non-linearly to average air temperatures. Considering the importance of peak demands to water companies, this aspect should be a priority for future research. We also intend to apply the UKCP09 climate projections to the statistical models to explore sensitivity of household water demand components to climate variability and change. Furthermore, it is important that our findings for East England are compared with other regions to determine the extent to which the climate-sensitivity of micro-components is location dependent.

Micro-component data analysis also supports the current approach to demand forecasting by the UK water industry regulators. One of the key lessons from the research discussed in this paper is that significant effort is required to collect and maintain data on micro-component use and that careful consideration of the costs, benefits and future uses are needed when programmes of this type are set up. This realisation will direct the industry to focus on critical attitudes and water using behaviours and encourage innovation in sampling and meter design to improve household water use data.

The key disadvantage of micro-component analysis is the cost involved in collecting and maintaining the required data. Innovation to reduce these costs would result in more data being collected and better understanding of water using behaviours. The benefits of this would include more sophisticated targeting of demand management measures and less uncertainty in the planning and delivery of large supply-side assets, such as reservoirs. Further research is needed to improve standardisation of micro-component surveys and data management practices, allowing the pooling of data from different areas. Moreover, work to quantify the effects and significance of uncertainty in micro-component analysis and related studies is needed. This should align with research on emerging investment appraisal methodologies, such as Robust Decision Making techniques.