1 Introduction

The growing body of research on water use, scarcity and pollution in relation to consumption, production and trade has led to the emergence of the field of Water Footprint Assessment (WFA). At the foundation of this field is the water footprint (WF) concept, developed by me early in 2002 and introduced to an international audience at an expert meeting on virtual water trade (VWT) in December 2002 (Hoekstra 2003). I introduced the WF as an indicator of water use behind all the goods and services consumed by one individual or the individuals of a country and claimed that “the total water footprint of a nation promises to become a useful indicator of a nation’s call on the global water resources” and that “at consumers level it is useful to show people’s individual footprint as a function of food diet and consumption pattern” (Hoekstra 2003). There was scepticism from researchers who did not believe it makes sense to analyse people’s indirect water use, because water resources management is about allocation to actual water users, not ‘indirect water users’. Besides, it would be incorrect to ‘blame’ consumers for indirect water use or hold them ‘responsible’ for the negative impacts of indirect water use overseas. The concept appeared to be ground-breaking though, together with the idea of VWT from Allan (2001), who had suggested that virtual water import was a mechanism that contributed to solving water shortages in the Middle East. In 2002, we quantified, for the first time, global virtual water flows related to international crop trade (Hoekstra and Hung 2002). By adding the ‘net virtual water import’ of a country to the water use within the country, as shown in traditional national water use statistics, we were able to reveal the ‘real’ water use of people in a country. While Allan had looked at VWT from the perspective of the importing country, I proposed to consider VWT from the exporting country perspective as well, because a food importer may ‘save’ water domestically, but the exporting region is left with a WF bigger than necessary to produce its own food, which may relate to sustainability and fairness of water resources allocation in the export country. International politics, markets and regulations indirectly influence the way water resources in different places are allocated and used and who finally benefits. Given that water availability and demand are unequally spread around the world and the fundamental importance of water as a resource, it is useful to analyse the international and geopolitical dimension of water resources allocation. Some of the early-day criticisms on the concepts of WF and VWT still arise at regular interval – as will be discussed in this paper – but in the meantime the field of WF and VWT assessment has matured, yielding in-depth studies and examples of practical use. Advances include the development of the full WFA methodology (Hoekstra et al. 2011), the quantification of WFs at high spatial and temporal resolution (Hoekstra and Mekonnen 2012), the study of inter-annual variability and trends in WFs and VWT (Zhuo et al. 2016a), the development of WF benchmarks for crops (Mekonnen and Hoekstra 2014; Zhuo et al. 2016b), the assessment of monthly blue water scarcity at a high spatial resolution based on patterns of blue WFs versus patterns of water availability (Mekonnen and Hoekstra 2016), the computation of water pollution levels in river basins based on grey WFs versus assimilation capacity (Liu et al. 2012), the exploration of the use of remote sensing (Romaguera et al. 2010) and the development of future WF and VWT scenarios (Ercin and Hoekstra 2014). WFA applications vary widely, from product assessments, sector studies, diet assessments and catchment, municipal and national studies to global assessments. This paper reviews the evolvement of the field by sketching a number of developments over time and reflects on the major issues of debate.

2 Historic Developments

2.1 Roots

The field of WFA is rooted in four basic thoughts. The first is the idea that freshwater is a global resource (Hoekstra and Chapagain 2008), because people in one place can and do make indirect use of freshwater resources elsewhere through VWT (Allan 2001), and because local water allocations and patterns of unsustainable water consumption are increasingly driven by the global economy which lacks incentives for sustainable water use (Hoekstra 2013). The second idea is that freshwater renewal rates are limited, so we must study the development of consumption, production and trade patterns in relation to these limitations. In broader sense, when analysing the environmental sustainability of economies, it is necessary to study the ‘footprint’ of human consumption in relation to planetary boundaries. When creating the WF concept, I was inspired by the ‘ecological footprint’ that had been developed by Wackernagel and Rees (1996). The third idea is that for understanding natural resources use and impacts of consumption, we have to think in terms of supply chains and product life cycles. The fourth idea is that in a comprehensive approach towards freshwater use and scarcity, we must consider both green and blue water consumption (Falkenmark 2000) as well as water pollution (Postel et al. 1996). The field of WFA is thus fundamentally interdisciplinary and integrative, with papers published in both ‘environmental sciences’ and ‘water resources’ journals. Broadly spoken, WFA bridges the two interdisciplinary communities by bringing environmental thinking (footprint and supply chain thinking) into the water resources community and by bringing water resources thinking (water allocation, water productivity, water scarcity) into the environmental sciences community.

2.2 Distinguishing Green, Blue and Grey WFs

The WF is a measure of consumptive and degradative freshwater use. The consumptive WF includes a green component referring to the consumption of rainwater, and a blue component referring to the consumption of surface water or groundwater. The degradative WF, the so-called grey WF, measures the volume of water required to assimilate pollutants entering freshwater bodies (Hoekstra et al. 2011). In early WF studies, the focus was just consumptive water use. From the start, water consumption was understood to include both green and blue water consumption, but they were presented as a total, because the models applied did not allow to make explicit distinction between the two components (Hoekstra and Hung 2002). The inclusion of green water consumption in the WF metric was an important and deliberate decision, inspired by the work of Falkenmark (2000), who had introduced the green-blue water terminology in order to broaden the perspective of water management beyond the historical focus on blue water. The first paper to assess a crop’s green and blue WF separately was by Chapagain et al. (2006b). That same paper introduced the grey WF, albeit not yet under that name, but presented as a ‘dilution water volume’ necessary to assimilate a pollutant load. This appeared to be an unfortunate term, because some took it in a normative sense as if it was proposed to solve pollution through dilution. That was of course not the intention; the idea was to express water pollution in terms of the claim it puts on scarce freshwater resources by expressing pollution in terms of the amount of water it takes to assimilate it. Water pollution in that sense competes with water consumption. Hoekstra and Chapagain (2008) presented the green, blue and grey WF for the first time in one coherent framework. Hoekstra et al. (2011) made a slight improvement in the definition of the grey WF by accounting for natural concentrations of substances in water bodies, thus decreasing the capacity to take up additional loads from anthropogenic origin given maximum allowable concentrations. Whereas the first grey WF studies were limited to just pollution through nitrogen, today, grey WF studies have been carried out for a variety of water quality parameters, including nutrients, dissolved solids, metals, and pesticides. Whereas a few studies have already distinguished between different types of blue WF, depending on the source of the water (surface water, renewable groundwater, fossil groundwater, or capillary rise), it may be expected that this will increasingly be done, when data allow, since the potential implications of these different shades of blue WF may be different.

2.3 From Concept to Field of Analysis

The initial stage of development was centred around the quantification of WFs of crops, VWT related to crop trade and WFs of national consumption (Hoekstra and Hung 2002). The basis for the national WF estimation was the accounting scheme shown in Fig. 1. Hoekstra and Chapagain (2007, 2008) improved the national WF accounts by considering all forms of consumption and trade, including animal and industrial products and municipal water use as well. Until 2008, the focus remained on national WFs in relation to consumption and on accounting. Afterwards, the scope broadened, whereby also the production perspective received increasing attention, driven by the growing interest from companies, which started to discover the use of the WF concept in 2007. Another driver was the interest to analyse aggregate WFs of production within certain geographic areas in order to put them in the context of the limited water availability per area. These advances resulted in the development of a larger conceptual framework, as shown in Fig. 2, allowing the quantification of WFs at the most basic level of a single process or activity, the WFs of products, the WF of consumption at individual or community level, the WF of production in a certain area, and the operational and supply-chain WFs of companies. With the broadening of scope, terminology regarding water consumption per unit of product changed from ‘specific water demand’ (Hoekstra and Hung 2002) or ‘virtual water content’ (Hoekstra 2003) to ‘water footprint of a product’ in order to have consistency when aggregating WFs of products to the WF of a basket of products or further to the WF of a consumption pattern or diet (Hoekstra et al. 2011).

Fig. 1
figure 1

The water footprint accounting scheme for a spatial unit like a municipality, province, state, nation or river basin, showing the relation between the water footprints of production and consumption and virtual water trade (from Hoekstra et al. 2011)

Fig. 2
figure 2

The relation between different water footprints. Water footprints of single processes or activities form the basic building blocks for the water footprint of a product, consumer, or producer or for the footprint within a certain geographical area. The footprint of global consumption is equal to the footprint of global production (adapted from Hoekstra et al. 2011)

Around 2008, there was a broadly felt need to move beyond a concept and work on a more elaborate assessment method, recognizing that a quantification of WFs yields interesting figures but does not address the ‘so what’ question and policy implications. The full WFA method was developed in consultation with stakeholders from the private and public sector over the years 2008–2011, which resulted in the Global WFA Standard of the Water Footprint Network (Hoekstra et al. 2011). The method includes four steps: setting scope of analysis, accounting, sustainability assessment, and response formulation. The sustainability assessment step addresses the ‘so what’ question by putting WFs in the context of sustainability, efficiency and fairness, recognizing that WF figures in themselves tell little if not compared to reference levels. In this stage, new concepts were developed, like the idea of the ‘maximum sustainable WF’, to be translated into ‘WF caps’ per river basin, the idea of ‘WF benchmarks’ for processes and products as a reference for what WF level could be achieved based on the use of certain good or best technology or practice, the idea of ‘blue and grey WF permits’ as opposed to water abstraction and wastewater discharge permits, the idea of ‘fair WF shares’ as a tool to discuss WFs of communities, and the concepts of ‘supply-chain water risk’ for companies and ‘imported water risk’ for countries (Hoekstra 2013).

2.4 Relation to Other Research Fields

The maturing of the research field has led to an increasing exchange with other fields of investigation. While initial WFA studies were little integrated within the broader field of integrated water resources management (IWRM), we see a growing integration of WF and VWT notions in regular water management studies. In addition, we see that WFA is integrated into broader environmental and economic research. First of all, the research community working on environmentally extended input-output modelling started to incorporate WFs into their tools (Ewing et al. 2012), allowing for the full tracing of virtual water flows across economic sectors and regions. The life cycle assessment (LCA) community has started to incorporate the WF into LCA (Boulay et al. 2013) and scholars working on corporate environmental indicators, corporate social responsibility and corporate water stewardship started to integrate the WF in their frameworks as well (Herva et al. 2011; Sarni 2011). Furthermore, an increasing number of scholars is working on integrating different footprints in more holistic environmental footprint studies (Hoekstra 2009; Galli et al. 2012) and linking footprint work to the concept of planetary boundaries (Hoekstra and Wiedmann 2014; Fang et al. 2015). With the transition from a fossil to biobased economy, carbon footprint studies will gradually make place for land and water footprint studies, because biobased essentially means based on scarce land and water resources. Finally, the idea of ‘zero WF’ as the ultimate target for industrial processes fits within studies on the circular economy.

2.5 The Emergence of WF Studies at Different Geographic Scales

A series of global WFAs has been carried out over the years. The first WF study estimated the WFs of national consumption for most countries of the world (Hoekstra and Hung 2002). In a second global assessment, improvements were made by including a larger range of products (Hoekstra and Chapagain 2007, 2008). Whereas both assessments were done at the country level, a third global assessment was based on a high spatial resolution (Hoekstra and Mekonnen 2012). Another global WFA around the same time was carried out by Fader et al. (2011). Chen and Chen (2013) were the first to make a global WFA using a multi-region input-output model as opposed to static trade databases to estimate international VWT. Ercin and Hoekstra (2014) were the first to develop future global WF and VWT scenarios.

Country-specific studies emerged since 2006 (Ma et al. 2006), river-basin studies since 2008 (Aldaya and Llamas 2008), urban studies since 2009 and site-specific studies (for specific crop fields and factories) since around 2010 (see Supporting Material). Whereas the country and urban studies generally consider primarily the internal and external WF of consumption of citizens, the river basin studies tend to focus on the WF of production within the basin. Most site-specific studies focus on the WF from a local production perspective as well, without considering supply chains. Many of the more local studies are fed by results from the global studies, since local studies can be more specific in terms of spatial detail within the area studied, but as for data on WFs of imported products and on the sustainability of those WFs elsewhere, one has to rely on other studies.

2.6 The Emergence of Product, Sector and Corporate WF Studies

Hoekstra and Hung (2002) estimated the WFs of 38 crops, per country. Hoekstra and Chapagain (2007, 2008) estimated, again per country, WFs of all primary crops (and various derived crop products), WFs of eight types of animal (and animal products like meat, milk, butter, cheese, leather) and WFs of the industrial and municipal sectors. Mekonnen and Hoekstra (2011, 2012a) made improvements and applied a high spatial resolution, thus accounting for spatial variability in climate, soils and other production conditions. More specific product studies started to appear in 2006 with a study on cotton (Chapagain et al. 2006b). WF studies have been published now on a wide variety of products, including food and beverage products (Ercin et al. 2011, 2012), fibre products like textiles (Chico et al. 2013) and paper (Van Oel and Hoekstra 2012), cut flowers (Mekonnen et al. 2012), packages, minerals, construction materials and manufactured products like cars and computers (see Supporting Material). Sector studies were published for instance for beverages, electricity, transport, tourism, and food aid. WF studies from specific companies started to appear after a first study from SABMiller and WWF-UK (2009). A great problem in most of these applications is the tracing of supply chains and obtaining specific data rather than crude global estimates. This is particularly true for products with long and complex supply chains like animal and manufactured products. For animal products, for instance, the diet of the animal and feed origin is crucial, but in many cases it is difficult to trace the precise composition and origin of feed concentrates.

2.7 The WF of Dietary Choices – the Water-Food Nexus

The impact of diet on the WF of consumption has been studied since 2010. Hoekstra (2010) estimated a potential overall WF reduction of 36% in the industrialised world and 15% in the developing world if people would replace meat by nutritionally equivalent crop products. Mekonnen and Hoekstra (2012a) showed that for any animal product there are crop products with equivalent nutritional value that have a substantially smaller WF. The average WF per calorie for beef was found to be 20 times larger than for cereals and starchy roots. The WF per gram of protein for milk, eggs and chicken meat was estimated to be 1.5 times larger than for pulses. For beef, the WF per gram of protein is six times larger than for pulses. Ercin et al. (2012) found the WF of 1 l of cow milk to be three times larger than for 1 l of soy milk, and the WF of a beef burger 15 times larger than for a similar soy burger. Vanham et al. (2013) estimate that a shift from current to vegetarian diets, would result in a WF reduction of 41% for Southern and Western Europe and reductions of 27% and 32% for Eastern and Northern Europe, respectively. Jalava et al. (2014) estimate that a global shift from current diets to recommended diets (following the dietary guidelines of the World Health Organization) plus a replacement of animal products by nutritionally equivalent local crop products would reduce the food-related global green WF by 23% and the global blue WF by 16%.

The innovation of these studies on the relation between diet and water consumption lies in the fact that efforts to mitigate water scarcity through water demand management have traditionally focussed on the question how to increase water productivity in crop production and raising livestock, while a more fundamental question remained unaddressed: how water efficient is the food production system as a whole? WF studies open up the possibility to study the ‘nutritional water productivity’ of the global agricultural sector, i.e., how many kilocalories or grams of protein are produced per drop of water. Another focus of research has become the WF of food waste; it has been estimated that the blue WF for the production of total food wastage is about 250 billion m3, which is 3.6 times the blue WF of total USA consumption (FAO 2013).

2.8 The WF of the Energy Mix – the Water-Energy Nexus

Research on the WF of energy started with studies for bio-energy (Gerbens-Leenes et al. 2009; Dominguez-Faus et al. 2009), followed by research on the WF of hydro-electricity (Mekonnen and Hoekstra 2012b). Currently, we have a reasonable understanding of the WF of all different forms of energy, covering both the fossil and renewable sources (Mekonnen et al. 2015). Per unit of energy, the WF of bioenergy and hydroelectricity is two to three orders of magnitude larger than for fossil fuels and nuclear. The variation for bio-energy is large, since the precise form (e.g., first or second generation bio-energy, which crops or other organic material, and which production circumstances) hugely matters. The variation for hydropower is large as well, depending on the location and characteristics of the reservoir. Electricity from concentrated solar power (CSP) has a similar WF to fossil fuels, while geothermal can be an order of magnitude smaller or even less. The WF of photovoltaic (PV) and wind energy is one to two orders of magnitude smaller than for fossil fuels.

WF studies have been instrumental in showing the water implications of the energy transition from fossil to renewable. The ‘greenest’ of the existing energy scenarios (with quickest and largest CF reduction) will greatly enlarge the WF of global energy production, because of the large fractions of bio-energy and hydro-electricity in the mix. The only way to reduce both carbon and water footprint of energy production appears to be if all investments are aimed towards wind and solar energy (Mekonnen et al. 2016). Future research will undoubtedly focus on how the energy transition will change interregional energy dependencies and thus power relations, because future energy supply will depend on the availability of land, wind and water resources to produce the renewable energy. If only 10% of fossil fuels in today’s global transport sector were replaced by bioethanol from relatively efficient crops, global water consumption would increase by 7% (Gerbens-Leenes and Hoekstra 2011). Future energy scarcity will essentially be land and water scarcity, so the land and water footprints of energy will be at the core of future energy research.

An additional concern is that the energy return on investment (the EROI factor) for renewables is much lower than for fossils; the energy demand for generating energy will thus become substantial, putting additional claims on land and water (Mekonnen et al. 2015). With current energy-intensive agricultural practices, net energy output is far lower than gross energy production, sometimes even near zero.

PV panels and CSP systems are more efficient in capturing incoming solar radiation than photosynthesis, thus generating more energy per square metre. Since substantial growth of bioenergy – beyond using rest streams of organic material – is impossible, our economies will increasingly depend on wind and solar power, which will drive the electrification of the transport sector, but also electric heating, at least where surplus heat from industrial processes or geothermal energy does not offer a solution. Further on, we will need to find ways to store energy and design electrical grids that can handle the large variability of both electricity demand and supply.

2.9 Putting WFs and VWT in Context

Since 2009, an increasing number of papers put WFs of production and consumption and VWT in the context of what is sustainable, fair and efficient (Hoekstra 2013). In a case study for the Netherlands, Van Oel et al. (2009) were the first to put the external WF of national consumption in the context of local scarcity in the regions of production, thus identifying critical hotspots. The approach was refined by Ercin et al. (2013) in a case study for France and further by Hoekstra and Mekonnen (2016) for the UK. The latter study also shows the level of water-use efficiency in all the locations of UK’s external WF. Lenzen et al. (2013) showed to which extent international virtual water flows in the world originate from water-scarce places.

Based on estimates of WFs at a high temporal and spatial resolution level and high-resolution data on freshwater renewal rates, it has become possible to assess water scarcity at a greater level of detail than ever before, showing where precisely WFs exceed maximum sustainable levels and which types of water use (e.g., which crops) are responsible for that. It has been shown that blue WFs exceed maximum sustainable levels by a factor two for at least one month per year in half of the four hundred largest river basins in the world (Hoekstra et al. 2012) and that about 4 billion people in the world live in areas that experience severe water scarcity at least one month per year (Mekonnen and Hoekstra 2016). It has also become possible to relate WFs and virtual water trade to the overexploitation of specific aquifers, as shown for example by Marston et al. (2015) for the United States. Grey WFs can be put in the context of a river basin’s assimilation capacity. For nitrogen and phosphorus pollution, it has been shown that grey WFs exceed maximum sustainable levels in many catchments in the world (Liu et al. 2012; Mekonnen and Hoekstra 2015).

It has become possible to discuss fairness of water use by comparing the WFs related to the consumption levels and patterns of different communities (Hoekstra 2013). Given that WFs have passed levels of what is maximally sustainable in half of the world’s major river basins, one may conservatively assume that the WF of humanity as a whole – currently averaging at 1400 m3/y per capita – should at least not increase in the future. Future population growth implies that the maximum sustainable level per capita will decline. In the hypothetical case that fairness would be interpreted as an equal water share for every world citizen, this would imply an enormous WF reduction challenge for countries with current WFs beyond the average, like the US (Fig. 3). Future research is needed to better understand the complexities involved here, including questions on what are precise sustainability levels, what is fair given human rights for water and food, what reductions can be achieved through greater water-use efficiencies and to what degree consumption patterns would need to be adapted. One question is also what is the potential VWT may offer. Seekell et al. (2011) and Suweis et al. (2011) find that current VWT is primarily driven by gross domestic product and social development status of countries rather than spatial patterns of water scarcity and solidarity toward water-stressed populations. Studies have shown that VWT results in modest global water saving (Chapagain et al. 2006a) and that global VWT leads to a slightly more equal global distribution of water resources (Seekell 2011), but it comes with adverse environmental impacts and the risk of long-term water dependency for water-scarce nations. This leads to the need of further inquiry in what Suweis et al. (2013) call the water-controlled wealth of nations.

Fig. 3
figure 3

Hypothetical convergence of the WF of national consumption of different countries towards an equal share in the maximum sustainable global WF. The maximum sustainable global WF per capita will decline due to population growth (UN medium scenario). Water-use efficiencies need to be improved beyond what is expected under a Business as Usual scenario, and consumption patterns will need to become aligned to what is possible within the planetary boundaries for freshwater supply. Data for 2000 from Hoekstra and Mekonnen (2012)

WF research has resulted in discussions around water-use efficiency from three different perspectives: the production perspective (local water-use efficiency), the trade perspective (global water-use efficiency) and the consumption perspective (consumer water-use efficiency). Local water-use efficiency can be assessed by comparing the WF of a specific process or product to a WF benchmark for that process or product, which can be based for instance on best available technology and practice (Hoekstra 2013; Mekonnen and Hoekstra 2014; Chukalla et al. 2015). Further research is needed on the effectiveness of regulations or economic instruments to motivate water users to reduce WFs to benchmark levels. Global water-use efficiency depends on whether water-intensive commodities are dominantly produced in relatively water-abundant regions with high water productivity and traded to places characterized by the opposite (Hoekstra 2013). Questions remain on how water scarcity can be better factored into the world economy. Water-use efficiency from the consumer point of view refers to the fact that consumers can seek to fulfil certain demands (e.g., certain amount of kcal and protein per day) in alternative ways, some of which will have a much smaller WF than others. It is quite a new field of research to see how consumers can be incentivized to account for indirect environmental impacts in their shopping choices.

Future WFA research will likely concentrate more on questions around the sustainability, equity and efficiency of WFs than more narrowly on quantification of WFs as in the past. In addition, WFs will increasingly be put in the context of associated risks. Water dependency and security can be assessed by analysing the extent to which companies or communities depend on unsustainable water use in their supply chain. Where companies have supply-chain water risks (Sarni 2011), countries have an ‘imported water risk’ (Hoekstra and Mekonnen 2016).

2.10 Data Sources, Models, Spatial and Temporal Resolution, Scenarios and Uncertainties

The first WF studies were done based on FAO’s CropWat model, national production statistics and international trade data (Hoekstra and Hung 2002). The first global grid-based assessment, at 5 × 5 arc minute resolution, was published in 2011, again using the CropWat model for estimating WFs in crop production (Mekonnen and Hoekstra 2011). More recently, FAO’s AquaCrop soil-water-balance and crop-growth model has been employed in several studies, with an added module to partition ET into green and blue ET (Chukalla et al. 2015; Zhuo et al. 2016a). Other models applied to estimate WFs of crop production include EPIC (Liu et al. 2007) and LPJmL (Fader et al. 2011). Next to modelling, the usefulness of remote sensing in assessing WFs has been explored (Romaguera et al. 2010), with the long-term potential of real-time monitoring. Modelling in combination with national statistics, field measurements and remote sensing products will likely improve the quality of the assessments. The field has to mature still in terms of calibrating model results against field data, adding uncertainties to estimates and inter-model comparisons as done in the field of climate studies. Furthermore, past studies mostly focused on average WFs over multi-year periods, although since 2010 an increasing number of studies show historical times series, with data year by year, enabling the analysis of variability and trends (Dalin et al. 2012; Zhuo et al. 2016a). A few WF and VWT scenario studies – considering the future implications of population and economic growth, diet changes, technological advances, the energy transition and climate change – have been published (Ercin and Hoekstra 2014, 2016; Orlowsky et al. 2014), but this branch of study is in its infancy.

2.11 Standards and Guidelines

The first WF standard was developed by Water Footprint Network (WFN) in consultation with a broad array of stakeholders over the period 2008–2011, a process that resulted in the 2009 draft and 2011 final Global WFA Standard (Hoekstra et al. 2011). The beverage industry published a guideline largely consistent with this standard (BIER 2011). In the years 2012–2013, WFN hosted an international expert group to develop grey WF guidelines, providing additional practical help in assessing the grey WF for a variety of chemicals (Franke et al. 2013). In 2014, ISO published an assessment and reporting standard related to the WF of products, processes and organizations based on LCA (ISO 2014). Unfortunately, this standard is inconsistent with WFN’s standard; the difference partly lies in method, which is understandable, because ISO focusses on product LCAs and environmental impact, while the WFN standard offers a broader framework, in which WFs can be studied with different focus (product, producer, consumer or geographic focus) and from different perspectives (environmental sustainability, social equity, resource efficiency or water risk). However, ISO also confusingly deviates in terminology. A key difference is that ISO requires water consumption to be multiplied with a ‘characterization factor’, whereby in practice it has been proposed to multiply water consumption by local water scarcity (Ridoutt and Pfister 2010), which has been criticized for being inconsistent with the way other environmental footprints are defined (Hoekstra 2016).

3 Discussion

The quick emergence of the new field of WFA and uptake of the WF concept by companies, governmental organizations, the United Nations, civil society and media has generated substantial discussion about what the concept in narrow sense and the field in broader sense offer and what not. The WF concept has been praised for creating awareness but has been questioned regarding its policy relevance. Critique on the related concepts of WF and VWT has particularly come from two sides: economists who do not see the need for these two concepts in economic analysis, and LCA scholars in search of an indicator of water use impacts. In the following sections, I reflect on the main issues of debate.

3.1 Local Vs Global

There are good arguments to manage water locally or at basin level whenever possible, but it is valuable as well to see what can or even needs to be done at larger spatial scales, particularly when the driving mechanisms of water problems go beyond the river basin (Hoekstra 2011; Vörösmarty et al. 2015). A strong motivation behind many VWT and WF studies is the idea that understanding local water use and pollution in relation to the structure of the global economy could help identifying potential mechanisms of change. Consumer choices, corporate procurement policies, supply agreements, investment policies, product labelling schemes, trade policies and agreements, and international cooperation, shared principles and environmental treaties can all affect the way water is being used elsewhere. It is useful to explore how players beyond the local stakeholders can play a role in solving local water problems. The WF and VWT concepts have proven to be instrumental in obtaining insight in water use along supply chains and identifying critical hotspots. The idea of water as a global resource, however, has received harsh criticism, and alongside that the VWT and WF concepts. Wichelns (2011) calls them “compelling notions, but notably flawed”. The concepts would fall short as analytical constructs because water scarcity and water quality are local, not global phenomena (Wichelns 2015a). Water scarcity would arise from local water demands exceeding local supplies; water quality degradation would be due to inappropriate practices within a given country. While this is obviously true, it remains unclear how this justifies the conclusion that what happens beyond the local is irrelevant and that water protection should be an exclusive local task. There are boundless examples of local water problems that are part of global mechanisms; consider, for example, the overexploitation of water resources in the dry north-western parts of India for the irrigation of cotton fields or polluted rivers in Bangladesh from the cotton processing industries. In an interconnected world, it is short-sighted to say that problems are caused and are to be solved where they occur. Wichelns (2015a) argues that consumers in one country cannot alleviate water scarcity or improve water quality in other countries. This is indeed too simply stated, but there is no reason not to explore what companies, investors, governments and consumers down the cotton supply chain can do to make water use at the places of production more sustainable, not only in the interest of local communities in India or Bangladesh, but in their own interest as well (because relying on an unsustainable supply chain is not going to last). Wichelns (2015a) further questions the WF and VWT concepts by arguing that consumers in one country are not responsible for environmental harm in another. Legally this is true, ethically one may debate, but whatever position one takes, a product’s WF just shows the factual water use over its supply chain, after which one can analyse the sustainability and efficiency of the water use in each stage, and use that information in debating what can be done. WF and VWT are sometimes erroneously understood as prescriptive tools, but they just offer a way of factually analysing water use along supply chains. About 75% of the total WF of UK consumption is outside the UK and about half of the country’s blue WF is located in places where the blue WF exceeds the maximum sustainable blue WF, with the majority of the hotspots in Spain, USA, Pakistan, India, Iran and South Africa (Hoekstra and Mekonnen 2016). It is difficult to see what is not global in this or why concepts that reveal something we did not know before are flawed and useless.

3.2 How Real Are VWT and WFs?

There has been a philosophical discussion on how real VWT is and whether WFs make sense at all. According to Merrett (2003) and Wichelns (2011), countries import food, not virtual water. The basic critique is that the notions of VWT and WF are redundant and hence do not enhance understanding. Wichelns (2015a) further asserts that it is impossible to say that countries save water by importing virtual water; according to him, VWT does not exist and neither the water saving for the importing country resulting from it. In the strict vocabulary of some economists, trade is about real things and not about ‘embedded’ or ‘virtual’ things, and ‘saving’ refers to a specific form of economic efficiency gain. A strict neoclassical economic perspective may hinder to see VWT and WFs, but that precisely is the added value. Jordan, a highly water-scarce country, has externalized its WF by 86%; the country is a large net virtual water importer, with a national water saving of 7 billion m3/year through trade, the volume of water that would have been required had Jordan produced all imported commodities itself (Schyns et al. 2015a). This is vital information to understand the economy of Jordan, but economists easily remain blind for this because there is no real but virtual water trade and because water scarcity is not factored into the price of commodities and thus invisible to economists.

The VWT concept has been more specifically questioned because of the ‘virtual water hypothesis’ that water-short countries should import water-intensive products from water-abundant countries (Merrett 2003). The hypothesis – ascribed to Allan (2001) – does not exist, however, and is based on misinterpretation; the criticism thus misses a target. Authors on VWT generally use the concept as an analytical, not a prescriptive tool. They point at the relevance to consider VWT when addressing questions around national water security and international dependencies. Indeed, several authors, including Allan (2001), have suggested to examine the option of increased net virtual water import in water-scarce countries, but this is essentially different from the proposition that they should increase import. A prescriptive ‘VWT hypothesis’ does not make academic sense; VWT can more productively be viewed as simply happening to a greater or lesser extent, inevitably coming along with all sorts of both positive and negative economic, social and environmental implications. Neither should VWT be interpreted as a trade policy approach to resolving the global water crisis (see e.g., Horlemann and Neubert 2007). Critical examination logically results in the conclusion that VWT is not a panacea; that it would or could be is an odd idea from the start.

3.3 WF and Water Productivity versus the Opportunity Cost of Water Use

The consumptive WF per unit of product is the inverse of ‘water productivity’ and as such relevant in discussions about resource efficiency. The WF has been criticized for considering only one input in production and not properly addressing the opportunity costs of that input. Similarly, the VWT concept has been blamed for showing the volume of water virtually embedded in traded products but not addressing the opportunity costs of production within countries that engage in trade (Gawel and Bernsen 2013; Wichelns 2015b). All this is true, but it is difficult to see why that is a problem. The WF and VWT concepts are apparently expected to account for other inputs (like land, labour) as well and to properly reflect opportunity costs. But this is like taking the wrong tool for a purpose and then blaming the tool for it. The criticism, however, is regularly quoted (e.g., Chenoweth et al. 2014), so worth another reflection. Water productivity expresses how much of a good one gets per unit of water, analogous to concepts like land or labour productivity. Optimizing water productivity in crop production regardless of other factors is as bad an idea as just optimizing crop yields (land productivity). Optimal allocation of scarce resources requires them all to be taken into account. However, that does not render the concept of water productivity or its inverse, the WF, useless. The entire reason to worry about water productivity is that in our economies water is not properly included in allocation decisions, for a variety of reasons. One reason is the fact that water is a common pool resource and in many places not properly priced and regulated (Hoekstra 2013). As a result, farmers and industries optimize the productivity of input factors like land, labour and capital at the cost of overexploiting water resources. As Antonelli and Sartori (2015) observe, current patterns of water allocation and use often reflect underlying market failures that could be corrected, or whose effects could be overcome, through appropriate policy interventions. Data on WFs and VWT tell a partial story indeed, but a story that is worth knowing. The concern from some economists seems to stem from their interpretation that virtual water imports into water-scarce countries need to be promoted and that WFs need to be reduced at all cost. There is, however, nothing in the concepts with those implications.

3.4 Accounting for Water Scarcity in WFA

There is an ongoing debate on whether and how water scarcity should be accounted for in WFA. There is broad agreement that WFs within a river basin get meaning when put in the context of local water availability or scarcity. There is no agreement, however, how to do that exactly. The mainstream approach is to compare the aggregate blue WF in a catchment to the blue water availability in the catchment, which will show the degree of blue water scarcity in the catchment and whether environmental flow requirements are met (Hoekstra et al. 2012). Regarding the WF of a product, one can analyse which components of the overall WF along the production chain of the product lie in river basins where they contribute to high water scarcity, thus identifying critical components in the water use along the supply chain. Regarding the green WF, a similar approach has been proposed (Hoekstra et al. 2011), but this requires further elaboration (Schyns et al. 2015b). A second approach – proposed by LCA scholars – is to account for water scarcity in the WF metric itself, by multiplying consumed water volumes by local water scarcity, which yields a scarcity-weighted WF (Ridoutt and Pfister 2010). This approach is product-focussed; when this approach is applied to a catchment, one will find a WF defined as the water consumption in the catchment multiplied by the water scarcity in the catchment. Since water scarcity refers to the ratio of water consumption to water availability, the WF in a catchment will equal the square of the water consumption in the catchment divided by the water availability (Hoekstra 2016). This is obviously an odd metric, illustrating the unsuitability of the LCA approach for application in river basin studies. It makes sense to compare the volumetric WF in a catchment with water availability, not to multiply it with water scarcity. While the WF refers to ‘water consumption’, some LCA scholars want it to refer to ‘environmental impact of water consumption’. In WFA, which can take a product focus but a geographic or consumer focus as well, the environmental impact of a WF is studied in the sustainability assessment stage that follows the accounting stage; in that subsequent stage, there is also room for addressing other issues than the environmental impact of WFs, like questions around equitability and efficiency of water use and water dependency. With its focus on products and environmental impacts, LCA has a narrower focus than WFA.

Where WFA and LCA differ in focus and the way limited water availability is accounted for in the analysis, a more fundamentally different view comes from economists who argue that WFs lack sufficient information to support policy analysis or to motivate wise decisions by consumers and firms, because WFs neglect information describing water scarcity conditions, implications for livelihoods and beneficial aspects of water use (Gawel and Bernsen 2013; Wichelns 2015a). Their conclusion that the notion of WF falls short altogether, however, ignores the fact that the whole essence of quantifying WFs is to subsequently put them in the context of limited water availability and study water scarcity. The critical economist perspective seems to come from the assumption that the WF is not a good metric if it does not include all what is relevant in the context of allocation decisions. The WF concept does not do so, but does not pretend either. The LCA approach originates from a similar perception of shortcoming; a WF would not be a good WF if it does not reflect environmental impact of water use. The volumetric WF as used in WFA does not, which explains the LCA proposal to repair and multiply volumetric WFs by local water scarcity. It is better though, and less confusing, if the LCA community speaks about WF impact instead of WF if it is the impact they focus on.

3.5 Assessing Maximum Sustainable WFs and WF Benchmarks

Research on maximum sustainable WFs per river basin and WF benchmarks per process and product has just started and is much less developed than research on the quantification of WFs themselves. This is problematic in the sense that it feeds doubts on the usefulness of WF accounts, because WFs need to be contextualized in order to become relevant in policy making (Witmer and Cleij 2012; Perry 2014). Quantifying maximum sustainable WFs is difficult because water availability strongly fluctuates in time and space, as WFs do, so the comparison needs to be done time- and location-specific. Besides, a question is how much of the water flows are to be reserved as environmental flows to sustain ecosystems and local livelihoods. In addition, climate change and land use changes (e.g., deforestation, wetland drainage, reservoir construction) affect the partitioning of precipitation into green and blue water flows, which in turn affects temporal and spatial water availability patterns over time. Location-specific environmental flow standards need to be established as they exist for water quality; based on such standards, blue WF caps per basin can be institutionalized, which could be translated in a maximum volume of WF permits to be issued. Another challenge is to develop WF benchmarks for processes and products, which will enable companies to formulate WF reduction targets for their operations and supply chain. Besides, a WF benchmark for a certain type of production provides governments with a reference with respect to what is a reasonable WF permit to be issued to specific users. Questions still to be addressed are to which extent benchmarks for water consumption in crop production will need to differentiate between different climate and soil conditions, because a certain best practice may yield a larger WF per tonne of crop on sandy soil in a hot semi-arid climate than in other conditions (Zhuo et al. 2016b). Besides, we may need to have benchmarks for different technologies and practices (Chukalla et al. 2015). Obviously, WFs are not to be reduced down to certain benchmark levels at all cost; targets will need to depend on the wider context, for example how costly it is and how important given local water scarcity. Marginal cost curves can be developed to show costs associated with different WF reduction levels.

3.6 Measuring Total or ‘Additional’ Water Consumption

The essence of the green and blue WF is that they measure the consumption of green and blue water resources for a certain purpose, that as a result will no longer be available in the same catchment and time period for another purpose. In the case of green water consumption, it has been argued that it would be better to look at an activity’s additional green water consumption. Núñez et al. (2013) and Perry (2014) argue that green water consumption in crop production can better be measured relative to natural vegetation, which will always result in much smaller numbers and even negative numbers in many cases. Rain-fed crop fields having less evapotranspiration than the original vegetation, implies that crop production produces water rather than consumes! Launiainen et al. (2014) similarly argue for forestry that when evapotranspiration from a managed forest equals that of unmanaged forests, it should not be counted as a green WF. Mentioned authors argue that there is no WF if the hydrology of a catchment is unchanged. They misinterpret, however, the WF concept, which is not intended to show a change in catchment hydrology but the volume of water appropriated for a certain purpose, and therefore not available for another purpose. A similar misinterpretation happens when Bakken et al. (2015) and Scherer and Pfister (2016) propose to measure the WF of artificial reservoirs and hydroelectricity as the difference between reservoir evaporation and the evapotranspiration from the land that was there prior to the reservoir. This is incorrect, since the evaporative flow from the land prior to the reservoir construction was appropriated for another purpose (e.g., producing food or forestry products) or untouched and left for natural vegetation. With the reservoir, the evaporative flow is used for something else (e.g., hydro-electricity, water supply or else, depending on the purposes of the reservoir). Since the WF concept is defined to feed discussions on how available green and blue water flows are used for competing purposes, we thus have to stick to measuring all and not ‘additional’ water consumption.

3.7 The Consistency with Other Environmental Footprint Metrics

With other environmental footprints, the WF forms a family of footprint indicators that measure natural resource use or emissions (Galli et al. 2012). Environmental footprints are related to the concept of planetary boundaries; they measure how much of the available capacity within the planetary boundaries is already consumed (Hoekstra and Wiedmann 2014). The ecological footprint (EF) of humanity is to be compared with the available global biocapacity and the carbon footprint (CF) to the maximum level of greenhouse gas emissions given maximally acceptable global warming. The WF is to be compared with available freshwater resources, which can best be done catchment by catchment. Common to all environmental footprints is that they quantify human appropriation of natural capital as a source or a sink: each specific footprint measures either a form of natural resource appropriation or a form of waste generation, or both. The WF measures both the consumption of fresh water as a resource (the green and blue WF) and the use of fresh water to assimilate waste (the grey WF). It has been argued that water volumes consumed should be weighted based on local water scarcity, as an equivalent to EF practice, where used hectares are weighted based on their bioproductivity (kg/ha) (Wichelns 2015a). The right equivalence, however, would be to weight consumed water volumes based on local bioproductivity of the water (kg/m3). Since EF analysis focuses on the use of bioproductive lands, the rationale for normalizing used hectares based on their bioproductivity is that areas may have different value in terms of producing biomass. WFA is not exclusively focused on bioproductivity (relevant in agriculture or forestry), but also on other types of value (in domestic and industrial water supply), so weighting water use based on its bioproductivity does not make sense in the broader discussion of water allocation. Besides, the bioproductivity of water is not a property of the water used (as the bioproductivity of land is a property of the land), but a property of the amount applied. Without water, plants do not grow, with increasing water application the bioproductivity of water increases, until it will decrease again. Bioproductivity is thus not a proper weighting factor in WFA as it is in EF analysis. Weighting based on local water scarcity instead is not in any way equivalent to the accounting practice in EF analysis. Water scarcity is not a proxy for or something similar as water productivity. Weighting consumed water volumes based on local water scarcity would be equivalent with weighting used land based on local land scarcity, which makes no sense and is therefore not done. Measuring plain water volumes used is perfectly equivalent to measuring bioproductive space used, whereby in a next step water volumes used (the WF) need to be compared to the water volume available (the maximum sustainable WF) and the bioproductive space used (the EF) to the bioproductive space available (the maximum sustainable EF) (Hoekstra 2009).

LCA scholars have pointed at the need to weight water consumption based on local water scarcity as well, pointing at the usage in CF accounting to weight emissions of greenhouse gases based on their ‘global warming potential’ (Ridoutt and Pfister 2010). However, the equivalence is again incorrect. The grey WF is comparable to the CF in the sense that it measures emissions; in grey WF accounting, different pollutants are weighted based on their ‘water pollution potential’ like greenhouse gases are weighted based on their global warming potential. The blue and green WF measure resources use (water use) like the EF (land use). The green and blue WF could be weighted based on productivity (as discussed above), not based on water scarcity, but since this is not practically doable (because of different types of productivity and because of the variability in productivity depending on the volume of application), the best we can do is just explicitly distinguish the green and blue WF, because the array of possible applications of blue water resources differs from the array of possible applications of green water resources.

3.8 Policy Relevance

While it has been widely acknowledged that the WF has contributed to awareness raising on water issues, it has been questioned to which extent the WF and VWT concepts have policy relevance (Chenoweth et al. 2014). It has been pointed out that two products may have the same WF but different environmental impact, so that it becomes dangerous to use the WF to guide policy aimed at reducing environmental impact. For the same reason, doubts have been expressed on reporting the WF on a product label or use the concept in a product or production site certification scheme (Postle et al. 2011). Basically, the critique originates from the assumption that the WF metric should provide an all-inclusive message that tells right away what to do. Based on such expectations, Wichelns (2015b) concludes that the WF metric is unsuitable for monitoring company, consumer or country progress towards sustainable water use. It is simplistic thinking, however, to expect an indicator to tell what to do. We need analysis for that, not one number. As many authors have pointed out, WFs need to be put in context to get meaning and water considerations need to be embedded in broader reflections. More useful than a simple numerical WF label would be a graded water label based on criteria such as: is the product’s WF below a certain benchmark level and are most of the components of the product’s WF in basins where the aggregate WF is below the maximum sustainable level. Similarly, governmental policies and corporate strategies can be informed based on a full WFA, not just based on one number. The Dutch Environmental Agency notes that instead of revealing their overall WF in their sustainability reports, companies would do better to report progress made in reducing the separate components of their WF in unsustainable hotspots. The strength of this approach would be the involvement of distant consumers, producers, retailers and investors – in addition to local stakeholders and authorities – in addressing water problems in hotspot areas (Witmer and Cleij 2012). Over the past few years, an increasing number of companies and governments have found or started to explore the relevance of WFA (see Supporting Material). A good governmental example is UK’s Environmental Agency that carried out a detailed WFA for the Hertfordshire and North London Area to assist water resources and water quality regulators in managing the quantity and quality of water resources in a sustainable way (Zhang et al. 2014). Other examples are the Spanish government adopting a regulation that requires WFA as part of the process of developing river basin plans (Aldaya et al. 2010) and the Indian government including the goal of WF reduction in its draft national water framework bill (GoI 2016). In general, WF figures should be taken with care of course, put in a broader context and analysed at the level of detail as necessary for a certain purpose. WFA is a partial analysis, as any other analysis, and should always be integrated into or combined with other analyses for developing water or other policy.

4 Conclusions

The innovation of the new field of WFA lies in adding new perspectives to water management. First, it adds the global dimension in efforts to understand patterns of water use, pollution and scarcity. By unveiling indirect drivers of local water problems, it paves the way for analysing what can be done ‘elsewhere’ than locally to improve the sustainability and equity of water use. Previously, water problems have always been thought to be local and to be solved locally, or at least within a river basin. Second, WFA opens the way to analyse the most fundamental driving force behind problems of water pollution and scarcity, namely consumption. Water management has always focussed on matching local water demands and supplies, considering both ‘supply management’ and ‘demand management’ but this approach is too narrow. In water demand management, the focus is on reducing water needs per user, not addressing the more fundamental question, i.e., for which final purposes water is being used, thus avoiding critical discussions like water for food versus feed, water for food versus bio-energy, water for food versus forestry products, and water for producing products for domestic consumption versus export. Third, WFA has introduced supply-chain thinking in water management, bringing in new relevant players into the analysis. Whereas water management has traditionally centred around the question how governments can best govern the public resource water within catchments given competing water users and interests within the catchment, WFA shows the relevance of other actors (consumers, companies, investors), many of whom are seemingly not connected to the catchment. WFA is new for business in the sense that it shifts focus from own operations to the supply-chain, from gross to net water abstraction, from securing the ‘right to abstract’ to assessing the actual sustainability of water consumption, and from meeting ‘emission permits’ to assessing the company’s actual contribution to pollution. While WFA is rooted in discourses on globalization and sustainability of footprints and supply-chains, the development of WFA has in turn also contributed to these larger fields of thinking. Given the essential role of water in our food and energy supply, water is a key resource for future development. Further advances in WFA will need to improve our understanding of how different players can contribute to forms of water governance that integrate the important criteria of environmental sustainability, social equity, economic efficiency and supply security.