Evaluation of Sub-National Population Projections: a Case Study for London and the Thames Valley

Sub-national population projections help allocate national funding to local areas for planning local services. For example, water utilities prepare plans to meet future water demand over long-term horizons. Future demand depends on projected populations and households and forecasts of per household and per capita domestic water consumption in supply zones. This paper reports on population projections prepared for a water utility, Thames Water, which supplies water to over nine million people in London and the Thames Valley. Thames Water required an evaluation of the accuracy of the delivered projections against alternatives and estimates of uncertainty. The paper reviews how such evaluations have been made by researchers. The factors leading to variation in sub-national projections are identified. The methods, assumptions and results for English sub-national areas, used in five sets of projections, are compared. There is a consensus across projections about the future fertility and mortality but varying views about the future impact of internal and international migration flows. However, the greatest differences were between projections using ethnic populations and those using homogeneous populations. Areas with high populations of ethnic minorities were projected to grow faster when an ethnic-specific model was used. This result is important for assessing projections for countries housing diverse populations with different demographic profiles. Historic empirical prediction intervals are used to assess the uncertainty of the London and the Thames Valley projections. By 2101 the preferred projection suggests that the population of the Thames Water region will have grown by 85% within an 80% empirical prediction interval between 45 and 125%.


Introduction
Projections of future sub-national populations are needed for public and private sector planning. Sub-national population projections are used in grant allocation from central to local government departments and agencies and are employed in service planning by local governments, police authorities, fire and rescue services and health agencies. Projected populations are important in planning provision of utility services, such as electricity, gas, water and sewage disposal.
The future horizon for which projected populations are needed varies from one to five years for budget planning, to short-term intervals of 25 years in UK official subnational projections through medium horizons of 30 to 50 years in local authority (GLA 2014) or academic work (Rees et al. 2016a) to long-term periods of 100 years in pension planning (Pensions Commission 2005).
Thames Water Utilities Limited (Thames Water or TWUL) commissioned the University of Leeds (LEEDS) to carry out long-term population and household projections to 2101 as an input to forecasts of domestic water demand for Thames Water's Water Resource Zones (WRZs). Thames Water were interested in the impact of additional consumption by households in selected ethnic groups because water consumption records showed that South Asian headed households consumed, per capita, about 53 l per day more than Other Ethnic headed households (Nawaz et al. 2018). For a geographic context to this study, Fig. 1 shows the Thames Water region, its constituent WRZs and the boundaries all Local Authority Districts (LADs) which contribute populations to the WRZs. An inset map locates the Thames Water region within the UK. We refer to these WRZs collectively as the Thames Water region or TW region. Projections of populations, households and water demand were produced for Thames Water by a team at the University of Leeds, referred to as LEEDS in the rest of the paper (Thames Water 2017, Rees et al. 2018).
For quality assurance, Thames Water asked LEEDS to compare their projections with those of the Greater London Authority (GLA), the Office for National Statistics (ONS) and Edge Analytics Ltd. (EDGE), using local authority projections converted to WRZs. LEEDS was required to explain how and why their population projections differed from other projections. The aims of this paper are (1) to review approaches used to evaluate sub-national population projections, (2) to describe the methods and assumptions used in five sub-national projections for the English LADs that cover the WRZs, (3) to compare the LEEDS projected populations against the other projections, (4) to propose reasons for the differences, (5) to estimate the uncertainty of the LEEDS projection and (6) to produce an overall evaluation of the results. Thames Water asked us to argue the case for adopting the LEEDS results as the basis for their domestic water demand projections.
The paper is organized as follows. The second section reviews approaches used by practitioners to evaluate alternative projections, drawing on a growing literature. The third section describes data and methods in the projections evaluated. A fourth section discusses the assumptions used in the five projections. These two sections constitute a valuable resource for researchers and practitioners in southern England. The fifth section compares the results of the central forecast in the set of projections across WRZs and compares variants produced by LEEDS and the GLA. The sixth section presents uncertainty ranges for the LEEDS projections using empirical prediction intervals. The final section summarizes findings and discusses how comparative evaluations might be improved.

Review of Approaches to Evaluating Population Projections
This paper aims to evaluate a population projection for London and the Thames Valley water supply area against a set of alternative projections. A typology of approaches for such an evaluation is set out in Table 1. It provides a label for each evaluation method in the first column, a description in the second and citations of selected papers that exemplify the method in the third column.
The first type of evaluation (Table 1A), Interpretative Comparison, involves comparing key numbers, identifying differences and then developing plausible reasons for the differences, based on knowledge of the models used, input data and future  (UN 2015) and in the Wittgenstein Centre's SSP2 (Shared Socio-economic Pathway 2) projection provided in Lutz et al. (2014). The comparisons are made between national populations and the sum of their more detailed province/urban/rural populations. The authors are surprised at the consistency of the projected total populations of India but their interpretation indicates that differences in the structure of the projections and assumptions across components may cancel out. Other examples of this evaluation type include the comparison of methods used in UK Sub-National Population Projections reported in ONS (2018b) and the comparison of methods, assumptions and results of European Union regional population projections in Rees et al. (2001).  (2018), Simpson et al. (2018) The second evaluation approach (Table 1B), Controlled Comparison, involves using a fixed set of inputs (populations and components) and assumptions when running a suite of projections which differ in model design for just one component. Wilson and Bell (2004) test out ten different models for projecting internal migration, including the net migration flow model, the multi-regional model, a pool model and a gravity-type model. They find major differences between model groups but similarities between multi-regional and bi-regional models (replicating results in a similar evaluation by Rogers 1976). The comparisons are for future populations (unknown at the time of writing), so they adopt the multi-regional model as the gold standard for the comparison. The second example comprises variant projections, frequently generated at national scale, for local areas in Scotland (NRS 2018). The results of adopting low or high assumptions for one component at a time, holding others fixed with principal projection inputs are produced and evaluated.
The third evaluation type involves Tested Comparison (Table 1C). The Australian demographer Tom Wilson has improved on the second approach in a suite of papers (Wilson and Rowe 2011;Wilson 2015;Wilson et al. 2018;Wilson 2018) by calibrating models for 5-year inter-census period #1 and then forecasting using the models for period #2. This makes possible assessment of projection outcomes against census results. The method was developed by American demographers for evaluating US census tract and county population projections (Smith et al. 2001;Tayman 2011;Rayer et al. 2009). This approach is more rigorous than Interpretative or Controlled comparison, though authors caution that the best choice of model for a recent time interval may not be the best for the future. Wilson (2017) points out, in a useful research note, that projection results should be subject to plausibility tests covering projected total population trends, trends across areas, components of change and age-sex structures (Table 1D). These checks are most important for mid/central/principal projections which producers invite users to use as most likely forecasts. He poses 21 questions for which producers should seek answers (Wilson 2017, Table 1). Some are designed to reveal numerical problems. Examples include checking whether all projections by age and sex free of negative values and whether projected net internal migration across the country sums to zero. Others reveal information which helps in deciding further actions, such as whether projected sub-national populations should be adjusted to add up to national projected populations.
A common way of testing the plausibility of principal projections is by running variant projections (Table 1E), in which high and low assumptions for each component are made and the results compared with the main projection. This is standard practice at national scale (ONS 2015) but rarer at sub-national scale (NRS 2018). Developing plausible variants for internal migration is more difficult, with current practice being to use calibration intervals when different migration structures were known to operate (GLA 2014). Reference projections (e.g. no international migration) are also implemented (NRS 2018). Rees et al. (2013) developed system of reference projections based on a design by Bongaarts and Bulateo (1999) which assess the impact of assumptions for each component. This analysis is extended by Caswell and Gassen (2015) to develop a matrix calculus to measure the sensitivity and elasticity of forecast populations to perturbations in assumptions, though the application is for national rather than sub-national projections for Spain.
Variants represent beliefs about alternative futures and are not assigned likelihoods. Three decades ago Nathan Keyfitz emphasized that demographers should be "held responsible for … warning one another and our public what the error of our estimates [of future population] is likely to be" (Keyfitz 1981, p.579). Since then a methodology has been developed for constructing a probability distribution around a preferred projection (Table 1F). Error distributions for future fertility, mortality and migration summary indicators are estimated through one of three approaches: time series analysis (Keilman and Pham 2004), comparison of historical projected populations with later estimated or census populations (e.g. Shaw 2007) and surveys of expert views (Shaw 2008). However, eliciting error distributions from experts when either the number of countries or number of regions within a country is large and can challenge mental capacity. So, Lutz et al. (2014) focus on eliciting the general views of experts about broad trends or scenarios rather than numerical values for parameters to drive probabilistic projections.
Leading indicators are randomly sampled several hundred times from component error distributions and projections generated. The projection outcomes can be described as cumulative probability distributions. Usually the 10 and 90% percentiles are chosen giving an 80% prediction interval. Probabilistic projection distributions are conditional on the chosen principal projection, which trace a path close to the median of the projection set. Probabilistic projections for national populations have been produced in cross-national projects (Alders et al. 2005) for 18 countries in Europe), in projects by international institutes (Lutz et al. 1996 andLutz et al. 2004) for world regions, by the United Nations (UN 2015; Raftery et al. 2012) incorporating fertility and mortality uncertainties, by academic teams (Azoze et al. 2016) incorporating fertility, mortality and net international migration uncertainties and by national statistics offices. New Zealand's official demographers construct probabilistic projections using Bayesian methods at both national (Statistics New Zealand 2016a), sub-national (Statistics New Zealand 2016b), national by ethnicity (Statistics New Zealand 2015a) and subnational by ethnicity (Statistics New Zealand 2015b). Wilson and Bell (2007) present a probabilistic projection for the State of Queensland (Australia), which provides a clear guide to methods. Raymer et al. (2012) experiment with different models for representing internal and international migration in a projection of three English super-regions (North, Midlands and South) using probabilistic methods. Sevcikova et al. (2018) use probabilistic methods to forecast sub-national fertilities across a range of countries. Table 1G describes a method to use historic projection errors in a simpler way. Recent work has focussed on the development of empirical prediction intervals (EPIs) for small, medium and large regions in countries using analysis of historical errors (Smith et al. 2001(Smith et al. , 2013Rayer et al. 2009;Tayman 2011 andWilson (2012). Yamauchi et al. (2017) compare the accuracy of Japanese sub-national projections with those in the USA, Australia and England (two sets). In a later section of the paper we develop the Yamauchi comparison further, prior to using EPIs to evaluate our projections of Thames Water WRZ populations.
The final evaluation method (Table 1H) concerns providing advice to users about how far in the future projections can be regarded as reliable. Wilson et al. (2018) introduces the concept of shelf life of a projection, drawing on the use of "best before" and "use by" dates employed widely in the retail grocer sector. The shelf life is the time interval between jump-off year and use-by year, while display period lasts between jump off year and best before date. APE thresholds of 5 and 10% are chosen for "best before" and "use by" dates. Keilman (2008) offers a preliminary description of how risk functions can be used to judge the benefits and costs of using projection outcomes.
This review of evaluation informs our approach to the comparison of alternative projections for the Thames Water study region. Our focus is on "Interpretative Comparison", on "Variants" and on "Empirical Prediction Intervals". Most of the checks in Wilson's plausibility list we used in preparing our projections and they will have been implemented in the official, local government and consultant projections used in the comparison. Ideally, controlled or tested comparisons might have been used but insufficient resource was available to use these methods. Variants were available for two out of the five sets of projections and we examine their results later in the paper. Because a set of empirical prediction intervals based on historic error analysis were available (UKWIR 2015), we use these to gauge uncertainty in our central projections. The shelf life concept is not applied directly but we assess the usefulness of 90-year projections in the discussion section. Table 2 sets out details of the five sets projections which are compared in this paper. Note that we use one column for the two ONS sub-national projections because they use virtually the same methodology. They differ only in the way in which internal migration between English local authorities and the other home countries is handled. Each projection produces local authority populations for a sequence of years. The columns of Table 2 identify the organization responsible: the LEEDS team (authors of this paper), the GLA (Greater London Authority) Intelligence team led by Ben Corr with Will Tonkiss providing key software expertise, the Office for National Statistics team led by Andrew Nash and the EDGE (Edge Analytics Ltd) team led by Peter Boden, contracted by Thames Water to produce medium-term projections linked closely to the addition of new properties, both occupied and vacant. Table 2A lists the projections to be compared. Each organization produces a central projection while LEEDS and GLA also generate variants. Table 2B specifies the geographical units underpinning the projections. The LEEDS, GLA and ONS projections are for all English LADs plus the other home countries of the UK from which results are extracted for LADs covering the Thames Water region (Fig. 1). EDGE generate projections for 80 LADs that cover the wider TW water supply and sewage disposal region. Results for these LADs were extracted from the larger sets and then converted into populations for the six Thames Water WRZs. A Look Down Table (LDT) based on 2011 Census populations is applied to geo-convert LEEDS, GLA and ONS LAD projected populations to WRZ projected populations. The EDGE projections use an LDT based on geo-referenced individual properties (Thames Water 2017). The time horizon differs between projections (Table 2C). The projections adopt a range of future horizons: 25 years for the ONS SNPP projections, 30 years for the EDGE projection, 35 years for the GLA projections, 90 years for the LEEDS projections though we report mainly on information for 50 years. Jump-off years differ between projections from mid-2011 (LEEDS) to mid-2016 (ONS 2018a).  Table 2D sets out the methods used to represent the components of change in the projection models. The base populations are the ONS mid-year estimates for jump-off years, except that the GLA uses its own modified estimates. The LEEDS projections start at mid-year 2011, when mid-year estimates of population by ethnicity were available. The GLA projections jump-off from mid-2015. The ONS SNPP projections use mid-2014 and mid-2016 baseline populations. The EDGE projections use mid-2015 jump-off populations. All population estimates are specified by sex and single year of age.

Data and Methods Used in the UK Sub-National Projections
All projections employ the cohort-component projection model. Where projections differ is in how internal and international migration are handled. The LEEDS projections use a bi-regional model, in which LAD populations are forecast in pairs, the LAD itself and rest of the United Kingdom. The bi-regional model reduces the number of variables that need estimation compared with the multi-regional model but yields comparable results (Wilson and Bell 2004). In both models, internal migration flows are forecast by multiplying the population of the LAD origin by a forecast rate of outmigration. The GLA and ONS projections both use multi-regional models. The EDGE model uses a cohort-component model, implemented at two levels, COAs and LADs with housing plans changing migration inputs (Thames Water 2017).
All projections base their mortality rate assumptions on a combination of ONS national and sub-national estimates, which are computed from registered deaths by age and sex and the corresponding mid-year populations. The LEEDS projections require ethnic specific mortality rates. These must be estimated indirectly because the ethnicity of the deceased is not recorded in the Register of Deaths. The LEEDS ethnic mortality rates are estimated using the geographical distribution of ethnic populations (Rees and Wohland 2008, Rees et al. 2009, 2016a. The variation across ethnic groups in mortality rates is limited. All projections either use or adapt age-specific fertility rates for LADs, estimated by ONS, which are based on birth statistics and mid-year population estimates. The LEEDS projections use rates estimated from a combination of birth statistics for LADs, child-woman ratios by ethnicity from the 2011 Census data and ethnic fertility rates by age estimated from the Labour Force Survey (Norman et al. 2014). The GLA ethnic projections use London Borough ethnic census populations of 0-year olds to compute fertility rates, adjusting to total births by mother's age.
All projections either use or adapt internal migration rates for LADs, estimated by ONS, based on migration origin-destination statistics derived from NHS Register patient records of changes in address. Ethnic specific internal migration rates, required for the LEEDS projections, use commissioned tables from the 2001 and 2011 Censuses and NHS Patient Register migration data for mid-year intervals from 2001 to 02 to 2010-11 . The GLA and ONS projections make use of estimates of internal migration rates for years after the 2011 Census. The EDGE projections use housing plans for LADs in the Thames Water region to adjust internal migration rates to reflect additional in-migrants occupying new dwellings.
All projections either use or adapt ONS estimates of international migration flows to/ from LADs. Immigration estimates use flow statistics from the International Population Survey (IPS)/Long-Term International Migration (LTIM) at national and regional level and proxy variables from administrative data sets at LAD level. To estimate emigration flows at LAD scale, ONS employs a model with co-variates (e.g. previous immigration flows, internal out-migration rates), constrained to national IPS/LTIM emigration tables. The LEEDS projections make use of published and commissioned 2001 and 2011 Census immigration tables by ethnicity based on citizenship information in the IPS data . Interpolation methods are used to estimate ethnic international migration for mid-year to mid-year intervals between censuses. These LAD level estimates of immigration and emigration are used differently, depending on projection. The LEEDS projections employ immigration and emigration flow assumptions; the GLA projections use emigration rates and immigration flow assumptions. ONS uses net international migration assumptions in the NPP and SNPP 2014-based projections. Experiments by ONS and by the LEEDS team suggest that the choice of method for modelling future international migration can make a substantial difference in population projections. Table 2E indicates whether LAD level projections are constrained to higher level populations. The LEEDS projections are unconstrained or "bottom-up". The GLA Trend projection is unconstrained, so that the forecast for Greater London is the sum of the London Boroughs projections. The GLA Housing-Led projection is constrained to the GLA Trend projection but only at the Greater London scale. The ONS LAD projections for England are adjusted to sum to the totals for England, derived from the ONS National Population Projections (2014-based or 2016-based). The EDGE projections use a top-level, housing led LAD model and a bottom level Census Output Area model that links to property information (Thames Water 2017). The Census OA projections are constrained to the LAD projections. Table 2F indicates that all the LEEDS projections use LAD ethnic sub-populations. The GLA only implements ethnic group projections for London Boroughs, adjusted to sum to results from the GLA Trend projection, and not for LADs outside London. Neither ONS nor EDGE produce ethnic population projections.
This review of data and methods used in sub-national projections for England finds similar approaches adopted and a largely common database of population and component estimates. However, some crucial differences are apparent. Only the LEEDS projections use ethnic sub-populations which vary greatly in their growth potential and only the LEEDS projections adopt a bottom-up approach. All other projections constrain results to the ONS England projections. There are also differences in the calibration period use to estimate internal migration rates between LEEDS, GLA and ONS/EDGE projections. Methods of projecting international migration differ between GLA and other projections. Table 3 describes the component assumptions used in the projections. The approach across all projections is to specify long-term assumptions for national leading indicators for each component and then to trend from rates or flows estimated current just prior to the long-term assumption. The factors used to scale leading indicators to local scale are assumed constant at values in the time interval before the mid-year jump-off point. In the UK there has been little investigation of trends in local variation in demographic components. Local areas are assumed to behave in the same way as the national or system population.

Assumptions Used in the UK Sub-National Projections
Mortality trends adopted in the projections follow the ONS 2014-based assumption of an average decline of 1.2% per annum in age-specific mortality rates, based on the average decline between 1914 to 2014. Since 2013 declines in mortality have stalled (ONS 2015;Hiam et al. 2017). In the 2016-based national and sub-national projections (ONS , 2018a, the decline is modified, recognizing that mortality rates at the oldest ages have stopped falling. Continuing improvement is assumed for younger ages using a 1.2% decline, but mortality rates from age 65 onwards are assumed to decline more slowly to 2040-41 and resume the 1.2% decline thereafter. Fertility rate assumptions in the LEEDS, ONS 2014 and EDGE projections are based on the ONS NPP 2014 long-term assumption of a total fertility rate (TFR) of 1.90  (Table 6) Assumes internal migration rates are constant, with averaging period differing by variant (Table 6) Assumes internal migration rates are constant, using a 5year period for averaging (Table 6) Assumes internal migration rates are constant, using a 5-year period for averaging (Table 6) International Migration High, Mid and Low Variants for immigration and emigration flows assumptions for UK Home Countries factored to LADs (Table 6) Assumes constant emigration and immigration rates linked to ONS SNPP2014, factored to LADs (Table 6) Assumes Long-term constant net international migration flows (NIM = +185 k) factored to LADs (Table 6) Assumes Long-term constant net international migration flows (NIM = +165 k) factored to LADs (Table 6) EDGE Assumptions: Housing plans assembled from LAD documents & communications are used to modify total migration flows via a model. Otherwise EDGE projections follow ONS assumptions for fertility, mortality and international migration See Table 2 (children per woman) for England and 1.89 for the UK. The ONS assumption is based on a careful analysis of cohort fertility rates (completed number of children ever born), which has been less volatile over time than the period TFR. It is assumed that the tempo shift of the two previous decades, when women postponed births in their twenties only to later bear children in their thirties, has ended. In the ONS 2016-based projection UK long-term fertility was assumed to be 1.85 children per woman, down from the 2014 assumption of 1.89. The LEEDS projections use fertility rates for ethnic groups. The UK total fertility rates (TFRs) in 2011 for the groups comprising the "South Asian" ethnic grouping, were 2.20 for Indians, 3.20 for Pakistanis and 3.47 for Bangladeshis, compared with a TFR of 1.83 for the "Other Ethnic" grouping. These high rates for South Asians are coupled with a current youthful age structure, leading to substantially higher growth than for the White British and Irish majority and the other minority ethnic groups. After adjustment to the ONS long-term assumption, factoring to LADs and allowance for a short-term trend, the age specific fertility rates are held constant. In the GLA ethnic projections, ethnic specific fertility rates are also used but their effect is suppressed by the adjustment of populations by ethnicity to the total population constraints of the GLA Trend projections. Internal migration involves both origin and destination regions. To take this into account a different approach to assumption setting is used. Internal migration is a redistributor of populations whose size is largely determined by the current national population age structure, natural increase components and international migration. Lomax and Stillwell (2017) and Stillwell et al. (2017) showed that the redistribution effected by internal migration differed between the start, middle and end of the 2001 to 2011 decade, especially in the Greater South East. GLA have established through their analyses that the level of out-migration from London and in-migration to the Outer Thames Water region differs considerably over time, depending on the state of the economic cycle. GLA proposed variant projections that averaged internal migration rates over different time periods. The first was over a short-term period, heavily influenced by the 2008-09 Global Financial Crisis, which reduced out-migration from Greater London to the Rest of the South East. The second was a longer-term period which covered the boom of the early 2000s, the recession and the recovery to the present. Table 4 lists the periods over which internal out-migration transition rates were averaged and then introduced in central (ONS, EDGE), variant (GLA, LEEDS) projections. The average rates are assumed constant from the jump-off year for the rest of the forecast period. In the rightmost column of Table 4A, we indicate the likely impact of the exchanges of migrants between Greater London and the South East. Table 4B presents a summary of the international migration assumptions used in the ONS and LEEDS projections. GLA and EDGE use the ONS 2014-based National Population Projections assumptions. The ONS sub-national assumptions are based on the 2014-based National Population Principal projections (NPP2014). The long-term assumption was set as a net international migration total of 185 thousand net migrants per year. We estimate that the net balance is associated with flows of 519 thousand immigrants and of 334 thousand emigrants. In the ON 2016-based National Population Projections, the net international balance is assumed to decline to a lower long-term constant of +165 thousand per year, anticipating reduced immigration from the European Union revealed in the mid-2016 to mid-2017 estimates.
The LEEDS projections adopt three variants for future international migration flows. The LEEDS HIGH variant is the product of logistic models fitted to time series (1991 to 2015) of immigration and emigration flows. The logistic asymptote generates a longterm level of immigration of 617 thousand immigrants per year to the UK and a long-   See Table 2  The LEEDS LOW variant sets assumptions through analysis of a time series of international migration, using the citizenship data in the International Passenger Survey, which classifies international migrants as British, European Union or Non-European Union citizens. It is assumed that the downward trend observed between mid-2009 and mid-2015 by citizenship for non-EU immigrants and emigrants will also apply to EU citizens post-Brexit, from 2019 to 20 onwards. In this variant, the longterm limit is set at immigration and emigration levels equivalent to net international migration of 100,000 per year. However, because emigration declines at the same time as immigration, this level is not reached until 2079-80.   Table 5 converts the 2039 projected populations into time series indices (2011 = 100) to compare WRZ populations, small and large, using the same metric. The ordering of the projections differs by WRZ. For the Thames Water region, the ONS 2016-based projected populations are the lowest and 8% below those of the ONS 2014-based projections. These two projections share virtually the same methods, so the differences are due to shifts downwards in assumptions about fertility, survival and net immigration. To determine the relative contribution would need controlled or tested comparisons (Table 1B, C). The LEEDS MID projection produces the highest growth for the Thames Water region, based on high growth for the London and Slough-Wycombe-Aylesbury WRZs. We argue later in the paper that this reflects the growth potential of ethnic minority populations which are highest in the UK capital and the zone including the industrial city of Slough, with a high South Asian population share. For Guildford and Swindon-Oxfordshire the EDGE projections report the highest growth. For Henley and Kennet Valley all projections fall within an 8% range (maximum less minimum); for Guildford, Slough-Wycombe-Aylesbury and Swindon-Oxfordshire the range is 20 or 21; London experiences the greatest range at 25%.

Results: Interpretative Comparisons of the Thames Water Projections
For Guildford WRZ, the main contrast is between the EDGE Housing-Led and the other three projections. The higher projections are the result of housing developments planned in LADs contributing to the Guildford WRZ. For the Henley WRZ, the GLA and LEEDS projected populations were considerably higher than in the ONS or EDGE projections. This is likely to be a result of the different internal migration averaging periods used. Both GLA and LEEDS projections include internal migration rates from years prior to the Global Financial Crisis when out-migration from London was higher than in the recession years included in ONS's averaging period. In the Kennet Valley WRZ, the EDGE and LEEDS projections move in parallel, while the GLA and ONS projections are lower. The EDGE growth is driven by new housing starts while the LEEDS growth is driven by a combination of favourable internal migration rates and an increasing ethnic minority population, particularly in Reading. For the Slough, Wycombe and Aylesbury WRZ, the LEEDS MID projected populations are much higher than in the GLA, ONS and EDGE projections, paralleling the outcome in the London WRZ (Table 7). The outcome for this WRZ is different from other WRZs outside London because of the high South Asian population share in Slough LAD ( Table 6). The Swindon & Oxfordshire WRZ shows the same pattern of population increase across the projections as the Guildford WRZ, where the EDGE projections are higher than the others. This is a highly desirable migration destination, reflected in the housing plans that drive the EDGE projections.
To examine the differences between variant projections the results are graphed for Greater London (Figs. 3 and 4). Greater London covers 32 London Boroughs and the City of London; the London WRZ includes 29 London Boroughs and parts of other LADs to the south and north (Fig. 1). The differences between projections for Greater London are substantial. The LEEDS projections generate almost twice as much growth by 2050 than does the GLA projections (Fig. 3). The main reason is that the LEEDS forecast uses London Boroughs and LAD populations disaggregated by ethnicity. Ethnic minority groups are growing much faster than the White British and Irish "host" population. Work by the LEEDS team since 2008 using both 2001-based and 2011based ethnic projections (Rees et al. 2011, 2013, Rees et al. 2016a has shown that ethnic minority populations are growing very fast. London is one of the most diverse world metropolises. In 2011, many London Boroughs had "minority-majority" populations. While the GLA does produce ethnic population projections for Greater London, the results are constrained to the GLA Trend projections and fail to reflect fully the effect of this heterogeneity on population growth. The share of the Thames Water   Table 7, reporting on both South Asian and Other Ethnic populations. Against a background of 85% growth in the Thames Water's population, the highest growths are projected for the Slough-Wycombe-Aylesbury and London WRZs, which will experience increases of 123 and 93% respectively. In all WRZs, the South Asian population is expected to more than triple (a 325% increase), while the Other Ethnic population only increases by 61%. The South Asian population increases more in the WRZs outside London than in the London WRZ, indicating that internal migration redistributes this sub-population outwards. Both the London and Slough-Wycombe-Aylesbury WRZs gain in share of Thames Water region population by 3.2 and 7.0% respectively. These long-term results show the importance of including ethnic heterogeneity in LAD and WRZ projections. How should the user of the projections, Thames Water, cope with this diversity of results? The first coping mechanism would be to plan for all the eventualities embedded in the five competing projections, using the maximum and minimum projected population as set out in the bottom panel of Table 5. However, the sample of projections is very small so perhaps the full range of possible outcomes is not catered for. For Greater London, the set of projections was extended by including variant projections, produced by LEEDS and by GLA (Fig. 3). This extends the range across the projections by adding LEEDS HIGH and LOW variants. Note that despite the Brexit Referendum result, the UK's international migration balance was still +280 thousand in 2017 with reduced immigration from the European Union compensated by increased immigration from outside the EU (ONS 2018a). The LEEDS High projection adds 12% to total population growth compared with the LEEDS MID forecast, while the LEEDS LOW reduces growth by 3%. The GLA Short Term forecast adds 4% to the GLA Trend, while the GLA Long Term forecast reduces growth by 3%. Contrast these differences with the 34% difference between the LEEDS MID and GLA Trend, due to the ethnic heterogeneity built into the LEEDS projection.

Empirical Prediction Intervals Applied to the Thames Water Projections
The review of evaluation methods earlier in the paper described a growing body of evidence about historical errors in population projections (Table 1G) (Smith et al. 2001summarised in Tayman 2011, Yamauchi et al. 2017 or Empirical Prediction Intervals (UKWIR 2015) or both . In Table 8, we bring together the APE or EPI estimates for a sample of countries which have been studied. The APEs have been converted into EPIs or vice versa, for consistent comparison, assuming the errors are normally distributed. In each country EPIs decrease systematically with size. Most authors state that this decrease applies to the smaller populations and that above a threshold EPIs are constant, although Table 8F suggests a negative exponential function gives a better fit. As Yamauchi et al. (2017) comment, EPIs are highest in the USA, moderate in Australia, lower in England and lowest in Japan. A second set of EPIs is reported for England (Table 8D), from Simpson et al. (2018), which are broadly similar to the first set (UKWIR 2015), though the authors consider variation by LAD type more important than variation by size, with higher EPIs found in London Boroughs. These differences between countries are associated with international differences in internal migration intensity (Bell et al. 2015(Bell et al. , 2018Rees et al. 2016c) and the degree to which population change is driven by international migration. All studies of EPIs find they systematically increase with time horizon. As reported in Table 8C, the UK Water Industry Research (UKWIR) report on empirical prediction intervals, based on an analysis of historical errors in ONS sub-national projections. For the current study, 80% EPIs are interpolated between or extrapolated beyond the LAD, County and Region tables (UKWIR 2015) using WRZ 2011 populations. The 10th and 90th empirical prediction values, based on the UKWIR (2015) tables are plotted for the WRZs together with the variant projections (Fig. 4). The variant projections, for the most part, fall well within the 80% empirical prediction range. There are some cases, London and Slough-Wycombe-Aylesbury, where the HIGH variant populations are close to the 90% EPI and the LOW variant population pass below the 10% EPI. Under the LOW variant international migration flows decline to a limit of 100,000 in the second half of the projection horizon. These are the two WRZs with the highest growth under the MID scenario, which we have associated with the boost to population growth from ethnically diverse populations. However, the EPIs are based on sub-national population projections that do not include ethnic heterogeneity and for these WRZs the 80% EPI may be under-estimated.

Discussion and Conclusions
During this applied demography project, both Thames Water managers and Water Resource Forum stakeholders challenged our methods and assumptions, in far more depth and detail than usually occurs in academic meetings. It is useful to report here on the questions posed and our responses. The question was asked: "Why are the LEEDS MID projections higher than the projections by other organizations?" The following explanation was offered. The LEEDS MID projections include LAD ethnic minority populations, which grow much faster than the White British and Irish (WBI) majority population. The reasons for this faster growth differ across ethnic minority groups (Rees et al. 2012). The Indian, Pakistani and Bangladeshi (South Asian) grouping, sub-populations used in the Water Demand projections, are assumed, based on 2001 and 2011 estimates developed by the Leeds team, to have fertility rates higher than the White British and Irish group and to continue to add population through net international migration gains. The South Asian populations are younger than the WBI population. The age distribution for the South Asian grouping is highly concentrated in the family building ages. So, they are the fastest growing groups. Mixed groups grow because of their younger age structure, while, for other groups such as the Chinese or White Other (which includes EU immigrants), the main driver is immigration. Ethnic minority populations are concentrated in the London WRZ and in the Slough-Wycombe-Aylesbury WRZ. Other WRZs have much smaller ethnic minority populations and so grow at a slower rate. Note that this effect only occurs where a bottom-up approach to projection is adopted which assumes that the populations of LADs or WRZs are the sum of their ethnic group populations. GLA adopts a top-down approach and adjusts ethnic group projected populations to sum the all group London Borough forecast populations. ONS and EDGE forecast the total population with no ethnic breakdown and so both projections miss out on the boost from growing ethnic minorities.
There was surprise among manager and stakeholders about the high growth in the population of the Slough-Wycombe-Aylesbury WRZ compared with other WRZs outside London. The SWA WRZ has growth comparable to that of the London WRZ. This WRZ has a high concentration of South Asian ethnic minority groups. In Slough LAD, ethnic minorities constitute a majority of the population. These groups have above average fertility (especially Pakistani families), younger age profiles and continuing immigration through family unification and out-marriage.
We were asked: "Why does growth in population slow down the last thirty years of the forecast?" The slowing down in the growth of population in the thirty years of the twenty-first century (Table 7) is a product of the relationship between constant international migration assumptions and decreases in natural increase. The demographic slowdown derives from the assumptions adopted for natural increase, international migration and internal migration. In the long run (e.g. to 2101), an assumption that the national TFR remains constant at only 1.89 (ONS SNPP 2014) or 1.85 (ONS SNPP 2016) will lead to a decrease in population, because a TFR of 2.07 is required for a population to reproduce itself. Following ONS practice, for the LEEDS MID projection we assume in the long run after the 2020s that the balance of immigration and emigration flows remains constant. In the first decades of the projection, the net immigration gain will more than compensate for the natural decrease associated with the ageing of baby boomers into the high mortality age bands. However, late on in century natural increase will become natural decrease because of ageing of the population: a constant net international migration will no longer compensate for the decrease. This is, however, an argument about the national population. To explain effects at local scale we must consider internal migration, which redistributes populations and population change between local areas. For internal migration, we assume that the LAD ethnic group out-migration rates remain constant. Out-migration flows from Greater London will therefore increase as Greater London's population grows, eventually cancelling out the fixed net gains from international migration. On the other hand, WRZs outside London will experience gains through this outmigration from London, which continue to grow, alongside smaller, constant gains from international migration.
We were asked the almost impossible question "What about the impact of Brexit on the future populations of the Thames water region?" In LEEDS MID forecast, we adopt the ONS NPP2014 UK assumptions for international migration, which are factored to LADs and ethnic groups which cover the Thames Water region. The ONS 2014-based long-term net international migration assumption at +185 thousand per annum is still below recent levels (+248 thousand in 2016) and so has an element of Brexit effect built in. We have also carried out a forecast (LEEDS LOW) which assumes decline to the net international migration target of 100 thousand immigrants per year for the UK, but this target is only achieved in the last two decades of the projection. In summary, we suggested that the LEEDS HIGH projection might indicate a "soft Brexit", the LEEDS MID projection as reflecting a "moderate Brexit" and the LEEDS LOW forecast as signalling a "hard Brexit" with lower and lower immigration over time. Similar views were put forward by Werpachowska and Werpachowski (2017) in a projection of England's population by ethnicity using micro-simulation methods.
In dialogue with Thames Water manager and stakeholders, we were asked to outline the arguments against and for adopting the LEEDS projections as the basis for longterm water demand forecasts and make some final judgements. The arguments against the LEEDS projections include the following. We cannot measure the ethnic specific demographic rates and flows with sufficient accuracy to justify expanding the population groups to include ethnicity. Therefore, we must adopt a conservative approach and not introduce such heterogeneity into the projections. These arguments can be rebutted. The LEEDS projections capture vital heterogeneity due to ethnicity in the demographic dynamics. Substantial effort has been made to improve the quality of estimates of ethnic specific demographic rates between a 2001-based (Rees et al. 2011(Rees et al. , 2012) and a 2011-based (Rees et al. 2016a(Rees et al. , 2016b set of ethnic population projections. For example, we changed our approach to estimating ethnic mortality rates taking cognizance of findings by other researchers regarding the health migrant effect. Although there is still uncertainty in many ethnic specific fertility rates, 2011 Census data on child populations indicate that the fertility rates for the South Asian ethnic minority grouping are much higher than the average. To estimate rates of internal migration by ethnicity in the 2011 projections, use was made of special tabulations from two censuses and improved internal migration estimates . The youthful age structures of ethnic minority populations were revealed in the results of the 2011 Census. These age structures imply a large demographic momentum effect, which will be largely independent of policy. The published immigration statistics (despite their uncertainty) confirm that the boost to growth through immigration to all minority ethnic groups will be substantial (Rees et al., 2013). Note that the White British and Irish majority loses population through emigration being higher than immigration. Of course, there will be considerable uncertainty about the level of international migration flows because of Brexit, but we argue that, if the UK economy performs well, immigrants from the EU who have had the freedom to move to the UK will be converted into migrants with work permits and indefinite leaves to remain, because of the need for employers to recruit labour from outside the UK to maintain vital production and vital services.
There are many other scenarios or variants that we could have implemented, e.g. HIGH, MID, LOW on the fertility, mortality, internal migration and international migration components, leading to 81 possible variants). This would have been an extension of what ONS do at the national level to the sub-national level. ONS have been lobbied by users of sub-national projections to implement sub-national variants but have not proceeded as yet, citing the level of resource needed. Scotland (NRS 2018) carries out variant projections and New Zealand implements probabilistic projections (Statistics New Zealand 2015a, 2015b. 2016a, 2016b for both national, sub-national and ethnic populations. Clearly, a full analysis of the sensitivity of future sub-national populations to methods and assumptions is needed to improve their utility. One implication of our evaluation of projected sub-national populations is that future plans to increase supply of utilities to consumers must be flexible in timing. Supply improvements should be timetabled in a sequence of projects that could be brought forward or postponed depending of future forecasts of water demand. The paper evaluates the authors' population projections required by a large public utility against alternatives. However, the findings have lessons beyond the case study. There is a need to plan comparisons at the specification stage of a project. Ideally, the plan should include running case study assumptions with the models used in comparators to discover how important the effect of model specification is (Wilson and Bell (2004). Similarly, comparator assumptions should be run on the case study model to discover the impact of differences. Such comparisons are routinely made in implementing variant projections. Variants are usefully organised in a schema (Bongaarts and Bulatao 1999), adapted for sub-national ethnic population projections by Rees et al. (2013) to determine the contributions of component assumptions to population change. Drawing on studies of the errors observed in past projections by component, probabilistic projections can be run to determine whether comparator projections fall within the 80% prediction interval. Or empirical prediction intervals can be projected over space and time and used to ascertain the uncertainty of the projection. This is a formidable list of analyses but one necessary to answer the questions of users. A final lesson learnt in this work is that applied research work for an external client can contribute valuable challenges to academic researchers and push us beyond our comfort zones.