University of Birmingham Understanding high-emitting households in the UK through a cluster analysis

Anthropogenic climate change is a global problem that affects every country and each individual. It is largely caused by human beings emitting greenhouse gases into the atmosphere. In general, a small percentage of the population is responsible for a large amount of emissions. This paper focuses on high emitters and their CO 2 emissions from energy use in UK homes. It applies a cluster approach, aiming to identify whether the high emitters comprise clusters where households in each cluster share similar characteristics but are different from the others. The data are mainly based on the Living Cost and Food survey in the UK. The results show that after equivalising both household emissions and income, the high emitters can be clustered into six groups which share similar characteristics within each group, but are different from the others in terms of income, age, household composition, category and size of the dwelling, and tenure type. The clustering results indicate that various combinations of socioeconomic factors, such as low-income single female living in an at least six-room property, or high-income retired couple owning a large detached house, could all lead to high CO 2 emissions from energy use at home. Policymakers should target each high-emitter cluster differently to reduce CO 2 emissions from energy consumption at home more effectively.


Introduction
Climate change has various impacts on water, food, industry, health, ecosystems and coastal systems [1]. To avoid the 'dangerous' impacts of climate change, parties to the United Nations Framework Convention on Climate Change (UNFCCC) came to an international agreementthe Paris Agreementon 'holding the increase in global average temperature to well below 2°C above preindustrial levels' [2]. The UK introduced its own carbon budgets and stated its commitment in the Climate Change Act 2008 to achieve an 80% emission reduction by 2050 compared to the 1990 baseline [3]. To achieve these emission reduction targets, the first five UK carbon budgets covering the period from 2008 to 2032 have been set in law [4][5][6]. Anderson [7] and Pye et al. [8] have argued that the Paris Agreement requires more radical and rapid emission reductions than the UK targets.
The energy used in homes accounted for around 27% of territorial-based CO 2 emissions in the UK in 2016 categorised by end users, while the industry and transport sectors were responsible for 30% and 36% of the total UK CO 2 emissions respectively [9]. Due to the switch from solid fuel to gas in electricity generation and reduced solid fuel use in homes, the CO 2 emissions from UK household energy use decreased by 35% between 1990 and 2016 [9]. However, the total energy consumption by households increased by 1% in 2016 compared to the 1990 level, measured by tonnes of oil equivalent [10]. To reduce CO 2 emissions from energy use at home, the UK launched Feed-in Tariffs (FIT) in April 2010, which supported households, businesses and other organisations to generate electricity from renewable sources. Households who participate in the FIT are paid by their energy suppliers with both a generation tariff for each unit of energy they generate from renewable source and an export tariff for each unit that feeds back into the grid [11]. Furthermore, for improving energy efficiency at home, the UK government launched the Green Deal and Energy Company Obligation (ECO) that required energy compa-nies to provide heating and insulation improvements to households [12][13][14]. The Green Deal allowed households to repay the cost of installed insulation system through their saved energy bills [13,14], although participation was low partly because of the uncertainties on energy savings that could be achieved and the house resale value due to attached Green Deal loan to the property [15,16]. The ECO was also launched early in 2013 to provide additional support, especially for vulnerable households and hard-totreat homes, which placed legal obligations on energy suppliers to deliver energy efficiency measures to residential energy users [12]. As discrepancies in terms of wealth, well being and emissions exist at an intranational level, it is important to focus on the high-emitting households who could have a larger potential for reducing their energy consumption and CO 2 emissions than others. Previous research did not explore different groups among high emitters, or how emission reduction policies could target each group differently .
This paper aims to identify particular groups within the society who are likely to be high emitters from their energy consumption at home. This will facilitate more targeted policies for reducing household CO 2 emissions in the UK. Therefore, the cluster analysis is applied to explore the make-up of the high-emitting household group in terms of socioeconomic factors and dwelling-related characteristics. After the introduction in Section 1, Section 2 draws on the literature related to the distribution of UK household emissions, and outlines the research gap that this paper addresses. Section 3 explains the data used to identify and explore high emitters, as well as the cluster method that is used for the analysis. After presenting the clustering results in Section 4 with further discussions in Section 5, it concludes by considering the importance of the analysis for targeted emission reduction policies among households in the UK in Section 6.

Previous work on socioeconomic factors and household energy consumption
The CO 2 emissions produced by households could vary significantly [32,33,43]. Chancel and Piketty [32] have estimated that the top 10% of emitters account for around 45% of direct and embedded greenhouse gas (GHG) emissions globally. Likewise, Oxfam [33] has estimated that the total share of CO 2 emissions from the top 10% of high-income people is approximately 50%. Previous literature for the UK [17][18][19][20] and other countries [21][22][23][24][25][26][27][35][36][37][38] has shown that household emissions are related to a variety of socioeconomic factors and dwelling-related characteristics. For example, there exist a negative link between CO 2 emissions per capita and household size (the number of people in a household), and a positive link between CO 2 emissions per household and household size [21][22][23]. This negative link between CO 2 emissions per capita and household size could be caused by the economies of scale at the household level, as members of the same household could share gas and electricity use for space heating, cooking, lighting and the utilization of various appliances at home most of the time [24]. The age of household members also matters [20]. For example, older people might require more energy for space heating at home due to their poor health [25].
Household income is another key variable discussed in the literature as it influences both emissions from direct energy use at home, and emissions embedded in consumed products and services [26][27][28]37]. Household composition influences the economic resources available to a household measured by income, due to the economies of scale [44]. The reason for this is that the members of the same household not only share the energy used for performing different practices, but also share other products and services such as furniture, cookware, cars, Internet service, TV license, and so on. To address the influences from the economies of scale, the disposable household income (gross income after taxes and benefits) is equivalised in this paper with the commonly used Organisation for Economic Co-operation and Development (OECD) [44] modified scales (Table 1).
Studies have identified that in general the larger the floor area of the dwelling, the more energy will be required for space heating and other energy use, assuming other conditions of the dwelling are similar [25,26,30,31,36]. Space and water heating accounted for around 81% of total energy use at home in the UK between 1990 and 2013 [10]. For this reason, the size of dwelling plays an important role in overall household energy consumption and related CO 2 emissions, especially during the winter, due to the energy required for space heating. The electricity use for lights and appliances is also positively correlated with the size of the dwelling [29,30,45], as there may be more appliances in larger houses that are used in everyday life, than in an average-sized dwelling.
The type of dwelling, whether it is a detached, semidetached or terraced house, or a bungalow or flat, also influences energy use and the related CO 2 emissions, largely due to the diverse amount of energy required for space heating across these different types of dwellings [46,47]. In addition to energy use for space heating, electricity use for lighting and appliances is also influenced by the type of dwelling [20,45]. Büchs and Schnepf [20] [46]. The shared insulation and heat between dwellings also reduce the heat loss among semi-detached houses, terraced houses and flats, compared with detached dwellings [46]. The mean heat loss for different dwelling types in the UK is presented in Table 2, which shows the mean heat loss ranges from 167 W/°C for a flat, up to 342 W/°C for a detached house [46].
Tenure type could also influence the energy consumption at home [36]. In general, private rented domestic buildings have relatively low thermal insulation installed and require more energy for space heating, due largely to the 'tenant-landlord problem' [17]. The 'tenant-landlord problem' refers to the mismatch between landlords who pay the cost of insulation and tenants who receive the benefits [17]. As a financial incentive, the UK government introduced the Landlord's Energy Saving Allowance (LESA) between April 2004 and April 2015, which provided grants to landlords for upfront payments of various energy efficiency measures such as loft and cavity wall insulation, solid wall insulation, draft-proofing and floor insulation [39]. Although results from previous studies [17,20,30,45] indicate that high energy users and high emitters are likely to own a house that is not only large, but also detached, these studies have not explicitly explored whether some high emitters do not own large detached houses; and if this is the case, what other factors could collectively lead to their high energy use level.
In particular, the household income, age of householders, and category of dwelling have been identified as the main influencing factors on household energy expenditure in North Carolina in the US [38]. Likewise in the UK, Palmer et al. [29] have focused only on electricity consumption among households excluding electric heating and electric showers, where it is found that 85% of the high electricity-consuming households in their study (39 out of 46 households that belong to the top 20% high-electricity users) have at least one key contributing factor that could lead to their higher electricity consumption. The key contributing factors are: at least three people living in the dwelling, the dwelling size being larger than 130 m 2 , the age of the Household Reference Person (HRP) 1) being between 45 and 54, the HRP being unemployed but not retired, and householders belonging to the professional and managerial socioeconomic group 2) . While Palmer et al. [29] have estimated the average value of each of the socioeconomic factors and dwelling-related characteristics among high electricity-consuming households, they have not identified whether different combinations of these factors are more likely to lead to high electricity use collectively. Overall, previous studies have identified socioeconomic factors and dwelling-related characteristics that influence energy consumption and related CO 2 emissions. However, they have not explored whether various combinations of the factors are all likely to link with particularly high CO 2 emissions, which is crucial to understand if the aim of reducing household energy consumption and related CO 2 emissions is to be achieved. For example, studies may find that, in general, highincome households are likely to have more emissions than households with an average income. However, such studies do not show that if a household does not belong to a high-income group, it could be a high emitter due to a combination of other socioeconomic factors or dwellingrelated characteristics, such as household composition, size of dwelling and tenure type. To address this gap in the literature, this paper undertakes cluster analysis within high-emitting households in the UK.
The clustering technique has been used in several other energy and emission related studies, both to cluster countries and households [34,48,49]. The aim of cluster analysis is to classify the whole sample into distinguished clusters, where it is relatively homogeneous within each single cluster and heterogeneous across different clusters. On a country level, Lamb at al. [49] have applied the clustering technique to identify the similarity and diversity of human development and CO 2 emissions between developing and developed nations. On a household level, Pullinger et al. [48] have used the clustering approach to identify distinct household groups according to their water using practices. Likewise, Element Energy [34] have investigated household electricity use in the UK using cluster analysis. It conducted cluster analysis among householders in their sample based on their annual 1) The HRP is the person who owns the dwelling or is responsible for renting it. If the dwelling is joint owned or rented, the person with the highest income would be the HRP. If two or more householders have the same highest income or they all have zero income, the oldest should be identified as the HRP. 2) Palmer et al. [29] have divided all households into three socioeconomic groups: professional and managerial; supervisory, clerical and skilled manual; and semi-skilled, unskilled, pensioner and non-working group. electricity consumption, 6-7 pm peak-time electricity use, socioeconomic factors and dwelling-related characteristics, number and energy efficiency level of appliances, as well as their climate change attitudes and electricity conservation behaviors [34]. Compared to the traditional regression analysis, the clustering technique is especially beneficial for studies exploring whether there are different combinations of independent variables and corresponding values that lead to a similar value for the dependent variable (for example, high CO 2 emissions from energy use at home) [48]. The analysis presented in this paper applies the cluster method to classify the high-emitting households in order to identify whether the high emitters comprise several groups which are more homogeneous in terms of socioeconomic factors and dwelling-related characteristics within each group but heterogeneous across different groups. The homogeneity within a cluster means that the households within one cluster are grouped together by well-defined similarities. On the contrary, the heterogeneity across clusters means that households in one cluster are separated from those in other clusters by welldefined dissimilarities [50,51]. This clustering method has not been used to classify high emitters in other studies, and will be an important original contribution to knowledge through this research.

Material
There are currently no data sets in the UK that provide both household CO 2 emissions and socioeconomic factors [20]. This research thus estimates household CO 2 emissions from energy consumption data. Household expenditure on gas, electricity, and oil are collected from the 2012 Living Cost and Food (LCF) survey, which is a household survey carried out by the Office for National Statistics (ONS) in the UK [40]. The survey covers the whole UK, including England, Scotland, Wales, and Northern Ireland [40]. Households in Northern Ireland are excluded in the analysis presented here, due to the much higher level of oil use at home than other regions. Using oil leads to around 38% higher CO 2 emission per kWh than natural gas [52]. 5593 households in total were selected using a multistage stratified random sample method from approximately 26.4 million UK households in the 2012 LCF survey [41]. Initially, the first stratum in the sample selection was defined by the Government Office Regions (GORs) and two variables, which were social class of the HRP and ownership of cars. Then, 638 out of 1.8 million postal sectors were randomly selected from the first stratum. All households in each selected postal sector were accessed for the survey. As a result, 52% of the selected households responded to the survey, which constituted the 5593 households in the data set. Less than 1% of the households use other fuels, such as solid fuel or Calor gas, in the 2012 LCF survey. For this reason, the other fuels are not considered in estimating total CO 2 emissions from energy use at home. This analysis estimates the energy used by each household in the selected survey sample by dividing the household energy bills by corresponding energy unit prices as shown in Table 3 [53,54]. The price for domestic oil in 2012 was also obtained from Department of Energy and Climate Change (DECC) [55], with no regional prices available. After calculation, gas and electricity use are measured in kWh and oil consumption is measured in liters, instead of In addition to household energy expenditure, the 2012 LCF survey also provides socioeconomic factors and dwelling-related information for households. The analysis aims to include as many socioeconomic factors and dwelling-related characteristics as possible from the 2012 LCF survey for data analysis in order to provide a fuller picture of who the high emitters are and why they emit more than others. Based on the studies associated with household CO 2 emissions from energy use at home and socioeconomic factors that are introduced in Section 2, the household variables included in the analysis are household composition, tenure type, category of dwelling, number of rooms in the accommodation, equivalised disposable household income, age of the oldest person, sex of HRP, GORs, as well as ownership of cars and second dwelling in the UK. The ownership of cars and second dwellings in the UK are included in the analysis as an additional indicator of the wealth level of the households. The sex of HRP is included in the analysis to complement the information on household composition. Education level data were also collected in the 2012 LCF survey, but 32% of household members did not provide this information, thus the variables cannot be used dependably in the analysis. All other available socioeconomic factors and dwelling-related characteristics in the 2012 LCF survey are covered by the selected variables in the cluster analysis to cluster the highemitting households.

Equivalising household CO 2 emissions
As explained in Section 2, household composition influences energy requirements at home as members in the same household are able to share the energy used for space heating, lighting, cooking, and appliance use most of the time. The analysis aims to identify who the high emitters are based on CO 2 emissions estimates from their energy consumption. Not equivalising the household CO 2 emissions estimates is likely to result in the defined highemitting households comprising a larger percentage of households with more members than average. Thus the analysis presented in this paper applies DECC's [56] Low Income High Cost (LIHC) equivalisation factors (Table 4) to equivalise household CO 2 emissions estimates before defining, clustering and identifying the high-emitting households. The LIHC equivalisation scale is based on the energy requirement of the households, which was used by DECC [56] to identify households in fuel poverty.
After equivalising the household CO 2 emission estimates from energy use at home, the top 10% of emitting households are defined as high emitters for the cluster analysis, which constitute 510 households. The 10% range is selected to be consistent with relevant studies conducted by Brand [43], Chancel and Piketty [32], and Oxfam [33].

Cluster method
There are three principal clustering methods: hierarchical clustering, k-means clustering and Two-step clustering approaches. Among these three clustering methods, only the Two-step cluster fits with the mixed data of continuous and categorical variables [51]. Therefore, the Two-step cluster method is selected for the analysis. The continuous variables are standardised using the Standard Score (also named as Z Score). The categorical variables are manipulated as dummy variables, with a numerical value of 0 or 1. In other words, if the answer is 'yes' for the dummy variable (For example, do the household occupants live in a detached house?), the variable has a numerical value 1. If the answer is 'no', a numerical value 0 is allocated to the variable. The Pearson correlation tests are then undertaken to check the correlation between any two of the selected continuous variables, and the Pearson's Chi-square tests are used to check the correlation between any two categorical variables [57].
The first step of the Two-step cluster analysis is called pre-cluster, where the data are scanned one-by-one to decide whether to merge the data with the previously formed clusters or start a new cluster, according to the loglikelihood distance criterion [58]. The second step of the Two-step cluster analysis merges the sub-clusters identified in the first step, where the final number of clusters is decided through two stages. At stage one, the initial estimate of the number of clusters is computed using the Bayesian information criterion (BIC) criterion, which is commonly used as an objective selection criteria to avoid arbitrariness in deciding the number of clusters [59]. At stage two, in order to decide the final number of clusters, the largest relative increase in distance between the two closest clusters is identified using the ratio calculation, shown in Eq. (1) [58].
where C k is the cluster model containing k clusters and d min (C k ) is the minimum cluster distance for cluster model C k . The final number of clusters is decided by comparing the two largest R ratios. If the largest is 1.15 times greater than the second largest, the model with the largest R ratio is selected as the optimal number of clusters; alternatively, from those two models with the largest R ratio, the one with the larger number of clusters is selected as the optimal number of clusters [58]. The cluster quality is measured by the 'Silhouette measure of cohesion and separation', which is calculated by using Eq. (2).
where s(x) is the 'Silhouette measure of cohesion and separation', a(x) is the average distance of x to all other cases in the same cluster, and b(x) is the minimum average distance of x to cases in any of the other clusters. The larger the Silhouette measure, the more homo-genous each individual cluster is and the more heterogeneity exists across different clusters. The cluster quality is treated as 'poor' if the Silhouette measure is between -1 and 0.2, while it is 'fair' if it is between 0.2 and 0.5 and 'good' if it is larger than 0.5 [58].

Results
As mentioned in Section 3, the Pearson correlation tests are undertaken among the selected continuous variables, and the Pearson's Chi-square tests are used to check the correlations between any two categorical variables [57]. The larger the absolute value derived from the Pearson correlation is, the more correlated the two continuous variables are. Likewise, the larger the Cramer's V for Pearson's Chi-square is, the more correlated the two categorical variables are. If the absolute value derived from the Pearson correlation or the Cramer's V for Pearson's Chi-square is close to 1, it may influence the cluster results. This is because that in this case, the influence of the two related variables on the clustering results would be similar; including both variables means that the influence is counted twice during the clustering procedure. According to the correlation test results in Table 5, the number of cars with the number of rooms, and the number of rooms with the household income are more correlated continuous variables than others. For categorical variables, the sex of the HRP and the composition of the household are more correlated than others ( Table 6). The analysis has included a relatively large sample (510 high-emitting households) to Notes: ** indicates that the correlation is significant at the 0.01 level (2-tailed) while * indicates that the correlation is significant at the 0.05 level (2-tailed). reduce the risk of clustering results being influenced by correlations between variables. Furthermore, the value distribution of each variable, comparing between the highemitting households and the remaining 90% households in the 2012 LCF survey sample, are drawn in Fig. 1. Figure 1 shows that none of the variables would dominate the cluster results, because the values of each variable of highemitting households are distributed across all ranges. Likewise, the values of each variable of the remaining 90% households are also distributed across all ranges. Therefore, all the continuous and categorical variables are included for the clustering process. As a result, six high-emitter clusters are identified, with a 'fair' quality being achieved for the cluster results measured with the 'Silhouette measure of cohesion and separation' (Section 3.2.2). Table 7 lists the selected socioeconomic factors and dwelling-related characteristics for all six identified high-emitter clusters.
As shown in Table 7, the government office region and ownership of second dwelling in the UK are not distinguishable among the high-emitting households. The household composition, income, category of dwelling, tenure type, age of the oldest person, sex of HRP, number of vehicles owned and rooms in accommodation collectively influence CO 2 emissions from energy use at home. For example, if a two-adult household does not belong to any high-income clusters (Clusters A and B), but rents a dwelling that is poorly insulated without gas central heating, they can require more energy for space heating which would result in high CO 2 emissions from energy use at home. On the other hand, if the households are highincome ones who own a flat outright and work outside the  home on weekdays, they are less likely to be high emitters, as they are not typical households in any of the identified clusters. According to the cluster results, typical socioeconomic characteristics for each cluster are selected to compare the households in each high-emitter cluster with the remaining 90% households in the 2012 LCF survey sample. The combination of typical socioeconomic characteristics of each cluster shows that: (1) If the HRP is female, the household is likely to be a high emitter if 1) the age of the oldest person is under 49; they live in a non-detached property; rent it or own it with a mortgage; and they own no more than one car and no more than seven rooms at home. Among the high-emitter cluster C, 55% (41 out of 75) households meet these criteria. In contrast, among the remaining 90% households, only 28% (496 out of 1787) of households with a female HRP meet all these criteria.
2) the age of the oldest person is over 50; the household has at least one car; there are at least seven rooms at home; and the householder owns their property either with a mortgage or outright. Among the high-emitter cluster D, 55% (31 out of 56) households meet these criteria. In contrast, among the remaining 90% households, only 10% (181 out of 1787) of households with a female HRP meet all these criteria.
(2) If the HRP is male, and the age of the oldest person is over 60, the household is likely to be a high emitter if 1) they own a detached house outright; have no children; and have at least two cars and eight rooms at home. Among the high-emitter cluster A, 72% (75 out of 1104) households meet these criteria. In contrast, among the remaining 90% households, only 6% (69 out of 1147) of households with a male HRP and the oldest person over 60 meet these criteria.
2) they own a semi-detached house outright, have at least two adults; and at least one car. Among the highemitter cluster F, 56% (27 out of 48) households meet these criteria. In contrast, among the remaining 90% households, only 21% (239 out of 1147) of households with a male HRP and the oldest person over 60 meet these criteria.
(3) If the HRP is male, and the age of the oldest person is under 59, the household is likely to be a high emitter if 1) they own a detached house with a mortgage; have at least two adults; and at least two cars and eight rooms at home. Among the high-emitter cluster B, 62% (66 out of 107) households meet these criteria. In contrast, among the remaining 90% households, only 7% (111 out of 1660) of households with a male HRP and the oldest person under 59 meet these criteria.
2) they live in a non-detached house, either renting or owning with a mortgage; have at least two adults; and at least one car and six rooms at home. Among the highemitter cluster E, 53% (63 out of 120) households meet these criteria. In contrast, among the remaining 90% households, only 24% (402 out of 1660) of households with a male HRP and the oldest person under 59 meet these criteria.
The comparison between high-emitter clusters and the remaining 90% households shows that among all the clusters, the households in cluster A are the most distinguishable ones, followed by those in Cluster B, and then Cluster D. The households in Clusters C, E, and F are less distinguishable from the remaining 90% households, but still have some of the typical characteristics that high emitters in these clusters share. Although the households in Cluster C have lower incomes than the other high-emitter clusters, they may rent a dwelling that is poorly insulated without gas central heating. Therefore, they could require more energy for space heating which would result in high CO 2 emissions from energy use at home. They may also be part-time employed or unemployed who spend more time at home during the day compared to full-time employed people; therefore more energy would be consumed during the day for space heating, cooking, and entertaining. This is consistent with the findings from Büchs and Schnepf [20], where the households with female HRPs are likely to have higher CO 2 emissions from direct energy use at home, which could relate to a workless status and along time spent at home. The clustering results ( Table 7) also show that the households that do not own a car or have less than six rooms in their accommodation, the retired households that do not own their accommodation, the households with a male HRP and an average equivalised disposable income less than £390, and the households with a female HRP and an average equivalised disposable income less than £290 are less likely to be high emitters compared with other households.

Discussion
The identified six high-emitter clusters support findings in the existing literature that household energy consumption and CO 2 emissions are influenced by various socioeconomic factors and dwelling-related characteristics. Moreover, they also show that in addition to each of the socioeconomic factors and dwelling-related characteristics identified as influential in the literature, various combinations of these characteristics can jointly lead to high CO 2 emissions from energy use at home. Previous studies have mainly used regression analysis to investigate the relationships between household CO 2 emissions (the dependent variable) and socioeconomic or dwelling characteristics (independent variables) [20,28]. Through regression analysis, these studies identified some correlations between household emissions and socioeconomic or dwelling factors. For example, the type of dwelling, tenure type, as well as the age and income levels of householders are all correlated with household CO 2 emissions [20,28,31]. Moreover, the regression model can be used to estimate the likely amount of emissions for a particular household, giving the household's values for all independent variables in the model. However, due to the limitation of the regression technique, it cannot provide insights into whether and what different combinations of independent variables indicate particularly high levels of household emissions. For example, through regression analysis, Büchs and Schnepf [20] show that the age of the HRP positively correlates with the emissions from direct energy use at home. However, they do not specify that younger families may also more likely be high emitters if they are renting an old house that is energy inefficient. Likewise, the regression analysis can show that the size of the dwelling is positively associated with the household CO 2 emissions from energy use at home; but it may not disclose that householders living in smaller dwellings (for example, one or two-bedroom flats) can be high emitters if they have no access to gas at home and mainly use electricity for space heating. In contrast, cluster analysis, which is applied to the analysis presented in this paper, can identify all these possible combinations. Of the identified highemitter clusters, the socioeconomic factors and dwellingrelated factors are more homogeneous within one cluster while heterogeneous compared with other clusters. Using the clustering technique to classify high emitters addresses a gap in the literature around exploring the various combinations of socioeconomic factors and dwellingrelated factors that are most likely to link to high household CO 2 emissions.
The LCF survey was selected as the most appropriate survey to identify the high emitters through cluster analysis, because the LCF data set covers information not only on household gas and electricity bills separately, but also a variety of socioeconomic factors and dwellingrelated characteristics required to identify their influences on energy consumption and CO 2 emissions. However, there are limitations to using the LCF data set, as some of the variables are measured more indicatively than others. For example, the size of the dwellings is measured by the number of rooms at home, because the data on floor area is not available. The rural or urban location of the household is not available from the LCF survey either, which could also affect the level of energy use at home, especially for space heating due to the lack of access to gas in some rural area and the urban heat island effect [30,60]. The urban heat island effect means that the temperature in urban areas is generally higher than that in surrounding rural areas, largely due to deforestation, the replacement of the land surface by non-evaporating and non-porous materials such as asphalt and concrete, and the more intensive layout of buildings and streets within an urban landscape [61,62]. The identified high-emitter clusters might be different if the input variables are changed, for example if the rural or urban location is included in the input variables and the floor area is included instead of the number of rooms. In spite of the data limitations, the cluster analysis results based on the LCF still show that the high emitters comprise different clusters of the households who share similar socioeconomic factors within each cluster but are different from others, which provide useful information for more targeted emission reduction policies on the different highemitter clusters.
In addition to socioeconomic factors and dwellingrelated characteristics, other factors, such as the energy efficiency of the dwelling and appliances, householders' daily routines, and their use of the home may also lead to different energy consumption and CO 2 emission levels [63][64][65]. For example, a middle-aged couple who rent their accommodation can be high emitters due to the 'tenantlandlord problem' discussed in Section 2. They can live in less insulated dwellings with less efficient appliances, which require more energy to deliver the same energy services for heating, cooking and cleaning. High-income families with younger children can be high emitters due to their separate cooking for children [66]. Retired households that own their dwellings outright can be high emitters because of their more vulnerable health conditions and longer time spent at home in general, where more energy can be used for space heating and entertaining [20]. Some of the high-emitting retired households can also live in larger houses with more additional appliances, which they had been using before their child(ren) moved out [67]. Further research on people's routines and use of home are necessary to provide a fuller picture of why these clusters of households are more likely to be high emitters than others.
Rebound effects have been discussed widely in relation to the emission reduction achievement focusing on the households [68,69]. Rebound effects refer to people consuming the money saved on energy bills from improved energy efficiency or behavior change in a particular energy service (e.g. lighting, cooking, space heating and cleaning) on using more energy for that service (also known as the direct rebound effect), or on other products and services that have direct or embedded CO 2 emissions (also known as the indirect rebound effect) [68,69]. As clarified in Section 1, the analysis aims to identify high-emitter clusters and the potential opportunities to reduce household CO 2 emissions from higher emitting households. The emission reductions from high emitters' energy use are likely to lead to rebound effects. The range of rebound effects may vary significantly among different high-emitter clusters and across various carbon mitigation policies. The cluster analysis results in Section 4 show that some identified high-emitter clusters (such as Clusters A and B) share an average household income about twice as high as other clusters (such as Clusters C, E and F). The high-income high energy users are more likely to already be able to afford as much gas and electricity they require as possible. They are less likely to spend the cost savings on more direct energy consumption at home, but are more likely to spend them on other products and services (e.g., purchasing more expensive cars or flying abroad for holidays). On the other hand, some high-emitter clusters are lower-income ones. If the higher-energyconsuming households have not been able to afford as much energy as they need or have tight budgets, they can spend the energy payments saved from efficiency improvements on more gas and/or electricity use at home. For example, some householders may leave more lights on while away, after swapping them for efficient LED lights, because the total payments for lighting would not increase or would still be reduced compared with previous inefficient lights. Policies focusing on energy and emission reductions from higher-income higher-energy users may lead to smaller rebound effects and achieve more net emission reductions than others [60,69,70]. In contrast, energy and emission reductions from lower-income higher-energy users can involve higher rebound effects, which offset the emission reduction effort to a larger extent [60,69,70]. Future research on reducing energy consumption and CO 2 emissions needs to consider the different size of the likely rebound effect for each high-emitter cluster. The estimate could provide evidence on whether and how much net CO 2 emissions can be reduced from the highemitter clusters after taking into account rebound effects.
Policy measures on promoting renewably-generated electricity (e.g., the FIT) may achieve more net emission reductions from the low-income high emitters than other emission reduction policy instruments. This is due to the increased share of total energy use provided by renewablygenerated electricity that reduces the CO 2 intensity of energy use. For example, both improving energy efficiency and increasing renewably-generated electricity use at home may lead to reduced household energy bills and cause similar direct rebound effects on energy use. If householders rebound into using more electricity, they will have less impact on CO 2 emissions if they use renewablygenerated electricity. This can be especially valuable to low-income high emitters who are likely to have larger direct rebound effects than other high emitters. The CO 2 emissions caused by the rebound effects can be reduced when a larger percentage of electricity is generated from renewable sources. For this reason, policies such as the FIT targeted at low-income high-electricity users would be attractive for improving carbon mitigation. Furthermore, the cost of the FIT scheme is shared by all electricity customers, which is likely to result in households that do not participate in the FIT scheme paying for those who are in the scheme. This could lead to a larger gap between the rich and the poor, as there is no provision in the FIT scheme to ensure its uptake by low-income households. Therefore, this paper suggests that incentives could be financed through general income tax and government spending, rather than from the energy market or energy suppliers where costs are passed on to all customers but only benefit those households that have renewable energy systems installed.
As introduced in Section 1, for energy efficiency improvements, the UK government mandates energy companies to provide heating and insulation improvements to lower-income and vulnerable households, for example, through the ECO and previous Green Deal. Due to supplier obligations, they are financed by raising overall energy prices for customers [42]. The impact is highly regressive, because the high-income households pay a much smaller share of their income on home energy compared to the low-income households in general [42]. When energy prices increase, the share of income spent on home energy bills may increase much more among the low-income households than high-income households if the energy savings from efficiency improvements are not sufficient enough to offset increased energy prices. This can lead to more serious fuel poverty issues among the low-income high emitters, especially retired low-income households living in large houses after their children have moved out. Retired or older people could require more energy for space heating, in part due to health conditions. In addition, low-income high emitters may also rent poorly insulated dwellings and are constrained from insulting it due to their tenure type. Cluster C comprises 64% households who rent their properties. This category of householders generally receives few benefits from energy efficiency schemes because of the 'tenant-landlord problem' discussed in Section 2 [17]. As a financial incentive, the UK government introduced the LESA program [39]. However, the program was not widely known and the amount of grant provided was insufficient [71]. The research presented here suggests that more policies like the LESA should be initiated with an increased level of financial incentive supported by government spending, and be widely publicised among landlords, for example, through the media or letting agents. Policymakers should continue to assist the private rented sector as well as low-income households with older people, and ensure that emission reduction policies do not result in more serious fuel poverty issues among the low-income high emitters due to increased energy prices as a result of policy interventions. Financial grants, such as the Winter Fuel Payment subsidy in the UK, could target low-income high energy users rather than the current arrangement where people born on or before 5 May 1953 are eligible to apply for the subsidy regardless of income [72].
The findings of this paper not only apply to households in the UK, but also other countries where high emitters could comprise clusters of households whose socioeconomic characteristics are homogeneous within one cluster but heterogeneous compared with other clusters.
Future research can identify the drivers of high energy consumption at a larger scale through comparing the UK with other countries. The comparison of drivers of high energy consumption across countries would partly depend on the availability of household survey data in those countries, which are expected to cover both energy consumption and socioeconomic factors at home. International comparison on whether and how the drivers of high energy consumption differ across countries would contribute to the global emission reductions by focusing on these drivers. It could also offer insight on supra-national policy making and collaborations for reducing household energy use and CO 2 emissions.

Conclusions
Household energy consumption accounts for almost a third of total UK territorial-based CO 2 emissions. It is important to reduce emissions from energy use at home in the short to medium term for achieving the climate mitigation targets in the UK and globally. In this paper, attention has been paid to the high-emitting households and their socioeconomic factors, as high emitters could have a larger potential to reduce their CO 2 emissions than the others. Through cluster analysis, the study identifies six different combinations of socioeconomic factors and dwellingrelated characteristics that can lead to overall high CO 2 emissions from energy use at home. The results show that the high-emitting households belong to several typical clusters sharing similar socioeconomic factors and dwelling-related characteristics within each cluster, but different from other clusters. According to the typical characteristics of households in each cluster, households with a male HRP, the oldest person over 60, own a detached house outright, at least two cars and eight rooms with no children at home (Cluster A) are most likely to be high emitters among the clusters. The next group of households who are also likely to be high emitters are those who have a male HRP, oldest person under 59, at least two adults, own a detached house with a mortgage, and at least two cars and eight rooms (Cluster B). Households with a female HRP are also likely to be high emitters if the oldest person is over 50, they own their property either with a mortgage or outright, and have at least one car and seven rooms (Cluster D). High emitters in Clusters C, E and F are less distinguishable from the remaining 90% households, compared with Clusters A, B and D, but still shows some typical characteristics that high emitters in these clusters share.
This paper is of high significance not only in the UK, but also in other countries. Different from the main stream regression analysis, the cluster study within high emitters is an innovative approach developed and presented in this paper, which provides useful information for more targeted policies in the UK and other countries focusing on different high-emitter clusters. As reducing energy consumption at home could lead to rebound effects, it is also important to understand that the range of rebound effects could vary significantly among different high-emitter clusters and across various policy measures. More targeted policies would facilitate a greater amount of emission reductions in the short to medium term.
While the results indicate that different combinations of socioeconomic factors and dwelling-related characteristics could all link with high energy consumption and resulting CO 2 emissions, these combinations only explain partly why some householders are responsible for more CO 2 emissions than others. The data on energy efficiency of the dwelling and appliances are not available for this cluster analysis, and there is no information on high emitters' daily routines and their use of home that could require energy to complete. Further research could be conducted to explore the routines and daily practices of the households who belong to different high-emitting clusters, in order to provide a fuller explanation of why these households are more likely to be high emitters than the others.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.