Introduction

Access and sustained consumption of clean energy sources are essential for a nation’s overall socio-economic development and improved human welfare [1]. Consumption of clean energy among the population is associated with economic prospects and provision of basic needs required for the sustenance of human life including food, housing, health services, and clothing. Hence, sustainable socio-economic development at the household level is directly linked to the preference and intensity of energy consumed [2]. The global statistics indicate that about 2.7 billion people consume solid biomass for cooking, which is associated with 3.5 million deaths annually from indoor air pollution [3]. Further, statistics portray inter and intra-regional disparities in energy consumption patterns. Developed economies and members belonging to Organization for Economic Co-operation and Development (OECD) have nearly universal access and reliance on modern energy sources [3]. Similarly, other regions with remarkable trends in utilization of clean energy include Latin America (95%), North Africa (99%), Middle East (92%), and South East Asia (84%) [4]. On the contrary, the energy consumption patterns in Sub-Saharan Africa (SSA) are of global concern as they dominate the world totals with roughly 80% dependency on biomass [4]. It is estimated that only 43% of the population in Sub-Saharan Africa have access to electricity which is considered as an efficient and clean energy source [5]. It is further projected that if the current scenario persists, nearly 880 million of Sub-Saharan Africa (SSA) population will rely on non-clean energy for domestic use in the year 2020 [5]. The East Africa region is reported as one of the fastest growing regions in Africa but still exhibits a high dependency (80%) on non-clean energy sources [6]. The abovementioned trends clearly indicate the extent to which clean energy potential benefits and opportunities are mislaid especially among the population.

Kenya’s progress in promoting clean energy consumption at the household level has had its own hurdles. For decades now, biomass is reported as the dominant energy source in Kenya, accounting for about 68% of the energy utilized [7]. Nearly three-quarter of Kenya’s population rely on biomass sources in meeting their cooking, heating, and lighting needs [8]. Other sectors which rely on biomass include the industries micro and small enterprises. Apart from biomass, other major energy sources accounting for the total energy consumed include petroleum products and electricity at 22% and 9%, respectively [9]. The proportion of households relying on non-clean energy is projected to rise from the current 26 million to 45 million by the year 2020 [4]. Access and consumption of clean and efficient energy remains one of the fundamental enablers that is implemented through various national plans and programmes such as the Kenya’s Least Cost Power Development Plan 2017–2037 and the Vision 2030 and the medium-term plans incluing; grid extension renewable off-grid solutions, and the last mile connectivity. The above-mentioned projects have set a clear path towards universal access to clean energy sources and remained pertinent in changing the landscape of energy preference and consumption among households. For instance, the number of customers connected to electricity under the Rural Electrification Programme and Last Mile Connectivity in Kenya increased from 2,264,508 in March 2013 to 6,526,987 customers in June 2018 and access rate stands at 73.42% having improved from 32% in 2013 [10].

Considerable efforts have been made in reforming the energy sector in Kenya. However, consumption of clean energy sources remains relatively low at the household level [10]. This scenario is evident among households practicing multiple energy use, whereby consumption intensity for clean energy sources is considerably low. According to [11], electricity consumption has declined for the last 5 years from 2823 kW per hour (kWh) in the year 2013 to 1338 kWh as of 2017. Further, 3.6 million households of the 6.5 million connected to the national grid consume an average of 15 kWh of electricity per month (ERC, 2017). The energy consumption behavior portrayed bears a negative implication to the overall growth of the economy and hinders progress in productive activities such as micro small and medium enterprises and the overall welfare benefits at the household level [1]. More so, indoor pollution from exposure to biomass smoke impacts negatively on human health where close to 15,000 lives are lost annually in Kenya and implications are severe among women and girls, whose household energy use revolves around biomass [12, 13]. With the prevailing energy consumption trends, more people are likely to die annually due to respiratory infections and chronic obstructive pulmonary disease [14].

Most of the energy studies analyze households’ energy preference and consumption behavior as conjoined components of the energy use behavior. This study exemplifies that the decision to acquire a certain energy source and the proportions consumed may be affected by a set of different factors. Therefore, there is need to simultaneously analyze the factors affecting energy preference and consumption intensity at the household level.

Materials and methods

Description of the study area and data acquisition

The study was carried out in Kenya. Based on the research gap identified, the study utilized a cross-sectional household dataset that was acquired through the National Energy Survey, 2009. The data comprised comprehensive representative and reliable household energy use patterns. Sampling was initiated by deriving a sampling frame from the National Sample Survey and Evaluation Programme comprised of 6,371,370 households (Table 1). For the eight former administrative provinces including; Nairobi, Coast, Central, Eastern, Western, North Eastern, Nyanza and the Rift Valley. The sampling frame comprised of 1800 clusters, each with a 100 households. Out of the 1800 clusters, 540 were in the urban areas while 1260 in the rural areas (Table 1). Subsquently, a 20% sub-sample of the clusters was selected and comprised of 108 and 252 clusters for rural and urban areas respectively. Further, using the proportionate random sampling technique a sample of 3663 households was derived. It is worthy noting that following the promulgation of the Constitution of Kenya, 2010 the country adopted the devolved system of government and future energy studies are expected to explore the dynamics in the established 47 counties. However, the data is still useful as it represents the regional aspect of energy preference and consumption intensity at the household level.

Table 1 Distribution of clusters and households in national sample as per the Kenya National Bureau of Statistics (KNBS)

Data analyses

The consumption intensity on various energy sources was identified as the dependent variable for this study which was computed household's expenditure on various energy sources and used as a proxy for the intensity or the level of consumption. Consumption intensity is expressed as the ratio of the expenses on a given energy source to total expenses for all the other energy sources in a household. Therefore, the dependent variable was expressed as a continuous proportionate variable comprised of zeros or positive (0 and +n …) values. Zero observations arise from the households that do not consume a certain energy source. According to [15], the presence of zero observations in the dependent variable poses difficulties when analyzing micro-data. Therefore, there was need to consider an appropriate estimation model. The independent variables were identified as socio-economic characteristics pertinent to energy consumption behavior at the household level.

The diagnostic and specification tests

Due to the limited nature of the dependent variable, the study explored on diagnostic and specification tests which aid in the selection of the most appropriate model. The preliminary tests act as a precaution for inconsistent parameter estimates arising from non-normality, heteroscedasticity, and choice of the wrong model [16, 17]. Lagrange Multiplier (LM) test for homoscedasticity [15] and Conditional Moment (CM) based tests [16] for normality were conducted to ascertain whether Tobit model was the appropriate model for the underlying study.

According to Table 2, the LM test values were found to be below the relevant critical value which is an indication of heteroskedasticity hence, rejection of the Tobit model as a suitable tool for analysis. Equally, the CM test showed non-normality distribution led to the rejection of the Tobit model.

Table 2 Lagrange Multiplier (LM) and Conditional Moment (CM) test values

Following validity tests’ fail for the Tobit model, specification tests were carried out to affirm the suitability of double-hurdle models as an appropriate technique for analysis. Therefore, the efficacy of the Tobit model was tested against that of the double-hurdle model using the likelihood ratio test (often referred to as the superiority test or Tobit test statistic) as defined by [18]. The Tobit test statistic was computed as shown in Eq. (i)

$$ \mathrm{LR}=-2\times {x}_{y^2}\left(\mathrm{lnLDH}-\mathrm{lnLT}\right)\sim {x}^{2_{\mathrm{k}}} $$
(i)

where

  • LR = Tobit test statistic

  • lnLDH = the log-likelihood estimation for the double-hurdle model

  • lnLT = the log-likelihood estimates for the Tobit model

  • \( {x}^{2_k} \) = chi-squared distribution with k degrees of freedom, k represents the number of variables in the participation equation i.e. the number of coefficients that are assumed to be zero under the restricted model. Therefore, can also be indicated as Tobit test = 2 × (llProbit + lltrncreg − llTobit) or (− 2 × (Double-hurdle − Tobit).

For this test, the null hypothesis was that there is no significant difference between the double-hurdle model and Tobit model, which would imply that the Tobit model fits the data better. Rejection of the null hypothesis would imply that the double-hurdle model fits the data better [18].

The likelihood ratio (LR) values of the two models were estimated, and the Tobit test values for each equation were compared against the critical values for the chi-square distribution with the specified degrees of freedom (Table 3).

Table 3 Likelihood ratio tests for Tobit model versus double-hurdle model

Results indicate that LR test values were above the critical value indicating that the test statistic Γ = exceeds the critical value of the χ2 distribution. This qualifies the rejection of the Tobit model and adoption of the double-hurdle model. This implies that zero observations could have been as a result of either non-participation or participation but non-consumption [19]. Therefore, the double-hurdle model was considered appropriate in explaining households’ consumption preference and consumption intensity.

Cragg’s double-hurdle model specification and empirical framework

As aforementioned, Cragg’s double-hurdle postulates that households must pass two separate hurdles before they are observed with a positive level of consumption [20]. The first hurdle corresponds to factors affecting preference for a certain energy source and the second to the level of consumption. The unique feature of the double-hurdle model is that factors affecting the energy preference and consumptions are allowed to differ.

As modified from [19, 21] frameworks, the double-hurdle equations are specified as follows:

  1. (i)

    Participation decision

    $$ {\displaystyle \begin{array}{l}{y_{il}}^{\ast }={w}_ia+{u}_i\\ {}d=\left\{{}_{0\kern0.24em \mathrm{otherwise}}^{1\;\mathrm{if}\;{y_{il}}^{\ast}\kern0.84em >0}\right.\end{array}} $$
    (ii)
  2. (ii)

    Consumption decision

    $$ {\displaystyle \begin{array}{l}{y_{i2}}^{\ast }={x}_i\beta +{v}_i\\ {}{y}_i={x}_i\beta +{u}_i\;\mathrm{If}\;{y^{\ast}}_{il}>0\;\mathrm{and}\ {y^{\ast}}_{i2}>0\\ {}{y}_i=0\kern0.98em \mathrm{Otherwise}\end{array}} $$
    (iii)

Equation (ii) represents the dependent variable yil as the latent variable representing household’s choice for a particular energy source. wi is a vector of explanatory variables explaining the choice. wi is a set of individual characteristics explaining the choice; ui is the disturbance term randomly distributed as ui ∼ N(0, 1). d is an unobserved latent variable; yil is a binary indicator equaling one if household i consumes the particular energy item under consideration and zero otherwise [19, 21].

In Eq. (iii), the dependent variable (y*i2) indicate energy share by household i from a particular energy source. x1 is a vector of variables explaining the consumption decision. vi is the error term distributed as vi ∼ N(0, σ2). yi is the observed household consumption intensity on a particular energy source. yi2 is a latent endogenous variable representing households’ consumption level. A positive level of consumption yi is the dependent variable (household energy consumption intensity on various energy sources) which is positive if the household chooses a particular energy source (yil > 0) and also consumes the energy (yi2 > 0). a and β in Eqs. 1.1 and 1.2 are linear parameters exhibiting the effect on the participation and consumption decisions respectively [19, 21].

The double-hurdle model estimation

Double-hurdle maximum likelihood estimation is as shown in Eq. (iv).

$$ \sum \limits_0\;{LL}_{\mathrm{Double}\hbox{-} \mathrm{Hurdle}}=\sum \limits_0\ln \left[1-\phi \left({w}_ia\right)\phi \left(\frac{x_i\beta }{\sigma}\right)\right]+\sum \limits_{+}\ln \left[\phi \left(w\kern0.24em a\right)\frac{1}{\sigma_1}\phi \left(\frac{y_{i-}{x}_i\beta }{\sigma}\right)\right] $$
(iv)

The first term in Eq. (iv) corresponds to the contribution of all the observations with an observed zero [22]. It indicates that the zero observations are coming not only from the participation decision but also from the level of consumption decision. The second term in the equation accounts for the contribution of all the observations with non-zero consumption intensity [23]. Using the maximum likelihood estimation, three marginal effects derived include probability of participation and consumption intensity (unconditional and conditional) to properly estimate the effects of various factors on the dependent variable [23, 24] as illustrated in Eqs. (v) and (vi).

Marginal effects for probability of participation

p(yi > 0| x): The probability of a positive value of yi for the values of the explanatory variables, x showing marginal effects for the probability of participation or acquiring a certain energy source [23].

Marginal effects for unconditional expectation

$$ E\left[y|{x}_i\right]=p\left({y}_i>0|x\right)E\left({y}_i|{y}_i>0,x\right) $$
(v)

Refers to the overall effect on the dependent variable, that is, the expected value of yi for the values of the explanatory variables, x also known as the unconditional expectation of yiE [yi|x] [23].

Unconditional marginal effects refer to the total effect on the level of consumption whereby all households under the study are included in the model. Therefore, a positive value for the marginal effect would suggest an increase in energy consumption across all households. Unconditional marginal test helps gain an understanding of the overall impact of an explanatory variable when for instance the participation effect and consumption effect show different signs.

Marginal effects for conditional expectation

E(yi| yi > 0, x): The conditional expectation that is the expected value of yi for values of the explanatory variables, x, condition of y > 0 showing the intensity of consumption conditional on participation [23].

The specific estimated equation is shown in Eq. (vi):

$$ {Y}_i=\upalpha +{\beta}_1\ \left(\mathrm{Age}\right)+{\beta}_2\ \left(\mathrm{Education}\right)+{\beta}_3\ \left(\mathrm{Location}\right)+{\beta}_4\ \left(\mathrm{Gender}\right)+{\beta}_5\ \left(\mathrm{Dwelling}\_\mathrm{unit}\right)+{\beta}_6\ \left(\mathrm{Household}\_\mathrm{income}\right)+{\beta}_7\ \left(\mathrm{Marital}\_\mathrm{status}\right)+{\beta}_8\ \left(\mathrm{Decision}\_\mathrm{maker}\right)+\upvarepsilon $$
(vi)

Energy sources considered for double-hurdle estimation include electricity, liquefied petroleum gas, kerosene, charcoal, and wood fuel.

Results and discussion

Characterizing household socio-economic patterns and energy source utilization

Majority of the households were situated in rural areas (66.04%) as compared to urban (33.96% households). It was established that wood fuel is a dominant (91.06%) source of energy for rural households while LPG (70.48%) and electricity (67.53%) are dominant energy sources for urban households (Table 4).

Table 4 Household socio-economic characteristics and energy utilized (%)

Notably, majority of the household heads were dominantly 45 years of age (69%) and female spouses were equally involved in decision-making regarding energy consumption. In terms of education, it was observed that majority of household heads had acquired formal secondary education (31.8%). Households with heads without formal education were found to significantly consume more wood energy (over 80%) while households with higher sources of income were observed to rely more on LPG and electricity sources of energy.

Factors affecting the probability, conditional and unconditional energy consumption intensity among households

Determining the dependent variable

This study defines the dependent variable as a proportion of the household’s consumption intensity on a specific energy source. Preliminary results show that the dependent variable comprises of positive proportionate values and zero observations (Table 5).

Table 5 Dependent variable summary statistics for positive consumption

The concept underlying this study is that of single and multiple energy use among households, which gives room for in-depth analysis on factors influencing consumption intensity for clean and non-clean energy sources. The results indicate that both positive and zero consumption levels for various energy sources were recorded among households. According to Table 5, 79%, 58%, 32%, 33%, and 19% of households consumed kerosene, charcoal, wood fuel, electricity, and liquefied petroleum gas respectively. Further, the maximum consumption intensity recorded as one indicates that various households used a single source of energy.

The probability and conditional and unconditional marginal effects of households’ socio-economic factors on energy preference and consumption intensity

The log-likelihood parameters are used to estimate the marginal effects which explain how various factors affect the probability for participation, conditional and unconditional consumption for various energy sources. The discussions focus on the significant results identified across the three estimations, whereby significant and positive observations signify an increase in energy consumption based on the reference category while negative observations indicate a decrease in consumption.

Electricity

Results indicate that the location of a household in an urban area does not affect the probability of electricity use. However, the conditional and unconditional marginal effects indicate that households in urban areas consumed higher proportions on electricity as compared to households in rural areas (Table 6).

Table 6 Probability and conditional and unconditional discrete marginal effects for household’s energy consumption intensity

In terms of gender, preference for electricity as a source of energy among female-headed households was found to be lower as compared to male-headed households. Similarly, consumption intensity based on conditional and unconditional level indicated a decrease among female-headed households. It is further observed that semi-permanent and temporary characteristics of the households’ dwelling unit have a negative probability on participation, conditional and unconditional level of energy consumption. This indicates that there are lower chances and level of consuming electricity among households with temporary and semi-permanent units as compared to households dwelling in permanent units.

Household heads with primary and secondary level education recorded a higher probability of electricity consumption. However, the negative effect on the conditional and unconditional level of consumption implies that lower levels of education negatively affect electricity consumption. Conversely, the marginal effects for household heads with postgraduate degree indicate an increase in electricity consumption intensity.

Liquefied petroleum gas

Results indicate that households in urban areas were more likely to consume higher proportions of liquefied petroleum gas as compared to those in rural areas (Table 6).

In addition, it was observed that female-headed households are less likely to consume LPG as a clean energy source compared to the male-headed households (Table 6). The marginal effects would further implore that female-headed households consumed lower proportions of LPG as compared to their male counterparts. In terms of decision-making on acquiring and utilizing LPG as a source of energy, it was observed that it would be less likely to acquire LPG if the decision maker is a spouse or a child. However, consumption level increases when female spouses and children are the main household decision makers. Results further indicated that there exists a low probability of consuming LPG among households dwelling in semi-permanent housing structures. Households with an average monthly income of KSh 100,001 recorded a higher probability of acquiring LPG and similarly, consume higher proportions of LPG.

Kerosene

The marginal effects estimates for kerosene indicate diverse variations on various household socio-economic factors. Households located in urban areas appear to consume lower proportions of kerosene as compared to their rural counterparts (Table 6).

Further, households headed by males were more likely to consume higher proportions of kerosene as compared to those headed by females. Notably, households with an average monthly income of KSh 100,000 recorded lower consumption intensity on kerosene as compared to households in the lower income brackets. The age of the household head was also found to be a significant factor affecting kerosene consumption intensity among households. The probability of using kerosene for households heads aged 60 years and above was high but recorded lower consumption intensity.

Charcoal

Households located in urban areas recorded a higher probability of consuming charcoal as an energy source. However, proportions consumed were lower as compared to households in rural areas. Similarly, female-headed households were more likely to use charcoal but in lower proportions as compared to male-headed households (Table 6).

When the decision maker on energy consumption is the spouse, there is a higher preference for charcoal and an increase in consumption. Further, results indicate that increase in household monthly income increases the chances of using charcoal as an energy source but negatively affects the level of consumption. This is a true replica of the notion that well-off households prefer and consume clean and efficient energy sources when compared to poor households. The level of education was also identified as a critical factor for understanding energy use dynamics in Kenya. Results are consistent for households headed by persons possessing vocational, bachelors, and postgraduate studies. This implies that educated household heads are more aware of the health risks associated with charcoal and so they end up consuming lower proportions of charcoal.

Wood fuel

For the wood fuel, results indicate that urban households were less likely to use wood fuel and consumption intensity was consistently low. This implies that the majority of households in rural areas consumed higher proportions of wood fuel compared to their urban counterparts (Table 6).

On the other hand, female-headed households were more likely to acquire wood fuel as well as consume it in higher proportions as compared to male-headed households. Households dwelling in semi-permanent and temporary units consumed lower proportions of firewood compared to those in permanent households. Households in the upper-income level (over KSh 100,000) showed a consistent pattern across the model estimates. This indicates that households in the highest income level were less likely to use wood fuel and consume lower proportions of wood fuel. In reference to non-formal education, household heads with higher levels of education are less likely to consume wood fuel hence. Married household heads are likely to higher proportions of wood fuel in reference to a single household head.

Conclusions and policy implication

The incumbent study sought to examine factors that affect energy preference and consumption intensity for various energy sources by utilizing a nationally representative energy micro-level dataset. It can be concluded that the use of the double-hurdle model vividly justifies the notion that households must pass two separate hurdles before a positive level of consumption is observed. The first hurdle corresponds to factors affecting preference for various energy sources and the second on the level of consumption. Results indicate that households’ energy consumption is skewed towards non-clean energy sources. The urban or rural location was observed as a major factor in determining household preference and consumption intensity. It was further observed that households in rural areas consume higher proportions of non-clean energy sources compared to urban households. In addition, household heads with a higher level of education tend to consume higher proportions of clean energy such as electricity, liquefied petroleum gas, and transitional fuel such as kerosene which is mainly used as a substitute. It can further be concluded that an increase in a household’s income translated to an increase in proportions of clean energy consumed and lower proportions of kerosene, charcoal and wood fuel. On the gender perspective, it was observed that electricity consumption decreased among female-headed households as compared to male headed households.

These findings are essential for deriving specific policies that can enhance consumption intensity of clean energy sources. In this regard, promotion of clean energy use should target households in rural areas, households with lower education levels, elderly household heads, and households living in semi-permanent and temporary dwelling units as well as those in the lower income segments. There is a need to encourage liquefied petroleum gas consumption especially among the urban poor and rural households by reducing the upfront cost of acquiring liquefied petroleum gas cylinders. Similarly, energy access programs should integrate the aspect of sensitizing the households on the utilization of clean energy which focuses on health, productive gains, and address misconceptions on various clean energy sources. This strategy is important especially for illiterate households whose preferences and consumption decision are based on ignorant opinions.