Background & Summary

Many assessments of future electricity demand in India project large increases in electricity consumption from adoption of air conditioning technologies in the buildings sector over the next two decades1,2,3. This large growth is likely to make India among the top nations in terms of electricity consumption, implying that technology choices related to energy consumption and production in India are likely to play a significant impact on global climate change mitigation efforts. Additionally, the Indian government has been pushing for the transportation sector’s electrification, starting with two- and three-wheel vehicles,which is further likely to increase overall electricity demand. As of 2020 in India, there are 152,000 registered electric vehicles2. Air conditioning (AC) related electricity demand accounted for 32.7 TWh, contributing to less than 2.5% of the total demand in 20193. However, both air conditioning and transport electrification are anticipated to introduce structural changes in the temporal and spatial trends in electricity consumption patterns, that has important ramifications for long-term resource planning for the electricity sector4. This paper presents an bottom-up approach to estimate electricity consumption in India for various scenarios of technology and policy adoption with a specific focus on providing aggregated consumption estimates as well as spatio-temporally resolved consumption profiles that would be relevant for regional and national electricity system planning studies. The approach enables quantifying the impact of various growth and technology adoption scenarios on quantity and pattern in electricity consumption. The datasets detailed in this paper include annual energy consumption at India’s state, regional, and national levels as visualized in Fig. 1, as well as underlying consumption profiles at an hourly time resolution. The annual energy consumption is forecasted on a five-year increment to 2050. Figure 2 shows one scenario of national electricity demand forecast. In addition to the snapshot of annual consumption, hourly load profiles are developed at the same resolution as seen in Fig. 3.

Fig. 1
figure 1

State and regional level distribution of annual electricity 2050 for stable GDP growth, baseline cooling, and home electric vehicle (EV) charging scenario.

Fig. 2
figure 2

Summary results of India’s electricity demand forecasting at national level with stable GDP growth, baseline cooling, and home electric vehicle (EV) charging.

Fig. 3
figure 3

2030 Load profile for southern Region across three days in summer. Scenario: stable GDP growth, baseline cooling, home electric vehicle (EV) charging.

The forecasting is divided into two steps: business-as-usual and technology. Business-as-usual is a statistical model that infers data it can be trained on i.e. historical electricity demand. The technology model is a bottom-up approach that adds new loads to the total demand. Among new loads, we focus on residential and commercial cooling as well as various electric vehicles (EV). Some key insights from cooling3 and EV2 studies highlighting peak demand development motivate the need for demand forecasting at the hourly resolution. Cooling demand due to mainly split unit air conditioning installation in India is expected to increase the peak to mean ratio (also sometimes referred to the “peakiness”) of electricity demand in India as well as shift the timing of peak demand from evenings to midnight3. While electric vehicles do not constitute a large portion of the total demand, certain charging schemes can contribute significantly to the peak demand2. Numerous energy demand forecast for India have recently been published as decadal snapshots1,4,5, however granularity of demand at an hourly resolution has not been presented in these studies. Our approach enables quantifying the impact of different technology and structural elements, such as adopting energy efficient vs. baseline cooling technology or work-place charging vs. home charging for EVs, on the hourly electricity consumption profiles. These insights and the accompanying data sets are essential to carry out generation and transmission expansion as well as distribution network planning,and are thus essential for a sustainable energy infrastructure development in the Indian context.

Similar to other forecasting studies, we model Gross domestic product (GDP) growth6 to be the main econometric driver of the business-as-usual demand forecasting, and thus three scenarios are introduced: slow, stable, and rapid GDP growth. We examine two AC load scenarios: energy-efficient equipment and baseline equipment per the International Energy Agency’s Future of Cooling study3. Finally, we evaluate three EV charging mechanisms: home, work, and public charging. This totals the number of data sets spanning three input dimensions to 18 scenarios. Technology adoption growth has been correlated with economic growth under the assumption that new technologies are adopted faster when the economy is growing faster and vice versa. We present two cooling scenarios to highlight the difference in energy-efficient and regular air conditioning units and bring attention to the need for policy and programs that favor energy-efficient cooling unit sales. Furthermore, we present various EV charging mechanisms to inspect the demand impacts that electric vehicle charging can have on the electric grid at different times. The produced data can be used as input to electricity infrastructure planning both at the distribution and transmission level.

Methods

Figure 4 illustrates the major steps of our proposed demand forecasting approach. We use two models to estimate future electricity demand in India. In the first model — business-as-usual — we use a linear regression model to project daily peak and consumption on a regional basis; this is the business-as-usual scenario. We then add natural variation to the projections by finding the error between the training data and results and scaling it to every region based on seasonality. Then we fit the projected peak and total consumption to an annual hourly load profile for 20157 featuring an evening peak8. In the second model — technology model — we take AC and EV adoption into account as an additive component on top of the business-as-usual predictions. GDP data, which is an independent variable in the model, is chosen to be the main driver of growth of the business-as-usual scenario as well as technology adoption rates. The input data used are publicly available and are referenced in Table 1.

Fig. 4
figure 4

Simplified schematic of methods; inputs in green, models in red, outputs in blue.

Table 1 Input Data Sources.

Input data processing

Although GDP is widely used for forecasting energy demand, it is specifically essential in the case of India, where economic growth is expected to ramp up over the next few decades similar to the recent trends in China9. We based our demand forecast on GDP projections from a PricewaterhouseCoopers (PwC) report10, that projected India’s GDP to grow from 3.6 trillion in 2020 to reach 28 trillion USD in 2050. Considering the historical national GDP data for India starting in 1990, we fit and project an exponential curve for rapid growth and an Gompertz curve for slow growth11 as detailed in Table 2. We use PwC’s projections to define the stable GDP growth scenario. Curve fitting and projection results are illustrated in Supplementary Fig. 1. The rapid growth scenario produces an annual average growth rate of 9.5%, PwC’s growth rates start at 7.8% for the first projected decade and ends at 6.2% in the final projected decade. The slow growth scenario starts at 7.2% growth rate in the first projected decade and ends at 3.9% in the final projected decade. To break down the regional energy consumption projections to state level we use the ratio of GDP per capita of the corresponding state to the GDP per capita of the region it is in. For each GDP growth scenario, we fit the same functions given state-wise data to produce GDP forecast at the same resolution. GDP per capita at state-level is computed using the projected GDP data and state level population projections12.

Table 2 GDP projections curve fit results.

GDP dependence and limitation

Relating growth in electricity demand to GDP is a strong generalization, however it is not a novel one in the case of India. Strong correlation between economic growth and energy consumption has been established in the Indian context in this study and other studies13 given data from the past two decades6. We recognize that GDP as a metric of economic growth has several limitations particularly related to projecting how economic growth is distributed among society within a state or nation. This may be the strongest limitation of the data we are presenting in the manuscript. However, lack of historical record and long-term projections of alternative open-access economic data at the desired spatial and temporal resolution limit the development of a framework to project energy consumption with other metrics. While GDP and energy consumption growths may differ in the long-run, there is an evident correlation between the two that can be used to estimate long-run energy consumption growth. Deviating away from linear regression may yield better results, however, data scarcity is again a limitation to the development of more complex models. Furthermore, this manuscript motivates the need for more bottom-up projections and not just regression models because historical consumption cannot infer consumption trends from new demand sources such as cooling and EVs.

Additionally, since the Future of Cooling study by the International Energy Agency relies on GDP forecasts developed by the International Monetary Fund3, we elected to use a similar metric. We intentionally develop a large bandwidth of projection scenarios to mitigate the limitation of an individual snapshot representing a singular assumption. The motivation behind presenting the described results is ability to compare different scenarios and post-analyze the demand growth and the trade-offs. To produce a large bandwidth of growth scenarios we needed to use a straightforward metric that has enough historical data to produce various fitted curves for projections.

Business-as-usual model

The business as usual projections are modeled with a linear regression considering weather and economic growth features. The ground truth historical daily peak and total consumption for each electric grid were obtained from the Power System Operation Corporation (POSOCO) for 2014–201914. The GDP used in the model was obtained, as explained in the previous section. Weather data was secured from the NASA Merra-2 data set15. The choice of features for the regression model is limited to GDP and weather variation due to the limitation in availability of data, both historical and future projections, at the desired spatial and temporal resolution. GDP is identified as a long-term parameter driving growth in year over year demand projections as highlighted in Fig. 5. Weather data is identified as a short-term parameter driving seasonal variation within a year’s demand projections as highlighted in Fig. 6. Previous parametric analysis on these features and their coefficient for short and long term demand forecasting in both time and frequency domain16 reinforce their use as features for the business-as-usual regression model. We present detailed outcomes for the Southern region, with further details available in16.

Fig. 5
figure 5

Southern region back test annual demand growth given GDP projection.

Fig. 6
figure 6

Southern region back test seasonal demand variation given weather data.

NASA Merra 2 data acquisition

For each of the five electric grid demand regions highlighted in right panel of Fig. 1, the largest cities in each region were identified using population data made available by the United Nations17. Then, the city’s latitude and longitude were used to pull down the corresponding environmental data from the Nasa Merra-2 data set. The cities used for each of the five regions are listed here:

  • Northern: Delhi, Jaipur, Lucknow, Kanpur, Ghaziabad, Ludhiana, Agra

  • Western: Mumbai, Ahmadabad, Surat, Pune, Nagpur, Thane, Bhopal, Indore, Pimpri-Chinchwad

  • Eastern: Kolkata, Patna, Ranchi (Howrah was ignored because the environmental factors are the same as Kolkata)

  • Southern: Hyderabad, Bangalore, Chennai, Visakhapatnam, Coimbatore, Vijayawada, Madurai

  • Northeast: Guwahati, Agartala, Imphal

From the NASA set, 11 variables were included for each city: specific humidity, temperature, eastward wind, and northward wind (all 2 m above the surface and 10 m above the surface - eight total variables), precipitable ice water, precipitable liquid water, and precipitable water vapor. In particular, the instantaneous two-dimensional collection “inst1_2d_asm_Nx (M2I1NXASM)” from NASA was used. Detailed descriptions of these variables are available in the Merra-2 file specification provided by NASA15. The environmental variables available from the NASA MERRA-2 dataset were given on an hourly basis. The daily minimum, daily, maximum, and daily average was calculated for each of the 11 variables for each day.

Forecasts

The business-as-usual demand forecasting problem was divided into ten separate problems,corresponding to one problem each peak and total consumption for each of the five regional grids shown in Fig. 1. To ensure the model would not overfit the data, the model was trained with Elastic Net18 to regularize results, and validated on held out 2019 data. An L1 ratio (Lasso) of 0.9 was chosen to minimize error in 2019 as the validation set. Then all of the models were trained with 0.9 L1 ratio on the full dataset.

Addition of natural variation

This step aimed to match the statistical characteristics of an actual load year with the projected year. 2019 was used to derive the differences. Natural variation was estimated by a distribution characterized by the mean and standard deviation of the differences (in absolute value). Then, a natural variation adjustment was added to that day (with a random true/false bit for positive or negative variation). The noise was calculated for each region and peak demand and daily consumption separately. The natural variation (noise) vectors used are on the Github repository for this paper19. This part of the process is non-deterministic and replication of the results requires using the same natural variation vector used in our projections.

Hourly profiles

The statistical inference model presented above forecasts daily consumption driven by state-level economic parameters and weather data. The produced projections are at a daily resolution. We downscaled the data to hourly load profiles based on the 2015 hourly load profile data7. The result of the regression model is at regional level, breaking it down state-wise is pro-rated based on state-wise to region-wise GDP per capita projections ratios for the respective year. To do so, we tag each day of the year by the month it corresponds to and whether it is a weekday or weekend. We cluster demand for each hour by month and day. Each hour of the day then has its own cluster of demand data from 2015 based on the assumption that the same hour of the day for a given month and the same day type will exhibit similar demand behavior. This biases the construction of the profiles to demand patterns from 2015 only. To minimize the impact of this bias, we use the historical weather data15 of the testing data years (2014–2019) for each day to simulate daily temperatures variations that are reflected in higher or lower demand. We sample weather data for each day and compare it to 2015, and subsequently use normalized the difference to scale the demand on a daily basis. Finally, we sample demand for each hour of the year from the corresponding cluster (defined by month and weekend or weekday) and scale it accordingly. Constructing the hourly load profile and fitting them to match the projected daily consumption and the projected daily peak demand then becomes a trivial exercise of sampling and fitting from the corresponding clusters and weather data space. The 2015 hourly demand data used in this study is documented in detail elsewhere and has been used in projecting demand for supply-side modeling efforts8. Limited availability of complete hourly data at state and regional level in India biases the hourly profiles to the 2015 datasets. However, the business-as-usual projections are for existing demands composed mainly of lighting and appliance at the residential level and large daytime loads at the commercial level20. Our approach implicitly assumes that energy consumption trends for these loads will follow historical patterns and therefore sampling from a given year with post-processed noise variation can yield reasonable results.

Impact of Climate change on business-as-usual demand

As per the International Energy Agency (IEA) World Energy Outlook (WEO) 201921 only 5% of households in India currently own air conditioning units and 2.6% of commercial building energy use is from space cooling. Historically, electricity consumption in India has been driven by lighting and appliances in the residential sector20 with commercial and industrial sector contributing via larger daytime loads. Since cooling demand is not historically available in the data that the business-as-usual regression model is learning from, there is no parametric value to projecting increase in temperatures since there is no evident correlation between temperature increase and lighting or appliance use. Moreover, since space cooling is a small percentage of current electricity demand in India, no major trends can be identified given the limited daily training data that is being used for the business-as-usual regression. It is then safe to assume that weather remains constant for the business-as-usual demand.

Technology model

Since a regression model can only produce forecasts of data it can learn from, additional bottom-up processing must be carried out to get a full picture of India’s demand in the future. We identify trends and data points at the state level of the country to build a regional profile as well as the national one.

Cooling

Cooling is divided into two main categories: residential and commercial. The ratio of commercial to residential consumption is computed from state-level data22 and is used as the ratio of commercial to residential cooling demand. Using the IEA’s baseline and efficient cooling projections from the Future of Cooling study3, we use the annual sales and unit types to calculate the energy consumption and growth rate at a national level and pro-rate it down to state level given GDP per capita. Surveyed hourly demand profiles20 are indicators of behavioral cooling energy consumption patterns as exemplified in Supplementary Figs. 2 and 3. The survey produce various profiles given climate seasons, household income and size. We apply a time-domain convolution of these profiles to generate a representative profile for each state for the various climates and seasons.

We can generate the air conditioning demand profiles for two weather seasons (winter and summer) by convolution of the sample profiles to generate a smooth aggregated demand profile. Moreover, coincidence factors must be applied to properly estimate the simultaneity of the demand and its peak. Two coincidence factors are identified: weekday and weekend, values are extracted from a Reference Network Model Toolkit23. We break down the national cooling demand to residential and commercial at state level by identifying state-level sector size and growth trends. Scaling the profiles to match the projected cooling energy demand produces hourly energy consumption profiles from residential and commercial cooling. Aggregating the appropriate states together will produce the same results at the regional level.

More importantly, the IEA’s future of cooling study3 stresses the usage of Cooling Degree Days (CDD) to project cooling demand dependency on temperature. The unit consumption pattern and projections of capacity for India’s share of global cooling demand is based on growth in electrification, urbanization as well as Purchasing Power Parity. The IEA future of cooling study estimates that a 1-degree Celsius increase in decadal average temperature in 2050 will to lead to 25% more CDD and a 2-degree Celsius increase will lead to 50% more CDD. Climate change impacts are considered in the unit sales and energy consumption data used from the IEA’s future of cooling study. In our analysis, we use IEA’s 50% increase in CDD to model cooling demand in 2050. For prior periods, we interpolate CDD between 2018 and 2050 to model cooling demand. The increase in CDD and the addition of noise variation are introduced for the purpose of modeling the projected increase in peak demand due to climate change. Specifically, this analysis does not consider frequency nor forecast of extreme weather events.

Electric vehicles

The second component of the technology model projects EV demand in India. The data presented here considered electric two, three, and four-wheel vehicles. Two-wheelers, being the dominating vehicle in terms of annual sales in India24, are expected to be electrified first, followed by the three-wheelers and regular cars25. The Indian government has set a goal of converting 100% of two-wheeler sales and 30% of all vehicle sales to electric by 203026, so the starting point is vehicle sales at the state level24. Using the regression equations of the corresponding GDP growth scenarios, we can project car sales with the EV targets by 2030 met in the rapid growth scenario. From vehicle sales and conversion rates, we get an estimate of the number of EV that will require charging. From a market survey on the average commute distance of vehicles in urban areas and rural areas25, long and short-range battery capacity and EV energy can be estimated. We introduce a mix of EV sales starting with short-range as the dominant market product and shifting to long-range, a market-dominant market in 2050. This trends reflects the current economic competitiveness of short-range EVs vs. existing internal combustion engine vehicles as well as the long-term competitiveness of long-range EVs with declining battery costs.

Similar to the construction of the cooling profiles, a coincidence factor must be implemented, so as to not over-predict peak EV charging demand. Since this is a new consumption behavior and given the relatively small batteries of two-wheelers and three-wheelers, it is assumed that every vehicle needs to charge every other day on average for urban drivers and every day for rural ones. This yields an average daily consumption from EV charging. As shown in Supplementary Fig. 4, three different charging profiles — home, work, public – are identified in an EV pilot project study in Mexico City27. While Mexico and India differ greatly in many socio-economic aspects. The different hourly EV charging profiles collected were for a pilot project to deploy electric two-wheelers and small sedans in the metropolitan area of Mexico City. This presents two synergies enabling the usage of the charging profiles in India. Under the assumptions that EV deployment will be more prevalent in urban areas in India with initial conversion of smaller vehicles (two-wheelers and three-wheelers), the charging data collected27 is a suitable fit for potential EV charging schemes in India. Energy consumption is computed from vehicle sales, projections, and electrification conversion. That calculated number is then fitted under the chosen charging profile. Time domain convolution of the profiles is applied to smoothen the peakiness of the total constructed hourly time series.

Data dependence

The technology model relies heavily on surveyed data to produce the representative hourly profiles for cooling and electric vehicle demands at state levels. This is indeed a limitation, and our projections assumes that future technology adopters will behave just like initial adopters. In the absence of a better alternative at a similar spatial and temporal resolution, the bottom-up modeling effort provides a reasonable estimate of temporal patterns expected from these new demand sources. For the hourly sample cooling profiles, the main assumption is that cooling demand consumption is only dependent on weather patterns and econometric patterns. Specifically, we apply a weighted sum convolution of the income level cooling profiles based on the states’ GDP per capita ranking. For the total cooling demand at national level, we depend on the air cooling unit sales projection as well as break down of unit energy consumption under baseline and efficient scenarios of the IEA’s Future of Cooling report3. We pro-rate residential cooling at state level using the GDP per capita projections. For commercial cooling we use the state-wise sector growth trends28. A sanity check for this break down is to sum both residential and commercial state-wise cooling demand and compare to the IEA’s all India cooling demand annual electricity consumption projections to 2050, the difference is highlighted in Supplementary Figs. 5 and 6. Regarding the EV profiles, while there are alternative choices of charging schemes, we identified the synergies with the Berkeley study27 to be best reflective of the bookend EV charging scenarios across India.

Data Records

The data is uploaded on Zenodo29 and is available to download at https://doi.org/10.5281/zenodo.4564581. The path leading to a CSV file indicate the scenario corresponding to the results of that file. Breakdown of the folder hierarchy listed as:

  1. 1.

    GDP Growth: slow, stable, rapid

  2. 2.

    EV charging: home, work, public

  3. 3.

    Cooling: baseline, efficient

  4. 4.

    Type: detailed, summary

The detailed results are tables of the itemized hourly demand profile of each considered scenario; all files will produce 8760 rows (number of hours in a year). The summary are tables of the itemized annual energy consumption for the considered years; all files will produce seven rows (number of considered future years). Both file types are itemized the same way as per Table 3. The path of each file is the reference to the specific scenario the data in the tables represents. For example SR.csv file under slow/home/efficient/summary is the summary file of the case of slow economic growth, home EV and energy efficient air conditioning consumption.

Table 3 Output data headers descriptor.

Technical Validation

The Business-as-usual statistical model is validated using standard statistical metrics when backtesting is applied. Further details on the backtesting are available elsewhere16. For the technology model, we compare our estimates to the IEA’s WEO1,21,30,31 and Brookings India5. Furthermore, our projections compare favorably against the EV projections to the IEA’s Global Electric Vehicle Outlook 20202.

Back testing

Daily consumption and peak are projected for all five regions, we show the daily consumption back tests of the Southern Region in Fig. 7. More results can be found on the GitHub repository. It is important to note that the regression model captures the organic growth of the historical demand as well as the seasonal variation in demand but is not accurate at predicting daily variation. This shortcoming can be attributed to the small training dataset that is available. To compensate for this short-coming, we add additional noise variation as discussed earlier in the Methods section. We compare the R-squared value of the regression only versus the regression and noise time series as shown in Table 4. Additionally, selected parameter performance metrics of the model for the Southern Region are presented in Table 5. The model’s independent variables are the 2 meters and 10 meters elevation historic temperature and humidity data for the selected cities and GDP data for the state. Various weather parameters will have a higher coefficient then GDP since the latter is not as granular as a metric but will still be factored in for longer term growth as interpreted by its Fourier component16.

Fig. 7
figure 7

Back test result for Southern Region regression model.

Table 4 Business-as-usual Regression R-squared consumption results.
Table 5 Business-as-usual Southern Region consumption Regression performance of select parameters.

Cross-comparison

Supplementary Figs. 7 and 8 compare the forecasting results to the WEO 2020 projections of India’s Energy Demand to 2040. Our band of projections is notably wider due to the large number of scenarios that are combined to forecast energy demand. We further compare our results to Brookings India’s study in Supplementary Fig. 9. We also compare our electric vehicle projections to those of the Global EV Outlook in Supplementary Fig. 10. Finally, we compare our air conditioning demand contribution to the peak demand to the Future of Cooling study in Supplementary Fig. 5.

COVID-19 pandemic impact on year 2020

The COVID-19 pandemic has drastically affected the global population in various ways. Energy consumption dropped severely as people were advised to stay at home. While it is not possible to project such “Black Swan” events from historical data, their long-term effects can be modeled as delayed growth under various recovery schemes. Figure 8 shows that our projections for the month of January 2020 align with the realized demand, which is prior to the global outbreak of COVID-19. Evidently, there is a strong mismatch in the following months as the outbreak developed into a global pandemic. However, in the later part of the year, signs of recovery are noticed where the historical daily consumption once again reaches projected levels.

Fig. 8
figure 8

2020 year-to-date demand comparison with projections.

The impact of extreme events on energy consumption are difficult to predict at a granular level. Our projections are at a five year increment so that such yearly variations are smoothed out and the regression towards the mean phenomenon is observed. Moreover, the recovery from extreme events and their long-term impact can depend on many factors: economic, social, scientific and more. Without modeling those events in detail, projected growth can model the long-term average growth rate. In case of a negative extreme event, a smaller growth rate can model the long-term impact caused by the slow down. Similarly, a positive extreme event can be modeled as larger growth rate to include the long-term impact by the rapid growth. With signals of a fast recovery in total daily consumption for most regions, we elected to disregard projections that model long-term COVID-19 pandemic impact to avoid confirmation bias. Moreover, there is little data to support projections modeling a long-term impact on Indian energy consumption. We believe that the model and data presented in this paper are valid beyond the COVID-19 pandemic.

Usage Notes

The format of the results is comma-separated values (CSV). All the results are available on the Zenodo Open-Access repository29.