Identifying Price Index Classes for Electricity Consumers via Dynamic Gradient Boosting

  • Vanh Khuyen NguyenEmail author
  • Wei Emma Zhang
  • Quan Z. Sheng
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11234)


Electricity retailers buy electricity at spot prices and resell energy to their customers at fixed retail prices. However, the electricity market is complex with highly volatile spot prices, and high price events might happen during peak time periods when energy demand significantly increases, leading to the decision of the retail price a challenging task. Understanding consumer price index, a price indicator that is associated with electricity consumption of customers helps energy retailers make critical decisions on pricing strategy. In this work, we apply dynamic gradient boosting model, namely CatBoost, to classify customers into different groups according to their price indices. To benchmark our results, we compare the performance of CatBoost with other baselines, including Random Forest, AdaBoost, XGBoost, and LightGBM. Our experimental results proved that CatBoost outperformed other algorithms due to its effective overfitting detector and categorical encoding techniques. Besides, the area under the curve of the Receiver Operating Characteristics (ROC), often known as AUC, is used as a standard measure metric to evaluate and compare between classifiers. Hence, CatBoost gained the lowest difference score of 0.02 between train AUC and test AUC scores that successfully competed other models.


Classification learning CatBoost Gradient boosting model 



This study was funded by Capital Markets Cooperative Research Centre (CMCRC) ( and supported for data collection by Mojo Power, Australia.


  1. 1.
    AEMC. Fact sheet: How the spot market works. p. 4 (2017)Google Scholar
  2. 2.
    Anderson, E.J., Hu, X., Winchester, D.: Forward contracts in electricity markets: the Australian experience. Energy Policy 35(5), 3089–3103 (2007)CrossRefGoogle Scholar
  3. 3.
    Au, T.C.: Random Forests, Decision Trees, and Categorical Predictors: The “Absent Levels” Problem (2017)Google Scholar
  4. 4.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar
  5. 5.
    Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)Google Scholar
  6. 6.
    Dorogush, A.V., Gulin, A., Gusev, G., Kazeev, N., Prokhorenkova, L.O., Vorobev A.: Fighting biases with dynamic boosting. CoRR, arXiv:abs/1706.0 (2017)
  7. 7.
    Fitkov-Norris, E., Vahid, S., Hand, C.: Evaluating the impact of categorical data encoding and scaling on neural network classification performance: the case of repeat consumption of identical cultural goods. In: Jayne, C., Yue, S., Iliadis, L. (eds.) EANN 2012. CCIS, vol. 311, pp. 343–352. Springer, Heidelberg (2012). Scholar
  8. 8.
    Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors). Ann. Stat. 28(2), 337–407 (2000)CrossRefGoogle Scholar
  9. 9.
    Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B.: Bayesian Data Analysis, 3rd edn. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis (2013)Google Scholar
  10. 10.
    Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, pp. 3149–3157 (2017)Google Scholar
  11. 11.
    Kim, T., et al.: Extracting baseline electricity usage using gradient tree boosting. In: 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity), pp. 734–741 (2015)Google Scholar
  12. 12.
    Kokol, P., Zorman, M., Stiglic, M.M., Malèiæ, I.: The limitations of decision trees and automatic learning in real world medical decision making. In: 9th World Congress on Medical Informatics, MEDINFO 1998, pp. 529–533 (1998)Google Scholar
  13. 13.
    Le, C.V., et al.: Classification of energy consumption patterns for energy audit and machine scheduling in industrial manufacturing systems. Trans. Inst. Measur. Control 35(5), 583–592 (2013)CrossRefGoogle Scholar
  14. 14.
    Ling, C.X., Huang, J., Zhang, H.: AUC: a better measure than accuracy in comparing learning algorithms. In: Xiang, Y., Chaib-draa, B. (eds.) AI 2003. LNCS, vol. 2671, pp. 329–341. Springer, Heidelberg (2003). Scholar
  15. 15.
    Mason, L., Baxter, J., Bartlett, P., Frean, M.: Boosting algorithms as gradient descent. In: Proceedings of the 12th International Conference on Neural Information Processing Systems, NIPS 1999, pp. 512–518. MIT Press, Cambridge (1999)Google Scholar
  16. 16.
    Micci-Barreca, D.: A preprocessing scheme for high-cardinality categorical attributes in classification and prediction problems. SIGKDD Explor. 3(1), 27–32 (2001)CrossRefGoogle Scholar
  17. 17.
    Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning: data mining, inference and prediction (2001)Google Scholar
  18. 18.
    Torres-Barrán, A., Alonso, Dorronsoro, J.R.: Regression tree ensembles for wind energy and solar radiation prediction. Neurocomputing (2017)Google Scholar
  19. 19.
    Wu, D.J., Feng, T., Naehrig, M., Lauter, K.E.: Privately evaluating decision trees and random forests. PoPETs 2016(4), 335–355 (2016)Google Scholar
  20. 20.
    Li, X., Bowers, C.P., Schnier, T.: Classification of energy consumption in buildings with outlier detection. IEEE Trans. Ind. Electron. 57(11), 3639–3644 (2010)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Vanh Khuyen Nguyen
    • 1
    Email author
  • Wei Emma Zhang
    • 1
  • Quan Z. Sheng
    • 1
  1. 1.Department of ComputingMacquarie UniversitySydneyAustralia

Personalised recommendations