Representation Learning in Power Time Series Forecasting

  • Janosch Henze
  • Jens SchreiberEmail author
  • Bernhard Sick
Part of the Studies in Computational Intelligence book series (SCI, volume 865)


Renewable energy resources have become a fundamental part of the electrical power supply in many countries. In Germany, renewable energy resources contribute up to \(29\%\) to the energy mix. However, the challenges that arise with the integration of those variable energy resources are various. Some of these tasks are short-term and long-term power generation forecasts, load forecasts, integration of multiple numerical weather prediction (NWP) models, simultaneous power forecasts for many renewable farms and areas, scenario generation for renewable power generation, and the list goes on. All these tasks vary in difficulty depending on the representation of input features. As an example, consider formulas that express laws of physics and allow cause and effect of otherwise complex problems to be calculated. Similar to the expressiveness of such formulas, deep learning provides a framework to represent data in such a way that it is suited for the task at hand. Once the neural network has learned such a representation of the data in a supervised or semi-supervised manner, it makes it possible to utilize this representation in the various available tasks for renewable energy. In our chapter, we present different techniques to obtain appropriate representations for renewable power forecasting tasks, showing the similarities and differences of deep learning-based techniques to traditional algorithms such as (kernel) PCA. We support the theoretical foundations with evaluations of these techniques found on publicly available datasets for renewable energy, such as the GEFCOM 2014 data, Europe Wind Farm data, and German Solar Farm data. Finally, we give a recommendation that assists the reader in building and selecting representation learning algorithms for domains other than renewable energy.


Representation learning Deep learning Renewable energy Timeseries 



This work was supported within the project Prophesy (0324104A) and c/sells RegioFlexMarkt Nordhessen (03SIN119) funded by the BMWi (Deutsches Bundesministerium für Wirtschaft und Energie / German Federal Ministry for Economic Affairs and Energy).


  1. 1.
    Alom, M.Z., Taha, T.M., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M.S., Van Esesn, B.C., Awwal, A.A.S., Asari, V.K.: The history began from AlexNet: a comprehensive survey on deep learning approaches. CoRR arXiv:1803.01164 (2018)
  2. 2.
    Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. CoRR arXiv:1206.5538 (2012)
  3. 3.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)Google Scholar
  4. 4.
    Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78 (2012)CrossRefGoogle Scholar
  5. 5.
    Gensler, A.: Wind power ensemble forecasting. Ph.D. thesis, University of Kassel (2018)Google Scholar
  6. 6.
    Gensler, A., Henze, J., Sick, B., Raabe, N.: Deep learning for solar power forecasting—an approach using AutoEncoder and LSTM neural networks. In: 2016 IEEE International Conference on Systems, Man, and Cybernetics, pp. 002858–002865 (2016)Google Scholar
  7. 7.
    Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. AIStats 9, 249–256 (2010)Google Scholar
  8. 8.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016)Google Scholar
  9. 9.
    Freedman, D.A.: Statistical Models: Theory and Practice, 2nd edn. Cambridge University Press (2005)Google Scholar
  10. 10.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning The Elements of Statistical Learning, vol. 27. Springer (2017)Google Scholar
  11. 11.
    Hong, T., Pinson, P., Fan, S., Zareipour, H., Troccoli, A., Hyndman, R.J.: Probabilistic energy forecasting: global energy forecasting competition 2014 and beyond. Int. J. Forecast. 32(3), 896–913 (2016)CrossRefGoogle Scholar
  12. 12.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. CoRR arXiv:1502.03167 (2015)
  13. 13.
    Joy, J., Jasmin, E.A., John, V.: Challenges of smart grid. Int. J. Adv. Res. Electric. Electron. Instrum. Eng. 2(3), 2320–3765 (2013)Google Scholar
  14. 14.
    Khalid, S., Khalil, T., Nasreen, S.: A survey of feature selection and feature extraction techniques in machine learning. In: Science and Information Conference, pp. 372–378. IEEE (2014)Google Scholar
  15. 15.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR arXiv:1412.6980 (2014)
  16. 16.
    Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. CoRR arXiv:1312.6114 (2013)
  17. 17.
    Kirchgässner, G., Wolters, J.: Introduction to Modern Time Series Analysis. Springer, Berlin (2007)CrossRefGoogle Scholar
  18. 18.
    Kohler, S., Agricola, A.C., Seidl, H.: dena-Netzstudie II. Technical report, Deutsche Energie-Agentur GmbH (dena), Berlin (2010)Google Scholar
  19. 19.
    Kraskov, A., Stögbauer, H., Grassberger, P.: Estimating mutual information. Phys. Rev. E Stat. Phys. Plasmas Fluids Related Interdiscip. Topics 69(6), 16 (2004)Google Scholar
  20. 20.
    Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R., Tang, J., Liu, H.: Feature selection: a data perspective. CoRR arXiv:1601.07996 (2016)
  21. 21.
    Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. CoRR arXiv:1711.05101 (2017)
  22. 22.
    Lund, H., Østergaard, P.A.: Electric grid and heat planning scenarios with centralised and distributed sources of conventional CHP and wind generation. Energy 25(4), 299–312 (2000)CrossRefGoogle Scholar
  23. 23.
    Masci, J., Meier, U., Cireşan, D., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction. In: International Conference on Artificial Neural Networks, pp. 52–59. Springer, Berlin (2011)Google Scholar
  24. 24.
    Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press (2012)Google Scholar
  25. 25.
    Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in PyTorch. In: Neural Information Processing Systems (2017)Google Scholar
  26. 26.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, É.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Plaut, E.: From principal subspaces to principal components with linear autoencoders. CoRR arXiv:1804.10253 (2018)
  28. 28.
    Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: explicit invariance during feature extraction. In: Proceedings of The 28th International Conference on Machine Learning (ICML-11), vol. 1, pp. 833–840 (2011)Google Scholar
  29. 29.
    Schölkopf, B., Smola, A., Mulle, K.R.: Kernel principal component analysis. In: International Conference on Artificial Neural Networks, pp. 583–588 (1997)Google Scholar
  30. 30.
    Seide, F., Li, G., Chen, X., Yu, D.: Feature engineering in context-dependent deep neural networks for conversational speech transcription. In: Workshop on Automatic Speech Recognition & Understanding, pp. 24–29. IEEE (2011)Google Scholar
  31. 31.
    Serrà, J., Arcos, J.L.: An empirical evaluation of similarity measures for time series classification. CoRR arXiv:1401.3973 (2014)
  32. 32.
    Stańczyk, U., Lakhmi, J.C. (eds.) Feature Selection for Data and Pattern Recognition, 1st edn. Springer-Verlag, Berlin (2015)Google Scholar
  33. 33.
    Van Der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  34. 34.
    van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., Kavukcuoglu, K.: WaveNet: a generative model for raw audio. CoRR arXiv:1609.03499 (2016)
  35. 35.
    Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (2000)CrossRefGoogle Scholar
  36. 36.
    Wang, Q.: Kernel principal component analysis and its applications in face recognition and active shape models. CoRR arXiv:1207.3538 (2012)
  37. 37.
    Zheng, Y., Liu, Q., Chen, E., Ge, Y., Zhao, J.L.: Time Series Classification Using Multi-Channels Deep Convolutional Neural Networks. Lecture Notes in Computer Science, vol. 8485 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Intelligent Embedded Systems LabUniversity of KasselKasselGermany

Personalised recommendations