Effective Feature Preprocessing for Time Series Forecasting

  • Jun Hua Zhao
  • ZhaoYang Dong
  • Zhao Xu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4093)


Time series forecasting is an important area in data mining research. Feature preprocessing techniques have significant influence on forecasting accuracy, therefore are essential in a forecasting model. Although several feature preprocessing techniques have been applied in time series forecasting, there is so far no systematic research to study and compare their performance. How to select effective techniques of feature preprocessing in a forecasting model remains a problem. In this paper, the authors conduct a comprehensive study of existing feature preprocessing techniques to evaluate their empirical performance in time series forecasting. It is demonstrated in our experiment that, effective feature preprocessing can significantly enhance forecasting accuracy. This research can be a useful guidance for researchers on effectively selecting feature preprocessing techniques and integrating them with time series forecasting models.


Support Vector Machine Feature Selection Independent Component Analysis Support Vector Machine Model Mean Absolute Percentage Error 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kolarik, T., Rudorfer, G.: Time Series Forecasting Using Neural Networks. In: Proceedings of the international conference on APL (SIGAPL 1994), Antwerp, Belgium, pp. 86–94 (1994)Google Scholar
  2. 2.
    Contreras, J., Espinola, R., Nogales, F.J., Conejo, A.J.: ARIMA Models to Predict Next-Day Electricity Prices. IEEE Transactions on Power Systems 18(3), 1020–1040 (2003)CrossRefGoogle Scholar
  3. 3.
    Steere, D.C., Baptista, A., McNamee, D., Pu, C., Walpole, J.: Research Challenges in Environmental Observation and Forecasting Systems. In: Proceedings of the 6th annual international conference on Mobile computing and networking, Boston, Massachusetts, USA, pp. 292–299Google Scholar
  4. 4.
    Niimura, T., Ko, H.S.: A Day-ahead Electricity Price Prediction Based on a Fuzzy-neural Autoregressive Model in a Deregulated Electricity Market. In: Proc. the 2002 International Joint Conf. on Neural Networks (IJCNN 2002), May 2002, vol. 2(12-17), pp. 1362–1366 (2002)Google Scholar
  5. 5.
    Guo, J.-J., Luh, P.B.: Selecting Input Factors for Clusters of Gaussian Radial Basis Function Networks to Improve Market Clearing Price Prediction. IEEE Trans on Power Systems 18(2), 665–672 (2003)CrossRefGoogle Scholar
  6. 6.
    Guirguis, H.S., Felder, F.A.: Further Advances in Forecasting Day-Ahead Electricity Prices Using Time Series Models. KIEE International Transactions on PE 4-A(3), 159–166 (2004)Google Scholar
  7. 7.
    Nuller, K.-R., Smola, A.J., Ratsch, G., Scholkopf, B., Kohlmorgen, J., Vapnik, V.: Predicting time series with support vector machine. In: Gerstner, W., Hasler, M., Germond, A., Nicoud, J.-D. (eds.) ICANN 1997. LNCS, vol. 1327, pp. 999–1004. Springer, Heidelberg (1997)Google Scholar
  8. 8.
    Butler, K.L., Momoh, J.A.: Detection and classification of line faults on power distribution systems using neural networks. In: Proc. of the 36th Midwest Symp. on Circuits and Systems, August 16-18, 1993, vol. 1, pp. 368–371 (1993)Google Scholar
  9. 9.
    Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering 17(4) (April 2005)Google Scholar
  10. 10.
    Nealand, J.H., Bradley, A.B., Lech, M.: Discriminative feature extraction applied to speaker identification. In: 6th Int. Conf. on Signal Processing, August 26-30, 2002, vol. 1, pp. 484–487 (2002)Google Scholar
  11. 11.
    Tamhane, A.C., Dunlop, D.D.: Statistics and Data Analysis: from Elementary to Intermediate. Prentice Hall, Upper Saddle River, NJ (2000c)Google Scholar
  12. 12.
    Blum, A.L., Langley, P.: Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence 97, 245–271 (1997)CrossRefzbMATHMathSciNetGoogle Scholar
  13. 13.
    Dash, M., Choi, K., Scheuermann, P., Liu, H.: Feature Selection for Clustering-a Filter Solution. In: Proc. Second Int’l Conf. Data Mining, pp. 115–122 (2002)Google Scholar
  14. 14.
    Hall, M.A.: Correlation-Based Feature Selection for Discrete and Numeric Class Machine Learning. In: Proc. 17th Int’l Conf. Machine Learning, pp. 359–366 (2000)Google Scholar
  15. 15.
    Ben-Bassat, M.: Pattern Recognition and Reduction of Dimensionality. In: Krishnaiah, P.R., Kanal, L.N. (eds.) Handbook of Statistics-II, pp. 773–791. North Holland, Amsterdam (1982)Google Scholar
  16. 16.
    Kira, K., Rendell, L.A.: The Feature Selection Problem: Traditional Methods and a New Algorithm. In: Proc. 10th Nat’l Conf. Artificial Intelligence, pp. 129–134 (1992)Google Scholar
  17. 17.
    Kohavi, R., John, G.H.: Wrappers for Feature Subset Selection. Artificial Intelligence 97(1-2), 273–324 (1997)CrossRefzbMATHGoogle Scholar
  18. 18.
    Martinez, W.L., Martinez, A.R.: Computational Statistics Handbook with MATLAB. Chapman & Hall/CRC, Boca Raton (2002c)Google Scholar
  19. 19.
    Oja, E., Kiviluoto, K., Malaroiu, S.: Independent component analysis for financial time series. In: IEEE Adaptive Systems for Signal Processing, Communications, and Control Symposium. AS-SPCC (2000)Google Scholar
  20. 20.
    Zhang, B.L., Dong, Z.Y.: An Adaptive Neural-wavelet Model for Short-Term Load Forecasting. Electric Power Systems Research 59, 121–129 (2001)CrossRefGoogle Scholar
  21. 21.
    Zhang, H., Zhang, L.Z., Xie, L., Shen, J.N.: The ANN of UMCP forecast based on developed ICA. In: IEEE International Conference on Electric Utility Deregulation, Reconstruction, and Power Technologies (DRPT 2004), April 2004, Hongkong (2004)Google Scholar
  22. 22.
    Skantze, P., Gubina, A., Ilic, M.: Bid-based Stochastic Model for Electricity Prices: The Impact of Fundamental Divers on Market. In: Dynamics. Report produced by Energy Laboratory, MIT, available:
  23. 23.
    Han, J.W., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco, Calif (2001)Google Scholar
  24. 24.
    Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic, Boston (1998)CrossRefzbMATHGoogle Scholar
  25. 25.
    Witten, I.H., Frank, E.: Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufman, San Francisco, Calif (2005)Google Scholar
  26. 26.
    Yang, J., Honavar, V.: Feature Subset Selection Using a Genetic Algorithm, Feature Extraction, Construction and Selection: A Data Mining Perspective, pp. 117–136 (1998) (2nd printing, 2001)Google Scholar
  27. 27.
    Garcia, G.N., Ebrahimi, T., Vesin, J.-M.: Support vector EEG classification in the Fourier and time-frequency correlation domains. In: First International IEEE EMBS Conference on Neural Engineering, March 20-22, 2003, pp. 591–594 (2003)Google Scholar
  28. 28.
    Sakai, T.: Average gain ratio: a simple retrieval performance measure for evaluation with multiple relevance levels. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval (July 2003)Google Scholar
  29. 29.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jun Hua Zhao
    • 1
  • ZhaoYang Dong
    • 1
  • Zhao Xu
    • 2
  1. 1.The School of Information Technology and Electrical EngineeringThe University of QueenslandAustralia
  2. 2.Centre for Electric Technology (CET), Ørsted*DTUTechnical University of DenmarkDenmark

Personalised recommendations