Abstract
Time series forecasting is an important area in data mining research. Feature preprocessing techniques have significant influence on forecasting accuracy, therefore are essential in a forecasting model. Although several feature preprocessing techniques have been applied in time series forecasting, there is so far no systematic research to study and compare their performance. How to select effective techniques of feature preprocessing in a forecasting model remains a problem. In this paper, the authors conduct a comprehensive study of existing feature preprocessing techniques to evaluate their empirical performance in time series forecasting. It is demonstrated in our experiment that, effective feature preprocessing can significantly enhance forecasting accuracy. This research can be a useful guidance for researchers on effectively selecting feature preprocessing techniques and integrating them with time series forecasting models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Kolarik, T., Rudorfer, G.: Time Series Forecasting Using Neural Networks. In: Proceedings of the international conference on APL (SIGAPL 1994), Antwerp, Belgium, pp. 86–94 (1994)
Contreras, J., Espinola, R., Nogales, F.J., Conejo, A.J.: ARIMA Models to Predict Next-Day Electricity Prices. IEEE Transactions on Power Systems 18(3), 1020–1040 (2003)
Steere, D.C., Baptista, A., McNamee, D., Pu, C., Walpole, J.: Research Challenges in Environmental Observation and Forecasting Systems. In: Proceedings of the 6th annual international conference on Mobile computing and networking, Boston, Massachusetts, USA, pp. 292–299
Niimura, T., Ko, H.S.: A Day-ahead Electricity Price Prediction Based on a Fuzzy-neural Autoregressive Model in a Deregulated Electricity Market. In: Proc. the 2002 International Joint Conf. on Neural Networks (IJCNN 2002), May 2002, vol. 2(12-17), pp. 1362–1366 (2002)
Guo, J.-J., Luh, P.B.: Selecting Input Factors for Clusters of Gaussian Radial Basis Function Networks to Improve Market Clearing Price Prediction. IEEE Trans on Power Systems 18(2), 665–672 (2003)
Guirguis, H.S., Felder, F.A.: Further Advances in Forecasting Day-Ahead Electricity Prices Using Time Series Models. KIEE International Transactions on PE 4-A(3), 159–166 (2004)
Nuller, K.-R., Smola, A.J., Ratsch, G., Scholkopf, B., Kohlmorgen, J., Vapnik, V.: Predicting time series with support vector machine. In: Gerstner, W., Hasler, M., Germond, A., Nicoud, J.-D. (eds.) ICANN 1997. LNCS, vol. 1327, pp. 999–1004. Springer, Heidelberg (1997)
Butler, K.L., Momoh, J.A.: Detection and classification of line faults on power distribution systems using neural networks. In: Proc. of the 36th Midwest Symp. on Circuits and Systems, August 16-18, 1993, vol. 1, pp. 368–371 (1993)
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering 17(4) (April 2005)
Nealand, J.H., Bradley, A.B., Lech, M.: Discriminative feature extraction applied to speaker identification. In: 6th Int. Conf. on Signal Processing, August 26-30, 2002, vol. 1, pp. 484–487 (2002)
Tamhane, A.C., Dunlop, D.D.: Statistics and Data Analysis: from Elementary to Intermediate. Prentice Hall, Upper Saddle River, NJ (2000c)
Blum, A.L., Langley, P.: Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence 97, 245–271 (1997)
Dash, M., Choi, K., Scheuermann, P., Liu, H.: Feature Selection for Clustering-a Filter Solution. In: Proc. Second Int’l Conf. Data Mining, pp. 115–122 (2002)
Hall, M.A.: Correlation-Based Feature Selection for Discrete and Numeric Class Machine Learning. In: Proc. 17th Int’l Conf. Machine Learning, pp. 359–366 (2000)
Ben-Bassat, M.: Pattern Recognition and Reduction of Dimensionality. In: Krishnaiah, P.R., Kanal, L.N. (eds.) Handbook of Statistics-II, pp. 773–791. North Holland, Amsterdam (1982)
Kira, K., Rendell, L.A.: The Feature Selection Problem: Traditional Methods and a New Algorithm. In: Proc. 10th Nat’l Conf. Artificial Intelligence, pp. 129–134 (1992)
Kohavi, R., John, G.H.: Wrappers for Feature Subset Selection. Artificial Intelligence 97(1-2), 273–324 (1997)
Martinez, W.L., Martinez, A.R.: Computational Statistics Handbook with MATLAB. Chapman & Hall/CRC, Boca Raton (2002c)
Oja, E., Kiviluoto, K., Malaroiu, S.: Independent component analysis for financial time series. In: IEEE Adaptive Systems for Signal Processing, Communications, and Control Symposium. AS-SPCC (2000)
Zhang, B.L., Dong, Z.Y.: An Adaptive Neural-wavelet Model for Short-Term Load Forecasting. Electric Power Systems Research 59, 121–129 (2001)
Zhang, H., Zhang, L.Z., Xie, L., Shen, J.N.: The ANN of UMCP forecast based on developed ICA. In: IEEE International Conference on Electric Utility Deregulation, Reconstruction, and Power Technologies (DRPT 2004), April 2004, Hongkong (2004)
Skantze, P., Gubina, A., Ilic, M.: Bid-based Stochastic Model for Electricity Prices: The Impact of Fundamental Divers on Market. In: Dynamics. Report produced by Energy Laboratory, MIT, available: http://lfee.mit.edu/public/el00-004.pdf
Han, J.W., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco, Calif (2001)
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic, Boston (1998)
Witten, I.H., Frank, E.: Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufman, San Francisco, Calif (2005)
Yang, J., Honavar, V.: Feature Subset Selection Using a Genetic Algorithm, Feature Extraction, Construction and Selection: A Data Mining Perspective, pp. 117–136 (1998) (2nd printing, 2001)
Garcia, G.N., Ebrahimi, T., Vesin, J.-M.: Support vector EEG classification in the Fourier and time-frequency correlation domains. In: First International IEEE EMBS Conference on Neural Engineering, March 20-22, 2003, pp. 591–594 (2003)
Sakai, T.: Average gain ratio: a simple retrieval performance measure for evaluation with multiple relevance levels. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval (July 2003)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhao, J.H., Dong, Z., Xu, Z. (2006). Effective Feature Preprocessing for Time Series Forecasting. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_84
Download citation
DOI: https://doi.org/10.1007/11811305_84
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37025-3
Online ISBN: 978-3-540-37026-0
eBook Packages: Computer ScienceComputer Science (R0)