Abstract
Prediction is one of the most important activities while working with time series. There are many alternative ways to model the time series. Finding the right one is challenging to model them. Most data-centric models (either statistical or machine learning) have hyperparameters to tune. Setting them right is mandatory for good predictions. It is even more complex since time series prediction also demands choosing a data preprocessing that complies with the chosen model. Many time series frameworks, such as Scikit Learning, have features to build models and tune their hyperparameters. However, only some works address tuning data preprocessing hyperparameters and model building. TSPredIT addresses this issue in this scope by providing a framework that seamlessly integrates data preprocessing activities with models’ hyperparameters. TSPredIT is made available as an R-package, which provides functions for defining and conducting time series prediction, including data pre(post)processing, decomposition, hyperparameter optimization, modeling, prediction, and accuracy assessment. Besides, TSPredIT is also extensible, which significantly expands the framework’s applicability, especially with other languages such as Python.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
Bischl, B., et al.: mlr: machine learning in R. J. Mach. Learn. Res. 17(170), 1–5 (2016)
Box, G.E.P., Jenkins, G.M., Reinsel, G.C., Ljung, G.M.: Time Series Analysis: Forecasting and Control. Wiley, Hoboken (2015)
Cheng, C., et al.: Time series forecasting for nonlinear and non-stationary processes: a review and comparative study. IIE Trans. (Ins. Ind. Eng.) 47(10), 1053–1071 (2015). https://doi.org/10.1080/0740817X.2014.999180
Davydenko, A., Fildes, R.: Measuring forecasting accuracy: the case of judgmental adjustments To SKU-level demand forecasts. Int. J. Forecast. 29(3), 510–522 (2013). https://doi.org/10.1016/j.ijforecast.2012.09.002
Diebold, F., Lopez, J.: 8 Forecast evaluation and combination. Handb. Stat. 14, 241–268 (1996). https://doi.org/10.1016/S0169-7161(96)14010-4
Diebold, F., Mariano, R.: Comparing predictive accuracy. J. Bus. Econ. Stat. 20(1), 134–144 (2002). https://doi.org/10.1198/073500102753410444
Eugster, M.J.A., Leisch, F.: Bench plot and mixed effects models: first steps toward a comprehensive benchmark analysis toolbox. In: Brito, P. (ed.) Compstat 2008, pp. 299–306. Physica Verlag, Heidelberg, Germany (2008)
Garcia, S., Luengo, J., Herrera, F.: Data Preprocessing in Data Mining. Springer (aug 2014). https://doi.org/10.1007/978-3-319-10247-4
Gujarati, D.N.: Essentials of Econometrics. SAGE (sep 2021)
Hao, J., Ho, T.: Machine learning made easy: a review of Scikit-learn package in python programming language. J. Educ. Behav. Stat. 44(3), 348–361 (2019). https://doi.org/10.3102/1076998619832248
Hyndman, R., Khandakar, Y.: Automatic time series forecasting: The forecast package for R. J. Stat. Softw. 27(3), 1–22 (2008). https://doi.org/10.18637/jss.v027.i03
Hyndman, R., Koehler, A., Snyder, R., Grose, S.: A state space framework for automatic forecasting using exponential smoothing methods. Int. J. Forecast. 18(3), 439–454 (2002). https://doi.org/10.1016/S0169-2070(01)00110-8
Hyndman, R.J., Athanasopoulos, G.: Forecasting: principles and practice. OTexts (may 2018)
Izaú, L., et al.: Towards robust cluster-based hyperparameter optimization. In: Anais do Simpósio Brasileiro de Banco de Dados (SBBD), pp. 439–444. SBC (sep 2022). https://doi.org/10.5753/sbbd.2022.224330
Khalid, R., Javaid, N.: A survey on hyperparameters optimization algorithms of forecasting models in smart grid. Sustain. Cities Soc. 61, 102275 (2020). https://doi.org/10.1016/j.scs.2020.102275
Kumar, A., McCann, R., Naughton, J., Patel, J.M.: Model selection management systems: the next frontier of advanced analytics. ACM SIGMOD Rec. 44(4), 17–22 (2016). https://doi.org/10.1145/2935694.2935698
Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008). https://doi.org/10.1109/TSE.2008.35
Lim, B., Zohren, S.: Time-series forecasting with deep learning: a survey. Philos. Trans. R. Soc. A: Math. Phys. Eng. Sci. 379(2194), 20200209 (2021). https://doi.org/10.1098/rsta.2020.0209
Lindemann, B., Müller, T., Vietz, H., Jazdi, N., Weyrich, M.: A survey on long short-term memory networks for time series prediction. In: Procedia CIRP. vol. 99, pp. 650–655 (2021). https://doi.org/10.1016/j.procir.2021.03.088
Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE Trans. Knowl. Data Eng. 31(12), 2346–2363 (2019). https://doi.org/10.1109/TKDE.2018.2876857
Moreno, A.V., Rivas, A.J.R., Godoy, M.D.P.: predtoolsTS: Time Series Prediction Tools. Tech. rep.,https://cran.r-project.org/package=predtoolsTS (2018)
Mumuni, A., Mumuni, F.: Data augmentation: a comprehensive survey of modern approaches. Array 16, 100258 (2022). https://doi.org/10.1016/j.array.2022.100258
Ogasawara, E., Martinez, L., De Oliveira, D., Zimbrão, G., Pappa, G., Mattoso, M.: Adaptive normalization: a novel data normalization approach for non-stationary time series. In: Proceedings of the International Joint Conference on Neural Networks (2010). https://doi.org/10.1109/IJCNN.2010.5596746
Pacheco, C., et al.: Exploring data preprocessing and machine learning methods for forecasting worldwide fertilizers consumption. In: Proceedings of the International Joint Conference on Neural Networks. vol. 2022-July (2022). https://doi.org/10.1109/IJCNN55064.2022.9892325
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Ramey, J.A.: sorting hat: sorting hat. Tech. rep., https://cran.r-project.org/web/packages/sortinghat/index.html (2013)
Ran, Z.Y., Hu, B.G.: Parameter identifiability in statistical machine learning: a review. Neural Comput. 29(5), 1151–1203 (2017). https://doi.org/10.1162/NECOa00947
Salles, R., Assis, L., Guedes, G., Bezerra, E., Porto, F., Ogasawara, E.: A framework for benchmarking machine learning methods using linear models for univariate time series prediction. In: Proceedings of the International Joint Conference on Neural Networks. vol. 2017-May, pp. 2338–2345 (2017). https://doi.org/10.1109/IJCNN.2017.7966139
Salles, R., Belloze, K., Porto, F., Gonzalez, P., Ogasawara, E.: Nonstationary time series transformation methods: an experimental review. Knowl.-Based Syst. 164, 274–291 (2019). https://doi.org/10.1016/j.knosys.2018.10.041
Salles, R., Pacitti, E., Bezerra, E., Porto, F., Ogasawara, E.: TSPred: a framework for nonstationary time series prediction. Neurocomputing 467, 197–202 (2022). https://doi.org/10.1016/j.neucom.2021.09.067
Sarwar Murshed, M., Murphy, C., Hou, D., Khan, N., Ananthanarayanan, G., Hussain, F.: Machine learning at the network edge: a survey. ACM Comput. Surv. 54(8), 1–37 (2022). https://doi.org/10.1145/3469029
Talavera, E., Iglesias, G., González-Prieto, A., Mozo, A., Gómez-Canaval, S.: Data Augmentation techniques in time series domain: A survey and taxonomy (jun 2022). https://doi.org/10.48550/arXiv.2206.13508,http://arxiv.org/abs/2206.13508
Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007). https://doi.org/10.1007/s11222-007-9033-z
Wen, Q., et al.: Time series data augmentation for deep learning: a survey. In: IJCAI International Joint Conference on Artificial Intelligence, pp. 4653–4660 (2021)
Wickham, H.: Advanced R. CRC Press, second edn. (may 2019)
Acknowledgements
The authors thank CNPq, CAPES (finance code 001), and FAPERJ for partially sponsoring this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer-Verlag GmbH Germany, part of Springer Nature
About this chapter
Cite this chapter
Salles, R. et al. (2023). TSPredIT: Integrated Tuning of Data Preprocessing and Time Series Prediction Models. In: Hameurlain, A., Tjoa, A.M., Boucelma, O., Toumani, F. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems LIV. Lecture Notes in Computer Science(), vol 14160. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-68014-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-662-68014-8_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-68013-1
Online ISBN: 978-3-662-68014-8
eBook Packages: Computer ScienceComputer Science (R0)