Evaluation Procedures for Forecasting with Spatio-Temporal Data
Abstract
The amount of available spatio-temporal data has been increasing as large-scale data collection (e.g., from geosensor networks) becomes more prevalent. This has led to an increase in spatio-temporal forecasting applications using geo-referenced time series data motivated by important domains such as environmental monitoring (e.g., air pollution index, forest fire risk prediction). Being able to properly assess the performance of new forecasting approaches is fundamental to achieve progress. However, the dependence between observations that the spatio-temporal context implies, besides being challenging in the modelling step, also raises issues for performance estimation as indicated by previous work. In this paper, we empirically compare several variants of cross-validation (CV) and out-of-sample (OOS) performance estimation procedures that respect data ordering, using both artificially generated and real-world spatio-temporal data sets. Our results show both CV and OOS reporting useful estimates. Further, they suggest that blocking may be useful in addressing CV’s bias to underestimate error. OOS can be very sensitive to test size, as expected, but estimates can be improved by careful management of the temporal dimension in training. Code related to this paper is available at: https://github.com/mrfoliveira/Evaluation-procedures-for-forecasting-with-spatio-temporal-data.
Keywords
Evaluation methods Performance estimation Cross-validation Spatio-temporal data Geo-referenced time series Reproducible researchNotes
Acknowledgments
This work is partially funded by the ERDF through the COMPETE 2020 Programme within project POCI-01-0145-FEDER-006961, and by National Funds through the FCT as part of project UID/EEA/50014/2013. Mariana Oliveira is supported by a FCT/MAPi PhD research grant (PD/BD/128166/2016). Vítor Santos Costa is supported by the project POCI-01-0145-FEDER-016844.
Supplementary material
References
- 1.Appice, A., Pravilovic, S., Malerba, D., Lanza, A.: Enhancing regression models with spatio-temporal indicator additions. In: Baldoni, M., Baroglio, C., Boella, G., Micalizio, R. (eds.) AI*IA 2013. LNCS (LNAI), vol. 8249, pp. 433–444. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-03524-6_37CrossRefGoogle Scholar
- 2.Arlot, S., Celisse, A.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010). https://doi.org/10.1214/09-SS054MathSciNetCrossRefzbMATHGoogle Scholar
- 3.Bergmeir, C., Benítez, J.M.: On the use of cross-validation for time series predictor evaluation. Inf. Sci. (Ny) 191, 192–213 (2012). https://doi.org/10.1016/j.ins.2011.12.028CrossRefGoogle Scholar
- 4.Bergmeir, C., Costantini, M., Benítez, J.M.: On the usefulness of cross-validation for directional forecast evaluation. Comput. Stat. Data Anal. 76, 132–143 (2014). https://doi.org/10.1016/j.csda.2014.02.001MathSciNetCrossRefzbMATHGoogle Scholar
- 5.Burman, P., Chow, E., Nolan, D.: A cross-validatory method for dependent data. Biometrika 81(2), 351–358 (1994). https://doi.org/10.1093/biomet/81.2.351MathSciNetCrossRefzbMATHGoogle Scholar
- 6.Calvo, B., Santafé Rodrigo, G.: scmamp: statistical comparison of multiple algorithms in multiple problems. R J. 8(1), August 2016Google Scholar
- 7.Carroll, S.S., Cressie, N.: Spatial modeling of snow water equivalent using covariances estimated from spatial and geomorphic attributes. J. Hydrol. 190(1–2), 42–59 (1997). https://doi.org/10.1016/S0022-1694(96)03062-4CrossRefGoogle Scholar
- 8.Ceci, M., Corizzo, R., Fumarola, F., Malerba, D., Rashkovska, A.: Predictive modeling of PV energy production: How to set up the learning task for a better prediction? IEEE T. Ind. Inform. 13(3), 956–966 (2017)CrossRefGoogle Scholar
- 9.Cerqueira, V., Torgo, L., Smailovi, J., Mozeti, I.: A comparative study of performance estimation methods for time series forecasting. In: International Conference on Data Science and Advanced Analytics (DSAA), pp. 529–538 (2017). https://doi.org/10.1109/DSAA.2017.7
- 10.Cheysson, F.: starma: Modelling Space Time AutoRegressive Moving Average. In: (STARMA) Processes (2016)Google Scholar
- 11.Chu, C.K., Marron, J.S.: Comparison of two bandwidth selectors with dependent errors. Ann. Stat. 19(4), 1906–1918 (1991)MathSciNetCrossRefGoogle Scholar
- 12.Devroye, L., Wagner, T.: Distribution-free performance bounds for potential function rules. IEEE Trans. Inf. Theory 25(5), 601–604 (1979)MathSciNetCrossRefGoogle Scholar
- 13.Diggle, P.: Analysis of Longitudinal Data. Oxford University Press, Oxford (2002)Google Scholar
- 14.Gasch, C.K., Hengl, T., Gräler, B., Meyer, H., Magney, T.S., Brown, D.J.: Spatio-temporal interpolation of soil water, temperature, and electrical conductivity in 3D+ T: the cook agronomy farm data set. Spat. Stat. 14, 70–90 (2015)MathSciNetCrossRefGoogle Scholar
- 15.Geisser, S.: The predictive sample reuse method with applications. J. Am. Stat. Assoc. 70(350), 320–328 (1975)CrossRefGoogle Scholar
- 16.Haberlandt, U.: Geostatistical interpolation of hourly precipitation from rain gauges and radar for a large-scale extreme rainfall event. J. Hydrol. 332(1–2), 144–157 (2007). https://doi.org/10.1016/j.jhydrol.2006.06.028CrossRefGoogle Scholar
- 17.Hengl, T.: GSIF: Global Soil Information Facilities (2017). R package version 0.5-4Google Scholar
- 18.Meyer, H., Reudenbach, C., Hengl, T., Katurji, M., Nauss, T.: Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation. Environ. Model. Softw. 101, 1–9 (2018). https://doi.org/10.1016/j.envsoft.2017.12.001CrossRefGoogle Scholar
- 19.Modha, D.S., Masry, E.: Prequential and cross-validated regression estimation. Mach. Learn. 33(1), 5–39 (1998). https://doi.org/10.1109/ISIT.1998.708964CrossRefzbMATHGoogle Scholar
- 20.Mozetič, I., Torgo, L., Cerqueira, V., Smailović, J.: How to evaluate sentiment classifiers for Twitter time-ordered data? PLoS One 13(3), 1–20 (2018). https://doi.org/10.1371/journal.pone.0194317CrossRefGoogle Scholar
- 21.Ohashi, O., Torgo, L.: Wind speed forecasting using spatio-temporal indicators. In: Proceedings of the 20th European Conference on Artificial Intelligence, pp. 975–980. IOS Press (2012)Google Scholar
- 22.Opsomer, J., Wang, Y., Yang, Y.: Nonparametric regression with correlated errors. Stat. Sci. 16(2), 134–153 (2001). https://doi.org/10.1214/ss/1009213287MathSciNetCrossRefzbMATHGoogle Scholar
- 23.Pebesma, E.: spacetime: Spatio-temporal data in R. J. Stat. Softw. 51(7), 1–30 (2012). http://www.jstatsoft.org/v51/i07/CrossRefGoogle Scholar
- 24.Pfeifer, P.E., Deutsch, S.J.: A three-stage iterative procedure for space-time modeling. Technometrics 22(1), 35–47 (1980)CrossRefGoogle Scholar
- 25.Pravilovic, S., Appice, A., Malerba, D.: Leveraging correlation across space and time to interpolate geophysical data via CoKriging. Int. J. Geogr. Inf. Sci. 32(1), 191–212 (2018). https://doi.org/10.1080/13658816.2017.1381338CrossRefGoogle Scholar
- 26.R Core Team: R: a language and environment for statistical computing. In: R Foundation for Statistical Computing, Austria, Vienna (2017)Google Scholar
- 27.Racine, J.: Consistent cross-validatory model-selection for dependent data: hv-block cross-validation. J. Econom. 99(1), 39–61 (2000)CrossRefGoogle Scholar
- 28.Roberts, D.R., et al.: Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 40(8), 913–929 (2017)CrossRefGoogle Scholar
- 29.Snijders, T.A.B.: On cross-validation for predictor evaluation in time series. In: Dijkstra, T.K. (ed.) On Model Uncertainty and its Statistical Implications. LNE, pp. 56–69. Springer, Berlin, Heidelberg (1988). https://doi.org/10.1007/978-3-642-61564-1_4CrossRefGoogle Scholar
- 30.Stone, M.: Cross-validatory choice and assessment of statistical predictions. J. R. Stat. Soc. B 111–147 (1974)MathSciNetzbMATHGoogle Scholar
- 31.Tashman, L.J.: Out-of-sample tests of forecasting accuracy : an analysis and review. Int. J. Forecast. 16(4), 437–450 (2000)CrossRefGoogle Scholar
- 32.Torgo, L.: Data Mining with R: Learning with Case Studies. CRC Press, Boca Raton (2016)Google Scholar
- 33.Trachsel, M., Telford, R.J.: Estimating unbiased transfer-function performances in spatially structured environments. Clim. Past 12(5), 1215–1223 (2016)CrossRefGoogle Scholar
- 34.Wright, M.N., Ziegler, A.: Ranger: a fast implementation of random forests for high dimensional data in C++ and R. J. Stat. Softw. 77(1), 1–17 (2017). https://doi.org/10.18637/jss.v077.i01CrossRefGoogle Scholar
- 35.Zheng, Y., Liu, F., Hsieh, H.P.: U-Air: when urban air quality inference meets big data. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery Data Mining, pp. 1436–1444. ACM (2013). https://doi.org/10.1145/2487575.2488188