Abstract
This study investigates the effectiveness of probability forecasts output by standard machine learning techniques (Neural Network, C4.5, K-Nearest Neighbours, Naive Bayes, SVM and HMM) when tested on time series datasets from various problem domains. Raw data was converted into a pattern classification problem using a sliding window approach, and the respective target prediction was set as some discretised future value in the time series sequence. Experiments were conducted in the online learning setting to model the way in which time series data is presented. The performance of each learner’s probability forecasts was assessed using ROC curves, square loss, classification accuracy and Empirical Reliability Curves (ERC) [1]. Our results demonstrate that effective probability forecasts can be generated on time series data and we discuss the practical implications of this.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lindsay, D., Cox, S.: Improving the Reliability of Decision Tree and Naive Bayes Learners. In: Proc. of the 4th ICDM, pp. 459–462. IEEE, Los Alamitos (2004)
Zadrozny, B., Elkan, C.: Transforming Classifier Scores into Accurate Multiclass Probability Estimates. In: Proc. of the 8th ACM SIGKDD, pp. 694–699. ACM Press, New York (2002)
Dawid, A.P.: Calibration-based empirical probability (with discussion). Annals of Statistics 13, 1251–1285 (1985)
Murphy, A.H.: A New Vector Partition of the Probability Score. Journal of Applied Meteorology 12, 595–600 (1973)
Witten, I., Frank, E.: Data Mining - Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (2000)
Provost, F., Fawcett, T.: Analysis and Visualisation of Classifier Performance: Comparision Under Imprecise Class and Cost Distributions. In: Proc. of the 3rd ICKDD, pp. 43–48. AAAI Press, Menlo Park (1997)
Fayyad, U., Irani, K.: The attribute selection problem in decision tree generation. In: Proc. of 10th Nat. Conf. on Artificial Intelligence, pp. 104–110. AAAI Press, Menlo Park (1992)
Atkeson, C.G., Moore, A.W., Schaal, S.: Locally Weighted Learning. Artificial Intelligence Review 11, 11–73 (1997)
Smola, A., Bartlett, P., Schölkopf, B., Schuurmans, C.: Advances in Large Margin Classifiers. MIT Press, Cambridge (1999)
Vovk, V., Gammerman, A., Shafer, G.: The Analysis of Time Series: An Introduction, 4th edn. Chapman and Hall, London (1989)
Keogh, E., Kasetty, S.: On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. In: Proc. of the 8th ACM SIGKDD, pp. 102–111. ACM Press, New York (2002)
Hull, J.C.: Options, Futures, and other Derivatives, 5th edn. Prentice-Hall, Upper Saddle River (2002)
Vovk, V., Takemura, A., Shafer, G.: Defensive Forecasting. In: Proc. of 10th International Workshop on Artificial Intelligence and Statistics. Electronic publication, Cologne University (2005)
Langford, J., Zadronzy, B.: Estimating Class Membership Probabilities Using Classifier Learners. In: Proc. of 10th International Workshop on Artificial Intelligence and Statistics. Electronic publication, Cologne University (2005)
Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., Keogh, E.: Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures. In: Proc. of the 9th ACM SIGKDD, pp. 216–225. ACM Press, New York (2003)
Syed, N., Liu, H., Sung, K.: Incremental Learning with Support Vector Machines. In: Proc. of Workshop on Support Machines at IJCAI 1999. Electronic publication, Cologne University (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lindsay, D., Cox, S. (2005). Effective Probability Forecasting for Time Series Data Using Standard Machine Learning Techniques. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds) Pattern Recognition and Data Mining. ICAPR 2005. Lecture Notes in Computer Science, vol 3686. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551188_4
Download citation
DOI: https://doi.org/10.1007/11551188_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28757-5
Online ISBN: 978-3-540-28758-2
eBook Packages: Computer ScienceComputer Science (R0)