EWMA Based Two-Stage Dataset Shift-Detection in Non-stationary Environments
- 8 Citations
- 2k Downloads
Abstract
Dataset shift is a major challenge in the non-stationary environments wherein the input data distribution may change over time. In a time-series data, detecting the dataset shift point, where the distribution changes its properties is of utmost interest. Dataset shift exists in a broad range of real-world systems. In such systems, there is a need for continuous monitoring of the process behavior and tracking the state of the shift so as to decide about initiating adaptive corrections in a timely manner. This paper presents a novel method to detect the shift-point based on a two-stage structure involving Exponentially Weighted Moving Average (EWMA) chart and Kolmogorov-Smirnov test, which substantially reduces type-I error rate. The algorithm is suitable to be run in real-time. Its performance is evaluated through experiments using synthetic and real-world datasets. Results show effectiveness of the proposed approach in terms of decreased type-I error and tolerable increase in detection time delay.
Keywords
Non-stationary Dataset shift EWMA Online Shift-detectionReferences
- 1.Liu, S., Yamada, M., Collier, N., Sugiyama, M.: Change-Point Detection in Time-Series Data by Relative Density-Ratio Estimation. Neural Networks, 1–25 (2013)Google Scholar
- 2.Shimodaira, H.: Improving Predictive Inference Under Covariate Shift by Weighting the Log-Likelihood Function. Journal of Statistical Planning and Inference 90(2), 227–244 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
- 3.Alippi, C., Boracchi, G., Roveri, M.: Change Detection Tests Using the ICI Rule. In: The International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (July 2010)Google Scholar
- 4.Alippi, C., Roveri, M.: Just-in-Time Adaptive Classifier–Part I: Detecting Nonstationary Changes. IEEE Transactions on Neural Networks 19(7), 1145–1153 (2008)CrossRefGoogle Scholar
- 5.Basseville, M., Nikiforov, I.: Detection of Abrupt Changes: Theory and Application. Prentice-Hall (1993)Google Scholar
- 6.Alippi, C., Boracchi, G., Roveri, M.: A Just-In-Time Adaptive Classification System Based on the Intersection of Confidence Intervals Rule. Neural Networks: The Official Journal of the International Neural Network Society 24(8), 791–800 (2011)CrossRefGoogle Scholar
- 7.Sugiyama, M., Suzuki, T., Kanamori, T.: Density Ratio Estimation in Machine Learning, p. 344. Cambridge University Press (2012)Google Scholar
- 8.Raza, H., Prasad, G., Li, Y.: Dataset Shift Detection in Non-Stationary Environments using EWMA Charts. In: IEEE International Conference on Systems, Man, and Cybernetics (accepted, 2013)Google Scholar
- 9.Moreno-Torres, J.G., Raeder, T., Alaiz-Rodríguez, R., Chawla, N.V., Herrera, F.: A Unifying View on Dataset Shift in Classification. Pattern Recognition 45(1), 521–530 (2012)CrossRefGoogle Scholar
- 10.Sugiyama, M., Schwaighofer, A., Lawrence, N.D.: Dataset Shift in Machine Learning. MIT Press (2009)Google Scholar
- 11.Roberts, S.W.: Control chart tests Based on Geometric Moving Averages. Technometrics (1959)Google Scholar
- 12.Dougla, C.M.: Introduction to Statistical Quality Control, 5th edn. John Wiley & Sons (2007)Google Scholar
- 13.Ye, N., Vilbert, S., Chen, Q.: Computer Intrusion Detection Through EWMA for Autocorrelated and Uncorrelated Data. IEEE Transaction on Reliability 52(1), 75–82 (2003)CrossRefGoogle Scholar
- 14.Connie, M., Douglas, C., George, C.: Robustness of the EWMA Control Chart to Non-Normality. Journal of Quality Technology 31(3), 309 (1999)Google Scholar
- 15.Snedecor, G.W., Cochran, W.G.: Statistical Methods, Eight. Iowa State University Press (1989)Google Scholar
- 16.Table of Critical Values for the Two-Sample Test, http://www.soest.hawaii.edu/wessel/courses/gg313/Critical_KS.pdf
- 17.Klaus-Robert Müller, B.B.: BCI Competition III: Data set IVb (2005), http://www.bbci.de/competition/iii/desc_IVb.html