Abstract
In this paper, we propose a Geometric Moving Average Martingale (GMAM) method for detecting changes in data streams. There are two components underpinning the GMAM method. The first is the exponential weighting of observations which has the capability of reducing false changes. The second is the use of the GMAM value for hypothesis testing. When a new data point is observed, the hypothesis testing decides whether any change has occurred on it based on the GMAM value. Once a change is detected, then all variables of the GMAM algorithm are re-initialized in order to find other changes. The experiments show that the GMAM method is effective in detecting concept changes in two synthetic time-varying data streams and a real world dataset ‘Respiration dataset’.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bondu, M. Boullé: A supervised approach for change detection in data streams. , The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 519 – 526 (2011).
Daniel Kifer, Shai Ben-David, Johannes Gehrke: Detecting Change in Data Streams. Proceedings of the 30th VLDB Conference,Toronto,Canada, pp. 180-191 (2004).
Leszek Czerwonka: Changes in share prices as a response to earnings forecasts regarding future real profits. Alexandru Ioan Cuza University of Iasi, Vol. 56, pp. 81-90 (2009).
Q. Siqing, W. Sijing: A homomorphic model for identifying abrupt abnormalities of landslide forerunners. Engineering Geology, Vol. 57, pp. 163–168 (2000).
Wei Xiong, NaixueXiong, Laurence T. Yang, etc.: Network Traffic Anomaly Detection based on Catastrophe Theory. IEEE Globecom 2010 Workshop on Advances in Communications and Networks, pp. 2070-2074 (2010).
Thomas Hilker , Michael A.Wulder , Nicholas C. Coops, etc. : A new data fusion model for high spatial- and temporal-resolution mapping of forest disturbance based on Landsat and MODIS. Remote Sensing of Environment, Vol. 113, pp. 1613–1627 (2009).
Ashraf M. Dewan , Yasushi Yamaguchi: Using remote sensing and GIS to detect and monitor land use and land cover change in Dhaka Metropolitan of Bangladesh during 1960– 2005. Environ Monit Assess, Vol. 150, pp. 237-249 (2009).
Jin S. Deng, KeWang,Yang Hong,Jia G.Qi.: Spatio-temporal dynamics and evolution of land use change and landscape pattern in response to rapid urbanization. Landscape and Urban Planning, Vol. 92, pp. 187-198 (2009).
Asampbu Kitamoto: Spatio-Temporal Data Mining for Typhoon Image Collection.Journal of Intelligent Information Systems, Vol. 19(1), pp. 25-41 (2002).
Tao Cheng, Jiaqiu Wang: Integrated Spatio-temporal Data Mining for Forest Fire Prediction. Transactions in GIS. Vol. 12 (5), pp. 591-611 (2008).
A. Dries and U. Ruckert: Adaptive Concept Drift Detection. In SIAM Conference on Data Mining, pp. 233–244 (2009).
J.H. Friedman and L.C Rafsky: Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests. Annals of Statistic, Vol. 4, pp. 697–717 (2006).
F. Nemec, O. Santolik, M. Parrot,and J. J. Berthelier: Spacecraft observations of electromagnetic perturbations connected with seismic activity. Geophysical Research Letters, Vol. 35(L05109), pp. 1-5 (2008).
Sheskin, D. J.: Handbook of Parametric and Nonparametric Statistical Procedures. 2nd ed. CRC Press, Boca Raton, Fla. pp. 513-727 (2000).
W.A. Shewhart: The Application of Statistics as an Aid in Maintaining Quality of a manufactured Product. Am.Statistician Assoc., Vol. 20, pp. 546-548 (1925).
W.A. Shewhart: Economic Control of Quality of Manufactured Product. Am. Soc. for Quality Control, (1931).
E.S. Page: On Problem in Which a Change in a Parameter Occurs at an Unknown Point. Biometrika, Vol. 44, pp. 248-252 (1957).
M.A. Girshik and H. Rubin: A Bayes Approach to a Quality Control Model, Annal of Math. Statistics, Vol. 23(1), pp. 114-125 (1952).
Ludmila I. Kuncheva: Change Detection in Streaming Multivariate Data Using Likelihood Detectors. IEEE Transactions on Knowledge and Data Engineering, Vol. 6(1), pp. 1-7 (2007).
F. Chu, Y. Wang, and C. Zaniolo: An Adaptive Learning Approach for Noisy Data Streams.Proc. Fourth IEEE Int’l Conf.Data Mining, pp. 351-354 (2004).
J.Z. Kolter and M.A. Maloof: Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift. Proc. Third IEEE Int’l Conf. Data Mining, pp. 123-130 (2003).
H. Wang, W. Fan, P.S. Yu, and J. Han: Mining Concept-Drifting Data Streams Using Ensemble Classifiers. Proc. ACM SIGKDD, pp. 226-235 (2003).
M. Scholz and R. Klinkenberg: Boosting Classifiers for Drifting Concepts.Intelligent Data Analysis, Vol. 11(1), pp. 3-28 (2007).
R. Klinkenberg: Learning Drifting Concepts: Examples Selection vs Example Weighting, Intelligent Data Analysis. special issue on incremental learning systems capable of dealing with concept drift, Vol. 8(3), pp. 281-300 (2004).
R. Klinkenberg and T. Joachims: Detecting Concept Drift with Support Vector Machines. Proc. 17th Int’l Conf. Machine Learning, P. Langley, ed., pp. 487-494 (2000).
G. Widmer and M. Kubat: Learning in the Presence of Concept Drift and Hidden Contexts.Machine Learning, Vol. 23(1), pp. 69-101 (1996).
Kong Fanlang: A Dynamic Method of System Forecast. Systems Engineering Theory and Practice, Vol. 19(3), pp. 58-62 (1999).
Kong Fanlang: A Dynamic Method of Air Temperature Forecast. Kybernetes, Vol. 33(2), pp. 282-287 (2004).
S. S. Ho, H. Wechsler: A Martingale Framework for Detecting Changes in Data Streams by Testing Exchangeability. IEEE transactions on pattern analysis and machine intelligence, Vol. 32(12), pp. 2113-2127 (2010).
S. Muthukrishnan, E. van den Berg, and Y. Wu: Sequential Change Detection on Data Streams, Proc. ICDM Workshop Data Stream Mining and Management, pp. 551-556 (2007)
V. Vovk, I. Nouretdinov, and A. Gammerman: Testing Exchangeability On-Line. Proc. 20th Int’l Conf. Machine Learning,T. pp. 768-775 (2003).
M. Steele: Stochastic Calculus and Financial Applications. SpringerVerlag, (2001).
E. Keogh, J. Lin, and A. Fu: HOT SAX: Efficiently finding the most unusual time series subsequences. In Proceedings of the 5th IEEE International Conference on Data Mining (ICDM'05), pp. 226-233 (2005).
V. Moskvina and A. A. Zhigljavsky: An algorithm based on singular spectrum analysis for change-point detection. Communication in Statistics: Simulation & Computation, Vol. 32(2), pp. 319-352 (2003).
Y. Takeuchi and K. Yamanishi: A unifying framework for detecting outliers and change points from non-stationary time series data. IEEE Transactions on Knowledge and Data Engineering, Vol. 18(4), pp. 482–489 (2006).
F. Desobry, M. Davy, and C. Doncarli: An online kernel change detection algorithm. IEEE Transactions on Signal Processing, Vol. 53(8), pp. 2961-2974 (2005).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag London
About this paper
Cite this paper
Kong, X.Z., Bi, Y.X., Glass, D.H. (2012). A Geometric Moving Average Martingale method for detecting changes in data streams. In: Bramer, M., Petridis, M. (eds) Research and Development in Intelligent Systems XXIX. SGAI 2012. Springer, London. https://doi.org/10.1007/978-1-4471-4739-8_6
Download citation
DOI: https://doi.org/10.1007/978-1-4471-4739-8_6
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-4738-1
Online ISBN: 978-1-4471-4739-8
eBook Packages: Computer ScienceComputer Science (R0)