Skip to main content

A Geometric Moving Average Martingale method for detecting changes in data streams

Abstract

In this paper, we propose a Geometric Moving Average Martingale (GMAM) method for detecting changes in data streams. There are two components underpinning the GMAM method. The first is the exponential weighting of observations which has the capability of reducing false changes. The second is the use of the GMAM value for hypothesis testing. When a new data point is observed, the hypothesis testing decides whether any change has occurred on it based on the GMAM value. Once a change is detected, then all variables of the GMAM algorithm are re-initialized in order to find other changes. The experiments show that the GMAM method is effective in detecting concept changes in two synthetic time-varying data streams and a real world dataset ‘Respiration dataset’.

Keywords

  • Data Stream
  • Control Chart
  • Concept Drift
  • Sequential Probability Ratio Test
  • Martingale Theory

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-1-4471-4739-8_6
  • Chapter length: 14 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   229.00
Price excludes VAT (USA)
  • ISBN: 978-1-4471-4739-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   299.00
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bondu, M. Boullé: A supervised approach for change detection in data streams. , The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 519 – 526 (2011).

    Google Scholar 

  2. Daniel Kifer, Shai Ben-David, Johannes Gehrke: Detecting Change in Data Streams. Proceedings of the 30th VLDB Conference,Toronto,Canada, pp. 180-191 (2004).

    Google Scholar 

  3. Leszek Czerwonka: Changes in share prices as a response to earnings forecasts regarding future real profits. Alexandru Ioan Cuza University of Iasi, Vol. 56, pp. 81-90 (2009).

    Google Scholar 

  4. Q. Siqing, W. Sijing: A homomorphic model for identifying abrupt abnormalities of landslide forerunners. Engineering Geology, Vol. 57, pp. 163–168 (2000).

    CrossRef  Google Scholar 

  5. Wei Xiong, NaixueXiong, Laurence T. Yang, etc.: Network Traffic Anomaly Detection based on Catastrophe Theory. IEEE Globecom 2010 Workshop on Advances in Communications and Networks, pp. 2070-2074 (2010).

    Google Scholar 

  6. Thomas Hilker , Michael A.Wulder , Nicholas C. Coops, etc. : A new data fusion model for high spatial- and temporal-resolution mapping of forest disturbance based on Landsat and MODIS. Remote Sensing of Environment, Vol. 113, pp. 1613–1627 (2009).

    Google Scholar 

  7. Ashraf M. Dewan , Yasushi Yamaguchi: Using remote sensing and GIS to detect and monitor land use and land cover change in Dhaka Metropolitan of Bangladesh during 1960– 2005. Environ Monit Assess, Vol. 150, pp. 237-249 (2009).

    CrossRef  Google Scholar 

  8. Jin S. Deng, KeWang,Yang Hong,Jia G.Qi.: Spatio-temporal dynamics and evolution of land use change and landscape pattern in response to rapid urbanization. Landscape and Urban Planning, Vol. 92, pp. 187-198 (2009).

    CrossRef  Google Scholar 

  9. Asampbu Kitamoto: Spatio-Temporal Data Mining for Typhoon Image Collection.Journal of Intelligent Information Systems, Vol. 19(1), pp. 25-41 (2002).

    Google Scholar 

  10. Tao Cheng, Jiaqiu Wang: Integrated Spatio-temporal Data Mining for Forest Fire Prediction. Transactions in GIS. Vol. 12 (5), pp. 591-611 (2008).

    Google Scholar 

  11. A. Dries and U. Ruckert: Adaptive Concept Drift Detection. In SIAM Conference on Data Mining, pp. 233–244 (2009).

    Google Scholar 

  12. J.H. Friedman and L.C Rafsky: Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests. Annals of Statistic, Vol. 4, pp. 697–717 (2006).

    Google Scholar 

  13. F. Nemec, O. Santolik, M. Parrot,and J. J. Berthelier: Spacecraft observations of electromagnetic perturbations connected with seismic activity. Geophysical Research Letters, Vol. 35(L05109), pp. 1-5 (2008).

    Google Scholar 

  14. Sheskin, D. J.: Handbook of Parametric and Nonparametric Statistical Procedures. 2nd ed. CRC Press, Boca Raton, Fla. pp. 513-727 (2000).

    Google Scholar 

  15. W.A. Shewhart: The Application of Statistics as an Aid in Maintaining Quality of a manufactured Product. Am.Statistician Assoc., Vol. 20, pp. 546-548 (1925).

    CrossRef  Google Scholar 

  16. W.A. Shewhart: Economic Control of Quality of Manufactured Product. Am. Soc. for Quality Control, (1931).

    Google Scholar 

  17. E.S. Page: On Problem in Which a Change in a Parameter Occurs at an Unknown Point. Biometrika, Vol. 44, pp. 248-252 (1957).

    MATH  Google Scholar 

  18. M.A. Girshik and H. Rubin: A Bayes Approach to a Quality Control Model, Annal of Math. Statistics, Vol. 23(1), pp. 114-125 (1952).

    CrossRef  Google Scholar 

  19. Ludmila I. Kuncheva: Change Detection in Streaming Multivariate Data Using Likelihood Detectors. IEEE Transactions on Knowledge and Data Engineering, Vol. 6(1), pp. 1-7 (2007).

    Google Scholar 

  20. F. Chu, Y. Wang, and C. Zaniolo: An Adaptive Learning Approach for Noisy Data Streams.Proc. Fourth IEEE Int’l Conf.Data Mining, pp. 351-354 (2004).

    Google Scholar 

  21. J.Z. Kolter and M.A. Maloof: Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift. Proc. Third IEEE Int’l Conf. Data Mining, pp. 123-130 (2003).

    Google Scholar 

  22. H. Wang, W. Fan, P.S. Yu, and J. Han: Mining Concept-Drifting Data Streams Using Ensemble Classifiers. Proc. ACM SIGKDD, pp. 226-235 (2003).

    Google Scholar 

  23. M. Scholz and R. Klinkenberg: Boosting Classifiers for Drifting Concepts.Intelligent Data Analysis, Vol. 11(1), pp. 3-28 (2007).

    Google Scholar 

  24. R. Klinkenberg: Learning Drifting Concepts: Examples Selection vs Example Weighting, Intelligent Data Analysis. special issue on incremental learning systems capable of dealing with concept drift, Vol. 8(3), pp. 281-300 (2004).

    Google Scholar 

  25. R. Klinkenberg and T. Joachims: Detecting Concept Drift with Support Vector Machines. Proc. 17th Int’l Conf. Machine Learning, P. Langley, ed., pp. 487-494 (2000).

    Google Scholar 

  26. G. Widmer and M. Kubat: Learning in the Presence of Concept Drift and Hidden Contexts.Machine Learning, Vol. 23(1), pp. 69-101 (1996).

    Google Scholar 

  27. Kong Fanlang: A Dynamic Method of System Forecast. Systems Engineering Theory and Practice, Vol. 19(3), pp. 58-62 (1999).

    Google Scholar 

  28. Kong Fanlang: A Dynamic Method of Air Temperature Forecast. Kybernetes, Vol. 33(2), pp. 282-287 (2004).

    Google Scholar 

  29. S. S. Ho, H. Wechsler: A Martingale Framework for Detecting Changes in Data Streams by Testing Exchangeability. IEEE transactions on pattern analysis and machine intelligence, Vol. 32(12), pp. 2113-2127 (2010).

    CrossRef  Google Scholar 

  30. S. Muthukrishnan, E. van den Berg, and Y. Wu: Sequential Change Detection on Data Streams, Proc. ICDM Workshop Data Stream Mining and Management, pp. 551-556 (2007)

    Google Scholar 

  31. V. Vovk, I. Nouretdinov, and A. Gammerman: Testing Exchangeability On-Line. Proc. 20th Int’l Conf. Machine Learning,T. pp. 768-775 (2003).

    Google Scholar 

  32. M. Steele: Stochastic Calculus and Financial Applications. SpringerVerlag, (2001).

    Google Scholar 

  33. E. Keogh, J. Lin, and A. Fu: HOT SAX: Efficiently finding the most unusual time series subsequences. In Proceedings of the 5th IEEE International Conference on Data Mining (ICDM'05), pp. 226-233 (2005).

    Google Scholar 

  34. V. Moskvina and A. A. Zhigljavsky: An algorithm based on singular spectrum analysis for change-point detection. Communication in Statistics: Simulation & Computation, Vol. 32(2), pp. 319-352 (2003).

    MathSciNet  MATH  CrossRef  Google Scholar 

  35. Y. Takeuchi and K. Yamanishi: A unifying framework for detecting outliers and change points from non-stationary time series data. IEEE Transactions on Knowledge and Data Engineering, Vol. 18(4), pp. 482–489 (2006).

    CrossRef  Google Scholar 

  36. F. Desobry, M. Davy, and C. Doncarli: An online kernel change detection algorithm. IEEE Transactions on Signal Processing, Vol. 53(8), pp. 2961-2974 (2005).

    MathSciNet  CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to X. Z. Kong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag London

About this paper

Cite this paper

Kong, X.Z., Bi, Y.X., Glass, D.H. (2012). A Geometric Moving Average Martingale method for detecting changes in data streams. In: Bramer, M., Petridis, M. (eds) Research and Development in Intelligent Systems XXIX. SGAI 2012. Springer, London. https://doi.org/10.1007/978-1-4471-4739-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-4739-8_6

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-4738-1

  • Online ISBN: 978-1-4471-4739-8

  • eBook Packages: Computer ScienceComputer Science (R0)