Skip to main content

Real-Time Change-Point Detection Using Sequentially Discounting Normalized Maximum Likelihood Coding

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6635))

Included in the following conference series:

Abstract

We are concerned with the issue of real-time change-point detection in time series. This technology has recently received vast attentions in the area of data mining since it can be applied to a wide variety of important risk management issues such as the detection of failures of computer devices from computer performance data, the detection of masqueraders/malicious executables from computer access logs, etc. In this paper we propose a new method of real-time change point detection employing the sequentially discounting normalized maximum likelihood codingĀ (SDNML). Here the SDNML is a method for sequential data compression of a sequence, which we newly develop in this paper. It attains the least code length for the sequence and the effect of past data is gradually discounted as time goes on, hence the data compression can be done adaptively to non-stationary data sources. In our method, the SDNML is used to learn the mechanism of a time series, then a change-point score at each time is measured in terms of the SDNML code-length. We empirically demonstrate the significant superiority of our method over existing methods, such as the predictive-coding method and the hypothesis testing method, in terms of detection accuracy and computational efficiency for artificial data sets. We further apply our method into real security issues called malware detection. We empirically demonstrate that our method is able to detect unseen security incidents at significantly early stages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fawcett, T., Provost, F.: Activity monitoring: noticing interesting changes in behavior. In: Proc. of ACM-SIGKDD Intā€™l Conf. Knowledge Discovery and Data Mining, pp. 53ā€“62 (1999)

    Google ScholarĀ 

  2. Guralnik, V., Srivastava, J.: Event detection from time series data. In: Proc. ACM-SIGKDD Intā€™l Conf. Knowledge Discovery and Data Mining, pp. 33ā€“42 (1999)

    Google ScholarĀ 

  3. Hawkins, D.M.: Point estimation of parameters of piecewise regression models. J. Royal Statistical Soc. Series CĀ 25(1), 51ā€“57 (1976)

    Google ScholarĀ 

  4. Rissanen, J.: Information and Complexity in Statistical Modeling. Springer, Heidelberg (2007)

    MATHĀ  Google ScholarĀ 

  5. Rissanen, J., Roos, T., MyllymƤki, P.: Model selection by sequentially normalized least squares. Jr. Multivariate AnalysisĀ 101(4), 839ā€“849 (2010)

    ArticleĀ  MATHĀ  Google ScholarĀ 

  6. Roos, T., Rissanen, J.: On sequentially normalized maximum likelihood models. In: Proc. of 1st Workshop on Information Theoretic Methods in Science and Engineering, WITSME 2008 (2009)

    Google ScholarĀ 

  7. Shtarkov, Y.M.: Universal sequential coding of single messages. Problems of Information TransmissionĀ 23(3), 175ā€“186 (1987)

    Google ScholarĀ 

  8. Song, X., Wu, M., Jermaine, C., Ranka, S.: Statistical change detection for multi-dimensional data. In: Proc. Fifteenth ACM-SIGKDD Intā€™l Conf. Knowledge Discovery and Data Mining, pp. 667ā€“675 (2009)

    Google ScholarĀ 

  9. Takeuchi, J., Yamanishi, K.: A unifying framework for detecting outliers and change-points from time series. IEEE Transactions on Knowledge and Data EngineeringĀ 18(44), 482ā€“492 (2006)

    ArticleĀ  Google ScholarĀ 

  10. Wang, J., Deng, P., Fan, Y., Jaw, L., Liu, Y.: Virus detection using data mining techniques. In: Proc. of ICDM 2003 (2003)

    Google ScholarĀ 

  11. Yamanishi, K., Takeuchi, J.: A unifying approach to detecting outliers and change-points from nonstationary data. In: Proc. of the Eighth ACM SIGKDD Intā€™l Conf. Knowledge Discovery and Data Mining (2002)

    Google ScholarĀ 

  12. Ye, Y., Li, T., Jiang, Q., Han, Z., Wan, L.: Intelligent file scoring system for malware detection from the gray list. In: Proc. of the Fifteenth ACM SIGKDD Intā€™l Conf. Knowledge Discovery and Data Mining (2009)

    Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Urabe, Y., Yamanishi, K., Tomioka, R., Iwai, H. (2011). Real-Time Change-Point Detection Using Sequentially Discounting Normalized Maximum Likelihood Coding. In: Huang, J.Z., Cao, L., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 6635. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20847-8_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20847-8_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20846-1

  • Online ISBN: 978-3-642-20847-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics