Skip to main content

Change Detection from Heterogeneous Data Sources

  • Chapter
  • First Online:
Data Mining for Service

Part of the book series: Studies in Big Data ((SBD,volume 3))

  • 3463 Accesses

Abstract

In real business, taking the right action at the right time is critical to success. As the first step, detecting the sign of changes in the business situation is an important technical challenge. In this chapter, we focus on change detection technologies, including the tasks of outlier detection and change-point detection. In particular, we focus on how to handle the heterogeneous and dynamic natures that are common features of the data in service businesses. We describe an approach of singular spectrum transformation for change-point detection for heterogeneous data. We also introduce a novel technique of proximity-based outlier detection to handle the dynamic nature of the data. Using real-world sensor data, we demonstrate the utility of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput Surv 41(3), 1–58 (2009)

    Article  Google Scholar 

  2. Anderson, T.W.: An Introduction to Multivariate Statistical Analysis, 3rd edn. Wiley-Interscience, New York (2003)

    MATH  Google Scholar 

  3. Yamanishi, K., Takeuchi, J., Williams, G., Milne. P.: On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 320–324 (2000)

    Google Scholar 

  4. Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. ACM SIGMOD Rec 29(2), 93–104 (2000)

    Article  Google Scholar 

  5. Idé, T., Kashima, H.: Eigenspace-based anomaly detection in computer systems. In: Proceedings of ACM SIGKDD International Conference Knowledge Discovery and Data Mining, pp. 440–449 (2004)

    Google Scholar 

  6. Basseville M., Nikiforov, I.: Detection of Abrupt Changes. Prentice Hall, Englewood Cliffs (1993)

    Google Scholar 

  7. Kawahara, Y., Sugiyama, M.: Change-point detection in time-series data by direct density-ratio estimation. In: Proceedings of 2009 SIAM International Conference on Data Mining SDM 09, (2009)

    Google Scholar 

  8. Sugiyama, M., Suzuki, T., Kanamori, T.: Density Ratio Estimation in Machine Learning, 1st edn. Cambridge University Press, Cambridge (2012)

    Book  MATH  Google Scholar 

  9. Yamanishi, K., Takeuchi, J.: A unifying framework for detecting outliers and change points from non-stationary time series data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 02, pp. 676–681 (2002)

    Google Scholar 

  10. Idé, T., Inoue, K.: Knowledge discovery from heterogeneous dynamic systems using change-point correlations. In: Proceedings of 2005 SIAM International Conference Data Mining (SDM 05), pp. 571–575 (2005)

    Google Scholar 

  11. Idé, T., Tsuda, K.: Change-point detection using krylov subspace learning. In: Proceedings of 2007 SIAM International Conference on Data Mining (SDM 07), pp. 515–520 (2007)

    Google Scholar 

  12. Kawahara, Y., Yairi, T., Machida,K.: Change-point detection in time-series data based on subspace identification. In: Proceedings of the 7th IEEE International Conference on Data Mining ICDM 07, (2007)

    Google Scholar 

  13. Urabe, Y., Yamanishi, K., Tomioka, R., Iwai, H.: Real-time change-point detection using sequentially discounting normalized maximum likelihood coding. In: Proceedings of the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining PAKDD 11, (2011)

    Google Scholar 

  14. Xuan, X., Murphy, K.: Modeling changing dependency structure in multivariate time series. In: Proceedings of the 24th International Conference on Machine Learning, pp. 1055–1062 (2007)

    Google Scholar 

  15. Keogh, E.J., Lin, J., Fu, A.W.-C.: HOT SAX: Efficiently finding the most unusual time series subsequence. In: Procedings of the 5th IEEE International Conference on Data Mining ICDM 05, pp. 226–233 (2005)

    Google Scholar 

  16. Yankov, D., Keogh, E.J., Rebbapragada, U.: Disk aware discord discovery: Finding unusual time series in terabyte sized datasets. In: Proceedings of the 7th IEEE International Conference on Data Mining ICDM 07, (2007)

    Google Scholar 

  17. Hu, B., Rakthanmanon, T., Hao, Y., Evans, S., Lonardi, S., Keogh, E.: Discovering the intrinsic cardinality and dimensionality of time series using mdl. In: Proceedings of the 11th IEEE International Conference on Data Mining ICDM 11, (2011)

    Google Scholar 

  18. T. Idé. Why does subsequence time-series clustering produce sine waves? In: Proceedings of 10th European Conference on Principles and Practice of Knowledge Discovery in Databases PKDD 06, pp. 211–222 (2006)

    Google Scholar 

  19. Keogh, E., Lin, J., Truppel, W.: Clustering of time series subsequences is meaningless: Implications for previous and future research. In: Proceedings of IEEE International Conference on Data Mining, pp. 115–122 (2003)

    Google Scholar 

  20. Fujimaki, R., Hirose, S., Nakata, T.: Theorectical analysis of subsequence time-series clustering from a frequency-analysis viewpoint. In: Proceedings of the SIAM International Conference Data Mining, pp. 506–517 (2008)

    Google Scholar 

  21. Idé T., Lozano, AC., Abe, N., Liu, Y.: Proximity-based anomaly detection using sparse structure learning. In: Proceedings of 2009 SIAM International Conference on Data Mining (SDM 09), pp. 97–108 (2009)

    Google Scholar 

  22. Idé, T., Papadimitriou, S., Vlachos, M.: Computing correlation anomaly scores using stochastic nearest neighbors. In: Proceedings of IEEE International Conference on Data Mining (ICDM 07), pp. 523–528 (2007)

    Google Scholar 

  23. Jiang, R., Fei, H., Huan, J.: Anomaly localization for network data streams with graph joint sparse PCA. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 886–894 (2011)

    Google Scholar 

  24. Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)

    MATH  Google Scholar 

  25. Ljung, L.: System Identification - Theory for the User, 2nd edn. PTR Prentice Hall, Englewood Cliffs (1999)

    Google Scholar 

  26. Banerjee, O., Ghaoui, L.E., Natsoulis, G.: Convex optimization techniques for fitting sparse Gaussian graphical models. In: Proceedingsof the International Conference on Machine Learning, pp. 89–96 (2006, Press)

    Google Scholar 

  27. Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)

    Article  MATH  Google Scholar 

  28. Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Annals of Appl Stat 1(2), 302–332 (2007)

    Article  MATH  Google Scholar 

  29. Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34(3), 1436–1462 (2006)

    Article  MATH  Google Scholar 

  30. Hirano, S., Tsumoto, S.: Mining similar temporal patterns in long time-series data and its application to medicine. In: Proceedings of 2002 IEEE International Conference on Data Mining, pp. 219–226 (2002)

    Google Scholar 

  31. Kadambe, S., Boudreaux-Bartels, G.: Application of the wavelet transform for pitch detection of speech signals. IEEE Trans. Inf. Theory 38, 917–924 (1992)

    Article  Google Scholar 

  32. Roweis, S.: EM algorithms for PCA and SPCA. In: Jordan, M.I., Kearns, M.J., Solla, S.A. (eds.) Advances in Neural Information Processing Systems, vol. 10, The MIT Press, Cambridge (1998)

    Google Scholar 

  33. Keogh, E., Folias, T.: The UCR time series data mining archive [http://www.cs.ucr.edu/eamonn/TSDMA/index.html] (2002)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tsuyoshi Idé .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Idé, T. (2014). Change Detection from Heterogeneous Data Sources. In: Yada, K. (eds) Data Mining for Service. Studies in Big Data, vol 3. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45252-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45252-9_13

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45251-2

  • Online ISBN: 978-3-642-45252-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics