Abstract
In real business, taking the right action at the right time is critical to success. As the first step, detecting the sign of changes in the business situation is an important technical challenge. In this chapter, we focus on change detection technologies, including the tasks of outlier detection and change-point detection. In particular, we focus on how to handle the heterogeneous and dynamic natures that are common features of the data in service businesses. We describe an approach of singular spectrum transformation for change-point detection for heterogeneous data. We also introduce a novel technique of proximity-based outlier detection to handle the dynamic nature of the data. Using real-world sensor data, we demonstrate the utility of the proposed methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput Surv 41(3), 1–58 (2009)
Anderson, T.W.: An Introduction to Multivariate Statistical Analysis, 3rd edn. Wiley-Interscience, New York (2003)
Yamanishi, K., Takeuchi, J., Williams, G., Milne. P.: On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 320–324 (2000)
Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. ACM SIGMOD Rec 29(2), 93–104 (2000)
Idé, T., Kashima, H.: Eigenspace-based anomaly detection in computer systems. In: Proceedings of ACM SIGKDD International Conference Knowledge Discovery and Data Mining, pp. 440–449 (2004)
Basseville M., Nikiforov, I.: Detection of Abrupt Changes. Prentice Hall, Englewood Cliffs (1993)
Kawahara, Y., Sugiyama, M.: Change-point detection in time-series data by direct density-ratio estimation. In: Proceedings of 2009 SIAM International Conference on Data Mining SDM 09, (2009)
Sugiyama, M., Suzuki, T., Kanamori, T.: Density Ratio Estimation in Machine Learning, 1st edn. Cambridge University Press, Cambridge (2012)
Yamanishi, K., Takeuchi, J.: A unifying framework for detecting outliers and change points from non-stationary time series data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD 02, pp. 676–681 (2002)
Idé, T., Inoue, K.: Knowledge discovery from heterogeneous dynamic systems using change-point correlations. In: Proceedings of 2005 SIAM International Conference Data Mining (SDM 05), pp. 571–575 (2005)
Idé, T., Tsuda, K.: Change-point detection using krylov subspace learning. In: Proceedings of 2007 SIAM International Conference on Data Mining (SDM 07), pp. 515–520 (2007)
Kawahara, Y., Yairi, T., Machida,K.: Change-point detection in time-series data based on subspace identification. In: Proceedings of the 7th IEEE International Conference on Data Mining ICDM 07, (2007)
Urabe, Y., Yamanishi, K., Tomioka, R., Iwai, H.: Real-time change-point detection using sequentially discounting normalized maximum likelihood coding. In: Proceedings of the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining PAKDD 11, (2011)
Xuan, X., Murphy, K.: Modeling changing dependency structure in multivariate time series. In: Proceedings of the 24th International Conference on Machine Learning, pp. 1055–1062 (2007)
Keogh, E.J., Lin, J., Fu, A.W.-C.: HOT SAX: Efficiently finding the most unusual time series subsequence. In: Procedings of the 5th IEEE International Conference on Data Mining ICDM 05, pp. 226–233 (2005)
Yankov, D., Keogh, E.J., Rebbapragada, U.: Disk aware discord discovery: Finding unusual time series in terabyte sized datasets. In: Proceedings of the 7th IEEE International Conference on Data Mining ICDM 07, (2007)
Hu, B., Rakthanmanon, T., Hao, Y., Evans, S., Lonardi, S., Keogh, E.: Discovering the intrinsic cardinality and dimensionality of time series using mdl. In: Proceedings of the 11th IEEE International Conference on Data Mining ICDM 11, (2011)
T. Idé. Why does subsequence time-series clustering produce sine waves? In: Proceedings of 10th European Conference on Principles and Practice of Knowledge Discovery in Databases PKDD 06, pp. 211–222 (2006)
Keogh, E., Lin, J., Truppel, W.: Clustering of time series subsequences is meaningless: Implications for previous and future research. In: Proceedings of IEEE International Conference on Data Mining, pp. 115–122 (2003)
Fujimaki, R., Hirose, S., Nakata, T.: Theorectical analysis of subsequence time-series clustering from a frequency-analysis viewpoint. In: Proceedings of the SIAM International Conference Data Mining, pp. 506–517 (2008)
Idé T., Lozano, AC., Abe, N., Liu, Y.: Proximity-based anomaly detection using sparse structure learning. In: Proceedings of 2009 SIAM International Conference on Data Mining (SDM 09), pp. 97–108 (2009)
Idé, T., Papadimitriou, S., Vlachos, M.: Computing correlation anomaly scores using stochastic nearest neighbors. In: Proceedings of IEEE International Conference on Data Mining (ICDM 07), pp. 523–528 (2007)
Jiang, R., Fei, H., Huan, J.: Anomaly localization for network data streams with graph joint sparse PCA. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 886–894 (2011)
Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)
Ljung, L.: System Identification - Theory for the User, 2nd edn. PTR Prentice Hall, Englewood Cliffs (1999)
Banerjee, O., Ghaoui, L.E., Natsoulis, G.: Convex optimization techniques for fitting sparse Gaussian graphical models. In: Proceedingsof the International Conference on Machine Learning, pp. 89–96 (2006, Press)
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Annals of Appl Stat 1(2), 302–332 (2007)
Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34(3), 1436–1462 (2006)
Hirano, S., Tsumoto, S.: Mining similar temporal patterns in long time-series data and its application to medicine. In: Proceedings of 2002 IEEE International Conference on Data Mining, pp. 219–226 (2002)
Kadambe, S., Boudreaux-Bartels, G.: Application of the wavelet transform for pitch detection of speech signals. IEEE Trans. Inf. Theory 38, 917–924 (1992)
Roweis, S.: EM algorithms for PCA and SPCA. In: Jordan, M.I., Kearns, M.J., Solla, S.A. (eds.) Advances in Neural Information Processing Systems, vol. 10, The MIT Press, Cambridge (1998)
Keogh, E., Folias, T.: The UCR time series data mining archive [http://www.cs.ucr.edu/eamonn/TSDMA/index.html] (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Idé, T. (2014). Change Detection from Heterogeneous Data Sources. In: Yada, K. (eds) Data Mining for Service. Studies in Big Data, vol 3. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45252-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-45252-9_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45251-2
Online ISBN: 978-3-642-45252-9
eBook Packages: EngineeringEngineering (R0)