A CPM-Based Change Detection Test for Big Data

  • Giada Tacconelli
  • Manuel RoveriEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 529)


Big data analytics nowadays represent one of the most relevant and promising research activities in the field of Big Data. Tools and solutions designed for such purpose are meant to analyse very large sets ot data to extract relevant/valuable information. In this path, this paper addresses the problem of sequentially analysing big streams of data inspecting for changes. This problem that has been extensively studied for scalar or multivariate datastreams, has been mostly left unattended in the Big Data scenario. More specifically, the aim of this paper is to introduce a change detection test able to detect changes in datastreams characterized by very-large dimensions (up to 1000). The proposed test, based on a change-point method, is non parameteric (in the sense that it does not require any apriori information about the system under inspection or the possible changes) and is designed to detect changes in the mean vector of the datastreams. The effectiveness and the efficiency of the proposed change detection test has been tested on both synthetic and real datasets.


Real Dataset Detection Ability False Positive Detection Detection Delay Synthetic Experiment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Agarwal, D.: An empirical bayes approach to detect anomalies in dynamic multidimensional arrays. In: Fifth IEEE International Conference on Data Mining, 8-pp. IEEE (2005)Google Scholar
  2. 2.
    Alippi, C., Roveri, M.: Just-in-time adaptive classifierspart i: detecting nonstationary changes. IEEE Trans. Neural Netw. 19(7), 1145–1153 (2008)CrossRefGoogle Scholar
  3. 3.
    Basseville, M., Nikiforov, I.V., et al.: Detection of Abrupt Changes: Theory and Application, vol. 104. Prentice Hall, Englewood Cliffs (1993)Google Scholar
  4. 4.
    Danziger, S.A., Swamidass, S.J., Zeng, J., Dearth, L.R., Lu, Q., Chen, J.H., Cheng, J., Hoang, V.P., Saigo, H., Luo, R., et al.: Functional census of mutation sequence spaces: the example of p53 cancer rescue mutants. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 3(2), 114–125 (2006)CrossRefGoogle Scholar
  5. 5.
    Ferreira, L.N., Zhao, L.: A time series clustering technique based on community detection in networks. Procedia Comput. Sci. 53, 183–190 (2015). INNS Conference on Big Data 2015, San Francisco, CA, USA, 8–10 August 2015CrossRefGoogle Scholar
  6. 6.
    Galeano, P., Peña, D.: Covariance changes detection in multivariate time series. J. Stat. Plann. Infer. 137(1), 194–211 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Hajj, N., Rizk, Y., Awad, M.: A mapreduce cortical algorithms implementation for unsupervised learning of big data. Procedia Comput. Sci. 53, 327–334 (2015)CrossRefGoogle Scholar
  8. 8.
    Hegedűs, I., Nyers, L., Ormándi, R.: Detecting concept drift in fully distributed environments. In: 2012 IEEE 10th Jubilee International Symposium on Intelligent Systems and Informatics (SISY), pp. 183–188. IEEE (2012)Google Scholar
  9. 9.
    Kuncheva, L.I.: Change detection in streaming multivariate data using likelihood detectors. IEEE Trans. Knowl. Data Eng. 25(5), 1175–1180 (2013)CrossRefGoogle Scholar
  10. 10.
    Qiu, P., Hawkins, D.: A rank-based multivariate cusum procedure. Technometrics 43(2), 120–132 (2012)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Sullivan, J.H., Woodall, W.H.: Change-point detection of mean vector or covariance matrix shifts using multivariate individual observations. IIE Trans. 32(6), 537–549 (2000)Google Scholar
  12. 12.
    Wang, T.Y., Chen, L.H.: Mean shifts detection and classification in multivariate process: a neural-fuzzy approach. J. Intell. Manufact. 13(3), 211–221 (2002)CrossRefGoogle Scholar
  13. 13.
    Wu, X., Zhu, X., Wu, G.Q., Ding, W.: Data mining with big data. IEEE Trans. Knowl. Data Eng. 26(1), 97–107 (2014)CrossRefGoogle Scholar
  14. 14.
    Zamba, K., Hawkins, D.M.: A multivariate change-point model for statistical process control. Technometrics 48(4), 539–549 (2006)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Zikopoulos, P., Eaton, C., et al.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill Osborne Media, New York (2011)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Politecnico di MilanoMilanoItaly

Personalised recommendations