CCM: Controlling the Change Magnitude in High Dimensional Data
The effectiveness of change-detection algorithms is often assessed on real-world datasets by injecting synthetically generated changes. Typically, the magnitude of the introduced changes is not controlled, and most of experimental practices lead to results that are difficult to reproduce and compare with. This problem becomes particularly relevant when the data-dimension scales, as it happens in big data applications.
To enable a fair comparison among change-detection algorithms, we have designed “Controlling Change Magnitude” (CCM), a rigorous method to introduce changes in multivariate datasets. In particular, we measure the change magnitude as the symmetric Kullback-Leibler divergence between the pre- and post-change distributions, and introduce changes by applying a roto-translation directly to the data. We present an algorithm to identify the parameters yielding the desired change magnitude, and analytically prove its convergence. Our experiments show the effectiveness of the proposed method and the limitations of tests run on high-dimensional datasets when changes are injected following traditional approaches. The MATLAB framework implementing the proposed method is made publicly available for download.
KeywordsBisection Method Change Magnitude Realistic Monitoring Multivariate Dataset Popular Machine Learning
- 1.Alippi, C.: Intelligence for Embedded Systems, A Methodological Approach. Springer, Switzerland (2014)Google Scholar
- 2.Alippi, C., Boracchi, G., Carrera, D., Roveri, M.: Change detection in multivariate datastreams: Likelihood and detectability loss. In: Proceedings of IJCAI (2016)Google Scholar
- 4.Alippi, C., Boracchi, G., Roveri, M.: Just-in-time classifiers for recurrent concepts. IEEE Trans. Neural Netw. Learn. Syst. 24(4) (2013)Google Scholar
- 7.Boracchi, G., Roveri, M.: Exploiting self-similarity for change detection. In: Proceedings of IEEE International Joint Conference on Neural Networks (IJCNN) (2014)Google Scholar
- 10.Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Proceedings of Brazilian Symposium on Artificial Intelligence (SBIA) (2004)Google Scholar
- 11.Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4) (2014)Google Scholar
- 12.Harel, M., Mannor, S., El-yaniv, R., Crammer, K.: Concept drift detection through resampling. In: Proceedings of ICML, pp. 1009–1017 (2014)Google Scholar
- 13.Kuncheva, L.I.: Change detection in streaming multivariate data using likelihood detectors. IEEE Trans. Knowl. Data Eng. 25(5) (2013)Google Scholar
- 14.Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml
- 17.Ross, G.J., Tasoulis, D.K., Adams, N.M.: Nonparametric monitoring of data streams for changes in location and scale. Technometrics 53(4) (2011)Google Scholar
- 19.Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2001)Google Scholar