Abstract
An important aspect of the classification of applications under conditions of a priori uncertainty is the operation of algorithms in streaming mode, with the continuous receipt of measurement data. A distinctive feature of the classification of data in streaming mode is the concept drift. A concept drift occurs when the phenomenon being studied, for which data have been collected, changes over time. Under the conditions of non-stationary data flows, the classification of mobile applications should be paired with a concept drift detector (CDD). The paper proposes a two-stage algorithm for detecting a change in the concept in the observed data stream. The algorithm is based on the statistical characteristics of the attributes analyzed using two sliding windows that control the change in the current statistical characteristics of the attributes of mobile applications. At the first stage, key statistics are applied in accordance with the Fisher criterion. At the second stage, the Page–Hinckley test is applied. As a result of the experiments, using an artificial data set, dependencies were obtained that allow one to evaluate the performance of the proposed two-stage algorithm for detecting the concept drift. It is shown that CDD allows reducing the probability of classification error with each change of concept by about 5%.
Similar content being viewed by others
REFERENCES
Brzezinski, D. and Stefanowski, J., Reacting to different types of concept drift: The accuracy updated ensemble algorithm, IEEE Trans. Neural Networks Learn. Syst., 2014, vol. 25, pp. 81–94. https://doi.org/10.1109/TNNLS.2013.2251352
Wang, H., Yu, P., and Han, J., Mining concept-drifting data streams, in Data Mining and Knowledge Discovery Handbook, Maimon, O. and Rokach, L., Eds., Springer US, 2010, pp. 789–802. https://doi.org/10.1007/978-0-387-09823-4_40.
Minku, L. and Yao, X., DDD: A new ensemble approach for dealing with concept drift, IEEE Trans. Knowl. Data Eng., 2012, vol. 24, no. 4, pp. 619–633. https://doi.org/10.1109/TKDE.2011.58
Dongre, P.B. and Malik, L.G., Stream data classification and adapting to gradual concept drift, Int. J. Adv. Res. Comput. Sci. Manage. Stud., 2014, vol. 2, no. 3, pp. 125–129.
Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M., and Bouchachia, A., A survey on concept drift adaptation, ACM Comput. Surv., 2013, vol. 46, no. 4. https://doi.org/10.1145/2523813
Gama, J., Medas, P., Castillo, G., and Rodrigues, P., Learning with drift detection, Advances in Artificial Intelligence—SBIA 2004, 2004, pp. 286–295.
Baena-Garcia, M., Campo-Avila, J.D., Fidalgo, R., BIifet, A., Gavalda, R., and Morales-Bueno, R., Early drift detection method, Fourth International Workshop on Knowledge Discovery from Data Streams, 2006.
Ross, G.J., Adams, N.M., Tasoulis, D.K., and Hand, D.J., Exponentially weighted moving average charts for detecting concept drift, Pattern Recognit. Lett., 2012, vol. 33, no. 2, pp. 191–198. https://doi.org/10.1016/j.patrec.2011.08.019
Raza, H., Prasad, G., and Li, Y., EWMA model-based shift-detection methods for detecting covariate shifts in non-stationary environments, Pattern Recognit., 2015, vol. 48, no. 3, pp. 659–669. https://doi.org/10.1016/j.patcog.2014.07.028
Sadhukha, Change Detection Algorithms, 2003. http://www.research.rutgers.edu/~sadhukha/file2.pdf. Accessed March 14, 2016.
Barddal, J.P., Gomes, H.M., Enembreck, F., Pfahringer, B., and Bifet, A., On dynamic feature weighting for feature drifting data streams, in Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases, Frasconi, P., Landwehr, N., Manco, G., and Vreeken, J., Eds., Cham: Springer, 2016, pp. 129–144. https://doi.org/10.1007/978-3-319-46227-1_9
Yamini Kadwe and Vaishali Suryawanshi, A review on concept drift, IOSR J. Comput. Eng., 2015, vol. 17, no. 1, pp. 20–26. https://doi.org/10.9790/0661-17122026
Gomes, H.M., Barddal, J.P., Enembreck, F., and Bifet, A., A survey on ensemble learning for data stream classification, ACM Comput. Surv., 2017, vol. 50, no. 2, pp. 1–36. https://doi.org/10.1145/3054925
Bifet, A. and Gavalda, R., Learning from time-changing data with adaptive windowing, Proceedings of the Seventh SIAM International Conference on Data Mining, 2007, pp. 443–448. https://doi.org/10.1137/1.9781611972771.42
Bifet, A. and Gavalda, R., Adaptive learning from evolving data streams, Proceedings of the Eighth International Symposium on Intelligent Data Analysis, 2009, pp. 249–260.
Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., and Gavalda, R., New ensemble methods for evolving data streams, Proceedings of the Fifteenth ACM International Conference on Knowledge Discovery and Data Mining, 2009, pp. 139–148.
Gomes, H.M., Barddal, J.P., Enembreck, F., and Bifet, A., A survey on ensemble learning for data stream classification, ACM Comput. Surv., 2017, vol. 50, no. 2, pp. 1–36. https://doi.org/10.1145/3054925
Sheluhin, O.I., Erokhin, S.D., Osin, A.V., and Barkov, V.V., Experimental studies of network traffic of mobile devices with Android OS, 2019 Systems of Signals Generating and Processing in the Field of on Board Communications, 2019. https://doi.org/10.1109/SOSG.2019.8706824
Sheluhin, O.I., Barkov, V.V., and Sekretarev, S.A., The online classification of the mobile applications traffic using data mining techniques, T-Comm, 2019, vol. 13, no. 10, pp. 60–67. https://doi.org/10.24411/2072-8735-2018-10317
Basseville, M. and Nikiforov, I.V., Detection of Abrupt Changes: Theory and Application, Englewood Cliffs: PTR Prentice-Hall, Inc., 1993.
Mouss, H., Mouss, D., Mouss, N., and Sefouhi, L., Test of Page-Hinkley, an approach for fault detection in an agro-alimentary production system, Proceedings of the Asian Control Conference, 2004, vol. 2, pp. 815–818. https://doi.org/10.1109/ASCC.2004.184970
Witten, I.H. and Frank, E., Data Mining: Practical Machine Learning Tools and Techniques, San Francisco: Morgan Kaufmann, 2005, 2nd ed.
Minku, L.L., White, A.P., and Yao, X., The impact of diversity on online ensemble learning in the presence of concept drift, IEEE Trans. Knowl. Data Eng., 2009, vol. 22, no. 5, pp. 730–742. https://doi.org/10.1109/TKDE.2009.156
Gama, J., et al., Learning with drift detection, in Brazilian Symposium on Artificial Intelligence, Springer, Berlin, Heidelberg, 2004, pp. 286–295. https://doi.org/10.1007/978-3-540-28645-5_29
Gomes, H.M., Bifet, A., Read, J., Barddal, J.P., Enembreck, F., Pfharinger, B., and Holmes, G., Adaptive random forests for evolving data stream classification, Mach. Learn., 2017, vol. 106, no. 9–10, pp. 1469–1495. https://doi.org/10.1007/s10994-017-5642-8
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The authors declare no conflict of interest.
About this article
Cite this article
Sheluhin, O.I., Sekretarev, S.A. Concept Drift Detection in Streaming Classification of Mobile Application Traffic. Aut. Control Comp. Sci. 55, 253–262 (2021). https://doi.org/10.3103/S0146411621030093
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S0146411621030093