Abstract
The ability to adapt to new learning environments is a vital feature of contemporary case-based reasoning system. It is imperative that decision makers know when and how to discard outdated cases and apply new cases to perform smart maintenance operations. Competence-based empirical distance has been recently proposed as a measurement that can estimate the difference between case sample sets without knowing the actual case distributions. It is reportedly one of the most accurate drift detection algorithms in both synthetic and real-world data sets. However, as the construction of competence models have to retain every case in memory, it is not suitable for online drift detection. In addition, the high computational complexity O(\(n^{2}\)) also limits its practical application, especially when dealing with large scale data sets with time constrains. In this paper, therefore, we propose a space-based online case grouping strategy, and a new case group enhanced competence distance (CGCD), to address these issues. The experiment results show that the proposed strategy and related algorithms significantly improve the efficiency of the current leading competence-based drift detection algorithm.
References
Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proceedings of the Twenty-Ninth International Conference on Very Large Data Bases, vol. 29, pp. 81–92. VLDB Endowment (2003)
Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: Proceedings of the Sixth SIAM International Conference on Data Mining, vol. 6, pp. 328–339. SIAM (2006)
Dasu, T., Krishnan, S., Venkatasubramanian, S., Yi, K.: An information-theoretic approach to detecting changes in multi-dimensional data streams. In: Proceedings of the Symposium on the Interface of Statistics, Computing Science, and Applications, 24-27 May 2006, pp. 1–24. Citeseer (2006)
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman and Hall, New York (1994)
Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). doi:10.1007/978-3-540-28645-5_29
Gama, J., Zliobait, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 1–37 (2014)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106. ACM (2001)
Kifer, D., Ben-David, S., Gehrk, J.: Detecting change in data streams. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, vol. 30, pp. 180–191. VLDB Endowment (2004)
Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: an ensemble method for drifting concepts. J. Mach. Learn. Res. 8, 2755–2790 (2007)
Last, M.: Online classification of nonstationary data streams. Intell. Data Anal. 6(2), 129–147 (2002)
Li, P., Hu, X., Liang, Q., Gao, Y.: Concept drifting detection on noisy streaming data in random ensemble decision trees. In: Perner, P. (ed.) MLDM 2009. LNCS (LNAI), vol. 5632, pp. 236–250. Springer, Heidelberg (2009). doi:10.1007/978-3-642-03070-3_18
Lu, N., Lu, J., Zhang, G., de Mantaras, R.L.: A concept drift-tolerant case-base editing technique. Artif. Intell. 230, 108–133 (2016)
Lu, N., Zhang, G., Jie, L.: Concept drift detection via competence models. Artif. Intell. 209, 11–28 (2014)
Lu, N., Zhang, G., Lu, J.: Detecting change via competence model. In: Bichindaritz, I., Montani, S. (eds.) ICCBR 2010. LNCS (LNAI), vol. 6176, pp. 201–212. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14274-1_16
Maloof, M.A., Michalski, R.S.: Incremental learning with partial instance memory. Artif. Intell. 154(1), 95–126 (2004)
Minku, L.L., White, A.P., Xin, Y.: The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans. Knowl. Data Eng. 22(5), 730–742 (2010)
Schlimmer, J.C., Granger Jr., R.H.: Incremental learning from noisy data. Mach. Learn. 1(3), 317–354 (1986)
Smyth B., Keane, M.T.: Remembering to forget: a competence-preserving case deletion policy for case-based reasoning systems. In Proceedings of the Fourtheenth International Joint Conference on Artificial Intelligence, 20-25 August 1995, pp. 377–382. Morgan Kaufmann(1995)
Smyth, B., McKenna, E.: Modelling the competence of case-bases. In: Smyth, B., Cunningham, P. (eds.) EWCBR 1998. LNCS, vol. 1488, pp. 208–220. Springer, Heidelberg (1998). doi:10.1007/BFb0056334
Su, B., Shen, Y.-D., Xu, W.: Modeling concept drift from the perspective of classifiers. In: IEEE Conference on Cybernetics and Intelligent Systems, 21–24 September 2008, pp. 1055–1060. IEEE (2008)
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235. ACM (2003). doi:10.1145/956750.956778
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proceedings of the Sixteenth International Conference on Management of Data, pp. 103–114. ACM (1996). doi:10.1145/233269.233324
Zliobaite, I.: Learning under concept drift: an overview. Report, Faculty of Mathematics and Informatics, Vilnius University (2009)
Acknowledgment
This work is supported by the Australian Research Council (ARC) under discovery grant DP150101645. Also, the authors would like to thank the anonymous reviewers for their valuable feedback and all members of the Decision Systems and e-Service Intelligence laboratory of University of Technology Sydney for discussion.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Liu, A., Zhang, G., Lu, J., Lu, N., Lin, CT. (2016). An Online Competence-Based Concept Drift Detection Algorithm. In: Kang, B.H., Bai, Q. (eds) AI 2016: Advances in Artificial Intelligence. AI 2016. Lecture Notes in Computer Science(), vol 9992. Springer, Cham. https://doi.org/10.1007/978-3-319-50127-7_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-50127-7_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50126-0
Online ISBN: 978-3-319-50127-7
eBook Packages: Computer ScienceComputer Science (R0)