Advertisement

An Online Competence-Based Concept Drift Detection Algorithm

  • Anjin LiuEmail author
  • Guangquan Zhang
  • Jie Lu
  • Ning Lu
  • Chin-Teng Lin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9992)

Abstract

The ability to adapt to new learning environments is a vital feature of contemporary case-based reasoning system. It is imperative that decision makers know when and how to discard outdated cases and apply new cases to perform smart maintenance operations. Competence-based empirical distance has been recently proposed as a measurement that can estimate the difference between case sample sets without knowing the actual case distributions. It is reportedly one of the most accurate drift detection algorithms in both synthetic and real-world data sets. However, as the construction of competence models have to retain every case in memory, it is not suitable for online drift detection. In addition, the high computational complexity O(\(n^{2}\)) also limits its practical application, especially when dealing with large scale data sets with time constrains. In this paper, therefore, we propose a space-based online case grouping strategy, and a new case group enhanced competence distance (CGCD), to address these issues. The experiment results show that the proposed strategy and related algorithms significantly improve the efficiency of the current leading competence-based drift detection algorithm.

Keywords

Case base reasoning Concept drift Online clustering 

Notes

Acknowledgment

This work is supported by the Australian Research Council (ARC) under discovery grant DP150101645. Also, the authors would like to thank the anonymous reviewers for their valuable feedback and all members of the Decision Systems and e-Service Intelligence laboratory of University of Technology Sydney for discussion.

References

  1. 1.
    Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proceedings of the Twenty-Ninth International Conference on Very Large Data Bases, vol. 29, pp. 81–92. VLDB Endowment (2003)Google Scholar
  2. 2.
    Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: Proceedings of the Sixth SIAM International Conference on Data Mining, vol. 6, pp. 328–339. SIAM (2006)Google Scholar
  3. 3.
    Dasu, T., Krishnan, S., Venkatasubramanian, S., Yi, K.: An information-theoretic approach to detecting changes in multi-dimensional data streams. In: Proceedings of the Symposium on the Interface of Statistics, Computing Science, and Applications, 24-27 May 2006, pp. 1–24. Citeseer (2006)Google Scholar
  4. 4.
    Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman and Hall, New York (1994)zbMATHGoogle Scholar
  5. 5.
    Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)CrossRefGoogle Scholar
  6. 6.
    Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). doi: 10.1007/978-3-540-28645-5_29 CrossRefGoogle Scholar
  7. 7.
    Gama, J., Zliobait, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 1–37 (2014)CrossRefzbMATHGoogle Scholar
  8. 8.
    Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106. ACM (2001)Google Scholar
  9. 9.
    Kifer, D., Ben-David, S., Gehrk, J.: Detecting change in data streams. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, vol. 30, pp. 180–191. VLDB Endowment (2004)Google Scholar
  10. 10.
    Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: an ensemble method for drifting concepts. J. Mach. Learn. Res. 8, 2755–2790 (2007)zbMATHGoogle Scholar
  11. 11.
    Last, M.: Online classification of nonstationary data streams. Intell. Data Anal. 6(2), 129–147 (2002)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Li, P., Hu, X., Liang, Q., Gao, Y.: Concept drifting detection on noisy streaming data in random ensemble decision trees. In: Perner, P. (ed.) MLDM 2009. LNCS (LNAI), vol. 5632, pp. 236–250. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-03070-3_18 CrossRefGoogle Scholar
  13. 13.
    Lu, N., Lu, J., Zhang, G., de Mantaras, R.L.: A concept drift-tolerant case-base editing technique. Artif. Intell. 230, 108–133 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Lu, N., Zhang, G., Jie, L.: Concept drift detection via competence models. Artif. Intell. 209, 11–28 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Lu, N., Zhang, G., Lu, J.: Detecting change via competence model. In: Bichindaritz, I., Montani, S. (eds.) ICCBR 2010. LNCS (LNAI), vol. 6176, pp. 201–212. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-14274-1_16 CrossRefGoogle Scholar
  16. 16.
    Maloof, M.A., Michalski, R.S.: Incremental learning with partial instance memory. Artif. Intell. 154(1), 95–126 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Minku, L.L., White, A.P., Xin, Y.: The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans. Knowl. Data Eng. 22(5), 730–742 (2010)CrossRefGoogle Scholar
  18. 18.
    Schlimmer, J.C., Granger Jr., R.H.: Incremental learning from noisy data. Mach. Learn. 1(3), 317–354 (1986)Google Scholar
  19. 19.
    Smyth B., Keane, M.T.: Remembering to forget: a competence-preserving case deletion policy for case-based reasoning systems. In Proceedings of the Fourtheenth International Joint Conference on Artificial Intelligence, 20-25 August 1995, pp. 377–382. Morgan Kaufmann(1995)Google Scholar
  20. 20.
    Smyth, B., McKenna, E.: Modelling the competence of case-bases. In: Smyth, B., Cunningham, P. (eds.) EWCBR 1998. LNCS, vol. 1488, pp. 208–220. Springer, Heidelberg (1998). doi: 10.1007/BFb0056334 CrossRefGoogle Scholar
  21. 21.
    Su, B., Shen, Y.-D., Xu, W.: Modeling concept drift from the perspective of classifiers. In: IEEE Conference on Cybernetics and Intelligent Systems, 21–24 September 2008, pp. 1055–1060. IEEE (2008)Google Scholar
  22. 22.
    Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235. ACM (2003). doi: 10.1145/956750.956778
  23. 23.
    Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)Google Scholar
  24. 24.
    Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proceedings of the Sixteenth International Conference on Management of Data, pp. 103–114. ACM (1996). doi: 10.1145/233269.233324
  25. 25.
    Zliobaite, I.: Learning under concept drift: an overview. Report, Faculty of Mathematics and Informatics, Vilnius University (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Anjin Liu
    • 1
    Email author
  • Guangquan Zhang
    • 1
  • Jie Lu
    • 1
  • Ning Lu
    • 2
  • Chin-Teng Lin
    • 1
  1. 1.QCISUniversity of Technology SydneyUltimoAustralia
  2. 2.SAS Institute Inc.Lane CoveAustralia

Personalised recommendations