Evolving Systems

, Volume 9, Issue 1, pp 1–23 | Cite as

Discussion and review on evolving data streams and concept drift adapting

  • Imen Khamassi
  • Moamar Sayed-Mouchaweh
  • Moez Hammami
  • Khaled Ghédira
Original Paper


Recent advances in computational intelligent systems have focused on addressing complex problems related to the dynamicity of the environments. In increasing number of real world applications, data are presented as streams that may evolve over time and this is known by concept drift. Handling concept drift is becoming an attractive topic of research that concerns multidisciplinary domains such that machine learning, data mining, ubiquitous knowledge discovery, statistic decision theory, etc... Therefore, a rich body of the literature has been devoted to the study of methods and techniques for handling drifting data. However, this literature is fairly dispersed and it does not define guidelines for choosing an appropriate approach for a given application. Hence, the main objective of this survey is to present an ease understanding of the concept drift issues and related works, in order to help researchers from different disciplines to consider concept drift handling in their applications. This survey covers different facets of existing approaches, evokes discussion and helps readers to underline the sharp criteria that allow them to properly design their own approach. For this purpose, a new categorization of the existing state-of-the-art is presented with criticisms, future tendencies and not-yet-addressed challenges.


Adaptive learning Evolving learning Evolving data stream Change detection Concept drift Statistical hypothesis test 


  1. Alippi C, Boracchi G, Roveri M (2010) Change detection tests using the ICI rule. In: The international joint conference on neural networks (IJCNN), pp 1–7Google Scholar
  2. Alippi C, Roveri M (2008) Just-in-time adaptive classifiers; part i: Detecting nonstationary changes. Neural Netw IEEE Trans 19(7):1145–1153CrossRefGoogle Scholar
  3. Aloraini A (2015) Penalized ensemble feature selection methods for hidden associations in time series environments case study: equities companies in saudi stock exchange market. Evol Syst 6(2):93–100CrossRefGoogle Scholar
  4. AlZoubi O, Fossati D, DMello S, Calvo R (2015) Affect detection from non-stationary physiological data using ensemble classifiers. Evol Syst 6(2):79–92CrossRefGoogle Scholar
  5. Amiribesheli M, Benmansour A, Bouchachia A (2015) A review of smart homes in healthcare. J Ambient Intell Hum Comput 1–23Google Scholar
  6. Angelov P (2012) Autonomous learning systems: from data streams to knowledge in real-time. Wiley Press, New YorkCrossRefGoogle Scholar
  7. Angelov P, Filev DP, Kasabov N (2010) Evolving intelligent systems: methodology and applications. Wiley-IEEE Press, New YorkCrossRefGoogle Scholar
  8. Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream systems. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems., PODS ’02ACM, New York, pp 1–16Google Scholar
  9. Bach S, Maloof M (2008) Paired learners for concept drift. In: Data mining, 2008. ICDM ’08. Eighth IEEE international conference, pp 23–32Google Scholar
  10. Baena-García M, del Campo-Ávila J, Fidalgo R, Bifet A, Gavaldá R, Morales-Bueno R (2006) Early drift detection method. In: In fourth international workshop on knowledge discovery from data streamsGoogle Scholar
  11. Baruah RD, Angelov PP (2011) Evolving fuzzy systems for data streams: a survey. Wiley Interdisc Rew Data Mining Knowl Discov 1(6):461–476CrossRefGoogle Scholar
  12. Behdad M, Barone L, Bennamoun M, French T (2012) Nature-inspired techniques in the context of fraud detection. Syst Man Cybernet Part C Appl Rev IEEE Trans 42(6):1273–1290CrossRefGoogle Scholar
  13. Bifet A, Frank E, Holmes G, Pfahringer B, Sugiyama M, Yang Q (2010) Accurate ensembles for data streams: combining restricted hoeffding trees using stacking. In: 2nd Asian conference on machine learning (ACML2010), pp 225–240Google Scholar
  14. Bifet A, Gavaldà R (2007) Learning from time-changing data with adaptive windowing. Proceedings of the seventh SIAM international conference on data mining, April 26–28, 2007. Minneapolis, Minnesota, pp 443–448CrossRefGoogle Scholar
  15. Bifet A, Holmes G, Pfahringer B (2010) Leveraging bagging for evolving data streams. In: Proceedings of the 2010 European conference on machine learning and knowledge discovery in databases: Part I., ECML PKDD’10Springer-Verlag, Berlin, pp 135–150Google Scholar
  16. Bifet A, Read J, Zliobaite I, Pfahringer B, Holmes G (2013) Pitfalls in benchmarking data stream classification and how to avoid them. Machine learning and knowledge discovery in databases, vol 8188. Lecture notes in computer science. Springer, Berlin, pp 465–479Google Scholar
  17. Bose RPJC, van der Aalst WMP, Žliobaitė I, Pechenizkiy M (2011) Advanced information systems engineering: 23rd international conference, CAiSE 2011, London, UK. Proceedings, chap. Handling concept drift in process mining. Springer, Berlin, pp 391–405Google Scholar
  18. Brzezinski D, Stefanowski J (2014) Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf Sci 265:50–67MathSciNetCrossRefzbMATHGoogle Scholar
  19. Brzezinski D, Stefanowski J (2014) Reacting to different types of concept drift: the accuracy updated ensemble algorithm. Neural Netw Learn Syst IEEE Trans 25(1):81–94CrossRefGoogle Scholar
  20. Cao F, Liang J, Bai L, Zhao X, Dang C (2010) A framework for clustering categorical time-evolving data. Fuzzy Syst IEEE Trans 18(5):872–882CrossRefGoogle Scholar
  21. Cauwenberghs G, Poggio T (2001) Incremental and decremental support vector machine learning. In: Advances in neural information processing systemsGoogle Scholar
  22. Chen HL, Chen MS, Lin SC (2009) Catching the trend: a framework for clustering concept-drifting categorical data. Knowl Data Eng IEEE Trans 21(5):652–665CrossRefGoogle Scholar
  23. Cieslak D, Chawla N (2009) A framework for monitoring classifiers performance: when and why failure occurs? Knowl Inf Syst 18(1):83–108CrossRefGoogle Scholar
  24. Ditzler G, Polikar R (2011) Hellinger distance based drift detection for nonstationary environments. In: Computational intelligence in dynamic and uncertain environments (CIDUE), 2011 IEEE symposium, pp 41–48Google Scholar
  25. Ditzler G, Roveri M, Alippi C, Polikar R (2015) Learning in nonstationary environments: a survey. IEEE Comput Int Mag 10(4):12–25CrossRefGoogle Scholar
  26. Dries A, Ruckert U (2009) Adaptive concept drift detection. Stat Anal Data Min 2(5–6):311–327MathSciNetCrossRefGoogle Scholar
  27. Gama J (2010) Knowledge discovery from data streams. Chapman & Hall/CRC, Boca RatonCrossRefzbMATHGoogle Scholar
  28. Gama, J, Castillo G (2006) Learning with local drift detection. In: Advanced data mining and applications, second international conference, ADMA 2006, Xi’an, China, August 14–16, 2006, Proceedings, pp 42–55Google Scholar
  29. Gama JA, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37Google Scholar
  30. GonçAlves PM Jr, Barros RSMD (2013) Rcd: a recurring concept drift framework. Pattern Recogn Lett 34(9):1018–1025CrossRefGoogle Scholar
  31. Hoens T, Polikar R, Chawla N (2012) Learning from streaming data with concept drift and imbalance: an overview. Progress Artif Intell 1(1):89–101CrossRefGoogle Scholar
  32. Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, San Francisco, CA, August 26–29, 2001, pp 97–106Google Scholar
  33. Jackowski K (2014) Fixed-size ensemble classifier system evolutionarily adapted to a recurring context with an unlimited pool of classifiers. Pattern Anal Appl 17(4):709–724MathSciNetCrossRefGoogle Scholar
  34. Goncalves PMG Jr, de Carvalho Santos SG, Barros RS, Vieira DC (2014) A comparative study on concept drift detectors. Expert Syst Appl 41(18):8144–8156CrossRefGoogle Scholar
  35. Khamassi I, Sayed-Mouchaweh M (2014) Drift detection and monitoring in non-stationary environments. In: Evolving and adaptive intelligent systems (EAIS), Austria, pp 1–6Google Scholar
  36. Khamassi I, Sayed-Mouchaweh M, Hammami M, Ghédira K (2013) Ensemble classifiers for drift detection and monitoring in dynamical environments. In: Annual conference of the prognostics and health management society, New Orlean, pp 199–224Google Scholar
  37. Khamassi I, Sayed-Mouchaweh M, Hammami M, Ghédira K (2015) Self-adaptive windowing approach for handling complex concept drift. Cogn Comput 7(6):772–790CrossRefGoogle Scholar
  38. Kifer D, Ben-David S, Gehrke J (2004) Detecting change in data streams. In: Proceedings of the thirtieth international conference on very large data bases, vol 30. VLDB ’04, pp 180–191Google Scholar
  39. Klinkenberg R, Renz I (1998) Adaptive information filtering: learning in the presence of concept drifts. In: Workshop notes of the ICML/AAAI-98 workshop learning for text categorization. AAAI Press, pp 33-40Google Scholar
  40. Kolter JZ, Maloof MA (2007) Dynamic weighted majority: an ensemble method for drifting concepts. J Mach Learn Res 8:2755–2790zbMATHGoogle Scholar
  41. Krawczyk B, Wozniak, M (2014) One-class classifiers with incremental learning and forgetting for data streams with concept drift. Soft Comput 1–14Google Scholar
  42. Krempl G, Žliobaite I, Brzeziński D, Hüllermeier E, Last M, Lemaire V, Noack T, Shaker A, Sievi S, Spiliopoulou M, Stefanowski J (2014) Open challenges for data stream mining research. SIGKDD Explor Newsl 16(1):1–10CrossRefGoogle Scholar
  43. Kukar M (2003) Drifting concepts as hidden factors in clinical studies. In: Dojat M, Keravnou E, Barahona P (eds) Artificial intelligence in medicine, vol 2780., Lecture notes in computer scienceSpringer, Berlin, pp 355–364CrossRefGoogle Scholar
  44. Kuncheva L (2004) Classifier ensembles for changing environments. In: Roli F, Kittler J, Windeatt T (eds) Multiple classifier systems, vol 3077., Lecture notes in computer scienceSpringer, Berlin, pp 1–15CrossRefGoogle Scholar
  45. Kuncheva LI (2009) Using control charts for detecting concept change in streaming data. Tech. Rep. BCS-TR-001-2009, School of Computer Science, Bangor University, UKGoogle Scholar
  46. Kuncheva LI, Žliobaitė I (2009) On the window size for classification in changing environments. Intell Data Anal 13(6):861–872Google Scholar
  47. Lazarescu MM, Venkatesh S, Bui HH (2004) Using multiple windows to track concept drift. Intell Data Anal 8(1):29–59Google Scholar
  48. Lichtenwalter R, Chawla N (2010) Adaptive methods for classification in arbitrarily imbalanced and drifting data streams. In: Theeramunkong T, Nattee C, Adeodato P, Chawla N, Christen P, Lenca P, Poon J, Williams G (eds) New frontiers in applied data mining, vol 5669., Lecture notes in computer scienceSpringer, Berlin, pp 53–75CrossRefGoogle Scholar
  49. Lu Z, Wu X, Bongard J (2015) Active learning through adaptive heterogeneous ensembling. Knowl Data Eng IEEE Trans 27(2):368–381CrossRefGoogle Scholar
  50. Lughofer E (2012) Evolving fuzzy systems-methodologies, advanced concepts and applications. Springer, New YorkzbMATHGoogle Scholar
  51. Lughofer E, Weigl E, Heidl W, Eitzinger C, Radauer T (2016) Recognizing input space and target concept drifts in data streams with scarcely labeled and unlabelled instances. Inf Sci 355(C):127–151Google Scholar
  52. Luo Y, Li Z, Wang Z (2009) Adaptive cusum control chart with variable sampling intervals. Comput Stat Data Anal 53(7):2693–2701MathSciNetCrossRefzbMATHGoogle Scholar
  53. Martfnez-Rego D, Fernndez-Francos D, Fontenla-Romero O, Alonso-Betanzos A (2015) Stream change detection via passive-aggressive classification and bernoulli CUSUM. Inf Sci 305:130–145MathSciNetCrossRefzbMATHGoogle Scholar
  54. Masud M, Gao J, Khan L, Han J, Thuraisingham B (2011) Classification and novel class detection in concept-drifting data streams under time constraints. Knowl Data Eng IEEE Trans 23(6):859–874CrossRefGoogle Scholar
  55. Mejri D, Khanchel R, Limam M (2013) An ensemble method for concept drift in nonstationary environment. J Stat Comput Simul 83:1115–1128MathSciNetCrossRefzbMATHGoogle Scholar
  56. Mejri D, Limam M, Weihs C (2013) Adaptive control chart with time varying control limits based on online classification methods for data streams. In: 12th workshop on quality improvement methods in Dortmund, GermanyGoogle Scholar
  57. Minku L, White A, Yao X (2010) The impact of diversity on online ensemble learning in the presence of concept drift. Knowl Data Eng IEEE Trans 22(5):730–742CrossRefGoogle Scholar
  58. Minku L, Yao X (2012) Ddd: A new ensemble approach for dealing with concept drift. Knowl Data Eng IEEE Trans 24(4):619–633CrossRefGoogle Scholar
  59. Muthukrishnan S, van den Berg E, Wu Y (2007) Sequential change detection on data streams. In: Data mining workshops, 2007. ICDM Workshops 2007. Seventh IEEE international conference, pp 551–550Google Scholar
  60. Navarro-Gonzalez J, Lopez-Juarez I, Ordaz-Hernandez K, Rios-Cabrera R (2015) On-line incremental learning for unknown conditions during assembly operations with industrial robots. Evol Syst 6(2):101–114CrossRefGoogle Scholar
  61. Nelwamondo F, Marwala T (2008) Key issues on computational intelligence techniques for missing data imputation-a review. In: Proc. of world multi conf. on systemics, cybernetics and informatics, pp 35–45Google Scholar
  62. Nishida K, Yamauchi K (2007) Detecting concept drift using statistical testing. In: Corruble V, Takeda M, Suzuki E (eds) Discovery science, vol 4755., Lecture notes in computer scienceSpringer, Berlin, pp 264–269CrossRefGoogle Scholar
  63. Oza NC, Russell S (2001) Online bagging and boosting. In: In artificial intelligence and statistics 2001. Morgan Kaufmann, pp 105–112Google Scholar
  64. Pandarachalil R, Sendhilkumar S, Mahalakshmi G (2015) Twitter sentiment analysis for large-scale data: an unsupervised approach. Cogn Comput 7(2):254–262CrossRefGoogle Scholar
  65. Pinto C, Gama J (2007) Incremental discretization, application to data with concept drift. In: Proceedings of the 2007 ACM symposium on applied computing. SAC ’07ACM, New York, pp 467–468Google Scholar
  66. Polikar R, Upda L, Upda S, Honavar V (2001) Learn++: an incremental learning algorithm for supervised neural networks. Syst Man Cybernet Part C Appl Rev IEEE Trans 31(4):497–508CrossRefGoogle Scholar
  67. Ren Y, Zhang L, Suganthan PN (2016) Ensemble classification and regression-recent developments, applications and future directions [review article]. IEEE Comput Intell Mag 11(1):41–53CrossRefGoogle Scholar
  68. Ross G, Adams N (2012) Two nonparametric control charts for detecting arbitrary distribution changes. J Qual Technol 44:102–116CrossRefGoogle Scholar
  69. Ross GJ, Adams NM, Tasoulis DK, Hand DJ (2012) Exponentially weighted moving average charts for detecting concept drift. Pattern Recognit Lett 33(2):191–198CrossRefGoogle Scholar
  70. Sayed Mouchaweh M, Lughofer E (2012) Learning in non-stationary environments: methods and applications. Springer, New YorkCrossRefzbMATHGoogle Scholar
  71. Schliebs S, Kasabov N (2013) Evolving spiking neural network–a survey. Evol Syst 4(2):87–98CrossRefGoogle Scholar
  72. Sebastipo R, Silva M, Rabito R, Gama J, Mendonta T (2013) Real-time algorithm for changes detection in depth of anesthesia signals. Evol Syst 4(1):3–12CrossRefGoogle Scholar
  73. Shaker A, Lughofer E (2014) Self-adaptive and local strategies for a smooth treatment of drifts in data streams. Evol Syst 5(4):239–257CrossRefGoogle Scholar
  74. Sobhani P, Beigy H (2011) New drift detection method for data streams. In: Bouchachia A (ed) Adaptive and intelligent systems, vol 6943., Lecture notes in computer scienceSpringer, Berlin, pp 88–97CrossRefGoogle Scholar
  75. Sobolewski P, Wozniak M (2013) Concept drift detection and model selection with simulated recurrence and ensembles of statistical detectors. J Univ Comput Sci 19(4):462–483Google Scholar
  76. Song G, Ye Y, Zhang H, Xu X, Lau RY, Liu F (2016) Dynamic clustering forest: an ensemble framework to efficiently classify textual data stream with concept drift. Inf Sci 357:125–143CrossRefGoogle Scholar
  77. Sun J, Li H, Adeli H (2013) Concept drift-oriented adaptive and dynamic support vector machine ensemble with time window in corporate financial risk prediction. Syst Man Cybernet Syst IEEE Trans 43(4):801–813CrossRefGoogle Scholar
  78. Toubakh H, Sayed-Mouchaweh M (2015) Hybrid dynamic data-driven approach for drift-like fault detection in wind turbines. Evol Syst 6(2):115–129CrossRefGoogle Scholar
  79. Tran D (2013) Automated change detection and reactive clustering in multivariate streaming data. CoRR arXiv:1311.0505
  80. Tsymbal A, Pechenizkiy M, Cunningham P, Puuronen S (2006) Handling local concept drift with dynamic integration of classifiers: domain of antibiotic resistance in nosocomial infections. In: Computer-based medical systems, 2006. CBMS 2006. 19th IEEE international symposium, pp 679—684Google Scholar
  81. Tsymbal A, Puuronen S (2000) Bagging and boosting with dynamic integration of classifiers. Principles of data mining and knowledge discovery, vol (1910). Lecture notes in computer science. Springer, Berlin, pp 116–125Google Scholar
  82. Tünnermann J, Mertsching B (2014) Region-based artificial visual attention in space and time. Cognit Comput 6(1):125–143CrossRefGoogle Scholar
  83. Vorburger P, Bernstein A (2006) Entropy-based concept shift detection. In: Data Mining, 2006. ICDM ’06. Sixth international conference, pp 1113–1118Google Scholar
  84. Wang H, Fan W, Yu PS, Han J (2003) Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’03ACM, New York, pp 226–235Google Scholar
  85. Wang S, Minku LL, Yao X (2013) Online class imbalance learning and its applications in fault detection. Int J Comput Intell Appl 12(4)Google Scholar
  86. Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. In: Machine learning, pp 69–101Google Scholar
  87. Wozniak M, Krawczyk B (2012) Combined classifier based on feature space partitioning. Int J Appl Math Comput Sci 22(4):855–866MathSciNetCrossRefGoogle Scholar
  88. Zliobaite I (2009) Combining time and space similarity for small size learning under concept drift. Foundations of intelligent systems, vol 5722. Lecture notes in computer science. Springer, Berlin, pp 412–421Google Scholar
  89. Zliobaite I (2010) Learning under concept drift: an overview. CoRR arXiv:1010.4784
  90. Zliobaite I, Bifet A, Read J, Pfahringer B, Holmes G (2015) Evaluation methods and decision theory for classification of streaming data with temporal dependence. Mach Learn 98(3):455–482MathSciNetCrossRefzbMATHGoogle Scholar
  91. Zliobaite I, Kuncheva L (2009) Determining the training window for small sample size classification with concept drift. In: Data mining workshops, 2009. ICDMW ’09. IEEE International Conference, pp 447–452Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Imen Khamassi
    • 1
  • Moamar Sayed-Mouchaweh
    • 2
  • Moez Hammami
    • 1
  • Khaled Ghédira
    • 1
  1. 1.SOIE, Institut Supérieur de Gestion de TunisUniversité de TunisLe BardoTunisia
  2. 2.Mines-Douai, IADouaiFrance

Personalised recommendations