Progress in Artificial Intelligence

, Volume 1, Issue 1, pp 89–101 | Cite as

Learning from streaming data with concept drift and imbalance: an overview

  • T. Ryan Hoens
  • Robi Polikar
  • Nitesh V. Chawla


The primary focus of machine learning has traditionally been on learning from data assumed to be sufficient and representative of the underlying fixed, yet unknown, distribution. Such restrictions on the problem domain paved the way for development of elegant algorithms with theoretically provable performance guarantees. As is often the case, however, real-world problems rarely fit neatly into such restricted models. For instance class distributions are often skewed, resulting in the “class imbalance” problem. Data drawn from non-stationary distributions is also common in real-world applications, resulting in the “concept drift” or “non-stationary learning” problem which is often associated with streaming data scenarios. Recently, these problems have independently experienced increased research attention, however, the combined problem of addressing all of the above mentioned issues has enjoyed relatively little research. If the ultimate goal of intelligent machine learning algorithms is to be able to address a wide spectrum of real-world scenarios, then the need for a general framework for learning from, and adapting to, a non-stationary environment that may introduce imbalanced data can be hardly overstated. In this paper, we first present an overview of each of these challenging areas, followed by a comprehensive review of recent research for developing such a general framework.


Class imbalance Concept drift Data streams Classification 


  1. 1.
    Alippi, C., Boracchi, G., Roveri, M.: Just in time classifiers: managing the slow drift case. In: IJCNN, pp. 114–120. IEEE, New York (2009). doi: 10.1109/IJCNN.2009.5178799
  2. 2.
    Alippi, C., Roveri, M.: Just-in-time adaptive classifiers in non-stationary conditions. In: IJCNN, pp. 1014–1019. IEEE, New York (2007)Google Scholar
  3. 3.
    Alippi C., Roveri M.: Just-in-time adaptive classifierspart ii: designing the classifier. TNN 19(12), 2053–2064 (2008)Google Scholar
  4. 4.
    Andres-Andres, A., Gomez-Sanchez, E., Bote-Lorenzo, M.: Incremental rule pruning for fuzzy artmap neural network. In: ICANN, pp. 655–660 (2005)Google Scholar
  5. 5.
    Becker, H., Arias, M.: Real-time ranking with concept drift using expert advice. In: KDD, pp. 86–94. ACM, New York (2007)Google Scholar
  6. 6.
    Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: SDM, pp. 443–448 (Citeseer) (2007)Google Scholar
  7. 7.
    Bifet, A., Gavalda, R.: Adaptive learning from evolving data streams. In: IDA, pp. 249–260 (2009)Google Scholar
  8. 8.
    Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà, R.: New ensemble methods for evolving data streams. In: KDD, pp. 139–148. ACM, New York (2009)Google Scholar
  9. 9.
    Black M., Hickey R.: Learning classification rules for telecom customer call data under concept drift. Soft Comput. Fusion Found. Methodol. Appl. 8(2), 102–108 (2003)CrossRefGoogle Scholar
  10. 10.
    Breiman L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996). doi: 10.1023/A:1018054314350 MathSciNetMATHGoogle Scholar
  11. 11.
    Breiman L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). doi: 10.1023/A:1010933404324 CrossRefMATHGoogle Scholar
  12. 12.
    Buntine W.: Learning classification trees. Stat. Comput. 2(2), 63–73 (1992)CrossRefGoogle Scholar
  13. 13.
    Carpenter G., Grossberg S., Markuzon N., Reynolds J., Rosen D.: Fuzzy artmap: a neural network architecture for incremental supervised learning of analog multidimensional maps. TNN 3(5), 698–713 (1992)Google Scholar
  14. 14.
    Carpenter G., Grossberg S., Reynolds J.: Artmap: supervised real-time learning and classification of nonstationary data by a self-organizing neural network. Neural Netw. 4(5), 565–588 (1991)CrossRefGoogle Scholar
  15. 15.
    Carpenter G., Tan A.: Rule extraction: from neural architecture to symbolic representation. Connect. Sci. 7(1), 3–27 (1995)CrossRefGoogle Scholar
  16. 16.
    Chawla N., Japkowicz N., Kotcz A.: Editorial: special issue on learning from imbalanced data sets. ACM SIGKDD Explor. Newsl. 6(1), 1–6 (2004)CrossRefGoogle Scholar
  17. 17.
    Chawla, N., Lazarevic, A., Hall, L., Bowyer, K.: Smoteboost: improving prediction of the minority class in boosting. In: PKDD, pp. 107–119 (2003)Google Scholar
  18. 18.
    Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 875–886. Springer, Berlin (2010)Google Scholar
  19. 19.
    Chawla N.V., Cieslak D.A., Hall L.O., Joshi A.: Automatically countering imbalance and its empirical relationship to cost. DMKD 17(2), 225–252 (2008)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Chen, S., He, H.: Sera: selectively recursive approach towards nonstationary imbalanced stream data mining. In: IJCNN, pp. 522–529. IEEE, New York (2009)Google Scholar
  21. 21.
    Chu, F., Zaniolo, C.: Fast and light boosting for adaptive mining of data streams. In: PAKDD, pp. 282–292 (2004)Google Scholar
  22. 22.
    Dietterich, T.: Ensemble methods in machine learning. In: MCS, pp. 1–15 (2000)Google Scholar
  23. 23.
    Ditzler, G., Polikar, R.: An incremental learning framework for concept drift and class imbalance. In: IJCNN. IEEE, New York (2010)Google Scholar
  24. 24.
    Ditzler, G., Polikar, R., Chawla, N.V.: An incremental learning algorithm for nonstationary environments and class imbalance. In: ICPR. IEEE, New York (2010)Google Scholar
  25. 25.
    Domingos, P., Hulten, G.: Mining high-speed data streams. In: KDD, pp. 71–80. ACM, New York (2000)Google Scholar
  26. 26.
    Elwell, R., Polikar, R.: Incremental learning in nonstationary environments with controlled forgetting. In: IJCNN, pp. 771–778. IEEE, New York (2009)Google Scholar
  27. 27.
    Elwell, R., Polikar, R.: Incremental learning of variable rate concept drift. In: MCS, pp. 142–151 (2009)Google Scholar
  28. 28.
    Elwell R., Polikar R.: Incremental learning of concept drift in nonstationary environments. TNN 22(10), 1517–1531 (2011)Google Scholar
  29. 29.
    Fan, W.: Systematic data selection to mine concept-drifting data streams. In: KDD, pp. 128–137. ACM, New York (2004)Google Scholar
  30. 30.
    Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: ICML (1996). doi: 10.1007/3-540-59119-2_166
  31. 31.
    Friedman J., Hastie T., Tibshirani R.: Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat. 28(2), 337–407 (2000)MathSciNetCrossRefMATHGoogle Scholar
  32. 32.
    Fu L.: Incremental knowledge acquisition in supervised learning networks. SMC Part A 26(6), 801–809 (2002)Google Scholar
  33. 33.
    Fukunaga K., Hostetler L.: Optimization of k nearest neighbor density estimates. Inf. Theory 19(3), 320–326 (2002)MathSciNetCrossRefGoogle Scholar
  34. 34.
    Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: AAI, pp. 66–112 (2004)Google Scholar
  35. 35.
    Gao J., Ding B., Fan W., Han J., Yu P.: Classifying data streams with skewed class distributions and concept drifts. Internet Comput. 12(6), 37–49 (2008)CrossRefGoogle Scholar
  36. 36.
    Gao, J., Fan, W., Han, J., Yu, P.: A general framework for mining concept-drifting data streams with skewed distributions. In: SDM, pp. 3–14 (Citeseer) (2007)Google Scholar
  37. 37.
    Giraud-Carrier C.: A note on the utility of incremental learning. AI Commun. 13(4), 215–223 (2000)MATHGoogle Scholar
  38. 38.
    Grossberg S.: Nonlinear neural networks: principles, mechanisms, and architectures. Neural Netw. 1(1), 17–61 (1988)MathSciNetCrossRefGoogle Scholar
  39. 39.
    Guo, H., Viktor, H.L.: Learning from imbalanced data sets with boosting and data generation: the databoost-im approach. SIGKDD Explor. Newsl. 6, 30–39 (2004). doi: 10.1145/1007730.1007736
  40. 40.
    Ho T.: The random subspace method for constructing decision forests. PAMI 20(8), 832–844 (1998)CrossRefGoogle Scholar
  41. 41.
    Hoeffding W.: Probability inequalities for sums of bounded random variables. JASA 58(301), 13–30 (1963)MathSciNetMATHGoogle Scholar
  42. 42.
    Hoeglinger, S., Pears, R.: Use of hoeffding trees in concept based data stream mining. In: ICIAFS, pp. 57–62 (2007). doi: 10.1109/ICIAFS.2007.4544780
  43. 43.
    Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: KDD, pp. 97–106. ACM, New York (2001)Google Scholar
  44. 44.
    Joachims, T.: Estimating the generalization performance of an svm efficiently. In: ICML, p. 431. Morgan Kaufmann, Menlo Park (2000)Google Scholar
  45. 45.
    Karnick, M., Ahiskali, M., Muhlbaier, M., Polikar, R.: Learning concept drift in nonstationary environments using an ensemble of classifiers based approach. In: IJCNN, pp. 3455–3462. IEEE, New York (2008)Google Scholar
  46. 46.
    Karnick, M., Muhlbaier, M., Polikar, R.: Incremental learning in non-stationary environments with concept drift using a multiple classifier based approach. In: ICPR, pp. 1–4. IEEE, New York (2009)Google Scholar
  47. 47.
    Kelly, M., Hand, D., Adams, N.: The impact of changing populations on classifier performance. In: KDD, pp. 367–371. ACM, New York (1999)Google Scholar
  48. 48.
    Klinkenberg, R., Joachims, T.: Detecting concept drift with support vector machines. In: ICML (Citeseer) (2000)Google Scholar
  49. 49.
    Kohavi, R., Kunz, C.: Option decision trees with majority votes. In: ICML, pp. 161–169. Morgan Kaufmann, Menlo Park (1997)Google Scholar
  50. 50.
    Kolter, J., Maloof, M.: Dynamic weighted majority: a new ensemble method for tracking concept drift. In: ICDM, pp. 123–130. IEEE, New York (2003)Google Scholar
  51. 51.
    Kolter, J., Maloof, M.: Using additive expert ensembles to cope with concept drift. In: ICML, pp. 449–456. ACM, New York (2005)Google Scholar
  52. 52.
    Kolter J., Maloof M.: Dynamic weighted majority: an ensemble method for drifting concepts. JMLR 8, 2755–2790 (2007)MATHGoogle Scholar
  53. 53.
    Kubat M.: Floating approximation in time-varying knowledge bases. PRL 10(4), 223–227 (1989)CrossRefMATHGoogle Scholar
  54. 54.
    Kuncheva L.I., Whitaker C.J.: Measures of diversity in classifier ensembles. Mach. Learn. 51, 181–207 (2003)CrossRefMATHGoogle Scholar
  55. 55.
    Lange S., Grieser G.: On the power of incremental learning. TCS 288(2), 277–307 (2002)MathSciNetCrossRefMATHGoogle Scholar
  56. 56.
    Lange S., Zilles, S.: Formal models of incremental learning and their analysis. In: IJCNN, vol. 4, pp. 2691–2696. IEEE, New York (2003)Google Scholar
  57. 57.
    Last M.: Online classification of nonstationary data streams. IDA 6(2), 129–147 (2002)MathSciNetMATHGoogle Scholar
  58. 58.
    Lazarescu M., Venkatesh S., Bui H.: Using multiple windows to track concept drift. IDA 8(1), 29–59 (2004)Google Scholar
  59. 59.
    Lichtenwalter, R., Chawla, N.V.: Adaptive methods for classification in arbitrarily imbalanced and drifting data streams. In: New Frontiers in Applied Data Mining. Lecture Notes in Computer Science, vol. 5669, pp. 53–75. Springer, Berlin (2010)Google Scholar
  60. 60.
    Maron, O., Moore, A.W.: Hoeffding races: accelerating model selection search for classification and function approximation. In: NIPS, pp. 59–66 (1993)Google Scholar
  61. 61.
    Masnadi-Shirazi H., Vasconcelos N.: Cost-sensitive boosting. PAMI 33(2), 294–309 (2011). doi: 10.1109/TPAMI.2010.71 CrossRefGoogle Scholar
  62. 62.
    Mitchell T., Caruana R., Freitag D., McDermott J., Zabowski D.: Experience with a learning personal assistant. Commun. ACM 37(7), 80–91 (1994)CrossRefGoogle Scholar
  63. 63.
    Moreno-Torres, J., Herrera, F.: A preliminary study on overlapping and data fracture in imbalanced domains by means of genetic programming-based feature extraction. In: ISDA, pp. 501 –506 (2010). doi: 10.1109/ISDA.2010.5687214
  64. 64.
    Moreno-Torres, J., Raeder, T., Alaiz-Rodríguez, R., Chawla, N.V., Herrera, F.: A unifying view on dataset shift in classification. Pattern Recognit. 45, 521–530 (2011)Google Scholar
  65. 65.
    Muhlbaier, M., Polikar, R.: An ensemble approach for incremental learning in nonstationary environments. In: MCS, pp. 490–500 (2007)Google Scholar
  66. 66.
    Muhlbaier, M., Polikar, R.: Multiple classifiers based incremental learning algorithm for learning in nonstationary environments. In: ICMLC, vol. 6, pp. 3618–3623. IEEE, New York (2007)Google Scholar
  67. 67.
    Muhlbaier M., Topalis A., Polikar R.: Learn++. nc: combining ensemble of classifiers with dynamically weighted consult-and-vote for efficient incremental learning of new classes. TNN 20(1), 152–168 (2009). doi: 10.1109/TNN.2008.2008326 Google Scholar
  68. 68.
    Nishida, K., Yamauchi, K., Omori, T.: Ace: adaptive classifiers-ensemble system for concept-drifting environments. In: MCS, pp. 176–185 (2005)Google Scholar
  69. 69.
    Pfahringer, B., Holmes, G., Kirkby, R.: New options for hoeffding trees. In: AAI, pp. 90–99 (2007)Google Scholar
  70. 70.
    Polikar R.: Ensemble based systems in decision making. Circuits Syst. Mag. 6(3), 21–45 (2006)CrossRefGoogle Scholar
  71. 71.
    Polikar R.: Bootstrap-inspired techniques in computation intelligence. Signal Process. Mag. 24(4), 59–72 (2007)CrossRefGoogle Scholar
  72. 72.
    Polikar, R., Upda, L., Upda, S.S., Honavar, V.: Learn++: an incremental learning algorithm for supervised neural networks. In: SMC Part C, pp. 497–508 (2001)Google Scholar
  73. 73.
    Quinlan, J.: C4.5: Programs For Machine Learning. Morgan Kaufmann, Menlo Park (1993)Google Scholar
  74. 74.
    Schapire R., Singer Y.: Improved boosting algorithms using confidence-rated predictions. Mach. Learn. 37(3), 297–336 (1999)CrossRefMATHGoogle Scholar
  75. 75.
    Scholz M., Klinkenberg R.: Boosting classifiers for drifting concepts. IDA 11(1), 3–28 (2007)Google Scholar
  76. 76.
    Stanley, K.: Learning concept drift with a committee of decision trees. Technical Report AI-03-302, Computer Science Department, University of Texas-Austin (2003)Google Scholar
  77. 77.
    Street, W., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: KDD, pp. 377–382. ACM, New York (2001)Google Scholar
  78. 78.
    Ting, K.: A comparative study of cost-sensitive boosting algorithms. In: ICML (Citeseer) (2000)Google Scholar
  79. 79.
    Tsymbal, A.: The problem of concept drift: definitions and related work. Technical Report TCD-CS-2004-15, Departament of Computer Science, Trinity College (2004).
  80. 80.
    Tsymbal, A., Pechenizkiy, M., Cunningham, P., Puuronen, S.: Handling local concept drift with dynamic integration of classifiers: domain of antibiotic resistance in nosocomial infections. In: CBMS, pp. 679 –684 (2006). doi: 10.1109/CBMS.2006.94
  81. 81.
    Tsymbal A., Pechenizkiy M., Cunningham P., Puuronen S.: Dynamic integration of classifiers for handling concept drift. Inf. Fusion 9(1), 56–68 (2008)CrossRefGoogle Scholar
  82. 82.
    Wang, H., Fan, W., Yu, P., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: KDD, pp. 226–235. ACM, New York (2003)Google Scholar
  83. 83.
    Wang, H., Yin, J., Pei, J., Yu, P., Yu, J.: Suppressing model overfitting in mining concept-drifting data streams. In: KDD, pp. 736–741. ACM, New York (2006)Google Scholar
  84. 84.
    Widmer, G., Kubat, M.: Learning flexible concepts from streams of examples: Flora2. In: ECAI, p. 467. Wiley, New York (1992)Google Scholar
  85. 85.
    Widmer, G., Kubat, M.: Effective learning in dynamic environments by explicit context tracking. In: ECML, pp. 227–243. Springer, Berlin (1993)Google Scholar
  86. 86.
    Widmer G., Kubat M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)Google Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • T. Ryan Hoens
    • 1
  • Robi Polikar
    • 2
  • Nitesh V. Chawla
    • 1
  1. 1.Department of Computer Science and EngineeringUniversity of Notre DameNotre DameUSA
  2. 2.Electrical and Computer EngineeringRowan UniversityGlassboroUSA

Personalised recommendations