Skip to main content
Log in

Online neural network model for non-stationary and imbalanced data stream classification

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

“Concept drift” and class imbalance are two challenges for supervised classifiers. “Concept drift” (or non-stationarity) is changes in the underlying function being learnt, and class imbalance is a vast difference between the numbers of instances in different classes of data. Class imbalance is an obstacle for the efficiency of most classifiers. Previous methods for classifying non-stationary and imbalanced data streams mainly focus on batch solutions, in which the classification model is trained using a chunk of data. Here, we propose an online Neural Network (NN) model. The NN model, is composed of two different parts for handling concept drift and class imbalance. Concept drift is handled with a forgetting function and class imbalance is handled with a specific error function which assigns different importance to error in separate classes. The proposed method is evaluated on 3 synthetic and 8 real world datasets. The results show statistically significant improvement to previous online NN methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Gama J (2010) Knowledge discovery from data streams. Chapman & Hall/CRC Press, Boca Raton

    Book  MATH  Google Scholar 

  2. Masud MM (2009) Adaptive classification of scarcely labeled and evolving data streams. Texas, Dallas

    Google Scholar 

  3. Klinkenberg R, Joachims T (2000) Detecting concept drift with support vector machines. In: Paper presented at the 17th International conference on machine learning, San Mateo

  4. Sun J, Li H (2011) Dynamic financial distress prediction using instance selection for the disposal of concept drift. Expert Syst Appl 38(3):2566–2576

    Article  Google Scholar 

  5. Martínez-Rego D, Pérez-Sánchez B, Fontenla-Romero O, Alonso-Betanzos A (2011) A robust incremental learning method for non-stationary environments. Neurocomputing 74(11):1800–1808

    Article  Google Scholar 

  6. Pavlidis NG, Tasoulis DK, Adams NM, Hand DJ (2011) Landa perceptron: an adaptive classifier for data streams. Pattern Recogn 44(1):78–96

    Article  MATH  Google Scholar 

  7. Tsymbal A (2004) The problem of concept drift: definitions and related work. Technical Report: TCD-CS-2004-15. Trinity College Dublin, Computer Science Department, Dublin

  8. Elwell R, Polikar R (2011) Incremental learning of concept drift in nonstationary environments. IEEE Trans Neural Netw 22(10):1517–1531

    Article  Google Scholar 

  9. Abdulsalam H, Skillicorn DB, Martin P (2011) Classification using streaming random forests. IEEE Trans Knowl Data Eng 23(1):22–36

    Article  Google Scholar 

  10. Masud MM, Jing G, Khan L, Jiawei H, Thuraisingham BM (2011) Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans Knowl Data Eng 23(6):859–874

    Article  Google Scholar 

  11. Fern A, Givan R (2003) Online ensemble learning: an empirical study. Mach Learn 53(1):71–109. doi:10.1023/a:1025619426553

    Article  MATH  Google Scholar 

  12. Rodriguez JJ, Kuncheva LI (2008) Combining online classification approaches for changing environments. In: Paper presented at the Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition, Orlando

  13. Littlestone N (1988) Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm. Mach Learn 2(4):285–318. doi:10.1023/a:1022869011914

    Google Scholar 

  14. Kuncheva LI (2004) Classifier ensembles for changing environments. In: Roli F, Kittler J, Windeatt T (eds) Multiple classifier systems. Lecture notes in computer science, vol 3077. Springer, Berlin, pp 1–15. doi:10.1007/978-3-540-25966-4_1

  15. Kotsiantis S, Patriarcheas K, Xenos M (2010) A combinational incremental ensemble of classifiers as a technique for predicting students’ performance in distance education. Knowl-Based Syst 23(6):529–535

    Article  Google Scholar 

  16. Abdelhamid B (2011) Incremental learning with multi-level adaptation. Neurocomputing 74(11):1785–1799

    Article  Google Scholar 

  17. Pocock A, Yiapanis P, Singer J, Luján M, Brown G (2010) Online non-stationary boosting. In: El Gayar N, Kittler J, Roli F (eds) Multiple classifier systems. Lecture notes in computer science, vol 5997. Springer, Berlin, pp 205–214. doi:10.1007/978-3-642-12127-2_21

  18. Minku L, Yao X (2011) DDD: a new ensemble approach for dealing with concept drift. IEEE Trans Knowl Data Eng 24(99):1–1

    Google Scholar 

  19. Batuwita R, Palade V (2010) FSVM-CIL: fuzzy support vector machines for class imbalance learning. IEEE Trans Fuzzy Syst 18(3):558–571

    Article  Google Scholar 

  20. Fernández A, del Jesus MJ, Herrera F (2010) On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets. Inf Sci 180(8):1268–1291

    Article  Google Scholar 

  21. Arun Kumar M, Gopal M (2010) Fast multiclass SVM classification using decision tree based one-against-all method. Neural Process Lett 32(3):311–323. doi:10.1007/s11063-010-9160-y

    Article  Google Scholar 

  22. Sánchez-Monedero J, Gutiérrez P, Fernández-Navarro F, Hervás-Martínez C (2011) Weighting efficient accuracy and minimum sensitivity for evolving multi-class classifiers. Neural Process Lett 34(2):101–116. doi:10.1007/s11063-011-9186-9

    Article  Google Scholar 

  23. Gao J, Fan W, Han J, Yu PS (2007) A general framework for mining concept-drifting data streams with skewed distributions. Paper presented at the SIAM

  24. Chen S, He H (2010) Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach. Evol Syst 2(1):35–50

    Article  Google Scholar 

  25. Ditzler G, Polikar R (2010) An ensemble based incremental learning framework for concept drift and class imbalance. Paper presented at the WCCI

  26. Tong D, Mintram R (2010) Genetic Algorithm-Neural Network (GANN): a study of neural network activation functions and depth of genetic algorithm search applied to feature selection. Int J Mach Learn Cyber 1(1–4):75–87. doi:10.1007/s13042-010-0004-x

    Article  Google Scholar 

  27. Boehm O, Hardoon D, Manevitz L (2011) Classifying cognitive states of brain activity via one-class neural networks with feature selection by genetic algorithms. Int J Mach Learn Cyber 2(3):125–134. doi:10.1007/s13042-011-0030-3

    Article  Google Scholar 

  28. Sarlin P (2012) Visual tracking of the millennium development goals with a fuzzified self-organizing neural network. Int J Mach Learn Cyber 3(3):233–245. doi:10.1007/s13042-011-0057-5

    Article  Google Scholar 

  29. Barakat M, Lefebvre D, Khalil M, Druaux F, Mustapha O (2013) Parameter selection algorithm with self adaptive growing neural network classifier for diagnosis issues. Int J Mach Learn Cyber 4(3):217–233. doi:10.1007/s13042-012-0089-5

    Article  Google Scholar 

  30. Oh S-H (2011) Error back-propagation algorithm for classification of imbalanced data. Neurocomputing 74(6):1058–1061

    Article  Google Scholar 

  31. Rumelhart DE, McClelland JL (1986) Parallel distributed processing. MIT Press, Cambridge

    Google Scholar 

  32. Fontenla-Romero O, Guijarro-Berdiñas B, Pérez-Sánchez B, Alonso-Betanzos A (2010) A new convex objective function for the supervised learning of single-layer neural networks. Pattern Recogn 43(5):1984–1992

    Article  MATH  Google Scholar 

  33. Ghazikhani A, Monsefi R, Sadoghi Yazdi H (2012) Online cost-sensitive neural network classifiers for non-stationary and imbalanced data streams. Neural Comput Appl 1–13. doi:10.1007/s00521-012-1071-6

  34. He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Article  Google Scholar 

  35. Street NW, Kim Y (2001) A streaming ensemble algorithm (SEA) for large-scale classification. In: Paper presented at the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

  36. Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23:60–101

    Google Scholar 

  37. Narasimhamurthy A, Kuncheva LI (2007) A framework for generating data to simulate changing environments. In: Paper presented at the IASTED International Conference on Artificial Intelligence and Applications

  38. Harries M (1999) Splice-2 comparative evaluation: electricity pricing. University of South Wales

  39. Neurotech (2009) PAKDD 2009 data mining competition. http://sede.neurotech.com.br:443/PAKDD2009/

  40. NOAA (2010) Weather data. http://users.rowan.edu/~polikar/research/NSE/

  41. UCI Repository of Machine Learning Database (2007) School of information and computer science, University of California, Irvine. http://www.ics.uci.edu/~mlearn/MLRepository.html

  42. Yang Y, Wu X, Zhu X (2006) Mining in anticipation for concept change: proactive-reactive prediction in data streams. Data Min Knowl Discov 13(3):261–289

    Article  MathSciNet  Google Scholar 

  43. Alpaydın E (2010) Introduction to machine learning, 2nd edn. The MIT Press, Cambridge

    MATH  Google Scholar 

  44. Sipser M (2006) Introduction to the theory of computation. Course Technology Inc, Boston

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adel Ghazikhani.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghazikhani, A., Monsefi, R. & Sadoghi Yazdi, H. Online neural network model for non-stationary and imbalanced data stream classification. Int. J. Mach. Learn. & Cyber. 5, 51–62 (2014). https://doi.org/10.1007/s13042-013-0180-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-013-0180-6

Keywords

Navigation