Skip to main content

Dynamic Ensemble Selection for Imbalanced Data Stream Classification with Limited Label Access

  • 393 Accesses

Part of the Lecture Notes in Computer Science book series (LNAI,volume 12855)

Abstract

Real data streams often, in addition to the possibility of concept drift occurrence, can display a high imbalance ratio. Another important problem with real classification tasks, often overlooked in the literature, is the cost of obtaining labels. This work aims to connect three rarely combined research directions i.e., data stream classification, imbalanced data classification, and limited access to labels. For this purpose, the behavior of the desisc-sb framework proposed by the authors in earlier works for the classification of highly imbalanced data stream was examined under the scenario of limited label access. Experiments conducted on synthetic and real streams confirmed the potential of using desisc-sb to classify highly imbalanced data streams even in the case of low label availability.

Keywords

  • Data stream
  • Classifier selection
  • Active learning

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-87897-9_20
  • Chapter length: 10 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-87897-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   139.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.

Notes

  1. 1.

    https://github.com/w4k2/icaisc21-al-stream.

References

  1. Bouguelia, M., Belaïd, Y., Belaïd, A.: An adaptive streaming active learning strategy based on instance weighting. Pattern Recogn. Lett. 70, 38–44 (2016)

    CrossRef  Google Scholar 

  2. Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2013)

    CrossRef  Google Scholar 

  3. Fernández, A., García, S., Galar, M., Prati, R.C., Krawczyk, B., Herrera, F.: Learning from Imbalanced Data Sets. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98074-4

    CrossRef  Google Scholar 

  4. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4), 1–37 (2014)

    CrossRef  Google Scholar 

  5. Gomes, H.M., Barddal, J.P., Enembreck, F., Bifet, A.: A survey on ensemble learning for data stream classification. ACM Comput. Surv. (CSUR) 50(2), 1–36 (2017)

    CrossRef  Google Scholar 

  6. Grzyb, J., Klikowski, J., Woźniak, M.: Hellinger distance weighted ensemble for imbalanced data stream classification. J. Comput. Sci. 51, 101314 (2021)

    CrossRef  Google Scholar 

  7. Krawczyk, B., Pfahringer, B., Wozniak, M.: Combining active learning with concept drift detection for data stream mining. In: IEEE International Conference on Big Data, Big Data 2018, Seattle, WA, USA, 10–13 December 2018. pp. 2239–2244. IEEE (2018)

    Google Scholar 

  8. Ksieniewicz, P.: The prior probability in the batch classification of imbalanced data streams. Neurocomputing 452, 309–316 (2020)

    CrossRef  Google Scholar 

  9. Ksieniewicz, P., Zyblewski, P.: Stream-learn-open-source python library for difficult data stream batch analysis. arXiv preprint arXiv:2001.11077 (2020)

  10. Mohamad, S., Sayed-Mouchaweh, M., Bouchachia, A.: Active learning for classifying data streams with unknown number of classes. Neural Netw. 98, 1–15 (2018)

    CrossRef  Google Scholar 

  11. Settles, B.: Active Learning. Morgan & Claypool Publishers (2012)

    Google Scholar 

  12. Shan, J., Zhang, H., Liu, W., Liu, Q.: Online active learning ensemble framework for drifted data streams. IEEE Trans. Neural Netw. Learn. Syst. 30(2), 486–498 (2019)

    CrossRef  Google Scholar 

  13. de Souza, V.M.A., Silva, D.F., Batista, G.E.A.P.A.: Classification of data streams applied to insect recognition: initial results. In: 2013 Brazilian Conference on Intelligent Systems, pp. 76–81 (2013). https://doi.org/10.1109/BRACIS.2013.21

  14. Sun, Y., Tang, K., Minku, L.L., Wang, S., Yao, X.: Online ensemble learning of data streams with gradually evolved classes. IEEE Trans. Knowl. Data Eng. 28(6), 1532–1545 (2016)

    CrossRef  Google Scholar 

  15. Wang, S., Minku, L.L., Yao, X.: A systematic study of online class imbalance learning with concept drift. CoRR abs/1703.06683 (2017)

    Google Scholar 

  16. Wang, Y., Zhang, Y., Wang, Y.: Mining data streams with skewed distribution by static classifier ensemble. In: Chien, B.C., Hong, T.P. (eds.) Opportunities and Challenges for Next-Generation Applied Intelligence. Studies in Computational Intelligence, vol 214, pp. 65–71. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-540-92814-0_11

  17. Zhang, H., Liu, W., Liu, Q.: Reinforcement online active learning ensemble for drifting imbalanced data streams. IEEE Trans. Knowl. Data Eng. (2020)

    Google Scholar 

  18. Zyblewski, P., Ksieniewicz, P., Woźniak, M.: Combination of active and random labeling strategy in the non-stationary data stream classification. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds.) ICAISC 2020. LNCS (LNAI), vol. 12415, pp. 576–585. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61401-0_54

    CrossRef  Google Scholar 

  19. Zyblewski, P., Sabourin, R., Woźniak, M.: Preprocessed dynamic classifier ensemble selection for highly imbalanced drifted data streams. Inf. Fusion 66, 138–154 (2021)

    CrossRef  Google Scholar 

Download references

Acknowledgment

This work was supported by the Polish National Science Centre under the grant No. 2017/27/B/ST6/01325.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paweł Zyblewski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Zyblewski, P., Woźniak, M. (2021). Dynamic Ensemble Selection for Imbalanced Data Stream Classification with Limited Label Access. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2021. Lecture Notes in Computer Science(), vol 12855. Springer, Cham. https://doi.org/10.1007/978-3-030-87897-9_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87897-9_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87896-2

  • Online ISBN: 978-3-030-87897-9

  • eBook Packages: Computer ScienceComputer Science (R0)