Advertisement

Final Remarks and Challenging Problems

  • Leszek RutkowskiEmail author
  • Maciej Jaworski
  • Piotr Duda
Chapter
Part of the Studies in Big Data book series (SBD, volume 56)

Abstract

In this book, we studied the problem of data stream mining. Recently, it became a very important and challenging issue of computer science research. The reason is the enormous growth of data amounts generated in various areas of human activities. Data streams [1, 2, 3] are potentially of infinite size and often arrive at the system with very high rates. Therefore, it is not possible to store all the data in memory. Appropriate algorithms should use some synopsis structures to compress the information gathered from the past data. Moreover, data stream mining algorithms should be fast enough. Most often they have an incremental nature, i.e. each data element is processed at most once. Alternatively, the data stream can be analyzed in a block-based manner. Another feature of data streams is that the underlying data distribution may change over time. It is known in the literature as ‘concept drift’ [4, 5]. A good data stream mining method should be able to react to different types of changes. In this book, we studied various data stream mining algorithms. We focused on three groups of methods, based on decision trees, probabilistic neural networks, and ensemble methods. A separate part of the book was devoted to each group.

References

  1. 1.
    Gama, J.: Knowledge Discovery from Data Streams, 1st edn. Chapman and Hall/CRC, United Kingdom (2010)Google Scholar
  2. 2.
    Lemaire, V., Salperwyck, C., Bondu, A.: A survey on supervised classification on data streams. In: European Business Intelligence Summer School, pp. 88–125. Springer, Berlin (2014)Google Scholar
  3. 3.
    Garofalakis, M., Gehrke, J., Rastogi, R. (eds.): Data Stream Management: Processing High-Speed Data Streams. Data-Centric Systems and Applications. Springer, Cham (2016)Google Scholar
  4. 4.
    Tsymbal, A.: The problem of concept drift: definitions and related work, Technical report, Department of Computer Science, Trinity College Dublin (2004)Google Scholar
  5. 5.
    Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44:1–44:37 (2014)Google Scholar
  6. 6.
    Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)Google Scholar
  7. 7.
    Rutkowski, L., Pietruczuk, L., Duda, P., Jaworski, M.: Decision trees for mining data streams based on the McDiarmid’s bound. IEEE Trans. Knowl. Data Eng. 25(6), 1272–1279 (2013)Google Scholar
  8. 8.
    Matuszyk, P., Krempl, G., Spiliopoulou, M.: Correcting the usage of the Hoeffding inequality in stream mining. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds.) Advances in Intelligent Data Analysis XII. Lecture Notes in Computer Science, vol. 8207, pp. 298–309. Springer, Berlin (2013)Google Scholar
  9. 9.
    De Rosa, R., Cesa-Bianchi, N.: Splitting with confidence in decision trees with application to stream mining. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2015)Google Scholar
  10. 10.
    Jaworski, M., Duda, P., Rutkowski, L.: New splitting criteria for decision trees in stationary data streams. IEEE Trans. Neural Netw. Learn. Syst. 29, 2516–2529 (2018)Google Scholar
  11. 11.
    Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: A new method for data stream mining based on the misclassification error. IEEE Trans. Knowl. Data Eng. 26(5), 1048–1059 (2015)Google Scholar
  12. 12.
    De Rosa, R., Cesa-Bianchi, N.: Confidence decision trees via online and active learning for streaming data. J. Artif. Intell. Res. 60(60), 1031–1055 (2017)Google Scholar
  13. 13.
    Rutkowski, L.: Generalized regression neural networks in time-varying environment. IEEE Trans. Neural Netw. 15(3), 576–596 (2004)Google Scholar
  14. 14.
    Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: The Parzen kernel approach to learning in non-stationary environment. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 3319–3323 (2014)Google Scholar
  15. 15.
    Duda, P., Jaworski, M., Rutkowski, L.: Knowledge discovery in data streams with the orthogonal series-based generalized regression neural networks. Inf. Sci. 460–461, 497–518 (2017)Google Scholar
  16. 16.
    Duda, P., Jaworski, M., Rutkowski, L.: Convergent time-varying regression models for data streams: tracking concept drift by the recursive parzen-based generalized regression neural networks. Int. J. Neural Syst. 28(02), 1750048 (2018)Google Scholar
  17. 17.
    Rutkowski, L.: Adaptive probabilistic neural-networks for pattern classification in time-varying environment. IEEE Trans. Neural Netw. 15(4), 811–827 (2004)Google Scholar
  18. 18.
    Jaworski, M., Duda, P., Rutkowski, L., Najgebauer, P., Pawlak, M.: Heuristic regression function estimation methods for data streams with concept drift. Lecture Notes in Computer Science, vol. 10246, pp. 726–737 (2017)Google Scholar
  19. 19.
    Jaworski, M.: Regression function and noise variance tracking methods for data streams with concept drift. Int. J. Appl. Math. Comput. Sci. 28(3), 559–567 (2018)Google Scholar
  20. 20.
    Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: A method for automatic adjustment of ensemble size in stream data mining. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 9–15 (2016)Google Scholar
  21. 21.
    Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: How to adjust an ensemble size in stream data mining? Inf. Sci. 381, 46–54 (2017)Google Scholar
  22. 22.
    Duda, P., Jaworski, M., Rutkowski, L.: Online GRNN-based ensembles for regression on evolving data streams. In: Huang, T., Lv, J., Sun, C., Tuzikov, A.V. (eds.) Advances in Neural Networks – ISNN 2018, pp. 221–228. Springer International Publishing, Cham (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Leszek Rutkowski
    • 1
    • 2
    Email author
  • Maciej Jaworski
    • 1
  • Piotr Duda
    • 1
  1. 1.Institute of Computational IntelligenceCzestochowa University of TechnologyCzęstochowaPoland
  2. 2.Information Technology InstituteUniversity of Social SciencesLodzPoland

Personalised recommendations