Resource-Aware Data Stream Mining Using the Restricted Boltzmann Machine

Jaworski, Maciej; Rutkowski, Leszek; Duda, Piotr; Cader, Andrzej

doi:10.1007/978-3-030-20915-5_35

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11509))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

1225 Accesses
8 Citations

Abstract

In this paper, we consider the problem of data stream mining with an application of the Restricted Boltzmann Machine (RBM). If the data incoming rate is very fast, an appropriate algorithm should be resource-aware and work as fast as possible. Two RBM learning algorithms are investigated, i.e. the Contrastive Divergence and the Persistent Contrastive Divergence. We test three strategies for dealing with a buffer overflow in the case of high-speed data streams: load shedding, minibatch resizing, and controlling the number of Gibbs steps in the learning algorithm. Considered approaches are verified on the real MNIST dataset which is treated as a part of a data stream.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Akdeniz, E., Egrioglu, E., Bas, E., Yolcu, U.: An ARMA type Pi-Sigma artificial neural network for nonlinear time series forecasting. J. Artif. Intell. Soft Comput. Res. 8(2), 121–132 (2018)
Article Google Scholar
Dias de Assunçao, M., da Silva Veith, A., Buyya, R.: Distributed data stream processing and edge computing: a survey on resource elasticity and future directions. J. Netw. Comput. Appl. 103, 1–17 (2018)
Article Google Scholar
Babcock, B., Datar, M., Motwani, R.: Load shedding techniques for data stream systems. In: Proceedings of the 2003 Workshop on Management and Processing of Data Streams (2003)
Google Scholar
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article Google Scholar
Bengio, Y., Delalleau, O.: Justifying and generalizing contrastive divergence. Neural Comput. 21(6), 1601–1621 (2009)
Article MathSciNet Google Scholar
Bertini Junior, J.R., do Carmo Nicoletti, M.: An iterative boosting-based ensemble for streaming data classification. Inf. Fusion 45, 66–78 (2019)
Article Google Scholar
Bifet, A.: Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams. Frontiers in Artificial Intelligence and Applications. IOS Press, Amsterdam, Berlin (2010)
Google Scholar
Bifet, A., et al.: Extremely fast decision tree mining for evolving data streams. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1733–1742. ACM, New York (2017)
Google Scholar
Bilski, J., Kowalczyk, B., Grzanek, K.: The parallel modification to the Levenberg-Marquardt algorithm. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds.) ICAISC 2018. LNCS (LNAI), vol. 10841, pp. 15–24. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91253-0_2
Chapter Google Scholar
Bilski, J., Wilamowski, B.M.: Parallel learning of feedforward neural networks without error backpropagation. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9692, pp. 57–69. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39378-0_6
Chapter Google Scholar
Carreira-Perpinan, M.A., Hinton, G.E.: On contrastive divergence learning (2005)
Google Scholar
Chi, Y., Wang, H., Yu, P.S.: Loadstar: load shedding in data stream mining. In: Proceedings of the International Conference on Very Large Data Bases, pp. 1302–1305 (2005)
Google Scholar
Devi, V.S., Meena, L.: Parallel MCNN (PMCNN) with application to prototype selection on large and streaming data. J. Artif. Intell. Soft Comput. Res. 7(3), 155–169 (2017)
Article Google Scholar
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)
Google Scholar
Duda, P., Rutkowski, L., Jaworski, M., Rutkowska, D.: On the parzen kernel-based probability density function learning procedures over time-varying streaming data with applications to pattern classification. IEEE Trans. Cybern. 1–14 (2018). https://ieeexplore.ieee.org/document/8536871
Duda, P., Jaworski, M., Rutkowski, L.: Convergent time-varying regression models for data streams: tracking concept drift by the recursive parzen-based generalized regression neural networks. Int. J. Neural Syst. 28(02), 1750048 (2018)
Article Google Scholar
Duda, P., Jaworski, M., Rutkowski, L.: Knowledge discovery in data streams with the orthogonal series-based generalized regression neural networks. Inf. Sci. 460–461, 497–518 (2018)
Article MathSciNet Google Scholar
Gaber, M.M., Krishnaswamy, S., Zaslavsky, A.B.: Resource-aware mining of data streams. J. Univ. Comput. Sci. 11, 1440–1453 (2005)
MATH Google Scholar
Gomes, J., Gaber, M., Sousa, P., Menasalvas, E.: Mining recurring concepts in a dynamic feature space. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 95–110 (2014)
Article Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org
Hinton, G.E.: To recognize shapes, first learn to generate images. Prog. Brain Res. 165, 535–547 (2007)
Article Google Scholar
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)
Article Google Scholar
Hinton, G.E.: A practical guide to training restricted Boltzmann machines. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 599–619. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_32
Chapter Google Scholar
Hinton, G.E., Sejnowski, T.J., Ackley, D.H.: Boltzmann machines: constraint satisfaction networks that learn. Technical report, CMU-CS-84-119, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA (1984)
Google Scholar
Isokawa, T., Yamamoto, H., Nishimura, H., Yumoto, T., Kamiura, N., Matsui, N.: Complex-valued associative memories with projection and iterative learning rules. J. Artif. Intell. Soft Comput. Res. 8(3), 237–249 (2018)
Article Google Scholar
Jaworski, M., Duda, P., Rutkowski, L.: On applying the restricted Boltzmann machine to active concept drift detection. In: Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence, Honolulu, USA, pp. 3512–3519 (2017)
Google Scholar
Jaworski, M., Duda, P., Rutkowski, L.: Concept drift detection in streams of labelled data using the restricted Boltzmann machine. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (2018)
Google Scholar
Jaworski, M.: Regression function and noise variance tracking methods for data streams with concept drift. Int. J. Appl. Math. Comput. Sci. 28(3), 559–567 (2018)
Article MathSciNet Google Scholar
Jaworski, M., Duda, P., Rutkowski, L.: New splitting criteria for decision trees in stationary data streams. IEEE Trans. Neural Netw. Learn. Syst. 29(6), 2516–2529 (2018)
Article MathSciNet Google Scholar
Jaworski, M., Pietruczuk, L., Duda, P.: On resources optimization in fuzzy clustering of data streams. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2012. LNCS (LNAI), vol. 7268, pp. 92–99. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29350-4_11
Chapter Google Scholar
Jordanov, I., Petrov, N., Petrozziello, A.: Classifiers accuracy improvement based on missing data imputation. J. Artif. Intell. Soft Comput. Res. 8(1), 31–48 (2018)
Article Google Scholar
Krawczyk, B., Cano, A.: Online ensemble learning with abstaining classifiers for drifting and noisy data streams. Appl. Soft Comput. 68, 677–692 (2018)
Article Google Scholar
Kumar, T., Rohil, H.: Quality assured resource aware data stream mining. Int. J. Appl. Eng. Res. 6, 2563–2567 (2011)
Google Scholar
LeCun, Y., Cortes, C.: MNIST handwritten digit database (2010). http://yann.lecun.com/exdb/mnist/
LeCun, Y., Huang, F.: Loss functions for discriminative training of energy-based models. In: AISTATS 2005 - Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics, pp. 206–213 (2005)
Google Scholar
Lemaire, V., Salperwyck, C., Bondu, A.: A survey on supervised classification on data streams. In: Zimányi, E., Kutsche, R.-D. (eds.) eBISS 2014. LNBIP, vol. 205, pp. 88–125. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17551-5_4
Chapter Google Scholar
Nowicki, R.K., Starczewski, J.T.: A new method for classification of imprecise data using fuzzy rough fuzzification. Inf. Sci. 414, 33–52 (2017)
Article Google Scholar
Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: How to adjust an ensemble size in stream data mining? Inf. Sci. 381(C), 46–54 (2017)
Article MathSciNet Google Scholar
Ramirez-Gallego, S., Krawczyk, B., García, S., Woźniak, M., Herrera, F.: A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239, 39–57 (2017)
Article Google Scholar
Roux, N.L., Bengio, Y.: Representational power of restricted Boltzmann machines and deep belief networks. Neural Comput. 20(6), 1631–1649 (2008)
Article MathSciNet Google Scholar
Rutkowski, L., Jaworski, M., Duda, P.: Stream Data Mining: Algorithms and Their Probabilistic Properties. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-13962-9
Book Google Scholar
Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: The CART decision tree for mining data streams. Inf. Sci. 266, 1–15 (2014)
Article Google Scholar
Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: Decision trees for mining data streams based on the Gaussian approximation. IEEE Trans. Knowl. Data Eng. 26(1), 108–119 (2014)
Article Google Scholar
Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: A new method for data stream mining based on the misclassification error. IEEE Trans. Neural Netw. Learn. Syst. 26(5), 1048–1059 (2015)
Article MathSciNet Google Scholar
Rutkowski, L., Pietruczuk, L., Duda, P., Jaworski, M.: Decision trees for mining data streams based on the McDiarmid’s bound. IEEE Trans. Knowl. Data Eng. 25(6), 1272–1279 (2013)
Article Google Scholar
Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, pp. 194–281. MIT Press, Cambridge (1986)
Google Scholar
Tatbul, N., Çetintemel, U., Zdonik, S., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: Proceedings of the 29th International Conference on Very Large Data Bases, VLDB 2003, vol. 29, pp. 309–320. VLDB Endowment (2003)
Google Scholar
Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 1064–1071. ACM, New York (2008)
Google Scholar
Welling, M., Rosen-Zvi, M., Hinton, G.: Exponential family harmoniums with an application to information retrieval. In: Proceedings of the 17th International Conference on Neural Information Processing Systems, NIPS 2004, pp. 1481–1488. MIT Press, Cambridge (2004)
Google Scholar
Zhao, Y., Liu, Q.: A continuous-time distributed algorithm for solving a class of decomposable nonconvex quadratic programming. J. Artif. Intell. Soft Comput. Res. 8(4), 283–291 (2018)
Article Google Scholar
Zliobaite, I., Bifet, A., Pfahringer, B., Holmes, G.: Active learning with drifting streaming data. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 27–39 (2014)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the Polish National Science Centre under grant no. 2017/27/B/ST6/02852.

Author information

Authors and Affiliations

Institute of Computational Intelligence, Czestochowa University of Technology, Czestochowa, Poland
Maciej Jaworski, Leszek Rutkowski & Piotr Duda
Information Technology Institute, University of Social Sciences, Łódź, Poland
Leszek Rutkowski & Andrzej Cader
Clark University, Worcester, MA, USA
Andrzej Cader

Authors

Maciej Jaworski
View author publications
You can also search for this author in PubMed Google Scholar
Leszek Rutkowski
View author publications
You can also search for this author in PubMed Google Scholar
Piotr Duda
View author publications
You can also search for this author in PubMed Google Scholar
Andrzej Cader
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maciej Jaworski .

Editor information

Editors and Affiliations

Częstochowa University of Technology, Częstochowa, Poland
Leszek Rutkowski
Częstochowa University of Technology, Częstochowa, Poland
Rafał Scherer
Częstochowa University of Technology, Częstochowa, Poland
Marcin Korytkowski
University of Alberta, Edmonton, AB, Canada
Witold Pedrycz
AGH University of Science and Technology, Kraków, Poland
Ryszard Tadeusiewicz
University of Louisville, Louisville, KY, USA
Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jaworski, M., Rutkowski, L., Duda, P., Cader, A. (2019). Resource-Aware Data Stream Mining Using the Restricted Boltzmann Machine. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2019. Lecture Notes in Computer Science(), vol 11509. Springer, Cham. https://doi.org/10.1007/978-3-030-20915-5_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-20915-5_35
Published: 27 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20914-8
Online ISBN: 978-3-030-20915-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics