Threaded Ensembles of Supervised and Unsupervised Neural Networks for Stream Learning

Dong, Yue; Japkowicz, Nathalie

doi:10.1007/978-3-319-34111-8_37

Yue Dong¹⁵ &
Nathalie Japkowicz¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9673))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

1911 Accesses
2 Citations

Abstract

Most existing model-based approaches to anomaly detection in streaming data are based on decision trees due to their fast construction speed [1]. This paper proposes two fast anomaly detectors based on ensembles of neural networks for evolving data streams. One model is a supervised online learning algorithm involving an ensemble of threaded multilayer perceptrons (MLP). The other model is a one-class learning algorithm with an ensemble of threaded autoencoders. The latter model only requires data from the positive class for training and is accurate even when anomalous training data are rare. The models feature an ensemble of multilayer perceptrons or autoencoders from multi-threads which evolve with data streams. Using multi-threads makes the methods highly efficient because both methods process data streams in parallel. Our analysis shows that both methods in streaming data have constant small time complexity and constant memory requirement. When compared with Very Fast Decision Trees (VFDT), a state-of-the-art algorithm, our methods performed favorably in terms of detection accuracy and training time for the datasets under consideration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Gama, J., Medas, P., Rodrigues, P.: Learning decision trees from dynamic data streams. In: Proceedings of the 2005 ACM Symposium on Applied Computing. ACM (2005)
Google Scholar
Tan, S.C., Ting, K.M., Liu, T.F.: Fast anomaly detection for streaming data. In: IJCAI Proceedings of the International Joint Conference on Artificial Intelligence, vol. 22, No. 1 (2011)
Google Scholar
Gama, J., Rodrigues, P.P., Sebastio, R.: Evaluating algorithms that learn from data streams. In: Proceedings of the 2009 ACM Symposium on Applied Computing. ACM (2009)
Google Scholar
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2000)
Google Scholar
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2001)
Google Scholar
Vic, B., Lewis, T.: Outliers in Statistical Data, 3rd edn. Wiley, Hoboken (1994)
MATH Google Scholar
Abe, N., Zadrozny, B., Langford, J.: Outlier detection by active learning. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2006)
Google Scholar
He, Z., Xiaofei, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recogn. Lett. 24(9), 1641–1650 (2003)
Article MATH Google Scholar
Bay, S.D., Schwabacher, M.: Mining distance-based outliers in near linear time with randomization and a simple pruning rule. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2003)
Google Scholar
Heller, K., et al.: One class support vector machines for detecting anomalous windows registry accesses. In: Workshop on Data Mining for Computer Security (DMSEC), Melbourne, FL, 19 November 2003
Google Scholar
Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: 9th IEEE International Conference on Data Mining, ICDM2008. IEEE (2008)
Google Scholar
Hahsler, M., Bolanos, M., Forrest, J.: Introduction to stream: an extensible framework for data stream clustering research with R
Google Scholar
Last, M.: Online classification of nonstationary data streams. Intell. Data Anal. 6, 129–147 (2002). ISSN 1088-467X
MATH Google Scholar
Aggarwal, CC., Han, J., Wang, J., Yu, P.S.: On demand classification of data streams. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2004, pp. 503–508. ACM, New York, NY, USA (2004)
Google Scholar
Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proceedings of the International Conference on Very Large Data Bases (VLDB 2003), pp. 81–92 (2003)
Google Scholar
Hoeglinger, S., Pears, R., Koh, Y.S.: CBDT: A concept based approach to data stream mining. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 1006–1012. Springer, Heidelberg (2009)
Chapter Google Scholar
Polikar, R., et al.: Learn++: An incremental learning algorithm for supervised neural networks. IEEE Trans. Syst. Man Cybern. Part C: Appl. Rev. 31(4), 497–508 (2001)
Article Google Scholar
Carpenter, G.A., et al.: Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Trans. Neural Netw. 3(5), 698–713 (1992)
Article Google Scholar
Shen, F., Ogura, T., Hasegawa, O.: An enhanced self-organizing incremental neural network for online unsupervised learning. Neural Netw. 20(8), 893–903 (2007)
Article MATH Google Scholar
Shen, F., Hasegawa, O.: Self-organizing incremental neural network and its application. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010, Part III. LNCS, vol. 6354, pp. 535–540. Springer, Heidelberg (2010)
Chapter Google Scholar
Japkowicz, N., Myers, C., Gluck, M.: A novelty detection approach to classification. In: IJCAI (1995)
Google Scholar
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MathSciNet MATH Google Scholar
Wang, H., et al.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2003)
Google Scholar
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)
Google Scholar
Gama, J., et al.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4), 44 (2014)
Article MATH Google Scholar
Ryan, E., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)
Article Google Scholar
Carpenter, G.A., Grossberg, S., John, H.R.: ARTMAP: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network. Neural Netw. 4(5), 565–588 (1991)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Statistics, University of Ottawa, Ottawa, ON, Canada
Yue Dong
School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON, Canada
Nathalie Japkowicz

Authors

Yue Dong
View author publications
You can also search for this author in PubMed Google Scholar
Nathalie Japkowicz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yue Dong .

Editor information

Editors and Affiliations

Lakehead University, Thunder Bay, Ontario, Canada
Richard Khoury
National Research Council Canada , Ottawa, Canada
Christopher Drummond

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dong, Y., Japkowicz, N. (2016). Threaded Ensembles of Supervised and Unsupervised Neural Networks for Stream Learning. In: Khoury, R., Drummond, C. (eds) Advances in Artificial Intelligence. Canadian AI 2016. Lecture Notes in Computer Science(), vol 9673. Springer, Cham. https://doi.org/10.1007/978-3-319-34111-8_37

Download citation

DOI: https://doi.org/10.1007/978-3-319-34111-8_37
Published: 13 May 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-34110-1
Online ISBN: 978-3-319-34111-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics