Skip to main content

Threaded Ensembles of Supervised and Unsupervised Neural Networks for Stream Learning

  • Conference paper
  • First Online:
Advances in Artificial Intelligence (Canadian AI 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9673))

Included in the following conference series:

Abstract

Most existing model-based approaches to anomaly detection in streaming data are based on decision trees due to their fast construction speed [1]. This paper proposes two fast anomaly detectors based on ensembles of neural networks for evolving data streams. One model is a supervised online learning algorithm involving an ensemble of threaded multilayer perceptrons (MLP). The other model is a one-class learning algorithm with an ensemble of threaded autoencoders. The latter model only requires data from the positive class for training and is accurate even when anomalous training data are rare. The models feature an ensemble of multilayer perceptrons or autoencoders from multi-threads which evolve with data streams. Using multi-threads makes the methods highly efficient because both methods process data streams in parallel. Our analysis shows that both methods in streaming data have constant small time complexity and constant memory requirement. When compared with Very Fast Decision Trees (VFDT), a state-of-the-art algorithm, our methods performed favorably in terms of detection accuracy and training time for the datasets under consideration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gama, J., Medas, P., Rodrigues, P.: Learning decision trees from dynamic data streams. In: Proceedings of the 2005 ACM Symposium on Applied Computing. ACM (2005)

    Google Scholar 

  2. Tan, S.C., Ting, K.M., Liu, T.F.: Fast anomaly detection for streaming data. In: IJCAI Proceedings of the International Joint Conference on Artificial Intelligence, vol. 22, No. 1 (2011)

    Google Scholar 

  3. Gama, J., Rodrigues, P.P., Sebastio, R.: Evaluating algorithms that learn from data streams. In: Proceedings of the 2009 ACM Symposium on Applied Computing. ACM (2009)

    Google Scholar 

  4. Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2000)

    Google Scholar 

  5. Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2001)

    Google Scholar 

  6. Vic, B., Lewis, T.: Outliers in Statistical Data, 3rd edn. Wiley, Hoboken (1994)

    MATH  Google Scholar 

  7. Abe, N., Zadrozny, B., Langford, J.: Outlier detection by active learning. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2006)

    Google Scholar 

  8. He, Z., Xiaofei, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recogn. Lett. 24(9), 1641–1650 (2003)

    Article  MATH  Google Scholar 

  9. Bay, S.D., Schwabacher, M.: Mining distance-based outliers in near linear time with randomization and a simple pruning rule. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2003)

    Google Scholar 

  10. Heller, K., et al.: One class support vector machines for detecting anomalous windows registry accesses. In: Workshop on Data Mining for Computer Security (DMSEC), Melbourne, FL, 19 November 2003

    Google Scholar 

  11. Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: 9th IEEE International Conference on Data Mining, ICDM2008. IEEE (2008)

    Google Scholar 

  12. Hahsler, M., Bolanos, M., Forrest, J.: Introduction to stream: an extensible framework for data stream clustering research with R

    Google Scholar 

  13. Last, M.: Online classification of nonstationary data streams. Intell. Data Anal. 6, 129–147 (2002). ISSN 1088-467X

    MATH  Google Scholar 

  14. Aggarwal, CC., Han, J., Wang, J., Yu, P.S.: On demand classification of data streams. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2004, pp. 503–508. ACM, New York, NY, USA (2004)

    Google Scholar 

  15. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proceedings of the International Conference on Very Large Data Bases (VLDB 2003), pp. 81–92 (2003)

    Google Scholar 

  16. Hoeglinger, S., Pears, R., Koh, Y.S.: CBDT: A concept based approach to data stream mining. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 1006–1012. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  17. Polikar, R., et al.: Learn++: An incremental learning algorithm for supervised neural networks. IEEE Trans. Syst. Man Cybern. Part C: Appl. Rev. 31(4), 497–508 (2001)

    Article  Google Scholar 

  18. Carpenter, G.A., et al.: Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Trans. Neural Netw. 3(5), 698–713 (1992)

    Article  Google Scholar 

  19. Shen, F., Ogura, T., Hasegawa, O.: An enhanced self-organizing incremental neural network for online unsupervised learning. Neural Netw. 20(8), 893–903 (2007)

    Article  MATH  Google Scholar 

  20. Shen, F., Hasegawa, O.: Self-organizing incremental neural network and its application. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010, Part III. LNCS, vol. 6354, pp. 535–540. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  21. Japkowicz, N., Myers, C., Gluck, M.: A novelty detection approach to classification. In: IJCAI (1995)

    Google Scholar 

  22. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  23. Wang, H., et al.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2003)

    Google Scholar 

  24. Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)

    Google Scholar 

  25. Gama, J., et al.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4), 44 (2014)

    Article  MATH  Google Scholar 

  26. Ryan, E., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)

    Article  Google Scholar 

  27. Carpenter, G.A., Grossberg, S., John, H.R.: ARTMAP: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network. Neural Netw. 4(5), 565–588 (1991)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yue Dong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Dong, Y., Japkowicz, N. (2016). Threaded Ensembles of Supervised and Unsupervised Neural Networks for Stream Learning. In: Khoury, R., Drummond, C. (eds) Advances in Artificial Intelligence. Canadian AI 2016. Lecture Notes in Computer Science(), vol 9673. Springer, Cham. https://doi.org/10.1007/978-3-319-34111-8_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-34111-8_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-34110-1

  • Online ISBN: 978-3-319-34111-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics