Advertisement

A Cluster-Based Prototype Reduction for Online Classification

  • Kemilly Dearo Garcia
  • André C. P. L. F. de Carvalho
  • João Mendes-MoreiraEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11314)

Abstract

Data stream is a challenging research topic in which data can continuously arrive with a probability distribution that may change over time. Depending on the changes in the data distribution, different phenomena can occur, for example, a concept drift. A concept drift occurs when the concepts associated with a dataset change when new data arrive. This paper proposes a new method based on k-Nearest Neighbors that implements a sliding window requiring less instances stored for training than existing methods. For such, a clustering approach is used to summarize data by placing labeled instances considered similar in the same cluster. Besides, instances close to the uncertainty border of existing classes are also stored, in a sliding window, to adapt the model to concept drift. The proposed method is experimentally compared with state-of-the-art classifiers from the data stream literature, regarding accuracy and processing time. According to the experimental results, the proposed method has better accuracy and less time consumption when fewer information about the concepts are stored in a single sliding window.

Keywords

kNN Prototyping Data stream Online clustering 

References

  1. 1.
    Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proceedings of 29th International Conference on Very Large Data Bases, VLDB 2003, 9–12 September 2003, Berlin, Germany, pp. 81–92 (2003)Google Scholar
  2. 2.
    Bifet, A., Gavaldà, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the Seventh SIAM International Conference on Data Mining, 26–28 April 2007, Minneapolis, Minnesota, USA, pp. 443–448 (2007)Google Scholar
  3. 3.
    Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)Google Scholar
  4. 4.
    Bifet, A., Pfahringer, B., Read, J., Holmes, G.: Efficient data stream classification via probabilistic adaptive windows. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing, SAC 2013, Coimbra, Portugal, 18–22 March 2013, pp. 801–806 (2013)Google Scholar
  5. 5.
    Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
  7. 7.
    Faria, E.R., Gama, J., Carvalho, A.C.P.L.F.: Novelty detection algorithm for data streams multi-class problems. In: Proceedings of the \(28^{th}\) Annual ACM Symposium on Applied Computing, SAC ’13, Coimbra, Portugal, 18–22 March 2013, pp. 795–800 (2013)Google Scholar
  8. 8.
    Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44:1–44:37 (2014)CrossRefGoogle Scholar
  9. 9.
    Losing, V., Hammer, B., Wersing, H.: KNN classifier with self adjusting memory for heterogeneous concept drift. In: IEEE 16th International Conference on Data Mining, ICDM 2016, 12–15 December 2016, Barcelona, Spain, pp. 291–300 (2016).  https://doi.org/10.1109/ICDM.2016.0040
  10. 10.
    Street, W.N., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 26–29 August 2001, pp. 377–382 (2001)Google Scholar
  11. 11.
    Zliobaite, I., Bifet, A., Pfahringer, B., Holmes, G.: Active learning with drifting streaming data. IEEE Trans. Neural Netw. Learning Syst. 25(1), 27–39 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.University of TwenteEnschedeNetherlands
  2. 2.ICMC, University of São PauloSão PauloBrazil
  3. 3.Faculty of EngineeringUniversity of PortoPortoPortugal
  4. 4.LIAAD-INESC TECPortoPortugal

Personalised recommendations