Advertisement

On Robustness of Adaptive Random Forest Classifier on Biomedical Data Stream

  • Hayder K. Fatlawi
  • Attila KissEmail author
Conference paper
  • 333 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12033)

Abstract

Data Stream represents a significant challenge for data analysis and data mining techniques because those techniques are developed based on training batch data. Classification technique that deals with data stream should have the ability for adapting its model for the new samples and forget the old ones. In this paper, we present an intensive comparison for the performance of six of popular classification techniques and focusing on the power of Adaptive Random Forest. The comparison was made based on four real medical datasets and for more reliable results, 40 other datasets were made by adding white noise to the original datasets. The experimental results showed the dominant of Adaptive Random Forest over five other techniques with high robustness against the change in data and noise.

Keywords

Classification Biomedical Data stream Ensemble modeling 

Notes

Acknowledgment

The project was supported by the European Union, co-financed by the European Social Fund (EFOP-3.6.3-VEKOP-16-2017-00002).

References

  1. 1.
    Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106. ACM (2001)Google Scholar
  2. 2.
    Gantz, J., Reinsel, D.: The digital universe in 2020: big data, bigger digital shadows, and biggest growth in the far east. IDC iView: IDC Analyze the future 2007(2012), 1–16 (2012)Google Scholar
  3. 3.
    Gama, J.: Knowledge Discovery from Data Streams. Chapman and Hall/CRC, London (2010)CrossRefGoogle Scholar
  4. 4.
    Krempl, G., et al.: Open challenges for data stream mining research. ACM SIGKDD Explor. Newsl. 16(1), 1–10 (2014)CrossRefGoogle Scholar
  5. 5.
    Babenko, B., Yang, M.-H., Belongie, S.: A family of online boosting algorithms. In: 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pp. 1346–1353. IEEE (2009)Google Scholar
  6. 6.
    Bifet, A., Holmes, G., Pfahringer, B., Gavaldà, R.: Improving adaptive bagging methods for evolving data streams. In: Zhou, Z.-H., Washio, T. (eds.) ACML 2009. LNCS (LNAI), vol. 5828, pp. 23–37. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-05224-8_4CrossRefGoogle Scholar
  7. 7.
    Fong, S., et al.: Stream-based biomedical classification algorithms for analyzing biosignals. J. Inf. Process. Syst. 7(4), 717–732 (2011)CrossRefGoogle Scholar
  8. 8.
    Hang, Y., et al.: Case-based and stream-based classification in biomedical application. In: Eighth IASTED International Conference on Biomedical Engineering (Biomed 2011), pp. 207–214. February 2011Google Scholar
  9. 9.
    Zhang, Y., et al.: Real-time clinical decision support system with data stream mining. In: BioMed Research International 2012 (2012)Google Scholar
  10. 10.
    Cazzolato, M.T., Ribeiro, M.X.: A statistical decision tree algorithm for medical data stream mining. In: Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems, pp. 389–392. IEEE (2013)Google Scholar
  11. 11.
    Zhu, M., et al.: Class weights random forest algorithm for processing class imbalanced medical data. IEEE Access 6, 4641–4652 (2018)CrossRefGoogle Scholar
  12. 12.
    Al-Shammari, A., Zhou, R., Liu, C., Naseriparsa, M., Vo, B.Q.: A framework for processing cumulative frequency queries over medical data streams. In: Hacid, H., Cellary, W., Wang, H., Paik, H.-Y., Zhou, R. (eds.) WISE 2018. LNCS, vol. 11234, pp. 121–131. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-02925-8_9CrossRefGoogle Scholar
  13. 13.
    Oza, N.C.: Online bagging and boosting. In: 2005 IEEE International Conference on Systems, Man and Cybernetics. vol. 3, pp. 2340–2345, IEEE (2005)Google Scholar
  14. 14.
    Losing, V., Hammer, B., Wersing, H.: KNN classifier with self adjusting memory for heterogeneous concept drift. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 291–300. IEEE (2016)Google Scholar
  15. 15.
    Salperwyck, C., Lemaire, V., Hue, C.: Incremental weighted naive bays classifiers for data stream. In: Lausen, B., Krolak-Schwerdt, S., Böhmer, M. (eds.) Data Science, Learning by Latent Structures, and Knowledge Discovery. SCDAKO, pp. 179–190. Springer, Heidelberg (2015).  https://doi.org/10.1007/978-3-662-44983-7_16CrossRefGoogle Scholar
  16. 16.
    Domingos, P., Hulten, G.: Mining high-speed data streams. In: Kdd. vol. 2, p. 4 (2000)Google Scholar
  17. 17.
    Irvine UC: Machine Learning Repository. July 2019. url: https://archive.ics.uci.edu/ml/index.php
  18. 18.
    kaggle Rebosotiry: Public Datasets. July 2019. url: https://www.kaggle.com/datasets

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Faculty of Informatics, Department of Information SystemsEötvös Loránd UniversityBudapestHungary
  2. 2.Center of Information Technology Research and DevelopmentUniversity of KufaNajafIraq

Personalised recommendations