Advertisement

Improving the Performance of Data Stream Classifiers by Mining Recurring Contexts

  • Yong Wang
  • Zhanhuai Li
  • Yang Zhang
  • Longbo Zhang
  • Yun Jiang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4093)

Abstract

Traditional researches on data stream mining only put emphasis on building classifiers with high accuracy, which always results in classifiers with dramatic drop of accuracy when concept drifts. In this paper, we present our RTRC system that has good classification accuracy when concept drifts and enough samples are scanned in data stream. By using Markov chain and least-square method, the system is able to predict not only on which the next concept is but also on when the concept is to drift. Experimental results confirm the advantages of our system over Weighted Bagging and CVFDT, two representative systems in streaming data mining.

Keywords

Data Stream Time Stamp Data Block Concept Drift Target Concept 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fan, W.: Systematic Data Selection to Mine Concept-Drifting Data Streams. In: Proceeding of the conference KDD, pp. 128–137 (2004)Google Scholar
  2. 2.
    Chu, F., Zaniolo, C.: Fast and Light Boosting for Adaptive Mining of Data Streams. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Kolter, J., Maloof, M.: Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift. In: Proceeding of the conference ICDM (2003)Google Scholar
  4. 4.
    Hulten, G., Spencer, L., Domingos, P.: Mining Time-Changing Data Streams. In: Proceeding of the conference ACM SIGKDD (2001)Google Scholar
  5. 5.
    Street, W., Kim, Y.: A Streaming Ensemble Algorithm(sea) for Large–Scale Classification. In: Proceeding of the conference SIGKDD (2001)Google Scholar
  6. 6.
    Zhu, X., Wu, X., Yang, Y.: Effective Classification of Noisy Data Streams with Attribute-Oriented Dynamic Classifier Selection. In: Proceeding of the conference ICDM (2004)Google Scholar
  7. 7.
    Widmer, G., Kubat, M.: Learning in the Presence of Concept Drift and Hidden Contests. Machine Learning, 69–101 (1996)Google Scholar
  8. 8.
    Salganicoff, M.: Tolerating Concept and Sampling Shift in Lazy Learning Using Prediction Error Context Switching. AI Review, Special Issue on Lazy Learning 11(1-5), 133–155 (1997)CrossRefGoogle Scholar
  9. 9.
    Harries, M., Sammut, C., Horn, K.: Extracting Hidden Context. Matching Learning 32(2), 101–126 (1998)CrossRefMATHGoogle Scholar
  10. 10.
    Yang, Y., Wu, X., Zhu, X.: Combining Proactive and Reactive Predictions for Data Streams. In: Proceeding of the conference KDD (2005)Google Scholar
  11. 11.
    Wang, H., Fan, W., Yu, P., Han, J.: Mining Concept-Drifting Data Streams Using Ensemble Classifiers. In: Proceeding of the conference SIGKDD (2003)Google Scholar
  12. 12.
    Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: Proceeding of the conference KDD, pp. 71–80 (2000)Google Scholar
  13. 13.
    Jin, R., Agrawal, G.: Efficient Decision Tree Construction on Streaming Data. In: Proceeding of the conference ACM SIGKDD (2003)Google Scholar
  14. 14.
    John, G., Langley, P.: Langley: Extimating Continuous Distributions in Bayesian Classifiers. In: Proceeding of the conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Francisco (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yong Wang
    • 1
  • Zhanhuai Li
    • 1
  • Yang Zhang
    • 2
  • Longbo Zhang
    • 1
  • Yun Jiang
    • 1
  1. 1.Dept. Computer Science & SoftwareNorthwestern Polytechnical UniversityChina
  2. 2.School of Information EngineeringNorthwest A&F UniversityChina

Personalised recommendations