Skip to main content
Log in

Mining in Anticipation for Concept Change: Proactive-Reactive Prediction in Data Streams

  • Original Article
  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Prediction in streaming data is an important activity in the modern society. Two major challenges posed by data streams are (1) the data may grow without limit so that it is difficult to retain a long history of raw data; and (2) the underlying concept of the data may change over time. The novelties of this paper are in four folds. First, it uses a measure of conceptual equivalence to organize the data history into a history of concepts. This contrasts to the common practice that only keeps recent raw data. The concept history is compact while still retains essential information for learning. Second, it learns concept-transition patterns from the concept history and anticipates what the concept will be in the case of a concept change. It then proactively prepares a prediction model for the future change. This contrasts to the conventional methodology that passively waits until the change happens. Third, it incorporates proactive and reactive predictions. If the anticipation turns out to be correct, a proper prediction model can be launched instantly upon the concept change. If not, it promptly resorts to a reactive mode: adapting a prediction model to the new data. Finally, an efficient and effective system RePro is proposed to implement these new ideas. It carries out prediction at two levels, a general level of predicting each oncoming concept and a specific level of predicting each instance's class. Experiments are conducted to compare RePro with representative existing prediction methods on various benchmark data sets that represent diversified scenarios of concept change. Empirical evidence offers inspiring insights and demonstrates the proposed methodology is an advisable solution to prediction in data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

 
 
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
Figure 8.
Figure 9.
Figure 10.
Figure 11.

Similar content being viewed by others

References

  • Aggarwal, C.C., Han, J., Wang, J., and Yu, P.S. 2003. A framework for clustering evolving data streams. In Proceedings of the 29th International Conference on Very Large Data Bases, pp. 81–92.

  • Blake, C.L. and Merz, C.J. 2005. UCI repository of machine learning databases [http://www.ics.uci.edu/~mlearn/milrepository.html] Department of Information and Computer Science, University of California, Irvine.

  • Ganti, V., Gehrke, J., and Ramakrishnan, R. 2001. Demon: Mining and monitoring evolving data. IEEE Transactions on Knowledge and Data Engineering, 13:50–63.

    Article  Google Scholar 

  • Gehrke, J., Ganti, V., Ramakrishnan, R., and Loh, W.Y. 1999. Boat-optimistic decision tree construction. In Proceedings ACM SIGMOD International Conference on Management of Data, pp. 169–180.

  • Harries, M.B. and Horn, K. 1996. Learning stable concepts in a changing world. In PRICAI Workshops, pp. 106–122.

  • Hulten, G., Spencer, L., and Domingos, P. 2001. Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106.

  • Jain, R. 1991. The art of computer systems performance analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. Wiley-Interscience, NY, Winner of ‘1991 Best Advanced How-To Book, Systems’ award from the Computer Press Association.

  • Keogh, E. and Kasetty, S. 2002. On the need for time series data mining benchmarks: A survey and empirical demonstration. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 102–111.

  • Kolter, J.Z. and Maloof, M.A. 2003. Dynamic weighted majority: A new ensemble method for tracking concept drift. In Proceedings of the 3rd International IEEE Conference on Data Mining, pp. 123–130.

  • Lanquillon, C. and Renz, I. 1999. Adaptive information filtering: Detecting changes in text streams. In Proceedings of the 8th International Conference on Information and Knowledge Management, pp. 538–544.

  • Quinlan, J.R. 1993. C4.5: Programs for machine learning. Morgan Kaufmann Publishers.

  • Salganicoff, M. 1997. Tolerating concept and sampling shift in lazy learning using prediction error context switching. Artificial Intelligence Review, 11:133–155.

    Article  Google Scholar 

  • Stanley, K.O. 2003. Learning concept drift with a committee of decision trees. Technical Report AI-03-302, Department of Computer Sciences, University of Texas at Austin.

  • Street, W.N. and Kim, Y. 2001. A streaming ensemble algorithm (sea) for large-scale classification. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 377–382.

  • Tsymbal, A. 2004. The problem of concept drift: Definitions and related work. Technical Report TCD-CS-2004-15, Computer Science Department, Trinity College Dublin.

  • Wang, H., Fan, W., Yu, P.S., and Han, J. 2003. Mining concept-drifting data streams using ensemble classifiers. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235.

  • Widmer, G. and Kubat, M. 1996. Learning in the presence of concept drift and hidden contexts. Machine Learning, 23:69–101.

    Google Scholar 

  • Yang, Y., Wu, X., and Zhu, X. 2004. Dealing with predictive-but-unpredictable attributes in noisy data sources. In Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 471–483.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Yang.

Additional information

A preliminary and shorter version of this paper has been published in the Proceedings of the llth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2005), pp. 710–715.

Sometimes there are conflicts in the literature when describing these modes. For example, the concept shift in some papers means the concept drift in other papers. The definitions here are cleared up to the best of the authors’ understanding.

The value in each cell can be frequency as well as probability. The latter can be approximated from the former.

If the concept changes so fast that the learning can not catch up with it, the prediction will be inordinate. This also applies to human learning.

For example, C4.5rules (Quinlan, 1993) can achieve a 100% classification accuracy on the whole data set.

If the attribute value has less than 500 instances, all instances will be sampled without replacement.

If a data set has only nominal attributes, two nominal attributes will be selected. If a data set has only numeric attributes, two numeric attributes will be selected.

One can not manipulate these degrees in the hyperplane or network intrusion data, for which no results are presented.

The sample size is chosen to avoid observation noise caused by high classification variance.

These error rates may sometimes be higher than those reported in the original work (Hulten et al., 2001). It is because the original work used a much larger data size. There are many more instances coming after the new classifier becomes stable and hence can be classified correctly. This longer existence of each concept relieves CVFDT's dilemma and lowers its average error rate.

Please note that for DWCE, the optimal version whose buffer size equals to 10% of its window size has been used on those 3 artificial data streams. However, its prohibitively high time demand makes DWCE intractable when a large number (36) of real-world data streams are tested here. Hence a compromise version is used here instead whose buffer size is half of its window size. The results are sufficient to verify that DWCE trades time for accuracy. It can improve prediction accuracy on WCE, but is often too slow to be useful for on-line prediction.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, Y., Wu, X. & Zhu, X. Mining in Anticipation for Concept Change: Proactive-Reactive Prediction in Data Streams. Data Min Knowl Disc 13, 261–289 (2006). https://doi.org/10.1007/s10618-006-0050-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-006-0050-x

Keywords

Navigation