Where Are We Going? Predicting the Evolution of Individuals

  • Zaigham Faraz Siddiqui
  • Márcia Oliveira
  • João Gama
  • Myra Spiliopoulou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7619)


When searching for patterns on data streams, we come across perennial (dynamic) objects that evolve over time. These objects are encountered repeatedly and each time with different definition and values. Examples are (a) companies registered at stock exchange and reporting their progress at the end of each year, and (b) students whose performance is evaluated at the end of each semester. On such data, domain experts also pose questions on how the individual objects will evolve: would it be beneficial to invest in a given company, given both the company’s individual performance thus far and the drift experienced in the model? Or, how will a given student perform next year, given the performance variations observed thus far? While there is much research on how models evolve/change over time [Ntoutsi et al., 2011a], little is done to predict the change of individual objects when the states are not known a priori. In this work, we propose a framework that learns the clusters to which the objects belong at each moment, uses them as ad hoc states in a state-transition graph, and then learns a mixture model of Markov Chains, which predicts the next most likely state/cluster per object. We report on our evaluation on synthetic and real datasets.


clustering cluster transition data streams evolutionary data mining label prediction perennial objects 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society 39(1), 1–38 (1977); Series B (Methodological)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Gaffney, S., Smyth, P.: Trajectory clustering with mixtures of regression models. In: 5th Int. Conf. on Knowledge Discovery and Data Mining, pp. 63–72 (1999)Google Scholar
  3. 3.
    Ikonomovska, E., Driessens, K., Dzeroski, S., Gama, J.: Adaptive windowing for online learning from multiple inter-related data streams. In: Workshop on Learning and Data Mining for Robots (LEMIR 2011@ICDM 2011), pp. 697–704 (December 2011)Google Scholar
  4. 4.
    Krempl, G., Siddiqui, Z.F., Spiliopoulou, M.: Online Clustering of High-Dimensional Trajectories under Concept Drift. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part II. LNCS, vol. 6912, pp. 261–276. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  5. 5.
    Laxman, S., Tankasali, V., White, R.W.: Stream prediction using a generative model based on frequent episodes in event sequences. In: Procs of the 14th ACM Int. Conf. on Knowledge Discovery and Data Mining, pp. 453–461. ACM (2008)Google Scholar
  6. 6.
    Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady 10(1), 707–710 (1966)MathSciNetGoogle Scholar
  7. 7.
    Ntoutsi, I., Spiliopoulou, M., Theodoridis, Y.: Summarizing Cluster Evolution in Dynamic Environments. In: Murgante, B., Gervasi, O., Iglesias, A., Taniar, D., Apduhan, B.O. (eds.) ICCSA 2011, Part II. LNCS, vol. 6783, pp. 562–577. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  8. 8.
    Oliveira, M., Gama, J.: A framework to monitor clusters evolution applied to economy and finance problems. Intelligent Data Analysis 16(1), 93–111 (2012)Google Scholar
  9. 9.
    Siddiqui, Z.F., Spiliopoulou, M.: Combining Multiple Interrelated Streams for Incremental Clustering. In: Winslett, M. (ed.) SSDBM 2009. LNCS, vol. 5566, pp. 535–552. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  10. 10.
    Siddiqui, Z.F., Spiliopoulou, M.: Tree Induction over Perennial Objects. In: Gertz, M., Ludäscher, B. (eds.) SSDBM 2010. LNCS, vol. 6187, pp. 640–657. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  11. 11.
    Spiliopoulou, M., Ntoutsi, I., Theodoridis, Y., Schult, R.: MONIC – modeling and monitoring cluster transitions. In: 12th Int. Conf. on Knowledge Discovery and Data Mining, pp. 706–711. ACM (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Zaigham Faraz Siddiqui
    • 1
  • Márcia Oliveira
    • 2
  • João Gama
    • 2
  • Myra Spiliopoulou
    • 1
  1. 1.University of MagdeburgGermany
  2. 2.LIAADINESC TEC and University of PortoPortugal

Personalised recommendations