Skip to main content
Log in

ClusPath: a temporal-driven clustering to infer typical evolution paths

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

We propose ClusPath, a novel algorithm for detecting general evolution tendencies in a population of entities. We show how abstract notions, such as the Swedish socio-economical model (in a political dataset) or the companies fiscal optimization (in an economical dataset) can be inferred from low-level descriptive features. Such high-level regularities in the evolution of entities are detected by combining spatial and temporal features into a spatio-temporal dissimilarity measure and using semi-supervised clustering techniques. The relations between the evolution phases are modeled using a graph structure, inferred simultaneously with the partition, by using a “slow changing world” assumption. The idea is to ensure a smooth passage for entities along their evolution paths, which catches the long-term trends in the dataset. Additionally, we also provide a method, based on an evolutionary algorithm, to tune the parameters of ClusPath to new, unseen datasets. This method assesses the fitness of a solution using four opposed quality measures and proposes a balanced compromise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Download pretreated version of the CPDS1 dataset here: http://goo.gl/17ihsf.

References

  • Araujo R, Kamel MS (2014) Semi-supervised kernel-based temporal clustering. In: International conference on machine learning and applications, IEEE, ICMLA ’14, pp 123–128

  • Armingeon K., Isler C, Laura Knöpfel DW, Engler S (2011) Comparative political data set 1960–2009. University of Berne

  • Chakrabarti D, Kumar R, Tomkins A (2006) Evolutionary clustering. In: International conference on knowledge discovery and data mining, ACM, SIGKDD ’06, pp 554–560

  • Chi Y, Song X, Zhou D, Hino K, Tseng BL (2007) Evolutionary Spectral Clustering by Incorporating Temporal Smoothness. In: International Conference on Knowledge Discovery and Data Mining (KDD), San Jose, USA, pp 153–162

  • De la Torre F, Agell C (2007) Multimodal diaries. In: Multimedia and expo, IEEE, pp 839–842

  • De Smet Y, Eppe S (2009) Multicriteria relational clustering: the case of binary outranking matrices. Evol Multi-Criterion Optim 5467:380–392

    Article  Google Scholar 

  • Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. Evol Comput 6(2):182–197

    Article  Google Scholar 

  • Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybern 3(3):32–57

    Article  MathSciNet  MATH  Google Scholar 

  • Erixon L (2000) A Swedish economic policy: the theory, application and validity of the Rehn–Meidner model. Tech. rep., Department of Economics, Stockholm University

  • Gaffney S, Smyth P (1999) Trajectory clustering with mixtures of regression models. In: International conference on knowledge discovery and data mining, ACM Press, New York, USA, SIGKDD ’99, pp 63–72. doi:10.1145/312129.312198

  • Halsall-Whitney H, Thibault J (2006) Multi-objective optimization for chemical processes and controller design: approximating and classifying the pareto domain. Comput Chem Eng 30(6–7):1155–1168

    Article  Google Scholar 

  • Kafafy A, Bounekkar A, Bonnevay S (2011) A hybrid evolutionary metaheuristics (HEMH) applied on 0/1 multiobjective knapsack problems. In: Genetic and evolutionary computation, ACM Press, New York, USA, GECCO ’11, p 497

  • Kalnis P, Mamoulis N, Bakiras S (2005) On discovering moving clusters in spatio-temporal data, chap. 21. In: Bauzer Medeiros C, Egenhofer M, Bertino E (eds) Advances in spatial and temporal databases, vol 3633., Lecture notes in computer science, Springer, Berlin, pp 364–381

  • Liang Z, Tomioka R, Murata H, Asaoka R, Yamanishi K (2013) Quantitative prediction of glaucomatous visual field loss from few measurements. In: International conference on data mining, ICDM ’13, pp 1121–1126

  • Lin WH, Hauptmann A (2006) Structuring continuous video recordings of everyday life using time-constrained clustering. In: Chang EY, Hanjalic A, Sebe N (eds) Multimedia content analysis, management, and retrieval, pp 60730D–60730D-9

  • MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Berkeley symposium on mathematical statistics and probability, vol 1, pp 281–297

  • Mihăiţă AS, Camargo M, Lhoste P (2014) Optimization of a complex urban intersection using discrete event simulation and evolutionary algorithms. In: International federation of automatic control, IFAC’14, vol 19, pp 8768–8774

  • Rizoiu MA, Velcin J, Lallich S (2012) Structuring typical evolutions using temporal-driven constrained clustering. In: International conference on tools with artificial intelligence, ICTAI ’12, vol 1, IEEE, Athens, Greece, pp 610–617

  • Rizoiu MA, Velcin J, Lallich S (2014) How to use temporal-driven constrained clustering to detect typical evolutions. Int J Artif Intell Tools 23(04):1460,013

    Article  Google Scholar 

  • Rocha C, Dias LC, Dimas I (2013) Multicriteria classification with unknown categories: a clustering–sorting approach and an application to conflict management. J Multi-Criteria Decis Anal 20(1–2):13–27

    Article  Google Scholar 

  • Sawaragi Y, Nakayama H, Tanino T (1985) Theory of multiobjective optimization, vol 176. Academic Press, New York

    MATH  Google Scholar 

  • Siddiqui ZF, Oliveira M, Gama J, Spiliopoulou M (2012) Where are we going? Predicting the evolution of individuals. In: Hollmén J, Klawonn F, Tucker A (eds) Advances in intelligent data analysis V, vol 7619. Lecture notes in computer science, Springer, Berlin, pp 357–368

  • Wagstaff K, Cardie C, Rogers S, Schroedl S (2001) Constrained K-means clustering with background knowledge. In: International conference on machine learning, ICML ’01, pp 577–584

  • Xu T, Zhang Z, Yu PS, Long B (2012) Generative models for evolutionary clustering. ACM Trans Knowl Discov Data (TKDD) 6(2):7

    Google Scholar 

  • Zitzler E, Laumanns M, Thiele L (2001) SPEA2: improving the strength Pareto evolutionary algorithm. In: Evolutionary methods for design, optimisation and control with applications to industrial problems, EUROGEN ’01, pp 95–100

Download references

Acknowledgments

NICTA is funded by the Australian Government through the Department of Communications and the Australian Research Council through the ICT Centre of Excellence Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marian-Andrei Rizoiu.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Research Involving Human Participants and/or Animals

The authors declare that no part of the research presented in this manuscript involved any humans or animals.

Additional information

Responsible editors: Joao Gama, Indre Zliobaite , Alipio Jorge, Concha Bielza.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 264 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rizoiu, MA., Velcin, J., Bonnevay, S. et al. ClusPath: a temporal-driven clustering to infer typical evolution paths. Data Min Knowl Disc 30, 1324–1349 (2016). https://doi.org/10.1007/s10618-015-0445-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-015-0445-7

Keywords

Navigation