Social Network Analysis and Mining

, Volume 3, Issue 4, pp 1179–1194 | Cite as

Extracting ordinal temporal trail clusters in networks using symbolic time-series analysis

  • Aparna GullapalliEmail author
  • Kathleen M. Carley
Original Article


Temporal trails generated by agents traveling to various locations at different time epochs are becoming more prevalent in large social networks. We propose an algorithm to intuitively cluster groups of such agent trails from networks. The proposed algorithm is based on modeling each trail as a probabilistic finite state automata (PFSA). The algorithm also allows the specification of the required degree of similarity between the trails by specifying the depth of the PFSA. Hierarchical agglomerative clustering is used to group trails based on their representative PFSA and the locations that they visit. The algorithm was applied to simulated trails and real-world network trails obtained from merchant marine ships GPS locations. In both cases it was able to intuitively detect and extract the underlying patterns in the trails and form clusters of similar trails.


Spatiotemporal networks Network trails Time-series analysis Symbolic dynamics 



This work was supported in part by the Office of Naval Research (N00014-06-1-0104) for adversarial assessment and (N00014-08-11186) for rapid ethnographic assessment, the Army Research Office and ERDC-TEC (W911NF0710317). Additional support was provided by CASOS—the center for Computational Analysis of Social and Organizational Systems at Carnegie Mellon University. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Office of Naval Research, the Army Research Institute, the US Army Engineer Research and Development Centers (ERDC), Topographic Engineering Center or the US government.


  1. Abbott A, Tsay A (2000) Sequence analysis and optimal matching methods in sociology: review and prospect. Sociol Methods Res 29(1):3–33CrossRefGoogle Scholar
  2. Antunes CM, Oliveira AL (2001) Temporal data mining: an overview. KDD Workshop on Temporal Data Mining, pp 1–15Google Scholar
  3. Assent I, Krieger R, Glavic B, Seidl T (2008) Clustering multidimensional sequences in spatial and temporal databases. Knowl Inf Syst 16(1):29–51CrossRefGoogle Scholar
  4. Baragona R (2001) A simulation study on clustering time series with metaheuristic methods. Quad Stat 3:1–26Google Scholar
  5. Börner K, Penumarthy S (2003) Social diffusion patterns in three-dimensional virtual worlds. Inf Vis 2(17):182–198CrossRefGoogle Scholar
  6. Carley KM (2004) Dynamic network analysis, In: Committee on Human Factors, National Research Council, pp 133–145Google Scholar
  7. Carley KM, Reminga J (2004) Ora: organizational risk analyzer. Technical Report CMU-ISRI-04-106, Institute for Software Research International, Carnegie Mellon UniversityGoogle Scholar
  8. Cazabet R, Takeda H, Hamasaki M, Amblard F (2012) Using dynamic community detection to identify trends in user-generated content. Social Netw Anal Min 2(4):361–371CrossRefGoogle Scholar
  9. Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans PAMI-Pattern Analysis and Machine Intelligence 1(2):224–227CrossRefGoogle Scholar
  10. Davis G, Olson J, Carley KM (2008) OraGIS and loom: Spatial and temporal extensions to the ORA analysis platform. Technical Report CMU-ISR-08-121, Institute for Software Research International, Carnegie Mellon UniversityGoogle Scholar
  11. Goodchild MF (2010) Twenty years of progress: Giscience in 2010. J Spat Inf Sci 1:3–20Google Scholar
  12. Hirano S, Tsumoto S (2004) Classification of temporal sequences using rough clustering. Processing NAFIPS ’04. IEEE Annual Meeting of the Fuzzy Information 2:711–716Google Scholar
  13. Jain A, Dubes R (1988) Algorithms for clustering data. Prentice-Hall, IncGoogle Scholar
  14. Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: A survey and empirical demonstration. Data Min Knowl Disc 7:349–371MathSciNetCrossRefGoogle Scholar
  15. Lane T, Brodley CE (1999) Temporal sequence learning and data reduction for anomaly detection. ACM Trans Inf Syst Secur 2(3):295–331Google Scholar
  16. Li C, Biswas G (1999) Temporal pattern generation using hidden markov model based unsupervised classiffication. Advances in Intelligent Data Analysis, vol 1642 of Lecture Notes in Computer Science, Springer Berlin/Heidelberg, pp 245–256Google Scholar
  17. Liao TW (2005) Clustering of time series data—a survey. Pattern Recogn Lett 38(11):1857–1874CrossRefzbMATHGoogle Scholar
  18. Pena D, Tiao G, Tsay SR (2001) A course in time series analysis, Wiley Series in Probability and StatisticsGoogle Scholar
  19. Peuquet DJ (2001) Making space for time: Issues in space-time data representation. GeoInformatica 5:11–32CrossRefzbMATHGoogle Scholar
  20. Poornalatha G, Prakash SR (2012) Web sessions clustering using hybrid sequence alignment measure (HSAM). Social network analysis and mining 1869–5450, pp 1–12.
  21. Rajagopalan V, Ray A (2006) Symbolic time series analysis via wavelet-based partitioning. Signal Process 86(11):3309–3320CrossRefzbMATHGoogle Scholar
  22. Ramoni M, Sebastiani P, Cohen P (2002) Bayesian clustering by dynamics. Mach Learn 47:91–121CrossRefzbMATHGoogle Scholar
  23. Ray A (2004) Symbolic dynamic analysis of complex systems for anomaly detection. Signal Process 84(7):1115–1130CrossRefzbMATHGoogle Scholar
  24. Roddick JF, Spiliopoulou (2002) A survey of temporal knowledge discovery paradigms and methods. IEEE Trans Knowl Data Eng 14:750–767Google Scholar
  25. Rosenberg A, Hirschberg J (2007) V-measure: a conditional entropy-based external cluster evaluation measure. In: EMNLP-CoNLL’07, pp 410–420Google Scholar
  26. Saul LK, Jordan MI (1999) Mixed memory markov models: Decomposing complex stochastic processes as mixtures of simpler ones. Mach Learn 37:75–87CrossRefzbMATHGoogle Scholar
  27. Schmiedekamp M, Subbu A, Phoha S (2006) The clustered causal state algorithm: Efficient pattern discovery for lossy data-compression applications. Comput Sci Eng 8(5):59–67CrossRefGoogle Scholar
  28. Shalizi CR, Shalizi KL, Crutchfield JP (2002) An algorithm for pattern discovery in time series. Technical Report 02-10-060, Santa Fe Institute, Scholar
  29. Smyth P (1999) Probabilistic model-based clustering of multivariate and sequential data. In: Proceedings of Artificial Intelligence and Statistics, Morgan Kaufmann, Los Altos, pp 299–304Google Scholar
  30. Subbu A, Ray A (2008) Space partitioning via Hilbert transform for symbolic time series analysis. Appl Phys Lett 92(8):084107–084107-3CrossRefGoogle Scholar
  31. Wang L, Mehrabi MG, Kannatey-Asibu E (2002) Hidden markov model-based tool wear monitoring in turning. J Manuf Sci Eng 124(3):651–658CrossRefGoogle Scholar
  32. Wasserman S, Faust K (1994) Social network analysis. Cambridge University Press, CambridgeGoogle Scholar

Copyright information

© Springer-Verlag Wien 2013

Authors and Affiliations

  1. 1.CASOS, Institute of Software Research, School of Computer Science, Carnegie Mellon UniversityPittsburghUSA

Personalised recommendations