Extracting ordinal temporal trail clusters in networks using symbolic time-series analysis

Abstract

Temporal trails generated by agents traveling to various locations at different time epochs are becoming more prevalent in large social networks. We propose an algorithm to intuitively cluster groups of such agent trails from networks. The proposed algorithm is based on modeling each trail as a probabilistic finite state automata (PFSA). The algorithm also allows the specification of the required degree of similarity between the trails by specifying the depth of the PFSA. Hierarchical agglomerative clustering is used to group trails based on their representative PFSA and the locations that they visit. The algorithm was applied to simulated trails and real-world network trails obtained from merchant marine ships GPS locations. In both cases it was able to intuitively detect and extract the underlying patterns in the trails and form clusters of similar trails.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

References

  1. Abbott A, Tsay A (2000) Sequence analysis and optimal matching methods in sociology: review and prospect. Sociol Methods Res 29(1):3–33

    Article  Google Scholar 

  2. Antunes CM, Oliveira AL (2001) Temporal data mining: an overview. KDD Workshop on Temporal Data Mining, pp 1–15

  3. Assent I, Krieger R, Glavic B, Seidl T (2008) Clustering multidimensional sequences in spatial and temporal databases. Knowl Inf Syst 16(1):29–51

    Article  Google Scholar 

  4. Baragona R (2001) A simulation study on clustering time series with metaheuristic methods. Quad Stat 3:1–26

    Google Scholar 

  5. Börner K, Penumarthy S (2003) Social diffusion patterns in three-dimensional virtual worlds. Inf Vis 2(17):182–198

    Article  Google Scholar 

  6. Carley KM (2004) Dynamic network analysis, In: Committee on Human Factors, National Research Council, pp 133–145

  7. Carley KM, Reminga J (2004) Ora: organizational risk analyzer. Technical Report CMU-ISRI-04-106, Institute for Software Research International, Carnegie Mellon University

  8. Cazabet R, Takeda H, Hamasaki M, Amblard F (2012) Using dynamic community detection to identify trends in user-generated content. Social Netw Anal Min 2(4):361–371

    Article  Google Scholar 

  9. Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans PAMI-Pattern Analysis and Machine Intelligence 1(2):224–227

    Article  Google Scholar 

  10. Davis G, Olson J, Carley KM (2008) OraGIS and loom: Spatial and temporal extensions to the ORA analysis platform. Technical Report CMU-ISR-08-121, Institute for Software Research International, Carnegie Mellon University

  11. Goodchild MF (2010) Twenty years of progress: Giscience in 2010. J Spat Inf Sci 1:3–20

    Google Scholar 

  12. Hirano S, Tsumoto S (2004) Classification of temporal sequences using rough clustering. Processing NAFIPS ’04. IEEE Annual Meeting of the Fuzzy Information 2:711–716

    Google Scholar 

  13. Jain A, Dubes R (1988) Algorithms for clustering data. Prentice-Hall, Inc

  14. Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: A survey and empirical demonstration. Data Min Knowl Disc 7:349–371

    MathSciNet  Article  Google Scholar 

  15. Lane T, Brodley CE (1999) Temporal sequence learning and data reduction for anomaly detection. ACM Trans Inf Syst Secur 2(3):295–331

    Google Scholar 

  16. Li C, Biswas G (1999) Temporal pattern generation using hidden markov model based unsupervised classiffication. Advances in Intelligent Data Analysis, vol 1642 of Lecture Notes in Computer Science, Springer Berlin/Heidelberg, pp 245–256

  17. Liao TW (2005) Clustering of time series data—a survey. Pattern Recogn Lett 38(11):1857–1874

    Article  MATH  Google Scholar 

  18. Pena D, Tiao G, Tsay SR (2001) A course in time series analysis, Wiley Series in Probability and Statistics

  19. Peuquet DJ (2001) Making space for time: Issues in space-time data representation. GeoInformatica 5:11–32

    Article  MATH  Google Scholar 

  20. Poornalatha G, Prakash SR (2012) Web sessions clustering using hybrid sequence alignment measure (HSAM). Social network analysis and mining 1869–5450, pp 1–12. http://link.springer.com/article/10.1007%2Fs13278-012-0070-z?LI=true

  21. Rajagopalan V, Ray A (2006) Symbolic time series analysis via wavelet-based partitioning. Signal Process 86(11):3309–3320

    Article  MATH  Google Scholar 

  22. Ramoni M, Sebastiani P, Cohen P (2002) Bayesian clustering by dynamics. Mach Learn 47:91–121

    Article  MATH  Google Scholar 

  23. Ray A (2004) Symbolic dynamic analysis of complex systems for anomaly detection. Signal Process 84(7):1115–1130

    Article  MATH  Google Scholar 

  24. Roddick JF, Spiliopoulou (2002) A survey of temporal knowledge discovery paradigms and methods. IEEE Trans Knowl Data Eng 14:750–767

  25. Rosenberg A, Hirschberg J (2007) V-measure: a conditional entropy-based external cluster evaluation measure. In: EMNLP-CoNLL’07, pp 410–420

  26. Saul LK, Jordan MI (1999) Mixed memory markov models: Decomposing complex stochastic processes as mixtures of simpler ones. Mach Learn 37:75–87

    Article  MATH  Google Scholar 

  27. Schmiedekamp M, Subbu A, Phoha S (2006) The clustered causal state algorithm: Efficient pattern discovery for lossy data-compression applications. Comput Sci Eng 8(5):59–67

    Article  Google Scholar 

  28. Shalizi CR, Shalizi KL, Crutchfield JP (2002) An algorithm for pattern discovery in time series. Technical Report 02-10-060, Santa Fe Institute, arxiv.org/abs/cs.LG/0210025

  29. Smyth P (1999) Probabilistic model-based clustering of multivariate and sequential data. In: Proceedings of Artificial Intelligence and Statistics, Morgan Kaufmann, Los Altos, pp 299–304

  30. Subbu A, Ray A (2008) Space partitioning via Hilbert transform for symbolic time series analysis. Appl Phys Lett 92(8):084107–084107-3

    Article  Google Scholar 

  31. Wang L, Mehrabi MG, Kannatey-Asibu E (2002) Hidden markov model-based tool wear monitoring in turning. J Manuf Sci Eng 124(3):651–658

    Article  Google Scholar 

  32. Wasserman S, Faust K (1994) Social network analysis. Cambridge University Press, Cambridge

Download references

Acknowledgments

This work was supported in part by the Office of Naval Research (N00014-06-1-0104) for adversarial assessment and (N00014-08-11186) for rapid ethnographic assessment, the Army Research Office and ERDC-TEC (W911NF0710317). Additional support was provided by CASOS—the center for Computational Analysis of Social and Organizational Systems at Carnegie Mellon University. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Office of Naval Research, the Army Research Institute, the US Army Engineer Research and Development Centers (ERDC), Topographic Engineering Center or the US government.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Aparna Gullapalli.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Gullapalli, A., Carley, K.M. Extracting ordinal temporal trail clusters in networks using symbolic time-series analysis. Soc. Netw. Anal. Min. 3, 1179–1194 (2013). https://doi.org/10.1007/s13278-012-0091-7

Download citation

Keywords

  • Spatiotemporal networks
  • Network trails
  • Time-series analysis
  • Symbolic dynamics