Advertisement

SSRDVis: Interactive visualization for event sequences summarization and rare detection

  • Chenlu Li
  • Xiaoju DongEmail author
  • Wei Liu
  • Shiying Sheng
  • Aijuan Qian
Regular Paper
  • 11 Downloads

Abstract

This paper presents SSRDVis, a visual approach to effectively summarize event sequences and interactively detect rare behaviors. SSRDVis is mainly composed of three components: (1) a sequence embedding module for learning effective feature vectors of sequences, (2) a sequence grouping and summarization module to find representative clusters and patterns in the dataset, (3) a rare detection module to discover and explain the rare cases. The sequences are embedded into vector space via “mixed-ngram2vec,” which is adapted from “word2vec.” Then, unsupervised learning models could be applied to group similar sequences and detect anomalies in the vector space. Furthermore, sequential pattern graphs are built to provide a compact and semantic summarization of sequences. These components work together to present both overall sequential patterns and abnormal behaviors in one visual interface. We have demonstrated the feasibility of our approach by applying it to analyze Web clickstreams. Experimental results have shown that our approach could help identify noticeable patterns from a large number of event sequences, especially for rare behaviors.

Graphic abstract

Keywords

Visual analytics Event sequences Sequential pattern mining Rare detection 

Notes

Acknowledgements

This work is supported by National Key Research and Development Program of China (Grant No. 2017YFB0701900), National Nature Science Foundation of China (Grant No. 61100053) and Key Laboratory of Machine Perception in Peking University (K-2019-09).

References

  1. Agarwal R, Srikant R et al (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th VLDB conference, pp 487–499Google Scholar
  2. Ayres J, Flannick J, Gehrke J, Yiu T (2002) Sequential pattern mining using a bitmap representation. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 429–435Google Scholar
  3. Casas-Garriga G (2005) Summarizing sequential data with closed partial orders. In: Proceedings of the 2005 SIAM international conference on data mining. SIAM, pp 380–391Google Scholar
  4. Chen Y, Xu P, Ren L (2017) Sequence synopsis: optimize visual summary of temporal event data. IEEE Trans Vis Comput Gr 24(1):45–55CrossRefGoogle Scholar
  5. Cuenca E, Sallaberry A, Ying Wang F, Poncelet P (2018) MultiStream: a multiresolution streamgraph approach to explore hierarchical time series. IEEE Trans Vis Comput Gr 24(12):3160–3173CrossRefGoogle Scholar
  6. Du F, Shneiderman B, Plaisant C, Malik S, Perer A (2016) Coping with volume and variety in temporal event sequences: strategies for sharpening analytic focus. IEEE Trans Vis Comput Gr 23(6):1636–1649CrossRefGoogle Scholar
  7. Fan X, Li C, Dong X (2019) A real-time network security visualization system based on incremental learning (chinavis 2018). J Vis 22(1):215–229CrossRefGoogle Scholar
  8. Fournier-Viger P, Wu CW, Tseng VS (2012) Mining top-k association rules. In: Canadian conference on artificial intelligence. Springer, pp 61–73Google Scholar
  9. Fournier-Viger P, Gomariz A, Campos M, Thomas R (2014) Fast vertical mining of sequential patterns using co-occurrence information. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 40–52Google Scholar
  10. Fournier-Viger P, Lin JC-W, Kiran RU, Koh YS, Thomas R (2017) A survey of sequential pattern mining. Data Sci Pattern Recogn 1(1):54–77Google Scholar
  11. Guo S, Xu K, Zhao R, Gotz D, Zha H, Cao N (2017) EventThread: visual summarization and stage analysis of event sequence data. IEEE Trans Vis Comput Gr 99:1–1Google Scholar
  12. Guo S, Du F, Malik S, Koh E, Kim S, Liu Z, Kim D, Zha H, Cao N (2019) Visualizing uncertainty and alternatives in event sequence predictions. In: Proceedings of the 2019 CHI conference on human factors in computing systems. ACM, p 573Google Scholar
  13. Heckerman D (1999) Msnbc. com anonymous web data setGoogle Scholar
  14. Koh YS, Ravana SD (2016) Unsupervised rare pattern mining: a survey. ACM Trans Knowl Discov Data 10(4):45CrossRefGoogle Scholar
  15. Kwon BC, Choi M-J, Kim JT, Choi E, Kim YB, Kwon S, Sun J, Choo J (2019) Retainvis: visual analytics with interpretable and interactive recurrent neural networks on electronic medical records. IEEE Trans Vis Comput Gr 25(1):299–309CrossRefGoogle Scholar
  16. Liu FT, Ting KM, Zhou Z-H (2008) Isolation forest. In: Eighth IEEE international conference on data mining. IEEE, pp 413–422Google Scholar
  17. Liu Z, Wang Y, Dontcheva M, Hoffman M, Walker S, Wilson A (2016) Patterns and sequences: interactive exploration of clickstreams to understand common visitor paths. IEEE Trans Vis Comput Gr 23(1):321–330CrossRefGoogle Scholar
  18. Liu Z, Kerr B, Dontcheva M, Grover J, Hoffman M, Wilson A (2017) Coreflow: extracting and visualizing branching patterns from event sequences. Comput Gr Forum 36(3):527–538CrossRefGoogle Scholar
  19. Lu J, Wang X-F, Adjei O, Hussain F (2004) Sequential patterns graph and its construction algorithm. Chin J Comput Chin Edn 27(6):782–788Google Scholar
  20. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
  21. Monroe M, Lan R, Lee H, Plaisant C, Shneiderman B (2013) Temporal event sequence simplification. IEEE Trans Vis Comput Gr 19(12):2227–2236CrossRefGoogle Scholar
  22. Ng P (2017) dna2vec: Consistent vector representations of variable-length k-mers. arXiv preprint arXiv:1701.06279
  23. Piech C, Bassen J, Huang J, Ganguli S, Sahami M, Guibas LJ, Sohl-Dickstein J (2015) Deep knowledge tracing. In: Advances in neural information processing systems, pp 505–513Google Scholar
  24. Plaisant C, Shneiderman B (2016) The diversity of data and tasks in event analytics. In: Proceedings of the IEEE VIS 2016 workshop on temporal and sequential event analysisGoogle Scholar
  25. Samet A, Guyet T, Négrevergne B (2017) Mining rare sequential patterns with ASP. In: ILPGoogle Scholar
  26. Scholtes I (2017) When is a network a network? Multi-order graphical model selection in pathways and temporal networks. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1037–1046Google Scholar
  27. Song Y, Wen Z, Lin CY, Davis R (2013) One-class conditional random fields for sequential anomaly detection. In: Twenty-third international joint conference on artificial intelligenceGoogle Scholar
  28. Sugiyama K, Tagawa S, Toda M (1981) Methods for visual understanding of hierarchical system structures. IEEE Trans Syst Man Cybern 11(2):109–125MathSciNetCrossRefGoogle Scholar
  29. Unger A, Dräger N, Sips M, Lehmann DJ (2017) Understanding a sequence of sequences: visual exploration of categorical states in lake sediment cores. IEEE Trans Vis Comput Gr 99:1Google Scholar
  30. Wei J, Shen Z, Sundaresan N, Ma KL (2012) Visual cluster exploration of web clickstream data. In: IEEE VAST, pp 3–12Google Scholar
  31. Wongsuphasawat K, Gotz D (2012) Exploring flow, factors, and outcomes of temporal event sequences with the outflow visualization. IEEE Trans Vis Comput Gr 18(12):2659–2668CrossRefGoogle Scholar
  32. Wongsuphasawat K, Guerra Gómez JA, Plaisant C, Wang TD, Taieb-Maimon M, Shneiderman, B (2011) Lifeflow: visualizing an overview of event sequences. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 1747–1756Google Scholar
  33. Zaki MJ (2001) Spade: an efficient algorithm for mining frequent sequences. Mach Learn 42(1–2):31–60CrossRefGoogle Scholar
  34. Zhao Z, Liu T, Li S, Li B, Du X (2017) Ngram2vec: learning improved word representations from ngram co-occurrence statistics. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 244–253Google Scholar
  35. Zhu J, Wang K, Wu Y, Hu Z, Wang H (2016) Mining user-aware rare sequential topic patterns in document streams. IEEE Trans Knowl Data Eng 28(7):1790–1804CrossRefGoogle Scholar

Copyright information

© The Visualization Society of Japan 2019

Authors and Affiliations

  1. 1.BASICS, Department of Computer Science and EngineeringShanghai Jiao Tong UniversityShanghaiChina

Personalised recommendations