Advertisement

TreeRoses: outlier-centric monitoring and analysis of periodic time series data

  • Hui Tang
  • Shuang Wei
  • Zheng Zhou
  • Zhenyu Cheryl Qian
  • Yingjie Victor ChenEmail author
Regular Paper
  • 35 Downloads

Abstract

Periodical time series data can reveal temporal patterns as well as anomalies. Although these patterns appear in different ranges with different measurements, the variables in a dataset can be associated with and affect each other. When the size of datasets becomes large, monitoring and analyzing these data become challenging. We propose a visual analytics framework that is capable of transforming a large number of multivariate time serial data into manageable visual representations that enable analysts to identify and monitor anomalies and trends. After sorting the data in the temporal and categorical directions, we extract the temporal possibility range patterns and anomalies from each variable. Associated variables typically include similar anomalies. Therefore, we employ a hierarchical clustering algorithm to group the variables with similar anomaly appearances. Through our interactive visualization toolset TreeRoses, we find that possibility ranges, anomaly, and its summary, and hierarchical association is perceptible to different degrees. We apply our method to a real-world periodical time series dataset to showcase how our framework effectively monitors anomalies in meteorological data.

Graphic abstract

Keywords

Compression techniques Time-varying data Multidimensional data Multiresolution techniques 

Notes

References

  1. Aigner W, Miksch S, Muller W, Schumann H, Tominski C (2008) Visual methods for analyzing time-oriented data. IEEE Trans Vis Comput Graph 14(1):47–60.  https://doi.org/10.1109/TVCG.2007.70415 CrossRefGoogle Scholar
  2. Aigner W, Miksch S, Schumann H, Tominski C (2011) Visualization of time-oriented data. Springer, Berlin.  https://doi.org/10.1007/978-0-85729-079-3 CrossRefGoogle Scholar
  3. Bostock M (2017) D3.js - data-driven documents. Library retrieved March 24, 2017, from D3.js - Data-Driven Documents. https://www.d3js.org
  4. Buono P, Plaisant C, Simeone A, Aris A, Shmueli G, Jank W (2007) Similarity-based forecasting with simultaneous previews: a river plot interface for time series forecasting. In: 2007 11th international conference information visualization (IV’07). IEEE, pp 191–196.  https://doi.org/10.1109/IV.2007.101
  5. Cao N, Lin Y-R, Gotz D, Du F (2018) Z-glyph: visualizing outliers in multivariate data. Inf Vis 17(1):22–40CrossRefGoogle Scholar
  6. Card M (1999) Readings in information visualization: using vision to think. Morgan Kaufmann, San FranciscoGoogle Scholar
  7. Carling K (2000) Resistant outlier rules and the non-Gaussian case. Comput Stat Data Anal 33(3):249–258.  https://doi.org/10.1016/S0167-9473(99)00057-2 CrossRefzbMATHGoogle Scholar
  8. Fuchs J, Fischer F, Mansmann F, Bertini E, Isenberg P (2013) Evaluation of alternative glyph designs for time series data in a small multiple setting. In: Proceedings of the SIGCHI conference on human factors in computing systems—CHI’13. ACM Press, New York, NY, USA, p 3237.  https://doi.org/10.1145/2470654.2466443
  9. Gupta M, Gao J, Aggarwal CC, Han J (2014) Outlier detection for temporal data: a survey. IEEE Trans Knowl Data Eng 26(9):2250–2267.  https://doi.org/10.1109/TKDE.2013.184 CrossRefzbMATHGoogle Scholar
  10. Havre S, Hetzler B, Nowell L (2000) ThemeRiver: visualizing theme changes over time. In: Proceedings of the IEEE symposium on information visualization 2000. INFOVIS 2000. IEEE Comput Soc, pp 115–123.  https://doi.org/10.1109/INFVIS.2000.885098
  11. Hayes MA, Capretz MA (2015) Contextual anomaly detection framework for big sensor data. J Big Data 2(1):2CrossRefGoogle Scholar
  12. IEEE. 2016 VAST Challenge: MC2, 2016. Retrieved August 04, 2016, from http://vacommunity.org/2016+VAST+Challenge%3A+MC2
  13. Javed W, McDonnel B, Elmqvist N (2010) Graphical perception of multiple time series. IEEE Trans Vis Comput Graph 16(6):927–934.  https://doi.org/10.1109/TVCG.2010.162 CrossRefGoogle Scholar
  14. Kincaid R (2010) SignalLens: focus+context applied to electronic time series. IEEE Trans Vis Comput Graph 16(6):900–907.  https://doi.org/10.1109/TVCG.2010.193 CrossRefGoogle Scholar
  15. Kincaid R, Lam H (2006) Line graph explorer: scalable display of line graphs using Focus+Context. In: Proceedings of the working conference on advanced visual interfaces—AVI’06. ACM Press, New York, NY, USA, p 404.  https://doi.org/10.1145/1133265.1133348
  16. Krstajic M, Bertini E, Keim D (2011) CloudLines: compact display of event episodes in multiple time-series. IEEE Trans Vis Comput Graph 17(12):2432–2439.  https://doi.org/10.1109/TVCG.2011.179 CrossRefGoogle Scholar
  17. McLachlan P, Munzner T, Koutsofios E, North S (2008) LiveRAC: interactive visual exploration of system management time-series data. In: Proceeding of the twenty-sixth annual CHI conference on Human factors in computing systems—CHI’08. ACM Press, New York, NY, USA, p 1483.  https://doi.org/10.1145/1357054.1357286
  18. Nairac A, Townsend N, Carr R, King S, Cowley P, Tarassenko L (1999) A system for the analysis of jet engine vibration data. Integr Comput Aided Eng 6(1):53–66CrossRefGoogle Scholar
  19. Palpanas T, Papadopoulos D, Kalogeraki V, Gunopulos D (2003) Distributed deviation detection in sensor networks. ACM SIGMOD Rec 32(4):77.  https://doi.org/10.1145/959060.959074 CrossRefGoogle Scholar
  20. Papadimitriou S, Kitagawa H, Gibbons P, Faloutsos C (2003) LOCI: fast outlier detection using the local correlation integral. In: Proceedings 19th international conference on data engineering (Cat. No. 03CH37405). IEEE, pp 315–326.  https://doi.org/10.1109/ICDE.2003.1260802
  21. Perin C, Vernier F, Fekete J-D (2013) Interactive horizon graphs: improving the compact visualization of multiple time series. CHI’13 Proceedings of the SIGCHI conference on human factors in computing systems, pp 3217–3226.  https://doi.org/10.1145/2470654.2466441
  22. Pietriga E, Appert C (2008) Sigma lenses: focus-context transitions combining space, time and translucence. In Proceeding of the twenty-sixth annual CHI conference on Human factors in computing systems—CHI’08, vol 08. ACM Press, New York, NY, USA, p 1343.  https://doi.org/10.1145/1357054.1357264
  23. Rajasegarar S, Leckie C, Palaniswami M, Bezdek J (2006) Distributed anomaly detection in wireless sensor networks. In: 2006 10th IEEE Singapore international conference on communication systems, pp 1–5.  https://doi.org/10.1109/ICCS.2006.301508
  24. Segaran T (2007) Programming collective intelligence. O’Reilly Media, SebastopoGoogle Scholar
  25. Shaw C, King G (1992) Using cluster analysis to classify time series. Phys D: Nonlinear Phenom 58(1–4):288–298.  https://doi.org/10.1016/0167-2789(92)90117-6 CrossRefGoogle Scholar
  26. Shi C, Cui W, Liu S, Xu P, Chen W, Qu H (2012) RankExplorer: visualization of ranking changes in large time series data. IEEE Trans Vis Comput Graph 18(12):2669–2678.  https://doi.org/10.1109/TVCG.2012.253 CrossRefGoogle Scholar
  27. Shneiderman B (1996) The eyes have it: a task by data type taxonomy for information visualizations. In: Proceedings 1996 IEEE symposium on visual languages, pp 336–343.  https://doi.org/10.1109/VL.1996.545307
  28. Shumway RH (2003) Time-frequency clustering and discriminant analysis. Stat Probab Lett 63(3):307–314.  https://doi.org/10.1016/S0167-7152(03)00095-6 MathSciNetCrossRefzbMATHGoogle Scholar
  29. Subramaniam S, Palpanas T, Papadopoulos D, Kalogeraki V, Gunopulos D (2006) Online outlier detection in sensor data using non-parametric models. In: Proceedings of the 32nd international conference on very large data bases, VLDB’06, pp 187–198Google Scholar
  30. Thakur S, Rhyne T-M M (2009) Data vases: 2D and 3D plots for visualizing multiple time series. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 5876 LNCS (PART 2), pp 929–938.  https://doi.org/10.1007/978-3-642-10520-3_89
  31. Tukey JW (1977) Exploratory data analysis. Addison-Wesley, BostonzbMATHGoogle Scholar
  32. van Wijk JJ, van Selow ER, Wijk Jv, Selow EV (1999) Cluster and calendar based visualization of time series data. In: Proceedings 1999 IEEE symposium on information visualization (InfoVis’99), p 4.  https://doi.org/10.1109/INFVIS.1999.801851
  33. Warren Liao T (2005) Clustering of time series data a survey. Pattern Recognit 38(11):1857–1874.  https://doi.org/10.1016/j.patcog.2005.01.025 CrossRefzbMATHGoogle Scholar
  34. Weather history and & data archive (2017). Retrieved November 14, 2016, from Historical Weather. https://www.wunderground.com/history/
  35. Williams AW, Pertet SM, Narasimhan P (2007) Tiresias: black-box failure prediction in distributed systems. In: 2007 IEEE international parallel and distributed processing symposium, pp 1–8. IEEE.  https://doi.org/10.1109/IPDPS.2007.370345
  36. Zhang Y, Meratnia N, Havinga P (2010) Outlier detection techniques for wireless sensor networks: a survey. IEEE Commun Surv Tutor 12(2):159–170.  https://doi.org/10.1109/SURV.2010.021510.00088 CrossRefGoogle Scholar
  37. Zhao J, Chevalier F, Balakrishnan R (2011) KronoMiner: using multi-foci navigation for the visual exploration of time-series data. In Proceedings of the 2011 annual conference on Human factors in computing systems—CHI’11. ACM Press, New York, NY, USA, p 1737.  https://doi.org/10.1145/1978942.1979195

Copyright information

© The Visualization Society of Japan 2019

Authors and Affiliations

  1. 1.Shenzhen Key Laboratory of Media SecurityCollege of Information Engineering, Shenzhen UniversityShenzhenChina
  2. 2.Department of Computer Graphics TechnologyPurdue UniversityWest LafayetteUSA
  3. 3.Cerebri AI, Inc.AustinUSA
  4. 4.Department of Art and DesignPurdue UniversityWest LafayetteUSA

Personalised recommendations