Skip to main content
Log in

Air quality visualization analysis based on multivariate time series data feature extraction

  • Regular Paper
  • Published:
Journal of Visualization Aims and scope Submit manuscript

Abstract

Air quality analysis helps analysts understand the state of atmospheric pollution and its changing trends, providing robust data and theoretical support for developing and implementing environmental policies. Air quality data are typically represented as multivariate time series, which poses challenges due to the large amount of data, high dimensionality, and lack of labeled information. Analysts often struggle to discover internal relationships and patterns within the data. There is still significant room for improvement in related data mining and exploration methods, as issues such as perceptual burden and low efficiency must be addressed. To assist analysts in atmospheric pollution analysis, we propose an air quality visualization scheme based on feature extraction of multivariate time series data. We utilize the automated data modeling capability of deep learning and intuitive data visualization to help analysts explore and analyze complex air quality datasets. To extract features of air quality data effectively, we transform the multivariate time series feature extraction task into an automated deep learning self-supervised task and propose a feature extraction method called CTDCN for multivariate time series. Finally, we design and implement a visualization and analysis system for air quality multivariate time series. This system helps analysts discover potential information and patterns in air quality data, providing support and a foundation for informed decision-making. The system offers rich visualization views, allows users to change data modeling parameters, and interactively analyze and extract insights from the data through multiple views. Extensive experiments on UEA public datasets confirm CTDCN’s superior feature extraction capabilities, while case studies and user studies validate the effectiveness and practicality of our visualization approach.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Abdi H, Williams LJ (2010) Principal component analysis. Wiley Interdis Rev Comput Stat 2(4):433–459

    Article  Google Scholar 

  • Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Disc 31:606–660

    Article  MathSciNet  Google Scholar 

  • Cleveland RB, Cleveland WS, McRae JE, Terpenning I (1990) Stl: a seasonal-trend decomposition. J Off Stat 6(1):3–73

    Google Scholar 

  • Eldele E, Ragab M, Chen Z, Wu M, Kwoh CK, Li X, Guan C (2021) Time-series representation learning via temporal and contextual contrasting. arXiv preprint arXiv:2106.14112

  • Forkan ARM, Kimm G, Morshed A, Jayaraman PP, Banerjee A, Huang W (2019) Aqvision: a tool for air quality data visualisation and pollution-free route tracking for smart city. In: 2019 23rd international conference in information visualization–part II, IEEE, pp 47–51

  • Franceschi J-Y, Dieuleveut A, Jaggi M (2019) Unsupervised scalable representation learning for multivariate time series. Adv Neural Inf Process Syst 32

  • Fujiwara T, Sakamoto N, Nonaka J, Yamamoto K, Ma K-L et al (2020) A visual analytics framework for reviewing multivariate time-series data with dimensionality reduction. IEEE Trans Vis Comput Gr 27(2):1601–1611

    Article  Google Scholar 

  • Hamed KH, Rao AR (1998) A modified Mann–Kendall trend test for autocorrelated data. J Hydrol 204(1–4):182–196

    Article  Google Scholar 

  • Hauke J, Kossowski T (2011) Comparison of values of Pearson’s and spearman’s correlation coefficients on the same sets of data. Quaest Geogr 30(2):87–93

    Google Scholar 

  • Jäckle D, Fischer F, Schreck T, Keim DA (2015) Temporal MDS plots for analysis of multivariate data. IEEE Trans Vis Comput Gr 22(1):141–150

    Article  Google Scholar 

  • Jiang X, Lou S, Scott PJ (2011) Morphological method for surface metrology and dimensional metrology based on the alpha shape. Meas Sci Technol 23(1):015003

    Article  Google Scholar 

  • Keogh EJ, Pazzani MJ (2000) Scaling up dynamic time warping for datamining applications. In: Proceedings of the Sixth ACM SIGKDD international conference on knowledge discovery and data mining, pp 285–289

  • Kong L, Tang X, Zhu J, Wang Z, Li J, Wu H, Wu Q, Chen H, Zhu L, Wang W et al (2021) A 6-year-long (2013–2018) high-resolution air quality reanalysis dataset in china based on the assimilation of surface observations from cnemc. Earth Syst Sci Data 13(2):529–570

    Article  Google Scholar 

  • Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  • Lin M, Chen Q, Yan S (2013) Network in network. arXiv preprint arXiv:1312.4400

  • Lyu X, Hueser M, Hyland SL, Zerveas G, Raetsch G (2018) Improving clinical predictions through unsupervised time series representation learning. arXiv preprint arXiv:1812.00490

  • Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9:2579–2605

    Google Scholar 

  • Malhotra P, TV V, Vig L, Agarwal P, Shroff G (2017) Timenet: pre-trained deep recurrent neural network for time series classification. arXiv preprint arXiv:1706.08838

  • Ma J, Shou Z, Zareian A, Mansour H, Vetro A, Chang S-F (2019) Cdsa: cross-dimensional self-attention for multivariate, geo-tagged time series imputation. arXiv preprint arXiv:1905.09904

  • McInnes L, Healy J, Melville J (2018) Umap: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426

  • Park JW, Yun CH, Jung HS, Lee YW (2011) Visualization of urban air pollution with cloud computing. In: 2011 IEEE world congress on services, IEEE, pp 578–583

  • Peng Y, Fan X, Chen R, Yu Z, Liu S, Chen Y, Zhao Y, Zhou F (2023) Visual abstraction of dynamic network via improved multi-class blue noise sampling. Front Comp Sci 17(1):171701

    Article  Google Scholar 

  • Press WH, Teukolsky SA (1990) Savitzky-golay smoothing filters. Comput Phys 4(6):669–672

    Article  Google Scholar 

  • Sacha D, Kraus M, Bernard J, Behrisch M, Schreck T, Asano Y, Keim DA (2017) Somflow: guided exploratory cluster analysis with self-organizing maps and analytic provenance. IEEE Trans Vis Comput Gr 24(1):120–130

    Article  Google Scholar 

  • Tonekaboni S, Eytan D, Goldenberg A (2021) Unsupervised representation learning for time series with temporal neighborhood coding. arXiv preprint arXiv:2106.00750

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Proces Syst 30

  • Vlachos M, Hadjieleftheriou M, Gunopulos D, Keogh E (2006) Indexing multidimensional time-series. VLDB J 15:1–20

    Article  Google Scholar 

  • Weidele DKI (2019) Conditional parallel coordinates. In: 2019 IEEE visualization conference (VIS), IEEE, pp 221–225

  • Yue Z, Wang Y, Duan J, Yang T, Huang C, Tong, Y, Xu B (2021) Ts2vec: towards universal representation of time series. In: AAAI conference on artificial intelligence

  • Zeng Y-R, Chang YS, Fang YH (2019) Data visualization for air quality analysis on bigdata platform. In: 2019 international conference on system science and engineering (ICSSE), IEEE, pp 313–317

  • Zerveas G, Jayaraman S, Patel D, Bhamidipaty A, Eickhoff C (2021) A transformer-based framework for multivariate time series representation learning. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pp 2114–2124

  • Zhao Y, Ge L, Xie H, Bai G, Zhang Z, Wei Q, Lin Y, Liu Y, Zhou F (2022) Astf: visual abstractions of time-varying patterns in radio signals. IEEE Trans Vis Comput Gr 29(1):214–224

    Google Scholar 

  • Zhao Y, Lv S, Long W, Fan Y, Yuan J, Jiang H, Zhou F (2023) Malicious webshell family dataset for webshell multi-classification research. Vis Inf

  • Zimmerman Z, Kamgar K, Senobari NS, Crites B, Funning G, Brisk P, Keogh E (2019) Matrix profile XIV: scaling time series motif discovery with gpus to break a quintillion pairwise comparisons a day and beyond. In: Proceedings of the ACM symposium on cloud computing, pp 74–86

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant 62272071 and U1836114.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haibo Hu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Results of the classification experiment

Appendix A: Results of the classification experiment

See Table 3.

Table 3 Classification accuracy performance of different methods on UEA datasets

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, X., Jiang, R., Yang, B. et al. Air quality visualization analysis based on multivariate time series data feature extraction. J Vis (2024). https://doi.org/10.1007/s12650-024-00981-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12650-024-00981-3

Keywords

Navigation