Skip to main content
Log in

Probabilistic SAX: A Cognitively-Inspired Method for Time Series Classification in Cognitive IoT Sensor Network

  • Research
  • Published:
Mobile Networks and Applications Aims and scope Submit manuscript

Abstract

Cognitive Internet of Things (CIoT) is a new subfield of the Internet of Things (IoT) that aims to integrate cognition into the IoT's architecture and design. Various CIoT applications require techniques to inevitably extract machine-understandable concepts from unprocessed sensory data to provide value-added insights about CIoT devices and their users. The time series classification, which is used for the concept's extraction poses challenges to many applications across various domains, i.e., dimensionality reduction strategies have been suggested as an effective method to decrease the dimensionality of time series. The most common approach for time-series classification is the symbolic aggregate approximation (SAX). However, its main drawback is that it does not select the most significant point from the segment during the piecewise aggregate approximation (PAA) stage. The situation is cumbersome when data is heterogeneous and massive. Therefore, this research presents a novel technique for the selection of the most significant point from a segment during the PAA stage in SAX. The proposed technique chooses the maximum informative point as the most significant point using the probabilistic interpretation of sensory data with an appropriate copula design. The appropriate copula is selected using the minimum akaike information criteria (AIC) value. Subsequently, the modified SAX considers the maximum informative points instead of a selection of mean/max/extreme data points on a given segment during the PAA stage. The experimental evaluation of the environmental dataset reveals that the proposed method is more accurate and computationally efficient than classic SAX. Also, for cross-validation it computes the entropy of the information point (i-value) from each dataset to verify the successful transformation of normal data points to information points.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

The original dataset will be made available as per request.

References

  1. Atzori L, Iera A, Morabito G (2010) The internet of things: A survey. Comput Netw 54(15):2787–2805

    Article  Google Scholar 

  2. Perera C, Zaslavsky A, Christen P, Georgakopoulos D (2014) Context Aware Computing for The Internet of Things: A Survey. IEEE Commun Surv Tutorials 16:414–454. https://doi.org/10.1109/SURV.2013.042313.00197

    Article  Google Scholar 

  3. Palattella MR, Accettura N, Vilajosana X et al (2013) Standardized Protocol Stack for the Internet of (Important) Things. IEEE Commun Surv Tutorials 15:1389–1406. https://doi.org/10.1109/SURV.2012.111412.00158

    Article  Google Scholar 

  4. Baydogan MG, Runger G, Tuv E (2013) A Bag-of-Features Framework to Classify Time Series. IEEE Trans Pattern Anal Mach Intell 35:2796–2802. https://doi.org/10.1109/TPAMI.2013.72

    Article  Google Scholar 

  5. Ismail Fawaz H, Forestier G, Weber J, et al (2019) Adversarial Attacks on Deep Neural Networks for Time Series Classification. In: 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, pp 1–8

  6. Hatami N, Gavet Y, Debayle J (2019) Bag of recurrence patterns representation for time-series classification. Pattern Anal Appl 22:877–887. https://doi.org/10.1007/s10044-018-0703-6

    Article  MathSciNet  Google Scholar 

  7. Karim F, Majumdar S, Darabi H (2019) Insights Into LSTM Fully Convolutional Networks for Time Series Classification. IEEE Access 7:67718–67725. https://doi.org/10.1109/ACCESS.2019.2916828

    Article  Google Scholar 

  8. Wang Z, Yan W, Oates T (2017) Time series classification from scratch with deep neural networks: a strong baseline. In: 2017 international joint conference on neural networks (IJCNN). IEEE, pp 1578–1585

  9. Cunningham P, Delany SJ (2022) k-Nearest Neighbour Classifiers - A Tutorial. ACM Comput Surv 54:1–25. https://doi.org/10.1145/3459665

    Article  Google Scholar 

  10. Berrar D (2018) Cross-validation. Encycl Bioinforma Comput Biol ABC Bioinforma 1–3:542–545. https://doi.org/10.1016/B978-0-12-809633-8.20349-X

    Article  Google Scholar 

  11. Bramer M (2020) Avoiding Overfitting of Decision Trees. 121–136. https://doi.org/10.1007/978-1-4471-7493-6_9

  12. Muhammad Fuad MM (2020) Modifying the symbolic aggregate approximation method to capture segment trend information. In: Modeling Decisions for Artificial Intelligence: 17th International Conference, MDAI 2020, Sant Cugat, Spain, September 2–4, 2020, Proceedings 17. Springer, pp 230–239 https://doi.org/10.1007/978-3-030-57524-3_19

  13. Li AG, Qin Z (2005) Dimensionality reduction and similarity search in large time series databases. Jisuanji Xuebao/Chinese J Comput 28:1467–1475

    MathSciNet  Google Scholar 

  14. Blázquez-García A, Conde A, Mori U, Lozano JA (2022) A Review on Outlier/Anomaly Detection in Time Series Data. ACM Comput Surv 54:1–33. https://doi.org/10.1145/3444690

    Article  Google Scholar 

  15. Kulahcioglu B, Ozdemir S, Kumova B (2008) Application of symbolic piecewise aggregate approximation (PAA) analysis to ECG signals. In: 17th IASTED international conference on applied simulation and modelling. Citeseer.

  16. D’Ambrosio C, Lodi A, Martello S (2010) Piecewise linear approximation of functions of two variables in MILP models. Oper Res Lett 38:39–46. https://doi.org/10.1016/j.orl.2009.09.005

    Article  MathSciNet  Google Scholar 

  17. Mason JC, Handscomb DC (2002) Chebyshev polynomials. CRC Press

    Book  Google Scholar 

  18. Bagnall A, Lines J, Bostrom A et al (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31:606–660. https://doi.org/10.1007/s10618-016-0483-9

    Article  MathSciNet  Google Scholar 

  19. Ratanamahatana CA, Keogh E (2004) Making time-series classification more accurate using learned constraints. In: proceedings of the 2004 SIAM international conference on data mining. Society for industrial and applied mathematics, Philadelphia, PA, pp 11–22. https://doi.org/10.1137/1.9781611972740.2

  20. Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: A novel symbolic representation of time series. Data Min Knowl Discov 15:107–144. https://doi.org/10.1007/s10618-007-0064-z

    Article  MathSciNet  Google Scholar 

  21. Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery. ACM, New York, NY, USA, pp 2–11. https://doi.org/10.1145/882082.882086

  22. Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. ACM SIGMOD Rec 23:419–429. https://doi.org/10.1145/191843.191925

    Article  Google Scholar 

  23. Muhammad Fuad MM, Marteau P-F (2010) Multi-resolution approach to time series retrieval. In: proceedings of the fourteenth international database engineering & applications symposium on - IDEAS ’10. ACM Press, New York, USA, pp 136–142. https://doi.org/10.1145/1866480.1866501

  24. Pasteur L, Koch R (1941) 1. Introduction 1. Introduction 74:535–546

    Google Scholar 

  25. Tucker A (ed) (2013) Advances in intelligent data analysis XII: 12th international symposium, IDA 2013, London, UK. Proceedings. Springer. https://doi.org/10.1007/978-3-642-41398-8

  26. Zhang T, Yue D, Gu Y et al (2009) Adaptive correlation analysis in stream time series with sliding windows. Comput Math with Appl 57:937–948. https://doi.org/10.1016/j.camwa.2008.10.083

    Article  Google Scholar 

  27. Kane A (2017) Trend and value based time series representation for similarity search. In: 2017 IEEE third international conference on multimedia big data (BigMM). IEEE, pp 252–259. https://doi.org/10.1109/BigMM.2017.76

  28. Ratanamahatana C, Keogh E, Bagnall AJ, Lonardi S (2005) A Novel Bit Level Time Series Representation with Implication of Similarity Search and Clustering. pp 771–777 https://doi.org/10.1007/11430919_90

  29. Bao Y, Chen W (2018) Automated concept extraction in internet-of-things. In: 2018 IEEE international conference on internet of things (iThings) and IEEE green computing and communications (GreenCom) and IEEE cyber, physical and social computing (CPSCom) and IEEE smart data (SmartData). IEEE, pp 1770–1776. https://doi.org/10.1109/Cybermatics_2018.2018.00295

  30. Pappa L, Karvelis P, Georgoulas G, Stylios C (2021) Slopewise aggregate approximation SAX: keeping the trend of a time series. In: 2021 IEEE symposium series on computational intelligence (SSCI). IEEE, pp 01–08. https://doi.org/10.1109/SSCI50451.2021.9660130

  31. Avogadro P, Dominoni MA (2022) A fast algorithm for complex discord searches in time series: HOT SAX Time. Appl Intell 52:10060–10081. https://doi.org/10.1007/s10489-021-02897-z

    Article  Google Scholar 

  32. Taktak M, Triki S (2023) A novel shape-based time series classification with SAX-Ensemble. Int J Comput Appl Technol 71:64. https://doi.org/10.1504/IJCAT.2023.131065

    Article  Google Scholar 

  33. Liu J, Huang W, Li H et al (2023) SLAFusion: Attention fusion based on SAX and LSTM for dangerous driving behavior detection. Inf Sci (Ny) 640:119063. https://doi.org/10.1016/j.ins.2023.119063

    Article  Google Scholar 

  34. Earnest J (2023) Sum of Gaussian Feature-Based Symbolic Representations of Eddy Current Defect Signatures. Res Nondestruct Eval 1–18. https://doi.org/10.1080/09349847.2023.2217094

  35. Zhao D, Chen Y, Liu S et al (2023) Parallel symbolic aggregate approximation and its application in intelligent fault diagnosis. J Intell Fuzzy Syst 44:6359–6374. https://doi.org/10.3233/JIFS-223575

    Article  Google Scholar 

  36. Tabassum N, Menon S, Jastrzębska A (2022) Time-series classification with SAFE: Simple and fast segmented word embedding-based neural time series classifier. Inf Process Manag 59:103044. https://doi.org/10.1016/j.ipm.2022.103044

    Article  Google Scholar 

  37. El Khansa H, Gervet C, Brouillet A (2022) Prominent Discord Discovery with Matrix Profile : Application to Climate Data Insights. 65–79. https://doi.org/10.5121/csit.2022.120806

  38. Tang D, Zheng Z, Wang X, et al (2022) PeakSAX: Real-time Monitoring and Mitigation System for LDoS Attack in SDN. IEEE Trans Netw Serv Manag 1–1. https://doi.org/10.1109/TNSM.2022.3222846

  39. Zhang H, Sun L, Lin Y (2022) Broadband Long-Term Spectrum Prediction Based on Trend Based SAX BT - Mobile Multimedia Communications. In: Honggang W, Yun L (eds) Chenggang Y. Springer Nature Switzerland, Cham, pp 179–189

    Google Scholar 

  40. Meng F, Gao Y, Wang H et al (2022) TSLOD: a coupled generalized subsequence local outlier detection model for multivariate time series. Int J Mach Learn Cybern 13:1493–1504. https://doi.org/10.1007/s13042-021-01462-x

    Article  Google Scholar 

  41. Yang J, Jing S, Huang G (2022) Accurate and fast time series classification based on compressed random Shapelet Forest. Appl Intell. https://doi.org/10.1007/s10489-022-03852-2

    Article  Google Scholar 

  42. Glenis A, Vouros GA (2022) SCALE-BOSS: a framework for scalable time-series classification using symbolic representations. In: proceedings of the 12th hellenic conference on artificial intelligence. ACM, New York, NY, USA, pp 1–9. https://doi.org/10.1145/3549737.3549761

  43. Park H, Jung J-Y (2020) SAX-ARM: Deviant event pattern discovery from multivariate time series using symbolic aggregate approximation and association rule mining. Expert Syst Appl 141:112950. https://doi.org/10.1016/j.eswa.2019.112950

    Article  Google Scholar 

  44. Genest C, Rémillard B, Beaudoin D (2009) Goodness-of-fit tests for copulas: A review and a power study. Insur Math Econ 44:199–213. https://doi.org/10.1016/j.insmatheco.2007.10.005

    Article  MathSciNet  Google Scholar 

  45. Huard D, Évin G, Favre A-C (2006) Bayesian copula selection. Comput Stat Data Anal 51:809–822. https://doi.org/10.1016/j.csda.2005.08.010

    Article  MathSciNet  Google Scholar 

  46. Pitt M, Chan D, Kohn R (2006) Efficient Bayesian inference for Gaussian copula regression models. Biometrika 93:537–554. https://doi.org/10.1093/biomet/93.3.537

    Article  MathSciNet  Google Scholar 

  47. Sklar M (1959) Fonctions de repartition an dimensions et leurs marges. Publ inst Stat univ Paris 8:229–231

    Google Scholar 

  48. Joe H (1997) Multivariate models and multivariate dependence concepts. CRC Press

    Book  Google Scholar 

  49. Nelsen RB (1999) An introduction to copulas. Springer, New York. https://doi.org/10.1007/0-387-28678-0

  50. Jordanger LA, Tjøstheim D (2014) Model selection of copulas: AIC versus a cross validation copula information criterion. Stat Probab Lett 92:249–255. https://doi.org/10.1016/j.spl.2014.06.006

    Article  MathSciNet  Google Scholar 

  51. Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc Ser B Statistical Methodol 64:583–639. https://doi.org/10.1111/1467-9868.00353

    Article  MathSciNet  Google Scholar 

  52. Edwin T. Jaynes (1982) On The Rationale of Maximum-Entropy Methods. Proc. IEEE 839- https://doi.org/10.1109/PROC.1982.12425

Download references

Acknowledgements

A lot of thanks to the EiC, Editor and all those reviewers who had given their precious time for valuable suggestions, comments and active participation in our research work. Again, we are very grateful to all those who have directly or indirectly enhanced our research work.

Funding

Not applicable.

The authors have no relevant financial interests to disclose.

Non-financial interests.

The authors have no relevant non-financial interests to disclose.

Author information

Authors and Affiliations

Authors

Contributions

Vidyapati Jha and Priyanka Tripathi equally contributed to this work

Corresponding author

Correspondence to Vidyapati Jha.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jha, V., Tripathi, P. Probabilistic SAX: A Cognitively-Inspired Method for Time Series Classification in Cognitive IoT Sensor Network. Mobile Netw Appl (2024). https://doi.org/10.1007/s11036-024-02322-y

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11036-024-02322-y

Keywords

Navigation