Skip to main content

Modifying the Symbolic Aggregate Approximation Method to Capture Segment Trend Information

  • Conference paper
  • First Online:
Modeling Decisions for Artificial Intelligence (MDAI 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12256))

Abstract

The Symbolic Aggregate approXimation (SAX) is a very popular symbolic dimensionality reduction technique of time series data, as it has several advantages over other dimensionality reduction techniques. One of its major advantages is its efficiency, as it uses precomputed distances. The other main advantage is that in SAX the distance measure defined on the reduced space lower bounds the distance measure defined on the original space. This enables SAX to return exact results in query-by-content tasks. Yet SAX has an inherent drawback, which is its inability to capture segment trend information. Several researchers have attempted to enhance SAX by proposing modifications to include trend information. However, this comes at the expense of giving up on one or more of the advantages of SAX. In this paper we investigate three modifications of SAX to add trend capturing ability to it. These modifications retain the same features of SAX in terms of simplicity, efficiency, as well as the exact results it returns. They are simple procedures based on a different segmentation of the time series than that used in classic-SAX. We test the performance of these three modifications on 45 time series datasets of different sizes, dimensions, and nature, on a classification task and we compare it to that of classic-SAX. The results we obtained show that one of these modifications manages to outperform classic-SAX and that another one slightly gives better results than classic-SAX.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993). https://doi.org/10.1007/3-540-57301-1_5

    Chapter  Google Scholar 

  2. Agrawal, R., Lin, K.I., Sawhney, H.S., Shim, K.: Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In: Proceedings of the 21st International Conference on Very Large Databases. Zurich, Switzerland, pp. 490–501 (1995)

    Google Scholar 

  3. Bramer, M.: Principles of Data Mining. Springer, Heidelberg (2007)

    MATH  Google Scholar 

  4. Cai, Y., Ng, R.: Indexing spatio-temporal trajectories with Chebyshev polynomials. In: SIGMOD (2004)

    Google Scholar 

  5. Chan, K.P., Fu, A.W.-C.: Efficient time series matching by wavelets. In: Proceedings of 15th International Conference on Data Engineering (1999)

    Google Scholar 

  6. Chen,Y., Keogh, E., Hu, B., Begum, N., Bagnall, A., Mueen, A., Batista, G.: The UCR time series classification archive (2015). www.cs.ucr.edu/~eamonn/time_series_data

  7. Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. (CSUR) 45(1), 12 (2012)

    Article  Google Scholar 

  8. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proceedings of ACM SIGMOD Conference, Minneapolis (1994)

    Google Scholar 

  9. Kane,A.: Trend and value based time series representation for similarity search. In: 2017 IEEE Third International Conference Multimedia Big Data (BigMM), p. 252 (2017)

    Google Scholar 

  10. Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. J. Knowl. Inform. Syst. 3, 263–286 (2000)

    Article  Google Scholar 

  11. Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Locally adaptive dimensionality reduction for similarity search in large time series databases. In: SIGMOD, pp. 151–162 (2001)

    Google Scholar 

  12. Korn, F., Jagadish, H., Faloutsos, C.: Efficiently supporting ad hoc queries in large datasets of time sequences. In: Proceedings of SIGMOD 1997, Tucson, AZ, pp. 289–300 (1997)

    Google Scholar 

  13. Lin, J., Keogh, E., Lonardi, S., Chiu, B.Y.: A symbolic representation of time series, with implications for streaming algorithms. In: DMKD 2003, pp. 2–11 (2003)

    Google Scholar 

  14. Lin, J.E., Keogh, E., Wei, L., Lonardi, S.: Experiencing SAX: a novel symbolic representation of time series. Data Min. Knowl. Discov. 15(2), 107–144 (2007)

    Article  MathSciNet  Google Scholar 

  15. Ma, T., Xiao, C., Wang, F.: Health-ATM: a deep architecture for multifaceted patient health record representation and risk prediction. In: SIAM International Conference on Data Mining (2018)

    Google Scholar 

  16. Malinowski, S., Guyet, T., Quiniou, R., Tavenard, R.: 1d-SAX: a novel symbolic representation for time series. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds.) IDA 2013. LNCS, vol. 8207, pp. 273–284. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41398-8_24

    Chapter  Google Scholar 

  17. Maimon, O., Rokach, L.: Data Mining and Knowledge Discovery Handbook. Springer, New York (2005)

    Book  Google Scholar 

  18. Morinaka, Y., Yoshikawa, M., Amagasa, T., Uemura, S.: The L-index: an indexing structure for efficient subsequence matching in time sequence databases. In:Proceedings of 5th Pacific Asia Conference on Knowledge Discovery and Data Mining, pp. 51–60 (2001)

    Google Scholar 

  19. Muhammad Fuad, M.M., Marteau P.F.: Multi-resolution approach to time series retrieval. In: Fourteenth International Database Engineering & Applications Symposium– IDEAS 2010, Montreal, QC, Canada (2010)

    Google Scholar 

  20. Nawrocka, A., Lamorska, J.: Determination of food quality by using spectroscopic methods. In: Advances in Agrophysical Research (2013)

    Google Scholar 

  21. Ratanamahatana, C., Keogh, E., Bagnall, Anthony J., Lonardi, S.: A novel bit level time series representation with implication of similarity search and clustering. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 771–777. Springer, Heidelberg (2005). https://doi.org/10.1007/11430919_90

    Chapter  Google Scholar 

  22. Tan, C.W., Webb, G.I., Petitjean, F.: Indexing and classifying gigabytes of time series under time warping. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 282–290. SIAM (2017)

    Google Scholar 

  23. Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary Lp norms. In: Proceedings of the 26th International Conference on Very Large Databases, Cairo, Egypt (2000)

    Google Scholar 

  24. Zhang, T., Yue, D., Gu, Y., Wang, Y., Yu, G.: Adaptive correlation analysis in stream time series with sliding windows. Comput. Math Appl. 57(6), 937–948 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Marwan Muhammad Fuad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Muhammad Fuad, M.M. (2020). Modifying the Symbolic Aggregate Approximation Method to Capture Segment Trend Information. In: Torra, V., Narukawa, Y., Nin, J., Agell, N. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2020. Lecture Notes in Computer Science(), vol 12256. Springer, Cham. https://doi.org/10.1007/978-3-030-57524-3_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-57524-3_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-57523-6

  • Online ISBN: 978-3-030-57524-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics