Modifying the Symbolic Aggregate Approximation Method to Capture Segment Trend Information

Muhammad Fuad, Muhammad Marwan

doi:10.1007/978-3-030-57524-3_19

Muhammad Marwan Muhammad Fuad¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12256))

Included in the following conference series:

International Conference on Modeling Decisions for Artificial Intelligence

486 Accesses
5 Citations
3 Altmetric

Abstract

The Symbolic Aggregate approXimation (SAX) is a very popular symbolic dimensionality reduction technique of time series data, as it has several advantages over other dimensionality reduction techniques. One of its major advantages is its efficiency, as it uses precomputed distances. The other main advantage is that in SAX the distance measure defined on the reduced space lower bounds the distance measure defined on the original space. This enables SAX to return exact results in query-by-content tasks. Yet SAX has an inherent drawback, which is its inability to capture segment trend information. Several researchers have attempted to enhance SAX by proposing modifications to include trend information. However, this comes at the expense of giving up on one or more of the advantages of SAX. In this paper we investigate three modifications of SAX to add trend capturing ability to it. These modifications retain the same features of SAX in terms of simplicity, efficiency, as well as the exact results it returns. They are simple procedures based on a different segmentation of the time series than that used in classic-SAX. We test the performance of these three modifications on 45 time series datasets of different sizes, dimensions, and nature, on a classification task and we compare it to that of classic-SAX. The results we obtained show that one of these modifications manages to outperform classic-SAX and that another one slightly gives better results than classic-SAX.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993). https://doi.org/10.1007/3-540-57301-1_5
Chapter Google Scholar
Agrawal, R., Lin, K.I., Sawhney, H.S., Shim, K.: Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In: Proceedings of the 21st International Conference on Very Large Databases. Zurich, Switzerland, pp. 490–501 (1995)
Google Scholar
Bramer, M.: Principles of Data Mining. Springer, Heidelberg (2007)
MATH Google Scholar
Cai, Y., Ng, R.: Indexing spatio-temporal trajectories with Chebyshev polynomials. In: SIGMOD (2004)
Google Scholar
Chan, K.P., Fu, A.W.-C.: Efficient time series matching by wavelets. In: Proceedings of 15th International Conference on Data Engineering (1999)
Google Scholar
Chen,Y., Keogh, E., Hu, B., Begum, N., Bagnall, A., Mueen, A., Batista, G.: The UCR time series classification archive (2015). www.cs.ucr.edu/~eamonn/time_series_data
Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. (CSUR) 45(1), 12 (2012)
Article Google Scholar
Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proceedings of ACM SIGMOD Conference, Minneapolis (1994)
Google Scholar
Kane,A.: Trend and value based time series representation for similarity search. In: 2017 IEEE Third International Conference Multimedia Big Data (BigMM), p. 252 (2017)
Google Scholar
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. J. Knowl. Inform. Syst. 3, 263–286 (2000)
Article Google Scholar
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Locally adaptive dimensionality reduction for similarity search in large time series databases. In: SIGMOD, pp. 151–162 (2001)
Google Scholar
Korn, F., Jagadish, H., Faloutsos, C.: Efficiently supporting ad hoc queries in large datasets of time sequences. In: Proceedings of SIGMOD 1997, Tucson, AZ, pp. 289–300 (1997)
Google Scholar
Lin, J., Keogh, E., Lonardi, S., Chiu, B.Y.: A symbolic representation of time series, with implications for streaming algorithms. In: DMKD 2003, pp. 2–11 (2003)
Google Scholar
Lin, J.E., Keogh, E., Wei, L., Lonardi, S.: Experiencing SAX: a novel symbolic representation of time series. Data Min. Knowl. Discov. 15(2), 107–144 (2007)
Article MathSciNet Google Scholar
Ma, T., Xiao, C., Wang, F.: Health-ATM: a deep architecture for multifaceted patient health record representation and risk prediction. In: SIAM International Conference on Data Mining (2018)
Google Scholar
Malinowski, S., Guyet, T., Quiniou, R., Tavenard, R.: 1d-SAX: a novel symbolic representation for time series. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds.) IDA 2013. LNCS, vol. 8207, pp. 273–284. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41398-8_24
Chapter Google Scholar
Maimon, O., Rokach, L.: Data Mining and Knowledge Discovery Handbook. Springer, New York (2005)
Book Google Scholar
Morinaka, Y., Yoshikawa, M., Amagasa, T., Uemura, S.: The L-index: an indexing structure for efficient subsequence matching in time sequence databases. In:Proceedings of 5th Pacific Asia Conference on Knowledge Discovery and Data Mining, pp. 51–60 (2001)
Google Scholar
Muhammad Fuad, M.M., Marteau P.F.: Multi-resolution approach to time series retrieval. In: Fourteenth International Database Engineering & Applications Symposium– IDEAS 2010, Montreal, QC, Canada (2010)
Google Scholar
Nawrocka, A., Lamorska, J.: Determination of food quality by using spectroscopic methods. In: Advances in Agrophysical Research (2013)
Google Scholar
Ratanamahatana, C., Keogh, E., Bagnall, Anthony J., Lonardi, S.: A novel bit level time series representation with implication of similarity search and clustering. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 771–777. Springer, Heidelberg (2005). https://doi.org/10.1007/11430919_90
Chapter Google Scholar
Tan, C.W., Webb, G.I., Petitjean, F.: Indexing and classifying gigabytes of time series under time warping. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 282–290. SIAM (2017)
Google Scholar
Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary Lp norms. In: Proceedings of the 26th International Conference on Very Large Databases, Cairo, Egypt (2000)
Google Scholar
Zhang, T., Yue, D., Gu, Y., Wang, Y., Yu, G.: Adaptive correlation analysis in stream time series with sliding windows. Comput. Math Appl. 57(6), 937–948 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Coventry University, Coventry, CV1 5FB, UK
Muhammad Marwan Muhammad Fuad

Authors

Muhammad Marwan Muhammad Fuad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muhammad Marwan Muhammad Fuad .

Editor information

Editors and Affiliations

Department of Computing Science, Umeå University, Umeå, Sweden
Vicenç Torra
Department of Management Science, Tamagawa University, Tokyo, Japan
Yasuo Narukawa
Department of Operations, Innovation and Data Sciences, ESADE, Sant Cugat, Spain
Jordi Nin
Department of Operations, Innovation and Data Sciences, ESADE, Sant Cugat, Spain
Núria Agell

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Muhammad Fuad, M.M. (2020). Modifying the Symbolic Aggregate Approximation Method to Capture Segment Trend Information. In: Torra, V., Narukawa, Y., Nin, J., Agell, N. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2020. Lecture Notes in Computer Science(), vol 12256. Springer, Cham. https://doi.org/10.1007/978-3-030-57524-3_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-57524-3_19
Published: 26 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57523-6
Online ISBN: 978-3-030-57524-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics