Skip to main content

Exploit Every Cycle: Vectorized Time Series Algorithms on Modern Commodity CPUs

  • Conference paper
  • First Online:
Data Management on New Hardware (ADMS 2016, IMDM 2016)

Abstract

Many time series algorithms reduce the computation cost by pruning unpromising candidates with lower-bound distance functions. In this paper, we focus on an orthogonal research direction that further boosts the performance by unlocking the potentials of modern commodity CPUs. First, we conduct a performance profiling on existing algorithms to understand where does time go. Second, we design vectorized implementations for lower-bound and distance functions that can enjoy characteristics (e.g., data parallelism, caching, branch prediction) provided by CPU. Third, our vectorized methods are general and applicable to many time series problems such as subsequence search, motif discovery and kNN classification. Our experimental study on real datasets shows that our proposal can achieve up to 6 times of speedup.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.physionet.org/physiobank/.

  2. 2.

    https://en.wikipedia.org/wiki/Apple_mobile_application_processors.

  3. 3.

    The squared distance preserves the relative ordering of distances, and it avoids expensive square root calculations.

  4. 4.

    http://www.physionet.org/physiobank/database/edb/.

  5. 5.

    http://www.physionet.org/physiobank/database/ltstdb/.

  6. 6.

    http://www.physionet.org/pn6/chbmit/.

  7. 7.

    http://data.gov.uk/metoffice-data-archive.

  8. 8.

    http://www.cs.ucr.edu/~mueen/OnlineMotif/index.html.

  9. 9.

    For consistency, we use the ‘float’ data type to represent time series values in all evaluated methods.

  10. 10.

    http://en.wikipedia.org/wiki/Cache_pollution.

References

  1. Intel 64 and IA-32 architecutres optimization reference manual. http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf. Accessed 20 June 2016

  2. Source codes and datasets for experimental study. http://goo.gl/mwDTxP. Accessed 20 June 2016

  3. Ailamaki, A., DeWitt, D.J., Hill, M.D., Wood, D.A.: DBMSs on a modern processor: where does time go? In: VLDB, Edinburgh, UK, pp. 266–277 (1999)

    Google Scholar 

  4. Assent, I., Krieger, R., Afschari, F., Seidl, T.: The ts-tree: efficient time series search and retrieval. In: EDBT (2008)

    Google Scholar 

  5. Athitsos, V., Papapetrou, P., Potamias, M., Kollios, G., Gunopulos, D.: Approximate embedding-based subsequence matching of time series. In: SIGMOD (2008)

    Google Scholar 

  6. Balkesen, C., Teubner, J., Alonso, G., Özsu, M.T.: Main-memory hash joins on multi-core cpus: tuning to the underlying hardware. In: ICDE (2013)

    Google Scholar 

  7. Blanas, S., Li, Y., Patel, J.M.: Design and evaluation of main memory hash join algorithms for multi-core CPUs. In: SIGMOD (2011)

    Google Scholar 

  8. Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14(3), 189–204 (2000)

    Article  Google Scholar 

  9. Camerra, A., Palpanas, T., Shieh, J., Keogh, E.J.: iSAX 2.0: Indexing and mining one billion time series. In: ICDM (2010)

    Google Scholar 

  10. Chen, S., Ailamaki, A., Gibbons, P.B., Mowry, T.C.: Improving hash join performance through prefetching. TODS 32(3), 17 (2007)

    Google Scholar 

  11. Chhugani, J., Nguyen, A.D., Lee, V.W., Macy, W., Hagog, M., Chen, Y.-K., Baransi, A., Kumar, S., Dubey, P.: Efficient implementation of sorting on multi-core simd CPU architecture. PVLDB 1(2), 1313–1324 (2008)

    Google Scholar 

  12. Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.J.: Querying and mining of time series data: experimental comparison of representations and distance measures. PVLDB 1(2), 1542–1552 (2008)

    Google Scholar 

  13. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: SIGMOD (1994)

    Google Scholar 

  14. Fu, A.W., Keogh, E.J., Lau, L.Y.H., Ratanamahatana, C.A., Wong, R.C.: Scaling and time warping in time series querying. VLDB J. 17(4), 899–921 (2008)

    Article  Google Scholar 

  15. Hennessy, J.L., Patterson, D.A.: Computer Architecture - A Quantitative Approach, 5th edn. Morgan Kaufmann, San Francisco (2012)

    MATH  Google Scholar 

  16. Inoue, H., Ohara, M., Taura, K.: Faster set intersection with simd instructions by reducing branch mispredictions. Proc. VLDB Endowment 8(3), 293–304 (2014)

    Article  Google Scholar 

  17. Jha, S., He, B., Lu, M., Cheng, X., Huynh, H.P.: Improving main memory hash joins on intel xeon phi processors: an experimental approach. PVLDB 8(6), 642–653 (2015)

    Google Scholar 

  18. Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inform. Syst. 7(3), 358–386 (2005)

    Article  Google Scholar 

  19. Li, Y., U, L.H., Yiu, M.L., Gong, Z.: Discovering longest-lasting correlation in sequence databases. PVLDB 6(14), 1666–1677 (2013)

    Google Scholar 

  20. Mueen, A., Keogh, E.J., Zhu, Q., Cash, S., Westover, M.B.: Exact discovery of time series motifs. In: SDM (2009)

    Google Scholar 

  21. Papapetrou, P., Athitsos, V., Potamias, M., Kollios, G., Gunopulos, D.: Embedding-based subsequence matching in time-series databases. ACM TODS 36(3), 17 (2011)

    Google Scholar 

  22. Rakthanmanon, T., Campana, B.J.L., Mueen, A., Batista, G.E., Westover, M.B., Zhu, Q., Zakaria, J., Keogh, E.J.: Searching and mining trillions of time series subsequences under dynamic time warping. In: KDD (2012)

    Google Scholar 

  23. Ross, K.A.: Efficient hash probes on modern processors. In: ICDE (2007)

    Google Scholar 

  24. Sart, D., Mueen, A., Najjar, W.A., Keogh, E.J., Niennattrakul, V.: Accelerating dynamic time warping subsequence search with GPUs and FPGAs. In: ICDM (2010)

    Google Scholar 

  25. Shieh, J., Keogh, E.J.: iSAX: indexing and mining terabyte sized time series. In: KDD (2008)

    Google Scholar 

  26. Shoeb, A.H., Guttag, J.V.: Application of machine learning to epileptic seizure detection. In: ICML (2010)

    Google Scholar 

  27. Sridharan, S., Patel, J.M.: Profiling R on a contemporary processor. Proc. VLDB Endowment 8(2), 173–184 (2014)

    Article  Google Scholar 

  28. Xiao, L., Zheng, Y., Tang, W., Yao, G., Ruan, L.: Parallelizing dynamic time warping algorithm using prefix computations on GPU. In: HPCC/EUC (2013)

    Google Scholar 

  29. Zhou, J., Ross, K.A.: Implementing database operations using SIMD instructions. In: SIGMOD (2002)

    Google Scholar 

  30. Zhu, H., Kollios, G., Athitsos, V.: A generic framework for efficient and effective subsequence retrieval. PVLDB 5(11), 1579–1590 (2012)

    Google Scholar 

Download references

Acknowledgement

This project was supported by grant GRF 152043/15E from the Hong Kong RGC and grant MYRG2014-00106-FST from UMAC Research Committee and grant NSFC 61502548 from National Natural Science Foundation of China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bo Tang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Tang, B., Yiu, M.L., Li, Y., U, L.H. (2017). Exploit Every Cycle: Vectorized Time Series Algorithms on Modern Commodity CPUs. In: Blanas, S., Bordawekar, R., Lahiri, T., Levandoski, J., Pavlo, A. (eds) Data Management on New Hardware. ADMS IMDM 2016 2016. Lecture Notes in Computer Science(), vol 10195. Springer, Cham. https://doi.org/10.1007/978-3-319-56111-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56111-0_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56110-3

  • Online ISBN: 978-3-319-56111-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics