Skip to main content

Searching for Patterns in Sequential Data: Functionality and Performance Assessment of Commercial and Open-Source Systems

  • Conference paper
  • First Online:
Advances in Conceptual Modeling (ER 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9975))

Included in the following conference series:

  • 1161 Accesses

Abstract

Ubiquitous devices and applications generate data that are naturally ordered by time. Thus elementary data items can form sequences. The most popular way of analyzing sequences is searching for patterns. To this end, sequential pattern discovery techniques were proposed in some research contributions and implemented in a few database systems, e.g., Oracle Database, Teradata Aster, Apache Hive. The goal of this work is to assess the functionality of the systems and to evaluate their performance with respect to pattern queries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bebel, B., Cichowicz, T., Morzy, T., Rytwiński, F., Wrembel, R., Koncilia, C.: Sequential data analytics by means of Seq-SQL language. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds.) DEXA 2015. LNCS, vol. 9261, pp. 416–431. Springer, Heidelberg (2015). doi:10.1007/978-3-319-22849-5_28

    Chapter  Google Scholar 

  2. Bębel, B., Morzy, M., Morzy, T., Królikowski, Z., Wrembel, R.: OLAP-like analysis of time point-based sequential data. In: Castano, S., Vassiliadis, P., Lakshmanan, L.V., Lee, M.L. (eds.) ER 2012. LNCS, vol. 7518, pp. 153–161. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33999-8_19

    Chapter  Google Scholar 

  3. Buchmann, A.P., Koldehofe, B.: Complex event processing. Inf. Technol. 51(5), 241–242 (2009)

    Google Scholar 

  4. Chawathe, S.S., Krishnamurthy, V., Ramachandran, S., Sarma, S.: Managing RFID data. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 1189–1195 (2004)

    Google Scholar 

  5. Chui, C.K., Kao, B., Lo, E., Cheung, D.: S-OLAP: an OLAP system for analyzing sequence data. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 1131–1134 (2010)

    Google Scholar 

  6. Chui, C.K., Lo, E., Kao, B., Ho, W.-S.: Supporting ranking pattern-based aggregate queries in sequence data cubes. In: Proceedings of ACM Conference on Information and Knowledge Management (CIKM), pp. 997–1006 (2009)

    Google Scholar 

  7. Fred Zemke, F., Witkowski, A., Cherniak, M., Colby, L.: Pattern matching in sequences of rows, 2007. Accessed 2 Mar 2016. http://web.cs.ucla.edu/classes/fall15/cs240A/notes/temporal/row-pattern-recogniton-11.pdf

  8. Gonzalez, H., Han, J., Li, X.: FlowCube: constructing RFID flowcubes for multi-dimensional analysis of commodity flows. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 834–845 (2006)

    Google Scholar 

  9. Han, J., Chen, Y., Dong, G., Pei, J., Wah, B.W., Wang, J., Cai, Y.D.: Stream Cube: an architecture for multi-dimensional analysis of data streams. Distrib. Parallel Databases 18(2), 173–197 (2005)

    Article  Google Scholar 

  10. Koncilia, C., Morzy, T., Wrembel, R., Eder, J.: Interval OLAP: analyzing interval data. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 233–244. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10160-6_21

    Google Scholar 

  11. Koncilia, C., Pichler, H., Wrembel, R.: A generic data warehouse architecture for analyzing workflow logs. In: Morzy, T., Valduriez, P., Bellatreche, L. (eds.) ADBIS 2015. LNCS, vol. 9282, pp. 106–119. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23135-8_8

    Chapter  Google Scholar 

  12. Lerner, A., Shasha, D.: AQuery: query language for ordered data, optimization techniques, and experiments. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 345–356 (2003)

    Google Scholar 

  13. Liu, M., Rundensteiner, E., Greenfield, K., Gupta, C., Wang, S., Ari, I., Mehta, A.: E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 889–900 (2011)

    Google Scholar 

  14. Liu, M., Rundensteiner, E.A.: Event sequence processing: new models and optimization techniques. In: Proceedings of SIGMOD PhD Workshop on Innovative Database Research (IDAR), pp. 7–12 (2010)

    Google Scholar 

  15. Lo, E., Kao, B., Ho, W.-S., Lee, S.D., Chui, C.K., Cheung, D.W.: OLAP on sequence data. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 649–660 (2008)

    Google Scholar 

  16. Meisen, P., Kenig, D., Meisen, T., Recchioni, M., Jeschke, S.: TidaQL: a query language enabling on-line analytical processing of time interval data. In: Proceedings of International Conference on Enterprise Information Systems (ICEIS), pp. 54–66 (2015)

    Google Scholar 

  17. Melton, J. (ed.): Working Draft Database Language SQL - Part 15: Row Pattern Recognition (SQL/RPR). ANSI INCITS DM32.2-2011-00005 (2011)

    Google Scholar 

  18. Ramakrishnan, R., Donjerkovic, D., Ranganathan, A., Beyer, K.S., Krishnaprasad, M.: SRQL: Sorted relational query language. In: Proceedings of International Conference on Scientific and Statistical Database Management (SSDBM), pp. 84–95 (1998)

    Google Scholar 

  19. Sadri, R., Zaniolo, C., Zarkesh, A.M., Adibi, J.: A sequential pattern query language for supporting instant data mining for e-services. In: Proceedings of International Conference on Very Large Data Bases (VLDB), pp. 653–656 (2001)

    Google Scholar 

  20. Seshadri, P., Livny, M., Ramakrishnan, R.: Sequence query processing. SIGMOD Rec. 23(2), 430–441 (1994)

    Google Scholar 

  21. Aster nPath. http://developer.teradata.com/aster/articles/aster-npath-guide. Accessed 13 Mar 2014

  22. Aalst, W.M.P.: Process cubes: slicing, dicing, rolling up and drilling down event data for process mining. In: Song, M., Wynn, M.T., Liu, J. (eds.) AP-BPM 2013. LNBIP, vol. 159, pp. 1–22. Springer, Heidelberg (2013). doi:10.1007/978-3-319-02922-1_1

    Chapter  Google Scholar 

  23. Wu, E., Diao, Y., Rizvi, S.: High-performance complex event processing over streams. In: Procedings of ACM SIGMOD International Conference on Management of Data, pp. 407–418 (2006)

    Google Scholar 

  24. Zhang, Y., Kersten, M., Manegold, S.: SciQL: Array data processing inside an RDBMS. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 1049–1052 (2013)

    Google Scholar 

Download references

Acknowledgement

The research of G. Bakkalian has been funded by the European Commission through the “Erasmus Mundus Joint Doctorate Information Technologies for Business Intelligence Doctoral College (IT4BI-DC)”. The research of W. Andrzejewski, B. Bębel, and R. Wrembel has been funded by the Polish National Science Center, grant “Analytical processing and mining of sequential data: models, algorithms, and data structures”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gastón Bakkalian .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Andrzejewski, W., Bębel, B., Kłosowski, S., Łukaszewski, B., Wrembel, R., Bakkalian, G. (2016). Searching for Patterns in Sequential Data: Functionality and Performance Assessment of Commercial and Open-Source Systems. In: Link, S., Trujillo, J. (eds) Advances in Conceptual Modeling. ER 2016. Lecture Notes in Computer Science(), vol 9975. Springer, Cham. https://doi.org/10.1007/978-3-319-47717-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47717-6_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47716-9

  • Online ISBN: 978-3-319-47717-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics