Skip to main content
Log in

Recency-based sequential pattern mining in multiple event sequences

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

The standard sequential pattern mining scheme hardly considers the positions of events in a sequence, and therefore it is difficult to focus on more interesting patterns that represent better the causal relationships between events. Without quantifying how close two events are in a sequence, we may fail to evaluate how likely an event is caused by the others from the pattern, which is a severe drawback for some applications like prediction. Motivated by this, we propose the recency-based sequential pattern mining scheme together with a novel measure of pattern interestingness to effectively capture recency as well as frequency. To efficiently extract all the recency-based sequential patterns, we devise a mining algorithm, called Recency-based Frequent pattern Miner (RF-Miner), together with an effective prediction method to evaluate the quality of recency-based patterns in terms of their prediction power. The experimental results show that our RF-Miner algorithm can extract more diverse and important patterns that can be used to make prediction of the next event, and can be more efficiently performed by using the upper bounds of our measure than baseline algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://en.wikipedia.org/wiki/Serial-position_effect.

  2. We truncated each sequence to a quarter of its original, length for reducing the execution time of all algorithms from the original FIFA data set available from http://www.philippe-fournier-viger.com/spmf/index.php?link=datasets.php.

  3. https://github.com/bigdata-inha/RFMiner.

References

  • Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the international conference on data engineering, pp 3–14

  • Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD international conference on management of data, Washington, pp 207–216

  • Ao X, Luo P, Wang J, Zhuang F, He Q (2017) Mining precise-positioning episode rules from event sequences. In: IEEE international conference on data engineering, pp 83–86

  • Ao X, Luo P, Wang J, Zhuang F, He Q (2018) Mining precise-positioning episode rules from event sequences. IEEE Trans Knowl Data Eng 30(3):530–543

    Article  Google Scholar 

  • Ayres J, Flannick J, Gehrke J, Yiu T (2002) Sequential pattern mining using a bitmap representation. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 429–435

  • Chen Y, Chen S, Hsu P (2002) Mining hybrid sequential patterns and sequential rules. Inf Syst 27(5):345–362

    Article  Google Scholar 

  • Colman A (2009) A dictionary of psychology. Oxford dictionary of psychology. Oxford University Press, Oxford

    Google Scholar 

  • Cule B, Goethals B, Robardet C (2009) A new constraint for mining sets in sequences. In: Proceedings of the SIAM international conference on data mining, pp 317–328

  • Cule B, Feremans L, Goethals B (2016) Efficient discovery of sets of co-occurring items in event sequences. In: Machine learning and knowledge discovery in databases—European conference, pp 361–377

  • Feremans L, Cule B, Goethals B (2018) Mining top-k quantile-based cohesive sequential patterns. In: Proceedings of the SIAM international conference on data mining, pp 90–98

  • Fiot C, Laurent A, Teisseire M (2007) Extended time constraints for sequence mining. In: 14th international symposium on temporal representation and reasoning (TIME 2007), 28–30 June 2007. Alicante, Spain, pp 105–116

  • Fournier-Viger P, Gueniche T, Tseng VS (2012) Using partially-ordered sequential rules to generate more accurate sequence prediction. In: Advanced data mining and applications, international conference, ADMA, pp 431–442

  • Fournier-Viger P, Lin JC-W, Kiran RU, Koh YS, Thomas R (2017) A survey of sequential pattern mining. Data Sci Pattern Recognit 1(1):54–77

    Google Scholar 

  • Hetland ML, Sætrom P (2005) Evolutionary rule mining in time series databases. Mach Learn 58(2–3):107–125

    Article  Google Scholar 

  • Hirate Y, Yamana H (2006) Generalized sequential pattern mining with item intervals. JCP 1(3):51–60

    Google Scholar 

  • Ho J, Lukov L, Chawla S (2005) Sequential pattern mining with constraints on large protein databases. In: Proceedings of the international conference on management of data (COMAD), pp 89–100

  • Mannila H, Toivonen H, Verkamo AI (1997) Discovery of frequent episodes in event sequences. Data Min Knowl Discov 1(3):259–289

    Article  Google Scholar 

  • Mobasher B, Dai H, Luo T, Nakagawa M (2002) Using sequential and non-sequential patterns in predictive web usage mining tasks. In: Proceedings of the IEEE international conference on data mining, pp 669–672

  • Nakagawa M, Mobasher B (2003) Impact of site characteristics on recommendation models based on association rules and sequential patterns. Proc IJCAI 3:1–10

    Google Scholar 

  • Pei J, Han J, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu M (2001) Prefixspan: mining sequential patterns by prefix-projected growth. In: Proceedings of the international conference on data engineering, pp 215–224

  • Pei J, Han J, Wang W (2002) Mining sequential patterns with constraints in large databases. In: Proceedings of the ACM CIKM international conference on information and knowledge management, pp 18–25

  • Pei J, Han J, Wang W (2007) Constraint-based sequential pattern mining: the pattern-growth methods. J Intell Inf Syst 28(2):133–160

    Article  Google Scholar 

  • Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: International conference on extending database technology, pp 3–17

  • Tang L, Zhang L, Luo P, Wang M (2012) Incorporating occupancy into frequent pattern mining for high quality pattern recommendation. In: ACM international conference on information and knowledge management, pp 75–84

  • Zaki MJ (2000) Sequence mining in categorical domains: incorporating constraints. In: Proceedings of the ACM CIKM international conference on information and knowledge management, pp 422–429

  • Zaki MJ (2001) SPADE: an efficient algorithm for mining frequent sequences. Mach Learn 42(1/2):31–60

    Article  Google Scholar 

  • Zhou C, Cule B, Goethals B (2015) A pattern based predictor for event streams. Expert Syst Appl 42(23):9294–9306

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Research Foundation of Korea (NRF) Grant Funded by the Korea government (MSIT) (NRF-2018R1D1A1B07049934), in part by Institute of Information & Communications Technology Planning & Evaluation (IITP) Grants Funded by the Korea government (MSIT) (2019-0-00240, 2019-0-00064, 2017-0-00396, and 2020-0-01389, Artificial Intelligence Convergence Research Center (Inha University)), and in part by INHA UNIVERSITY Research Grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dong-Wan Choi.

Additional information

Responsible editor: M. J. Zaki.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, H., Choi, DW. Recency-based sequential pattern mining in multiple event sequences. Data Min Knowl Disc 35, 127–157 (2021). https://doi.org/10.1007/s10618-020-00715-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-020-00715-7

Keywords

Navigation