Abstract
Trillions of bytes of data are generated every day in different forms, and extracting useful information from that massive amount of data is the study of data mining. Sequential pattern mining is a major branch of data mining that deals with mining frequent sequential patterns from sequence databases. Due to items having different importance in real-life scenarios, they cannot be treated uniformly. With today’s datasets, the use of weights in sequential pattern mining is much more feasible. In most cases, as in real-life datasets, pushing weights will give a better understanding of the dataset, as it will also measure the importance of an item inside a pattern rather than treating all the items equally. Many techniques have been introduced to mine weighted sequential patterns, but typically these algorithms generate a massive number of candidate patterns and take a long time to execute. This work aims to introduce a new pruning technique and a complete framework that takes much less time and generates a small number of candidate sequences without compromising with completeness. Performance evaluation on real-life datasets shows that our proposed approach can mine weighted patterns substantially faster than other existing approaches.
Similar content being viewed by others
References
Agrawal R, Srikant R, et al. (1994) Fast algorithms for mining association rules. In: Proc. 20th int. conf. very large data bases, VLDB, vol 1215, pp 487–499
Agrawal R, Srikant R, et al. (1995) Mining sequential patterns. In: Icde, vol 95, pp 3–14
Ahmed CF, Tanbeer SK, Jeong BS, Lee YK, Choi HJ (2012) Single-pass incremental and interactive mining for weighted frequent patterns. Expert Syst Appl 39(9):7976–7994
Alamanda S, Pabboju S, Gugulothu N (2017) An approach to mine time interval based weighted sequential patterns in sequence databases. In: 2017 13th international conference on signal-image technology & internet-based systems (SITIS), pp 29–34. IEEE
Ayres J, Flannick J, Gehrke J, Yiu T (2002) Sequential pattern mining using a bitmap representation. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 429–435
Baek Y, Yun U, Lin JCW, Yoon E, Fujita H (2020) Efficiently mining erasable stream patterns for intelligent systems over uncertain data. Int J Intell Syst 35(11):1699–1734
Cai CH, Fu AWC, Cheng CH, Kwong WW (1998) Mining association rules with weighted items. In: Proceedings. IDEAS’98. International database engineering and applications symposium (Cat. no. 98EX156), pp 68–77. IEEE
Chanda AK, Ahmed CF, Samiullah M, Leung CK (2017) A new framework for mining weighted periodic patterns in time series databases. Expert Syst Appl 79:207–224
Chang JH (2011) Mining weighted sequential patterns in a sequence database with a time-interval weight. Knowl-Based Syst 24(1):1–9
Cho YS, Na WS, Moon SC (2019) Periodicity analysis using weighted sequential pattern in recommending service. Clust Comput 22(4):1049–1056
Choi P, Hwang B (2017) Dynamic weighted sequential pattern mining for usn system. In: Proceedings of the 11th international conference on ubiquitous information management and communication, pp 1–6
Dong X, Qiu P, Lü J, Cao L, Xu T (2019) Mining top-k useful negative sequential patterns via learning. IEEE Transactions on Neural Networks and Learning Systems
Dong X, Zheng Z, Cao L, Zhao Y, Zhang C, Li J, Wei W, Ou Y (2011) e-nsp: efficient negative sequential pattern mining based on identified positive patterns without database rescanning. In: Proceedings of the 20th ACM international conference on information and knowledge management, pp 825–830
Fournier-Viger P, Lin JCW, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The spmf open-source data mining library version 2. In: Joint european conference on machine learning and knowledge discovery in databases, pp 36–40. Springer
Fournier-Viger P, Lin JCW, Kiran RU, Koh YS, Thomas R (2017) A survey of sequential pattern mining. Data Science and Pattern Recognition 1(1):54–77
Gan W, Lin JCW, Zhang J, Chao HC, Fujita H, Philip SY (2019) Proum: high utility sequential pattern mining. In: 2019 IEEE international conference on systems, man and cybernetics (SMC), pp 767–773. IEEE
Gan W, Lin JCW, Zhang J, Fournier-Viger P, Chao HC, Philip SY (2020) Fast utility mining on sequence data. IEEE Transactions on Cybernetics
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. ACM Sigmod Record 29(2):1–12
Islam MA, Ahmed CF, Leung CK, Hoi CS (2018) Wfsm-maxpws: an efficient approach for mining weighted frequent subgraphs from edge-weighted graph databases. In: Pacific-asia conference on knowledge discovery and data mining, pp 664–676. Springer
Jiang H, Ning X, Xie Q, Li H (2018) Research on pruning techniques of mining weighted sequential patterns. In: Proceedings of the 2018 international conference on internet and e-business, pp 141–145. ACM
Lan GC, Hong TP, Lee HY (2014) An efficient approach for finding weighted sequential patterns from sequence databases. Appl Intell 41(2):439–452
Lan GC, Hong TP, Tseng VS, Wang SL (2014) Applying the maximum utility measure in high utility sequential pattern mining. Expert Syst Appl 41(11):5071–5081
Lee G, Yun U, Ryang H, Kim D (2016) Approximate maximal frequent pattern mining with weight conditions and error tolerance. International Journal of Pattern Recognition and Artificial Intelligence 30(06):1650012
Lee G, Yun U, Ryu KH (2017) Mining frequent weighted itemsets without storing transaction ids and generating candidates. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 25(01):111–144
Lu Y, Ezeife CI (2003) Position coded pre-order linked wap-tree for web log sequential pattern mining. In: Pacific-asia conference on knowledge discovery and data mining, pp 337–349. Springer
Nuruddin SM, Islam MD, Alam MS, Ovi JA, Islam MA (2020) An efficient approach for sequential pattern mining on gpu using cuda platform. In: International symposium on multidisciplinary studies and innovative technologies, pp 631–639. IEEE
Pei J, Han J, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu MC (2001) Prefixspan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings 17th international conference on data engineering, pp 215–224. IEEE
Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U, Hsu MC (2004) Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Transactions on Knowledge and Data Engineering 16(11):1424–1440
Pei J, Han J, Mortazavi-Asl B, Zhu H (2000) Mining access patterns efficiently from web logs. In: Pacific-Asia conference on knowledge discovery and data mining, pp 396–407. Springer
Rahman MM, Ahmed CF, Leung CKS (2019) Mining weighted frequent sequences in uncertain databases. Inf Sci 479:76– 100
Rizvee RA, Arefin MF, Ahmed CF (2020) Tree-miner: mining sequential patterns from sp-tree. In: Pacific-Asia conference on knowledge discovery and data mining, pp 44–56. Springer
Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: International conference on extending database technology, pp 1–17. Springer
Sunitha G, Reddy ARM (2016) Wrsp-miner algorithm for mining weighted sequential patterns from spatio-temporal databases. In: Proceedings of the second international conference on computer and communication technologies, pp 309–317. Springer
Tao F, Murtagh F, Farid M (2003) Weighted association rule mining using weighted support and significance framework. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp 661–666
Tran DH, Nguyen TT, Vu TD, et al. (2018) Mining top-k frequent sequential pattern in item interval extended sequence database. Journal of Computer Science and Cybernetics 34(3):249–263
Wang JZ, Huang JL, Chen YC (2016) On efficiently mining high utility sequential patterns. Knowl Inf Syst 49(2):597–627
Wang W, Yang J, Yu PS (2000) Efficient mining of weighted association rules (war). In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 270–274
Yin J, Zheng Z, Cao L (2012) Uspan: an efficient algorithm for mining high utility sequential patterns. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 660–668
Yin J, Zheng Z, Cao L, Song Y, Wei W (2013) Efficiently mining top-k high utility sequential patterns. In: 2013 IEEE 13th international conference on data mining, pp 1259–1264. IEEE
Yun U (2007) Efficient mining of weighted interesting patterns with a strong weight and/or support affinity. Inf Sci 177(17):3477–3499
Yun U (2007) Wis: weighted interesting sequential pattern mining with a similar level of support and/or weight. ETRI Journal 29(3):336–352
Yun U, Kim D, Yoon E, Fujita H (2018) Damped window based high average utility pattern mining over data streams. Knowl-Based Syst 144:188–205
Yun U, Lee G (2016) Incremental mining of weighted maximal frequent itemsets from dynamic databases. Expert Syst Appl 54:304–327
Yun U, Lee G, Lee KM (2016) Efficient representative pattern mining based on weight and maximality conditions. Expert Syst 33(5):439–462
Yun U, Leggett JJ (2005) Wfim: weighted frequent itemset mining with a weight range and a minimum weight. In: Proceedings of the 2005 SIAM international conference on data mining, pp 636–640. SIAM
Yun U, Leggett JJ (2006) Wspan: weighted sequential pattern mining in large sequence databases. In: 2006 3Rd international IEEE conference intelligent systems, pp 512–517. IEEE
Yun U, Nam H, Kim J, Kim H, Baek Y, Lee J, Yoon E, Truong T, Vo B, Pedrycz W (2020) Efficient transaction deleting approach of pre-large based high utility pattern mining in dynamic databases. Futur Gener Comput Syst 103:58–78
Yun U, Pyun G, Yoon E (2015) Efficient mining of robust closed weighted sequential patterns without information loss. International Journal on Artificial Intelligence Tools 24(01):1550007
Yun U, Shin H, Ryu KH, Yoon E (2012) An efficient mining algorithm for maximal weighted frequent patterns in transactional databases. Knowl-Based Syst 33:53–64
Zaki MJ (2001) Spade: an efficient algorithm for mining frequent sequences. Machine Learning 42(1-2):31–60
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Islam, M.A., Rafi, M.R., Azad, Aa. et al. Weighted frequent sequential pattern mining. Appl Intell 52, 254–281 (2022). https://doi.org/10.1007/s10489-021-02290-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02290-w