Abstract
The main purpose of data mining is to extract hidden, important and nontrivial information from a database. Sequential Pattern Mining is a data mining technique that aims to obtain and analyze frequent subsequences from sequences of events or items with or without time constraint. The importance of a sequence can be measured based on different factors such as the frequency of their occurrence, their length and also their profit. The pattern mining or the discovery of important and unexpected patterns and information was first introduced in 1990 with the well-known Apriori algorithm. Then, and after many studies on frequent pattern mining, a new approach appeared: Sequential Pattern Mining. In 1995, Agrawal et al. introduced a new Apriori algorithm supporting time constraints. The algorithm studied the transactions through time, in order to extract frequent patterns from the sequences of products related to a customer. Later, this technique became useful in many applications: DNA researches, medical diagnosis and prevention, telecommunications and so on. Other advanced algorithms and their extensions also appeared since then, such as GSP (1996), SPADE (2001), PrefixSPan (2001), SPAM (2002), CM-SPADE (2014) and CM-SPAM (2014) for Sequential Mining Process, ERMiner (2015) and RuleGrowth (2011) for mining Sequential Rule, CPT (2013) and CPT+(2015) for Sequence Prediction. Overviewing the evolution of sequential data mining techniques, this chapter discusses the multiple extensions of the Sequential Pattern Mining algorithms, and classifies them into Sequential Pattern Mining, Sequential Rule Mining and Sequence Prediction. It elaborates the different classes and some of their extensions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
A.P. Wright, A.T. Wright, A.B. McCoy and D.F. Sittig, “The use of sequential pattern mining to predict next prescribed medications”. Journal of biomedical informatics, 53, pp.73–80, 2015.
G. Bruno and P. Garza, “Temporal pattern mining for medical applications”. Data Mining: Foundations and Intelligent Paradigms, pp.9–18. 2012
K. Uragaki, T. Hosaka, Y. Arahori, M. Kushima, T. Yamazaki, K. Araki and H. Yokota, “Sequential pattern mining on electronic medical records with handling time intervals and the efficacy of medicines”. In IEEE Symposium on Computers and Communication (ISCC), (pp. 20–25). IEEE. 2016.
K. Choi, S. Chung, H. Rhee and Y. Suh, Classification and sequential pattern analysis for improving managerial efficiency and providing better medical service in public healthcare centers. Healthcare informatics research, 16(2), pp.67–76, 2010.
C. Bou Rjeily, G. Badr, A. Hajjam El Hassani and E. Andres, “Sequence Prediction Algorithm for Heart Failure Prediction”, International Conference e-Health, ISBN: 978-989-8533-65-4, pp.109–116, 2017.
C. Bou Rjeily, G. Badr, A. Hajjam El Hassani and E. Andres, “Predicting Heart Failure Class using a Sequence Prediction Algorithm”, Fourth International Conference on Advances in Biomedical Engineering (ICABME), 2017
R. Agrawal, T. Imieliński, and A. Swami, “Mining association rules between sets of items in large databases”, In ACM sigmod record(Vol. 22, No. 2, pp. 207–216). ACM, 1993.
P. Fournier-Viger, U. Faghihi, R. Nkambou, E. Mephu Nguifo, “CMRules: Mining Sequential Rules Common to Several Sequences. Knowledge-based Systems”, Elsevier, 25(1): 63–76, 2012.
J. Han, J. Pei, Y. Yin, and R. Mao, “Mining frequent patterns without candidate generation: A frequent-pattern tree approach”, Data mining and knowledge discovery, 8(1), pp.53–87, 2000.
M.J. Zaki, “SPADE: An efficient algorithm for mining frequent sequences”, Machine learning, 42(1–2), pp.31–60, 2001.
R. Srikant, and R. Agrawal, “Mining sequential patterns: Generalizations and performance improvements”, In International Conference on Extending Database Technology (pp.1–17). Springer Berlin Heidelberg, 1996.
J. Ayres, J. Flannick, J. Gehrke, J. and T. Yiu, “Sequential pattern mining using a bitmap representation”, In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining(pp. 429–435). ACM, 2002.
J. Han, J. Pei, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.C. Hsu, “Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth”, In proceedings of the 17th international conference on data engineering, pp. 215–224, 2001.
Z. Yang, Y. Wang, and M. Kitsuregawa, M., “LAPIN: effective sequential pattern mining algorithms by last position induction for dense databases”, In International Conference on Database systems for advanced applications (pp. 1020–1023). Springer Berlin Heidelberg, 2007.
P. Fournier-Viger, A. Gomariz, M. Campos, and R. Thomas, “Fast vertical mining of sequential patterns using co-occurrence information”, In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 40–52). Springer International Publishing, 2014.
X. Yan, J. Han, R. Afshar R., “CloSpan: Mining Closed Sequential Patterns in Large Datasets”, Proceedings of the 2003 SIAM International Conference on Data Mining, 2003.
J. Wang, and J. Han, “BIDE: Efficient mining of frequent closed sequences”, In Data Engineering, 2004. Proceedings. 20th International Conference on (pp. 79-90). IEEE, 2004.
A. Gomariz, M. Campos, R. Marin, and B. Goethals, “Clasp: An efficient algorithm for mining frequent closed sequences” In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 50-61). Springer Berlin Heidelberg, 2013.
P. Fournier-Viger, C.W. Wu, and V.S. Tseng, “Mining maximal sequential patterns without candidate maintenance”, In International Conference on Advanced Data Mining and Applications (pp. 169-180). Springer Berlin Heidelberg, 2013.
P. Fournier-Viger, C.W. Wu, A. Gomariz, and V.S. Tseng, “VMSP: Efficient vertical mining of maximal sequential patterns”, In Canadian Conference on Artificial Intelligence (pp. 83-94). Springer International Publishing, 2014.
H.T. Lam, F. Mörchen, D. Fradkin, and T. Calders, “Mining compressing sequential patterns”, Statistical Analysis and Data Mining, 7(1), pp.34-52, 2014.
P. Tzvetkov, X. Yan, and J. Han, “TSP: Mining Top-k Closed Sequential Patterns”, Knowledge and Information Systems, vol. 7, no. 4, pp. 438-457, 2005.
P. Fournier-Viger, A. Gomariz, T. Gueniche, E. Mwamikazi, and R. Thomas, “TKS: efficient mining of top-k sequential patterns”, In International Conference on Advanced Data Mining and Applications (pp. 109-120). Springer Berlin Heidelberg, 2013.
J. Deogun, and L. Jiang, “Prediction mining–an approach to mining association rules for prediction”, In International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing (pp. 98-108). Springer Berlin Heidelberg, 2005.
P. Fournier-Viger, R. Nkambou, and V.S.M. Tseng, “RuleGrowth: mining sequential rules common to several sequences by pattern-growth”, In Proceedings of the 2011 ACM symposium on applied computing (pp. 956-961), 2011.
P. Fournier-Viger, T. Gueniche, S. Zida, and V.S. Tseng, “ERMiner: sequential rule mining using equivalence classes”, In International Symposium on Intelligent Data Analysis (pp. 108-119). Springer International Publishing, 2014.
P. Fournier-Viger, and V.S. Tseng, “Mining top-k sequential rules”, In International Conference on Advanced Data Mining and Applications (pp. 180-194). Springer Berlin Heidelberg, 2011.
P. Fournier-Viger, and V. S Tseng, “TNS: mining top-k non-redundant sequential rules”, In Proceedings of the 28th Annual ACM Symposium on Applied Computing, 2013.
P. Fournier-Viger, C.W. Wu, V.S. Tseng, and R. Nkambou, “Mining sequential rules common to several sequences with the window size constraint”, In Canadian Conference on Artificial Intelligence (pp. 299-304). Springer Berlin Heidelberg, 2012.
T. Gueniche, P. Fournier-Viger, and V.S. Tseng, “Compact prediction tree: A lossless model for accurate sequence prediction”, In International Conference on Advanced Data Mining and Applications (pp. 177-188). Springer Berlin Heidelberg, 2013.
J. Cleary, I. Witten, “Data compression using adaptive coding and partial string matching”, IEEE Trans. on Inform. Theory, vol. 24, no. 4, pp. 413-421, 1984.
V. N, Padmanabhan, J.C. Mogul, “Using Prefetching to Improve World Wide Web Latency”, Computer Communications, vol. 16, pp. 358-368, 1998.
J. Pitkow, P. Pirolli, “Mining longest repeating subsequence to predict world wide web surfing”, In: USENIX Symposium on Internet Technologies and Systems, Boulder, CO, pp. 13-25, 1999.
T. Gueniche, P. Fournier-Viger, R. Raman, and V.S. Tseng, “CPT+: Decreasing the time/space complexity of the Compact Prediction Tree”, In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 625-636). Springer International Publishing, 2015.
N.R. Mabroukeh, and C.I. Ezeife, “A taxonomy of sequential pattern mining algorithms”. ACM Computing Surveys (CSUR), 43(1), p.3, 2010.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Bou Rjeily, C., Badr, G., Al Hassani, A.H., Andres, E. (2018). Overview on Sequential Mining Algorithms and Their Extensions. In: Alja’am, J., El Saddik, A., Sadka, A. (eds) Recent Trends in Computer Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-89914-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-89914-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-89913-8
Online ISBN: 978-3-319-89914-5
eBook Packages: Computer ScienceComputer Science (R0)