Overview on Sequential Mining Algorithms and Their Extensions

Bou Rjeily, Carine; Badr, Georges; Al Hassani, Amir Hajjam; Andres, Emmanuel

doi:10.1007/978-3-319-89914-5_1

Carine Bou Rjeily⁴,
Georges Badr⁵,
Amir Hajjam Al Hassani⁴ &
…
Emmanuel Andres⁶

453 Accesses
2 Citations

Abstract

The main purpose of data mining is to extract hidden, important and nontrivial information from a database. Sequential Pattern Mining is a data mining technique that aims to obtain and analyze frequent subsequences from sequences of events or items with or without time constraint. The importance of a sequence can be measured based on different factors such as the frequency of their occurrence, their length and also their profit. The pattern mining or the discovery of important and unexpected patterns and information was first introduced in 1990 with the well-known Apriori algorithm. Then, and after many studies on frequent pattern mining, a new approach appeared: Sequential Pattern Mining. In 1995, Agrawal et al. introduced a new Apriori algorithm supporting time constraints. The algorithm studied the transactions through time, in order to extract frequent patterns from the sequences of products related to a customer. Later, this technique became useful in many applications: DNA researches, medical diagnosis and prevention, telecommunications and so on. Other advanced algorithms and their extensions also appeared since then, such as GSP (1996), SPADE (2001), PrefixSPan (2001), SPAM (2002), CM-SPADE (2014) and CM-SPAM (2014) for Sequential Mining Process, ERMiner (2015) and RuleGrowth (2011) for mining Sequential Rule, CPT (2013) and CPT+(2015) for Sequence Prediction. Overviewing the evolution of sequential data mining techniques, this chapter discusses the multiple extensions of the Sequential Pattern Mining algorithms, and classifies them into Sequential Pattern Mining, Sequential Rule Mining and Sequence Prediction. It elaborates the different classes and some of their extensions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

A.P. Wright, A.T. Wright, A.B. McCoy and D.F. Sittig, “The use of sequential pattern mining to predict next prescribed medications”. Journal of biomedical informatics, 53, pp.73–80, 2015.
Article Google Scholar
G. Bruno and P. Garza, “Temporal pattern mining for medical applications”. Data Mining: Foundations and Intelligent Paradigms, pp.9–18. 2012
Google Scholar
K. Uragaki, T. Hosaka, Y. Arahori, M. Kushima, T. Yamazaki, K. Araki and H. Yokota, “Sequential pattern mining on electronic medical records with handling time intervals and the efficacy of medicines”. In IEEE Symposium on Computers and Communication (ISCC), (pp. 20–25). IEEE. 2016.
Google Scholar
K. Choi, S. Chung, H. Rhee and Y. Suh, Classification and sequential pattern analysis for improving managerial efficiency and providing better medical service in public healthcare centers. Healthcare informatics research, 16(2), pp.67–76, 2010.
Article Google Scholar
C. Bou Rjeily, G. Badr, A. Hajjam El Hassani and E. Andres, “Sequence Prediction Algorithm for Heart Failure Prediction”, International Conference e-Health, ISBN: 978-989-8533-65-4, pp.109–116, 2017.
Google Scholar
C. Bou Rjeily, G. Badr, A. Hajjam El Hassani and E. Andres, “Predicting Heart Failure Class using a Sequence Prediction Algorithm”, Fourth International Conference on Advances in Biomedical Engineering (ICABME), 2017
Google Scholar
R. Agrawal, T. Imieliński, and A. Swami, “Mining association rules between sets of items in large databases”, In ACM sigmod record(Vol. 22, No. 2, pp. 207–216). ACM, 1993.
Google Scholar
P. Fournier-Viger, U. Faghihi, R. Nkambou, E. Mephu Nguifo, “CMRules: Mining Sequential Rules Common to Several Sequences. Knowledge-based Systems”, Elsevier, 25(1): 63–76, 2012.
Article Google Scholar
J. Han, J. Pei, Y. Yin, and R. Mao, “Mining frequent patterns without candidate generation: A frequent-pattern tree approach”, Data mining and knowledge discovery, 8(1), pp.53–87, 2000.
Article MathSciNet Google Scholar
M.J. Zaki, “SPADE: An efficient algorithm for mining frequent sequences”, Machine learning, 42(1–2), pp.31–60, 2001.
Article Google Scholar
R. Srikant, and R. Agrawal, “Mining sequential patterns: Generalizations and performance improvements”, In International Conference on Extending Database Technology (pp.1–17). Springer Berlin Heidelberg, 1996.
Google Scholar
J. Ayres, J. Flannick, J. Gehrke, J. and T. Yiu, “Sequential pattern mining using a bitmap representation”, In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining(pp. 429–435). ACM, 2002.
Google Scholar
J. Han, J. Pei, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.C. Hsu, “Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth”, In proceedings of the 17th international conference on data engineering, pp. 215–224, 2001.
Google Scholar
Z. Yang, Y. Wang, and M. Kitsuregawa, M., “LAPIN: effective sequential pattern mining algorithms by last position induction for dense databases”, In International Conference on Database systems for advanced applications (pp. 1020–1023). Springer Berlin Heidelberg, 2007.
Google Scholar
P. Fournier-Viger, A. Gomariz, M. Campos, and R. Thomas, “Fast vertical mining of sequential patterns using co-occurrence information”, In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 40–52). Springer International Publishing, 2014.
Google Scholar
X. Yan, J. Han, R. Afshar R., “CloSpan: Mining Closed Sequential Patterns in Large Datasets”, Proceedings of the 2003 SIAM International Conference on Data Mining, 2003.
Chapter Google Scholar
J. Wang, and J. Han, “BIDE: Efficient mining of frequent closed sequences”, In Data Engineering, 2004. Proceedings. 20th International Conference on (pp. 79-90). IEEE, 2004.
Google Scholar
A. Gomariz, M. Campos, R. Marin, and B. Goethals, “Clasp: An efficient algorithm for mining frequent closed sequences” In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 50-61). Springer Berlin Heidelberg, 2013.
Chapter Google Scholar
P. Fournier-Viger, C.W. Wu, and V.S. Tseng, “Mining maximal sequential patterns without candidate maintenance”, In International Conference on Advanced Data Mining and Applications (pp. 169-180). Springer Berlin Heidelberg, 2013.
Chapter Google Scholar
P. Fournier-Viger, C.W. Wu, A. Gomariz, and V.S. Tseng, “VMSP: Efficient vertical mining of maximal sequential patterns”, In Canadian Conference on Artificial Intelligence (pp. 83-94). Springer International Publishing, 2014.
Google Scholar
H.T. Lam, F. Mörchen, D. Fradkin, and T. Calders, “Mining compressing sequential patterns”, Statistical Analysis and Data Mining, 7(1), pp.34-52, 2014.
Article MathSciNet Google Scholar
P. Tzvetkov, X. Yan, and J. Han, “TSP: Mining Top-k Closed Sequential Patterns”, Knowledge and Information Systems, vol. 7, no. 4, pp. 438-457, 2005.
Google Scholar
P. Fournier-Viger, A. Gomariz, T. Gueniche, E. Mwamikazi, and R. Thomas, “TKS: efficient mining of top-k sequential patterns”, In International Conference on Advanced Data Mining and Applications (pp. 109-120). Springer Berlin Heidelberg, 2013.
Chapter Google Scholar
J. Deogun, and L. Jiang, “Prediction mining–an approach to mining association rules for prediction”, In International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing (pp. 98-108). Springer Berlin Heidelberg, 2005.
Google Scholar
P. Fournier-Viger, R. Nkambou, and V.S.M. Tseng, “RuleGrowth: mining sequential rules common to several sequences by pattern-growth”, In Proceedings of the 2011 ACM symposium on applied computing (pp. 956-961), 2011.
Google Scholar
P. Fournier-Viger, T. Gueniche, S. Zida, and V.S. Tseng, “ERMiner: sequential rule mining using equivalence classes”, In International Symposium on Intelligent Data Analysis (pp. 108-119). Springer International Publishing, 2014.
Google Scholar
P. Fournier-Viger, and V.S. Tseng, “Mining top-k sequential rules”, In International Conference on Advanced Data Mining and Applications (pp. 180-194). Springer Berlin Heidelberg, 2011.
Chapter Google Scholar
P. Fournier-Viger, and V. S Tseng, “TNS: mining top-k non-redundant sequential rules”, In Proceedings of the 28th Annual ACM Symposium on Applied Computing, 2013.
Google Scholar
P. Fournier-Viger, C.W. Wu, V.S. Tseng, and R. Nkambou, “Mining sequential rules common to several sequences with the window size constraint”, In Canadian Conference on Artificial Intelligence (pp. 299-304). Springer Berlin Heidelberg, 2012.
Chapter Google Scholar
T. Gueniche, P. Fournier-Viger, and V.S. Tseng, “Compact prediction tree: A lossless model for accurate sequence prediction”, In International Conference on Advanced Data Mining and Applications (pp. 177-188). Springer Berlin Heidelberg, 2013.
Chapter Google Scholar
J. Cleary, I. Witten, “Data compression using adaptive coding and partial string matching”, IEEE Trans. on Inform. Theory, vol. 24, no. 4, pp. 413-421, 1984.
Article Google Scholar
V. N, Padmanabhan, J.C. Mogul, “Using Prefetching to Improve World Wide Web Latency”, Computer Communications, vol. 16, pp. 358-368, 1998.
Google Scholar
J. Pitkow, P. Pirolli, “Mining longest repeating subsequence to predict world wide web surfing”, In: USENIX Symposium on Internet Technologies and Systems, Boulder, CO, pp. 13-25, 1999.
Google Scholar
T. Gueniche, P. Fournier-Viger, R. Raman, and V.S. Tseng, “CPT+: Decreasing the time/space complexity of the Compact Prediction Tree”, In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 625-636). Springer International Publishing, 2015.
Google Scholar
N.R. Mabroukeh, and C.I. Ezeife, “A taxonomy of sequential pattern mining algorithms”. ACM Computing Surveys (CSUR), 43(1), p.3, 2010.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Nanomedicine Lab, Université de Bourgogne Franche – Comté, Belfort, France
Carine Bou Rjeily & Amir Hajjam Al Hassani
TICKET Lab, Antonine University, Baabda, Lebanon
Georges Badr
Université de Strasbourg, Centre Hospitalier Universitaire, Strasbourg, France
Emmanuel Andres

Authors

Carine Bou Rjeily
View author publications
You can also search for this author in PubMed Google Scholar
Georges Badr
View author publications
You can also search for this author in PubMed Google Scholar
Amir Hajjam Al Hassani
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuel Andres
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Georges Badr .

Editor information

Editors and Affiliations

Computer Science and Engineering, Qatar University, Doha, Qatar
Jihad Mohamad Alja’am
Faculty of Engineering, University of Ottawa, Ottawa, ON, Canada
Abdulmotaleb El Saddik
Electronic and Computer Engineering, Brunel University London, Uxbridge, UK
Abdul Hamid Sadka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bou Rjeily, C., Badr, G., Al Hassani, A.H., Andres, E. (2018). Overview on Sequential Mining Algorithms and Their Extensions. In: Alja’am, J., El Saddik, A., Sadka, A. (eds) Recent Trends in Computer Applications. Springer, Cham. https://doi.org/10.1007/978-3-319-89914-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-89914-5_1
Published: 20 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-89913-8
Online ISBN: 978-3-319-89914-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics