Frequent Item Set, Sequential Pattern Mining and Sequence Prediction: Structures and Algorithms

Mukherjee, Soumonos; Rajkumar, R.

doi:10.1007/978-981-15-0633-8_21

Soumonos Mukherjee⁸ &
R. Rajkumar⁹

Part of the book series: Algorithms for Intelligent Systems ((AIS))

1690 Accesses
1 Citations

Abstract

The very basic objective of data mining and data analytics has been discovering hidden complex patterns in massive scale of data that cannot be processed by conventional statistical methodologies in the raw form. This leads to introduction of our field of research: mining of item sets and patterns from various forms of data. While talking about the versatility of data, in transaction databases, which mainly are used for worldwide E-commerce, product recommendation systems, ranking and indexing systems, search result, text analysis and recommendation system, it is a typical job for us to find the frequent, unopted and interesting item sets from the huge scale of transactions done by the customers or benefactors. This brings us to the study of frequent item set mining and recent developments of the same as high utility, fuzzy and uncertain item set mining. Where in several other kinds of researches like E-learning, bioinformatics, etc. One of the most important progressive trends is finding out patterns from sequential databases. This leads us to the adjacent topic: sequential pattern mining algorithms. The paper aims to illustrate and explicate the gradual developments on the mining of sequential patterns and transactional item sets, identify the functionalities and limitations of various algorithms and also enlighten about the recent studies and advancements in the field of research towards pattern mining and sequence prediction—which is the key point research sector of consumer behaviour analysis, stock market prediction, product tendency analysis, Genetic disease prediction, evolutionary computational biology, cognition analytics and many other most cultivated research field of today’s multidisciplinary world of science.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

J. Han, J. Pei, Y. Ying, R. Mao, Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Disc. 8(1), 53–87 (2004)
Article MathSciNet Google Scholar
J. Lin, E. Keogh, Wei, R. Srikant, Fast algorithms for mining association rules, in Proceedings of 20th International Conference on Very Large Data Bases, VLDB 1994 (Santiago de Chile, Chile, 12–15 September, 1994), pp. 487–499
Google Scholar
M. Hegland, The apriori algorithm—a tutorial. Math. Comput. Imaging Sci. Inf. Process. 11, 209–262 (2005)
MathSciNet Google Scholar
M.J. Zaki, Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Google Scholar
J. Han, J. Pei, Y. Ying, R. Mao, Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8(1), 53–87 (2004)
Google Scholar
C.K. Leung, Q.I. Khan, Z. Li, T. Hoque, CanTree: a canonical-order tree for incremental frequent-pattern mining. Knowl. Inf. Syst. 1;11(3), 287–311 (2007)
Google Scholar
C.W. Lin, T.P. Hong, W.H. Lu, The pre-FUFP algorithm for incremental mining. Expert. Syst. Appl. 31;36(5), 9498–9505 (2009)
Google Scholar
J.H. Chang, W.S. Lee, Finding recent frequent item sets adaptively over online data streams, in Proceedings of 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Washington DC, USA, 24–27 August, 2003), pp. 487–492
Google Scholar
Z.-H. Deng, DiffNodesets: an efficient structure for fast mining frequent item sets. Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China
Google Scholar
N. Aryabarzan, B. Minaei-Bidgoli, M. Teshnehlab, negFIN: an efficient algorithm for fast mining frequent item sets. Expert. Syst. Appl.
Google Scholar
An introduction to sequential pattern mining by P. Fournier-Viger at http://data-mining.philippe-fournier-viger.com/introduction-sequential-pattern-mining
R. Srikant, R. Agrawal, Mining sequential patterns: generalizations and performance improvements, in The International Conference on Extending Database Technology (1996), pp. 1–17
Google Scholar
M.J. Zaki, SPADE: an efficient algorithm for mining frequent sequences. Mach. Learn. 42(1–2), 31–60 (2001)
Article Google Scholar
J. Ayres, J. Flannick, J. Gehrke, T. Yiu, Sequential pattern mining using a bitmap representation, in ACM SIGKDD, International Conference on Knowledge Discovery and Data Mining (2002), pp. 429–435
Google Scholar
K. Gouda, M. Hassaan, M.J. Zaki, Prism: an effective approach for frequent sequence mining via prime-block encoding. J. Comput. Syst. Sci. 76(1), 88–102 (2010)
Article MathSciNet Google Scholar
S. Aseervatham, A. Osmani, E. Viennet, bitSPADE: a lattice-based sequential pattern mining algorithm using bitmap representation, in The International Conference on Data Mining (2006), pp. 792–797
Google Scholar
E. Salvemini, F. Fumarola, D. Malerba, J. Han, Fast sequence mining based on sparse id-lists, in The International Symposium on Methodologies for Intelligent Systems (2011), pp. 316–325
Google Scholar
P. Fournier-Viger, A. Gomariz, M. Campos, R. Thomas, Fast vertical mining of sequential patterns using co-occurrence information, in The Pacific-Asia Conference on Knowledge Discovery and Data Mining (2014), pp. 40–52
Google Scholar
J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, M.C. Hsu, Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans. Knowl. Data Eng. 16(11), 1424–1440 (2004)
Article Google Scholar
M. Muzammal, R. Raman, Mining sequential patterns from probabilistic databases. Knowl. Inf. Syst. 44(2), 325–358 (2015)
Article Google Scholar
Z. Zhao, D. Yan, W. Ng, Mining probabilistically frequent sequential patterns in large uncertain databases. IEEE Trans. Knowl. Data Eng. 26(5), 1171–1184 (2014)
Article Google Scholar
C. Fiot, A. Laurent, M. Teisseire, From crispness to fuzziness: three algorithms for soft sequential pattern mining. IEEE Trans. Fuzzy Syst. 15(6), 1263–1277 (2007)
Article Google Scholar
J.H. Chang, Mining weighted sequential patterns in a sequence database with a time-interval weight. Knowl.-Based Syst. 24(1), 1–9 (2011)
Article Google Scholar
C.F. Ahmed, S.K. Tanbeer, B.S. Jeong, A novel approach for mining high-utility sequential patterns in sequence databases. Electron. Telecommun. Res. Inst. J. 32(5), 676–686 (2010)
Google Scholar
V.N. Padmanabhan, J.C. Mogul, Using prefetching to improve world wide web latency. Comput. Commun. 16, 358–368 (1998)
Google Scholar
J. Cleary, I. Witten, Data compression using adaptive coding and partial string matching. IEEE Trans. Inform. Theory 24(4), 413–421 (1984)
Google Scholar
J. Pitkow, P. Pirolli, Mining longest repeating sub-sequence to predict world wide web surng, in Proceedings of 2nd USENIX Symposium on Internet Technologies and Systems (Boulder, CO, 1999), pp. 13–25
Google Scholar
T. Gueniche, P. Fournier-Viger, V.S. Tseng, Compact prediction tree: a lossless model for accurate sequence prediction, in Advanced Data Mining and Applications. ADMA 2013, ed. by H. Motoda, Z. Wu, L. Cao, O. Zaiane, M. Yao, W. Wang. Lecture Notes in Computer Science, vol. 8347 (Springer, Berlin, Heidelberg)
Google Scholar
T.C. Truong, P. Fournier Viger, A survey of high utility sequential pattern mining, in High-Utility Pattern Mining: Theory, Algorithms And Applications (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Data Science and Analytics, EPITA (École Pour l’Informatique et les Techniques Avancées), Paris, France
Soumonos Mukherjee
School of Computer Science and Engineering, VIT, Vellore, India
R. Rajkumar

Authors

Soumonos Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
R. Rajkumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Soumonos Mukherjee .

Editor information

Editors and Affiliations

Machine Intelligence Research Labs, Gwalior, India
Geetam Singh Tomar
Indian Institute of Technology Indore, Indore, Madhya Pradesh, India
Narendra S. Chaudhari
Applied Computing Graduate Program, University of Vale do Rio dos Sinos, Sao Leopoldo, Rio Grande do Sul, Brazil
Jorge Luis V. Barbosa
Department of Electronics and Communication Engineering, THDC-Institute of Hydropower Engineering and Technology, Tehri, Uttarakhand, India
Mahesh Kumar Aghwariya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mukherjee, S., Rajkumar, R. (2020). Frequent Item Set, Sequential Pattern Mining and Sequence Prediction: Structures and Algorithms. In: Singh Tomar, G., Chaudhari, N.S., Barbosa, J.L.V., Aghwariya, M.K. (eds) International Conference on Intelligent Computing and Smart Communication 2019. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-0633-8_21

Download citation

DOI: https://doi.org/10.1007/978-981-15-0633-8_21
Published: 20 December 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0632-1
Online ISBN: 978-981-15-0633-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics