Skip to main content

Frequent Item Set, Sequential Pattern Mining and Sequence Prediction: Structures and Algorithms

  • Conference paper
  • First Online:
International Conference on Intelligent Computing and Smart Communication 2019

Part of the book series: Algorithms for Intelligent Systems ((AIS))

Abstract

The very basic objective of data mining and data analytics has been discovering hidden complex patterns in massive scale of data that cannot be processed by conventional statistical methodologies in the raw form. This leads to introduction of our field of research: mining of item sets and patterns from various forms of data. While talking about the versatility of data, in transaction databases, which mainly are used for worldwide E-commerce, product recommendation systems, ranking and indexing systems, search result, text analysis and recommendation system, it is a typical job for us to find the frequent, unopted and interesting item sets from the huge scale of transactions done by the customers or benefactors. This brings us to the study of frequent item set mining and recent developments of the same as high utility, fuzzy and uncertain item set mining. Where in several other kinds of researches like E-learning, bioinformatics, etc. One of the most important progressive trends is finding out patterns from sequential databases. This leads us to the adjacent topic: sequential pattern mining algorithms. The paper aims to illustrate and explicate the gradual developments on the mining of sequential patterns and transactional item sets, identify the functionalities and limitations of various algorithms and also enlighten about the recent studies and advancements in the field of research towards pattern mining and sequence prediction—which is the key point research sector of consumer behaviour analysis, stock market prediction, product tendency analysis, Genetic disease prediction, evolutionary computational biology, cognition analytics and many other most cultivated research field of today’s multidisciplinary world of science.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. J. Han, J. Pei, Y. Ying, R. Mao, Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Disc. 8(1), 53–87 (2004)

    Article  MathSciNet  Google Scholar 

  2. J. Lin, E. Keogh, Wei, R. Srikant, Fast algorithms for mining association rules, in Proceedings of 20th International Conference on Very Large Data Bases, VLDB 1994 (Santiago de Chile, Chile, 12–15 September, 1994), pp. 487–499

    Google Scholar 

  3. M. Hegland, The apriori algorithm—a tutorial. Math. Comput. Imaging Sci. Inf. Process. 11, 209–262 (2005)

    MathSciNet  Google Scholar 

  4. M.J. Zaki, Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)

    Google Scholar 

  5. J. Han, J. Pei, Y. Ying, R. Mao, Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8(1), 53–87 (2004)

    Google Scholar 

  6. C.K. Leung, Q.I. Khan, Z. Li, T. Hoque, CanTree: a canonical-order tree for incremental frequent-pattern mining. Knowl. Inf. Syst. 1;11(3), 287–311 (2007)

    Google Scholar 

  7. C.W. Lin, T.P. Hong, W.H. Lu, The pre-FUFP algorithm for incremental mining. Expert. Syst. Appl. 31;36(5), 9498–9505 (2009)

    Google Scholar 

  8. J.H. Chang, W.S. Lee, Finding recent frequent item sets adaptively over online data streams, in Proceedings of 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Washington DC, USA, 24–27 August, 2003), pp. 487–492

    Google Scholar 

  9. Z.-H. Deng, DiffNodesets: an efficient structure for fast mining frequent item sets. Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China

    Google Scholar 

  10. N. Aryabarzan, B. Minaei-Bidgoli, M. Teshnehlab, negFIN: an efficient algorithm for fast mining frequent item sets. Expert. Syst. Appl.

    Google Scholar 

  11. An introduction to sequential pattern mining by P. Fournier-Viger at http://data-mining.philippe-fournier-viger.com/introduction-sequential-pattern-mining

  12. R. Srikant, R. Agrawal, Mining sequential patterns: generalizations and performance improvements, in The International Conference on Extending Database Technology (1996), pp. 1–17

    Google Scholar 

  13. M.J. Zaki, SPADE: an efficient algorithm for mining frequent sequences. Mach. Learn. 42(1–2), 31–60 (2001)

    Article  Google Scholar 

  14. J. Ayres, J. Flannick, J. Gehrke, T. Yiu, Sequential pattern mining using a bitmap representation, in ACM SIGKDD, International Conference on Knowledge Discovery and Data Mining (2002), pp. 429–435

    Google Scholar 

  15. K. Gouda, M. Hassaan, M.J. Zaki, Prism: an effective approach for frequent sequence mining via prime-block encoding. J. Comput. Syst. Sci. 76(1), 88–102 (2010)

    Article  MathSciNet  Google Scholar 

  16. S. Aseervatham, A. Osmani, E. Viennet, bitSPADE: a lattice-based sequential pattern mining algorithm using bitmap representation, in The International Conference on Data Mining (2006), pp. 792–797

    Google Scholar 

  17. E. Salvemini, F. Fumarola, D. Malerba, J. Han, Fast sequence mining based on sparse id-lists, in The International Symposium on Methodologies for Intelligent Systems (2011), pp. 316–325

    Google Scholar 

  18. P. Fournier-Viger, A. Gomariz, M. Campos, R. Thomas, Fast vertical mining of sequential patterns using co-occurrence information, in The Pacific-Asia Conference on Knowledge Discovery and Data Mining (2014), pp. 40–52

    Google Scholar 

  19. J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, M.C. Hsu, Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans. Knowl. Data Eng. 16(11), 1424–1440 (2004)

    Article  Google Scholar 

  20. M. Muzammal, R. Raman, Mining sequential patterns from probabilistic databases. Knowl. Inf. Syst. 44(2), 325–358 (2015)

    Article  Google Scholar 

  21. Z. Zhao, D. Yan, W. Ng, Mining probabilistically frequent sequential patterns in large uncertain databases. IEEE Trans. Knowl. Data Eng. 26(5), 1171–1184 (2014)

    Article  Google Scholar 

  22. C. Fiot, A. Laurent, M. Teisseire, From crispness to fuzziness: three algorithms for soft sequential pattern mining. IEEE Trans. Fuzzy Syst. 15(6), 1263–1277 (2007)

    Article  Google Scholar 

  23. J.H. Chang, Mining weighted sequential patterns in a sequence database with a time-interval weight. Knowl.-Based Syst. 24(1), 1–9 (2011)

    Article  Google Scholar 

  24. C.F. Ahmed, S.K. Tanbeer, B.S. Jeong, A novel approach for mining high-utility sequential patterns in sequence databases. Electron. Telecommun. Res. Inst. J. 32(5), 676–686 (2010)

    Google Scholar 

  25. V.N. Padmanabhan, J.C. Mogul, Using prefetching to improve world wide web latency. Comput. Commun. 16, 358–368 (1998)

    Google Scholar 

  26. J. Cleary, I. Witten, Data compression using adaptive coding and partial string matching. IEEE Trans. Inform. Theory 24(4), 413–421 (1984)

    Google Scholar 

  27. J. Pitkow, P. Pirolli, Mining longest repeating sub-sequence to predict world wide web surng, in Proceedings of 2nd USENIX Symposium on Internet Technologies and Systems (Boulder, CO, 1999), pp. 13–25

    Google Scholar 

  28. T. Gueniche, P. Fournier-Viger, V.S. Tseng, Compact prediction tree: a lossless model for accurate sequence prediction, in Advanced Data Mining and Applications. ADMA 2013, ed. by H. Motoda, Z. Wu, L. Cao, O. Zaiane, M. Yao, W. Wang. Lecture Notes in Computer Science, vol. 8347 (Springer, Berlin, Heidelberg)

    Google Scholar 

  29. T.C. Truong, P. Fournier Viger, A survey of high utility sequential pattern mining, in High-Utility Pattern Mining: Theory, Algorithms And Applications (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soumonos Mukherjee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mukherjee, S., Rajkumar, R. (2020). Frequent Item Set, Sequential Pattern Mining and Sequence Prediction: Structures and Algorithms. In: Singh Tomar, G., Chaudhari, N.S., Barbosa, J.L.V., Aghwariya, M.K. (eds) International Conference on Intelligent Computing and Smart Communication 2019. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-0633-8_21

Download citation

Publish with us

Policies and ethics