Mining frequent Itemsets from transaction databases using hybrid switching framework

Jashma Suresh, P.P; Dinesh Acharya, U; Reddy, N.V. Subba

doi:10.1007/s11042-023-14484-0

Mining frequent Itemsets from transaction databases using hybrid switching framework

Published: 16 February 2023

Volume 82, pages 27571–27591, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

P.P Jashma Suresh¹,
U Dinesh Acharya¹ &
N.V. Subba Reddy²

169 Accesses
1 Altmetric
Explore all metrics

Abstract

With the growing volume of data, mining Frequent Itemsets remains of paramount importance. These have applications in various domains such as market basket analysis, clustering, classification, software bug detection web-mining to name a few. Over the recent years, several “data-structures” were employed to mine “frequent itemsets”. Unfortunately, many of them showed less efficiency in runtime or memory. This resulted in the design of Hybrid Frameworks that uses a combination of two or more data structures to extract frequent itemsets. This exploiting the benefits of different data structures while minimizing their drawbacks. This paper employs a tree-based data structure named as NegNodesets in collaboration with the list-based structure N-list for developing a novel Hybrid Framework for mining the frequent itemsets. NegNodesets have the advantage of employing bitmaps for generating a concise representation of itemsets. The N-list structure on the other hand depends on list based intersection operation for generating frequent itemsets, which is much faster than other conventional approaches. Transaction merging concept is utilized in this work to minimize the run time by merging several transactions into a single itemset. A switching criterion depends on the length of nodelist is used for switching between the algorithms. The efficacy of this approach has been enhanced by using a hash-based mechanism for generating the final set of frequent item sets. JAVA is the programming language used for coding the algorithms. The simulation analysis is carried out to know the efficacy of proposed approach in run time, memory consumption and compared with some existing approaches. From the comparative analysis, it is proved that the proposed NPLengthSwitch consumes lesser memory and run time than other techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Novel Algorithm for Frequent Itemsets Mining in Transactional Databases

Study of Effective Mining Algorithms for Frequent Itemsets

Discovering Frequent High Average Utility Itemset Without Transaction Insertion

Data availability

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

References

Aryabarzan N, Minaei-Bidgoli B, Teshnehlab M (2018) negFIN: an efficient algorithm for fast mining frequent itemsets. Expert Syst Appl 105:129–143. https://doi.org/10.1016/j.eswa.2018.03.041
Article Google Scholar
Bhatt R, Dhall A, (2010) Skin segmentation dataset. UCI Machine Learning Repository.
Bui H, Vo B, Nguyen H, Nguyen-Hoang T-A, Hong T-P (2018) A weighted N-list-based method for mining frequent weighted itemsets. Expert Syst Appl 96:388–405. https://doi.org/10.1016/j.eswa.2017.10.039
Article Google Scholar
Bui H, Vo B, Nguyen-Hoang T-A, Yun U (2021) Mining frequent weighted closed itemsets using the WN-list structure and an early pruning strategy. Appl Intell 51(3):1439–1459. https://doi.org/10.1007/s10489-020-01899-7
Article Google Scholar
Bustio-Martínez L, Letras-Luna M, Cumplido R, Hernández-León R, Feregrino-Uribe C, Bande-Serrano JM (2019) Using hashing and lexicographic order for frequent Itemsets mining on data streams. J Parallel Distrib Comput 125:58–71. https://doi.org/10.1016/j.jpdc.2018.11.002
Article Google Scholar
Chen DD (2015) Online retail data set. UC Irvine Machine Learning Repository.
Chon K-W, Hwang S-H, Kim M-S (2018) GMiner: a fast GPU-based frequent itemset mining method for large-scale data. Inf Sci 439–440:19–38. https://doi.org/10.1016/j.ins.2018.01.046
Article MathSciNet Google Scholar
Cui Y, Gan W, Lin H, Zheng W (2022) FRI-miner: fuzzy rare itemset mining. Appl Intell 52(3):3387–3402. https://doi.org/10.1007/s10489-021-02574-1
Article Google Scholar
Davashi R (2021) UP-tree & UP-mine: a fast method based on upper bound for frequent pattern mining from uncertain data. Eng Appl Artif Intell 106:104477. https://doi.org/10.1016/j.engappai.2021.104477
Article Google Scholar
Dawar S, Goyal V, Bera D (2017) A hybrid framework for mining high-utility itemsets in a sparse transaction database. Appl Intell 47(3):809–827. https://doi.org/10.1007/s10489-017-0932-1
Article Google Scholar
Deng Z-H, Lv S-L (2015) PrePost+: an efficient N-lists-based algorithm for mining frequent itemsets via children–parent equivalence pruning. Expert Syst Appl 42(13):5424–5432. https://doi.org/10.1016/j.eswa.2015.03.004
Article Google Scholar
Djenouri Y, Belhadi A, Fournier-Viger P (2018) Extracting useful knowledge from event logs: a frequent itemset mining approach. Knowl-Based Syst 139:132–148. https://doi.org/10.1016/j.knosys.2017.10.016
Article Google Scholar
Gatuha G, Jiang T (2017) Smart frequent itemsets mining algorithm based on FP-tree and DIFFset data structures. Turk J Electric Eng Comput Sci 25(3):2096–2107. https://doi.org/10.3906/elk-1602-113
Article Google Scholar
Goyal P, Challa JS, Shrivastava S, Goyal N (2020) Anytime frequent Itemset Mining of Transactional Data Streams. Big Data Research 21:100146. https://doi.org/10.1016/j.bdr.2020.100146
Article Google Scholar
Hebrail G, Berard A (2012 Aug 30) Individual household electric power consumption data set. É. d. France, Ed., ed: UCI Machine Learning Repository
Ilamchezhian J (2021) A novel approach for frequent Itemset mining using geometric progression number labeling. Turk J Comput Math Educ (TURCOMAT) 12(10):3529–3538
Google Scholar
Jamsheela O, Raju GK (2021) Parallelization of frequent Itemset mining methods with FP-tree: an experiment with PrePost+ algorithm. Int Arab J Inf Technol 18(2): 208–213. https://doi.org/10.34028/iajit/18/2/9
Kaliappan J, Sai SM, Preetham KS (2019) Weblog and retail industries analysis using a robust modified Apriori algorithm. Int J Innov Technol Explor Eng 8(6):1727–1733
Google Scholar
Lessanibahri S, Gastaldi L, González Fernández C (2020) A novel pruning algorithm for mining long and maximum length frequent itemsets. Expert Syst Appl 142:113004. https://doi.org/10.1016/j.eswa.2019.113004
Article Google Scholar
Leung CK, Zhang H, Souza J, Lee W (2018) Scalable vertical mining for big data analytics of frequent itemsets. In: International conference on database and expert systems applications. Springer, Cham, pp 3–17. https://doi.org/10.1007/978-3-319-98809-2_1
Chapter Google Scholar
Li Z, Chen F, Wu J, Liu Z, Liu W (2021) Efficient weighted probabilistic frequent itemset mining in uncertain databases. Expert Syst 38(5). https://doi.org/10.1111/exsy.12551
Lin JC-W, Ahmed U, Srivastava G, Wu JM-T, Hong T-P, Djenouri Y (2021) Linguistic frequent pattern mining using a compressed structure. Appl Intell 51(7):4806–4823. https://doi.org/10.1007/s10489-020-02080-w
Article Google Scholar
Liu J, Ye Z, Yang X, Wang X, Shen L, Jiang X (2022) Efficient strategies for incremental mining of frequent closed itemsets over data streams. Expert Syst Appl 191:116220. https://doi.org/10.1016/j.eswa.2021.116220
Article Google Scholar
Liu D, Li Y, Baskett W, Lin D, Shyu C-R (2022) RHPTree—risk hierarchical pattern tree for scalable long pattern mining. ACM Trans Knowl Discov Data 16(4):1–33. https://doi.org/10.1145/3488380
Article Google Scholar
Nguyen HH (2019) Boosting frequent itemset mining via early stopping intersections. arXiv:190107773 [cs]
Nguyen T-N, Nguyen LTT, Vo B, Nguyen N-T, Nguyen TDD (2020) An N-list-based approach for mining frequent inter-transaction patterns. IEEE Access 8:116840–116855. https://doi.org/10.1109/ACCESS.2020.3004530
Article Google Scholar
Ponmudiyan Poovan JS, Udupi DA, Subba Reddy NV (2022) A multithreaded hybrid framework for mining frequent itemsets. IJECE 12(3): 3249-3264. https://doi.org/10.11591/ijece.v12i3
Qu J-F, Hang B, Wu Z, Wu Z, Gu Q, Tang B (2020) Efficient Mining of Frequent Itemsets Using Only one Dynamic Prefix Tree. IEEE Access 8:183722–183735. https://doi.org/10.1109/ACCESS.2020.3029302
Article Google Scholar
Rahman MM, Ahmed CF, Leung CK-S (2019) Mining weighted frequent sequences in uncertain databases. Inf Sci 479:76–100. https://doi.org/10.1016/j.ins.2018.11.026
Article Google Scholar
Reiss A (April 2019) Stricker D, Pamap2 physical activity monitoring data set. Retrieved.
Siahaan APU, Ikhwan A, Aryza S (2018) A novelty of data mining for promoting education based on FP-growth algorithm.
Sohrabi MK, Taheri N (2018) A Hadoop-based parallel mining of frequent itemsets using N-lists. J Chin Inst Eng 41(3):229–238. https://doi.org/10.1080/02533839.2018.1454853
Article Google Scholar
Son LH, Chiclana F, Kumar R, Mittal M, Khari M, Chatterjee JM, Baik SW (2018) ARM–AMO: an efficient association rule mining algorithm based on animal migration optimization. Knowl-Based Syst 154:68–80. https://doi.org/10.1016/j.knosys.2018.04.038
Article Google Scholar
Stolfo SJ, (1999) KDD cup 1999 dataset. UCI KDD repository. http://kdd.ics.uci.edu.
Vanahalli MK, Patil N (2019) An efficient dynamic switching algorithm for mining colossal closed itemsets from high dimensional datasets. Data Knowl Eng 123:101721. https://doi.org/10.1016/j.datak.2019.101721
Article Google Scholar
Vo B, Bui H, Vo T, Le T (2020) Mining top-rank-k frequent weighted itemsets using WN-list structures and an early pruning strategy. Knowl-Based Syst 201–202:106064. https://doi.org/10.1016/j.knosys.2020.106064
Article Google Scholar
Waghere SS, RajaRajeswari P, Ganesan V (2021) Retrieval of frequent itemset using improved mining algorithm in Hadoop. Singapore, 787–798. https://doi.org/10.1007/978-981-15-5148-2_68
Wang L, Meng J, Xu P, Peng K (2018) Mining temporal association rules with frequent itemsets tree. Appl Soft Comput 62:817–829. https://doi.org/10.1016/j.asoc.2017.09.013
Article Google Scholar
Wang G, Cong G, Zhang Y, Hai Z, Ye J (2021) A synopsis based approach for Itemset frequency estimation over massive multi-transaction stream. ACM Trans Knowl Discov Data 16(2):29:1-29:30. https://doi.org/10.1145/3465238
Wu N, Zou Y, Shan C (2021) A frequent Itemset mining method based on local differential privacy. In International Conference on Web Information Systems and Applications Springer, 225-236. https://doi.org/10.1007/978-3-030-87571-8_20
Wu Y, Luo L, Li Y, Guo L, Fournier-Viger P, Zhu X, Wu X (2022) NTP-miner: nonoverlapping three-way sequential pattern mining. ACM Trans Knowl Discov Data 16(3):1–21. https://doi.org/10.1145/3480245
Article Google Scholar
Xun Y, Zhang J, Yang H, Qin X (2021) HBPFP-DC: a parallel frequent itemset mining using spark. Parallel Comput 101:102738. https://doi.org/10.1016/j.parco.2020.102738
Article MathSciNet Google Scholar
Yamamoto Y, Tabei Y, Iwanuma K (2020) PARASOL: a hybrid approximation approach for scalable frequent itemset mining in streaming data. J Intell Inf Syst 55(1):119–147. https://doi.org/10.1007/s10844-019-00590-9
Article Google Scholar
Yimin M, Junhao G, Mwakapesa DS, Nanehkaran YA, Chi Z, Xiaoheng D, Zhigang C (2021) PFIMD: a parallel MapReduce-based algorithm for frequent itemset mining. Multimedia Systems 27(4):709–722. https://doi.org/10.1007/s00530-020-00725-x
Article Google Scholar
Zhang R, Chen W, Hsu T-C, Yang H, Chung Y-C (2019) ANG: a combination of Apriori and graph computing techniques for frequent itemsets mining. J Supercomput 75(2):646–661. https://doi.org/10.1007/s11227-017-2049-z
Article Google Scholar
Zhang C, Tian P, Zhang X, Liao Q, Jiang ZL, Wang X (2019) HashEclat: an efficient frequent itemset algorithm. Int J Mach Learn & Cyber 10(11):3003–3016. https://doi.org/10.1007/s13042-018-00918-x
Article Google Scholar
Zhang C, Tian P, Zhang X, Jiang ZL, Yao L, Wang X (2019) Fast Eclat algorithms based on Minwise hashing for large scale transactions. IEEE Internet Things J 6(2):3948–3961. https://doi.org/10.1109/JIOT.2018.2885851
Article Google Scholar
Zhao X, Ning SY (2021) Improved algorithm of multiple minimum support association rules based on can tree 206–213. https://doi.org/10.1007/978-3-030-92632-8_20

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
P.P Jashma Suresh & U Dinesh Acharya
Department of Information Technology, Manipal Institute of Technology Bengaluru, Manipal Academy of Higher Education, Bengaluru, Karnataka, 560064, India
N.V. Subba Reddy

Authors

P.P Jashma Suresh
View author publications
You can also search for this author in PubMed Google Scholar
U Dinesh Acharya
View author publications
You can also search for this author in PubMed Google Scholar
N.V. Subba Reddy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to U Dinesh Acharya.

Ethics declarations

Conflict of interest

Authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jashma Suresh, P., Dinesh Acharya, U. & Reddy, N.S. Mining frequent Itemsets from transaction databases using hybrid switching framework. Multimed Tools Appl 82, 27571–27591 (2023). https://doi.org/10.1007/s11042-023-14484-0

Download citation

Received: 13 September 2021
Revised: 21 April 2022
Accepted: 31 January 2023
Published: 16 February 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s11042-023-14484-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining frequent Itemsets from transaction databases using hybrid switching framework

Abstract

Access this article

Similar content being viewed by others

A Novel Algorithm for Frequent Itemsets Mining in Transactional Databases

Study of Effective Mining Algorithms for Frequent Itemsets

Discovering Frequent High Average Utility Itemset Without Transaction Insertion

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mining frequent Itemsets from transaction databases using hybrid switching framework

Abstract

Access this article

Similar content being viewed by others

A Novel Algorithm for Frequent Itemsets Mining in Transactional Databases

Study of Effective Mining Algorithms for Frequent Itemsets

Discovering Frequent High Average Utility Itemset Without Transaction Insertion

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation