Skip to main content
Log in

Discovery of closed high utility itemsets using a fast nature-inspired ant colony algorithm

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Mining high utility itemset (HUIM) from an extensive database is a crucial descriptive task in data mining, which considers both the quantity and unit profit factor in revealing the ultimately profitable items. However, it may discover a vast number of HUIs which can be challenging to interpret by a user and also reduce the efficiency of the mining process. A solution to this problem is to mine a Closed high utility itemset, a more compact and lossless form of HUIs. In this paper, a fast nature-inspired meta-heuristic approach CHUI-AC (Closed high utility itemset mining using ant colony algorithm) has been introduced to mine CHUIs. This is the first work on mining CHUI using a nature-inspired ant colony algorithm. CHUI-AC maps the feasible solution space to a directed graph with quadratic space complexity to guide the searching efficiently. Several experiments on real-world datasets show that the proposed algorithm outrun the state-of-the-art algorithms in terms of execution time and rate of convergence. Moreover, the scalability experiments demonstrate that CHUI-AC is linearly scalable with respect to the number of transaction and number of items.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The code can be found at https://github.com/chuim-ac/CHUI-AC

  2. The codes for other compared models can be found at https://github.com/chuim-ac/SPMF

  3. The dataset can be found at https://www.kaggle.com/irfanasrullah/groceries

  4. This dataset can be found at https://www.kaggle.com/sulmansarwar/transactions-from-a-bakery

References

  1. Agrawal R, Imieliński T., Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on Management of data, pp 207–216

  2. Agrawal R, Shafer JC (1996) Parallel mining of association rules. IEEE Trans Knowl Data Eng 8(6):962–969

    Article  Google Scholar 

  3. Ahmed CF, Tanbeer SK, Jeong BS, Choi HJ (2012) Interactive mining of high utility patterns over data streams. Expert Syst Appl 39(15):11979–11991

    Article  Google Scholar 

  4. Ahmed CF, Tanbeer SK, Jeong BS, Lee YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721

    Article  Google Scholar 

  5. Ahmed CF, Tanbeer SK, Jeong BS, Lee YK (2011) Huc-prune: an efficient candidate pruning technique to mine high utility patterns. Appl Intell 34(2):181–198

    Article  Google Scholar 

  6. Borgelt C (2005) Keeping things simple: finding frequent item sets by recursive elimination. In: Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations, pp 66–70

  7. Chan R, Yang Q, Shen YD (2003) Mining high utility itemsets. In: Proceedings of the Third IEEE International Conference on Data Mining, ICDM ’03, IEEE Computer Society, USA

  8. Chen D, Sain SL, Guo K (2012) Data mining for the online retail industry: a case study of rfm model-based customer segmentation using data mining. J Database Market Custom Strategy Manag 19(3):197–208

    Article  Google Scholar 

  9. Dam TL, Li K, Fournier-Viger P, Duong QH (2019) Cls-miner: efficient and effective closed high-utility itemset mining. Front Comput Sci 13(2):357–381

    Article  Google Scholar 

  10. Dawar S, Goyal V, Bera D (2017) A hybrid framework for mining high-utility itemsets in a sparse transaction database. Appl Intell 47(3):809–827

    Article  Google Scholar 

  11. Deng Z (2018) An efficient structure for fast mining high utility itemsets. Appl Intell 48 (9):3161–3177

    Article  Google Scholar 

  12. Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66

    Article  Google Scholar 

  13. Dorigo M, Maniezzo V, Colorni A (1996) Ant system: optimization by a colony of cooperating agents. IEEE Trans Syst Man Cybern Part B (Cybern) 26(1):29–41

    Article  Google Scholar 

  14. Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (1996) Advances in knowledge discovery and data mining. American Association for Artificial Intelligence

  15. Fournier-Viger P, Lin JCW, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The spmf open-source data mining library version 2. In: Joint european conference on machine learning and knowledge discovery in databases. Springer, pp 36–40

  16. Fournier-Viger P, Wu CW, Zida S, Tseng VS (2014) Fhm: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: International symposium on methodologies for intelligent systems. Springer, pp 83–92

  17. Fournier-Viger P, Zhang Y, Lin JCW, Fujita H, Koh YS (2018) Mining local high utility itemsets. In: International conference on database and expert systems applications. Springer, pp 450–460

  18. Fournier-Viger P, Zhang Y, Lin JCW, Fujita H, Koh YS (2019) Mining local and peak high utility itemsets. Inf Sci 481:344–367

    Article  MathSciNet  Google Scholar 

  19. Gan W, Lin JCW, Fournier-Viger P, Chao HC, Fujita H (2018) Extracting non-redundant correlated purchase behaviors by utility measure. Knowl-Based Syst 143:30–41

    Article  Google Scholar 

  20. Gan W, Lin JCW, Fournier-Viger P, Chao HC, Hong TP, Fujita H (2018) A survey of incremental high-utility itemset mining. Wiley Interdiscip Rev Data Min Knowl Discov 8(2):e1242

    Article  Google Scholar 

  21. Goethals B (2003) Frequent itemset mining dataset repository. Frequent Itemset Mining Implementations (FIMI’03)

  22. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87

    Article  MathSciNet  Google Scholar 

  23. Holland J (1975) Adaptation in natural and artificial systems: an introductory analysis with application to biology Control and artificial intelligence

  24. Kannimuthu S, Premalatha K (2014) Discovery of high utility itemsets using genetic algorithm with ranked mutation. Appl Artif Intell 28(4):337–359

    Article  Google Scholar 

  25. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95-International Conference on Neural Networks. IEEE, vol 4, pp 1942–1948

  26. Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381

    Article  Google Scholar 

  27. Lan GC, Hong TP, Tseng VS (2014) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107

    Article  Google Scholar 

  28. Li YC, Yeh JS, Chang CC (2005) Direct candidates generation: a novel algorithm for discovering complete share-frequent itemsets. In: International conference on fuzzy systems and knowledge discovery. Springer, pp 551–560

  29. Li YC, Yeh JS, Chang CC (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64(1):198–217

    Article  Google Scholar 

  30. Lin JCW, Djenouri Y, Srivastava G, Yun U, Fournier-Viger P (2021) A predictive ga-based model for closed high-utility itemset mining. Appl Soft Comput 108:107422

    Article  Google Scholar 

  31. Lin JCW, Gan W, Fournier-Viger P, Hong TP, Tseng VS (2016) Fast algorithms for mining high-utility itemsets with various discount strategies. Adv Eng Inform 30(2):109–126

    Article  Google Scholar 

  32. Lin JCW, Gan W, Hong TP (2016) Maintaining the discovered high-utility itemsets with transaction modification. Appl Intell 44(1):166–178

    Article  Google Scholar 

  33. Lin JCW, Yang L, Fournier-Viger P, Hong TP, Voznak M (2017) A binary pso approach to mine high-utility itemsets. Soft Comput 21(17):5103–5121

    Article  Google Scholar 

  34. Lin JCW, Yang L, Fournier-Viger P, Wu JMT, Hong TP, Wang LSL, Zhan J (2016) Mining high-utility itemsets based on particle swarm optimization. Eng Appl Artif Intell 55:320–330

    Article  Google Scholar 

  35. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on Information and knowledge management, pp 55–64

  36. Liu Y, Liao WK, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific-asia conference on knowledge discovery and data mining. Springer, pp 689–695

  37. Liu Y, Cheng CP, Tseng VS (2013) Mining differential top-k co-expression patterns from time course comparative gene expression datasets. BMC bioinformatics 14(1):230

    Article  Google Scholar 

  38. NAWAZ MS, Fournier-Viger P, YUN U, WU Y, Song W (2021) Mining high utility itemsets with hill climbing and simulated annealing

  39. Nguyen LT, Vu VV, Lam MT, Duong TT, Manh LT, Nguyen TT, Vo B, Fujita H (2019) An efficient method for mining high utility closed itemsets. Inf Sci 495:78–99

    Article  Google Scholar 

  40. Osaba E, Yang XS, Diaz F, Lopez-Garcia P, Carballedo R (2016) An improved discrete bat algorithm for symmetric and asymmetric traveling salesman problems. Eng Appl Artif Intell 48:59–71

    Article  Google Scholar 

  41. Pisharath J, Liu Y, Ozisikyilmaz B, Narayanan R, Liao W, Choudhary A, Memik G (2005) Nu-minebench version 2.0 dataset and technical report. http://cucis.ece.northwestern.edu/projects/DMS/MineBench.html (last access on 2 March 2015)

  42. Ryang H, Yun U (2017) Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques. Knowl Inf Syst 51(2):627–659

    Article  Google Scholar 

  43. Sahoo J, Das AK, Goswami A (2016) An efficient fast algorithm for discovering closed+ high utility itemsets. Appl Intell 45(1):44–74

    Article  Google Scholar 

  44. Shen YD, Zhang Z, Yang Q (2002) Objective-oriented utility-based association mining. In: 2002 IEEE International conference on data mining, 2002. Proceedings. IEEE, pp 426–433

  45. Shie BE, Hsiao HF, Tseng VS (2013) Efficient algorithms for discovering high utility user behavior patterns in mobile commerce environments. Knowl Inf Syst 37(2):363–387

    Article  Google Scholar 

  46. Shie BE, Hsiao HF, Tseng VS, Philip SY (2011) Mining high utility mobile sequential patterns in mobile commerce environments. In: International conference on database systems for advanced applications. Springer, pp 224–238

  47. Song W, Huang C (2018) Mining high utility itemsets using bio-inspired algorithms: a diverse optimal value framework. IEEE Access 6:19568–19582

    Article  Google Scholar 

  48. Song W, Liu Y, Li J (2014) Mining high utility itemsets by dynamically pruning the tree structure. Appl Intell 40(1):29–43

    Article  Google Scholar 

  49. Song W, Nan J (2020) Mining high utility itemsets using ant colony optimization. In: The international conference on natural computation, fuzzy systems and knowledge discovery. Springer, pp 98–107

  50. Tseng VS, Shie BE, Wu CW, Philip SY (2012) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786

    Article  Google Scholar 

  51. Tseng VS, Wu CW, Fournier-Viger P, Philip SY (2014) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans Knowl Data Eng 27(3):726–739

    Article  Google Scholar 

  52. Tseng VS, Wu CW, Shie BE, Yu PS (2010) Up-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 253–262

  53. Wu CW, Fournier-Viger P, Gu JY, Tseng VS (2015) Mining closed+ high utility itemsets without candidate generation. In: 2015 Conference on technologies and applications of artificial intelligence (TAAI). IEEE, pp 187–194

  54. Wu JMT, Zhan J, Lin JCW (2017) An aco-based approach to mine high-utility itemsets. Knowl-Based Syst 116:102–113

    Article  Google Scholar 

  55. Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59(3):603–626

    Article  Google Scholar 

  56. Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the 2004 SIAM International Conference on Data Mining. SIAM, pp 482–486

  57. Yao H, Hamilton HJ, Geng L (2006) A unified framework for utility-based measures for mining itemsets. In: Proceedings of ACM SIGKDD 2nd workshop on utility-based data mining. Citeseer, pp 28–37

  58. Zihayat M, An A (2014) Mining top-k high utility patterns over data streams. Inf Sci 285:138–161

    Article  MathSciNet  Google Scholar 

  59. Zihayat M, Davoudi H, An A (2017) Mining significant high utility gene regulation sequential patterns. BMC Syst Biol 11(6):109

    Article  Google Scholar 

Download references

Acknowledgements

This study has been supported by Indian Institute of Technology, Kharagpur, India.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Subhadip Pramanik.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pramanik, S., Goswami, A. Discovery of closed high utility itemsets using a fast nature-inspired ant colony algorithm. Appl Intell 52, 8839–8855 (2022). https://doi.org/10.1007/s10489-021-02922-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02922-1

Keywords

Navigation