Soft Computing

, Volume 21, Issue 17, pp 5103–5121 | Cite as

A binary PSO approach to mine high-utility itemsets

  • Jerry Chun-Wei Lin
  • Lu Yang
  • Philippe Fournier-Viger
  • Tzung-Pei Hong
  • Miroslav Voznak
Methodologies and Application

Abstract

High-utility itemset mining (HUIM) is a critical issue in recent years since it can be used to reveal the profitable products by considering both the quantity and profit factors instead of frequent itemset mining (FIM) or association-rule mining (ARM). Several algorithms have been presented to mine high-utility itemsets (HUIs) and most of them have to handle the exponential search space for discovering HUIs when the number of distinct items and the size of database are very large. In the past, a heuristic HUPE\( _\mathrm{umu}\)-GRAM algorithm was proposed to mine HUIs based on genetic algorithm (GA). For the evolutionary computation (EC) techniques of particle swarm optimization (PSO), it only requires fewer parameters compared to the GA-based approaches. Since the traditional PSO mechanism is used to handle the continuous problem, in this paper, the discrete PSO is adopted to encode the particles as the binary variables. An efficient PSO-based algorithm, namely HUIM-BPSO, is proposed to efficiently find HUIs. The designed HUIM-BPSO algorithm finds the high-transaction-weighted utilization 1-itemsets (1-HTWUIs) as the size of the particles based on transaction-weighted utility (TWU) model, which can greatly reduce the combinational problem in evolution process. The sigmoid function is adopted in the updating process of the particles for the designed HUIM-BPSO algorithm. An OR/NOR-tree structure is further developed to reduce the invalid combinations for discovering HUIs. Substantial experiments on real-life datasets show that the proposed algorithm outperforms the other heuristic algorithms for mining HUIs in terms of execution time, number of discovered HUIs, and convergence.

Keywords

Binary PSO OR/NOR-tree Discrete PSO High-utility itemsets TWU model 

References

  1. Agrawal S, Silakari S (2013) FRPSO: Fletcher-Reeves based particle swarm optimization for multimodal function optimization. Soft Comput 18(11):2227–2243CrossRefGoogle Scholar
  2. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. Int Conf Very Large Data Bases 1215:487–499Google Scholar
  3. Ahmed CF, Tanbeer SK, Jeong BS, Le YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721CrossRefGoogle Scholar
  4. Cattral R, Oppacher F, Graham KJL (2009) Techniques for evolutionary rule discovery in data mining. IEEE Congr Evolut Comput :1737–1744Google Scholar
  5. Chan R, Yang Q, Shen YD (2003) Minging high utility itemsets. IEEE Int Conf Data Mining :19–26Google Scholar
  6. Chen MS, Han J, Yu PS (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866–883CrossRefGoogle Scholar
  7. Fournier-Viger P, Wu CW, Zida S, Tseng VS (2014) FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. Found Intell Syst 8502:83–92Google Scholar
  8. Fournier-Viger P, Wu CW, Tseng VS (2014) Novel concise representations of high utility itemsets using generator patterns. Adv Data Mining Appl 8933:30–43Google Scholar
  9. Fournier-Viger P, Zida S (2015) FOSHU: faster on-shelf high utility itemsets mining with or without negative unit profit. ACM Symp Appl Comput :857–864Google Scholar
  10. Frequent itemset mining dataset repository (2012). http://fimi.ua.ac.be/data/
  11. Gong W, Cai Z, Ling CX (2010) DE/BBO: a hybrid differential evolution with biogeography-based optimization for global numerical optimization. Soft Comput 15(4):645–665CrossRefGoogle Scholar
  12. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87MathSciNetCrossRefGoogle Scholar
  13. Holland J (1975) Adaptation in Natural and Artificial Systems, Cambridge. MIT Press, USAGoogle Scholar
  14. Kannimuthu S, Premalatha K (2014) Discovery of high utility itemsets using genetic algorithm with ranked mutation. Appl Artif Intell 28(4):337–359CrossRefGoogle Scholar
  15. Kennedy J, Eberhart R (1997) A discrete binary version of particle swarm algorithm. IEEE Int Conf Syst Man Cybern 5:4104–4108Google Scholar
  16. Kennedy J, Eberhart R (1995) Particle swarm optimization. IEEE Int Conf Neural Netw 4:1942–1948Google Scholar
  17. Kuo RJ, Chao CM, Chiu YT (2011) Application of particle swarm optimization to association rule mining. Appl Soft Comput 11(1):326336Google Scholar
  18. Lan GC, Hong TP, Tseng VS (2013) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107CrossRefGoogle Scholar
  19. Li XT, Yin MH (2015) A particle swarm inspired cuckoo search algorithm for real parameter optimization. Soft Comput :1–25Google Scholar
  20. Liang XL, Li WF, Zhang Y, Zhou MC (2014) An adaptive particle swarm optimization method based on clustering. Soft Comput 19(2):431–448CrossRefGoogle Scholar
  21. Lin CW, Gan WS, Fournier-Viger P, Hong TP (2015) Mining high-utility itemsets with multiple minimum utility thresholds. Int C* Conf Comput Sci Softw Eng :9–17Google Scholar
  22. Lin JCW, Yang L, Fournier-Viger P, Wu MT, Hong TP, Wang LSL (2015) A Swarm-based approach to mine high-utility itemsets. Multidiscip Int Soc Netw ConfGoogle Scholar
  23. Lin CW, Hong TP, Lu WH (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424CrossRefGoogle Scholar
  24. Liu Y, Liao WK, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. Lecture Notes Comput Sci :689–695Google Scholar
  25. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. ACM Int Conf Inf Knowl Manag :55–64Google Scholar
  26. Martnez-Ballesteros M, Martnez-lvarez F, Riquelme JC (2010) Mining quantitative association rules based on evolutionary computation and its application to atmospheric pollution. Integr Comput Aided Eng 17(3):227–242Google Scholar
  27. Menhas MI, Fei M, Wang L, Fu X (2011) A novel hybrid binary PSO algorithm. Lect Notes Comput Sci 6728:93–100CrossRefGoogle Scholar
  28. Microsoft (1996) Example database foodmart of Microsoft analysis services. http://msdn.microsoft.com/en-us/library/aa217032(SQL.80).aspx
  29. Nouaouria N, Boukadouma M, Proulx R (2013) Particle swarm classification: a survey and positioning. Pattern Recogn 46(7):20282044CrossRefGoogle Scholar
  30. Pears R, Koh YS (2012) Weighted association rule mining using particle swarm pptimization. Lect Notes Comput Sci 7104:327–338CrossRefGoogle Scholar
  31. Salleb-Aouissi A, Vrain C, Nortet C (2007) QuantMiner: a genetic algorithm for mining quantitative association rules. Int Jt Conf Artif Intell 7:1035–1040Google Scholar
  32. Sarath KNVD, Ravi V (2013) Association rule mining using binary particle swarm optimization. Eng Appl Artif Intell 26:1832–1840CrossRefGoogle Scholar
  33. Storn R, Price K (1997) Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Global Optim 11(4):341–359MathSciNetCrossRefMATHGoogle Scholar
  34. Tsai CW, Huang KW, Yang CS, Chiang MC (2015) A fast particle swarm optimization for clustering. Soft Comput 19(2):321–338CrossRefGoogle Scholar
  35. Tseng VS, Wu CW, Shie BE, Yu PS (2010) UP-growth: an efficient algorithm for high utility itemset mining. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 253–262Google Scholar
  36. Wu CW, Shie BE, Tseng VS, Yu PS (2012) Mining top-k high utility itemsets. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 78–86Google Scholar
  37. Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. SIAM Int Conf Data Mining 4:211–225Google Scholar
  38. Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59(3):603–626CrossRefGoogle Scholar
  39. Yen SJ, Lee YS (2007) Mining high utility quantitative association rules. Lect Notes Comput Sci 4654:283–292CrossRefGoogle Scholar
  40. Zida S, Fournier-Viger P, Lin CW, Wu CW, Tseng VS (2015) EFIM: a highly efficient algorithm for high-utility itemset mining. In: Mexican International Conference on Artificial IntelligenceGoogle Scholar
  41. Zihayat M, An A (2014) Mining top-k high utility patterns over data streams. Inf Sci 285:138–161MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Jerry Chun-Wei Lin
    • 1
  • Lu Yang
    • 1
  • Philippe Fournier-Viger
    • 2
  • Tzung-Pei Hong
    • 3
    • 4
  • Miroslav Voznak
    • 5
  1. 1.School of Computer Science and TechnologyHarbin Institute of Technology Shenzhen Graduate SchoolShenzhenChina
  2. 2.School of Natural Sciences and HumanitiesHarbin Institute of Technology Shenzhen Graduate SchoolShenzhenChina
  3. 3.Department of Computer Science and Information EngineeringNational University of KaohsiungKaohsiungTaiwan, ROC
  4. 4.Department of Computer Science and EngineeringNational Sun Yat-sen UniversityKaohsiungTaiwan, ROC
  5. 5.Department of Telecommunications, Faculty of Electrical Engineering and Computer ScienceVSB Technical University of OstravaOstrava-PorubaCzech Republic

Personalised recommendations