Applied Intelligence

, Volume 48, Issue 5, pp 1148–1160 | Cite as

ETARM: an efficient top-k association rule mining algorithm

  • Linh T. T. Nguyen
  • Bay Vo
  • Loan T. T. Nguyen
  • Philippe Fournier-Viger
  • Ali Selamat
Article
  • 149 Downloads

Abstract

Mining association rules plays an important role in data mining and knowledge discovery since it can reveal strong associations between items in databases. Nevertheless, an important problem with traditional association rule mining methods is that they can generate a huge amount of association rules depending on how parameters are set. However, users are often only interested in finding the strongest rules, and do not want to go through a large amount of rules or wait for these rules to be generated. To address those needs, algorithms have been proposed to mine the top-k association rules in databases, where users can directly set a parameter k to obtain the k most frequent rules. However, a major issue with these techniques is that they remain very costly in terms of execution time and memory. To address this issue, this paper presents a novel algorithm named ETARM (Efficient Top-k Association Rule Miner) to efficiently find the complete set of top-k association rules. The proposed algorithm integrates two novel candidate pruning properties to more effectively reduce the search space. These properties are applied during the candidate selection process to identify items that should not be used to expand a rule based on its confidence, to reduce the number of candidates. An extensive experimental evaluation on six standard benchmark datasets show that the proposed approach outperforms the state-of-the-art TopKRules algorithm both in terms of runtime and memory usage.

Keywords

Data mining Association rule mining Top-k association rules Rule Expansion 

Notes

Acknowledgement

This work was carried out during the tenure of an ERCIM ‘Alain Bensoussan’ Fellowship Programme.

References

  1. 1.
    Agrawal R, Imielminski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings ACM international conference on management of data. ACM Press, pp 207–216Google Scholar
  2. 2.
    Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on very large data bases, pp 487–499Google Scholar
  3. 3.
    Chuang KT, Huang JL, Chen MS (2008) Mining top-k frequent patterns in the presence of the memory constraint. VLDB J 17(5):1321–1344CrossRefGoogle Scholar
  4. 4.
    Deng Z, Fang G (2007) Mining top-rank-k frequent patterns. In: ICMLC’07, pp 851–856Google Scholar
  5. 5.
    Deng ZH (2014) Fast mining top-rank-k frequent patterns by using node-lists. Expert Syst Appl 41(4):1763–1768CrossRefGoogle Scholar
  6. 6.
    Deng ZH, Lv SL (2015) PrePost +: an efficient N-lists-based algorithm for mining frequent itemsets via children–parent equivalence pruning. Expert Syst Appl 42(13):5424–5432CrossRefGoogle Scholar
  7. 7.
    Deng ZH (2016) DiffNodesets: an efficient structure for fast mining frequent itemsets. Appl Soft Comput 41:214–223CrossRefGoogle Scholar
  8. 8.
    Fang G, Deng ZH (2008) VTK: vertical mining of top-rank-k frequent patterns. In: FSKD’08, pp 620–624Google Scholar
  9. 9.
    Fournier-Viger P, Wu C-W, Tseng VS (2012) Mining top-k association rules. In: Proceedings of the 25th Canadian conference on artificial intelligence AI (2012). Springer, LNAI 7310, pp 61– 73Google Scholar
  10. 10.
    Han J, Dong G, Yin Y (1999) Efficient mining of partial periodic patterns in time series database. In: ICDE’99, pp 106–115Google Scholar
  11. 11.
    Han J, Pei H, Yin Y (2004) Mining frequent patterns without candidate generation. In: Proceedings ACM international conference on management of data (SIGMOD’00, Dallas, TX), vol 8(1). ACM Press, pp 53–87Google Scholar
  12. 12.
    Han J, Wang J, Lu Y, Tzvetkov P (2002) Mining top-k frequent closed patterns without minimum support. In: ICDM’02, pp 211–218Google Scholar
  13. 13.
    Huynh-Thi-Le Q, Le T, Vo B, Le B (2015) An efficient and effective algorithm for mining top-rank-k frequent patterns. Expert Syst Appl 42(1):156–164CrossRefGoogle Scholar
  14. 14.
    Le T, Vo B (2015) An N-list-based algorithm for mining frequent closed patterns. Expert Syst Appl 42 (19):6648–6657CrossRefGoogle Scholar
  15. 15.
    Lucchese C, Orlando S, Perego R (2006) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1):21–36CrossRefGoogle Scholar
  16. 16.
    Nguyen LTT, Trinh T, Nguyen NT, Vo B (2017) A method for mining top-rank-k frequent closed itemsets. J Intell Fuzzy Syst 32(2):1297–1305CrossRefGoogle Scholar
  17. 17.
    Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Efficient mining of association rules using closed itemset lattices. Inf Syst 24(1):25–46CrossRefMATHGoogle Scholar
  18. 18.
    Pietracaprina A, Vandin F (2004) Efficient incremental mining of top-k frequent closed itemsets. In: Tenth international conference discovery science. Springer, Berlin, pp 275–280Google Scholar
  19. 19.
    Pyun G, Yun U (2014) Mining top-k frequent patterns with combination reducing techniques. Appl Intell 41(1):76–98CrossRefGoogle Scholar
  20. 20.
    Pyun G, Yun U, Ryu KH (2014) Efficient frequent pattern mining base on linear prefix tree. Knowl-Based Syst 55:125–139CrossRefGoogle Scholar
  21. 21.
    Sahoo J, Das AK, Goswami A (2015) An effective association rule mining scheme using a new generic basis. Knowl Inf Syst 43(1):127–156CrossRefGoogle Scholar
  22. 22.
    Saif-Ur-Rehman, Ashraf J, Salam AHA (2016) Top-k miner: top-k identical frequent itemsets discovery without user support threshold. Knowl Inf Syst 48(3):741–762CrossRefGoogle Scholar
  23. 23.
    Tzvetkov P, Yan X, Han J (2005) TSP: mining top-k closed sequential patterns. Knowl Inf Syst 7 (4):438–457CrossRefGoogle Scholar
  24. 24.
    Vo B, Le B (2009) Mining traditional association rules using frequent itemsets lattice. In: International conference on computers & industrial engineering. IEEE Press, pp 1401–1406Google Scholar
  25. 25.
    Vo B, Le B (2011) Interestingness measures for association rules: combination between lattice and hash tables. Expert Syst Appl 38(9):11630–11640CrossRefGoogle Scholar
  26. 26.
    Vo B, Hong TP, Le B (2012) DBV-miner: a dynamic bit-vector approach for fast mining frequent closed itemsets. Expert Syst Appl 39(8):7196–7206CrossRefGoogle Scholar
  27. 27.
    Vo B, Hong TP, Le B (2013) A lattice-based approach for mining most generalization association rules. Knowl-Based Syst 45:20–30CrossRefGoogle Scholar
  28. 28.
    Webb G I, Zhang S (2005) K-optimal rule discovery. Data Min Knowl Disc 10(1):39–79MathSciNetCrossRefGoogle Scholar
  29. 29.
    Webb G I (2011) Filtered top-k association discovery. WIREs Data Min Knowl Discovery 1(3):183–192CrossRefGoogle Scholar
  30. 30.
    You Y, Zhang J, Yang Z, Liu G (2010) Mining top-k fault tolerant association rules by redundant pattern disambiguation in data streams. In: International conference intelligent computing and cognitive informatics. IEEE Press, pp 470–473Google Scholar
  31. 31.
    Zaki MJ (2004) Mining non-redundant association rules. Data Min Knowl Disc 9(3):223–248MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Faculty of Information TechnologyDong An PolytechnicBinh DuongVietnam
  2. 2.Faculty of Information TechnologyHo Chi Minh City University of TechnologyHo Chi Minh CityVietnam
  3. 3.Division of Knowledge and System Engineering for ICTTon Duc Thang UniversityHo Chi Minh CityVietnam
  4. 4.Faculty of Information TechnologyTon Duc Thang UniversityHo Chi Minh CityVietnam
  5. 5.School of Humanities and Social SciencesHarbin Institute of Technology Shenzhen Graduate SchoolShenzhenChina
  6. 6.Universiti Teknologi MalaysiaJohorMalaysia

Personalised recommendations