A Set-Checking Algorithm for Mining Maximal Frequent Itemsets from Data Streams
Online mining the maximal frequent itemsets over data streams is an important problem in data mining. In order to solve mining maximal frequent itemsets from data streams using the Landmark Window model, Mao et al. propose the INSTANT algorithm. The structure of the INSTANT algorithm is simple and it can save much memory space. But it takes long time in mining the maximal frequent itemsets. When the new transaction comes, the number of comparisons between the old transactions of the INSTANT algorithm is too much. Therefore, in this chapter, we propose the Set-Checking algorithm to mine frequent itemsets from data streams using the Landmark Window model. We use the structure of the lattice to store our information. The structure of the lattice records the subset relationship between the child node and the parent node. From our simulation results, we show that the process time of our Set-Checking algorithm is faster than that of the INSTANT algorithm.
KeywordsData stream Itemset Landmark Window model Lattice Maximal frequent itemset
The research was supported in part by the National Science Council of Republic of China under Grant No. NSC-101-2221-E-110-091-MY2.
- 1.Agrawal R, Srikant R (1994) Fast algorithm for mining association rules in large databases. In: 20th international conference on very large data bases. Morgan Kaufmann, San Francisco, pp 487–499Google Scholar
- 2.Li H, Zhang N (2010) Mining maximal frequent itemsets over a stream sliding window. In: IEEE youth conference on information computing and telecommunications. IEEE Press, New York, pp 110–113Google Scholar
- 3.Li JW, Lee GQ (2009) Mining frequent itemsets over data streams using efficient window sliding techniques. Int J Expert Syst Appl 36(2):1466–1477. Pergamon Press, New YorkGoogle Scholar
- 4.Lin KC, Liao IE, Chen ZS (2011) An improved frequent pattern growth method for mining association rules. Int J Expert Syst Appl 38(5):5154–5161. Pergamon Press, New YorkGoogle Scholar
- 5.Mao G, Wu X, Zhu X, Chen G, Liu C (2007) Mining maximal frequent itemsets from data streams. J Inf Sci 33(3):251–262. Sage, Thousand OaksGoogle Scholar
- 6.Xin JW, Yang GQ, Sun JZ, Zhang YP (2006) A new algorithm for discovery maximal frequent itemsets based on binary vector sets. In: 5th international conference on machine learning and cybernetics. IEEE Press, New York, pp 1120–1124Google Scholar