Revised ECLAT Algorithm for Frequent Itemset Mining
Data mining is now a day becoming very important due to availability of large amount of data. Extracting important information from warehouse has become very tedious in some cases. One of the most important application of data mining is customer segmentation in marketing, demand analyzes, campaign management, Web usage mining, text mining, customer relationship and so on. Association rule mining is one of the important techniques of data mining used for discovering meaningful patterns from huge collection of data. Frequent item set mining play an important role in mining association rules in finding interesting patterns among complex data. Frequent Pattern Itemset Mining from “Big Data” is used to mine important patterns of item occurrence from large unstructured database. When compared with traditional data warehousing techniques, MapReduce methodology provides distributed data mining process. Dataset can be found in two pattern one is horizontal data set and another one is vertical data set. Tree based frequent pattern MapReduce algorithm is considered more efficient among other horizontal frequent itemset mining methods in terms of memory as well as time complexity. Another algorithm is ECLAT that is implemented on vertical data set and is compared with my proposed revised ECLAT Algorithm. As a result the performance of ECLAT Algorithm is improved in proposed algorithm revised ECLAT. In this paper will discuss improved results and reasons for improved results.
KeywordsBig data analytics MapReduce ECLAT BFS FP tree FP growth
- 1.Bharati Suvalka, Sarika Khandelwal, Siddharth Singh Sisodiya “Big Analytics using meta machine learning” in international journal of innovative research in science engineering and technology in August 2014.Google Scholar
- 2.Tilmann Rabl, Mohammad Sadoghi, Hans-Arno Jacobsen “solving big data challenges for enterprise application Performance management”.Google Scholar
- 3.Noll, Michael G. “Running hadoop on ubuntu linux (single-node cluster).” Mar- 2013 [Online]. Available: http://www.michael-noll.com/tutorials/running-hadoopon-ubuntu-linux-single-nodecluster/.
- 4.Lam, Chuck. Hadoop in Action. Manning Publications Co., 2010.Google Scholar
- 5.Han, Jiawei, Micheline Kamber, and Jian Pei. Data mining: concepts and techniques. Morgan kaufmann.Google Scholar
- 6.Zikopoulos, Paul, and Chris Eaton. Understanding big data: Analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media.Google Scholar
- 7.Zikopoulos, Paul, Krishnan Parasuraman, Thomas Deutsch, James Giles, and David Corrigan. Harness the Power of Big Data The IBM Big Data Platform. McGraw Hill Professional.Google Scholar
- 8.Shvachko, Konstantin, Hairong Kuang, Sanjay Radia, and Robert Chansler. “The hadoop distributed file system.” In Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium.Google Scholar
Open Access This chapter is distributed under the terms of the Creative Commons Attribution Noncommercial License, which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.