An Enhancement of the MapReduce Apriori Algorithm Using Vertical Data Layout and Set Theory Concept of Intersection

  • S. Dhanya
  • M. Vysaakan
  • A. S. Mahesh
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 385)


The process of Association Rule Generation is an important task in Data Mining. It is widely used for the Market Basket Analysis. The data that is used for the generation of Association Rules is usually distributed and complex. This can be efficiently implemented using the Hadoop Framework as it can process large datasets with less cost and good performance. We have proposed an efficient algorithm for MapReduce Apriori based on Hadoop- MapReduce model using Vertical Database Layout and Set Theory concept of Intersection.


Apriori Hadoop MapReduce Cloud computing Association rule Frequent itemset mining 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Shafer, J.C.: IBM Almaden Research Center, “Parallel Mining of Association Rules”. IEEE Transactions on Knowledge and Data Engineering (February 1996)Google Scholar
  2. 2.
    Sakhapara, AM., Bharathi, H.N.: Comparative Study of Apriori Algorithms for Parallel Mining of Frequent Itemsets. International Journal of Computer Applications (0975–8887) 90(8) (March 2014)Google Scholar
  3. 3.
    Qureshi, Z., Bansal, J., Bansal, S.: A Survey on Association Rule Mining in Cloud Computing. International Journal of Emerging Technology and Advanced Engineering 3(4) (April 2013); ISSN 2250-2459, ISO 9001:2008 Certified JournalGoogle Scholar
  4. 4.
    Li, J., Roy, P., Khan, S., Wang, L., Bai, Y.: Data Mining Using Clouds: An Experimental Implementation of Apriori over MapReduce. The 12th IEEE International Conference on Scalable Computing and Communication (ScalCom 2102), Changzhou, China (December 2012)Google Scholar
  5. 5.
    Woo, J.: Apriori-MapReduce Algorithm. In: International Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas (July 2012)Google Scholar
  6. 6.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. 1994 Int. Conf. Very Large Data Bases, Santiago, Chile, pp. 487–499 (September 1994)Google Scholar
  7. 7.
    Geetha, K., Mohiddin, S.K.: An Efficient Data Mining Technique for Generating Frequent Itemsets. International Journal of Advanced Research in Computer Science and Software Engineering 3(4) (April 2013)Google Scholar
  8. 8.
    Goswami, D.N., Anshu, C., Raghuvanshi, C.S.: An Algorithm for Frequent Pattern Mining Based on Apriori. International Journal on Computer Science and Engineering 2(4) 942–947 (2010)Google Scholar
  9. 9.
    Al-Maolegi, M., Arkok, B.: An Improved Algorithm For Association Rules. International Journal on Natural Language Computing 3(1) (February 2014)Google Scholar
  10. 10.
    Agarwal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. ACM SIGMOD Record 22, 207–216 (1993)CrossRefGoogle Scholar
  11. 11.
    Apache Hadoop Project.
  12. 12.
    Market Basket Analysis Example in Hadoop., Jongwook Woo (March 2011)
  13. 13.
    Saritha, R.C., Usha Rani, M.: Mining Frequent Item Sets Using Map Reduce Paradigam. International Journal of Engineering Sciences Research-IJESR 4(Special Issue) (January 2013)Google Scholar
  14. 14.
    Dhamdhere Jyoti, L., Deshpande Kiran, B.: A Novel Methodology of Frequent Itemset Mining on Hadoop. International Journal of Emerging Technology and Advanced Engineering 4(7) (July 2014)Google Scholar
  15. 15.
    Ziauddin, S.K., Khan, K.Z., Khan, M I.: Research on Association Rule Mining. Advances in Computtational Mathematics and its Applications 2(1), (2012)Google Scholar
  16. 16.
    Umarani, V., Punithavalli, M.: A Study on Effective Mining of Association Rules From Huge Databases. International Journal of Computer Science and Research 1(1) (2010)Google Scholar
  17. 17.
    Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: Parallel Algorithms for Discovery of Association Rules. Data Mining and Knowledge Discovery 1(4), 343–373 (1997)Google Scholar
  18. 18.
    Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco (1999)Google Scholar
  19. 19.
    Zaki, M.J.: Parallel and Distributed Association Mining: A Survey. IEEE Concurrency 7(4), 14–25, (1999)Google Scholar
  20. 20.
    Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The Physiology of the Grid: an Open Grid Services Architecture for Distributed Systems Integration. Technical report, Global Grid Forum (2002)Google Scholar
  21. 21.
    Mridul, M., Khajuria, A., Dutta, S., Kumar, N., Prasad, M.R.: Analysis of Bidgata using Apache Hadoop and MapReduce. International Journal of Advanced Research in Computer Science and Software Engineering 4(5) (May 2014)Google Scholar
  22. 22.
    Ruxandra-Stefania PETRE, Data Mining in Cloud Computing. Database Systems Journal III(3) (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Amrita School of Arts and Sciences Amrita Vishwa VidyapeethamKochiIndia

Personalised recommendations