Design and Performance Analysis of Distributed Implementation of Apriori Algorithm in Grid Environment

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 248)

Abstract

This paper presents the design and performance analysis of distributed implementation of Apriori algorithm in grid environment. Apriori algorithm is very important algorithm in data mining discipline that enables organizations to mine large amount of historical data that they gather over period of time and discover hidden patterns in that data. Data mining techniques enable organizations to analyze market trends and user behavior. If the data set to be mined is very large then varying the basic algorithm for execution in a distributed environment makes sense because distributed technologies generally offer performance benefits. Grids have gained wide popularity in executing a task in distributed fashion and offer performance benefits. So in this paper we have made an attempt to implement distributed version of basic Apriori algorithm in a grid environment. The Grid environment has been constructed using Globus® Toolkit. Experimental results show that distributed version offers performance benefits over basic version of Apriori algorithm and hence is a good implementation choice if the data to be mined is really large and distributed.

Keywords

Data Mining Grid Environment Apriori algorithm Globus® Toolkit 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Tanenbaum, A.S., Steen, M.V.: Distributed Systems: Principle and Paradigms. Pearson Education, India (2010)Google Scholar
  2. 2.
    Foster, I., Kesselman, C., Tuecke, S.: The Anatomy of Grid: Enabling Scalable Virtual Organizations. International Journal of Supercomputer Applications 15(3), 200–222 (2001)CrossRefGoogle Scholar
  3. 3.
    Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco (1999)Google Scholar
  4. 4.
    Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The Physiology of the Grid: an Open Grid Services Architecture for Distributed Systems Integration. Technical report, Global Grid Forum (2002)Google Scholar
  5. 5.
    Jacob, B., Brown, M., Fukui, K., Trivedi, N.: Introduction to Grid Computing. IBM Redbooks (2005)Google Scholar
  6. 6.
    Joseph, J., Fellenstein, C.: Grid Computing. Pearson Education, India (2004)Google Scholar
  7. 7.
    Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: ACM SIGMOD Conference, Washington DC, USA (1993)Google Scholar
  8. 8.
    Duru, N.: An Application of Apriori Algorithm on a Diabetic Database. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds.) KES 2005. LNCS (LNAI), vol. 3681, pp. 398–404. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  9. 9.
    Hu, L., Zhuo, G., Quin, Y.: Application of Apriori Algorithm to the Data Mining of the Wildfire. In: Sixth International Conference on Fuzzy Systems and Knowledge Discovery (2009)Google Scholar
  10. 10.
    Wu, X., Kumar, V.: The top ten algorithms in Data Mining. CRC Press, Taylor & Francis Group (2009)Google Scholar
  11. 11.
    Sakthi, U., Hemalatha, R., Bhuvaneswaran, R.S.: Parallel and Distributed Mining of Association Rule on Knowledge Grid. World Academy of Science, Engineering and Technology 42, 316–320 (2008)Google Scholar
  12. 12.
    Hahsler, M., Gruen, B., et al.: Arules - A Computational Environment for Mining Association Rules and Frequent Item Sets. Journal of Statistical Software 14(15), 1–25 (2005)Google Scholar
  13. 13.
    Alfori, C., Craus, M.: Grid Implementation of Apriori Algorithm. Advances in Engineering Software 38(5), 295–300 (2006)Google Scholar
  14. 14.
    Perez, M.S., Sanchez, A., et al.: Design and Implementation of a Data Mining Grid-aware Architecture. Future Generation Computer Systems 23, 42–47 (2007)CrossRefGoogle Scholar
  15. 15.
    Rawat, S.S., Rajamani, L.: Performance of Distributed Apriori Algorithms on a Computational Grid. In: Asia-Pacific Services Computing Conference, pp. 163–167. IEEE (2009)Google Scholar
  16. 16.
  17. 17.
  18. 18.
    Singh, S., Bawa, S.: A Privacy Policy Framework for Grid and Web Services. Information Technology Journal 6, 809–817 (2007)CrossRefGoogle Scholar
  19. 19.
    Singh, S., Bawa, S.: A Privacy, Trust and Policy based Authorization Framework for Services in Distributed Environments. International Journal of Computer Science 2, 85–92 (2007)Google Scholar
  20. 20.
    Singh, G., Singh, S.: A Comparative Study of Privacy Mechanisms and a Novel Privacy Mechanism [Short Paper]. In: Qing, S., Mitchell, C.J., Wang, G. (eds.) ICICS 2009. LNCS, vol. 5927, pp. 346–358. Springer, Heidelberg (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Information TechnologyGuru Nanak Dev Engineering CollegeLudhianaIndia
  2. 2.Computer Science and EngineeringUIET, Panjab UniversityChandigarhIndia

Personalised recommendations