Advertisement

Datamining in Grid Environment

  • M. Ciglarič
  • M. Pančur
  • B. Šter
  • A. Dobnikar
Conference paper

Abstract

The paper deals with assessing performance improvements and some implementation issues of two well-known data mining algorithms, Apriori and FP-growth, in Alchemi grid environment. We compare execution times and speed-up of two parallel implementations: pure Apriori and hybrid FP-growth — Apriori version on grid with one to six processors. As expected, the latter shows superior performances. We also discuss the effects of database characteristics on overall performance, and give directions for proper choice of execution parameters and suitable number of executors.

Keywords

Association Rule Frequent Itemsets Grid Environment Mining Frequent Pattern Minimal Support Threshold 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    R. Agrawal, T. Imielinski, A. Swami, Mining association rules between sets of items in large databases. Proc. 1993 ACM SIGMOD Int. Conf. on Management of Data, 207–216, ACM Press, 1993.Google Scholar
  2. [2]
    M.J. Zaki et al. New algorithms for fast discovery of association rules. 3. Int. Conf. on Knowledge Discovery and Data Mining, 1997Google Scholar
  3. [3]
    N. Pasquir, Y. Bastide, R. Taouil, L. Lakhal. Discovering frequent closed itemsets for association rules. 7. Int. Conf. on Database Theory, Jan., 1999Google Scholar
  4. [4]
    K. Gouda, J. Zaki. Efficiently mining maximal frequent itemsets. 1. IEEE Int. Conf. on Data Mining, Nov. 2001Google Scholar
  5. [5]
    UCI Machine Learning Database Repository, http://www.ics.uci.edu/~mlearn/MLRepository.htmlGoogle Scholar
  6. [6]
    M. H. Dunham, Data mining, Introductory and Advanced Topics, Prentice Hall, 2003Google Scholar
  7. [7]
    J. Han, J. Pei, Y. Yin. Mining Frequent Patterns without Candidate Generation. In ACM SIGMOD Int. Conf. on Management of Data, May, 2000Google Scholar
  8. [8]
    E. Hong Ham, G. Karypis, V. Kumar. Scalable Parallel Data Mining for Association Rules. In IEEE Tr. on Knowledge and Data Engineering, 1999.Google Scholar
  9. [9]
    Alchemi, http://www.alchemi.net, 2004.Google Scholar
  10. [10]
    A. Luther, R. Buyya, R. Ranjan, S. Venugopal, Alchemi: A.NET-based Grid Computing Framework and its Integration Into Global Grids, TR, GRIDS-TR-2003-8, University of Melbourne, Australia, 2003.Google Scholar
  11. [11]
    B. Goethals. FP-growth implementation, http:// www.cs.helsinki.fi/u/goethals/software/index.htmlGoogle Scholar
  12. [12]
    M. Trebar, U. Lotric. Predictive data mining on rubber compound database. ICANNGA 2005, Coimbra, Portugal.Google Scholar

Copyright information

© Springer-Verlag/Wien 2005

Authors and Affiliations

  • M. Ciglarič
    • 1
  • M. Pančur
    • 1
  • B. Šter
    • 1
  • A. Dobnikar
    • 1
  1. 1.Faculty of Computer and Information ScienceUniversity of LjubljanaSlovenia

Personalised recommendations