Strategies for Partitioning Data in Association Rule Mining

  • Shakil Ahmed
  • Frans Coenen
  • Paul Leng
Conference paper

Abstract

The problem of extracting association rules from databases is well known. The most demanding part of the problem is the determination of the support for all those sets of attributes which occur often enough to be of possible interest. We have previously described methods we have developed that approach the problem by first constructing a tree (the P-tree) that contains a record of all the relevant information in the database and a partial computation of the support totals. This approach offers significant performance advantages over comparable alternative methods, which we have demonstrated experimentally with store-resident datasets. In practice, however, the real focus of interest is on much larger databases. In this paper we discuss strategies for partitioning the data in these cases, and present results of the performance analysis.

Keywords

Association Rules Partial Support Data Structures Partitioning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agarwal, R., Aggarwal, C. and Prasad, V. Depth First Generation of Long Patterns. In Proc. of the ACM KDD Conference on Management of DataBoston, pages 108–118, 2000.Google Scholar
  2. 2.
    Agrawal, R., Imielinski, T. and Swami, A. Mining Association Rules between Sets of Items in Large Databases. In Proc. of the ACM SIGMOD Conference on Management of DataWashington, D.C., pages 207–216, May 1993.Google Scholar
  3. 3.
    Agrawal, R. and Srikant, R. Fast Algorithm for Mining Association Rules. In Proc. of the 20th VLDB Conference, Santiago, Santiago, Chile, pages 487–499, September 1994.Google Scholar
  4. 4.
    Bayardo, R.J. Efficiently Mining Long Pattern from Databases. In Proc. of the ACM SIGMOD Conference on Management of Data, pages 85–93, 1998.Google Scholar
  5. 5.
    Bayardo, R.J., Agrawal, R. and Gunopulos, D. Constraint-Based Rule Mining in Large, Dense Databases. In Proc, of the 15th Int’l Conference on Data Engineering, 1999.Google Scholar
  6. 6.
    Han, J., Pei, J. and Yin, Y. Mining Frequent Patterns without Candidate Generation. In Proc. of the ACM SIGMOD Conference on Management of Data, Dallas, pages 1–12, 2000.Google Scholar
  7. 7.
    Coenen, F., Goulbourne, G., and Leng, P. Computing Association Rules using Partial Totals. PKDD 2001, pages 54–66, 2001.Google Scholar
  8. 8.
    Coenen, F. and Leng, P. Optimising Association Rule Algorithms Using Itemset Ordering. Research and Development in Intelligent Systems XVIII: Proc ES2001 Conference, eds M Bramer, F Coenen and A Preece, Springer, pp53–66.Google Scholar
  9. 9.
    Goulbourne, G., Coenen, F. and Leng, P. Algorithms for Computing Association Rules Using a Partial-Support Tree. J. Knowledge-Based System 13 (2000), pages 141–149. (also Proc ES’99.)CrossRefGoogle Scholar
  10. 10.
    Toivonen, H. Sampling Large Databases for Association Rules. In Proc. of the 22th VLDB Conference, Mumbai, India, pages 1–12, 1996.Google Scholar
  11. 11.
    Brin, S., Motwani, R., Ullman, J. D. and Tsur, S. Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proc. of the ACM SIGMOD Conference on Management of Data, USA, pages 255–264, 1997.Google Scholar
  12. 12.
    Savasere, A., Omiecinski, E. and Navathe, S. An Efficient Algorithm for Mining Association Rules in Large Databases. In Proc, of the 21th VLDB Conference, Zurich, Swizerland, pages 432–444, 1995.Google Scholar
  13. 13.
    Zaki, M.J. Parthasarathy, S. Ogihara, M. and Li, W. New Algorithms for fast discovery of association rules. Technical report 651, University of Rochester, Computer Science Department, New York. July 1997.Google Scholar

Copyright information

© Springer-Verlag London 2004

Authors and Affiliations

  • Shakil Ahmed
    • 1
  • Frans Coenen
    • 1
  • Paul Leng
    • 1
  1. 1.Department of Computer ScienceUniversity of LiverpoolLiverpoolUK

Personalised recommendations