Advertisement

The Parameterized Complexity of Enumerating Frequent Itemsets

  • Matthew Hamilton
  • Rhonda Chaytor
  • Todd Wareham
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4169)

Abstract

A core problem in data mining is enumerating frequently-occurring itemsets in a given set of transactions. The search and enumeration versions of this problem have recently been proven NP- and #P-hard, respectively (Gunopulos et al, 2003) and known algorithms all have running times whose exponential terms are functions of either the size of the largest transaction in the input and/or the largest itemset in the output. In this paper, we analyze the complexity of the size-k frequent itemset enumeration problem relative to a variety of parameterizations. Many of our hardness results are proved using a recent extension of parameterized complexity to solution-counting problems (McCartin, 2002). These results include hardness for versions of this problem based on restricted transaction-set structure. We also derive a collection of fixed-parameter algorithms using off-the-shelf parameterized algorithm design techniques, several of which suggest new algorithmic directions for the frequent itemset enumeration problem.

Keywords

Bipartite Graph Vertex Cover Frequent Itemset Truth Assignment Hardness Result 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hamilton, M.: B.Sc.h Dissertation, The Parameterized Complexity of Enumerating Frequent Itemsets (2005)Google Scholar
  2. 2.
    Agrawal, R., Imielinski, T., Swami, A.N.: Mining Association Rules between Sets of Items in Large Databases. In: Buneman, P., Jajodia, S. (eds.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)Google Scholar
  3. 3.
    Zaki, M.J., Ogihara, M.: Theoretical Foundations of Association Rules. In: Proceedings of 3rd SIGMOD 1998 Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 1998) (1998)Google Scholar
  4. 4.
    Koyutürk, M., Grama, A., Szpankowski, W.: An Efficient Algorithm for Detecting Frequent Subgraphs in Biological Networks. In: ISMB/ECCB (Supplement of Bioinformatics), pp. 200–207 (2004)Google Scholar
  5. 5.
    Grossman, R., Kasif, S., Moore, R., Rocke, D., Ullman, J.: Data mining research: Opportunities and challenges. A Report of three NSF Workshops on Mining Large, Massive, and Distributed Data (1998)Google Scholar
  6. 6.
    Gunopulos, D., Khardon, R., Mannila, H., Saluja, S., Toivonen, H., Sharm, R.S.: Discovering All Most Specific Sentences. ACM Transactions on Database Systems 28(2), 140–174 (2003)CrossRefGoogle Scholar
  7. 7.
    Lin, D.-I., Kedem, Z.M.: Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 105–119. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  8. 8.
    Zaki, M.J., Hsiao, C.J.: Efficient Algorithms for Mining Closed Itemsets and Their Lattice Structure. IEEE Transactions on Knowlege and Data Engineering 17(4), 462–478 (2005)CrossRefGoogle Scholar
  9. 9.
    Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), pp. 487–499. Morgan Kaufmann, San Francisco (1994)Google Scholar
  10. 10.
    Goethals, B.: Survey on Frequent Pattern Mining (2003)Google Scholar
  11. 11.
    Zaki, M.J., Gouda, K.: Fast Vertical Mining Using Diffsets. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 326–335. ACM Press, New York (2003)CrossRefGoogle Scholar
  12. 12.
    Goethals, B., Zaki, M.J.: Advances in frequent itemset mining implementations: Introduction to fimi03. In: Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations (2003)Google Scholar
  13. 13.
    Yang, G.: The Complexity of Mining Maximal Frequent Itemsets and Maximal Frequent Patterns. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 344–353 (2004)Google Scholar
  14. 14.
    Wareham, H.T.: Systematic Parameterized Complexity Analysis in Computational Phonology. PhD thesis, University of Victoria (1999)Google Scholar
  15. 15.
    Seno, M., Karypis, G.: SLPMiner: An Algorithm for Finding Frequent Sequential Patterns Using Length-Decreasing Support Constraint. In: Proceedings of the 2nd IEEE International Conference on Data Mining (ICDM), pp. 418–425 (2002)Google Scholar
  16. 16.
    Arvind, V., Raman, V.: Approximation Algorithms for Some Parameterized Counting Problems. In: Bose, P., Morin, P. (eds.) ISAAC 2002. LNCS, vol. 2518, pp. 453–464. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  17. 17.
    Damaschke, P.: Parameterized Enumeration, Transversals, and Imperfect Phylogeny Reconstruction. In: Downey, R.G., Fellows, M.R., Dehne, F. (eds.) IWPEC 2004. LNCS, vol. 3162, pp. 1–12. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  18. 18.
    Fernau, H.: On Parameterized Enumeration. In: H. Ibarra, O., Zhang, L. (eds.) COCOON 2002. LNCS, vol. 2387, pp. 564–573. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  19. 19.
    Flum, J., Grohe, M.: The Parameterized Complexity of Counting Problems. SIAM Journal on Computing 33(4), 892–922 (2004)MATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    McCartin, C.: Parameterized Counting Problems. In: Diks, K., Rytter, W. (eds.) MFCS 2002. LNCS, vol. 2420, pp. 556–567. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  21. 21.
    Valiant, L.G.: The Complexity of Enumeration and Reliability Problems. SIAM Journal on Computing 8(3), 410–421 (1979)MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Garey, M.R., Johnson, D.S.: Computers and Intractablity: A Guide to the Theory of NP-Completeness. Freeman, New York (1979)MATHGoogle Scholar
  23. 23.
    Downey, R.G., Fellows, M.R.: Fixed-Parameter Tractability and Completeness I: Basic Results. SIAM Journal on Computing 24(4), 873–921 (1995)MATHCrossRefMathSciNetGoogle Scholar
  24. 24.
    Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, Heidelberg (1999)Google Scholar
  25. 25.
    Guo, J., Niedermeier, R.: Exact algorithms and applications for tree-like weighted set cover. Journal of Discrete Algorithms (accepted June 2005) (to appear)Google Scholar
  26. 26.
    Gottlob, G., Grohe, M., Musliu, N., Samer, M., Scarcello, F.: Hypertree decompositions: Structure, algorithms, and applications. In: Kratsch, D. (ed.) WG 2005. LNCS, vol. 3787, pp. 1–15. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  27. 27.
    Alexe, G., Alexe, S., Crama, Y., Foldes, S., Hammer, P.L., Simeone, B.: Consensus algorithms for the generation of all maximal bicliques. Discrete Applied Mathematics 145(1), 11–21 (2004)MATHCrossRefMathSciNetGoogle Scholar
  28. 28.
    Guo, J., Hüffner, F., Niedermeier, R.: A Structural View on Parameterizing Problems: Distance from Triviality. In: Downey, R.G., Fellows, M.R., Dehne, F. (eds.) IWPEC 2004. LNCS, vol. 3162, pp. 162–173. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  29. 29.
    Fernau, H., Niedermeier, R.: An Efficient Exact Algorithm for Constraint Bipartite Vertex Cover. In: Ambos-Spies, K., et al. (eds.) 38. Workshop über Komplexitätstheorie, Datenstrukturen und effiziente Algorithmen, vol. 44, p. 8 (1999)Google Scholar
  30. 30.
    Chen, J., Kanj, I.A.: Constrained Minimum Vertex Cover in Bipartite Graphs: Complexity and Parameterized Algorithms. Journal of Computer and System Sciences 67(4), 833–847 (2003)MATHCrossRefMathSciNetGoogle Scholar
  31. 31.
    Chlebík, M., Chlebíková, J.: Improvement of Nemhauser-Trotter Theorem and Its Applications in Parametrized Complexity. In: Hagerup, T., Katajainen, J. (eds.) SWAT 2004. LNCS, vol. 3111, pp. 174–186. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  32. 32.
    Abu-Khzam, F.N., Collins, R.L., Fellows, M.R., Langston, M.A., Suters, W.H., Symons, C.T.: Kernelization Algorithms for the Vertex Cover Problem: Theory and Experiments. In: Proceedings of the Sixth Workshop on Algorithm Engineering and Experiments and the First Workshop on Analytic Algorithmics and Combinatorics (ALENEX/ANALC), pp. 62–69 (2004)Google Scholar
  33. 33.
    Afrati, F.N., Gionis, A., Mannila, H.: Approximating a collection of frequent sets. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 12–19 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Matthew Hamilton
    • 1
  • Rhonda Chaytor
    • 2
  • Todd Wareham
    • 2
  1. 1.Department of Computing ScienceUniversity of AlbertaEdmontonCanada
  2. 2.Department of Computer ScienceMemorial University of NewfoundlandSt. John’sCanada

Personalised recommendations