Skip to main content

Deducing Bounds on the Support of Itemsets

  • Chapter
Database Support for Data Mining Applications

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2682))

Abstract

Mining Frequent Itemsets is the core operation of many data mining algorithms. This operation however, is very data intensive and sometimes produces a prohibitively large output. In this paper we give a complete set of rules for deducing tight bounds on the support of an itemset if the supports of all its subsets are known. Based on the derived bounds [l,u] on the support of a candidate itemset I, we can decide not to access the database to count the support of I if l is larger than the support threshold (I will certainly be frequent), or if u is below the threshold (I will certainly fail the frequency test). We can also use the deduction rules to reduce the size of an adequate representation of the collection of frequent sets; all itemsets I with bounds [l,u], where l =u, do not need to be stored explicitly. To assess the usability in practice, we implemented the deduction rules and we present experiments on real-life data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., et al.: Mining association rules between sets of items in large databases. In: Proc. ACM SIGMOD, pp. 207–216 (1993)

    Google Scholar 

  2. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB, pp. 487–499 (1994)

    Google Scholar 

  3. Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. University of California, Dept. of Inf. and CS., Irvine, CA (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

  4. Boulicaut, J.-F., Bykowski, A.: Frequent closures as a concise representation for binary data mining. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 62–73. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  5. Boulicaut, J.-F., et al.: Approximation of frequency queries by means of free-sets. In: Zighed, A.D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 75–85. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  6. Bykowski, A., Rigotti, C.: A condensed representation to find frequent patterns. In: Proc. PODS (2001)

    Google Scholar 

  7. Bykowski, A., et al.: Model-independent bounding of the supports of boolean formulae in binary data. In: Proc. ECML-PKDD Workshop KDID, pp. 20–31 (2002)

    Google Scholar 

  8. Calders, T., Goethals, B.: Mining all non-derivable frequent itemsets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 74–85. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. Calders, T., Paredaens, J.: Axiomatization of frequent sets. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 204–218. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  10. Fagin, R., et al.: A logic for reasoning about probabilities. Information and Computation 87(1,2), 78–128 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  11. Groth, D., Robertson, E.: Discovering frequent itemsets in the presence of highly frequent items. In: Proc. Workshop RBDM, in Conjunction with 14th Intl. Conf. On Applications of Prolog (2001)

    Google Scholar 

  12. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. ACM SIGMOD, pp. 1–12 (2000)

    Google Scholar 

  13. Hettich, S., Bay, S.D.: The UCI KDD Archive. University of California, Dept. of Inf. and CS., Irvine, CA (1999), http://kdd.ics.uci.edu

  14. Kryszkiewicz, M.: Concise representation of frequent patterns based on disjunction-free generators. In: Proc. ICDM, pp. 305–312 (2001)

    Google Scholar 

  15. Mannila, H., Toivonen, H.: Multiple uses of frequent sets and condensed representations. In: Proc. KDD (1996)

    Google Scholar 

  16. Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. DMKD 1(3), 241–258 (1997)

    Google Scholar 

  17. Nilsson, N.: Probabilistic logic. Artificial Intelligence 28, 71–87 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  18. Pasquier, N., et al.: Discovering frequent closed itemsets for association rules. In: Proc. ICDT, pp. 398–416 (1999)

    Google Scholar 

  19. Pei, J., et al.: Closet: An efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD Workshop DMKD, Dallas, TX (2000)

    Google Scholar 

  20. Zaki, M.J., Hsiao, C.: ChARM: An efficient algorithm for closed association rule mining. In: Proc. ICDM (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Calders, T. (2004). Deducing Bounds on the Support of Itemsets. In: Meo, R., Lanzi, P.L., Klemettinen, M. (eds) Database Support for Data Mining Applications. Lecture Notes in Computer Science(), vol 2682. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44497-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-44497-8_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22479-2

  • Online ISBN: 978-3-540-44497-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics