An Exhaustive Covering Approach to Parameter-Free Mining of Non-redundant Discriminative Itemsets

Kameya, Yoshitaka

doi:10.1007/978-3-319-43946-4_10

Yoshitaka Kameya¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9829))

Included in the following conference series:

International Conference on Big Data Analytics and Knowledge Discovery

1151 Accesses
1 Citations

Abstract

Discriminative pattern mining is a promising extension of frequent pattern mining. This paper proposes an algorithm called ExCover, a shorthand for exhaustive covering, for finding non-redundant discriminative itemsets. ExCover outputs non-redundant patterns where each pattern covers best at least one positive transaction. With no control parameters limiting the search space, ExCover efficiently performs an exhaustive search for best-covering patterns using branch-and-bound pruning. During the search, candidate best-covering patterns are concurrently collected for each positive transaction. Formal discussions and experimental results exhibit that ExCover efficiently finds a more compact set of patterns in comparison with previous methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Depth-First Traversal over a Mirrored Space for Non-redundant Discriminative Itemsets

CoverSize: A Global Constraint for Frequency-Based Itemset Mining

Study of Effective Mining Algorithms for Frequent Itemsets

Notes

1.
In the previous example with the target class \(c=+\), a pattern \(\varvec{x}=\{\mathsf {D}\}\) is excluded, since \(p(\varvec{x}\mid c)=3/5=0.6\) and \(p(\varvec{x}\mid \lnot c)=5/5=1\).
2.
Equivalent substitutions are also possible: \(p(c\mid \varvec{x}):=1\), \(p(\lnot \varvec{x}\mid \lnot c):=1\), and so on.
3.
In other words, ExCover outputs all patterns having the same best score. We only exclude apparently redundant patterns to avoid the loss of crucial information.
4.
Remind that \(\mathcal{D}_c(\varvec{x})\) denotes the set of positive transactions covered by a pattern \(\varvec{x}\).
5.
More formally, a visited pattern is a pattern closed on the positives produced at Line 5 in the Grow procedure.

References

Aggarwal, C.C.: Data Mining: The Textbook. Springer, Switzerland (2015)
Book MATH Google Scholar
Bayardo, R., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. Data Min. Knowl. Discov. 4, 217–240 (2000)
Article Google Scholar
Cerf, L., Gay, D., Selmaoui, N., Boulicaut, J.-F.: A parameter-free associative classification method. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2008. LNCS, vol. 5182, pp. 293–304. Springer, Heidelberg (2008)
Chapter Google Scholar
Cheng, H., Yan, X., Han, J., Yu, P.S.: Direct discriminative pattern mining for effective classification. In: Proceedings of ICDE 2008, pp. 169–178 (2008)
Google Scholar
Domingos, P.: The RISE system: conquering without separating. In: Proceedings of ICTAI 1994, pp. 704–707 (1994)
Google Scholar
Dong, G., Bailey, J. (eds.): Contrast Data Mining: Concepts, Algorithms, and Applications. CRC Press, Boca Raton (2012)
Google Scholar
Fürnkranz, J., Gamberger, D., Lavrač, N.: Foundations of Rule Learning. Springer, Heidelberg (2012)
Book MATH Google Scholar
Garriga, G.C., Kralj, P., Lavrač, N.: Closed sets for labeled data. J. Mach. Learn. Res. 9, 559–580 (2008)
MathSciNet MATH Google Scholar
Grosskreutz, H., Lang, B., Trabold, D.: A relevance criterion for sequential patterns. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013, Part I. LNCS, vol. 8188, pp. 369–384. Springer, Heidelberg (2013)
Chapter Google Scholar
Grosskreutz, H., Paurat, D.: Fast and memory-efficient discovery of the top-k relevant subgroups in a reduced candidate space. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part I. LNCS, vol. 6911, pp. 533–548. Springer, Heidelberg (2011)
Chapter Google Scholar
Guns, T., Nijssen, S., De Raedt, L.: \(k\)-Pattern set mining under constraints. IEEE Trans. Knowl. Data Eng. 25(2), 402–418 (2013)
Article Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of SIGMOD 2000, pp. 1–12 (2000)
Google Scholar
Han, J., Wang, J., Lu, Y., Tzvetkov, P.: Mining top-\(k\) frequent closed patterns without minimum support. In: Proceedings of ICDM 2002, pp. 211–218 (2002)
Google Scholar
Kameya, Y., Asaoka, H.: Depth-first traversal over a mirrored space for non-redundant discriminative itemsets. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2013. LNCS, vol. 8057, pp. 196–208. Springer, Heidelberg (2013)
Chapter Google Scholar
Novak, P.K., Lavrač, N., Webb, G.I.: Supervised descriptive rule discovery: a unifying survey of contrast set, emerging pattern and subgroup mining. J. Mach. Learn. Res. 10, 377–403 (2009)
MATH Google Scholar
Li, J., Li, H., Wong, L., Pei, J., Dong, G.: Minimum description length principle: generators are preferable to closed patterns. In: Proceedings of AAAI 2006, pp. 409–414 (2006)
Google Scholar
Li, W., Han, J., Pei, J.: CMAR: accurate and efficient classification based on multiple class-association rules. In: Proceedings of ICDM 2001, pp. 369–376 (2001)
Google Scholar
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of KDD 1998, pp. 80–86 (1998)
Google Scholar
Morishita, S., Sese, J.: Traversing itemset lattices with statistical metric pruning. In: Proceedings of PODS 2000, pp. 226–236 (2000)
Google Scholar
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Chapter Google Scholar
Rijnbeek, P.R., Kors, J.A.: Finding a short and accurate decision rule in disjunctive normal form by exaustive search. Mach. Learn. 80, 33–62 (2010)
Article MathSciNet Google Scholar
Soulet, A., Crémilleux, B., Rioult, F.: Condensed representation of emerging patterns. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 127–132. Springer, Heidelberg (2004)
Chapter Google Scholar
Thabtah, F.: A review of associative classification mining. Knowl. Eng. Rev. 22(1), 37–65 (2007)
Article Google Scholar
Uno, T., Asai, T., Uchida, Y., Arimura, H.: An efficient algorithm for enumerating closed patterns in transaction databases. In: Proceedings of DS 2004, pp. 16–31 (2004)
Google Scholar
Wang, J., Karypis, G.: HARMONY: efficiently mining the best rules for classification. In: Proceedings of SDM 2005, pp. 205–216 (2005)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Diego (2005)
MATH Google Scholar
Xin, D., Han, J., Yan, X., Cheng, H.: On compressing frequent patterns. Data Knowl. Eng. 60(1), 5–29 (2007)
Article MathSciNet Google Scholar
Yuan, C., Lim, H., Lu, T.C.: Most relevant explanation in Bayesian networks. J Artif. Intell. Res. 42, 309–352 (2011)
MathSciNet MATH Google Scholar
Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Engineering, Meijo University, 1-501 Shiogama-guchi, Tenpaku-ku, Nagoya, 468-8502, Japan
Yoshitaka Kameya

Authors

Yoshitaka Kameya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yoshitaka Kameya .

Editor information

Editors and Affiliations

University of Science and Technology , Rolla, Missouri, USA
Sanjay Madria
Osaka University , Osaka, Japan
Takahiro Hara

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kameya, Y. (2016). An Exhaustive Covering Approach to Parameter-Free Mining of Non-redundant Discriminative Itemsets. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2016. Lecture Notes in Computer Science(), vol 9829. Springer, Cham. https://doi.org/10.1007/978-3-319-43946-4_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-43946-4_10
Published: 06 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43945-7
Online ISBN: 978-3-319-43946-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Exhaustive Covering Approach to Parameter-Free Mining of Non-redundant Discriminative Itemsets

Abstract

Access this chapter

Similar content being viewed by others

Depth-First Traversal over a Mirrored Space for Non-redundant Discriminative Itemsets

CoverSize: A Global Constraint for Frequency-Based Itemset Mining

Study of Effective Mining Algorithms for Frequent Itemsets

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

An Exhaustive Covering Approach to Parameter-Free Mining of Non-redundant Discriminative Itemsets

Abstract

Access this chapter

Similar content being viewed by others

Depth-First Traversal over a Mirrored Space for Non-redundant Discriminative Itemsets

CoverSize: A Global Constraint for Frequency-Based Itemset Mining

Study of Effective Mining Algorithms for Frequent Itemsets

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation