Abstract
Discriminative pattern mining is a promising extension of frequent pattern mining. This paper proposes an algorithm called ExCover, a shorthand for exhaustive covering, for finding non-redundant discriminative itemsets. ExCover outputs non-redundant patterns where each pattern covers best at least one positive transaction. With no control parameters limiting the search space, ExCover efficiently performs an exhaustive search for best-covering patterns using branch-and-bound pruning. During the search, candidate best-covering patterns are concurrently collected for each positive transaction. Formal discussions and experimental results exhibit that ExCover efficiently finds a more compact set of patterns in comparison with previous methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In the previous example with the target class \(c=+\), a pattern \(\varvec{x}=\{\mathsf {D}\}\) is excluded, since \(p(\varvec{x}\mid c)=3/5=0.6\) and \(p(\varvec{x}\mid \lnot c)=5/5=1\).
- 2.
Equivalent substitutions are also possible: \(p(c\mid \varvec{x}):=1\), \(p(\lnot \varvec{x}\mid \lnot c):=1\), and so on.
- 3.
In other words, ExCover outputs all patterns having the same best score. We only exclude apparently redundant patterns to avoid the loss of crucial information.
- 4.
Remind that \(\mathcal{D}_c(\varvec{x})\) denotes the set of positive transactions covered by a pattern \(\varvec{x}\).
- 5.
More formally, a visited pattern is a pattern closed on the positives produced at Line 5 in the Grow procedure.
References
Aggarwal, C.C.: Data Mining: The Textbook. Springer, Switzerland (2015)
Bayardo, R., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. Data Min. Knowl. Discov. 4, 217–240 (2000)
Cerf, L., Gay, D., Selmaoui, N., Boulicaut, J.-F.: A parameter-free associative classification method. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2008. LNCS, vol. 5182, pp. 293–304. Springer, Heidelberg (2008)
Cheng, H., Yan, X., Han, J., Yu, P.S.: Direct discriminative pattern mining for effective classification. In: Proceedings of ICDE 2008, pp. 169–178 (2008)
Domingos, P.: The RISE system: conquering without separating. In: Proceedings of ICTAI 1994, pp. 704–707 (1994)
Dong, G., Bailey, J. (eds.): Contrast Data Mining: Concepts, Algorithms, and Applications. CRC Press, Boca Raton (2012)
Fürnkranz, J., Gamberger, D., Lavrač, N.: Foundations of Rule Learning. Springer, Heidelberg (2012)
Garriga, G.C., Kralj, P., Lavrač, N.: Closed sets for labeled data. J. Mach. Learn. Res. 9, 559–580 (2008)
Grosskreutz, H., Lang, B., Trabold, D.: A relevance criterion for sequential patterns. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013, Part I. LNCS, vol. 8188, pp. 369–384. Springer, Heidelberg (2013)
Grosskreutz, H., Paurat, D.: Fast and memory-efficient discovery of the top-k relevant subgroups in a reduced candidate space. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part I. LNCS, vol. 6911, pp. 533–548. Springer, Heidelberg (2011)
Guns, T., Nijssen, S., De Raedt, L.: \(k\)-Pattern set mining under constraints. IEEE Trans. Knowl. Data Eng. 25(2), 402–418 (2013)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of SIGMOD 2000, pp. 1–12 (2000)
Han, J., Wang, J., Lu, Y., Tzvetkov, P.: Mining top-\(k\) frequent closed patterns without minimum support. In: Proceedings of ICDM 2002, pp. 211–218 (2002)
Kameya, Y., Asaoka, H.: Depth-first traversal over a mirrored space for non-redundant discriminative itemsets. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2013. LNCS, vol. 8057, pp. 196–208. Springer, Heidelberg (2013)
Novak, P.K., Lavrač, N., Webb, G.I.: Supervised descriptive rule discovery: a unifying survey of contrast set, emerging pattern and subgroup mining. J. Mach. Learn. Res. 10, 377–403 (2009)
Li, J., Li, H., Wong, L., Pei, J., Dong, G.: Minimum description length principle: generators are preferable to closed patterns. In: Proceedings of AAAI 2006, pp. 409–414 (2006)
Li, W., Han, J., Pei, J.: CMAR: accurate and efficient classification based on multiple class-association rules. In: Proceedings of ICDM 2001, pp. 369–376 (2001)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of KDD 1998, pp. 80–86 (1998)
Morishita, S., Sese, J.: Traversing itemset lattices with statistical metric pruning. In: Proceedings of PODS 2000, pp. 226–236 (2000)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Rijnbeek, P.R., Kors, J.A.: Finding a short and accurate decision rule in disjunctive normal form by exaustive search. Mach. Learn. 80, 33–62 (2010)
Soulet, A., Crémilleux, B., Rioult, F.: Condensed representation of emerging patterns. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 127–132. Springer, Heidelberg (2004)
Thabtah, F.: A review of associative classification mining. Knowl. Eng. Rev. 22(1), 37–65 (2007)
Uno, T., Asai, T., Uchida, Y., Arimura, H.: An efficient algorithm for enumerating closed patterns in transaction databases. In: Proceedings of DS 2004, pp. 16–31 (2004)
Wang, J., Karypis, G.: HARMONY: efficiently mining the best rules for classification. In: Proceedings of SDM 2005, pp. 205–216 (2005)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Diego (2005)
Xin, D., Han, J., Yan, X., Cheng, H.: On compressing frequent patterns. Data Knowl. Eng. 60(1), 5–29 (2007)
Yuan, C., Lim, H., Lu, T.C.: Most relevant explanation in Bayesian networks. J Artif. Intell. Res. 42, 309–352 (2011)
Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Kameya, Y. (2016). An Exhaustive Covering Approach to Parameter-Free Mining of Non-redundant Discriminative Itemsets. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2016. Lecture Notes in Computer Science(), vol 9829. Springer, Cham. https://doi.org/10.1007/978-3-319-43946-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-43946-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43945-7
Online ISBN: 978-3-319-43946-4
eBook Packages: Computer ScienceComputer Science (R0)