Abstract
The discovery of frequent patterns is one of the most important issues in the data mining area. An extensive research has been carried out for discovering positive patterns, however, very little has been offered for discovering patterns with negation. One of the main difficulties concerning frequent patterns with negation is huge amount of discovered patterns. It exceeds the number of frequent positive patterns by orders of magnitude. The problem can be significantly alleviated by applying concise representations that use generalized disjunctive rules to reason about frequent patterns, both with and without negation. In this paper, we examine three types of generalized disjunction free representations and derive the relationships between them. We also present two variants of algorithms for building such representations. The results obtained on a theoretical basis are verified experimentally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Associations Rules between Sets of Items in Large Databases. In: Proceedings of the ACM SIGMOD, Washington, USA, pp. 207–216 (1993)
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast Discovery of Association Rules. In: Advances in KDD, pp. 307–328. AAAI, Menlo Park (1996)
Baptiste, J., Boulicaut, J.-F.: Constraint-Based Discovery and Inductive Queries: Application to Association Rule Mining. In: Proceedings of Pattern Detection and Discovery, ESF Exploratory Workshop, London, UK, pp. 110–124 (2002)
Baralis, E., Chiusano, S., Garza, P.: On Support Thresholds in Associative Classification. In: Proceedings of SAC 2004 ACM Symposium on Applied Computing, Nikosia, Cyprus, pp. 553–558 (2004)
Boulicaut, J.-F., Bykowski, A., Rigotti, C.: Approximation of Frequency Queries by Means of Free-Sets. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 75–85. Springer, Heidelberg (2000)
Bykowski, A., Rigotti, C.: A Condensed Representation to Find Frequent Patterns. In: Proceedings of PODS 2001 ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Santa Barbara, USA, pp. 267–273 (2001)
Calders, T.: Axiomatization and Deduction Rules for the Frequency of Itemsets, Ph.D. Thesis, University of Antwerp (2003)
Calders, T., Goethals, B.: Mining All Non-Derivable Frequent Itemsets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 74–85. Springer, Heidelberg (2002)
Calders, T., Goethals, B.: Minimal k-free representations of frequent sets. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 71–82. Springer, Heidelberg (2003)
Cichon K.: Fast Discovering Representation of Frequent Patterns with Negation and Reducts of Decision Tables, Ph.D. Thesis, Warsaw University of Technology (2006) (in Polish)
Gołębski M.: Inducing Grammair Rules for Polish, Ph.D. Thesis, Warsaw University of Technology (2007) (in Polish)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2000)
Harms, S.K., Deogun, J., Saquer, J., Tadesse, T.: Discovering Representative Episodal Association Rules from Event Sequences using Frequent Closed Episode Sets and Event Constraints. In: Proceedings of ICDM 2001 IEEE International Conference on Data Mining, San Jose, California, USA, pp. 603–606 (2001)
Kryszkiewicz, M.: Closed Set Based Discovery of Representative Association Rules. In: Hoffmann, F., Adams, N., Fisher, D., Guimarães, G., Hand, D.J. (eds.) IDA 2001. LNCS, vol. 2189, pp. 350–359. Springer, Heidelberg (2001)
Kryszkiewicz, M.: Concise Representation of Frequent Patterns Based on Disjunction–Free Generators. In: Proceedings of ICDM 2001 IEEE International Conference on Data Mining, San Jose, California, USA, pp. 305–312 (2001)
Kryszkiewicz, M.: Inferring Knowledge from Frequent Patterns. In: Bustard, D.W., Liu, W., Sterritt, R. (eds.) Soft-Ware 2002. LNCS, vol. 2311, pp. 247–262. Springer, Heidelberg (2002)
Kryszkiewicz, M.: Concise Representations of Frequent Patterns and Association Rules. Publishing House of Warsaw University of Technology, Warsaw (2002)
Kryszkiewicz, M.: Concise Representations of Association Rules. In: Proceedings, Pattern Detection and Discovery, ESF Exploratory Workshop, London, UK, pp. 92–109 (2002)
Kryszkiewicz, M.: Reducing Infrequent Borders of Downward Complete Representations of Frequent Patterns. In: Proceedings of The First Symposium on Databases, Data Warehousing and Knowledge Discovery, Baden-Baden, Germany, pp. 29–42 (2003)
Kryszkiewicz, M.: Closed set based discovery of maximal covering rules. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 11(Supplement-1), 15–29 (2003)
Kryszkiewicz, M.: Reducing Borders of k-disjunction Free Representations of Frequent Patterns. In: Proceedings, SAC 2004 ACM Symposium on Applied Computing, Nikosia, Cyprus, pp. 559–563 (2004)
Kryszkiewicz, M.: Upper Bound on the Length of Generalized Disjunction Free Patterns. In: Proceedings of SSDBM 2004 International Conference on Scientific and Statistical Database Management, Santorini, Greece, pp. 31–40 (2004)
Kryszkiewicz, M.: Reducing Main Components of k-disjunction Free Representations of Frequent Patterns. In: Proceedings of IPMU 2004 International Conference in Information Processing and Management of Uncertainty in Knowledge-Based Systems, Perugia, Italy, pp. 1751–1758 (2004)
Kryszkiewicz, M.: Generalized Disjunction-Free Representation of Frequent Patterns with Negation, pp. 63–82. JETAI, Taylor & Francis Group, UK (2005)
Kryszkiewicz, M.: Reasoning about Frequent Patterns with Negation. In: Encyclopedia of Data Warehousing and Mining, pp. 941–946. Information Science Publishing, Idea Group (2005)
Kryszkiewicz, M.: Generalized Disjunction-Free Representation of Frequents Patterns with at Most k Negations. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 468–472. Springer, Heidelberg (2006)
Kryszkiewicz, M., Cichon, K.: Support Oriented Discovery of Generalized Disjunction-Free Representation of Frequent Patterns with Negation. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 672–682. Springer, Heidelberg (2005)
Kryszkiewicz, M., Gajek, M.: Concise Representation of Frequent Patterns Based on Generalized Disjunction-Free Generators. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 159–171. Springer, Heidelberg (2002)
Kryszkiewicz, M., Gajek, M.: Why to Apply Generalized Disjunction-Free Generators Representation of Frequent Patterns? In: Hacid, M.-S., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds.) ISMIS 2002. LNCS (LNAI), vol. 2366, pp. 383–392. Springer, Heidelberg (2002)
Kryszkiewicz, M., Rybiński, H., Gajek, M.: Dataless Transitions between Concise Representations of Frequent Patterns. Journal of Intelligent Information Systems 22(1), 41–70 (2004)
Kryszkiewicz, M., Rybinski, H., Muraszkiewicz, M.: Data Mining Methods for Telecom Providers. In: MOST 2002 Conference, Warsaw (2002)
Kryszkiewicz, M., Skonieczny, Ł.: Hierarchical Document Clustering Using Frequent Closed Sets. In: Advances in Soft Computing, pp. 489–498. Springer, Heidelberg (2006)
Mannila, H., Toivonen, H.: Multiple Uses of Frequent Sets and Condensed Representations. In: Proceedings of KDD 1996, Portland, USA, pp. 189–194 (1996)
Nykiel, T., Rybinski, H.: Word Sense Discovery for Web Information Retrieval. In: MCD Workshop 2008, Piza, ICDM (2008)
Pasquier, N.: Data mining: Algorithmes d’extraction et de Réduction des Règles d’association dans les Bases de Données, Ph.D. thesis, Université Blaise Pascal – Clermont–Ferrand II (2000)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient Mining of Association Rules Using Closed Itemset Lattices. Journal of Information Systems 24(1), 25–46 (1999)
Pei, J., Dong, G., Zou, W., Han, J.: On Computing Condensed Frequent Pattern Bases. In: Proceedings of ICDM 2002 IEEE International Conference on Data Mining, Maebashi, Japan, pp. 378–385 (2002)
Phan-Luong, V.: Representative Bases of Association Rules. In: Proceedings of ICDM 2001 IEEE International Conference on Data Mining, San Jose, California, USA, pp. 639–640 (2001)
Rybinski, H., Kryszkiewicz, M., Protaziuk, G., Jakubowski, A., Delteil, A.: Discovering Synonyms based on Frequent Termsets. In: Kryszkiewicz, M., Peters, J.F., Rybiński, H., Skowron, A. (eds.) RSEISP 2007. LNCS (LNAI), vol. 4585, pp. 516–525. Springer, Heidelberg (2007)
Rybinski, H., Kryszkiewicz, M., Protaziuk, G., Kontkiewicz, A., Marcinkowska, K., Delteil, A.: Discovering Word Meanings Based on Frequent Termsets. In: Raś, Z.W., Tsumoto, S., Zighed, D.A. (eds.) MCD 2007. LNCS (LNAI), vol. 4944, pp. 82–92. Springer, Heidelberg (2008)
Toivonen, H.: Discovery of Frequent Patterns in Large Data Collections. Ph.D. Thesis, Report A-1996-5, University of Helsinki (1996)
Saquer, J., Deogun, J.S.: Using Closed Itemsets for Discovering Representative Association Rules. In: Ohsuga, S., Raś, Z.W. (eds.) ISMIS 2000. LNCS (LNAI), vol. 1932, pp. 495–504. Springer, Heidelberg (2000)
Savinov, A.: Mining Dependence Rules by Finding Largest Itemset Support Quota. In: Proceedings of SAC 2004 ACM Symposium on Applied Computing, Nikosia, Cyprus, pp. 525–529 (2004)
Tsumoto, S.: Discovery of Positive and Negative Knowledge in Medical Databases Using Rough Sets. In: Arikawa, S., Shinohara, A. (eds.) Progress in Discovery Science. LNCS (LNAI), vol. 2281, pp. 543–552. Springer, Heidelberg (2002)
Zaki, M.J., Hsiao, C.J.: CHARM: An Efficient Algorithm For Closed Itemset Mining. In: Proceedings of SIAM 2002 International Conference on Data Mining, Arlington, USA (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kryszkiewicz, M., Rybiński, H., Cichoń, K. (2010). On Concise Representations of Frequent Patterns Admitting Negation. In: Koronacki, J., Raś, Z.W., Wierzchoń, S.T., Kacprzyk, J. (eds) Advances in Machine Learning II. Studies in Computational Intelligence, vol 263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05179-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-05179-1_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05178-4
Online ISBN: 978-3-642-05179-1
eBook Packages: EngineeringEngineering (R0)