Abstract
Recently, different works proposed a new way to mine patterns in databases with pathological size. For example, experiments in genome biology usually provide databases with thousands of attributes (genes) but only tens of objects (experiments). In this case, mining the “transposed” database runs through a smaller search space, and the Galois connection allows to infer the closed patterns of the original database. We focus here on constrained pattern mining for those unusual databases and give a theoretical framework for database and constraint transposition. We discuss the properties of constraint transposition and look into classical constraints. We then address the problem of generating the closed patterns of the original database satisfying the constraint, starting from those mined in the “transposed” database. Finally, we show how to generate all the patterns satisfying the constraint from the closed ones.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining (1996)
Bastide, Y., Pasquier, N., Taouil, R., Stumme, G., Lakhal, L.: Mining minimal non-redundant association rules using frequent closed itemsets. In: Palamidessi, C., Moniz Pereira, L., Lloyd, J.W., Dahl, V., Furbach, U., Kerber, M., Lau, K.-K., Sagiv, Y., Stuckey, P.J. (eds.) CL 2000. LNCS (LNAI), vol. 1861, pp. 972–986. Springer, Heidelberg (2000)
Besson, J., Robardet, C., Boulicaut, J.-F.: Constraint-based mining of formal concepts in transactional data. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 615–624. Springer, Heidelberg (2004)
Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: ExAnte: Anticipated data reduction in constrained pattern mining. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 59–70. Springer, Heidelberg (2003)
Boros, E., Gurvich, V., Khachiyan, L., Makino, K.: On the complexity of generating maximal frequent and minimal infrequent sets. In: Symposium on Theoretical Aspects of Computer Science, pp. 133–141 (2002)
Boulicaut, J.-F., Bykowski, A., Rigotti, C.: Free-sets: a condensed representation of boolean data for the approximation of frequency queries. DMKD 7(1) (2003)
Boulicaut, J.-F., Jeudy, B.: Mining free-sets under constraints. In: Proc. IDEAS, pp. 322–329 (2001)
Bucila, C., Gehrke, J., Kifer, D., White, W.: Dualminer: a dual-pruning algorithm for itemsets with constraints. In: Proc. SIGKDD, pp. 42–51 (2002)
Calders, T., Goethals, B.: Mining all non-derivable frequent itemsets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 74–85. Springer, Heidelberg (2002)
de Raedt, L., Kramer, S.: The levelwise version space algorithm and its application to molecular fragment finding. In: Proc. IJCAI, pp. 853–862 (2001)
Dong, G., Li, J.: Efficient mining of emerging patterns: discovering trends and differences. In: Proc. SIGKDD, pp. 43–52 (1999)
Fu, H., Nguifo, E.M.: How well go lattice algorithms on currently used machine learning testbeds? In: 1st Intl. Conf. on Formal Concept Analysis (2003)
Goethals, B., Van den Bussche, J.: On supporting interactive association rule mining. In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds.) DaWaK 2000. LNCS, vol. 1874, p. 307. Springer, Heidelberg (2000)
Jeudy, B., Boulicaut, J.-F.: Optimization of association rule mining queries. Intelligent Data Analysis 6(4), 341–357 (2002)
Jeudy, B., Boulicaut, J.-F.: Using condensed representations for interactive association rule mining. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, p. 225. Springer, Heidelberg (2002)
Ng, R., Lakshmanan, L.V., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained associations rules. In: SIGMOD (1998)
Nguifo, E.M., Njiwoua, P.: GLUE: a lattice-based constructive induction system. Intelligent Data Analysis 4(4), 1–49 (2000)
Pan, F., Cong, G., Tung, A.K.H., Yang, J., Zaki, M.J.: CARPENTER: Finding closed patterns in long biological datasets. In: Proc. SIGKDD (2003)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Information Systems 24(1), 25–46 (1999)
Pei, J., Han, J., Lakshmanan, L.V.S.: Mining frequent itemsets with convertible constraints. In: Proc. ICDE, pp. 433–442 (2001)
Pei, J., Han, J., Mao, R.: CLOSET an efficient algorithm for mining frequent closed itemsets. In: Proc. DMKD workshop (2000)
Raedt, L.D., Jaeger, M., Lee, S., Mannila, H.: A theory of inductive query answering (extended abstract). In: Proc. ICDM, pp. 123–130 (2002)
Rioult, F., Boulicaut, J.-F., Crémilleux, B., Besson, J.: Using transposition for pattern discovery from microarray data. In: DMKD workshop (2003)
Rioult, F., Crémilleux, B.: Optimisation of pattern mining: a new method founded on database transposition. In: EIS 2004 (2004)
Soulet, A., Crémilleux, B., Rioult, F.: Condensed representation of emerging patterns. In: Proc. PAKDD (2004)
Stadler, B., Stadler, P.: Basic properties of filter convergence spaces. J. Chem. Inf. Comput. Sci. 42 (2002)
Wille, R.: Concept lattices and conceptual knowledge systems. Computer mathematic applied 23(6-9), 493–515 (1992)
Zaki, M.J., Hsiao, C.-J.: CHARM: An efficient algorithm for closed itemset mining. In: Proc. SDM (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jeudy, B., Rioult, F. (2005). Database Transposition for Constrained (Closed) Pattern Mining. In: Goethals, B., Siebes, A. (eds) Knowledge Discovery in Inductive Databases. KDID 2004. Lecture Notes in Computer Science, vol 3377. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31841-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-31841-5_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25082-1
Online ISBN: 978-3-540-31841-5
eBook Packages: Computer ScienceComputer Science (R0)