Advertisement

A Lossless Data Reduction for Mining Constrained Patterns in n-ary Relations

Conference paper
  • 3k Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8725)

Abstract

Given a binary relation, listing the itemsets takes exponential time. The problem grows worse when searching for analog patterns defined in n-ary relations. However, real-life relations are sparse and, with a greater number n of dimensions, they tend to be even sparser. Moreover, not all itemsets are searched. Only those satisfying some userdefined constraints, such as minimal size constraints. This article proposes to exploit together the sparsity of the relation and the presence of constraints satisfying a common property, the monotonicity w.r.t. one dimension. It details a pre-processing step to identify and erase n-tuples whose removal does not change the collection of patterns to be discovered. That reduction of the relation is achieved in a time and a space that is linear in the number of n-tuples. Experiments on two real-life datasets show that, whatever the algorithm used afterward to actually list the patterns, the pre-process allows to lower the overall running time by a factor typically ranging from 10 to 100.

Keywords

Binary Relation Pattern Mining Pattern Space Ternary Relation Monotone Constraint 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB 1994: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann (1994)Google Scholar
  2. 2.
    Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: ExAMiner: Optimized level-wise frequent pattern mining with monotone constraints. In: ICDM 2003: Proceedings of the 3rd International Conference on Data Mining, pp. 11–18. IEEE Computer Society (2003)Google Scholar
  3. 3.
    Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: ExAnte: Anticipated data reduction in constrained pattern mining. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 59–70. Springer, Heidelberg (2003)Google Scholar
  4. 4.
    Bonchi, F., Goethals, B.: FP-Bonsai: the art of growing and pruning small FP-trees. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 155–160. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. 5.
    Boulicaut, J.F., Jeudy, B.: Using constraints during set mining: should we prune or not? In: BDA 2000: Actes des 16ème Journées Bases de Données Avancées, pp. 221–237 (2000)Google Scholar
  6. 6.
    Bucila, C., Gehrke, J., Kifer, D., White, W.M.: DualMiner: a dual-pruning algorithm for itemsets with constraints. In: KDD 2002: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 42–51. ACM Press (2002)Google Scholar
  7. 7.
    Cerf, L., Besson, J., Robardet, C., Boulicaut, J.F.: Closed patterns meet n-ary relations. ACM Transactions on Knowledge Discovery from Data 3(1), 1–36 (2009)CrossRefGoogle Scholar
  8. 8.
    Gallo, A., Mammone, A., Bie, T.D., Turchi, M., Cristianini, N.: From frequent itemsets to informative patterns. Tech. Rep. 123936, University of Bristol, Senate House, Tyndall Avenue, Bristol BS8 1TH, UK (December 2009)Google Scholar
  9. 9.
    Grahne, G., Lakshmanan, L.V.S., Wang, X.: Efficient mining of constrained correlated sets. In: ICDE 2000: Proceedings of the 16th International Conference on Data Engineering, pp. 512–521. IEEE Computer Society (2000)Google Scholar
  10. 10.
    Jaschke, R., Hotho, A., Schmitz, C., Ganter, B., Stumme, G.: Trias–an algorithm for mining iceberg tri-lattices. In: ICDM 2006: Proceedings of the 6th IEEE International Conference on Data Mining, pp. 907–911. IEEE Computer Society (2006)Google Scholar
  11. 11.
    Ji, L., Tan, K.L., Tung, A.K.H.: Mining frequent closed cubes in 3D data sets. In: VLDB’06: Proceedings of the 32nd International Conference on Very Large Data Bases, pp. 811–822. VLDB Endowment (2006)Google Scholar
  12. 12.
    Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: WWW 2010: Proceedings of the 19th International World Wide Web Conferences, pp. 591–600. ACM Press (2010)Google Scholar
  13. 13.
    Nataraj, R.V., Selvan, S.: Closed pattern mining from n-ary relations. International Journal of Computer Applications 1(9), 9–13 (2010)CrossRefGoogle Scholar
  14. 14.
    Trabelsi, C., Jelassi, N., Ben Yahia, S.: Scalable mining of frequent tri-concepts from Folksonomies. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012, Part II. LNCS, vol. 7302, pp. 231–242. Springer, Heidelberg (2012)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversidade Federal de Minas GeraisBelo HorizonteBrazil

Personalised recommendations