The complexity of satisfying constraints on databases of transactions

Calders, Toon

doi:10.1007/s00236-007-0060-1

The complexity of satisfying constraints on databases of transactions

Original Article
Published: 01 November 2007

Volume 44, pages 591–624, (2007)
Cite this article

Acta Informatica Aims and scope Submit manuscript

Toon Calders^1,2

55 Accesses
9 Citations
Explore all metrics

Abstract

Computing frequent itemsets is one of the most prominent problems in data mining. Recently, a new related problem, called FREQSAT, was introduced and studied: given some itemset–interval pairs, does there exist a database such that for every pair, the frequency of the itemset falls in the interval? In this paper, we extend this FREQSAT-problem by further constraining the database by giving other characteristics as part of the input as well. These characteristics are the maximal transaction length, the maximal number of transactions, and the maximal number of duplicates of a transaction. These extensions and all their combinations are studied in depth, and a hierarchy w.r.t. complexity is given. To make a complete picture, also the cases where the characteristics are constant; i.e., bounded and the bound being a fixed constant that is not a part of the input, are studied.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficiently Mining Maximal Diverse Frequent Itemsets

Computing Theoretically-Sound Upper Bounds to Expected Support for Frequent Pattern Mining Problems over Uncertain Big Data

Structure of frequent itemsets with extended double constraints

Article Open access 29 January 2016

References

Agrawal, R., Imilienski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. ACM SIGMOD Int. Conf. Management of Data, Washington, DC, pp. 207–216 (1993)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proc. ACM SIGMOD Int. Conf. Management of Data, pp. 439–450 (2000)
Bastide Y., Taouil R., Pasquier N., Stumme G. and Lakhal L. (2000). Mining frequent patterns with counting inference. ACM SIGKDD Explorations 2(2): 66–75
Article Google Scholar
Bayardo, R.J.: Efficiently mining long patterns from databases. In: Proc. ACM SIGMOD Int. Conf. Management of Data, Seattle, Washington, pp. 85–93 (1998)
Boros, E., Gurvich, V., Khachiyan, L., Makino, K.: On the complexity of generating maximal frequent and minimal infrequent sets. In: Proc. STACS Int. Symp. On Theoretical Aspects of Computer Science, pp. 133–141 (2002)
Calders, T.: Axiomatization and Deduction Rules for the Frequency of Itemsets. Ph.D. thesis, University of Antwerp, Belgium (2003)
Calders, T.: Computational complexity of itemset frequency satisfiability. In: Proc. PODS Int. Conf. Principles of Database Systems, pp. 143–154 (2004)
Calders, T., Goethals, B.: Mining all non-derivable frequent itemsets. In: Proc. PKDD Int. Conf. Principles of Data Mining and Knowledge Discovery, pp. 74–85. Springer, Berlin (2002)
Calders T. and Goethals B. (2007). Non-derivable itemset mining. Data Min. Knowledge Discovery 14(1): 171–206
Article Google Scholar
Calders T. and Paredaens J. (2003). Axiomatization of frequent itemsets. Theor. Comput. Sci. 290(1): 669–693
Article MATH MathSciNet Google Scholar
Calders, T., Rigotti, C., Boulicaut, J.-F.: A survey on condensed representations for frequent sets. In: Boulicaut J.-F., de Raedt L., Mannila H. (eds.) Constraint-based Mining and Inductive Databases, vol. 3848 of LNCS. Springer, Berlin (2005)
Calders, T.: Complexity of and axiomatization for the freqsat problem. Technical Report 06-03, University of Antwerp (2006)
Chen, X., Orlowska, M.E.: A further study on inverse frequent set mining. In: Proc. ADMA Int. Conf. Advanced Data Mining and Applications, pp. 753–760 (2005)
Chvátal V. (1980). Recognizing intersection patterns. Ann. Discrete Math. Combin. 79 8(I): 249–251
MATH Google Scholar
Dexters, N., Purdom, P.W., Van Gucht, D.: A probability analysis for candidate-based frequent itemset algorithms. In: Proceedings of the 2006 ACM Symposium on Applied Computing, DM track, vol. 1 of 2, pp. 541–545 (2006)
Frisch A.M. and Haddawy P. (1994). Anytime deduction for probabilistic logic. Artif. Intell. 69(1,2): 93–112
Article MATH MathSciNet Google Scholar
Garey M.R. and Johnson D.S. (1979). Computers and Intractability: A Guide to the Theory of NP-completeness. Freeman, New York
MATH Google Scholar
Hailperin T. (1996). Sentential Probability Logic. Lehigh University Press, Bethlehem
MATH Google Scholar
Hansen, P., Jaumard, B.: Probabilistic satisfiability. Les Cahiers du GERAD G-96-31, GERAD (1996)
Hansen, P., Jaumard, B., Nguets, G.-B.D., de Aragäo, M.P.: Models and algorithms for probabilistic and bayesian logic. In: Proc. IJCAI Int. Joint Conf. Artificial Intelligence, Montreal, Canada, pp. 1862–1868 (1995)
Lukasiewicz, T.: Probabilistic logic programming with conditional constraints. INFSYS Research Report 1843-00-01, Institut für Informationssysteme, Abteilung Wissenbasierte Systeme (2000)
Lukasiewicz T. (2001). Probabilistic logic programming with conditional constraints. ACM Trans. Comput. Logic 2(3): 289–339
Article MathSciNet Google Scholar
Mannila, H., Toivonen, H.: Multiple uses of frequent sets and condensed representations. In: Proc. ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (1996)
Mielikäinen, T.: On inverse frequent set mining. In: 2nd Workshop on Privacy Preserving Data Mining (PPDM) (2003)
Nilsson N. (1986). Probabilistic logic. Artif. Intelligence 28: 71–87
Article MATH MathSciNet Google Scholar
Papadimitriou C.H. (1994). Computational Complexity. Addison-Wesley, Reading
MATH Google Scholar
Paris J.B. (1994). The Uncertain Reasoner’s Companion. Tracts in Theoretical Computer Science 39. Cambridge University Press, Cambridge
Google Scholar
Wang, Y., Wu, X.: Approximate inverse frequent itemset mining: privacy, complexity, and approximation. In: Proc. IEEE Int. Conf. on Data Mining (2005)
Wu, X., Wu, Y., Wang, Y., Li, Y.: Privacy aware market basket data set generation: a feasible approach for inverse frequent set mining. In: Proc. SIAM Int. Conf. on Data Mining (2005)
Yang, G.: The complexity of mining maximal frequent itemsets and maximal frequent patterns. In: Proc. ACM SIGKDD Int. Conf. Knowledge Discovery in Databases. pp. 344–353. ACM Press (2004)

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, Eindhoven University of Technology, Den Dolech 2, 5600 MB, Eindhoven, The Netherlands
Toon Calders
Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
Toon Calders

Authors

Toon Calders
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Toon Calders.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Calders, T. The complexity of satisfying constraints on databases of transactions. Acta Informatica 44, 591–624 (2007). https://doi.org/10.1007/s00236-007-0060-1

Download citation

Received: 21 February 2006
Revised: 07 July 2007
Accepted: 09 October 2007
Published: 01 November 2007
Issue Date: December 2007
DOI: https://doi.org/10.1007/s00236-007-0060-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The complexity of satisfying constraints on databases of transactions

Abstract

Access this article

Similar content being viewed by others

Efficiently Mining Maximal Diverse Frequent Itemsets

Computing Theoretically-Sound Upper Bounds to Expected Support for Frequent Pattern Mining Problems over Uncertain Big Data

Structure of frequent itemsets with extended double constraints

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The complexity of satisfying constraints on databases of transactions

Abstract

Access this article

Similar content being viewed by others

Efficiently Mining Maximal Diverse Frequent Itemsets

Computing Theoretically-Sound Upper Bounds to Expected Support for Frequent Pattern Mining Problems over Uncertain Big Data

Structure of frequent itemsets with extended double constraints

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation