P. Adriaans and P. Vitányi. Approximation of the two-part MDL code. IEEE TIT, 55(1):444–457, 2009.
Google Scholar
H. Akaike. A new look at the statistical model identification. IEEE TAC, 19(6):716–723, 1974.
MATH
MathSciNet
Google Scholar
L. Akoglu, H. Tong, J. Vreeken, and C. Faloutsos. CompreX (refer MSP): Compression based anomaly detection. In CIKM. ACM, 2012.
Google Scholar
L. Akoglu, J. Vreeken, H. Tong, N. Tatti, and C. Faloutsos. Mining connection pathways for marked nodes in large graphs. In SDM. SIAM, 2013.
Google Scholar
R. Bathoorn, A. Koopman, and A. Siebes. Reducing the frequent pattern set. In ICDM-Workshop, pages 1–5, 2006.
Google Scholar
C. Böhm, C. Faloutsos, J.-Y. Pan, and C. Plant. Robust information-theoretic clustering. In KDD, pages 65–75, 2006.
Google Scholar
F. Bonchi, M. van Leeuwen, and A. Ukkonen. Characterizing uncertain data using compression. In SDM, pages 534–545, 2011.
Google Scholar
S. Chakrabarti, S. Sarawagi, and B. Dom. Mining surprising patterns using temporal description length. In VLDB, pages 606–617. Morgan Kaufmann, 1998.
Google Scholar
D. Chakrabarti, S. Papadimitriou, D. S. Modha, and C. Faloutsos. Fully automatic cross-associations. In KDD, pages 79–88, 2004.
Google Scholar
V. Chandola and V. Kumar. Summarization – compressing data into an informative representation. Knowl. Inf. Sys., 12(3):355–378, 2007.
CrossRef
Google Scholar
R. Cilibrasi and P. Vitányi. Clustering by compression. IEEE TIT, 51(4):1523–1545, 2005.
Google Scholar
T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley-Interscience New York, 2006.
Google Scholar
T. De Bie. An information theoretic framework for data mining. In KDD, pages 564–572. ACM, 2011.
Google Scholar
A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. J. R. Statist. Soc. B, 39(1):1–38, 1977.
MATH
MathSciNet
Google Scholar
C. Faloutsos and V. Megalooikonomou. On data mining, compression and Kolmogorov complexity. Data Min. Knowl. Disc., 15(1):3–20, 2007.
CrossRef
MathSciNet
Google Scholar
U. Fayyad and K. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In UAI, pages 1022–1027, 1993.
Google Scholar
R. A. Fisher. On the interpretation of χ2from contingency tables, and the calculation of P. Journal of the Royal Statistical Society, 85(1):87–94, 1922.
CrossRef
Google Scholar
F. Geerts, B. Goethals, and T. Mielikäinen. Tiling databases. In DS, pages 278–289, 2004.
Google Scholar
A. Gionis, H. Mannila, and J. K. Seppänen. Geometric and combinatorial tiles in 0-1 data. In PKDD, pages 173–184. Springer, 2004.
Google Scholar
P. Grünwald. The Minimum Description Length Principle. MIT Press, 2007.
Google Scholar
T. Guns, S. Nijssen, and L. D. Raedt. Itemset mining: A constraint programming perspective. Artif. Intell., 175(12-13):1951–1983, 2011.
CrossRef
MATH
Google Scholar
E. Halperin and R. M. Karp. The minimum-entropy set cover problem. TCS, 348(2-3):240–250, 2005.
CrossRef
MATH
MathSciNet
Google Scholar
H. Heikinheimo, J. K. Seppänen, E. Hinkkanen, H. Mannila, and T. Mielikäinen. Finding low-entropy sets and trees from binary data. In KDD, pages 350–359, 2007.
Google Scholar
H. Heikinheimo, J. Vreeken, A. Siebes, and H. Mannila. Lowentropy set selection. In SDM, pages 569–580, 2009.
Google Scholar
E. Jaynes. On the rationale of maximum-entropy methods. Proc. IEEE, 70(9):939–952, 1982.
CrossRef
Google Scholar
U. Kang and C. Faloutsos. Beyond caveman communities: Hubs and spokes for graph compression and mining. In ICDM, pages 300–309. IEEE, 2011.
Google Scholar
R. M. Karp. Reducibility among combinatorial problems. In Proc. Compl. Comp. Comput., pages 85–103, New York, USA, 1972.
Google Scholar
E. Keogh, S. Lonardi, and C. A. Ratanamahatana. Towards parameter-free data mining. In KDD, pages 206–215, 2004.
Google Scholar
E. Keogh, S. Lonardi, C. A. Ratanamahatana, L. Wei, S.-H. Lee, and J. Handley. Compression-based data mining of sequential data. Data Min. Knowl. Disc., 14(1):99–129, 2007.
CrossRef
MathSciNet
Google Scholar
P. Kontkanen and P. Myllymäki. A linear-time algorithm for computing the multinomial stochastic complexity. Inf. Process. Lett., 103(6):227–233, 2007.
CrossRef
MATH
Google Scholar
P. Kontkanen, P. Myllymäki, W. Buntine, J. Rissanen, and H. Tirri. An MDL framework for clustering. Technical report, HIIT, 2004. Technical Report 2004–6.
Google Scholar
A. Koopman and A. Siebes. Discovering relational items sets efficiently. In SDM, pages 108–119, 2008.
Google Scholar
A. Koopman and A. Siebes. Characteristic relational patterns. In KDD, pages 437–446, 2009.
Google Scholar
H. T. Lam, F. Mörchen, D. Fradkin, and T. Calders. Mining compressing sequential patterns. In SDM, 2012.
Google Scholar
H. T. Lam, T. Calders, J. Yang, F. Moerchen, and D. Fradkin.: Mining compressing sequential patterns in streams. In IDEA, pages 54–62, 2013.
Google Scholar
M. van Leeuwen and A. Siebes. StreamKrimp: Detecting change in data streams. In ECML PKDD, pages 672–687, 2008.
Google Scholar
M. van Leeuwen, J. Vreeken, and A. Siebes. Compression picks the item sets that matter. In PKDD, pages 585–592, 2006.
Google Scholar
M. van Leeuwen, F. Bonchi, B. Sigurbjörnsson, and A. Siebes. Compressing tags to find interesting media groups. In CIKM, pages 1147–1156, 2009.
Google Scholar
M. van Leeuwen, J. Vreeken, and A. Siebes. Identifying the components. Data Min. Knowl. Disc., 19(2):173–292, 2009.
CrossRef
MathSciNet
Google Scholar
M. Li and P. Vitányi. An Introduction to Kolmogorov Complexity and its Applications. Springer, 1993.
Google Scholar
M. Li, X. Chen, X. Li, B. Ma, and P. Vitanyi. The similarity metric. IEEE TIT, 50(12): 3250–3264, 2004.
MathSciNet
Google Scholar
C. Lucchese, S. Orlando, and R. Perego. Mining top-k patterns from binary datasets in presence of noise. In SDM, pages 165–176, 2010.
Google Scholar
M. Mampaey and J. Vreeken. Summarising categorical data by clustering attributes. Data Min. Knowl. Disc., 26(1):130–173, 2013.
CrossRef
MATH
MathSciNet
Google Scholar
M. Mampaey, J. Vreeken, and N. Tatti. Summarizing data succinctly with the most informative itemsets. ACM TKDD, 6:1–44, 2012.
CrossRef
Google Scholar
P. Miettinen and J. Vreeken. Model order selection for Boolean matrix factorization. In KDD, pages 51–59. ACM, 2011.
Google Scholar
P. Miettinen and J. Vreeken. mdl4bmf: Minimum description length for Boolean matrix factorization. ACM TKDD. In Press.
Google Scholar
S. Papadimitriou, J. Sun, C. Faloutsos, and P. S. Yu. Hierarchical, parameter-free community discovery. In ECML PKDD, pages 170–187, 2008.
Google Scholar
B. Pfahringer. Compression-based feature subset selection. In Proc. IJCAI'95 Workshop on Data Engineering for Inductive Learning, pages 109–119, 1995.
Google Scholar
B. A. Prakash, J. Vreeken, and C. Faloutsos. Spotting culprits in epidemics: How many and which ones? In ICDM. IEEE, 2012.
Google Scholar
J. Quinlan. C4.5: Programs for Machine Learning. Morgan-Kaufmann, Los Altos, California, 1993.
Google Scholar
L. D. Raedt. Declarative modeling for machine learning and data mining. In ECML PKDD, pages 2–3, 2012.
Google Scholar
J. Rissanen. Modeling by shortest data description. Automatica, 14(1):465–471, 1978.
CrossRef
MATH
Google Scholar
G. Schwarz. Estimating the dimension of a model. Annals Stat., 6(2):461–464, 1978.
CrossRef
MATH
Google Scholar
H. Shao, B. Tong, and E. Suzuki. Extended MDL principle for feature-based inductive transfer learning. Knowl. Inf. Sys., 35(2):365–389, 2013.
CrossRef
Google Scholar
A. Siebes. Queries for data analysis. In IDA, pages 7–22, 2012.
Google Scholar
A. Siebes and R. Kersten. A structure function for transaction data. In SDM, pages 558–569. SIAM, 2011.
Google Scholar
A. Siebes, J. Vreeken, and M. van Leeuwen. Item sets that compress. In SDM, pages 393–404. SIAM, 2006.
Google Scholar
K. Smets and J. Vreeken. The odd one out: Identifying and characterising anomalies. In SDM, pages 804–815. SIAM, 2011.
Google Scholar
K. Smets and J. Vreeken. Slim: Directly mining descriptive patterns. In SDM, pages 236–247. SIAM, 2012.
Google Scholar
J. Sun, C. Faloutsos, S. Papadimitriou, and P. S. Yu. Graphscope: parameter-free mining of large time-evolving graphs. In KDD, pages 687–696, 2007.
Google Scholar
N. Tatti. Computational complexity of queries based on itemsets. Inf. Process. Lett., 98(5): 183–187, 2006.
CrossRef
MATH
MathSciNet
Google Scholar
N. Tatti and J. Vreeken. Finding good itemsets by packing data. In ICDM, pages 588–597, 2008.
Google Scholar
N. Tatti and J. Vreeken. Discovering descriptive tile trees by fast mining of optimal geometric subtiles. In ECML PKDD. Springer, 2012.
Google Scholar
N. Tatti and J. Vreeken. The long and the short of it: Summarizing event sequences with serial episodes. In KDD. ACM, 2012.
Google Scholar
J. Vreeken and A. Siebes. Filling in the blanks: Krimp minimisation for missing data. In ICDM, pages 1067–1072. IEEE, 2008.
Google Scholar
J. Vreeken, M. van Leeuwen, and A. Siebes. Characterising the difference. In KDD, pages 765–774, 2007.
Google Scholar
J. Vreeken, M. van Leeuwen, and A. Siebes. Preserving privacy through data generation. In ICDM, pages 685–690. IEEE, 2007.
Google Scholar
J. Vreeken, M. van Leeuwen, and A. Siebes. Krimp: Mining itemsets that compress. Data Min. Knowl. Disc., 23(1):169–214, 2011.
CrossRef
MATH
Google Scholar
C. Wallace. Statistical and inductive inference by minimum message length. Springer-Verlag, 2005.
Google Scholar
C. Wang and S. Parthasarathy. Summarizing itemset patterns using probabilistic models. In KDD, pages 730–735, 2006.
Google Scholar
H. Warner, A. Toronto, L. Veasey, and R. Stephenson. A mathematical model for medical diagnosis, application to congenital heart disease. J. Am. Med. Assoc., 177:177–184, 1961.
CrossRef
Google Scholar