E. Achtert, C. Böhm, H.-P. Kriegel, P. Kröger, I. Müller-Gorman, and A. Zimek. Detection and visualization of subspace cluster hierarchies. In 12th International Conference on Database Systems for Advanced Applications (DASFAA), Bangkok, Thailand, pages 152–163, 2007.
Google Scholar
E. Achtert, C. Böhm, H.-P. Kriegel, P. Kröger, and A. Zimek. Robust, complete, and efficient correlation clustering. In 7th SIAM International Conference on Data Mining (SDM), Minneapolis, MN, pages 413–418, 2007.
Google Scholar
C. C. Aggarwal, C. M. Procopiuc, J. L. Wolf, P. S. Yu, and J. S. Park. Fast algorithms for projected clustering. In ACM International Conference on Management of Data (SIGMOD), Philadelphia, PA, pages 61–72, 1999.
Google Scholar
C. C. Aggarwal, A. Hinneburg, and D. Keim. On the surprising behavior of distance metrics in high dimensional space. In 8th International Conference on Database Theory (ICDT), London, UK, pages 420–434, 2001.
Google Scholar
C. C. Aggarwal, N. Ta, J. Wang, J. Feng, and M. Zaki. Xproj: a framework for projected structural clustering of xml documents. In 13th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), San Jose, CA, pages 46–55, 2007.
Google Scholar
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In 20th International Conference on Very Large Data Bases (VLDB), Santiago de Chile, Chile, pages 487–499, 1994.
Google Scholar
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In ACM International Conference on Management of Data (SIGMOD), Seattle, WA, pages 94–105, 1998.
Google Scholar
M. Ankerst, M. M. Breunig, H.-P. Kriegel, and J. Sander. OPTICS: Ordering points to identify the clustering structure. In ACM International Conference on Management of Data (SIGMOD), Philadelphia, PA, pages 49–60, 1999.
Google Scholar
I. Assent. Clustering high dimensional data. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(4):340–350, 2012.
Google Scholar
I. Assent, R. Krieger, E. Müller, and T. Seidl. DUSC: dimensionality unbiased subspace clustering. In 7th IEEE International Conference on Data Mining (ICDM), Omaha, NE, pages 409–414, 2007.
Google Scholar
I. Assent, R. Krieger, E. Müller, and T. Seidl. EDSC: efficient density-based subspace clustering. In 17th ACM Conference on Information and Knowledge Management (CIKM), Napa Valley, CA, pages 1093–1102, 2008.
Google Scholar
I. Assent, R. Krieger, E. Müller, and T. Seidl. INSCY: indexing subspace clusters with in-process-removal of redundancy. In 8th IEEE International Conference on Data Mining (ICDM), Pisa, Italy, pages 719–724, 2008.
Google Scholar
I. Assent, E. Müller, S. Günnemann, R. Krieger, and T. Seidl. Less is more: Non-redundant subspace clustering. In MultiClust: 1st International Workshop on Discovering, Summarizing and Using Multiple Clusterings Held in Conjunction with KDD 2010, Washington, DC, 2010.
Google Scholar
E. Bae and J. Bailey. COALA: a novel approach for the extraction of an alternate clustering of high quality and high dissimilarity. In 6th IEEE International Conference on Data Mining (ICDM), Hong Kong, China, pages 53–62, 2006.
Google Scholar
C. Baumgartner, K. Kailing, H.-P. Kriegel, P. Kröger, and C. Plant. Subspace selection for clustering high-dimensional data. In 4th IEEE International Conference on Data Mining (ICDM), Brighton, UK, pages 11–18, 2004.
Google Scholar
R. Bayardo. Efficiently mining long patterns from databases. In ACM International Conference on Management of Data (SIGMOD), Seattle, WA, pages 85–93, 1998.
Google Scholar
K. P. Bennett, U. Fayyad, and D. Geiger. Density-based indexing for approximate nearest-neighbor queries. In 5th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), San Diego, CA, pages 233–243, 1999.
Google Scholar
K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is “nearest neighbor” meaningful? In 7th International Conference on Database Theory (ICDT), Jerusalem, Israel, pages 217–235, 1999.
Google Scholar
S. Bickel and T. Scheffer. Multi-view clustering. In 4th IEEE International Conference on Data Mining (ICDM), Brighton, UK, pages 19–26, 2004.
Google Scholar
R. J. G. B. Campello, D. Moulavi, and J. Sander. Density-based clustering based on hierarchical density estimates. In 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Gold Coast, Australia, pages 160–172, 2013.
Google Scholar
C. H. Cheng, A. W.-C. Fu, and Y. Zhang. Entropy-based subspace clustering for mining numerical data. In 5th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), San Diego, CA, pages 84–93, 1999.
Google Scholar
Y. Cui, X. Z. Fern, and J. G. Dy. Non-redundant multi-view clustering via orthogonalization. In 7th IEEE International Conference on Data Mining (ICDM), Omaha, NE, pages 133–142, 2007.
Google Scholar
X. H. Dang and J. Bailey. Generation of alternative clusterings using the CAMI approach. In 10th SIAM International Conference on Data Mining (SDM), Columbus, OH, pages 118–129, 2010.
Google Scholar
I. Davidson and Z. Qi. Finding alternative clusterings using constraints. In 8th IEEE International Conference on Data Mining (ICDM), Pisa, Italy, pages 773–778, 2008.
Google Scholar
I. Davidson, S. S. Ravi, and L. Shamis. A SAT-based framework for efficient constrained clustering. In 10th SIAM International Conference on Data Mining (SDM), Columbus, OH, pages 94–105, 2010.
Google Scholar
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 39(1):1–31, 1977.
MATH
MathSciNet
Google Scholar
R. J. Durrant and A. Kaban. When is ‘nearest neighbour’ meaningful: A converse theorem and implications. Journal of Complexity, 25(4):385–397, 2009.
CrossRef
MATH
MathSciNet
Google Scholar
M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In 2nd ACM International Conference on Knowledge Discovery and Data Mining (KDD), Portland, OR, pages 226–231, 1996.
Google Scholar
I. Färber, S. Günnemann, H.-P. Kriegel, P. Kröger, E. Müller, E. Schubert, T. Seidl, and A. Zimek. On using class-labels in evaluation of clusterings. In MultiClust: 1st International Workshop on Discovering, Summarizing and Using Multiple Clusterings Held in Conjunction with KDD 2010, Washington, DC, 2010.
Google Scholar
D. François, V. Wertz, and M. Verleysen. The concentration of fractional distances. IEEE Transactions on Knowledge and Data Engineering, 19(7):873–886, 2007.
CrossRef
Google Scholar
G. Gan, C. Ma, and J. Wu. Data Clustering. Theory, Algorithms, and Applications. Society for Industrial and Applied Mathematics (SIAM), 2007.
Google Scholar
D. Gondek and T. Hofmann. Non-redundant data clustering. In 4th IEEE International Conference on Data Mining (ICDM), Brighton, UK, pages 75–82, 2004.
Google Scholar
D. Gondek and T. Hofmann. Non-redundant clustering with conditional ensembles. In 11th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), Chicago, IL, pages 70–77, 2005.
Google Scholar
S. Günnemann, E. Müller, I. Färber, and T. Seidl. Detection of orthogonal concepts in subspaces of high dimensional data. In 18th ACM Conference on Information and Knowledge Management (CIKM), Hong Kong, China, pages 1317–1326, 2009.
Google Scholar
S. Günnemann, I. Färber, E. Müller, and T. Seidl. ASCLU: alternative subspace clustering. In MultiClust: 1st International Workshop on Discovering, Summarizing and Using Multiple Clusterings Held in Conjunction with KDD 2010, Washington, DC, 2010.
Google Scholar
S. Günnemann, I. Färber, E. Müller, I. Assent, and T. Seidl. External evaluation measures for subspace clustering. In 20th ACM Conference on Information and Knowledge Management (CIKM), Glasgow, UK, pages 1363–1372, 2011.
Google Scholar
J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. ACM SIGMOD Record, 29(2):1–12, 2000.
CrossRef
Google Scholar
J. Han, M. Kamber, and J. Pei. Data Mining: Concepts and Techniques. Morgan Kaufmann, 3rd edition, 2011.
Google Scholar
J. A. Hartigan. Clustering Algorithms. John Wiley & Sons, New York, London, Sydney, Toronto, 1975.
Google Scholar
A. Hinneburg and D. A. Keim. An efficient approach to clustering in large multimedia databases with noise. In 4th ACM International Conference on Knowledge Discovery and Data Mining (KDD), New York City, NY, pages 58–65, 1998.
Google Scholar
A. Hinneburg, C. C. Aggarwal, and D. A. Keim. What is the nearest neighbor in high dimensional spaces? In 26th International Conference on Very Large Data Bases (VLDB), Cairo, Egypt, pages 506–515, 2000.
Google Scholar
M. E. Houle, H.-P. Kriegel, P. Kröger, E. Schubert, and A. Zimek. Can shared-neighbor distances defeat the curse of dimensionality? In 22nd International Conference on Scientific and Statistical Database Management (SSDBM), Heidelberg, Germany, pages 482–500, 2010.
Google Scholar
A. K. Jain. Data clustering: 50 years beyond k-means. Pattern Recognition Letters, 31(8):651–666, 2010.
CrossRef
Google Scholar
A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, 1988.
Google Scholar
A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: A review. ACM Computing Surveys, 31(3):264–323, 1999.
CrossRef
Google Scholar
P. Jain, R. Meka, and I. S. Dhillon. Simultaneous unsupervised learning of disparate clusterings. Statistical Analysis and Data Mining, 1(3):195–210, 2008.
CrossRef
MathSciNet
Google Scholar
K. Kailing, H.-P. Kriegel, P. Kröger, and S. Wanka. Ranking interesting subspaces for clustering high dimensional data. In 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), Cavtat-Dubrovnik, Croatia, pages 241–252, 2003.
Google Scholar
K. Kailing, H.-P. Kriegel, and P. Kröger. Density-connected subspace clustering for high-dimensional data. In 4th SIAM International Conference on Data Mining (SDM), Lake Buena Vista, FL, pages 246–257, 2004.
Google Scholar
L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analyis. John Wiley & Sons, 1990.
Google Scholar
H.-P. Kriegel, P. Kröger, and A. Zimek. Clustering high dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Transactions on Knowledge Discovery from Data (TKDD), 3(1):1–58, 2009.
CrossRef
Google Scholar
H.-P. Kriegel, P. Kröger, J. Sander, and A. Zimek. Density-based clustering. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(3):231–240, 2011.
Google Scholar
H.-P. Kriegel, P. Kröger, and A. Zimek. Subspace clustering. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(4):351–364, 2012.
Google Scholar
P. Kröger and A. Zimek. Subspace clustering techniques. In L. Liu and M. T. Ozsu, editors, Encyclopedia of Database Systems, pages 2873–2875. Springer, 2009.
Google Scholar
G. Liu, J. Li, K. Sim, and L. Wong. Distance based subspace clustering with flexible dimension partitioning. In 23rd International Conference on Data Engineering (ICDE), Istanbul, Turkey, pages 1250–1254, 2007.
Google Scholar
S. P. Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129–136, 1982.
CrossRef
MATH
MathSciNet
Google Scholar
J. MacQueen. Some methods for classification and analysis of multivariate observations. In 5th Berkeley Symposium on Mathematics, Statistics, and Probabilistics, volume 1, pages 281–297, 1967.
Google Scholar
M. Mampaey, N. Tatti, and J. Vreeken. Tell me what I need to know: Succinctly summarizing data with itemsets. In 17th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), San Diego, CA, pages 573–581, 2011.
Google Scholar
G. Moise, J. Sander, and M. Ester. P3C: A robust projected clustering algorithm. In 6th IEEE International Conference on Data Mining (ICDM), Hong Kong, China, pages 414–425, 2006.
Google Scholar
G. Moise, J. Sander, and M. Ester. Robust projected clustering. Knowledge and Information Systems (KAIS), 14(3):273–298, 2008.
CrossRef
MATH
Google Scholar
G. Moise, A. Zimek, P. Kröger, H.-P. Kriegel, and J. Sander. Subspace and projected clustering: Experimental evaluation and analysis. Knowledge and Information Systems (KAIS), 21(3):299–326, 2009.
CrossRef
Google Scholar
E. Müller, I. Assent, S. Günnemann, R. Krieger, and T. Seidl. Relevant subspace clustering: Mining the most interesting non-redundant concepts in high dimensional data. In 9th IEEE International Conference on Data Mining (ICDM), Miami, FL, pages 377–386, 2009.
Google Scholar
E. Müller, I. Assent, R. Krieger, S. Günnemann, and T. Seidl. Dens-Est:density estimation for data mining in high dimensional spaces. In 9th SIAM International Conference on Data Mining (SDM), Sparks, NV, pages 173–184, 2009.
Google Scholar
E. Müller, S. Günnemann, I. Assent, and T. Seidl. Evaluating clustering in subspace projections of high dimensional data. In 35th International Conference on Very Large Data Bases (VLDB), Lyon, France, pages 1270–1281, 2009.
Google Scholar
E. Müller, I. Assent, S. Günnemann, and T. Seidl. Scalable densitybased subspace clustering. In 20th ACM Conference on Information and Knowledge Management (CIKM), Glasgow, UK, pages 1077–1086, 2011.
Google Scholar
H. S. Nagesh, S. Goil, and A. Choudhary. Adaptive grids for clustering massive data sets. In 1st SIAM International Conference on Data Mining (SDM), Chicago, IL, 2001.
Google Scholar
H. V. Nguyen, E. Müller, J. Vreeken, F. Keller, and K. Böhm. CMI: an information-theoretic contrast measure for enhancing subspace cluster and outlier detection. In 13th SIAM International Conference on Data Mining (SDM), Austin, TX, pages 198–206, 2013.
Google Scholar
L. Parsons, E. Haque, and H. Liu. Subspace clustering for high dimensional data: A review. ACM SIGKDD Explorations, 6(1):90–105, 2004.
CrossRef
Google Scholar
N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for association rules. In 7th International Conference on Database Theory (ICDT), Jerusalem, Israel, pages 398–416, 1999.
Google Scholar
J. Pei, X. Zhang, M. Cho, H. Wang, and P. S. Yu. MaPle: A fast algorithm for maximal pattern-based clustering. In 3rd IEEE International Conference on Data Mining (ICDM), Melbourne, FL, pages 259–266, 2003.
Google Scholar
J. M. Phillips, P. Raman, and S. Venkatasubramanian. Generating a diverse set of high-quality clusterings. In 2nd MultiClust Workshop: Discovering, Summarizing and Using Multiple Clusterings Held in Conjunction with ECML PKDD 2011, Athens, Greece, pages 80–91, 2011.
Google Scholar
C. M. Procopiuc, M. Jones, P. K. Agarwal, and T. M. Murali. A Monte Carlo algorithm for fast projective clustering. In ACM International Conference on Management of Data (SIGMOD), Madison, WI, pages 418–427, 2002.
Google Scholar
Z. J. Qi and I. Davidson. A principled and flexible framework for finding alternative clusterings. In 15th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), Paris, France, pages 717–726, 2009.
Google Scholar
C. E. Shannon and W. Weaver. The Mathematical Theory of Communication. University of Illinois Press, 1949.
Google Scholar
K. Sim, V. Gopalkrishnan, A. Zimek, and G. Cong. A survey on enhanced subspace clustering. Data Mining and Knowledge Discovery, 26(2):332–397, 2013.
CrossRef
MATH
MathSciNet
Google Scholar
P. H. A. Sneath. The application of computers to taxonomy. Journal of General Microbiology, 17:201–226, 1957.
Google Scholar
M. Verleysen and D. François. The curse of dimensionality in data mining and time series prediction. In 8th International Work-Conference on Artificial Neural Networks (IWANN), Barcelona, Spain, pages 758–770, 2005.
Google Scholar
D. Wishart. Mode analysis: A generalization of nearest neighbor which reduces chaining effects. In A. J. Cole, editor, Numerical Taxonomy, pages 282–311, 1969.
Google Scholar
X. Yan, H. Cheng, J. Han, and D. Xin. Summarizing itemset patterns: a profile-based approach. In 11th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), Chicago, IL, pages 314–323, 2005.
Google Scholar
M. L. Yiu and N. Mamoulis. Frequent-pattern based iterative projected clustering. In 3rd IEEE International Conference on Data Mining (ICDM), Melbourne, FL, pages 689–692, 2003.
Google Scholar
M. L. Yiu and N. Mamoulis. Iterative projected clustering by subspace mining. IEEE Transactions on Knowledge and Data Engineering, 17(2):176–189, 2005.
CrossRef
Google Scholar
M. J. Zaki, M. Peters, I. Assent, and T. Seidl. CLICKS: an effective algorithm for mining subspace clusters in categorical datasets. Data & Knowledge Engineering, 60(1):51–70, 2007.
Google Scholar
F. Zhu, X. Yan, J. Han, P. S. Yu, and H. Cheng. Mining colossal frequent patterns by core pattern fusion. In 23rd International Conference on Data Engineering (ICDE), Istanbul, Turkey, pages 706–715, 2007.
Google Scholar
A. Zimek. Clustering high-dimensional data. In C. C. Aggarwal and C. K. Reddy, editors, Data Clustering: Algorithms and Applications, chapter 9, pages 201–230. CRC Press, 2013.
Google Scholar
A. Zimek and J. Vreeken. The blind men and the elephant: On meeting the problem of multiple truths in data from clustering and pattern mining perspectives. Machine Learning, 2013.
Google Scholar
A. Zimek, E. Schubert, and H.-P. Kriegel. A survey on unsupervised outlier detection in high-dimensional numerical data. Statistical Analysis and Data Mining, 5(5):363–387, 2012.
Google Scholar