Machine Learning

, Volume 98, Issue 1–2, pp 31–56 | Cite as

Subjectively interesting alternative clusterings

Article

Abstract

We deploy a recently proposed framework for mining subjectively interesting patterns from data to the problem of alternative clustering, where patterns are sets of clusters (clusterings) in the data. This framework outlines how subjective interestingness of patterns (here, clusterings) can be quantified using sound information theoretic concepts. We demonstrate how it motivates a new objective function quantifying the interestingness of a clustering, automatically accounting for a user’s prior beliefs and for redundancies between the discovered patterns.

Directly searching for the optimal set of clusterings defined in this way is hard. However, the optimization problem can be solved approximately if clusterings are generated iteratively. In this iterative scheme, each subsequent clustering is maximally interesting given the whole set of previously generated clusterings, automatically trading off interestingness with non-redundancy. Although generating each clustering in an iterative fashion is computationally hard as well, we develop an approximation technique similar to spectral clustering algorithms.

Our method can generate as many clusterings as the user requires. Subjective evaluation or the value of the objective function can guide the termination of the process. In addition our method allows varying the number of clusters in each successive clustering.

Experiments on artificial and real-world datasets show that the mined clusterings fulfill the requirements of a good clustering solution by being both non-redundant and of high compactness. Comparison with existing solutions shows that our approach compares favourably with regard to well-known objective measures of similarity and quality of clusterings, even though it is not designed to directly optimize them.

Keywords

Subjective interestingness Maximum entropy modelling Alternative clustering 

References

  1. Bae, E., & Bailey, J. (2006). Coala: a novel approach for the extraction of an alternate clustering of high quality and high dissimilarity. In Proceedings of the 6th international conference on data mining, ICDM’06 (pp. 53–62). Google Scholar
  2. Blake, C., & Merz, C. (1998). In UCI machine learning repository. Google Scholar
  3. Cui, Y., Fern, X. Z., & Dy, J. G. (2007). Non-redundant multi-view clustering via orthogonalization. In Proceedings of the 7th IEEE international conference on data mining, ICDM’07 (pp. 133–142). Los Alamitos: IEEE Comput. Soc. Google Scholar
  4. Dang, X.-H., & Bailey, J. (2010). A hierarchical information theoretic technique for the discovery of non linear alternative clusterings. In Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’10 (pp. 573–582). New York: ACM. CrossRefGoogle Scholar
  5. Dasgupta, S., & Ng, V. (2010). Mining clustering dimensions. In Proceedings of the 27th international conference on machine learning, ICML’10 (pp. 263–270). Google Scholar
  6. Davidson, I., & Qi, Z. (2008). Finding alternative clusterings using constraints. In Proceedings of the 8th IEEE international conference on data mining, ICDM’08 (pp. 773–778). Los Alamitos: IEEE Comput. Soc. Google Scholar
  7. De Bie, T. (2010). Maximum entropy models and subjective interestingness: an application to tiles in binary databases. Data Mining and Knowledge Discovery, 23(3), 407–446. CrossRefGoogle Scholar
  8. De Bie, T. (2011). An information-theoretic framework for data mining. In Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’11 (pp. 564–572). Google Scholar
  9. De Bie, T. (2011). Subjectively interesting alternative clusters. In Proceedings of the 2nd MultiClust workshop: discovering, summarizing and using multiple clusterings, September 2011. Google Scholar
  10. Dunn, J. C. (1971). A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. Journal of Cybernetics, 3(3), 32–57. MathSciNetCrossRefGoogle Scholar
  11. Geng, L., & Hamilton, H. J. (2006). Interestingness measures for data mining: a survey. ACM Computing Surveys, September, 38. Google Scholar
  12. Gondek, D., & Hofmann, T. (2003). Conditional information bottleneck clustering. In Proceedings of the 3rd IEEE international conference on data mining, workshop on clustering large datasets, ICDM’03 (pp. 36–42). Google Scholar
  13. Gondek, D., & Hofmann, T. (2004). Non-redundant data clustering. In Proceedings of the 4th IEEE international conference on data mining, ICDM’04 (pp. 75–82). Los Alamitos: IEEE Comput. Soc. Google Scholar
  14. Hersch, J. (1961). Caractérisation variationelle d’une somme de valeurs propres consécutives; généralisation d’inégalités de pólya-schiffer et de weyl. Comptes Rendus de l’Académie des Sciences, Paris, 252, 1714–1716. MathSciNetMATHGoogle Scholar
  15. Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a review. ACM Computing Surveys, 31(3), 264–323. CrossRefGoogle Scholar
  16. Jain, P., Meka, R., & Dhillon, I. S. (2008). Simultaneous unsupervised learning of disparate clusterings. Statistical Analysis and Data Mining, 1(3), 195–210. MathSciNetCrossRefGoogle Scholar
  17. Jaynes, E. T. (1982). On the rationale of maximum-entropy methods. Proceedings of the IEEE, 70(9), 939–952. CrossRefGoogle Scholar
  18. Kontonasios, K.-N., & De Bie, T. (2010). An information-theoretic approach to finding informative noisy tiles in binary databases. In Proceedings of the 2010 SIAM international conference on data mining, SDM’10. Google Scholar
  19. Kontonasios, K.-N., & De Bie, T. (2012). Formalizing complex prior information to quantify subjective interestingness of frequent pattern sets. In 11th international symposium on intelligent data analysis, IDA 2012 (pp. 161–171). Google Scholar
  20. Kontonasios, K.-N., Spyropoulou, E., & De Bie, T. (2012). Knowledge discovery interestingness measures based on unexpectedness. WIREs Data Mining and Knowledge Discovery, 2, 386–399. CrossRefGoogle Scholar
  21. Kontonasios, K.-N., Vreeken, J., & De Bie, T. (2011). Maximum entropy modelling for assessing results on real-valued data. In Proceedings of the 11th IEEE international conference on data mining, ICDM’11, Vancouver, BC, Canada, December 11–14, 2011 (pp. 350–359). New York: IEEE Press. Google Scholar
  22. McGarry, K. (2005). A survey of interestingness measures for knowledge discovery. Knowledge Engineering Review, 20, 39–61. CrossRefGoogle Scholar
  23. Müller, E., Assent, I., Günnemann, S., Krieger, R., & Seidl, T. (2009). Relevant subspace clustering: mining the most interesting non-redundant concepts in high dimensional data. In Proceedings of the 9th IEEE international conference on data mining, ICDM’09, Washington, DC, USA (pp. 377–386). Los Alamitos: IEEE Comput. Soc. Google Scholar
  24. Müller, E., Günnemann, S., Farber, I., & Seidl, T. (2010). Discovering multiple clustering solutions: grouping objects in different views of the data. In Proceedings of the 2010 IEEE international conference on data mining, ICDM’10, Washington, DC, USA (pp. 1220–1223). Los Alamitos: IEEE Comput. Soc. CrossRefGoogle Scholar
  25. Ng, A. Y., Jordan, M. I., & Weiss, Y. (2001). On spectral clustering: analysis and an algorithm. In Advances in neural information processing systems (pp. 849–856). Cambridge: MIT Press. Google Scholar
  26. Niu, D., Dy, J. G., & Jordan, M. I. (2010). Multiple non-redundant spectral clustering views. In Proceedings of the 27th international conference on machine learning 2010, ICML’10 (pp. 831–838). Google Scholar
  27. Qi, Z.J., & Davidson, I. (2009). A principled and flexible framework for finding alternative clusterings. In Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’09 (pp. 717–726). New York: ACM. CrossRefGoogle Scholar
  28. Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336), 846–850. CrossRefGoogle Scholar
  29. Sequeira, K., & Zaki, M. (2004). Schism: a new approach for interesting subspace mining. In Proceedings of the 4th IEEE international conference on data mining (pp. 186–193). Los Alamitos: IEEE Comput. Soc. Google Scholar
  30. Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905. CrossRefGoogle Scholar
  31. Spyropoulou, E., & De Bie, T. (2011). Interesting multi-relational patterns. In Proceedings of the 11th IEEE international conference on data mining, ICDM’11, Vancouver, BC, Canada, December 11–14, 2011 (pp. 675–684). New York: IEEE Press. Google Scholar
  32. Tan, P.-N., Steinbach, M., & Kumar, V. (2005). Introduction to data mining. Boston: Addison-Wesley. Google Scholar
  33. Vinh, N. X., & Epps, J. (2010). minCEntropy: a novel information theoretic approach for the generation of alternative clusterings. In Proceedings of the 10th IEEE international conference on data mining, ICDM’10 (pp. 521–530). Google Scholar
  34. Wang, H., Yan, S., Xu, D., Tang, X., & Huang, T. (2007). Trace ratio vs. ratio trace for dimensionality reduction. In IEEE conference on computer vision and pattern recognition, CVPR’07, June 2007 (pp. 1–8). New York: IEEE Press. Google Scholar

Copyright information

© The Author(s) 2013

Authors and Affiliations

  1. 1.Intelligent Systems LaboratoryUniversity of BristolBristolUK

Personalised recommendations