Cluster Analysis of Face Images and Literature Data by Evolutionary Distance Metric Learning
Evolutionary distance metric learning (EDML) is an efficient technique for solving clustering problems with some background knowledge. However, EDML has never been applied to real world applications. Thus, we demonstrate EDML for cluster analysis and visualization of two applications, i.e., a face recognition image dataset and a literature dataset. In the facial image clustering, we demonstrate improvement of the cluster validity index and also analyze the distributions of classes (ages) visualized by a self-organizing map and a K-means clustering with K-nearest neighbor centroids graph. For the literature dataset, we have analyzed the topics (i.e., a cluster of articles) that are the most likely to win the best paper award. Application of EDML to these datasets yielded qualitatively promising visualization results that demonstrate the practicability and effectiveness of EDML.
KeywordsClass Label Face Image Latent Dirichlet Allocation Minority Class Paper Award
This work was partially supported by the Kayamori Foundation of Informational Science Advancement, and by the cooperative research program of “Network Joint Research Center for Materials and Devices”.
- 1.Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semisupervised clustering. In: Proceedings of the 21st International Conference on Machine Learning, pp. 81–88. ACM (2004)Google Scholar
- 4.Fukui, K., Numao, M.: Neighborhood-based smoothing of external cluster validity measures. In: Proceedings of the 16th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-12), pp. 354–365 (2012)Google Scholar
- 5.Fukui, K., Ono, S., Megano, T., Numao, M.: Evolutionary distance metric learning approach to semi-supervised clustering with neighbor relations. In: Proceedings of 2013 IEEE 25th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 398–403 (2013)Google Scholar
- 6.Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Advances in Neural Information Processing Systems, pp. 513–520 (2004)Google Scholar
- 7.Hertz, T., Bar-Hillel, A., Weinshall, D.: Boosting margin based distance functions for clustering. In: Proceedings of the 21st International Conference on Machine Learning (ICML-04), pp. 393–400 (2004)Google Scholar
- 10.Megano, T., Fukui, K., Numao, M., Ono, S.: Evolutionary multi-objective distance metric learning for multi-label clustering. In: Proceedings of 2015 IEEE Congress on Evolutionary Computation (CEC-15), pp. 2945–2952 (2015)Google Scholar
- 11.Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S.: Constrained k-means clustering with background knowledge. In: Proceedings of the International Conference on Machine Learning (ICML-01), pp. 577–584 (2001)Google Scholar
- 14.Xing, E.P., Ng, A.Y., Jordan, M.I., Russell, S.J.: Distance metric learning with application to clustering with side-information. In: Advances in Neural Information Processing Systems (NIPS), pp. 505–512 (2002)Google Scholar
- 15.Yang, L.: Distance metric learning : A comprehensive survey. Tech. Rep. 16, Michigan State Universiy (2006)Google Scholar