Commonality Preserving Multiple Instance Clustering Based on Diverse Density
Image-set clustering is a problem decomposing a given image set into disjoint subsets satisfying specified criteria. For single vector image representations, proximity or similarity criterion is widely applied, i.e., proximal or similar images form a cluster. Recent trend of the image description, however, is the local feature based, i.e., an image is described by multiple local features, e.g., SIFT, SURF, and so on. In this description, which criterion should be employed for the clustering? As an answer to this question, this paper presents an image-set clustering method based on commonality, that is, images preserving strong commonality (coherent local features) form a cluster. In this criterion, image variations that do not affect common features are harmless. In the case of face images, hair-style changes and partial occlusions by glasses may not affect the cluster formation. We defined four commonality measures based on Diverse Density, that are used in agglomerative clustering. Through comparative experiments, we confirmed that two of our methods perform better than other methods examined in the experiments.
KeywordsFeature Space Face Image Hausdorff Distance Commonality Measure Normalize Mutual Information
This work was supported by “R&D Program for Implementation of Anti-Crime and Anti-Terrorism Technologies for a Safe and Secure Society”, Funds for integrated promotion of social system reform and research and development of the Ministry of Education, Culture, Sports, Science and Technology, the Japanese Government.
- 2.Bay, H., Ess, A., Tiytelaars, T., Gool, L.J.V.: Surf: speeded up robust features. CVIU 110, 346–359 (2008)Google Scholar
- 3.Fei-Fei, L.: A bayesian hierarchical model for learning natural scene categories. In: CVPR, pp. 524–531 (2005)Google Scholar
- 4.Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Advances in Neural Information Processing Systems, pp. 570–576. MIT Press (1998)Google Scholar
- 5.Maron, O., Ratan, A.L.: Multiple-instance learning for natural scene classification. In: The Fifteenth International Conference on Machine Learning, pp. 341–349. Morgan Kaufmann (1998)Google Scholar
- 6.Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: SODA 2007: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics, Philadelphia (2007)Google Scholar
- 8.Berkhin, P.: A survey of clustering data mining techniques. In: Kogan, J., Nicholas, C., Teboulle, M. (eds.) Grouping Multidimensional Data. Springer, Berlin (2006)Google Scholar
- 9.Forgy, E.: Cluster analysis of multivariate data: efficiency versus interpretability of classification. Biometrics 21, 768–769 (1965)Google Scholar
- 10.MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Cam, L.M.L., Neyman, J., eds.: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probabilitym, vol. 1, pp. 281–297. University of California Press (1967)Google Scholar
- 14.Murtagh, F., Contreras, P.: Methods of hierarchical clustering. CoRR abs/1105.0121 (2011)Google Scholar
- 17.Zhang, Q., Goldman, S.A.: Em-dd: An Improved Multiple-instance Learning Technique. MIT Press, Cambridge (2001)Google Scholar
- 19.Witten, I.H., Frank, E., Holmes, G.: Data Mining : Practical Machine Learning Tools and Techniques. The Morgan Kaufmann series in data management systems. Morgan Kaufmann, Amsterdam (2011)Google Scholar