Commonality Preserving Multiple Instance Clustering Based on Diverse Density

  • Takayuki FukuiEmail author
  • Toshikazu Wada
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9010)


Image-set clustering is a problem decomposing a given image set into disjoint subsets satisfying specified criteria. For single vector image representations, proximity or similarity criterion is widely applied, i.e., proximal or similar images form a cluster. Recent trend of the image description, however, is the local feature based, i.e., an image is described by multiple local features, e.g., SIFT, SURF, and so on. In this description, which criterion should be employed for the clustering? As an answer to this question, this paper presents an image-set clustering method based on commonality, that is, images preserving strong commonality (coherent local features) form a cluster. In this criterion, image variations that do not affect common features are harmless. In the case of face images, hair-style changes and partial occlusions by glasses may not affect the cluster formation. We defined four commonality measures based on Diverse Density, that are used in agglomerative clustering. Through comparative experiments, we confirmed that two of our methods perform better than other methods examined in the experiments.


Feature Space Face Image Hausdorff Distance Commonality Measure Normalize Mutual Information 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was supported by “R&D Program for Implementation of Anti-Crime and Anti-Terrorism Technologies for a Safe and Secure Society”, Funds for integrated promotion of social system reform and research and development of the Ministry of Education, Culture, Sports, Science and Technology, the Japanese Government.


  1. 1.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)CrossRefGoogle Scholar
  2. 2.
    Bay, H., Ess, A., Tiytelaars, T., Gool, L.J.V.: Surf: speeded up robust features. CVIU 110, 346–359 (2008)Google Scholar
  3. 3.
    Fei-Fei, L.: A bayesian hierarchical model for learning natural scene categories. In: CVPR, pp. 524–531 (2005)Google Scholar
  4. 4.
    Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Advances in Neural Information Processing Systems, pp. 570–576. MIT Press (1998)Google Scholar
  5. 5.
    Maron, O., Ratan, A.L.: Multiple-instance learning for natural scene classification. In: The Fifteenth International Conference on Machine Learning, pp. 341–349. Morgan Kaufmann (1998)Google Scholar
  6. 6.
    Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: SODA 2007: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics, Philadelphia (2007)Google Scholar
  7. 7.
    Ward, J.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)CrossRefGoogle Scholar
  8. 8.
    Berkhin, P.: A survey of clustering data mining techniques. In: Kogan, J., Nicholas, C., Teboulle, M. (eds.) Grouping Multidimensional Data. Springer, Berlin (2006)Google Scholar
  9. 9.
    Forgy, E.: Cluster analysis of multivariate data: efficiency versus interpretability of classification. Biometrics 21, 768–769 (1965)Google Scholar
  10. 10.
    MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Cam, L.M.L., Neyman, J., eds.: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probabilitym, vol. 1, pp. 281–297. University of California Press (1967)Google Scholar
  11. 11.
    Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)CrossRefGoogle Scholar
  12. 12.
    Jain, A., Dubes, R.: Algorithms for Clustering Data. Prentice Hall, Upper Saddle River (1988)zbMATHGoogle Scholar
  13. 13.
    Lance, G.N., Williams, W.T.: A general theory of classificatory sorting strategies 1. hierarchical systems. Comput. J. 9, 373–380 (1967)CrossRefGoogle Scholar
  14. 14.
    Murtagh, F., Contreras, P.: Methods of hierarchical clustering. CoRR abs/1105.0121 (2011)Google Scholar
  15. 15.
    Huttenlocher, D., Klanderman, G.A., Kl, G.A., Rucklidge, W.J.: Comparing images using the hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 15, 850–863 (1993)CrossRefGoogle Scholar
  16. 16.
    Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
  17. 17.
    Zhang, Q., Goldman, S.A.: Em-dd: An Improved Multiple-instance Learning Technique. MIT Press, Cambridge (2001)Google Scholar
  18. 18.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. Roy. Stat. Soc. Ser. B 39, 1–38 (1977)zbMATHMathSciNetGoogle Scholar
  19. 19.
    Witten, I.H., Frank, E., Holmes, G.: Data Mining : Practical Machine Learning Tools and Techniques. The Morgan Kaufmann series in data management systems. Morgan Kaufmann, Amsterdam (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Graduate School of Systems EngineeringWakayama UniversityWakayamaJapan

Personalised recommendations