Abstract
We describe a new method for unsupervised structure learning of a hierarchical compositional model (HCM) for deformable objects. The learning is unsupervised in the sense that we are given a training dataset of images containing the object in cluttered backgrounds but we do not know the position or boundary of the object. The structure learning is performed by a bottom-up and top-down process. The bottom-up process is a novel form of hierarchical clustering which recursively composes proposals for simple structures to generate proposals for more complex structures. We combine standard clustering with the suspicious coincidence principle and the competitive exclusion principle to prune the number of proposals to a practical number and avoid an exponential explosion of possible structures. The hierarchical clustering stops automatically, when it fails to generate new proposals, and outputs a proposal for the object model. The top-down process validates the proposals and fills in missing elements. We tested our approach by using it to learn a hierarchical compositional model for parsing and segmenting horses on Weizmann dataset. We show that the resulting model is comparable with (or better than) alternative methods. The versatility of our approach is demonstrated by learning models for other objects (e.g., faces, pianos, butterflies, monitors, etc.). It is worth noting that the low-levels of the object hierarchies automatically learn generic image features while the higher levels learn object specific features.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR (2), pp. 264–271 (2003)
Zhu, L., Chen, Y., Yuille, A.L.: Unsupervised learning of a probabilistic grammar for object detection and parsing. In: NIPS, pp. 1617–1624 (2006)
Borenstein, E., Ullman, S.: Class-specific, top-down segmentation. In: ECCV (2), pp. 109–124 (2002)
Fukushima, K.: Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Networks 1, 119–130 (1988)
Jin, Y., Geman, S.: Context and hierarchy in a probabilistic image model. In: CVPR (2), pp. 2145–2152 (2006)
Chen, Y., Zhu, L., Lin, C., Yuille, A.L., Zhang, H.: Rapid inference on a novel and/or graph for object detection, segmentation and parsing. In: NIPS (2007)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18, 1527–1554 (2006)
Epshtein, B., Ullman, S.: Feature hierarchies for object classification. In: ICCV, pp. 220–227 (2005)
Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: CVPR (2), pp. 994–1000 (2005)
Ahuja, N., Todorovic, S.: Learning the taxonomy and models of categories present in arbitrary image. In: ICCV (2007)
Fleuret, F., Geman, D.: Coarse-to-fine face detection. In: IJCV (2001)
Fidler, S., Leonardis, A.: Towards scalable representations of object categories: Learning a hierarchy of parts. In: CVPR (2007)
Rother, C., Kolmogorov, V., Blake, A.: “grabcut”: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. 23, 309–314 (2004)
Ren, X., Fowlkes, C., Malik, J.: Cue integration for figure/ground labeling. In: NIPS (2005)
Levin, A., Weiss, Y.: Learning to combine bottom-up and top-down segmentation. In: ECCV (4), pp. 581–594 (2006)
Kumar, M.P., Torr, P.H.S., Zisserman, A.: Obj cut. In: CVPR (1), pp. 18–25 (2005)
Cour, T., Shi, J.: Recognizing objects by piecing together the segmentation puzzle. In: CVPR (2007)
Borenstein, E., Malik, J.: Shape guided object segmentation. In: CVPR (1), pp. 969–976 (2006)
Winn, J.M., Jojic, N.: Locus: Learning object classes with unsupervised segmentation. In: ICCV, pp. 756–763 (2005)
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Comput. Vis. Image Underst. 106, 59–70 (2007)
Russell, B., Torralba, A., Murphy, K., Freeman, W.: Labelme: a database and web-based tool for image annotation. Technical Report (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhu, L.(., Lin, C., Huang, H., Chen, Y., Yuille, A. (2008). Unsupervised Structure Learning: Hierarchical Recursive Composition, Suspicious Coincidence and Competitive Exclusion. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5303. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88688-4_56
Download citation
DOI: https://doi.org/10.1007/978-3-540-88688-4_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88685-3
Online ISBN: 978-3-540-88688-4
eBook Packages: Computer ScienceComputer Science (R0)