Generative Models for Labeling Multi-object Configurations in Images

  • Yali Amit
  • Alain Trouvé
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4170)


We propose a generative approach to the problem of labeling images containing configurations of objects from multiple classes. The main building blocks are dense statistical models for individual objects. The models assume conditional independence of binary oriented edge variables conditional on a hidden instantiation parameter, which also determines an object support. These models are then be composed to form models for object configurations with various interactions including occlusion. Choosing the optimal configuration is entirely likelihood based and no decision boundaries need to be pre-learned. Training involves estimation of model parameters for each class separately. Both training and classification involve estimation of hidden pose variables which can be computationally intensive. We describe two levels of approximation which facilitate these computations: the Patchwork of Parts (POP) model and the coarse part based models (CPM). A concrete implementation of the approach is illustrated on the problem of reading zip-codes.


Object Class Edge Type Coarse Model Reference Grid Bernoulli Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Allassonnière, S., Amit, Y., Trouvé, A.: Toward a coherent statistical framework for dense deformable template estimation. Technical report, Department of Statistics, University of Chicago (2005)Google Scholar
  2. 2.
    Amit, Y.: 2d Object Detection and Recognition: Models, Algorithms and Networks. MIT Press, Cambridge (2002)Google Scholar
  3. 3.
    Amit, Y., Geman, D.: Shape quantization and recognition with randomized trees. Neural Computation 9, 1545–1588 (1997)CrossRefGoogle Scholar
  4. 4.
    Amit, Y., Geman, D.: A computational model for visual selection. Neural Computation 11, 1691–1715 (1999)CrossRefGoogle Scholar
  5. 5.
    Amit, Y., Geman, D., Fan, X.: A coarse-to-fine strategy for multi-class shape detection. IEEE PAMI 26, 1606–1621 (2004)Google Scholar
  6. 6.
    Amit, Y., Trouvé, A.: Pop: Patchwork of parts models for object recognition. Technical report, Department of Statistics, University of Chicago (2004)Google Scholar
  7. 7.
    Belongie, S., Malik, J., Puzicha, S.: Shape matching and object recongition using shape context. IEEE PAMI 24, 509–523 (2002)Google Scholar
  8. 8.
    Bernstein, E.J., Amit, Y.: Part-based statistical models for object classification and detection. CVPR (2), 734–740 (2005)Google Scholar
  9. 9.
    Bienenstock, E., Geman, S., Potter, D.: Compositionality, mdl priors, and object recognition. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information and Processing Systems, vol. 9, pp. 834–844. MIT Press, Cambridge (1997)Google Scholar
  10. 10.
    Borenstein, E., Sharon, E., Ullman, S.: Combining bottom up and top down segmentation. In: Proceedings CVPRW 2004, vol. 4. IEEE, Los Alamitos (2004)Google Scholar
  11. 11.
    Bottou, L., Cortes, C., Denker, J.S., Drucker, H., Guyon, I., Jackel, L.D., LeCun, Y., Muller, U.A., Sackinger, E., Simard, P., Vapnik, V.: Comparison of classifier methods: a case study in handwritten digit recognition. In: Proc. IEEE Inter. Conf. on Pattern Recognition, pp. 77–82 (1994)Google Scholar
  12. 12.
    Burl, M.C., Weber, M., Perona, P.: A Probabilistic Approach to Object Recognition Using Local Photometry and Global Geometry. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, pp. 628–641. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  13. 13.
    Burl, M., Weber, M., Pietro, P.: A Probabilistic Approach to Object Recognition Using Local Photometry and Global Geometry. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, pp. 628–641. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  14. 14.
    Crandall, D., Felzenszwalb, P., Huttenlocher, D.: Spatial priors for part-based recognition using statistical models. In: Proceedings CVPR (to appear, 2005)Google Scholar
  15. 15.
    Fei-Fei, L., Fergus, R., Perona, P.: A bayesian approach to unsupervised one-shot learning of object categories. In: Proceedings of the International Conference on Computer Vision, vol.1 (2003)Google Scholar
  16. 16.
    Fleuret, F., Geman, D.: Coarse-to-fine face detection. International Journal of Computer Vision 41, 85–107 (2001)CrossRefMATHGoogle Scholar
  17. 17.
    Fleuret, F., Geman, D.: Fast face detections with precise pose estimation. In: Proceedings of ICPR 2002, vol. I, pp. 235–238 (2002)Google Scholar
  18. 18.
    Geman, S., Potter, D., Chi, Z.: Composition systems. Quarterly J. Appl. Math. LX, 707–737 (2002)MathSciNetGoogle Scholar
  19. 19.
    Ha, T.M., Zimmermann, M., Bunke, H.: Off-line handwritten numeral string recognition by combining segmentation-based and segmentation-free methods. Pattern Recognition 31, 257–272 (1998)CrossRefGoogle Scholar
  20. 20.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  21. 21.
    Leibe, B., Schiele, B.: Interleaved object categorization and segmentation. In: BMVC 2003 (2003)Google Scholar
  22. 22.
    Palumbo, P., Srihari, S.: Postal address reading in real time. Intr. Jour. of Imaging Science and Technology (1996)Google Scholar
  23. 23.
    Revow, M., Williams, C.K.I., Hinton, G.E.: Using generative models for handwritten digit recognition. IEEE PAMI 18, 592–606 (1996)Google Scholar
  24. 24.
    Rowley, H.A., Baluja, S., Kanade, T.: Neural network-based face detection. IEEE Trans. PAMI 20, 23–38 (1998)Google Scholar
  25. 25.
    Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing visual features for multiclass and multiview object detection. Technical Report AI-Memo 2004-008, MIT (2004)Google Scholar
  26. 26.
    Ullman, S., Vida-Naquet, M., Sali, E.: Visual features of intermediate complexity and their use in classification. Nature Neuroscience 5, 682–687 (2002)Google Scholar
  27. 27.
    Viola, P., Jones, M.J.: Robust real time object detection. Intl. Jour. Comp. Vis. (2002)Google Scholar
  28. 28.
    Wang, S.C.: A statistical model for computer recognition of sequences of handwritten digits, with applications to zip codes. Ph.D thesis, University of Chicago (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yali Amit
    • 1
  • Alain Trouvé
    • 2
  1. 1.Department of StatisticsUniversity of ChicagoChicagoUSA
  2. 2.CMLAENS-CachanCachan cedexFrance

Personalised recommendations