Skip to main content
Log in

POP: Patchwork of Parts Models for Object Recognition

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We formulate a deformable template model for objects with an efficient mechanism for computation and parameter estimation. The data consists of binary oriented edge features, robust to photometric variation and small local deformations. The template is defined in terms of probability arrays for each edge type. A primary contribution of this paper is the definition of the instantiation of an object in terms of shifts of a moderate number local submodels—parts—which are subsequently recombined using a patchwork operation, to define a coherent statistical model of the data. Object classes are modeled as mixtures of patchwork of parts POP models that are discovered sequentially as more class data is observed. We define the notion of the support associated to an instantiation, and use this to formulate statistical models for multi-object configurations including possible occlusions. All decisions on the labeling of the objects in the image are based on comparing likelihoods. The combination of a deformable model with an efficient estimation procedure yields competitive results in a variety of applications with very small training sets, without need to train decision boundaries—only data from the class being trained is used. Experiments are presented on the MNIST database, reading zipcodes, and face detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Allassonnière, S., Amit, Y., and Trouvé, A. 2006. Toward a coherent statistical framework for dense deformable template estimation. Journal of the Royal Stat. Soc., to appear.

  • Amit, Y. 2002. 2d Object Detection and Recognition: Models, Algorithms and Networks, MIT Press: Cambridge, Mass

  • Amit, Y. and Geman, D. 1997. Shape quantization and recognition with randomized trees. Neural Computation, 9: 1545–1588.

    Article  Google Scholar 

  • Amit, Y. and Geman, D. 1999. A computational model for visual selection. Neural Computation, 11: 1691–1715.

    Article  Google Scholar 

  • Amit, Y., Geman, D., and Fan, X. D. 2004. A coarse-to-fine strategy for multi-class shape detection. IEEE-PAMI, 26: 1606–1621.

    Google Scholar 

  • Belongie, S., Malik, J., and Puzicha, S. 2002. Shape matching and object recongition using shape context. IEEE PAMI, 24: 509–523.

    Google Scholar 

  • Bernstein, E. J. and Amit, Y. 2005. Part- based models for object classification and detection, In CVPR 2005 (2).

  • Borenstein, E. 2006. http://www.dam.brown.edu/people/eranb/’.

  • Borenstein, E., Sharon, E., and S., U. 2004. Combining bottom up and top down segmentation, In Proceedings CVPRW04, Vol. 4, IEEE.

  • Burl, M., Weber, M., and Perona, P. 1998. A probabilistic approach to object recognition using local photometry and global geometry, In Proc. of the 5th European Conf. on Computer Vision, ECCV 98, pp. 628–641.

  • Crandall, D., Felzenszwalb, P., and Huttenlocher, D. 2005. Spatial priors for part-based recognition using statistical models, In Proceedings CVPR 2005 to appear.

  • Dempster, A. P., Laird, N. M., and Rubin, D. B. 1977. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, 1: 1–22.

    Google Scholar 

  • Fei-Fei, L., Fergus, R., and Perona, P. 2003. A bayesian approach to unsupervised one-shot learning of object categories, In Proceedings of the International Conference on Computer Vision, Vol. 1.

  • Geman, S., Potter, D. F., and Chi, Z. 2002. Composition systems. Quarterly of Applied Mathematics, LX: 707–736.

    Google Scholar 

  • Ha, T. M., Zimmermann, M., and Bunke, H. 1998. Off-line handwritten numeral string recognition by combining segmentation-based and segmentation-free methods. Pattern Recognition, 31: 257–272.

    Article  Google Scholar 

  • Hastie, T. and Simard, P. Y. 1998. Metrics and models for handwritten character recognition. Statistical Science.

  • LeCun, Y. 2004. The mnist database. http://yann.lecun.com/exdb/mnist/.

  • LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11): 2278–2324.

  • Leibe, B. and Schiele, B. 2003. Interleaved object categorization and segmentation, In BMVC’03.

  • Leung, T., Burl, M., and Perona, P. 1995. Finding faces in cluttered scenes labelled random graph matching, In Proceedings, 5th Int. Conf. on Comp. Vision, pp. 637–644.

  • liebe04:_scale_invar_objec_categ_using Liebe, B. and Schiele, B. 2004. Scale invariant object categorization using a scale-adaptative mean-shift search, In DAGM’04 Annual Pattern Recognition Symposium, Vol. 3175, pp. 145–153.

  • Palumbo, P. and Srihari, S. 1996. Postal address reading in real time. Intr. Jour. of Imaging Science and Technology.

  • Rowley, H. A., Baluja, S., and Kanade, T. 1998. Neural network-based face detection. IEEE Trans. PAMI, 20: 23–38.

    Google Scholar 

  • Schneiderman, H. and Kanade, T. 2004. Object detection using the statistics of parts. Inter. Jour. Comp. Vis., 56: 151–177.

    Article  Google Scholar 

  • Torralba, A., Murphy, K. P., and Freeman, W. T. 2004. Sharing visual features for multiclass and multiview object detection, Technical Report AI-Memo 2004-008, MIT.

  • Tu, Z. W., Chen, X. R., L., Y. A., and Zhu, S. C. 2004. Image parsing: unifying segmentation, detection and recognition. Int’l J. of Computer Vision, to appear.

  • Vapnik, V. N. 1995. The Nature of Statistical Learning Theory. Springer Verlag, New York.

    MATH  Google Scholar 

  • Viola, P. and Jones, M. J. 2004. Robust real time face detection. Intl. Jour. Comp. Vis., 57: 137–154.

    Article  Google Scholar 

  • Wang, S. C. 1998. A statistical model for computer recognition of sequences of handwritten digits, with applications to zip codes, PhD thesis, University of Chicago.

  • Wiskott, L., Fellous, J.-M., Kruger, N., and von der Marlsburg, C. 1997. Face recognition by elastic bunch graph matching. IEEE Trans. on Patt. Anal. and Mach. Intel., 7: 775–779.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yali Amit.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Amit, Y., Trouvé, A. POP: Patchwork of Parts Models for Object Recognition. Int J Comput Vis 75, 267–282 (2007). https://doi.org/10.1007/s11263-006-0033-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-006-0033-9

Keywords

Navigation