Skip to main content

A Shape Reconstructability Measure of Object Part Importance with Applications to Object Detection and Localization

Abstract

We propose a computational model which computes the importance of 2-D object shape parts, and we apply it to detect and localize objects with and without occlusions. The importance of a shape part (a localized contour fragment) is considered from the perspective of its contribution to the perception and recognition of the global shape of the object. Accordingly, the part importance measure is defined based on the ability to estimate/recall the global shapes of objects from the local part, namely the part’s “shape reconstructability”. More precisely, the shape reconstructability of a part is determined by two factors–part variation and part uniqueness. (i) Part variation measures the precision of the global shape reconstruction, i.e. the consistency of the reconstructed global shape with the true object shape; and (ii) part uniqueness quantifies the ambiguity of matching the part to the object, i.e. taking into account that the part could be matched to the object at several different locations. Taking both these factors into consideration, an information theoretic formulation is proposed to measure part importance by the conditional entropy of the reconstruction of the object shape from the part. Experimental results demonstrate the benefit with the proposed part importance in object detection, including the improvement of detection rate, localization accuracy, and detection efficiency. By comparing with other state-of-the-art object detectors in a challenging but common scenario, object detection with occlusions, we show a considerable improvement using the proposed importance measure, with the detection rate increased over \(10~\%\). On a subset of the challenging PASCAL dataset, the Interpolated Average Precision (as used in the PASCAL VOC challenge) is improved by 4–8 %. Moreover, we perform a psychological experiment which provides evidence suggesting that humans use a similar measure for part importance when perceiving and recognizing shapes.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

References

  • Bai, X., & Latecki, L. (2008). Path similarity skeleton graph matching. IEEE Transactions Pattern Analysis and Machine Intelligence, 30(7), 1282–1292.

    Article  Google Scholar 

  • Belongie, S., Malik, J., & Puzicha, J. (2002). Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 24(4), 509C522.

    Article  Google Scholar 

  • Biederman, I. (1987). Recognition-by-components: a theory of human image understanding. Psychological Review, 92, 115–147.

    Article  Google Scholar 

  • Biederman, I., & Cooper, E. E. (1991). Priming contour-deleted images: Evidence for intermediate representations in visual object recognition. Cognitive Psychology, 23, 393–419.

    Article  Google Scholar 

  • Bouchard, G., & Triggs, B. (2005). Hierarchical part-based visual object categorization. In IEEE Conference on Computer Vision and Pattern Recognition.

  • Bower, G. H., & Glass, A. L. (2011). Structural units and the redintegrative power of picture fragments. Journal of Experimental Psychology, 2, 456–466.

    Google Scholar 

  • Cai, H., Yan, F., & Mikolajczyk, K. (2010). Learning weights for codebook in image classification and retrieval. In IEEE Conference on Computer Vision and Pattern Recognition.

  • Chui, H., & Rangarajan, A. (2003). A new point matching algorithm for non-rigid registration. Computer Vision and Image Understanding, 89(2–3), 114–141.

    Article  MATH  Google Scholar 

  • Cootes, T. F., Taylor, C. J., Cooper, D. H., & Graham, J. (1995). Active shape models-their training and application. Computer Vision and Image Understanding, 61(1), 38–59.

    Google Scholar 

  • Crandall, D. J., & Huttenlocher, D. (2006). Weakly supervised learning of part-based spatial models for visual object recognition. In In European Conference on Computer Vision.

  • Dubinskiy, A., & Zhu, S. C. (2003). A multi-scale generative model for animate shapes and parts. In Proceedings of IEEE International Conference on Computer Vision.

  • Duchi, J., Shalev-Shwartz, S., Singer, Y., & Chandra, T. (2008). Efficient projections onto the l1-ball for learning in high dimensions. In International Conference on Machine Learning.

  • Epshtein, B., & Ullman, S. (2007). Semantic hierarchies for recognizing objects and parts. In IEEE Conference on Computer Vision and Pattern Recognition.

  • Everingham, M., Gool, L. V., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision (IJCV), 88(2), 303–338.

    Article  Google Scholar 

  • Felzenszwalb, P. F., & Huttenlocher, D. P. (2005). Pictorial structures for object recognition. International Journal of Computer Vision (IJCV), 61(9), 55–79.

    Google Scholar 

  • Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2009). Weakly supervised learning of part-based spatial models for visual object recognition. In Computer Vision and Pattern Recognition (CVPR).

  • Ferrari, V., Tuytelaars, T., & Gool, L. V. (2006). Object detection by contour segment networks. In European Conference on Computer Vision (ECCV), dataset. www.vision.ee.ethz.ch/~calvin/datasets.html.

  • Ferrari, V., Fevrier, L., Jurie, F., & Schmid, C. (2008). Groups of adjacent contour segments for object detection. In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).

  • Ferrari, V., Jurie, F., & Schmid, C. (2009). From images to shape models for object detection. International Journal of Computer Vision (IJCV). 104, 2–3.

    Google Scholar 

  • Freifeld, O., Weiss, A., Zuffi, S., & Black, M. J. (2010). Contour people: A parameterized model of 2d articulated human shape. In IEEE Conference Computer Vision and Patt Recognition.

  • Gopalan, R., Turaga, P., & Chellappa, R. (2010). Articulation-invariant representation of non-planar shapes. In European Conference on Computer Vision.

  • Hoffman, D. D., & Richards, W. (1984). Parts of recognition. Cognition, 18, 65–96.

    Google Scholar 

  • Hoffman, D. D., & Singh, M. (1997). Salience of visual parts. Cognition, 63, 29–78.

    Article  Google Scholar 

  • Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in the cats visual cortex. Journal of Neurophysiology, 160, 106–154.

    Google Scholar 

  • Ion, A., Artner, N. M., Peyre, G., Kropatsch, W. G., & Cohen, L. D. (2011). Matching 2d and 3d articulated shapes using the eccentricity transform. Journal of Experimental Psychology, 115(6), 817–834.

    Google Scholar 

  • Jurie, F., & Schmid, C. (2004). Scale-invariant shape features for recognition of object categories. In IEEE Conference on Computer Vision and Pattern Recognition. dataset: lear.inrialpes.fr/data.

  • Siddiqi BK, K., & Tresness, K. (1996). Parts of visual form: Psychophysical aspects. Perception, 25, 399–424.

  • Kanizsa, G., & Gerbino, W. (1982). Amodal completion: Seeing or thinking? In J Beck (Ed). Organization and representation in perception. Hillsdale, NJ: Lawrence Erlbaum Associates, (pp. 167–190).

  • Kersten, D., Mamassian, P., & Yuille, A. (2004). Object perception as Bayesian inference. In Annual Review of Psychology.

  • Kimia, B., Frankel, I., & Popescu, A. (2003). Euler spiral for shape completion. International Journal of Computer Vision, 54(1/2), 157–180.

    Google Scholar 

  • Kira, K., & Rendell, L. A. (1992). A practical approach to feature selection. In The 9th International Conference on Machine Learning.

  • Lin, L., Wang, X., Yang, W., & Lai, J. (2012). Learning contour-fragment-based shape model with and-or tree representation. In IEEE Conference Computer Vision and Pattern Recognition.

  • Liu, H., Liu, W., & Latecki, L. J. (2010). Convex shape decomposition. In IEEE Conference on Computer Vision and Pattern Recognition.

  • Lu, C., Latecki, L. J., Adluru, N., Yang, X., & Ling, H. (2009). Shape guided contour grouping with particle filters. In Proceedings of IEEE International Conference on Computer Vision.

  • Luo, P., Lin, L., & Chao, H. (2010). Learning shape detector by quantizing curve segments with multiple distance metrics. In European Conference on Computer Vision.

  • Ma, T., & Latecki, L. J. (2011). From partial shape matching through local deformation to robust global shape similarity for object detection. In IEEE Conference on Computer Vision and Pattern Recognition.

  • Maji, S., & Malik, J. (2009). A max-margin hough tranform for object detection. In IEEE Conference on Computer Vision and Pattern Recognition.

  • Martin, D., Fowlkes, C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Patt Analysis and Machine Intelligence (PAMI), 26(5), 530–549.

    Article  Google Scholar 

  • Mikolajczyk, K., Schmid, C., & Zisserman, A. (2004). Human detection based on a probabilistic assembly of robust part detectors. In European Conference on Computer Vision.

  • Ommer, B., & Malik, J. (2009). Multi-scale object detection by clustering lines. In International Conference on Computer Vision.

  • Opelt, A., Pinz, A., & Zisserman, A. (2008). Learning an alphabet of shape and appearance for multi-class object detection. International Journal of Computer Vision, 80(1), 45–57.

    Google Scholar 

  • Felzenszwalb, P., McAllester, D., & Girshick, R. (2010). Cascade object detection with deformable part models. In IEEE Conference on Computer Vision and Pattern Recognition.

  • Ravishankar, S., Jain, A., & Mittal, A. (2008). Multi-stage contour based detection of deformable objects. In European Conference Computer Vision.

  • Renninger, L. K., Verghese, P., & Coughlan, J. (2007). Where to look next? Eye movements reduce local uncertainty. Journal of Vision, 7(3), 1–17.

    Article  Google Scholar 

  • Rensink, R. A., & Enns, J. T. (1998). Early completion of occluded objects. Vision Research, 38, 2489–2505.

    Article  Google Scholar 

  • Riemenschneider, H., Donoser, M., & Bischof, H. (2010). Using partial edge contour matches for efficient object category localization. In European Conference Computer Vision.

  • Sala, P., & Dickinson, S. (2010). Contour grouping and abstraction using simple part models. In European Conference on Computer Vision.

  • Schneiderman, H., & Kanade, T. (2004). Object detection using the statistics of parts. International Journal of Computer Vision, 60(2), 135–164.

    Google Scholar 

  • Schnitzspan, P., Roth, S., & Schiele, B. (2010). Automatic discovery of meaningful object parts with latent CRFs. In IEEE Conference on Computer Vision and Pattern Recognition.

  • Sharvit, D., Chan, J., Tek, H., & Kimia, B. B. (1998). Symmetry-based indexing of image databases. Journal of Visual Communication and Image Representation, 9(4), 366–380.

    Article  Google Scholar 

  • Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Shotton, J., Blake, A., & Cipolla, R. (2008). Multi-scale categorical object recognition using contour fragments. In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).

  • Srinivasan, P., Zhu, Q., & Shi, J. (2010). Many-to-one contour matching for describing and discriminating object shape. In IEEE Conference Computer Vision and Pattern Recognition.

  • Sukumar, S.R., Page, D. L., Koschan, A. F., Gribok, A. V., & Abidi, M. A. (2006). Shape measure for identifying perceptually informative parts of 3d objects. In Proceeding of 3rd International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT).

  • Toshev, A., Taskar, B., & Daniilidis, K. (2012). Shape-based object detection via boundary structure segmentation. International Journal of Computer Vision (IJCV), 99(2), 123–146.

    Article  MATH  MathSciNet  Google Scholar 

  • Ullman, S. (2007). Object recognition and segmentation by a fragment based hierarchy. Trends in Cognitive Sciences, 11, 58–64.

    Article  Google Scholar 

  • Wang, X., Bai, X., Ma, T., Liu, W., & Latecki, L. J. (2012). Fan shape model for object detection. In IEEE Conference on Computer Vision and Pattern Recognition.

  • Yarlagadda, P., & Ommer, B. (2012). From meaningful contours to discriminative object shape. In European Conference on Computer Vision.

  • Yarlagadda, P., Monroy, A., & Ommer, B. (2010). Voting by grouping dependent parts. In European Conference on Computer Vision.

  • Zhu, L., Chen, Y., Yuille, A., & Freeman, W. (2010). Latent hierarchical structural learning for object detection. In IEEE Conference on Computer Vision and Pattern Recognition.

  • Zhu, Q., Wang, L., Wu, Y., & Shi, J. (2008). Contour context selection for object detection: A set-to-set contour matching approach. In European Conference on Computer Vision.

  • Zhu, S. C., Wu, Y. N., & Mumford, D. B. (1998). Filters, random field and maximum entropy(frame): Towards a unified theory for texture modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).

Download references

Acknowledgments

We’d like to thank for the support from the following research grants 973-2011CBA00400, NSFC-61272027, NSFC-61121002, NSFC-61231010, NSFC-61210005, NSFC-61103087, NSFC-31230029, and Office of Naval Research N00014-12-1-0883.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yizhou Wang.

Additional information

Communicated by M. Hebert.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Guo, G., Wang, Y., Jiang, T. et al. A Shape Reconstructability Measure of Object Part Importance with Applications to Object Detection and Localization. Int J Comput Vis 108, 241–258 (2014). https://doi.org/10.1007/s11263-014-0705-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-014-0705-9

Keywords

  • Shape part
  • Part importance
  • Shape reconstruction
  • Object recognition and detection