Skip to main content
Log in

Learning to Locate Informative Features for Visual Identification

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Object identification is a specialized type of recognition in which the category (e.g. cars) is known and the goal is to recognize an object’s exact identity (e.g. Bob’s BMW). Two special challenges characterize object identification. First, inter-object variation is often small (many cars look alike) and may be dwarfed by illumination or pose changes. Second, there may be many different instances of the category but few or just one positive “training” examples per object instance. Because variation among object instances may be small, a solution must locate possibly subtle object-specific salient features, like a door handle, while avoiding distracting ones such as specular highlights. With just one training example per object instance, however, standard modeling and feature selection techniques cannot be used. We describe an on-line algorithm that takes one image from a known category and builds an efficient “same” versus “different” classification cascade by predicting the most discriminative features for that object instance. Our method not only estimates the saliency and scoring function for each candidate feature, but also models the dependency between features, building an ordered sequence of discriminative features specific to the given image. Learned stopping thresholds make the identifier very efficient. To make this possible, category-specific characteristics are learned automatically in an off-line training procedure from labeled image pairs of the category. Our method, using the same algorithm for both cars and faces, outperforms a wide variety of other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Amit, Y., & Geman, D. (1999). A computational model for visual selection. Neural Computation, 11(7), 1691–1715.

    Article  Google Scholar 

  • Belhumeur, P. N., Hespanha, J. P., & Kriegman, D. J. (1997). Eigenfaces vs. Fisherfaces: Recognition using class specific linear projections. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 711–720.

    Article  Google Scholar 

  • Belongie, S., Malik, J., & Puzicha, J. (2001). Matching shapes. In International conference on computer vision (pp. 454–463).

  • Berg, T. L., Berg, A. C., Edwards, J., Maire, M., White, R., Teh, Y. W., Learned-Miller, E., & Forsyth, D. A. (2004). Names and faces in the news. Computer Vision and Pattern Recognition, 2, 848–854.

    Google Scholar 

  • Berg, A., Berg, T., & Malik, J. (2005). Shape matching and object recognition using low distortion correspondence. In CVPR (pp. 26–33).

  • Bernstein, E. J., & Amit, Y. (2005). Part-based statistical models for object classification and detection. In IEEE Computer vision and pattern recognition (pp. 734–740).

  • Blanz, V., Romdhani, S., & Vetter, T. (2002). Face identification across different poses and illuminations with a 3d morphable model. In Proceedings of the 5th international conference on automatic face and gesture recognition (pp. 202–207).

  • Bolme, D., Beveridge, R., Teixeira, M., & Draper, B. (2003). The CSU face identification evaluation system: Its purpose, features and structure. In ICVS (pp. 128–138).

  • Diamond, R., & Carey, S. (1986). Why faces are and are not special: An effect of expertise. Journal of Experimental Psychology, 115, 107–117.

    Google Scholar 

  • Dork, G., & Schmid, C. (2005). Object class recognition using discriminative local features (Technical Report RR-5497). INRIA Rhone-Alpes.

  • Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32(2), 407–499.

    Article  MATH  MathSciNet  Google Scholar 

  • Fei-Fei, L., Fergus, R., & Perona, P. (2003). A Bayesian approach to unsupervised one-shot learning of object categories. In International conference on computer vision (Vol. 2, pp. 1134–1141).

  • Fleuret, F. (2004). Fast binary feature selection with conditional mutual information. Journal of Machine Learning Research, 5, 1531–1555.

    MathSciNet  Google Scholar 

  • Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In 13th international conference on machine learning (pp. 148–156).

  • Heisele, B., Poggio, T., & Pontil, M. (2000). Face detection in still gray images (A.I. Memo No. 521). Massachusetts Institute of Technology Artificial Intelligence Lab, May 2000.

  • Jain, V., Ferencz, A., & Learned-Miller, E. (2006). Discriminative training of hyper-feature models for object identification. In British machine vision conference (Vol. 1, pp. 357–366).

  • John, G. H., Kohavi, R., & Pfleger, K. (1994). Irrelevant features and the subset selection problem. In International conference on machine learning (pp. 121–129).

  • Kadir, T., & Brady, M. (2001). Scale, saliency and image description. International Journal of Computer Vision, 45(2), 83–105.

    Article  MATH  Google Scholar 

  • Kibble, W. F. (1941). A two-variate gamma type distribution. Sankhya, 5, 137–150.

    MATH  MathSciNet  Google Scholar 

  • Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  • McCullagh, P., & Nelder, J. A. (1989). Generalized linear models. London: Chapman and Hall.

    MATH  Google Scholar 

  • Miller, E. G., Matsakis, N. E., & Viola, P. A. (2000). Learning from one example through shared densities on transforms. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 464–471).

  • Moghaddam, B., Jebara, T., & Pentland, A. (2000). Bayesian face recognition. Pattern Recognition, 33, 1771–1782.

    Article  Google Scholar 

  • Mori, G., Belongie, S., & Malik, J. (2001). Shape contexts enable efficient retrieval of similar shapes. In CVPR (pp. 723–730).

  • Schneiderman, H., & Kanade, T. (2000). A statistical approach to 3d object detection applied to faces and cars. In CVPR (pp. 1746–1759).

  • Shental, N., Bar-Hillel, A., Hertz, T., & Weinshall, D. (2003). Computing Gaussian mixture models with EM using equivalence constraints. In NIPS.

  • Shental, N., Hertz, T., Weinshall, D., & Pavel, M. (2002). Adjustment learning and relevant component analysis. In ECCV.

  • Tarr, M., & Gauthier, I. (2000). FFA: A flexible fusiform area for subordinate-level visual processing automatized by expertise. Nature Neuroscience, 3(8), 764–769.

    Article  Google Scholar 

  • Thrun, S. (1996). Explanation-based neural network learning: A lifelong learning approach. Dordrecht: Kluwer.

    MATH  Google Scholar 

  • Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. Journal of Cogntive Neuroscience, 3(1), 71–86.

    Article  Google Scholar 

  • Vidal-Naquet, M., & Ullman, S. (2003). Object recognition with informative features and linear classification. In International conference on computer vision (pp. 281–288).

  • Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In IEEE conference on computer vision and pattern recognition (pp. 511–518).

  • Weber, M., Welling, M., & Perona, P. (2000). Unsupervised learning of models for recognition. European Conference on Computer Vision, 1, 18–32.

    Google Scholar 

  • Wiskott, L., Fellous, J., Krüger, N., & von der Malsburg, C. (1997). Face recognition by elastic bunch graph matching. Proceedings 7th International Conference on Computer Analysis of Images and Patterns, 19(7), 775–779.

    Google Scholar 

  • Xing, E., Ng, A., Jordan, M., & Russell, S. (2002). Distance metric learning with application to clustering with side-information. In Advances in neural information processing systems.

  • Zhao, W., Chellappa, R., Rosenfeld, A., & Phillips, P. (2003). Face recognition: A literature survey. ACM Computing Surveys, 35(4), 399–458.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andras Ferencz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ferencz, A., Learned-Miller, E.G. & Malik, J. Learning to Locate Informative Features for Visual Identification. Int J Comput Vis 77, 3–24 (2008). https://doi.org/10.1007/s11263-007-0093-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-007-0093-5

Keywords

Navigation