Learning to Locate Informative Features for Visual Identification

  • Andras Ferencz
  • Erik G. Learned-Miller
  • Jitendra Malik
Article

Abstract

Object identification is a specialized type of recognition in which the category (e.g. cars) is known and the goal is to recognize an object’s exact identity (e.g. Bob’s BMW). Two special challenges characterize object identification. First, inter-object variation is often small (many cars look alike) and may be dwarfed by illumination or pose changes. Second, there may be many different instances of the category but few or just one positive “training” examples per object instance. Because variation among object instances may be small, a solution must locate possibly subtle object-specific salient features, like a door handle, while avoiding distracting ones such as specular highlights. With just one training example per object instance, however, standard modeling and feature selection techniques cannot be used. We describe an on-line algorithm that takes one image from a known category and builds an efficient “same” versus “different” classification cascade by predicting the most discriminative features for that object instance. Our method not only estimates the saliency and scoring function for each candidate feature, but also models the dependency between features, building an ordered sequence of discriminative features specific to the given image. Learned stopping thresholds make the identifier very efficient. To make this possible, category-specific characteristics are learned automatically in an off-line training procedure from labeled image pairs of the category. Our method, using the same algorithm for both cars and faces, outperforms a wide variety of other methods.

Keywords

Object recognition Object identification Parametric models Interclass transfer Learning from new examples One-shot learning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amit, Y., & Geman, D. (1999). A computational model for visual selection. Neural Computation, 11(7), 1691–1715. CrossRefGoogle Scholar
  2. Belhumeur, P. N., Hespanha, J. P., & Kriegman, D. J. (1997). Eigenfaces vs. Fisherfaces: Recognition using class specific linear projections. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 711–720. CrossRefGoogle Scholar
  3. Belongie, S., Malik, J., & Puzicha, J. (2001). Matching shapes. In International conference on computer vision (pp. 454–463). Google Scholar
  4. Berg, T. L., Berg, A. C., Edwards, J., Maire, M., White, R., Teh, Y. W., Learned-Miller, E., & Forsyth, D. A. (2004). Names and faces in the news. Computer Vision and Pattern Recognition, 2, 848–854. Google Scholar
  5. Berg, A., Berg, T., & Malik, J. (2005). Shape matching and object recognition using low distortion correspondence. In CVPR (pp. 26–33). Google Scholar
  6. Bernstein, E. J., & Amit, Y. (2005). Part-based statistical models for object classification and detection. In IEEE Computer vision and pattern recognition (pp. 734–740). Google Scholar
  7. Blanz, V., Romdhani, S., & Vetter, T. (2002). Face identification across different poses and illuminations with a 3d morphable model. In Proceedings of the 5th international conference on automatic face and gesture recognition (pp. 202–207). Google Scholar
  8. Bolme, D., Beveridge, R., Teixeira, M., & Draper, B. (2003). The CSU face identification evaluation system: Its purpose, features and structure. In ICVS (pp. 128–138). Google Scholar
  9. Diamond, R., & Carey, S. (1986). Why faces are and are not special: An effect of expertise. Journal of Experimental Psychology, 115, 107–117. Google Scholar
  10. Dork, G., & Schmid, C. (2005). Object class recognition using discriminative local features (Technical Report RR-5497). INRIA Rhone-Alpes. Google Scholar
  11. Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32(2), 407–499. MATHCrossRefMathSciNetGoogle Scholar
  12. Fei-Fei, L., Fergus, R., & Perona, P. (2003). A Bayesian approach to unsupervised one-shot learning of object categories. In International conference on computer vision (Vol. 2, pp. 1134–1141). Google Scholar
  13. Fleuret, F. (2004). Fast binary feature selection with conditional mutual information. Journal of Machine Learning Research, 5, 1531–1555. MathSciNetGoogle Scholar
  14. Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In 13th international conference on machine learning (pp. 148–156). Google Scholar
  15. Heisele, B., Poggio, T., & Pontil, M. (2000). Face detection in still gray images (A.I. Memo No. 521). Massachusetts Institute of Technology Artificial Intelligence Lab, May 2000. Google Scholar
  16. Jain, V., Ferencz, A., & Learned-Miller, E. (2006). Discriminative training of hyper-feature models for object identification. In British machine vision conference (Vol. 1, pp. 357–366). Google Scholar
  17. John, G. H., Kohavi, R., & Pfleger, K. (1994). Irrelevant features and the subset selection problem. In International conference on machine learning (pp. 121–129). Google Scholar
  18. Kadir, T., & Brady, M. (2001). Scale, saliency and image description. International Journal of Computer Vision, 45(2), 83–105. MATHCrossRefGoogle Scholar
  19. Kibble, W. F. (1941). A two-variate gamma type distribution. Sankhya, 5, 137–150. MATHMathSciNetGoogle Scholar
  20. Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110. CrossRefGoogle Scholar
  21. McCullagh, P., & Nelder, J. A. (1989). Generalized linear models. London: Chapman and Hall. MATHGoogle Scholar
  22. Miller, E. G., Matsakis, N. E., & Viola, P. A. (2000). Learning from one example through shared densities on transforms. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 464–471). Google Scholar
  23. Moghaddam, B., Jebara, T., & Pentland, A. (2000). Bayesian face recognition. Pattern Recognition, 33, 1771–1782. CrossRefGoogle Scholar
  24. Mori, G., Belongie, S., & Malik, J. (2001). Shape contexts enable efficient retrieval of similar shapes. In CVPR (pp. 723–730). Google Scholar
  25. Schneiderman, H., & Kanade, T. (2000). A statistical approach to 3d object detection applied to faces and cars. In CVPR (pp. 1746–1759). Google Scholar
  26. Shental, N., Bar-Hillel, A., Hertz, T., & Weinshall, D. (2003). Computing Gaussian mixture models with EM using equivalence constraints. In NIPS. Google Scholar
  27. Shental, N., Hertz, T., Weinshall, D., & Pavel, M. (2002). Adjustment learning and relevant component analysis. In ECCV. Google Scholar
  28. Tarr, M., & Gauthier, I. (2000). FFA: A flexible fusiform area for subordinate-level visual processing automatized by expertise. Nature Neuroscience, 3(8), 764–769. CrossRefGoogle Scholar
  29. Thrun, S. (1996). Explanation-based neural network learning: A lifelong learning approach. Dordrecht: Kluwer. MATHGoogle Scholar
  30. Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. Journal of Cogntive Neuroscience, 3(1), 71–86. CrossRefGoogle Scholar
  31. Vidal-Naquet, M., & Ullman, S. (2003). Object recognition with informative features and linear classification. In International conference on computer vision (pp. 281–288). Google Scholar
  32. Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In IEEE conference on computer vision and pattern recognition (pp. 511–518). Google Scholar
  33. Weber, M., Welling, M., & Perona, P. (2000). Unsupervised learning of models for recognition. European Conference on Computer Vision, 1, 18–32. Google Scholar
  34. Wiskott, L., Fellous, J., Krüger, N., & von der Malsburg, C. (1997). Face recognition by elastic bunch graph matching. Proceedings 7th International Conference on Computer Analysis of Images and Patterns, 19(7), 775–779. Google Scholar
  35. Xing, E., Ng, A., Jordan, M., & Russell, S. (2002). Distance metric learning with application to clustering with side-information. In Advances in neural information processing systems. Google Scholar
  36. Zhao, W., Chellappa, R., Rosenfeld, A., & Phillips, P. (2003). Face recognition: A literature survey. ACM Computing Surveys, 35(4), 399–458. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Andras Ferencz
    • 1
  • Erik G. Learned-Miller
    • 2
  • Jitendra Malik
    • 3
  1. 1.Mobileye Vision TechnologiesPrincetonUSA
  2. 2.Computer ScienceUMass AmherstAmherstUSA
  3. 3.Computer ScienceU.C. BerkeleyBerkeleyUSA

Personalised recommendations