Task-Specific Salience for Object Recognition

  • Jerome Revaud
  • Guillaume Lavoue
  • Yasuo Ariki
  • Atilla Baskurt
Part of the Studies in Computational Intelligence book series (SCI, volume 339)


Object recognition is a complex and challenging problem. It involves examining many different hypothesis in terms of the object class, position, scale, pose, etc., but the main trend in computer vision systems is to lazily rely on the brute force capacity of computers, that is to explore every possibilities indifferently. Sadly, in many case this scheme is way too slow for real-time or even practical applications. By incorporating salience in the recognition process, several approaches have shown that it is possible to get several orders of speed-up. In this chapter, we demonstrate the link between salience and cascaded processes and show why and how those ones should be constructed. We illustrate the benefits that it provides, in terms of detection speed, accuracy and robustness, and how it eases the combination of heterogeneous feature types (i.e. dense and sparse features) by some innovating strategies from the state-of-the-art and a practical application.


task-specific salience cascades feature combination optimization 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: From contours to regions: An empirical evaluation. In: Computer Vision and Pattern Recognition (2009)Google Scholar
  2. Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8(6), 679–698 (1986)CrossRefGoogle Scholar
  3. Chum, O., Zisserman, A.: An exemplar model for learning object classes. In: Computer Vision and Pattern Recognition (2007)Google Scholar
  4. Chum, O., Matas, J., Kittler, J.: Locally optimized ransac. Pattern Recognition, 236–243 (2003)Google Scholar
  5. Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV International Workshop on Statistical Learning in Computer Vision (2004)Google Scholar
  6. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Schmid, C., Soatto, S., Tomasi, C. (eds.) International Conference on Computer Vision & Pattern Recognition, vol. 2, pp. 886–893 (2005); INRIA Rhône-Alpes,ZIRST-655, av. de l’Europe, Montbonnot-38334Google Scholar
  7. Elad, M., Hel-Or, Y., Keshet, R.: Pattern detection using a maximal rejection classifier. Pattern Recognition Letters 23, 1459–1471 (2001)CrossRefGoogle Scholar
  8. Epshtein, B., Ullman, S.: Semantic hierarchies for recognizing objects and parts. In: Computer Vision and Pattern Recognition, CVPR 2007 (2007)Google Scholar
  9. Eveland, C.K., Socolinsky, D.A., Priebe, C.E., Marchette, D.J.: A hierarchical methodology for class detection problems with skewed priors. J. Classif. 22(1), 17–48 (2005)zbMATHCrossRefMathSciNetGoogle Scholar
  10. Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC 2007) Results,
  11. Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 594 (2006)CrossRefGoogle Scholar
  12. Felzenszwalb, P.F., Girshick, R.B., McAllester, D.: Cascade object detection with deformable part models. In: ComputerVision and Pattern Recognition (2010)Google Scholar
  13. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)CrossRefMathSciNetGoogle Scholar
  14. Fleuret, F., Geman, D.: Coarse-to-fine face detection. Int. J. Comput. Vision 41(1-2), 85–107 (2001)zbMATHCrossRefGoogle Scholar
  15. Fleuret, F., Geman, D.: Stationary features and cat detection. Journal of Machine Learning Research 9, 2549–2578 (2008)MathSciNetGoogle Scholar
  16. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Proceedings of the Second European Conference on Computational Learning Theory, pp. 23–37. Springer, London (1995)Google Scholar
  17. Gangaputra, S., Geman, D.: A design principle for coarse-to-fine classification. In: CVPR 2006: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1877–1884. IEEE ComputerSociety, Washington, DC (2006)Google Scholar
  18. Gehler, P., Nowozin, S.: On feature combination for multiclass object classification. In: Proceedings of the Twelfth IEEE International Conference on Computer Vision, ICCV (2009)Google Scholar
  19. Gu, C., Lim, J.J., Arbelaez, P., Malik, J.: Recognition using regions. In: Computer Vision and Pattern Recognition (2009)Google Scholar
  20. Harris, C., Stephens, M.: A combined corner and edge detection. In: Proceedings of The Fourth Alvey Vision Conference, pp. 147–151 (1988)Google Scholar
  21. Harzallah, H., Jurie, F., Schmid, C.: Combining efficient object localization and image classification. In: International Conference on Computer Vision (2009)Google Scholar
  22. Huang, C., Ai, H., Li, Y., Lao, S.: Vector boosting for rotation invariant multi-view face detection. In: International Conference on Computer Vision (ICCV 2005), pp. 446–453. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  23. Huang, C., Ai, H., Wu, B., Lao, S.: Boosting nested cascade detector for multi-view face detection. In: ICPR 2004: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR 2004), vol. 2, pp. 415–418. IEEE Computer Society, Washington, DC (2004)CrossRefGoogle Scholar
  24. Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans.Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)CrossRefGoogle Scholar
  25. Kruizinga, P., Petkov, N.: Nonlinear operator for oriented texture. IEEE Transactions on Image Processing 8(10), 1395–1407 (1999)CrossRefMathSciNetGoogle Scholar
  26. Lampert, C.: An efficient divide-and-conquer cascade for nonlinear object detection. In: Computer Vision and Pattern Recognition (2010)Google Scholar
  27. Liu, X., Zhang, L., Li, M., Zhang, H., Wang, D.: Boosting image classification with lda-based feature combination for digital photograph management. Pattern Recognition 38(6), 887–901 (2005)CrossRefGoogle Scholar
  28. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)CrossRefGoogle Scholar
  29. Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: British Machine Vision Conference, vol. 1, pp. 384–393 (2002)Google Scholar
  30. Messmer, B.T., Bunke, H.: A new algorithm for errortolerant subgraph isomorphism detection. IEEE Trans. Pattern Anal. Mach. Intell. 20(5), 493–504 (1998)CrossRefGoogle Scholar
  31. Moreels, P., Perona, P.: Probabilistic coarse-to-fine object recognition. Technical report, California Institute of Technology (2005)Google Scholar
  32. Moreels, P., Perona, P.: Evaluation of features detectors and descriptors based on 3d objects. International Journal of Computer Vision 73(3), 263–284 (2007)CrossRefGoogle Scholar
  33. Paletta, L., Fritz, G.: Reinforcement learning for decision making in sequential visual attention, pp. 293-306 (2008)Google Scholar
  34. Paletta, L., Tsotsos, J.K.: Attention in Cognitive Systems. LNCS. Springer, Heidelberg (2008)Google Scholar
  35. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Computer Vision and Pattern Recognition, pp. 1–8 (2007)Google Scholar
  36. Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J.: 3d object modeling and recognition using local affneinvariant image descriptors and multi-view spatial constraints. International Journal of Computer Vision 66(3), 231–259 (2006)CrossRefGoogle Scholar
  37. Tola, E., Lepetit, V., Fua, P.: A fast local descriptor for dense matching. In: Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)Google Scholar
  38. Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing visual features for multiclass and multiview object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(5), 854–869 (2007)CrossRefGoogle Scholar
  39. Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: International Conference on ComputerVision (2009)Google Scholar
  40. Vidal-Naquet, M., Ullman, S.: Object recognition with informative features and linear classification. In: ICCV 2003: Proceedings of the Ninth IEEE International Conference on Computer Vision, p. 281. IEEE Computer Society, Washington, DC (2003)CrossRefGoogle Scholar
  41. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Computer Vision and Pattern Recognition, pp. 511–518 (2001)Google Scholar
  42. Viola, P., Jones, M.J.: Robust real-time face detection. International Journal of Computer Vision 57(2), 137–154 (2004)CrossRefGoogle Scholar
  43. Walther, D., Koch, C.: Modeling attention to salient proto-objects. Neural Networks 19(9), 1395–1407 (2006); Brain and Attention, Brain and AttentionzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Jerome Revaud
    • 1
  • Guillaume Lavoue
    • 1
  • Yasuo Ariki
    • 2
  • Atilla Baskurt
    • 1
  1. 1.INSA-Lyon, LIRIS, UMR5205Universite de Lyon, CNRSFrance
  2. 2.CS17 Media LaboratoryKobe UniversityJapan

Personalised recommendations