Recursive Coarse-to-Fine Localization for Fast Object Detection

  • Marco Pedersoli
  • Jordi Gonzàlez
  • Andrew D. Bagdanov
  • Juan J. Villanueva
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6316)


Cascading techniques are commonly used to speed-up the scan of an image for object detection. However, cascades of detectors are slow to train due to the high number of detectors and corresponding thresholds to learn. Furthermore, they do not use any prior knowledge about the scene structure to decide where to focus the search. To handle these problems, we propose a new way to scan an image, where we couple a recursive coarse-to-fine refinement together with spatial constraints of the object location. For doing that we split an image into a set of uniformly distributed neighborhood regions, and for each of these we apply a local greedy search over feature resolutions. The neighborhood is defined as a scanning region that only one object can occupy. Therefore the best hypothesis is obtained as the location with maximum score and no thresholds are needed. We present an implementation of our method using a pyramid of HOG features and we evaluate it on two standard databases, VOC2007 and INRIA dataset. Results show that the Recursive Coarse-to-Fine Localization (RCFL) achieves a 12x speed-up compared to standard sliding windows. Compared with a cascade of multiple resolutions approach our method has slightly better performance in speed and Average-Precision. Furthermore, in contrast to cascading approach, the speed-up is independent of image conditions, the number of detected objects and clutter.


Object Detection Object Model Resolution Level Human Detection Feature Resolution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Supplementary material

978-3-642-15567-3_21_MOESM1_ESM.avi (8.9 mb)
Electronic Supplementary Material (9,139 KB)


  1. 1.
    Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: A benchmark. In: CVPR (2009)Google Scholar
  2. 2.
    Schwartz, W.R., Kembhavi, A., Harwood, D., Davis, L.S.: Human detection using partial least squares analysis. In: ICCV (2009)Google Scholar
  3. 3.
    Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: ICCV (2009)Google Scholar
  4. 4.
    Wojek, C., Schiele, B.: A performance evaluation of single and multi-feature people detection. In: Rigoll, G. (ed.) DAGM 2008. LNCS, vol. 5096, pp. 82–91. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  5. 5.
    Harzallah, H., Jurie, F., Schmid, C.: Combining efficient object localization and image classification. In: ICCV (2009)Google Scholar
  6. 6.
    Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: CVPR (2008)Google Scholar
  7. 7.
    Mikolajczyk, K., Leibe, B., Schiele, B.: Multiple object class detection with a generative model. In: CVPR, pp. 26–36 (2006)Google Scholar
  8. 8.
    Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. IJCV 77, 259–289 (2008)CrossRefGoogle Scholar
  9. 9.
    Lampert, C.H., Blaschko, M.B., Hofmann, T.: Beyond sliding windows: Object localization by efficient subwindow search. In: CVPR (2008)Google Scholar
  10. 10.
    Gavrila, D., Philomin, V.: Real-time object detection for smart vehicles. In: ICCV, pp. 87–93 (1999)Google Scholar
  11. 11.
    Zhu, Q., Yeh, M.C., Cheng, K.T., Avidan, S.: Fast human detection using a cascade of histograms of oriented gradients. In: CVPR, pp. 1491–1498 (2006)Google Scholar
  12. 12.
    Dollar, P., Tu, Z., Tao, H., Belongie, S.: Feature mining for image classification. In: CVPR (2007)Google Scholar
  13. 13.
    Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge, VOC 2007 Results (2007)Google Scholar
  14. 14.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)Google Scholar
  15. 15.
    Zhang, W., Zelinsky, G., Samaras, D.: Real-time accurate object detection using multiple resolutions. In: ICCV (2007)Google Scholar
  16. 16.
    Maji, S., Berg, A., Malik, J.: Classification using intersection kernel support vector machines is efficient. In: CVPR (2008)Google Scholar
  17. 17.
    Felzenszwalb, P., Girshick, R., McAllester, D.: Cascade object detection with deformable parts models. In: CVPR (2010)Google Scholar
  18. 18.
    Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. PAMI 31 (2009)Google Scholar
  19. 19.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001)Google Scholar
  20. 20.
    Dollar, P., Babenko, B., Belongie, S., Perona, P., Tu, Z.: Multiple component learning for object detection. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 211–224. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  21. 21.
    Torralba, A.: How many pixels make an image? Visual Neuroscience 26, 123–131 (2009)CrossRefGoogle Scholar
  22. 22.
    Chang, C., Lin, C.: LIBSVM: a library for support vector machines (2005)Google Scholar
  23. 23.
    Sabzmeydani, P., Mori, G.: Detecting pedestrians by learning shapelet features. In: CVPR (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Marco Pedersoli
    • 1
  • Jordi Gonzàlez
    • 1
  • Andrew D. Bagdanov
    • 1
  • Juan J. Villanueva
    • 1
  1. 1.Dept. Ciències de la Computació & Centre de Visió per ComputadorEdifici O, Campus UAB 08193 Bellaterra (Cerdanyola)BarcelonaSpain

Personalised recommendations