Non-maximum Suppression for Object Detection by Passing Messages Between Windows

Rothe, Rasmus; Guillaumin, Matthieu; Van Gool, Luc

doi:10.1007/978-3-319-16865-4_19

Non-maximum Suppression for Object Detection by Passing Messages Between Windows

Rasmus Rothe⁵,
Matthieu Guillaumin⁵ &
Luc Van Gool^5,6

Conference paper
First Online: 01 January 2015

3328 Accesses
56 Citations
6 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9003))

Abstract

Non-maximum suppression (NMS) is a key post-processing step in many computer vision applications. In the context of object detection, it is used to transform a smooth response map that triggers many imprecise object window hypotheses in, ideally, a single bounding-box for each detected object. The most common approach for NMS for object detection is a greedy, locally optimal strategy with several hand-designed components (e.g., thresholds). Such a strategy inherently suffers from several shortcomings, such as the inability to detect nearby objects. In this paper, we try to alleviate these problems and explore a novel formulation of NMS as a well-defined clustering problem. Our method builds on the recent Affinity Propagation Clustering algorithm, which passes messages between data points to identify cluster exemplars. Contrary to the greedy approach, our method is solved globally and its parameters can be automatically learned from training data. In experiments, we show in two contexts – object class and generic object detection – that it provides a promising solution to the shortcomings of the greedy NMS.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Canny, J.: A computational approach to edge detection. TPAMI 8(6), 679–698 (1986)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. TPAMI 32(9), 1627–1645 (2010)
Article Google Scholar
Viola, P., Jones, M.: Robust real-time object detection. IJCV 57(2), 137–154 (2004)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Google Scholar
Cheng, M.M., Zhang, Z., Lin, W.Y., Torr, P.H.S.: BING: binarized normed gradients for objectness estimation at 300fps. In: CVPR (2014)
Google Scholar
Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. TPAMI 34(11), 2189–2202 (2012)
Article Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)
Article Google Scholar
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Article MathSciNet Google Scholar
Mikolajczyk, K., Schmid, C.: Scale & Affine invariant interest point detectors. IJCV 1(60), 63–86 (2004)
Article Google Scholar
Schneiderman, H., Kanade, T.: Object detection using the statistics of parts. IJCV 56(3), 151–177 (2004)
Article Google Scholar
Cinbis, R.G., Verbeek, J., Schmid, C.: Segmentation driven object detection with fisher vectors. In: ICCV (2013)
Google Scholar
Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: NIPS (2013)
Google Scholar
Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. IJCV 104(2), 154–171 (2013)
Article Google Scholar
Dalal, N.: Finding people in images and videos. Ph.D. thesis, Institut National Polytechnique de Grenoble (2006)
Google Scholar
Wojcikiewicz, W.: Probabilistic modelling of multiple observations in face detection. Technical report, Humboldt-Universität zu Berlin (2008)
Google Scholar
Blaschko, M.B., Kannala, J., Rahtu, E.: Non maximal suppression in cascaded ranking models. In: Kämäräinen, J.-K., Koskela, M. (eds.) SCIA 2013. LNCS, vol. 7944, pp. 408–419. Springer, Heidelberg (2013)
Chapter Google Scholar
Chen, G., Ding, Y., Xiao, J., Han, T.X.: Detection evolution with multi-order contextual co-occurrence. In: CVPR (2013)
Google Scholar
Ding, Y., Xiao, J.: Contextual boost for pedestrian detection. In: CVPR (2012)
Google Scholar
Razavi, N., Gall, J., Van Gool, L.: Backprojection revisited: scalable multi-view object detection and similarity metrics for detections. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 620–633. Springer, Heidelberg (2010)
Chapter Google Scholar
Barinova, O., Lempitsky, V., Kholi, P.: On detection of multiple object instances using hough transforms. TPAMI 34(9), 1773–1784 (2012)
Article Google Scholar
Wohlhart, P., Donoser, M., Roth, P.M., Bischof, H.: Detecting partially occluded objects with an implicit shape model random field. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 302–315. Springer, Heidelberg (2013)
Chapter Google Scholar
Wu, B., Nevatia, R.: Detection and segmentation of multiple, partially occluded objects by grouping, merging, assigning part detection responses. IJCV 82(2), 185–204 (2009)
Article Google Scholar
Blaschko, M.B., Lampert, C.H.: Learning to localize objects with structured output regression. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 2–15. Springer, Heidelberg (2008)
Chapter Google Scholar
Blaschko, M.B.: Branch and bound strategies for non-maximal suppression in object detection. In: Boykov, Y., Kahl, F., Lempitsky, V., Schmidt, F.R. (eds.) EMMCVPR 2011. LNCS, vol. 6819, pp. 385–398. Springer, Heidelberg (2011)
Google Scholar
Tang, S., Andriluka, M., Schiele, B.: Detection and tracking of occluded people. In: BMVC (2012)
Google Scholar
Desai, C., Ramanan, D., Fowlkes, C.C.: Discriminative models for multi-class object layout. IJCV 95(1), 1–12 (2011)
Article MathSciNet Google Scholar
Ladický, L., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, where and how many? combining object detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010)
Chapter Google Scholar
Yao, J., Fidler, S., Urtasun, R.: Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. In: CVPR (2012)
Google Scholar
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1(14), pp. 281–297 (1967)
Google Scholar
Kaufman, L., Rousseeuw, P.: Clustering by means of medoids. In: Dodge, Y. (ed.) Statistical Data Analysis Based on the L1-Norm and Related Methods. North-Holland, Amsterdam (1987)
Google Scholar
Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
Article MathSciNet Google Scholar
Dueck, D., Frey, B.J.: Non-metric affinity propagation for unsupervised image categorization. In: ICCV (2007)
Google Scholar
Dueck, D., Frey, B.J., Jojic, N., Jojic, V., Giaever, G., Emili, A., Musso, G., Hegele, R.: Using affinity propagation. In: RECOMB (2008)
Google Scholar
Lazic, N., Frey, B.J., Aarabi, P.: Solving the uncapacitated facility location problem using message passing algorithms. In: AISTATS (2010)
Google Scholar
Givoni, I.E., Chung, C., Frey, B.J.: Hierarchical affinity propagation. In: The 27th Conference on Uncertainty in Artificial Intelligence (UAI) (2011)
Google Scholar
Givoni, I.E., Frey, B.J.: Semi-supervised affinity propagation with instance-level constraints. In: AISTATS (2009)
Google Scholar
Givoni, I.E., Frey, B.J.: A binary variable model for affinity propagation. Neural Comput. 21(6), 1589–1600 (2009)
Article MathSciNet Google Scholar
Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: ICML (2009)
Google Scholar
Yuille, A.L., Rangarajan, A.: The concave-convex procedure. Neural Comput. 15(4), 915–936 (2003)
Article Google Scholar
Vedaldi, A.: A MATLAB wrapper of SVMstruct (2011)
Google Scholar
Hoiem, D., Chodpathumwan, Y., Dai, Q.: Diagnosing error in object detectors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 340–353. Springer, Heidelberg (2012)
Chapter Google Scholar
Alexe, B., Deselaers, T., Ferrari, V.: What is an object? In: CVPR (2010)
Google Scholar
Manén, S., Guillaumin, M., Van Gool, L.: Prime object proposals with randomized Prim’s algorithm. In: ICCV (2013)
Google Scholar
Ristin, M., Gall, J., Van Gool, L.: Local context priors for object proposal generation. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 57–70. Springer, Heidelberg (2013)
Chapter Google Scholar
Dollar, P., Zitnick, C.L.: Structured forests for fast edge detection. In: ICCV (2013)
Google Scholar

Download references

Acknowledgement

The authors gratefully acknowledge support by Toyota.

Author information

Authors and Affiliations

Computer Vision Laboratory, ETH Zurich, Zurich, Switzerland
Rasmus Rothe, Matthieu Guillaumin & Luc Van Gool
ESAT - PSI/IBBT, K.U. Leuven, Leuven, Belgium
Luc Van Gool

Authors

Rasmus Rothe
View author publications
You can also search for this author in PubMed Google Scholar
Matthieu Guillaumin
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rasmus Rothe .

Editor information

Editors and Affiliations

Technische Universität München, Garching, Bayern, Germany
Daniel Cremers
University of Adelaide, Adelaide, South Australia, Australia
Ian Reid
Keio University, Yokohama, Kanagawa, Japan
Hideo Saito
University of California at Merced, Merced, California, USA
Ming-Hsuan Yang

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material (pdf 525 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rothe, R., Guillaumin, M., Van Gool, L. (2015). Non-maximum Suppression for Object Detection by Passing Messages Between Windows. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-16865-4_19
Published: 16 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16864-7
Online ISBN: 978-3-319-16865-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics