Skip to main content

Non-maximum Suppression for Object Detection by Passing Messages Between Windows

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9003))

Abstract

Non-maximum suppression (NMS) is a key post-processing step in many computer vision applications. In the context of object detection, it is used to transform a smooth response map that triggers many imprecise object window hypotheses in, ideally, a single bounding-box for each detected object. The most common approach for NMS for object detection is a greedy, locally optimal strategy with several hand-designed components (e.g., thresholds). Such a strategy inherently suffers from several shortcomings, such as the inability to detect nearby objects. In this paper, we try to alleviate these problems and explore a novel formulation of NMS as a well-defined clustering problem. Our method builds on the recent Affinity Propagation Clustering algorithm, which passes messages between data points to identify cluster exemplars. Contrary to the greedy approach, our method is solved globally and its parameters can be automatically learned from training data. In experiments, we show in two contexts – object class and generic object detection – that it provides a promising solution to the shortcomings of the greedy NMS.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Canny, J.: A computational approach to edge detection. TPAMI 8(6), 679–698 (1986)

    Article  Google Scholar 

  2. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)

    Google Scholar 

  3. Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. TPAMI 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  4. Viola, P., Jones, M.: Robust real-time object detection. IJCV 57(2), 137–154 (2004)

    Article  Google Scholar 

  5. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)

    Google Scholar 

  6. Cheng, M.M., Zhang, Z., Lin, W.Y., Torr, P.H.S.: BING: binarized normed gradients for objectness estimation at 300fps. In: CVPR (2014)

    Google Scholar 

  7. Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. TPAMI 34(11), 2189–2202 (2012)

    Article  Google Scholar 

  8. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)

    Article  Google Scholar 

  9. Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)

    Article  MathSciNet  Google Scholar 

  10. Mikolajczyk, K., Schmid, C.: Scale & Affine invariant interest point detectors. IJCV 1(60), 63–86 (2004)

    Article  Google Scholar 

  11. Schneiderman, H., Kanade, T.: Object detection using the statistics of parts. IJCV 56(3), 151–177 (2004)

    Article  Google Scholar 

  12. Cinbis, R.G., Verbeek, J., Schmid, C.: Segmentation driven object detection with fisher vectors. In: ICCV (2013)

    Google Scholar 

  13. Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: NIPS (2013)

    Google Scholar 

  14. Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. IJCV 104(2), 154–171 (2013)

    Article  Google Scholar 

  15. Dalal, N.: Finding people in images and videos. Ph.D. thesis, Institut National Polytechnique de Grenoble (2006)

    Google Scholar 

  16. Wojcikiewicz, W.: Probabilistic modelling of multiple observations in face detection. Technical report, Humboldt-Universität zu Berlin (2008)

    Google Scholar 

  17. Blaschko, M.B., Kannala, J., Rahtu, E.: Non maximal suppression in cascaded ranking models. In: Kämäräinen, J.-K., Koskela, M. (eds.) SCIA 2013. LNCS, vol. 7944, pp. 408–419. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  18. Chen, G., Ding, Y., Xiao, J., Han, T.X.: Detection evolution with multi-order contextual co-occurrence. In: CVPR (2013)

    Google Scholar 

  19. Ding, Y., Xiao, J.: Contextual boost for pedestrian detection. In: CVPR (2012)

    Google Scholar 

  20. Razavi, N., Gall, J., Van Gool, L.: Backprojection revisited: scalable multi-view object detection and similarity metrics for detections. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 620–633. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  21. Barinova, O., Lempitsky, V., Kholi, P.: On detection of multiple object instances using hough transforms. TPAMI 34(9), 1773–1784 (2012)

    Article  Google Scholar 

  22. Wohlhart, P., Donoser, M., Roth, P.M., Bischof, H.: Detecting partially occluded objects with an implicit shape model random field. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 302–315. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  23. Wu, B., Nevatia, R.: Detection and segmentation of multiple, partially occluded objects by grouping, merging, assigning part detection responses. IJCV 82(2), 185–204 (2009)

    Article  Google Scholar 

  24. Blaschko, M.B., Lampert, C.H.: Learning to localize objects with structured output regression. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 2–15. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  25. Blaschko, M.B.: Branch and bound strategies for non-maximal suppression in object detection. In: Boykov, Y., Kahl, F., Lempitsky, V., Schmidt, F.R. (eds.) EMMCVPR 2011. LNCS, vol. 6819, pp. 385–398. Springer, Heidelberg (2011)

    Google Scholar 

  26. Tang, S., Andriluka, M., Schiele, B.: Detection and tracking of occluded people. In: BMVC (2012)

    Google Scholar 

  27. Desai, C., Ramanan, D., Fowlkes, C.C.: Discriminative models for multi-class object layout. IJCV 95(1), 1–12 (2011)

    Article  MathSciNet  Google Scholar 

  28. Ladický, L., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, where and how many? combining object detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  29. Yao, J., Fidler, S., Urtasun, R.: Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. In: CVPR (2012)

    Google Scholar 

  30. MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1(14), pp. 281–297 (1967)

    Google Scholar 

  31. Kaufman, L., Rousseeuw, P.: Clustering by means of medoids. In: Dodge, Y. (ed.) Statistical Data Analysis Based on the L1-Norm and Related Methods. North-Holland, Amsterdam (1987)

    Google Scholar 

  32. Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  33. Dueck, D., Frey, B.J.: Non-metric affinity propagation for unsupervised image categorization. In: ICCV (2007)

    Google Scholar 

  34. Dueck, D., Frey, B.J., Jojic, N., Jojic, V., Giaever, G., Emili, A., Musso, G., Hegele, R.: Using affinity propagation. In: RECOMB (2008)

    Google Scholar 

  35. Lazic, N., Frey, B.J., Aarabi, P.: Solving the uncapacitated facility location problem using message passing algorithms. In: AISTATS (2010)

    Google Scholar 

  36. Givoni, I.E., Chung, C., Frey, B.J.: Hierarchical affinity propagation. In: The 27th Conference on Uncertainty in Artificial Intelligence (UAI) (2011)

    Google Scholar 

  37. Givoni, I.E., Frey, B.J.: Semi-supervised affinity propagation with instance-level constraints. In: AISTATS (2009)

    Google Scholar 

  38. Givoni, I.E., Frey, B.J.: A binary variable model for affinity propagation. Neural Comput. 21(6), 1589–1600 (2009)

    Article  MathSciNet  Google Scholar 

  39. Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: ICML (2009)

    Google Scholar 

  40. Yuille, A.L., Rangarajan, A.: The concave-convex procedure. Neural Comput. 15(4), 915–936 (2003)

    Article  Google Scholar 

  41. Vedaldi, A.: A MATLAB wrapper of SVMstruct (2011)

    Google Scholar 

  42. Hoiem, D., Chodpathumwan, Y., Dai, Q.: Diagnosing error in object detectors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 340–353. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  43. Alexe, B., Deselaers, T., Ferrari, V.: What is an object? In: CVPR (2010)

    Google Scholar 

  44. Manén, S., Guillaumin, M., Van Gool, L.: Prime object proposals with randomized Prim’s algorithm. In: ICCV (2013)

    Google Scholar 

  45. Ristin, M., Gall, J., Van Gool, L.: Local context priors for object proposal generation. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 57–70. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  46. Dollar, P., Zitnick, C.L.: Structured forests for fast edge detection. In: ICCV (2013)

    Google Scholar 

Download references

Acknowledgement

The authors gratefully acknowledge support by Toyota.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rasmus Rothe .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material (pdf 525 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Rothe, R., Guillaumin, M., Van Gool, L. (2015). Non-maximum Suppression for Object Detection by Passing Messages Between Windows. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16865-4_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16864-7

  • Online ISBN: 978-3-319-16865-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics