Abstract
Non-maximum suppression (NMS) is a key post-processing step in many computer vision applications. In the context of object detection, it is used to transform a smooth response map that triggers many imprecise object window hypotheses in, ideally, a single bounding-box for each detected object. The most common approach for NMS for object detection is a greedy, locally optimal strategy with several hand-designed components (e.g., thresholds). Such a strategy inherently suffers from several shortcomings, such as the inability to detect nearby objects. In this paper, we try to alleviate these problems and explore a novel formulation of NMS as a well-defined clustering problem. Our method builds on the recent Affinity Propagation Clustering algorithm, which passes messages between data points to identify cluster exemplars. Contrary to the greedy approach, our method is solved globally and its parameters can be automatically learned from training data. In experiments, we show in two contexts – object class and generic object detection – that it provides a promising solution to the shortcomings of the greedy NMS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Canny, J.: A computational approach to edge detection. TPAMI 8(6), 679–698 (1986)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. TPAMI 32(9), 1627–1645 (2010)
Viola, P., Jones, M.: Robust real-time object detection. IJCV 57(2), 137–154 (2004)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Cheng, M.M., Zhang, Z., Lin, W.Y., Torr, P.H.S.: BING: binarized normed gradients for objectness estimation at 300fps. In: CVPR (2014)
Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. TPAMI 34(11), 2189–2202 (2012)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)
Mikolajczyk, K., Schmid, C.: Scale & Affine invariant interest point detectors. IJCV 1(60), 63–86 (2004)
Schneiderman, H., Kanade, T.: Object detection using the statistics of parts. IJCV 56(3), 151–177 (2004)
Cinbis, R.G., Verbeek, J., Schmid, C.: Segmentation driven object detection with fisher vectors. In: ICCV (2013)
Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: NIPS (2013)
Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. IJCV 104(2), 154–171 (2013)
Dalal, N.: Finding people in images and videos. Ph.D. thesis, Institut National Polytechnique de Grenoble (2006)
Wojcikiewicz, W.: Probabilistic modelling of multiple observations in face detection. Technical report, Humboldt-Universität zu Berlin (2008)
Blaschko, M.B., Kannala, J., Rahtu, E.: Non maximal suppression in cascaded ranking models. In: Kämäräinen, J.-K., Koskela, M. (eds.) SCIA 2013. LNCS, vol. 7944, pp. 408–419. Springer, Heidelberg (2013)
Chen, G., Ding, Y., Xiao, J., Han, T.X.: Detection evolution with multi-order contextual co-occurrence. In: CVPR (2013)
Ding, Y., Xiao, J.: Contextual boost for pedestrian detection. In: CVPR (2012)
Razavi, N., Gall, J., Van Gool, L.: Backprojection revisited: scalable multi-view object detection and similarity metrics for detections. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 620–633. Springer, Heidelberg (2010)
Barinova, O., Lempitsky, V., Kholi, P.: On detection of multiple object instances using hough transforms. TPAMI 34(9), 1773–1784 (2012)
Wohlhart, P., Donoser, M., Roth, P.M., Bischof, H.: Detecting partially occluded objects with an implicit shape model random field. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 302–315. Springer, Heidelberg (2013)
Wu, B., Nevatia, R.: Detection and segmentation of multiple, partially occluded objects by grouping, merging, assigning part detection responses. IJCV 82(2), 185–204 (2009)
Blaschko, M.B., Lampert, C.H.: Learning to localize objects with structured output regression. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 2–15. Springer, Heidelberg (2008)
Blaschko, M.B.: Branch and bound strategies for non-maximal suppression in object detection. In: Boykov, Y., Kahl, F., Lempitsky, V., Schmidt, F.R. (eds.) EMMCVPR 2011. LNCS, vol. 6819, pp. 385–398. Springer, Heidelberg (2011)
Tang, S., Andriluka, M., Schiele, B.: Detection and tracking of occluded people. In: BMVC (2012)
Desai, C., Ramanan, D., Fowlkes, C.C.: Discriminative models for multi-class object layout. IJCV 95(1), 1–12 (2011)
Ladický, L., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, where and how many? combining object detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010)
Yao, J., Fidler, S., Urtasun, R.: Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. In: CVPR (2012)
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1(14), pp. 281–297 (1967)
Kaufman, L., Rousseeuw, P.: Clustering by means of medoids. In: Dodge, Y. (ed.) Statistical Data Analysis Based on the L1-Norm and Related Methods. North-Holland, Amsterdam (1987)
Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
Dueck, D., Frey, B.J.: Non-metric affinity propagation for unsupervised image categorization. In: ICCV (2007)
Dueck, D., Frey, B.J., Jojic, N., Jojic, V., Giaever, G., Emili, A., Musso, G., Hegele, R.: Using affinity propagation. In: RECOMB (2008)
Lazic, N., Frey, B.J., Aarabi, P.: Solving the uncapacitated facility location problem using message passing algorithms. In: AISTATS (2010)
Givoni, I.E., Chung, C., Frey, B.J.: Hierarchical affinity propagation. In: The 27th Conference on Uncertainty in Artificial Intelligence (UAI) (2011)
Givoni, I.E., Frey, B.J.: Semi-supervised affinity propagation with instance-level constraints. In: AISTATS (2009)
Givoni, I.E., Frey, B.J.: A binary variable model for affinity propagation. Neural Comput. 21(6), 1589–1600 (2009)
Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: ICML (2009)
Yuille, A.L., Rangarajan, A.: The concave-convex procedure. Neural Comput. 15(4), 915–936 (2003)
Vedaldi, A.: A MATLAB wrapper of SVMstruct (2011)
Hoiem, D., Chodpathumwan, Y., Dai, Q.: Diagnosing error in object detectors. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 340–353. Springer, Heidelberg (2012)
Alexe, B., Deselaers, T., Ferrari, V.: What is an object? In: CVPR (2010)
Manén, S., Guillaumin, M., Van Gool, L.: Prime object proposals with randomized Prim’s algorithm. In: ICCV (2013)
Ristin, M., Gall, J., Van Gool, L.: Local context priors for object proposal generation. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 57–70. Springer, Heidelberg (2013)
Dollar, P., Zitnick, C.L.: Structured forests for fast edge detection. In: ICCV (2013)
Acknowledgement
The authors gratefully acknowledge support by Toyota.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Rothe, R., Guillaumin, M., Van Gool, L. (2015). Non-maximum Suppression for Object Detection by Passing Messages Between Windows. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-16865-4_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16864-7
Online ISBN: 978-3-319-16865-4
eBook Packages: Computer ScienceComputer Science (R0)