Abstract
As a contactless security technology, X-ray security inspection machine is widely used in the detection of dangerous object in all kinds of densely populated public places to ensure the safety. Unlike a natural image, various objects overlapping with each other can be observed in an X-ray image for its perspectivity. It brings us a challenge that the traditional NMS (Non-maximum suppression) algorithm will suppress the less significant objects. In this paper, we propose a Smoother Soft NMS based on the difference in aspect ratios and areas of different object bounding boxes to improve the accuracy of overlapping object detection. We also propose a special data augmentation method to simulate the generation of complex samples of overlapping objects. On our dataset, we boost the mean Average Precision of ResNet-101 FPN from 89.44% to 96.67% and Cascade R-CNN from 96.43% to 97.21%. Detector trained by Smoother Soft NMS has a significant improvement in overlapping cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zhao, W., Chellappa, R., Phillips, P.J., Rosenfeld, A.: Face recognition: a literature survey. ACM Comput. Surv. (CSUR) 35(4), 399–458 (2003)
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: Null, p. 1470. IEEE (2003)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8. IEEE (2007)
Bouwmans, T., Zahzah, E.H.: Robust PCA via principal component pursuit: a review for a comparative evaluation in video surveillance. Comput. Vis. Image Underst. 122, 22–34 (2014)
Ma, X., et al.: Vehicle traffic driven camera placement for better metropolis security surveillance. In: IEEE Intelligent Systems (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. arXiv preprint arXiv:1512.00567 (2015)
Szegedy, C., Ioffe, S., Vanhoucke, V.: Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015)
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)
Girshick, R.B.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
Redmon, J., Divvala, S., Girshick, R., et al.: You Only Look Once: Unified, Real-Time Object Detection. ArXiv preprint arXiv:1506.02640
Liu, W., et al.: SSD: Single Shot MultiBox Detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
He, K., Gkioxari, G., Dollár, P., et al.: Mask R-CNN. In: IEEE International Conference on Computer Vision, pp. 2980–2988. IEEE Computer Society (2017)
Rosenfeld, A., Thurston, M.: Edge and curve detection for visual scene analysis. IEEE Trans. Comput. 5, 562–569 (1971)
Bodla, N., Singh, B., Chellappa, R., Davis, L.S.: Soft-NMS improving object detection with one line of code. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 5562–5570. IEEE (2017)
Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature Pyramid Networks for Object Detection. ArXiv preprint arXiv:1612.03144
Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 761–769 (2016)
Wang, X., Shrivastava, A., Gupta, A.: A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection. ArXiv preprint arXiv:1704.03414
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: International Conference on Neural Information Processing Systems, pp. 2672–2680. MIT Press (2014)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)
Cai, Z.: Nuno Vasconcelos. Cascade R-CNN: Delving into high quality object detection. ArXiv preprint arXiv:1712.00726
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Lin, C., Bao, X., Zhou, X. (2019). Smoother Soft-NMS for Overlapping Object Detection in X-Ray Images. In: Cui, Z., Pan, J., Zhang, S., Xiao, L., Yang, J. (eds) Intelligence Science and Big Data Engineering. Visual Data Engineering. IScIDE 2019. Lecture Notes in Computer Science(), vol 11935. Springer, Cham. https://doi.org/10.1007/978-3-030-36189-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-36189-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36188-4
Online ISBN: 978-3-030-36189-1
eBook Packages: Computer ScienceComputer Science (R0)