Abstract
Object detection is an important and fundamental task in computer vision. Recently, the emergence of deep neural network has made considerable progress in object detection. Deep neural network object detectors can be grouped in two broad categories: the two-stage detector and the one-stage detector. One-stage detectors are faster than two-stage detectors. However, they suffer from a severe foreground-backg-round class imbalance during training that causes a low accuracy performance. RetinaNet is a one-stage detector with a novel loss function named Focal Loss which can reduce the class imbalance effect. Thereby RetinaNet outperforms all the two-stage and one-stage detectors in term of accuracy. The main idea of focal loss is to add a modulating factor to rectify the cross-entropy loss, which down-weights the loss of easy examples during training and thus focuses on the hard examples. However, cross-entropy loss only focuses on the loss of the ground-truth classes and thus it can’t gain the loss feedback from the false classes. Thereby cross-entropy loss does not achieve the best convergence. In this paper, we proposed a new loss function named Dual Cross-Entropy Focal Loss, which improves on the focal loss. Dual cross-entropy focal loss adds a modulating factor to rectify the dual cross-entropy loss towards focusing on the hard samples. Dual cross-entropy loss is an improved variant of cross-entropy loss, which gains the loss feedback from both the ground-truth classes and the false classes. We changed the loss function of RetinaNet from focal loss to our dual cross-entropy focal loss and performed some experiments on a small vehicle dataset. The experimental results show that our new loss function improves the vehicle detection performance.
This research was partly supported by NSFC, China (No:61876107,U1803261).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems, pp. 487–495 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Girshick, R.B.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: Object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems, pp. 379–387 (2016)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv: 1804.02767 (2018)
Liu, W.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Fu, C.-Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: DSSD: deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2016)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7310–7321 (2017)
Shrivastava, A., Sukthankar, R., Malik, J., Gupta, A.: Beyond skip connections: top-down modulation for object detection. arXiv preprint arXiv:1612.06851 (2016)
Li, X., Yu, L., Chang, D., Ma, Z., Cao, J.: Dual cross-entropy loss for small-sample fine-grained vehicle classification. IEEE Trans. Veh. Technol. 68(5), 4204–4212 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
He, X., Yang, J., Kasabov, N. (2020). Application of an Improved Focal Loss in Vehicle Detection. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2020. Lecture Notes in Computer Science(), vol 12415. Springer, Cham. https://doi.org/10.1007/978-3-030-61401-0_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-61401-0_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61400-3
Online ISBN: 978-3-030-61401-0
eBook Packages: Computer ScienceComputer Science (R0)