EfficientLiteDet: a real-time pedestrian and vehicle detection algorithm

Murthy, Chintakindi Balaram; Hashmi, Mohammad Farukh; Keskar, Avinash G.

doi:10.1007/s00138-022-01293-y

EfficientLiteDet: a real-time pedestrian and vehicle detection algorithm

Original Paper
Published: 12 April 2022

Volume 33, article number 47, (2022)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Chintakindi Balaram Murthy^1,2,
Mohammad Farukh Hashmi ORCID: orcid.org/0000-0002-3808-9122^1,2 &
Avinash G. Keskar³

1010 Accesses
14 Citations
Explore all metrics

Abstract

Since safety plays a crucial role and the top priority, in both unmanned and driver-assistance driving systems, there is a need of efficient and accurate detection of captured objects by object detection algorithms in real-time. Directly applying existing models to tackle real-time pedestrian and vehicle detection tasks captured by high speed moving vehicle scenarios has two problems. First, the target scale varies drastically because the vehicle speed changes greatly. Second, captured images contain both tiny targets and high density targets, which brings in occlusion between targets. To solve the two issues, an efficient light weight real-time detection algorithm is proposed, which is referred to as EfficientLiteDet. Based on Tiny-YOLOv4, one more prediction head is introduced in the proposed model to detect multi-scale targets effectively. In order to detect tiny and occluded denser targets, we used Transformer Prediction Heads (TPH) instead of original anchor detection heads in our model. To explore the potential of self-attention mechanism in TPH, the proposed model integrates “convolutional block attention model” to locate crucial attention region on scenarios with denser targets. Further to improve the detection performance of our model, we applied various data augmentation strategies such as mosaic, mix-up, multi-scale, and random-horizontal-flip during the model training. Extensive experiments are conducted on five challenging pedestrian and vehicle datasets shows that the EfficientLiteDet model has better performance in real-time scenarios. On Pascal Voc-2007, Highway and Udacity datasets, the proposed model achieves mean average precision (mAP) 87.3%, 80.1% and 77.8%, respectively, which is quite better than Tiny-YOLOv4 state-of-the-art algorithm by + 2.4%, 1.8% and + 2.4%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

References

Girshick, R., Donahue, J., Darrell, T., Malik. J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards realtime object detection with region proposal networks. arXiv preprint http://arxiv.org/abs/1506.01497 (2015)
He, K., Gkioxari, G, Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2961–2969 (2017)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
“Yolov3: An incremental improvement. arXiv preprint http://arxiv.org/abs/1804.02767 (2018)
Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. Int. J. Comput. Vision 63(2), 153–161 (2005)
Article Google Scholar
Murthy, C.B., Hashmi, M.F., Bokde, N.D., Geem, Z.W.: Investigations of object detection in images/videos using various deep learning techniques and embedded platforms-A comprehensive review. Appl. Sci. 10(9), 3280 (2020)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR—Volume 1–Volume 01, 2005, pp. 886–893. IEEE Computer Society (2005)
Yang, B., Yan, J., Lei, Z., Li, S.Z.: Convolutional channel features. In: Proceedings of the IEEE International Conference on Computer Vision, December 2015, pp. 82–90 (2015)
Zhang, S., Benenson, R., Schiele, B.: Filtered channel features for pedestrian detection. Proc. CVPR 1(2), 1751–1760 (2015)
Google Scholar
W. Nam, P. Dollár, and J. H. Han, “Local decorrelation for improved pedestrian detection,” in Proc. Adv. Neural Inf. Process. Syst., 2014, pp. 424–432.
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint http://arxiv.org/abs/2004.10934 (2020)
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: Scaled-yolov4: scaling cross stage partial network. arXiv preprint http://arxiv.org/abs/2011.08036 (2020)
Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2117–2125 (2017)
Tian, Y., Yang, G., Wang, Z., Wang, H., Li, E., Liang, Z.: Apple detection during different growth stages in orchards using the improved YOLO-V3 model. Comput. Electron. Agric. 157, 417–426 (2019)
Article Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4700–4708 (2017)
Wang, C.-Y., Liao, H.-Y.M., Wu, Y.-H., Chen, P.-Y., Hsieh, J.-W., Yeh, I.-H.:Cspnet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 37(9), 1904–1916 (2015)
Article Google Scholar
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision
Jocher, G. et al. yolov5 (2021). https://github.com/ultralytics/yolov5
Zhu, X., Lyu, S., Wang, X., Zhao, Q.: TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2778–2788 (2021)
Wang, R.J., Li, X., Ling, C.X.: Pelee: a real-time object detection system on mobile devices. arXiv preprint http://arxiv.org/abs/1804.06882 (2018)
Murthy, C.B., Hashmi, M.F.: Real time pedestrian detection using robust enhanced YOLOv3+. In: 2020 21st International Arab Conference on Information Technology (ACIT), pp. 1–5. IEEE (2020, November)
Murthy, C.B., Hashmi, M.F., Keskar, A.G.: Optimized MobileNet+ SSD: a real-time pedestrian detection on a low-end edge device. Int. J. Multimed. Inf. Retr. 3, 171–184 (2021)
Article Google Scholar
Chen, L., Ding, Q., Zou, Q., Chen, Z., Li, L.: DenseLightNet: a light-weight vehicle detection network for autonomous driving. IEEE Trans. Ind. Electron. 67(12), 10600–10609 (2020)
Article Google Scholar
Rani, E.: Little-YOLO-SPP: A delicate real-time vehicle detection algorithm. Optik 225, 165818 (2021)
Article Google Scholar
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv 2017, http://arxiv.org/abs/1704.04861
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018).
Howard, A., Sandler, M., Chen, B., et al.: Searching for MobileNetV3. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 2019, pp. 1314–1324. https://doi.org/10.1109/ICCV.2019.00140
Mateus, A., Ribeiro, D., Miraldo, P., Nascimento, J.C.: Efficient and robust pedestrian detection using deep learning for human-aware navigation. Robot. Auton. Syst. 113, 23–37 (2019)
Article Google Scholar
Zhang, L., Lin, L., Liang, X., He, K.: Is faster R-CNN doing well for pedestrian detection? In: Proceedings of European Conference on Computer Vision, pp. 443–457: Springer, Cham (2016)
Tian, Y., Luo, P., Wang, X., Tang, X.: Pedestrian detection aided by deep learning semantic tasks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, June 2015, pp. 5079–5087 (2015)
Brazil, G., Yin, X., Liu, X.: Illuminating pedestrians via simultaneous detection segmentation, June 2017. http://arxiv.org/abs/1706.08564 (2017)
Ouyang, W., Zhou, H., Li, H., Li, Q., Yan, J., Wang, X.: Joint on. IEEE Trans. Pattern Anal. Mach. Intell. 40(8), 1874–1887 (2018)
Article Google Scholar
Liu, Z., Chen, Z., Li, Z., Hu, W.: An efficient pedestrian detection method based on YOLOv2. In: Mathematical Problems in Engineering (2018)
Li, J., Liang, X., Shen, S., Xu, T., Feng, J., Yan, S.: Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimed. 20(4), 985–996 (2018)
Google Scholar
Han, B., Wang, Y., Yang, Z., Gao, X.: Small-scale pedestrian detection based on deep neural network. IEEE Trans. Intell. Transp. Syst. 21(7), 3046–3055 (2019)
Article Google Scholar
Zhang, S., Yang, J., Schiele, B.: Occluded pedestrian detection through guided attention in CNNs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018, pp. 6995–7003 (2018)
Cao, J., Song, C., Peng, S., Song, S., Zhang, X., Shao, Y., Xiao, F.: Pedestrian detection algorithm for intelligent vehicles in complex scenarios. Sensors 20(13), 3646 (2020)
Article Google Scholar
Wang, D., Li, C., Wen, S., Han, Q.-L., Nepal, S., Zhang, X., Xiang, Y.: Daedalus: Breaking nonmaximum suppression in object detection via adversarial examples. In: IEEE Transactions on Cybernetics (2021)
Hsu, W.Y., Lin, W.Y.: Ratio-and-scale-aware YOLO for pedestrian detection. IEEE Trans. Image Process. 30, 934–947 (2020)
Article Google Scholar
Yi, Z., Yongliang, S., Jun, Z.: An improved tiny-yolov3 pedestrian detection algorithm. Optik 183, 17–23 (2019)
Article Google Scholar
Song, H., Liang, H., Li, H., Dai, Z., Yun, X.: Vision-based vehicle detection and counting system using deep learning in highway scenes. Eur. Transp. Res. Rev. 11(1), 1–16 (2019)
Article Google Scholar
Wang, X., Wang, S., Cao, J., Wang, Y.: Data-driven based Tiny-YOLOv3 method for front vehicle detection inducing SPP-Net. IEEE Access 8, 110227–110236 (2020). https://doi.org/10.1109/ACCESS.2020.3001279
Article Google Scholar
Mahto, P., Garg, P., Seth, P., Panda, J.: Refning yolov4 for vehicle detection. Int. J. Adv. Res. Eng. Technol. (IJARET) 11(5), 409–419 (2020)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of Neural Information Processing Systems, 2012, pp. 1097–1105 (2012)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016).
Zhou, X., Lin, M., et al.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6848–6856. Salt Lake City, UT, USA (2018)
Ma, N., Zhang, X., Zheng, H. T., et al.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: European Conference on Computer Vision, pp. 122–138. Springer, Cham (2018)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16 × 16 words: transformers for image recognition at scale. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3–7, 2021 (2021)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp. 5998–6008 (2017)
Woo, S., Park, J., Lee, J-Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European conference on Computer Vision (ECCV), pp. 3–19 (2018)
Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: a benchmark. In IEEE Conference on Computer Vision and Pattern Recognition, Miami, Fl, USA, June 2009, pp. 304–311 (2009)
Neubeck, A., Van Gool, L.: Efficient nonmaximum suppression. In: 18th International Conference on Pattern Recognition (ICPR 06), vol. 3, pp. 850–855. IEEE (2006)
Solovyev, R., Wang, W., Gabruseva, T.: Weighted boxes fusion: Ensembling boxes from different object detection models. Image Vis. Comput. 107, 104117 (2021)
Article Google Scholar
Udacity (2018). https://github.com/udacity/self-driving-car/. Accessed 20 Sept 2021. (Udacity)
Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes challenge 2007 (voc 2007) results (2007) (2008)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: CVPR (2012)
Zhang, Z., He, T., Zhang, H., Zhang, Z., Xie, J., Li, M: Bag of freebies for training object detection neural networks. arXiv preprint http://arxiv.org/abs/1902.04103 (2019)
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

Download references

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, National Institute of Technology, Warangal, India
Chintakindi Balaram Murthy & Mohammad Farukh Hashmi
National Institute of Technology, Warangal, India
Chintakindi Balaram Murthy & Mohammad Farukh Hashmi
Department of Electronics and Communication Engineering, Visvesvaraya National Institute of Technology, Nagpur, India
Avinash G. Keskar

Authors

Chintakindi Balaram Murthy
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Farukh Hashmi
View author publications
You can also search for this author in PubMed Google Scholar
Avinash G. Keskar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Farukh Hashmi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Murthy, C.B., Hashmi, M.F. & Keskar, A.G. EfficientLiteDet: a real-time pedestrian and vehicle detection algorithm. Machine Vision and Applications 33, 47 (2022). https://doi.org/10.1007/s00138-022-01293-y

Download citation

Received: 28 October 2020
Revised: 21 October 2021
Accepted: 28 February 2022
Published: 12 April 2022
DOI: https://doi.org/10.1007/s00138-022-01293-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

EfficientLiteDet: a real-time pedestrian and vehicle detection algorithm

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

EfficientLiteDet: a real-time pedestrian and vehicle detection algorithm

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation