Speeding up inference on deep neural networks for object detection by performing partial convolution

Kurdthongmee, Wattanapong

doi:10.1007/s11554-019-00906-6

Speeding up inference on deep neural networks for object detection by performing partial convolution

Original Research Paper
Published: 13 September 2019

Volume 17, pages 1487–1503, (2020)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Wattanapong Kurdthongmee¹

570 Accesses
1 Citation
Explore all metrics

Abstract

Real-time object detection is an expected application of deep neural networks (DNNs). It can be achieved by employing graphic processing units (GPUs) or dedicated hardware accelerators. Alternatively, in this work, we present a software scheme to accelerate the inference stage of DNNs designed for object detection. The scheme relies on partial processing within the consecutive convolution layers of a DNN. It makes use of different relationships between the locations of the components of an input feature, an intermediate feature representation, and an output feature to effectively identify the modified components. This downsizes the matrix multiplicand to cover only those modified components. Therefore, matrix multiplication is accelerated within a convolution layer. In addition, the aforementioned relationships can also be employed to signal the next consecutive convolution layer regarding the modified components. This further helps reduce the overhead of the comparison on a member-by-member basis to identify the modified components. The proposed scheme has been experimentally benchmarked against a similar concept approach, namely, CBinfer, and against the original Darknet on the Tiny-You Only Look Once network. The experiments were conducted on a personal computer with dual CPU running at 3.5 GHz without GPU acceleration upon video data sets from YouTube. The results show that improvement ratios of 1.56 and 13.10 in terms of detection frame rate over CBinfer and Darknet, respectively, are attainable on average. Our scheme was also extended to exploit GPU-assisted acceleration. The experimental results of NVIDIA Jetson TX2 reached a detection frame rate of 28.12 frames per second (1.25\(\times\) with respect to CBinfer). The accuracy of detection of all experiments was preserved at 90% of the original Darknet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

End-to-End Object Detection with Transformers

Notes

We make use of the im2col implementation by Berkeley Vision’s Caffe, available at https://github.com/BVLC/caffe/blob/master/LICENSE.

References

Zhao, Z-Q., Zheng, P., Xu, H. S., WU, X.: Object detection with deep learning: a review. J. LaTeX Class Files 14, 8. arXiv:1807.05511 (2017)
Pathak, A.R., Pandey, M., Rautaray, S.: Application of deep learning for object detection. Proc. Comput. Sci. 132, 1706–1717 (2018)
Article Google Scholar
Vondrick, C., Khosla, A., Pirsiavash, H., Malisiewicz, T., Torralba, A.: Visualizing object detection features. Int. J. Comput. Vis. 119(2), 145–158 (2016)
Article MathSciNet Google Scholar
Matsumoto, M.: SVM-based object detection using self-quotient \(\epsilon\)-filter and histograms of oriented gradients. In: Proceedings of the Computational Intelligence. Springer, Berlin Heidelberg, pp. 277–286 (2012)
Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In; Proceedings of the 26th International Conference on Neural Information Processing Systems—Volume 2, Lake Tahoe, Nevada, pp. 2553–2561 (2013)
Liu, N., Han, J., Zhang, D., Wen, S., Liu, T.: Predicting eye fixations using convolutional neural networks. CVPR (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR (2014)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS (2015)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. CVPR (2016)
Redmon, J., Farhadi, A.: YOLO9000: Better, Faster, Stronger, arXiv (2016). arXiv:1612.08242
Redmon, J., Farhadi, A.: YOLOv3: An incremental improvement, arXiv (2018), arXiv:1804.02767
Huynh, L.N., Lee, Y., Balan, R.K.: Deepmon: Mobile GPU-based deep learning framework for continuous vision applications. In: Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. ACM, New York, pp. 82–95 (2017)
Mobahi, H., Collobert, R., Weston, J.: Deep learning from temporal coherence in video. In: Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, Quebec, Canada, pp. 737–744 (2009)
Lin, X., Zhao, C., Pan W.: Towards accurate binary convolutional neural network, NIPS 2017. Long Beach, CA, USA, pp. 344–352 (2017)
Bertasiu, G., Torresani, L., Shi, J.: Object detection in video with spatiotemporal sampling networks, ECCV2018. arXiv:1803.05549 (2018)
Cavigelli, L., Degen, P., Benini, L.: CBinfer: Change-based inference for convolutional neural networks on video data. arXiv:1704.04313 (2017)
Xu, M., Zhu, M., Liu, Y., Lin, F.X., Liu, X.: DeepCache: Principled cache for mobile deep vision. arXiv:1712.01670 (2018)
Anderson, A., Vasudevany, A., Keane, C., Gregg, D.: Low-memory GEMM-based convolution algorithms for deep neural networks, DeepMon: Mobile GPU-based deep learning framework for continuous vision applications. In: Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. ACM, New York, pp. 82–95 (2017)
Abu-El-Haija, S., Kothari, N.: YouTube-8M: A large-scale video classification Benchmark (2016)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S. E., Fu, C. Y., Berg, A. C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M., (eds) Computer Vision—ECCV 2016. ECCV 2016. Lecture Notes in Computer Science, vol 9905. Springer, Cham (2016)

Download references

Acknowledgements

This work was supported by Thailand Research Fund (TRF) and Walailak University, Thailand, under Grant number RSA6280097.

Author information

Authors and Affiliations

School of Engineering and Technology, Walailak University, 222 Thaibury, Tha-sa-la, Nakhon-si-thammarat, 80160, Thailand
Wattanapong Kurdthongmee

Authors

Wattanapong Kurdthongmee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wattanapong Kurdthongmee.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kurdthongmee, W. Speeding up inference on deep neural networks for object detection by performing partial convolution. J Real-Time Image Proc 17, 1487–1503 (2020). https://doi.org/10.1007/s11554-019-00906-6

Download citation

Received: 29 November 2018
Accepted: 03 September 2019
Published: 13 September 2019
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11554-019-00906-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speeding up inference on deep neural networks for object detection by performing partial convolution

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

End-to-End Object Detection with Transformers

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Speeding up inference on deep neural networks for object detection by performing partial convolution

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

End-to-End Object Detection with Transformers

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation