Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

An Energy-Efficient Hardware Implementation of HOG-Based Object Detection at 1080HD 60 fps with Multi-Scale Support

  • 577 Accesses

  • 23 Citations

Abstract

A real-time and energy-efficient multi-scale object detector hardware implementation is presented in this paper. Detection is done using Histogram of Oriented Gradients (HOG) features and Support Vector Machine (SVM) classification. Multi-scale detection is essential for robust and practical applications to detect objects of different sizes. Parallel detectors with balanced workload are used to increase the throughput, enabling voltage scaling and energy consumption reduction. Image pre-processing is also introduced to further reduce power and area costs of the image scales generation. This design can operate on high definition 1080HD video at 60 fps in real-time with a clock rate of 270 MHz, and consumes 45.3 mW (0.36 nJ/pixel) based on post-layout simulations. The ASIC has an area of 490 kgates and 0.538 Mbit on-chip memory in a 45 nm SOI CMOS process.

This is a preview of subscription content, log in to check access.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19

Notes

  1. 1.

    Average precision measures the area under precision recall curve. Higher average precision means better detection accuracy.

  2. 2.

    Different gradient filters are tested in [6] like 1-D, cubic, 3 ×3 Sobel as well as 2×2 diagonal filters. Simple 1-D [-1 0 1] filter works the best.

  3. 3.

    The energy numbers for 0.6 V and 1.1 V supplies are estimated from a ring oscillator voltage versus power and frequency curves. SRAM minimum voltage is 0.72 V.

  4. 4.

    Energy numbers for 0.6 V and 1.1 V supplies are estimated from a ring oscillator voltage versus power and frequency curves. SRAM minimum voltage is 0.72 V.

  5. 5.

    AP number is not reported in [13]. This number is from single scale HOG detection simulation.

References

  1. 1.

    Haltakov, V., Belzner, H., & Ilic, S. (2012). Scene understanding from a moving camera for object detection and free space estimation. In Proceedings IEEE Intelligent Vehicles Symposium (pp. 105–110).

  2. 2.

    Dollar, P., Wojek, C., Schiele, B., & Perona, P. (2012). Pedestrian Detection: An Evaluation of the State of the Art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 743–761.

  3. 3.

    Meingast, M., Geyer, C., & Sastry, S. (2004). Vision based terrain recovery for landing unmanned aerial vehicles. In Proceedings IEEE Conference on Decision and Control, (Vol. 2 pp. 1670–1675).

  4. 4.

    Myers, B., Burns, J., & Ratell, J. (2001). Embedded Electronics in Electro-Mechanical Systems for Automotive Applications. SAE Technical Paper (2001-01-0691).

  5. 5.

    INRIA Person Dataset for Pedestrian Detection. http://pascal.inrialpes.fr/data/human/.

  6. 6.

    Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proceedings IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (Vol. 1 pp. 886–893).

  7. 7.

    Dollar, P., Belongie, S., & Perona, P. (2010). The Fastest Pedestrian Detector in the West. In Proceedings of the British Machine Vision Conference (pp. 68.1–68.11).

  8. 8.

    Jinwook, O., Gyeonghoon, K., Injoon, H., Junyoung, P., Seungjin, L., Joo-Young, K., Jeong-Ho, W., & Hoi-Jun, Y. (2012). Low-Power, Real-Time Object-Recognition Processors for Mobile Vision Systems. IEEE Micro, 32(6), 38–50.

  9. 9.

    Wei, Z., Zelinsky, G., & Samaras, D. (2007). Real-time Accurate Object Detection using Multiple Resolutions. In Proceedings IEEE International Conference on Computer Vision (pp. 1–8).

  10. 10.

    Benenson, R., Mathias, M., Timofte, R., & Van Gool, L. (2012). Pedestrian detection at 100 frames per second. In. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (pp. 2903–2910).

  11. 11.

    Bauer, S., Kohler, S., Doll, K., & Brunsmann, U. (2010). FPGA-GPU architecture for kernel SVM pedestrian detection. In. In Proceedings IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (pp. 61–68).

  12. 12.

    Mizuno, K., Terachi, Y., Takagi, K., Izumi, S., Kawaguchi, H., & Yoshimoto, M. (197). Architectural Study of HOG Feature Extraction Processor for Real-Time Object Detection. In Proceedings IEEE Workshop on Signal Processing Systems.

  13. 13.

    Takagi, K., Mizuno, K., Izumi, S., Kawaguchi, H., & Yoshimoto, M. (2013). A sub-100-milliwatt dual-core HOG accelerator VLSI for real-time multiple object detection. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 2533–2537).

  14. 14.

    Hahnle, M., Saxen, F., Hisung, M., Brunsmann, U., & Doll, K. (2013). FPGA-Based Real-Time Pedestrian Detection on High-Resolution Images. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 629–635).

  15. 15.

    Cortes, C., & Vapnik, V. (1995). Support-Vector Networks. Journal of Machine Learning, 20(3), 273–297.

  16. 16.

    Cao, T., & Deng, G. (2008). Real-Time Vision-Based Stop Sign Detection System on FPGA. In Proceedings Digital Image Computing: Techniques and Applications (pp. 465–471).

Download references

Acknowledgments

Funding for this research was provided by Texas Instruments and the DARPA YFA grant N66001-14-1-4039. The authors would like to thank Xilinx University Program (XUP) for equipment donation.

Author information

Correspondence to Amr Suleiman.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Suleiman, A., Sze, V. An Energy-Efficient Hardware Implementation of HOG-Based Object Detection at 1080HD 60 fps with Multi-Scale Support. J Sign Process Syst 84, 325–337 (2016). https://doi.org/10.1007/s11265-015-1080-7

Download citation

Keywords

  • Object detection
  • Histogram of oriented gradients
  • Multi-scale
  • Low power architectures
  • Embedded vision