A real-time and energy-efficient multi-scale object detector hardware implementation is presented in this paper. Detection is done using Histogram of Oriented Gradients (HOG) features and Support Vector Machine (SVM) classification. Multi-scale detection is essential for robust and practical applications to detect objects of different sizes. Parallel detectors with balanced workload are used to increase the throughput, enabling voltage scaling and energy consumption reduction. Image pre-processing is also introduced to further reduce power and area costs of the image scales generation. This design can operate on high definition 1080HD video at 60 fps in real-time with a clock rate of 270 MHz, and consumes 45.3 mW (0.36 nJ/pixel) based on post-layout simulations. The ASIC has an area of 490 kgates and 0.538 Mbit on-chip memory in a 45 nm SOI CMOS process.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Average precision measures the area under precision recall curve. Higher average precision means better detection accuracy.
Different gradient filters are tested in  like 1-D, cubic, 3 ×3 Sobel as well as 2×2 diagonal filters. Simple 1-D [-1 0 1] filter works the best.
The energy numbers for 0.6 V and 1.1 V supplies are estimated from a ring oscillator voltage versus power and frequency curves. SRAM minimum voltage is 0.72 V.
Energy numbers for 0.6 V and 1.1 V supplies are estimated from a ring oscillator voltage versus power and frequency curves. SRAM minimum voltage is 0.72 V.
AP number is not reported in . This number is from single scale HOG detection simulation.
Haltakov, V., Belzner, H., & Ilic, S. (2012). Scene understanding from a moving camera for object detection and free space estimation. In Proceedings IEEE Intelligent Vehicles Symposium (pp. 105–110).
Dollar, P., Wojek, C., Schiele, B., & Perona, P. (2012). Pedestrian Detection: An Evaluation of the State of the Art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 743–761.
Meingast, M., Geyer, C., & Sastry, S. (2004). Vision based terrain recovery for landing unmanned aerial vehicles. In Proceedings IEEE Conference on Decision and Control, (Vol. 2 pp. 1670–1675).
Myers, B., Burns, J., & Ratell, J. (2001). Embedded Electronics in Electro-Mechanical Systems for Automotive Applications. SAE Technical Paper (2001-01-0691).
INRIA Person Dataset for Pedestrian Detection. http://pascal.inrialpes.fr/data/human/.
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proceedings IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (Vol. 1 pp. 886–893).
Dollar, P., Belongie, S., & Perona, P. (2010). The Fastest Pedestrian Detector in the West. In Proceedings of the British Machine Vision Conference (pp. 68.1–68.11).
Jinwook, O., Gyeonghoon, K., Injoon, H., Junyoung, P., Seungjin, L., Joo-Young, K., Jeong-Ho, W., & Hoi-Jun, Y. (2012). Low-Power, Real-Time Object-Recognition Processors for Mobile Vision Systems. IEEE Micro, 32(6), 38–50.
Wei, Z., Zelinsky, G., & Samaras, D. (2007). Real-time Accurate Object Detection using Multiple Resolutions. In Proceedings IEEE International Conference on Computer Vision (pp. 1–8).
Benenson, R., Mathias, M., Timofte, R., & Van Gool, L. (2012). Pedestrian detection at 100 frames per second. In. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (pp. 2903–2910).
Bauer, S., Kohler, S., Doll, K., & Brunsmann, U. (2010). FPGA-GPU architecture for kernel SVM pedestrian detection. In. In Proceedings IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (pp. 61–68).
Mizuno, K., Terachi, Y., Takagi, K., Izumi, S., Kawaguchi, H., & Yoshimoto, M. (197). Architectural Study of HOG Feature Extraction Processor for Real-Time Object Detection. In Proceedings IEEE Workshop on Signal Processing Systems.
Takagi, K., Mizuno, K., Izumi, S., Kawaguchi, H., & Yoshimoto, M. (2013). A sub-100-milliwatt dual-core HOG accelerator VLSI for real-time multiple object detection. In Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 2533–2537).
Hahnle, M., Saxen, F., Hisung, M., Brunsmann, U., & Doll, K. (2013). FPGA-Based Real-Time Pedestrian Detection on High-Resolution Images. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 629–635).
Cortes, C., & Vapnik, V. (1995). Support-Vector Networks. Journal of Machine Learning, 20(3), 273–297.
Cao, T., & Deng, G. (2008). Real-Time Vision-Based Stop Sign Detection System on FPGA. In Proceedings Digital Image Computing: Techniques and Applications (pp. 465–471).
Funding for this research was provided by Texas Instruments and the DARPA YFA grant N66001-14-1-4039. The authors would like to thank Xilinx University Program (XUP) for equipment donation.
About this article
Cite this article
Suleiman, A., Sze, V. An Energy-Efficient Hardware Implementation of HOG-Based Object Detection at 1080HD 60 fps with Multi-Scale Support. J Sign Process Syst 84, 325–337 (2016). https://doi.org/10.1007/s11265-015-1080-7
- Object detection
- Histogram of oriented gradients
- Low power architectures
- Embedded vision