Abstract
As described in this paper, a real-time object detection system using a Histogram of Oriented Gradients (HOG) feature extraction accelerator VLSI is presented. The VLSI [1, 2] enables the system to achieve real-time performance and scalability for multiple object detection under limited power condition. The VLSI employs three techniques: a VLSI-oriented HOG algorithm with early classification in Support Vector Machine (SVM) classification, a dual-core architecture for parallel feature extraction, and a detection-window-size scalable architecture with a reconfigurable MAC array for processing objects of different shapes. The test chip was fabricated using 65 nm CMOS technology. The measurement result shows that the VLSI consumes 43 mW at 42.9 MHz and 1.1 V to process HDTV (1920 × 1080 pixels) at 30 frames per second (fps). A multiple object detection system and a multiple scale object detection system are presented to demonstrate the system flexibility and scalability realized by VLSI and applicability for versatile application of object detection. On the multiple object detection system, a real-time object detection for HDTV resolution video is achieved with 84 mW of power consumption on a task to detect 2 types of targets while keeping comparable detection accuracy as software-based system. On the multiple scale object detection system, a task to detect 5 scales of a target is accomplished using a single VLSI. The power consumption of the VLSI is estimated to 102 mW on the task.
Similar content being viewed by others
References
Takagi, K., et al. (2013) A SUB-100MW dual-core HOG accelerator VLSI for real-time multiple object detection, IEEE International Conference on Acoustics, speech, and Signal Processing (ICASSP).
Mizuno, K., Takagi, K., et al. (2013). A sub-100mW dual-core HOG accelerator VLSI for parallel feature extraction processing for HDTV resolution video. IEICE Transactions on Electronics, E96-C(4).
World Health Organization “Decade of Action for Road Safety 2011-2020: saving millions of lives”, May 2011.
Dalal, N., & Triggs, B. (2005) Histograms of oriented gradients for human detection, in Proceedings of the 2005 International Conference on Computer Vision and Pattern Recognition, vol. 2. Washington, DC, USA: IEEE Computer Society, pp. 886–893.
Zhang, L., & Nevatia, R. (2008) Efficient scan-window based object detection using GPGPU, IEEE, CVPRW.
Bauer, S., Brunsmann, U., Schlotterbeck-Macht, S. (2009) FPGA Implementation of a HOG-based pedestrian recognition system, MPC-Workshop, July 2009.
Hiromoto, M., & Miyamoto, R. (2009) Hardware architecture for high-accuracy real-time pedestrian detection with CoHOG Features, IEEE ICCVW.
Kadota, R., Sugano, H., Hiromoto, M., Ochi, H., Miyamoto, R., Nakamura, Y. (2009) Hardware architecture for HOG feature extraction, in Proceedings of the 2009 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, Washington, DC, USA: IEEE Computer Society, pp. 1330–1333.
Yazawa, Y., Yoshimi, T., Tsuzuki, T., Dohi, T., Fujiyoshi, H. (2011) FPGA Hardware with target-reconfigurable object detector by Joint-HOG, in Proceeding of SSII. Yokohama, Japan.
Negi, K., Dohi, K., Shibata, Y., Oguri, K. (2011) Deep pipelined one-chip FPGA implementation of a real-time image-based human detection algorithm, IEEE FPT 2011.
Cao T. P., & Deng, G. (2008) Real-time vision-based stop sign detection system on FPGA”, in Proceeding of Digital Image Computing: Techniques and Applications. Los Alamitos, CA, USA: IEEE Computer Society, pp. 465–471, 2008.
Bauer, S., Kohler, S., Doll, K., Brunsmann, U. (2010) FPGA-GPU Architecture for Kernel SVM Pedestrian Detection, IEEE CVPRW 2010.
Mizuno, K., Terachi, Y., Takagi, K., Izumi, S., Kawaguchi, H. Yoshimoto, M. (2012) Architectural study of HOG feature extraction processor for real-time object detection”, IEEE SiPS.
Volder, J. E. (1959). The CORDIC trigonometric computing technique. IRE Trans. Electron. Computers., EC-8, 330–334.
INRIA Person Dataset. http://pascal.inrialpes.fr/data/human/
Lowe, D. G. (2004). Distinctive image features from scale invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
GTI’s Vehicle Image Database. http://www.gti.ssr.upm.es/data/Vehicle_database/Vehicle_database.html
Arróspide, J., Salgado, L., & Nieto, M. Video analysis based vehicle detection and tracking using an MCMC sampling framework. EURASIP Journal on Advances in Signal Processing, 2.
Acknowledgments
The VLSI chip in this study has been fabricated in the chip fabrication program of VLSI Design and Education Center (VDEC), the University of Tokyo in collaboration with STARC, e-Shuttle, Inc., and Fujitsu Ltd. This research has been supported by the Semiconductor Technology Academic Research Center (STARC). This development was performed by the author for STARC as part of the Japanese Ministry of Economy, Trade and Industry sponsored “Silicon Implementation Support Program for Next Generation Semiconductor Circuit Architectures”.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Takagi, K., Tanaka, K., Izumi, S. et al. A Real-time Scalable Object Detection System using Low-power HOG Accelerator VLSI. J Sign Process Syst 76, 261–274 (2014). https://doi.org/10.1007/s11265-014-0870-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-014-0870-7