Abstract
The Graphics Processor Unit (GPU) has expanded its role from an accelerator for rendering graphics into an efficient parallel processor for general purpose computing. The GPU, an indispensable component in desktop and server-class computers as well as game consoles, has also become an integrated component in handheld devices, such as smartphones. Since the handheld devices are mostly powered by battery, the mobile GPU is usually designed with an emphasis on low-power rather than on performance. In addition, the memory bus architecture of mobile devices is also quite different from those of desktops, servers, and game consoles. In this paper, we try to provide answers to the following two questions: (1) Can a mobile GPU be used as a powerful accelerator in the mobile platform for general purpose computing, similar to its role in the desktop and server platforms? (2) What is the role of a mobile GPU in energy-optimized real-time mobile applications? We use face recognition as an application driver which is a compute-intensive task and is a core process for several mobile applications. The experiments of our investigation were performed on an Nvidia Tegra development board which consists of a dual-core ARM Cortex A9 CPU and a Nvidia mobile GPU integrated in a SoC. The experiment results show that, utilizing the mobile GPU can achieve a 4.25x speedup in performance and 3.98x reduction in energy consumption, in comparison with a CPU-only implementation on the same platform.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Nvidia CUDA Compute Unified Device Architecture Programming Guide; Version 2.0, Nvidia Corporation (2008), www.nvidia.com
Samant, S.S., Xia, J., Muyan-Ozcelik, P., Owens, J.D.: High performance computing for deformable image registration: Towards a new paradigm in adaptive radiotherapy. Medical Phisics (2008)
Kim, J.-S., Hwangbo, M., Kanade, T.: Realtime Affine-photometric KLT Feature Tracker on GPU in CUDA Framework. In: IEEE Workshop of Embedded Computer Vision (2009)
OMAP3 family of multimedia application processors, Texas Instruments Inc. (2007), http://focus.ti.com
Akenine-Moller, T., Strom, J.: Graphics Processing Units for Handhelds. Proceedings of the IEEE 96(5), 779–789 (2008)
Munshi, A., Ginsburg, D., Shreiner, D.: OpenGL ES 2.0 Programming Guide. Addison-Wesley, USA (2008)
Leskela, J., Nikula, J., Salmela, M.: OpenCL embedded profile prototype in mobile device. In: IEEE Workshop on Signal Processing Systems (SiPS 2009), pp. 279–284 (2009)
Munshi, A., (ed.) Khronos OpenCL Working Group, The OpenCL Specification, Version 1.0, Rev. 43, Khronos Group, USA (May 2009)
Nvidia Corporation, http://tegradeveloper.nvidia.com/tegra/
Chu, S-W., Yeh, M-C., Cheng, K-T.: A real-time, embedded face-annotation system. ACM MM Technical Demonstrations (2008)
Rofouei, M., Stathopoulos, T., Ryffel, S., Kaiser, W., Sarrafzadeh, M.: Energy-Aware High Performance Computing with Graphic Processing Units. In: Workshop on Power Aware Computing and Systems (HotPower 2008), San Diego, December 8-10 (2008)
Ren, D.Q., Suda, R.: Power Efficient Large Matrices Multiplication by Load Scheduling on Multi-core and GPU Platform with CUDA. In: International Conference on Computational Science and Engineering, CSE 2009 (2009)
Su, Y., Shan, S., Chen, X., Gao, W.: Hierarchical Ensemble of Global and Local Classifiers for Face Recognition. IEEE Transactions on Image Processing 18(8), 1885–1896 (2009)
Phillips, P.J., Moon, H., Rizvi, S.A., Rauss, P.J.: The FERET evaluation methodology for face-recognition algorithms. PAMI 22(10), 1090–1104 (2000)
Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. University of Massachusetts, Amherst,Technical Report 07-49 (2007)
http://developer.android.com/reference/android/media/FaceDetector.html
Fialka, O., Cadik, M.: FFT and Convolution Performance in Image Filtering on GPU. In: Tenth International Conference on Information Visualization (2006)
Nvidia Corp., CUDA CUFFT Library
Mitchell, J.L., Ansari, M.Y., Hart, E.: Advanced image processing with DirectX 9 pixel shaders. In: Engel, W. (ed.) ShaderX2: Shader Programming Tips and Tricks with DirectX9.0. Wordware Publishing, Inc. (2003)
Sumanaweera, T., Liu, D.: Medical image reconstruction with the FFT in GPU Gems 2. In: Pharr, M. (ed.), pp. 765–784. Addison-Wesley (2005)
Brandon, L.D., Boyd, C., Govindaraju, N.: Fast computation of general Fourier Transforms on GPUS. In: IEEE International Conference on Multimedia and Expo. (ICME 2008), pp. 5–8 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, YC., Donyanavard, B., Cheng, KT.(. (2012). Energy-Aware Real-Time Face Recognition System on Mobile CPU-GPU Platform. In: Kutulakos, K.N. (eds) Trends and Topics in Computer Vision. ECCV 2010. Lecture Notes in Computer Science, vol 6554. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35740-4_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-35740-4_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35739-8
Online ISBN: 978-3-642-35740-4
eBook Packages: Computer ScienceComputer Science (R0)