Journal of Signal Processing Systems

, Volume 90, Issue 6, pp 913–929 | Cite as

Image Processing Units on Ultra-low-cost Embedded Hardware: Algorithmic Optimizations for Real-time Performance

  • Suraj NairEmail author
  • Nikhil Somani
  • Artur Grunau
  • Emmanuel Dean-Leon
  • Alois Knoll


The design and development of image processing units (IPUs) has traditionally involved trade-offs between cost, real-time properties, portability, and ease of programming. A standard PC can be turned into an IPU relatively easily with the help of readily available computer vision libraries, but the end result will not be portable, and may be costly. Similarly, one can use field programmable gate arrays (FPGAs) as the base for an IPU, but they are expensive and require hardware-level programming. Finally, general purpose embedded hardware tends to be under-powered and difficult to develop for due to poor support for running advanced software. In recent years a new option has surfaced: single-board computers (SBCs). These generally inexpensive embedded devices would be attractive as a platform on which to develop IPUs due to their inherent portability and good compatibility with existing computer vision (CV) software. However, whether their performance is sufficient for real-time image processing has thus far remained an open question. Most SBCs (especially the ultra-low-cost ones which we target) do not offer CUDA/OpenCL support which makes it difficult to port GPU-based CV applications. In order to utilize the full power of the SBCs, their GPUs need to be used. In our attempts at doing this, we have observed that the CV algorithms which an IPU uses have to be re-designed according to the OpenGL support available on these devices. This work presents a framework where a selection of CV algorithms have been designed in a way that they optimize performance on SBCs while still maintaining portability across devices which offer OpenGL ES 2.0 support. Furthermore, this paper demonstrates an IPU based on a representative SBC (namely the Raspberry Pi) along with two CV applications backed by it. The robustness of the applications as well as the performance of the IPU are evaluated to show that SPCs can be used to build IPUs capable of producing accurate data in real time. This opens the possibilities of large scale economically deployment of vision system especially in remote and barren lands. Finally, the software developed as a part of this work has been released open source.


Embedded vision Image processing units Human tracking 



This work was financially supported by the Singapore National Research Foundation under its Campus for Research Excellence And Technological Enterprise (CREATE) programme.


  1. 1.
    Kimura, S., Miyasaka, A., Funase, R., Sawada, H., Sakamoto, N., & Miyashita, N. (2011). High-performance image acquisition & processing unit fabricated using COTS technologies. IEEE Aerospace and Electronic Systems Magazine, 26, 19–25.CrossRefGoogle Scholar
  2. 2.
    Choy, C.S., Chan, W.K., & Lam, W. (1992). An image processing unit using an ICT chip set. In TENCON ’92. Technology enabling tomorrow: computers, communications and automation towards the 21st century. 1992 IEEE region 10 international conference, (Vol. 2 pp. 1003–1007).CrossRefGoogle Scholar
  3. 3.
    Freescale Semiconductor (2009). Image processing unit v3 (IPUV3) library.Google Scholar
  4. 4.
    Shi, Y., & Real, F.D. (2010). In Smart cameras: fundamentals and classification. US: Springer.Google Scholar
  5. 5.
    Holzer, M., Schumacher, F., Greiner, T., & Rosenstiel, W. (2012). Optimized hardware architecture of a smart camera with novel cyclic image line storage structures for morphological raster scan image processing. In IEEE International conference on emerging signal processing applications (ESPA), 2012 (pp. 83–86).CrossRefGoogle Scholar
  6. 6.
    Chan, W.K., & Chien, S.Y. (2006). High performance low cost video analysis core for smart camera chips in distributed surveillance network. In IEEE 8th workshop on multimedia signal processing, 2006 (pp. 170–175).CrossRefGoogle Scholar
  7. 7.
    Casares, M., & Velipasalar, S. (2010). An adaptive method for energy-efficiency in battery-powered embedded smart cameras. In Proceedings of the fourth ACM/IEEE international conference on distributed smart cameras. ICDSC ’10 (pp. 167–174). New York, NY, USA: ACM.CrossRefGoogle Scholar
  8. 8.
    Cheng, K.T., Yang, X., & Wang, Y.C. (2013). Performance optimization of vision apps on mobile application processor. In 20th international conference on systems, signals and image processing (IWSSIP), 2013 (pp. 187–191).CrossRefGoogle Scholar
  9. 9.
    Roudel, N., Berry, F., Serot, J., & Eck, L. (2010). A new high-level methodology for programming FPGA-based smart camera. In 13th euromicro conference on digital system design: architectures, methods and tools (DSD), 2010 (pp. 573–578).CrossRefGoogle Scholar
  10. 10.
    Neves, R., & Matos, A. (2013). Raspberry pi based stereo vision for small size ASVs. In Oceans - San Diego, 2013 (pp. 1–6).Google Scholar
  11. 11.
    Reboucas, R.A., Eller, Q.d.C., Habermann, M., & Shiguemori, E.H. (2013). Embedded system for visual odometry and localization of moving objects in images acquired by unmanned aerial vehicles. In III Brazilian symposium on computing systems engineering (SBESC), 2013 (pp. 35–40).CrossRefGoogle Scholar
  12. 12.
    Hofmann, R., Seichter, H., & Reitmayr, G. (2012). A GPGPU accelerated descriptor for mobile devices. In IEEE international symposium on mixed and augmented reality (ISMAR), 2012 (pp. 289–290).CrossRefGoogle Scholar
  13. 13.
    Reinisch, G., Arth, C., & Schmalstieg, D. (2013). Panoramic mapping on a mobile phone GPU. In IEEE international symposium on mixed and augmented reality (ISMAR), 2013 (pp. 291–292).CrossRefGoogle Scholar
  14. 14.
    Singhal, N., Park, I.K., & Cho, S. (2010). Implementation and optimization of image processing algorithms on handheld GPU. In 17th IEEE international conference on image processing (ICIP), 2010 (pp. 4481–4484).CrossRefGoogle Scholar
  15. 15.
    Fung, J., & Mann, S. (2004). Computer vision signal processing on graphics processing units. In IEEE international conference on acoustics, speech, and signal processing, 2004. Proceedings. (ICASSP ’04), (Vol. 5 pp. 93–96).Google Scholar
  16. 16.
    Jargstorff, F. (2004). 27. In A framework for image processing. Addison-Wesley Professional (pp. 445–467).Google Scholar
  17. 17.
    Fung, J. (2005). 40. In Computer vision on the GPU. Addison-Wesley Professional (pp. 649–666).Google Scholar
  18. 18.
    Benezeth, Y., Jodoin, P.M., Emile, B., Laurent, H., & Rosenberger, C. (2010). Comparative study of background subtraction algorithms. Journal of Electronic Imaging, 19, 033003.CrossRefGoogle Scholar
  19. 19.
    Strengert, M., Kraus, M., & Ertl, T. (2006). Pyramid methods in GPU-based image processing. Proceedings Vision, Modeling, and Visualization, 2006, 169–176.Google Scholar
  20. 20.
    Horn, D. (2005). 36. In Stream reduction operations for GPGPU applications. Addison-Wesley Professional (pp. 573–589).Google Scholar
  21. 21.
    Nugteren, C., van den Braak, G.J., Corporaal, H., & Mesman, B. (2011). High performance predictable histogramming on GPUs: Exploring and evaluating algorithm trade-offs. In Proceedings of the fourth workshop on general purpose processing on graphics processing units. GPGPU-4 (pp. 1:1–1:8). New York: ACM.Google Scholar
  22. 22.
    Fluck, O., Aharon, S., Cremers, D., & Rousson, M. (2006). GPU histogram computation. In ACM SIGGRAPH 2006 Research posters. SIGGRAPH ’06. New York, NY, USA: ACM.Google Scholar
  23. 23.
    Scheuermann, T., & Hensley, J. (2007). Efficient histogram generation using scattering on GPUs. In Proceedings of the 2007 symposium on interactive 3d graphics and games. i3d ’07 (pp. 33–37). New York: ACM.CrossRefGoogle Scholar
  24. 24.
    Pérez, P., Hue, C., Vermaak, J., & Gangnet, M. (2002). Color-based probabilistic tracking. In Heyden, A., Sparr, G., Nielsen, M., & Johansen, P. (Eds.) Computer vision — ECCV 2002. Volume 2350 of lecture notes in computer science (pp. 661–675). Berlin: Springer.Google Scholar
  25. 25.
    Nair, S., Panin, G., Wojtczyk, M., Lenz, C., Friedlhuber, T., & Knoll, A. (2008). A multi-camera person tracking system for robotic applications in virtual reality TV studio. In IEEE/RSJ International conference on intelligent robots and systems, 2008. IROS 2008 (pp. 3990–3996).CrossRefGoogle Scholar
  26. 26.
    Nummiaro, K., Koller-Meier, E., & Van Gool, L. (2002). Object tracking with an adaptive color-based particle filter. In Van Gool, L. (Ed.) Pattern recognition. Volume 2449 of Lecture Notes in Computer Science (pp. 353–360). Berlin: Springer.Google Scholar
  27. 27.
    Bhattacharyya, A. (1946). On a measure of divergence between two multinomial populations. Sankhyä: The Indian Journal of Statistics, 401–406.Google Scholar
  28. 28.
    Nair, S. (2012). Visual tracking of multiple humans with machine learning based robustness enhancement applied to real-world robotic systems. Dissertation, Technische Universität München, München.Google Scholar
  29. 29.
    Nair, S., & Grunau, A. (2014). Human tracking simulation.

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  • Suraj Nair
    • 1
    Email author
  • Nikhil Somani
    • 1
  • Artur Grunau
    • 2
  • Emmanuel Dean-Leon
    • 2
  • Alois Knoll
    • 2
  1. 1.TUMCREATESingaporeSingapore
  2. 2.Technische Universität MünchenMünchenGermany

Personalised recommendations