Journal of Real-Time Image Processing

, Volume 14, Issue 3, pp 585–604 | Cite as

Interactive 3D object recognition pipeline on mobile GPGPU computing platforms using low-cost RGB-D sensors

  • Alberto Garcia-GarciaEmail author
  • Sergio Orts-Escolano
  • Jose Garcia-Rodriguez
  • Miguel Cazorla
Special Issue Paper


In this work, we propose the implementation of a 3D object recognition system which will be optimized to operate under demanding time constraints. The system must be robust so that objects can be recognized properly in poor light conditions and cluttered scenes with significant levels of occlusion. An important requirement must be met: The system must exhibit a reasonable performance running on a low power consumption mobile GPU computing platform (NVIDIA Jetson TK1) so that it can be integrated in mobile robotics systems, ambient intelligence or ambient-assisted living applications. The acquisition system is based on the use of color and depth (RGB-D) data streams provided by low-cost 3D sensors like Microsoft Kinect or PrimeSense Carmine. The resulting system is able to recognize objects in a scene in less than 7 seconds, offering an interactive frame rate and thus allowing its deployment on a mobile robotic platform. Because of that, the system has many possible applications, ranging from mobile robot navigation and semantic scene labeling to human–computer interaction systems based on visual information. A video showing the proposed system while performing online object recognition in various scenes is available on our project website (


Real-time GPGPU RGB-D data CUDA Object recognition 



This work was partially funded by the national project SIRMAVED (DPI2013-40534-R). Experiments were made possible with a generous donation of hardware from NVIDIA.


  1. 1.
    Amit, Y.: 2D Object Detection and Recognition: Models, Algorithms, and Networks. MIT Press, Cambridge (2002)Google Scholar
  2. 2.
    Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. Pattern Anal. Mach. Intell. IEEE Trans. 27(10), 1615–1630 (2005)CrossRefGoogle Scholar
  3. 3.
    Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J.: 3d object recognition in cluttered scenes with local surface features: a survey. Pattern Anal. Mach. Intell. IEEE Trans. 36(11), 2270–2287 (2014)CrossRefGoogle Scholar
  4. 4.
    Zhang, Z.: Microsoft kinect sensor and its effect. Multimed. IEEE 19(2), 4–10 (2012)CrossRefGoogle Scholar
  5. 5.
    NVIDIA: technical brief NVIDIA Jetson TK1 development kit bringing GPU-accelerated computing to embedded systems (2014)Google Scholar
  6. 6.
    Ponce, J., Lazebnik, S., Rothganger, F., Schmid, C.: Toward true 3d object recognition. In: Reconnaissance de Formes et Intelligence Artificielle (2004)Google Scholar
  7. 7.
    Campbell, R.J., Flynn, P.J.: A survey of free-form object representation and recognition techniques. Comput. Vision Image Underst. 81(2), 166–210 (2001)CrossRefzbMATHGoogle Scholar
  8. 8.
    Andreopoulos, A., Tsotsos, J.K.: 50 years of object recognition: directions forward. Comput Vision Image Underst. 117(8), 827–891 (2013)CrossRefGoogle Scholar
  9. 9.
    Castellani, U., Cristani, M., Fantoni, S., Murino, V.: Sparse points matching by combining 3d mesh saliency with statistical descriptors. In: Computer Graphics Forum. vol. 27, pp. 643–652. Wiley-Blackwell (2008)Google Scholar
  10. 10.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: Computer vision, 1999. The proceedings of the seventh IEEE international conference on Ieee, vol. 2, pp. 1150–1157 (1999)Google Scholar
  11. 11.
    Foresti, G.: Object recognition and tracking for remote video surveillance. Circuits Syst. Video Technol. IEEE Trans. 9(7), 1045–1062 (1999)CrossRefGoogle Scholar
  12. 12.
    Wu, J., Xiao, Z.: Video surveillance object recognition based on shape and color features. In: Image and Signal Processing (CISP), 2010 3rd International Congress, vol. 1, pp. 451–454 (2010)Google Scholar
  13. 13.
    Stuckler, J., Behnke, S.: Integrating indoor mobility, object manipulation, and intuitive interaction for domestic service tasks. In: Humanoid Robots, 2009. Humanoids 2009. 9th IEEE-RAS International Conference, pp. 506–513 (2009)Google Scholar
  14. 14.
    Lei, Y., Bennamoun, M., Hayat, M., Guo, Y.: An efficient 3D face recognition approach using local geometrical signatures. Pattern Recognit. 47(2), 509–524 (2014)CrossRefGoogle Scholar
  15. 15.
    Sukno, F., Waddington, J., Whelan, P.: Comparing 3d descriptors for local search of craniofacial landmarks. In: Advances in Visual Computing. Lecture Notes in Computer Science, vol. 7432, pp. 92–103, Springer, Berlin, Heidelberg (2012)Google Scholar
  16. 16.
    Mian, A., Bennamoun, M., Owens, R.: Three-dimensional model-based object recognition and segmentation in cluttered scenes. Pattern Anal. Mach. Intell. IEEE Trans. 28(10), 1584–1601 (2006)CrossRefGoogle Scholar
  17. 17.
    Mian, A.S., Bennamoun, M., Owens, R.A.: A novel representation and feature matching algorithm for automatic pairwise registration of range images. Int. J. Comput. Vision 66(1), 19–40 (2006)CrossRefGoogle Scholar
  18. 18.
    Orts-Escolano, S., Morell, V., Garcia-Rodriguez, J., Cazorla, M., Fisher, R.: Real-time 3d semi-local surface patch extraction using GPGPU. J. Real Time Image Process. 10(4), 647–666 (2015)CrossRefGoogle Scholar
  19. 19.
    Hirano, Y., Garcia, C., Sukthankar, R., Hoogs, A.: Industry and object recognition: Applications, applied research and challenges. In: Ponce J., Hebert M., Schmid C., Zisserman A. (eds.) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol. 4170, pp. 49–64. Springer, Berlin, Heidelberg (2006)Google Scholar
  20. 20.
    Besl, P.J., Jain, R.C.: Three-dimensional object recognition. ACM Comput. Surv. (CSUR) 17(1), 75–145 (1985)CrossRefGoogle Scholar
  21. 21.
    Brady, J., Nandhakumar, N., Aggarwal, J.: Recent progress in the recognition of objects from range data. In: Pattern Recognition, 1988, 9th International Conference, pp. 85–92 (1988)Google Scholar
  22. 22.
    Arman, F., Aggarwal, J.: Model-based object recognition in dense-range images—a review. ACM Comput. Surv. (CSUR) 25(1), 5–43 (1993)CrossRefGoogle Scholar
  23. 23.
    Mamic, G., Bennamoun, M.: Representation and recognition of 3d free-form objects. Digit. Signal Process. 12(1), 47–76 (2002)CrossRefzbMATHGoogle Scholar
  24. 24.
    Aldoma, A., Marton, Z.C., Tombari, F., Wohlkinger, W., Potthast, C., Zeisl, B., Rusu, R.B., Gedikli, S., Vincze, M.: Point cloud library. IEEE Robot. Autom. Mag. 1070(9932/12), 80–91 (2012)CrossRefGoogle Scholar
  25. 25.
    Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Computer Vision, 1998. Sixth International Conference on, IEEE, pp. 839–846 (1998)Google Scholar
  26. 26.
    Trevor, A.J., Gedikli, S., Rusu, R.B., Christensen, H.I.: Efficient organized point cloud segmentation with connected components. Semant. Percept. Mapp. Explor. (SPME) (2013)Google Scholar
  27. 27.
    Szeliski, R.: Computer Vision: Algorithms and Applications. Springer, Berlin (2010)zbMATHGoogle Scholar
  28. 28.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vision 59(2), 167–181 (2004)CrossRefGoogle Scholar
  29. 29.
    Rusu, R.B., Blodow, N., Marton, Z.C., Beetz, M.: Close-range scene segmentation and reconstruction of 3d point cloud maps for mobile manipulation in domestic environments. In: Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on, IEEE, pp. 1–6 (2009)Google Scholar
  30. 30.
    Rusu, R., Blodow, N., Beetz, M.: Fast point feature histograms (fpfh) for 3d registration. In: Robotics and Automation, 2009. ICRA ’09. IEEE International Conference, pp. 3212–3217 (2009)Google Scholar
  31. 31.
    Frome, A., Huber, D., Kolluri, R., Blow, T., Malik, J.: Recognizing objects in range data using regional point descriptors. In: Pajdla T., Matas J. (eds.) Computer Vision - ECCV 2004. Lecture Notes in Computer Science, vol. 3023, pp. 224–237. Springer, Berlin, Heidelberg (2004)Google Scholar
  32. 32.
    Tombari, F., Salti, S., Di Stefano, L.: Unique shape context for 3d data description. In: Proceedings of the ACM Workshop on 3D Object Retrieval. 3DOR ’10, New York, NY, USA, ACM, pp. 57–62 (2010)Google Scholar
  33. 33.
    Tombari, F., Salti, S., Di Stefano, L.: Unique signatures of histograms for local surface description. In: Proceedings of the 11th European Conference on Computer Vision Conference on Computer Vision: Part III. ECCV’10, Springer, Berlin, Heidelberg, pp. 356–369 (2010)Google Scholar
  34. 34.
    Tombari, F., Salti, S., Di Stefano, L.: A combined texture-shape descriptor for enhanced 3d feature matching. In: Image Processing (ICIP), 2011 18th IEEE International Conference, pp. 809–812 (2011)Google Scholar
  35. 35.
    Guo, Y., Sohel, F., Bennamoun, M., Lu, M., Wan, J.: Rotational projection statistics for 3d local surface description and object recognition. Int. J. Comput. Vision 105(1), 63–86 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Rusu, R., Blodow, N., Marton, Z., Beetz, M.: Aligning point cloud views using persistent feature histograms. In: Intelligent Robots and Systems, 2008. IROS 2008. IEEE/RSJ International Conference, pp. 3384–3391 (2008)Google Scholar
  37. 37.
    Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. Pattern Anal. Mach. Intell. IEEE Trans. 24(4), 509–522 (2002)CrossRefGoogle Scholar
  38. 38.
    Muja, M., Lowe, D.G.: Scalable nearest neighbor algorithms for high dimensional data. Pattern Anal. Mach. Intell. IEEE Trans. 36, 2227–2240 (2014)CrossRefGoogle Scholar
  39. 39.
    Chen, H., Bhanu, B.: 3d free-form object recognition in range images using local surface patches. In: Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference, vol. 3, pp. 136–139 (2004)Google Scholar
  40. 40.
    Besl, P.J., McKay, N.D.: Method for registration of 3-D shapes. Robot DL Tentat. Int. Soc. Optics Photonics 586–606 (1992)Google Scholar
  41. 41.
    Chen, Y., Medioni, G.: Object modeling by registration of multiple range images. In: Robotics and Automation, 1991. Proceedings., 1991 IEEE International Conference on, IEEE, pp. 2724–2729 (1991)Google Scholar
  42. 42.
    Rusinkiewicz, S., Levoy, M.: Efficient variants of the icp algorithm. In: 3-D Digital Imaging and Modeling, 2001. Proceedings. Third International Conference on, IEEE, pp. 145–152 (2001)Google Scholar
  43. 43.
    Aldoma, A., Tombari, F., Di Stefano, L., Vincze, M.: A global hypotheses verification method for 3d object recognition. In: Computer Vision–ECCV 2012, pp. 511–524, Springer (2012)Google Scholar
  44. 44.
    Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the twentieth annual symposium on Computational geometry, ACM, pp. 253–262 (2004)Google Scholar
  45. 45.
    Weber, R., Schek, H.J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. VLDB 98, 194–205 (1998)Google Scholar
  46. 46.
    Gionis, A., Indyk, P., Motwani, R., et al.: Similarity search in high dimensions via hashing. VLDB 99, 518–529 (1999)Google Scholar
  47. 47.
    Silpa-Anan, C., Hartley, R.: Optimised kd-trees for fast image descriptor matching. In: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, IEEE, pp. 1–8 (2008)Google Scholar
  48. 48.
    Wilt, N.: The Cuda Handbook: A Comprehensive Guide to GPU Programming. Pearson Education, Upper Saddle River (2013)Google Scholar
  49. 49.
    Kirk, D.B., Wen-mei, W.H.: Programming massively parallel processors: a hands-on approach. Morgan Kaufmann (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Alberto Garcia-Garcia
    • 1
    Email author
  • Sergio Orts-Escolano
    • 1
  • Jose Garcia-Rodriguez
    • 1
  • Miguel Cazorla
    • 1
  1. 1.Computer Technology DepartmentUniversity of AlicanteAlicanteSpain

Personalised recommendations