Skip to main content
Log in

Interactive 3D object recognition pipeline on mobile GPGPU computing platforms using low-cost RGB-D sensors

  • Special Issue Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

In this work, we propose the implementation of a 3D object recognition system which will be optimized to operate under demanding time constraints. The system must be robust so that objects can be recognized properly in poor light conditions and cluttered scenes with significant levels of occlusion. An important requirement must be met: The system must exhibit a reasonable performance running on a low power consumption mobile GPU computing platform (NVIDIA Jetson TK1) so that it can be integrated in mobile robotics systems, ambient intelligence or ambient-assisted living applications. The acquisition system is based on the use of color and depth (RGB-D) data streams provided by low-cost 3D sensors like Microsoft Kinect or PrimeSense Carmine. The resulting system is able to recognize objects in a scene in less than 7 seconds, offering an interactive frame rate and thus allowing its deployment on a mobile robotic platform. Because of that, the system has many possible applications, ranging from mobile robot navigation and semantic scene labeling to human–computer interaction systems based on visual information. A video showing the proposed system while performing online object recognition in various scenes is available on our project website (http://www.dtic.ua.es/~agarcia/3dobjrecog-jetsontk1/).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. https://developer.nvidia.com/meet-jetson-embedded-platform.

  2. https://www.microsoft.com/en-us/kinectforwindows/.

  3. http://ark.intel.com/es-es/products/65702/Intel-Core-i5-3570-Processor-6M-Cache-up-to-3_80-GHz.

  4. http://www.dtic.ua.es/~agarcia/3dobjrecog-jetsontk1/.

References

  1. Amit, Y.: 2D Object Detection and Recognition: Models, Algorithms, and Networks. MIT Press, Cambridge (2002)

    Google Scholar 

  2. Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. Pattern Anal. Mach. Intell. IEEE Trans. 27(10), 1615–1630 (2005)

    Article  Google Scholar 

  3. Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J.: 3d object recognition in cluttered scenes with local surface features: a survey. Pattern Anal. Mach. Intell. IEEE Trans. 36(11), 2270–2287 (2014)

    Article  Google Scholar 

  4. Zhang, Z.: Microsoft kinect sensor and its effect. Multimed. IEEE 19(2), 4–10 (2012)

    Article  Google Scholar 

  5. NVIDIA: technical brief NVIDIA Jetson TK1 development kit bringing GPU-accelerated computing to embedded systems (2014)

  6. Ponce, J., Lazebnik, S., Rothganger, F., Schmid, C.: Toward true 3d object recognition. In: Reconnaissance de Formes et Intelligence Artificielle (2004)

  7. Campbell, R.J., Flynn, P.J.: A survey of free-form object representation and recognition techniques. Comput. Vision Image Underst. 81(2), 166–210 (2001)

    Article  MATH  Google Scholar 

  8. Andreopoulos, A., Tsotsos, J.K.: 50 years of object recognition: directions forward. Comput Vision Image Underst. 117(8), 827–891 (2013)

    Article  Google Scholar 

  9. Castellani, U., Cristani, M., Fantoni, S., Murino, V.: Sparse points matching by combining 3d mesh saliency with statistical descriptors. In: Computer Graphics Forum. vol. 27, pp. 643–652. Wiley-Blackwell (2008)

  10. Lowe, D.G.: Object recognition from local scale-invariant features. In: Computer vision, 1999. The proceedings of the seventh IEEE international conference on Ieee, vol. 2, pp. 1150–1157 (1999)

  11. Foresti, G.: Object recognition and tracking for remote video surveillance. Circuits Syst. Video Technol. IEEE Trans. 9(7), 1045–1062 (1999)

    Article  Google Scholar 

  12. Wu, J., Xiao, Z.: Video surveillance object recognition based on shape and color features. In: Image and Signal Processing (CISP), 2010 3rd International Congress, vol. 1, pp. 451–454 (2010)

  13. Stuckler, J., Behnke, S.: Integrating indoor mobility, object manipulation, and intuitive interaction for domestic service tasks. In: Humanoid Robots, 2009. Humanoids 2009. 9th IEEE-RAS International Conference, pp. 506–513 (2009)

  14. Lei, Y., Bennamoun, M., Hayat, M., Guo, Y.: An efficient 3D face recognition approach using local geometrical signatures. Pattern Recognit. 47(2), 509–524 (2014)

    Article  Google Scholar 

  15. Sukno, F., Waddington, J., Whelan, P.: Comparing 3d descriptors for local search of craniofacial landmarks. In: Advances in Visual Computing. Lecture Notes in Computer Science, vol. 7432, pp. 92–103, Springer, Berlin, Heidelberg (2012)

  16. Mian, A., Bennamoun, M., Owens, R.: Three-dimensional model-based object recognition and segmentation in cluttered scenes. Pattern Anal. Mach. Intell. IEEE Trans. 28(10), 1584–1601 (2006)

    Article  Google Scholar 

  17. Mian, A.S., Bennamoun, M., Owens, R.A.: A novel representation and feature matching algorithm for automatic pairwise registration of range images. Int. J. Comput. Vision 66(1), 19–40 (2006)

    Article  Google Scholar 

  18. Orts-Escolano, S., Morell, V., Garcia-Rodriguez, J., Cazorla, M., Fisher, R.: Real-time 3d semi-local surface patch extraction using GPGPU. J. Real Time Image Process. 10(4), 647–666 (2015)

    Article  Google Scholar 

  19. Hirano, Y., Garcia, C., Sukthankar, R., Hoogs, A.: Industry and object recognition: Applications, applied research and challenges. In: Ponce J., Hebert M., Schmid C., Zisserman A. (eds.) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol. 4170, pp. 49–64. Springer, Berlin, Heidelberg (2006)

  20. Besl, P.J., Jain, R.C.: Three-dimensional object recognition. ACM Comput. Surv. (CSUR) 17(1), 75–145 (1985)

    Article  Google Scholar 

  21. Brady, J., Nandhakumar, N., Aggarwal, J.: Recent progress in the recognition of objects from range data. In: Pattern Recognition, 1988, 9th International Conference, pp. 85–92 (1988)

  22. Arman, F., Aggarwal, J.: Model-based object recognition in dense-range images—a review. ACM Comput. Surv. (CSUR) 25(1), 5–43 (1993)

    Article  Google Scholar 

  23. Mamic, G., Bennamoun, M.: Representation and recognition of 3d free-form objects. Digit. Signal Process. 12(1), 47–76 (2002)

    Article  MATH  Google Scholar 

  24. Aldoma, A., Marton, Z.C., Tombari, F., Wohlkinger, W., Potthast, C., Zeisl, B., Rusu, R.B., Gedikli, S., Vincze, M.: Point cloud library. IEEE Robot. Autom. Mag. 1070(9932/12), 80–91 (2012)

    Article  Google Scholar 

  25. Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Computer Vision, 1998. Sixth International Conference on, IEEE, pp. 839–846 (1998)

  26. Trevor, A.J., Gedikli, S., Rusu, R.B., Christensen, H.I.: Efficient organized point cloud segmentation with connected components. Semant. Percept. Mapp. Explor. (SPME) (2013)

  27. Szeliski, R.: Computer Vision: Algorithms and Applications. Springer, Berlin (2010)

    MATH  Google Scholar 

  28. Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vision 59(2), 167–181 (2004)

    Article  Google Scholar 

  29. Rusu, R.B., Blodow, N., Marton, Z.C., Beetz, M.: Close-range scene segmentation and reconstruction of 3d point cloud maps for mobile manipulation in domestic environments. In: Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on, IEEE, pp. 1–6 (2009)

  30. Rusu, R., Blodow, N., Beetz, M.: Fast point feature histograms (fpfh) for 3d registration. In: Robotics and Automation, 2009. ICRA ’09. IEEE International Conference, pp. 3212–3217 (2009)

  31. Frome, A., Huber, D., Kolluri, R., Blow, T., Malik, J.: Recognizing objects in range data using regional point descriptors. In: Pajdla T., Matas J. (eds.) Computer Vision - ECCV 2004. Lecture Notes in Computer Science, vol. 3023, pp. 224–237. Springer, Berlin, Heidelberg (2004)

  32. Tombari, F., Salti, S., Di Stefano, L.: Unique shape context for 3d data description. In: Proceedings of the ACM Workshop on 3D Object Retrieval. 3DOR ’10, New York, NY, USA, ACM, pp. 57–62 (2010)

  33. Tombari, F., Salti, S., Di Stefano, L.: Unique signatures of histograms for local surface description. In: Proceedings of the 11th European Conference on Computer Vision Conference on Computer Vision: Part III. ECCV’10, Springer, Berlin, Heidelberg, pp. 356–369 (2010)

  34. Tombari, F., Salti, S., Di Stefano, L.: A combined texture-shape descriptor for enhanced 3d feature matching. In: Image Processing (ICIP), 2011 18th IEEE International Conference, pp. 809–812 (2011)

  35. Guo, Y., Sohel, F., Bennamoun, M., Lu, M., Wan, J.: Rotational projection statistics for 3d local surface description and object recognition. Int. J. Comput. Vision 105(1), 63–86 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  36. Rusu, R., Blodow, N., Marton, Z., Beetz, M.: Aligning point cloud views using persistent feature histograms. In: Intelligent Robots and Systems, 2008. IROS 2008. IEEE/RSJ International Conference, pp. 3384–3391 (2008)

  37. Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. Pattern Anal. Mach. Intell. IEEE Trans. 24(4), 509–522 (2002)

    Article  Google Scholar 

  38. Muja, M., Lowe, D.G.: Scalable nearest neighbor algorithms for high dimensional data. Pattern Anal. Mach. Intell. IEEE Trans. 36, 2227–2240 (2014)

    Article  Google Scholar 

  39. Chen, H., Bhanu, B.: 3d free-form object recognition in range images using local surface patches. In: Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference, vol. 3, pp. 136–139 (2004)

  40. Besl, P.J., McKay, N.D.: Method for registration of 3-D shapes. Robot DL Tentat. Int. Soc. Optics Photonics 586–606 (1992)

  41. Chen, Y., Medioni, G.: Object modeling by registration of multiple range images. In: Robotics and Automation, 1991. Proceedings., 1991 IEEE International Conference on, IEEE, pp. 2724–2729 (1991)

  42. Rusinkiewicz, S., Levoy, M.: Efficient variants of the icp algorithm. In: 3-D Digital Imaging and Modeling, 2001. Proceedings. Third International Conference on, IEEE, pp. 145–152 (2001)

  43. Aldoma, A., Tombari, F., Di Stefano, L., Vincze, M.: A global hypotheses verification method for 3d object recognition. In: Computer Vision–ECCV 2012, pp. 511–524, Springer (2012)

  44. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the twentieth annual symposium on Computational geometry, ACM, pp. 253–262 (2004)

  45. Weber, R., Schek, H.J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. VLDB 98, 194–205 (1998)

    Google Scholar 

  46. Gionis, A., Indyk, P., Motwani, R., et al.: Similarity search in high dimensions via hashing. VLDB 99, 518–529 (1999)

    Google Scholar 

  47. Silpa-Anan, C., Hartley, R.: Optimised kd-trees for fast image descriptor matching. In: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, IEEE, pp. 1–8 (2008)

  48. Wilt, N.: The Cuda Handbook: A Comprehensive Guide to GPU Programming. Pearson Education, Upper Saddle River (2013)

    Google Scholar 

  49. Kirk, D.B., Wen-mei, W.H.: Programming massively parallel processors: a hands-on approach. Morgan Kaufmann (2012)

Download references

Acknowledgments

This work was partially funded by the national project SIRMAVED (DPI2013-40534-R). Experiments were made possible with a generous donation of hardware from NVIDIA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alberto Garcia-Garcia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Garcia-Garcia, A., Orts-Escolano, S., Garcia-Rodriguez, J. et al. Interactive 3D object recognition pipeline on mobile GPGPU computing platforms using low-cost RGB-D sensors. J Real-Time Image Proc 14, 585–604 (2018). https://doi.org/10.1007/s11554-016-0607-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-016-0607-x

Keywords

Navigation