Abstract
In this work, we propose the implementation of a 3D object recognition system which will be optimized to operate under demanding time constraints. The system must be robust so that objects can be recognized properly in poor light conditions and cluttered scenes with significant levels of occlusion. An important requirement must be met: The system must exhibit a reasonable performance running on a low power consumption mobile GPU computing platform (NVIDIA Jetson TK1) so that it can be integrated in mobile robotics systems, ambient intelligence or ambient-assisted living applications. The acquisition system is based on the use of color and depth (RGB-D) data streams provided by low-cost 3D sensors like Microsoft Kinect or PrimeSense Carmine. The resulting system is able to recognize objects in a scene in less than 7 seconds, offering an interactive frame rate and thus allowing its deployment on a mobile robotic platform. Because of that, the system has many possible applications, ranging from mobile robot navigation and semantic scene labeling to human–computer interaction systems based on visual information. A video showing the proposed system while performing online object recognition in various scenes is available on our project website (http://www.dtic.ua.es/~agarcia/3dobjrecog-jetsontk1/).
Similar content being viewed by others
References
Amit, Y.: 2D Object Detection and Recognition: Models, Algorithms, and Networks. MIT Press, Cambridge (2002)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. Pattern Anal. Mach. Intell. IEEE Trans. 27(10), 1615–1630 (2005)
Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J.: 3d object recognition in cluttered scenes with local surface features: a survey. Pattern Anal. Mach. Intell. IEEE Trans. 36(11), 2270–2287 (2014)
Zhang, Z.: Microsoft kinect sensor and its effect. Multimed. IEEE 19(2), 4–10 (2012)
NVIDIA: technical brief NVIDIA Jetson TK1 development kit bringing GPU-accelerated computing to embedded systems (2014)
Ponce, J., Lazebnik, S., Rothganger, F., Schmid, C.: Toward true 3d object recognition. In: Reconnaissance de Formes et Intelligence Artificielle (2004)
Campbell, R.J., Flynn, P.J.: A survey of free-form object representation and recognition techniques. Comput. Vision Image Underst. 81(2), 166–210 (2001)
Andreopoulos, A., Tsotsos, J.K.: 50 years of object recognition: directions forward. Comput Vision Image Underst. 117(8), 827–891 (2013)
Castellani, U., Cristani, M., Fantoni, S., Murino, V.: Sparse points matching by combining 3d mesh saliency with statistical descriptors. In: Computer Graphics Forum. vol. 27, pp. 643–652. Wiley-Blackwell (2008)
Lowe, D.G.: Object recognition from local scale-invariant features. In: Computer vision, 1999. The proceedings of the seventh IEEE international conference on Ieee, vol. 2, pp. 1150–1157 (1999)
Foresti, G.: Object recognition and tracking for remote video surveillance. Circuits Syst. Video Technol. IEEE Trans. 9(7), 1045–1062 (1999)
Wu, J., Xiao, Z.: Video surveillance object recognition based on shape and color features. In: Image and Signal Processing (CISP), 2010 3rd International Congress, vol. 1, pp. 451–454 (2010)
Stuckler, J., Behnke, S.: Integrating indoor mobility, object manipulation, and intuitive interaction for domestic service tasks. In: Humanoid Robots, 2009. Humanoids 2009. 9th IEEE-RAS International Conference, pp. 506–513 (2009)
Lei, Y., Bennamoun, M., Hayat, M., Guo, Y.: An efficient 3D face recognition approach using local geometrical signatures. Pattern Recognit. 47(2), 509–524 (2014)
Sukno, F., Waddington, J., Whelan, P.: Comparing 3d descriptors for local search of craniofacial landmarks. In: Advances in Visual Computing. Lecture Notes in Computer Science, vol. 7432, pp. 92–103, Springer, Berlin, Heidelberg (2012)
Mian, A., Bennamoun, M., Owens, R.: Three-dimensional model-based object recognition and segmentation in cluttered scenes. Pattern Anal. Mach. Intell. IEEE Trans. 28(10), 1584–1601 (2006)
Mian, A.S., Bennamoun, M., Owens, R.A.: A novel representation and feature matching algorithm for automatic pairwise registration of range images. Int. J. Comput. Vision 66(1), 19–40 (2006)
Orts-Escolano, S., Morell, V., Garcia-Rodriguez, J., Cazorla, M., Fisher, R.: Real-time 3d semi-local surface patch extraction using GPGPU. J. Real Time Image Process. 10(4), 647–666 (2015)
Hirano, Y., Garcia, C., Sukthankar, R., Hoogs, A.: Industry and object recognition: Applications, applied research and challenges. In: Ponce J., Hebert M., Schmid C., Zisserman A. (eds.) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol. 4170, pp. 49–64. Springer, Berlin, Heidelberg (2006)
Besl, P.J., Jain, R.C.: Three-dimensional object recognition. ACM Comput. Surv. (CSUR) 17(1), 75–145 (1985)
Brady, J., Nandhakumar, N., Aggarwal, J.: Recent progress in the recognition of objects from range data. In: Pattern Recognition, 1988, 9th International Conference, pp. 85–92 (1988)
Arman, F., Aggarwal, J.: Model-based object recognition in dense-range images—a review. ACM Comput. Surv. (CSUR) 25(1), 5–43 (1993)
Mamic, G., Bennamoun, M.: Representation and recognition of 3d free-form objects. Digit. Signal Process. 12(1), 47–76 (2002)
Aldoma, A., Marton, Z.C., Tombari, F., Wohlkinger, W., Potthast, C., Zeisl, B., Rusu, R.B., Gedikli, S., Vincze, M.: Point cloud library. IEEE Robot. Autom. Mag. 1070(9932/12), 80–91 (2012)
Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Computer Vision, 1998. Sixth International Conference on, IEEE, pp. 839–846 (1998)
Trevor, A.J., Gedikli, S., Rusu, R.B., Christensen, H.I.: Efficient organized point cloud segmentation with connected components. Semant. Percept. Mapp. Explor. (SPME) (2013)
Szeliski, R.: Computer Vision: Algorithms and Applications. Springer, Berlin (2010)
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vision 59(2), 167–181 (2004)
Rusu, R.B., Blodow, N., Marton, Z.C., Beetz, M.: Close-range scene segmentation and reconstruction of 3d point cloud maps for mobile manipulation in domestic environments. In: Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on, IEEE, pp. 1–6 (2009)
Rusu, R., Blodow, N., Beetz, M.: Fast point feature histograms (fpfh) for 3d registration. In: Robotics and Automation, 2009. ICRA ’09. IEEE International Conference, pp. 3212–3217 (2009)
Frome, A., Huber, D., Kolluri, R., Blow, T., Malik, J.: Recognizing objects in range data using regional point descriptors. In: Pajdla T., Matas J. (eds.) Computer Vision - ECCV 2004. Lecture Notes in Computer Science, vol. 3023, pp. 224–237. Springer, Berlin, Heidelberg (2004)
Tombari, F., Salti, S., Di Stefano, L.: Unique shape context for 3d data description. In: Proceedings of the ACM Workshop on 3D Object Retrieval. 3DOR ’10, New York, NY, USA, ACM, pp. 57–62 (2010)
Tombari, F., Salti, S., Di Stefano, L.: Unique signatures of histograms for local surface description. In: Proceedings of the 11th European Conference on Computer Vision Conference on Computer Vision: Part III. ECCV’10, Springer, Berlin, Heidelberg, pp. 356–369 (2010)
Tombari, F., Salti, S., Di Stefano, L.: A combined texture-shape descriptor for enhanced 3d feature matching. In: Image Processing (ICIP), 2011 18th IEEE International Conference, pp. 809–812 (2011)
Guo, Y., Sohel, F., Bennamoun, M., Lu, M., Wan, J.: Rotational projection statistics for 3d local surface description and object recognition. Int. J. Comput. Vision 105(1), 63–86 (2013)
Rusu, R., Blodow, N., Marton, Z., Beetz, M.: Aligning point cloud views using persistent feature histograms. In: Intelligent Robots and Systems, 2008. IROS 2008. IEEE/RSJ International Conference, pp. 3384–3391 (2008)
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. Pattern Anal. Mach. Intell. IEEE Trans. 24(4), 509–522 (2002)
Muja, M., Lowe, D.G.: Scalable nearest neighbor algorithms for high dimensional data. Pattern Anal. Mach. Intell. IEEE Trans. 36, 2227–2240 (2014)
Chen, H., Bhanu, B.: 3d free-form object recognition in range images using local surface patches. In: Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference, vol. 3, pp. 136–139 (2004)
Besl, P.J., McKay, N.D.: Method for registration of 3-D shapes. Robot DL Tentat. Int. Soc. Optics Photonics 586–606 (1992)
Chen, Y., Medioni, G.: Object modeling by registration of multiple range images. In: Robotics and Automation, 1991. Proceedings., 1991 IEEE International Conference on, IEEE, pp. 2724–2729 (1991)
Rusinkiewicz, S., Levoy, M.: Efficient variants of the icp algorithm. In: 3-D Digital Imaging and Modeling, 2001. Proceedings. Third International Conference on, IEEE, pp. 145–152 (2001)
Aldoma, A., Tombari, F., Di Stefano, L., Vincze, M.: A global hypotheses verification method for 3d object recognition. In: Computer Vision–ECCV 2012, pp. 511–524, Springer (2012)
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the twentieth annual symposium on Computational geometry, ACM, pp. 253–262 (2004)
Weber, R., Schek, H.J., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. VLDB 98, 194–205 (1998)
Gionis, A., Indyk, P., Motwani, R., et al.: Similarity search in high dimensions via hashing. VLDB 99, 518–529 (1999)
Silpa-Anan, C., Hartley, R.: Optimised kd-trees for fast image descriptor matching. In: Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, IEEE, pp. 1–8 (2008)
Wilt, N.: The Cuda Handbook: A Comprehensive Guide to GPU Programming. Pearson Education, Upper Saddle River (2013)
Kirk, D.B., Wen-mei, W.H.: Programming massively parallel processors: a hands-on approach. Morgan Kaufmann (2012)
Acknowledgments
This work was partially funded by the national project SIRMAVED (DPI2013-40534-R). Experiments were made possible with a generous donation of hardware from NVIDIA.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Garcia-Garcia, A., Orts-Escolano, S., Garcia-Rodriguez, J. et al. Interactive 3D object recognition pipeline on mobile GPGPU computing platforms using low-cost RGB-D sensors. J Real-Time Image Proc 14, 585–604 (2018). https://doi.org/10.1007/s11554-016-0607-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-016-0607-x