Object Categorization from RGB-D Local Features and Bag of Words

  • Jesus Martínez-Gómez
  • Miguel Cazorla
  • Ismael García-Varea
  • Cristina Romero-González
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 418)


Object categorization from robot perceptions has become one of the most well-known problems in robotics. How to select proper representations for these perceptions, specially when using RGB-D images, has received a significant attention in the last years. We present in this paper an object categorization approach from RGB-D images. This approach is based on the BoW representation, and it allows to integrate any type of 3D local feature implemented in the Point Cloud Library. The experimentation performed over the challenging RGB-D Object dataset shows how competitive object categorization systems can be developed using this procedure.


Object categorization 3D features Classification Robotics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Blum, M., Springenberg, J.T., Wülfing, J., Riedmiller, M.: A learned feature descriptor for object recognition in RGB-D data. In: 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 1298–1303. IEEE (2012)Google Scholar
  2. 2.
    Bo, L., Ren, X., Fox, D.: Depth kernel descriptors for object recognition. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 821–826. IEEE (2011)Google Scholar
  3. 3.
    Bustos, P., Martínez-Gómez, J., García-Varea, I., Rodríguez-Ruiz, L., Bachiller, P., Calderita, L., Manso, L., Sánchez, A., Bandera, A., Bandera, J.P.: Multimodal interaction with loki. In: Workshop de Agentes Físicos, Madrid-Spain (2013)Google Scholar
  4. 4.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)Google Scholar
  5. 5.
    Duchenne, O., Joulin, A., Ponce, J.: A graph-matching kernel for object categorization. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1792–1799. IEEE (2011)Google Scholar
  6. 6.
    Fodor, I.K.: A survey of dimension reduction techniques (2002)Google Scholar
  7. 7.
    Holz, D., Behnke, S.: Fast range image segmentation and smoothing using approximate surface reconstruction and region growing. In: Intelligent Autonomous Systems 12, pp. 61–73. Springer (2013)Google Scholar
  8. 8.
    Jie, L., Tommasi, T., Caputo, B.: Multiclass transfer learning from unconstrained priors. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1863–1870. IEEE (2011)Google Scholar
  9. 9.
    Johnson, A.E., Hebert, M.: Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999)CrossRefGoogle Scholar
  10. 10.
    Koppula, H.S., Gupta, R., Saxena, A.: Learning human activities and object affordances from RGB-D videos. The International Journal of Robotics Research 32(8), 951–970 (2013)CrossRefGoogle Scholar
  11. 11.
    Lai, K., Bo, L., Ren, X., Fox, D.: A Large-Scale Hierarchical Multi-View RGB-D Object Dataset. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1817–1824, May 2011Google Scholar
  12. 12.
    Lai, K., Bo, L., Ren, X., Fox, D.: Detection-based object labeling in 3d scenes. In: 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 1330–1337. IEEE (2012)Google Scholar
  13. 13.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178. IEEE (2006)Google Scholar
  14. 14.
    Lv, X., Jiang, S.Q., Herranz, L., Wang, S.: RGB-D hand-held object recognition based on heterogeneous feature fusion. Journal of Computer Science and Technology 30(2), 340–352 (2015)CrossRefGoogle Scholar
  15. 15.
    Martinez-Gomez, J., Cazorla, M., Garcia-Varea, I., Morell, V.: Vidrilo: The visual and depth robot indoor localization with objects information dataset. The International Journal of Robotics Research (2015)Google Scholar
  16. 16.
    Martínez-Gómez, J., Gámez, J., García-Varea, I., Matellán, V.: Using genetic algorithms for real-time object detection. In: RoboCup 2009: Robot Soccer World Cup XIII. Lecture Notes in Computer Science, vol. 5949, pp. 215–227. Springer, Heidelberg (2010)Google Scholar
  17. 17.
    Romero-Garcés, A., Calderita, L.V., Martínez-Gómez, J., Bandera, J.P., Marfil, R., Manso, L.J., Bandera, A., Bustos, P.: Testing a fully autonomous robotic salesman in real scenarios. In: 2015 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), pp. 124–130. IEEE (2015)Google Scholar
  18. 18.
    Rusu, R., Blodow, N., Beetz, M.: Fast point feature histograms (FPFH) for 3d registration. In: IEEE International Conference on Robotics and Automation, ICRA 2009, pp. 3212–3217, May 2009Google Scholar
  19. 19.
    Rusu, R., Cousins, S.: 3D is here: Point Cloud Library (PCL). In: IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, May 9–13 2011 (2011)Google Scholar
  20. 20.
    Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and supportinference from rgbd images. In: ECCV (2012)Google Scholar
  21. 21.
    Sipiran, I., Bustos, B.: Harris 3d: a robust extension of the harris operator for interest point detection on 3d meshes. The Visual Computer 27(11), 963–976 (2011)CrossRefGoogle Scholar
  22. 22.
    Steder, B., Rusu, R.B., Konolige, K., Burgard, W.: Narf: 3d range image features for object recognition. In: Workshop on Defining and Solving Realistic Perception Problems in Personal Robotics at the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), vol. 44 (2010)Google Scholar
  23. 23.
    Tombari, F., Salti, S., Di Stefano, L.: Unique signatures of histograms for local surface description. In: Computer Vision-ECCV 2010, pp. 356–369. Springer (2010)Google Scholar
  24. 24.
    Tombari, F., Salti, S., Di Stefano, L.: A combined texture-shape descriptor for enhanced 3d feature matching. In: 2011 18th IEEE International Conference on Image Processing (ICIP), pp. 809–812, september 2011Google Scholar
  25. 25.
    Yang, J., Jiang, Y.G., Hauptmann, A.G., Ngo, C.: Evaluating bag-of-visual-words representations in scene classification. In: Proceedings of the International Workshop on Workshop on Multimedia Information Retrieval, MIR 2007, pp. 197–206. ACM, New York (2007)Google Scholar
  26. 26.
    Zhang, C., Cheng, J., Liu, J., Pang, J., Liang, C., Huang, Q., Tian, Q.: Object categorization in sub-semantic space. Neurocomputing 142, 248–255 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Jesus Martínez-Gómez
    • 1
    • 2
  • Miguel Cazorla
    • 2
  • Ismael García-Varea
    • 1
  • Cristina Romero-González
    • 1
  1. 1.Computer System DepartmentUniversity of Castilla-La ManchasCiudad RealSpain
  2. 2.Department of Computer Science and Artificial IntelligenceUniversity of AlicanteAlicanteSpain

Personalised recommendations