Combining Textural and Geometrical Descriptors for Scene Recognition

  • Neslihan Bayramog̃lu
  • Janne Heikkilä
  • Matti Pietikäinen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7584)


Local description of images is a common technique in many computer vision related research. Due to recent improvements in RGB-D cameras, local description of 3D data also becomes practical. The number of studies that make use of this extra information is increasing. However, their applicabilities are limited due to the need for generic combination methods. In this paper, we propose combining textural and geometrical descriptors for scene recognition of RGB-D data. The methods together with the normalization stages proposed in this paper can be applied to combine any descriptors obtained from 2D and 3D domains. This study represents and evaluates different ways of combining multi-modal descriptors within the BoW approach in the context of indoor scene localization. Query’s rough location is determined from the pre-recorded images and depth maps in an unsupervised image matching manner.


2D/3D description feature fusion localization 


  1. 1.
    Microsoft: Introducing kinect for xbox 360,
  2. 2.
    Cummins, M., Newman, P.: Fab-map: Probabilistic localization and mapping in the space of appearance. Int. J. Rob. Res. 27, 647–665 (2008)CrossRefGoogle Scholar
  3. 3.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE CVPR, pp. 2169–2178 (2006)Google Scholar
  4. 4.
    Kang, H., Efros, A.A., Hebert, M., Kanade, T.: Image matching in large scale indoor environment. In: IEEE CVPR Workshop on Egocentric Vision (2009)Google Scholar
  5. 5.
    Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: IEEE CVPR, pp. 413–420 (2009)Google Scholar
  6. 6.
    Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: IEEE ICCV, pp. 1470–1477 (2003)Google Scholar
  7. 7.
    Grauman, K., Darrell, T.: Efficient image matching with distributions of local invariant features. In: IEEE CVPR, pp. 627–634 (2005)Google Scholar
  8. 8.
    Ren, X., Bo, L., Fox, D.: Rgb-(d) scene labeling: Features and algorithms. In: IEEE CVPR (2012)Google Scholar
  9. 9.
    Janoch, A., Karayev, S., Jia, Y., Barron, J., Fritz, M., Saenko, K., Darrell, T.: A category-level 3-D object dataset: Putting the kinect to work. In: IEEE ICCV Workshops, pp. 1168–1174 (2011)Google Scholar
  10. 10.
    Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: IEEE ICCV Workshop on 3DRR (2011)Google Scholar
  11. 11.
    Browatzki, B., Fischer, J., Graf, B., Bulthoff, H., Wallraven, C.: Going into depth: Evaluating 2D and 3D cues for object classification on a new, large-scale object dataset. In: IEEE ICCV Workshops, pp. 1189–1195 (2011)Google Scholar
  12. 12.
    Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L.: A comparison of affine region detectors. Int. J. Computer Vision 65, 43–72 (2005)CrossRefGoogle Scholar
  13. 13.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Computer Vision 60, 91–110 (2004)CrossRefGoogle Scholar
  14. 14.
    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Computer Vision Image Underst. 110, 346–359 (2008)CrossRefGoogle Scholar
  15. 15.
    Tangelder, J.W.H., Veltkamp, R.C.: A survey of content based 3D shape retrieval methods. Multimedia Tools Appl. 39, 441–471 (2008)CrossRefGoogle Scholar
  16. 16.
    Bronstein, A.M., Bronstein, M.M., Guibas, L.J., Ovsjanikov, M.: Shape google: Geometric words and expressions for invariant shape retrieval. ACM Trans. Graph. 30, 1–20 (2011)CrossRefGoogle Scholar
  17. 17.
    Johnson, A.E., Hebert, M.: Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transactions on PAMI 21, 433–449 (1999)CrossRefGoogle Scholar
  18. 18.
    Osada, R., Funkhouser, T., Chazelle, B., Dobkin, D.: Matching 3D models with shape distributions. In: IEEE Int. Conf. on Shape Mod. & App. (2001)Google Scholar
  19. 19.
    Rusu, R.B., Blodow, N., Beetz, M.: Fast Point Feature Histograms (FPFH) for 3D Registration. In: IEEE ICRA, pp. 3212–3217 (2009)Google Scholar
  20. 20.
    Tombari, F., Salti, S., Di Stefano, L.: A combined texture-shape descriptor for enhanced 3D feature matching. In: IEEE ICIP, pp. 809–812 (2011)Google Scholar
  21. 21.
    Kittler, J., Hatef, M., Duin, R., Matas, J.: On combining classifiers. IEEE Transactions on PAMI 20, 226–239 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Neslihan Bayramog̃lu
    • 1
  • Janne Heikkilä
    • 1
  • Matti Pietikäinen
    • 1
  1. 1.Center for Machine Vision ResearchUniversity of OuluFinland

Personalised recommendations