Abstract
This paper presents a novel multi-cue framework for scene segmentation, involving a combination of appearance (grayscale images) and depth cues (dense stereo vision). An efficient 3D environment model is utilized to create a small set of meaningful free-form region hypotheses for object location and extent. Those regions are subsequently categorized into several object classes using an extended multi-cue bag-of-features pipeline. For that, we augment grayscale bag-of-features by bag-of-depth-features operating on dense disparity maps, as well as height pooling to incorporate a 3D geometric ordering into our region descriptor.
In experiments on a large real-world stereo vision data set, we obtain state-of-the-art segmentation results at significantly reduced computational costs. Our dataset is made public for benchmarking purposes.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Achanta, R., et al.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. PAMI 34(11), 2274–2282 (2012)
Arbeláez, P., Hariharan, B., Gu, C.: Semantic Segmentation using Regions and Parts. In: Proc. CVPR, pp. 3378–3385 (2012)
Brostow, G.J., Shotton, J., Fauqueur, J., Cipolla, R.: Segmentation and recognition using structure from motion point clouds. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 44–57. Springer, Heidelberg (2008)
Carreira, J., Caseiro, R., Batista, J., Sminchisescu, C.: Semantic Segmentation with Second-Order Pooling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 430–443. Springer, Heidelberg (2012)
Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: Proc. CVPR, vol. 1, pp. 886–893 (2005)
Enzweiler, M., Gavrila, D.: A Multi-Level Mixture-of-Experts Framework for Pedestrian Classification. IEEE Trans. IP 20(10), 2967–2979 (2011)
Ester, M., et al.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proc. KDD, pp. 226–231 (1996)
Everingham, M., et al.: The Pascal Visual Object Classes (VOC) Challenge. IJCV 88(2), 303–338 (2010)
Felzenszwalb, P., et al.: Object Detection with Discriminatively Trained Part Based Models. IEEE Trans. PAMI 32(9), 1627–1645 (2010)
Fraundorfer, F., et al.: Combining Monocular and Stereo Cues for Mobile Robot Localization Using Visual Words. In: Proc. ICPR, pp. 3927–3930 (2010)
Fulkerson, B., Vedaldi, A.: Class Segmentation and Object Localization with Superpixel Neighborhoods. In: Proc. ICCV, pp. 670–677 (2009)
Grauman, K., Darrell, T.: The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features. In: Proc. ICCV, vol. 2, pp. 1458–1465 (2005)
Gupta, S., Arbeláez, P., Malik, J.: Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images. In: Proc. CVPR (2013)
Hernández-Vela, A., et al.: BoVDW: Bag-of-Visual-and-Depth-Words for gesture recognition. In: Proc. ICPR, pp. 3–6 (2012)
Hirschmuller, H.: Stereo Processing by Semi-global Matching and Mutual Information. IEEE Trans. PAMI 30(2), 328–341 (2008)
Ladický, L., et al.: Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction. In: Proc. BMVC, pp. 1–11 (2010)
Lazebnik, S., et al.: Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: Proc. CVPR, pp. 2169–2178 (2006)
Lowe, D.: Distinctive Image Features from Scale-Invariant Keypoints. IJCV 60, 91–110 (2004)
Micusik, B.: Semantic Segmentation of Street Scenes by Superpixel Co-occurrence and 3D Geometry. In: Computer Vision Workshops (ICCV), pp. 625–632 (2009)
Moosmann, F., Triggs, B., Jurie, F.: Fast Discriminative Visual Codebooks using Randomized Clustering Forests. In: NIPS (2007)
Pfeiffer, D., Franke, U.: Towards a Global Optimal Multi-layer Stixel Representation of Dense 3D Data. In: BMVC, pp. 51.1–51.12 (2011)
Shotton, J., et al.: TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context. IJCV 81(1), 2–23 (2009)
Sturgess, P., et al.: Combining Appearance and Structure from Motion Features for Road Scene Understanding. In: Proc. BMVC, pp. 62.1–62.11 (2009)
Tang, J., Miller, S.: A Textured Object Recognition Pipeline for Color and Depth Image Data. In: Proc. ICRA (2012)
Vedaldi, A., Fulkerson, B.: VLFeat: An Open and Portable Library of Computer Vision Algorithms (2008), http://www.vlfeat.org/
Vieux, R., et al.: Segmentation-based Multi-Class Semantic Object Detection. Multimedia Tools and Applications 60, 305–326 (2012)
Wu, J.: A Fast Dual Method for HIK SVM Learning. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 552–565. Springer, Heidelberg (2010)
Zhang, C., Wang, L., Yang, R.: Semantic Segmentation of Urban Scenes Using Dense Depth Maps. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 708–721. Springer, Heidelberg (2010)
Zhang, J., et al.: Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study. IJCV 73(2), 213–238 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Scharwächter, T., Enzweiler, M., Franke, U., Roth, S. (2013). Efficient Multi-cue Scene Segmentation. In: Weickert, J., Hein, M., Schiele, B. (eds) Pattern Recognition. GCPR 2013. Lecture Notes in Computer Science, vol 8142. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40602-7_46
Download citation
DOI: https://doi.org/10.1007/978-3-642-40602-7_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40601-0
Online ISBN: 978-3-642-40602-7
eBook Packages: Computer ScienceComputer Science (R0)