Skip to main content

Efficient Multi-cue Scene Segmentation

  • Conference paper
Pattern Recognition (GCPR 2013)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8142))

Included in the following conference series:

Abstract

This paper presents a novel multi-cue framework for scene segmentation, involving a combination of appearance (grayscale images) and depth cues (dense stereo vision). An efficient 3D environment model is utilized to create a small set of meaningful free-form region hypotheses for object location and extent. Those regions are subsequently categorized into several object classes using an extended multi-cue bag-of-features pipeline. For that, we augment grayscale bag-of-features by bag-of-depth-features operating on dense disparity maps, as well as height pooling to incorporate a 3D geometric ordering into our region descriptor.

In experiments on a large real-world stereo vision data set, we obtain state-of-the-art segmentation results at significantly reduced computational costs. Our dataset is made public for benchmarking purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Achanta, R., et al.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. PAMI 34(11), 2274–2282 (2012)

    Article  Google Scholar 

  2. Arbeláez, P., Hariharan, B., Gu, C.: Semantic Segmentation using Regions and Parts. In: Proc. CVPR, pp. 3378–3385 (2012)

    Google Scholar 

  3. Brostow, G.J., Shotton, J., Fauqueur, J., Cipolla, R.: Segmentation and recognition using structure from motion point clouds. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 44–57. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. Carreira, J., Caseiro, R., Batista, J., Sminchisescu, C.: Semantic Segmentation with Second-Order Pooling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 430–443. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  5. Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: Proc. CVPR, vol. 1, pp. 886–893 (2005)

    Google Scholar 

  6. Enzweiler, M., Gavrila, D.: A Multi-Level Mixture-of-Experts Framework for Pedestrian Classification. IEEE Trans. IP 20(10), 2967–2979 (2011)

    MathSciNet  Google Scholar 

  7. Ester, M., et al.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proc. KDD, pp. 226–231 (1996)

    Google Scholar 

  8. Everingham, M., et al.: The Pascal Visual Object Classes (VOC) Challenge. IJCV 88(2), 303–338 (2010)

    Article  Google Scholar 

  9. Felzenszwalb, P., et al.: Object Detection with Discriminatively Trained Part Based Models. IEEE Trans. PAMI 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  10. Fraundorfer, F., et al.: Combining Monocular and Stereo Cues for Mobile Robot Localization Using Visual Words. In: Proc. ICPR, pp. 3927–3930 (2010)

    Google Scholar 

  11. Fulkerson, B., Vedaldi, A.: Class Segmentation and Object Localization with Superpixel Neighborhoods. In: Proc. ICCV, pp. 670–677 (2009)

    Google Scholar 

  12. Grauman, K., Darrell, T.: The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features. In: Proc. ICCV, vol. 2, pp. 1458–1465 (2005)

    Google Scholar 

  13. Gupta, S., Arbeláez, P., Malik, J.: Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images. In: Proc. CVPR (2013)

    Google Scholar 

  14. Hernández-Vela, A., et al.: BoVDW: Bag-of-Visual-and-Depth-Words for gesture recognition. In: Proc. ICPR, pp. 3–6 (2012)

    Google Scholar 

  15. Hirschmuller, H.: Stereo Processing by Semi-global Matching and Mutual Information. IEEE Trans. PAMI 30(2), 328–341 (2008)

    Article  Google Scholar 

  16. Ladický, L., et al.: Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction. In: Proc. BMVC, pp. 1–11 (2010)

    Google Scholar 

  17. Lazebnik, S., et al.: Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In: Proc. CVPR, pp. 2169–2178 (2006)

    Google Scholar 

  18. Lowe, D.: Distinctive Image Features from Scale-Invariant Keypoints. IJCV 60, 91–110 (2004)

    Article  Google Scholar 

  19. Micusik, B.: Semantic Segmentation of Street Scenes by Superpixel Co-occurrence and 3D Geometry. In: Computer Vision Workshops (ICCV), pp. 625–632 (2009)

    Google Scholar 

  20. Moosmann, F., Triggs, B., Jurie, F.: Fast Discriminative Visual Codebooks using Randomized Clustering Forests. In: NIPS (2007)

    Google Scholar 

  21. Pfeiffer, D., Franke, U.: Towards a Global Optimal Multi-layer Stixel Representation of Dense 3D Data. In: BMVC, pp. 51.1–51.12 (2011)

    Google Scholar 

  22. Shotton, J., et al.: TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context. IJCV 81(1), 2–23 (2009)

    Article  Google Scholar 

  23. Sturgess, P., et al.: Combining Appearance and Structure from Motion Features for Road Scene Understanding. In: Proc. BMVC, pp. 62.1–62.11 (2009)

    Google Scholar 

  24. Tang, J., Miller, S.: A Textured Object Recognition Pipeline for Color and Depth Image Data. In: Proc. ICRA (2012)

    Google Scholar 

  25. Vedaldi, A., Fulkerson, B.: VLFeat: An Open and Portable Library of Computer Vision Algorithms (2008), http://www.vlfeat.org/

  26. Vieux, R., et al.: Segmentation-based Multi-Class Semantic Object Detection. Multimedia Tools and Applications 60, 305–326 (2012)

    Article  Google Scholar 

  27. Wu, J.: A Fast Dual Method for HIK SVM Learning. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 552–565. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  28. Zhang, C., Wang, L., Yang, R.: Semantic Segmentation of Urban Scenes Using Dense Depth Maps. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 708–721. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  29. Zhang, J., et al.: Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study. IJCV 73(2), 213–238 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Scharwächter, T., Enzweiler, M., Franke, U., Roth, S. (2013). Efficient Multi-cue Scene Segmentation. In: Weickert, J., Hein, M., Schiele, B. (eds) Pattern Recognition. GCPR 2013. Lecture Notes in Computer Science, vol 8142. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40602-7_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40602-7_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40601-0

  • Online ISBN: 978-3-642-40602-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics