Segmentation and Recognition Using Structure from Motion Point Clouds

Brostow, Gabriel J.; Shotton, Jamie; Fauqueur, Julien; Cipolla, Roberto

doi:10.1007/978-3-540-88682-2_5

Gabriel J. Brostow⁴,
Jamie Shotton⁵,
Julien Fauqueur⁶ &
…
Roberto Cipolla⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5302))

Included in the following conference series:

European Conference on Computer Vision

11k Accesses
341 Citations
6 Altmetric

Abstract

We propose an algorithm for semantic segmentation based on 3D point clouds derived from ego-motion. We motivate five simple cues designed to model specific patterns of motion and 3D world structure that vary with object category. We introduce features that project the 3D cues back to the 2D image plane while modeling spatial layout and context. A randomized decision forest combines many such features to achieve a coherent 2D segmentation and recognize the object categories present. Our main contribution is to show how semantic segmentation is possible based solely on motion-derived 3D world structure. Our method works well on sparse, noisy point clouds, and unlike existing approaches, does not need appearance-based descriptors.

Experiments were performed on a challenging new video database containing sequences filmed from a moving car in daylight and at dusk. The results confirm that indeed, accurate segmentation and recognition are possible using only motion and 3D world structure. Further, we show that the motion-derived information complements an existing state-of-the-art appearance-based method, improving both qualitative and quantitative performance.

Download to read the full chapter text

Chapter PDF

Efficient Multi-cue Scene Segmentation

Recursive Inference for Prediction of Objects in Urban Environments

Tracking Using Multilevel Quantizations

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Merrell, P., Akbarzadeh, A., Wang, L., Mordohai, P., Frahm, J.M., Yang, R., Nister, D., Pollefeys, M.: Real-time visibility-based fusion of depth maps. In: Proceedings of the International Conference on Computer Vision (ICCV) (2007)
Google Scholar
Posner, I., Schroeter, D., Newman, P.M.: Describing composite urban workspaces. In: ICRA (2007)
Google Scholar
Boujou: 2d3 Ltd. (2007), http://www.2d3.com
Chum, O., Zisserman, A.: An exemplar model for learning object classes. In: CVPR (2007)
Google Scholar
Li, L.J., Fei-Fei, L.: What, where and who? classifying events by scene and object recognition. In: ICCV (2007)
Google Scholar
Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: CVPR (2008)
Google Scholar
Hoiem, D., Efros, A.A., Hebert, M.: Putting objects in perspective. In: CVPR, vol. 2, pp. 2137–2144 (2006)
Google Scholar
Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: ICCV, vol. 1, pp. 654–661 (2005)
Google Scholar
Huber, D., Kapuria, A., Donamukkala, R., Hebert, M.: Parts-based 3d object classification. In: CVPR, pp. 82–89 (2004)
Google Scholar
Hoiem, D., Rother, C., Winn, J.: 3d layout crf for multi-view object class recognition and segmentation. In: CVPR (2007)
Google Scholar
Kushal, A., Schmid, C., Ponce, J.: Flexible object models for category-level 3d object recognition. In: CVPR (2007)
Google Scholar
Pingkun, Y., Khan, S., Shah, M.: 3d model based object class detection in an arbitrary view. In: ICCV (2007)
Google Scholar
Savarese, S., Fei-Fei, L.: 3d generic object categorization, localization and pose estimation. In: ICCV (2007)
Google Scholar
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 428–441. Springer, Heidelberg (2006)
Chapter Google Scholar
Cedras, C., Shah, M.: Motion-based recognition: A survey. IVC 13(2), 129–155 (1995)
Article Google Scholar
Laptev, I., Lindeberg, T.: Space-time interest points. In: ICCV, pp. 432–439 (2003)
Google Scholar
Viola, P.A., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: ICCV, pp. 734–741 (2003)
Google Scholar
Yin, P., Criminisi, A., Winn, J.M., Essa, I.: Tree-based classifiers for bilayer video segmentation. In: CVPR (2007)
Google Scholar
Wiles, C., Brady, M.: Closing the loop on multiple motions. In: ICCV, pp. 308–313 (1995)
Google Scholar
Kang, J., Cohen, I., Medioni, G.G., Yuan, C.: Detection and tracking of moving objects from a moving platform in presence of strong parallax. In: ICCV, pp. 10–17 (2005)
Google Scholar
Leibe, B., Cornelis, N., Cornelis, K., Gool, L.J.V.: Dynamic 3d scene analysis from a moving vehicle. In: CVPR (2007)
Google Scholar
Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: ICCV, pp. 726–733 (2003)
Google Scholar
Harris, C., Stephens, M.: A Combined Corner and Edge Detector. In: 4th ALVEY Vision Conference, pp. 147–151 (1988)
Google Scholar
Mitra, N.J., Nguyen, A., Guibas, L.: Estimating surface normals in noisy point cloud data. International Journal of Computational Geometry and Applications 14, 261–276 (2004)
Article MathSciNet MATH Google Scholar
Shewchuk, J.R.: Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator. In: Lin, M.C., Manocha, D. (eds.) FCRC-WS 1996 and WACG 1996. LNCS, vol. 1148, pp. 203–222. Springer, Heidelberg (1996)
Chapter Google Scholar
Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 1–15. Springer, Heidelberg (2006)
Chapter Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR, vol. 1, pp. 511–518 (2001)
Google Scholar
Amit, Y., Geman, D.: Shape quantization and recognition with randomized trees. Neural Computation 9(7), 1545–1588 (1997)
Article Google Scholar
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Machine Learning 36(1), 3–42 (2006)
Article MATH Google Scholar
Winn, J., Shotton, J.: The layout consistent random field for recognizing and segmenting partially occluded objects. In: CVPR (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

University College London and ETH Zurich, UK
Gabriel J. Brostow
Microsoft Research Cambridge, USA
Jamie Shotton
University of Cambridge (now with MirriAd Ltd.), USA
Julien Fauqueur
University of Cambridge, USA
Roberto Cipolla

Authors

Gabriel J. Brostow
View author publications
You can also search for this author in PubMed Google Scholar
Jamie Shotton
View author publications
You can also search for this author in PubMed Google Scholar
Julien Fauqueur
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Cipolla
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, University of Illinois at Urbana Champaign, 3310 Siebel Hall, Urbana, IL 61801, USA
David Forsyth
Department of Computing, Oxford Brookes University, OX33 1HX, Wheatley, Oxford, UK
Philip Torr
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Electronic Supplementary Material

Supplementary material (29,005 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brostow, G.J., Shotton, J., Fauqueur, J., Cipolla, R. (2008). Segmentation and Recognition Using Structure from Motion Point Clouds. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5302. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88682-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-540-88682-2_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88681-5
Online ISBN: 978-3-540-88682-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Segmentation and Recognition Using Structure from Motion Point Clouds

Abstract

Chapter PDF

Similar content being viewed by others

Efficient Multi-cue Scene Segmentation

Recursive Inference for Prediction of Objects in Urban Environments

Tracking Using Multilevel Quantizations

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Electronic Supplementary Material

Supplementary material (29,005 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Segmentation and Recognition Using Structure from Motion Point Clouds

Abstract

Chapter PDF

Similar content being viewed by others

Efficient Multi-cue Scene Segmentation

Recursive Inference for Prediction of Objects in Urban Environments

Tracking Using Multilevel Quantizations

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Electronic Supplementary Material

Supplementary material (29,005 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation