3D Layout Propagation to Improve Object Recognition in Egocentric Videos

Rituerto, Alejandro; Murillo, Ana C.; Guerrero, José J.

doi:10.1007/978-3-319-16199-0_58

Alejandro Rituerto¹⁶,
Ana C. Murillo¹⁶ &
José J. Guerrero¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8927))

Included in the following conference series:

European Conference on Computer Vision

3151 Accesses

Abstract

Intelligent systems need complex and detailed models of their environment to achieve more sophisticated tasks, such as assistance to the user. Vision sensors provide rich information and are broadly used to obtain these models, for example, indoor scene modeling from monocular images has been widely studied. A common initial step in those settings is the estimation of the \(3\)D layout of the scene. While most of the previous approaches obtain the scene layout from a single image, this work presents a novel approach to estimate the initial layout and addresses the problem of how to propagate it on a video. We propose to use a particle filter framework for this propagation process and describe how to generate and sample new layout hypotheses for the scene on each of the following frames. We present different ways to evaluate and rank these hypotheses. The experimental validation is run on two recent and publicly available datasets and shows promising results on the estimation of a basic \(3\)D layout. Our experiments demonstrate how this layout information can be used to improve detection tasks useful for a human user, in particular sign detection, by easily rejecting false positives.

We would like to thank Prof. Roberto Manduchi for his comments and suggestions, which helped us to improve the present work. This work was supported by the Spanish FPI grant BES-\(2010\)-\(030299\) and Spanish projects DPI\(2012\)-\(31781\), DGA-T\(04\)-FSE and TAMA.

Download to read the full chapter text

Chapter PDF

3D Spatial Layout Propagation in a Video Sequence

Exploiting Contextual Motion Cues for Visual Object Tracking

Revisiting Robust Visual Tracking Using Pixel-Wise Posteriors

Keywords

References

Badrinarayanan, V., Galasso, F., Cipolla, R.: Label propagation in video sequences. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3265–3272 (2010)
Google Scholar
Bao, S.Y., Sun, M., Savarese, S.: Toward coherent object detection and scene layout understanding. Image and Vision Computing 29(9), 569–579 (2011)
Article Google Scholar
Cambra, A.B., Murillo, A.: Towards robust and efficient text sign reading from a mobile phone. In: Int. Conf. on Computer Vision Workshops, pp. 64–71 (2011)
Google Scholar
Chen, L., Guo, B.L., Sun, W.: Obstacle detection system for visually impaired people based on stereo vision. In: Int. Conf. on Genetic and Evolutionary Computing, pp. 723–726 (2010)
Google Scholar
Ciocarlie, M., Hsiao, K., Jones, E.G., Chitta, S., Rusu, R.B., Şucan, I.A.: Towards reliable grasping and manipulation in household environments. In: Khatib, O., Kumar, V., Sukhatme, G. (eds.) Experimental Robotics. STAR, vol. 79, pp. 241–252. Springer, Heidelberg (2012)
Chapter Google Scholar
Coughlan, J.M., Yuille, A.L.: Manhattan world: Compass direction from a single image by bayesian inference. In: IEEE International Conference on Computer Vision (ICCV), pp. 941–947 (1999)
Google Scholar
Delage, E., Lee, H., Ng, A.Y.: A dynamic bayesian network model for autonomous 3d reconstruction from a single indoor image. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2418–2428 (2006)
Google Scholar
Flint, A., Murray, D., Reid, I.: Manhattan scene understanding using monocular, stereo, and 3D features. In: IEEE International Conference on Computer Vision (ICCV), pp. 2228–2235 (2011)
Google Scholar
Furlan, A., Miller, S., Sorrenti, D.G., Fei-Fei, L., Savarese, S.: Free your camera: 3d indoor scene understanding from arbitrary camera motion. In: British Machine Vision Conference (BMVC) (2013)
Google Scholar
Gupta, A., Efros, A.A., Hebert, M.: Blocks world revisited: image understanding using qualitative geometry and mechanics. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 482–496. Springer, Heidelberg (2010)
Chapter Google Scholar
Hedau, V., Hoiem, D., Forsyth, D.: Recovering the spatial layout of cluttered rooms. In: IEEE International Conference on Computer Vision (ICCV), pp. 1849–1856 (2009)
Google Scholar
Hedau, V., Hoiem, D., Forsyth, D.: Thinking inside the box: using appearance models and context based on room geometry. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 224–237. Springer, Heidelberg (2010)
Chapter Google Scholar
Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from an image. International Journal of Computer Vision 75(1), 151–172 (2007)
Article Google Scholar
Hoiem, D., Efros, A.A., Hebert, M.: Putting objects in perspective. International Journal of Computer Vision 80(1), 3–15 (2008)
Article Google Scholar
Kovesi, P.D.: MATLAB and Octave functions for computer vision and image processing
Google Scholar
Lee, D.C., Gupta, A., Hebert, M., Kanade, T.: Estimating spatial layout of rooms using volumetric reasoning about objects and surfaces. In: Advances in Neural Information Processing Systems (NIPS) (2010)
Google Scholar
Lee, D.C., Hebert, M., Kanade, T.: Geometric reasoning for single image structure recovery. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2136–2143 (2009)
Google Scholar
López-Nicolás, G., Omedes, J., Guerrero, J.: Spatial layout recovery from a single omnidirectional image and its matching-free sequential propagation. In: Robotics and Autonomous Systems (2014)
Google Scholar
Raza, S.H., Grundmann, M., Essa, I.: Geometric context from video. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)
Google Scholar
Rituerto, A., Manduchi, R., Murillo, A.C., Guerrero, J.J.: 3D Spatial layout propagation in a video sequence. In: Campilho, A., Kamel, M. (eds.) ICIAR 2014, Part II. LNCS, vol. 8815, pp. 374–382. Springer, Heidelberg (2014)
Chapter Google Scholar
Rituerto, J., Murillo, A., Kosecka, J.: Label propagation in videos indoors with an incremental non-parametric model update. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2383–2389 (2011)
Google Scholar
Rother, C.: A new approach to vanishing point detection in architectural environments. Image and Vision Computing 20(9), 647–655 (2002)
Article Google Scholar
Saxena, A., Sun, M., Ng, A.Y.: Make3d: Learning 3D scene structure from a single still image. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(5), 824–840 (2009)
Article Google Scholar
Southey, T., Little, J.: 3D spatial relationships for improving object detection. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 140–147 (May 2013)
Google Scholar
Tapu, R., Mocanu, B., Bursuc, A., Zaharia, T.: A smartphone-based obstacle detection and classification system for assisting visually impaired people. In: Int. Conf. on Computer Vision Workshops (ICCVW), pp. 444–451 (2013)
Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T.: Using the forest to see the trees: exploiting context for visual object detection and localization. Communications of the ACM 53(3), 107–114 (2010)
Article Google Scholar
Tsai, G., Kuipers, B.: Dynamic visual understanding of the local environment for an indoor navigating robot. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4695–4701 (2012)
Google Scholar
Vazquez-Reina, A., Avidan, S., Pfister, H., Miller, E.: Multiple hypothesis video segmentation from superpixel flows. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 268–281. Springer, Heidelberg (2010)
Chapter Google Scholar
Wexler, Y., Shashua, A., Tadmor, O., Ehrlich, I.: User wearable visual assistance device (ORCAM), uS Patent App. 13/914,792 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Instituto de Investigación en Ingeniería de Aragón, University of Zaragoza, Zaragoza, Spain
Alejandro Rituerto, Ana C. Murillo & José J. Guerrero

Authors

Alejandro Rituerto
View author publications
You can also search for this author in PubMed Google Scholar
Ana C. Murillo
View author publications
You can also search for this author in PubMed Google Scholar
José J. Guerrero
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alejandro Rituerto .

Editor information

Editors and Affiliations

University College London, London, United Kingdom
Lourdes Agapito
University of Lugano, Lugano, Switzerland
Michael M. Bronstein
Technische Universität Dresden, Dresden, Germany
Carsten Rother

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rituerto, A., Murillo, A.C., Guerrero, J.J. (2015). 3D Layout Propagation to Improve Object Recognition in Egocentric Videos. In: Agapito, L., Bronstein, M., Rother, C. (eds) Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science(), vol 8927. Springer, Cham. https://doi.org/10.1007/978-3-319-16199-0_58

Download citation

DOI: https://doi.org/10.1007/978-3-319-16199-0_58
Published: 20 March 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16198-3
Online ISBN: 978-3-319-16199-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

3D Layout Propagation to Improve Object Recognition in Egocentric Videos

Abstract

Chapter PDF

Similar content being viewed by others

3D Spatial Layout Propagation in a Video Sequence

Exploiting Contextual Motion Cues for Visual Object Tracking

Revisiting Robust Visual Tracking Using Pixel-Wise Posteriors

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

3D Layout Propagation to Improve Object Recognition in Egocentric Videos

Abstract

Chapter PDF

Similar content being viewed by others

3D Spatial Layout Propagation in a Video Sequence

Exploiting Contextual Motion Cues for Visual Object Tracking

Revisiting Robust Visual Tracking Using Pixel-Wise Posteriors

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation