Multi-modal Manhattan World Structure Estimation for Domestic Robots

Zhou, Kai; Mahesh Varadarajan, Karthik; Zillich, Michael; Vincze, Markus

doi:10.1007/978-3-662-43859-6_1

Kai Zhou⁸,
Karthik Mahesh Varadarajan⁸,
Michael Zillich⁸ &
…
Markus Vincze⁸

Part of the book series: Cognitive Systems Monographs ((COSMOS,volume 23))

1810 Accesses

Abstract

Spatial structure, typically dealt with by robots in domestic environments conform to Manhattan spatial orientations. In other words, much of the 3D point cloud space conform to one of three primal planar orientations. Hence analysis of such planar spatial structures is significant in robotic environments. This process has become a fundamental component in diverse robot vision systems since the introduction of low-cost RGB-D cameras such as the Kinect, ASUS and the Primesense that have been widely mounted on various indoor robots. These structured light/ time-of-flight commercial depth cameras are capable of providing high quality 3D reconstruction in real-time. There are a number of techniques that can be applied to determination of multi-plane structure in 3D scenes. Most of these techniques require prior knowledge modality of the planes or inlier scale of the data points in order to successfully discriminate between different planar structures. In this paper, we present a novel approach towards estimation of multi-plane structures without prior knowledge, based on Jensen-Shannon Divergence (JSD), which is a similarity measurement method used to represent pairwise relationship between data. Our model based on the JSD incorporates information about whether pairwise relationships exist in a model’s inlier data set or not as well as the pairwise geometrical relationship between data points.

Tests on datasets comprised of noisy inliers and a large percentage of outliers demonstrate that the proposed solution can efficiently estimate multiple models without prior information. Experimental results shown using our model also demonstrate successful discrimination of multiple planar structures in both real and synthetic scenes. Pragmatic tests with a robot vision system also demonstrate the validity of the proposed approach. Furthermore, it is shown that our model is not just restricted to linear kernel models such as planes but also be used to fit data using non-linear kernel models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aydemir, A., Sjöö, K., Folkesson, J., Pronobis, A., Jensfelt, P.: Search in the real world: Active visual object search based on spatial relations. In: Proceedings of the 2011 IEEE International Conference on Robotics and Automation (ICRA 2011), Shanghai, China (2011)
Google Scholar
Jun Chin, T., Wang, H., Suter, D.: Robust fitting of multiple structures: The statistical learning approach. In: IEEE International Conference on Computer Vision (2009)
Google Scholar
Chin, T.-J., Yu, J., Suter, D.: Accelerated hypothesis generation for multi-structure robust fitting. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 533–546. Springer, Heidelberg (2010)
Chapter Google Scholar
Coughlan, J.M., Yuille, A.L.: Manhattan world: Compass direction from a single image by bayesian inference. In: ICCV, pp. 941–947 (1999)
Google Scholar
Delage, E., Lee, H., Ng, A.: Automatic single-image 3d reconstructions of indoor manhattan world scenes. In: Thrun, S., Brooks, R., Durrant-Whyte, H. (eds.) Robotics Research. STAR, vol. 28, pp. 305–321. Springer, Heidelberg (2007)
Chapter Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981), doi: http://doi.acm.org/10.1145/358669.358692
Fuglede, B., Topsoe, F.: Jensen-Shannon divergence and Hilbert space embedding. In: IEEE International Symposium on Information Theory (2004)
Google Scholar
Isack, H.N., Boykov, Y.: Energy-based geometric multi-model fitting. IJCV 97(2), 123–147 (2012)
Article MATH Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. Annals of Mathematical Statistics 22(1), 79–86 (1951)
Article MathSciNet MATH Google Scholar
Lee, L.: On the effectiveness of the skew divergence for statistical language analysis. In: Artificial Intelligence and Statistics, pp. 65–72 (2001)
Google Scholar
Omedes, J., López-Nicolás, G., Guerrero, J.J.: Omnidirectional vision for indoor spatial layout recovery. In: Lee, S., Yoon, K.-J., Lee, J. (eds.) Frontiers of Intelligent Auton. Syst. SCI, vol. 466, pp. 95–104. Springer, Heidelberg (2013)
Chapter Google Scholar
Ridge, B., Skocaj, D., Leonardis, A.: Self-supervised cross-modal online learning of basic object affordances for developmental robotic systems. In: ICRA 2010, pp. 5047–5054 (2010)
Google Scholar
Rousseeuw, P.J., Leroy, A.M.: Robust regression and outlier detection. John Wiley (1987)
Google Scholar
Sjöö, K., Aydemir, A., Mörwald, T., Zhou, K., Jensfelt, P.: Mechanical support as a spatial abstraction for mobile robots. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan (2010)
Google Scholar
Stewart, C.V.: Bias in robust estimation caused by discontinuities and multiple structures. IEEE Transactions on PAMI 19, 818–833 (1997), doi: http://doi.ieeecomputersociety.org/10.1109/34.608280
Toldo, R., Fusiello, A.: Robust multiple structures estimation with j-linkage. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 537–547. Springer, Heidelberg (2008)
Chapter Google Scholar
Wang, H., Chin, T.J., Suter, D.: Simultaneously fitting and segmenting multiple-structure data with outliers. IEEE Transactions on PAMI (2012)
Google Scholar
Wong, H.S., Chin, T.J., Yu, J., Suter, D.: Dynamic and hierarchical multi-structure geometric model fitting. In: ICCV (2011)
Google Scholar
Yu, J., Chin, T.J., Suter, D.: A global optimization approach to robust multi-model fitting. In: CVPR, pp. 2041–2048 (2011)
Google Scholar
Zhang, W., Kǒsecká, J.: Nonparametric estimation of multiple structures with outliers. In: Vidal, R., Heyden, A., Ma, Y. (eds.) WDV 2005/2006. LNCS, vol. 4358, pp. 60–74. Springer, Heidelberg (2007)
Chapter Google Scholar
Zhou, K., Richtsfeld, A., Zillich, M., Vincze, M., Vrečko, A., Skočaj, D.: Visual information abstraction for interactive robot learning. In: The 15th International Conference on Advanced Robotics (ICAR 2011), Tallinn, Estonia (2011)
Google Scholar
Zuliani, M., Kenney, C.S., Manjunath, B.S.: The multiransac algorithm and its application to detect planar homographies. In: IEEE International Conference on Image Processing (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Automation and Control Institute, Vienna University of Technology, A-1040, Vienna, Austria
Kai Zhou, Karthik Mahesh Varadarajan, Michael Zillich & Markus Vincze

Authors

Kai Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Karthik Mahesh Varadarajan
View author publications
You can also search for this author in PubMed Google Scholar
Michael Zillich
View author publications
You can also search for this author in PubMed Google Scholar
Markus Vincze
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kai Zhou .

Editor information

Editors and Affiliations

Computer Science and Engineering, University of South Florida, Tampa, FL, USA
Yu Sun
Electrical and Computer Engineering and NanoScience Technology Center, University of Central Florida, Orlando, FL, USA
Aman Behal
Dept. of Mech. and Automation Eng., Vocational Training Council of Hong Kong, Hong Kong, China
Chi-Kit Ronald Chung

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhou, K., Mahesh Varadarajan, K., Zillich, M., Vincze, M. (2015). Multi-modal Manhattan World Structure Estimation for Domestic Robots. In: Sun, Y., Behal, A., Chung, CK. (eds) New Development in Robot Vision. Cognitive Systems Monographs, vol 23. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43859-6_1

Download citation

DOI: https://doi.org/10.1007/978-3-662-43859-6_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43858-9
Online ISBN: 978-3-662-43859-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics