Skip to main content
Log in

Spatially-dependent Bayesian semantic perception under model and localization uncertainty

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

Semantic perception can provide autonomous robots operating under uncertainty with more efficient representation of their environment and better ability for correct loop closures than only geometric features. However, accurate inference of semantics requires measurement models that correctly capture properties of semantic detections such as viewpoint dependence, spatial correlations, and intra- and inter-class variations. Such models should also gracefully handle open-set conditions which may be encountered, keeping track of the resultant model uncertainty. We propose a method for robust visual classification of an object of interest observed from multiple views in the presence of significant localization uncertainty and classifier noise, and possible dataset shift. We use a viewpoint dependent measurement model to capture viewpoint dependence and spatial correlations in classifier scores, showing how to use it in the presence of localization uncertainty. Assuming a Bayesian classifier providing a measure of uncertainty, we show how its outputs can be fused in the context of the above model, allowing robust classification under model uncertainty when novel scenes are encountered. We present statistical evaluation of our method both in synthetic simulation, and in a 3D environment where rendered images are fed into a Deep Neural Network classifier. We compare to baseline methods in scenarios of varying difficulty showing improved robustness of our method to localization uncertainty and dataset shift. Finally, we validate our contribution w.r.t. localization uncertainty on a dataset of real-world images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  • Ammirato, P., Poirson, P., Park, E., Kosecka, J., & Berg, A. C. (2017). A dataset for developing and benchmarking active vision. In IEEE International Conference on Robotics and Automation (ICRA)

  • Atanasov, N., Sankaran, B., Ny, J., Pappas, G. J., & Daniilidis, K. (2014). Nonmyopic view planning for active object classification and pose estimation. IEEE Transactions on Robotics, 30, 1078–1090.

    Article  Google Scholar 

  • Becerra, I., Valentín-Coronado, L. M., Murrieta-Cid, R., & Latombe, J. C. (2016). Reliable confirmation of an object identity by a mobile robot: A mixed appearance/localization-driven motion approach. International Journal of Robotics Research, 35(10), 1207–1233.

    Article  Google Scholar 

  • Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., & Vaughan, J. W. (2010). A theory of learning from different domains. Machine Learning, 79(1–2), 151–175.

    Article  MathSciNet  Google Scholar 

  • Bowman, S., Atanasov, N., Daniilidis, K., & Pappas, G. (2017). Probabilistic data association for semantic slam. In IEEE International Conference on Robotics and Automation (ICRA), IEEE (pp. 1722–1729).

  • Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., et al. (2016). Simultaneous localization and mapping: Present, future, and the robust-perception age. IEEE Transactions on Robotics, 32(6), 1309–1332.

    Article  Google Scholar 

  • Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785–794).

  • Choudhary, S., Carlone, L., Nieto, C., Rogers, J., Christensen, H. I., & Dellaert, F. (2017). Distributed mapping with privacy and communication constraints: Lightweight algorithms and object-based models. International Journal of Robotics Research, 36(12), 1286–1311.

    Article  Google Scholar 

  • Farhi, E. I., & Indelman, V. (2019). ix-bsp: Belief space planning through incremental expectation. In IEEE International Conference on Robotics and Automation (ICRA)

  • Feldman, Y., & Indelman, V. (2018a). Bayesian viewpoint-dependent robust classification under model and localization uncertainty. In IEEE International Conference on Robotics and Automation (ICRA)

  • Feldman, Y., & Indelman, V. (2018b). Towards robust autonomous semantic perception. In Workshop on representing a complex world: perception, inference, and learning for joint semantic, geometric, and physical understanding, in conjunction with ieee international conference on robotics and automation (ICRA)

  • Gal, Y. (2017). Uncertainty in deep learning. Ph.D. thesis, University of Cambridge

  • Gal, Y., & Ghahramani, Z. (2016). Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In International Conference on Machine Learning (ICML)

  • Gal, Y., Islam, R., & Ghahramani, Z. (2017). Deep bayesian active learning with image data. In International Conference on machine learning (ICML), JMLR. org (pp. 1183–1192).

  • Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093

  • Kaess, M., Johannsson, H., Roberts, R., Ila, V., Leonard, J., & Dellaert, F. (2012). iSAM2: Incremental smoothing and mapping using the Bayes tree. International Journal of Robotics Research, 31, 217–236.

    Article  Google Scholar 

  • Kendall, A., & Gal, Y. (2017). What uncertainties do we need in bayesian deep learning for computer vision? In Advances in neural information processing systems (NIPS) (pp. 5580–5590).

  • Kendall, A., Badrinarayanan, V., & Cipolla, R. (2015a). Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding. arXiv:1511.02680.

  • Kendall, A., Grimes, M., & Cipolla, R. (2015b). Posenet: Convolutional networks for real-time 6-dof camera relocalization. In International Conference on Computer Vision (ICCV).

  • Kopitkov, D., & Indelman, V. (2018). Robot localization through information recovered from cnn classificators. In IEEE/RSJ International conference on intelligent robots and systems (IROS), IEEE.

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).

  • Lakshminarayanan, B., Pritzel, A., & Blundell, C. (2017). Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in neural information processing systems (NIPS) (pp. 6402–6413).

  • Lianos, K. N., Schonberger, J. L., Pollefeys, M., & Sattler, T. (2018). Vso: Visual semantic odometry. In European Conference on Computer Vision (ECCV) (pp. 234–250).

  • Lütjens, B., Everett, M., & How, J. P. (2018). Safe reinforcement learning with model uncertainty estimates. arXiv:1810.08700

  • Malinin, A., & Gales, M. (2018). Predictive uncertainty estimation via prior networks. In Advances in neural information processing systems (NIPS) (pp. 7047–7058).

  • Malinin, A., Ragni, A., Knill, K., & Gales, M. (2017). Incorporating uncertainty into deep learning for spoken language assessment. In Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 2: Short Papers), vol. 2 (pp. 45–50).

  • McAllister, R., Gal, Y., Kendall, A., Van Der Wilk, M., Shah, A., Cipolla, R., & Weller, A. V. (2017). Concrete problems for autonomous vehicle safety: advantages of bayesian deep learning. In International Joint Conference on AI (IJCAI).

  • Miller, D., Dayoub, F., Milford, M., & Sünderhauf, N. (2018a). Evaluating merging strategies for sampling-based uncertainty techniques in object detection. arXiv:1809.06006

  • Miller, D., Nicholson, L., Dayoub, F., & Sünderhauf, N. (2018b). Dropout sampling for robust object detection in open-set conditions. In IEEE International conference on robotics and automation (ICRA), IEEE (pp. 1–7).

  • Mu, B., Liu, S. Y., Paull, L., Leonard, J., & How, J. (2016). Slam with objects using a nonparametric pose graph. In IEEE/RSJ International conference on intelligent robots and systems (IROS).

  • Myshkov, P., & Julier, S. (2016). Posterior distribution analysis for bayesian inference in neural networks. NIPS: In workshop on Bayesian deep learning.

  • Omidshafiei, S., Lopez, B. T., How, J. P., & Vian, J. (2016). Hierarchical bayesian noise inference for robust real-time probabilistic object classification. arXiv:1605.01042

  • Osband, I., Blundell, C., Pritzel, A., & Van Roy, B. (2016). Deep exploration via bootstrapped dqn. In Advances in neural information processing systems (NIPS) (pp. 4026–4034).

  • Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in pytorch. Advances in neural information processing systems (NIPS)

  • Patten, T., Zillich, M., Fitch, R., Vincze, M., & Sukkarieh, S. (2016). Viewpoint evaluation for online 3-d active object classification. IEEE Robotics and Automation Letters (RA-L), 1(1), 73–81.

    Article  Google Scholar 

  • Patten, T., Martens, W., & Fitch, R. (2018). Monte carlo planning for active object classification. Autonomous Robots, 42(2), 391–421.

    Article  Google Scholar 

  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12(Oct), 2825–2830.

    MathSciNet  MATH  Google Scholar 

  • Pillai, S., & Leonard, J. (2015). Monocular slam supported object recognition. In Robotics: Science and Systems (RSS).

  • Qiu, W., Zhong, F., Zhang, Y., Qiao, S., Xiao, Z., Kim, T. S., & Wang, Y. (2017). Unrealcv: Virtual worlds for computer vision. In Proceedings of the 2017 ACM on multimedia conference, ACM (pp. 1221–1224).

  • Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., & Lawrence, N. D. (2009). Dataset shift in machine learning. Cambridge: The MIT press.

    Google Scholar 

  • Rasmussen, C., & Williams, C. (2006). Gaussian processes for machine learning. Cambridge: The MIT press.

    MATH  Google Scholar 

  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.

    Article  MathSciNet  Google Scholar 

  • Salas-Moreno, R. F., Newcombe, R. A., Strasdat, H., Kelly, P., & Davison, A. J. (2013a). Slam++: Simultaneous localisation and mapping at the level of objects. In IEEE Conference on computer vision and pattern recognition (CVPR) (pp. 1352–1359).

  • Salas-Moreno, R. F., Newcombe, R. A., Strasdat, H., Kelly, P. H., & Davison, A. J. (2013b). Slam++: Simultaneous localisation and mapping at the level of objects. In IEEE Conference on computer vision and pattern recognition (CVPR) (pp. 1352–1359).

  • Singh, A., Sha, J., Narayan, K. S., Achim, T., & Abbeel, P. (2014). Bigbird: A large-scale 3d database of object instances. In 2014 IEEE international conference on robotics and automation (ICRA), IEEE (pp. 509–516).

  • Sünderhauf, N., Pham, T. T., Latif, Y., Milford, M., & Reid, I. (2017). Meaningful maps with object-oriented semantic mapping. In IEEE/RSJ International conference on intelligent robots and systems (IROS), IEEE (pp. 5079–5085).

  • Tchuiev, V., & Indelman, V. (2018). Inference over distribution of posterior class probabilities for reliable bayesian classification and object-level perception. IEEE Robotics and Automation Letters (RA-L), 3(4), 4329–4336.

    Article  Google Scholar 

  • Tchuiev, V., Feldman, Y., & Indelman, V. (2019). Data association aware semantic mapping and localization via a viewpoint-dependent classifier model. In IEEE/RSJ International conference on intelligent robots and systems (IROS).

  • Teacy, W., Julier, S. J., De Nardi, R., Rogers, A., & Jennings, N. R. (2015). Observation modelling for vision-based target search by unmanned aerial vehicles. In International conference on autonomous agents and multiagent systems (AAMAS) (pp. 1607–1614).

  • Velez, J., Hemann, G., Huang, A. S., Posner, I., & Roy, N. (2012). Modelling observation correlations for active exploration and robust object detection. Journal of Artificial Intelligence Research, 44, 423–425.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuri Feldman.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was partially supported by the Israel Ministry of Science & Technology (MOST)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Feldman, Y., Indelman, V. Spatially-dependent Bayesian semantic perception under model and localization uncertainty. Auton Robot 44, 1091–1119 (2020). https://doi.org/10.1007/s10514-020-09921-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-020-09921-0

Keywords

Navigation