Autonomous Robots

, Volume 42, Issue 2, pp 391–421 | Cite as

Monte Carlo planning for active object classification

  • Timothy PattenEmail author
  • Wolfram Martens
  • Robert Fitch
Part of the following topical collections:
  1. Active Perception


Classifying objects in complex unknown environments is a challenging problem in robotics and is fundamental in many applications. Modern sensors and sophisticated perception algorithms extract rich 3D textured information, but are limited to the data that are collected from a given location or path. We are interested in closing the loop around perception and planning, in particular to plan paths for better perceptual data, and focus on the problem of planning scanning sequences to improve object classification from range data. We formulate a novel time-constrained active classification problem and propose solution algorithms that employ a variation of Monte Carlo tree search to plan non-myopically. Our algorithms use a particle filter combined with Gaussian process regression to estimate joint distributions of object class and pose. This estimator is used in planning to generate a probabilistic belief about the state of objects in a scene, and also to generate beliefs for predicted sensor observations from future viewpoints. These predictions consider occlusions arising from predicted object positions and shapes. We evaluate our algorithms in simulation, in comparison to passive and greedy strategies. We also describe similar experiments where the algorithms are implemented online, using a mobile ground robot in a farm environment. Results indicate that our non-myopic approach outperforms both passive and myopic strategies, and clearly show the benefit of active perception for outdoor object classification.


Active classification Object classification Sequential Monte Carlo Monte Carlo tree Search 



This research is supported in part by the Australian Centre for Field Robotics, the New South Wales State Government, the Australian Research Council’s Discovery Projects funding scheme (Project Number DP140104203), and the Faculty of Engineering and Information Technologies at The University of Sydney under the Faculty Research Cluster Program. We thank Joel Veness, Oliver Cliff, and Graeme Best for helpful discussions. Thanks to Andrew Bate, Jocie Bate, Lasitha Piyathilaka, and Grant Louat at SwarmFarm Robotics for use of the robot and assistance with the hardware experiments. Thanks also to James Underwood and Alen Alempijevic for assistance with sensor calibration.

Supplementary material

Supplementary material 1 (mpeg 55018 KB)


  1. Aloimonos, J., Weiss, I., & Bandopadhay, A. (1988). Active vision. International Journal of Computer Vision, 1(4), 333–356.CrossRefGoogle Scholar
  2. Andrieu, C., Doucet, A., & Holenstein, R. (2010). Particle Markov chain Monte Carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(3), 269–342.MathSciNetCrossRefzbMATHGoogle Scholar
  3. Atanasov, N., Sankaran, B., Le Ny, J., Pappas, G., & Daniilidis, K. (2014). Nonmyopic view planning for active object classification and pose estimation. IEEE Transactions on Robotics, 30(5), 1078–1090.CrossRefGoogle Scholar
  4. Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2–3), 235–256.CrossRefzbMATHGoogle Scholar
  5. Bac, C. W., Henten, E. J., Hemming, J., & Edan, Y. (2014). Harvesting robots for high-value crops: State-of-the-art review and challenges ahead. Journal of Field Robotics, 31(6), 888–911.CrossRefGoogle Scholar
  6. Bajcsy, R. (1988). Active perception. Proceedings of the IEEE, 76(8), 966–1005.CrossRefGoogle Scholar
  7. Bargoti, S., Underwood, J. P., Nieto, J. I., & Sukkarieh, S. (2015). A pipeline for trunk detection in trellis structured apple orchards. Journal of Field Robotics, 32(8), 1075–1094.CrossRefGoogle Scholar
  8. Becerra, I., Valentín-Coronado, L. M., Murrieta-Cid, R., & Latombe, J. C. (2016). Reliable confirmation of an object identity by a mobile robot: a mixed appearance/localization-driven motion approach. International Journal of Robotics Research, 35(10), 1207–1233.CrossRefGoogle Scholar
  9. Binney, J., Krause, A., & Sukhatme, G. (2013). Optimizing waypoints for monitoring spatiotemporal phenomena. International Journal of Robotics Research, 32(8), 873–888.CrossRefGoogle Scholar
  10. Blaer, P. S., & Allen, P. K. (2007). Data acquisition and view planning for 3-D modeling tasks. In: Proceedings of IEEE/RSJ IROS (pp. 417–422)Google Scholar
  11. Bourgault, F., Makarenko, A., Williams, S., Grocholsky, B., & Durrant-Whyte, H. (2002). Information based adaptive robotic exploration. In Proceedings of IEEE/RSJ IROS (pp. 540–545).Google Scholar
  12. Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78(1), 1–3.CrossRefGoogle Scholar
  13. Browne, C. B., Powley, E., Whitehouse, D., Lucas, S. M., Cowling, P. I., Rohlfshagen, P., et al. (2012). A survey of Monte Carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in Games, 4(1), 1–43.CrossRefGoogle Scholar
  14. Chen, S., Li, Y., & Kwok, N. (2011). Active vision in robotic systems: A survey of recent developments. International Journal of Robotics Research, 30(11), 1343–1377.CrossRefGoogle Scholar
  15. Cliff, O., Fitch, R., Sukkarieh, S., Saunders, D., & Heinsohn, R. (2015). Online localization of radio-tagged wildlife with an autonomous aerial robot system. In Proceedings of RSS.Google Scholar
  16. Collet, A., Xiong, B., Gurau, C., Hebert, M., & Srinivasa, S. (2015). Herbdisc: Towards lifelong robotic object discovery. International Journal of Robotics Research, 34(1), 3–25.CrossRefGoogle Scholar
  17. Denzler, J., & Brown, C. (2002). Information theoretic sensor data selection for active object recognition and state estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2), 145–157.CrossRefGoogle Scholar
  18. Doucet, A., Smith, A., de Freitas, N., & Gordon, N. (2001). Sequential Monte Carlo methods in practice. Information science and statistics. Berlin: Springer.CrossRefGoogle Scholar
  19. Douillard, B., Underwood, J., Vlaskine, V., Quadros, A., & Singh, S. (2014). A pipeline for the segmentation and classification of 3D point clouds. In Experimental robotics (Vol. 79, pp. 585–600). Springer, STAR.Google Scholar
  20. Eidenberger, R., & Scharinger, J. (2010). Active perception and scene modeling by planning with probabilistic 6D object poses. In Proceedings of IEEE/RSJ IROS (pp. 1036–1043).Google Scholar
  21. Faulhammer, T., Aldoma, A., Zillich, M., & Vincze, M. (2015). Temporal integration of feature correspondences for enhanced recognition in cluttered and dynamic environments. In Proceedings of IEEE ICRA (pp. 3003–3009).Google Scholar
  22. Fentanes, J.P., Zalama, E., & Gómez-García-Bermejo, J. (2011). Algorithm for efficient 3D reconstruction of outdoor environments using mobile robots. In Proceedings of IEEE ICRA (pp. 3275–3280).Google Scholar
  23. Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.MathSciNetCrossRefGoogle Scholar
  24. Gan, S., Fitch, R., & Sukkarieh, S. (2014). Online decentralized information gathering with spatial-temporal constraints. Autonomous Robots, 37(1), 1–25.CrossRefGoogle Scholar
  25. Gschwandtner, M., Kwitt, R., Uhl, A., & Pree, W. (2011). BlenSor: Blender sensor simulation toolbox. In Advances in visual computing (Vol. 6939, pp. 199–208), . Springer.Google Scholar
  26. Guestrin, C., Krause, A., & Singh, A. (2005). Near-optimal sensor placements in Gaussian processes. In Proceedings of ICML (pp. 265–272).Google Scholar
  27. Hollinger, G., Englot, B., Hover, F., Mitra, U., & Sukhatme, G. (2013). Active planning for underwater inspection and the benefit of adaptivity. International Journal of Robotics Research, 32(1), 3–18.CrossRefGoogle Scholar
  28. Huber, M., Dencker, T., Roschani, M., & Beyerer, J. (2012). Bayesian active object recognition via Gaussian process regression. In Proceedings of fusion (pp. 1718–1725).Google Scholar
  29. Hung, C. C., Nieto, J., Taylor, Z., Underwood, J., & Sukkarieh, S. (2013). Orchard fruit segmentation using multi-spectral feature learning. In Proceedings of IEEE/RSJ IROS (pp. 5314–5320).Google Scholar
  30. Johnson, A. E. (1997). Spin-images: A representation for 3-D surface matching. Ph.D. thesis, Carnegie Mellon University.Google Scholar
  31. Jolliffe, I. (2002). Principal component analysis. Wiley StatsRef: Statistics Reference Online.Google Scholar
  32. Karasev, V., Chiuso, A., & Soatto, S. (2012). Controlled recognition bounds for visual learning and exploration. In Advances in neural information processing systems 25 (pp. 2915–2923). Curran Associates, Inc..Google Scholar
  33. Kocsis, L., & Szepesvári, C. (2006). Bandit based Monte-Carlo planning. In Proceedings of ECML (pp. 282–293). Springer.Google Scholar
  34. Krause, A., Singh, A., & Guestrin, C. (2008). Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies. Journal of Machine Learning Research, 9, 235–284.zbMATHGoogle Scholar
  35. Lauri, M. (2016). Sequential decision making under uncertainty for sensor management in mobile robotics. PhD thesis, Tampere University of Technology.Google Scholar
  36. Lauri, M., & Ritala, R. (2014). Stochastic control for maximizing mutual information in active sensing. In Proceedings of IEEE ICRA, workshop on robots in homes and industry: Where to look first? Google Scholar
  37. Lauri, M., Atanasov, N., Pappas, G. J., & Ritala, R. (2015). Active object recognition via Monte Carlo tree search. In Proceedings of IEEE ICRA, workshop on beyond geometric constraints.Google Scholar
  38. Lindsten, F., & Schön, T. B. (2013). Backward simulation methods for Monte Carlo statistical inference. Foundations and Trends in Machine Learning, 6(1), 1–143.CrossRefzbMATHGoogle Scholar
  39. Meger, D., Gupta, A., & Little, J. (2010). Viewpoint detection models for sequential embodied object category recognition. In Proceedings of IEEE ICRA (pp. 5055–5061).Google Scholar
  40. Nemhauser, G., Wolsey, L., & Fisher, M. (1978). An analysis of approximations for maximizing submodular set functions—I. Mathematical Programming, 14(1), 265–294.MathSciNetCrossRefzbMATHGoogle Scholar
  41. Nguyen, J. L., Lawrance, N. R. J., Fitch, R., & Sukkarieh, S. (2016). Real-time path planning for long-term information gathering with an aerial glider. Autonomous Robots, 40(6), 1017–1039.CrossRefGoogle Scholar
  42. Patten, T., Kassir, A., Martens, W., Douillard, B., Fitch, R., & Sukkarieh, S. (2015). A Bayesian approach for time-constrained 3D outdoor object recognition. In Proceedings of IEEE ICRA, workshop on scaling up active perception.Google Scholar
  43. Patten, T., Zillich, M., Fitch, R., Vincze, M., & Sukkarieh, S. (2016). Viewpoint evaluation for online 3-D active object classification. IEEE Robotics and Automation Letters, 1(1), 73–81.CrossRefGoogle Scholar
  44. Paul, R., Triebel, R., Rus, D., & Newman, P. (2012). Semantic categorization of outdoor scenes with uncertainty estimates using multi-class Gaussian process classification. In Proceedings of IEEE/RSJ IROS (pp. 2404–2410).Google Scholar
  45. Pineda, L., Takahashi, T., Jung, H. T., Zilberstein, S., & Grupen, R. (2015). Continual planning for search and rescue robots. In Proceedings of IEEE RAS humanoids (pp. 243–248).Google Scholar
  46. Potthast, C., Breitenmosero, A., Sha, F., & Sukhatme, G. (2015). Active multi-view object recognition and online feature selection. In Proceedings of ISRR.Google Scholar
  47. Quigley, M., Conley, K., Gerkey, B. P., Faust, J., Foote, T., Leibs, J., Wheeler, R., & Ng, A. Y. (2009). ROS: an open-source robot operating system. In Proceedings of IEEE ICRA, workshop on open source software.Google Scholar
  48. Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. Cambridge, MA: MIT Press.zbMATHGoogle Scholar
  49. Rosell, J., & Sanz, R. (2012). A review of methods and applications of the geometric characterization of tree crops in agricultural activities. Computers and Electronics in Agriculture, 81, 124–141.CrossRefGoogle Scholar
  50. Rusu, R., & Cousins, S. (2011). 3D is here: Point Cloud Library (PCL). In Proceedings of IEEE ICRA (pp. 1–4).Google Scholar
  51. Rusu, R. B., Bradski, G., Thibaux, R., Hsu, J. (2010). Fast 3D recognition and pose using the viewpoint feature histogram. In Proceedings of IEEE/RSJ IROS (pp. 2155–2162).Google Scholar
  52. Silver, D., & Veness, J. (2010). Monte-Carlo planning in large POMDPs. In Advances in neural information processing systems 23 (pp. 2164–2172). Curran Associates, Inc.Google Scholar
  53. Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489.CrossRefGoogle Scholar
  54. Sutton, R. S., & Barto, A. G. (1998). Introduction to reinforcement learning (1st ed.). Cambridge, MA: MIT Press.Google Scholar
  55. Tang, J., Miller, S., Singh, A., & Abbeel, P. (2012). A textured object recognition pipeline for color and depth image data. In Proceedings of IEEE ICRA (pp. 3467–3474).Google Scholar
  56. Underwood, J. P., Hill, A., Peynot, T., & Scheding, S. J. (2010). Error modeling and calibration of exteroceptive sensors for accurate mapping applications. Journal of Field Robotics, 27(1), 2–20.CrossRefGoogle Scholar
  57. Underwood, J. P., Calleija, M., Taylor, Z., Hung, C., Nieto, J., Fitch, R., & Sukkarieh, S. (2015). Real-time target detection and steerable spray for vegetable crops. In Proceedings of IEEE ICRA, workshop on robotics in agriculture.Google Scholar
  58. Vander Hook, J., Tokekar, P., & Isler, V. (2015). Algorithms for cooperative active localization of static targets with mobile bearing sensors under communication constraints. IEEE Transactions on Robotics, 31(4), 864–876.CrossRefGoogle Scholar
  59. Vélez, J., Hemann, G., Huang, A. S., Posner, I., & Roy, N. (2012). Modelling observation correlations for active exploration and robust object detection. Journal of Artificial Intelligence Research, 44, 423–453.zbMATHGoogle Scholar
  60. Wong, L. L. S., Kaelbling, L. P., & Lozano-Prez, T. (2015). Data association for semantic world modeling from partial views. International Journal of Robotics Research, 34(7), 1064–1082.CrossRefGoogle Scholar
  61. Wu, K., Ranasinghe, R., & Dissanayake, G. (2015). Active recognition and pose estimation of household objects in clutter. In Proceedings of IEEE ICRA (pp. 4230–4237).Google Scholar
  62. Xie, Z., Singh, A., Uang, J., Narayan, K. S., & Abbeel, P. (2013). Multimodal blending for high-accuracy instance recognition. In Proceedings of IEEE/RSJ IROS (pp. 2214–2221).Google Scholar
  63. Xu, Z., Fitch, R., Underwood, J., & Sukkarieh, S. (2013). Decentralized coordinated tracking with mixed discrete-continuous decisions. Journal of Field Robotics, 30(5), 717–740.CrossRefGoogle Scholar
  64. Zhong, Y. (2009). Intrinsic shape signatures: A shape descriptor for 3D object recognition. In Proceedings of ICCV workshops (pp. 689–696).Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Australian Centre for Field Robotics (ACFR)The University of SydneySydneyAustralia
  2. 2.Centre for Autonomous Systems (CAS)University of Technology SydneyUltimoAustralia

Personalised recommendations