Advertisement

A Deep Learning Approach for Object Recognition with NAO Soccer Robots

  • Dario Albani
  • Ali Youssef
  • Vincenzo Suriani
  • Daniele Nardi
  • Domenico Daniele Bloisi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9776)

Abstract

The use of identical robots in the RoboCup Standard Platform League (SPL) made software development the key aspect to achieve good results in competitions. In particular, the visual detection process is crucial for extracting information about the environment. In this paper, we present a novel approach for object detection and classification based on Convolutional Neural Networks (CNN). The approach is designed to be used by NAO robots and is made of two stages: image region segmentation, for reducing the search space, and Deep Learning, for validation. The proposed method can be easily extended to deal with different objects and adapted to be used in other RoboCup leagues. Quantitative experiments have been conducted on a data set of annotated images captured in real conditions from NAO robots in action. The used data set is made available for the community.

Keywords

Robot vision Deep Learning RoboCup SPL NAO robots 

Notes

Acknowledgment

We wish to acknowledge the Multi-Sensor Interactive Systems Group Faculty 3 - Mathematics and Computer Science University of Bremen for providing a big part of the images used in the SPQR NAO image data set.

References

  1. 1.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition, vol. 1, pp. 886–893 (2005)Google Scholar
  2. 2.
    Erhan, D., Szegedy, C., Toshev, A., Anguelov, D.: Scalable object detection using deep neural networks. In: Computer Vision and Pattern Recognition, pp. 2155–2162 (2014)Google Scholar
  3. 3.
    Frese, U., Laue, T., Birbach, O., Röfer, T.: (A) vision for 2050-context-based image understanding for a human-robot soccer match. ECEASST 62 (2013)Google Scholar
  4. 4.
    Girshick, R.: Fast R-CNN. In: Proceedings of the International Conference on Computer Vision (ICCV) (2015)Google Scholar
  5. 5.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  6. 6.
    Härtl, A., Visser, U., Röfer, T.: Robust and efficient object recognition for a humanoid soccer robot. In: Behnke, S., Veloso, M., Visser, A., Xiong, R. (eds.) RoboCup 2013. LNCS, vol. 8371, pp. 396–407. Springer, Heidelberg (2014).  https://doi.org/10.1007/978-3-662-44468-9_35CrossRefGoogle Scholar
  7. 7.
    Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caeff: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, MM 2014, pp. 675–678 (2014)Google Scholar
  9. 9.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classi cation with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25, pp. 1106–1114 (2012)Google Scholar
  10. 10.
    Lienhart, R., Kuranov, A., Pisarevsky, V.: Empirical analysis of detection cascades of boosted classifiers for rapid object detection. In: Michaelis, B., Krell, G. (eds.) DAGM 2003. LNCS, vol. 2781, pp. 297–304. Springer, Heidelberg (2003).  https://doi.org/10.1007/978-3-540-45243-0_39CrossRefGoogle Scholar
  11. 11.
    Ouyang, W., Wang, X.: Joint deep learning for pedestrian detection. In: Computer Vision (ICCV), pp. 205–2063 (2013)Google Scholar
  12. 12.
    Röfer, T.: Region-based segmentation with ambiguous color classes and 2-D motion compensation. In: Visser, U., Ribeiro, F., Ohashi, T., Dellaert, F. (eds.) RoboCup 2007. LNCS, vol. 5001, pp. 369–376. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-68847-1_37CrossRefGoogle Scholar
  13. 13.
    Röfer, T., Laue, T., Richter-Klug, J., Schünemann, M., Stiensmeier, J., Stölpmann, A., Stowing, A., Thielke, F.: B-Human team report and code release (2015). http://www.b-human.de/downloads/publications/2015/CodeRelease2015.pdf
  14. 14.
    Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. CoRR, abs/1312.6229 (2013)Google Scholar
  15. 15.
    Sermanet, P., LeCun, Y.: Traffc sign recognition with multi-scale convolutional networks. In: The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 2809–2813 (2011)Google Scholar
  16. 16.
    Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: Advances in Neural Information Processing Systems 26, pp. 2553–2561 (2013)Google Scholar
  17. 17.
    Volioti, S., Lagoudakis, M.G.: Histogram-based visual object recognition for the 2007 four-legged robocup league. In: Darzentas, J., Vouros, G.A., Vosinakis, S., Arnellos, A. (eds.) SETN 2008. LNCS, vol. 5138, pp. 313–326. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-87881-0_28CrossRefGoogle Scholar
  18. 18.
    Zeiler, M.D., Ranzato, M., Monga, R., Mao, M.Z., Yang, K., Le, Q.V., Nguyen, P., Senior, A.W., Vanhoucke, V., Dean, J., Hinton, G.E.: On rectified linear units for speech processing. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, pp. 3517–3521 (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Dario Albani
    • 1
  • Ali Youssef
    • 1
  • Vincenzo Suriani
    • 1
  • Daniele Nardi
    • 1
  • Domenico Daniele Bloisi
    • 1
  1. 1.Department of Computer, Control, and Management EngineeringSapienza University of RomeRomeItaly

Personalised recommendations