Skip to main content

Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments

Abstract

3D object detection and recognition is increasingly used for manipulation and navigation tasks in service robots. It involves segmenting the objects present in a scene, estimating a feature descriptor for the object view and, finally, recognizing the object view by comparing it to the known object categories. This paper presents an efficient approach capable of learning and recognizing object categories in an interactive and open-ended manner. In this paper, “open-ended” implies that the set of object categories to be learned is not known in advance. The training instances are extracted from on-line experiences of a robot, and thus become gradually available over time, rather than at the beginning of the learning process. This paper focuses on two state-of-the-art questions: (1) How to automatically detect, conceptualize and recognize objects in 3D scenes in an open-ended manner? (2) How to acquire and use high-level knowledge obtained from the interaction with human users, namely when they provide category labels, in order to improve the system performance? This approach starts with a pre-processing step to remove irrelevant data and prepare a suitable point cloud for the subsequent processing. Clustering is then applied to detect object candidates, and object views are described based on a 3D shape descriptor called spin-image. Finally, a nearest-neighbor classification rule is used to predict the categories of the detected objects. A leave-one-out cross validation algorithm is used to compute precision and recall, in a classical off-line evaluation setting, for different system parameters. Also, an on-line evaluation protocol is used to assess the performance of the system in an open-ended setting. Results show that the proposed system is able to interact with human users, learning new object categories continuously over time.

This is a preview of subscription content, access via your institution.

References

  1. 1.

    Andreopoulos, A., Tsotsos, J.K.: 50 years of object recognition: Directions forward. Comp. Vision Image Underst. 117(8), 827–891 (2013)

    Article  Google Scholar 

  2. 2.

    Beis, J.S., Lowe, D.G.: Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 1997. IEEE Computer Society, Washington, DC, USA (1997)

  3. 3.

    Campbell, R.J., Flynn, P.J.: A survey of free-form object representation and recognition techniques. Comp. Vision Image Underst. 81(2), 166–210 (2001)

    MATH  Article  Google Scholar 

  4. 4.

    Chauhan, A., Lopes, L.S.: Using spoken words to guide open-ended category formation. Cogn. Process. 12(4), 341–354 (2011)

    Article  Google Scholar 

  5. 5.

    Collet Romea, A., Berenson, D., Srinivasa, S., Ferguson, D.: Object recognition and full pose registration from a single image for robotic manipulation. In: IEEE International Conference on Robotics and Automation, (ICRA 2009) (2009)

  6. 6.

    Dinh, H., Kropac, S.: Multi-resolution spin-images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 863–870 (2006)

  7. 7.

    Filipe, S., Alexandre, L.A.: A comparative evaluation of 3d keypoint detectors in a rgb-d object dataset. In: 9th International Conference on Computer Vision Theory and Applications. Lisbon, Portugal (2014)

  8. 8.

    Fischler, M.A., Bolles, R.C.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)

    MathSciNet  Article  Google Scholar 

  9. 9.

    Hertzberg, J., Zhang, J., Zhang, L., Rockel, S., Neumann, B., Lehmann, J., Dubba, K., Cohn, A.G., Saffiotti, A., Pecora, F., Mansouri, M., Konečný, S̆., Günther, M., Stock, S., Lopes, L.S., Oliveira, M., Lim, G.H., Kasaei, H., Mokhtari, V., Hotz, L., Bohlken, W.: The race project. KI - Künstliche Intelligenz, pp. 297–304 (2014). doi:10.1007/s13218-014-0327-y

  10. 10.

    Islam, M., Jahan, F., Min, J.H., hwan Baek, J.: Object classification based on visual and extended features for video surveillance application. In: Control Conference (ASCC 2011), 8th Asian, pp. 1398–1401 (2011)

  11. 11.

    Jeong, S., Lee, M.: Adaptive object recognition model using incremental feature representation and hierarchical classification. Neural Netw. 25, 130–140 (2012)

    MATH  Article  Google Scholar 

  12. 12.

    Johnson, A., Hebert, M.: Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. Pattern. Anal. Mach. Intell. 21(5), 433–449 (1999)

    Article  Google Scholar 

  13. 13.

    Kasaei, H., Oliveira, M.R., Lim, G.H., Lopes, L.S., Tomé, A.M.: An interactive open-ended learning approach for 3d object recognition. In: Proceedings of the 2014 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC) (2014)

  14. 14.

    Kirstein, S., Wersing, H., Gross, H.M., Körner, E.: A life-long learning vector quantization approach for interactive learning of multiple categories. Neural Netw. 28, 90–105 (2012)

    Article  Google Scholar 

  15. 15.

    Kootstra, G., Ypma, J., De Boer, B.: Active exploration and keypoint clustering for object recognition. In: IEEE International Conference on Robotics and Automation, (ICRA 2008), pp. 1005–1010 (2008)

  16. 16.

    Liu, Y., Zha, H., Qin, H.: Shape topics: A compact representation and new algorithms for 3d partial shape retrieval. In: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, vol. 2, pp. 2025–2032 (2006)

  17. 17.

    Martinez Torres, M., Collet Romea, A., Srinivasa, S.: Moped: A scalable and low latency object recognition and pose estimation system. In: IEEE International Conference on Robotics and Automation, (ICRA 2010) (2010)

  18. 18.

    Mian, A., Bennamoun, M., Owens, R.: On the repeatability and quality of keypoints for local feature-based 3d object retrieval from cluttered scenes. Int. J. Comput. Vis. 89(2–3), 348–361 (2010)

    Article  Google Scholar 

  19. 19.

    Oliveira, M., Lim, G.H., Seabra Lopes, L., Kasaei, H., Tome, A., Chauhan, A.: A perceptual memory system for grounding semantic representations in intelligent service robots. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE (2014)

  20. 20.

    Ozawa, S., Toh, S.L., Abe, S., Pang, S., Kasabov, N.: Incremental learning of feature space and classifier for face recognition. Neural Netw. 18(5–6), 575–584 (2005)

    Article  Google Scholar 

  21. 21.

    Rockel, S., Neumann, B., Zhang, J., Dubba, S.K.R., Cohn, A.G., Konecny, S., Mansouri, M., Pecora, F., Saffiotti, A., Günther, M., et al.: An ontology-based multi-level robot architecture for learning from experiences. In: Proceedings of the AAAI Spring Symposium: Designing Intelligent Robots (2013)

  22. 22.

    Ross, D.A., Lim, J., Lin, R.S., Yang, M.H.: Incremental learning for robust visual tracking. Int. J. Comput. Vis. 77(1-3), 125–141 (2008)

    Article  Google Scholar 

  23. 23.

    Rusu, R.B., Bradski, G., Thibaux, R., Hsu, J.: Fast 3d recognition and pose using the viewpoint feature histogram. In: Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, pp. 2155–2162. IEEE (2010)

  24. 24.

    Schulz, D., Burgard, W., Fox, D., Cremers, A.: Tracking multiple moving targets with a mobile robot using particle filters and statistical data association. In: IEEE International Conference on Robotics and Automation, (ICRA 2001), vol. 2, pp. 1665–1670 (2001)

  25. 25.

    Seabra Lopes, L., Chauhan, A.: How many words can my robot learn? An approach and experiments with one-class learning. Interact. Stud. 8(1), 53–81 (2007)

    Article  Google Scholar 

  26. 26.

    Seabra Lopes, L., Chauhan, A.: Open-ended category learning for language acquisition. Connect. Sci 20(4), 277–297 (2008)

    Article  Google Scholar 

  27. 27.

    Takamuku, S., Hosoda, K., Asada, M.: Shaking eases object category acquisition: Experiments with a robot arm. In: Proceedings of the Seventh International Conference on Epigenetic Robotics (2007)

  28. 28.

    Tombari, F.: Di Stefano, L.: Object recognition in 3d scenes with occlusions and clutter by hough voting. In: 4th Pacific-Rim Symposium on Image and Video Technology (PSIVT 2010), pp. 349–355 (2010)

  29. 29.

    Wohlkinger, W., Vincze, M.: Shape-based depth image to 3d model matching and classification with inter-view similarity. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011), pp. 4865–4870 (2011)

  30. 30.

    Yeh, T., Darrell, T.: Dynamic visual category learning. In: IEEE Conference on Computer Vision and Pattern Recognition, (CVPR 2008), pp. 1–8 (2008)

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to S. Hamidreza Kasaei.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kasaei, S.H., Oliveira, M., Lim, G.H. et al. Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments. J Intell Robot Syst 80, 537–553 (2015). https://doi.org/10.1007/s10846-015-0189-z

Download citation

Keywords

  • Open-ended learning
  • 3D object recognition
  • Spin-image descriptor
  • Autonomous robots