Abstract
Natural human-robot interaction requires robots to link words to objects and actions through grounding. Although grounding has been investigated in previous studies, not many considered grounding of synonyms and the majority of employed models only worked offline. In this paper, we try to fill this gap by introducing an online learning framework for grounding synonymous object and action names using cross-situational learning. Words are grounded through geometric characteristics of objects and kinematic features of the robot joints during action execution. An interaction experiment between a human tutor and HSR robot is used to evaluate the proposed framework. The results show that the employed framework is able to successfully ground all used words.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The threshold for the blob size was manually set after selecting the objects for the experiment and should be suitable for all objects of similar size.
- 2.
The used DBSCAN implementation is available in scikit-learn [18].
- 3.
The used instructions contain only the article the as an auxiliary word, i.e. a word that has no corresponding percept, and eight phrases, e.g. lord of the ring or lift up. In this study, two manually predefined dictionaries were used to identify them, while we will investigate to create the dictionaries automatically and in an unsupervised manner during grounding, in future work.
- 4.
The Human Support Robot from Toyota, which is used for the experiment, can move omnidirectional and has a cylindrical shaped body with one arm and gripper. It has 11 degrees of freedom and is equipped with a variety of different sensors, such as stereo and wide-angle cameras. [Official Toyota HSR Website].
- 5.
The latter is only used for sentences with the book object. For example: “lift up harry potter” represents the structure “action object”, while “lift up the lemonade” represents the structure “action the object”.
References
Aly, A., Taniguchi, A., Taniguchi, T.: A generative framework for multimodal learning of spatial concepts and object categories: an unsupervised part-of-speech tagging and 3D visual perception based approach. In: IEEE International Conference on Development and Learning and the International Conference on Epigenetic Robotics (ICDL-EpiRob), Lisbon, Portugal, September 2017
Blythe, R.A., Smith, K., Smith, A.D.M.: Learning times for large lexicons through cross-situational learning. Cogn. Sci. 34, 620–642 (2010)
Clark, E.V.: The principle of contrast: a constraint on language acquisition. In: Mechanisms of Language Acquisition, pp. 1–33. Lawrence Erlbaum Associates (1987)
Craye, C., Filliat, D., Goudou, J.F.: Environment exploration for object-based visual saliency learning. In: IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, May 2016
Dawson, C.R., Wright, J., Rebguns, A., Escárcega, M.V., Fried, D., Cohen, P.R.: A generative probabilistic framework for learning spatial language. In: IEEE Third Joint International Conference on Development and Learning and Epigenetic Robotics (ICDL), Osaka, Japan, August 2013
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD), Portland, Oregon, USA, pp. 226–231, August 1996
Filin, S., Pfeifer, N.: Segmentation of airborne laser scanning data using a slope adaptive neighborhood. ISPRS J. Photogram. Remote Sens. (P&RS) 60, 71–80 (2006)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM (CACM) 24(6), 381–395 (1981)
Fisher, C., Hall, D.G., Rakowitz, S., Gleitman, L.: When it is better to receive than to give: syntactic and conceptual constraints on vocabulary growth. Lingua 92, 333–375 (1994)
Fontanari, J.F., Tikhanoff, V., Cangelosi, A., Ilin, R., Perlovsky, L.I.: Cross-situational learning of object-word mapping using neural modeling fields. Neural Netw. 22(5–6), 579–585 (2009)
Fontanari, J.F., Tikhanoff, V., Cangelosi, A., Perlovsky, L.I.: A cross-situational algorithm for learning a lexicon using neural modeling fields. In: International Joint Conference on Neural Networks (IJCNN), Atlanta, GA, USA, June 2009
Harnad, S.: The symbol grounding problem. Physica D 42, 335–346 (1990)
Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)
International Federation of Robotics: World robotics 2017 - service robots (2017)
Kemp, C.C., Edsinger, A., Torres-Jara, E.: Challenges for robot manipulation in human environments. IEEE Robot. Autom. Mag. 14(1), 20–29 (2007)
Koster, K., Spann, M.: MIR: an approach to robust clustering-application to range image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 22(5), 430–444 (2000)
Nguyen, A., Le, B.: 3D point cloud segmentation: a survey. In: 6th IEEE Conference on Robotics, Automation and Mechatronics (RAM). IEEE, Manila, November 2013
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Pinker, S.: Learnability and Cognition. MIT Press, Cambridge (1989)
Roesler, O., Aly, A., Taniguchi, T., Hayashi, Y.: A probabilistic framework for comparing syntactic and semantic grounding of synonyms through cross-situational learning. In: ICRA-18 Workshop on Representing a Complex World: Perception, Inference, and Learning for Joint Semantic, Geometric, and Physical Understanding, Brisbane, Australia, May 2018
Roesler, O., Aly, A., Taniguchi, T., Hayashi, Y.: Evaluation of word representations in grounding natural language instructions through computational human-robot interaction. In: Proceedings of the 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI), Daegu, South Korea, March 2019
Rusu, R.B., Bradski, G., Thibaux, R., Hsu, J.: Fast 3D recognition and pose using the viewpoint feature histogram. In: Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan, pp. 2155–2162, October 2010
Sappa, A.D., Devy, M.: Fast range image segmentation by an edge detection strategy. In: Proceedings of the Third International Conference on 3-D Digital Imaging and Modeling (3DIM), Quebec City, Quebec, Canada, August 2002
Schnabel, R., Wahl, R., Klein, R.: Efficient ransac for point-cloud shape detection. Comput. Graphics Forum 26(2), 214–226 (2007)
Schubert, E., Sander, J., Ester, M., Kriegel, H.P., Xu, X.: DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans. Database Syst. (TODS) 42(3), 19 (2017)
She, L., Yang, S., Cheng, Y., Jia, Y., Chai, J.Y., Xi, N.: Back to the blocks world: learning new actions through situated human-robot dialogue. In: Proceedings of the SIGDIAL 2014 Conference, Philadelphia, U.S.A., pp. 89–97, June 2014
Siskind, J.M.: A computational study of cross-situational techniques for learning word-to-meaning mappings. Cognition 61, 39–91 (1996)
Smith, A.D.M., Smith, K.: Cross-Situational Learning, pp. 864–866. Springer, Boston (2012). https://doi.org/10.1007/978-1-4419-1428-6_1712
Smith, K., Smith, A.D.M., Blythe, R.A.: Cross-situational learning: an experimental study of word-learning mechanisms. Cogn. Sci. 35(3), 480–498 (2011)
Steels, L., Loetzsch, M.: The grounded naming game. In: Steels, L. (ed.) Experiments in Cultural Language Evolution, pp. 41–59. John Benjamins, Amsterdam (2012)
Strom, J., Richardson, A., Olson, E.: Graph-based segmentation for colored 3D laser point clouds. In: International Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan (2010)
Taniguchi, A., Taniguchi, T., Cangelosi, A.: Cross-situational learning with Bayesian generative models for multimodal category and word learning in robots. Front. Neurorobot. 11, 66 (2017)
Tellex, S., Kollar, T., Dickerson, S., Walter, M.R., Banerjee, A.G., Teller, S., Roy, N.: Approaching the symbol grounding problem with probabilistic graphical models. AI Mag. 32(4), 64–76 (2011)
Toyota Motor Corporation: HSR Manual, 2017.4.17 edn., April 2017
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Roesler, O. (2020). A Cross-Situational Learning Based Framework for Grounding of Synonyms in Human-Robot Interactions. In: Silva, M., Luís Lima, J., Reis, L., Sanfeliu, A., Tardioli, D. (eds) Robot 2019: Fourth Iberian Robotics Conference. ROBOT 2019. Advances in Intelligent Systems and Computing, vol 1093. Springer, Cham. https://doi.org/10.1007/978-3-030-36150-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-36150-1_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36149-5
Online ISBN: 978-3-030-36150-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)