Inference of Other’s Minds with Limited Information in Evolutionary Robotics

Abstract

Theory of mind (ToM) is the ability to understand others’ mental states (e.g., intentions). Studies on human ToM show that the way we understand others’ mental states is very efficient, in the sense that observing only some portion of others’ behaviors can lead to successful performance. Recently, ToM has gained interest in robotics to build robots that can engage in complex social interactions. Although it has been shown that robots can infer others’ internal states, there has been limited focus on the data utilization of ToM mechanisms in robots. Here we show that robots can infer others’ intentions based on limited information by selectively and flexibly using behavioral cues similar to humans. To test such data utilization, we impaired certain parts of an actor robot’s behavioral information given to the observer, and compared the observer’s performance under each impairment condition. We found that although the observer’s performance was not perfect compared to when all information was available, it could infer the actor’s mind to a degree if the goal-relevant information was intact. These results demonstrate that, similar to humans, robots can learn to infer others’ mental states with limited information.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. 1.

    http://www.geforce.com/Hardware/Technologies/physx.

References

  1. 1.

    Premack D, Woodruff G (1978) Does the chimpanzee have a theory of mind? Behav Brain Sci 4:515–526

    Article  Google Scholar 

  2. 2.

    Gallese V, Goldman A (1998) Mirror neurons and the simulation theory of mind-reading. Trends Cogn Sci 2:493–501

    Article  Google Scholar 

  3. 3.

    Baron-Cohen S, Wheelwright S, Jolliffe T (1997) Is there a “language of the eyes”? Evidence from normal adults, and adults with autism or Asperger syndrome. Vis Cogn 4:311–331

    Article  Google Scholar 

  4. 4.

    Carpenter M, Call J, Tomasello M (2005) Twelve- and 18-month-olds copy actions in terms of goals. Dev Sci 8:F13–F20

    Article  Google Scholar 

  5. 5.

    Fong T, Nourbakhsh I, Dautenhahn K (2003) A survey of socially interactive robots. Robot Auton Syst 42:143–166

    Article  Google Scholar 

  6. 6.

    Kim K-J, and Lipson H (2009) Towards a “theory of mind” in simulated robots. In: Proceedings of the 11th annual conference companion on genetic and evolutionary computation conference. pp 2071–2076

  7. 7.

    Kim K-J, Lipson H (2009) Towards a simple robotic theory of mind. In: Proceedings of the 9th workshop on performance metrics for intelligent systems. pp 131–138

  8. 8.

    Kim K-J, Cho S-B (2015) Inference of other’s internal neural models from active observation. BioSystems 128:37–47

    Article  Google Scholar 

  9. 9.

    Meltzoff AN, Decety J (2003) What imitation tells us about social cognition: a rapprochement between developmental psychology and cognitive neuroscience. Philos Trans R Soc Lond B Biol Sci 358:491–500

    Article  Google Scholar 

  10. 10.

    Meltzoff AN, Moore M (1997) Explaining facial imitation: a theoretical model. Early Dev Parent 6:179–192

    Article  Google Scholar 

  11. 11.

    Hillyard SA, Hink RF, Schwent VL, Picton TW (1973) Electrical signs of selective attention in the human brain. Science 182:177–180

    Article  Google Scholar 

  12. 12.

    Moran J, Desimone R (1985) Selective attention gates visual processing in the extrastriate cortex. Science 229:782–784

    Article  Google Scholar 

  13. 13.

    Baldwin DA, Baird JA, Saylor MM, Clark MA (2001) Infants parse dynamic action. Child Dev 72:708–717

    Article  Google Scholar 

  14. 14.

    Zadny J, Gerard HB (1974) Attributed intentions and informational selectivity. J Exp Soc Psychol 10:34–52

    Article  Google Scholar 

  15. 15.

    Sonne T, Kingo OS, Krojgaard P (2016) Occlusions at event boundaries during encoding have a negative effect on infant memory. Conscious Cogn 41:72–82

    Article  Google Scholar 

  16. 16.

    Lakusta L, DiFabrizio S (2017) And the winner is…A visual preference for endpoints over starting points in infant’s motion event representations. Infancy 22(3):323–343

    Article  Google Scholar 

  17. 17.

    Csibra G, Bíró S, Koós O, Gergely G (2003) One-year-old infants use teleological representations of actions productively. Cogn Sci 27:111–133

    Article  Google Scholar 

  18. 18.

    Daum MM, Prinz W, Aschersleben G (2008) Encoding the goal of an object-directed but uncompleted reaching action in 6- and 9-month-old infants. Dev Sci 11:607–619

    Article  Google Scholar 

  19. 19.

    Meltzoff AN (1995) Understanding the intentions of others: re-enactment of intended acts by 18-month-old children. Dev Psychol 31:838–850

    Article  Google Scholar 

  20. 20.

    Brandone AC, Horwitz SR, Aslin RN, Wellman HM (2014) Infants’ goal anticipantion during failed and successful reaching actions. Dev Sci 17:23–34

    Article  Google Scholar 

  21. 21.

    Kim EY, Song H-J (2015) Six-month-olds actively predict others’ goal-directed actions. Cogn Dev 33:1–13

    Article  Google Scholar 

  22. 22.

    Applin JB, Kibbe MM (2019) Six-month-old infnants predict agents’ goal-directed actions on occuluded objects. Infancy 24:392–410

    Article  Google Scholar 

  23. 23.

    Scassellati B (2002) Theory of mind for a humanoid robot. Auton Robots 12:13–24

    Article  Google Scholar 

  24. 24.

    Bien ZZ, Park K-H, Jung J-W, Do J-H (2005) Intention reading is essential in human-friendly interfaces for the elderly and the handicapped. IEEE Trans Ind Inform 52:1500–1505

    Article  Google Scholar 

  25. 25.

    Kaliouby RE, Robinson P (2004) Mind reading machines: automated inference of cognitive mental states from video. Conf Proc IEEE Int Conf Syst Man Cybern 1:682–688

    Google Scholar 

  26. 26.

    Buchsbaum D, Blumberg B, Breazeal C, Meltzoff AN (2005) A simulation-theory inspired social learning system for interactive characters. In: IEEE international workshop on robot and human interactive communication. pp 85–90

  27. 27.

    Breazeal C, Buchsbaum D, Gray J, Gatenby D, Blumberg B (2005) Learning from and about others: towards using imitation to bootstrap the social understanding of others by robots. Artif Life 11:31–62

    Article  Google Scholar 

  28. 28.

    Gray J, Breazeal C, Berlin M, Brooks A, Lieberman J (2005) Action parsing and goal inference using self as simulator. In: IEEE international workshop on robot and human interactive communication. pp 202–209

  29. 29.

    Kelley R, King C, Tavakkoli A, Nicolescu M, Nicolescu M, Bebis G (2008) An architecture for understanding intent using a novel hidden markov formulation. Int J Hum Robot 5(2):1–22

    Article  Google Scholar 

  30. 30.

    Yokoya R, Ogata T, Tani J, Komatani K, Okuno HG (2007) Discovery of other individuals by projecting a self-model through imitation. In: IEEE/RSJ international conference on intelligent robots and systems. pp 1009–1014

  31. 31.

    Demiris Y, Johnson M (2003) Distributed, predictive perception of actions: a biologically inspired robotics architecture for imitation and learning. Connect Sci 15(4):231–243

    Article  Google Scholar 

  32. 32.

    Takanashi T, Kawamata T, Asada M, Negrello M (2007) Emulation and behavior understanding through shared values. In: IEEE/RSJ international conference on intelligent robots and systems. pp 3950–3955

  33. 33.

    Kim KJ, Eo KY, Jung YR, Kim SO, Cho SB (2013) Evolutionary conditions for the emergence of robotic theory of mind with multiple goals. In: IEEE workshop on robotic intelligence in informationally structured space. pp 48–54

  34. 34.

    Bongard J, Lipson H (2007) Automated reverse engineering of nonlinear dynamical systems. Proc Natl Acad Sci USA 104:9943–9948

    Article  Google Scholar 

  35. 35.

    Bongard J, Zykov V, Lipson H (2006) Resilient machines through continuous self-modeling. Science 314:1118–1121

    Article  Google Scholar 

  36. 36.

    Saxena A, Lipson H, Valero-Cuevas FJ (2012) Functional inference of complex anatomical tendinous networks at a macroscopic scale via sparse experimentation. PLoS Comput Biol 8:e1002751

    Article  Google Scholar 

  37. 37.

    Seung HS, Opper M, Sompolinsky H (1992) Query by committee. In: Proceedings of the fifth annual ACM workshop on computational learning theory. pp 287–294

  38. 38.

    Zacks JM (2004) Using movement and intentions to understand simple events. Cogn Sci 28:979–1008

    Article  Google Scholar 

  39. 39.

    Baker CL, Saxe R, Tenenbaum JB (2009) Action understanding as inverse planning. Cognition 113:329–349

    Article  Google Scholar 

  40. 40.

    Field DJ, Hayes A, Hess RF (1993) Contour integration by human visual system: evidence for a local “association field”. Vis Res 33:173–193

    Article  Google Scholar 

  41. 41.

    Prinzmetal W, Banks WP (1977) Good continuation affects visual detection. Percept Psychophys 21:389–395

    Article  Google Scholar 

  42. 42.

    Aslin RN, Saffran JR, Newport EL (1998) Computation of conditional probability statistics by 8-month-old infants. Psychol Sci 9:321–324

    Article  Google Scholar 

  43. 43.

    Buchsbaum D, Griffiths TL, Gopnik A, Baldwin DA (2009) Learning from actions and their consequences: Inferring causal variables from continuous sequences of human action. In: Proceedings of the 31st annual conference of the cognitive science society. pp 2493–2498

  44. 44.

    Allen K, Ibara S, Seymour A, Cordova N, Botvinick M (2010) Abstract structural representations of goal-directed behavior. Psychol Sci 21:1518–1524

    Article  Google Scholar 

  45. 45.

    Beyer HG, Schwefel HP (2002) Evolution strategies a comprehensive introduction. Nat Comput 1:3–52

    MathSciNet  Article  Google Scholar 

  46. 46.

    Kim KJ, Wang A, Lipson H (2010) Automated synthesis of resilient and tamper-evident analog circuits without a single point of failure. Genet Program Evol Mach 11:35–59

    Article  Google Scholar 

  47. 47.

    Kim TS, Na JC, Kim KJ (2012) Optimization of autonomous car controller using self-adaptive evolutionary strategy. Int J Adv Robot Syst 9:73. https://doi.org/10.5772/50848

    Article  Google Scholar 

  48. 48.

    Chellapilla K, Fogel DB (2001) Evolving an expert checkers playing program without using human expertise. IEEE Trans Evol Comput 5:422–428

    Article  Google Scholar 

  49. 49.

    Beyer HG (1996) Toward a theory of evolution strategies: self-adaptation. Evolut Comput 3:311–347

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Woon Ju Park, Sang-Ah Yoo, and Kangyong Eo for running the experimentations and making the early draft of this paper.

Funding

K.-J. Kim was supported by Ministry of Culture, Sports and Tourism (MCST) and Korea Creative Content Agency (KOCCA) in the Culture Technology (CT) Research & Development Program 2020, and S.-B. Cho was supported by Defense Acquisition Program Administration and Agency for Defense Development under the contract (UD190016ED).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Sung-Bae Cho.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Actor Learning

Actor’s neural network is defined with the real-valued parameters. It is assumed that observer has no direct access to the parameters but attempts to infer them from the behaviors (trajectories) of the actor. The actor’s NN evolved to move towards a light source placed in the environment (the light is placed at (L_X, L_Y)). The evolution aims at finding the parameters (weights) in the NN that controls the robot behavior. Because these parameters are real-valued, there are a very large number of NN candidates. An evolutionary algorithm, inspired by natural evolution, guides the search for the NN parameters.

In this paper, we adopt an evolutionary strategy (ES) [45], which has been used for many engineering problems (for example, game strategy and analog circuit evolution [46,47,48]). The ES is relatively simpler than other evolutionary methods but it has been successful in optimizing real-valued parameters for engineering problems. For example, it successfully optimized 1741 weights of neural networks which played checkers (better than 99.61% of the playing population of zone.com players) [48].

In ES, each solution initializes the neural network’s weights and corresponding mutation step-size. The evolutionary search optimizes both the mutation strength and the weights of NNs together. Only those individuals with high fitness value get a chance of selection. It is a deterministic process in which only the best half from the pool of parents and offspring survive to the next generation. The selection technique is called “truncation” or “breeding” selection [45].

The details of the ES are as follows. Initially, P NNs (parents) are generated randomly (P is the population size). Weights (including the bias weights) are selected from a uniform distribution with a range of − 0.2 to 0.2. Each weight has a corresponding mutation step-size initialized at 0.05. Each parent NN generates one offspring through a mutation yielding 2 × P neural networks (parents + offspring). The mutation operator is defined as follows (it slightly changes the current weight \( w_{i} (j) \) to new one \( w^{\prime}_{i} (j) \) to produce the offspring):

$$ \begin{aligned} \, \sigma_{i}^{\prime } (j) & = \sigma_{i} (j)\exp \left( {\tau \times N_{j} (0,1)} \right) \, \\ \, w_{i}^{\prime } (j) & = w_{i} (j) + \sigma_{i}^{\prime } (j)N_{j} (0,1) \\ \end{aligned} $$
(10)

where Nw is the number of weights, \( \tau \) is the learning parameter, \( w_{i} (j) \) is the jth weight of the ith neural network in the population, \( \sigma_{i} (j) \) is the corresponding mutation-step size for \( w_{i} (j) \) and Nj(0,1) is a standard Gaussian random variable re-sampled for every j. The parameter is defined as follows.

$$ \tau = 1/\sqrt {2\sqrt {N_{W} } } $$
(11)

The parameter in Eq. (11) is chosen using theoretical and empirical evidence [49].

From a pool of parents and offspring, only half survive to the next generation based on fitness. Because the goal of this evolution was to reach the light, fitness was measured by the Euclidean distances between the robot and the light source at each time point during the navigation.

$$ Fitness_{i} = \frac{1}{{\sum\nolimits_{j = 1}^{MAX\_STEPS} {\sqrt {(L\_X - R\_X(j))^{2} + (L\_Y - R\_Y(j))^{2} } } }} $$
(12)

It sorts the 2 × P candidate NNs (parents + offspring) based on the fitness. Only half of them survive to be parents in the next generation.

Appendix B: Robot’s Details

  • Body (Morphology) The robot is like a tricycle which has a big main body and three wheels (one front wheel and two rear wheels) (Fig. 3a). The radius of the sphere is 1 m. The wheel’s radius and width are 0.5 and 0.3 meters, respectively. The density of the robot is 5 kg/m3. It is modified based on the sample tricycle (with a rectangular body) from the PhysX simulator.

  • Sensors The robots have two light sensors. Sensors are located on the front side of the upper hemisphere of the robot’s body and they detect light levels around the body. The sensors are located at + π/4 and − π/4 positions. The light levels are measured using the following equation (r: Euclidean distance between the sensor and the light source, θ: the angle between the sensor and the light source)

    $$ sensor\_value = \frac{1000}{{r^{2} }}\cos (\theta ) $$
    (13)
  • Actuators At each time step (per 1/60 s), the simulator sets the angle (− 1/3π to 1/3π) of the front wheel and the speed of the rear wheels based on the outputs from the controller. The maximum speed of the robot is 4 m/s.

  • Controller The robots received two (right and left) light sensor values which are sent to the NN, producing two real-values for the direction and speed of the wheels (Fig. 3b). The model has three hidden neurons and two output neurons. The number of weights is 17, consisting of 12 connection weights and 5 biases. Because each neuron has one bias, the number of biases is the same to that of neurons in the neural network. A hyperbolic tangent is used as a sigmoid function. The actor learned to move towards a light source by evolving an “innate” NN through its interaction with the environment. In this way, the actor is able to follow the light source using its unique neural controller (see Fig. 10).

    Fig. 10
    figure10

    The behavior and neural topology of the actor. a The figure shows the trajectories of the neural network (the black cross is the light source and the circle represents the starting positions of the robot. The robot’s initial angle is set as 0 degree.) b The actor robot’s neural network model evolved to reach the light. The figure shows the weights of the actor’s NN evolved

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kim, KJ., Cho, SB. Inference of Other’s Minds with Limited Information in Evolutionary Robotics. Int J of Soc Robotics (2020). https://doi.org/10.1007/s12369-020-00660-x

Download citation

Keywords

  • Evolutionary robotics
  • Estimation–exploration algorithm
  • Neural network
  • Physics-based simulation
  • Information loss