Leolani: A Reference Machine with a Theory of Mind for Social Communication

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11107)


Our state of mind is based on experiences and what other people tell us. This may result in conflicting information, uncertainty, and alternative facts. We present a robot that models relativity of knowledge and perception within social interaction following principles of the theory of mind. We utilized vision and speech capabilities on a Pepper robot to build an interaction model that stores the interpretations of perceptions and conversations in combination with provenance on its sources. The robot learns directly from what people tell it, possibly in relation to its perception. We demonstrate how the robot’s communication is driven by hunger to acquire more knowledge from and on people and objects, to resolve uncertainties and conflicts, and to share awareness of the perceived environment. Likewise, the robot can make reference to the world and its knowledge about the world and the encounters with people that yielded this knowledge.


Robot Theory of mind Social learning Communication 



This research was funded by the VU University Amsterdam and the Netherlands Organization for Scientific Research via the Spinoza grant awarded to Piek Vossen. We also thank Bob van der Graft for his support.


  1. 1.
    Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015).
  2. 2.
    Amos, B., Ludwiczuk, B., Satyanarayanan, M.: Openface: a general-purpose face recognition library with mobile applications. Technical report, CMU-CS-16-118, CMU School of Computer Science (2016)Google Scholar
  3. 3.
    T.W. Project Authors: WebRTC. Online publication (2011).
  4. 4.
    Baron-Cohen, S.: Mindblindness: An Essay on Autism and Theory of Mind. MIT Press, Cambridge (1997)Google Scholar
  5. 5.
    Bratman, M.: Intention, plans, and practical reason (1987)Google Scholar
  6. 6.
    Card, S.K.: The Psychology of Human-Computer Interaction. CRC Press, Boca Raton (2017)Google Scholar
  7. 7.
    Epley, N., Waytz, A., Cacioppo, J.T.: On seeing human: a three-factor theory of anthropomorphism. Psychol. Rev. 114(4), 864 (2007)CrossRefGoogle Scholar
  8. 8.
    Fokkens, A., Vossen, P., Rospocher, M., Hoekstra, R., van Hage, W.: Grasp: grounded representation and source perspective. In: Proceedings of KnowRSH, RANLP-2017 Workshop, Varna, Bulgaria (2017)Google Scholar
  9. 9.
    Google: Cloud speech-to-text - speech recognition. Online publication (2018).
  10. 10.
    Hiatt, L.M., Harrison, A.M., Trafton, J.G.: Accommodating human variability in human-robot teams through theory of mind. In: IJCAI Proceedings-International Joint Conference on Artificial Intelligence, vol. 22, p. 2066 (2011)Google Scholar
  11. 11.
    Kanade, T., Cohn, J.F., Tian, Y.: Comprehensive database for facial expression analysis. In: 2000 Proceedings of Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 46–53. IEEE (2000)Google Scholar
  12. 12.
    Leslie, A.M.: Pretense and representation: the origins of “theory of mind”. Psychol. Rev. 94(4), 412 (1987)CrossRefGoogle Scholar
  13. 13.
    Mavridis, N.: A review of verbal and non-verbal human-robot interactive communication. Robot. Auton. Syst. 63, 22–35 (2015)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Mirnig, N., Stollnberger, G., Miksch, M., Stadler, S., Giuliani, M., Tscheligi, M.: To err is robot: how humans assess and act toward an erroneous social robot. Front. Robot. AI 4, 21 (2017)CrossRefGoogle Scholar
  15. 15.
    Ono, T., Imai, M., Nakatsu, R.: Reading a robot’s mind: a model of utterance understanding based on the theory of mind mechanism. Adv. Robot. 14(4), 311–326 (2000)CrossRefGoogle Scholar
  16. 16.
    Partan, S.R., Marler, P.: Issues in the classification of multimodal communication signals. Am. Nat. 166(2), 231–245 (2005)CrossRefGoogle Scholar
  17. 17.
    Premack, D., Woodruff, G.: Does the chimpanzee have a theory of mind? Behav. Brain Sci. 4, 515–526 (1978)CrossRefGoogle Scholar
  18. 18.
    Scassellati, B.: Theory of mind for a humanoid robot. Auton. Robot. 12(1), 13–24 (2002)CrossRefGoogle Scholar
  19. 19.
    Scassellati, B.M.: Foundations for a theory of mind for a humanoid robot. Ph.D. thesis, Massachusetts Institute of Technology (2001)Google Scholar
  20. 20.
    Serban, I.V., Sordoni, A., Bengio, Y., Courville, A.C., Pineau, J.: Building end-to-end dialogue systems using generative hierarchical neural network models. In: AAAI, vol. 16, pp. 3776–3784 (2016)Google Scholar
  21. 21.
    She, L., Chai, J.: Interactive learning of grounded verb semantics towards human-robot communication. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1634–1644 (2017)Google Scholar
  22. 22.
    Szegedy, C., et al.: Going deeper with convolutions. In: Computer Vision and Pattern Recognition (CVPR) (2015).
  23. 23.
    Van Hage, W.R., Malaisé, V., Segers, R., Hollink, L., Schreiber, G.: Design and use of the simple event model (SEM). Web Semant.: Sci. Serv. Agents World Wide Web 9(2), 128–136 (2011)CrossRefGoogle Scholar
  24. 24.
    Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)CrossRefGoogle Scholar
  25. 25.
    Vossen, P., et al.: Newsreader: using knowledge resources in a cross-lingual reading machine to generate more knowledge from massive streams of news. Knowl.-Based Syst. (2016).
  26. 26.
    Wahlster, W.: SmartKom: Foundations of Multimodal Dialogue Systems, vol. 12. Springer, Heidelberg (2006). Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Computational Lexicology and Terminology LabVU University AmsterdamAmsterdamThe Netherlands

Personalised recommendations