International Journal of Social Robotics

, Volume 8, Issue 1, pp 31–50 | Cite as

People Interpret Robotic Non-linguistic Utterances Categorically

  • Robin ReadEmail author
  • Tony Belpaeme


We present results of an experiment probing whether adults exhibit categorical perception when affectively rating robot-like sounds (Non-linguistic Utterances). The experimental design followed the traditional methodology from the psychology domain for measuring categorical perception: stimulus continua for robot sounds were presented to subjects, who were asked to complete a discrimination and an identification task. In the former subjects were asked to rate whether stimulus pairs were affectively different, while in the latter they were asked to rate single stimuli affectively. The experiment confirms that Non-linguistic Utterances can convey affect and that they are drawn towards prototypical emotions, confirming that people show categorical perception at a level of inferred affective meaning when hearing robot-like sounds. We speculate on how these insights can be used to automatically design and generate affect-laden robot-like utterances.


Non-linguistic Utterances Social human–robot interaction Categorical perception Affective displays Multi-modal human–robot interaction 



This work was (partially) funded by the EU FP7 ALIZ-E project (Grant 248116).

Supplementary material

12369_2015_304_MOESM1_ESM.mp3 (4 kb)
Supplementary material 1 (mp3 4 KB)
12369_2015_304_MOESM2_ESM.mp3 (4 kb)
Supplementary material 2 (mp3 4 KB)
12369_2015_304_MOESM3_ESM.mp3 (5 kb)
Supplementary material 3 (mp3 4 KB)
12369_2015_304_MOESM4_ESM.mp3 (5 kb)
Supplementary material 4 (mp3 5 KB)
12369_2015_304_MOESM5_ESM.mp3 (6 kb)
Supplementary material 5 (mp3 5 KB)
12369_2015_304_MOESM6_ESM.mp3 (6 kb)
Supplementary material 6 (mp3 6 KB)
12369_2015_304_MOESM7_ESM.mp3 (4 kb)
Supplementary material 7 (mp3 4 KB)
12369_2015_304_MOESM8_ESM.mp3 (4 kb)
Supplementary material 8 (mp3 4 KB)
12369_2015_304_MOESM9_ESM.mp3 (5 kb)
Supplementary material 9 (mp3 4 KB)
12369_2015_304_MOESM10_ESM.mp3 (5 kb)
Supplementary material 10 (mp3 5 KB)
12369_2015_304_MOESM11_ESM.mp3 (6 kb)
Supplementary material 11 (mp3 5 KB)
12369_2015_304_MOESM12_ESM.mp3 (6 kb)
Supplementary material 12 (mp3 6 KB)
12369_2015_304_MOESM13_ESM.mpg (852 kb)
Supplementary material 13 (mpg 852 KB)
12369_2015_304_MOESM14_ESM.mpg (858 kb)
Supplementary material 14 (mpg 858 KB)


  1. 1.
    Banse R, Scherer K (1996) Acoustic profiles in vocal emotion expression. J Pers Soc Psychol 70(3):614–636CrossRefGoogle Scholar
  2. 2.
    Banziger T, Scherer K (2005) The role of intonation in emotional expressions. Speech Commun 46(3–4):252–267CrossRefGoogle Scholar
  3. 3.
    Beck A, Stevens B, Bard KA, Cañamero L (2012) Emotional body language displayed by artificial agents. Trans Interact Intell Syst 2(1):1–29CrossRefGoogle Scholar
  4. 4.
    Bimler D, Kirkland J (2001) Categorical perception of facial expressions of emotion: evidence from multidimensional scaling. Cogn Emot 15(5):633–658CrossRefGoogle Scholar
  5. 5.
    Blattner M, Sumikawam D, Greenberg R (1989) Earcons and icons: their structure and common design principles. Hum Comput Interact 4:11–44CrossRefGoogle Scholar
  6. 6.
    Bornstein MH, Kessen W, Weiskopf S (1976) Color vision and Hue categorization in young human infants. Human perception and performance. J Exp Psychol 2(1):115–129Google Scholar
  7. 7.
    Breazeal C (2002) Designing sociable robots. The MIT Press, CambridgeGoogle Scholar
  8. 8.
    Breazeal C (2003) Emotion and sociable humanoid robots. Int J Hum Comput Stud 59(1–2):119–155CrossRefGoogle Scholar
  9. 9.
    Broekens J, Brinkman WP (2013) Affectbutton: a method for reliable and valid affective self-report. Int J Hum Comput Stud 71(6):641–667CrossRefGoogle Scholar
  10. 10.
    Broekens J, Pronker A, Neuteboom M (2010) Real time labelling of affect in music using the affect button. In: Proceedings of the 3rd international workshop on affective interaction in natural environments (AFFINE 2010) at ACM multimedia 2010. ACM, Firenze, pp 21–26Google Scholar
  11. 11.
    Cassell J (1998) A framework for gesture generation and interpretation. In: Cipolla R, Pentland A (eds) Computer vision for human–machine interaction. Cambridge University Press, Cambridge, pp 191–216CrossRefGoogle Scholar
  12. 12.
    Cheal JL, Rutherford MD (2011) Categorical perception of emotional facial expressions in preschoolers. J Exp Child Psychol 110(3):434–443CrossRefGoogle Scholar
  13. 13.
    Cowie R, Cornelius R (2003) Describing the emotional states that are expressed in speech. Speech Commun 40(1–2):5–32CrossRefzbMATHGoogle Scholar
  14. 14.
    Cowie R, Douglas-Cowie E, Savvidou S, McMahon E, Sawey M, Schröder M (2000) ’FEELTRACE’: An instrument for recording perceived emotion in real time. In: Proceedings of the ISCA tutorial and research workshop (ITRW) on speech and emotion. Newcastle, pp 19–24Google Scholar
  15. 15.
    Delaunay F, de Greeff J, Belpaeme T (2009) Towards retro-projected robot faces: An alternative to mechatronic and android faces. In: Proceedings of the 18th international symposium on robot and human interactive communication (ROMAN 2009). Toyama, pp 306–311Google Scholar
  16. 16.
    Delaunay F, de Greeff J, Belpaeme T (2010) A study of a retro-projected robotic face and its effectiveness for gaze reading by humans. In: Proceedings of the 5th international conference on human–robot interaction (HRI’10). ACM/IEEE, Osaka, pp 39–44Google Scholar
  17. 17.
    Duffy BR (2003) Anthropomorphism and the social robot. Robot Autonom Syst 42(3–4):177–190CrossRefzbMATHGoogle Scholar
  18. 18.
    Ekman P (2005) Basic emotions. In: Dalgleish T, Power M (eds) Handbook of cognition and emotion. Wiley, Chichester, pp 45–60CrossRefGoogle Scholar
  19. 19.
    Ekman P, Friesen W (1971) Constants across cultures in the face and emotion. J Pers Soc Psychol 17(2):124–129CrossRefGoogle Scholar
  20. 20.
    Embgen S, Luber M, Becker-Asano C, Ragni M, Evers V, Arras K (2012) Robot-specific social cues in emotional body language. In: Proceedings of the 21st international symposium on robot and human interactive communication (RO-MAN 2012). IEEE, Paris, pp 1019–1025Google Scholar
  21. 21.
    Etcoff N, Magee J (1992) Categorical perception of facial expressions. Cognition 44:227–240CrossRefGoogle Scholar
  22. 22.
    Eyssel F, Hegel F (2012) (S)he’s got the look: gender stereotyping of robots. J Appl Soc Psychol 42(9):2213–2230CrossRefGoogle Scholar
  23. 23.
    Franklin A, Davies IR (2004) New evidence for infant colour categories. Br J Dev Psychol 22(3):349–377CrossRefGoogle Scholar
  24. 24.
    Funakoshi K, Kobayashi K, Nakano M, Yamada S, Kitamura Y, Tsujino H (2008) Smoothing human-robot speech interactions by using a blinking-light as subtle expression. In: Proceedings of the 10th international conference on multimodal interfaces (ICMI’08). ACM, Chania, pp 293–296Google Scholar
  25. 25.
    Gaver W (1986) Auditory icons: using sound in computer interfaces. Hum Comput Interact 2(2):167–177CrossRefGoogle Scholar
  26. 26.
    Gerrits E, Schouten M (2004) Categorical perception depends on the discrimination task. Percept Psychophys 66(3):363–376CrossRefGoogle Scholar
  27. 27.
    Goldstone RL, Hendrickson AT (2009) Categorical perception. Wiley Interdiscip Rev 1(1):69–78CrossRefGoogle Scholar
  28. 28.
    Hackett C (1960) The origin of speech. Sci Am 203:88–96CrossRefGoogle Scholar
  29. 29.
    Harnad S (ed) (1987) Categorical perception: the groundwork of cognition. Cambridge University Press, CambridgeGoogle Scholar
  30. 30.
    Heider F, Simmel M (1944) An experimental study of apparent behavior. Am J Psychol 57:243–259CrossRefGoogle Scholar
  31. 31.
    Jee E, Jeong Y, Kim C, Kobayashi H (2010) Sound design for emotion and intention expression of socially interactive robots. Intel Serv Robot 3:199–206CrossRefGoogle Scholar
  32. 32.
    Jee ES, Kim CH, Park SY, Lee KW (2007) Composition of musical sound expressing an emotion of robot based on musical factors. In: Proceedings of the 16th international symposium on robot and human interactive communication (RO-MAN 2007). IEEE, Jeju Island, pp 637–641Google Scholar
  33. 33.
    Johannsen G (2004) Auditory displays in human–machine interfaces. Proc IEEE 92(4):742–758CrossRefGoogle Scholar
  34. 34.
    Karg M, Samadani Aa, Gorbet R, Kuhnlenz K (2013) Body movements for affective expression: a survey of automatic recognition and generation. Trans Affect Comput 4(4):341–359CrossRefGoogle Scholar
  35. 35.
    Komatsu T, Kobayashi K (2012) Can users live with overconfident or unconfident systems?: A comparison of artificial subtle expressions with human-like expression. In: Proceedings of conference on human factors in computing systems (CHI 2012). Austin, pp 1595–1600Google Scholar
  36. 36.
    Komatsu T, Yamada S (2007) How appearance of robotic agents affects how people interpret the agents’ attitudes. In: Proceedings of the international conference on Advances in computer entertainment technology: ACE ’07Google Scholar
  37. 37.
    Komatsu T, Yamada S (2011) How does the agents’ appearance affect users’ interpretation of the agents’ attitudes: experimental investigation on expressing the same artificial sounds from agents with different appearances. Int J Hum Comput Interact 27(3):260–279CrossRefGoogle Scholar
  38. 38.
    Komatsu T, Yamada S, Kobayashi K, Funakoshi K, Nakano M (2010) Artificial subtle expressions: intuitive notification methodology of artifacts. In: Proceedings of the 28th international conference on human factors in computing systems (CHI’10). ACM, New York, pp 1941–1944Google Scholar
  39. 39.
    Kuhl PK (1991) Human adults and human infants show a “perceptual magnet effect” for the prototypes of speech categories, monkeys do not. Percept Psychophys 50(2):93–107CrossRefGoogle Scholar
  40. 40.
    Kuratate T, Matsusaka Y, Pierce B, Cheng G (2011) “Mask-bot”: A life-size robot head using talking head animation for human–robot communication. In: Proceedings of the 11th IEEE-RAS international conference on humanoid robots (Humanoids 2011). IEEE, Bled, pp 99–104Google Scholar
  41. 41.
    Lang P, Bradley M (1994) Measuring emotion: the self-assessment manikin and the semantic differential. J Behav Therapy Exp psychiatry 25(1):49–59CrossRefGoogle Scholar
  42. 42.
    Laukka P (2005) Categorical perception of vocal emotion expressions. Emotion 5(3):277–295CrossRefGoogle Scholar
  43. 43.
    Levitin DJ, Rogers SE (2005) Absolute pitch: perception, coding, and controversies. Trends Cognit Sci 9(1):26–33CrossRefGoogle Scholar
  44. 44.
    Liberman A, Harris K, Hoffman H (1957) The discrimination of speech sounds within and across phoneme boundaries. J Exp Psychol 54(5):358–368CrossRefGoogle Scholar
  45. 45.
    Moore RK (2012) A Bayesian explanation of the ’Uncanny Valley’ effect and related psychological phenomena. Sci Rep 2:864CrossRefGoogle Scholar
  46. 46.
    Moore RK (2013) Spoken language processing: where do we go from here? In: Trappl R (ed) Your virtual butler. Springer, Berlin, pp 119–133CrossRefGoogle Scholar
  47. 47.
    Mori M (1970) The Uncanny Valley. Energy 7:33–35Google Scholar
  48. 48.
    Mubin O, Bartneck C, Leijs L, Hooft van Huysduynen H, Hu J, Muelver J (2012) Improving speech recognition with the robot interaction language. Disrupt Sci Technol 1(2):79–88CrossRefGoogle Scholar
  49. 49.
    Mumm J, Mutlu B (2011) Human–robot proxemics: physical and psychological distancing in human–robot interaction. In: Proceedings of the 6th international conference on human–robot interaction (HRI’11), LausanneGoogle Scholar
  50. 50.
    Oudeyer PY (2003) The production and recognition of emotions in speech: features and algorithms. Int J Hum Comput Stud 59(1–2):157–183Google Scholar
  51. 51.
    Paepcke S, Takayama L (2010) Judging a bot by its cover: an experiment on expectation setting for personal robots. In: Proceedings of the 5th international conference on human–robot interaction (HRI’10). ACM/IEEE, Osaka, pp 45–52Google Scholar
  52. 52.
    Picard RW (1997) Affective computing. MIT Press, CambridgeCrossRefGoogle Scholar
  53. 53.
    Plutchik R (1994) The psychology and biology of emotion. HarperCollins College Publishers, New YorkGoogle Scholar
  54. 54.
    Rae I, Takayama L, Mutlu B (2013) The influence of height in robot-mediated communication. In: Proceedings of the 8th international conference on human–robot interaction (HRI’13). IEEE, Tokyo, pp 1–8Google Scholar
  55. 55.
    Read R, Belpaeme T (2010) Interpreting non-linguistic utterances by robots : studying the influence of physical appearance. In: Proceedings of the 3rd international workshop on affective interaction in natural environments (AFFINE 2010) at ACM multimedia 2010. ACM, Firenze, pp 65–70Google Scholar
  56. 56.
    Read R, Belpaeme T (2012) How to use non-linguistic utterances to convey emotion in child–robot interaction. In: Proceedings of the 7th international conference on human–robot interaction (HRI’12). ACM/IEEE, Boston, pp 219–220Google Scholar
  57. 57.
    Read R, Belpaeme T (2014) Situational context directs how people affectively interpret robotic non-linguistic utterances. In: Proceedings of the 9th international conference on human–robot interaction (HRI’14). ACM/IEEE, BielefeldGoogle Scholar
  58. 58.
    Reeves B, Nass C (1996) The media equation: how people treat computers, television, and new media like real people and places. CSLI Publications, StanfordGoogle Scholar
  59. 59.
    Repp B (1984) Categorical perception: issues, methods, findings. Speech Lang 10:243–335Google Scholar
  60. 60.
    Ros Espinoza R, Nalin M, Wood R, Baxter P, Looije R, Demiris Y, Belpaeme T (2011) Child-robot interaction in the wild: Advice to the aspiring experimenter. In: Proceedings of the 13th international conference on multimodal interfaces (ICMI’11). ACM, Valencia, pp 335–342Google Scholar
  61. 61.
    Saerbeck M, Bartneck C (2010) Perception of affect elicited by robot motion. In: Proceedings of the 5th international conference on human–robot interaction (HRI’10). ACM/IEEE, Osaka, pp 53–60Google Scholar
  62. 62.
    Scherer K (2003) Vocal communication of emotion: a review of research paradigms. Speech Commun 40(1–2):227–256Google Scholar
  63. 63.
    Schouten B, Gerrits E, van Hessen A (2003) The end of categorical perception as we know it. Speech Commun 41(1):71–80CrossRefGoogle Scholar
  64. 64.
    Schröder M, Burkhardt F, Krstulovic S (2010) Synthesis of emotional speech. In: Scherer KR, Bänziger T, Roesch E (eds) Blueprint for affective computing. Oxford University Press, Oxford, pp 222–231Google Scholar
  65. 65.
    Schwent M, Arras K (2014) R2–d2 reloaded: a flexible sound synthesis system for sonic human–robot interaction design. In: Proceedings of the 23rd international symposium on robot and human interaction communiation (RO-MAN 2014), EdinburghGoogle Scholar
  66. 66.
    Siegel J, Siegel W (1977) Categorical perception of tonal intervals: musicians can’t tell sharp from flat. Percept Psychophys 21(5):399–407CrossRefGoogle Scholar
  67. 67.
    Siegel M, Breazeal C, Norton M (2009) Persuasive robotics: the influence of robot gender on human behavior. In: International conference on intelligent robots and systems (IROS 2009). IEEE, St. Louis, pp 2563–2568Google Scholar
  68. 68.
    Singh A, Young J (2012) Animal-inspired human–robot interaction: a robotic tail for communicating state. In: Proceedings of the 7th international conference on human–robot interaction (HRI’12), Boston, pp 237–238Google Scholar
  69. 69.
    Stedeman A, Sutherland D, Bartneck C (2011) Learning ROILA. CreateSpace, CharlestonGoogle Scholar
  70. 70.
    Tay B, Jung Y, Park T (2014) When stereotypes meet robots: the double-edge sword of robot gender and personality in human–robot interaction. Comput Hum Behav 38:75–84CrossRefGoogle Scholar
  71. 71.
    Terada K, Yamauchi A, Ito A (2012) Artificial emotion expression for a robot by dynamic coluor change. In: Proceedings of the 21st international symposium on robot and human interactive communication (RO-MAN 2012). IEEE, Paris, pp 314–321Google Scholar
  72. 72.
    Walters ML, Syrdal DS, Dautenhahn K, te Boekhorst R, Koay KL (2007) Avoiding the uncanny valley: robot appearance, personality and consistency of behaviour in an attention-seeking home scenario for a robot companion. Auton Robots 24(2):159–178CrossRefGoogle Scholar
  73. 73.
    Yilmazyildiz S, Athanasopoulos G, Patsis G, Wang W, Oveneke MC, Latacz L, Verhelst W, Sahli H, Henderickx D, Vanderborght B, Soetens E, Lefeber D (2013) Voice modification for wizard-of-OZ experiments in robot–child interaction. In: Proceedings of the workshop on affective social speech signals, GrenobleGoogle Scholar
  74. 74.
    Yilmazyildiz S, Henderickx D, Vanderborght B, Verhelst W, Soetens E, Lefeber D (2011) EMOGIB: emotional gibberish speech database for affective human–robot interaction. In: Proceedings of the international conference on affective computing and intelligent interaction (ACII’11). Springer, Memphis, pp 163–172Google Scholar
  75. 75.
    Yilmazyildiz S, Henderickx D, Vanderborght B, Verhelst W, Soetens E, Lefeber D (2013) Multi-modal emotion expression for affective human–robot interaction. In: Proceedings of the workshop on affective social speech signals (WASSS 2013), GrenobleGoogle Scholar
  76. 76.
    Yilmazyildiz S, Latacz L, Mattheyses W, Verhelst W (2010) Expressive Gibberish speech synthesis for affective human–computer interaction. In: Proceedings of the 13th international conference on text., speech and dialogue (TSD’10). Springer, Brno, pp 584–590Google Scholar
  77. 77.
    Zhou K, Mo L, Kay P, Kwok VPY, Ip TNM, Tan LH (2010) Newly trained lexical categories produce lateralized categorical perception of color. Proc Natl Acad Sci USA 107(22):9974–9978CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2015

Authors and Affiliations

  1. 1.Centre for Robotics and Neural SystemsPlymouth UniversityPlymouthUK

Personalised recommendations