Evaluation of Human-Robot Interaction Quality: A Toolkit for Workplace Design

  • Patricia H. RosenEmail author
  • Sarah Sommer
  • Sascha Wischniwski
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 824)


The working world is facing a constant change. New technologies emerge enabling new forms of human-system interactions. Especially autonomous robots in services industries as well as manufacturing settings create novel forms of human-robot interaction. Not only researchers but also system integrators as well as practitioners are confronted with the question how to analyze, to evaluate and finally how to design these new working systems in a human-centered way. In this paper we present evaluation criteria as well as a toolkit with concrete measures in order to enable a holistic evaluation of cognitive aspects in human-robot-interactions in work related scenarios. The evaluation criteria comprise technology and human related parameters. Further the paper presents a first empirical validation of the evaluation criteria and their measurements. The validation study uses a manual assembly task accomplished with a lightweight robot. The results indicate that the evaluation criteria can be used to describe the quality of the human-robot interaction.


Workplace evaluation Human-centered workplace design Socio-technical system 



Parts of this work were developed within the research project Hybr-iT. The research project Hybr-iT is funded by the Federal Ministry of Education and Research and is administrated by the DLR Project Management Agency (reference no. 01IS16026H).


  1. 1.
    BMAS (F.M.o.L.a.S.A.) (2017) White Paper, Work 4.0. Federal Ministry of Labour and Social Affairs Directorate-General for Basic Issues of the Social State, the Working World and the Social Mark et Economy, BerlinGoogle Scholar
  2. 2.
    IFR, Website. Accessed 15 May 2018
  3. 3.
    Barner A et al (2016) Innovationspotenziale der Mensch-Maschine-Interaktion. acatech Dossier, BerlinGoogle Scholar
  4. 4.
    Olsen DR, Goodrich MA (2003) Metrics for evaluating human-robot interactions. In: Proceedings of PERMISGoogle Scholar
  5. 5.
    Steinfeld A et al (2006) Common metrics for human-robot interaction. In: Proceedings of the 1st ACM SIGCHI/SIGART conference on human-robot interaction. ACMGoogle Scholar
  6. 6.
    Weiss A et al (2009) The USUS evaluation framework for human-robot interaction. In: AISB 2009: proceedings of the symposium on new frontiers in human-robot interactionGoogle Scholar
  7. 7.
    Young JE et al (2011) Evaluating human-robot interaction. Int J Soc Robot 3(1):53–67CrossRefGoogle Scholar
  8. 8.
    Onnasch L, Maier X, Jürgensohn T (2016) Mensch-Roboter-Interaktion - Eine Taxonomie für alle Anwendungsfälle.. baua: Fokus, 1 AuflageGoogle Scholar
  9. 9.
    Bicchi A, Peshkin MA, Colgate JE (2008) Safety for physical human-robot interaction. In: Springer handbook of robotics. Springer, Heidelberg, pp 1335–1348CrossRefGoogle Scholar
  10. 10.
    Rosen PH et al (2016) Mensch-Roboter-Teams – Klassifikation, Gestaltung und Evaluation der Interaktionen im Arbeitssystem. wt Werkstatttechnik online, 106(H9), pp 605–609Google Scholar
  11. 11.
    Standardization, I.O.f. (2006) ISO 9241-110:2006Google Scholar
  12. 12.
    Gediga G, Hamborg K-C, Düntsch I (1999) The IsoMetrics usability inventory: an operationalization of ISO 9241-10 supporting summative and formative evaluation of software systems. Behav Inf Technol 18(3):151–164CrossRefGoogle Scholar
  13. 13.
    Legris P, Ingham J, Collerette P (2003) Why do people use information technology? A critical review of the technology acceptance model. Inf Manag 40(3):191–204CrossRefGoogle Scholar
  14. 14.
    Adams DA, Nelson RR, Todd PA (1992) Perceived usefulness, ease of use, and usage of information technology: a replication. MIS Q 16:227–247CrossRefGoogle Scholar
  15. 15.
    Standardization, I.O.f., Ergonomic principles related to mental work-load - Part 1: General terms and definitions, Geneva, SwitzerlandGoogle Scholar
  16. 16.
    Onnasch L et al (2014) Human performance consequences of stages and levels of automation: An integrated meta-analysis. Hum Factors 56(3):476–488CrossRefGoogle Scholar
  17. 17.
    Jimenez P, Dunkl A (2017) The buffering effect of workplace resources on the relationship between the areas of worklife and burnout. Front Psychol 8:12CrossRefGoogle Scholar
  18. 18.
    Demerouti E et al (2002) From mental strain to burnout. Eur J Work Organ Psychol 11(4):423–441CrossRefGoogle Scholar
  19. 19.
    Schnall PL, Landsbergis PA, Baker D (1994) Job strain and cardiovascular disease. Ann Rev Public Health 15(1):381–411CrossRefGoogle Scholar
  20. 20.
    Wieland R, Hammes M (2014) Wuppertaler screening instrument Psychische Beanspruchung (WSIB) Beanspruchungsbilanz und Kontrollerleben als Indikatoren für gesunde Arbeit. J Psychol Alltagshandelns/J Everyday Act 7(1):30–50Google Scholar
  21. 21.
    Hart SG (2006) NASA-task load index (NASA-TLX); 20 years later. In: Proceedings of the human factors and ergonomics society annual meeting. Sage PublicationsGoogle Scholar
  22. 22.
    Hart SG, Staveland LE (1988) Development of NASA-TLX (Task Load Index): results of empirical and theoretical research. Adv Psychol 52:139–183CrossRefGoogle Scholar
  23. 23.
    Weiss HM, Cropanzano R (1996) Affective events theory: a theoretical discussion of the structure, causes and consequences of affective experiences at work. In: Research in organizational behavior: an annual series of analytical essays and critical reviews, vol 18Google Scholar
  24. 24.
    Breazeal C (2003) Emotion and sociable humanoid robots. Int J Hum Comput Stud 59(1):119–155CrossRefGoogle Scholar
  25. 25.
    Powers A et al (2007) Comparing a computer agent with a humanoid robot. In: 2007 2nd ACM/IEEE international conference on human-robot interaction (HRI). IEEEGoogle Scholar
  26. 26.
    Watson D, Clark LA, Tellegen A (1988) Development and validation of brief measures of positive and negative affect: the PANAS scales. J Pers Soc Psychol 54(6):1063CrossRefGoogle Scholar
  27. 27.
    Bradley MM, Lang PJ (1994) Measuring emotion: the self-assessment manikin and the semantic differential. J Behav Ther Exp Psychiatry 25(1):49–59CrossRefGoogle Scholar
  28. 28.
    Langner T, Schmidt J, Fischer A (2015) Is it really love? A comparative investigation of the emotional nature of brand and interpersonal love. Psychol Mark 32(6):624–634CrossRefGoogle Scholar
  29. 29.
    Cortina JM (1993) What is coefficient alpha? An examination of theory and applications. J Appl Psychol 78(1):98CrossRefGoogle Scholar
  30. 30.
    Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Erlbaum, HillsdalezbMATHGoogle Scholar
  31. 31.
    Nisbett RE, Wilson TD (1977) The halo effect: Evidence for unconscious alteration of judgments. J Pers Soc Psychol 35(4):250CrossRefGoogle Scholar
  32. 32.
    Wanous JP, Reichers AE, Hudy MJ (1997) Overall job satisfaction: how good are single-item measures? J Appl Psychol 82(2):247CrossRefGoogle Scholar
  33. 33.
    Gardner DG et al (1998) Single-item versus multiple-item measurement scales: an empirical comparison. Educ Psychol Measur 58(6):898–915CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Patricia H. Rosen
    • 1
    Email author
  • Sarah Sommer
    • 1
  • Sascha Wischniwski
    • 1
  1. 1.Unit “Human Factors, Ergonomics”Federal Institute of Occupational Safety and HealthDortmundGermany

Personalised recommendations