Human Performance Modeling with Deep Learning

Yuan, Arianna; Pfeuffer, Ken; Li, Yang

doi:10.1007/978-3-030-82681-9_1

Arianna Yuan⁴,
Ken Pfeuffer⁵ &
Yang Li⁶

Part of the book series: Human–Computer Interaction Series ((HCIS))

2232 Accesses
2 Citations

Abstract

Predicting human performance in interaction tasks allows designers or developers to understand the expected performance of a target interface without actually testing it with real users. In this chapter, we are going to discuss how deep learning methods can be used to aid human performance prediction in the context of HCI. Particularly, we are going to look at three case studies. In the first case study, we discuss deep models for goal-driven human visual search on arbitrary web pages. In the second study, we show that deep learning models could successfully capture human learning effects from repetitive interaction with vertical menus. In the third case study, we describe how deep models can be combined with analytical understanding to capture high-level interaction strategies and low-level behaviors in touchscreen grid interfaces on mobile devices. In all these studies, we show that deep learning provides great capacity for modeling complex interaction behaviors, which would be extremely difficult for traditional heuristic-based models. Furthermore, we showcase different ways to analyze a learned deep model to obtain better model interpretability, and understanding of human behaviors to advance the science.

Arianna Yuan and Ken Pfeuffer conducted the work during an internship at Google Research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Anderson P, He X, Buehler C, Teney D, Johnson M, Gould S, Zhang L (2018) Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6077–6086
Google Scholar
Bailly G, Oulasvirta A, Brumby DP, Howes A (2014) Model of visual search and selection time in linear menus. In: Proceedings of the SIGCHI conference on human factors in computing systems (CHI ’14). ACM, New York, NY, USA, pp 3865–3874. http://dx.doi.org/10.1145/2556288.2557093
Bailly G, Oulasvirta A, Brumby DP, Howes A (2014) Model of visual search and selection time in linear menus. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 3865–3874
Google Scholar
Bi X, Li Y, Zhai S (2013) FFitts law: modeling finger touch with fitts’ law. In: Proceedings of the SIGCHI conference on human factors in computing systems (CHI ’13). ACM, New York, NY, USA, pp 1363–1372. http://dx.doi.org/10.1145/2470654.2466180
Borji A (2019) Saliency prediction in the deep learning era: successes and limitations. IEEE Transa Pattern Anal Mach Intell (2019)
Google Scholar
Byrne MD (2001) ACT-R/PM and menu selection. Int J Hum-Comput Stud 55(1):41–84. https://doi.org/10.1006/ijhc.2001.0469
Card SK (1982) User perceptual mechanisms in the search of computer command menus. In: Proceedings of the 1982 conference on human factors in computing systems (CHI ’82). ACM, New York, NY, USA, pp 190–196. http://dx.doi.org/10.1145/800049.801779
Card SK, Moran TP, Newell A (1980) The keystroke-level model for user performance time with interactive systems. Commun ACM 23(7):396–410
Google Scholar
Chen K, Wang J, Chen L-C, Gao H, Xu W, Nevatia R (2015) ABC-CNN: an attention based convolutional neural network for visual question answering. arXiv:1511.05960
Chen X, Bailly G , Brumby DP, Oulasvirta A, Howes A (2015). The emergence of interactive behaviour: a model of rational menu search. In: CHI’15 Proceedings of the 33rd annual ACM conference on human factors in computing systems, vol 33. Association for Computing Machinery (ACM), pp 4217–4226
Google Scholar
Cockburn A, Gutwin C, Greenberg S (2007) A predictive model of menu performance. In: Proceedings of the SIGCHI conference on human factors in computing systems (CHI ’07). ACM, New York, NY, USA, pp 627–636. http://dx.doi.org/10.1145/1240624.1240723
Cockburn A, Gutwin C, Greenberg S (2007) A predictive model of menu performance. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 627–636
Google Scholar
Corbetta M, Shulman GL (2002) Control of goal-directed and stimulus-driven attention in the brain. Nat Rev Neurosci 3(3):201
Google Scholar
Devlin J, Chang M-W, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Fitts PM (1954) The information capacity of the human motor system in controlling the amplitude of movement. J Exper Psychol 47(6):381
Google Scholar
Fu W-T, Pirolli P (2007) SNIF-ACT: A cognitive model of user navigation on the World Wide Web. Human-Comput. Int. 22(4):355–412
Google Scholar
Graves A (2012) Supervised sequence labelling with recurrent neural networks. Springer, Studies in computational intelligence
Google Scholar
Hick WE (1952) On the rate of gain of information. Q J Exp Psychol 4(1):11–26
Google Scholar
Hochreiter S, Schmidhuber JU (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Google Scholar
Johnson M, Schuster M, Le QV, Krikun M, Yonghui W, Chen Z, Thorat N, Viégas F, Wattenberg M, Corrado G et al (2017) Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans Ass Comput Ling 5(2017):339–351
Google Scholar
Jokinen Jussi PP, Zhenxin W, Sayan S, Antti O, Xiangshi R (2020) Adaptive feature guidance: modelling visual search with graphical layouts. Int J Human-Comput Stud 136:102376
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Google Scholar
Lane DM, Napier HA, Batsell RR, Naman JL (1993) Predicting the skilled use of hierarchical menus with the keystroke-level model. Hum-Comput Interact 8(2):185–192. http://dx.doi.org/10.1207/s15327051hci0802_4
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436
Google Scholar
Li Y (2014) Reflection: enabling event prediction as an on-device service for mobile interaction. In: Proceedings of the 27th annual ACM symposium on user interface software and technology (UIST ’14). Association for Computing Machinery, New York, NY, USA, pp 689–698. http://dx.doi.org/10.1145/2642918.2647355
Li Y, Bengio S, Bailly G (2018) Predicting human performance in vertical menu selection using deep learning. In: Proceedings of the 2018 CHI conference on human factors in computing systems, pp 1–7
Google Scholar
Liu T, Larsson J, Carrasco M (2007) Feature-based attention modulates orientation-selective responses in human visual cortex. Neuron 55(2):313–323
Google Scholar
MacKenzie IS, Buxton W (1992) Extending Fitts’ law to two-dimensional tasks. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 219–226
Google Scholar
Martinez-Trujillo JC, Treue S (2004) Feature-based attention increases the selectivity of population responses in primate visual cortex. Curr Biol 14(9):744–751
Google Scholar
McElree B, Carrasco M (1999) The temporal dynamics of visual search: evidence for parallel processing in feature and conjunction searches. J Exp Psychol: Human Percept Perf 25(6):1517
Google Scholar
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: ICML: Proceedings of the 27th international conference on machine learning
Google Scholar
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Google Scholar
Pfeuffer K, Li Y (2018) Analysis and modeling of grid performance on touchscreen mobile devices. In: Proceedings of the 2018 CHI conference on human factors in computing systems, pp 1–12
Google Scholar
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Google Scholar
Shen J, Reingold EM, Pomplun M (2003) Guidance of eye movements during conjunctive visual search: the distractor-ratio effect. Can J Exp Psychol 57(2):76
Google Scholar
Shih KJ, Singh S, Hoiem D (2016) Where to look: focus regions for visual question answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4613–4621
Google Scholar
Tehranchi F, Ritter FE (2018) Modeling visual search in interactive graphic interfaces: adding visual pattern matching algorithms to ACT-R. In: Proceedings of 16th international conference on cognitive modeling. University of Wisconsin Madison, WI, pp 162–167
Google Scholar
Teo L-H, John B, Blackmon M (2012) CogTool-Explorer: a model of goal-directed user exploration that considers information layout. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 2479–2488
Google Scholar
Todi K, Jokinen J, Luyten K, Oulasvirta A (2019) Individualising graphical layouts with predictive visual search models. ACM Trans Int Intell Syst (TiiS) 10(1):1–24
Google Scholar
Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cogn Psychol 12(1):97–136
Google Scholar
van der Meulen H, Varsanyi P, Westendorf L, Kun AL, Shaer O (2016) Towards understanding collaboration around interactive surfaces: exploring joint visual attention. In: Proceedings of the 29th annual symposium on user interface software and technology. ACM, pp 219–220
Google Scholar
Walter R, Bulling A, Lindlbauer D, Schuessler M, Müller J (2015) Analyzing visual attention during whole body interaction with public displays. In: Proceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing. ACM, New York, NY, USA, pp 1263–1267
Google Scholar
Wu X, Gedeon T, Wang L (2018) The analysis method of visual information searching in the human-computer interactive process of intelligent control system. In: Congress of the international ergonomics association. Springer, pp 73–84
Google Scholar
Xu H, Saenko K (2016) Ask, attend and answer: exploring question-guided spatial attention for visual question answering. In: European conference on computer vision. Springer, pp 451–466
Google Scholar
Yuan A, Li Y (2020) Modeling human visual search performance on realistic webpages using analytical and deep learning methods. In: Proceedings of the 2020 CHI conference on human factors in computing systems, pp 1–12
Google Scholar
Zhaoping L, Frith U (2011) A clash of bottom-up and top-down processes in visual search: the reversed letter effect revisited. J Exp Psychol: Human Perc Perf 37(4):997
Google Scholar
Zheng Q, Jiao J, Cao Y, Lau RWH (2018) Task-driven webpage saliency. In: Proceedings of the European conference on computer vision (ECCV), pp 287–302
Google Scholar
Zipf GK (1949) Human behavior and the principle of least effort: an introduction to human ecology. Addison-Wesley Press, Boston. http://psycnet.apa.org/record/2005-10806-009

Download references

Author information

Authors and Affiliations

Stanford University, 450 Serra Mall, Stanford, CA, USA
Arianna Yuan
Aarhus University, Nordre Ringgade 1, Aarhus, Denmark
Ken Pfeuffer
Google Research, Mountain View, CA, USA
Yang Li

Authors

Arianna Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Ken Pfeuffer
View author publications
You can also search for this author in PubMed Google Scholar
Yang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yang Li .

Editor information

Editors and Affiliations

Google Research (United States), Mountain View, CA, USA
Yang Li
Advanced Interactive Technologies Lab, ETH Zurich, Zurich, Switzerland
Otmar Hilliges

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Yuan, A., Pfeuffer, K., Li, Y. (2021). Human Performance Modeling with Deep Learning. In: Li, Y., Hilliges, O. (eds) Artificial Intelligence for Human Computer Interaction: A Modern Approach. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-030-82681-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-82681-9_1
Published: 05 November 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82680-2
Online ISBN: 978-3-030-82681-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics