Abstract
Predicting human performance in interaction tasks allows designers or developers to understand the expected performance of a target interface without actually testing it with real users. In this chapter, we are going to discuss how deep learning methods can be used to aid human performance prediction in the context of HCI. Particularly, we are going to look at three case studies. In the first case study, we discuss deep models for goal-driven human visual search on arbitrary web pages. In the second study, we show that deep learning models could successfully capture human learning effects from repetitive interaction with vertical menus. In the third case study, we describe how deep models can be combined with analytical understanding to capture high-level interaction strategies and low-level behaviors in touchscreen grid interfaces on mobile devices. In all these studies, we show that deep learning provides great capacity for modeling complex interaction behaviors, which would be extremely difficult for traditional heuristic-based models. Furthermore, we showcase different ways to analyze a learned deep model to obtain better model interpretability, and understanding of human behaviors to advance the science.
Arianna Yuan and Ken Pfeuffer conducted the work during an internship at Google Research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anderson P, He X, Buehler C, Teney D, Johnson M, Gould S, Zhang L (2018) Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6077–6086
Bailly G, Oulasvirta A, Brumby DP, Howes A (2014) Model of visual search and selection time in linear menus. In: Proceedings of the SIGCHI conference on human factors in computing systems (CHI ’14). ACM, New York, NY, USA, pp 3865–3874. http://dx.doi.org/10.1145/2556288.2557093
Bailly G, Oulasvirta A, Brumby DP, Howes A (2014) Model of visual search and selection time in linear menus. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 3865–3874
Bi X, Li Y, Zhai S (2013) FFitts law: modeling finger touch with fitts’ law. In: Proceedings of the SIGCHI conference on human factors in computing systems (CHI ’13). ACM, New York, NY, USA, pp 1363–1372. http://dx.doi.org/10.1145/2470654.2466180
Borji A (2019) Saliency prediction in the deep learning era: successes and limitations. IEEE Transa Pattern Anal Mach Intell (2019)
Byrne MD (2001) ACT-R/PM and menu selection. Int J Hum-Comput Stud 55(1):41–84. https://doi.org/10.1006/ijhc.2001.0469
Card SK (1982) User perceptual mechanisms in the search of computer command menus. In: Proceedings of the 1982 conference on human factors in computing systems (CHI ’82). ACM, New York, NY, USA, pp 190–196. http://dx.doi.org/10.1145/800049.801779
Card SK, Moran TP, Newell A (1980) The keystroke-level model for user performance time with interactive systems. Commun ACM 23(7):396–410
Chen K, Wang J, Chen L-C, Gao H, Xu W, Nevatia R (2015) ABC-CNN: an attention based convolutional neural network for visual question answering. arXiv:1511.05960
Chen X, Bailly G , Brumby DP, Oulasvirta A, Howes A (2015). The emergence of interactive behaviour: a model of rational menu search. In: CHI’15 Proceedings of the 33rd annual ACM conference on human factors in computing systems, vol 33. Association for Computing Machinery (ACM), pp 4217–4226
Cockburn A, Gutwin C, Greenberg S (2007) A predictive model of menu performance. In: Proceedings of the SIGCHI conference on human factors in computing systems (CHI ’07). ACM, New York, NY, USA, pp 627–636. http://dx.doi.org/10.1145/1240624.1240723
Cockburn A, Gutwin C, Greenberg S (2007) A predictive model of menu performance. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 627–636
Corbetta M, Shulman GL (2002) Control of goal-directed and stimulus-driven attention in the brain. Nat Rev Neurosci 3(3):201
Devlin J, Chang M-W, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Fitts PM (1954) The information capacity of the human motor system in controlling the amplitude of movement. J Exper Psychol 47(6):381
Fu W-T, Pirolli P (2007) SNIF-ACT: A cognitive model of user navigation on the World Wide Web. Human-Comput. Int. 22(4):355–412
Graves A (2012) Supervised sequence labelling with recurrent neural networks. Springer, Studies in computational intelligence
Hick WE (1952) On the rate of gain of information. Q J Exp Psychol 4(1):11–26
Hochreiter S, Schmidhuber JU (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Johnson M, Schuster M, Le QV, Krikun M, Yonghui W, Chen Z, Thorat N, Viégas F, Wattenberg M, Corrado G et al (2017) Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans Ass Comput Ling 5(2017):339–351
Jokinen Jussi PP, Zhenxin W, Sayan S, Antti O, Xiangshi R (2020) Adaptive feature guidance: modelling visual search with graphical layouts. Int J Human-Comput Stud 136:102376
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Lane DM, Napier HA, Batsell RR, Naman JL (1993) Predicting the skilled use of hierarchical menus with the keystroke-level model. Hum-Comput Interact 8(2):185–192. http://dx.doi.org/10.1207/s15327051hci0802_4
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436
Li Y (2014) Reflection: enabling event prediction as an on-device service for mobile interaction. In: Proceedings of the 27th annual ACM symposium on user interface software and technology (UIST ’14). Association for Computing Machinery, New York, NY, USA, pp 689–698. http://dx.doi.org/10.1145/2642918.2647355
Li Y, Bengio S, Bailly G (2018) Predicting human performance in vertical menu selection using deep learning. In: Proceedings of the 2018 CHI conference on human factors in computing systems, pp 1–7
Liu T, Larsson J, Carrasco M (2007) Feature-based attention modulates orientation-selective responses in human visual cortex. Neuron 55(2):313–323
MacKenzie IS, Buxton W (1992) Extending Fitts’ law to two-dimensional tasks. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 219–226
Martinez-Trujillo JC, Treue S (2004) Feature-based attention increases the selectivity of population responses in primate visual cortex. Curr Biol 14(9):744–751
McElree B, Carrasco M (1999) The temporal dynamics of visual search: evidence for parallel processing in feature and conjunction searches. J Exp Psychol: Human Percept Perf 25(6):1517
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: ICML: Proceedings of the 27th international conference on machine learning
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Pfeuffer K, Li Y (2018) Analysis and modeling of grid performance on touchscreen mobile devices. In: Proceedings of the 2018 CHI conference on human factors in computing systems, pp 1–12
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Shen J, Reingold EM, Pomplun M (2003) Guidance of eye movements during conjunctive visual search: the distractor-ratio effect. Can J Exp Psychol 57(2):76
Shih KJ, Singh S, Hoiem D (2016) Where to look: focus regions for visual question answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4613–4621
Tehranchi F, Ritter FE (2018) Modeling visual search in interactive graphic interfaces: adding visual pattern matching algorithms to ACT-R. In: Proceedings of 16th international conference on cognitive modeling. University of Wisconsin Madison, WI, pp 162–167
Teo L-H, John B, Blackmon M (2012) CogTool-Explorer: a model of goal-directed user exploration that considers information layout. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 2479–2488
Todi K, Jokinen J, Luyten K, Oulasvirta A (2019) Individualising graphical layouts with predictive visual search models. ACM Trans Int Intell Syst (TiiS) 10(1):1–24
Treisman AM, Gelade G (1980) A feature-integration theory of attention. Cogn Psychol 12(1):97–136
van der Meulen H, Varsanyi P, Westendorf L, Kun AL, Shaer O (2016) Towards understanding collaboration around interactive surfaces: exploring joint visual attention. In: Proceedings of the 29th annual symposium on user interface software and technology. ACM, pp 219–220
Walter R, Bulling A, Lindlbauer D, Schuessler M, Müller J (2015) Analyzing visual attention during whole body interaction with public displays. In: Proceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing. ACM, New York, NY, USA, pp 1263–1267
Wu X, Gedeon T, Wang L (2018) The analysis method of visual information searching in the human-computer interactive process of intelligent control system. In: Congress of the international ergonomics association. Springer, pp 73–84
Xu H, Saenko K (2016) Ask, attend and answer: exploring question-guided spatial attention for visual question answering. In: European conference on computer vision. Springer, pp 451–466
Yuan A, Li Y (2020) Modeling human visual search performance on realistic webpages using analytical and deep learning methods. In: Proceedings of the 2020 CHI conference on human factors in computing systems, pp 1–12
Zhaoping L, Frith U (2011) A clash of bottom-up and top-down processes in visual search: the reversed letter effect revisited. J Exp Psychol: Human Perc Perf 37(4):997
Zheng Q, Jiao J, Cao Y, Lau RWH (2018) Task-driven webpage saliency. In: Proceedings of the European conference on computer vision (ECCV), pp 287–302
Zipf GK (1949) Human behavior and the principle of least effort: an introduction to human ecology. Addison-Wesley Press, Boston. http://psycnet.apa.org/record/2005-10806-009
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Yuan, A., Pfeuffer, K., Li, Y. (2021). Human Performance Modeling with Deep Learning. In: Li, Y., Hilliges, O. (eds) Artificial Intelligence for Human Computer Interaction: A Modern Approach. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-030-82681-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-82681-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82680-2
Online ISBN: 978-3-030-82681-9
eBook Packages: Computer ScienceComputer Science (R0)