Creating user stereotypes for persona development from qualitative data through semi-automatic subspace clustering

  • Dannie KorsgaardEmail author
  • Thomas Bjørner
  • Pernille Krog Sørensen
  • Paolo Burelli


Personas are models of users that incorporate motivations, wishes, and objectives; These models are employed in user-centred design to help design better user experiences and have recently been employed in adaptive systems to help tailor the personalized user experience. Designing with personas involves the production of descriptions of fictitious users, which are often based on data from real users. The majority of data-driven persona development performed today is based on qualitative data from a limited set of interviewees and transformed into personas using labour-intensive manual techniques. In this study, we propose a method that employs the modelling of user stereotypes to automate part of the persona creation process and addresses the drawbacks of the existing semi-automated methods for persona development. The description of the method is accompanied by an empirical comparison with a manual technique and a semi-automated alternative (multiple correspondence analysis). The results of the comparison show that manual techniques differ between human persona designers leading to different results. The proposed algorithm provides similar results based on parameter input, but was more rigorous and will find optimal clusters, while lowering the labour associated with finding the clusters in the dataset. The output of the method also represents the largest variances in the dataset identified by the multiple correspondence analysis.


Ethnography Persona Mixed method Subspace clustering Older adults 



This study was part of the ELDORADO project “Preventing malnourishment and promoting well-being among the older adults at home through personalized cost-effective food and meal supply” supported by a grant (4105-00009B) from the Innovation Fund Denmark.


  1. Abdi, H., Valentin, D.: Multiple correspondence analysis. In: Rasmussen, K. (ed.) Encyclopedia of Measurements and Statistics. Sage Publications, Thousand Oaks (2007)Google Scholar
  2. Achtert, E., Kriegel, H.P., Zimek, A.: ELKI: a software system for evaluation of subspace clustering algorithms (2008)Google Scholar
  3. Adlin, T., Pruitt, J., Goodwin, K., Hynes, C., McGrane, K., Rosenstein, A., Muller, M.J.: Panel: putting personas to work. Chi 2006, 13–16 (2006). CrossRefGoogle Scholar
  4. Antle, A.N.: Child-personas: fact or fiction? In: Proceedings of the 6th Conference on Designing Interactive Systems, pp. 22–30 (2006).
  5. Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984)CrossRefGoogle Scholar
  6. Bjørner, T.: Qualitative Methods for Consumer Research: The Value of the Qualitative Approach in Theory and Practice, 1st edn. Hans Reitzel, Copenhagen (2015)Google Scholar
  7. Bjørner, T., Korsgaard, D., Christine, H., Perez-cueto, F.J.A.: A contextual identification of home-living older adults’ positive mealtime practices: a honeycomb model as a framework for joyful aging and the importance of social factors. Appetite 129, 125–134 (2018). CrossRefGoogle Scholar
  8. Bock, T.: How correspondence analysis works (a simple explanation) (2017). Accessed 29 Dec 2019
  9. Brickey, J., Walczak, S., Burgess, T.: Comparing semi-automated clustering methods for persona development. IEEE Trans. Softw. Eng. 38(3), 537–546 (2012). CrossRefGoogle Scholar
  10. Burelli, P., Yannakakis, G.N.: Adapting virtual camera control through player modelling. User Model. User Adapt. Interact. (2015). CrossRefGoogle Scholar
  11. Calde, S., Goodwin, K., Reimann, R.: SHS Orcas: the first integrated information system for long-term healthcare facility management. In: Case Studies of the CHI2002|AIGA Experience Design FORUM on—CHI ’02, pp. 2–16 (2002).
  12. Casas, R., Mar, B., Robinet, A., Roy, A.: User modelling in ambient intelligence for elderly and disabled people. In: International Conference on Computers for Handicapped Persons, pp. 114–122 (2008)Google Scholar
  13. Chapman, C.N., Love, E., Milham, R.P., ElRif, P., Alford, J.L.: Quantitative evaluation of personas as information. Proc. Hum. Factors Ergon. Soc. Annu. Meet. 52(16), 1107–1111 (2008). CrossRefGoogle Scholar
  14. Christidis, K., Papailiou, N., Apostolou, D., Mentzas, G.: Semantic interfaces for personal and social knowledge work. Int. J. Knowl. Based Organ. 1(1), 61–77 (2011). CrossRefGoogle Scholar
  15. Christiernin, L.G.: Guiding the designer: a radar diagram process for applications with multiple layers. Interact. Comput. 22(2), 107–122 (2010). CrossRefGoogle Scholar
  16. Cooper Professional Education.: Why CPE (2019). Accessed 29 Dec 2019
  17. Cooper, A.: The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity. Sams, Indianapolis (1999)CrossRefGoogle Scholar
  18. Cooper, A., Reimann, R., Cronin, D.: About Face 3: The Essentials of Interaction Design, 3rd edn. Wiley, New York (2007)Google Scholar
  19. Copenhagen University: ELDORADO project (2018) Accessed 29 Dec 2019
  20. Dice, L.: Measures of the amount of ecologic association between species. Ecological 26(3), 297–302 (1945)Google Scholar
  21. Djajadiningrat, J.P., Gaver, W.W., Frens, J.W.: Interaction relabelling and extreme characters: methods for exploring aesthetic interactions. Proceedings of the Conference on Designing Interactive Systems: Processes, Practices, Methods, and Techniques, DIS 2000, 66–71 (2000). CrossRefGoogle Scholar
  22. Ewusi-Mensah, K.: Software Development Failures: Anatomy of Abandoned Projects. MIT Press, Cambridge (2003)CrossRefGoogle Scholar
  23. Fischer, G.: User modeling in human-computer interaction. User Model. User Adapt. Interact. 11(1–2), 65–86 (2001). CrossRefzbMATHGoogle Scholar
  24. Floyd, I.R., Cameron Jones, M., Twidale, M.B.: Resolving incommensurable debates: a preliminary identification of persona kinds, attributes, and characteristics. Artifact 2(1), 12–26 (2008). CrossRefGoogle Scholar
  25. Goodwin, K.: Getting from research to personas: harnessing the power of data (2002) Accessed 29 Dec 2019
  26. Goodwin, K., Cooper, A.: Designing for the Digital Age: How to Create Human-Centered Products and Services. Wiley, New York (2009)Google Scholar
  27. Grenier-Boley, N.: Some issues about the introduction of first concepts in linear algebra during tutorial sessions at the beginning of university. Educ. Stud. Math. 87(3), 439–461 (2014). CrossRefGoogle Scholar
  28. Grimmer, J., Stewart, B.M.: The promise and pitfalls of automatic content analysis methods for political texts. Soc. Political Methodol. 21(3), 267–297 (2013). CrossRefGoogle Scholar
  29. Guest, G., Mclellan, E.: Distinguishing the trees from the forest: applying cluster analysis to thematic qualitative data. F. Methods 15(2), 186–201 (2003). CrossRefGoogle Scholar
  30. Harel, G.: Variations in linear algebra content presentations. Learn. Math. 7(3), 29–32 (1987)Google Scholar
  31. Holmgard, C., Liapis, A., Togelius, J.: Evolving personas for player decision modeling. In: 2014 IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2014).
  32. Husson, F., Le, S., Pages, J.: Exploratory Multivariate Analysis by Example Using R. Chapman & Hall, Boca Raton (2011)zbMATHGoogle Scholar
  33. Husson, F., Josse, J., Le, S., Mazet, J.: Package ’FactoMineR’: multivariate exploratory data analysis and data mining (2018). Accessed 29 Dec 2019
  34. Jain, A.K.: Data clustering 50 years beyond K-means. In: 19th International Conference in Pattern Recognition (ICPR), pp. 651–666 (2010). CrossRefGoogle Scholar
  35. Jovchelovitch, S., Bauer, M.: Narrative interviewing. In: Bauer, M., Gaskell, G. (eds.) Qualitative Researching with Text, Image and Sound, pp. 57–74. SAGE Publications Ltd, Thousand Oaks (2000)Google Scholar
  36. Kerr, S.J., Tan, O., Chua, J.C.: Cooking personas: goal-directed design requirements in the kitchen. Int. J. Hum. Comput. Stud. 72(2), 255–274 (2014). CrossRefGoogle Scholar
  37. Landauer, T., McNamara, D., Dennis, S., Kintsch, W.: Handbook of Latent Semantic Analysis. Psychology Press, London (2007)Google Scholar
  38. Landauer, T.K., Folt, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2), 259–284 (1998). CrossRefGoogle Scholar
  39. Laporte, L., Slegers, K., De Grooff, D.: Using correspondence analysis to monitor the persona segmentation process. In: Proceedings of the 7th Nordic Conference on Human–Computer Interaction Making Sense Through Design—NordiCHI ’12, p. 265 (2012).
  40. Leskovec, J., Rajaraman, A., Ullman, D.J. (eds.): Singular-value decomposition. In: Mining of Massive Datasets, Chapter 11, 2nd edn, p. 483. Cambridge University Press, Cambridge (2014)Google Scholar
  41. Macia, L.: Using clustering as a tool: mixed methods in qualitative data analysis. Qual. Rep. 20(7), 1083–1094 (2015)Google Scholar
  42. Madureira, A., Cunha, B., Pereira, J.P., Gomes, S., Pereira, I., Santos, J.M., Abraham, A.: Using personas for supporting user modeling on scheduling systems. In: 2014 14th International Conference on Hybrid Intelligent Systems, pp. 279–284 (2014).
  43. Masiero, A.A., Leite, M.G., Vilela, L., Filgueiras, L., Thomaz, P., Jr A., Humberto, A., Branco, A.C., Campo, S.B., Brasil, S.P., Paulo, U.D.S., Prof, A., Gualberto, L., Paulo, S.: Multidirectional knowledge extraction process for creating behavioral personas. In: 10th Brazilian Symposium on Human Factors in Computer Systems & 5th Latin American Conference on Human–Computer Interaction, pp. 91–99 (2011)Google Scholar
  44. Masters, R.: The effect of students’ physics background on their understanding of linear algebra. Ph.D. thesis, Concordia University (2000).
  45. Melhart, D., Azadvar, A., Canossa, A., Liapis, A., Yannakakis, G.N.: Your gameplay says it all: modelling motivation in Tom Clancy’s the division. In: IEEE Conference on Games (2019). arXiv:1902.00040
  46. Miaskiewicz, T., Kozar, K.A.: Personas and user-centered design: how can personas benefit product design processes? Des. Stud. 32(5), 417–430 (2011). CrossRefGoogle Scholar
  47. Miaskiewicz, T., Sumner, T., Kozar, K.A.: A latent semantic analysis methodology for the identification and creation of personas. In: Proceedings of the 26th SIGCHI Conference on Human Factors in Computing Systems, pp. 1501–1510 (2008). CrossRefGoogle Scholar
  48. Moser, C., Fuchsberger, V.: Revisiting personas: the making-of for special user groups. In: CHI’12 Extended Abstracs on Human Factors in Computing Systems, pp. 453–468 (2012).,
  49. Müller, E., Günnemann, S., Assent, I., Seidl, T.: Evaluating clustering in subspace projections of high dimensional data. In: Proceedings of the 35th International Conference on Very Large Data Bases, Lyon, France (2009)Google Scholar
  50. Nielsen, L.: Personas. In: The Encyclopedia of Human–Computer Interaction, 2nd edn (2002)Google Scholar
  51. Nielsen, L., Storgaard Hansen, K.: Personas is applicable: a study on the use of personas in Denmark. In: Proceedings of the 32nd annual ACM Conference on Human Factors in Computing Systems, pp. 1665–1674 (2014).
  52. Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data. ACM SIGKDD Explor. Newsl. 6(1), 90–105 (2004). CrossRefGoogle Scholar
  53. Podani, J.: Distance, similarity, correlation. In: Podani, J. (ed.) Introduction to the Exploration of Multivariate Biological Data. Backhuys Publishers, Leiden (2000)zbMATHGoogle Scholar
  54. Procopiuc, C.M., Jones, M., Agarwal, P.K., Murali, T.M.: A Monte Carlo algorithm for fast projective clustering. In: SIGMOD, pp. 418–427 (2002).
  55. Pruitt, J., Adlin, T.: The Persona Lifecycle: Keeping People in Mind Throughout Product Design. Morgan Kaufmann, Burlington (2006). CrossRefGoogle Scholar
  56. Pruitt, J., Grundin, J.: Personas: practice and theory. In: Proceedings of the 2003 Conference on Designing for User Experiences, pp. 1–15 (2003).
  57. Sahlgren, M.: The distributional hypothesis. Ital. J. Linguist. 20(1), 33–54 (2008)Google Scholar
  58. Savolainen, P., Ahonen, J., Richardson, I.: Software development project success and failure from the supplier’s perspective: a systematic literature review. Int. J. Proj. Manag. 30(4), 458–469 (2012). CrossRefGoogle Scholar
  59. Siegel, D.A.: The mystique of numbers: belief in quantitative approaches to segmentation and persona development. In: CHI ’10 Extended Abstracts on Human Factors in Computing Systems, pp. 4721–4731 (2010).
  60. Sinha, R.: Persona development for information-rich domains. In: CHI ’03: CHI ’03 Extended Abstracts on Human Factors in Computing Systems, pp. 830–831 (2003).
  61. Sourial, N., Wolfan, C., Zhu, B., Quail, J., Fletcher, J., Karunananthan, S., Bandeen-Roche, K., Beland, F., Bergman, H.: Correspondence analysis is a useful tool to uncover the relationships among categorical variables. J. Clin. Epidemiol. 63(6), 638–646 (2010). CrossRefGoogle Scholar
  62. Stevens, S.: On the theory of scales of measurement. Science 103(2684), 677–680 (1946)CrossRefGoogle Scholar
  63. Tan, P.N., Steinbach, M., Kumar, V.: Chap 8: Cluster analysis: basic concepts and algorithms. In: Introduction to Data Mining, Chapter 8 (2005). CrossRefGoogle Scholar
  64. Tara Matthews, S., Tejinder Judge: How do designers and user experience professional actually perceive and use personas? In: Conference of Human Factors in Computing Systems, pp. 1219–1228 (2012).
  65. Tu, N., Dong, X., Rau, P.L.P., Zhang, T.: Using cluster analysis in persona development. In: 2010 8th International Conference on Supply Chain Management and Information (2010)Google Scholar
  66. Van der Maaten, L.J.P.: An introduction to dimensionality reduction using matlab. Technical Report MICC 07-07, Maastricht University, Maastricht (2007)Google Scholar
  67. Viana, G., Robert, J.m.: The practitioners’ points of view on the creation and use of personas for user interface design. In: Human–Computer Interaction. Theory, Design, Development and Practice. HCI 2016. Lecture Notes in Computer Science, vol. 9731, pp. 233–244 (2016). Google Scholar
  68. Wöckl, B., Yildizoglu, U., Buber, I., Aparicio Diaz, B., Kruijff, E., Tscheligi, M.: Basic senior personas: a representative design tool covering the spectrum of European older adults. In: Proceedings of the 14th International ACM SIGACCESS Conference on Computers and Accessibility, pp. 25–32 (2012).
  69. Young, T., Hazarika, D., Poria, S., Cambria, E.: Recent trends in deep learning based natural language processing. arXivorg preprint ID:170802709v7 (October) (2018)Google Scholar
  70. Zhang, X., Brown, H.f., Shankar, A.: Data-driven personas : constructing archetypal users with clickstreams and user telemetry. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 5350–5359, San Jose, California, USA (2016)Google Scholar

Copyright information

© Springer Nature B.V. 2020

Authors and Affiliations

  1. 1.Aalborg University CopenhagenCopenhagenDenmark
  2. 2.IT UniversityCopenhagenDenmark

Personalised recommendations