Cluster Computing

, Volume 21, Issue 1, pp 1097–1107 | Cite as

Mining biometric data to predict programmer expertise and task difficulty

  • Seolhwa Lee
  • Danial Hooshyar
  • Hyesung Ji
  • Kichun Nam
  • Heuiseok LimEmail author


Programming mistakes frequently waste software developers’ time and may lead to the introduction of bugs into their software, causing serious risks for their customers. Using the correlation between various software process metrics and defects, earlier work has traditionally attempted to spot such bug risks. However, this study departs from previous works in examining a more direct method of using psycho-physiological sensors data to detect the difficulty of program comprehension tasks and programmer level of expertise. By conducting a study with 38 expert and novice programmers, we investigated how well an electroencephalography and an eye-tracker can be utilized in predicting programmer expertise (novice/expert) and task difficulty (easy/difficult). Using data from both sensors, we could predict task difficulty and programmer level of expertise with 64.9 and 97.7% precision and 68.6 and 96.4% recall, respectively. The result shows it is possible to predict the perceived difficulty of a task and expertise level for developers using psycho-physiological sensors data. In addition, we found that while using single biometric sensor shows good results, the composition of both sensors lead to the best overall performance.


Code comprehension Programming expertise Task difficulty Biometric data Machine learning 



This work was supported by the ICT R&D Program of MSIP/IITP [Grant Number 2016(B0101-16-0340)]. Development of distribution and diffusion service technology through individual and collective intelligence to digital contents. “This work was supported by the National Research Foundation of Korea(NRF) Grant funded by the Korea Government(MSIP) (No. R1610941).”


  1. 1.
    Veltman, J.A., Gaillard, A.W.K.: Physiological workload reactions to increasing levels of task difficulty. Ergonomics 41(5), 656–669 (1998)CrossRefGoogle Scholar
  2. 2.
    Wierwille, W.W., Eggemeier, F.T.: Recommendations for mental workload measurement in a test and evaluation environment. Hum. Factors 35(2), 263–281 (1993)CrossRefGoogle Scholar
  3. 3.
    Gannon, D., Bramley, R., Fox, G., Smallen, S., Rossi, A., Ananthakrishnan, R., Bertrand, F., Chiu, K., Farrellee, M., Govindaraju, M., Krishnan, S.: Programming the grid: distributed software components, P2P and grid web services for scientific applications. Clust. Comput. 5(3), 325–336 (2002)CrossRefGoogle Scholar
  4. 4.
    Bui, H., Kelly, M., Lyon, C., Pasquier, M., Thomas, D., Flynn, P., Thain, D.: Experience with BXGrid: a data repository and computing grid for biometrics research. Clust. Comput. 12(4), 373–386 (2009)CrossRefGoogle Scholar
  5. 5.
    Ali, N., Sharafi, Z., Guéhéneuc, Y.-G., Antoniol, G.: An empirical study on the importance of source code entities for requirements traceability. Empir. Softw. Eng. 20(2), 442–478 (2015)CrossRefGoogle Scholar
  6. 6.
    Sharif, B., Falcone, M., Maletic, J.I.: An eye-tracking study on the role of scan time in finding source code defects. In: Symposium on Eye Tracking Research and Applications (ETRA), Santa Barbara, CA, 2012Google Scholar
  7. 7.
    Choi, Y.-S., Hyun, K., Choi, J.-Y.: Assessing multiscale permutation entropy for short electroencephalogram recordings. Clust. Comput. 19(4), 2305–2314 (2016)Google Scholar
  8. 8.
    Zhu, J., Xu, C., Li, Z., Fung, G., Lin, X., Huang, J., Huang, C.: An examination of on-line machine learning approaches for pseudo-random generated data. Clust. Comput. 19(3), 1309–1321 (2016)CrossRefGoogle Scholar
  9. 9.
    Parnin, C.: Subvocalization-toward hearing the inner thoughts of developers. In: Proceedings of the 19th International Conference on Program Comprehension (ICPC), 2011, pp. 197–200Google Scholar
  10. 10.
    Fritz, T., Begel, A., Müller, S.C., Yigit-Elliott, S., Züger, M.: Using psycho-physiological measures to assess task difficulty in software development. In: Proceedings of the 36th International Conference on Software Engineering, Ser. ICSE 2014, pp. 402–413. New York: ACM (2014)Google Scholar
  11. 11.
    Siegmund, J., Kästner, C., Apel, S., Parnin, C., Bethmann, A., Leich, T., Brechmann, A.: Understanding understanding source code with functional magnetic resonance imaging. In: Proceedings of the 36th International Conference on Software Engineering, 2014, pp. 378–389. New York: ACM (2014)Google Scholar
  12. 12.
    Finn, E.S., Shen, X., Scheinost, D., Rosenberg, M.D., Huang, J., Chun, M.M., Papademetris, X., Constable, R.T.: Functional connectome fingerprinting: identifying individuals using patterns of brain connectivity. Nat. Neurosci. 18, 1–11 (2015)CrossRefGoogle Scholar
  13. 13.
    Sweller, J., Van Merrienboer, J.J., Paas, F.G.: Cognitive architecture and instructional design. Educ. Psychol. Rev. 10(3), 251–296 (1998)CrossRefGoogle Scholar
  14. 14.
    Crk, I., Kluthe, T.: Toward using alpha and theta brain waves to quantify programmer expertise. In: 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2014, pp. 5373–5376Google Scholar
  15. 15.
    Pedrotti, M., Lei, S., Dzaack, J., Rötting, M.: A data-driven algorithm for offline pupil signal preprocessing and eyeblink detection in low-speed eye-tracking protocols. Behav. Res. Methods 43, 372–383 (2011)CrossRefGoogle Scholar
  16. 16.
    Klingner, J., Kumar, R., Hanrahan, P.: Measuring the task-evoked pupillary response with a remote eye tracker. In: Räihä, K.-J., Duchowski, A.T. (eds.) ETRA—Proceedings of the Eye Tracking Research and Application Symposium, 26–28 March, Savannah, Georgia, USA, pp. 69–72 (2008)Google Scholar
  17. 17.
    Goldberg, J.H., Kotval, X.P.: Computer interface evaluation using eye movements: methods and constructs. Int. J. Ind. Ergon. 24(6), 631–645 (1999)CrossRefGoogle Scholar
  18. 18.
    Simola, J., Salojärvi, J., Kojo, I.: Using hidden Markov to uncover processing states from eye movements in information search tasks. Cogn. Syst. Res. 9(4), 237–251 (2008)CrossRefGoogle Scholar
  19. 19.
    Lemaire, B., Guérin-Dugué, A., Baccino, T., Chanceaux, M., Pasqualotti, L.: A cognitive computational model of eye movements investigating visual strategies on textual material. In: Proceedings of the 33rd Annual Meeting of the Cognitive Science Society, CogSci 2011, Boston, MA, pp. 1146–1151 (2011)Google Scholar
  20. 20.
    Klimesch, W.: EEG alpha and theta oscillations reflect cognitive and memory performance: a review and analysis. Brain Res. Rev. 29(2), 169–195 (1999)CrossRefGoogle Scholar
  21. 21.
    Smith., M.E., Gevins, A.: Neurophysiologic monitoring of mental workload and fatigue during operation of a flight simulator. In: Defense and Security, pp. 116–126. International Society of Optics and Photonics (2005)Google Scholar
  22. 22.
    Hankins, T.C., Wilson, G.F.: A comparison of heart rate, eye activity, EEG and subjective measures of pilot mental workload during flight. Aviat. Space Environ. Med. 69(4), 360–367 (1998)Google Scholar
  23. 23.
    Grimes, D., Tan, D.S., Hudson, S.E., Shenoy, P., Rao, R.P.: Feasibility and pragmatics of classifying working memory load with an electroencephalograph. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 835–844. ACM, Florence (2008)Google Scholar
  24. 24.
    del R Millan, J., Mouriño, J., Franzé, M., Cincotti, F., Varsta, M., Heikkonen, J., Babiloni, F.: A local neural classifier for the recognition of EEG patterns associated to mental tasks. IEEE Trans. Neural Netw. 13(3), 678–686 (2002)CrossRefGoogle Scholar
  25. 25.
    Hart, S.G., Staveland, L.E.: Development of NASA-TLX (task load index): results of empirical and theoretical research. Hum. Ment. Workload 1(3), 139–183 (1988)CrossRefGoogle Scholar
  26. 26.
    Haapalainen, E., Kim, S., Forlizzi, J.F., Dey, A.K.: Psycho-physiological measures for assessing cognitive load. In: Proceedings of the 12th ACM International Conference on Ubiquitous Computing, pp. 301–310. ACM, New York (2010)Google Scholar
  27. 27.
    Feigenspan, J., Kästner, C., Liebig, J., Apel, S., Hanenberg, S.: Measuring programming experience. In: 2012 IEEE 20th International Conference on Program Comprehension (ICPC), pp. 73–82 (2012)Google Scholar
  28. 28.
    Bednarik, R., Vrzakova, H., Hradis, M.: What do you want to do next: a novel approach for intent prediction in gaze-based interaction. In: Proceedings of the Symposium on Eye Tracking Research and Applications, pp. 83–90. ACM, New York (2012)Google Scholar
  29. 29.
    Lotte, F., Congedo, M.: L’ecuyer, A., Lamarche, F., Arnaldi, B.: A review of classification algorithms for EEG-based brain-computer interfaces. J. Neural Eng. 4(2), 24 (2007)CrossRefGoogle Scholar
  30. 30.
    Zulkifli, N.A.A., Ali, S.H.M., Ahmad, S.A., Islam, M.S.: Review on support vector machine (SVM) classifier for human emotion pattern recognition from EEG signals. Asian J. Inf. Technol. 14(4), 135–146 (2015)Google Scholar
  31. 31.
    Rello, L., Ballesteros, M.: Detecting readers with dyslexia using machine learning with eye tracking measures. In: Proceedings of the 12th Web for All Conference, p. 16. ACM (2015)Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Seolhwa Lee
    • 1
  • Danial Hooshyar
    • 1
  • Hyesung Ji
    • 1
  • Kichun Nam
    • 2
  • Heuiseok Lim
    • 1
    Email author
  1. 1.Department of Computer Science and EngineeringKorea UniversitySeoulKorea
  2. 2.Department of PsychologyKorea UniversitySeoulKorea

Personalised recommendations